Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
N|TONE: an interactive exploration of a poetic space through voice
(USC Thesis Other)
N|TONE: an interactive exploration of a poetic space through voice
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
N|TONE
AN INTERACTIVE EXPLORATION OF A POETIC SPACE THROUGH VOICE
by
Ala’ Diab
A Thesis Presented to the
FACULTY OF THE USC SCHOOL OF CINEMATIC ARTS
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF FINE ARTS
(INTERACTIVE MEDIA)
May 2010
Copyright 2010 Ala’ Diab
ii
Dedication
This paper is dedicated to the unwavering support and love of my family. Also,
gratitude is extended to Mr. Fadi Ghandour for his generous, unconditional support
and to Ms. Nadine Toukan for her divine intervention. And finally, to the creative
energy that permeates the USC IMD family of faculty and students.
iii
Acknowledgments
Heart-felt gratitude for the following people for their support and guidance:
Thesis Committee:
Mark Bolas, Associate Professor (Chair)
Perry Hoberman, Research Associate Professor (Second Advisor)
Dr. David Krum, Computer Scientist - ICT (Outside Advisor)
Special thanks to:
Anne Balsamo, Professor
Tracy Fullerton, Associate Professor
iv
Table of Contents
Dedication ii
Acknowledgments iii
List of Figures v
Abstract vii
Project Description 1
Project Concept 2
Prior Art
Legible City 3
Put-That-There 5
Text Rain 6
Vox Populi 8
Ursonography 9
Airwaves 10
Post Secret 12
User Experience
Prior Prototypes 13
The Experience 15
A User Experience Storyboard 16
Directed Path vs. Improvisation 21
Recent Work on Emotion and Language 23
A History of Speech Recognition 24
Evaluation 26
Discussion 27
Conclusion 29
References 30
v
List of Figures
Figure 1: A User Exploring the Legible City Project
Represented in 3D Text on a Bicycle rig
3
Figure 2: Put-That-There office Setup 5
Figure 3: A Screen Grab from the Installation Showing How
the Text Detects the Edge of a Moving Shape
6
Figure 4: A User Delivering a Presidential Speech to an
Enthused Digital Crowd
8
Figure 5: A Series of Screen Shot of Jaap Blonk Delivering the
Ursonate
9
Figure 6: Airwaves CD Cover 10
Figure 7: Post Secret Book Cover 12
Figure 8: A Screenshot of 'Max 5' Showing the 'Walker' Patch
at Work
13
Figure 9:
A Still from an experiment using Isadora
14
Figure 10: Rough Layout of the Proposed Space 15
Figure 11: The utterance of the word ‘Mounds’ bends the text
to form semi-circle shapes
17
Figure 12: The screen is brimming with mounds by now. They
keep flowing from one side of the screen to the
next. Then the word ‘mineral’ is triggered to reveal
a curled nugget of mineral words entangled in a
spiral.
17
vi
Figure 13: The mineral cracks open, and roots emerge from
and start digging into the ground beneath them.
The same goes for the rest of the screens where
the text being read either makes it behave
differently, like changing size, case or color or by
introducing an image
18
Figure 14: User Interaction Flow Diagram 20
Figure 15: A Series of Visualizations of the Effect of Changing
a User’s Vioce on the Text on Screen
22
Figure 16: A Diagram Illustrating the Elements That Will Be
Controlled by the System
22
Figure 17: Milestones in Speech Recognition and
Understanding Technology over the Past 40 years
25
vii
Abstract
This paper is an accompanying document to an interactive prototype as part of the
requirement to attain a Masters of Fine Arts in Interactive Media. The project is an
interactive art piece that explores the intersection of spoken word, voiced sounds
and their visual expressions, both in shape and typographical form. The project will
create a spatial playground for the user to play with her voice and affect the digital
environment in interesting and compelling ways.
1
Project Description
The project is an attempt to create a connection between the human voice and a
poetic space. The poetic space will be an amalgamation of sound, typography and
evocative imagery. The visuals are going to be directly linked to the inflections of the
poetic performance.
The motivation for the project is a personal one that is driven by the following areas
of enquiry: The first one comes from Architecture and Movement: how a visitor to a
city is immersed in a new urban environment where she enacts a narrative mainly
by walking and exploring. This notion that was touched upon by Judith Butler’s in
her idea of how, what she called, ‘Performativity’ and repetition help enforce
collectively accepted social norms and how those very tools can be used to subvert
those norms (Butler, 1990). Michel de Certeau describes walking as a way of reading
a city and how the city has a narrative not unlike that of a novel (Certeau, 1984).
Both are forms of enacting texuality in its written and built manifestations.
The second area of interest is concerned with the rich expressive potential of the
human voice as an interface. Certain aspects of how we communicate vocally exist
beyond language, both in tonal variation and amplitude. This is evident in how early
formation of language in newborns is shaped by what is called ‘Motherese’ or
2
Babytalk. Phonosemantics , a branch of linguistics, makes the controversial claim
that non-verbal voicing inherently carries meaning and emotion (Gazzola V., 2006).
The third area is the rhythmic power of poetry and its capacity to compress
meaning and express both local and universal themes. Repeating the same word
over and over has a hypnotic quality and the way it’s verbalized changes its
emotional resonance drastically.
Project Concept
From a user perspective, N|Tone is ultimately a poetic look at a representational
space that is brought to existence through performed and visualized narrative. A
series of narrative snippets in the form of verse in a poem are going to be presented
on screen to the users to invite them to piece together a story: the themes will vary
emotionally and will be clearly, or not, delineated. Room within the poems will be
provided for the users to improvise by creating gaps in the texts or in some places
going completely silent. The stage for this will be a room that is sound-proofed to
heighten the feeling of isolation. A microphone will capture the spoken words and
visuals will fly around inside the projected image in response.
This work sees itself a part of a lineage that is rooted in performance in a space. The
emphasis of the experience will be on making the user be placed at a distance from
3
the screen unlike more task-based experiences that force the user to be tied to a
controller.
Prior Art
The Legible City
Jeffery Shaw | 1988
In ‘The Legible City’, the
visitor is able to ride a
stationary bicycle through a
simulated representation of a
city that is constituted by
computer-generated three-
dimensional letters that form
words and sentences along the sides of the streets. Using the ground plans of
actual cities - Manhattan, Amsterdam and Karlsruhe - the existing
architecture of these cities is completely replaced by textual formations.
Travelling through these cities of words is consequently a journey of reading;
choosing the path one takes is a choice of texts as well as their spontaneous
juxtapositions and conjunctions of meaning.
Figure 1. A user exploring the Legible City Project
represented in 3D text on a bicycle rig
4
The handlebar and pedals of the interface bicycle gives the viewer interactive
control over direction and speed of travel. The physical effort of cycling in the
real world is gratuitously transposed into the virtual environment, affirming
a conjunction of the active body in the virtual domain. A video projector is
used to project the computer-generated image onto a large screen. Another
small monitor screen in front of the bicycle shows a simple ground plan of
each city, with an indicator showing the momentary position of the cyclist.
Each story line has a specific letter color so that the bicyclist can choose one
or another to follow the path of a particular narration. In the Amsterdam
(1990) and Karlsruhe (1991) versions all the letters are scaled so that they
have the same proportion and location as the actual buildings which they
replace, resulting in a transformed but exact representation of the actual
architectural appearance of these cities. The texts for these two cities are
largely derived from archive documents that describe mundane historical
events there.
Relevance: This piece combines both architectural representation of the
city/urban space with storytelling on the one hand and typographic
exploration on the other. Of particular interest is the source from which the
stories were assembled.
5
Put-That-There
Richard Bolt |1988
The Architecture Machine Group at the
Massachusetts Institute of Technology
has experimented with the conjoint use
of voice-input and gesture-recognition
to command events on a large format
raster-scan graphics display.
Of central interest was how voice and gesture could be made to inter-
orchestrate actions in one modality amplifying, modifying, disambiguating,
actions in the other. The approach involves the significant use of
pronouns, effectively as "temporary variables" to reference items on the
display.
The interactions described were staged in the MIT Architecture
Machine Group's "Media Room," a physical facility where the user's terminal
is literally a room into which one steps, rather than a desktop CRT
before which one is perched.
Figure 2. Put-That-There office setup
6
Relevance: This project combines a task-oriented interactive experience in
the form of laying out elements in a contextual fashion on screen using
speech recognition technology. Special attention was made to staging the
experience in its ‘natural’ environment of an office space. The stage was laid
out to accurately depict a work environment where one would be sitting at a
desk and making executive, albeit mundane decisions. A marriage of a
designed space and an interactive experience that is naturally mapped onto it
makes this very relevant to N|TONE.
Text Rain
Camille Utterback & Romy Achituv | 1999
Text Rain is an interactive
installation in which
participants use the
familiar instrument of their
bodies, to do what seems
magical—to lift and play
with falling letters that do
not really exist. In the Text
Rain installation participants stand or move in front of a large projection
screen. On the screen they see a mirrored video projection of themselves in
Figure 3. A screen grab from the installation showing how
the text detects the edge of a moving shape
7
black and white, combined with a color animation of falling letters. Like rain
or snow, the letters appears to land on participants' heads and arms.
The letters respond to the participants' motions and can be caught, lifted, and
then let fall again. The falling text will 'land' on anything darker than a
certain threshold and 'fall' whenever that obstacle is removed. If a
participant accumulates enough letters along their outstretched arms, or
along the silhouette of any dark object, they can sometimes catch an entire
word, or even a phrase. The falling letters are not random, but form lines of a
poem about bodies and language. 'Reading' the phrases in the Text Rain
installation becomes a physical as well as a cerebral endeavor.
Relevance: This project creates an emotive space with something as simple
as typography. The poetry is read through a modality that engages the body
as well as the eyes moving meaning into an embodied space.
8
Vox Populi
Don Ritter | 2005
A large video projection of a
crowd yells “speech, speech,”
and encourages visitors to
speak from a lectern
equipped with a microphone.
A teleprompter on the lectern
provides the text of historical, political speeches, though the sources are not
specified. When a visitor delivers a speech through the microphone, the text
scrolls on the teleprompter, the crowd responds with varying degrees of
hostility, support or ridicule, and the visitor’s speech is mixed with the
screaming of the crowd through a sound system.
Within Vox Populi, anyone can adopt the role of leader and speak the words
of John F. Kennedy, Martin Luther King Jr, George W. Bush, and others.
Visitors are free to speak whatever they want through the microphone, but
most read the speeches provided. The amount of confidence within the
leader’s voice will control various aspects of the installation, including the
specific response of the crowd and scrolling of the text on the teleprompter.
Figure 4. A user delivering a presidential speech to an
enthused digital crowd
9
If a leader speaks continuously for four minutes at a high volume and tempo,
the crowd remains enthusiastically supportive.
Relevance: This installation puts the call-response technique into good use.
The on-screen ‘public’ exhorts the speaker to excite and incite them be the
urgent ‘speech’ call. The performance of the speaker (level of enthusiasm
indicated by the rise in the tone) is monitored and directly influences the
reaction of the public.
Ursonography
Golan Levin & Jaap Blonk | 2005
The Ursonography is an audio visual interpretation of a famous example of
20
th
century concrete poetry which is the ‘Ursonate’ of Kurt Schwitters. The
poetry piece emphasizes the abstract musical qualities of speech. The
performance of that piece by Jaap Blonk is augmented by a simple yet elegant
Figure 5. A series of screen shot of Jaap Blonk delivering the Ursonate
10
form of expressive, real-time, intelligent subtitling system. The system uses
speech recognition and score-following technology to sync the projected
subtitles with the timing and timbre of the performer’s voice. It also
punctuates the poem’s structure with dynamic variation on the typographic
motion in space.
Relevance: A powerful way of making visible some of the nuances of a
poetry recital. The text springs to life due to the tight synching of utterance
and instantiation. It is a performed piece that contains no words but retains a
kind of structure that forces the listener to appreciate the phonetic qualities
of a kind of pre-language language
Airwaves
Loops & Topology | 2006
Airwaves is the culmination of the
collaboration between Jonathan Dimond
(ex-Jazz Convenor at the Queensland
Conservatorium, Griffith, with Jamie
Clark also a contributing writer. Both
Figure 6. Airwaves CD cover
11
composers and both ensembles have been exploring the musical possibilities
of speech for a number of years.
The combined 8-piece band plays along with the recorded voices of Churchill,
Hitler, Gandhi, Earhardt, Whitlam, Howard, Freud, Einstein, Bradman, Melba,
and a host of others, including Marconi himself. These figures are heard in
"voice portraits" - a new technique using characteristic intonation patterns of
a person's speech to make melody. The band plays music designed to
emphasize this melody, so that when Bill Clinton talks about "that woman", it
sounds like he's singing. The result is a new kind of opera. The two
ensembles combine different approaches - Topology's contemporary
classical perspective and Loops' jazz background - to create a new, wide-
ranging ethos.
Relevance: This album utilizes language’s inherent rhythm by chunking the
musical phrasing according to the speech patterns of the recorded pieces. It
also zooms in on the micro level and extracts local pitches and uses that data
as a composition strategy.
12
PostSecret: Confessions on Life, Death, and God
Frank Warren |2009
PostSecret works like this:
People are invited to send
anonymous postcards bearing
a secret, decorated in any or
no way, to Warren's
Germantown, MD home
address (it's still the same,
which is remarkable). The original postcards Warren printed up were quite
plain, and soon he started getting cards returned that had sketches,
paintings, collages, and more on them. In the five PostSecret books, including
the new "PostSecret: Confessions on Life, Death, and God," the postcards are
displayed in all of their glory (while almost all of them are two-dimensional,
Warren has received objects, too, like the bag of coffee beans with a secret
printed on it).
Relevance: This project is in here because of the user-generated-content
aspect of it. Contributors exercise their creativity in the postcards they sent
to the author. There’s a democratic aspect to printing a book with seemingly
no authorial control over the content.
Figure 7. Post Secret book cover
13
User Experience
Prior Prototypes
The Walker
The Walker was driven by the question of whether I could build a prototype
in Max 5, a visual programming environment developed by Cycling74’, that
had simulated walking (by either tapping on the space bar or hitting an
attached microphone) drive and control the speed of a musical piece. The
prototype worked fairly well for what it was. It was interesting to see
through user interaction how people slipped into a feedback loop where they
were trying to push the speed of the piece beyond what was considered
musical and see what happened. Initial plan was to try and turn the tapping
Figure 8. A Screenshot of 'Max 5' Showing the 'Walker' Patch at Work
14
into movement to see how engaging the body instead of the hand would
transpose the experience.
Dynamic Postcard
This prototype was an exercise in
learning Isadora, a high-level visual
development environment for theatre
production developed by TroikaTonix,
and trying more ideas of triggering
graphics on screen using amplitude and
pitch data. The text and images where
moving according to a pre-defined sequencing that was triggered by
speaking into the microphone. It was unclear to the users who were testing it
how their voice mapped to the events happening on screen, which is a result
of the lack of a script.
Figure 9. A Still from an experiment using
Isadora
15
The Experience
The Space:
The space is going to be a partitioned
confinement that should fit up to two users
along with a microphone and a stand. The
dimensions and the specs of the space will be
determined in consideration of other
accompanying exhibits’ spatial requirements
and the dimensions of the hall in which the group of exhibits will be housed.
The space will also contain speakers that will be placed in a way to avoid
possible feedback issues. Alternatively, noise-canceling mics can be used
with limited success. The projection screen will be made from a flex
membrane that can be pulled back against a solid background and made into
various shapes. In the case of this project the screen will take a concave
shape.
Use of Sound
Sound will be kept to a minimum in the form of feedback sound to the
Directed Path experience to egg the user on to either progress or notice
change in the way she's affecting the experience on screen. For example: A
bell sound would be triggered if the user exercised their pitch during a
Figure 10. Rough layout of the
proposed space
16
particular moment. That particular sound would be ramping up in pitch to
indicate that the change in pitch is in effect.
A User Experience Storyboard
Visually speaking, the project shifts between abstract expressive typography
and minimal imagery. Inspired by the core message of the Sufi poem that was
selected for it, the project is a conscious journey in the transcendence of
objective reality by ascending through orders of creation: starting with
inanimate and lifeless things through organic material all the way to angel-
like sublimation. The keywords that will trigger the transition some will be
known and others will not. The poem in question has a nice balance of
imagery and sense of rhythm to transform into an interactive audio visual
experience. It is a short poem by Jalaludin Rumi:
‘ I have experienced seven hundred and seventy mounds
I died from minerality and became vegetable
And from vegetativeness I died and became animal
I died from animality and became man
Then why fear disappearance through death?
Next time I shall die
Bringing forth wings and feathers like angels
After that soaring higher than angels
What you cannot imagine, I shall be that ’
17
‘Mound’ reveals ‘mineral’ from which roots emanate to penetrate soil
creating a ‘tree’ that is transformed into ‘animal’ and then ‘human’ and finally
ending with an angel.
The following few diagrams show one scenario of how things in-screen will
look and behave in response to speech:
Figure 11. The utterance of the word ‘Mounds’ bends the text to form semi-circle shapes
Figure 12. The screen is brimming with mounds by now. They keep flowing from one side
of the screen to the next. Then the word ‘mineral’ is triggered to reveal a curled nugget of
mineral words entangled in a spiral.
18
The following is a description of how the user would interact with the piece:
The user will enter the room to find a microphone on a stand and an
instruction text on the screen that would read: Welcome, please pickup the
microphone and say ‘Begin’.
A line of text starts scrolling on screen like a news ticker.
If the user for some reason doesn’t say anything, the type on screen will start
blinking to give a sense of urgency.
If the user starts reading, the words scroll on screen corresponding to the
speed of the delivery.
Recognized words are going to be highlighted on screen, changing size and
color based on the volume and the pitch of the delivery.
Figure 13. The mineral cracks open, and roots emerge from and start digging into the
ground beneath them. The same goes for the rest of the screens where the text being read
either makes it behave differently, like changing size, case or color or by introducing an
image
19
Along with the text, accompanying visuals will appear on screen that will
illustrate and ‘expressivity’ of the speech.
In case the user remains silent nothing will happen. But, if she stops speaking
during a stage in the experience the animated text is caught in a loop until
new words are fed into it.
The poem will progress only if the desired keyword is spoken.
A room for improvisation exists as everything the user does gets captured
and processed, so a layer of user-generated content can exist alongside the
pre-existing text creating the opportunity for mischief, appropriation and re-
contextualization of the given text.
20
Figure 14. User Interaction flow diagram
21
Directed Path vs. Improvisation
As the experience is based on a poem, the user will be invited to explore the poem in
its textual form through the control of the speed and size of the typographic
elements on screen. Recognized words will trigger an animated sequence that has a
specific path. The path is meant to limit the amount of control that the user will have
in order to enable her to recognize the change she's making in the elements on
screen.
Since a system as such is readily capable of capturing in real-time a voice input and
analyze it on the fly for prosody (timing, inflection and emphasis) in its abstract
form as both pitch and amplitude, it opens up possibilities for 'going off script' and
provide an emergent experience that's rooted in free play that would encourage
people to play with the potentialities of the a system.
Going off script could provide for humorous reading/interpretation of the poem by
layering of extra phrasing on top of the verse that coincides with that. As the
example below illustrates, one could die from a heart burn.
Also, since the system can read both pitch and amplitude of a voice, that data can be
used to do interesting visualizations on top of the pre-sequenced animated type;
22
mapping amplitude or loudness to size and pitch to jitter could yield interesting
results.
Figure 15. A Series of Visualizations of the Effect of Changing a User's Voice on the Text on Screen
Figure 16. A diagram illustrating the elements that will be controlled by the system
23
Recent Work on Emotion and Language
Significant work had been done on studying the connections and intersections
between language, emotion and cognition. Of particular relevance are two
researches done at USC’s Signal Analysis & Interpretation Lab (SAIL) and Brain and
Creativity Institute respectively both in task-oriented emotional reading and the
role of prosody in detecting empathy in human-to-human interaction.
One study by Lisa Aziz-Zadeh of USC's Brain and Creativity Institute and USC
doctoral student Tong Sheng found that the regions of the brain that are involved in
inflecting one's language with prosody are largely the same as those used to listen to
the subtle signals in others' speech. People whose prosody makes them very
expressive also tend to be good at picking up on others' prosody, and people with a
talent for prosody tend to be more empathic (Healy, 2010).
The other research investigates politeness and frustration behavior of children
during their spoken interaction with computer characters in a game. The study is
based on a Wizard-of-Oz dialog corpus of 103 children playing a voice-activated
computer game. The analysis showed that there is a positive correlation between
frustration and the number of dialog turns reflecting the fact that longer time spent
solving the puzzle of the game led to a more frustrated child (Serdar Yildirim, 2005)
24
A Short History of Speech Recognition
This section is not intended to be an exhaustive history of the technology behind the
currently available tools for speech recognition. The author of this thesis paper
recognizes that he is no expert of the matter. The purpose of this history is to
highlight that there are still challenges facing the development of an all-
encompassing solution for some key problems.
The first speech recognizer appeared in 1952 and consisted of a device for the
recognition of single spoken digits (Davis, Biddulph, & Balashek, 1952). Another
early device was the IBM Shoebox (IBM), exhibited at the 1964 New York World's
Fair.
One of the most notable domains for the commercial application of speech
recognition in the United States has been health care and in particular the work of
the medical transcriptionist. Other areas where the technology has been deployed
are: Military in the following applications: High-performance fighter aircraft,
Helicopters, Battle management, Training air traffic controllers. Also, it was used in
Telephony and other domains as an electronic responder/query and retrieval
system. The biggest limitation to speech recognition’s ability to automate
transcription, however, is seen as the software. The nature of narrative dictation is
25
highly interpretive and often requires judgment that may be provided by a real
human but not yet by an automated system.
A distinction in ASR is often made between "artificial syntax systems" which are
usually domain-specific in the way that they have a limited vocabulary that is highly
optimized and "natural language processing" which is usually language-specific.
Each of these types of application presents its own particular goals and challenges.
Today speech
technologies are
commercially available
for a limited but
interesting range of
tasks. These
technologies enable
machines to respond correctly and reliably to human voices, and provide useful and
valuable services.
Figure 17. Milestones in Speech Recognition and Understanding
Technology over the Past40 years
26
Evaluation
The early prototypes concentrated mainly on getting around the technical challenge
of capturing and mapping the human voice on an interactive experience and thus
lacked the depth and richness that comes with forethought and intent. Nonetheless
there were aspects of those short experiences that users during an exhibition in
December 2009 commented on that will influence the final version of the
experience, namely, the need for clear guidance on how to start the experience and
what to expect it to do in it and how. The core challenge is the intuitive mapping of
the actions of the users (the reading) onto what’s being processed and displayed on
screen.
27
Discussion
This is not the first interactive media experience to use the microphone as an input
device. As the Prior Art section illustrated, there have been numerous artistic
experiments that used the microphone as a performative aid. Also, in the area of
video games there have been games in both the area of Jukebox/Karaoke genre and
the be-part-of-a-band genres that employed that; Karaoke Revolution from Konami
(2009) and Guitar Hero from Activision (2006), respectively. Another game example
where the player used voice command to guide an avatar through a space to
perform tasks to finish the game is Lifeline from Konami (2004), an interesting
experiment that failed because it didn’t recognize the limitations of the technology
and therefore didn’t set proper and realistic expectations for what ended up being a
frustrating experience.
This project is not concerned with the technology's ability to recognize the person
speaking, although it is capable of doing so, since it requires training over a longer
period of time than the intended outcome experience. That being said, speech
recognition technology is getting progressively better and increase in detection
accuracy is noticeable, which makes for interesting design decisions based on the
fact that the computer knows fairly well the words being said.
28
What I feel is missing with these experiences, both task-oriented and explorative, is
a sense of the poetic. Leaving the task-oriented nature of some of projects on the
side for a bit leaves an aesthetic experience that might be visually engrossing but
lacks eloquence. To ‘put-that-there’ ends the relationship between the user and the
object once the task is fulfilled. To voice something into the world should come with
a sense of care and whimsy (Certeau, 1984).
On one level N|TONE pays respect to the voice by writing it on a wall. The wall in
this sense is a witness to the act of utterance. The fact that the poem might not be
familiar to the user can be advantageous in the way it is a seed for possible
improvisation that might occur as the design allows. It also allows for more playful
use of interpretive animations that could be designed for users to appreciate.
29
Conclusion
In conclusion, this project is a call to invite some poetry into interactivity and
capture a sense of eloquence both textual and visual that the designer believes is
lacking in current experiences. The human voice is a fascinating medium that is rich
and deserves more work to be done outside the specialized areas of speech
recognition and signal processing.
As for the project at hand, work still needs to be done to develop some of the
interactive assets that will respond to the input. Those assets are related to dynamic
typography and possibly some of the photographic images and sounds that will be
the core of the work. More user-testing will be done to see how far the project can
go to achieve its intent.
30
References
Butler, J. (1990). Performative Acts and Gender Constitution: An Essay in
Phenomenology and Feminist Theory. In S.-E. Case, Performing Feminisms: Feminist
Critical Theory and Theatre (p. 270). Baltimore: The Johns Hopkins University Press.
Certeau, M. d. (1984). The Practice of Everyday Life. Berkley and Los Angeles,
California: University of California Press.
Davis, K. H., Biddulph, R., & Balashek, a. S. (1952). Automatic Speech Recognition of
Spoken Digits. Journal of the Acoustical Society of America , 637 - 642.
Grahn, J. A., & Rowe, J. B. (2009, June 10). Feeling the Beat: Premotor and Striatal
Interactions in Musicians and Nonmusicians during Beat Perception. The Journal of
Neuroscience , pp. 7540 –7548.
Healy, M. (2010, January 19). Linguistically musical? You're probably nicer, too.
Retrieved from LA Times | Health:
http://latimesblogs.latimes.com/booster_shots/2010/01/linguistically-musical-
youre-probably-nicer-too.html
IBM. (n.d.). IBM Shoebox. Retrieved 03 31, 2010, from IBM Archives: http://www-
03.ibm.com/ibm/history/exhibits/specialprod1/specialprod1_7.html
Lynch, K. (1960). The Image of the City. Cambridge MA: The MIT Press.
Serdar Yildirim, C. M. (2005). Detecting Politeness and Frustration State of a Child in
a Conversational Computer Game. Proc. Interspeech, (pp. 2209-2212). Lisboa.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Type Set : Exploring the effects of making kinetic typography interactive
PDF
Seymour Deeply: exploring stereoscopic 3D as a storytelling tool in interactive media
PDF
Emotion control of player characters: creating an emotionally responsive game
PDF
Grayline: Creating shared narrative experience through interactive storytelling
PDF
The voice in the garden: an experiment in combining narrative and voice input for interaction design
PDF
Day[9]TV: How interactive Web television parallels game design
PDF
Toward a theory of gesture design
PDF
Nahui Ollin, “in search of the divine force”: a phyical immersive interactive dance experience
PDF
The tree as storied experience: an experiment in new narrative forms
PDF
Montage of interaction: conceptual exploration and creative practice
PDF
SolidSpace: a device for redesigning perception
PDF
Pluff: creating intersections between tactile interface devices and fabric‐based electronics
PDF
Spectre: exploring the relationship between players and narratives in digital games
PDF
Of gods & golems
PDF
Psychic - an interactive TV pilot: development of a game project for native TV platforms
PDF
By nature: an exploration of effects of time on localized gameplay systems
PDF
The illusion of communication
PDF
Maum: exploring immersive gameplay with emerging user interface devices
PDF
Morana: explore healing potential of virtual reality storytelling
PDF
Minor battle: explorations of a multi‐spatial game experience
Asset Metadata
Creator
Diab, Ala'
(author)
Core Title
N|TONE: an interactive exploration of a poetic space through voice
School
School of Cinematic Arts
Degree
Master of Fine Arts
Degree Program
Interactive Media
Publication Date
05/11/2010
Defense Date
03/25/2010
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
interactive art,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Bolas, Mark (
committee chair
), Hoberman, Perry (
committee member
), Krum, David (
committee member
)
Creator Email
al_diab@yahoo.com,diab@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m3078
Unique identifier
UC1295803
Identifier
etd-Diab-3712 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-330664 (legacy record id),usctheses-m3078 (legacy record id)
Legacy Identifier
etd-Diab-3712.pdf
Dmrecord
330664
Document Type
Thesis
Rights
Diab, Ala'
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Tags
interactive art