SOUND TO IMAGE
Proceedings SMC’07, 4th Sound and Music Computing Conference, 11-13 July 2007, Lefkada, Greece
REVIEWING THE TRANSFORMATION OF SOUND TO IMAGE IN NEW COMPUTER MUSIC SOFTWARE
Esthir Lemi, Anastasia Georgaki
National and Kapodistrian University/ Music Department, Athens, Greece
ABSTRACT
In the following essay we are going to analyse the relationship between sound and image in computer music. We will be examining sound visualisation timespan in which it has existed. How we judge software is based on aesthetic criteria, the way they were handed down to us from the theories of abstract painting (20th century avant-garde), the theory of montage by Sergei Eisenstein, of neurophysiology (synesthesia, muscular images) and of the successful correspondence of the two media (pixel and music) in the works and theory of J. Whitney.
1.INTRO
Since there is always the need to write down the music, the relationship between image and sound is a very important one. An approach of this kind is crucial for composers. Xenakis justifies the composers need to have a first draft by hand using the poetic, but also logical, saying that “the hand is the organ of the body that is closest to the brain” [1].
The history of modernism begins with the spiritual relationship between two great men, Kandinsky and Schoenberg. During their long friendship they certainly managed to influence each other, breaking the boundaries of representative art and tonic music respectively, each creating for their art the new era of abstraction. On a deeper level they managed a new approach of art as a constant and alive centre of creativity. Many artists and movements followed. These consisted of musical and visual principles followed. These consisted of musical and visual principles (Duchamp, Mondrian, Matiuschin, Hauer, Fluxus etc), which aided the evolution of art. This evolution led to new art forms (vieo art, interactive performance, visual music etc).
The booming of the computer age helped the transformation of the relationship between the image as a decoder of the musical language into the language of visual translation with the help of symbolic representations (from the theory of Paul Klee in Bauhaus to the aleatoric scores). The involvement of computer in the production of the common resulting image-sound gave birth to a completely new view that could not have existed up to that point [2].
2. FROM NOTATION TO VISUAL SCORE
The first connection of the relationship between colour and music can be found in the text by Aristoteles “On Sense and the Sensible [3]. In this text there is a categorisation of colours according to the ratio between consonance and dissonance, in ratios representative of the harmonic music system of Pythagoras, even though that the term will not be invented until the 19th century, Aristoteles talks about the synesthetic point of view in art.
On the other hand there are formal methods of approaching a graphic conception of an encoded musical structure. In the generalised typical form (stave), as it was given to us by Guido d’ Arezzo, the important information that comes up from it, is the abscissa relationship of the modification of the pitch (vertical axis) to the evolution of time (horizontal axis).
The movement of aleatorism gave a freer description of music composition by using abstract forms, though which the macrostructure of the piece remains clear and (because of the abstract flexible handling) the microstructure of the piece could be controlled with greater detail [4].
After the 2nd World War the evolution of musical writing became revolutionary in as far as its focus and analysis options were concerned. So, we can separate the options of representation into the following general categories:
Graphic Representation of:
the external/ internal structure
the harmonic and melodic structure (pitch)
rhythmic structure (time)
timbral structure
motive (cellular reconstruction)
The criteria which these representation were created were always in the context of the synesthetical relationship between the arts, using as criteria of that, decoding aural symbols, literal and formal congruences (onomatopoeia etc) and kinaesthetic stimuli [5]. For example, a modern music score can provide information of time, technique, expression, pitch and volume in a visually structured form which will help the performer understand and perform the piece with the accuracy that the composer wishes.
For example in the score of E. Lemi’s piece for a trumpet solo “Stamina” (2007) the performer produces sounds of indefinite pitch that cannot be shown with classical notation. In order to present various timbral mutations on the time domain, symbolic forms and colours were used, which represent different timbres, meanwhile the nuances of luminance in the orange colour define the dynamic evolution between piano and forzzatissimo (Fig.1)
A discretionary synesthetical approach provides a vague dimension with neurophysiological and psychological parameters that cannot be perceived in an objective way [6]. For example, for the time being we can represent a sound as “yellow” as part of a serious correspondence. A basis of this type is vague and doubtful and the correspondence lies within a deeper coding.
A more significant method of correspondence between image and music came with the growth of computer music, particularly the research of J. Whitney on the construction of a harmonic relationship, based on the tome as a basic component of music, and the pixel as a basic component of image. During the seventies he founded the term “harmonic pixel phenomena” in his book “Harmony - On the Complementarity of Music and Visual Art”. In one of the chapters he includes average programs in Pascal language, which create “differential dynamics”, a family of algorithms that activated each pixel point of a cluster differentially. The plasma-like liquidity of such motions permits aggregate architectonic structures to match musical action. J. Whitney also gave us the term “differential digital harmony” to express the idea of evolutions of ratios into visual ad sound models of harmony [2].
The efforts of the artists to combine the two arts created many hybrid forms of art, which consisted of a dynamic further evolution. In this way a form of art became possible, which allows, by the rhythmic change of images, even with the absence of sound, a kind of visual composition. This art was named visual music. Applied on the technical and aesthetic theories at the beginning of the twentieth century, visual composition combines the meanings of consonance and dissonance, as a common language for music and contrast and variation of the many visual dimensions including direction, speed, shape, size and colour [7].
This is the result of using theoretical basis of montage and the needed use of a rhythmic continuity as well as the symbolic approaches for time and rhythm in painting and experimental cinematography. The separation of the techniques of the following images, depending on their rhythmic behaviour was given by S. Eisenstein who believed that the process of montage is the most important part of the artistic value of the film. For him, montage was was a multi-tiered construct of tension and release. This is a term that can be also given to the art of music. Furthermore, he listed five basic types of montage, which are applied today in the theory of visual music. These types are: metric, rhythmic, tonal, overtonal and intellectual montage [8]. Basically, he values the rhythmic ratios and the meaning of rhythm, as one of the basic elements of narration.
3. TAXONOMY OF COMPUTER VISUAL MUSIC SOFTWARE
Iannis Xenakis started in 1972 constructing a music system for computers which specialised in the creation of sounds through graphs. The visual palette of Upic (Unite Polygogique Informatique de CEMAMu) consists of lines, curves and points. Many examples of programs of visualising sound followed. Most of them are being presented as simplified programs of composition. In the following paragraph we refer only to programs with two dimensional presentation in axis x,y.
The programs can be separated according to the visual production into programs with:
graphical representation
colour representation
and according to their functions, into programs of reproducing:
sound to image
image to sound
simultaneous production of image to sound
A general diagram with characteristic samples from programs can be presented as follows:
— —
As we can see on the table above, according the categorisation, we observe characteristic examples of translation image-sound. The method of categorisation here is according to the procedure and the aesthetic result.
In the first category of graphical production we have UPIC, in which we are given the opportunity to draw waves and envelopes, to compose a page and to do the mixing. UPIC, as a pioneer in the field of visualisation changed the creative habits and methods with a simultaneous process of composing in a macro- and micro compositional level through the interaction between man (via hands that design the music) and the machine at the command receiving of the system [9].
Phonogramme, a simpler software composition tool, has a visual palette which spreads through all the shades between the white and black (silence and maximum volume respectively), but despite the good ideas it presents, this system is not based on empirical studies and does not support design strategies [10].
In the second category, that of colour presentation, the palette of symbolisms grows, and there is a possibility of choosing further parameters. The chromatic “desymbolism” of sound is an idea that was successfully applied for the first time during the eighteenth century by the French Jesuit Monk Louis Bernard Castel. The monk constructed a visual musical instrument of performing sounds with a parallel showing of colours, and he inspired the later generartion. Three centuries later, we still have not found a logical image-sound correspondence, so the correspondences are always used in combination according to what they serve and present.
The first example of production sound from picture is the one of Metasynth, which provides a range of red-green-yellow colours, which work as filters. As in Phonogramme, but in the opposite direction, the grey shades symbolise the differences of volume (white for the maximum volume and black for silence [10].
Constructed for educational purposes, Hyperscore’s layout is made in a more playful style. The melodies are composed independently in the motive window and then are presented the way the user wants in the sketch window. So we have a motivic structure/ Each motive has its own colour. In a third window, the harmonic line of the piece is defined. The construction follows a simple setting of the motivic elements of the composition. However, the simplistic visual setting sure enables children to comprehend composition. Moreover, there is a possibility of copying the visual score in a midi file and its appearance as a stave in one of the special programs (Finale, Sibelius etc) [11]. Such a play - like a procedure (from the abstract to the specific) pleases the imagination and amuses, but its creative possibilities are limited.
Soundpaint is a program that maps from colour to sound by introducing a vector space homomorphism between visual space (in RGB palette) and sound space. The constraints which are used for the process are injectivity, surjectivity topology preservation and user definable mapping. The program has potential, because its table is regarded in a way that implies its flexible behaviour of movement during a process of painting[12].
In the case of fractals, we examine the technique of musical composition through computer that counts the repetitive functions from codes of chaotic procedures [13]. Through this, the the descriptive system of this functions can form th e fractals which can be rendered as music score. Fractal interpolation waveforms are deterministic, so the same melody will be generated repeatedly unless the parameters of the waveform are changed in pitch, dynamic level, behaviour and into a new musical form [14]. The great interest of a non-linear dynamic system for the use of musical composition is a natural relationship, in behaviour or phenomena, to the real world, which unfolds the mechanical affinity of controlling and contemplating. The chaotic procedures consist of a process of modification. Its internal constancy is verified by the rules that lie coded into its equation [15].
Jitter provides a vast field of application, which allows the visual parameters to function as aspirations to the dynamic and non-objective qualities of music [16]. This is the first time that we speak of the possibility of producing visual music, because the complexity of the program provides the free choice of the circumstances of mapping from a simple sketch to a graph table and the application of video. The positive element of this freedom, as far as the procedure is concerned, but also from the procedure of simultaneous flexible applications in image and sound, produces results of a common relationship image-sound [17].
Sonos is a program that complies with Max/MSP/Jitter, which controls the sound parameters using the abilities of visualisation of sound as its own procedure. An important function is graphical transformation using a transfer matrix as control interface. That happens by connecting each plane of a matrix to transformation, while each pixel simultaneously stores a value. The procedure is activated when the user colours a matrix. Here the abscissa represents frequency in time. A controller, such as a keyboard mouse, or any joystick, may then allow the exploration of a sensitive variation of sound and image, giving it interactive form [18].
The general problem of all these programs is that beyond the scientific approach of the representation, such as waveforms or spectograms, there is no objective representation of sound phenomenon by the computer visualisation. There are only subjective, more or less metaphorical representations, in heterogeneous sound or musical context. The subjective relationship between visual and sound results are due to the absence of objective categorisation of the colours. According to the neurobiological studies the eye and the brain, the greater part of what has been written about colours is arbitrary. According to the research, the brain receives the values of shade and luminosity separately. Also, the depth of the field and the separation of the form from the surface is achieved with the absence of colour, while the crises in the colour range depend on genetic and empirical elements. The generalisation that we can make concerns the three-dimensional construction of the colour space without a relation to the kind of colour range (RGB, RYB, CMY). The results of the research show that there is no hierarchy of the colour range beyond the value of luminosity and its saturation.
The programming of systems of sound visualisation balances in movement, those subjective-objective levels of hierarchies. Even if most of the criteria are judged by physiological arbitrary approaches, our connection with the systems create a general hierarchy that we can accept as is (e.g. can we throw away Kandinsky’s theory on which modernism was based?).
4. VISUAL COMPUTER MUSIC
The new thing that computer music brought into the world was the way in which music was processed with the focus on sound. The process of sound using mathematical models helped us enter its core. The ability to deconstruct the sound into its basic components, and to alter sensitive parameters shows a new behaviour of easily modifying the material itself, providing us with one more aesthetic advantage of the immediate and precise impression of the sound result and its sculpting reformation, even though the opportunity for a common vocabulary to be built has not been given. Besides that, in all of the programmes there is always a hidden intent of touching a broader public and of discovering a new synesthetic behaviour and a new musical form.
From the practical applications point of view the programmes of visualisation are separated into the categories:
of musical production and presentation as two different processes and
the parallel use of the parameters with the synchronous presentation of the results.
The two different techniques (acousmatic and interactive), because of the use of the computer, are used for the production of more complex applications that without the power of the machine could not be produced [4]. In the computer music repretoire we meet inspirational works in some visualisation programs, while successful creation is usually due to its ability to compose with its talent.
Xenakis’ Mycenae Alpha is the first work entirely realiSed on the UPIC (1979). Others followed, composers such as Julio Estrada with Eua-on (1980), but the big bloom came during the nineties with pieces such as “Saxatile” by Jean-Claude Risset’s and “L’ Autel de la Perte et de la Transformation” (“the Altar of Loss and Transformation”) by Brigitte Robindore (1993).
Beyond the researchers and the modern composers, pop artists worked with many of the programs. One of the most known examples is the case of Aphex Twin who hid an image of himself in a spectogram using Metasynth, an image of a spiral shape in his first track from “Windowlicker” .
On the other hand, the programs such as Hyperscore are used mainly for educational purposes. The pieces that have been written are cute, with a respective optic and musical lightness such as “Creepy Raindrops” by Chelsea o Hara (2002) and “My Very Happy Hyperscore” by Garry Hughes (2002).
Nowadays, when the technological revolution allows more delicate manipulation in sound and image, it is now possible to combine more complex techniques that bring a more fundamental result.
Even though we have examples of pieces of visualisation by some software, the flexible new forms of programming allows us to include, in the prehistory of the modern bases, visionary steps that have been applied by artists that worked with sound and experimental cinema. Norman Mac Laren, in his film “Synchrony” (1971) composed music and drew directly on the optical soundtrack of 35-mm film blocks of different vertical and horizontal sizes, which are audible as square waves of different frequencies and amplitudes respectively. The visual component of the film was created by manipulating the soundtrack on an optical printer, to create multiple copies in different foreground and background colours. In this way, Mac Laren used the technology of film to associate sound to image millisecond by millisecond [16]. The gathering and comparison of such pieces can bring results for a new fundamental theory of reversibility, transformation and interaction.
5. TOWARDS A VISUAL MUSIC THEORY
With the term interactive, we generally refer to any real-time adaptation in relation to another action. However, we accept that this relation does not only concern the two senses, but also exists for the viewer in a simultaneous use of the rest of the senses to get the stimuli.
This belief also works in the case of acousmatic music, where, even though the stimuli are less or more constant, the rest of the senses are always open to parallel activations that have to be measured because they affect the audience. An interesting point of view of this phenomenon is that of Antonio Damasio [20 ].
Antonio Damasio claims that the complex aesthetic stimuli of visual and auditory incentives, creates a neurological model inside the brain, which transforms into what can be described as “image”. As the “image” brings information from the physical aesthetic action field of hearing and touching, the term “image”, according to Antonio Damasio, does not refer to the visual correspondence, but to something more complex. Albert Einstein was one of the first who studied that phenomenon and named those models of brain images “muscular images”. From all this general information we gather that the creation of the system of the visualisation of sound includes parameters from the fields of neurophysiology and history of art, and if these fields are taken into consideration, the results allow the user a more friendly relationship with the machine - (the detest of a large part of the population towards technology because of the incomprehensible programs is known). There is a stable basis in all those facts. We need to give our faith to a new kind of reading of music, which does not need the time consuming familiarity of classical notation. Something of that sort would help the audience to experience music creation more consciously, and give them levels of sentiment beyond amusement, levels that, until today, only musicians shared.
Charles Rosen [21] has written convincingly about the important role of the written score in the Western classical tradition. The score was used to circulate new compositions and preserve them for future generations. The audience’s ability to read the score at a time when the most of the bourgeoisie learned musical notation was critical to the reception of new works. Until the end of the 19th century, music was in large part a private experience. Most people would first encounter a Beethoven sonata alone at piano, paging through the score. This private dialogue of discovery, between amateur pianist and composer, suggests that the Western relationship to music had once been closer to our contemporary relationship to poetry - engaged with a page, searching, meditative.
Today, conversely, we think of music as belonging mostly to the public sphere. Rosen writes: “Our assumption today, made unconsciously, that almost all music is basically public is a radical distortion of Western tradition. We no longer have a public that largely understands how the visual experience of a musical score is transformed into an experience of sound, and to what extent this transformation is not a simple matter but is capable of individual inflections”.
The transformation of sound into image enables an audience to better comprehend the musical structure of a work, by presenting them with another level of sensory involvement. Yet, in order to facilitate an engagement with a visual aesthetic, it is necessary to re-evaluate the criteria upon which the visual representation is based, and re-locate them from the technical arena to an aesthetic one. This could be done on three levels:
Firstly, by reviewing this technical software interface, encouraging composers and computer scientists to collaborate to improve software usability.
Secondly, by instigating a co-ordinated effort to compile theories of music/ sound reversibility, in order to consistently improve on the aesthetic criteria upon which the software is based.
Finally by examining new possibilities of achieving reversibility of image and sound through mapping.
The parallel action of image and sound offers to our comprehension of music a more drastic, vivid feeling about sound, the same as one of the description of music sci-fi books like “Brave New World” by A. Huxley [22] back in the thirties. This reminds us that technology is here today to serve the creative fantasy, and our wishful imagery, unless we turn into an unnerving state of consuming ideas.
6. CONCLUSION
Unforeseen events of auditory imagery in combination with visual representation offers a way of presenting and communicating in art, that emulates the way we get the stimuli in our everyday life. “This use of auditory information should be consistent with normal experience even if the phenomenon being represented is far outside normal experience“ [23]. Human perception works in that way and in a way it help a deeper understanding of musical composition, but the results so far in music are only experimental works of a repetition of all trivial musical systems. The chaos is a result of combination of the absence of applied theory of structure and the denial of the musician computer-scientists to observe with greater attention and sensitivity the evolution of musical history.