Sixteen Years of Sound & Music Computing
A Look Into the History and Trends of the Conference and Community

D.A. Mauro, F. Avanzini, A. Baratè, L.A. Ludovico, S. Ntalampiras, S. Dimitrov, S. Serafin
Card image

Papers

Sound and Music Computing Conference 2009 (ed. 6)

Dates: from July 23 to July 25, 2009
Place: Porto, Portugal
Proceedings info: Proceedings of the 6th Sound and Music Computing Conference (SMC2009), ISBN 978-989-95577-6-5


2009.1
Accessing Structure of Samba Rhythms Through Cultural Practices of Vocal Percussion
Naveda, Luiz   Institute for Psychoacoustics and Electronic Music (IPEM), Ghent University; Ghent, Belgium
Leman, Marc   Institute for Psychoacoustics and Electronic Music (IPEM), Ghent University; Ghent, Belgium

Abstract
In the field of computer music, melodic based forms of vocalizations have often been used as channels to access subject’s queries and retrieve information from music databases. In this study, we look at percussive forms of vocalizations in order to retrieve rhythmic models entrained by subjects in Samba culture. By analyzing recordings of vocal percussions collected from randomly selected Brazilian subjects, we aim at comparing emergent rhythmic structures with the current knowledge about Samba music forms. The database of recordings was processed using a psychoacoustically inspired auditory model and further displayed on loudness and onset images. The analyses of emergent rhythmic patterns show intriguing similarities with the findings in previous studies in the field and put different perspectives on the use of vocal forms in music information retrieval and musicology.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849591
Zenodo URL: https://zenodo.org/record/849591


2009.2
A Chroma-based Salience Function for Melody and Bass Line Estimation From Music Audio Signals
Salamon, Justin   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Gómez, Emilia   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
In this paper we present a salience function for melody and bass line estimation based on chroma features. The salience function is constructed by adapting the Harmonic Pitch Class Profile (HPCP) and used to extract a mid-level representation of melodies and bass lines which uses pitch classes rather than absolute frequencies. We show that our salience function has comparable performance to alternative state of the art approaches, suggesting it could be successfully used as a first stage in a complete melody and bass line estimation system.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849573
Zenodo URL: https://zenodo.org/record/849573


2009.3
A Classification Approach to Multipitch Analysis
Klapuri, Anssi   Department of Signal Processing, Tampere University of Technology; Tampere, Finland

Abstract
This paper proposes a pattern classification approach to detecting the pitches of multiple simultaneous sounds. In order to deal with the octave ambiguity in pitch estimation, a statistical classifier is trained which observes the value of a detection function both at the position of a candidate pitch period and at its integer multiples and submultiples, in order to decide whether the candidate period should be accepted or rejected. The method improved significantly over a reference method in simulations.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849575
Zenodo URL: https://zenodo.org/record/849575


2009.4
A Computational Model That Generalises Schoenberg's Guidelines for Favourable Chord Progressions
Anders, Torsten   Interdisciplinary Centre for Computer Music Research (ICCMR), School of Computing, Communications and Electronics, University of Plymouth; Plymouth, United Kingdom
Miranda, Eduardo Reck   Interdisciplinary Centre for Computer Music Research (ICCMR), School of Computing, Communications and Electronics, University of Plymouth; Plymouth, United Kingdom

Abstract
This paper presents a formal model of Schoenberg’s guidelines for convincing chord root progressions. This model has been implemented as part of a system that models a considerable part of Schoenberg’s Theory of Harmony. This system implements Schoenberg’s theory in a modular way: besides generating four-voice homophonic chord progressions, it can also be used for creating other textures that depend on harmony (e.g., polyphony). The proposed model generalises Schoenberg’s guidelines in order to make them applicable for more use cases. Instead of modelling his rules directly (as constraints on scale degree intervals between chord roots), we actually model his explanation of these rules (as constraints between chord pitch class sets and roots, e.g., whether the root pitch class of some chord is an element in the pitch class set of another chord). As a result, this model can not only be used for progressions of diatonic triads, but in addition also for chords with a large number of tones, and in particular also for microtonal music beyond 12-tone equal temperament and beyond 5-limit harmony.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849577
Zenodo URL: https://zenodo.org/record/849577


2009.5
A Dynamic Analogy Between Integro-differential Operators and Musical Expressiveness
De Poli, Giovanni   CSC, Department of Information Engineering, Università di Padova; Padova, Italy
Rodà, Antonio   CSC, Department of Information Engineering, Università di Padova; Padova, Italy
Mion, Luca   TasLab, Informatica Trentina; Trento, Italy

Abstract
Music is often related to mathematics. Since Pythagoras, the focus is mainly on the relational and structural aspects of pitches described by arithmetic or geometric theories, and on the sound production and propagation described by differential equation, Fourier analysis and computer algorithms. However, music is not only score or sound; it conveys emotional and affective content. The aim of this paper is to explore a possible association between musical expressiveness and basic physical phenomena described by integro-differential operators.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849579
Zenodo URL: https://zenodo.org/record/849579


2009.6
A Framework for Ecosystem-based Generative Music
Bown, Oliver   Centre for Electronic Media Art, Monash University; Clayton, Australia

Abstract
Ecosystem-based generative music is computer-generated music that uses principles borrowed from evolution and ecosystem dynamics. These are different from traditional interactive genetic algorithms in a number of ways. The possibilities of such an approach can be explored using multi-agent systems. I discuss the background, motivations and expectations of ecosystem-based generative music and describe developments in building a software framework aimed at facilitating the design of ecosystemic sonic artworks, with examples of how such a system can be used creatively.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849581
Zenodo URL: https://zenodo.org/record/849581


2009.7
A Framework for Musical Multiagent Systems
Ferrari Thomaz, Leandro   Department of Computer Science, University of São Paulo (USP); São Paulo, Brazil
Queiroz, Marcelo   Department of Computer Science, University of São Paulo (USP); São Paulo, Brazil

Abstract
Multiagent system technology is a promising new venue for interactive musical performance. In recent works, this technology has been tailored to solve specific, limited scope musical problems, such as pulse detection, instrument simulation or automatic accompaniment. In this paper, we present a taxonomy of such musical multiagent systems, and an implementation of a computational framework that subsumes previous works and addresses general-interest low-level problems such as real-time synchronization, sound communication and spatial agent mobility. By using it, a user may develop a musical multiagent system focusing primarily in his/her musical needs, while leaving most of the technical problems to the framework. To validate this framework, we implemented and discussed two cases studies that explored several aspects of musical multiagent systems, such as MIDI and audio communication, spatial trajectories and acoustical simulation, and artificial life constructs like genetic codes and reproduction, thus indicating the usefulness of this framework in a variety of musical applications.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849583
Zenodo URL: https://zenodo.org/record/849583


2009.8
A Framework for Soundscape Analysis and Re-synthesis
Valle, Andrea   CIRMA, Università di Torino; Torino, Italy
Schirosa, Mattia   CIRMA, Università di Torino; Torino, Italy
Lombardo, Vincenzo   CIRMA, Università di Torino; Torino, Italy

Abstract
This paper presents a methodology for the synthesis and interactive exploration of real soundscapes. We propose a soundscape analysis method that relies upon a sound object behavior typology and a notion of “sound zone” that collocates objects typologies in spatial locations. Then, a graphbased model for organising sound objects in space and time is described. Finally, the resulting methodology is discussed in relation to a case study.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849585
Zenodo URL: https://zenodo.org/record/849585


2009.9
Album and Artist Effects for Audio Similarity at the Scale of the Web
Flexer, Arthur   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Schnitzer, Dominik   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
In audio based music recommendation, a well known effect is the dominance of songs from the same artist as the query song in recommendation lists. We verify that this effect also exists in a very large data set at the scale of the world wide web (> 250000). Since our data set contains multiple albums from individual artists, we can also show that the album effect is relatively bigger than the artist effect.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849593
Zenodo URL: https://zenodo.org/record/849593


2009.10
A Stratified Approach for Sound Spatialization
Peters, Nils   Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada
Lossius, Trond   Bergen Center for Electronic Arts (BEK); Bergen, Norway
Schacher, Jan C.   Institute for Computer Music and Sound Technology (ICST), Zurich University of the Arts (ZHdK); Zurich, Switzerland
Baltazar, Pascal   Centre National de Creation Musicale (GMEA); Albi, France
Bascou, Charles   Centre National de Creation Musicale (GMEM); Marseille, France
Place, Timothy   Cycling '74; Paris, France

Abstract
We propose a multi-layer structure to mediate essential components in sound spatialization. This approach will facilitate artistic work with spatialization systems, a process which currently lacks structure, flexibility, and interoperability.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849587
Zenodo URL: https://zenodo.org/record/849587


2009.11
A System for Musical Improvisation Combining Sonic Gesture Recognition and Genetic Algorithms
Van Nort, Doug   Rensselaer Polytechnic Institute (RPI); Troy, United States
Braasch, Jonas   Rensselaer Polytechnic Institute (RPI); Troy, United States
Oliveros, Pauline   Rensselaer Polytechnic Institute (RPI); Troy, United States

Abstract
This paper describes a novel system that combines machine listening with evolutionary algorithms. The focus is on free improvisation, wherein the interaction between player, sound recognition and the evolutionary process provides an overall framework that guides the improvisation. The project is also distinguished by the close attention paid to the nature of the sound features, and the influence of their dynamics on the resultant sound output. The particular features for sound analysis were chosen in order to focus on timbral and textural sound elements, while the notion of “sonic gesture” is used as a framework for the note-level recognition of performer’s sound output, using a Hidden Markov Model based approach. The paper discusses the design of the system, the underlying musical philosophy that led to its construction as well as the boundary between system and composition, citing a recent composition as an example application.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849589
Zenodo URL: https://zenodo.org/record/849589


2009.12
Auditory Representations as Landmarks in the Sound Design Space
Drioli, Carlo   Department of Computer Science, Università di Verona; Verona, Italy
Polotti, Pietro   Department of Computer Science, Università di Verona; Verona, Italy
Rocchesso, Davide   IUAV University of Venice; Venezia, Italy
Delle Monache, Stefano   IUAV University of Venice; Venezia, Italy
Adiloğlu, Kamil   Technische Universität Berlin (TU Berlin); Berlin, Germany
Anniés, Robert   Technische Universität Berlin (TU Berlin); Berlin, Germany
Obermayer, Klaus   Technische Universität Berlin (TU Berlin); Berlin, Germany

Abstract
A graphical tool for the timbre space exploration and interactive design of complex sounds by physical modeling synthesis is presented. It is built around an auditory representation of sounds based on spike functions and provides the designer with both a graphical and an auditory insight. The auditory representation of a number of reference sounds, located as landmarks in a 2D sound design space, provides the designer with an effective aid to direct his search along the paths that lie in the proximity of the most inspiring landmarks.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849597
Zenodo URL: https://zenodo.org/record/849597


2009.13
Automatic Jazz Harmony Evolution
Bäckman, Kjell   University West; Trollhättan, Sweden

Abstract
Jazz harmony has during jazz history mainly been functionally based on principles of tonality derived from the classical and romantic periods of the 18th and 19th centuries. In the Evolutionary Jazz Harmony project we introduced a functionless harmony system that impacted the musical feeling in jazz compositions to imitate the harmonic feeling in an avant-garde way. The main features of that new harmony system were chords not built on any specific base note and not necessarily connected to the major/minor concept. In this project we introduce an automatic evaluation of the produced harmony sequences that both looks at each individual chord and the chord progression. A population of chord progressions is evaluated and the highest ranked ones will most likely be used for breeding of the offspring. This project is one of the sub-projects of the EJI (Evolutionary Jazz Improvisation) project, where we explore various aspects of jazz music; improvised solo, harmony, tune creation, algorithmic creation of piano, bass and drum accompaniment, communication between instruments etc. The results have been evaluated by a live jazz group consisting of professional jazz musicians.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849599
Zenodo URL: https://zenodo.org/record/849599


2009.14
Automatic Manipulation of Music to Express Desired Emotions
Oliveira, António Pedro   Centre for Informatics and Systems, University of Coimbra; Coimbra, Portugal
Cardoso, Amílcar   Centre for Informatics and Systems, University of Coimbra; Coimbra, Portugal

Abstract
We are developing a computational system that produces music expressing desired emotions. This paper is focused on the automatic transformation of 2 emotional dimensions of music (valence and arousal) by changing musical features: tempo, pitch register, musical scales, instruments and articulation. Transformation is supported by 2 regression models, each with weighted mappings between an emotional dimension and music features. We also present 2 algorithms used to sequence segments. We made an experiment with 37 listeners who were asked to label online 2 emotional dimensions of 132 musical segments. Data coming from this experiment was used to test the effectiveness of the transformation algorithms and to update the weights of features of the regression models. Tempo and pitch register proved to be relevant on both valence and arousal. Musical scales and instruments were also relevant for both emotional dimensions but with a lower impact. Staccato articulation influenced only valence.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849601
Zenodo URL: https://zenodo.org/record/849601


2009.15
Computational Investigations Into Between-hand Synchronization in Piano Playing: Magaloff's Complete Chopin
Goebl, Werner   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Flossmann, Sebastian   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
The paper reports on first steps towards automated computational analysis of a unique and unprecedented corpus of symbolic performance data. In particular, we focus on between-hand asynchronies – an expressive device that plays an important role particularly in Romantic music, but has not been analyzed quantitatively in any substantial way. The historic data were derived from performances by the renowned pianist Nikita Magaloff, who played the complete work of Chopin live on stage, on a computer-controlled grand piano. The mere size of this corpus (over 320,000 performed notes or almost 10 hours of continuous performance) challenges existing analysis approaches. The computational steps include score extraction, score-performance matching, definition and measurement of the analyzed features, and a computational visualization tool. We then present preliminary data to demonstrate the potential of our approach for future computational modeling and its application in computational musicology.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849603
Zenodo URL: https://zenodo.org/record/849603


2009.16
Content Analysis of Note Transitions in Music Performance
Loureiro, Maurício   School of Music, Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil
Yehia, Hani   School of Engineering, Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil
de Paula, Hugo   Pontifical Catholic University of Minas Gerais (PUC Minas); Belo Horizonte, Brazil
Campolina, Thiago   School of Engineering, Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil
Mota, Davi   School of Music, Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil

Abstract
Different aspects of music performance have been quantified by a set of descriptor parameters, which try to incorporate the resources used by the performer to communicate his/her intention of expressiveness and intelligibility. The quality of note transitions are quite important in the construction of an interpretation. They are manipulated by the performer by controlling note durations and the quality of attacks and note groupings. These characteristics can be modeled by parameters that may describe what happens between the notes of a musical sentence, which attempt to represent how we perceive note articulations and groupings of legato or detached notes. On the other hand, the quality of transitions between legato notes may be related to the musician's abilities, to the reverberation characteristics of the performance room and to the acoustic characteristics of the instrument. This text illustrates methods of extraction and definition of descriptor parameters related to the quality of transitions between notes, which are capable to reveal relevant aspects about the accomplishment of these transitions. The procedures here described integrate a model of analysis of the expressiveness of performance in monophonic musical instruments. The samples used consist of recordings of interpretations of excerpts of the classic repertoire for solo clarinet.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849605
Zenodo URL: https://zenodo.org/record/849605


2009.17
Current Directions With Musical Plus One
Raphael, Christopher   Indiana University Bloomington; Bloomington, United States

Abstract
We discuss the varieties of musical accompaniment systems and place our past efforts in this context. We present several new aspects of our ongoing work in this area. The basic system is presented in terms of the tasks of score following, modeling of musical timing, and the computational issues of the actual implementation. We describe some improvements in the probabilistic modeling of the audio data, as well as some ideas for more sophisticated modeling of musical timing. We present a set of recent pieces for live player and computer controlled pianos, written specifically for our accompaniment system. Our presentation will include a live demonstration of this work.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849607
Zenodo URL: https://zenodo.org/record/849607


2009.18
Dissonances: Brief Description and Its Computational Representation in the RTCC Calculus
Perchy, Salim   AVISPA (Ambientes Visuales de Programación Aplicativa) Research Group, Pontificia Universidad Javeriana de Cali; Cali, Colombia
Sarria Montemiranda, Gerardo Mauricio   AVISPA (Ambientes Visuales de Programación Aplicativa) Research Group, Pontificia Universidad Javeriana de Cali; Cali, Colombia

Abstract
Dissonances in music have had a long evolution history dating back to days of strictly prohibition to times of enricheness of musical motives and forms. Nowadays, dissonances account for most of the musical expressiveness and contain a full application theory supporting their use making them a frequently adopted resource of composition. This work partially describes their theoretical background as well as their evolution in music and finally proposing a new model for their computational use.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849609
Zenodo URL: https://zenodo.org/record/849609


2009.19
Does a "natural" Sonic Feedback Affect Perceived Usability and Emotion in the Context of Use of an ATM?
Susini, Patrick   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Misdariis, Nicolas   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Houix, Olivier   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Lemaitre, Guillaume   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
The present study examines the question of a “natural” sonic feedback associated with keys of a numerical keyboard - in the context of use of an Automatic Teller Machine (ATM). “Natural” is defined here as an obvious sound feedback with regards to the action made by a user on a device. The aim is then to study how “naturalness” is related to the perceived usability and the perceived emotion of the sonic feedback before and after participants perform several tasks with the keyboard. Three levels of “naturalness” are defined: causal, iconic, and abstract. In addition, two levels of controlled usability of the system are used: a low level and a high one. Results show that preexperimental ratings of perceived “naturalness” and perceived usability were highly correlated. This relationship held after the participants interacted with the keyboard. “Naturalness” and emotional aspects were less dependant, revealing that “naturalness” and usability represent a special type of relation. However, results are affected by the level of controlled usability of the system. Indeed, the positive change at the high level of controlled usability for the iconic sounds (medium level of naturalness) obtained after the performance task failes at the low level of controlled usability.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849611
Zenodo URL: https://zenodo.org/record/849611


2009.20
Empirically Based Auditory Display Design
Brazil, Eoin   Interaction Design Center, University of Limerick; Limerick, Ireland
Fernstrom, Mikael   Interaction Design Center, University of Limerick; Limerick, Ireland

Abstract
This paper focuses on everyday sounds and in particular on sound description, sound understanding, sound synthesis/modelling and on sonic interaction design. The argument made in this paper is that the quantitative-analytical reductionist approach reduces a phenomenon into isolated individual parts which do not reflect the richness of the whole, as also noted by Widmer et al. [1]. As with music, so is it for everyday sounds that multidimensional approaches and techniques from various domains are required to address the complex interplay of the various facets in these types of sounds. An empirically inspired framework for sonic interaction design is proposed that incorporates methods and tools from perceptual studies, from auditory display theories, and from machine learning theories. The motivation for creating this framework is to provide designers with accessible methods and tools, to help them bridge the semantic gap between low-level perceptual studies and high-level semantically meaningful concepts. The framework is designed to be open and extendable to other types of sound such as music.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849613
Zenodo URL: https://zenodo.org/record/849613


2009.21
Enhancing Expressive and Technical Performance in Musical Video Games
Marczak, Raphaël   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Robine, Matthias   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Desainte-Catherine, Myriam   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Allombert, Antoine   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Hanna, Pierre   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Kurtág, György   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France

Abstract
Musical video games are best sellers games. One of their selling point is based on the improvement of players’ musical abilities. But interviews made with gamers and musicians show that if the former feel enough freedom in musical expression, the latter are more sceptical and feel more limited in the possibility of express themselves when they are playing. In parallel with the games development, some research works propose to improve control and meta-control on music to allow expressive performance without the user being a virtuoso. The goal of this article is first to present interviews made for knowing users opinions. Some research works about improving musical expressive performance are then presented. Finally, we propose games enhancing the expressive and technical musical performance, linking current game-play with current research. 67 percent of rhythm gamers will certainly buy a real instrument in mid-term. So playing musical video games could make players wanted to play real instruments. But do they really learn music playing this kind of game? Do they have enough freedom to express themselves musically? And what about the sensation of playing as a band when several players are allowed? This article starts with the description of musical video games main characteristics in section 2. Section 3 presents the main categories of musical video games with some famous examples. Interviews with different gamers and musicians illustrate the main assets and limits of this kind of games in section 4. We then present in section 5 some hardware and software developed in a computer music research environment. A way of pooling musical video games and this research in innovative games is finally proposed in section 6.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849615
Zenodo URL: https://zenodo.org/record/849615


2009.22
Explorations in Convolutional Synthesis
Smyth, Tamara   School of Computing Science, Simon Fraser University; Vancouver, Canada
Elmore, Andrew R.   School of Computing Science, Simon Fraser University; Vancouver, Canada

Abstract
In this work we further explore a previously proposed technique, used in the context of physical modeling synthesis, whereby a waveguide structure is replaced by a low-latency convolution operation with an impulse response. By doing so, there is no longer the constraint that successive arrivals be uniformly spaced, nor need they decay exponentially as they must in a waveguide structure. The structure of an impulse response corresponding to an acoustic tube is discussed, with possible synthesis parameters identified. Suggestions are made for departing from a physicallyconstrained structure, looking in particular at impulse responses that are mathematically-based and/or that correspond to hybrid or multi-phonic instruments by interleaving two or more impulse responses. The result is an exploration of virtual musical instruments that are either based on physical instruments, completely imagined, or somewhere in between.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849617
Zenodo URL: https://zenodo.org/record/849617


2009.23
Expressive Performance Rendering: Introducing Performance Context
Flossmann, Sebastian   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Grachten, Maarten   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria / Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria

Abstract
that uses a probabilistic network to model dependencies between score and performance. The score context of a note is used to predict the corresponding performance characteristics. Two extensions to the system are presented, which aim at incorporating the current performance context into the prediction, which should result in more stable and consistent predictions. In particular we generalise the Viterbi-algorithm, which works on discrete-state Hidden Markov Models, to continuous distributions and use it to calculate the overall most probable sequence of performance predictions. The algorithms are evaluated and compared on two very large data-sets of human piano performances: 13 complete Mozart Sonatas and the complete works for solo piano by Chopin.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849619
Zenodo URL: https://zenodo.org/record/849619


2009.24
Extending the Folksonomies of freesound.org Using Content-based Audio Analysis
Martínez, Elena   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Celma, Òscar   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Sordo, Mohamed   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
de Jong, Bram   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Serra, Xavier   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
This paper presents an in–depth study of the social tagging mechanisms used in Freesound.org, an online community where users share and browse audio files by means of tags and content–based audio similarity search. We performed two analyses of the sound collection. The first one is related with how the users tag the sounds, and we could detect some well–known problems that occur in collaborative tagging systems (i.e. polysemy, synonymy, and the scarcity of the existing annotations). Moreover, we show that more than 10\% of the collection were scarcely annotated with only one or two tags per sound, thus frustrating the retrieval task. In this sense, the second analysis focuses on enhancing the semantic annotations of these sounds, by means of content– based audio similarity (autotagging). In order to “autotag” the sounds, we use a k–NN classifier that selects the available tags from the most similar sounds. Human assessment is performed in order to evaluate the perceived quality of the candidate tags. The results show that, in 77\% of the sounds used, the annotations have been correctly extended with the proposed tags derived from audio similarity.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849621
Zenodo URL: https://zenodo.org/record/849621


2009.25
First-order Logic Classification Models of Musical Genres Based on Harmony
Anglade, Amélie   Centre for Digital Music (C4DM), Queen Mary University of London; London, United Kingdom
Ramirez, Rafael   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Dixon, Simon   Centre for Digital Music (C4DM), Queen Mary University of London; London, United Kingdom

Abstract
We present an approach for the automatic extraction of transparent classification models of musical genres based on harmony. To allow for human-readable classification models we adopt a first-order logic representation of harmony and musical genres: pieces of music are represented as lists of chords and musical genres are seen as context-free definite clause grammars using subsequences of these chord lists. To induce the context-free definite clause grammars characterising the genres we use a first-order logic decision tree induction algorithm, Tilde. We test this technique on 856 Band in a Box files representing academic, jazz and popular music. We perform 2-class and 3-class classification tasks on this dataset and obtain good classification results: around 66\% accuracy for the 3-class problem and between 72\% and 86\% accuracy for the 2-class problems. A preliminary analysis of the most common rules extracted from the decision tree models built during these experiments reveals a list of interesting and/or well-known jazz, academic and popular music harmony patterns.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849623
Zenodo URL: https://zenodo.org/record/849623


2009.26
FM4 Soundpark Audio-based Music Recommendation in Everyday Use
Gasser, Martin   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Flexer, Arthur   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria

Abstract
We present an application of content-based music recommendation techniques within an online community platform targeted at an audience interested mainly in independent and alternative music. The web platform’s goals will be described, the problems of content management approaches based on daily publishing of new music tracks will be discussed, and we will give an overview of the user interfaces that have been developed to simplify access to the music collection. Finally, the adoption of content-based music recommendation tools and new user interfaces to improve user acceptance and recommendation quality will be justified by detailed user access analyses.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849625
Zenodo URL: https://zenodo.org/record/849625


2009.27
Free Classification of Vocal Imitations of Everyday Sounds
Dessein, Arnaud   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Lemaitre, Guillaume   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
This paper reports on the analysis of a free classification of vocal imitations of everyday sounds. The goal is to highlight the acoustical properties that have allowed the listeners to classify these imitations into categories that are closely related to the categories of the imitated sound sources. We present several specific techniques that have been developed to this end. First, the descriptions provided by the participants suggest that they have used different kinds of similarities to group together the imitations. A method to assess the individual strategies is therefore proposed and allows to detect an outlier participant. Second, the participants’ classifications are submitted to a hierarchical clustering analysis, and clusters are created using the inconsistency coefficient, rather than the height of fusion. The relevance of the clusters is discussed and seven of them are chosen for further analysis. These clusters are predicted perfectly with a few pertinent acoustic descriptors, and using very simple binary decision rules. This suggests that the acoustic similarities overlap with the similarities used by the participants to perform the classification. However, several issues need to be considered to extend these results to the imitated sounds.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849627
Zenodo URL: https://zenodo.org/record/849627


2009.28
Instrument Augmentation Using Ancillary Gestures for Subtle Sonic Effects
Lähdeoja, Otso   CICM, University Paris VIII; Paris, France
Wanderley, Marcelo M.   Input Devices and Music Interaction Laboratory (IDMIL), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada
Malloch, Joseph   Input Devices and Music Interaction Laboratory (IDMIL), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada

Abstract
In this paper we present an approach to instrument augmentation using the musician's ancillary gestures to enhance the liveliness of real-time digitally processed sound. In augmented instrument praxis, the simultaneous control of the initial instrument and its' electric/electronic extension is a challenge due to the musician's physical and psychological constraints. Our work seeks to address this problem by designing non-direct gesture-sound relationships between ancillary gestures and subtle sonic effects, which do not require a full conscious control of the instrumentalist. An application for the electric guitar is presented on the basis of an analysis of the ancillary movements occurring in performance, with specific gesture data acquisition and mapping strategies, as well as examples of musical utilizations. While the research work focuses on the electric guitar, the system is not instrument-specific, and can be applied to any instrument using digital sound processing.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849629
Zenodo URL: https://zenodo.org/record/849629


2009.29
Interactive Infrasonic Environment: A New Type of Sound Installation for Controlling Infrasound
Gupfinger, Reinhard   Graduate School of Interface Culture, Kunstuniversität Linz (UFG); Linz, Austria
Ogawa, Hideaki   Graduate School of Interface Culture, Kunstuniversität Linz (UFG); Linz, Austria
Sommerer, Christa   Graduate School of Interface Culture, Kunstuniversität Linz (UFG); Linz, Austria
Mignonneau, Laurent   Graduate School of Interface Culture, Kunstuniversität Linz (UFG); Linz, Austria

Abstract
This paper proposes a new type of interactive sound instrument for use with audiences in sound installations and musical performances. The Interactive Infrasonic Environment allows users to perceive and experiment with the vibration and acoustic energy produced by infrasound. This article begins with a brief overview of infrasound and examines its generation, human perception, areas of application and some odd myths. Infrasound is sound with a frequency lower than 20 hertz (20 cycles per second) – outside the normal limits of human hearing. Nevertheless the human body can perceive such low frequencies via cross-modal senses. This paper describes three key aspects of infrasonic sound technologies: the artificial generation of infrasound, the human perception of infrasound, and the interactive environment for sound installations and musical performances. Additionally we illustrate these ideas with related works from the field of sound art and interactive art.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849631
Zenodo URL: https://zenodo.org/record/849631


2009.30
InTune: A Musician's Intonation Visualization System
Lim, Kyung Ae   School of Informatics, Indiana University Bloomington; Bloomington, United States
Raphael, Christopher   School of Informatics, Indiana University Bloomington; Bloomington, United States

Abstract
We present a freely downloadable program, InTune, designed to help musicians better hear and improve their intonation. The program uses the musical score from which the musician plays, assisting our approach in two ways. First, we use score following to automatically align the player’s audio signal to the musical score, thus allowing better and more flexible estimation of pitch. Second, we use the score match to present the tuning analysis in ways that are natural and intuitive for musicians. One representation presents the player with a marked-up musical score showing notes whose pitch characteristics deserve closer attention. Two other visual representations of audio overlay a musical time grid on the music data and allow random access to the audio, keyed by musical time. We present a user study involving 20 highly educated instrumentalists and vocalists.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849633
Zenodo URL: https://zenodo.org/record/849633


2009.31
JackТrip/SoundWIRE Meets Server Farm
Cáceres, Juan-Pablo   Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States
Chafe, Chris   Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States

Abstract
Even though bidirectional, high-quality and low-latency audio systems for network performance are available, the complexity involved in setting up remote sessions needs better tools and methods to asses and tune network parameters. We present an implementation of a system to intuitively evaluate the Quality of Service (QoS) on best effort networks. In our implementation, musicians are able to connect to a multi-client server and tune the parameters of a connection using direct “auditory displays.” The server can scale up to hundreds of users by taking advantage of modern multi-core machines and multi-threaded programing techniques. It also serves as a central “mixing hub” when network performance involves several participants.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849635
Zenodo URL: https://zenodo.org/record/849635


2009.32
Learning Jazz Grammars
Gillick, Jon   Wesleyan University; Middletown, United States
Tang, Kevin   Cornell University; Ithaca, United States
Keller, Robert M.   Harvey Mudd College; Claremont, United States

Abstract
We are interested in educational software tools that can generate novel jazz solos in a style representative of a body of performed work, such as solos by a specific artist. Our approach is to provide automated learning of a grammar from a corpus of performances. Use of a grammar is robust, in that it can provide generation of solos over novel chord changes, as well as ones used in the learning process. Automation is desired because manual creation of a grammar in a particular playing style is a labor-intensive, trial-and-error, process. Our approach is based on unsupervised learning of a grammar from a corpus of one or more performances, using a combination of clustering and Markov chains. We first define the basic building blocks for contours of typical jazz solos, which we call “slopes”, then show how these slopes may be incorporated into a grammar wherein the notes are chosen according to tonal categories relevant to jazz playing. We show that melodic contours can be accurately portrayed using slopes learned from a corpus. By reducing turn-around time for grammar creation, our method provides new flexibility for experimentation with improvisational styles. Initial experimental results are reported.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849639
Zenodo URL: https://zenodo.org/record/849639


2009.33
L-systems, Scores, and Evolutionary Techniques
Figueira Lourenço, Bruno   Department of Computational Science, University of Brasilia; Brasilia, Brazil
Ralha, José C. L.   Department of Computational Science, University of Brasilia; Brasilia, Brazil
Brandão, Márcio C. P.   Department of Computational Science, University of Brasilia; Brasilia, Brazil

Abstract
Although musical interpretation of L-Systems has not been explored as extensively as the graphical interpretation, there are many ways of creating interesting musical scores from strings generated by L-Systems. In this article we present some thoughts on this subject and propose the use of genetic operators with L-System to increase variability.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849637
Zenodo URL: https://zenodo.org/record/849637


2009.34
Making an Orchestra Speak
Nouno, Gilbert   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Cont, Arshia   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Carpentier, Grégoire   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Harvey, Jonathan   Composer, Independent; United Kingdom

Abstract
This paper reports various aspects of the computer music realization of “Speakings” for live electronic and large orchestra by composer Jonathan Harvey, with the artistic aim of making an orchestra speak through computer music processes. The underlying project asks for various computer music techniques: whether through computer aided compositions as an aid for composer’s writing of instrumental scores, or real-time computer music techniques for electronic music realization and performance issues on the stage with the presence of an orchestra. Besides the realization techniques, the problem itself brings in challenges for existing computer music techniques that required the authors to pursue further research and studies in various fields. The current paper thus documents this collaborative process and introduce technical aspects of the proposed methods in each area with an emphasis on the artistic aim of the project.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849641
Zenodo URL: https://zenodo.org/record/849641


2009.35
Motion-enabled Live Electronics
Eckel, Gerhard   Institute of Electronic Music and Acoustics (IEM), University of Music and Performing Arts (KUG); Graz, Austria
Pirrò, David   Institute of Electronic Music and Acoustics (IEM), University of Music and Performing Arts (KUG); Graz, Austria
Sharma, Gerriet Krishna   Institute of Electronic Music and Acoustics (IEM), University of Music and Performing Arts (KUG); Graz, Austria

Abstract
Motion-Enabled Live Electronics (MELE) is a special approach towards live electronic music aiming at increasing the degree of the performers’ embodiment in shaping the sound processing. This approach is characterized by the combination of a high-resolution and fully-3D motion tracking system with a tracking data processing system tailored towards articulating the relationship between bodily movement and sound processing. The artistic motivations driving the MELE approach are described, an overview of related work is given and the technical setup used in a workshop exploring the approach is introduced. Brief descriptions of the pieces realized in the workshop and performed in the final concert inform the presentation of the conclusions drawn from the workshop.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849643
Zenodo URL: https://zenodo.org/record/849643


2009.36
Musical Applications and Design Techniques for the Gametrak Tethered Spatial Position Controller
Freed, Adrian   Center for New Music and Audio Technologies (CNMAT), University of California, Berkeley; Berkeley, United States
McCutchen, Devin   Center for New Music and Audio Technologies (CNMAT), University of California, Berkeley; Berkeley, United States
Schmeder, Andy   Center for New Music and Audio Technologies (CNMAT), University of California, Berkeley; Berkeley, United States
Skriver Hansen, Anne-Marie   Department for Media Technology, Aalborg University; Aalborg, Denmark
Overholt, Daniel   Department for Media Technology, Aalborg University; Aalborg, Denmark
Burleson, Winslow   Arizona State University; Tempe, United States
Nørgaard Jensen, Camilla   Arizona State University; Tempe, United States
Mesker, Alex   Macquarie University; Sydney, Australia

Abstract
Novel Musical Applications and Design Techniques for the Gametrak tethered spatial positioning controller are described. Individual musical instrument controllers and large-scale musical and multimedia applications are discussed.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849645
Zenodo URL: https://zenodo.org/record/849645


2009.37
Musical Groove Is Correlated With Properties of the Audio Signal as Revealed by Computational Modelling, Depending on Musical Style
Madison, Guy   Umeå University; Umeå, Sweden
Gouyon, Fabien   INESC TEC; Porto, Portugal
Ullén, Fredrik   Karolinska Institutet; Solna, Sweden

Abstract
With groove we mean the subjective experience of wanting to move rhythmically when listening to music. Previous research has indicated that physical properties of the sound signal contribute to groove - as opposed to mere association due to previous exposure, for example. Here, a number of quantitative descriptors of rhythmic and temporal properties were derived from the audio signal by means of computational modeling methods. The music examples were 100 samples from 5 distinct music styles, which were all unfamiliar to the listeners. Listeners’ ratings of groove were correlated with aspects of rhythmic patterning for Greek, Indian, Samba, and West African music. Microtiming was positively correlated with groove for Samba and negatively correlated with groove for Greek, but had very small unique contributions in addition to the rhythmical properties. For Jazz, none of the measured properties had any significant contributions to groove ratings.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849647
Zenodo URL: https://zenodo.org/record/849647


2009.38
Musical Voice Integration/Segregation: VISA Revisited
Rafailidis, Dimitris   Department of Informatics, Aristotle University of Thessaloniki; Thessaloniki, Greece
Cambouropoulos, Emilios   Department of Music Studies, Aristotle University of Thessaloniki; Thessaloniki, Greece
Manolopoulos, Yannis   Department of Informatics, Aristotle University of Thessaloniki; Thessaloniki, Greece

Abstract
The Voice Integration/Segregation Algorithm (VISA) proposed by Karydis et al. [7] splits musical scores (symbolic musical data) into different voices, based on a perceptual view of musical voice that corresponds to the notion of auditory stream. A single ‘voice’ may consist of more than one synchronous notes that are perceived as belonging to the same auditory stream. The algorithm was initially tested against a handful of musical works that were carefully selected so as to contain a steady number of streams (contrapuntal voices or melody with accompaniment). The initial algorithm was successful on this small dataset, but was proven to run into serious problems in cases were the number of streams/voices changed during the course of a musical work. A new version of the algorithm has been developed that attempts to solve this problem; the new version, additionally, includes an improved mechanism for context-dependent breaking of chords and for keeping streams homogeneous. The new algorithm performs equally well on the old dataset, but gives much better results on the new larger and more diverse dataset.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849649
Zenodo URL: https://zenodo.org/record/849649


2009.39
New Tendencies in the Digital Music Instrument Design: Progress Report
Ferreira-Lopes, Paulo   Research Centre in Science and Technology of the Arts (CITAR), Catholic University of Portugal; Porto, Portugal / Hochschule für Musik Karlsruhe; Karlsruhe, Germany
Vitez, Florian   Hochschule für Musik Karlsruhe; Karlsruhe, Germany
Dominguez, Daniel   Hochschule für Musik Karlsruhe; Karlsruhe, Germany
Wikström, Vincent   Hochschule für Musik Karlsruhe; Karlsruhe, Germany

Abstract
This paper is a progress report from a workgroup of the University of Music Karlsruhe studying Music Technology at the Institut für Musikwissenschaft und Musikinformatik (Institute for Musicology and Music Technology). The group activity is focused on the development and design of computer-controlled instruments – digital music instruments [5]. We will describe three digital music instruments, havedeveloped at the Computer Studio. These instruments are mostly unified by the idea of human gesture and human interaction using new technologies to control the interaction processes. At the same time they were built upon the consciousness of musical tradition taking a fresh approach on everyday objects.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849651
Zenodo URL: https://zenodo.org/record/849651


2009.40
Non-negative Matrix Factorization With Selective Sparsity Constraints for Transcription of Bell Chiming Recordings
Marolt, Matija   Faculty of Computer and Information Science, University of Ljubljana; Ljubljana, Slovenia

Abstract
The paper presents a method for automatic transcription of recordings of bell chiming performances. Bell chiming is a Slovenian folk music tradition involving performers playing tunes on church bells by holding the clapper and striking the rim of a stationary bell. The tunes played consist of repeated rhythmic patterns into which various changes are included. Because the sounds of bells are inharmonic and their tuning not known in advance, we propose a two step approach to transcription. First, by analyzing the covariance matrix of the time-frequency representation of a recording, we estimate the number of bells and their approximate spectra using prior knowledge of church bell acoustics and bell chiming performance rules. We then propose a non-negative matrix factorization algorithm with selective sparsity constraints that learns the basis vectors that approximate the previously estimated bell spectra. The algorithm also adapts the number of basis vectors during learning. We show how to apply the proposed method to bell chiming transcription and present results on a set of field recordings.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849653
Zenodo URL: https://zenodo.org/record/849653


2009.41
Parallelization of Audio Applications With Faust
Orlarey, Yann   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Fober, Dominique   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Letz, Stéphane   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France

Abstract
Faust 0.9.9.6 introduces new compilation options to automatically parallelize audio applications. This paper explains how the automatic parallelization is done and presents some benchmarks.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849655
Zenodo URL: https://zenodo.org/record/849655


2009.42
Polynomial Extrapolation for Prediction of Surprise Based on Loudness - a Preliminary Study
Purwins, Hendrik   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain / Neural Information Processing Group, Technische Universität Berlin (TU Berlin); Berlin, Germany
Holonowicz, Piotr   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Herrera, Perfecto   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
The phenomenon of music surprise can be evoked by various musical features, such as intensity, melody, harmony, and rhythm. In this preliminary study we concentrate on the aspect of intensity. We formulate surprise as a critical derivation from the predicted next intensity value, based on the “immediate” past (∼ 7 s), slightly longer than the short-term memory. Higher level cognition, processing the long range structure of the piece and general stylistic knowledge, is not considered by the model. The model consists of a intensity calculation step and a prediction function. As a preprocessing method we compare instantaneous energy (root mean square), loudness, and relative specific loudness. This processing stage is followed by a prediction function for which the following alternative implementations are compared with each other: 1) discrete temporal difference of intensity functions, 2) FIR filter, and 3) polynomial extrapolation. In addition, we experimented with different analysis window length, sampling rate and hop size of the intensity curve. Good results are obtained for loudness and polynomial extrapolation based on an analysis frame of 7 s, a sampling rate of the loudness measures of 1.2 s, and a hop size of 0.6 s. In the polynomial extrapolation a polynomial of degree 2 is fitted to the loudness curve in the analysis window. The absolute difference between the extrapolated next loudness value and the actual value is then calculated and divided by the standard deviation within the analysis window. If the result is above a threshold value we predict surprise. The method is preliminarily evaluated with a few classical music examples.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849657
Zenodo URL: https://zenodo.org/record/849657


2009.43
Predicting the Perceived Spaciousness of Stereophonic Music Recordings
Sarroff, Andy M.   New York University (NYU); New York, United States
Bello, Juan Pablo   New York University (NYU); New York, United States

Abstract
In a stereophonic music production, music producers seek to impart impressions of one or more virtual spaces upon the recording with two channels of audio. Our goal is to map spaciousness in stereophonic music to objective signal attributes. This is accomplished by building predictive functions by exemplar-based learning. First, spaciousness of recorded stereophonic music is parameterized by three discrete dimensions of perception—the width of the source ensemble, the extent of reverberation, and the extent of immersion. A data set of 50 song excerpts is collected and annotated by humans for each dimension of spaciousness. A verbose feature set is generated on the music recordings and correlation-based feature selection is used to reduce the feature spaces. Exemplar-based support vector regression maps the feature sets to perceived spaciousness. We show that the predictive algorithms perform well on all dimensions and that perceived spaciousness can be successfully mapped to objective attributes of the audio signal.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849659
Zenodo URL: https://zenodo.org/record/849659


2009.44
Prototyping Musical Experiments for Tangisense, a Tangible and Traceable Table
Arfib, Daniel   Laboratoire d'Informatique de Grenoble (LIG); Grenoble, France
Filatriau, Jehan-Julien   Université catholique de Louvain; Louvain-la-Neuve, Belgium
Kessous, Loïc   ISIR-UPMC, University Paris VI; Paris, France

Abstract
We describe two musical experiments that are designed for the interaction with a new tangible interface named by us Tangisense, based on a set of antennas and RFIDs. These experiments (classification, game of) use different kind of time schedule, and are now simulated using Max-Msp and Java programs and common computer-human interfaces. They are developed in such a way that they can be ported on a specific tangible interface using RFID tags in its heart. Details about this portage are given. These experiments will in the future serve as user-centered applications of this interactive table, be it for musical practice or sonic interaction design.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849661
Zenodo URL: https://zenodo.org/record/849661


2009.45
PV Stoch: A Spectral Stochastic Synthesis Generator
Döbereiner, Luc   Institute of Sonology, Royal Conservatory of The Hague; The Hague, Netherlands

Abstract
PV Stoch is a phase vocoder (PV) unit generator (UGen) for SuperCollider. Its objective is the exploration of methods used in “non-standard synthesis”, especially in Dynamic Stochastic Synthesis (Xenakis), in another domain. In contrast to their original conception, the methods are applied in the frequency domain. This paper discusses some of the compositional motivations and considerations behind the approach, it gives a description of the actual synthesis method and its implementation, as well as a summary of the results and conclusions drawn.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849663
Zenodo URL: https://zenodo.org/record/849663


2009.46
Real-time Binaural Audio Rendering in the Near Field
Spagnol, Simone   CSC, Department of Information Engineering, Università di Padova; Padova, Italy
Avanzini, Federico   CSC, Department of Information Engineering, Università di Padova; Padova, Italy

Abstract
This paper considers the problem of 3-D sound rendering in the near field through a low-order HRTF model. Here we concentrate on diffraction effects caused by the human head which we model as a rigid sphere. For relatively close source distances there already exists an algorithm that gives a good approximation to analytical spherical HRTF curves; yet, due to excessive computational cost, it turns out to be impractical in a real-time dynamic context. For this reason the adoption of a further approximation based on principal component analysis, which can significantly speed up spherical HRTF computation, is proposed. The model resulting from such an approach is suitable for future integration in a structural HRTF model and parameterization over anthropometrical measurements of a wide range of subjects.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849665
Zenodo URL: https://zenodo.org/record/849665


2009.47
Real-time DTW-based Gesture Recognition External Object for Max/MSP and Puredata
Bettens, Frederic   Department of Circuit Theory and Signal Processing (TCTS Lab), Faculty of Engineering, University of Mons; Mons, Belgium
Todoroff, Todor   ARTeM (Art, Recherche, Technologie et Musique); Brussels, Belgium / Department of Circuit Theory and Signal Processing (TCTS Lab), Faculty of Engineering, University of Mons; Mons, Belgium

Abstract
This paper focuses on a real-time Max/MSP implementation of a gesture recognition tool based on Dynamic Time Warping (DTW). We present an original ”multi-grid” DTW algorithm, that does not require prior segmentation. The num.dtw object will be downloadable on the numediart website both for Max/MSP and for Pure Data. Though this research was conducted in the framework described below, with wearable sensors, we believe it could be useful in many other contexts. We are for instance starting a new project where we will evaluate our DTW object on video tracking data as well as on a combination of video tracking and wearable sensors data.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849667
Zenodo URL: https://zenodo.org/record/849667


2009.48
Score Playback Devices in PWGL
Laurson, Mikael   Centre for Music and Technology, Sibelius Academy, University of the Arts Helsinki; Helsinki, Finland
Kuuskankare, Mika   Centre for Music and Technology, Sibelius Academy, University of the Arts Helsinki; Helsinki, Finland

Abstract
This paper presents a novel system that allows the user to customize playback facilities in our computer-assisted environment, PWGL. The scheme is based on a class hierarchy. The behavior of an abstract root playback class containing a set of methods can be customized through inheritance. This procedure is demonstrated by a subclass that is capable of playing MIDI data. This playback device allows to realize automatically multi-instrument and micro-tonal scores by using pitchbend setups and channel mappings. Also continuous control information can be given in a score by adding dynamics markings and/or special Score-BPF expressions containing break-point functions. We give several complete code examples that demonstrate how the user could further change the playback behavior. We start with a simple playback device that allows to override channel information. Next we discuss how to implement the popular keyswitch mechanism in our system. This playback device is capable of mapping high-level score information with commercial orchestral database supporting keyswitch instruments. Our final example shows how to override the default MIDI output and delegate the play events to an external synthesizer using OSC.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849669
Zenodo URL: https://zenodo.org/record/849669


2009.49
Sensory Threads: Sonifying Imperceptible Phenomena in the Wild
Fencott, Robin   Interaction, Media, and Communication Group, Queen Mary University of London; London, United Kingdom
Bryan-Kinns, Nick   Centre for Digital Music (C4DM), Queen Mary University of London; London, United Kingdom

Abstract
Sensory Threads is a pervasive multi-person interactive experience in which sensors monitor phenomena that are imperceptible or periphery to our everyday senses. These phenomena include light temperature, heart-rate and spatialdensity. Participants each wear a sensor as they move around an urban environment, and the sensor data is mapped in realtime to an interactive soundscape which is transmitted wirelessly back to the participants. This paper discusses the design requirements for the Sensory Threads soundscape. These requirements include intuitive mappings between sensor data and audible representation and the ability for participants to identify individual sensor representations within the overall soundscape mix. Our solutions to these requirements draw upon work from divergent research fields such as musical interface design, data sonification, auditory scene analysis, and the theory of electroacoustic music. We discuss mapping strategies between sensor data and audible representation, our decisions about sound design and issues surrounding the concurrent presentation of multiple data sets. We also explore the synergy and tension between functional and aesthetic design, considering in particular how affective design can provide solutions to functional requirements.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849671
Zenodo URL: https://zenodo.org/record/849671


2009.50
Shape-based Spectral Contrast Descriptor
Akkermans, Vincent   Faculty of Art, Media and Technology, University of the Arts Utrecht (HKU); Utrecht, Netherlands
Serrà, Joan   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Herrera, Perfecto   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
Mel-frequency cepstral coefficients are used as an abstract representation of the spectral envelope of a given signal. Although they have been shown to be a powerful descriptor for speech and music signals, more accurate and easily interpretable options can be devised. In this study, we present and evaluate the shape-based spectral contrast descriptor, which is build up from the previously proposed octave-based spectral contrast descriptor. We compare the three aforementioned descriptors with regard to their discriminative power and MP3 compression robustness. Discriminative power is evaluated within a prototypical genre classification task. MP3 compression robustness is measured by determining the descriptor values’ change between different encodings. We show that the proposed shape-based spectral contrast descriptor yields a significant increase in accuracy, robustness, and applicability over the octave-based spectral contrast descriptor. Our results also corroborate initial findings regarding the accuracy improvement of the octave-based spectral contrast descriptor over Mel-frequency cepstral coefficients for the genre classification task.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849673
Zenodo URL: https://zenodo.org/record/849673


2009.51
Sound and the City: Multi-layer Representation and Navigation of Audio Scenarios
Ludovico, Luca Andrea   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica e Comunicazione (DICo), Università degli Studi di Milano; Milano, Italy
Mauro, Davide Andrea   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica e Comunicazione (DICo), Università degli Studi di Milano; Milano, Italy

Abstract
IEEE 1599-2008 is an XML-based standard originally intended for the multi-layer representation of music information. Nevertheless, it is versatile enough to describe also information different from traditional scores written according to the Common Western Notation (CWN) rules. This paper will discuss the application of IEEE 1599-2008 to the audio description of paths and scenarios from the urban life or other landscapes. The standard we adopt allows the multilayer integration of textual, symbolical, structural, graphical, audio and video contents within a unique synchronized environment. Besides, for each kind of media, a number of digital objects is supported. As a consequence, thanks to the features of the format the produced description will be more than a mere audio track, a slideshow made of sonified static images or a movie. Finally, an ad hoc evolution of a standard viewer for IEEE 1599 documents will be presented, in order to enjoy the results of our efforts.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849675
Zenodo URL: https://zenodo.org/record/849675


2009.52
Sound Object Classification for Symbolic Audio Mosaicing: A Proof-of-concept
Janer, Jordi   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Haro, Martin   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Roma, Gerard   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Fujishima, Takuya   Yamaha Corporation; Hamamatsu, Japan
Kojima, Naoaki   Media Artist, Independent; Japan

Abstract
Sample-based music composition often involves the task of manually searching appropriate samples from existing audio. Audio mosaicing can be regarded as a way to automatize this process by specifying the desired audio attributes, so that sound snippets that match these attributes are concatenated in a synthesis engine. These attributes are typically derived from a target audio sequence, which might limit the musical control of the user. In our approach, we replace the target audio sequence by a symbolic sequence constructed with pre-defined sound object categories. These sound objects are extracted by means of automatic classification techniques. Three steps are involved in the sound object extraction process: supervised training, automatic classification and user-assisted selection. Two sound object categories are considered: percussive and noisy. We present an analysis/synthesis framework, where the user explores first a song collection using symbolic concepts to create a set of sound objects. Then, the selected sound objects are used in a performance environment based on a loop-sequencer paradigm.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849677
Zenodo URL: https://zenodo.org/record/849677


2009.53
Sound Search by Content-based Navigation in Large Databases
Schwarz, Diemo   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Schnell, Norbert   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
We propose to apply the principle of interactive real-time corpus-based concatenative synthesis to search in effects or instrument sound databases, which becomes content-based navigation in a space of descriptors and categories. This surpasses existing approaches of presenting the sound database first in a hierarchy given by metadata, and then letting the user listen to the remaining list of responses. It is based on three scalable algorithms and novel concepts for efficient visualisation and interaction: Fast similarity-based search by a kD-tree in the high-dimensional descriptor space, a mass– spring model for layout, efficient dimensionality reduction for visualisation by hybrid multi-dimensional scaling, and novel modes for interaction in a 2D representation of the descriptor space such as filtering, tiling, and fluent navigation by zoom and pan, supported by an efficient 3-tier visualisation architecture. The algorithms are implemented and tested as C-libraries and Max/MSP externals within a prototype sound exploration application.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849679
Zenodo URL: https://zenodo.org/record/849679


2009.54
The Effect of Visual Cues on Melody Segregation
Burkitt, Anthony Neville   The University of Melbourne; Melbourne, Australia
Grayden, David B.   The University of Melbourne; Melbourne, Australia
Innes-Brown, Hamish   The Bionic Ear Institute; Melbourne, Australia
Marozeau, Jeremy   The Bionic Ear Institute; Melbourne, Australia

Abstract
Music often contains many different auditory streams and one of its great interests is the relationship between these streams (melody vs. counterpoint vs. harmony). As these different streams reach our ears at the same time, it is up to the auditory system to segregate them. Auditory stream segregation is based mainly on our ability to group different streams according to their overall auditory perceptual differences (such as pitch or timbre). People with impaired hearing have great difficulty separating auditory streams, including those in music. It has been suggested that attention can influence auditory streaming and, by extension, visual information. A psychoacoustics experiment was run on eight musically trained listeners to test whether visual cues could influence the segregation of a 4-note repeating melody from interleaved pseudo-random notes. Results showed that the amount of overall segregation was significantly improved when visual information related to the 4-note melody is provided. This result suggests that music perception for people with impaired hearing could be enhanced using appropriate visual information

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: Missing
Zenodo URL: Missing


2009.55
The Flops Glass: A Device to Study Emotional Reactions Arising From Sonic Interactions
Lemaitre, Guillaume   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Houix, Olivier   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Franinović, Karmen   Institute for Computer Music and Sound Technology (ICST), Zurich University of the Arts (ZHdK); Zurich, Switzerland
Visell, Yon   Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada
Susini, Patrick   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
This article reports on an experimental study of emotional reactions felt by users manipulating an interactive object augmented with sounds: the Flops glass. The Flops interface consists of a glass embedded with sensors, which produces impact sounds after it is tilted, implementing the metaphor of the falling of objects out of the glass. The sonic and behavioural design of the glass was conceived specifically for the purpose of studying emotional reactions in sonic interactions. This study is the first of a series. It aims at testing the assumption that emotional reactions are influenced by three parameters of the sounds: spectral centroid, tonality and naturalness. The experimental results reported here confirm the significant influence of perceptual centroid and naturalness, but fail to show an effect of the tonality.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849681
Zenodo URL: https://zenodo.org/record/849681


2009.56
The Hyper-kalimba: Developing an Augmented Instrument From a Performer's Perspective
Rocha, Fernando   School of Music, Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil
Malloch, Joseph   Input Devices and Music Interaction Laboratory (IDMIL), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada

Abstract
The paper describes the development of the hyper-kalimba, an augmented instrument created by the authors. This development was divided into several phases and was based on constant consideration of technology, performance and compositional issues. The basic goal was to extend the sound possibilities of the kalimba, without interfering with any of the original features of the instrument or with the performer’s pre-existing skills. In this way performers were able to use all the traditional techniques previously developed, while learning and exploring all the new possibilities added to the instrument.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849683
Zenodo URL: https://zenodo.org/record/849683


2009.57
The Kinematic Rubato Model as a Means of Studying Final Ritards Across Pieces and Pianists
Grachten, Maarten   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria / Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
This paper presents an empirical study of the performance of final ritards in classical piano music by a collection of famous pianists. The particular approach taken here uses Friberg and Sundberg’s kinematic rubato model in order to characterize the variability of performed ritards across pieces and pianists. The variability is studied in terms of the model parameters controlling the depth and curvature of the ritard, after the model has been fitted to the data. Apart from finding a strong positive correlation of both parameters, we derive curvature values from the current data set that are substantially higher than curvature values deemed appropriate in previous studies. Although the model is too simple to capture all meaningful fluctuations in tempo, its parameters seem to be musically relevant, since performances of the same piece tend to be strongly concentrated in the parameter space. Unsurprisingly, the model parameters are generally not discriminative for pianist identity. Still, in some cases systematic differences between pianists are observed between pianists.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849685
Zenodo URL: https://zenodo.org/record/849685


2009.58
The Sonified Music Stand - an Interactive Sonification System for Musicians
Großhauser, Tobias   Ambient Intelligence Group, Center of Excellence in Cognitive Interaction Technology, Bielefeld University; Bielefeld, Germany
Hermann, Thomas   Ambient Intelligence Group, Center of Excellence in Cognitive Interaction Technology, Bielefeld University; Bielefeld, Germany

Abstract
This paper presents the sonified music stand, a novel interface for musicians that provides real-time feedback for professional musicians in an auditory form by means of interactive sonification. Sonifications convey information by using non-speech sound and are a promising means for musicians since they (a) leave the visual sense unoccupied, (b) address the sense of hearing which is already used and in this way further trained, (c) allow to relate feedback information in the same acoustic medium as the musical output, so that dependencies between action and reaction can be better understood. This paper presents a prototype system together with demonstrations of applications that support violinists during musical instrument learning. For that a pair of portable active loudspeaker has been designed for the music stand and a small motion sensor box has been developed to be attached to the bow, hand or hand wrist. The data are sonified in real-time according to different training objectives. We sketch several sonification ideas with sound examples and give a qualitative description of using the system.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849687
Zenodo URL: https://zenodo.org/record/849687


2009.59
Three Methods for Pianist Hand Assignment
Hadjakos, Aristotelis   Technische Universität Darmstadt (TU Darmstadt); Darmstadt, Germany
Lefebvre-Albaret, François   Institut de Recherche en Informatique de Toulouse (IRIT); Toulouse, France

Abstract
Hand assignment is the task to determine which hand of the pianist has played a note. We propose three methods for hand assignment: The first method uses Computer Vision and analyzes video images that are provided by a camera mounted over the keyboard. The second and third methods use Kalman filtering to track the hands using MIDI data only or a combination of MIDI and inertial sensing data. These methods have applications in musical practice, new piano pedagogy applications, and notation.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849689
Zenodo URL: https://zenodo.org/record/849689


2009.60
Towards an Experimental Platform for Collective Mobile Music Performance
Tahiroğlu, Koray   Department of Signal Processing and Acoustics (SPA), Helsinki University of Technology; Espoo, Finland

Abstract
This paper presents an experimentation of an interactive performance system that enables audience participation in an improvisational computer music performance. The design purports an improvisation tool and a mechanism by involving collective mobile interfaces. It also provides a design of an adaptive control module in parts of the system. Designing a collaborative interface for an easy to use and easy to control everyday-life communication tool allows for an audience to become more familiar with the collaboration process and experience a way of making music with a mobile device. The role of the audience is critical, not only for the design process of the system, but also for the experience of such experimental music.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849693
Zenodo URL: https://zenodo.org/record/849693


2009.61
Towards a Prosody Model of Attic Tragic Poetry: From Logos to Mousiké
Georgaki, Anastasia   Department of Music, University of Athens; Athens, Greece
Psaroudakēs, Stelios   Department of Music, University of Athens; Athens, Greece
Carlé, Martin   Institute of Music and Media Studies, Humboldt-Universität zu Berlin; Berlin, Germany
Tzevelekos, Panagiotis   Department of Informatics and Telecommunications, University of Athens; Athens, Greece

Abstract
Recently there has been increasing interest of scientists for the performance of singing or reciting voices of the past in utilising analysis-synthesis methods. In the domain of Ancient Greek musicology indeed, where we find the roots of the occidental music, the main research has been done mostly by scholars of classical Greek literature. However, there is still a vast territory for research in audio performances to be carried out with the help of new digital technologies. In this paper, we will present an attempt to decode a recited text of Ancient Greek tragedy and render it into sound. At the first paragraph of this article we underline the origin of music arising from the melodicity of speech in Ancient Greek tragedy. In the second paragraph, we describe the methodology we have used in order to analyse the voice of S. Psaroudak!s, himself professor of Ancient Greek music, by an open source prosodic feature extraction tool based on Praat. We give a description of the prosodic analysis, implementation details and discuss its feature extension capabilities as well. Last, we refer to the difference between the Ancient and Modern Greek phonological system, the application of this research in music and further development.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849691
Zenodo URL: https://zenodo.org/record/849691


2009.62
Towards Audio to Score Alignment in the Symbolic Domain
Niedermayer, Bernhard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
This paper presents a matrix factorization based feature for audio to score alignment. We show that in combination with dynamic time warping it can compete with chroma vectors, which are the probably most frequently used approach within the last years. A great benefit of the factorizationbased feature is its sparseness, which can be used in order to transform it into a symbolic representation. We will show that music to score alignments using the symbolic version of the feature is less accurate but on the other hand reduces the memory required for feature representation and during the alignment process to a fraction of the original amount. This is of special value when dealing with very long pieces of music where the limits of default DTW are reached.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849695
Zenodo URL: https://zenodo.org/record/849695


2009.63
VocaListener: A Singing-to-singing Synthesis System Based on Iterative Parameter Estimation
Nakano, Tomoyasu   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan

Abstract
This paper presents a singing synthesis system, VocaListener, that automatically estimates parameters for singing synthesis from a user’s singing voice with the help of song lyrics. Although there is a method to estimate singing synthesis parameters of pitch (F0 ) and dynamics (power) from a singing voice, it does not adapt to different singing synthesis conditions (e.g., different singing synthesis systems and their singer databases) or singing skill/style modifications. To deal with different conditions, VocaListener repeatedly updates singing synthesis parameters so that the synthesized singing can more closely mimic the user’s singing. Moreover, VocaListener has functions to help modify the user’s singing by correcting off-pitch phrases or changing vibrato. In an experimental evaluation under two different singing synthesis conditions, our system achieved synthesized singing that closely mimicked the user’s singing.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
unknown

DOI: 10.5281/zenodo.849697
Zenodo URL: https://zenodo.org/record/849697


Search