Sixteen Years of Sound & Music Computing
A Look Into the History and Trends of the Conference and Community

D.A. Mauro, F. Avanzini, A. Baratè, L.A. Ludovico, S. Ntalampiras, S. Dimitrov, S. Serafin
Card image

Papers

Sound and Music Computing Conference 2011 (ed. 8)

Dates: from July 06 to July 09, 2011
Place: Padova, Italy
Proceedings info: Proceedings of the SMC 2011 - 8th Sound and Music Computing Conference, 06-09 July 2011, Padova - Italy, ISBN 978-88-97385-03-5


2011.1
A Bayesian Approach to Drum Tracking
Robertson, Andrew   Queen Mary University of London; London, United Kingdom

Abstract
This paper describes a real-time Bayesian formulation of the problem of drum tracking. We describe how drum events can be interpreted to update distributions for both tempo and phase, and how these distributions can combine together to give a real-time drum tracking system. The proposed system is intended for the purposes of synchronisation of pre-recorded audio or video with live drums. We evaluate the algorithm of a new set of drum files from real recordings and compare it to other state-of-the-art algorithms. Our proposed method performs very well, often improving on the results of other real-time beat trackers. We discuss the merits of such a formulation and how it makes explicit the assumptions that underlie approaches to beat tracking. We conclude by considering how such an approach might be used for other tasks, such as score following or audio alignment. The proposed algorithm is implemented in C++ and runs in real-time.

Keywords
drum tracking, interactive, probabilistic, real-time

Paper topics
Interactive performance systems, Sound/music signal processing algorithms

Easychair keyphrases
beat period [23], drum event [19], likelihood function [15], beat location [11], beat tracking [11], real time [9], beat period estimate [6], eighth note [6], international computer music [6], phase estimate [6], relative phase [6], tempo distribution [6], beat tracker [5], ground truth [5], beat time [4], beat tracking system [4], comb filter matrix [4], computer system time [4], drum tracker [4], gaussian shaped likelihood function [4], music information retrieval [4], music research [4], posterior distribution [4], prior distribution [4], real time drum tracking [4], tempo estimate [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849855
Zenodo URL: https://zenodo.org/record/849855


2011.2
A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES
Friberg, Anders   KTH Royal Institute of Technology; Stockholm, Sweden
Hedblad, Anton   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
The backbone of most music information retrieval sys-tems is the features extracted from audio. There is an abundance of features suggested in previous studies rang-ing from low-level spectral properties to high-level se-mantic descriptions. These features often attempt to model different perceptual aspects. However, few studies have verified if the extracted features correspond to the assumed perceptual concepts. To investigate this we se-lected a set of features (or musical factors) from previous psychology studies. Subjects rated nine features and two emotion scales using a set of ringtone examples. Related audio features were extracted using existing toolboxes and compared with the perceptual ratings. The results indicate that there was a high agreement among the judges for most of the perceptual scales. The emotion ratings energy and valence could be well estimated by the perceptual features using multiple regression with adj. R2 = 0.93 and 0.87, respectively. The corresponding audio features could only to a certain degree predict the corre-sponding perceptual features indicating a need for further development.

Keywords
audio features, Music information retrieval, perceptual ratings

Paper topics
Automatic separation, classification of sound and music, Computational musicology, Content processing of music audio signals, Models for sound analysis and synthesis, Music information retrieval, Perception and cognition of sound and music, recognition

Easychair keyphrases
audio feature [16], perceptual rating [16], perceptual feature [15], pulse clarity [8], rhythmic clarity [7], harmonic complexity [6], pulse clarity model [6], rhythmic complexity [6], multiple regression [5], spectral flux [5], articulation dynamic modality [4], computed audio feature [4], computed feature [4], cronbach alpha [4], listening experiment [4], low level feature [4], mid level feature [4], music theory [4], rhythmic clarity articulation dynamic [4], separate multiple regression analysis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849857
Zenodo URL: https://zenodo.org/record/849857


2011.3
Active Preservation of Electrophone Musical Instruments. The Case of the "Liettizzatore" of "Studio Di Fonologia Musicale" (Rai, Milano)
Canazza, Sergio   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Udine; Udine, Italy
Novati, Maddalena   Archivio di Fonologia - RAI; Milano, Italy
Avanzini, Federico   Università di Padova; Padova, Italy

Abstract
This paper presents first results of an ongoing project devoted to the analysis and virtualization of the analog electronic devices of the “Studio di Fonologia Musicale”, one of the European centres of reference for the production of electroacoustic music in the 1950’s and 1960’s. After a brief summary of the history of the Studio, the paper discusses a particularly representative musical work produced at the Studio, "Scambi" by Henri Pousseur, and it presents initial results on the analysis and simulation of the electronic device used by Pousser in this composition, and the ongoing work finalized at developing an installation that re-creates such electronic lutherie.

Keywords
Electroacoustic music, Musical cultural heritage preservation, Restoration

Paper topics
access and modelling of musical heritage, Technologies for the preservation

Easychair keyphrases
red dotted line [14], blue solid line [12], electroacoustic music [8], musical instrument [8], electronic lutherie [7], time constant [7], twin diode [7], electrophone instrument [6], fonologia musicale [6], output signal [6], spice simulation [6], active preservation [4], connecting rule [4], electronic component [4], electronic device [4], electronic instrument [4], front panel [4], music bar [4], open form [4], project scheme [4], stochastic signal [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849867
Zenodo URL: https://zenodo.org/record/849867


2011.4
A Learning Interface for Novice Guitar Players using Vibrotactile Stimulation
Giordano, Marcello   McGill University; Montreal, Canada
Wanderley, Marcelo M.   McGill University; Montreal, Canada

Abstract
This paper presents a full-body vibrotactile display that can be used as a tool to help learning music performance. The system is composed of 10 vibrotactile actuators placed on different positions of the body as well as an extended and modified version of a software tool for generating tactile events, the Fast Afferent/Slow Afferent (FA/SA) application. We carried out initial tests of the system in the context of enhancing the learning process of novice guitar players. In these tests we asked the performers to play the guitar part over a drum and bass-line base track, either heard or felt by the performers through headphones and the tactile display they were wearing. Results show that it is possible to accurately render the representation of the audio track in the tactile channel only, therefore reducing the cognitive load in the auditory channel.

Keywords
immersive, learning interface, tactile perception, vibrotactile stimulation

Paper topics
Interactive performance systems, Multimodality in sound and music computing, Perception and cognition of sound and music

Easychair keyphrases
vibrotactile feedback [10], digital musical instrument [7], tactile sense [6], vibrotactile event [6], vibrotactile stimulation [6], base track [5], glabrous skin [5], hairy skin [5], tactile channel [5], tactile display [5], tactile sensation [5], tactile stimulation [5], auditory system [4], equal sensation magnitude curve [4], frequency range [4], guitar player [4], international computer music [4], model human cochlea [4], sensory substitution [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849859
Zenodo URL: https://zenodo.org/record/849859


2011.5
An Adaptive Classification Algorithm For Semiotic Musical Gestures
Gillian, Nicholas   Sonic Arts Research Centre (SARC), Queen's University Belfast; Belfast, United Kingdom
Knapp, R. Benjamin   Sonic Arts Research Centre (SARC), Queen's University Belfast; Belfast, United Kingdom
O'Modhrain, Sile   Sonic Arts Research Centre (SARC), Queen's University Belfast; Belfast, United Kingdom

Abstract
This paper presents a novel machine learning algorithm that has been specifically developed for the classification of semiotic musical gestures. We demonstrate how our algorithm, called the Adaptive Naive Bayes Classifier, can be quickly trained with a small number of training examples and then classify a set of musical gestures in a continuous stream of data that also contains non-gestural data. The algorithm also features an adaptive function that enables a trained model to slowly adapt itself as a performer refines and modifies their own gestures over, for example, the course of a rehearsal period. The paper is concluded with a study that shows a significant overall improvement in the classification abilities of the algorithm when the adaptive function is used.

Keywords
Adaptive Naive Bayes Classifier, Gesture Recognition, Musician computer interaction

Paper topics
Automatic separation, classification of sound and music, Interactive performance systems, recognition

Easychair keyphrases
target panel [25], anbc algorithm [24], ve baye classifier [19], semiotic musical gesture [14], training data [14], adaptive online training phase [12], practice phase [12], weighting coefficient [12], adaptive online training [11], baye classifier [8], musical gesture [8], real time [8], air makoto [7], naive baye classifier [7], target zone [7], adaptive online training mode [6], air makoto game [6], continuous stream [6], data collection phase [6], game phase [6], machine learning [6], machine learning algorithm [6], main game [6], maximum training buffer size [6], rejection threshold [6], target area [6], visual feedback [6], adaptive function [5], classification threshold [5], online training [5]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849869
Zenodo URL: https://zenodo.org/record/849869


2011.6
Analysis of Social Interaction in Music Performance with Score-Independent Audio Features
Volpe, Gualtiero   Università di Genova; Genova, Italy
Varni, Giovanna   Università di Genova; Genova, Italy
Mazzarino, Barbara   Università di Genova; Genova, Italy
Pisano, Silvia   Università di Genova; Genova, Italy
Camurri, Antonio   Università di Genova; Genova, Italy

Abstract
Research on analysis of expressive music performance is recently moving its focus from a single player to small music ensembles, extending the analysis to the social interaction among the members of the ensemble. A step in this direction is the definition and the validation of a set of score-independent audio features that enable to characterize the social interaction in the ensemble, based on the analysis of the music performance. This paper focuses on the analysis of four different performances of a same music piece performed by a string quartet. The performances differ with respect to factors affecting the social interaction within the ensemble. The analysis aims at evaluating whether and to what extent a set of consolidated score-independent audio features, already employed for analysis of expressive music content and particularly suitable for string instruments, enable to distinguish among such different performances.

Keywords
analysis of music performance, score-independent audio features, social interaction

Paper topics
Social interaction in sound and music computing

Easychair keyphrases
social interaction [20], independent audio feature [15], score independent [10], performance condition performance [9], expressive content [8], audio feature [7], music performance [7], string quartet [6], expressive music performance [4], lower panel [4], music ensemble [4], residual energy [4], upper panel [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849879
Zenodo URL: https://zenodo.org/record/849879


2011.7
An analog I/O interface board for Audio Arduino open sound card system
Dimitrov, Smilen   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark

Abstract
AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can be demonstrated with nothing more than a pair of headphones and a couple of capacitors - modern PC soundcards typically make use of multiple signal standards; and correspondingly, multiple connectors. The usual distinction that typical off-the-shelf stereo soundcards make, is between line-level signals (line-in/line-out) - and those not conforming to this standard (such as microphone input/speaker output). To provide a physical illustration of these issues in soundcard design, this project outlines an open design for a simple single-sided PCB, intended for experimentation (via interconnection of basic circuits on board). The contribution of this project is in providing a basic introductory overview of some of the problems (PWM output in particular) in analog I/O design and implementation for soundcards through a real world example, which - while incapable of delivering professional grade quality - could still be useful, primarily in an educational scope.

Keywords
Arduino, audio, driver, PCB, PCM, PWM, Sound card

Paper topics
Interfaces for sound and music, Sound/music signal processing algorithms

Easychair keyphrases
audioa rduino [15], line level [14], duty cycle [13], analog signal [11], analog switch [6], sampled value [5], sample value [5], advanced linux sound architecture [4], analog sample [4], audio engineering society convention [4], dashed line [4], dead time [4], digital audio [4], digital audio hardware [4], fast pwm mode [4], integrated signal [4], linear ramp [4], line level signal [4], low pass filter [4], next pwm period [4], power supply [4], voltage signal [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849871
Zenodo URL: https://zenodo.org/record/849871


2011.8
AN EXPLORATION ON THE INFLUENCE OF VIBROTACTILE CUES DURING DIGITAL PIANO PLAYING
Fontana, Federico   Università di Udine; Udine, Italy
Papetti, Stefano   Università di Verona; Verona, Italy
Del Bello, Valentina   Università di Verona; Verona, Italy
Civolani, Marco   Università di Udine; Udine, Italy
Bank, Balázs   Budapest University of Technology and Economics; Budapest, Hungary

Abstract
An exploratory experiment was carried out in which subjects with different musical skills were asked to play a digital piano keyboard, first by following a specific key sequence and style of execution, and then performing freely. Judgments of perceived sound quality were recorded in three different settings, including standard use of the digital piano with its own internal loudspeakers, and conversely use of the same keyboard for controlling a physics-based piano sound synthesis model running on a laptop in real time. Through its audio card, the laptop drove a couple of external loudspeakers, and occasionally a couple of shakers screwed to the bottom of the keyboard. The experiment showed that subjects prefer the combination of sonic and vibrotactile feedback provided by the synthesis model when playing the key sequences, whereas they promote the quality of the original instrument when performing free. However springing out of a preliminary evaluation, these results were in good accordance with the development stage of the synthesis software at the time of the experiment. They suggest that vibrotactile feedback modifies, and potentially improves the performer's experience when playing on a digital piano keyboard.

Keywords
digital piano keyboard, physical modeling, tactile augmentation, vibrotactile perception

Paper topics
Multimodality in sound and music computing, Perception and cognition of sound and music

Easychair keyphrases
vibrotactile feedback [17], digital piano [11], sound quality [7], digital piano keyboard [6], musical instrument [6], mean value [5], piano sound [5], non musician [4], real time [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849873
Zenodo URL: https://zenodo.org/record/849873


2011.9
AN INTERACTIVE MUSIC COMPOSITION SYSTEM BASED ON AUTONOMOUS MAINTENANCE OF MUSICAL CONSISTENCY
Kitahara, Tetsuro   Nihon University; Tokyo, Japan
Fukayama, Satoru   The University of Tokyo; Tokyo, Japan
Sagayama, Shigeki   The University of Tokyo; Tokyo, Japan
Katayose, Haruhiro   Kwansei Gakuin University; Osaka, Japan
Nagata, Noriko   Kwansei Gakuin University; Osaka, Japan

Abstract
Various attempts at automatic music composition systems have been made but they have not dealt with the issue of how to edit the composed piece by the user. In this paper, we propose a human-in-the-loop music composition system, where the manual editing stage is integrated into the composition process. This system first generates a musical piece based on the lyrics input by the user. After that, the user can edit the melody and/or chord progression. The feature of this system is to regenerate, once the user edits the melody or chord progression of the generated piece, the remaining part so that it musically matches the edited part. With this feature, users can try various melodies and arrangements without taking into account the musical inconsistency between the melody and chord progression. We confirmed that this feature facilitated the user's trial and error in elaborating music.

Keywords
Automatic music composition, Bayesian network, Human-in-the-loop

Paper topics
Automatic music generation/accompaniment systems

Easychair keyphrases
chord progression [30], musical piece [12], musical consistency [10], autonomous maintenance [9], chord name [7], remaining part [7], chord voicing [6], light gray chord [6], manual editing stage [6], music composition [6], amc system [5], pitch motion [5], automatic music composition [4], bass line [4], bayesian network [4], dynamic bayesian network [4], melody node [4], music composition system [4], passing note [4], second measure [4], user editing [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849875
Zenodo URL: https://zenodo.org/record/849875


2011.10
An Interactive Surface Realisation of Henri Pousseur's 'Scambi'
Fencott, Robin   Queen Mary University of London; London, United Kingdom
Dack, John   Middlesex University London; London, United Kingdom

Abstract
This paper discusses the design and implementation of an interactive touch surface exhibit which re-appropriates a historic electroacoustc work for the digital age. The electroacoustic work in question is Henri Pousseur's seminal `Scambi' composition, originally created in 1957 at the RAI Studios, Milan. The status of Scambi as a key example of an electroacoustic `open' form makes it ideal for re-appropriation as an interactive public exhibit, while an existing musicological analysis of Pousseur's compositional instructions for Scambi provide insight for the user interface design and translation of written textual composition process into interactive software. The project is on-going, and this paper presents our current work-in progress. We address the musicological, practical and aesthetic implications of this work, discuss informal observation of users engaging with our tabletop system, and comment on the nature of touchscreen interfaces for musical interaction. This work is therefore relevant to the electroacoustic community, fields of human computer interaction, and researchers developing new interfaces for musical expression. This work contributes to the European Commission funded DREAM project.

Keywords
Design, Electroacoustic, Fiducial, Heritage, Interaction, Multi-touch, Music, Pousseur, Re-appropriation, Scambi, Tangible

Paper topics
access and modelling of musical heritage, Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, Social interaction in sound and music computing, Sonic interaction design, Technologies for the preservation

Easychair keyphrases
interactive surface [13], sound segment [9], scambi sequencer [8], electroacoustic work [7], open form [7], fiducial marker [6], touch surface [6], henri pousseur [5], musical expression [5], surface interface [5], tangible object [5], computer vision [4], direct touch [4], dream project [4], middlesex university [4], multi touch [4], public exhibition [4], rai studio [4], scambi section [4], table surface [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849877
Zenodo URL: https://zenodo.org/record/849877


2011.11
Applications of Synchronization in Sound Synthesis
Neukom, Martin   Institute for Computer Music and Sound Technology (ICST), Zurich University of the Arts (ZHdK); Zurich, Switzerland

Abstract
The synchronization of natural and technical periodic processes can be simulated with self-sustained oscillators. Under certain conditions, these oscillators adjust their frequency and their phase to a master oscillator or to other self-sustained oscillators. These processes can be used in sound synthesis for the tuning of non-linear oscillators, for the adjustment of the pitches of other oscillators, for the synchronization of periodic changes of any sound parameters and for the synchronization of rhythms. This paper gives a short introduction to the theory of synchronization, shows how to implement the differential equations which describe the self-sustained oscillators and gives some examples of musical applications. The examples are programmed as mxj~ externals for MaxMSP. The Java code samples are taken from the perform routine of these externals. The externals and Max patches can be downloaded from http://www.icst.net/.

Keywords
Self-Sustained Oscillators, Sound Synthesis, Synchronization

Paper topics
Automatic music generation/accompaniment systems, Digital audio effects, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
van der pol [26], der pol oscillator [25], self sustained oscillator [17], limit cycle [7], max patch smc [7], phase difference [7], differential equation [6], natural frequency [6], non linear [6], phase diagram [6], chaotic oscillator [5], coupled oscillator [5], exciting force [5], rossler oscillator [5], sound synthesis [5], chaotic behavior [4], linear oscillator [4], max patch [4], non linear oscillator [4], phase space [4], quasi linear oscillator [4], rossler system [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849881
Zenodo URL: https://zenodo.org/record/849881


2011.12
A Rule-Based Generative Music System Controlled by Desired Valence and Arousal
Wallis, Isaac   Arizona State University; Tempe, United States
Ingalls, Todd   Arizona State University; Tempe, United States
Campana, Ellen   Arizona State University; Tempe, United States
Goodman, Janel   Arizona State University; Tempe, United States

Abstract
This paper details an emotional music synthesis (EMS) system which is designed around music theory parameters and previous research on music and emotion. This system uses a rule-based algorithm to generate the music from scratch. Results of a user study on this system show that listener ratings of emotional valence and arousal correlate with intended production of musical valence and arousal.

Keywords
Algorithmic Composition, Emotion, Music

Paper topics
Interactive performance systems, Interfaces for sound and music, Models for sound analysis and synthesis, Perception and cognition of sound and music

Easychair keyphrases
musical feature [10], intended valence [9], musical parameter [9], harmonic mode [8], intended arousal [8], musical emotion [8], rhythmic roughness [7], upper extension [7], arousal setting [6], emotional music synthesis [6], perceived valence [6], pitch register [6], voice spacing [6], voice leading [5], voicing size [5], international affective picture system [4], mean clicked valence [4], note generation [4], perceived arousal [4], real time [4], user study [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849861
Zenodo URL: https://zenodo.org/record/849861


2011.13
A Survey of Raaga Recognition Techniques and Improvements to the State-of-the-Art
Koduri, Gopala Krishna   Cognitive Science Lab, International Institute of Information Technology (IIITH); Hyderabad, India
Gulati, Sankalp   Digital Audio Processing Lab (DAP Lab), Indian Institute of Technology Bombay (IITB); Mumbai, India
Rao, Preeti   Digital Audio Processing Lab (DAP Lab), Indian Institute of Technology Bombay (IITB); Mumbai, India

Abstract
Raaga is the spine of Indian classical music. It is the single most crucial element of the melodic framework on which the music of the subcontinent thrives. Naturally, automatic raaga recognition is an important step in computational musicology as far as Indian music is considered. It has several applications like indexing Indian music, automatic note transcription, comparing, classifying and recommending tunes, and teaching to mention a few. Simply put, it is the first logical step in the process of creating computational methods for Indian classical music. In this work, we investigate the properties of a raaga and the natural process by which people identify the raaga. We survey the past raaga recognition techniques correlating them with human techniques, in both north Indian (Hindustani) and south Indian (Carnatic) music systems. We identify the main drawbacks and propose minor, but multiple improvements to the state-of-the-art raaga recognition technique, and discuss and compare it with the previous work.

Keywords
carnatic, hindustani, indian classical music, raaga properties, raaga recognition, survey

Paper topics
access and modelling of musical heritage, Automatic separation, classification of sound and music, Computational musicology, Music information retrieval, recognition, Technologies for the preservation

Easychair keyphrases
pitch class profile [31], pitch class [17], raaga recognition [17], indian classical music [14], pitch class distribution [12], indian music [10], classical music [9], carnatic music [8], distance measure [7], raaga recognition system [7], raaga recognition technique [7], stable note region [7], pitch contour [6], pitch value [6], raaga identification [6], stable note [6], note segmentation [5], bhupali total test sample [4], continuous pitch contour [4], detected stable note region [4], hindustani classical music [4], melodic atom [4], music information retrieval [4], pakad matching [4], pitch detection [4], pitch extraction [4], south indian classical music [4], swara intonation [4], test dataset [4], trained musician [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849863
Zenodo URL: https://zenodo.org/record/849863


2011.14
A toolbox for storing and streaming music-related data
Nymoen, Kristian   University of Oslo; Oslo, Norway
Jensenius, Alexander Refsum   University of Oslo; Oslo, Norway

Abstract
Simultaneous handling and synchronisation of data related to music, such as score annotations, MIDI, video, motion descriptors, sensor data, etc. requires special tools due to the diversity of this data. We present a toolbox for recording and playback of complex music-related data. Using the Sound Description Interchange Format as a storage format and the Open Sound Control protocol as a streaming protocol simplifies exchange of data between composers and researchers.

Keywords
Gesture Description Interchange Format, Open Sound Control, Sound Description Interchange Format, Synchronization of music-related data streams

Paper topics
Computer environments for sound/music processing, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
data type [7], motion capture data [7], music related data [7], international computer music [6], music related body motion [6], interchange format [5], music related [5], playback module [5], type tag [5], description interchange format [4], dimensional position stream [4], file name [4], frame type [4], gdif data [4], gdif data type [4], matrix type [4], open sound control protocol [4], real time [4], sdif file [4], sound description interchange [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849865
Zenodo URL: https://zenodo.org/record/849865


2011.15
Audio Physical Computing
Valle, Andrea   Interdipartimental Center for Research on Multimedia and Audiovideo (CIRMA), Dipartimento di Studi Umanistici, Università di Torino; Torino, Italy

Abstract
The paper describes an approach to the control of electromechanical devices for musical purposes (mainly, DC motors and solenoids) using audio signals. The proposed approach can be named audio physical computing, i.e. physical computing oriented towards sound generation by means of audio signals. The approach has its origin in a previous physical computing project dedicated to music generation, the Rumentarium Project, that used microcontrollers as the main computing hardware interface. First, some general aspect of physical computing are discussed and the Rumentarium project is introduced. Then, a reconsideration of the technical setup of the Rumentarium is developed, and the audio physical computing approach is considered as a possible replacement for microcontrollers. Finally, a music work is described, in order to provide a real life example of audio physical computing.

Keywords
audio control signal, interaction, physical computing

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
sound card [34], physical computing [23], audio physical computing [20], audio signal [17], sound body [13], dc coupled sound card [10], cifre del colpo [7], physical object [7], audio physical computing scenario [6], rumentarium project [6], control signal [5], direct current [5], electromechanical device [5], onset detection [5], pulse train [5], software layer [5], digital control [4], electronic music [4], operating system [4], real time [4], rotation direction [4], rotation speed [4], sound generation [4], square wave [4], unit sound body [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849883
Zenodo URL: https://zenodo.org/record/849883


2011.16
Auditory Feedback in a Multimodal Balancing Task: Walking on a Virtual Plank
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Turchet, Luca   Aalborg University; Aalborg, Denmark
Nordahl, Rolf   Aalborg University; Aalborg, Denmark

Abstract
We describe a multimodal system which exploits the use of footwear-based interaction in virtual environments. We developed a pair of shoes enhanced with pressure sensors, actuators, and markers. Such shoes control a multichannel surround sound system and drive a physically based sound synthesis engine which simulates the act of walking on different surfaces. We present the system in all its components, and explain its ability to simulate natural interactive walking in virtual environments. The system was used in an experiment whose goal was to assess the ability of subjects to walk blindfolded on a virtual plank. Results show that subjects perform the task slightly better when they are exposed to haptic feedback as opposed to auditory feedback, although no significant differences are measured. The combination of auditory and haptic feedback does not significantly enhances the task performance.

Keywords
auditory feedback, balancing task, haptic feedback, multimodal experience

Paper topics
Multimodality in sound and music computing

Easychair keyphrases
haptic feedback [19], auditory feedback [11], motion capture system [9], natural interactive walking [9], virtual environment [8], virtual plank [8], audio haptic [7], pressure sensor [7], creaking wood [5], haptic shoe [5], aalborg university copenhagen [4], footstep sound [4], medium technology aalborg [4], synthesis engine [4], technology aalborg university [4], virtual reality haptic shoe [4], visual feedback [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849885
Zenodo URL: https://zenodo.org/record/849885


2011.17
Automatically detecting key modulations in J.S. Bach chorale recordings
Mearns, Lesley   Queen Mary University of London; London, United Kingdom
Benetos, Emmanouil   Queen Mary University of London; London, United Kingdom
Dixon, Simon   Queen Mary University of London; London, United Kingdom

Abstract
This paper describes experiments to automatically detect key and modulation in J.S. Bach chorale recordings. Transcribed audio is processed into vertical notegroups, and the groups are automatically assigned chord labels in accordance with Schonberg's definition of diatonic triads and sevenths for the 24 major and minor modes. For comparison, MIDI representations of the chorales are also processed. Hidden Markov Models (HMMs) are used to detect key and key change in the chord sequences, based upon two approaches to chord and key transition representations. Our initial hypothesis is that key and chord values which are systematically derived from pre-eminent music theory will produce the most accurate models of key and modulation. The music theory models are therefore tested against models embodying Krumhansl's data resulting from perceptual experiments about chords and harmonic relations. We conclude that the music theory models produce better results than the perceptual data but that all of the models produce good results. The use of transcribed audio produces encouraging results, with the key detection outputs ranging from 79% to 97% of the MIDI ground truth results.

Keywords
Harmony, Hidden Markov models, Key detection, Perception, Polyphonic music transcription, Systematic musicology

Paper topics
Automatic separation, classification of sound and music, Computational musicology, Content processing of music audio signals, Perception and cognition of sound and music, recognition

Easychair keyphrases
observation matrix [16], key transition matrix [15], music theory [14], ground truth [13], diatonic chord [8], key change [8], transcribed audio [8], truth midi [8], chord sequence [7], error rate [7], ground truth midi [7], music theory model [7], perceptual data [7], pitch class set [7], transcribed data [7], chord value [6], complex chord [6], dist conc mod [6], hand annotated sequence [6], hidden markov model [6], key context [6], key modulation detection [6], midi data [6], pitch class [6], chord rating [5], err dist [5], home key [5], major chord [5], midi pitch [5], minor key [5]

Paper type
Full paper

DOI: 10.5281/zenodo.849893
Zenodo URL: https://zenodo.org/record/849893


2011.18
Automatic Creation of Mood Playlists in the Thayer Plane: A Methodology and a Comparative Study
Panda, Renato   University of Coimbra; Coimbra, Portugal
Paiva, Rui Pedro   University of Coimbra; Coimbra, Portugal

Abstract
We propose an approach for the automatic creation of mood playlists in the Thayer plane (TP). Music emotion recognition is tackled as a regression and classification problem, aiming to predict the arousal and valence (AV) values of each song in the TP, based on Yang’s dataset. To this end, a high number of audio features are extracted using three frameworks: PsySound, MIR Toolbox and Marsyas. The extracted features and Yang’s annotated AV values are used to train several Support Vector Re- gressors, each employing different feature sets. The best performance, in terms of R2 statistics, was attained after forward feature selection, reaching 63% for arousal and 35.6% for valence. Based on the predicted location of each song in the TP, mood playlists can be created by specifying a point in the plane, from which the closest songs are retrieved. Using one seed song, the accuracy of the created playlists was 62.3% for 20-song playlists, 24.8% for 5-song playlists and 6.2% for the top song.

Keywords
classification, mood, playlist, regression

Paper topics
Automatic separation, classification of sound and music, Music information retrieval, Perception and cognition of sound and music, recognition

Easychair keyphrases
music information retrieval [11], mir toolbox [9], thayer plane [9], feature set [7], music emotion recognition [7], seed song [7], automatic playlist generation [6], mood detection [6], music mood [6], playlist generation [6], audio signal [5], dimensional model [5], feature selection [5], audio feature [4], av mood modeling [4], best result [4], desired mood trajectory [4], emotional state [4], feature extraction [4], feature selection algorithm [4], ground truth [4], high number [4], language processing [4], mood playlist [4], song playlist [4], thayer model [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849887
Zenodo URL: https://zenodo.org/record/849887


2011.19
Automatic Multi-track Mixing Using Linear Dynamical Systems
Scott, Jeffrey   Drexel University; Philadelphia, United States
Prockup, Matthew   Drexel University; Philadelphia, United States
Schmidt, Erik   Drexel University; Philadelphia, United States
Kim, Youngmoo   Drexel University; Philadelphia, United States

Abstract
Over the past several decades music production has evolved from something that was only possible with multi- room, multi-million dollar studios into the province of the average person’s living room. New tools for digital pro- duction have revolutionized the way we consume and in- teract with music on a daily basis. We propose a system based on a structured audio framework that can generate a basic mix-down of a set of multi-track audio files using parameters learned through supervised machine learning. Given the new surge of mobile content consumption, we extend this system to operate on a mobile device as an ini- tial measure towards an integrated interactive mixing plat- form for multi-track music.

Keywords
automatic mixing, machine learning, mobile devices, music production

Paper topics
Automatic music generation/accompaniment systems, Computer environments for sound/music processing, Content processing of music audio signals, Interfaces for sound and music, Multimodality in sound and music computing, Music information retrieval

Easychair keyphrases
mixing coefficient [17], linear dynamical system [9], mobile device [8], final mix [7], structured audio [7], weighting coefficient [7], automatic mixing system [6], least square [6], multi track [6], supervised machine learning [6], amplitude amplitude [5], real time [5], acoustic feature [4], fader value [4], game console [4], ground truth weight [4], hardware accelerated linear algebra [4], individual track [4], kalman filtering [4], multi track session [4], source audio [4], time varying mixing [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849889
Zenodo URL: https://zenodo.org/record/849889


2011.20
BeatLED - The Social Gaming Partyshirt
De Nies, Tom   Multimedia Lab (MMLab), Interdisciplinary Institute for BroadBand Technology (IBBT), Ghent University; Ghent, Belgium
Vervust, Thomas   Center for Microsystems Technology (CMST), Ghent University; Ghent, Belgium
Demey, Michiel   Institute for Psychoacoustics and Electronic Music (IPEM), Ghent University; Ghent, Belgium
Van De Walle, Rik   Multimedia Lab (MMLab), Interdisciplinary Institute for BroadBand Technology (IBBT), Ghent University; Ghent, Belgium
Vanfleteren, Jan   Center for Microsystems Technology (CMST), Ghent University; Ghent, Belgium
Leman, Marc   Institute for Psychoacoustics and Electronic Music (IPEM), Ghent University; Ghent, Belgium

Abstract
This paper describes the development of a social game, BeatLED, using music, movement and luminescent textile.The game is based on a tool used in research on synchronization of movement and music, and social entrainment at the Institute of Psychoacoustics and Electronic Music (IPEM) at Ghent University. Players, divided into several teams, synchronize to music and receive a score in realtime, depending on how well they synchronize with the music and each other. While this paper concentrates on the game design and dynamics, an appropriate and original means of providing output to the end users was needed. To accommodate this output, a flexible, stretchable LED-display was developed at CMST (Ghent University), and embedded into textile. In this paper we analyze the characteristics a musical social game should have, as well as the overall merit of such a game. We discuss the various technologies involved, the game design and dynamics, a proof-of-concept implementation and the most prominent test results. We conclude that a real-world implementation of this game not only is feasible, but would also have several applications in multiple sectors, such as musicology research, team-building and health care.

Keywords
multimodal interaction, musical game, social entrainment

Paper topics
Music information retrieval, Perception and cognition of sound and music, Social interaction in sound and music computing, Sound and music for VR and games

Easychair keyphrases
social game [14], team score [9], ghent university [7], test session [7], audio track [6], concept implementation [6], led display [6], wii remote [6], acceleration data [5], game dynamic [5], individual score [5], peak detection [5], score array [5], synchronizing algorithm [5], highest scoring team [4], lowest scoring team [4], peak detection algorithm [4], sampling rate [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849895
Zenodo URL: https://zenodo.org/record/849895


2011.21
C. Elegans Meets Data Sonification: Can We Hear its Elegant Movement?
Terasawa, Hiroko   University of Tsukuba; Tsukuba, Japan
Takahashi, Yuta   University of Tsukuba; Tsukuba, Japan
Hirota, Keiko   University of Tsukuba; Tsukuba, Japan
Hamano, Takayuki   Exploratory Research for Advanced Technology (ERATO), Okanoya Emotional Information Project, Japan Science and Technology Agency (JST); Kawaguchi, Japan
Yamada, Takeshi   University of Tsukuba; Tsukuba, Japan
Fukamizu, Akiyoshi   University of Tsukuba; Tsukuba, Japan
Makino, Shoji   University of Tsukuba; Tsukuba, Japan

Abstract
We introduce our video-data sonification of Caenorhabditis elegans (C. elegans), a small nematode worm that has been extensively used as a model organism in molecular biology. C. elegans exhibits various kinds of movements, which may be altered by genetic manipulations. In pursuit of potential applications of data sonification in molecular biology, we converted video data of this worm into sounds, aiming to distinguish the movements by hearing. The video data of C. elegans wild type and transgenic types were sonified using a simple motion-detection algorithm and granular synthesis. The movement of the worm in the video was transformed into the sound cluster of very-short sine-tone wavelets. In the evaluation test, the group of ten participants (from both molecular biology and audio engineering) were able to distinguish sonifications of the different worm types with an almost 100% correct response rate. In the post-experiment interview, the participants reported more detailed and accurate comprehension on the timing of the worm's motion in sonification than in video.

Keywords
C. Elegans, Data sonification, Molecular biology, Research communication, Video

Paper topics
Auditory display and data sonification

Easychair keyphrases
video data [15], audio engineer [11], granular synthesis [10], data sonification [8], evaluation test [7], identification task [7], molecular biologist [7], molecular biology [7], auditory display [6], wild type [6], caenorhabditis elegan [5], model organism [5], sound cluster [5], correct response rate [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849897
Zenodo URL: https://zenodo.org/record/849897


2011.22
Comparing Inertial and Optical MoCap Technologies for Synthesis Control
Skogstad, Ståle A.   University of Oslo; Oslo, Norway
Nymoen, Kristian   University of Oslo; Oslo, Norway
Høvin, Mats   University of Oslo; Oslo, Norway

Abstract
This paper compares the use of two different technologies for controlling sound synthesis in real time: the infrared marker-based motion capture system OptiTrack and Xsens MVN, an inertial sensor-based motion capture suit. We present various quantitative comparisons between the data from the two systems and results from an experiment where a musician performed simple musical tasks with the two systems. Both systems are found to have their strengths and weaknesses, which we will present and discuss.

Keywords
Motion capture, OptiTrack, Synthesis Control, Xsens

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
motion capture [30], motion capture system [15], mocap system [10], real time [10], motion capture technology [7], optitrack system [7], xsen mvn suit [7], continuous onset task [6], global coordinate system [6], infrared marker based motion [6], marker based motion capture system [6], xsen mvn system [6], position data [5], acceleration data [4], control data [4], hand clap [4], inertial sensor [4], irmocap system [4], left foot [4], marker based motion capture [4], motion capture data [4], optical marker based motion [4], pitch following task [4], positional drift [4], rigid body [4], standard deviation [4], xsen suit [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849899
Zenodo URL: https://zenodo.org/record/849899


2011.23
DANCEREPRODUCER: AN AUTOMATIC MASHUP MUSIC VIDEO GENERATION SYSTEM BY REUSING DANCE VIDEO CLIPS ON THE WEB
Nakano, Tomoyasu   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Murofushi, Sora   Waseda University; Tokyo, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Morishima, Shigeo   Waseda University; Tokyo, Japan

Abstract
We propose a dance video authoring system, DanceReProducer, that can automatically generate a dance video clip appropriate to a given piece of music by segmenting and concatenating existing dance video clips. In this paper, we focus on the reuse of ever-increasing user-generated dance video clips on a video sharing web service. In a video clip consisting of music (audio signals) and image sequences (video frames), the image sequences are often synchronized with or related to the music. Such relationships are diverse in different video clips, but were not dealt with by previous methods for automatic music video generation. Our system employs machine learning and beat tracking techniques to model these relationships. To generate new music video clips, short image sequences that have been previously extracted from other music clips are stretched and concatenated so that the emerging image sequence matches the rhythmic structure of the target song. Besides automatically generating music videos, DanceReProducer offers a user interface in which a user can interactively change image sequences just by choosing different candidates. This way people with little knowledge or experience in MAD movie generation can interactively create personalized video clips.

Keywords
Mashup, Music Signal Processing, Music Video Generation

Paper topics
Content processing of music audio signals, Multimodality in sound and music computing

Easychair keyphrases
image sequence [45], video clip [42], bar level feature [15], dance video [15], music video [13], dance video clip [12], music structure [12], bar level [11], music video clip [9], visual unit [9], mad movie [8], video sharing web service [8], view count [8], audio signal [7], linear regression model [7], mashup video clip [7], video generation [7], visual feature [7], automatic music video generation [6], chorus section [6], context relationship [6], existing dance video clip [6], musical section [6], segmenting and concatenating [6], accumulated cost [5], dance motion [5], frame feature [5], temporal continuity [5], dance video authoring [4], mad movie generation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849903
Zenodo URL: https://zenodo.org/record/849903


2011.24
Demetrio Stratos Rethinks Voice Techniques: A Historical Investigation at ISTC in Padova
Ceolin, Elena   Dipartimento di Storia delle Arti Visive e della Musica, Università di Padova; Padova, Italy
Tisato, Graziano   Istituto di Scienze e Tecnologie della Cognizione (ISTC); Padova, Italy
Zattra, Laura   Dipartimento di Storia delle Arti Visive e della Musica, Università di Padova; Padova, Italy

Abstract
Demetrio Stratos (1945-1979) was a singer known for his creative use of vocal techniques such as diplophony, bitonality and diphony (overtone singing). His need to know the scientific explanation for such vocal behaviors, drove him to visit the ISTC in Padova (Institute of Cognitive Sciences and Technologies) in the late Seventies. ISTC technical resources and the collaboration with Franco Ferrero and Lucio Croatto (phonetics and phoniatric experts), allowed him to analyze his own phono-articulatory system and the effects he was able to produce. This paper presents the results of a broad historical survey of Stratos’ research at the ISTC. The historic investigation is made possible by textual criticism and interpretation based on different sources, digital and audio sources, sketches, various bibliographical references (published or unpublished) and oral communications. Sonograms of Stratos’ exercises (made at the time and recently redone) show that various abilities existed side by side in the same performer, which is rare to find. This marks his uniqueness in the avant-gard and popular music scene of the time. The ultimate aim of this study was to produce a digital archive for the preservation and conservation of the sources related to this period.

Keywords
analysis, historical investigation, philology and sources, preservation

Paper topics
access and modelling of musical heritage, Technologies for the preservation

Easychair keyphrases
istc archive [30], demetrio strato [24], conservative copy [12], magnetic tape [12], compact cassette [11], di demetrio strato [9], elena ceolin [8], overtone singing [7], academic year [6], di alcuni tipi [6], graduation thesis [6], lo strumento voce [6], tecnologie della cognizione [6], vocal cord [6], vocal technique [6], ferrero franco [5], analogue audio document [4], arti visive e della [4], basf chromdioxid [4], conservative copy cif0008 [4], conservative copy cif0009 [4], digital archive [4], harmonic partial [4], il fondo demetrio strato [4], sergio canazza targon [4], visive e della musica [4], vocal effect [4], vocalizzo di demetrio [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849907
Zenodo URL: https://zenodo.org/record/849907


2011.25
Design and Applications of a Multi-Touch Musical Keyboard
McPherson, Andrew P.   Drexel University; Philadelphia, United States
Kim, Youngmoo   Drexel University; Philadelphia, United States

Abstract
This paper presents a hardware and software system for adding multiple touch sensitivity to the piano-style keyboard. The traditional keyboard is a discrete interface, defining notes by onset and release. By contrast, our system allows continuous gestural control over multiple dimensions of each note by sensing the position and size of up to three touches per key. Sensors are constructed using system-on-chip capacitive touch sensing controllers on circuit boards shaped to each key. The boards are laminated with thin plastic sheets to provide a traditional feel to the performer. The sensors, which are less than 3mm thick, mount atop an existing acoustic or electronic piano keyboard. The hardware connects by USB, and software on a host computer generates OSC messages reflecting a broad array of low- and high-level gestures, including motion of single points, two- and three-finger pinch and slide gestures, and continuous glissando tracking across multiple keys. This paper describes the sensor design and presents selected musical mappings.

Keywords
Gesture recognition, Keyboard, Music interfaces, Touch sensing

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
white key [13], horizontal position [7], host controller [7], black key [6], higher level gestural feature [6], acoustic piano [5], circuit board [5], osc message [5], capacitive sensing [4], computer music [4], contact area [4], continuous key position sensing [4], existing touch [4], expressive plucked string synthesis [4], key touch [4], multiple touch sensitivity [4], multi touch [4], octave controller [4], open sound control [4], touch location [4], touch position [4], touch sensitivity [4], touch size [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849909
Zenodo URL: https://zenodo.org/record/849909


2011.26
Designing an Expressive Virtual Percussive Instrument
Dolhansky, Brian   Drexel University; Philadelphia, United States
McPherson, Andrew P.   Drexel University; Philadelphia, United States
Kim, Youngmoo   Drexel University; Philadelphia, United States

Abstract
One advantage of modern smart phones is their ability to run complex applications such as instrument simulators. Most available percussion applications use a trigger-type implementation to detect when a user has made a gesture corresponding to a drum hit, which limits the expressive- ness of the instrument. This paper presents an alterna- tive method for detecting drum gestures and producing a latency-reduced output sound. Multiple features related to the shape of the percussive stroke are also extracted. These features are used in a variety of physically-inspired and novel sound mappings. The combination of these com- ponents provides an expressive percussion experience for the user.

Keywords
expression, mobile platform, percussion

Paper topics
Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, Sound and music for VR and games

Easychair keyphrases
mobile device [16], output sound [14], percussion instrument [9], accelerometer profile [7], bass drum [7], drum stick [7], forward swing [7], hit prediction [7], acceleration magnitude [6], accelerometer sample [6], velocity estimate [6], back swing [5], hit detection [5], musical expression [5], percussive stroke [5], playing style [5], accelerometer magnitude [4], actual peak [4], expressive virtual percussion instrument [4], feature extraction [4], onset detection [4], physical instrument [4], robust peak picking algorithm [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849911
Zenodo URL: https://zenodo.org/record/849911


2011.27
Distance Mapping for Corpus-Based Concatenative Synthesis
Schwarz, Diemo   UMR STMS, Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
In the most common approach to corpus-based concatenative synthesis, the unit selection takes places as a content-based similarity match based on a weighted Euclidean distance between the audio descriptors of the database units, and the synthesis target. While the simplicity of this method explains the relative success of CBCS for interactive descriptor-based granular synthesis--especially when combined with a graphical interface--and audio mosaicing, and still allows to express categorical matches, certain desirable constraints can not be expressed, such as disallowing repetition of units, matching a disjunction of descriptor ranges, or asymmetric distances. We therefore map the individual descriptor distances by a warping function that can express these criteria, while still being amenable to efficient multi-dimensional search indices like the kD-tree, for which we define the preconditions and cases of applicability.

Keywords
audio descriptors, audio mosaicing, concatenative synthesis, constraints, content-based retrieval, corpus-based synthesis, databases, search algorithms, similarity, unit selection

Paper topics
Content processing of music audio signals, Models for sound analysis and synthesis

Easychair keyphrases
distance mapping function [31], distance mapping [22], corpus based concatenative synthesis [16], unit selection [10], real time [7], selection criterion [5], time interactive [5], asymmetric distance mapping function [4], asymmetric duration distance mapping [4], audio mosaicing [4], binary distance mapping function [4], descriptor space [4], distance calculation [4], elimination rule [4], individual signed descriptor distance [4], kd tree search [4], mapped distance [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849913
Zenodo URL: https://zenodo.org/record/849913


2011.28
DYNAMIC INTERMEDIATE MODELS FOR AUDIOGRAPHIC SYNTHESIS
Goudard, Vincent   Lutheries - Acoustique - Musique (LAM), Institut Jean Le Rond d'Alembert; Paris, France
Genevois, Hugues   Lutheries - Acoustique - Musique (LAM), Institut Jean Le Rond d'Alembert; Paris, France
Doval, Boris   Lutheries - Acoustique - Musique (LAM), Institut Jean Le Rond d'Alembert; Paris, France
Ghomi, Émilien   Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud XI; Paris, France

Abstract
When developing and setting software instruments, the way data from the gesture interfaces are correlated with the parameters required to control the synthesis, i.e. the mapping, has a decisive role in ergonomics, playability and expressiveness of a device. The authors propose an approach based on a modular software design. In order to improve and enrich the interaction between the musician and his/her instrument, the authors propose to introduce the notion of "Dynamic Intermediate Models" (DIM) designed in a modular software architecture to complete and extend the notion of mapping functions. In such a scheme, these modules are inserted between those dedicated to the formatting of data from interfaces and those in charge of audio-graphic synthesis and rendering. In this paper, the general framework of the software architecture and the concept of "Dynamic Intermediate Models" will be presented and developed, based on a theoretical program to implement the DIMs based on a multidisciplinary approach taking into account the different aspects of evaluation.

Keywords
audio synthesis, data mapping, Human-Computer Interaction (HCI), instrumentality, musical gesture

Paper topics
Computer environments for sound/music processing, Digital audio effects, Interactive performance systems, Interfaces for sound and music, Models for sound analysis and synthesis, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
meta mallette [8], dynamic intermediate model [7], musical instrument [7], computer music [6], puce muse [6], interaction device [4], lam institut jean [4], les nouveaux geste [4], modular software architecture [4], musical gesture [4], non linear [4], orjo project [4], real time [4], roulette model [4], synthesis algorithm [4], synthesis parameter [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849915
Zenodo URL: https://zenodo.org/record/849915


2011.29
Emotional response to major mode musical pieces: score-dependent perceptual and acoustic analysis
Canazza, Sergio   Università di Padova; Padova, Italy
De Poli, Giovanni   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Padova; Padova, Italy

Abstract
In the Expressive Information Processing field, some studies investigated the relation between music and emotions, proving that is possible to correlate the listeners main appraisal categories and the acoustic parameters which better characterize expressive intentions, defining score-independent models of expressiveness. Other researches take in account that part of the emotional response to music results from the cognitive processing of musical structures (key, modalities, rhythm), which are known to be expressive in the context of the Western musical system. Almost all these studies investigate emotional responses to music by us- ing linguistic labels, that is potentially problematic since it can encourage participants to simplify what they actually experience. Recently, some authors proposed an experimental method that makes no use of linguistic labels. By means of the multidimensional scaling method (MDS), a two-dimensional space was found to provide a good fit of the data, with arousal and emotional valence as the primary dimensions. In order to emphasize other latent di- mensions, a perceptual experiment and a comprehensive acoustic analysis was carried out by using a set of musical pieces all in major mode. Results show that participants tend to organize the stimuli according to three clusters, related to musical tempo and to timbric aspects such as the spectral energy distribution.

Keywords
Audio analysis, Expressive information processing, Musical and physical gestures, Perceptual analysis

Paper topics
Music performance analysis and rendering

Easychair keyphrases
major mode [13], emotional response [8], high arousal [8], major major [8], minor minor [8], bigand experiment [7], expressive intention [7], mean value [7], musical piece [7], acoustic feature [6], audio feature [6], listener main appraisal category [6], low arousal [6], minimum error rate [6], acoustical society [5], high valence [5], low valence [5], musical expression [5], musical structure [5], music performance [5], verbal label [5], acoustic analysis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849917
Zenodo URL: https://zenodo.org/record/849917


2011.30
ENSEMBLE: IMPLEMENTING A MUSICAL MULTIAGENT SYSTEM FRAMEWORK
Thomaz, Leandro   Department of Computer Science, University of São Paulo (USP); São Paulo, Brazil
Queiroz, Marcelo   Department of Computer Science, University of São Paulo (USP); São Paulo, Brazil

Abstract
Multiagent systems can be used in a myriad of musical applications, including electro-acoustic composition, automatic musical accompaniment and the study of emergent musical societies. Previous works in this field were usually concerned with solving very specific musical problems and focused on symbolic processing, which limited their widespread use, specially when audio exchange and spatial information were needed. To address this shortcoming, Ensemble, a generic framework for building musical multiagent systems was implemented, based on a previously defined taxonomy and architecture. The present paper discusses some implementation details and framework features, including event exchange between agents, agent motion in a virtual world, realistic 3D sound propagation simulation, and interfacing with other systems, such as Pd and audio processing libraries. A musical application based on Steve Reich’s Clapping Music was conceived and implemented using the framework as a case study to validate the aforementioned features. Finally, we discuss some performance results and corresponding implementation challenges, and the solutions we adopted to address these issues.

Keywords
computer music, multiagent system, software framework

Paper topics
3D sound/music, Computer environments for sound/music processing, Interactive performance systems, Social interaction in sound and music computing, Sound/music signal processing algorithms

Easychair keyphrases
virtual environment [14], sound propagation [13], musical agent [11], sound propagation simulation [9], sound sensor [9], agent position [7], audio frame [7], musical application [7], musical multiagent [7], sound actuator [7], virtual world [7], event exchange [6], musical multiagent application [6], musical multiagent system [6], periodic event exchange [6], audio interface [5], frame size [5], memory access [5], audio processing [4], graphical user interface [4], international computer music [4], newton raphson method [4], open sound control [4], operating system [4], realistic sound propagation simulation [4], real time [4], sound processing [4], sound propagation processing time [4], starting point [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849919
Zenodo URL: https://zenodo.org/record/849919


2011.31
Evaluating of sensor technologies for The Rulers, a kalimba-like Digital Musical Instrument
Brum Medeiros, Carolina   Input Devices and Music Interaction Laboratory (IDMIL), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada
Wanderley, Marcelo M.   Input Devices and Music Interaction Laboratory (IDMIL), Schulich School of Music, Music Technology Area, McGill University; Montreal, Canada

Abstract
Selecting a sensor technology for a Digital Musical Instrument (DMI) is not obvious specially because it involves a performance context. For this reason, when designing a new DMI, one should be aware of the advantages and drawback of each sensor technology and methodology. In this article, we present a discussion about the Rulers, a DMI based on seven cantilever beams fixed at one end which can be bent, vibrated, or plucked. The instrument has already two sensing versions: one based on IR sensor, another on Hall sensor. We introduce strain gages as a third option for the Rulers, sensor that are widely used in industry for measuring loads and vibration. Our goal was to compare the three sensor technologies according to their measurement function, linearity, resolution, sensitivity and hysteresis and also according to real-time application indicators as: mechanical robustness, stage light sensitivity and temperature sensitivity. Results indicate that while strain gages offer more robust and medium sensitivity solution, the requirements for their use can be an obstacle for novice designers.

Keywords
DMI, sensor, strain gages

Paper topics
Interfaces for sound and music

Easychair keyphrases
hall sensor [34], strain gage [33], sensor output [8], digital musical instrument [7], fixed end [7], measurement range [6], power source voltage [6], sensor technology [6], conditioning circuit [5], free end [5], measurement function [5], mechanical robustness [5], output response [5], sensor type [5], last measurement point [4], magnetic field [4], thermal expansion coefficient [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849921
Zenodo URL: https://zenodo.org/record/849921


2011.32
Explaining musical expression as a mixture of basis functions
Grachten, Maarten   Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Johannes Kepler University Linz; Linz, Austria

Abstract
The quest for understanding how pianists interpret notated music to turn it into a lively musical experience, has led to numerous models of musical expression. One of the major dimensions of musical expression is loudness. Several models exist that explain loudness variations over the course of a performance, in terms of for example phrase structure, or musical accent. Often however, especially in piano music from the romantic period, performance directives are written explicitly in the score to guide performers. It is to be expected that such directives can explain a large part of the loudness variations. In this paper, we present a method to model the influence of notated loudness directives on loudness in piano performances, based on least squares fitting of a set of basis functions. We demonstrate that the linear basis model approach is general enough to allow for incorporating arbitrary musical features. In particular, we show that by including notated pitch in addition to loudness directives, the model also accounts for loudness effects in relation to voice-leading.

Keywords
basis functions, dynamics, expression, linear regression, music performance

Paper topics
Music performance analysis and rendering

Easychair keyphrases
basis function [49], dynamic marking [16], dyn pit [15], linear basis model [15], dynamic annotation [13], dyn pit dyn [12], pit dyn pit [12], dyn pit gr [11], dyn pit ir [11], musical expression [11], loudness variation [9], gr dyn pit [6], ir dyn pit [6], pit gr dyn [6], pit ir dyn [6], loudness variance [5], melody note [5], musical piece [5], music performance [5], predictive accuracy [5], weight vector [5], computational perception johanne kepler [4], grace note [4], onset time [4], piano performance [4], predicted loudness [4], third order polynomial pitch model [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849923
Zenodo URL: https://zenodo.org/record/849923


2011.33
Exploring the Design Space: Prototyping "The Throat V3" for the Elephant Man Opera
Elblaus, Ludvig   KTH Royal Institute of Technology; Stockholm, Sweden
Hansen, Kjetil Falkenberg   KTH Royal Institute of Technology; Stockholm, Sweden
Unander-Scharin, Carl   The University College of Opera; Stockholm, Sweden

Abstract
Developing new technology for artistic practice requires other methods than classical problem solving. Some of the challenges involved in the development of new musical instruments have affinities to the realm of wicked problems. Wicked problems are hard to define and have many different solutions that are good or bad (not true or false). The body of possible solutions to a wicked problem can be called a design space and exploring that space must be the objective of a design process. In this paper we present effective methods of iterative design and participatory design that we have used in a project developed in collaboration between the Royal Institute of Technology (KTH) and the University College of Opera, both in Stockholm. The methods are outlined, and examples are given of how they have been applied in specific situations. Focus lies on prototyping and evaluation with user participation. By creating and acting out scenarios with the user, and thus asking the questions through a prototype and receiving the answers through practice and exploration, we removed the bottleneck represented by language and allowed communication beyond verbalizing. Doing this, even so-called silent knowledge could be activated and brought into the development process.

Keywords
interactive systems, opera, Participatory design, prototyping, singing, supercollider

Paper topics
Digital audio effects, Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
design space [16], development process [12], design process [9], participatory design [9], wicked problem [9], musical instrument [8], music computing [8], signal processing [8], padova italy [7], elephant man [6], iterative design [6], problem solving [6], user participation [5], black box [4], called tacit knowledge [4], interaction design [4], tacit knowledge [4], unander scharin [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849925
Zenodo URL: https://zenodo.org/record/849925


2011.34
Extraction of sound localization cue utilizing pitch cue for modelling auditory system
Okuno, Takatoshi   University of Ulster; Coleraine, United Kingdom
Mcginnity, Thomas M.   University of Ulster; Coleraine, United Kingdom
Maguire, Liam P.   University of Ulster; Coleraine, United Kingdom

Abstract
This paper presents a simple model for the extraction of a sound localization cue utilizing pitch cues in the auditory system. In particular, the extraction of the interaural time difference (ITD) as the azimuth localization cue, rather than the interaural intensity difference (IID), is constructed using a conventional signal processing scheme. The new configuration in this model is motivated by psychoacoustical and physiological findings, suggesting that the ITD can be controlled by the pitch cue in the simultaneous grouping of auditory cues. The localization cues are extracted at the superior olivary complex (SOC) while the pitch cue may be extracted at a higher stage of the auditory pathway. To explore this idea in the extraction of ITD, a system is introduced to feed back information on the pitch cue to control and/or modify the ITD for each frequency channel.

Keywords
Auditory system, Pitch, Sound localization

Paper topics
3D sound/music, Models for sound analysis and synthesis, Perception and cognition of sound and music, Sound/music and the neurosciences, Sound/music signal processing algorithms

Easychair keyphrases
ear signal [17], estimated angle [16], better ear [13], directional white noise [12], pitch extraction [10], frequency channel [9], pitch extraction algorithm [9], right ear signal [9], sound localization cue [9], harmonic stream [8], auditory cue [7], gammatone filter bank [7], dummy head [6], sound localization [6], auditory system [5], binaural signal [5], decision process [5], female speech [5], localization cue [5], pitch cue [5], sound source [5], white noise [5], auditory pathway [4], auditory scene analysis [4], auto correlation function [4], frame size [4], frequency range [4], left ear signal [4], right ear [4], sound localization cue utilizing [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849927
Zenodo URL: https://zenodo.org/record/849927


2011.35
Foley sounds vs real sounds
Trento, Stefano   Conservatorio di musica “Cesare Pollini” di Padova; Padova, Italy
De Götzen, Amalia   Conservatorio di musica “Cesare Pollini” di Padova; Padova, Italy

Abstract
This paper is an initial attempt to study the world of sound effects for motion pictures, also known as Foley sounds. Throughout several audio and audio-video tests we have compared both Foley and real sounds originated by an identical action. The main purpose was to evaluate if sound effects are always better than real sounds [1]. Once this aspect is cleared up, the next step will be to understand how Foley effects exaggerate important acoustic features. These are the basis for being able to create a database of expressive sounds, such as audio caricatures, that will be used in different applications of sound design such as advertisement or soundtracks for movies.

Keywords
foley sounds, perception, sound effects

Paper topics
Perception and cognition of sound and music

Easychair keyphrases
real sound [26], foley sound [23], audio video test [20], foley effect [14], audio test [13], foley artist [8], foley stage [8], anchor sound [7], sliding door [6], sound effect [6], summertime grass [5], audio file [4], direct comparison [4], passionate kisses [4], screen action walking [4], twentieth centuryfox film corporation [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849929
Zenodo URL: https://zenodo.org/record/849929


2011.36
from snow [to space to movement] to sound
Kontogeorgakopoulos, Alexandros   Cardiff Metropolitan University (UWIC); Cardiff, United Kingdom
Kotsifa, Olivia   Cardiff Metropolitan University (UWIC); Cardiff, United Kingdom
Erichsen, Matthias   Cardiff Metropolitan University (UWIC); Cardiff, United Kingdom

Abstract
The current paper concerns a work in progress research and design project regarding a forth-coming mixed media interactive performance, which integrates space design, sound, visuals and snowboarding. The aim is to create a play-ful and even provocative experience to the users- performers and to the spectators of the final event by mixing and blending music, sound design, architecture, visual projections and freestyle snowboarding. It is a collaborative effort between a French freestyle snowpark de-velopment and snowboarding events company named H05, and three researchers and practitio-ners in computer music, architectural design and electronic engineering. Computer motion tracking techniques, a variety of spatial and body sensors and sonic transformations of pre-composed material have been and are currently explored for the realization of the musical part of the piece. The fundamental and key concept is to associate sound features and interactively composed sound objects to snowboarding full body gestures. Architectural design plays a critical role in the project, since the composed space shapes the snowboarding movements, which accordingly form the musical and visual elements of our work. The current paper de-scribes our initial designs and working proto-types used during a test period in the HO5 snowparks in the Alps.

Keywords
Architecture, Interactive Composition, Mixed Media Performance, Music Interaction, Snowboarding

Paper topics
Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
interactive performance [6], interactive dance [5], motion tracking [5], computer vision algorithm [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849931
Zenodo URL: https://zenodo.org/record/849931


2011.37
Functional Signal Processing with Pure and Faust using the LLVM Toolkit
Gräf, Albert   Johannes Gutenberg University Mainz; Mainz, Germany

Abstract
Pure and Faust are two functional programming languages useful for programming computer music and other multimedia applications. Faust is a domain-specific language specifically designed for synchronous signal processing, while Pure is a general-purpose language which aims to facilitate symbolic processing of complicated data structures in a variety of application areas. Pure is based on the LLVM compiler framework which supports both static and dynamic compilation and linking. This paper discusses a new LLVM bitcode interface between Faust and Pure which allows direct linkage of Pure code with Faust programs, as well as inlining of Faust code in Pure scripts. The interface makes it much easier to integrate signal processing components written in Faust with the symbolic processing and metaprogramming capabilities provided by the Pure language. It also opens new possibilities to leverage Pure and its JIT (just-in-time) compiler as an interactive frontend for Faust programming.

Keywords
Faust, functional programming, LLVM, Pure, signal processing

Paper topics
Computer environments for sound/music processing

Easychair keyphrases
faust module [17], extern extern extern [15], pure interpreter [9], signal processing [9], control variable [7], faust code [7], block size [5], faust compiler [5], faust program [5], pure code [5], bitcode module [4], computer music [4], data structure [4], pure script [4], put control gate [4], signal processing component [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849933
Zenodo URL: https://zenodo.org/record/849933


2011.38
Generating Musical Accompaniment through Functional Scaffolding
Hoover, Amy K.   University of Central Florida; Orlando, United States
Szerlip, Paul A.   University of Central Florida; Orlando, United States
Stanley, Kenneth O.   University of Central Florida; Orlando, United States

Abstract
A popular approach to music generation in recent years is to extract rules and statistical relationships by analyzing a large corpus of musical data. The aim of this paper is to present an alternative to such data-intensive techniques. The main idea, called functional scaffolding for musical composition (FSMC), exploits a simple yet powerful property of multipart compositions: The pattern of notes and timing in different instrumental parts of the same song are functionally related. That is, in principle, one part can be expressed as a function of another. The utility of this insight is validated by an application that assists the user in exploring the space of possible accompaniments to preexisting parts through a process called interactive evolutionary computation. In effect, without the need for musical expertise, the user explores transforming functions that yield plausible accompaniments derived from preexisting parts. In fact, a survey of listeners shows that participants cannot distinguish songs with computer-generated parts from those that are entirely human composed. Thus this one simple mathematical relationship yields surprisingly convincing results even without any real musical knowledge programmed into the system. With future refinement, FSMC might lead to practical aids for novices aiming to fulfill incomplete visions.

Keywords
Accompaniment, Compositional Pattern Producing Networks (CPPNs), Computer-Generated Music, Functional Scaolding for Musical Composition (FSMC)

Paper topics
Automatic music generation/accompaniment systems, Computational musicology

Easychair keyphrases
nancy whiskey [16], bad girl lament [14], functional relationship [12], steel guitar [12], bad girl lament accompaniment [8], neat drummer [8], functional scaffolding [7], musical composition [7], chief dougla daughter [6], interactive evolutionary computation [6], kilgary mountain [6], listener study [6], music generation [6], rhythm network [6], girl lament [5], musical knowledge [5], musical part [5], plausible accompaniment [5], called functional scaffolding [4], central florida orlando [4], compositional pattern producing network [4], computer generated accompaniment [4], hidden node [4], instrumental part [4], musical relationship [4], nancy whiskey accompaniment [4], nancy whiskey rhythm [4], transforming function [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849935
Zenodo URL: https://zenodo.org/record/849935


2011.39
Gestural Control of Real-time Speech Synthesis in Luna Park
Beller, Grégory   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
This paper presented the researches and the developments realized for an artistic project called Luna Park. This work is widely connected, at various levels, in the paradigm of the concatenative synthesis, both to its shape and in the processes which it employs. Thanks to a real-time pro- gramming environment, synthesis engines and prosodic trans- formations are manipulated, controlled and activated by the gesture, via accelerometers realized for the piece. This paper explains the sensors, the real time audio engines and the mapping that connects this two parts. The world pre- miere of Luna Park takes place in Paris, in the space of projection of the IRCAM, on June 10th, 2011, during the festival AGORA.

Keywords
concatenative synthesis, Gesture, mapping, prosody, real time, TTS

Paper topics
Interactive performance systems, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
real time [34], audio engine [12], concatenative synthesis [12], speech synthesis [12], speech rate [10], luna park [8], gesture capture [6], hit energy estimation [6], real time speech synthesis [6], batch mode [5], absolute position [4], gestural control [4], prosodic transformation [4], real time programming environment [4], real time prosodic transformation [4], real time target [4], speech synthesizer [4], tactile ribbon [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849937
Zenodo URL: https://zenodo.org/record/849937


2011.40
Humanities, Art and Science in the Context of Interactive Sonic Systems - Some Considerations on a Cumbersome Relationship
Polotti, Pietro   Conservatorio di musica "Giuseppe Tartini" di Trieste; Padova, Italy

Abstract
The theme of this conference, “creativity rethinks science” involves a radical epistemological challenge with respect to a classical view of science and it is an extremely hot topic of speculation within the scientific community, at least for what concerns computer sciences. In this paper, we propose some considerations about the role that artistic research could have within science, where science is meant in the wide sense of knowledge, including, thus, humanities as a one of the partners together with natural sciences. After a more general discussion focused mainly on the field of Information and Communication Technology (ICT), we will restrict the scope to the case of sound art involving new technologies and sound design for Human Computer Interaction (HCI), namely Sonic Interaction Design (SID). In our discussion, the concepts of design have a particular relevance, since they provide a connection between fields traditionally far away one from the other such as natural sciences, art, engineering and humanities. In the last part of the paper, we provide some examples about what we mean by doing artistic research guided by a design practice. We envisage this as one of the possible ways to make a dialogue between artistic research and scientific research more feasible at a methodological level.

Keywords
Acoustic paradigm, Artistic research, Interactive Arts, Interdisciplinarity, Rhetoric, Sonic Interaction Design

Paper topics
Auditory display and data sonification, Interactive performance systems, Sonic interaction design

Easychair keyphrases
human computer interaction [12], artistic research [10], interactive art [8], natural science [8], scientific research [7], computer science [6], human computer [6], human computer study [6], auditory display [5], public art [5], sound design [5], artistic practice [4], design methodology [4], epistemological revolution [4], interactive installation [4], next section [4], non verbal sound [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849939
Zenodo URL: https://zenodo.org/record/849939


2011.41
Improved Frequency Estimation in Sinusoidal Models through Iterative Linear Programming Schemes
Shiv, Vighnesh Leonardo   Catlin Gabel School; Portland, United States

Abstract
Sinusoidal modeling systems are commonly employed in sound and music processing systems for their ability to decompose a signal to its fundamental spectral information. Sinusoidal modeling is a two-phase process: sinusoidal parameters are estimated in each analysis frame in the first phase, and these parameters are chained into sinusoidal trajectories in the second phase. This paper focuses on the first phase. Current methods for estimating parameters rely heavily on the resolution of the Fourier transform and are thus hindered by the Heisenberg uncertainty principle. A novel approach is proposed that can super-resolve frequencies and attain more accurate estimates of sinusoidal parameters than current methods. The proposed algorithm formulates parameter estimation as a linear programming problem, in which the L1 norm of the residual component of the sinusoidal decomposition is minimized. Shared information from iteration to iteration and from frame to frame allows for efficient parameter estimation at high sampling rates.

Keywords
Linear programming, Parameter estimation, Sinusoidal modeling, Sound analysis/synthesis

Paper topics
Digital audio effects, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
analysis frame [32], linear program [20], linear programming [17], sinusoidal modeling [10], frequency bin [9], frequency estimation [8], simplex algorithm [8], exponential decay rate [7], hypothesis set [7], linear programming problem [7], parameter estimation [7], sinusoidal frequency [7], super resolve frequency [7], padding factor [6], signal processing [6], sinusoidal amplitude [6], sinusoidal parameter [6], residual component [5], sinusoidal decomposition [5], average absolute error [4], fourier analysis based system [4], fourier based method [4], frequency estimation error [4], linear least square system [4], lth analysis frame [4], optimal basis [4], short time fourier transform [4], sinusoidal modeling system [4], sinusoidal parameter estimation [4], time varying amplitude [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849942
Zenodo URL: https://zenodo.org/record/849942


2011.42
IMPROVING PERFORMERS’ MUSICALITY THROUGH LIVE INTERACTION WITH HAPTIC FEEDBACK: A CASE STUDY
Michailidis, Tychonas   Birmingham Conservatoire; Birmingham, United Kingdom
Bullock, Jamie   Birmingham Conservatoire; Birmingham, United Kingdom

Abstract
Physical interaction with instruments allows performers to express and realise music based on the nature of the instrument. Through instrumental practice, the performer is able to learn and internalise sensory responses inherent in the mechanical production of sound. However, current electronic musical input devices and interfaces lack the ability to provide a satisfactory haptic feedback to the performer. The lack of feedback information from electronic controllers to the performer introduces aesthetic and practical problems in performances and compositions of live electronic music. In this paper, we present an initial study examining the perception and understanding of artificial haptic feed- back in live electronic performances. Two groups of trumpet players participated during the study, in which short musical examples were performed with and without artificial haptic feedback. The results suggest the effectiveness and possible exploitable approaches of haptic feedback, as well as the performers’ ease of recalibrating and adapting to new haptic feedback associations. In addition to the methods utilised, technical practicalities and aesthetic issues are discussed.

Keywords
controllers, haptics, live electronics

Paper topics
Interactive performance systems, Perception and cognition of sound and music

Easychair keyphrases
haptic feedback [38], live electronic [14], vibrating motor [8], electronic controller [6], pressure sensor [6], vibrating feedback [6], gestural control [4], instrumental performer [4], mapping strategy [4], musical input device [4], pressure sensor glove [4], trumpet player [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849946
Zenodo URL: https://zenodo.org/record/849946


2011.43
Improving tempo-sensitive and tempo-robust descriptors for rhythmic similarity
Holzapfel, André   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Flexer, Arthur   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Widmer, Gerhard   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria

Abstract
For the description of rhythmic content of music signals usually features are preferred that are invariant in presence of tempo changes. In this paper it is shown that the importance of tempo depends on the musical context. For popular music, a tempo-sensitive feature is optimized on multiple datasets using analysis of variance, and it is shown that also a tempo-robust description profits from the integration into the resulting processing framework. Important insights are given into optimal parameters for rhythm description, and limitations of current approaches are indicated.

Keywords
music information retrieval, music similarity, rhythm

Paper topics
Content processing of music audio signals, Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
tempo change [13], pulse tempo [11], mean accuracy [10], multi band processing [7], periodicity spectral magnitude [7], rhythmic similarity [7], turkish art music [7], factor interaction [6], similarity measure [6], data set [5], onset pattern [5], tempo information [5], window length [5], band processing scheme [4], classification accuracy [4], dashed line [4], nearest neighbor classification [4], plotted mean accuracy [4], processing parameter anova [4], scale transform [4], significantly different mean [4], system parameter [4], tukey hsd adjustment [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849948
Zenodo URL: https://zenodo.org/record/849948


2011.44
Improving the Efficiency of Open Sound Control with Compressed Address Strings
Kleimola, Jari   School of Electrical Engineering, Department of Signal Processing and Acoustics, Aalto University; Espoo, Finland
Mcglynn, Patrick J.   Sound and Digital Music Technology Group, National University of Ireland Maynooth; Maynooth, Ireland

Abstract
This paper introduces a technique that improves the efficiency of the Open Sound Control (OSC) communication protocol. The improvement is achieved by decoupling the user interface and the transmission layers of the protocol, thereby reducing the size of the transmitted data while simultaneously simplifying the receiving end parsing algorithm. The proposed method is fully compatible with the current OSC v1.1 specification. Three widely used OSC toolkits are modified so that existing applications are able to benefit from the improvement with minimal reimplementation efforts, and the practical applicability of the method is demonstrated using a multitouch-controlled audiovisual application. It was found that the required adjustments for the existing OSC toolkits and applications are minor, and that the intuitiveness of the OSC user interface layer is retained while communicating in a more efficient manner.

Keywords
gestural controllers, interaction, networking, OSC

Paper topics
Computer environments for sound/music processing, Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
address string [22], integer token [17], open sound control [11], osc message [10], receiving end [8], address space [6], received osc message [6], standard osc [6], standard osc implementation [6], address part [5], address pattern [5], compressed message [5], data vector [5], osc toolkit [5], setup phase [5], user interface [5], end parsing algorithm [4], end point [4], established ip based technique [4], major osc specification update [4], mapping mechanism [4], next major osc specification [4], parameter update [4], practical applicability [4], receiving end parsing [4], shared dictionary mapping mechanism [4], streamliner class [4], supplied address string [4], transmitted data [4], url style address [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849950
Zenodo URL: https://zenodo.org/record/849950


2011.45
Investigation of the relationships between audio features and induced emotions in Contemporary Western music
Trochidis, Konstantinos   Université de Bourgogne; France
Delbé, Charles   Université de Bourgogne; France
Bigand, Emmanuel   Université de Bourgogne; France

Abstract
This paper focuses on emotion recognition and understanding in Contemporary Western music. The study seeks to investigate the relationship between perceived emotion and musical features in the fore-mentioned musical genre. 27 Contemporary music excerpts are used as stimuli to gather responses from both musicians and non-musicians which are then mapped on an emotional plane in terms of arousal and valence dimensions. Audio signal analysis techniques are applied to the corpus and a base feature set is obtained. The feature set contains characteristics ranging from low-level spectral and temporal acoustic features to high-level contextual features. The feature extraction process is discussed with particular emphasis on the interaction between acoustical and structural parameters. Statistical relations between audio features and emotional ratings from psychological experiments are systematically investigated. Finally, a linear model is created using the best features and the mean ratings and its prediction efficiency is evaluated and discussed.

Keywords
audio features, emotion processing, multiple linear regression, music emotion recognition

Paper topics
Computational musicology, Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
contemporary western music [11], non musician [10], emotional response [7], acoustic feature [6], high level contextual feature [6], low level acoustical feature [6], multiple linear regression analysis [6], music information retrieval [6], pulse clarity [6], high level [5], musical excerpt [5], contemporary art music [4], contemporary music excerpt [4], low level feature [4], low level spectral [4], regression model [4], tonal centroid [4], universite de bourgogne [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849952
Zenodo URL: https://zenodo.org/record/849952


2011.46
Isomorphic Tessellations for Musical Keyboards
Maupin, Steven   University of Regina; Regina, Canada
Gerhard, David   University of Regina; Regina, Canada
Park, Brett   University of Regina; Regina, Canada

Abstract
Many traditional and new musical instruments make use of an isomorphic note layout across a uniform planar tessellation. Recently, a number of hexagonal isomorphic keyboards have become available commercially. Each such keyboard or interface uses a single specific layout for notes, with specific justifications as to why this or that layout is better. This paper is an exploration of all possible note layouts on isomorphic tessellations. We begin with an investigation and proof of isomorphism in the two regular planar tessellations (Square and hexagonal), we describe the history and current practice of isomorphic note layouts from traditional stringed instruments to commercial hex keyboards and virtual keyboards available on tablet computers, and we investigate the complete space of such layouts, evaluating the existing popular layouts and proposing a set of new layouts which are optimized for specific musical tasks.

Keywords
isomorphic, jammer, keyboard layouts, wicki-hayden

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
semi tone [26], diatonic scale [25], perfect fifth [21], chromatic scale [19], dominant seventh chord [15], wicki hayden layout [12], dominant seventh [11], harmonic table [11], square tessellation [10], harmonic combination [9], major third [9], harmonic layout [8], inversed layout [8], musical instrument [8], vertical direction [8], horizontal direction [7], semi tone axis [7], stringed instrument [7], hexagonal layout [6], major scale [6], minor third interval [6], minor triad [6], root note [6], square layout [6], vertical interval [6], chord shape [5], dominant triad [5], minor third [5], perfect fourth [5], table layout [5]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849954
Zenodo URL: https://zenodo.org/record/849954


2011.47
Leech: BitTorrent and Music Piracy Sonification
Mckinney, Curtis   Bournemouth University; Bournemouth, United Kingdom
Renaud, Alain   Bournemouth University; Bournemouth, United Kingdom

Abstract
This paper provides an overview of a multi-media composition, Leech, which aurally and visually renders BitTor- rent traffic. The nature and usage of BitTorrent networking is discussed, including the implications of widespread music piracy. The traditional usage of borrowed musical material as a compositional resource is discussed and expanded upon by including the actual procurement of the musical material as part of the performance of the piece. The technology and tools required to produce this work, and the roles that they serve, are presented. Eight distinct streams of data are targeted for visualization and sonification: Torrent progress, download/upload rate, file name/size, number of peers, peer download progress, peer location, packet transfer detection, and the music being pirated. An overview of the methods used for sonifying and and visualizing this data in an artistic manner is presented.

Keywords
BitTorrent, network music, piracy, sonification

Paper topics
Auditory display and data sonification, Automatic music generation/accompaniment systems, Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
packet transfer [9], client torrent client [7], multi medium composition [6], torrent client torrent [6], download progress [5], music piracy [5], packet capture [5], real time [5], borrowed musical material [4], delay line [4], geographic location [4], mined data torrent progress [4], peer location [4], torrent download [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849956
Zenodo URL: https://zenodo.org/record/849956


2011.48
Limits of Control
Rutz, Hanns Holger   Interdisciplinary Centre for Computer Music Research (ICCMR), University of Plymouth; Plymouth, United Kingdom

Abstract
We are analysing the implications of music composition through programming, in particular the possibilities and limitations of tracing the compositional process through computer artefacts. The analysis is attached to the case study of a sound installation. This work was realised using a new programming system which is briefly introduced. Through these observations we are probing and adjusting a model of the composition process which draws ideas from systems theory, the experimental system of differential reproduction, and deconstructionism.

Keywords
Composition Process, Music Programming, Systems Thinking

Paper topics
Computer environments for sound/music processing, Interfaces for sound and music

Easychair keyphrases
computer music [9], composition process [8], programming language [8], language oriented system [7], sound file [7], sound process [5], computer music composition [4], differential reproduction [4], hann holger rutz [4], solution space [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849958
Zenodo URL: https://zenodo.org/record/849958


2011.49
Ljudskrapan/The Soundscraper: Sound exploration for children with complex needs, accommodating hearing aids and cochlear implants
Hansen, Kjetil Falkenberg   KTH Royal Institute of Technology; Stockholm, Sweden
Dravins, Christina   Riga Stradiņš University; Riga, Latvia
Bresin, Roberto   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
This paper describes a system for accommodating active listening for persons with hearing aids or cochlear implants, with a special focus on children with complex needs, for instance at an early stage of cognitive development and with additional physical disabilities. The system is called "Ljudskrapan" (or "the Soundscraper" in English) and consists of a software part in Pure data and a hardware part using an Arduino microcontroller with a combination of sensors. For both the software and hardware development, one of the most important aspects was to always ensure that the system was flexible enough to cater for the very different conditions that are characteristic of the intended user group. The Soundscraper has been tested with 25 children with good results. An increased attention span was reported, as well as surprising and positive reactions from children where the caregivers were unsure whether they could hear at all. The sound generating models, the sensors and the parameter mapping were simple, but provided a controllable and complex enough sound environment even with limited interaction.

Keywords
cochlear implants, cognitive impairments, interactive sound, new musical instruments, physical impairments, scratching, sound interaction

Paper topics
Interactive performance systems, Interfaces for sound and music, Perception and cognition of sound and music, Sonic interaction design

Easychair keyphrases
cochlear implant [14], active listening [6], hearing impairment [6], hearing aid [5], parameter mapping [5], pure data [5], sound model [5], early stage [4], electrode array [4], inner ear [4], musical instrument [4], music therapy [4], sensor data [4], sound manipulation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849960
Zenodo URL: https://zenodo.org/record/849960


2011.50
Marco Stroppa's Compositional Process and Scientific Knowledge Between 1980-1991
Tiffon, Vincent   (UDL3-CEAC), Université de Lille-Nord de France; France
Sprenger-Ohana, Noémie   (UDL3-CEAC), Université de Lille-Nord de France; France

Abstract
The purpose of this paper is to show the creative relation-ship that can be established between scientific knowledge and musical innovation, through the example of Marco Stroppa’s work performed between 1980 and 1991 in four specific places: Padova CSC, the conservatory of Venice, Ircam (Paris) and MIT (USA). The following methodological tools allow to under-stand the links between Stroppa’s technico-scientific in-novation, and musical invention: an analysis of his train-ing years from 1980 to 1983 and of the main sources of cognitive models; a genetic study of the work Traiettoria (1982-1988), that is, the systematic study of traces, sketches, drafts, computer jotters and other genetic doc-uments; written work published by Stroppa between 1983 and 1991; multiple interviews with the composer and witnesses of the period; a partial reconstitution under OpenMusic (OMChroma workspace) of the portion of synthesis initially performed under Music V. In fact, Traiettoria constitutes what can be labelled a laboratory of Marco Stroppa’s “workshop of composi-tion”.

Keywords
Compositional process, CSC, Instrument and electronics, Marco Stroppa, Musicology, Music V, OMChroma

Paper topics
access and modelling of musical heritage, Computer environments for sound/music processing, Music information retrieval, Perception and cognition of sound and music, Technologies for the preservation

Easychair keyphrases
marco stroppa [19], sound synthesis [10], scientific knowledge [8], computer generated sound [6], computer music [6], noemie sprenger ohana [6], synthetic sound [6], computer jotter [5], contrast model [5], sound object [5], computer assisted composition [4], high level musical control [4], jean claude risset [4], miniature estrose [4], musical information organism [4], real time [4], similarity index [4], software program [4], sound family [4], stroppa work [4], structured programming [4], vincent tiffon [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849962
Zenodo URL: https://zenodo.org/record/849962


2011.51
Melody Harmonization in Evolutionary Music using Multiobjective Genetic Algorithms
Freitas, Alan   Universidade Federal de Ouro Preto (UFOP); Belo Horizonte, Brazil
Guimarães, Frederico Gadelha   Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil

Abstract
This paper describes a multiobjective approach for melody harmonization in evolutionary music. There are numerous methods and a myriad of results to a process of {harmonization of a given melody.} Some implicit rules can be extracted from musical theory, but some harmonic aspects can only be defined by preferences of a composer. Thus, a multiobjective approach may be useful to allow an evolutionary process to find a set of solutions that represent a trade-off between the rules in different objective functions. In this paper, a multiobjective evolutionary algorithm defines chord changes with differing degrees of simplicity and dissonance. While presenting such an algorithm, we discuss how to embed musical cognizance in Genetic Algorithms in a meta-level. Experiments were held and compared to human judgment of the results. The findings suggest that it is possible to devise a fitness function which reflects human intentions for harmonies.

Keywords
Evolutionary Music, Harmonization, Multiobjective Optimization

Paper topics
Computer environments for sound/music processing, Interfaces for sound and music, Social interaction in sound and music computing

Easychair keyphrases
fitness function [17], genetic algorithm [12], dissonance function [9], pareto front [9], genetic operator [8], representation scheme [7], simplicity function [7], average hypervolume [6], column total [6], evaluation function [6], dissonant note [5], evolutionary music [5], invalid note [5], root note [5], computer music [4], dissonant chord [4], final result [4], fitness value [4], large vertical interval [4], mutation phase [4], pitch mutation [4], random solution [4], western music [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849964
Zenodo URL: https://zenodo.org/record/849964


2011.52
Multiple-Instrument Polyphonic Music Transcription using a Convolutive Probabilistic Model
Benetos, Emmanouil   Queen Mary University of London; London, United Kingdom
Dixon, Simon   Queen Mary University of London; London, United Kingdom

Abstract
In this paper, a method for automatic transcription of music signals using a convolutive probabilistic model is proposed. The model extends the shift-invariant Probabilistic Latent Component Analysis method. Several note templates from multiple orchestral instruments are extracted from monophonic recordings and are used for training the transcription system. By incorporating shift-invariance into the model along with the constant-Q transform as a time-frequency representation, tuning changes and frequency modulations such as vibrato can be supported by the system. For postprocessing, Hidden Markov Models trained on MIDI data are employed, in order to favour temporal continuity. The system was tested on classical and jazz recordings from the RWC database, on recordings from a Disklavier piano, and a woodwind quintet recording. The proposed method, which can also be used for pitch content visualization, is shown to outperform several state-of-the-art approaches, using a variety of error metrics.

Keywords
Hidden Markov models, Polyphonic music transcription, Probabilistic latent component analysis, Shift-invariance

Paper topics
Automatic separation, classification of sound and music, Models for sound analysis and synthesis, recognition

Easychair keyphrases
pitch template [10], signal processing [10], transcription matrix [8], transcription system [8], hidden markov model [7], polyphonic music transcription [7], spectral template [7], time pitch representation [7], instrument template [6], invariant probabilistic latent component analysis [6], multiple f0 estimation [6], music transcription [6], shift invariant [6], transcription experiment [6], frequency modulation [5], multiple instrument [5], woodwind quintet [5], constant q transform [4], f0 woodwind quintet [4], instrument source [4], midi scale [4], music information retrieval [4], piano recording [4], piano roll transcription [4], piano template [4], pitch content visualization [4], relative pitch tracking [4], shift invariant plca [4], transcription error metric [4], woodwind quintet recording [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849966
Zenodo URL: https://zenodo.org/record/849966


2011.53
On Computing Morphological Similarity of Audio Signals
Gasser, Martin   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Flexer, Arthur   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria
Grill, Thomas   Austrian Research Institute for Artificial Intelligence (OFAI); Vienna, Austria

Abstract
Most methods to compute content-based similarity between audio samples are based on descriptors representing the spectral envelope or the texture of the audio signal only. This paper describes an approach based on (i) the extraction of spectro--temporal profiles from audio and (ii) non-linear alignment of the profiles to calculate a distance measure.

Keywords
content-based, morphology, similarity

Paper topics
Automatic separation, classification of sound and music, Content processing of music audio signals, recognition, Sound/music signal processing algorithms

Easychair keyphrases
spectral centroid [11], time series [9], derivative dynamic time warping [6], spectral evolution [6], audio signal [5], distance matrix [5], magnitude spectra [5], acoustical society [4], audio sample [4], austrian research institute [4], calculate similarity [4], constant q magnitude spectra [4], cross correlation [4], cross correlation function [4], noisy signal [4], pitch envelope [4], reference profile [4], shift value [4], spectral evolution trajectory [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849968
Zenodo URL: https://zenodo.org/record/849968


2011.54
On the Creative use of Score Following and its Impact on Research
Cont, Arshia   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
Score following research is one of the active disciplines of sound and music computing since almost $30$ years that have haunted both algorithmic and computational development in realtime music information retrieval, as well as artistic applications in interactive computer music. This paper explores the creative use of such technologies and brings attention to new scientific paradigms that emerge out of their artistic use. We show how scientific and artistic goals of score following systems might differ and how the second, continuously helps re-think the first. We focus mostly on the musical goals of score following technologies which brings us to an underestimated field of research, despite its obviousness in creative applications, which is that of synchronous reactive programming and its realization in Antescofo.

Keywords
Realtime Interactions, Score Following, Synchronous Programming

Paper topics
Automatic music generation/accompaniment systems, Computer environments for sound/music processing, Interactive performance systems

Easychair keyphrases
score following [52], computer music [27], score following system [17], electronic score [14], interactive music system [14], live performance [13], score follower [12], score following technology [12], electronic process [9], automatic accompaniment [8], virtual score [8], artistic goal [7], interactive computer music [7], interactive system [7], live electronic [7], live performer [7], musical goal [7], recognition system [7], synchronous language [7], electronic program [6], instrumental score [6], international computer music [6], music score [6], score following paradigm [6], programming environment [5], real time [5], research paradigm [5], computer music composition [4], contemporary music review [4], music information retrieval [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849970
Zenodo URL: https://zenodo.org/record/849970


2011.55
Parametric Trombone Synthesis by Coupling Dynamic Lip Valve and Instrument Models
Smyth, Tamara   Simon Fraser University; Vancouver, Canada
Scott, Frederick   Simon Fraser University; Vancouver, Canada

Abstract
In this work, a physics-based model of a trombone coupled to a lip reed is presented, with the parameter space explored for the purpose of real-time sound synthesis. A highly configurable dynamic lip valve model is reviewed and its parameters discussed within the context of a trombone model. The trombone model is represented as two separate parametric transfer functions, corresponding to tapping a waveguide model at both mouthpiece and bell positions, enabling coupling to the reed model as well as providing the instrument's produced sound. The trombone model comprises a number of waveguide filter elements---propagation loss, reflection at the mouthpiece, and reflection and transmission at the bell---which may be obtained through theory and measurement. As oscillation of a lip reed is strongly coupled to the bore, and playability strongly dependent on the bore and bell resonances, it is expected that a change in the parameters of one will require adapting the other. Synthesis results, emphasizing both interactivity and high-quality sound production are shown for the trombone in both extended and retracted positions, with several example configurations of the lip reed.

Keywords
acoustics, interaction, parametric, synthesis, trombone

Paper topics
Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
dynamic lip reed model [8], trombone model [8], reed model [7], lip reed [6], transfer function [6], lip valve [5], parameter value [5], propagation loss [5], bore base [4], bore pressure [4], convolutional synthesis [4], digital audio effect [4], dynamic lip valve model [4], extended trombone [4], high quality sound production [4], impulse response [4], mouthpiece model [4], mouth pressure [4], overall driving force acting [4], pressure controlled valve [4], real time sound synthesis [4], retracted trombone [4], trombone instrument model [4], valve model [4], waveguide model [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849972
Zenodo URL: https://zenodo.org/record/849972


2011.56
PERSONALITY AND COMPUTER MUSIC
Garrido, Sandra   The University of New South Wales (UNSW); Sydney, Australia
Schubert, Emery   The University of New South Wales (UNSW); Sydney, Australia
Kreutz, Gunter   Carl von Ossietzky University Oldenburg; Oldenburg, Germany
Halpern, Andrea   Bucknell University; Lewisburg, United States

Abstract
There is some evidence that both music preferences and an attraction to computers and technology are related to personality. This paper will argue that the specific measure of ‘music-systemizing’ may therefore be predictive of a preference for electronica, techno and computer-generated music. We report a preliminary study with 36 participants in which those who enjoy computer music based genres demonstrated a trend of a higher mean score on the music-systemizing scale than those who enjoy love songs.

Keywords
computer-music, electronica, personality

Paper topics
Perception and cognition of sound and music, Social interaction in sound and music computing

Easychair keyphrases
computer music [33], love song [20], music preference [13], music systemizing [9], music empathizing [8], love song fan [7], love song group [7], computer generated music [6], computer music composer [6], cognitive style [5], dance music [5], effect size [5], individual difference [5], personality trait [5], computer music group [4], computer music lover [4], computer music style [4], favourite piece [4], least favourite [4], mean age [4], music cognitive style [4], music systemizing scale [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849974
Zenodo URL: https://zenodo.org/record/849974


2011.57
PHYSICAL MODELING MEETS MACHINE LEARNING: TEACHING BOW CONTROL TO A VIRTUAL VIOLINIST
Percival, Graham   Science and Music Research Group, University of Glasgow; Glasgow, United Kingdom
Bailey, Nicholas   Science and Music Research Group, University of Glasgow; Glasgow, United Kingdom
Tzanetakis, George   Department of Computer Science, University of Victoria; Victoria, Canada

Abstract
The control of musical instrument physical models is difficult; it takes many years for professional musicians to learn their craft. We perform intelligent control of a violin physical model by analyzing the audio output and adjusting the physical inputs to the system using trained Support Vector Machines (SVM). Vivi, the virtual violinist is a computer program which can perform music notation with the same skill as a beginning violin student. After only four hours of interactive training, Vivi can perform all of Suzuki violin volume 1 with quality that is comparable to a human student. Although physical constants are used to generate audio with the model, the control loop takes a ``black-box'' approach to the system. The controller generates the finger position, bow-bridge distance, bow velocity, and bow force without knowing those physical constants. This method can therefore be used with other bowed-string physical models and even musical robots.

Keywords
control, machine learning, physical modeling, synthesis, violin

Paper topics
Models for sound analysis and synthesis

Easychair keyphrases
bow force [16], sheet music [14], physical model [13], violin physical model [11], bowed string [10], audio file [9], sound quality [9], virtual violinist [8], basic training [7], machine learning [7], audio analysis [6], audio signal [6], bow bridge distance [6], control loop [6], note data [6], physical parameter [6], string played mf [6], bow velocity [5], musical instrument [5], open string [5], percival music [5], physical action [5], string dynamic [5], violin bowing [5], violin sound [5], bow control [4], feedback control [4], fold cross validation [4], music notation [4], professional musician [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849976
Zenodo URL: https://zenodo.org/record/849976


2011.58
Prioritized Contig Combining to Segregate Voices in Polyphonic Music
Ishigaki, Asako   Keio University; Tokyo, Japan
Matsubara, Masaki   Keio University; Tokyo, Japan
Saito, Hiroaki   Keio University; Tokyo, Japan

Abstract
Polyphonic music is comprised of independent voices sounding synchronously. The task of voice segregation is to assign notes from symbolic representation of a music score to monophonic voices. Human auditory sence can distinguish these voices. Hence, many previous works utilize perceptual principles. Voice segregation can be applied to music information retrieval and automated music transcription of polyphonic music. In this paper, we propose to modify the voice segregation algorithm of contig mapping approach by Chew and Wu. This approach consists of 3 steps; segmentation, segregation, and combining. We present a modification of “combining” step on the assumption that the accuracy of voice segregation depends on whether the segregation manages to correctly identify which voice is resting. Our algorithm prioritize voice combining at segmentation boundaries with increasing voice counts. We tested our voice segregation algorithm on 78 pieces of polyphonic music by J.S.Bach. The results show that our algorithm attained 92.21% of average voice consistency.

Keywords
contig mapping approach, stream separation, voice segregation, voice separation

Paper topics
Automatic separation, classification of sound and music, recognition

Easychair keyphrases
voice segregation [28], polyphonic music [20], music piece [13], contig mapping [12], voice segregation algorithm [12], voice count [11], increasing voice count [9], music information retrieval [9], pitch proximity [8], success ratio [8], voice contig [8], full experiment [7], maximal voice contig [7], voice music [7], adjacent contig [6], decreasing voice count [6], third voice [6], bach sinfonia [5], first note [5], perceptual principle [5], second beat [5], sky blue [5], voice separation [5], automatic music transcription [4], bach fugue [4], contig combining [4], ground truth [4], increasing voice [4], second voice [4], voice connection [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849978
Zenodo URL: https://zenodo.org/record/849978


2011.59
RaPScoM - A Framework for Rapid Prototyping of Semantically Enhanced Score Music
Rubisch, Julian   FH St. Pölten ForschungsGmbH, St. Pölten University of Applied Sciences; Sankt Pölten, Austria
Doppler, Jakob   FH St. Pölten ForschungsGmbH, St. Pölten University of Applied Sciences; Sankt Pölten, Austria
Raffaseder, Hannes   FH St. Pölten ForschungsGmbH, St. Pölten University of Applied Sciences; Sankt Pölten, Austria

Abstract
In film and video production, the selection or production of suitable music often turns out to be an expensive and time-consuming task. Directors or video producers frequently do not possess enough expert musical knowledge to express their musical ideas to a composer, which is why the usage of temp tracks is a widely accepted practice. To improve this situation, we aim at devising a generative music prototyping tool capable of supporting media producers by exposing a set of high-level parameters tailored to the vocabulary of films (such as mood descriptors, semantic parameters, film and music genre etc.). The tool is meant to semi-automate the process of producing and/or selecting temp tracks by using algorithmic composition strategies to either generate new musical material, or process exemplary material, such as audio or MIDI files. Eventually, the tool will be able to provide suitable raw material for composers to start their work. We will also publish parts of the prototype as an open source framework (the RaPScoM framework) to foster further development in this area.

Keywords
affect, algorithmic, composition, framework, generative, music, score, semantic, temp-track

Paper topics
Automatic music generation/accompaniment systems, Computer environments for sound/music processing

Easychair keyphrases
film music [12], temp track [8], score music [7], rapscom framework [6], applied science st [4], medium production university [4], musical segment [4], rapid prototyping [4], rough cut [4], semantically enhanced score music [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849980
Zenodo URL: https://zenodo.org/record/849980


2011.60
REAL-TIME UNSUPERVISED MUSIC STRUCTURAL SEGMENTATION USING DYNAMIC DESCRIPTORS
Pires, André S.   University of São Paulo (USP); São Paulo, Brazil
Queiroz, Marcelo   University of São Paulo (USP); São Paulo, Brazil

Abstract
This paper presents three approaches for music structural segmentation, i.e. intertwined music segmentation and labelling, using real-time techniques based solely on dynamic sound descriptors, without any training data. The first method is based on tracking peaks of a sequence obtained from a weighted off-diagonal section of a dissimilarity matrix, and uses Gaussian models for labelling sections. The second approach is a multi-pass method using Hidden Markov Models (HMM) with Gaussian Mixture Models (GMM) in each state. The third is a novel approach based on an adaptiveHMMthat dynamically identifies and labels sections, and also sporadically reevaluates the segmentation and labelling, allowing redefinition of past sections based on recent and immediate past information. Finally, a method to evaluate results is presented, that allows penalization both of incorrect section boundaries and of incorrect number of detected segments, if so desired. Computational results are presented and analysed both from quantitative and qualitative points-of-view.

Keywords
musical information retrieval, real-time sound processing, unsupervised segmentation

Paper topics
Automatic separation, classification of sound and music, Music information retrieval, recognition

Easychair keyphrases
dynamic descriptor [24], real time [18], potential state [12], music structural segmentation [9], temporal memory [9], dissimilarity matrix [8], exponential decay [8], normal distribution [7], training data [7], transition point [7], dissimilarity sequence [6], real time structural segmentation [6], similarity matrix [6], structural segmentation [6], bhattacharyya distance [5], clustering algorithm [5], observation sequence [5], viterbis algorithm [5], bayesian information criterion [4], dissimilarity matrix peak [4], euclidean norm [4], global timbre [4], hidden markov model [4], music information retrieval [4], real time unsupervised music [4], real time unsupervised technique [4], red dot [4], reference sequence [4], refinement stage [4], unsupervised music structural segmentation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849982
Zenodo URL: https://zenodo.org/record/849982


2011.61
Rencon Workshop 2011 (SMC-Rencon): Performance Rendering Contest for Computer Systems
Hashida, Mitsuyo   Soai University; Osaka, Japan
Hirata, Keiji   Future University Hakodate; Hakodate, Japan
Katayose, Haruhiro   Kwansei Gakuin University; Osaka, Japan

Abstract
The Performance Rendering Contest (Rencon) is an annual international competition in which entrants present computer systems they have developed for generating expressive musical performances, which audience members and organizers judge. Recent advances in performance-rendering technology have brought with them the need for a means for researchers in this area to obtain feedback about the abilities of their systems in comparison to those of other researchers. The Rencon contest at SMC2011 (SMC-Rencon) is going to have two different stages of evaluation. In the first stage, the musicality of generated performances and technical quality of systems will be evaluated by expert reviewers using a blind procedure for evaluation. In the second stage, performances generated on site will be openly evaluated by the SMC audience and Internet viewers. The SMC-Rencon Award will be bestowed on the system scoring the highest number for listening evaluation of Stages I and II is the highest.

Keywords
autonomous music systems, interactive music interfaces, music expression, music listening evaluation, performance rendering

Paper topics
Automatic music generation/accompaniment systems, Interactive performance systems, Interfaces for sound and music, Music performance analysis and rendering

Easychair keyphrases
performance rendering [23], set piece [13], technical quality [12], interactive section [10], performance rendering system [9], autonomous section [8], commercial music software [6], international computer music [6], musical performance [6], computer system [5], expert reviewer [5], expressive performance [5], performance expression [5], rendered performance [5], autonomous computer system [4], evaluation stage [4], expressive musical performance [4], human performance [4], internet viewer [4], kagurame phase [4], midi level note data [4], newly composed piano piece [4], rencon webpage [4], smc rencon award [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849984
Zenodo URL: https://zenodo.org/record/849984


2011.62
Robotic piano player making pianos talk
Ritsch, Winfried   Institute of Electronic Music and Acoustics (IEM), University of Music and Performing Arts (KUG); Graz, Austria

Abstract
The overall vision of a piano which can talk, a piano that produces understandable speech playing notes with a robotic piano player has been developed as artwork over the last decade. After successfully transcribing recorded ambient sound for piano and ensembles, the outcome of this mapping was applied by the composer Peter Ablinger in his artwork, which explores the auditory perception in the tradition of artistic phenomenalists. For this vision a robotic piano player has been developed to play the result from the mapping of voice recordings, by reconstructing the key features of the analyzed spectrum stream, so that a voice can be imagined and roughly recognized. This paper is a report on the artistic research, mentioning different solutions. The output as artworks will be referenced.

Keywords
cognition, player piano, robotics, speech, transcription

Paper topics
Automatic music generation/accompaniment systems, Automatic separation, classification of sound and music, Computational musicology, Music and robotics, Music information retrieval, Music performance analysis and rendering, Perception and cognition of sound and music, recognition

Easychair keyphrases
peter ablinger [12], robotic piano player [9], piano player [8], player piano [7], robot piano player [7], key feature [5], real time [5], art work [4], pure data [4], repetition rate [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849986
Zenodo URL: https://zenodo.org/record/849986


2011.63
Sonic Gestures as Input in Human-Computer Interaction: Towards a Systematic Approach
Jylhä, Antti   School of Electrical Engineering, Aalto University; Espoo, Finland

Abstract
While the majority of studies in sonic interaction design (SID) focuses on sound as the output modality of an interactive system, the broad scope of SID includes also the use of sound as an input modality. Sonic gestures can be defined as sound-producing actions generated by a human in order to convey information. Their use as input in computational systems has been studied in several isolated contexts, however a systematic approach to their utility is lacking. In this study, the focus is on general sonic gestures, rather than exclusively focusing on musical ones. Exemplary interactive systems applying sonic gestures are reviewed, and based on previous studies on gesture, the first steps towards a systematic framework of sonic gestures are presented. Here, sonic gestures are studied from the perspectives of typology, morphology, interaction affordances, and mapping. The informational richness of the acoustic properties of sonic gestures is highlighted.

Keywords
audio input, gesture studies, multimodal interaction, sonic interaction design

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
sonic gesture [77], sound producing action [11], hand clap [10], finger snap [8], gesture type [7], sonic interaction [7], sustained gesture [7], temporal deviation [7], computational system [6], human computer interaction [6], instrumental sonic gesture [6], interactive system [6], iterative gesture [6], sound producing [6], discrete command [5], impulsive gesture [5], signal processing [5], basic gesture [4], continuous parameter [4], continuous sonic interaction [4], convey information [4], digital audio effect [4], everyday sound [4], hand configuration [4], iterative sustained [4], sonic interaction design [4], temporal form [4], vocal joystick [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849990
Zenodo URL: https://zenodo.org/record/849990


2011.64
SONIK SPRING
Henriques, Tomas   State University of New York (SUNY), Buffalo State College; Buffalo, United States

Abstract
This paper presents a new digital musical instrument that focuses on the issue of feedback in instrument design as a key condition to achieve a performance tool that is both highly responsive and highly expressive. The Sonik Spring emphasizes the relationship between kinesthetic feedback and sound production while at the same time linking visual and gestural motion to the auditory experience and musical outcome. The interface consists of a 15-inch coil that is held and controlled using both hands. The coil exhibits unique stiffness and flexibility characteristics that allow many degrees of variation of its shape and length. The design of the instrument is described and its unique features discussed. Three distinct performance modes are also detailed highlighting the instrument’s expressive potential and wide range functionality.

Keywords
Digital Music Instrument, Gestural Control of Sound, Kinesthetic and visual feedback

Paper topics
Interfaces for sound and music, Models for sound analysis and synthesis, Sonic interaction design

Easychair keyphrases
sonik spring [25], left hand controller [9], force feedback [7], hand controller [7], push button [6], gestural motion [5], left hand [5], hand unit [4], kinesthetic feedback [4], musical expression [4], physical model [4], right hand controller [4], sensor data [4], spatial motion [4], spring mass system [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849992
Zenodo URL: https://zenodo.org/record/849992


2011.65
SoundScape: A music composition environment designed to facilitate collaborative creativity in the classroom
Truman, Sylvia   Regent's College; London, United Kingdom

Abstract
A question that has gained widespread interest is ‘how can learning tasks be structured to encourage creative thinking in the classroom?’ This paper adopts the stance of drawing upon theories of learning and creativity to encourage creative thinking in the classroom. A number of scholars have suggested that the processes of ‘learn-ing’ and ‘creativity are inextricably linked. Extending upon this, a generative framework is presented which exists as a design support tool for planning creative learn-ing experiences. A demonstration of how this framework can be applied is made through the design of SoundScape – A music composition program designed for school children. This paper reports upon a study using Sound-Scape within a school with 96 children aged 11. The study focused on two objectives, firstly, identifying any differences in explicitly supporting the creative processes of ‘preparation’ as opposed to not, and secondly, compar-ing the outcomes of using real-world metaphors to create music compared to the use of abstract visual representa-tion to specify music.

Keywords
Creativity, Education, Music Composition

Paper topics
Computer environments for sound/music processing, Interfaces for sound and music, Social interaction in sound and music computing

Easychair keyphrases
specify music [24], creative process [19], composition object [16], visual metaphor [16], music composition [15], abstract representation [12], creative thinking [9], generative framework [9], non preparation condition [9], preparation condition [8], creative learning [7], design support tool [7], formal music training [7], composition task [6], creative idea [6], discussion point [6], explicitly supporting [6], music composition task [6], perceived confidence [6], research hypothesis [6], significant difference [6], visual metaphor condition [6], personal level [5], stage model [5], supporting preparation [5], theoretical background [5], available composition object [4], creative act [4], evaluation process [4], musical bar [4]

Paper type
Full paper

DOI: 10.5281/zenodo.849998
Zenodo URL: https://zenodo.org/record/849998


2011.66
SOUND SPATIALIZATION CONTROL BY MEANS OF ACOUSTIC SOURCE LOCALIZATION SYSTEM
Salvati, Daniele   Università di Udine; Udine, Italy
Canazza, Sergio   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Udine; Udine, Italy

Abstract
This paper presents a system for controlling the sound spatialization of a live performance by means of the acoustic localization of the performer. Our proposal is to allow a performer to directly control the position of a sound played back through a spatialization system, by moving the sound produced by its own musical instrument. The proposed system is able to locate and track the position of a sounding object (e.g., voice, instrument, sounding mobile device) in a two-dimensional space with accuracy, by means of a microphone array. We consider an approach based on Generalized Cross-Correlation (GCC) and Phase Transform (PHAT) weighting for the Time Difference Of Arrival (TDOA) estimation between the microphones. Besides, a Kalman filter is applied to smooth the time series of observed TDOAs, in order to obtain a more robust and accurate estimate of the position. To test the system control in real-world and to validate its usability, we developed a hardware/software prototype, composed by an array of three microphones and a Max/MSP external object for the sound localization task. We have got some preliminary successfully results with a human voice in real moderately reverberant and noisy environment and a binaural spatialization system for headphone listening.

Keywords
acoustic source localization system, musical interface, sound spazialization control

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
kalman filter [9], sound source [9], time delay [9], time delay estimation [9], human voice [8], sound spatialization [8], acoustic source localization [7], maximum peak detection [6], signal processing [6], microphone array [5], noisy environment [5], raw data [5], spatialization system [5], acoustical society [4], array signal processing [4], audio engineering society [4], live performance [4], microphone array signal [4], msp external object [4], musical instrument [4], sound localization [4], source position [4], tdoa estimation [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849994
Zenodo URL: https://zenodo.org/record/849994


2011.67
Sound Spheres: A Design Study of the Articulacy of a Non-contact Finger Tracking Virtual Musical Instrument
Hughes, Craig   The Open University; Milton Keynes, United Kingdom
Wermelinger, Michel   The Open University; Milton Keynes, United Kingdom
Holland, Simon   The Open University; Milton Keynes, United Kingdom

Abstract
A key challenge in the design of Virtual Musical instruments (VMIs) is finding expressive, playable, learnable mappings from gesture to sound that progressively re-ward practice by performers. Designing such mappings can be particularly demanding in the case of non-contact musical instruments, where physical cues can be scarce. Unaided intuition works well for many instrument de-signers, but others may find design and evaluation heuristics useful when creating new VMIs. In this paper we gather existing criteria from the literature to assemble a simple set of design and evaluation heuristics that we dub articulacy. This paper presents a design case study in which an expressive non-contact finger-tracking VMI, Sound Spheres, is designed and evaluated with the sup-port of the articulacy heuristics. The case study explores the extent to which articulacy usefully informs the design of a non-contact VMI, and we reflect on the usefulness or otherwise of heuristic approaches in this context.

Keywords
articulacy, design heuristics, finger tracking, interaction design, light-weight methodology, virtual musical instruments

Paper topics
Interfaces for sound and music

Easychair keyphrases
sound sphere [47], tracking sphere [44], control parameter [23], sound sphere vmi [20], non contact vmi [19], design decision [16], visual feedback [13], non contact [11], finger tracking [10], musical instrument [10], evaluation heuristic [9], virtual musical instrument [9], user interface [7], vmi design [7], instrument control parameter [6], key design decision [6], musical expression [6], present case study [6], s erva tion [6], computer music [5], led array [5], pressure control [5], acute angle [4], design case study [4], design heuristic [4], finger tracking application [4], non contact finger tracking [4], open university [4], starting position [4], stru ctu [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.849996
Zenodo URL: https://zenodo.org/record/849996


2011.68
Spatio-temporal unfolding of sound sequences
Rocchesso, Davide   IUAV University of Venice; Venezia, Italy
Delle Monache, Stefano   IUAV University of Venice; Venezia, Italy

Abstract
Distributing short sequences of sounds in space as well as in time is important for many applications, including the signaling of hot spots. In a first experiment, we show that the accuracy in the localization of one such spot is not improved by the apparent motion induced by spatial se- quencing. In a second experiment, we show that increasing the number of emission points does improve the smooth- ness of spatio-temporal trajectories, even for those rapidly- repeating pulses that may induce an auditory-saltation illu- sion. Other indications for auditory-display designers can also be drawn from the experiments.

Keywords
auditory display, auditory saltation, sound design, spatial hearing

Paper topics
Perception and cognition of sound and music, Sonic interaction design

Easychair keyphrases
impact position [20], auditory saltation [14], estimated position [14], emission point [10], position estimated position [9], impact sound [8], sound source [8], auditory saltation effect [6], inter onset interval [6], inter stimulus [6], standard deviation [6], acoustical society [5], cutaneous rabbit [5], ecological condition [5], piezo speaker [5], several subject [5], cardboard panel [4], double impact [4], minimum audible angle [4], quadriple impact [4], shorter ioi [4], sound event [4], sound motion [4], sound stimulus [4], stimulus interval [4], traversing sequence [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.850000
Zenodo URL: https://zenodo.org/record/850000


2011.69
Support for Learning Synthesiser Programming
Dykiert, Mateusz   University College London; London, United Kingdom
Gold, Nicolas E.   University College London; London, United Kingdom

Abstract
When learning an instrument, students often like to emulate the sound and style of their favourite performers. The learning process takes many years of study and practice. In the case of synthesisers the vast parameter space involved can be daunting and unintuitive to the novice making it hard to define their desired sound and difficult to understand how it was achieved. Previous research has produced methods for automatically determining an appropriate parameter set to produce a desired sound but this can still require many parameters and does not explain or demonstrate the effect of particular parameters on the resulting sound. As a first step to solving this problem, this paper presents a new approach to searching the synthesiser parameter space to find a sound, reformulating it as a multi-objective optimisation problem (MOOP) where two competing objectives (closeness of perceived sonic match and number of parameters) are considered. As a proof-of-concept a pareto-optimal search algorithm (NSGA-II) is applied to CSound patches of varying complexity to generate a pareto-front of non-dominating (i.e. ”equally good”) solutions. The results offer insight into the extent to which the size and nature of parameter sets can be reduced whilst still retaining an acceptable degree of perceived sonic match between target and candidate sound.

Keywords
CSound, learning, multi-objective, parameters, synthesiser

Paper topics
Computational musicology

Easychair keyphrases
genetic algorithm [13], random search [10], sonic match [10], pareto front [8], pareto optimal search algorithm [8], synthesiser parameter [8], csound patch [7], fitness function [7], function evaluation [6], target sound [6], desired sound [5], parameter set [5], k function evaluation [4], multi objective optimisation [4], multi objective optimisation problem [4], objective optimisation problem [4], parameter value [4], search algorithm [4], university college london [4], yee king [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.850002
Zenodo URL: https://zenodo.org/record/850002


2011.70
The Closure-based Cueing Model: Cognitively-Inspired Learning and Generation of Musical Sequences
Maxwell, James B.   Simon Fraser University; Vancouver, Canada
Pasquier, Philippe   Simon Fraser University; Vancouver, Canada
Eigenfeldt, Arne   Simon Fraser University; Vancouver, Canada

Abstract
In this paper we outline the Closure-based Cueing Model (CbCM), an algorithm for learning hierarchical musical structure from symbolic inputs. Inspired by perceptual and cognitive notions of grouping, cueing, and chunking, the model represents the schematic and invariant properties of musical patterns, in addition to learning explicit musical representations. Because the learned structure encodes the formal relationships between hierarchically related musical segments, as well as the within-segment transitions, it can be used for the generation of new musical material following principles of recombinance. The model is applied to learning melodic sequences, and is shown to generalize perceptual contour and invariance. We outline a few methods for generation from the CbCM, and demonstrate a particular method for generating ranked lists of plausible continuations from a given musical context.

Keywords
cognition, generation, machine-learning, music, perception

Paper topics
Automatic music generation/accompaniment systems, Automatic separation, classification of sound and music, Music information retrieval, Perception and cognition of sound and music, recognition

Easychair keyphrases
higher level [22], source work [11], level node [9], music perception [9], identity state [7], schema state [7], novel production [6], invariance state [5], chunk boundary [4], formal cue [4], information theoretic property [4], musical structure [4], pitch sequence [4], simon fraser university [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.850004
Zenodo URL: https://zenodo.org/record/850004


2011.71
The EyeHarp: An Eye-Tracking-Based Musical Instrument
Vamvakousis, Zacharias   Pompeu Fabra University (UPF); Barcelona, Spain
Ramirez, Rafael   Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
In this paper we present EyeHarp, a new musical instrument based on eye tracking. EyeHarp consists of a self-built low-cost eye-tracking device which communicates with an intuitive musical interface. The system allows performers and composers to produce music by controlling sound settings and musical events using eye movement. We describe the development of EyeHarp, in particular the construction of the eye-tracking device and the design and implementation of the musical interface.

Keywords
digital musical instrument, eye tracking, Interfaces for sound and music, real time performance

Paper topics
Interfaces for sound and music

Easychair keyphrases
eye tracking [28], fixation detection algorithm [19], eye tracking device [17], eye movement [15], eye tracking system [11], musical instrument [11], fixation detection [10], real time [8], eyewriter project [6], eyeharp melodic [5], response time [5], spatial distribution [5], temporal control [5], commercial eye tracking system [4], eyeharp interface [4], eyeharp layer [4], international computer music [4], low cost eye tracking [4], main circle [4], musical interface [4], music performance [4], raw data [4], real time melody [4], smoothing amount [4], velocity based fixation detection algorithm [4], video based head tracking [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850006
Zenodo URL: https://zenodo.org/record/850006


2011.72
The plurality of melodic similarity
Marsden, Alan   Lancaster University; Lancaster, United Kingdom

Abstract
Melodic similarity is a much-researched topic. While there are some common paradigms and methods, there is no single emerging model. The different means by which melodic similarity has been studied are briefly surveyed and contrasts drawn between them which lead to impor-tant differences in the light of the finding that similarity is dependent on context. Models of melodic similarity based on reduction are given particular scrutiny, and the existence of multiple possible reductions proposed as a natural basis for a lack of triangle inequality. It is finally proposed that, in some situations at least, similarity is deliberately sought by maximising the similarity of interpretations. Thus melodic similarity is found to be plural on two counts (differing contexts and multiple interpretations) and furthermore to be an essentially creative concept. There are therefore grounds for turning research on melodic similarity on its head and using the concept as a means for studying reduction and in musical creative contexts.

Keywords
Analysis, Melodic similarity, Reduction

Paper topics
Computational musicology, Music information retrieval

Easychair keyphrases
melodic similarity [36], triangle inequality [12], music information retrieval [9], proceeding international [5], schenkerian analysis [5], similarity judgement [5], edit distance [4], levenshtein distance [4], maximum similarity [4], multiple interpretation [4], music analysis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850008
Zenodo URL: https://zenodo.org/record/850008


2011.73
The Vowel Worm: Real-time Mapping and Visualisation of Sung Vowels in Music
Frostel, Harald   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Arzt, Andreas   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
This paper presents an approach to predicting vowel quality in vocal music performances, based on common acoustic features (mainly MFCCs). Rather than performing classification, we use linear regression to project spoken or sung vowels into a continuous articulatory space: the IPA Vowel Chart. We introduce a real-time on-line visualisation tool, the Vowel Worm, which builds upon the resulting models and displays the evolution of sung vowels over time in an intuitive manner. The concepts presented in this work can be used for artistic purposes and music teaching.

Keywords
acoustic-articulatory mapping, IPA Vowel Chart, MFCC, multiple linear regression, real-time, singing, speech, visualisation, vowels, Vowel Worm

Paper topics
Automatic separation, classification of sound and music, Content processing of music audio signals, Models for sound analysis and synthesis, Music information retrieval, Music performance analysis and rendering, Perception and cognition of sound and music, recognition

Easychair keyphrases
ipa vowel chart [14], sung vowel [14], real time [13], vowel worm [13], vowel quality [12], fundamental frequency [9], regression model [7], vowel height [6], formant frequency [5], ipa vowel [5], spoken vowel [5], vowel chart [5], window size [5], feature combination [4], multiple linear regression [4], real time visualisation [4], vowel space [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850010
Zenodo URL: https://zenodo.org/record/850010


2011.74
Towards a Generative Electronica: Human-Informed Machine Transcription and Analysis in MaxMSP
Eigenfeldt, Arne   Simon Fraser University; Vancouver, Canada
Pasquier, Philippe   Simon Fraser University; Vancouver, Canada

Abstract
We present the initial research into a generative electronica system based upon analysis of a corpus, describing the combination of expert human analysis and machine analysis that provides parameter data for generative algorithms. Algorithms in MaxMSP and Jitter for the transcription of beat patterns and section labels are presented, and compared with human analysis. Initial beat generation using a genetic algorithm utilizing a neural net trained on the machine analysis data is discussed, and compared with the use of a probabilistic model.

Keywords
Electronica, Generative Music, MaxMSP, Music transcription

Paper topics
Automatic music generation/accompaniment systems, Automatic separation, classification of sound and music, Music information retrieval, recognition

Easychair keyphrases
beat pattern [20], machine analysis [11], human analysis [9], genetic algorithm [8], neural network [6], electronic dance music [4], fitness function [4], high frequency [4], mean value [4], notable artist [4], signal processing [4], successive measure [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.850012
Zenodo URL: https://zenodo.org/record/850012


2011.75
TOWARDS A PERSONALIZED TECHNICAL EAR TRAINING PROGRAM: AN INVESTIGATION OF THE EFFECT OF ADAPTIVE FEEDBACK
Kaniwa, Teruaki   University of Tsukuba; Tsukuba, Japan
Kim, Sungyoung   Yamaha Corporation; Hamamatsu, Japan
Terasawa, Hiroko   University of Tsukuba; Tsukuba, Japan
Ikeda, Masahiro   Yamaha Corporation; Hamamatsu, Japan
Yamada, Takeshi   University of Tsukuba; Tsukuba, Japan
Makino, Shoji   University of Tsukuba; Tsukuba, Japan

Abstract
Technical ear training aims to improve the listening of sound engineers so that they can skillfully modify and edit the structure of sound. To provide non-professionals such as amateur sound engineers and students with this technical ear training, we have developed a simple yet personalized ear training program. The most distinct fea-ture of this system is that it adaptively controls the train-ing task based on the trainee’s previous performance. In detail, this system estimates a trainee’s weakness, and generates a training routine that provides drills focusing on the weakness, so that the trainee can effectively re-ceive technical ear training without an instructor. We subsequently investigated the effect of the new training program with a one-month training experiment involving eight subjects. The result showed that the score of the group assigned to the proposed training system improved more than that of the group assigned to conventional training.

Keywords
Recording, Technical ear training, Timbre

Paper topics
access and modelling of musical heritage, Perception and cognition of sound and music, Technologies for the preservation

Easychair keyphrases
correct answer rate [20], low score band [14], proposal group [14], conventional group [12], pink noise [12], ear training program [9], ear training [7], training program [7], training system [7], correct answer [6], pink noise pink noise [6], technical ear training [6], absolute identification [5], center frequency [5], technical listening [5], audio engineering society [4], average correct answer rate [4], equal variance [4], intelligent tutoring system [4], personalized technical ear training [4], question appearance [4], technical ear training program [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.850014
Zenodo URL: https://zenodo.org/record/850014


2011.76
Using Physical Models Is Necessary to Guarantee Stable Analog Haptic Feedback Control for Any User and Haptic Device
Berdahl, Edgar   ACROE, INPG and CCRMA, Stanford University ACROE, National Polytechnic Institute of Grenoble (INPG) Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States
Florens, Jean-Loup   Association pour la Création et la Recherche sur les Outils d’Expression (ACROE), Grenoble Institute of Technology (Grenoble INP); Grenoble, France
Cadoz, Claude   Association pour la Création et la Recherche sur les Outils d’Expression (ACROE), Grenoble Institute of Technology (Grenoble INP); Grenoble, France

Abstract
It might be easy to imagine that physical models only represent a small portion of the universe of appropriate force feedback controllers for haptic new media; however, we argue the contrary in this work, in which we apply creative physical model design to re-examine the science of feedback stability. For example, in an idealized analog haptic feedback control system, if the feedback corresponds to a passive physical model, then the haptic control system is guaranteed to be stable, as we prove. Furthermore, it is in fact necessary that the feedback corresponds to a passive physical model. Otherwise, there exists a passive "user/haptic device transfer function" that can drive the feedback control system unstable. To simplify the mathematics, we make several assumptions, which we discuss throughout the paper and reexamine in an appendix. The work implies that besides all of the known advantages of physical models, we can argue that we should employ only them for designing haptic force feedback. For example, even though granular synthesis has traditionally been implemented using signal modeling methods, we argue that physical modeling should nonetheless be employed when controlling granular synthesis with a haptic force-feedback device.

Keywords
analog, ergotic, haptic, passivity, physical models, sound synthesis

Paper topics
Interactive performance systems, Interfaces for sound and music, Models for sound analysis and synthesis, Multimodality in sound and music computing, Music and robotics, Sonic interaction design, Sound and music for VR and games

Easychair keyphrases
physical model [46], haptic device [31], feedback control [22], control system [17], passive physical model [15], physical modeling [13], haptic feedback [10], haptic feedback control system [10], feedback control system [9], haptic feedback control [7], positive real [7], granular synthesis [6], haptic force feedback [6], international computer music [6], linear physical model [6], phase response [6], positive real function [6], recherche sur les outil [6], user hand [6], driving point [5], ergotic function [5], musical instrument [5], user device [5], device transfer function [4], driving point impedance [4], enactive system book [4], haptic control system [4], haptic feedback controller [4], m fext fext [4], strictly positive real [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850016
Zenodo URL: https://zenodo.org/record/850016


2011.77
Version Detection for Historical Musical Automata
Niedermayer, Bernhard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Johannes Kepler University Linz; Linz, Austria
Reuter, Christoph   Department of Musicology, University of Vienna; Vienna, Austria

Abstract
Musical automata were very popular in European homes in the pre-phonograph era, but have attracted little attention in academic research. Motivated by a specific application need, this paper proposes a first approach to the automatic detection of versions of the same piece of music played by different automata. Due to the characteristics of the instruments as well as the themes played, this task deviates considerably from cover version detection in modern pop and rock music. We therefore introduce an enhanced audio matching and comparison algorithm with two novel features: (1) a new alignment cost measure -- \textit{Off-Diagonal Cost} -- based on the Hough transform; and (2) a \textit{split-and-merge strategy} that compensates for major structural differences between different versions. The system was evaluated on a test set comprising 89 recordings of historical musical automata. Results show that the new algorithm performs significantly better than a `standard' matching approach without the above-mentioned new features, and that it may work well enough to be practically useful for the intended application.

Keywords
Dynamic Time Warping, Hough Transform, Mechanical Music Instrument, Musical Automata, Music Box, Version Detection

Paper topics
Automatic separation, classification of sound and music, Content processing of music audio signals, Music information retrieval, recognition

Easychair keyphrases
alignment path [13], hough transform [9], musical box [9], similarity measure [9], chroma vector [8], version detection [7], cover version detection [6], diagonal cost [6], dynamic time warping [6], musical automata [6], relative path cost [6], dissimilarity matrix [5], flute clock [5], alignment cost [4], audio matching [4], audio recording [4], audio signal [4], chroma feature [4], compact similarity measure [4], feature sequence [4], historical musical automata [4], image domain [4], main melody note [4], matching cost [4], music information retrieval [4], pitch class [4], version detection system [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850018
Zenodo URL: https://zenodo.org/record/850018


2011.78
When sound teaches
Zanolla, Serena   Università di Udine; Udine, Italy
Canazza, Sergio   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Udine; Udine, Italy
Romano, Filippo   Università di Udine; Udine, Italy
Scattolin, Francesco   Università di Udine; Udine, Italy
Foresti, Gian Luca   Università di Udine; Udine, Italy

Abstract
This paper presents the Stanza Logo-Motoria, a technologically augmented environment for learning and communication, which since last year we have been experimenting in a primary school; this system offers an alternative and/or additional tool to traditional ways of teaching that often do not adapt to the individual learning ability. The didactic use of interactive multimodal systems, such as the Stanza Logo-Motoria, does not replace the teacher; on the contrary this kind of technology is a resource which offers greater access to knowledge and interaction with others and the environment. This is possible by inventing systems and activities, which bring out inherent values in using technology and in its integration in learning processes. The aim of this paper is to document activities carried out by Resonant Memory, the first application of the Stanza Logo-Motoria, and the relative experimental protocol that we are implementing. In addition, we are going to introduce a new application of the system, the Fiaba Magica, for strengthening gesture intentionality in children with motor-cognitive impairments.

Keywords
Augmented Environment for Teaching, Augmented Reality, Disability, Interactive and Multimodal Environment, Stanza Logo-Motoria

Paper topics
Content processing of music audio signals, Interactive performance systems, Multimodality in sound and music computing, Social interaction in sound and music computing, Sound/music and the neurosciences

Easychair keyphrases
stanza logo motoria [41], fiaba magica application [20], resonant memory application [14], resonant memory [9], fiaba magica [6], severe disability [6], validation protocol [5], augmented reality [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850020
Zenodo URL: https://zenodo.org/record/850020


2011.79
Where do you want your ears? Comparing performance quality as a function of listening position in a virtual jazz band
Olmos, Adriana   McGill University; Montreal, Canada
Rushka, Paul   McGill University; Montreal, Canada
Ko, Doyuen   McGill University; Montreal, Canada
Foote, Gordon   McGill University; Montreal, Canada
Woszczyk, Wieslaw   McGill University; Montreal, Canada
Cooperstock, Jeremy R.   McGill University; Montreal, Canada

Abstract
This study explores the benefits of providing musicians with alternative audio rendering experiences while they perform with a virtual orchestra. Data collection methods included a field study with a large jazz band and a pilot study in which musicians rehearsed using a prototype that presented two different audio rendering perspectives: one from the musician's perspective, and a second from the audience perspective. The results showed that the choice of audio perspective makes a significant difference in some musicians' performance. Specifically, for some musicians, e.g., lead trumpet players, an acoustically natural mix results in improved performance, for others, e.g., drummers, it was easier to play along with the artificial "audience" perspective. These results motivate the inclusion of a music mixer capability in such a virtual rehearsal scenario.

Keywords
audio rendering experiences, Jazz band, music perception, virtual orchestra, virtual rehearsal

Paper topics
Interactive performance systems, Interfaces for sound and music, Music performance analysis and rendering, Perception and cognition of sound and music

Easychair keyphrases
audio image [8], audience mix [7], lead trumpet [7], trumpet player [7], audience perspective [6], audio rendering perspective [6], binaural recording [6], jazz band [6], mcgill jazz orchestra [6], music mcgill university [6], audio rendering [5], ensemble rehearsal [5], musician perspective [5], pilot study [5], schulich school [5], audio perspective [4], field study [4], intelligent machine mcgill university [4], mcgill university [4], musician performance [4], rehearsal session [4], sound source [4]

Paper type
Full paper

DOI: 10.5281/zenodo.850024
Zenodo URL: https://zenodo.org/record/850024


Search