Sixteen Years of Sound & Music Computing
A Look Into the History and Trends of the Conference and Community

D.A. Mauro, F. Avanzini, A. Baratè, L.A. Ludovico, S. Ntalampiras, S. Dimitrov, S. Serafin
Card image

Papers

Sound and Music Computing Conference 2015 (ed. 12)

Dates: from July 30 to August 01, 2015
Place: Maynooth, Ireland
Proceedings info: Proceedings of the 12th Int. Conference on Sound and Music Computing (SMC-15), Maynooth, Ireland, July 30, 31 & August 1, 2015, ISBN 9-7809-92746629


2015.1
A Computational Model of Tonality Cognition Based on Prime Factor Representation of Frequency Ratios and Its Application
Shiramatsu, Shun   Nagoya Institute of Technology; Nagoya, Japan
Ozono, Tadachika   Nagoya Institute of Technology; Nagoya, Japan
Shintani, Toramatsu   Nagoya Institute of Technology; Nagoya, Japan

Abstract
We present a computational model of tonality cognition derived from physical and cognitive principles on the frequency ratios of consonant intervals. The proposed model, which we call the Prime Factor-based Generalized Tonnetz (PFG Tonnetz), is based on the Prime Factor Representation of frequency ratios and can be regarded as a generalization of the Tonnetz. Our assumed application of the PFG Tonnetz is a system for supporting spontaneous and improvisational participation of inexpert citizens in music performance for regional promotion. For this application, the system needs to determine the pitch satisfying constraints on tonality against surrounding polyphonic music because inexpert users frequently lack music skills related to tonality. We also explore a working hypothesis on the robustness of the PFG Tonnetz against recognition errors on harmonic overtones in polyphonic audio signals. On the basis of this hypothesis, the PFG Tonnetz has a good potential as a representation of the tonality constraints of surrounding polyphonic music.

Keywords
ice-breaker activity, PFG Tonnetz, pitch contour, prime factor representation, tonality

Paper topics
Computational musicology and Mathematical Music Theory, Models for sound analysis and synthesis, Perception and cognition of sound and music, Social interaction in sound and music computing

Easychair keyphrases
frequency ratio [21], integer grid point [15], consonant interval [13], prime factor representation [12], pitch contour [11], harmonic overtone [9], limit pfg tonnetz [9], polyphonic audio signal [9], body motion [8], factor based generalized tonnetz [8], prime number [8], tonality constraint [8], cognitive principle [7], computational model [7], polyphonic music [7], recognition error [7], regional promotion [7], tonality cognition [7], grid point chord [6], limit just intonation [6], music performance [6], pitch frequency [6], tonality model [6], grid point [5], improvisational participation [5], inexpert user [5], minor chord [5], integer frequency ratio [4], pfg tonnetz space [4], pitch satisfying constraint [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851165
Zenodo URL: https://zenodo.org/record/851165


2015.2
Acoustically Guided Redirected Walking in a WFS System: Design of an Experiment to Identify Detection Thresholds
Nogalski, Malte   Hamburg University of Applied Sciences (HAW); Hamburg, Germany
Fohl, Wolfgang   Hamburg University of Applied Sciences (HAW); Hamburg, Germany

Abstract
RedirectedWalking (RDW) received increasing attention during the last decade. While exploring large-scale virtual environments (VEs) by means of real walking, RDWtechniques allow to explore VEs, that are significantly larger than the required physical space. This is accomplished by applying discrepancies between the real and the physical movements. This paper focuses on the development of an experiment to identify detection thresholds for an acoustic RDW system by means of a wave field synthesis (WFS) system. The implementation of an automated test procedure is described.

Keywords
immersive virtual environments, real time tracking, redirected walking, virtual reality, wave field synthesis

Paper topics
Perception and cognition of sound and music, Sonic interaction design, Spatial audio

Easychair keyphrases
test subject [41], virtual sound source [30], rotation gain [20], curvature gain [19], virtual environment [18], sound source [15], redirected walking [13], tracking area [12], starting position [11], curvature gain test [9], detection threshold [9], wave field synthesis [9], rotation gain test [7], tracking system [7], virtual world [7], mowec source [6], optional component [6], physical wf area [6], real world [6], rotational distortion [6], time dependent gain [6], translation gain [6], virtual rotation [6], alarm clock [5], auditory cue [5], gain test [5], self motion [5], time dependent [5], tracking data [5], immersive virtual environment [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851053
Zenodo URL: https://zenodo.org/record/851053


2015.3
Addressing Tempo Estimation Octave Errors in Electronic Music by Incorporating Style Information Extracted from Wikipedia
Hörschläger, Florian   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Vogl, Richard   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Böck, Sebastian   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria
Knees, Peter   Department of Computational Perception, Johannes Kepler University Linz; Linz, Austria

Abstract
A frequently occurring problem of state-of-the-art tempo estimation algorithms is that the predicted tempo for a piece of music is a wholef-number multiple or fraction of the tempo as perceived by humans (tempo octave errors). While often this is simply caused by shortcomings of the used algorithms, in certain cases, this problem can be attributed to the fact that the actual number of beats per minute (BPM) within a piece is not a listener’s only criterion to consider it being “fast” or “slow”. Indeed, it can be argued that the perceived style of music sets an expectation of tempo and therefore influences its perception. In this paper, we address the issue of tempo octave errors in the context of electronic music styles. We propose to incorporate stylistic information by means of probability density functions that represent tempo expectations for the individual music styles. In combination with a style classifier those probability density functions are used to choose the most probable BPM estimate for a sample. Our evaluation shows a considerable improvement of tempo estimation accuracy on the test dataset.

Keywords
information extraction, music information retrieval, octave errors, tempo estimation, wikipedia extraction

Paper topics
Multimodality in sound and music computing, Music information retrieval

Easychair keyphrases
tempo estimation [35], tempo octave error [14], probability density function [12], giantstep tempo dataset [11], music information retrieval [11], tempo estimation accuracy [11], tempo estimation algorithm [11], tempo range [11], block level feature [9], tempo annotation [9], tempo estimate [9], th international society [9], tempo relationship [8], music style [7], octave error [7], tempo information [7], wikipedia article [7], art tempo estimation [6], dance nu disco [6], electronic music [6], electronic music style [6], indie dance nu [6], tempo estimator [6], tempo induction algorithm [6], tempo ranker [6], feature vector [5], probability density [5], style estimation [5], house glitch hop [4], infobox music genre [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851139
Zenodo URL: https://zenodo.org/record/851139


2015.4
A DTW-BASED SCORE FOLLOWING METHOD FOR SCORE-INFORMED SOUND SOURCE SEPARATION
Rodriguez-Serrano, Francisco Jose   Universidad de Jaén; Jaén, Spain
Menendez-Canal, Jonatan   Universidad de Oviedo; Oviedo, Spain
Vidal, Antonio   Universitat Politècnica de València; Valencia, Spain
Cañadas-Quesada, Francisco Jesús   Universidad de Jaén; Jaén, Spain
Cortina, Raquel   Universidad de Oviedo; Oviedo, Spain

Abstract
Along this work, a new online DTW-based score alignment method is used over an online score-informed source sep- aration system. The proposed alignment stage deals with the input signal and the score. It estimates the score posi- tion of each new audio frame in an online fashion by using only information from the beginning of the signal up to the present audio frame. Then, under the Non-negative Matrix Factorization (NMF) framework and previously learned in- strument models the different instrument sources are sep- arated. The instrument models are learned on training ex- cerpts of the same kinds of instruments. Experiments are performed to evaluate the proposed system and its individ- ual components. Results show that it outperforms a state- of-the-art comparison method.

Keywords
alignment, audio, DTW, music, score, source-separation

Paper topics
Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
source separation [16], alignment method [13], instrument model [13], spectral pattern [13], spectral basis function [9], score alignment [8], dynamic time warping [6], excitation basis vector [6], non negative matrix factorization [6], signal processing [6], alignment stage [5], carabias orti [5], cost function [5], cost matrix [5], midi time [5], musical instrument [5], time series [5], latent variable analysis [4], low complexity signal decomposition [4], multi excitation model [4], multiplicative update rule [4], neural information processing system [4], nonnegative matrix factorization [4], offline version [4], online scoreinformed source separation [4], polyphonic audio [4], real time [4], signal model [4], sound source separation [4], trained instrument model [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851127
Zenodo URL: https://zenodo.org/record/851127


2015.5
A LOOP SEQUENCER THAT SELECTS MUSIC LOOPS BASED ON THE DEGREE OF EXCITEMENT
Kitahara, Tetsuro   Nihon University; Tokyo, Japan
Iijima, Kosuke   Nihon University; Tokyo, Japan
Okada, Misaki   Nihon University; Tokyo, Japan
Yamashita, Yuji   Nihon University; Tokyo, Japan
Tsuruoka, Ayaka   Nihon University; Tokyo, Japan

Abstract
In this paper, we propose a new loop sequencer that automatically selects music loops according to the degree of excitement by the user. A loop sequencer is expected to be a good tool for non-musicians to compose music because it does not require musical expert knowledge. However, it is not easy to appropriately select music loops because a loop sequencer usually has a huge-scale loop collection (e.g., more than 3000 loops). It is therefore required to automatically select music loops based on the user's simple and easy input. In this paper, we focus on the degree of excitement. In typical techno music, the temporal evolution of excitement is an important feature. Our system allows the user to input the temporal evolution of excitement by drawing a curve, then selects music loops automatically according to the input excitement. Experimental results show that our system is easy to understand and generates satisfied musical pieces for non-experts of music.

Keywords
Automatic music composition, Computer-aided music composition, Degree of excitement, Hidden Markov model, Loop sequencer

Paper topics
Interfaces for sound and music

Easychair keyphrases
music loop [37], loop sequencer [10], baseline system [8], musical piece [7], music composition [5], techno music [5], computer aided music composition [4], music loop according [4], temporal evolution [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851065
Zenodo URL: https://zenodo.org/record/851065


2015.6
A Music Performance Assistance System based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals
Dobashi, Ayaka   Kyoto University; Kyoto, Japan
Ikemiya, Yukara   Kyoto University; Kyoto, Japan
Itoyama, Katsutoshi   Kyoto University; Kyoto, Japan
Yoshii, Kazuyoshi   Kyoto University; Kyoto, Japan

Abstract
This paper presents a music performance assistance system that enables a user to sing, play a musical instrument producing harmonic sounds (e.g., guitar), or play drums while playing back a karaoke or minus-one version of an existing music audio signal from which the sounds of the user part (singing voices, harmonic instrument sounds, or drum sounds) have been removed. The beat times, chords, and vocal F0 contour of the original music signal are visualized and are automatically scrolled from right to left in synchronization with the music play-back. To help a user practice singing effectively, the F0 contour of the user’s singing voice is estimated and visualized in real time. The core functions of the proposed system are vocal, harmonic, and percussive source separation and content visualization for music audio signals. To provide the first function, vocal-and-accompaniment source separation based on RPCA and harmonic-and-percussive source separation based on median filtering are performed in a cascading manner. To provide the second function, content annotations (estimated automatically and partially corrected by users) are collected from aWeb service called Songle. Subjective experimental results showed the effectiveness of the proposed system.

Keywords
Harmonic and percussive source separation, Music content visualization, Music performance assistance, Singing voice separation

Paper topics
Interfaces for sound and music, Sound/music signal processing algorithms

Easychair keyphrases
music audio signal [28], singing voice [19], source separation [19], percussive source separation [17], vocal f0 contour [14], active music listening [12], singing voice separation [12], beat time [11], real time [10], accompaniment sound [8], music content [8], robust principal component analysis [8], musical instrument [7], music performance assistance [7], service called songle [7], user singing voice [7], web service [7], instrument part [6], performance assistance system [6], chord progression [5], median filtering [5], percussive sound [5], accompaniment source separation [4], audio signal [4], automatic accompaniment [4], median filter [4], playback position [4], polyphonic music [4], vocal andaccompaniment source separation [4], vocal spectrogram [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851133
Zenodo URL: https://zenodo.org/record/851133


2015.7
Analysis and Resynthesis of the Handpan Sound
Alon, Eyal   University of York; York, United Kingdom
Murphy, Damian   University of York; York, United Kingdom

Abstract
Handpan is a term used to describe a group of struck metallic musical instruments, which are similar in shape and sound to the Hang (developed by PANArt in January 2000). The handpan is a hand played instrument, which consists of two hemispherical steel shells that are fastened together along the circumference. The instrument usually contains a minimum of eight eliptical notes and is played by delivering rapid and gentle strikes to the note areas. This report details the design and implementation of an experimental procedure to record, analyse, and resynthesise the handpan sound. Four instruments from three different makers were used for the analysis, giving insight into common handpan sound features, and the origin of signature amplitude modulation characteristics of the handpan. Subjective listening tests were conducted aiming to estimate the minimum number of signature partials required to sufficiently resynthesise the handpan sound.

Keywords
amplitude modulation, analysis, decay rates, handpan, hang, listening test, partials, resynthesis, signature, T60

Paper topics
Models for sound analysis and synthesis

Easychair keyphrases
note field [27], handpan sound [18], amplitude modulation [13], signature partial [12], resynthesised signal [7], signature amplitude modulation [7], amplitude modulation characteristic [6], decay rate [6], highest magnitude partial [6], listening test [6], magnetic absorbing pad [6], musical instrument [6], surrounding note field [6], audio signal [5], frequency value [5], note group [5], steel pan [5], amplitude modulated partial frequency [4], decay time [4], energy decay relief [4], estimated amplitude modulation rate [4], highest magnitude [4], mean pd60 decay time [4], median similarity rating [4], modulated partial frequency value [4], signature handpan sound [4], steady state [4], subjective listening test [4], undamped and damped configuration [4], undamped and damped measurement [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851153
Zenodo URL: https://zenodo.org/record/851153


2015.8
ANALYSIS OF MUSICAL TEXTURES PLAYED ON THE GUITAR BY MEANS OF REAL-TIME EXTRACTION OF MID-LEVEL DESCRIPTORS
Freire, Sérgio   Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil
Cambraia, Pedro   Universidade Federal de Minas Gerais (UFMG); Belo Horizonte, Brazil

Abstract
The paper presents a set of mid-level descriptors for the analysis of musical textures played on the guitar, divided in six categories: global, guitar-specific, rhythm, pitch, amplitude and spectrum descriptors. The employed system is based on an acoustic nylon guitar with hexaphonic pick-ups, and was programmed in Max. An overview of the explored low-level audio descriptors is given in the first section. Mid-level descriptors, many of them based on a general affordance of the guitar, are the subject of the central section. Finally, some distinctive characteristics of six different textures -- two-voice writing, block chords, arpeggios, fast gestures with legato, slow melody with accompaniment, strummed chords -- are highlighted with the help of the implemented tools.

Keywords
descriptors of guitar performance, hexaphonic nylon guitar, interactive musical systems, mid-level descriptors

Paper topics
Content processing of music audio signals, Interactive performance systems, Music performance analysis and rendering

Easychair keyphrases
mid level descriptor [28], level descriptor [14], mid level [10], mid level descriptor value [8], string jump [8], block chord [7], fundamental frequency [7], superimposition index [7], mean value [6], real time [6], standard deviation [6], string index [6], left hand [5], open string [5], pitch class [5], spectrum descriptor [5], string centroid [5], acoustic guitar [4], implemented mid level descriptor [4], low level descriptor [4], non pitched event [4], prime form [4], prominent ioi [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851077
Zenodo URL: https://zenodo.org/record/851077


2015.9
Analyzing the influence of pitch quantization and note segmentation on singing voice alignment in the context of audio-based Query-by-Humming
Valero-Mas, Jose J.   Universidad de Alicante; Alicante, Spain
Salamon, Justin   New York University (NYU); New York, United States
Gómez, Emilia   Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
Query-by-Humming (QBH) systems base their operation on aligning the melody sung/hummed by a user with a set of candidate melodies retrieved from music tunes. While MIDI-based QBH builds on the premise of existing annotated transcriptions for any candidate song, audio-based research makes use of melody extraction algorithms for the music tunes. In both cases, a melody abstraction process is required for solving issues commonly found in queries such as key transpositions or tempo deviations. Automatic music transcription is commonly used for this, but due to the reported limitations in state-of-the-art methods for real-world queries, other possibilities should be considered. In this work we explore three different melody representations, ranging from a general time-series one to more musical abstractions, which avoid the automatic transcription step, in the context of an audio-based QBH system. Results show that this abstraction process plays a key role in the overall accuracy of the system, obtaining the best scores when temporal segmentation is dynamically performed in terms of pitch change events in the melodic contour.

Keywords
Audio-based Query-by-Humming, Melody encoding, Singing voice alignment

Paper topics
Music information retrieval

Easychair keyphrases
time series [18], music information retrieval [12], temporal segmentation [10], alignment algorithm [8], full automatic music transcription [8], melodic contour [8], subsequence dynamic time warping [8], abstraction process [6], hit rate [6], main f0 contour [6], melody estimation algorithm [6], music collection [6], pitch change event [6], candidate song [5], edit distance [5], fundamental frequency [5], semitone quantization [5], smith waterman [5], candidate melody [4], estimation algorithm melodia [4], frequency value [4], general time series [4], mean reciprocal rank [4], melody abstraction [4], melody abstraction process [4], melody extraction [4], pitch contour [4], polyphonic music signal [4], symbolic aggregate approximation [4], symbolic representation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851123
Zenodo URL: https://zenodo.org/record/851123


2015.10
An Augmented Guitar with Active Acoustics
Lähdeoja, Otso   University of the Arts Helsinki; Helsinki, Finland

Abstract
The present article describes and discusses an acoustic guitar augmented with structure-borne sound drivers at-tached on its soundboard. The sound drivers enable to drive electronic sounds into the guitar, transforming the soundboard into a loudspeaker and building a second layer of sonic activity on the instrument. The article pre-sents the system implementation and its associated design process, as well as a set of sonic augmentations. The sound esthetics of augmented acoustic instruments are discussed and compared to instruments comprising separate loudspeakers.

Keywords
Active acoustics, Augmented Instrument, Guitar, Live electronics, Sound processing, Structure-borne sound

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing

Easychair keyphrases
sound driver [11], acoustic guitar [9], augmented instrument [9], structure borne sound driver [8], acoustic instrument [6], active control [5], electronic sound [5], frequency response [5], signal processing [5], acoustic sound [4], active acoustic [4], active acoustic guitar [4], attack timbre modification [4], computer music [4], design process [4], electric guitar [4], hexaphonic pickup [4], international computer [4], playing technique [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851049
Zenodo URL: https://zenodo.org/record/851049


2015.11
An Exploration of Mood Classification in the Million Songs Dataset
Corona, Humberto   University College Dublin (UCD); Dublin, Ireland
O'Mahony, Michael   University College Dublin (UCD); Dublin, Ireland

Abstract
As the music consumption paradigm moves towards streaming services, users have access to increasingly large catalogs of music. In this scenario, music classification plays an important role in music discovery. It enables, for example, search by genres or automatic playlist creation based on mood. In this work we study the classification of song mood, using features extracted from lyrics alone, based on a vector space model representation. Previous work in this area reached contradictory conclusions based on experiments carried out using different datasets and evaluation methodologies. In contrast, we use a large freely-available dataset to compare the performance of different term-weighting approaches from a classification perspective. The experiments we present show that lyrics can successfully be used to classify music mood, achieving accuracies of up to 70% in some cases. Moreover, contrary to other work, we show that the performance of the different term weighting approaches evaluated is not statistically different using the dataset considered. Finally, we discuss the limitations of the dataset used in this work, and the need for a new benchmark dataset to progress work in this area.

Keywords
Million Songs Dataset, Mood classification, Music classification, Music information retrieval, Sentiment classification, text mining

Paper topics
Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
term weighting [17], mood classification [15], mood quadrant [15], term weighting scheme [15], music information retrieval [12], music classification [11], music mood classification [11], document frequency [10], song dataset [9], term frequency [9], th international society [9], vector space model [9], classification performance [8], delta tf idf [7], distinct term [7], mood group [7], classification accuracy [6], mood tag [6], social tag [6], classification result [5], feature analysis [5], lyrical feature [5], mood category [5], musixmatch dataset [5], term distribution [5], accuracy tf idf [4], idf term weighting [4], lyric based classification [4], mood granularity [4], statistically significant difference [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851021
Zenodo URL: https://zenodo.org/record/851021


2015.12
Archaeology and Virtual Acoustics. A Pan Flute From Ancient Egypt
Avanzini, Federico   Università di Padova; Padova, Italy
Canazza, Sergio   Università di Padova; Padova, Italy
De Poli, Giovanni   Università di Padova; Padova, Italy
Fantozzi, Carlo   Università di Padova; Padova, Italy
Pretto, Niccolò   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Padova; Padova, Italy
Angelini, Ivana   Università di Padova; Padova, Italy
Bettineschi, Cinzia   Università di Padova; Padova, Italy
Deotto, Giulia   Università di Padova; Padova, Italy
Faresin, Emanuela   Università di Padova; Padova, Italy
Menegazzi, Alessandra   Università di Padova; Padova, Italy
Molin, Gianmario   Università di Padova; Padova, Italy
Salemi, Giuseppe   Università di Padova; Padova, Italy
Zanovello, Paola   Università di Padova; Padova, Italy

Abstract
This paper presents the early developments of a recently started research project, aimed at studying from a multidisciplinary perspective an exceptionally well preserved ancient pan flute. A brief discussion of the history and iconography of pan flutes is provided, with a focus on Classical Greece. Then a set of non-invasive analyses are presented, which are based on 3D scanning and materials chemistry, and are the starting point to inspect the geometry, construction, age and geographical origin of the instrument. Based on the available measurements, a preliminary analysis of the instrument tuning is provided, which is also informed with elements of theory of ancient Greek music. Finally, the paper presents current work aimed at realizing an interactive museum installation that recreates a virtual flute and allows intuitive access to all these research facets.

Keywords
3D scanning, Archaeoacoustics, Interactive multimedia installations, Virtual instruments

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing

Easychair keyphrases
pan flute [13], ancient greek music [11], musical instrument [10], active preservation [5], metric measurement [5], franc ois vase [4], internal pipe diameter dint [4], preserved ancient pan flute [4], sound synthesis [4], stopped pipe wind instrument [4], very high resolution [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851067
Zenodo URL: https://zenodo.org/record/851067


2015.13
A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification
Fukuda, Tsubasa   Kyoto University; Kyoto, Japan
Ikemiya, Yukara   Kyoto University; Kyoto, Japan
Itoyama, Katsutoshi   Kyoto University; Kyoto, Japan
Yoshii, Kazuyoshi   Kyoto University; Kyoto, Japan

Abstract
This paper presents a novel piano tutoring system that encourages a user to practice playing a piano by simplifying difficult parts of a musical score according to the playing skill of the user. To identify the difficult parts to be simplified, the system is capable of accurately detecting mistakes of a user's performance by referring to the musical score. More specifically, the audio recording of the user's performance is transcribed by using supervised non-negative matrix factorization (NMF) whose basis spectra are trained from isolated sounds of the same piano in advance. Then the audio recording is synchronized with the musical score using dynamic time warping (DTW). The user's mistakes are then detected by comparing those two kinds of data. Finally, the detected parts are simplified according to three kinds of rules: removing some musical notes from a complicated chord, thinning out some musical notes from a fast passage, and removing octave jumps. The experimental results showed that the first rule can simplify musical scores naturally. The second rule, however, simplified the scores awkwardly, especially when the passage constituted a melody line.

Keywords
NMF, Piano performance support, Score simplification

Paper topics
Interactive performance systems, Music performance analysis and rendering

Easychair keyphrases
multipitch estimation [18], musical score [15], piano roll [15], score simplification [11], octave error [10], activation matrix [8], actual performance [8], audio signal [8], mistake detection [8], dynamic time warping [7], audio recording [6], non negative matrix factorization [6], synchronized piano roll [6], musical note [5], simplified score [5], user performance [5], base spectrum matrix [4], difficult part [4], fast passage [4], harmonic structure [4], informed piano tutoring system [4], musical score according [4], novel piano tutoring system [4], player skill [4], practice playing [4], rwc music database [4], score informed piano tutoring [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851129
Zenodo URL: https://zenodo.org/record/851129


2015.14
A Tambourine Support System to Improve the Atmosphere of Karaoke
Kurihara, Takuya   Nihon University; Tokyo, Japan
Kinoshita, Naohiro   Nihon University; Tokyo, Japan
Yamaguchi, Ryunosuke   Nihon University; Tokyo, Japan
Kitahara, Tetsuro   Nihon University; Tokyo, Japan

Abstract
Karaoke is a popular amusement, but people do not necessarily enjoy karaoke when they are not singing. It is better that non-singing people engage in karaoke to enliven it, but this is not always easy, especially if they do not know the song. Here, we focus on the tambourine, which is provided in most karaoke spaces in Japan but are rarely used. We propose a system that instructs how a non-singing person plays the tambourine. Once the singer choose a song, the tambourine part for this song is automatically generated based on the standard MIDI file. During the playback, the tambourine part is displayed in a common music game style with the usual karaoke-style lyrics. The correctness tambourine beat is fed to the display. The results showed that our system motivated non-singing people to play the tambourine with a game-like instruction even for songs that they did not know.

Keywords
Karaoke, Tambourine part generation, Tambourine Support

Paper topics
Interactive performance systems

Easychair keyphrases
tambourine part [19], baseline system [16], tambourine player [14], easy song hard song [12], mean value [10], body motion [9], play karaoke [9], practice mode [9], tambourine part generation [9], unknown song [9], tambourine performance [8], usual karaoke style lyric [8], easy song [7], instrumental solo section [7], tambourine support system [7], temporal differential [7], common music game style [6], hard song [6], hard song easy song [6], non singing person [6], real time tambourine performance [6], rwc music database [6], wii tambourine [6], singing voice [5], singer favorite song [4], snare drum [4], strong note [4], system easy song [4], tambourine performance feedback [4], unknown known unknown song song [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851059
Zenodo URL: https://zenodo.org/record/851059


2015.15
Automatic Singing Voice to Music Video Generation via Mashup of Singing Video Clips
Hirai, Tatsunori   Waseda University; Tokyo, Japan
Ikemiya, Yukara   Kyoto University; Kyoto, Japan
Yoshii, Kazuyoshi   Kyoto University; Kyoto, Japan
Nakano, Tomoyasu   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Morishima, Shigeo   CREST, JST, Waseda Research Institute for Science and Engineering, Waseda University; Tokyo, Japan

Abstract
This paper presents a system that takes audio signals of any song sung by a singer as the input and automatically generates a music video clip in which the singer appears to be actually singing the song. Although music video clips have gained the popularity in video streaming services, not all existing songs have corresponding video clips. Given a song sung by a singer, our system generates a singing video clip by reusing existing singing video clips featuring the singer. More specifically, the system retrieves short fragments of singing video clips that include singing voices similar to that in target song, and then concatenates these fragments using a technique of dynamic programming (DP). To achieve this, we propose a method to extract singing scenes from music video clips by combining vocal activity detection (VAD) with mouth aperture detection (MAD). The subjective experimental results demonstrate the effectiveness of our system.

Keywords
audio-visual processing, Music video generation, singing scene detection

Paper topics
Multimodality in sound and music computing

Easychair keyphrases
video clip [82], music video clip [68], singing scene [61], singing voice [61], music video [47], non singing scene [28], singing voice feature [22], singing scene detection [17], singing voice separation [15], video generation [13], mouth aperture degree [12], audio visual [11], audio visual synchronization [11], singing video [9], automatic music video generation [8], database clip [8], real music video clip [8], audio signal [7], edge free dp [7], existing music video [7], real video clip [7], similar singing voice [7], singing video clip [7], instrumental section [6], mouth aperture [6], music video generation [6], scene detection method [6], talking head [6], video fragment [6], arbitrary song [5]

Paper type
Full paper

DOI: 10.5281/zenodo.851033
Zenodo URL: https://zenodo.org/record/851033


2015.16
BEAN: A DIGITAL MUSICAL INSTRUMENT FOR USE IN MUSIC THERAPY
Kirwan, Nicholas John   Aalborg University; Aalborg, Denmark
Overholt, Daniel   Aalborg University; Aalborg, Denmark
Erkut, Cumhur   Aalborg University; Aalborg, Denmark

Abstract
The use of interactive technology in music therapy is rapidly growing. The flexibility afforded by the use of these technologies in music therapy is substantial. We present steps in development of Bean, a Digital Musical Instrument wrapped around a commercial game console controller and designed for use in a music therapy setting. Bean is controlled by gestures, and has both physical and virtual segments. The physical user interaction is minimalistic, consisting of the spatial movement of the instrument, along with two push buttons. Also, some visual aspects have been integrated in Bean. Direct visual feedback from the instrument itself is mirrored in accompanying software, where a 3D virtual representation of the instrument can be seen. Sound synthesis currently consists of amplitude and frequency modulation and effects, with a clear separation of melody and harmony. These aspects were developed with an aim to encourage an immediate sense of agency. Bean is being co-developed with clients and therapists, in order to assess the current state of development, and provide clues for optimal improvement going forward. Both the strengths, and the weaknesses of the design at the time of the evaluation, were assessed. Using this information, the current design has been updated, and is now closer to a formal evaluation.

Keywords
DMI, Music Therapy, Participatory Design, Tangible Interface for Musical Expression

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing, Social interaction in sound and music computing, Sound/music and the neurosciences

Easychair keyphrases
music therapy [24], music therapist [8], digital musical instrument [7], sensor data [7], aural feedback [6], solo voice [6], therapeutic setting [6], visual feedback [6], free play [5], aalborg university copenhagen [4], art therapist [4], mapping strategy [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851069
Zenodo URL: https://zenodo.org/record/851069


2015.17
Capturing and Ranking Perspectives on the Consonance and Dissonance of Dyads
Breen, Aidan   National University of Ireland Galway (NUI Galway); Galway, Ireland
O'Riordan, Colm   National University of Ireland Galway (NUI Galway); Galway, Ireland

Abstract
In many domains we wish to gain further insight into the subjective preferences of an individual. The problem with subjective preferences is that individuals are not necessarily coherent in their responses. Often, a simple linear ranking is either not possible, or may not accurately reflect the true preferences or behaviour of the individual. The phenomenon of consonance is heavily subjective and individuals often report to perceive different levels on consonance, or indeed dissonance. In this paper we present a thorough analysis of previous studies on the perception of consonance and dissonance of dyads. We outline a system which ranks musical intervals in terms of consonance based on pairwise comparison and we compare results obtained using the proposed system with the results of previous studies. Finally we propose future work to improve the implementation and design of the system. Our proposed approach is robust enough to handle incoherences in subjects' responses; preventing the formation of circular rankings while maintaining the ability to express these rankings --- an important factor for future work. We achieve this by representing the data gathered on a directed graph. Abstract objects are represented as nodes, and a subject's preference across any two objects is represented as a directed edge between the two corresponding nodes. We can then make use of the transitive nature of human preferences to build a ranking --- or partial ranking --- of objects with a minimum of pairwise comparisons.

Keywords
Consonance, Digraph, Dissonance, Partial Ranking, Ranking, Subjective Preferences

Paper topics
Computational musicology and Mathematical Music Theory, Models for sound analysis and synthesis, Music information retrieval, Sound/music and the neurosciences

Easychair keyphrases
weighted graph [8], directed graph [6], sample group [6], pairwise comparison [5], piano note [5], ranking method [5], subjective preference [5], ranking algorithm [4], subject response [4], test bed [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851045
Zenodo URL: https://zenodo.org/record/851045


2015.18
Cooperative musical creation using Kinect, WiiMote, Epoc and microphones: a case study with MinDSounDS
Tavares, Tiago Fernandes   University of Campinas (UNICAMP); Campinas, Brazil
Rimoldi, Gabriel   University of Campinas (UNICAMP); Campinas, Brazil
Eger Pontes, Vânia   University of Campinas (UNICAMP); Campinas, Brazil
Manzolli, Jônatas   University of Campinas (UNICAMP); Campinas, Brazil

Abstract
We describe the composition and performance process of the multimodal piece MinDSounDS, highlighting the design decisions regarding the application of diverse sensors, namely the Kinect (motion sensor), real-time audio analysis with Music Information Retrieval (MIR) techniques, WiiMote (accelerometer) and Epoc (Brain-Computer Interface, BCI). These decisions were taken as part of an collaborative creative process, in which the technical restrictions imposed by each sensor were combined with the artistic intentions of the group members. Our mapping schema takes in account the technical limitations of the sensors and, at the same time, respects the performers’ previous repertoire. A deep analysis of the composition process, particularly due to the collaborative aspect, highlights advantages and issues, which can be used as guidelines for future work in a similar condition.

Keywords
BCI, Kinect, Multimedia, Musical Creation, Music Information Retrieval, Synthesis, WiiMote

Paper topics
Interactive performance systems, Interfaces for sound and music, Sonic interaction design

Easychair keyphrases
composition process [16], virtual environment [7], brain computer interface [6], musical expression [4], slap gesture [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851117
Zenodo URL: https://zenodo.org/record/851117


2015.19
CrossSong Puzzle: Generating and Unscrambling Music Mashups with Real-time Interactivity
Smith, Jordan B. L.   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Percival, Graham   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Kato, Jun   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Fukayama, Satoru   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan

Abstract
There is considerable interest in music-based games, as the popularity of Rock Band and others can attest, as well as puzzle games. However, these have rarely been combined. Most music-based games fall into the category of rhythm games, and in those games where music is incorporated into a puzzle-like challenge, music usually serves as either an accompaniment or reward. We set out to design a puzzle game where musical knowledge and analysis would be essential to making deductions and solving the puzzle. The result is the CrossSong Puzzle, a novel type of music-based logic puzzle that truly integrates musical and logical reasoning. The game presents a player with a grid of tiles, each representing a mashup of measures from two different songs. The goal is to rearrange the tiles so that each row and column plays a continuous musical excerpt. Automatically identifying a set of song fragments to fill a grid such that each tile contains an acceptable mash-up is our primary technical hurdle. We propose an algorithm that analyses a corpus of music, searches the space of possible fragments, and selects an arrangement that maximizes the “mashability” of the resulting grid. This algorithm and the interaction design of the system are the main contributions.

Keywords
games, interfaces, mashups, puzzles

Paper topics
Content processing of music audio signals, Interactive performance systems, Interfaces for sound and music, Social interaction in sound and music computing, Sonic interaction design, Sound and music for VR and games

Easychair keyphrases
crosssong puzzle [12], visual hint [7], music based game [4], puzzle game [4], real time [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851097
Zenodo URL: https://zenodo.org/record/851097


2015.20
Design and Implementation of a Whole-body Haptic Suit for "Ilinx", a Multisensory Art Installation
Giordano, Marcello   McGill University; Montreal, Canada
Hattwick, Ian   McGill University; Montreal, Canada
Franco, Ivan   McGill University; Montreal, Canada
Egloff, Deborah   McGill University; Montreal, Canada
Frid, Emma   Sound and Music Computing group, KTH Royal Institute of Technology; Stockholm, Sweden
Lamontagne, Valerie   Concordia University; Montreal, Canada
Martinucci, Maurizio (Tez)   TeZ, Independent; Netherlands
Salter, Christopher   Concordia University; Montreal, Canada
Wanderley, Marcelo M.   McGill University; Montreal, Canada

Abstract
Ilinx is a multidisciplinary art/science research project focusing on the development of a multisensory art installation involving sound, visuals and haptics. In this paper we describe design choices and technical challenges behind the development of six tactile augmented garments, each one embedded with thirty vibrating actuators. Starting from perceptual experiments, conducted to characterize the actuators used in the garments, we describe hardware and software design, and the development of several haptic effects. The garments have successfully been used by over 300 people during the premiere of the installation in the TodaysArt 2014 festival in The Hague.

Keywords
haptics, multisensory, whole-body suit

Paper topics
Multimodality in sound and music computing, Social interaction in sound and music computing, Sonic interaction design

Easychair keyphrases
duty cycle [18], haptic effect [12], driver board [8], duty cycle difference [6], pwm duty cycle [6], average peak amplitude [4], body segment [4], central processing unit [4], dual lock velcro strip [4], duty cycle value [4], multi sensory art installation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851025
Zenodo URL: https://zenodo.org/record/851025


2015.21
Desirable aspects of visual programming languages for different applications in music creation
Pošćić, Antonio   Faculty of Electrical Engineering and Computing, University of Zagreb; Zagreb, Croatia
Kreković, Gordan   Faculty of Electrical Engineering and Computing, University of Zagreb; Zagreb, Croatia
Butković, Ana   Faculty of Humanities and Social Sciences, University of Zagreb; Zagreb, Croatia

Abstract
Visual programming languages are commonly used in the domain of sound and music creation. Specific properties and paradigms of those visual languages make them convenient and appealing to artists in various applications such as computer composition, sound synthesis, multimedia artworks, and development of interactive system. This paper presents a systematic research of several well-known languages for sound and music creation. The research was based on the analysis of cognitive dimensions such as abstraction gradient, consistency, closeness of mapping, and error-proneness. We have also considered the context of each analyzed language including its availability, community, and learning materials. Data for the research were collected from a survey conducted among users of the most notable and widespread visual programming languages. The data is presented both in raw, textual format and in a summarized table view. The results indicate desirable aspects along with possible improvements of visual programming approaches for different use cases. Finally, future research directions and goals are suggested in the field of visual programming for applications in music.

Keywords
cognitive dimensions, music creation, visual programming

Paper topics
Computer environments for sound/music processing, Interfaces for sound and music

Easychair keyphrases
pure data [36], few time [16], visual programming language [14], visual programming [11], few week [10], native instrument reaktor [9], algorithmic composition [8], programming language [8], formal music education [7], interactive system [7], music creation [7], audio effect [6], computer music [6], debugging tool limitation [6], few day [6], few month [6], music composition [6], symbolic sound kyma [6], answered question [5], inspiring aspect [5], musical composition [5], online tutorial [5], temporal dimension [5], user base [5], visual representation [5], automating composition technique [4], cognitive dimension framework [4], existing sound [4], program flow control [4], sound synthesis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851083
Zenodo URL: https://zenodo.org/record/851083


2015.22
Developing Mixer-style Controllers Based on Arduino/Teensy Microcontrollers
Popp, Constantin   Knobtronix; United Kingdom
Soria Luz, Rosalía   Knobtronix; United Kingdom

Abstract
Low-cost MIDI mixer-style controllers may not lend themselves to the performance practise of electroacoustic music. This is due to the limited bit depth in which values of controls are transmitted and potentially the size and layouts of control elements, providing only coarse control of sound processes running on a computer. As professional controllers with higher resolution and higher quality controls are more costly and possibly rely on proprietary protocols, the paper investigates the development process of custom DIY controllers based on the Arduino and Teensy 3.1 micro controllers, and Open Source software. In particular, the paper discusses the challenges of building higher resolution controllers on a restricted budget with regard to component selection, printed circuit board and enclosure design. The solutions, compromises and outcomes are presented and analysed in fader-based and knob-based prototypes.

Keywords
electroacoustic performance practise, high-resolution, human computer interaction, midi-controllers, open source

Paper topics
Interfaces for sound and music

Easychair keyphrases
mixer style controller [6], fader box [5], open source [5], electroacoustic music [4], size comparison [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851087
Zenodo URL: https://zenodo.org/record/851087


2015.23
Distributing Music Scores to Mobile Platforms and to the Internet using INScore
Fober, Dominique   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Gouilloux, Guillaume   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Orlarey, Yann   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Letz, Stéphane   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France

Abstract
Music notation is facing new musical forms such as electronic and/or interactive music, live coding, hybridizations with dance, design, multimedia. It is also facing the migration of musical instruments to gestural and mobile platforms, which poses the question of new scores usages on devices that mostly lack the necessary graphic space to display the music in a traditional setting and approach. Music scores distributed and shared on the Internet start also to be the support of innovative musical practices, which raises other issues, notably regarding dynamic and collaborative music scores. This paper introduces some of the perspectives opened by the migration of music scores to mobile platforms and to the Internet. It presents also the approach adopted with INScore, an environment for the design of augmented, interactive music scores.

Keywords
collaboration, internet, music score

Paper topics
Interactive performance systems

Easychair keyphrases
music score [15], music notation [12], mobile platform [8], collaborative score design [6], interactive music score [6], use case [6], websocket server [6], forwarding mechanism [5], computer music [4], event based interaction mechanism [4], international computer [4], score set gmnf [4], web audio api [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851061
Zenodo URL: https://zenodo.org/record/851061


2015.24
Embodied Auditory Display Affordances
Roddy, Stephen   Trinity College Dublin; Maynooth, Ireland
Furlong, Dermot   Trinity College Dublin; Maynooth, Ireland

Abstract
The current paper takes a critical look at the current state of Auditory Display. It isolates nave realism and cogni-tivist thinking as limiting factors to the development of the field. An extension of Gibson’s theory of affordances into the territory of Embodied Cognition is suggested. The proposed extension relies heavily on Conceptual Metaphor Theory and Embodied Schemata. This is hoped to provide a framework in which to address the problematic areas of theory, meaning and lack of cognitive research in Auditory Display. Finally the current research’s development of a set of embodied auditory models intended to offer greater lucidity and reasonability in Auditory Display systems through the exploitation of embodied affordances, is discussed.

Keywords
Affordances, Auditory, Cognition, Data-driven, Display, Embodied, Furlong, Music, Roddy, Sonification, Stephen

Paper topics
Auditory displays and data sonification, Perception and cognition of sound and music, Sonic interaction design

Easychair keyphrases
auditory display [27], embodied schema [21], embodied affordance [15], meaning making [12], symbol grounding problem [12], embodied cognition [11], auditory domain [10], cognitive capacity [10], cognitive science [10], embodied interaction [10], naive realism [10], big small schema [9], second generation cognitive science [8], cognitively based research [6], conceptual metaphor theory [6], problem area [6], design framework [5], embodied experience [5], human experience [5], auditory perception [4], bass line [4], design pattern [4], ecological interface design [4], embodied auditory model [4], embodied mind [4], embodied music cognition [4], embodied schema theory [4], envelope attack speed [4], pitch level [4], stereo image [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851019
Zenodo URL: https://zenodo.org/record/851019


2015.25
Exploring the General Melodic Characteristics of XinTianYou Folk Songs
Li, Juan   Xi'an Jiaotong University (XJTU); Xi'an, China
Dong, Lu   Xi'an Jiaotong University (XJTU); Xi'an, China
Ding, Jianhang   Xi'an Jiaotong University (XJTU); Xi'an, China
Yang, Xinyu   Xi'an Jiaotong University (XJTU); Xi'an, China

Abstract
This paper aims to analyze one style of Chinese traditional folk songs named Shaanxi XinTianYou. Research on XinTianYou is beneficial to cultural exploration and music information retrieval. We build a MIDI database to explore the general characteristics of melody. Our insight is that, the combination of intervals reflects the characteristics of the music style. To find the most representative combination of intervals, we propose to use N-Apriori algorithm which counts the frequent patterns of melody. Considering both the significance and similarity between music pieces, we also provide a multi-layer melody perception clustering algorithm which uses both the melodic direction and the melodic value. The significant patterns are selected as the general characteristics of XinTianYou. The musical structure of XinTianYou is analyzed based on both the experiment results and the music theory. We also ask experts to evaluate our experiment results, and prove that our results are consistent with the expert's intuition.

Keywords
Clustering, Folk songs, General characteristics, Pattern mining

Paper topics
Music information retrieval

Easychair keyphrases
general characteristic [11], folk song [10], melody segment [8], frequent pattern [7], average sc result [6], chinese folk song [6], edit distance [6], multi layer melody [6], clustering result [5], significant pattern [5], aware top k pattern [4], candidate k item [4], chinese folk music [4], chinese music [4], frequent item [4], music piece [4], redundancy aware [4], top k cosine similarity [4], wide interval [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851035
Zenodo URL: https://zenodo.org/record/851035


2015.26
Follow the Tactile Metronome: Vibrotactile Stimulation for Tempo Synchronization in Music Performance
Giordano, Marcello   McGill University; Montreal, Canada
Wanderley, Marcelo M.   McGill University; Montreal, Canada

Abstract
In this paper we present a study evaluating the effectiveness of a tactile metronome for music performance and training. Four guitar players were asked to synchronize to a metronome click-track delivered either aurally or via a vibrotactile stimulus. We recorded their performance at different tempi (60 and 120BPM) and compared the results across modalities. Our results indicate that a tactile metronome can reliably cue participants to follow the target tempo. Such a device could hence be used in musical practice and performances as a reliable alternative to traditional auditory click-tracks, generally considered annoying and distracting by performers.

Keywords
haptics, metronome, music performance, notification, tactile, vibrotactile

Paper topics
Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing

Easychair keyphrases
tactile metronome [17], auditory metronome [10], music performance [8], auditory click track [7], metronome signal [7], tactile stimulus [7], target tempo [7], click track [6], computer music [5], target ioi [5], audio modality tactile modality figure [4], guitar player [4], raw data point [4], reaction time [4], tactile click track [4], tactile display [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851023
Zenodo URL: https://zenodo.org/record/851023


2015.27
Generalizing Messiaen’s Modes of Limited Transposition to a n-tone Equal Temperament
Ludovico, Luca Andrea   Università degli Studi di Milano; Milano, Italy
Baratè, Adriano   Università degli Studi di Milano; Milano, Italy

Abstract
Modes of limited transposition are musical modes originally conceived by the French composer Olivier Messiaen for a tempered system of 12 pitches per octave. They are defined on the base of symmetry-related criteria used to split an octave into a number of recurrent interval groups. This paper describes an algorithm to automatically compute the modes of limited transposition in a generic n-tone equal temperament. After providing a pseudo-code description of the process, a Web implementation will be proposed.

Keywords
generalization, modes of limited transposition, Olivier Messiaen

Paper topics
Computational musicology and Mathematical Music Theory

Easychair keyphrases
limited transposition [11], ring diagram [10], pitch class [9], equal temperament [8], global interval [7], messiaen mode [7], olivier messiaen [6], tone equal temperament [6], frequency ratio [5], theoretical work [5], data structure [4], generalized mode [4], western music [4]

Paper type
Full paper


2015.28
Grammatical Evolution with Zipf's Law Based Fitness for Melodic Composition
Loughran, Róisín   University College Dublin (UCD); Dublin, Ireland
McDermott, James   University College Dublin (UCD); Dublin, Ireland
O'Neill, Michael   University College Dublin (UCD); Dublin, Ireland

Abstract
We present a novel method of composing piano pieces with Grammatical Evolution. A grammar is designed to define a search space for melodies consisting of notes, chords, turns and arpeggios. This space is searched using a fitness function based on the calculation of Zipf's distribution of a number of pitch and duration attributes of the given melodies. In this way, we can create melodies without setting a given key or time signature. We can then create simple accompanying bass parts to repeat under the melody. This bass part is evolved using a grammar created from the evolved treble line with a fitness based on Zipf's distribution of the harmonic relationship between the treble and bass parts. From an analysis of the system we conclude that the designed grammar and the construction of the compositions from the final population of melodies is more influential on the musicality of the resultant compositions than the use of the Zipf's metrics.

Keywords
Algorithmic Composition, Evolutionary Computation, Grammatical Evolution, Melodic Composition

Paper topics
Computational musicology and Mathematical Music Theory, Sound/music signal processing algorithms

Easychair keyphrases
fitness function [15], zipf distribution [15], grammatical evolution [12], final generation [10], bass part [7], zipf law [7], best fitness [6], final population [6], fit individual [6], pitch duration [6], short melody [6], duration attribute [5], musical composition [5], top individual [5], accompanying bass part [4], bass accompaniment [4], best individual [4], best median ideal [4], computer music [4], event piece event [4], evolutionary run [4], fitness measure [4], genetic algorithm [4], piano piece [4], treble melody [4]

Paper type
Full paper


2015.29
Granular Model of Multidimensional Spatial Sonification
Wan Rosli, Muhammad Hafiz   Media Arts and Technology Program, University of California, Santa Barbara (UCSB); Santa Barbara, United States
Cabrera, Andrés   Media Arts and Technology Program, University of California, Santa Barbara (UCSB); Santa Barbara, United States
Wright, Matthew James   Media Arts and Technology Program, University of California, Santa Barbara (UCSB); Santa Barbara, United States

Abstract
Sonification is the use of sonic materials to represent information. The use of spatial sonification to represent spatial data, i.e., that which contains positional information, is inherent due to the nature of sound. However, perceptual issues such as the Precedence Effect and Minimum Audible Angle attenuate our ability to perceive directional stimuli. Furthermore, the mapping of multivariate datasets to synthesis engine parameters is non-trivial as a result of the vast information space. This paper presents a model for representing spatial datasets via spatial sonification through the use of granular synthesis.

Keywords
Auditory Displays, Data Sonification, Granular Synthesis, Multimodal Data Representation, Psychoacoustics, Spatial Audio

Paper topics
Auditory displays and data sonification, Models for sound analysis and synthesis, Multimodality in sound and music computing, Perception and cognition of sound and music, Spatial audio

Easychair keyphrases
data point [29], flash rate [11], spatial data [10], granular synthesis [9], granular stream [8], lightning occurrence [7], auditory display [6], spatial sonification [6], synthesis engine [6], data slice [5], grain density [5], temporal transformation [5], complex data space [4], flash rate value [4], minimum audible angle [4], multimodal data representation [4], perceptual issue [4], point cloud [4], sound particle [4], sound spatialization [4], spatial dataset [4], spatial sound [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851089
Zenodo URL: https://zenodo.org/record/851089


2015.30
Guided improvisation as dynamic calls to an offline model
Nika, Jérôme   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Bouche, Dimitri   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Bresson, Jean   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Chemillier, Marc   École des hautes études en sciences sociales (EHESS); Paris, France
Assayag, Gérard   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
This paper describes a reactive architecture handling the hybrid temporality of guided human-computer music improvisation. It aims at combining reactivity and anticipation in the music generation processes steered by a scenario. The machine improvisation takes advantage of the temporal structure of this scenario to generate short-term anticipations ahead of the performance time, and reacts to external controls by refining or rewriting these anticipations over time. To achieve this in the framework of an interactive software, guided improvisation is modeled as embedding a compositional process into a reactive architecture. This architecture is instantiated in the improvisation system ImproteK and implemented in OpenMusic.

Keywords
Guided improvisation, Music generation, Planning, Reactive architecture, Scenario, Scheduling

Paper topics
Interactive performance systems, Music performance analysis and rendering

Easychair keyphrases
improvisation handler [26], generation model [20], improvisation renderer [13], generation parameter [11], dynamic control [9], guided improvisation [9], performance time [9], improvisation fragment [8], music generation [7], time window [7], execution trace [6], generation process [6], improvisation handler agent [6], improvisation system [6], reactive architecture [6], real time [6], short term [6], sub sequence [6], computer music [5], generation phase [5], temporal structure [5], action container [4], handler action container [4], human computer improvisation [4], long term structure [4], memory generation model [4], performance time tp [4], short term plan extraction [4], th international computer [4], user control [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851143
Zenodo URL: https://zenodo.org/record/851143


2015.31
Harmony of the Spheres: A Physics-Based Android Synthesizer and Controller with Gestural Objects and Physical Transformations
Thalmann, Florian   Queen Mary University of London; London, United Kingdom

Abstract
This paper introduces the concepts and principles behind Harmony of the Spheres, an Android app based on physi- cal spaces and transformations. The app investigates how gestural multitouch and accelerometer control can be used to create and interact with objects in these physical spaces. The properties of these objects can be arbitrarily mapped to sound parameters, either of an internal synthesizer or ex- ternal systems, and they can be visualized in flexible ways. On a larger scale, users can make soundscapes by defin- ing sequences of physical space conditions, each of which has an effect on the positions and properties of the physical objects.

Keywords
audiovisual mapping, gestural interaction, mobile apps, musical spaces, physical models

Paper topics
Computational musicology and Mathematical Music Theory, Interactive performance systems, Interfaces for sound and music, Sonic interaction design

Easychair keyphrases
physical condition [14], inherent motion [9], musical object [7], real time [7], directional gravity [5], physical model [5], audio parameter [4], central gravity [4], gravitational center [4], internal synthesizer [4], mathematical music theory [4], n dimensional space [4], physical space [4], physical transformation [4], transformational theory [4], visual dimension [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851163
Zenodo URL: https://zenodo.org/record/851163


2015.32
How Well Can a Music Emotion Recognition System Predict the Emotional Responses of Participants?
Song, Yading   Queen Mary University of London; London, United Kingdom
Dixon, Simon   Queen Mary University of London; London, United Kingdom

Abstract
Music emotion recognition (MER) systems have been shown to perform well for musical genres such as film soundtracks and classical music. It seems difficult, however, to reach a satisfactory level of classification accuracy for popular music. Unlike genre, music emotion involves complex interactions between the listener, the music and the situation. Research on MER systems is handicapped due to the lack of empirical studies on emotional responses. In this paper, we present a study of music and emotion using two models of emotion. Participants' responses on 80 music stimuli for the categorical and dimensional model, are compared. In addition, we collect 207 musical excerpts provided by participants for four basic emotion categories (happy, sad, relaxed, and angry). Given that these examples represent intense emotions, we use them to train musical features using support vector machines with different kernels and with random forests. The most accurate classifier, using random forests, is then applied to the 80 stimuli, and the results are compared with participants' responses. The analysis shows similar emotional responses for both models of emotion. Moreover, if the majority of participants agree on the same emotion category, the emotion of the song is also likely to be recognised by our MER system. This indicates that subjectivity in music experience limits the performance of MER systems, and only strongly consistent emotional responses can be predicted.

Keywords
emotional responses, music emotion, music emotion recognition, music perception

Paper topics
Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
dimensional model [31], music information retrieval [19], participant response [18], music emotion recognition [17], recognition system [15], th international society [15], categorical model [13], emotional response [13], emotion recognition system [12], musical excerpt [12], random forest [12], second clip [11], support vector machine [9], emotion category [8], induced emotion [7], musical feature [7], basic emotion [6], emotion model [6], emotion recognition [6], music emotion [6], happy sad [5], machine learning [5], popular music [5], recognition accuracy [5], artist name [4], greatest number [4], music research [4], popular musical excerpt [4], recognition result [4], subjective music recommendation system [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851055
Zenodo URL: https://zenodo.org/record/851055


2015.33
INTER-CHANNEL SYNCHRONISATION FOR TRANSMITTING LIVE AUDIO STREAMS OVER DIGITAL RADIO LINKS
Brown, Stephen   National University of Ireland Maynooth; Maynooth, Ireland
Oliver, Jorge   National University of Ireland Maynooth; Maynooth, Ireland

Abstract
There are two key challenges to the use of digital, wireless communication links for the short-range transmission of multiple, live music streams from independent sources: delay and synchronisation. Delay is a result of the necessary buffering in digital music streams, and digital signal processing. Lack of synchronisation between time-stamped streams is a result of independent analogue-to-digital conversion clocks. Both of these effects are barriers to the wireless, digital recording studio. In this paper we explore the issue of synchronization, presenting a model, some network performance figures, and the results of experiments to explore the perceived effects of losing synchronization between channels. We also explore how this can be resolved in software when the data is streamed over a Wi-Fi link for real-time audio monitoring using consumer-grade equipment. We show how both fixed and varying offsets between channels can be resolved in software, to below the level of perception, using an offset-merge algorithm. As future work, we identify some of the key solutions for automated calibration. The contribution of this paper is the presentation of perception experiments for mixing unsynchronized music channels, the development of a model representing how these streams can be synchronized after-the-fact, and the presentation of current work in progress in terms of realizing the model.

Keywords
Digital Audio, Interaural Time Difference, Latency, Synchronisation

Paper topics
Computer environments for sound/music processing, Content processing of music audio signals, Interactive performance systems, Sound/music and the neurosciences, Sound/music signal processing algorithms

Easychair keyphrases
real time [13], interaural time difference [11], offset merge [7], buffer size [6], inter channel [6], front end software [4], mixing desk [4], music performance [4], real time monitoring function [4], real time operating system [4], sound localization [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851121
Zenodo URL: https://zenodo.org/record/851121


2015.34
INVESTIGATING THE EFFECTS OF INTRODUCING NONLINEAR DYNAMICAL PROCESSES INTO DIGITAL MUSICAL INTERFACES
Mudd, Tom   The Open University; Milton Keynes, United Kingdom
Holland, Simon   The Open University; Milton Keynes, United Kingdom
Mulholland, Paul   The Open University; Milton Keynes, United Kingdom
Dalton, Nick   The Open University; Milton Keynes, United Kingdom

Abstract
This paper presents the results of a study that explores the effects of including nonlinear dynamical processes in the design of digital musical interfaces. Participants of varying musical backgrounds engaged with a range of systems, and their behaviours, responses and attitudes were recorded and analysed. The study suggests links between the inclusion of such processes and scope for exploration and serendipitous discovery. Relationships between musical instruments and nonlinear dynamics are discussed more broadly, in the context of both acoustic and electronic musical tools. Links between the properties of nonlinear dynamical systems and the priorities of experimental musicians are highlighted and related to the findings of the study.

Keywords
digital musical instruments, mapping, nonlinear dynamical systems

Paper topics
Interactive performance systems, Interfaces for sound and music, Sonic interaction design

Easychair keyphrases
nonlinear dynamical [22], nonlinear dynamical system [20], nonlinear dynamical process [15], discontinuous mapping [13], continuous mapping [11], experimental music [11], musical practice [10], experimental music group [9], musical tool [8], static system [8], nonlinear dynamic [7], nonlinear dynamical element [6], computer music [5], free improvisation [5], continuum international publishing group [4], damped forced duffing oscillator [4], digital musical interface [4], experimental music group mapping [4], material oriented [4], midi control [4], musical background [4], non experimental group [4], non experimental music group [4], open university [4], overall score [4], sonic event [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851041
Zenodo URL: https://zenodo.org/record/851041


2015.35
Irish Traditional Ethnomusicology Analysis Using Decision Trees and High Level Symbolic Features
Martins, Mario   Federal University of Technology of Paraná (UTFPR); Apucarana, Brazil
Silla Jr., Carlos Nascimento   Federal University of Technology of Paraná (UTFPR); Apucarana, Brazil

Abstract
In this paper we investigate the suitability of decision tree classifiers to assist the task of massive computational ethnomusicology analysis. In our experiments we have employed a dataset of 10,200 traditional Irish tunes. In order to extract features from the Irish tunes, we have converted them into MIDI files and then extracted high level features from them. In our experiments with the traditional Irish tunes, we have verified that decision tree classifiers might be used for this task.

Keywords
computational ethnomusicology, decision trees, irish music

Paper topics
Computational musicology and Mathematical Music Theory

Easychair keyphrases
high level symbolic feature [24], decision tree classifier [19], decision tree [15], music information retrieval [14], short excerpt [12], abc notation [11], computational ethnomusicology [11], midi file [9], tune according [9], irish traditional [8], irish traditional music [7], irish music genre [6], machine learning [5], slip jig [5], traditional irish [5], abc format [4], association rule mining [4], data mining [4], folk music [4], irish traditional tune [4], midi format [4], naive listener [4], rule mining algorithm [4], time signature [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851095
Zenodo URL: https://zenodo.org/record/851095


2015.36
LICHTGESTALT: INTERACTION WITH SOUND THROUGH SWARMS OF LIGHT RAYS
Fehr, Jonas   Aalborg University; Aalborg, Denmark
Erkut, Cumhur   Aalborg University; Aalborg, Denmark

Abstract
We present a new interactive sound installation to be explored by movement. Specifically, movement qualities extracted from the motion tracking data excite a dynamical system (a synthetic flock of agents), which responds to the movement qualities and indirectly controls the visual and sonic feedback of the interface. In other words, the relationship between gesture and sound are mediated by synthetic swarms of light rays. Sonic interaction design of the system uses density as a design dimension, and maps the swarm parameters to sound synthesis parameters. Three swarm behaviors and three sound models are implemented, and evaluation suggests that the general approach is promising and the system has potential to engage the user.

Keywords
gesture sound mapping, Interactive sound installation, sonic interaction design

Paper topics
Interactive performance systems, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
light ray [11], sound synthesis [10], interactive sound installation [6], movement quality [6], sonic interaction design [6], computing system [4], high pitch sound texture [4], human factor [4], physical model [4], visual appearance [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851151
Zenodo URL: https://zenodo.org/record/851151


2015.37
Mapping brain signals to music via executable graphs
Crowley, Katie   Trinity College Dublin; Maynooth, Ireland
McDermott, James   University College Dublin (UCD); Dublin, Ireland

Abstract
A method for generating music in response to brain sig- nals is proposed. The brain signals are recorded using consumer-level brain-computer interface equipment. Each time-step in the signal is passed through a directed acyclic graph whose nodes execute simple numerical manipula- tions. Certain nodes also output MIDI commands, leading to patterned MIDI output. Some interesting music is ob- tained, and desirable system properties are demonstrated: the music is responsive to changes in input, and a sin- gle input signal passed through different graphs leads to similarly-structured outputs.

Keywords
adaptive composition, BCI, EEG, generative music, music

Paper topics
Auditory displays and data sonification, Multimodality in sound and music computing, Sound/music and the neurosciences, Sound/music signal processing algorithms

Easychair keyphrases
brain computer interface [9], output node [9], temporal structure [7], brain signal [6], bci data [5], brain computer [5], eeg signal [5], esense meter [5], midi note [5], time series [5], bci signal [4], executable graph [4], human computer interaction [4], inbound edge [4], multiple output [4], neural network [4], non static input signal [4], pmod unary pdiv sin [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851155
Zenodo URL: https://zenodo.org/record/851155


2015.38
MEPHISTO: A Source to Source Transpiler from Pure Data to Faust
Demir, Abdullah Onur   Middle East Technical University; Ankara, Turkey
Hacıhabiboğlu, Hüseyin   Middle East Technical University; Ankara, Turkey

Abstract
This paper introduces Mephisto, a transpiler for converting sound patches designed using the graphical computer music environment Pure Data to the functional DSP programming language Faust. Faust itself compiles into highly-optimized C++ code. The aim of the proposed transpiler is to enable creating highly optimized C++ code embeddable in games or other interactive media for sound designers, musicians and sound engineers using PureData in their workflows and to reduce the prototype-to-product delay. Mephisto's internal structure, its conventions and limitations and its performance are going to be presented and discussed.

Keywords
audio in games, faust, high performance sound processing, procedural sound design, pure data, transpiler

Paper topics
Computer environments for sound/music processing, High Performance Computing for Audio, Sound and music for VR and games

Easychair keyphrases
pure data [27], faust code [12], parse tree [9], dac object [8], highly optimized c code [8], object figure [8], programming language [7], optimized c code [6], average cpu utilization [4], block diagram [4], control mechanism [4], data structure [4], mephisto generated faust code [4], pd object tree [4], pure data patch [4], sound synthesis [4], standard ccitt dialing tone [4], transpiler generated faust code [4], tree traversal [4], tree walker [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851147
Zenodo URL: https://zenodo.org/record/851147


2015.39
MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS
Dzhambazov, Georgi   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Serra, Xavier   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
In this work we propose how to modify a standard scheme for text-to-speech alignment for the alignment of lyrics and singing voice. To this end we model the duration of phonemes specific for the case of singing. We rely on a duration-explicit hidden Markov model (DHMM) phonetic recognizer based on mel frequency cepstral coefficients (MFCCs), which are extracted in a way robust to background instrumental sounds. The proposed approach is tested on polyphonic audio from the classical Turkish music tradition in two settings: with and without modeling phoneme durations. Phoneme durations are inferred from sheet music. In order to assess the impact of the polyphonic setting, alignment is evaluated as well on an acapella dataset, compiled especially for this study. We show that the explicit modeling of phoneme durations improves alignment accuracy by absolute 10 percent on the level of lyrics lines (phrases) and performs on par with state-of-the-art aligners for other languages.

Keywords
lyrics-to-audio alignment, phoneme durations, polyphonic audio, score-following, score-informed alignment, singing voice tracking, speech-to-text alignment, Turkish classical music

Paper topics
Computational musicology and Mathematical Music Theory, Content processing of music audio signals, Models for sound analysis and synthesis, Music information retrieval, Sound/music signal processing algorithms

Easychair keyphrases
musical score [9], alignment accuracy [7], singing voice [7], explicit hidden markov model [6], phoneme duration [6], audio alignment [5], duration explicit [5], automatic lyric [4], background instrument [4], classical turkish music [4], hidden markov model [4], hidden semi markov model [4], hmm singer adaptation [4], markov model [4], polyphonic audio [4], vocal activity detection [4]

Paper type
Full paper


2015.40
Movement Perception in Music Performance - A Mixed Methods Investigation
Schacher, Jan C.   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Järveläinen, Hanna   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Strinning, Christian   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Neff, Patrick   University of Zurich (UZH); Zurich, Switzerland

Abstract
What are the effects of a musician's movement on the affective impact of experiencing a music performance? How can perceptual, sub-personal and cognitive aspects of music be investigated through experimental processes? This article describes the development of a mixed methods approach that tries to tackle such questions by blending quantitative and qualitative methods with observations and interpretations. Basing the core questions on terms and concepts obtained through a wide survey of literature on musical gesture and movement analysis, the iterative, cyclical advance and extension of a series of experiments is shown, and preliminary conclusions drawn from data and information collected in a pilot study. With the choice of particular canonical pieces from contemporary music, a multi-perspective field of questioning is opened up that provides ample materials and challenges for a process of converging, intertwining and cross-discipline methods development. The resulting interpretation points to significant affective impact of movement in music, yet these insights remain subjective and demand that further and deeper investigations are carried out.

Keywords
affective impact, blended interpretation, mixed methods, movement perception, music performance

Paper topics
Multimodality in sound and music computing, Music performance analysis and rendering, Perception and cognition of sound and music

Easychair keyphrases
mixed method research [15], musical gesture [15], music performance [12], audio rating [10], mixed method [10], perceived effort [6], video rating [6], affective impact [5], video condition [5], continuous self report method [4], median time series [4], movement analysis [4], musical performance [4], music perception [4], quantitative track [4], research project [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851107
Zenodo URL: https://zenodo.org/record/851107


2015.41
Mozart is still blue: a comparison of sensory and verbal scales to describe qualities in music
Murari, Maddalena   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Padova; Padova, Italy
Da Pos, Osvaldo   Università di Padova; Padova, Italy
Schubert, Emery   The University of New South Wales (UNSW); Sydney, Australia
Canazza, Sergio   Università di Padova; Padova, Italy
De Poli, Giovanni   Department of Information Engineering, Università di Padova; Padova, Italy

Abstract
An experiment was carried out in order to assess the use of non-verbal sensory scales for evaluating perceived music qualities, by comparing them with the analogous verbal scales.Participants were divided into two groups; one group (SV) completed a set of non-verbal scales responses and then a set of verbal scales responses to short musical extracts. A second group (VS) completed the experiment in the reverse order. Our hypothesis was that the ratings of the SV group can provide information unmediated (or less mediated) by verbal association in a much stronger way than the VS group. Factor analysis performed separately on the SV group, the VS group and for all participants shows a recurring patterning of the majority of sensory scales versus the verbal scales into different fac- tors. Such results suggest that the sensory scale items are indicative of a different semantic structure than the verbal scales in describing music, and so they are indexing different qualities (perhaps ineffable), making them potentially special contributors to understanding musical experience.

Keywords
music expressiveness, music qualities, non verbal sensory scales, semantic differential

Paper topics
Multimodality in sound and music computing, Perception and cognition of sound and music

Easychair keyphrases
sensory scale [41], verbal scale [38], non verbal sensory scale [28], musical excerpt [15], bizet mozart chopin [12], brahm vivaldi bizet [12], har sof smo rou [12], mal tak blu ora [12], mozart chopin bach [12], sof smo rou bit [12], vivaldi bizet mozart [12], blu ora har [9], hea lig col [9], lig col war [9], non verbal [9], cold warm [8], heavy light [8], bitter sweet [7], factor analysis [7], scale blue orange [7], soft smooth [7], age range [6], blue orange [6], chopin bach factor [6], equivalent verbal scale [6], factor score [6], mean age [6], rel mal tak blu [6], soft smooth sweet light warm [6], very familiar very unfamiliar [6]

Paper type
Full paper

DOI: 10.5281/zenodo.851099
Zenodo URL: https://zenodo.org/record/851099


2015.42
Multichannel Composition Using State-space Models and Sonification
Soria Luz, Rosalía   The University of Manchester; Manchester, United Kingdom

Abstract
This paper investigates the use state space models and real time sonification as a tool for electroacoustic composition. State Space models provide mathematical representations of physical systems, making possible to virtually capture a real life system behavior in a matrix-vector equation. This representation provides a vector containing the so called states of the system describing how a system evolves over time. This paper shows different sonifications for state space models and ways of using them in multichannel electroacoustic composition. Even though conventional sound synthesis techniques are used for sonification, very peculiar timbres and effects can be generated when sonifiying state space models. The paper presents an inverted pendulum, a mass-spring-damper system, and a harmonic oscillator, implemented in Supercollider and different real time multichannel sonification approaches, as well as ways of using them in electroacoustic composition.

Keywords
Interactive System, Inverted Pendulum, Multichannel, Sonification, Spring Mass Damper, State Space Models

Paper topics
Auditory displays and data sonification, Interactive performance systems, Interfaces for sound and music, Models for sound analysis and synthesis, Spatial audio

Easychair keyphrases
real time [21], state space [21], mass spring damper system [18], state space model [17], inverted pendulum [10], sound synthesis [9], harmonic oscillator [7], system behaviour [6], time paradox [6], electroacoustic composition [5], sound transformation [5], mathematical model [4], model input value [4], multichannel sonification [4], sampling period ts [4], sonified state space model [4], state space form [4], state vector [4], stereo sonification [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851135
Zenodo URL: https://zenodo.org/record/851135


2015.43
MULTI-CHANNEL SPATIAL SONIFICATION OF CHINOOK SALMON MIGRATION PATTERNS IN THE SNAKE RIVER WATERSHED
Robertson, Ben   Eastern Washington University; Cheney, United States
Middleton, Jonathan   Eastern Washington University; Cheney, United States
Hegg, Jens   University of Idaho; Moscow, United States

Abstract
Spatialization, pitch assignment, and timbral variation are three methods that can improve the perception of complex data in both an artistic and analytical context. This multi-modal approach to sonification has been applied to fish movement data with the dual goals of providing an aural representation for an artistic sound installation as well as qualitative data analysis tool useful to scientists studying fish movement. Using field data collected from three wild Chinook Salmon (Oncorhynchus tshawytscha) living in the Snake River Watershed, this paper will demonstrate how sonification offers new perspectives for interpreting migration pat-terns and the impact of environmental factors on the life-cycle associated with this species. Within this model, audio synthesis parameters guiding spatialization, microtonal pitch organization, and temporal structure are assigned to streams of data through software developed by Ben Luca Robertson. Guidance for the project has been provided by Dr. Jonathan Middleton of Eastern Washington University, while collection and interpretation of field data was performed by University of Idaho – Water Resources Program Ph.D. candidate, Jens Hegg.

Keywords
auditory display, microtones, salmon, sonification, spatialization

Paper topics
Auditory displays and data sonification

Easychair keyphrases
auditory display [12], strontium isotope [11], chemical signature [10], pacific ocean [10], water resource program [9], strontium isotope signature [7], marine environment [6], migration pattern [6], pitch assignment [6], strontium isotopic ratio [6], maturation period [5], idaho water resource [4], maternal signature [4], mean value [4], otolith sample [4], snake river [4], timbral variation [4], water system [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851027
Zenodo URL: https://zenodo.org/record/851027


2015.44
MUSE: a Music-making Sandbox Environment for Real-Time Collaborative Play
Popa, Iulius A.T.   University of Calgary; Calgary, Canada
Boyd, Jeffrey Edwin   University of Calgary; Calgary, Canada
Eagle, David   University of Calgary; Calgary, Canada

Abstract
This paper reports the concept, design, and prototyping of MUSE, a real-time, turn based, collaborative music making game for users with little to no formal music education background. MUSE is a proof-of-concept, web application running exclusively in the Chrome web browser for four players using game pad controllers. First, we outline the proposed methodology with respect to related research and discuss our approach to designing MUSE through a partial gamification of music using a player-centric design framework. Second, we explain the implementation and prototyping of MUSE. Third, we highlight recent observations of participants using our proof-of-concept application during a short art/installation gallery exhibition. In conclusion, we reflect on our design methodology based on the informal user feedback we received and look at several approaches into improving MUSE.

Keywords
collaborative music, interactive music, music gamification, music sandbox

Paper topics
Interactive performance systems, Interfaces for sound and music, Perception and cognition of sound and music, Social interaction in sound and music computing, Sonic interaction design

Easychair keyphrases
real time [23], long term engagement [14], musical toy [11], serious musical instrument [11], musical output [9], game system [8], musical result [8], emotional response [7], end turn [6], game mechanic [6], game rule [6], motivational affordance [6], overall pleasant musical output [6], real time collaborative music [6], collaborative music [5], creative freedom [5], designing muse [5], music making [5], music output [5], passive player [5], provide user [5], chrome web browser [4], game component [4], game design element [4], low level musical control [4], musical concept [4], musical instrument [4], player block [4], real time pleasant musical output [4], web audio api [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851159
Zenodo URL: https://zenodo.org/record/851159


2015.45
MUSICALLY INFORMED SONIFICATION FOR SELF-DIRECTED CHRONIC PAIN PHYSICAL REHABILITATION
Newbold, Joseph   University College London; London, United Kingdom
Bianchi-Berthouze, Nadia   University College London; London, United Kingdom
Gold, Nicolas E.   University College London; London, United Kingdom
Williams, Amanda   University College London; London, United Kingdom

Abstract
Chronic pain is pain that persists past the expected time of healing. Unlike acute pain, chronic pain is often no longer a sign of damage and may never disappear. Remaining physically active is very important for people with chronic pain, but in the presence of such persistent pain it can be hard to maintain a good level of physical activity due to factors such as fear of pain or re-injury. This paper introduces a sonification methodology which makes use of characteristics and structural elements of Western tonal music to highlight and mark aspects of movement and breathing that are important to build confidence in peoples’ body capability in a way that is easy to attend to and devoid of pain. The design framework and initial conceptual design that uses musical elements such as melody, harmony, texture and rhythm for improving the efficiency of the sonification used to support physical activity for people with chronic pain is here presented and discussed. In particular, we discuss how such structured sonification can be used to facilitate movement and breathing during physical rehabilitation exercises that tend to cause anxiety in people with chronic pain. Experiments are currently being undertaken to investigate the use of these musical elements in sonification for chronic pain.

Keywords
Chronic pain, Implicit music understanding, Musically-informed, Physical rehabilitation, Sonification

Paper topics
Auditory displays and data sonification

Easychair keyphrases
chronic pain [15], physical activity [14], musically informed sonification [12], musical element [7], physical rehabilitation [7], exercise space [6], maximum target point [6], self efficacy [6], western tonal music [6], minimum amount [5], musical stability [4], musical training [4], provide information [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851111
Zenodo URL: https://zenodo.org/record/851111


2015.46
Music Content Driven Automated Choreography with Beat-wise Motion Connectivity Constraints
Fukayama, Satoru   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan

Abstract
We propose a novel method for generating choreographies driven by music content analysis. Although a considerable amount of research has been conducted in this field, a way to leverage various music features or music content in automated choreography has not been proposed. Previous methods suffer from a limitation in which they often generate motions giving the impression of randomness and lacking context. In this research, we first discuss what types of music content information can be used in automated choreography and then argue that creating choreography that reflects this music content requires novel beat-wise motion connectivity constraints. Finally, we propose a probabilistic framework for generating choreography that satisfies both music content and motion connectivity constraints. The evaluation indicates that the choreographies generated by our proposed method were chosen as having more realistic dance motion than those generated without the constraints.

Keywords
automated choreography, computer graphics, data driven, music analysis, probabilistic modeling

Paper topics
Multimodality in sound and music computing, Music and robotics

Easychair keyphrases
motion connectivity constraint [28], dance motion [22], motion fragment [22], musical constraint [21], music content [13], automated choreography [12], motion connectivity [10], cross entropy [8], chord label [7], generate choreography [7], music constraint [7], generating choreography [6], music content analysis [6], probabilistic model [6], various music feature [6], acoustic feature [5], beat location [4], hierarchical structure [4], kernel function [4], measure boundary [4], motion connectivity constrained choreography [4], motion database [4], musical feature [4], probabilistic framework [4], structural segmentation [4], structure label [4], subjective evaluation [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851119
Zenodo URL: https://zenodo.org/record/851119


2015.47
MUSICMEAN: FUSION-BASED MUSIC GENERATION
Hirai, Tatsunori   Waseda University; Tokyo, Japan
Sasaki, Shoto   CREST, JST, Waseda University; Tokyo, Japan
Morishima, Shigeo   Waseda Research Institute for Science and Engineering, Waseda University; Tokyo, Japan

Abstract
In this paper, we propose MusicMean, a system that fuses existing songs to create an in-between song such as an average song, by calculating the average acoustic frequency of musical notes and the occurrence frequency of drum elements from multiple MIDI songs. We generate an in-between song for generative music by defining rules based on simple music theory. The system realizes the interactive generation of in-between songs. This represents new interaction between human and digital content. Using MusicMean, users can create personalized songs by fusing their favorite songs.

Keywords
Average song, Interactive music generation, Song morphing

Paper topics
Computer environments for sound/music processing, Interfaces for sound and music

Easychair keyphrases
blend rate [15], averaging operation [13], musical note [11], average song [10], drum pattern [10], existing song [7], drum pattern histogram [6], musical bar [6], music generation [6], music theory [6], midi file [5], musical key [5], average note [4], mashup music video [4], musical note averaging operation [4], music video generation [4], statistical model [4], user specified blend rate [4], video generation system [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851071
Zenodo URL: https://zenodo.org/record/851071


2015.48
Music Synthesis based on Impression and Emotion of Input Narratives
Kanno, Saya   Ochanomizu University; Tokyo, Japan
Itoh, Takayuki   Ochanomizu University; Tokyo, Japan
Takamura, Hiroya   The University of Tokyo; Tokyo, Japan

Abstract
This paper presents a technique to synthesize the music based on the impression and emotion of the input narratives. The technique prepares a dictionary which record the sensibility polarity values of arbitrary words. The technique also supposes that users listen to the sample chords and rhythms, and input the fitness values to the pre-defined impression word pairs, so that the technique can learn the relations between chords/rhythms and these impression. After these processes, the technique interactively synthesize the music for input narratives.It estimates the fitness values of the narrative to the impression word pairs y applying the dictionary, and then estimates the chord and rhythm progressions those impressions and emotions are the closest to the input narrative. Finally, the technique synthesizes the output tune by combining the chord and rhythm. We suppose this technique encourages to express impression and emotion of the input narratives by generating music.

Keywords
Document analysis, Learning of user's sensibility, Music synthesis

Paper topics
Multimodality in sound and music computing, Music performance analysis and rendering, Perception and cognition of sound and music

Easychair keyphrases
impression word [33], musical feature [25], fitness value [24], rhythm progression [10], sample chord [8], brassy simple [7], light heavy [7], preliminary data construction [7], th impression word [7], bright dark [6], energetic calm musical feature [6], musical feature value [6], document analysis [5], enjoyable wistful [5], music synthesis [5], chord progression [4], dark enjoyable wistful tripping [4], energetic calm [4], fitness value vector [4], minor seventh chord impression [4], point likert scale [4], semantic orientation calculation technique [4], seventh chord impression word [4], user prepared musical pattern [4], user sensibility [4], wistful tripping quiet energetic [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851101
Zenodo URL: https://zenodo.org/record/851101


2015.49
Navigating the Mix-space: Theoretical and Practical Level-balancing Technique in Multitrack Music Mixtures
Wilson, Alex   University of Salford; Salford, United Kingdom
Fazenda, Bruno   University of Salford; Salford, United Kingdom

Abstract
The mixing of audio signals has been at the foundation of audio production since the advent of electrical recording in the 1920’s, yet the mathematical and psychological bases for this activity are relatively under-studied. This paper investigates how the process of mixing music is conducted. We introduce a method of transformation from a “gain-space” to a “mix-space”, using a novel representation of the individual track gains. An experiment is conducted in order to obtain time-series data of mix-engineers' exploration of this space as they balance levels within a multi-track session to create their desired mixture. It is observed that, while the exploration of the space is influenced by the initial configuration of track gains, there is agreement between individuals on the appropriate gain settings required to create a balanced mixture. Implications for the design of intelligent music production systems are discussed.

Keywords
Intelligent mixing systems, Mix-engineering, Music production, Subjective audio evaluation

Paper topics
Computational musicology and Mathematical Music Theory, Interfaces for sound and music, Perception and cognition of sound and music

Easychair keyphrases
mix space [28], mix engineer [9], audio engineering society convention [8], final mix [8], backing track [6], fader control [6], source position [6], relative loudness [5], track gain [5], audio engineering society [4], audio signal [4], dynamic range compression [4], final mix position [4], gain space [4], intelligent mixing system [4], intelligent music production system [4], level balancing [4], mix velocity [4], multitrack session [4], probability density function [4], rhythm section [4], source directivity [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851115
Zenodo URL: https://zenodo.org/record/851115


2015.50
Non-negative Sparse Decomposition of Musical Signal Using Pre-trained Dictionary of Feature Vectors of Possible Tones from Different Instruments
Nomura, Ryo   Hiroshima University; Japan
Kurita, Takio   Hiroshima University; Japan

Abstract
Decomposition of the music signal into the signals of the individual instruments is a fundamental task for music signal processing. This paper proposes a decomposition algorithm of the music signal based on non-negative sparse estimation. we estimate the coefficients of the linear combination by assuming the feature vector of the given music signal can be approximated as the linear combination of the elements in the pre-trained dictionary. Since the music sound is considered as a mixture of tones from several instruments and only a few tones are appeared at the same time, the coefficients must be non-negative and sparse if the music signals are represented by non-negative vectors. In this paper we used the feature vector based on the auto correlation functions. The experimental results show that the proposed decomposition method can accurately estimate the tone sequence from the music sound played using two instruments.

Keywords
Auto-correlation functions, Decomposition of music signal, Dictionary learning, Non-negative sparse coding

Paper topics
Music information retrieval, Sound/music signal processing algorithms

Easychair keyphrases
auto correlation function [30], auto correlation [23], musical signal [23], non negative sparse coding [18], alto saxophone [16], music sound [14], sampling rate [14], individual instrument [12], non negative normalized auto [12], linear combination [11], negative sparse [11], cross correlation [10], dictionary matrix [10], non negative matrix factorization [8], normalized auto correlation [8], normalized auto correlation function [8], feature vector [7], midi sound source [7], negative normalized auto correlation [7], pre trained dictionary [7], alto saxophone part [6], decomposition algorithm [6], negative normalized auto correlation vector [6], non negative matrix [6], nonnegative matrix factorization [6], non negative sparse coefficient [6], sound signal [6], contrabass part [5], estimated coefficient [5], non negative coefficient [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851039
Zenodo URL: https://zenodo.org/record/851039


2015.51
ONLINE HARMONIC/PERCUSSIVE SEPARATION USING SMOOTHNESS/SPARSENESS CONSTRAINTS
Cañadas-Quesada, Francisco Jesús   Universidad de Jaén; Jaén, Spain
Vera-Candeas, Pedro   Universidad de Jaén; Jaén, Spain
Ruiz-Reyes, Nicolas   Universidad de Jaén; Jaén, Spain
Alonso-Jorda, Pedro   Universitat Politècnica de València; Valencia, Spain
Ranilla-Pastor, José   Universidad de Oviedo; Oviedo, Spain

Abstract
The separation of percussive sounds fromharmonic sounds in audio recordings remains a challenging task since it has received much attention over the last decade. In a previous work, we described amethod to separate harmonic and percussive sounds based on a constrained Non-negative Matrix Factorization (NMF) approach. The approach distinguishes between percussive and harmonic bases integrating percussive and harmonic sound features, such as smoothness and sparseness, into the decomposition process. In this paper, we propose an online version of our previous work. Instead of decomposing the whole mixture, the online proposal decomposes a set of segments of the mixture selected by a sliding temporal window. Both percussive and harmonic bases of the next segment are initialized using the bases obtained in the decomposition of the previous segment. Results show that an online proposal can provide satisfactory separation performance but the sound quality of the separated signals depends inversely on the latency of the system.

Keywords
Constraints, Harmonic/Percussive separation, Latency, Non-negative Matrix Factorization, Online, Signal to Distortion Ratio (SDR), Signal to Interference Ratio (SIR), Smoothness, Sound source separation, Sparseness

Paper topics
Content processing of music audio signals, Music information retrieval, Sound/music signal processing algorithms

Easychair keyphrases
harmonic sound [26], online proposal [25], percussive sound [13], non negative matrix factorization [12], harmonic base [11], computation time [10], offline method [9], method online [8], percussive separation [7], separation performance [7], whole mixture [7], harmonic sound separation method [6], separated percussive signal [6], cost function [5], minimum local [5], next segment [5], proposal online [5], sir result [5], harmonic signal [4], language processing [4], magnitude spectrogram [4], matrix factorization [4], offline harmonic [4], percussive separation work [4], sliding window [4], source separation [4], whole mixture signal [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851017
Zenodo URL: https://zenodo.org/record/851017


2015.52
On the Musical Opportunities of Cylindrical Hexagonal Lattices: Mapping Flat Isomorphisms Onto Nanotube Structures
Hu, Hanlin   Department of Computer Science, University of Regina; Regina, Canada
Park, Brett   University of Regina; Regina, Canada
Gerhard, David   University of Regina; Regina, Canada

Abstract
It possible to position equal-tempered discrete notes on a flat hexagonal grid in such a way as to allow musical constructs (chords, intervals, melodies, etc.) to take on the same shape regardless of the tonic. This is known as a musical isomorphism, and it has been shown to have advantages in composition, performance, and learning. Considering the utility and interest of such layouts, an extension into 3D interactions was sought, focussing on cylindrical hexagonal lattices which have been extensively studied in the context of carbon nanotubes. In this paper, we explore the notation of this class of cylindrical hexagonal lattices and develop a process for mapping a flat hexagonal isomorphism onto such a lattice. This mapping references and draws upon previous explorations of the helical and cyclical nature of western musical harmony.

Keywords
harmonic theory, hexagonal lattice, isomorphic layout, musical controller design, tonnetz, wicki-hayden

Paper topics
Computational musicology and Mathematical Music Theory, Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
isomorphic layout [21], chiral vector [13], cylindrical hexagonal lattice [12], isotone axis [10], chiral angle [9], hexagonal lattice [9], carbon nanotube [7], cylindrical hexagonal [7], cylindrical hexagonal tube [7], hexagonal grid [6], pitch axis [5], boundary shape [4], chiral vector direction [4], dashed green line [4], harmonic table [4], musical isomorphism [4], tone equal [4], typical isomorphic layout [4], whole number [4], wicki hayden layout [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851085
Zenodo URL: https://zenodo.org/record/851085


2015.53
Perceiving and Predicting Expressive Rhythm with Recurrent Neural Networks
Lambert, Andrew   City University London; London, United Kingdom
Weyde, Tillman   City University London; London, United Kingdom
Armstrong, Newton   City University London; London, United Kingdom

Abstract
Automatically following rhythms by beat tracking is by no means a solved problem, especially when dealing with varying tempo and expressive timing. This paper presents a connectionist machine learning approach to expressive rhythm prediction, based on cognitive and neurological models. We detail a multi-layered recurrent neural network combining two complementary network models as hidden layers within one system. The first layer is a Gradient Frequency Neural Network (GFNN), a network of nonlinear oscillators which acts as an entraining and learning resonant filter to an audio signal. The GFNN resonances are used as inputs to a second layer, a Long Short-term Memory Recurrent Neural Network (LSTM). The LSTM learns the long-term temporal structures present in the GFNN's output, the metrical structure implicit within it. From these inferences, the LSTM predicts when the next rhythmic event is likely to occur. We train the system on a dataset selected for its expressive timing qualities and evaluate the system on its ability to predict rhythmic events. We show that our GFNN-LSTM model performs as well as state-of-the art beat trackers and has the potential to be used in real-time interactive systems, following and generating expressive rhythmic structures.

Keywords
Audio Signal Processing, Expressive Timing, Gradient Frequency Neural Networks, Machine Learning, Metre Perception, Music Information Retreival, Recurrent Neural Networks, Rhythm Prediction

Paper topics
Computer environments for sound/music processing, Interactive performance systems, Music information retrieval, Music performance analysis and rendering, Perception and cognition of sound and music, Sound/music signal processing algorithms

Easychair keyphrases
neural network [19], beat tracking [15], gradient frequency neural network [12], rhythmic event [11], recurrent neural network [9], expressive timing [8], hebbian learning [8], audio data [7], city university london [6], connectivity matrix [6], mean field [6], music information retrieval [6], rhythm prediction [6], short term memory [6], audio signal [5], metrical structure [5], rhythmic structure [5], beat induction [4], hierarchical metrical structure [4], integer ratio [4], long term structure [4], mean field network [4], mid level representation [4], neural network model [4], online initonline initonline lstm [4], online online initonline initonline [4], onset detection function [4], real time [4], standard deviation [4], th international society [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851063
Zenodo URL: https://zenodo.org/record/851063


2015.54
Pop Music Visualization Based on Acoustic Features and Chord Progression Patterns Applying Dual Scatterplots
Uehara, Misa   Ochanomizu University; Tokyo, Japan
Itoh, Takayuki   Ochanomizu University; Tokyo, Japan

Abstract
Visualization is an extremely useful tool to understand similarity of impressions among large number of tunes, or relationships of individual characteristics among artists, effectively in a short time. We expect chord progressions are beneficial in addition to acoustic features to understand the relationships among tunes; however, there have been few studies on visualization of music collections with the chord progression data. In this paper, we present a technique for integrated visualization of chord progression, meta information and acoustic features in collections of large number of tunes. This technique firstly calculates the acoustic feature values of the given set of tunes. At the same time, the technique collates typical chord progression patterns from the chord progressions of the tunes given as sequences of characters, and records which patterns are used in the tunes. Our implementation visualizes the above information applying the dual scatterplots, where one of the scatterplots arranges tunes based on their acoustic features, and the other figures co-occurrences among chord progression and meta information. In this paper, we introduce the experiment with tunes of 20 Japanese pop musicians using our visualization technique.

Keywords
acoustic feature, chord progression, information visualization, music recommendation

Paper topics
Interfaces for sound and music, Music information retrieval

Easychair keyphrases
chord progression pattern [44], meta information [39], chord progression [35], acoustic feature [32], progression pattern [17], meta information value [11], presented visualization technique [9], visualization technique [9], typical chord progression pattern [8], acoustic feature value [7], drag operation [7], music information retrieval [7], selected meta information [7], artist name [6], pop music [6], visualization result [6], selected dot [5], correlated meta information [4], japanese pop [4], music visualization [4], progression pattern matching [4], similar chord progression pattern [4], th meta information [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851029
Zenodo URL: https://zenodo.org/record/851029


2015.55
Psychoacoustic Impact Assessment of Smoothed AM/FM Resonance Signals
Goulart, Antonio   University of São Paulo (USP); São Paulo, Brazil
Timoney, Joseph   National University of Ireland Maynooth; Maynooth, Ireland
Queiroz, Marcelo   University of São Paulo (USP); São Paulo, Brazil
Lazzarini, Victor   National University of Ireland Maynooth; Maynooth, Ireland

Abstract
In this work we decompose analog musical resonant waveforms into their instantaneous frequency and amplitude envelope, and then smooth these information before resynthesis. The psychoacoustic impacts are evaluated from the point of view of dynamic brightness, tristimulus and spectrum irregularity. Signals with different amounts of resonance were analysed, and different types and lengths were tested for the smoothers. Experiments were carried out with amplitude smoothing only, frequency smoothing only, and simultaneous smoothing of amplitude and frequency signals. We draw conclusions relating the parameters explored and the results, which match with the sounds produced with the technique.

Keywords
AM-FM analysis resynthesis, analysis smoothing resynthesis, psychoacoustic impact of dafx

Paper topics
Models for sound analysis and synthesis, Perception and cognition of sound and music

Easychair keyphrases
instantaneous frequency [8], brightness value [6], musical instrument [6], env mod [5], order smoother [5], psychoacoustic metric [5], signal processing [5], tristimulus triangle [5], audio engineering society convention [4], frequency modulation [4], harmonics finding process [4], higher order [4], irregularity value [4], resonant waveform [4], waveform c figure [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851145
Zenodo URL: https://zenodo.org/record/851145


2015.56
Rendering and Subjective Evaluation of Real vs. Synthetic Vibrotactile Cues on a Digital Piano Keyboard
Fontana, Federico   Università di Udine; Udine, Italy
Avanzini, Federico   Università di Padova; Padova, Italy
Järveläinen, Hanna   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Papetti, Stefano   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Klauer, Giorgio   Conservatorio di musica “Cesare Pollini” di Padova; Padova, Italy
Malavolta, Lorenzo   Conservatorio di musica “Cesare Pollini” di Padova; Padova, Italy

Abstract
The perceived properties of a digital piano keyboard were studied in two experiments involving different types of vibrotactile cues in connection with sonic feedback. The first experiment implemented a free playing task in which subjects had to rate the perceived quality of the instrument according to five attributes: Dynamic control, Richness, Engagement, Naturalness, and General preference. The second experiment measured performance in timing and dynamic control in a scale playing task. While the vibrating condition was preferred over the standard non-vibrating setup in terms of perceived quality, no significant differences were observed in timing and dynamics accuracy. Overall, these results must be considered preliminary to an extension of the experiment involving repeated measurements with more subjects.

Keywords
digital piano, synthetic vibrations, tactile perception

Paper topics
Multimodality in sound and music computing, Perception and cognition of sound and music

Easychair keyphrases
vibrotactile feedback [10], dynamic control [9], general preference [9], vibration condition [9], digital keyboard [8], key vibration [8], digital piano [7], attribute scale [6], j arvel ainen [6], non vibrating standard [6], vibration sample [6], key velocity [5], negative group [5], perceived quality [5], piano synthesizer [5], control engagement richness [4], digital piano keyboard [4], individual consistency [4], positive group [4], significant difference [4], vibrotactile cue [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851015
Zenodo URL: https://zenodo.org/record/851015


2015.57
Sensor and Software Technologies for Lip Pressure Measurements in Trumpet and Cornet Playing - from Lab to Classroom
Großhauser, Tobias   Swiss Federal Institute of Technology in Zurich (ETH Zürich); Zurich, Switzerland

Abstract
Several technologies to measure lip pressure during brass instrument playing have already been developed as prototypes. This paper presents many technological improvements of previous methods and its optimization to use this technique as “easy to handle” tool in the classroom. It also offers new options for performance science studies gathering many intra- and inter-individual variabilities of playing parameters. Improvements include a wireless sensor setup to measure lip pressure in trumpet and cornet playing and to capture the orientation and motion of the instrument. Lightweight design and simple fixation allow to perform with a minimum of alteration of the playing conditions. Wireless connectivity to mobile devices is introduced for specific data logging. The app includes features like data recording, visualization, real-time feedback and server connectivity or other data sharing possibilities. Furthermore, a calibration method for the sensor setup is developed and the results showed measurement accuracy of less than 5% deviation and measurement range from 0.6N up to a peak load to 70N. A pilot study with 9 participants (beginners, advanced students and a professional player) confirmed practical usage. The integration of these real- time data visualizations into daily teaching and practicing could be just the next small step. Lip pressure forces are not only extremely critical for the upper register of the brass instruments, they are in general crucial for all brass instruments, especially playing in upper registers. Small changes of the fitting permit the use of the sensor for all brass instruments.

Keywords
app, biofeedback, brass, cornet, lip pressure, real-time, trumpet

Paper topics
Music performance analysis and rendering

Easychair keyphrases
lip pressure [17], sensor module [11], real time feedback [9], real time [6], professional player [5], sensor setup [5], brass instrument playing [4], data logging [4], dof imu [4], lip pressure measurement [4], miniature load cell [4], piston valve trumpet [4], real time data visualization [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851105
Zenodo URL: https://zenodo.org/record/851105


2015.58
Sensors2OSC
Deusany de Carvalho Junior, Antonio   University of São Paulo (USP); São Paulo, Brazil
Mayer, Thomas   Residuum, Independent; Germany

Abstract
In this paper we present an application that can send all events from any sensor available at an Android device using OSC and through Unicast or Multicast network communication. Sensors2OSC permits the user to activate and deactivate any sensor at runtime has forward compatibility with any new sensor that can become available without the need to upgrade the application for that. The sensors rate can be changed from the slowest to the fastest, and the user can configure any IP and Port to redirect the OSC messages. The application is described in detail with some discussion about Android devices limitations and the advantages of this application in contrast with so many others that we have on the market.

Keywords
android, interaction, ipv6, mobile, multicast, osc, unicast

Paper topics
Interactive performance systems, Sonic interaction design

Easychair keyphrases
mobile device [12], android device [11], sensor rate [10], android api [9], sensor available [9], osc message [8], multicast address [6], sensor value [6], available sensor [5], main screen [5], port number [5], forward compatibility [4], mobile application development [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851161
Zenodo URL: https://zenodo.org/record/851161


2015.59
Smooth Granular Sound Texture Synthesis by Control of Timbral Similarity
Schwarz, Diemo   STMS, Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
O'Leary, Sean   National University of Ireland Maynooth; Maynooth, Ireland

Abstract
Granular methods to synthesise environmental sound textures (e.g. rain, wind, fire, traffic, crowds) preserve the richness and nuances of actual recordings, but need a preselection of timbrally stable source excerpts to avoid unnaturally-sounding jumps in sound character. To overcome this limitation, we add a description of the timbral content of each sound grain to choose successive grains from similar regions of the timbre space. We define two different timbre similarity measures, one based on perceptual sound descriptors, and one based on MFCCs. A listening test compared these two distances to an unconstrained random grain choice as baseline and showed that the descriptor-based distance was rated as most natural, the MFCC based distance generally as less natural, and the random selection always worst.

Keywords
concatenative synthesis, corpus-based synthesis, granular synthesis, sound descriptors, sound texture synthesis

Paper topics
Sound and music for VR and games, Sound/music signal processing algorithms

Easychair keyphrases
sound texture [21], sound texture synthesis [15], crowd water faucet [7], desert wind stadium [7], lapping wave desert [7], stadium crowd water [7], traffic jam baby [7], water faucet formula [7], wave desert wind [7], wind stadium crowd [7], diemo schwarz [6], environmental sound texture [6], naturalness rating [6], texture synthesis [6], sound designer [5], baby total orig descr [4], corpus based concatenative synthesis [4], descriptor based similarity measure [4], mfcc based distance [4], scaled naturalness rating [4], signal processing [4], timbral similarity [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851125
Zenodo URL: https://zenodo.org/record/851125


2015.60
Songrium: Browsing and Listening Environment for Music Content Creation Community
Hamasaki, Masahiro   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Nakano, Tomoyasu   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan

Abstract
This paper describes a music browsing assistance service, Songrium (http://songrium.jp), that enables visualization and exploration of massive user-generated music content with the aim of enhancing user experiences in enjoying music. Such massive user-generated content has yielded ``web-native music'', which we defined as musical pieces that are published, shared, and remixed (have derivative works created) entirely on the web. Songrium has two interfaces for browsing and listening to web-native music from the viewpoints of scale and time: Songrium3D for gaining community-scale awareness and Interactive History Player for gaining community-history awareness. Both of them were developed to stimulate community activities for web-native music by visualizing massive music content spatially or chronologically and by providing interactive enriched experiences. Songrium has analyzed over 680,000 music video clips on the most popular Japanese video-sharing service, Niconico, which includes original songs of web-native music and their derivative works such as covers and dance arrangements. Analyses of more than 120,000 original songs reveal that over 560,000 derivative works have been generated and contributed to enriching massive user-generated music content.

Keywords
interactive system, music visualization, user-generated content, web application, web-native music

Paper topics
Interfaces for sound and music, Music information retrieval

Easychair keyphrases
derivative work [54], music content [29], interactive history player [28], music star map [15], visual effect [15], content creation community [12], music content creation [12], user generated music content [12], web native music [12], video clip [11], hatsune miku [10], browsing assistance service [9], music browsing assistance [9], music structure [7], music video clip [7], native music [7], dimensional space [6], embedded video player [6], video sharing service [6], vocaloid character [6], web native music content [6], music recommendation [5], public event [5], repeated section [5], vocaloid song [5], webnative music [5], crypton future medium [4], music video [4], popular japanese video sharing [4], web service [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851091
Zenodo URL: https://zenodo.org/record/851091


2015.61
Sound My Vision: Real-time video analysis on mobile platforms for controlling multimedia performances
Kreković, Miranda   School of Computer and Communication Sciences , Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Grbac, Franco   Independent; Switzerland
Kreković, Gordan   Faculty of Electrical Engineering and Computing, University of Zagreb; Zagreb, Croatia

Abstract
This paper presents Sound My Vision, an Android application for controlling music expression and multimedia projects. Unlike other similar applications which collect data only from sensors and input devices, Sound My Vision also analyses input video in real time and extracts low-level video features. Such a versatile controller can be used in various scenarios from entertainment and experimentation to live music performances, installations and multimedia projects. The application can replace complex setups that are usually required for capturing and analyzing a video signal in live performances. Additionally, mobility of smart-phones allows perspective changes in sense that the performer can become either an object or a subject involved in controlling the expression. The most important contributions of this paper are selection of general and low-level video feature and the technical solution for seamless real-time video extraction on the Android platform.

Keywords
mobile application, OSC, sensors, video analysis, video features

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
moving object [15], video feature [15], mobile device [13], mobile application [10], musical expression [9], real time [9], real time video analysis [8], computer music [7], touch screen [6], international computer [5], multimedia project [5], use case [5], android application [4], android operating system [4], computer vision [4], low level video feature [4], multimedia system [4], real time video [4], seamless real time video [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851081
Zenodo URL: https://zenodo.org/record/851081


2015.62
Synchronizing Spatially Distributed Musical Ensembles
Hadjakos, Aristotelis   Center of Music and Film Informatics, Detmold University of Music; Detmold, Germany
Berndt, Axel   Center of Music and Film Informatics, Detmold University of Music; Detmold, Germany
Waloschek, Simon   Center of Music and Film Informatics, Detmold University of Music; Detmold, Germany

Abstract
Spatially distributed musical ensembles play together while being distributed in space, e.g., in a park or in a historic building. Despite the distance between the musicians they should be able to play together with high synchronicity and realize complex rhythms (as far as the speed of sound permits). In this paper we propose systematic support of such ensembles based on electronic music stands that are synchronized to each other without using a permanent computer network or any network at all. This makes it possible to perform music for spatially distributed musical ensembles in places where it is difficult to get access to a computer network, e.g., in parks, historic buildings or big concert venues.

Keywords
Click Track, Electronic Music Stand, Synchronization

Paper topics
Interactive performance systems

Easychair keyphrases
m sync player [28], music stand [21], digital music stand [20], click track [13], distributed button [9], distributed musical ensemble [9], electronic music stand [9], page turning [9], computer music [6], continuous synchronization [6], playback model [6], radio time signal [6], shot synchronization [6], delta time [5], visual cue [5], auditory cue [4], ensemble member [4], low latency audio transmission [4], msync player [4], musical expression [4], music performance [4], sheet music [4], synchronized m sync player [4], tempo change [4], web based click track editor [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851131
Zenodo URL: https://zenodo.org/record/851131


2015.63
SynPy: a Python Toolkit for Syncopation Modelling
Song, Chunyang   Queen Mary University of London; London, United Kingdom
Pearce, Marcus   Queen Mary University of London; London, United Kingdom
Harte, Christopher   University of York; York, United Kingdom

Abstract
In this paper we present SynPy, an open-source software toolkit for quantifying syncopation. It is flexible yet easy to use, providing the first comprehensive set of implementations for seven widely known syncopation models using a simple plugin architecture for extensibility. SynPy is able to process multiple bars of music containing arbitrary rhythm patterns and can accept time-signature and tempo changes within a piece. The toolkit can take input from various sources including text annotations and standard MIDI files. Results can also be output to XML and JSON file formats. This toolkit will be valuable to the computational music analysis community, meeting the needs of a broad range of studies where a quantitative measure of syncopation is required. It facilitates a new degree of comparison for existing syncopation models and also provides a convenient platform for the development and testing of new models.

Keywords
python, syncopation modelling, toolkit

Paper topics
Computational musicology and Mathematical Music Theory, Models for sound analysis and synthesis

Easychair keyphrases
rhythm pattern [14], syncopation model [13], note sequence [12], velocity sequence [11], standard midi file [9], time signature [7], time span [7], metrical weight [6], music perception [6], syncopation prediction [6], synpy toolkit [6], metrical hierarchy [5], metrical level [5], note duration [5], arbitrary rhythm pattern [4], computational music analysis community [4], json file [4], longuet higgin [4], metrical position [4], open source [4], plugin architecture [4], prediction value [4], quarter note [4], queen mary [4], son clave rhythm [4], syncopation value [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851079
Zenodo URL: https://zenodo.org/record/851079


2015.64
Target-Based Rhythmic Pattern Generation and Variation with Genetic Algorithms
Ó Nuanáin, Cárthach   Pompeu Fabra University (UPF); Barcelona, Spain
Herrera, Perfecto   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Jordà, Sergi   Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
Composing drum patterns and musically developing them through repetition and variation is a typical task in electronic music production. We propose a system that, given an input pattern, automatically creates related patterns using a genetic algorithm. Two distance measures (the Hamming distance and directed-swap distance) that relate to rhythmic similarity are shown to derive usable fitness functions for the algorithm. A software instrument in the Max for Live environment presents how this can be used in real musical applications. Finally, a user survey was carried out to examine and compare the effectiveness of the fitness metrics in determining rhythmic similarity as well as the usefulness of the instrument for musical creation.

Keywords
Algorithmic Composition, Drums, Electronic Music, Genetic Algorithms, Rhythm Similarity

Paper topics
Computational musicology and Mathematical Music Theory, Computer environments for sound/music processing, Interfaces for sound and music, Models for sound analysis and synthesis, Perception and cognition of sound and music

Easychair keyphrases
genetic algorithm [32], target pattern [15], fitness function [12], hamming distance [11], rhythmic similarity [11], directed swap distance [9], drum pattern [9], distance measure [8], edit distance [7], swap distance [7], computer music [5], bit string [4], correlation matrix [4], next section [4], perceived similarity [4], rhythmic pattern [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851031
Zenodo URL: https://zenodo.org/record/851031


2015.65
TEMPO CURVING AS A FRAMEWORK FOR INTERACTIVE COMPUTER-AIDED COMPOSITION
Bresson, Jean   STMS, Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
MacCallum, John   Center for New Music and Audio Technologies (CNMAT), University of California, Berkeley; Berkeley, United States

Abstract
We present computer-aided composition experiments related to the notions of polyrhythmic structures and variable tempo curves. We propose a formal context and some tools that allow to generate complex multi-varying-tempo polyrhythms integrated in compositional processes and performance, implemented as algorithms and prototype user interfaces.

Keywords
Computer-aided composition, Polytemporal music, Rhythm, Tempo

Paper topics
Computer environments for sound/music processing

Easychair keyphrases
tempo curve [16], computer aided composition [7], computer music [7], temporal pattern [7], simulated annealing algorithm [6], compositional process [4], musical material [4], recent work [4], target rhythm [4], varying tempo [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851109
Zenodo URL: https://zenodo.org/record/851109


2015.66
The "Harmonic Walk" and Enactive Knowledge: An Assessment Report
Mandanici, Marcella   Università di Padova; Padova, Italy
Rodà, Antonio   Università di Padova; Padova, Italy
Canazza, Sergio   Università di Padova; Padova, Italy

Abstract
The Harmonic Walk is an interactive, physical environment based on user’s motion detection and devoted to the study and practice of tonal harmony. When entering the rectangular floor surface within the application’s camera view, a user can actually walk inside the musical structure, causing a sound feedback depending on the occupied zone. We arranged a two masks projection set up to allow users to experience melodic segmentation and tonality harmonic space, and we planned two phase assessment sessions, submitting a 22 high school student group to various test conditions. Our findings demonstrate the high learning effectiveness of the Harmonic Walk application. Its ability to transfer abstract concepts in an enactive way, produces important improvement rates both for subjects who received explicit information and for subjects who didn’t.

Keywords
Interactive physical environments, Music cognition, Music learning applications

Paper topics
Interactive performance systems, Interfaces for sound and music

Easychair keyphrases
harmonic change [20], harmonization task [16], harmonic walk [15], effect size [12], non musician [10], high school student [9], circular ring [8], subject category [8], melody harmonization [7], second test [7], employed chord [6], enactive experience [6], instructed subject [6], melodic segmentation [6], standard deviation [6], test conductor [6], tonality harmonic space [6], zone tracker application [6], assessment session [5], assessment test [5], audio file [5], explicit information [5], high school [5], instructed musician [5], tonal function [5], tonal melody [5], uninstructed musician [5], catholic institute barbarigo [4], harmonic walk application [4], music high school [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851043
Zenodo URL: https://zenodo.org/record/851043


2015.67
The Virtuoso Composer and the Formidable Machine: A Path to Preserving Human Compositional Expression
Cullimore, Jason   University of Regina; Regina, Canada
Gerhard, David   University of Regina; Regina, Canada

Abstract
Many contemporary computer music systems can emulate as- pects of composers’ behaviour, creating and arranging struc- tural elements traditionally manipulated by composers. This raises the question as to how new computer music systems can act as effective tools that enable composers to express their personal musical vision–if a computer is acting as a composer’s tool, but is working directly with score structure, how can it preserve the composer’s artistic voice? David Wessel and Matthew Wright have argued that, in the case of musical instrument interfaces, a balance should be struck between ease of use and the potential for developing expres- sivity through virtuosity. In this paper, we adapt these views to the design of compositional interfaces. We introduce the idea of the virtuoso composer, and propose an understanding of computer music systems that may enhance the relation- ship between composers and their computer software tools. We conclude by arguing for a conceptualization of the com- poser/computer relationship that promotes the continued evo- lution of human musical expression.

Keywords
Critical Studies, Electronic Composition, Generative Music, Human-Computer Collaboration

Paper topics
Music and robotics, Music performance analysis and rendering, Social interaction in sound and music computing, Sound and music for VR and games

Easychair keyphrases
computer music system [19], musical work [10], computer music [7], musical instrument [7], computer based composition environment [6], musical idea [6], virtuoso composer [6], computer system [5], score structure [5], david cope experiment [4], musical intelligence [4], musical structure [4], real time [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851137
Zenodo URL: https://zenodo.org/record/851137


2015.68
To "Sketch-a-Scratch"
Del Piccolo, Alan   Università Ca' Foscari; Venezia, Italy
Delle Monache, Stefano   IUAV University of Venice; Venezia, Italy
Rocchesso, Davide   IUAV University of Venice; Venezia, Italy
Papetti, Stefano   Zurich University of the Arts (ZHdK); Zurich, Switzerland
Mauro, Davide Andrea   IUAV University of Venice; Venezia, Italy

Abstract
A surface can be harsh and raspy, or smooth and silky, and everything in between. We are used to sense these features with our fingertips as well as with our eyes and ears: the exploration of a surface is a multisensory experience. Tools, too, are often employed in the interaction with surfaces, since they augment our manipulation capabilities. “Sketch a Scratch” is a tool for the multisensory exploration and sketching of surface textures. The user’s actions drive a physical sound model of real materials’ response to interactions such as scraping, rubbing or rolling. Moreover, different input signals can be converted into 2D visual surface profiles, thus enabling to experience them visually, aurally and haptically.

Keywords
Exploration, Interaction, Texture sketching

Paper topics
Computer environments for sound/music processing, Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, Sonic interaction design

Easychair keyphrases
haptic feedback [13], surface texture [10], surface profile [7], tool mediated exploration [6], virtual surface [6], impact model [5], audio signal [4], friction model [4], interactive surface [4], lateral force [4], multisensory exploration [4], real surface [4], rubbed object [4], self contained interactive installation [4], sound design toolkit [4], texture exploration [4], world voice day [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851051
Zenodo URL: https://zenodo.org/record/851051


2015.69
TRAP: TRAnsient Presence detection exploiting Continuous Brightness Estimation (CoBE)
Presti, Giorgio   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica (DI), Università degli Studi di Milano; Milano, Italy
Mauro, Davide Andrea   IUAV University of Venice; Venezia, Italy
Haus, Goffredo   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica (DI), Università degli Studi di Milano; Milano, Italy

Abstract
A descriptor of features’ modulation, useful in classification tasks and real time analysis, is proposed. This descriptor is computed in the time domain, ensuring fast computation speed and optimal temporal resolution. In this work we take into account amplitude envelope as inspected feature, so the outcome of this process can be useful to gain information about the input’ energy modulation and can be exploited to detect transients presence in audio segments. The proposed algorithm relays on an adaptation of Continuous Brightness Estimation (CoBE).

Keywords
Brightness, CoBE, Events detection, Feature extraction, MIR

Paper topics
Digital audio effects, Music information retrieval, Sound/music signal processing algorithms

Easychair keyphrases
envelope follower [7], event density [7], amplitude envelope [6], energy modulation [6], energy modulation amount [6], cobe value [5], energy envelope [5], low energy [5], attack leap [4], cobe ebf [4], continuous brightness estimation [4], crest factor [4], envelope brightness [4], feature extraction [4], modulation amount [4], onset detection [4], spectral flux [4], standard deviation [4], trap median [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851103
Zenodo URL: https://zenodo.org/record/851103


2015.70
Vibrotactile Discrimination of Pure and Complex Waveforms
Young, Gareth William   University College Cork (UCC); Cork, Ireland
Murphy, Dave   University College Cork (UCC); Cork, Ireland
Weeter, Jeffrey   University College Cork (UCC); Cork, Ireland

Abstract
Here we present experimental results that investigate the application of vibrotactile stimulus of pure and complex waveforms. Our experiment measured a subject’s ability to discriminate between pure and complex waveforms based upon vibrotactile stimulus alone. Subjective same/different awareness was captured for paired combinations of sine, saw, and square waveforms at a fixed fundamental frequency of 160 Hz (f0). Each arrangement was presented non-sequentially via a gloved vibrotactile device. Audio and bone conduction stimulus were removed via headphone and tactile noise masking respectively. The results from our experiments indicate that humans possess the ability to distinguish between different waveforms via vibrotactile stimulation when presented asynchronously at f0 and that this form of interaction may be developed further to advance digital musical instrument (DMI) extra-auditory interactions in computer music.

Keywords
Interfaces for sound and music, Multimodality in sound and music computing, Perception and cognition of sound and music, Sound/music and the neurosciences

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing, Perception and cognition of sound and music, Sound/music and the neurosciences

Easychair keyphrases
vibrotactile stimulus [10], complex waveform [7], vibrotactile feedback [7], audio tactile glove [6], multisensory integration [5], musical instrument [5], college cork cork [4], non musician [4], sub threshold [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851057
Zenodo URL: https://zenodo.org/record/851057


2015.71
"Virtual Tettix": Cicadas' Sound Analysis and Modeling at Plato's Academy
Georgaki, Anastasia   University of Athens; Athens, Greece
Queiroz, Marcelo   Department of Computer Science, University of São Paulo (USP); São Paulo, Brazil

Abstract
This paper deals with the acoustic analysis of timbral and rhythmic patterns of the Cicada Orni sound activity, collected at the Plato Academy archaeological site during the summer period of 2014, comprising the Tettix soundscape database. The main purpose here is to use sound analysis for understanding the basic patterns of cicada calls and shrilling sounds, and subsequently use the raw material provided by the Tettix database in a statistical modeling framework for creating virtual sounds of cicadas, allowing the control of synthesis parameters spanning micro, meso and macro temporal levels.

Keywords
Cicada sound, Soundscape, Statistical models, Synthesis model, Timbre and rhythm analysis

Paper topics
Content processing of music audio signals, Models for sound analysis and synthesis, Sonic interaction design

Easychair keyphrases
virtual cicada [14], second order markov model [8], cicada orni [7], plato academy [7], macro temporal level [6], micro temporal [6], cicada call [5], cicada singing [5], low pass [5], macro temporal [5], synthesis model [5], temporal scale [5], tettix project [5], transition matrix [5], cicada chorus [4], high pass filter [4], lower right [4], low frequency [4], low pass filtered version [4], meso temporal [4], meso temporal scale [4], micro temporal scale [4], micro temporal synthesis engine [4], multi ethnic heterotopical soundscape [4], plato academy soundscape [4], precedence effect [4], statistical modeling [4], tettix database [4], upper left [4], upper right [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851141
Zenodo URL: https://zenodo.org/record/851141


2015.72
Voice quality transformation using an extended source-filter speech model
Huber, Stefan   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France
Roebel, Axel   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
In this paper we present a flexible framework for parametric speech analysis and synthesis with high quality. It constitutes an extended source-filter model. The novelty of the proposed speech processing system lies in its extended means to use a Deterministic plus Stochastic Model (DSM) for the estimation of the unvoiced stochastic component from a speech recording. Further contributions are the efficient and robust means to extract the Vocal Tract Filter (VTF) and the modelling of energy variations. The system is evaluated in the context of two voice quality transformations on natural human speech. The voice quality of a speech phrase is altered by means of re-synthesizing the deterministic component with different pulse shapes of the glottal excitation source. A Gaussian Mixture Model (GMM) is used in one test to predict energies for the re-synthesis of the deterministic and the stochastic component. The subjective listening tests suggests that the speech processing system is able to successfully synthesize and arise to a listener the perceptual sensation of different voice quality characteristics. Additionally, improvements of the speech synthesis quality compared to a baseline method are demonstrated.

Keywords
Glottal Source, LF model, Source-Filter, Speech Analysis Transformation and Synthesis, Voice Quality

Paper topics
Content processing of music audio signals, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
voice quality [57], synthesis quality [17], mo synthesis quality [14], voice quality transformation [14], tense voice quality [12], signal processing [9], voice descriptor [9], voice quality rating [9], glottal pulse [8], quality rating [8], baseline method svln [7], rdgci contour [7], relaxed voice quality [7], sinusoidal content [7], time domain mixing [7], voice quality change [7], glottal excitation source [6], glottal flow [6], glottal source [6], spectral fading [6], spectral fading synthesis [6], spectral slope [6], speech communication association [6], synthesis quality rating [6], unvoiced signal [6], very tense voice [6], voice quality characteristic [6], energy measure [5], source filter [5], unvoiced component [5]

Paper type
Full paper

DOI: 10.5281/zenodo.851075
Zenodo URL: https://zenodo.org/record/851075


2015.73
Wave Voxel Synthesis
Haron, Anis   Media Arts and Technology Program, University of California, Santa Barbara (UCSB); Santa Barbara, United States
Wright, Matthew James   Media Arts and Technology Program, University of California, Santa Barbara (UCSB); Santa Barbara, United States

Abstract
We present research in sound synthesis techniques employing lookup tables higher than two dimensions. Higher dimensional wavetables have not yet been explored to their fullest potential due to historical resource restrictions, particularly memory. This paper presents a technique for sound synthesis by means of three-variable functions as an extension to existing multidimensional table lookup synthesis techniques.

Keywords
Multidimensional wavetable, Sound Synthesis, Three variable functions

Paper topics
Digital audio effects, Multimodality in sound and music computing, Sound/music signal processing algorithms

Easychair keyphrases
voxel stack [17], wave terrain [16], wave terrain synthesis [14], harmonic content [10], sound synthesis [10], computer music [9], linear frequency [9], linear phase [9], frequency sweep [8], orbit trajectory [8], amplitude value [7], wave voxel [7], international computer [6], dimensional space [5], orbit length [5], sine wave [5], table size [5], dimensional lookup table [4], dimensional wavetable [4], dynamic voxel stack content [4], indexing operation [4], real time video image [4], sub wavetable [4], term wave voxel [4], variable function [4], wave voxel synthesis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851037
Zenodo URL: https://zenodo.org/record/851037


2015.74
Web Audio Evaluation Tool: A Browser-Based Listening Test Environment
Jillings, Nicholas   Queen Mary University of London; London, United Kingdom
De Man, Brecht   Queen Mary University of London; London, United Kingdom
Moffat, David   Queen Mary University of London; London, United Kingdom
Reiss, Joshua Daniel   Queen Mary University of London; London, United Kingdom

Abstract
Perceptual evaluation tests where subjects assess certain qualities of different audio fragments are an integral part of audio and music research. These require specialised software, usually custom-made, to collect large amounts of data using meticulously designed interfaces with carefully formulated questions, and play back audio with rapid switching between different samples. New functionality in HTML5 included in the Web Audio API allows for increasingly powerful media applications in a platform independent environment. The advantage of a web application is easy deployment on any platform, without requiring any other application, enabling multiple tests to be easily conducted across locations. In this paper we propose a tool supporting a wide variety of easily configurable, multi-stimulus perceptual audio evaluation tests over the web with multiple test interfaces, pre- and post-test surveys, custom configuration, collection of test metrics and other features. Test design and set up doesn't require programming background, and results are gathered automatically using web friendly formats for easy storing of results on a server.

Keywords
Audio Evaluation, HTML5, Listening Tests, Web Audio

Paper topics
Interfaces for sound and music, Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
web audio api [17], audio engineering society [15], listening test [10], comment box [8], perceptual evaluation [7], sample rate [7], audio sample [6], audio file [5], metricenable metricenable [5], audio fragment [4], audio perceptual evaluation [4], audio quality [4], browser based listening test environment [4], browser based perceptual evaluation [4], metricresult metricresult id [4], multiple stimulus [4], perceptual evaluation tool [4], setup file [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.851157
Zenodo URL: https://zenodo.org/record/851157


2015.75
Web Audio Modules
Kleimola, Jari   Department of Computer Science, Aalto University; Espoo, Finland
Larkin, Oliver   Department of Music, University of York; York, United Kingdom

Abstract
This paper introduces Web Audio Modules (WAMs), which are high-level audio processing/synthesis units that represent the equivalent of Digital Audio Workstation (DAW) plug-ins in the browser. Unlike traditional browser plugins WAMs load from the open web with the rest of the page content without manual installation. We propose the WAM API – which integrates into the existing Web Audio API – and provide its implementation for JavaScript and C++ bindings. Two proof-of-concept WAM virtual instruments were implemented in Emscripten, and evaluated in terms of latency and performance. We found that the performance is sufficient for reasonable polyphony, depending on the complexity of the processing algorithms. Latency is higher than in native DAW environments, but we expect that the forthcoming W3C standard AudioWorkerNode as well as browser developments will reduce it.

Keywords
daw plugin, emscripten, sound synthesis, virtual instrument, web audio

Paper topics
Computer environments for sound/music processing, Digital audio effects, Interfaces for sound and music, Sound/music signal processing algorithms

Easychair keyphrases
web audio api [26], audio api [15], web browser [10], web audio api node [8], buffer size [7], parameter space [7], style virtual instrument [6], virtual instrument [6], virtual void [6], wam implementation [6], web page [6], web service [6], audio plug [5], daw style [5], manual installation [5], native plug [5], use case [5], wam sinsynth [5], web application [5], web audio [5], audio api node graph [4], audio module [4], open source [4], streamlined api [4], user interface [4], void data [4], wam api [4], web api [4], web component [4], web midi api [4]

Paper type
Full paper

DOI: 10.5281/zenodo.851149
Zenodo URL: https://zenodo.org/record/851149


Search