Sixteen Years of Sound & Music Computing
A Look Into the History and Trends of the Conference and Community

D.A. Mauro, F. Avanzini, A. Baratè, L.A. Ludovico, S. Ntalampiras, S. Dimitrov, S. Serafin
Card image

Papers

Sound and Music Computing Conference 2019 (ed. 16)

Dates: from May 28 to May 31, 2019
Place: Málaga, Spain
Proceedings info: Proceedings of the 16th Sound & Music Computing Conference, ISBN 978-84-09-08518-7


2019.1
Adaptive Body Movement Sonification in Music and Therapy
Baumann, Christian   Fulda University of Applied Sciences (Hochschule Fulda); Fulda, Germany
Baarlink, Johanna Friederike   Musikschule Fulda; Fulda, Germany
Milde, Jan-Torsten   Fulda University of Applied Sciences (Hochschule Fulda); Fulda, Germany

Abstract
n this paper we describe the ongoing research on the devel- opment of a body movement sonification system. High precision, high resolution wireless sensors are used to track the body movement and record muscle excitation. We are currently using 6 sensors. In the final version of the system full body tracking can be achieved. The recording system provides a web server including a simple REST API, which streams the recorded data in JSON format. An inter- mediate proxy server pre-processes the data and transmits it to the final sonification system. The sonification system is implemented using the web au- dio api. We are experimenting with a set of different soni- fication strategies and algorithms. Currently we are testing the system as part of an interactive, guided therapy, estab- lishing additional acoustic feedback channels for the patient. In a second stage of the research we are going to use the sys- tem in a more musical and artistic way. More specifically we plan to use the system in cooperation with a violist, where the acoustic feedback channel will be integrated into the performance

Keywords
body movement, music and therpy, real time interactive machine learning, sonification, wekinator

Paper topics
Auditory display and data sonification, Automatic music generation/accompaniment systems, Interaction in music performance, Interfaces for sound and music, Sonic interaction design, Sound and music for accessibility and special needs

Easychair keyphrases
body movement [17], arm movement [12], sonification system [10], body movement data [9], bio feedback [6], body movement sonification [6], musical performance [6], sonification strategy [5], central chord [4], web audio api [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249449
Zenodo URL: https://zenodo.org/record/3249449


2019.2
Adaptive Loudness Compensation in Music Listening
Fierro, Leonardo   Università di Brescia; Brescia, Italy
Rämö, Jussi   Aalto University; Espoo, Finland
Välimäki, Vesa   Aalto University; Espoo, Finland

Abstract
The need for loudness compensation is a well known fact arising from the nonlinear behavior of human sound per- ception. Music and sound are mixed and mastered at a certain loudness level, usually louder than the level at which they are commonly played. This implies a change in the perceived spectral balance of the sound, which is largest in the low-frequency range. As the volume setting in music playing is decreased, a loudness compensation filter can be used to boost the bass appropriately, so that the low frequencies are still heard well and the perceived spectral balance is preserved. The present paper proposes a loudness compensation function derived from the standard equal-loudness level contours and its implementation via a digital first-order shelving filter. Results of a formal listening test validate the accuracy of the proposed method.

Keywords
Audio, Digital filters, DSP, Equalization, Listening test

Paper topics
Digital audio effects, Perception and cognition of sound and music, Sound/music signal processing algorithms

Easychair keyphrases
trace guide [15], shelving filter [14], first order shelving filter [12], listening level [12], first order [11], first order low shelving filter [11], loudness compensation [10], digital filter [9], perceived spectral balance [9], crossover frequency [8], listening test [8], spectral balance [7], audio eng [6], equal loudness level contour [6], magnitude response [6], aalto acoustics lab [4], box plot [4], compensation method [4], first order filter [4], fractional order filter [4], iir filter [4], loudness level [4], order filter [4], pure tone [4], sensitivity function [4], sound pressure level [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249289
Zenodo URL: https://zenodo.org/record/3249289


2019.3
Adaptive Score-Following System by Integrating Gaze Information
Noto, Kaede   Future University Hakodate; Hakodate, Japan
Takegawa, Yoshinari   Future University Hakodate; Hakodate, Japan
Hirata, Keiji   Future University Hakodate; Hakodate, Japan

Abstract
In actual piano practice, people of different skill levels exhibit different behaviors, for instance leaping forward or to an upper staff, mis-keying, repeating, and so on. However, many of the conventional score following systems hardly adapt such accidental behaviors depending on individual skill level, because conventional systems usually learn the frequent or general behaviors. We develop a score-following system that can adapt a users individuality by combining keying information with gaze, because it is well-known that the gaze is a highly reliable means of expressing a performers thinking. Since it is difficult to collect a large amount of piano performance data reflecting individuality, we employ the framework of the Bayesian inference to adapt individuality. That is, to estimate the users current position in piano performance, keying and gaze information are integrated into a single Bayesian inference by Gaussian mixture model (GMM). Here, we assume both the keying and gaze information conform to normal distributions. Experimental results show that, taking into account the gaze information, our score-following system can properly cope with repetition and leaping to an upper row of a staff, in particular.

Keywords
Bayesian inference, Gaussian mixture model, gaze information, score-following

Paper topics
Multimodality in sound and music computing, Music performance analysis and rendering

Easychair keyphrases
gaze information [19], keying information [17], score following system [15], score following [12], bayesian inference [11], user current position [9], gaze point [8], mixture rate [8], normal distribution [8], user individuality [8], mixture ratio [7], note number [7], accuracy rate [6], piano performance [6], gaze distribution [5], keying distribution [5], random variable [5], set piece [5], eye hand span [4], eye movement [4], future university hakodate [4], keying position [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249400
Zenodo URL: https://zenodo.org/record/3249400


2019.4
ADEPT: Exploring the Design, Pedagogy and Analysis of a Mixed Reality Application for Piano Training
Gerry, Lynda   Aalborg University; Aalborg, Denmark
Dahl, Sofia   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark

Abstract
One of the biggest challenges in learning how to play a musical instrument is learning how to move one's body in a nuanced physicality. Technology can expand available forms of physical interactions to help cue specific movements and postures to reinforce new sensorimotor couplings and enhance motor learning and performance. Using audiovisual first person perspective taking with a piano teacher in Mixed Reality, we present a system that allows students to place their hands into the virtual gloves of a teacher. Motor learning and audio-motor associations are reinforced through motion feedback and spatialized audio. The Augmented Design to Embody a Piano Teacher (ADEPT) application is an early design prototype of this piano training system.

Keywords
Augmented Reality, Mixed Reality, Pedagogy, Piano, Technology-Enhanced Learning

Paper topics
Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
piano teacher [19], adept system [18], first person perspective [14], piano playing [9], piano training application [7], virtual reality [7], augmented reality [6], embodied music cognition [6], expert pianist [6], motion feedback [6], perspective taking [6], piano student [6], real time feedback [6], teacher hand [6], virtual embodiment [6], augmented embodiment [5], embodied perspective [5], piano instruction [5], piano training [5], real time [5], visual cue [5], visual environment [5], audio spatialization [4], magic leap headset [4], mixed reality [4], motion capture [4], piano performance [4], real time motion feedback [4], user physical piano [4], virtual hand [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249333
Zenodo URL: https://zenodo.org/record/3249333


2019.5
A Framework for Multi-f0 Modeling in SATB Choirs
Cuesta, Helena   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Chandna, Pritish   Music Technology Group (MTG), Pompeu Fabra University (UPF); Barcelona, Spain
Gómez, Emilia   Music Technology Group (MTG), Joint Research Centre, European Commission, Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
Fundamental frequency (f0) modeling is an important but relatively unexplored aspect of choir singing. Performance evaluation as well as auditory analysis of singing, whether individually or in a choir, often depend on extracting f0 contours for the singing voice. However, due to the large number of singers, singing at a similar range of frequencies, extracting the exact pitch contour is a challenging task. In this paper, we present a methodology for modeling the pitch contours of an SATB choir. A typical SATB choir consists of four parts, each covering a distinct range of fundamental frequencies and often with multiple singers each. We first evaluate some state-of-the-art multi-f0 estimation systems for the particular case of choirs with one singer per part, and observe that the pitch of the individual singers can be estimated to a relatively high degree of accuracy. We observe, however, that the scenario of multiple singers for each choir part is far more challenging. In this work we combine a multi-f0 estimation methodology based on deep learning followed by a set of traditional DSP techniques to model the f0 dispersion for each choir part instead of a single f0 trajectory. We present and discuss our observations and test our framework with different configurations of singers.

Keywords
choral singing, multi-pitch, pitch modeling, singing voice, unison

Paper topics
Models for sound analysis and synthesis, Music information retrieval, Music performance analysis and rendering

Easychair keyphrases
multi f0 estimation [42], choir section [16], f0 estimation system [9], vocal quartet [8], f0 estimation algorithm [7], ground truth [7], multiple singer [7], traditional dsp technique [6], unison ensemble singing [6], choir recording [5], dispersion value [5], spectral peak [5], choral singing [4], deep learning [4], fundamental frequency [4], individual track [4], interpolated peak [4], locus iste [4], multi f0 extraction [4], multiple fundamental frequency estimation [4], music information retrieval [4], non peak region [4], pitch salience [4], satb quartet [4], singing voice [4], standard deviation [4], unison performance [4], unison singing [4], universitat pompeu fabra [4], vocal ensemble [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249421
Zenodo URL: https://zenodo.org/record/3249421


2019.6
A Framework for the Development and Evaluation of Graphical Interpolation for Synthesizer Parameter Mappings
Gibson, Darrell   Bournemouth University; Bournemouth, United Kingdom
Polfreman, Richard   University of Southampton; Southampton, United Kingdom

Abstract
This paper presents a framework that supports the development and evaluation of graphical interpolated parameter mapping for the purpose of sound design. These systems present the user with a graphical pane, usually two-dimensional, where synthesizer presets can be located. Moving an interpolation point cursor within the pane will then create new sounds by calculating new parameter values, based on the cursor position and the interpolation model used. The exploratory nature of these systems lends itself to sound design applications, which also have a highly exploratory character. However, populating the interpolation space with “known” preset sounds allows the parameter space to be constrained, reducing the design complexity otherwise associated with synthesizer-based sound design. An analysis of previous graphical interpolators in presented and from this a framework is formalized and tested to show its suitability for the evaluation of such systems. The framework has then been used to compare the functionality of a number of systems that have been previously implemented. This has led to a better understanding of the different sonic outputs that each can produce and highlighted areas for further investigation.

Keywords
Interface, Interpolation, Sound Design, Synthesis, Visual

Paper topics
Digital audio effects, Interfaces for sound and music, New interfaces for interactive music creation

Easychair keyphrases
interpolation space [35], interpolation point [29], sonic output [18], synthesis engine [18], graphical interpolation system [17], sound design [17], interpolation system [14], interpolation function [12], parameter mapping [12], graphical interpolator [11], interpolation model [11], visual representation [11], synthesis parameter [10], visual model [9], graphical interpolation [8], preset location [8], real time [7], preset point [6], sound design application [6], gaussian kernel [5], parameter preset [5], parameter space [5], audio processing [4], cursor position [4], dimensional graphical [4], gravitational model [4], interpolation method [4], node based interpolator [4], synthesizer parameter [4], visual interpolation model [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249366
Zenodo URL: https://zenodo.org/record/3249366


2019.7
Alternative Measures: A Musicologist Workbench for Popular Music
Clark, Beach   Georgia Institute of Technology; Atlanta, United States
Arthur, Claire   Georgia Institute of Technology; Atlanta, United States

Abstract
The objective of this project is to create a digital “work-bench” for quantitative analysis of popular music. The workbench is a collection of tools and data that allow for efficient and effective analysis of popular music. This project integrates software from pre-existing analytical tools including music21 but adds methods for collecting data about popular music. The workbench includes tools that allow analysts to compare data from multiple sources. Our working prototype of the workbench con-tains several novel analytical tools which have the poten-tial to generate new musicological insights through the combination of various datasets. This paper demonstrates some of the currently available tools as well as several sample analyses and features computed from this data that support trend analysis. A future release of the work-bench will include a user-friendly UI for non-programmers.

Keywords
Music data mining, Music metadata, popular music analysis

Paper topics
Computational musicology and ethnomusicology, Models for sound analysis and synthesis, Music information retrieval

Easychair keyphrases
popular music [27], music information retrieval [12], chord transcription [11], existing tool [10], th international society [9], billboard hot [8], chord detection [8], multiple source [8], pitch vector [8], popular song [7], symbolic data [7], musical analysis [6], ultimate guitar [6], chord estimation [5], chord recognition [5], ground truth [5], harmonic analysis [5], musical data [5], audio feature [4], data collection [4], mcgill billboard dataset [4], midi transcription [4], music scholar [4], pitch class profile [4], popular music scholar [4], programming skill [4], spotify data [4], spotify database [4], symbolic metadata [4], timbre vector [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249402
Zenodo URL: https://zenodo.org/record/3249402


2019.8
A Model Comparison for Chord Prediction on the Annotated Beethoven Corpus
Landsnes, Kristoffer   Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Mehrabyan, Liana   Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Wiklund, Victor   Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Moss, Fabian   Digital and Cognitive Musicology Lab, Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Lieck, Robert   Digital and Cognitive Musicology Lab, Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Rohrmeier, Martin   Digital and Cognitive Musicology Lab, Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland

Abstract
This paper focuses on predictive processing of chords in Ludwig van Beethoven's string quartets. A dataset consisting of harmonic analyses of all Beethoven string quartets was used to evaluate an n-gram language model as well as a recurrent neural network (RNN) architecture based on long-short-term memory (LSTM). Through assessing model performance results, this paper studies the evolution and variability of Beethoven's harmonic choices in different periods of his activity as well as the flexibility of predictive models in acquiring basic patterns and rules of tonal harmony.

Keywords
chord prediction, harmony, lstm, musical expectancy, neural networks, ngram models, predictive processing

Paper topics
Computational musicology and ethnomusicology, Perception and cognition of sound and music

Easychair keyphrases
n gram model [14], beethoven string quartet [7], recurrent neural network [7], chord symbol [6], long term dependency [6], scale degree [5], annotated beethoven corpus [4], best performing n gram [4], cognitive science [4], cross validation [4], full roman numeral representation [4], gram language model [4], neural network [4], n gram language [4], optimal n gram length [4], predictive processing [4], short term memory [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249335
Zenodo URL: https://zenodo.org/record/3249335


2019.9
Analysis of Vocal Ornamentation in Iranian Classical Music
Shafiei, Sepideh   The City University of New York (CUNY); New York, United States

Abstract
In this paper we study tahrir, a melismatic vocal ornamentation which is an essential characteristic of Persian classical music and can be compared to yodeling. It is considered the most important technique through which the vocalist can display his/her prowess. In Persian, nightingale’s song is used as a metaphor for tahrir and sometimes for a specific type of tahrir. Here we examine tahrir through a case study. We have chosen two prominent singers of Persian classical music one contemporary and one from the twentieth century. In our analysis we have appropriated both audio recordings and notation. This paper is the first step towards computational modeling and recognition of different types of tahrirs. Here we have studied two types of tahrirs, mainly nashib and farāz, and their combination through three different performance samples by two prominent vocalists. More than twenty types of tahrirs have been identified by Iranian musicians and music theorists. We hope to develop a method to computationally identify these models in our future works.

Keywords
Iranian classical music, Iranian traditional music, radif, tahrir, vocal ornamentation

Paper topics
Computational musicology and ethnomusicology, Music information retrieval

Easychair keyphrases
persian classical music [14], main note [7], twentieth century [7], vocal radif [7], iranian music [6], traditional music [5], dar amad [4], persian music [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249414
Zenodo URL: https://zenodo.org/record/3249414


2019.10
An Interactive Music Synthesizer for Gait Training in Neurorehabilitation
Kantan, Prithvi   Aalborg University; Aalborg, Denmark
Dahl, Sofia   Aalborg University; Aalborg, Denmark

Abstract
Rhythm-based auditory cues have been shown to significantly improve walking performance in patients with numerous neurological conditions. This paper presents the design, implementation and evaluation of a gait training device capable of real-time synthesis and automated manipulation of rhythmic musical stimuli, as well as auditory feedback based on measured walking parameters. The proof-of-concept was evaluated with six healthy participants, as well as through critical review by one neurorehabilitation specialist. Stylistically, the synthesized music was found by participants to be conducive to movement, but not uniformly enjoyable. The gait capture/feedback mechanisms functioned as intended, although discrepancies between measured and reference gait parameter values may necessitate a more robust measurement system. The specialist acknowledged the potential of the gait measurement and auditory feedback as novel rehabilitation aids, but stressed the need for additional gait measurements, superior feedback responsiveness and greater functional versatility in order to cater to individual patient needs. Further research must address these findings, and tests must be conducted on real patients to ascertain the utility of such a device in the field of neurorehabilitation.

Keywords
Gait Rehabilitation, real-time synthesis, Rhythmic Auditory Stimulation, Sonification

Paper topics
Auditory display and data sonification, Automatic music generation/accompaniment systems, Sound/music and the neurosciences

Easychair keyphrases
gait parameter [17], gait measurement [11], real time [8], short term [8], auditory stimulus [7], im gait mate [7], swing time [7], auditory feedback [6], gait performance [6], long term [6], master clock [6], neurological condition [6], rhythmic auditory stimulation [6], stimulus generation subsystem [6], white noise white noise [6], subtractive subtractive [5], time variability [5], analytical subsystem [4], gait training [4], interactive music [4], measured gait parameter [4], neurorehabilitation specialist [4], real time auditory feedback [4], reference value [4], stance time [4], synthesis method [4], target group [4], temporal gait parameter [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249297
Zenodo URL: https://zenodo.org/record/3249297


2019.11
A PLATFORM FOR PROCESSING SHEET MUSIC AND DEVELOPING MULTIMEDIA APPLICATION
Wu, Fu-Hai (Frank)   National Tsing Hua University; Hsinchu, Taiwan

Abstract
Imaging when reading sheet music on computing devices, users could listen audio synchronizing with the sheet. To this end, the sheet music must be acquired, analyzed and transformed into digitized information of melody, rhythm, duration, chord, expressiveness and physical location of scores. As we know, the optical music recognition (OMR) is an appropriate technology to approach the purpose. However, the commercial OMR system of numbered music notation is not available as best as our knowledge. In the paper, we demonstrate our proprietary OMR system and show three human-interactive applications: sheet music browser and multimodal accompanying and games for sight-reading of sheet music. With the illustration, we hope to foster the usage and obtain the valuable opinions of the OMR system and the applications.

Keywords
musical game, numbered music notation, optical music recognition, sight-reading for sheet music, singing accompaniment

Paper topics
not available

Easychair keyphrases
sheet music [17], omr system [7], sheet music browser [7], numbered music notation [6], multimedia application [4], optical music recognition [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249272
Zenodo URL: https://zenodo.org/record/3249272


2019.12
A SEQUENCER WITH DECOUPLED TRACK TIMING
Peter, Silvan David   Johannes Kepler University Linz; Linz, Austria
Widmer, Gerhard   Johannes Kepler University Linz; Linz, Austria

Abstract
Sequencers almost exclusively share the trait of a single master clock. Each track is laid out on an isochronously spaced sequence of beat positions. Vertically aligned positions are expected to be in synchrony as all tracks refer to the same clock. In this work we present an experimental implementation of a decoupled sequencer with different underlying clocks. Each track is sequenced by the peaks of a designated oscillator. These oscillators are connected in a network and influence each other's periodicities. A familiar grid-type graphical user interface is used to place notes on beat positions of each of the interdependent but asynchronous tracks. Each track clock can be looped and node points specify the synchronisation of multiple tracks by tying together specific beat positions. This setup enables simple global control of microtiming and polyrhythmic patterns.

Keywords
Microtiming, Oscillator, Polyrhythm, Sequencer

Paper topics
not available

Easychair keyphrases
beat position [6], node point [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249268
Zenodo URL: https://zenodo.org/record/3249268


2019.13
Audiovisual Perception of Arousal, Valence and Effort in Contemporary Cello Performance
Järveläinen, Hanna   Zurich University of the Arts (ZHdK); Zurich, Switzerland

Abstract
Perceived arousal, valence, and effort were measured continuously from auditory, visual, and audiovisual cues in contemporary cello performance. Effort (perceived exertion of the performer) was added for two motivations: to investigate its potential as a measure and its association with arousal in audiovisual perception. Fifty-two subjects participated in the experiment. Results were analyzed using Activity Analysis and functional data analysis. Arousal and effort were perceived with significant coordination between participants from auditory, visual, as well as audiovisual cues. Significant differences were detected between auditory and visual channels but not between arousal and effort. Valence, in contrast, showed no significant coordination between participants. Relative importance of the visual channel is discussed.

Keywords
Audiovisual, Contemporary music, Multimodality, Music perception, Real-time perception

Paper topics
Multimodality in sound and music computing, Perception and cognition of sound and music

Easychair keyphrases
audiovisual rating [13], visual channel [11], visual cue [11], activity analysis [10], activity level [9], audiovisual perception [9], auditory rating [9], factor combination [9], visual rating [9], differenced rating [6], music performance [6], significant coordination [6], valence rating [6], auditory channel [5], auditory cue [5], functional analysis [5], perceived arousal [5], rating increase [5], audiovisual cue [4], auditory modality [4], functional data analysis [4], intensity change [4], mean valence level [4], musical performance [4], screen size [4], significant bi coordination [4], visual condition [4], visual perception [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249453
Zenodo URL: https://zenodo.org/record/3249453


2019.14
Autoencoders for Music Sound Modeling: a Comparison of Linear, Shallow, Deep, Recurrent and Variantional Models
Roche, Fanny   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France
Hueber, Thomas   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France
Limier, Samuel   Arturia; Grenoble, France
Girin, Laurent   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France

Abstract
This study investigates the use of non-linear unsupervised dimensionality reduction techniques to compress a music dataset into a low-dimensional representation, which can be used in turn for the synthesis of new sounds. We systematically compare (shallow) autoencoders (AEs), deep autoencoders (DAEs), recurrent autoencoders (with Long Short-Term Memory cells -- LSTM-AEs) and variational autoencoders (VAEs) with principal component analysis (PCA) for representing the high-resolution short-term magnitude spectrum of a large and dense dataset of music notes into a lower-dimensional vector (and then convert it back to a magnitude spectrum used for sound resynthesis). Our experiments were conducted on the publicly available multi-instrument and multi-pitch database NSynth. Interestingly and contrary to the recent literature on image processing, they showed that PCA systematically outperforms shallow AE. Only deep and recurrent architectures (DAEs and LSTM-AEs) lead to a lower reconstruction error. The optimization criterion in VAEs being the sum of the reconstruction error and a regularization term, it naturally leads to a lower reconstruction accuracy than DAEs but we show that VAEs are still able to outperform PCA while providing a low-dimensional latent space with nice ``usability'' properties. We also provide corresponding objective measures of perceptual audio quality (PEMO-Q scores), which generally correlate well with the reconstruction error.

Keywords
autoencoder, music sound modeling, unsupervised dimension reduction

Paper topics
Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
latent space [26], latent dimension [12], latent coefficient [9], reconstruction accuracy [9], variational autoencoder [9], reconstruction error [8], latent vector [7], control parameter [6], latent space dimension [6], layer wise training [6], machine learning [6], magnitude spectra [6], music sound [6], pemo q score [6], principal component analysis [6], neural network [5], phase spectra [5], preprint arxiv [5], signal reconstruction [5], audio synthesis [4], decoded magnitude spectra [4], dimensionality reduction technique [4], encoding dimension [4], good reconstruction accuracy [4], latent representation [4], latent variable [4], low dimensional latent space [4], neural information process [4], pemo q measure [4], sound synthesis [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249404
Zenodo URL: https://zenodo.org/record/3249404


2019.15
Automatic Chord Recognition in Music Education Applications
Grollmisch, Sascha   Fraunhofer; Erlangen, Germany
Cano, Estefania   Fraunhofer; Erlangen, Germany

Abstract
In this work, we demonstrate the market-readiness of a recently published state-of-the-art chord recognition method, where automatic chord recognition is extended beyond major and minor chords to the extraction of seventh chords. To do so, the proposed chord recognition method was integrated in the Songs2See Editor, which already includes the automatic extraction of the main melody, bass line, beat grid, key, and chords for any musical recording.

Keywords
chord recognition, gamification, music education, music information retrieval

Paper topics
not available

Easychair keyphrases
songs2see editor [5], automatic chord recognition [4], score sheet [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249362
Zenodo URL: https://zenodo.org/record/3249362


2019.16
Automatic Chord-Scale Recognition using Harmonic Pitch Class Profiles
Demirel, Emir   Queen Mary University of London; London, United Kingdom
Bozkurt, Barıs   İzmir Demokrasi University; Izmir, Turkey
Serra, Xavier   Pompeu Fabra University (UPF); Barcelona, Spain

Abstract
In this work, we study and evaluate different computational methods to carry out a "modal harmonic analysis" for Jazz improvisation performances by modeling the concept of \textit{chord-scales}. The Chord-Scale Theory is a theoretical concept that explains the relationship between the harmonic context of a musical piece and possible scale types to be used for improvisation. This work proposes different computational approaches for the recognition of the chord-scale type in an improvised phrase given the harmonic context. We have curated a dataset to evaluate different chord-scale recognition approaches proposed in this study, where the dataset consists of around 40 minutes of improvised monophonic Jazz solo performances. The dataset is made publicly available and shared on \textit{freesound.org}. To achieve the task of chord-scale type recognition, we propose one rule-based, one probabilistic and one supervised learning method. All proposed methods use Harmonic Pitch Class Profile (HPCP) features for classification. We observed an increase in the classification score when learned chord-scale models are filtered with predefined scale templates indicating that incorporating prior domain knowledge to learned models is beneficial. This study has its novelty for demonstrating one of first computational analysis on chord-scales in the context of Jazz improvisation.

Keywords
audio signal processing, computational musicology, machine learning, music information retrieval

Paper topics
Automatic separation, classification of sound and music, Computational musicology and ethnomusicology, Music information retrieval, Music performance analysis and rendering, recognition, Sound/music signal processing algorithms

Easychair keyphrases
chord scale [65], pitch class [33], chord scale type [25], music information retrieval [17], chord scale recognition [14], harmonic pitch class profile [12], scale type [12], jazz improvisation [10], chord scale model [9], chord recognition [8], audio signal [7], binary template [7], chord scale theory [7], gaussian mixture model [7], automatic chord scale recognition [6], binary chord scale template [6], chroma feature [6], frame level hpcp [6], international society [6], pitch class profile [6], predefined binary chord scale [6], scale pitch class [6], feature vector [5], scale recognition [5], standard deviation [5], binary template matching [4], level hpcp vector [4], pitch class distribution [4], predefined binary template [4], scale recognition method [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249258
Zenodo URL: https://zenodo.org/record/3249258


2019.17
Belief Propagation algorithm for Automatic Chord Estimation
Martin, Vincent P.   Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université Bordeaux-I; Bordeaux, France
Reynal, Sylvain   Equipes Traitement de l'Information et Systèmes (ETIS), Université de Cergy-Pontoise; France
Crayencour, Hélène-Camille   Laboratoire des signaux et systèmes (L2S), Université Paris-Sud XI; Paris, France
Basaran, Dogac   Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
This work aims at bridging the gap between two completely distinct research fields: digital telecommunications and Music Information Retrieval. While works in the MIR community have long used algorithms borrowed from voice signal processing, text recognition or image processing, to our knowledge no work based on digital telecommunications algorithms has been produced. This paper specifically targets the use of the Belief Propagation algorithm for the task of Automatic Chord Estimation. This algorithm is of widespread use in iterative decoders for error correcting codes and we show that it offers improved performances in ACE. It certainly represents a promising alternative to the Hidden Markov Models approach.

Keywords
Automatic Chord Detection, Belief Propagation Algorithm, General Belief Propagation, Hidden Markov Model, Music Information Retrieval

Paper topics
Music information retrieval

Easychair keyphrases
ground truth [14], transition matrix [11], automatic chord estimation [9], belief propagation [9], bayesian graph [7], long term [7], audio signal [6], belief propagation algorithm [6], chord estimation [6], self transition [6], transition probability [6], hidden state [5], chord progression [4], computation time [4], deep learning [4], fifth transition matrix [4], graphical model [4], ground truth beat [4], inference process [4], long term correlation [4], minor chord [4], observation probability [4], pattern matching [4], short cycle [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249467
Zenodo URL: https://zenodo.org/record/3249467


2019.18
Capturing the reaction time to distinguish between voice and music
Villena-Rodríguez, Alejandro   ATIC Research Group, Andalucía Tech, Universidad de Málaga; Málaga, Spain
Tardón, Lorenzo José   ATIC Research Group, Andalucía Tech, Universidad de Málaga; Málaga, Spain
Barbancho, Isabel   ATIC Research Group, Andalucía Tech, Universidad de Málaga; Málaga, Spain
Barbancho, Ana Maria   ATIC Research Group, Andalucía Tech, Universidad de Málaga; Málaga, Spain
Gómez-Plazas, Irene   ATIC Research Group, Andalucía Tech, Universidad de Málaga; Málaga, Spain
Varela-Salinas, María-José   Andalucía Tech, Universidad de Málaga; Málaga, Spain

Abstract
Reaction times (RTs) are an important source of information in experimental psychology and EEG data analysis. While simple auditory RT has been widely studied, response time when discriminating between two different auditory stimuli have not been determined yet. The purpose of this experiment is to measure the RT for the discrimination between two different auditory stimuli: speech and instrumental music.

Keywords
Auditory stimuli, Distinguish voice-music, Reaction time

Paper topics
not available

Easychair keyphrases
reaction time [7], speech excerpt [5]

Paper type
Demo

DOI: 10.5281/zenodo.3249274
Zenodo URL: https://zenodo.org/record/3249274


2019.19
Combining Texture-Derived Vibrotactile Feedback, Concatenative Synthesis and Photogrammetry for Virtual Reality Rendering
Magalhaes, Eduardo   Faculdade de Engenharia da Universidade do Porto (FEUP), Universidade do Porto; Porto, Portugal
Bernardes, Gilberto   Faculdade de Engenharia da Universidade do Porto (FEUP), Universidade do Porto; Porto, Portugal
Høeg, Emil   Aalborg University; Aalborg, Denmark
Pedersen, Jon   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Nordahl, Rolf   Aalborg University; Aalborg, Denmark

Abstract
This paper describes a novel framework for real-time sonification of surface textures in virtual reality (VR), aimed towards realistically representing the experience of driving over a virtual surface. A combination of capturing techniques of real-world surfaces are used for mapping 3D geometry, texture maps or auditory attributes (aural and vibrotactile) feedback. For the sonification rendering, we propose the use of information from primarily graphical texture features, to define target units in concatenative sound synthesis. To foster models that go beyond current generation of simple sound textures (e.g., wind, rain, fire), towards highly “synchronized” and expressive scenarios, our contribution draws a framework for higher-level modeling of a bicycle's kinematic rolling on ground contact, with enhanced perceptual symbiosis between auditory, visual and vibrotactile stimuli. We scanned two surfaces represented as texture maps, consisting of different features, morphology and matching navigation. We define target trajectories in a 2-dimensional audio feature space, according to a temporal model and morphological attributes of the surfaces. This synthesis method serves two purposes: a real-time auditory feedback, and vibrotactile feedback induced through playing back the concatenated sound samples using a vibrotactile inducer speaker.% For this purpose, a Virtual Environment was created including four surfaces variation and consisting on a bicycle ride allowing to test the proposed architecture for real time adaptation and adequate haptic feedback.

Keywords
Sonic Interaction Design, Sonification, Sound Synthesis, Virtual Reality

Paper topics
Auditory display and data sonification, Interactive performance systems, Models for sound analysis and synthesis, Multimodality in sound and music computing, Sonic interaction design, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
vibrotactile feedback [12], concatenative sound synthesis [11], displacement map [10], descriptor space [9], dirt road [9], capture technique [7], real time [7], haptic feedback [6], sound texture [6], surface texture [6], texture map [6], aural feedback [5], feature vector [5], virtual environment [5], virtual reality [5], aalborg university [4], audio capture technique [4], audio stream [4], concatenative sound synthesis engine [4], first order autocorrelation coefficient [4], rubber hand [4], sensory feedback [4], sound synthesis [4], target definition [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249378
Zenodo URL: https://zenodo.org/record/3249378


2019.20
Comparison and Implementation of Data Transmission Techniques Through Analog Audio Signals in the Context of Augmented Mobile Instruments
Michon, Romain   GRAME-CNCM, Lyon // CCRMA, Stanford University GRAME (Générateur de Ressources et d’Activités Musicales Exploratoires), CNCM (Centre national de création musicale), in Lyon Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States
Orlarey, Yann   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Letz, Stéphane   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Fober, Dominique   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France

Abstract
Augmented mobile instruments combine digitally-fabricated elements, sensors, and smartphones to create novel musical instruments. Communication between the sensors and the smartphone can be challenging as there doesn’t exist a universal lightweight way to connect external elements to this type of device. In this paper, we investigate the use of two techniques to transmit sensor data through the built-in audio jack input of a smartphone: digital data transmission using the Bell 202 signaling technique, and analog signal transmission using digital amplitude modulation and demodulation with Goertzel filters. We also introduce tools to implement such systems using the Faust programming language and the Teensy development board.

Keywords
Data Transmission Standards, Faust, Microcontrollers, Mobile Music, Sensors

Paper topics
and software environments for sound and music computing, Hardware systems for sound and music computing, Interfaces for sound and music, Languages, protocols

Easychair keyphrases
uart tx encoder [15], audio jack [10], sensor data [8], signaling technique [8], goertzel filter [7], digital amplitude modulation [6], channel number [5], data transmission [5], faust program [5], musical instrument [5], parallel stream [5], analog audio sensor data [4], audio jack output speaker [4], audio sensor data transmission [4], audio signal [4], augmented mobile instrument [4], digital signal processing [4], faust generated block diagram [4], faust programming language [4], output audio signal [4], pin audio jack [4], sound synthesis [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249311
Zenodo URL: https://zenodo.org/record/3249311


2019.21
Composing Space in the Space: An augmented and Virtual Reality Sound Spatialization System
Santini, Giovanni   Hong Kong Baptist University; Hong Kong, Hong Kong

Abstract
This paper describes a tool for gesture-based control of sound spatialization in Augmented and Virtual Reality (AR and VR). While the increased precision and availability of sensors of any kind has made possible, in the last twenty years, the development of a considerable number of interfaces for sound spatialization control through gesture, the combination with VR and AR has not been fully explored yet. Such technologies provide an unprecedented level of interaction, immersivity and ease of use, by letting the user visualize and modify position, trajectory and behaviour of sound sources in 3D space. Like VR/AR painting programs, the application allows to draw lines that have the function of 3D automations for spatial motion. The system also stores information about movement speed and directionality of the sound source. Additionally, other parameters can be controlled from a virtual menu. The possibility to alternate AR and VR allows to switch between different environment (the actual space where the system is located or a virtual one). Virtual places can also be connected to different room parameters inside the spatialization algorithm.

Keywords
Augmented Reality, Spatial Audio, Spatialisation Instrument, Virtual Reality

Paper topics
and virtual acoustics, New interfaces for interactive music creation, reverberation, Sound and music for Augmented/Virtual Reality and games, Spatial sound

Easychair keyphrases
sound source [32], sound spatialization [12], virtual object [7], spatialization algorithm [6], real time [5], cast shadow [4], digital musical instrument [4], sound source position [4], virtual object position [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249329
Zenodo URL: https://zenodo.org/record/3249329


2019.22
COMPOSING WITH SOUNDS: DESIGNING AN OBJECT ORIENTED DAW FOR THE TEACHING OF SOUND-BASED COMPOSITION
Pearse, Stephen   University of Portsmouth; Portsmouth, United Kingdom
Landy, Leigh   De Montfort University Leicester; Leicester, United Kingdom
Chapman, Duncan   Independent; United Kingdom
Holland, David   De Montfort University Leicester; Leicester, United Kingdom
Eniu, Mihai   University of Portsmouth; Portsmouth, United Kingdom

Abstract
This paper presents and discusses the Compose With Sounds (CwS) Digital Audio Workstation (DAW) and its approach to sequencing musical materials. The system is designed to facilitate the composition within the realm of Sound-based music wherein sound objects (real or synthesised) are the main musical unit of construction over traditional musical notes. Unlike traditional DAW’s or graphical audio pro- gramming environments (such as Pure Data, MAX MSP etc.) that are based around interactions with sonic ma- terials within tracks or audio graphs, the implementation presented here is based solely around sound objects. To achieve this a bespoke cross-platform audio engine known FSOM (Free Sound Object Mixer) was created in C++. To enhance the learning experience, imagery, dynamic 3D animations and models are used to allow for efficient ex- ploration and learning. To support the educational focus of the system further, the metaphor of a sound card is used over the term sound object. Collections of cards can sub- sequently be imported/exported to and from the software package. When applying audio transformations on cards, interactive 3D graphics are used to illustrate the transfor- mation in real time based on their current settings. Audio transformations and tools within the system all hook into a flexible permissions system that allows users or workshop leaders to create template sessions with features enabled or disabled based on the theme or objective of the usage. The system is part of a suite of pedagogical tools for the cre- ation of experimental electronic music. A version for live performance is currently in development, as is the ability to utilise video within the system.

Keywords
digital audio workstation design, new interfaces for music creation, object-oriented composition, pedagogy

Paper topics
Algorithms and Systems for music composition, Interfaces for sound and music, Music creation and performance, New interfaces for interactive music creation, Sound/music signal processing algorithms

Easychair keyphrases
free sound object mixer [6], software package [5], audio engine [4], electroacoustic resource site [4], granular synthesis [4], interface project [4], sonic postcard [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249368
Zenodo URL: https://zenodo.org/record/3249368


2019.23
CompoVOX: REAL-TIME SONIFICATION OF VOICE
Molina Villota, Daniel Hernán   Université Jean Monnet; Saint-Étienne, France
Navas, Antonio   Universidad de Málaga; Málaga, Spain
Barbancho, Isabel   Universidad de Málaga; Málaga, Spain

Abstract
It has been developed an interactive application that allows sonify human voice and visualize a graphic interface in relation to the sounds produced. This program has been developed in MAX MSP, and it takes the spoken voice signal, and from its treatment, it allows to generate an automatic and tonal musical composition.

Keywords
acoustics, automatic composition, music, sonification, sonifiying voice, tonal music, voice

Paper topics
not available

Easychair keyphrases
real time [6], tonal musical sequence [4], voice signal [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249354
Zenodo URL: https://zenodo.org/record/3249354


2019.24
Conditioning a Recurrent Neural Network to synthesize musical instrument transients
Wyse, Lonce   National University of Singapore; Singapore, Singapore
Huzaifah, Muhammad   National University of Singapore; Singapore, Singapore

Abstract
A Recurrent Neural Network (RNN) is trained to predict sound samples based on audio input augmented by con-trol parameter information for pitch, volume, and instrument identification. During the generative phase follow-ing training, audio input is taken from the output of the previous time step, and the parameters are externally con-trolled allowing the network to be played as a musical instrument. Building on an architecture developed in previous work, we focus on the learning and synthesis of transients – the temporal response of the network during the short time (tens of milliseconds) following the onset and offset of a control signal. We find that the network learns the particular transient characteristics of two different synthetic instruments, and furthermore shows some ability to interpolate between the characteristics of the instruments used in training in response to novel parameter settings. We also study the behavior of the units in hidden layers of the RNN using various visualization techniques and find a variety of volume-specific response characteristics.

Keywords
analaysis/synthesis, audio synthesis, deep learning, musical instrument modeling

Paper topics
Content processing of music audio signals, Interaction in music performance, Models for sound analysis and synthesis, Sonic interaction design

Easychair keyphrases
steady state [10], output signal [6], recurrent neural network [6], decay transient [5], hidden layer [5], hidden unit [4], hidden unit response [4], musical instrument [4], sudden change [4], synthetic instrument [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249457
Zenodo URL: https://zenodo.org/record/3249457


2019.25
Copying clave - a Turing test
Blackmore, Simon   Sonic Arts Research Unit, Oxford Brookes University; Oxford, United Kingdom

Abstract
A blindfolded instructor (evaluator) plays a clave pattern. A computer captures and repeats the pattern, after 1 minute the experiment stops. This processes is repeated by a human who also tries to copy the clave. After another minute they stop and the evaluator assess both performances.

Keywords
clave, Interaction, Machine listening

Paper topics
not available

Easychair keyphrases
not available

Paper type
Demo

DOI: 10.5281/zenodo.3249441
Zenodo URL: https://zenodo.org/record/3249441


2019.26
Dancing Dots - Investigating the Link between Dancer and Musician in Swedish Folk Dance
Misgeld, Olof   KTH Royal Institute of Technology; Stockholm, Sweden
Holzapfel, André   KTH Royal Institute of Technology; Stockholm, Sweden
Ahlbäck, Sven   Royal College of Music (KMH); Stockholm, Sweden

Abstract
The link between musicians and dancers is generally described as strong in many traditional musics and this holds also for Scandinavian Folk Music - spelmansmusik. Understanding the interaction of music and dance has potential for developing theories of performance strategies in artistic practice and for developing interactive systems. In this paper we investigate this link by having Swedish folk musicians perform to animations generated from motion capture recordings of dancers. The different stimuli focus on motions of selected body parts as moving white dots on a computer screen with the aim to understand how different movements can provide reliable cues for musicians. Sound recordings of fiddlers playing to the "dancing dot" were analyzed using automatic alignment to the original music performance related to the dance recordings. Interviews were conducted with musicians and comments were collected in order to shed light on strategies when playing for dancing. Results illustrate a reliable alignment to renderings showing full skeletons of dancers, and an advantage of focused displays of movements in the upper back of the dancer.

Keywords
dance, folk dance, folk music, interaction, Motion Capture, music, music performance, performance strategies, playing for dancing, polska

Paper topics
Computational musicology and ethnomusicology, Interaction in music performance, Interactive performance systems, Music performance analysis and rendering

Easychair keyphrases
alignment curve [9], automatic alignment [7], body movement [6], reduced rendering [6], secondary recording [6], music performance [5], body part [4], dance movement [4], drift phase [4], folk dance [4], folk music [4], scandinavian folk music [4], stimulus type [4], swedish folk dance [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249455
Zenodo URL: https://zenodo.org/record/3249455


2019.27
DAW-Integrated Beat Tracking for Music Production
Dalton, Brett   University of Victoria; Victoria, Canada
Johnson, David   University of Victoria; Victoria, Canada
Tzanetakis, George   University of Victoria; Victoria, Canada

Abstract
Rhythm analysis is a well researched area in music information retrieval that has many useful applications in music production. In particular, it can be used to synchronize the tempo of audio recordings with a digital audio workstation (DAW). Conventionally this is done by stretching recordings over time, however, this can introduce artifacts and alter the rhythmic characteristics of the audio. Instead, this research explores how rhythm analysis can be used to do the reverse by synchronizing a DAW's tempo to a source recording. Drawing on research by Percival and Tzanetakis, a simple beat extraction algorithm was developed and integrated with the Renoise DAW. The results of this experiment show that, using user input from a DAW, even a simple algorithm can perform on par with popular packages for rhythm analysis such as BeatRoot, IBT, and aubio.

Keywords
Beat Extraction, Beat Induction, Beat Tracking, Digitial Audio Workstation, Music Production, Renoise, Rhythm Analysis

Paper topics
Algorithms and Systems for music composition, Interfaces for sound and music, Music information retrieval

Easychair keyphrases
beat tracking [28], tempo curve [8], beat extraction [7], beat tracking system [6], mir eval [6], music research [6], real time beat tracking [6], beat delta [5], music production [5], oss calculation [5], audio recording [4], beat extraction algorithm [4], beat time [4], beat tracking algorithm [4], daw integrated beat tracking [4], digital audio workstation [4], digital music [4], expected beat delta [4], language processing [4], music information retrieval [4], peak picking [4], spectral flux [4], streamlined tempo estimation algorithm [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249237
Zenodo URL: https://zenodo.org/record/3249237


2019.28
Deep Linear Autoregressive Model for Interpretable Prediction of Expressive Tempo
Maezawa, Akira   Yamaha Corporation; Hamamatsu, Japan

Abstract
Anticipating a human musician's tempo for a given piece of music using a predictable model is important for interactive music applications, but existing studies base such an anticipation based on hand-crafted features. Based on recent trends in using deep learning for music performance rendering, we present an online method for multi-step prediction of the tempo curve, given the past history of tempo curves and the music score that the user is playing. We present a linear autoregressive model whose parameters are determined by a deep convolutional neural network whose input is the music score and the history of tempo curve; such an architecture allows the machine to acquire a music performance idioms based on musical contexts, while being able to predict the timing based on the user's playing. Evaluations show that our model is capable of improving tempo estimate over a commonly-used baseline for tempo prediction by 18%.

Keywords
Deep Neural Networks, Music Interaction, Tempo Prediction

Paper topics
Automatic music generation/accompaniment systems, Interaction in music performance, Music performance analysis and rendering

Easychair keyphrases
music score [51], music score feature [31], tempo curve [18], score feature [14], hand crafted feature [12], linear ar model [11], score feature extraction [11], timing prediction [10], fully connected layer [9], prediction coefficient function [9], deep linear ar model [8], music performance [8], feature extraction [7], music performance rendering [7], prediction error [7], segment duration [7], tempo prediction [7], deep non linear ar model [6], duet interaction [6], expressive timing [6], leaky relu batch norm [6], music score sn [6], performance feature [6], performance history [6], beat duration [5], deep learning [5], eighth note [5], human musician [5], piano playing [5], real time [5]

Paper type
Full paper

DOI: 10.5281/zenodo.3249387
Zenodo URL: https://zenodo.org/record/3249387


2019.29
Digital Manufacturing For Musical Applications: A Survey of Current Status and Future Outlook
Cavdir, Doga Buse   Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States

Abstract
In the design of new musical instruments, from acoustic to digital, merging conventional methods with new technologies has been one of the common approaches. Incorporation of prior design expertise with experimental or sometimes industrial methods suggests new directions in both musical expression design and the development of new manufacturing tools. This paper describes key concepts of digital manufacturing processes in musical instrument design. It provides a review of current manufacturing techniques which are commonly used to create new musical interfaces, and discusses future directions of digital fabrication which are applicable to numerous areas in music research, such as digital musical instrument (DMI) design, interaction design, acoustics, performance studies, and education. Additionally, the increasing availability of digital manufacturing tools and fabrication labs all around the world make these processes an integral part of the design and music classes. Examples of digital fabrication labs and manufacturing techniques used in education for student groups whose age ranges from elementary to university level are presented. In the context of this paper, it is important to consider how the growing fabrication technology will influence the design and fabrication of musical instruments, as well as what forms of new interaction methods and aesthetics might emerge.

Keywords
acoustics of musical instruments, design and manufacturing of musical instrument, interaction design, iterative design

Paper topics
Hardware systems for sound and music computing, Interfaces for sound and music, Music creation and performance, New interfaces for interactive music creation, Sonic interaction design, Sound and music for accessibility and special needs

Easychair keyphrases
musical instrument [27], musical instrument design [19], musical expression [15], instrument design [14], additive manufacturing [12], rapid prototyping [9], digital fabrication [8], digital manufacturing [8], hybrid manufacturing [8], manufacturing technique [7], manufacturing tool [7], digital musical instrument [6], fabrication lab [6], injection molding [6], instrument body [6], fabrication method [5], acoustic instrument [4], brass pan flute [4], digital manufacturing tool [4], electronic circuit [4], incremental robotic sheet forming [4], industrial manufacturing [4], manufacturing process [4], music research [4], personal manufacturing [4], portable digital manufacturing tool [4], printing technology [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249280
Zenodo URL: https://zenodo.org/record/3249280


2019.30
DRAWING GEOMETRIC FIGURES WITH BRAILLE DESCRIPTION THROUGH A SPEECH RECOGNITION SYSTEM
Chamorro, África   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Ana Maria   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Isabel   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Tardón, Lorenzo José   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain

Abstract
In this contribution, a system that represents drawings of geometric figures along with their description transcribed in Braille controlled by means of commands acquired by a speech recognition scheme is presented. The designed system recognizes the spoken descriptions needed to draw simple geometric objects: shape, colour, size and position of the figures in the drawing. The speech recognition method selected is based on a distance measure defined with Mel Frequency Cepstral Coefficients (MFCCs). The complete system can be used by both people with visual and with hearing impairments thanks to its interface which, in addition to showing the drawing and the corresponding transcription in Braille, also allows the user to hear the description of commands and final drawing.

Keywords
Braille, Drawing, MFCCs, Speech recognition

Paper topics
not available

Easychair keyphrases
speech recognition [12], speech recognition subsystem [6], geometric figure [4], speech recognition scheme [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249437
Zenodo URL: https://zenodo.org/record/3249437


2019.31
EVALUATING A CONTINUOUS SONIC INTERACTION: COMPARING A PERFORMABLE ACOUSTIC AND DIGITAL EVERYDAY SOUND
Keenan, Fiona   University of York; York, United Kingdom
Pauletto, Sandra   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
This paper reports on the procedure and results of an experiment to evaluate a continuous sonic interaction with an everyday wind-like sound created by both acoustic and digital means. The interaction is facilitated by a mechanical theatre sound effect, an acoustic wind machine, which is performed by participants. This work is part of wider research into the potential of theatre sound effect designs as a means to study multisensory feedback and continuous sonic interactions. An acoustic wind machine is a mechanical device that affords a simple rotational gesture to a performer; turning its crank handle at varying speeds produces a wind-like sound. A prototype digital model of a working acoustic wind machine is programmed, and the acoustic interface drives the digital model in performance, preserving the same tactile and kinaesthetic feedback across the continuous sonic interactions. Participants’ performances are elicited with sound stimuli produced from simple gestural performances of the wind-like sounds. The results of this study show that the acoustic wind machine is rated as significantly easier to play than its digital counterpart. Acoustical analysis of the corpus of participants’ performances suggests that the mechanism of the wind machine interface may play a role in guiding their rotational gestures.

Keywords
Evaluation, Multimodality, Perception of Sound, Sonic Interaction, Sound Performance

Paper topics
access and modelling of musical heritage, Interactive performance systems, Interfaces for sound and music, Models for sound analysis and synthesis, Multimodality in sound and music computing, Perception and cognition of sound and music, Sonic interaction design, Technologies for the preservation

Easychair keyphrases
acoustic wind machine [55], wind machine [39], digital wind [33], continuous sonic interaction [19], digital model [18], theatre sound effect [16], crank handle [13], statistically significant difference [12], participant performance [10], digital counterpart [9], wind sound [9], rotational gesture [7], sound effect [7], digital musical instrument [6], everyday sound [6], historical theatre sound effect [6], sonic feedback [6], sound stimulus [6], statistical testing [6], wilcoxon signed rank test [6], easiness rating [5], order effect [5], performance gesture [5], similarity rating [5], steady rotation [5], theatre sound [5], digital wind machine [4], early twentieth century [4], free description [4], theatre wind machine [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249286
Zenodo URL: https://zenodo.org/record/3249286


2019.32
EXPERIMENTAL VERIFICATION OF DISPERSIVE WAVE PROPAGATION ON GUITAR STRINGS
Kartofelev, Dmitri   Department of Cybernetics, Tallinn University of Technology; Tallinn, Estonia
Arro, Joann   Department of Cybernetics, Tallinn University of Technology; Tallinn, Estonia
Välimäki, Vesa   Acoustics Lab, School of Electrical Engineering, Department of Signal Processing and Acoustics, Aalto University; Espoo, Finland

Abstract
Experimental research into the fundamental acoustic aspects of musical instruments and other sound generating devices is an important part of the history of musical acoustics and of physics in general. This paper presented experimental proof of dispersive wave propagation on metal guitar strings. The high resolution experimental data of string displacement are gathered using video-kymographic high-speed imaging of the vibrating string. The experimental data are indirectly compared against a dispersive Euler-Bernoulli type model described by a PDE. In order to detect the minor wave features associated with the dispersion and distinguish them from other effects present, such as frequency-dependent dissipation, a second model lacking the dispersive (stiffness) term is used. Unsurprisingly, the dispersive effects are shown to be minor but definitively present. The results and methods presented here in general should find application in string instrument acoustics.

Keywords
dispersion analysis, dispersive wave propagation, experimental acoustics, guitar string, kymography, line-scan camera, nylon string, stiff string, String vibration

Paper topics
Digital audio effects, Models for sound analysis and synthesis

Easychair keyphrases
string displacement [14], traveling wave [12], string vibration [11], guitar string [9], dispersive wave propagation [7], boundary condition [6], frequency dependent [6], high frequency [6], high frequency wave component [6], high speed line scan [6], time series [6], digital waveguide [5], dispersion analysis [5], full model [5], digital audio effect [4], dispersive euler bernoulli type [4], dispersive high frequency oscillating tail [4], electric field sensing [4], frequency dependent loss [4], general solution [4], group velocity [4], high resolution experimental data [4], initial value problem [4], line scan camera [4], piano string [4], signal processing [4], triangular shaped initial condition [4], video kymographic [4], wave equation [4], window size [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249372
Zenodo URL: https://zenodo.org/record/3249372


2019.33
Exploring the Effects of Diegetic and Non-diegetic Audiovisual Cues on Decision-making in Virtual Reality
Çamcı, Anıl   University of Michigan; Ann Arbor, United States

Abstract
The user experience of a virtual reality intrinsically depends upon how the underlying system relays information to the user. Auditory and visual cues that make up the user interface of a VR help users make decisions on how to proceed in a virtual scenario. These interfaces can be diegetic (i.e. presented as part of the VR) or non-diegetic (i.e. presented as an external layer superimposed onto the VR). In this paper, we explore how auditory and visual cues of diegetic and non-diegetic origins affect a user’s decision-making process in VR. We present the results of a pilot study, where users are placed into virtual situations where they are expected to make choices upon conflicting suggestions as to how to complete a given task. We analyze the quantitative data pertaining to user preferences for modality and diegetic-quality. We also discuss the narrative effects of the cue types based on a follow-up survey conducted with the users.

Keywords
Auditory and visual interfaces, Diegetic and non-diegetic cues, Virtual Reality

Paper topics
Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
non diegetic [33], diegetic quality [16], virtual reality [14], diegetic audio [11], diegetic visual [11], virtual environment [11], visual cue [11], diegetic audio cue [9], second attempt [9], cue type [8], virtual room [8], decision making [7], user experience [7], first attempt [6], non diegetic cue [6], non diegetic visual object [6], cinematic virtual reality [4], diegetic audio object [4], diegetic cue [4], diegetic quality pairing [4], diegetic sound [4], implied universe [4], make decision [4], non diegetic audio cue [4], non diegetic sound [4], user interface [4], virtual space [4], visual element [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249315
Zenodo URL: https://zenodo.org/record/3249315


2019.34
EXTENDING JAMSKETCH: AN IMPROVISATION SUPPORT SYSTEM
Yasuhara, Akane   Nihon University; Tokyo, Japan
Fujii, Junko   Nihon University; Tokyo, Japan
Kitahara, Tetsuro   Nihon University; Tokyo, Japan

Abstract
We previously introduced JamSketch, a system which enabled users to improvise music by drawing a melodic outline. However, users could not control the rhythm and intensity of the generated melody. Here, we present extensions to JamSketch to enable rhythm and intensity control.

Keywords
Automatic music composition, Genetic algorithm, Melodic outline, Musical improvisation, Pen pressure

Paper topics
not available

Easychair keyphrases
melodic outline [23], pen pressure [8], note density [6], piano roll display [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249349
Zenodo URL: https://zenodo.org/record/3249349


2019.35
FACIAL ACTIVITY DETECTION TO MONITOR ATTENTION AND FATIGUE
Cobos, Oscar   Andalucía Tech, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Munilla, Jorge   Andalucía Tech, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Ana Maria   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Isabel   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Tardón, Lorenzo José   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain

Abstract
In this contribution, we present a facial activity detection system using image processing and machine learning techniques. Facial activity detection allows monitoring people emotional states, attention, fatigue, reactions to different situations, etc., in a non-intrusive way. The designed system can be used in many fields such as education and musical perception. Monitoring the facial activity of a person can help us to know if it is necessary to take a break, change the type of music that is being listened to or modify the way of teaching the class.

Keywords
Education, Facial activity detection, Monitor Attention, Musical perception, SVM

Paper topics
not available

Easychair keyphrases
facial activity detection system [10], facial activity detection [6], finite state machine [6], temporal analysis [6], mouth state detection [4], mouth status [4], non intrusive way [4], person emotional state [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249358
Zenodo URL: https://zenodo.org/record/3249358


2019.36
Finding new practice material through chord-based exploration of a large music catalogue
Pauwels, Johan   Queen Mary University of London; London, United Kingdom
Sandler, Mark   Queen Mary University of London; London, United Kingdom

Abstract
Our demo is a web app that suggests new practice material to music learners based on automatic chord analysis. It is aimed at music practitioners of any skill set, playing any instrument, as long as they know how to play along with a chord sheet. Users need to select a number of chords in the app, and are then presented with a list of music pieces containing those chords. Each of those pieces can be played back while its chord transcription is displayed in sync to the music. This enables a variety of practice scenarios, ranging from following the chords in a piece to using the suggested music as a backing track to practice soloing over.

Keywords
automatic chord recognition, music education, music recommendation, web application

Paper topics
not available

Easychair keyphrases
chord transcription [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249445
Zenodo URL: https://zenodo.org/record/3249445


2019.37
From Jigs and Reels to Schottisar och Polskor: Generating Scandinavian-like Folk Music with Deep Recurrent Networks
Mossmyr, Simon   KTH Royal Institute of Technology; Stockholm, Sweden
Hallström, Eric   KTH Royal Institute of Technology; Stockholm, Sweden
Sturm, Bob Luis   KTH Royal Institute of Technology; Stockholm, Sweden
Vegeborn, Victor Hansjons   KTH Royal Institute of Technology; Stockholm, Sweden
Wedin, Jonas   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
The use of recurrent neural networks for modeling and generating music has seen much progress with textual transcriptions of traditional music from Ireland and the UK. We explore how well these models perform for textual transcriptions of traditional music from Scandinavia. This type of music can have characteristics that are similar to and different from those of Irish music, e.g. structure, mode, and rhythm. We investigate the effects of different architectures and training regimens, and evaluate the resulting models using two methods: a comparison of statistics between real and generated transcription populations, and an appraisal of generated transcriptions via a semi-structured interview with an expert in Swedish folk music. As for the models trained on Irish transcriptions, we see these recurrent models can generate new transcriptions that share characteristics with Swedish folk music. One of our models has been implemented online at http://www.folkrnn.org.

Keywords
Deep Learning, Folk Music, GRU, LSTM, Neural Network, Polka, RNN

Paper topics
Algorithms and Systems for music composition, Automatic music generation/accompaniment systems, New interfaces for interactive music creation

Easychair keyphrases
scandinavian folk music [15], folk music [12], training data [10], recurrent neural network [7], swedish folk music [7], folkwiki dataset [6], real transcription [6], short term memory [6], gru layer [5], music transcription [5], traditional music [5], eric hallstr om [4], fake transcription [4], gated recurrent unit [4], gru model [4], irish traditional music [4], neural network [4], semi structured interview [4], transcription model [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249474
Zenodo URL: https://zenodo.org/record/3249474


2019.38
FROM VOCAL SKETCHING TO SOUND MODELS BY MEANS OF A SOUND-BASED MUSICAL TRANSCRIPTION SYSTEM
Panariello, Claudio   KTH Royal Institute of Technology; Stockholm, Sweden
Sköld, Mattias   Royal College of Music (KMH); Stockholm, Sweden
Frid, Emma   KTH Royal Institute of Technology; Stockholm, Sweden
Bresin, Roberto   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
This paper explores how notation developed for the representation of sound-based musical structures could be used for the transcription of vocal sketches representing expressive robot movements. A mime actor initially produced expressive movements which were translated to a humanoid robot. The same actor was then asked to illustrate these movements using vocal sketching. The vocal sketches were transcribed by two composers using sound-based notation. The same composers later synthesised new sonic sketches from the annotated data. Different transcriptions and synthesised versions of these were compared in order to investigate how the audible outcome changes for different transcriptions and synthesis routines. This method provides a palette of sound models suitable for the sonification of expressive body movements.

Keywords
robot sound, sonic interaction design, sonification, sound representation, Sound trascription, voice sketching

Paper topics
Auditory display and data sonification, Models for sound analysis and synthesis, Multimodality in sound and music computing, Music performance analysis and rendering, Perception and cognition of sound and music, Social interaction in sound and music computing, Sonic interaction design

Easychair keyphrases
vocal sketch [21], sound synthesis [8], mime actor [7], notation system [7], sound structure [7], synthesized version [6], vocal sketching [6], humanoid robot [5], sonic sketch [5], expressive gesture [4], human robot interaction [4], kmh royal college [4], movement sonification [4], pitched sound [4], sonao project [4], sound based musical structure [4], sound model [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249299
Zenodo URL: https://zenodo.org/record/3249299


2019.39
Graph Based Physical Models for Sound Synthesis
Christensen, Pelle Juul   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark

Abstract
Abstract Physical Modeling for Sound Synthesis: Graph Based Physical Models for Sound Synthesis Pelle Juul Christensen Aalborg University Copenhagen, Denmark pelle.juul@tuta.io Stefania Serafin Aalborg University Copenhagen, Denmark sts@create.aau.dk ABSTRACT We focus on physical models in which multiple strings are connected via junctions to form graphs. Starting with the case of the 1D wave equation, we show how to ex- tend it to a string branching into two other strings, and from there how to build complex cyclic and acyclic graphs. We introduce the concept of dense models and show that a discretization of the 2D wave equation can be built us- ing our methods, and that there are more efficient ways of modelling 2D wave propagation than a rectangular grid. We discuss how to apply Dirichlet and Neumann boundary conditions to a graph model, and show how to compute the frequency content of a graph using common methods. We then prove general lower and upper bounds computational complexity. Lastly, we show how to extend our results to other kinds of acoustical objects, such as linear bars, and how to add dampening to a graph model. A reference implementation in MATLAB and an interactive JUCE/C++ application is available online.

Keywords
Digital signal processing, Physical modeling for sound synthesis, Sound and music computing

Paper topics
and virtual acoustics, Models for sound analysis and synthesis, reverberation, Sound/music signal processing algorithms, Spatial sound

Easychair keyphrases
boundary condition [22], d wave equation [20], finite difference scheme [20], wave equation [14], pendant node [9], branching topology [8], rectangular grid [8], computational complexity [7], physical model [7], string segment [7], graph based physical model [6], mass spring network [6], digital waveguide [5], graph model [5], sound synthesis [5], aalborg university copenhagen [4], dense model [4], difference operator [4], d wave equation based model [4], edge node [4], hexagonal grid [4], linear bar model [4], mass spring system [4], n branch topology [4], physical modelling [4], stability condition [4], wave propagation [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249331
Zenodo URL: https://zenodo.org/record/3249331


2019.40
HMM-BASED GLISSANDO DETECTION FOR RECORDINGS OF CHINESE BAMBOO FLUTE
Wang, Changhong   Queen Mary University of London; London, United Kingdom
Benetos, Emmanouil   Queen Mary University of London; London, United Kingdom
Meng, Xiaojie   China Conservatory of Music; Beijing, China
Chew, Elaine   Queen Mary University of London; London, United Kingdom

Abstract
Playing techniques such as ornamentations and articulation effects constitute important aspects of music performance. However, their computational analysis is still at an early stage due to a lack of instrument diversity, established methodologies and informative data. Focusing on the Chinese bamboo flute, we introduce a two-stage glissando detection system based on hidden Markov models (HMMs) with Gaussian mixtures. A rule-based segmentation process extracts glissando candidates that are consecutive note changes in the same direction. Glissandi are then identified by two HMMs. The study uses a newly created dataset of Chinese bamboo flute recordings, including both isolated glissandi and real-world pieces. The results, based on both frame- and segment-based evaluation for ascending and descending glissandi respectively, confirm the feasibility of the proposed method for glissando detection. Better detection performance of ascending glissandi over descending ones is obtained due to their more regular patterns. Inaccurate pitch estimation forms a main obstacle for successful fully-automated glissando detection. The dataset and method can be used for performance analysis.

Keywords
Ethnomusicology, Glissando, Hidden Markov models, Playing technique detection

Paper topics
Automatic separation, classification of sound and music, Computational musicology and ethnomusicology, Content processing of music audio signals, Music information retrieval, Music performance analysis and rendering, recognition, Sound/music signal processing algorithms

Easychair keyphrases
playing technique [22], music information retrieval [11], descending glissando [10], glissando detection [10], isolated glissando [10], ground truth [9], ascending glissando [8], detection system [8], note change [8], pitch estimation [8], ascending and descending [7], chinese bamboo flute [7], international society [7], whole piece recording [7], automated glissando detection system [6], computational analysis [6], fully automated glissando [6], guitar playing technique [6], performed glissando [6], rule based segmentation [6], glissando candidate [5], modeling of magnitude and phase derived [5], note number [5], signal processing [5], ascending performed glissando [4], cbf playing technique [4], glissando detection system [4], hidden markov model [4], pitch estimation accuracy [4], playing technique detection [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249470
Zenodo URL: https://zenodo.org/record/3249470


2019.41
Increasing Access to Music in SEN Settings
Davis, Tom   Bournemouth University; Bournemouth, United Kingdom
Pierson, Daniel   Bournemouth University; Bournemouth, United Kingdom
Bevan, Ann   Bournemouth University; Bournemouth, United Kingdom

Abstract
This paper presents some of the outcomes of a one year Higher Education Innovation Fund funded project exam-ining the use of music technology to increase access to music for children within special educational need (SEN) settings. Despite the widely acknowledged benefits of interacting with music for children with SEN there are a number of well documented barriers to access [1, 2, 3]. These barriers take a number of forms including financial, knowledge based or attitudinal. The aims of this project were to assess the current music technology provision in SEN schools within a particular part of the Dorset region, UK, determine the barriers they were facing and develop strategies to help the schools overcome these barriers. An overriding concern for this project was to leave the schools with lasting benefit and meaningful change. As such an Action Research [4] methodology was followed, which has at its heart an understanding of the participants as co-researchers helping ensure any solutions presented met the needs of the stakeholders.. Although technologi-cal solutions to problems were presented to the school, it was found that the main issues were around the flexibil-ity of equipment to be used in different locations, staff time and staff attitudes to technology.

Keywords
access, inclusion, interaction, SEN

Paper topics
Sound and music for accessibility and special needs

Easychair keyphrases
music technology [21], music therapy [15], action research [12], resonance board [12], music therapist [6], music therapy perspective [6], vibro tactile resonance board [6], sen setting [5], action research methodology [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249345
Zenodo URL: https://zenodo.org/record/3249345


2019.42
Insights in habits and attitudes regarding programming sound synthesizers: a quantitative study
Kreković, Gordan   Visage Technologies; Zagreb, Croatia

Abstract
Sound synthesis represents an indispensable tool for modern composers and performers, but achieving desired sonic results often requires a tedious manipulation of various numeric parameters. In order to facilitate this process, a number of possible approaches have been proposed, but without a systematic user research that could help researchers to articulate the problem and to make informed design decisions. The purpose of this study is to fill that gap and to investigate attitudes and habits of sound synthesizer users. The research was based on a questionnaire answered by 122 participants, which, beside the main questions about habits and attitudes, covered questions about their demographics, profession, educational background and experience in using sound synthesizers. The results were quantitatively analyzed in order to explore relations between all those dimensions. The main results suggest that the participants more often modify or create programs than they use existing presets or programs and that such habits do not depend on the participants’ education, profession, or experience.

Keywords
automatic parameter selection, quantitative studies, sound synthesis, user research

Paper topics
and software environments for sound and music computing, Interfaces for sound and music, Languages, Models for sound analysis and synthesis, protocols

Easychair keyphrases
sound synthesizer [37], user interface [18], synthesis parameter [17], synthesizer programming [14], rank sum test [12], wilcoxon rank sum [12], computer music [11], music education [11], existing program [10], genetic algorithm [10], sound synthesis [9], usage habit [9], automatic selection [7], create program [7], creating and modifying [7], spearman correlation coefficient [7], automatic parameter selection [6], formal music education [6], international computer [6], modifying program [6], music student [6], professional musician [6], statistically significant difference [6], user research [6], desired sound [5], synthesis engine [5], audio engineering society [4], audio feature [4], computer science technique [4], music education level [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249370
Zenodo URL: https://zenodo.org/record/3249370


2019.43
Interacting with digital resonators by acoustic excitation
Neupert, Max   Bauhaus-Universität Weimar; Weimar, Germany
Wegener, Clemens   The Center for Haptic Audio Interaction Research (CHAIR); Weimar, Germany

Abstract
This demo presents an acoustic interface which allows to directly excite digital resonators (digital waveguides, lumped models, modal synthesis and sample convolution). Parameters are simultaneously controlled by the touch position on the same surface. The experience is an intimate and intuitive interaction with sound for percussive and melodic play.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
Demo

DOI: 10.5281/zenodo.3249260
Zenodo URL: https://zenodo.org/record/3249260


2019.44
Interacting with Musebots (that don’t really listen)
Eigenfeldt, Arne   Simon Fraser University; Vancouver, Canada

Abstract
tinySounds is a collaborative work for live performer and musebot ensemble. Musebots are autonomous musical agents that interact, via messaging, to create a musical performance with or without human interaction.

Keywords
generative music, interactive system, musebots, musical agents

Paper topics
not available

Easychair keyphrases
musebot ensemble [7]

Paper type
Demo

DOI: 10.5281/zenodo.3249347
Zenodo URL: https://zenodo.org/record/3249347


2019.45
Interaction-based Analysis of Freely Improvised Music
Kalonaris, Stefano   Center for Advanced Intelligence Project (AIP), RIKEN; Tokyo, Japan

Abstract
This paper proposes a computational method for the analysis and visualization of structure in freely improvised musical pieces, based on source separation and interaction patterns. A minimal set of descriptive axes is used for eliciting interaction modes, regions and transitions. To this end, a suitable unsupervised segmentation model is selected based on the author's ground truth, and is used to compute and compare event boundaries of the individual audio sources. While still at a prototypal stage of development, this method offers useful insights for evaluating a musical expression that lacks formal rules and protocols, including musical functions (e.g., accompaniment, solo, etc.) and form (e.g., verse, chorus, etc.).

Keywords
Computational musicology, Interaction and improvisation, Interaction in music performance, Perception and cognition of sound and music

Paper topics
Computational musicology and ethnomusicology, Improvisation in music through interactivity, Interaction in music performance, Music information retrieval, Perception and cognition of sound and music

Easychair keyphrases
freely improvised music [23], musical expression [13], free jazz [11], audio source [10], free improvisation [10], real time [9], dynamic mode [8], source separation [7], audio source separation [6], clear cut [6], musical improvisation [6], musical surface [6], music information retrieval [6], ordinal linear discriminant analysis [6], audio mix [5], jazz improvisation [5], static mode [5], activation time [4], auditory stream segregation [4], convex non negative matrix factorization [4], ground truth [4], improvised music [4], individual audio source [4], inter region [4], multi track recording [4], music theory [4], segmentation boundary [4], signal processing [4], structural segmentation [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249239
Zenodo URL: https://zenodo.org/record/3249239


2019.46
INTERACTIVE MUSIC TRAINING SYSTEM
Moreno, Daniel   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Isabel   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Barbancho, Ana Maria   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain
Tardón, Lorenzo José   Andalucía Tech, ATIC group, E.T.S.I. Telecomunicación, Universidad de Málaga; Málaga, Spain

Abstract
In this contribution, we present an interactive system for playing while learning music. The game is based on different computer games controlled by the user with a remote control. The remote control has been implemented using IMU sensors for 3D tranking. The computer games are programming in Python and allows to practice rhythm as well as the tune, ascending or descending of musical notes.

Keywords
IMU sensors, Interactive system, Music learning, Serious Games

Paper topics
not available

Easychair keyphrases
remote control [16], interactive music training system [10], computer game [8], practice rhythm [5], serious game [5], ascending or descending [4], lleva el cursor [4], mover el cursor [4], note order game [4], ve una partitura con [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249439
Zenodo URL: https://zenodo.org/record/3249439


2019.47
INTERNAL COMPLEXITY FOR EXPLORATORY INTERACTION
Hobye, Mads   Roskilde University; Roskilde, Denmark

Abstract
When designing interactive sound for non-utilitarian ludic interaction internal complexity can be a way of opening up a space for curiosity and exploration. Internal complexity should be understood as non-linear mappings between the input and the parameters they affect in the output (sound). This paper presents three different experiments which explore ways to create internal complexity with rather simple interfaces for curious exploration.

Keywords
8 Bit synth, Curiosity, Exploration, Interaction, Ludic play

Paper topics
not available

Easychair keyphrases
noise machine [5], internal complexity [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249447
Zenodo URL: https://zenodo.org/record/3249447


2019.48
‘Jazz Mapping’ an Analytical and Computational Approach to Jazz Improvisation
Vassilakis, Dimitrios   National and Kapodistrian University of Athens; Athens, Greece
Georgaki, Anastasia   National and Kapodistrian University of Athens; Athens, Greece
Anagnostopoulou, Christina   National and Kapodistrian University of Athens; Athens, Greece

Abstract
Abstract “Jazz mapping" is a multi-layer analytical approach to jazz improvisation based on hierarchical segmentation and categorization of segments, or constituents, according to their function in the overall improvisation. In this way higher-level semantics of transcribed and recorded jazz solos can be exposed. In this approach, the knowledge of the expert jazz performer is taken into account in all ana-lytical decisions. We apply the method to two well-known solos, by Sonny Rollins and Charlie Parker and we discuss how improvisations resemble story telling, employing a broad range of structural, expressive, tech-nical and emotional tools usually associated with the pro-duction and experience of language and of linguistic meaning. We make explicit the choices of the experi-enced jazz improviser who has developed a strong com-mand over the language and unfolds a story in real time, very similar to prose on a given framework, He/she uti-lizes various mechanisms to communicate expressive intent, elicit emotional responses, and make his/her musi-cal “story,” memorable and enjoyable to fellow musicians and listeners. We also comment on potential application areas of this work related to music and artificial intelli-gence.

Keywords
Interaction with music, Jazz Analyses, Jazz performance and AI, Machine learning, Music information retrieval, Semantics

Paper topics
Interaction in music performance, Models for sound analysis and synthesis, Music creation and performance, Music performance analysis and rendering, Perception and cognition of sound and music

Easychair keyphrases
jazz improvisation [14], thematic development [10], sonny rollin [8], personal voice [6], structural element [6], machine learning [5], charlie parker [4], jazz mapping [4], jazz solo [4], kapodistrian university [4], music study [4], story telling [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249431
Zenodo URL: https://zenodo.org/record/3249431


2019.49
Learning to Generate Music with BachProp
Colombo, Florian   Laboratory of Computational Neurosciences, Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Brea, Johanni   Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland
Gerstner, Wulfram   Ecole Polytechnique Fédérale de Lausanne (EPFL); Lausanne, Switzerland

Abstract
As deep learning advances, algorithms of music composition increase in performance. However, most of the successful models are designed for specific musical structures. Here, we present BachProp, an algorithmic composer that can generate music scores in many styles given sufficient training data. To adapt BachProp to a broad range of musical styles, we propose a novel representation of music and train a deep network to predict the note transition probabilities of a given music corpus. In this paper, new music scores generated by BachProp are compared with the original corpora as well as with different network architectures and other related models. A set of comparative measures is used to demonstrate that BachProp captures important features of the original datasets better than other models and invite the reader to a qualitative comparison on a large collection of generated songs.

Keywords
Automated Music Composition, Deep Learning, Generative Model of Music, Music Representation, Recurrent Neural Networks

Paper topics
Algorithms and Systems for music composition

Easychair keyphrases
neural network [15], generative model [12], recurrent neural network [11], time shift [11], novelty profile [9], music score [8], latexit latexit [7], music composition [7], hidden state [6], note sequence [6], reference corpus [6], auto novelty [5], bach chorale [5], john sankey [5], local statistic [5], machine learning [5], midi sequence [5], novelty score [5], string quartet [5], base unit [4], data set [4], hidden layer [4], musical structure [4], preprint arxiv [4], probability distribution [4], recurrent layer [4], recurrent neural network model [4], science ecole polytechnique [4], song length [4], utzxqxj0wy1i3a2q4fip7kfdydqi5jqljw8hoauq3hrk2ilz3oe64h9gbeqfgfk300ex hegyxu565ypbdwcgy6swkwrkapbkx91 znkik2ssgtopuqn6gduomostujc1pkfsrae8y6mietd nrt4qk6s0idhrg0pjdp190rgi2pguwa7i4pd [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249394
Zenodo URL: https://zenodo.org/record/3249394


2019.50
Mass-Interaction Physical Models for Sound and Multi-Sensory Creation: Starting Anew
Villeneuve, Jérôme   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France
Leonard, James   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France

Abstract
Mass-interaction methods for sound synthesis, and more generally for digital artistic creation, have been studied and explored for over three decades, by a multitude of researchers and artists. However, for a number of reasons this research has remained rather confidential, subsequently overlooked and often considered as the "odd-one-out" of physically-based synthesis methods, of which many have grown exponentially in popularity over the last ten years. In the context of a renewed research effort led by the authors on this topic, this paper aims to reposition mass-interaction physical modelling in the contemporary fields of Sound and Music Computing and Digital Arts: what are the core concepts? The end goals? And more importantly, which relevant perspectives can be foreseen in this current day and age? Backed by recent developments and experimental results, including 3D mass-interaction modelling and emerging non-linear effects, this proposed reflection casts a first canvas for an active, and resolutely outreaching, research on mass-interaction physical modelling for the arts.

Keywords
3D Physical Modeling, Emerging Non-linear Behaviors, Mass Interaction, Multi-Sensory, Processing

Paper topics
Interactive performance systems, Models for sound analysis and synthesis, Multimodality in sound and music computing, Music creation and performance, New interfaces for interactive music creation

Easychair keyphrases
mass interaction [31], sound synthesis [17], mass interaction physical modelling [16], physical modelling [15], real time [11], non linear [9], discrete time [7], interaction physical [7], non linear behaviour [7], computer music [6], mass interaction model [6], mass interaction physical model [6], non linearity [5], tension modulation [5], chaotic oscillation [4], finite difference scheme [4], grenoble inp gipsa lab [4], haptic interaction [4], modular physical modelling [4], musical instrument [4], physical model [4], virtual object [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249313
Zenodo URL: https://zenodo.org/record/3249313


2019.51
Mechanical Entanglement: A Collaborative Haptic-Music Performance
Kontogeorgakopoulos, Alexandros   Cardiff Metropolitan University; Cardiff, United Kingdom
Sioros, George   RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Department of Musicology, University of Oslo; Oslo, Norway
Klissouras, Odysseas   oneContinuousLab; Athens, Greece

Abstract
Mechanical Entanglement is a musical composition for three performers. Three force feedback devices each containing two haptic faders are mutually coupled using virtual linear springs and dampers. During the composition, the performers feel each others' gestures and collaboratively process the music material. The interaction's physical modelling parameters are modified during the different sections of the composition. An algorithm which process three stereo channels, is stretching in and out-of-sync three copies of the same music clip. The performers are “controlling” the stretching algorithm and an amplitude modulation effect, both applied to recognisable classical and contemporary music compositions. Each of them is substantially modifying the length and the dynamics of the same music clip but also simultaneously affecting subtly or often abruptly the gestural behaviour of the other performers. At fixed points in the length of the composition, the music becomes gradually in sync and the performers realign their gestures. This phasing “game” between gestures and sound, creates tension and emphasises the physicality of the performance.

Keywords
collaborative performance, composition, force-feedback, haptics, interactive music performance, lumped element modelling, mass-interaction networks, physical modelling

Paper topics
Improvisation in music through interactivity, Interaction in music performance, Interactive performance systems, Interfaces for sound and music, Music creation and performance, New interfaces for interactive music creation, Social interaction in sound and music computing

Easychair keyphrases
force feedback [10], computer music [9], haptic device [9], physical model [7], force feedback device [6], audio file [5], haptic fader [5], musical expression [5], musical instrument [5], signal processing [5], haptic digital audio effect [4], haptic signal processing [4], haptic signal processing framework [4], led light [4], mechanical entanglement [4], musical composition [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249242
Zenodo URL: https://zenodo.org/record/3249242


2019.52
Melody Identification in Standard MIDI Files
Jiang, Zheng   Carnegie Mellon University; Pittsburgh, United States
Dannenberg, Roger B.   Carnegie Mellon University; Pittsburgh, United States

Abstract
Melody identification is an important early step in music analysis. This paper presents a tool to identify the melody in each measure of a Standard MIDI File. We also share an open dataset of manually labeled music for researchers. We use a Bayesian maximum-likelihood approach and dynamic programming as the basis of our work. We have trained parameters on data sampled from the million song dataset and tested on a dataset including 1706 measures of music from different genres. Our algorithm achieves an overall accuracy of 90% in the test dataset. We compare our results to previous work.

Keywords
Bayesian, Melody, Music analysis, Standard MIDI File, Viterbi

Paper topics
Automatic separation, classification of sound and music, Music information retrieval, recognition

Easychair keyphrases
training data [13], melody channel [12], midi file [10], window size [10], melody identification [8], note density [8], dynamic programming [7], standard deviation [7], switch penalty [7], channel containing [5], melody extraction [5], test data [5], bayesian probability model [4], channel switch [4], cross validation [4], feature set [4], fold cross [4], pitch mean [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249256
Zenodo URL: https://zenodo.org/record/3249256


2019.53
Melody Slot Machine
Hamanaka, Masatoshi   Center for Advanced Intelligence Project (AIP), RIKEN; Tokyo, Japan

Abstract
This paper describes our interactive music system called the “Melody Slot Machine,” which enables control of a holographic performer. Although many interactive music systems have been proposed, manipulating perfor-mances in real time is difficult for musical novices because melody manipulation requires expert knowledge. Therefore, we developed the Melody Slot Machine to provide an experience of manipulating melodies by enabling users to freely switch between two original melodies and morphing melodies.

Keywords
Generative Theory of Tonal Music, Interactive Music System, Melody Morphing

Paper topics
not available

Easychair keyphrases
melody slot machine [15], time span tree [12], melody morphing method [7], holographic display [6], cache size [5], melody segment [5], frame rate [4], virtual performer [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249262
Zenodo URL: https://zenodo.org/record/3249262


2019.54
Metrics for the Automatic Assessment of Music Harmony Awareness in Children
Avanzini, Federico   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica e Comunicazione (DICo), Università degli Studi di Milano; Milano, Italy
Baratè, Adriano   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica (DI), Università degli Studi di Milano; Milano, Italy
Ludovico, Luca Andrea   Laboratorio di Informatica Musicale (LIM), Dipartimento di Informatica e Comunicazione (DICo), Università degli Studi di Milano; Milano, Italy
Mandanici, Marcella   Conservatorio Statale di musica “Luca Marenzio” di Brescia; Brescia, Italy

Abstract
In the context of a general research question about the effectiveness of computer-based technologies applied to early music-harmony learning, this paper proposes a web-based tool to foster and quantitatively measure harmonic awareness in children. To this end, we have developed a Web interface where young learners can listen to the leading voice of well-known music pieces and associate chords to it. During the activity, their actions can be monitored, recorded, and analyzed. An early experimentation involved 45 primary school teachers, whose performances have been measured in order to get user-acceptance opinions from domain experts and to determine the most suitable metrics to conduct automated performance analysis. This paper focuses on the latter aspect and proposes a set of candidate metrics to be used for future experimentation with children.

Keywords
assessment, harmony, metrics, music education, web tools

Paper topics
Perception and cognition of sound and music

Easychair keyphrases
tonal harmony [18], music tune [11], harmonic touch [9], leading voice [9], final choice [6], music education [6], parallel chord [6], harmonic function [5], final chord [4], harmonic awareness [4], harmonic space [4], harmony awareness [4], implicit harmony [4], learning effect [4], melody harmonization [4], primary chord [4], primary school child [4], research question [4], tonal function [4], tonic chord [4], web interface [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249389
Zenodo URL: https://zenodo.org/record/3249389


2019.55
MI-GEN∼: An Efficient and Accessible Mass-Interaction Sound Synthesis Toolbox
Leonard, James   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France
Villeneuve, Jérôme   Grenoble Images Parole Signal Automatique (GIPSA-Lab), Université de Grenoble-Alpes; Grenoble, France

Abstract
Physical modelling techniques are now an essential part of digital sound synthesis, allowing for the creation of complex timbres through the simulation of virtual matter and expressive interaction with virtual vibrating bodies. However, placing these tools in the hands of the composer or musician has historically posed challenges in terms of a) the computational expense of most real-time physically based synthesis methods, b) the difficulty of implementing these methods into modular tools that allow for the intuitive design of virtual instruments, without expert physics and/or computing knowledge, and c) the generally limited access to such tools within popular software environments for musical creation. To this end, a set of open-source tools for designing and computing mass-interaction networks for physically-based sound synthesis is presented. The audio synthesis is performed within Max/MSP using the gen~ environment, allowing for simple model design, efficient calculation of systems containing single-sample feedback loops, as well as extensive real-time control of physical parameters and model attributes. Through a series of benchmark examples, we exemplify various virtual instruments and interaction designs.

Keywords
Mass-interaction, Max/MSP, Physical modelling, Toolbox

Paper topics
Models for sound analysis and synthesis, Multimodality in sound and music computing, Sound/music signal processing algorithms

Easychair keyphrases
mass interaction [25], mass interaction model [12], physical model [12], sound synthesis [11], computer music [9], physical modelling [9], mass interaction physical modelling [8], discrete time [7], harmonic oscillator [7], mass interaction modelling [7], mass interaction network [7], motion buffer [7], physical modeling [6], real time [6], drunk triangle [5], physical parameter [5], stability condition [5], control rate parameter [4], digital sound synthesis [4], external position [4], force feedback [4], gen patch [4], grenoble inp gipsa lab [4], mass interaction physical modeling [4], mass type element [4], mi gen toolbox [4], model based digital piano [4], multisensory virtual musical instrument [4], non linear [4], physically based synthesis method [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249376
Zenodo URL: https://zenodo.org/record/3249376


2019.56
MININGSUITE: A COMPREHENSIVE MATLAB FRAMEWORK FOR SIGNAL, AUDIO AND MUSIC ANALYSIS, ARTICULATING AUDIO AND SYMBOLIC APPROACHES
Lartillot, Olivier   University of Oslo; Oslo, Norway

Abstract
The MiningSuite is a free open-source and comprehensive Matlab framework for the analysis of signals, audio recordings, music recordings, music scores, other signals such as motion capture data, etc., under a common modular framework. It adds a syntactic layer on top of Matlab, so that advanced operations can be specified using a simple and adaptive syntax. This makes the Matlab environment very easy to use for beginners, and in the same time allows power users to design complex workflows in a modular and concise way through a simple assemblage of operators featuring a large set of options. The MiningSuite is an extension of MIRtoolbox, a Matlab toolbox that has become a reference tool in MIR.

Keywords
Matlab toolbox, MIR, open source

Paper topics
not available

Easychair keyphrases
not available

Paper type
Demo

DOI: 10.5281/zenodo.3249435
Zenodo URL: https://zenodo.org/record/3249435


2019.57
Modeling and Learning Rhythm Structure
Foscarin, Francesco   Conservatoire national des arts et métiers (CNAM); France
Jacquemard, Florent   Institut national de recherche en informatique et en automatique (INRIA); France
Rigaux, Philippe   Conservatoire national des arts et métiers (CNAM); France

Abstract
We present a model to express preferences on rhythmic structure, based on probabilistic context-free grammars, and a procedure that learns the grammars probabilities from a dataset of scores or quantized MIDI files. The model formally defines rules related to rhythmic subdivisions and durations that are in general given in an informal language. Rules preference is then specified with probability values. One targeted application is the aggregation of rules probabilities to qualify an entire rhythm, for tasks like automatic music generation and music transcription. The paper also reports an application of this approach on two datasets.

Keywords
Digital Music Scores, Grammatical Inference, Rhythmic notation, Weighted Context-Free-Grammars

Paper topics
Algorithms and Systems for music composition, Automatic music generation/accompaniment systems, Music information retrieval

Easychair keyphrases
parse tree [33], music notation [10], probabilistic context free grammar [8], weight value [8], context free grammar [7], rhythmic notation [7], time interval [7], time signature [7], training set [7], midi file [6], rhythm structure [6], rhythm tree [6], hierarchical structure [5], rhythm notation [5], enhanced wikifonia leadsheet dataset [4], k div rule [4], musical event [4], non terminal symbol [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249476
Zenodo URL: https://zenodo.org/record/3249476


2019.58
Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters
Schreiber, Hendrik   tagtraum industries incorporated; Raleigh, United States
Müller, Meinard   International Audio Laboratories Erlangen (AudioLabs); Erlangen, Germany

Abstract
In this article we explore how the different semantics of spectrograms’ time and frequency axes can be exploited for musical tempo and key estimation using Convolutional Neural Networks (CNN). By addressing both tasks with the same network architectures ranging from shallow, domain-specific approaches to VGG variants with directional filters, we show that axis-aligned architectures perform similarly well as common VGG-style networks, while being less vulnerable to confounding factors and requiring fewer model parameters.

Keywords
CNN, Confounds, Key, MIR, Tempo

Paper topics
Automatic separation, classification of sound and music, Content processing of music audio signals, Models for sound analysis and synthesis, Music information retrieval, recognition, Sound/music signal processing algorithms

Easychair keyphrases
music information retrieval [22], tempo estimation [18], convolutional neural network [15], directional filter [13], key detection [12], key estimation [12], th international society [12], gtzan key [9], square filter [8], tempo task [8], convolutional layer [7], deepmod deepmod deepmod [6], electronic dance music [6], genre recognition [6], mir task [6], shallow architecture [6], standard deviation [6], deep architecture [5], giantstep key [5], network architecture [5], signal processing [5], validation accuracy [5], feature extraction module [4], giantstep tempo [4], key accuracy [4], key task [4], layer input conv [4], similar parameter count [4], tempo annotation [4], temporal filter [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249250
Zenodo URL: https://zenodo.org/record/3249250


2019.59
Music Temperaments Evaluation Based on Triads
Tong, Meihui   Japan Advanced Institute of Science and Technology (JAIST); Nomi, Japan
Tojo, Satoshi   Japan Advanced Institute of Science and Technology (JAIST); Nomi, Japan

Abstract
It is impossible for one temperament to achieve optimally both of consonance and modulation. The dissonance level has been calculated by the ratio of two pitch frequencies, however in the current homophonic music, the level should be measured by chords, especially by triads. In this research, we propose to quantify them as Dissonance Index of Triads (DIT). We select eight well-known temperaments and calculate seven diatonic chords in 12 keys and compare the weighted average and standard deviation to quantify the consonance, and then we visualize our experimental results in a two-dimensional chart to compare the trade-offs between consonance and modulation.

Keywords
equal temperament, mean tone, Pythagoras, Scale, visualization

Paper topics
Computational musicology and ethnomusicology, Perception and cognition of sound and music

Easychair keyphrases
equal temperament [14], sanfen sunyi fa [12], dissonance value [8], dit value [8], just intonation [8], dissonance curve [7], pythagorean tuning [6], critical bandwidth [5], music temperament [5], average consonant level [4], base tone [4], dissonance index [4], dissonance level [4], horizontal axis [4], mean tone [4], pure tone [4], quarter comma meantone [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249327
Zenodo URL: https://zenodo.org/record/3249327


2019.60
MUSICYPHER: MUSIC FOR MESSAGE ENCRYPTION
Jaime Marín, Víctor   Universidad de Málaga; Málaga, Spain
Peinado Dominguez, Alberto   Universidad de Málaga; Málaga, Spain

Abstract
An Android application has been developed to encrypt messages using musical notes that can be automatically played from the smartphone and/or stored in a midi file to be transmitted over any available connection. The app has been designed to recover the original message on-the-fly detecting the notes played by a different device. The main objective of this project is to make known the rela-tionship between cryptography and music showing old systems (XVII century) implemented in modern devices.

Keywords
Android, Cryptography, Encryption, Fundamental Frecuency, Guyot, Java, Music, Real Time Audio Capture

Paper topics
not available

Easychair keyphrases
musical note [6], android application [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249270
Zenodo URL: https://zenodo.org/record/3249270


2019.61
Non-linear Contact Sound Synthesis for Real-Time Audiovisual Applications using Modal Textures
Maunsbach, Martin   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark

Abstract
Sound design is an integral part of making a virtual environment come to life. Spatialization is important to the perceptual localization of sounds, while the quality determines how well virtual objects come to life. The implementation of pre-recorded audio for physical interactions in virtual environments often require a vast library of audio files to distinguish each interaction from the other. This paper explains the implementation of a modal synthesis toolkit for the Unity game engine to automatically add impact and rolling sounds to interacting objects. Position-dependent sounds are achieved using a custom shader that can contain textures with modal weighting parameters. The two types of contact sounds are synthesized using a mechanical oscillator describing a spring-mass system. Since the contact force that is applied to the system includes a non-linear component, its value is found using an approximating algorithm. In this case the Newton-Rhapson algorithm is used. The mechanical oscillator is discretized using the K method with the bilinear transform.

Keywords
Game Audio, Impact, K Method, Non-linear, Physical Modelling, Rolling, Sound Synthesis

Paper topics
Models for sound analysis and synthesis, Sonic interaction design, Sound and music for Augmented/Virtual Reality and games, Sound/music signal processing algorithms

Easychair keyphrases
mechanical oscillator [10], modal texture [10], modal synthesis [9], modal weight [9], virtual environment [8], glass table [6], impact sound [6], modal weighting [6], physical modelling [6], rolling sound [6], computer graphic [5], fundamental frequency [5], game engine [5], micro impact [5], sound synthesis [5], interaction type [4], normal mode [4], unity game engine [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249410
Zenodo URL: https://zenodo.org/record/3249410


2019.62
NO STRINGS ATTACHED: FORCE AND VIBROTACTILE FEEDBACK IN A GUITAR SIMULATION
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Nilsson, Niels Christian   Aalborg University; Aalborg, Denmark
Paisa, Razvan   Aalborg University; Aalborg, Denmark
Fontana, Federico   Università di Udine; Udine, Italy
Nordahl, Rolf   Aalborg University; Aalborg, Denmark
Passalenti, Andrea   Università di Udine; Udine, Italy

Abstract
In this paper we propose a multisensory simulation of plucking guitar strings in virtual reality. The auditory feedback is generated by a physics-based simulation of guitar strings, and haptic feedback is provided by a combination of high fidelity vibrotactile actuators and a Phantom Omni. Moreover, we present a user study (n=29) exploring the perceived realism of the simulation and the relative importance of force and vibrotactile feedback for creating a realistic experience of plucking virtual strings. The study compares four conditions: no haptic feedback, vibrotactile feedback, force feedback, and a combination of force and vibrotactile feedback. The results indicate that the combination of vibrotactile and force feedback elicits the most realistic experience, and during this condition, the participants were less likely to inadvertently hit strings after the intended string had been plucked. Notably, no statistically significant differences were found between the conditions involving either vibrotactile or force feedback, which points towards an indication that haptic feedback is important but does not need to be high fidelity in order to enhance the quality of the experience.

Keywords
guitar simulation, haptic feedback, virtual reality

Paper topics
Sonic interaction design, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
vibrotactile feedback [23], haptic feedback [18], virtual string [17], physical string [13], statistically significant difference [12], force feedback [10], significant difference [10], perceived realism [9], pairwise comparison [8], real guitar string [7], aalborg university [6], guitar string [6], involving force feedback [6], phantom omni haptic device [6], plucking guitar string [6], realistic experience [6], virtual guitar [6], auditory feedback [5], median score [5], perceptual similarity [5], questionnaire item [5], audio engineering society [4], computer music [4], musical instrument [4], real string [4], relative importance [4], vibrotactile actuator [4], virtual reality [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249321
Zenodo URL: https://zenodo.org/record/3249321


2019.63
OFFLINE SCORE ALIGNMENT FOR REALISTIC MUSIC PRACTICE
Jiang, Yucong   Indiana University Bloomington; Bloomington, United States
Ryan, Fiona   Indiana University Bloomington; Bloomington, United States
Cartledge, David   Indiana University Bloomington; Bloomington, United States
Raphael, Christopher   Indiana University Bloomington; Bloomington, United States

Abstract
In a common music practice scenario a player works with a musical score, but may jump arbitrarily from one passage to another in order to drill on difficult technical challenges or pursue some other agenda requiring non-linear movement through the score. In this work we treat the associated score alignment problem in which we seek to align a known symbolic score to audio of the musician's practice session, identifying all ``do-overs'' and jumps. The result of this effort facilitates a quantitative view of a practice session, allowing feedback on coverage, tempo, tuning, rhythm, and other aspects of practice. If computationally feasible we would prefer a globally optimal dynamic programming search strategy; however, we find such schemes only barely computationally feasible in the cases we investigate. Therefore, we develop a computationally efficient off-line algorithm suitable for practical application. We present examples analyzing unsupervised and unscripted practice sessions on clarinet, piano and viola, providing numerical evaluation of our score-following results on hand-labeled ground-truth audio data, as well as more subjective and easy-to-interpret visualizations of the results.

Keywords
beam search, music practice, score following

Paper topics
Automatic music generation/accompaniment systems, Automatic separation, classification of sound and music, Content processing of music audio signals, Interaction in music performance, Interactive performance systems, Music creation and performance, recognition, Sound/music signal processing algorithms

Easychair keyphrases
score alignment [26], practice session [16], score position [13], data model [9], hidden markov model [9], beam search [7], score alignment problem [7], pitch tree [6], ground truth [5], musical score [5], real time [5], score note [5], dynamic programming [4], mozart clarinet concerto [4], non terminal node [4], quarter note [4], score location [4], skip problem [4], traditional score alignment [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249396
Zenodo URL: https://zenodo.org/record/3249396


2019.64
OM-AI: A Toolkit to Support AI-Based Computer-Assisted Composition Workflows in OpenMusic
Vinjar, Anders   Independent; Norway
Bresson, Jean   STMS, Institut de Recherche et Coordination Acoustique/Musique (IRCAM); Paris, France

Abstract
We present ongoing works exploring the use of artificial intelligence and machine learning in computer-assisted music composition. The om-ai library for OpenMusic implements well-known techniques for data classification and prediction, in order to integrate them in composition workflows. We give examples using simple musical structures, highlighting possible extensions and applications.

Keywords
Artificial Intelligence, Common Lisp, Computer-Assisted Composition, Descriptors, Machine Learning, OpenMusic, Vector-Space

Paper topics
not available

Easychair keyphrases
machine learning [8], vector space [7], feature vector [5], computer assisted composition system [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249264
Zenodo URL: https://zenodo.org/record/3249264


2019.65
OSC-XR: A Toolkit for Extended Reality Immersive Music Interfaces
Johnson, David   University of Victoria; Victoria, Canada
Damian, Daniela   University of Victoria; Victoria, Canada
Tzanetakis, George   University of Victoria; Victoria, Canada

Abstract
Currently, developing immersive music environments for extended reality (XR) can be a tedious process requiring designers to build 3D audio controllers from scratch. OSC-XR is a toolkit for Unity intended to speed up this process through rapid prototyping, enabling research in this emerging field. Designed with multi-touch OSC controllers in mind, OSC-XR simplifies the process of designing immersive music environments by providing prebuilt OSC controllers and Unity scripts for designing custom ones. In this work, we describe the toolkit's infrastructure and perform an evaluation of the controllers to validate the generated control data. In addition to OSC-XR, we present UnityOscLib, a simplified OSC library for Unity utilized by OSC-XR. We implemented three use cases, using OSC-XR, to inform its design and demonstrate its capabilities. The Sonic Playground is an immersive environment for controlling audio patches. Hyperemin is an XR hyperinstrument environment in which we augment a physical theremin with OSC-XR controllers for real-time control of audio processing. Lastly, we add OSC-XR controllers to an immersive T-SNE visualization of music genre data for enhanced exploration and sonification of the data. Through these use cases, we explore and discuss the affordances of OSC-XR and immersive music interfaces.

Keywords
Extended Reality, Immersive Interaction, Immersive Interfaces for Musical Expression, Open Sound Control, Virtual Environments

Paper topics
Interactive performance systems, Interfaces for sound and music, New interfaces for interactive music creation, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
immersive environment [21], osc message [14], osc xr controller [11], immersive music environment [9], osc controller [9], computer music [8], controller prefab [7], multi touch osc [7], transmitting osc message [7], musical expression [6], osc receiver [6], osc xr slider [6], pad controller [6], sound designer [6], touch osc controller [6], unity inspector [6], use case [6], virtual reality [6], audio processing [5], immersive interface [5], international computer [5], multi touch [5], musical interaction [5], performance environment [5], rapid prototyping [5], traditional instrument [5], immersive musical environment [4], immersive music interface [4], osc controller prefab [4], osc transmit manager [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249319
Zenodo URL: https://zenodo.org/record/3249319


2019.66
Perceptual Evaluation of Modal Synthesis for Impact-Based Sounds
Barahona, Adrián   University of York; York, United Kingdom
Pauletto, Sandra   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
The use of real-time sound synthesis for sound effects can improve the sound design of interactive experiences such as video games. However, synthesized sound effects can be often perceived as synthetic, which hampers their adoption. This paper aims to determine whether or not sounds synthesized using filter-based modal synthesis are perceptually comparable to sounds directly recorded. Sounds from 4 different materials that showed clear modes were recorded and synthesized using filter-based modal synthesis. Modes are the individual sinusoidal frequencies at which objects vibrate when excited. A listening test was conducted where participants were asked to identify, in isolation, whether a sample was recorded or synthesized. Results show that recorded and synthesized samples are indistinguishable from each other. The study outcome proves that, for the analysed materials, filter-based modal synthesis is a suitable technique to synthesize hit sound in real-time without perceptual compromises.

Keywords
Game Audio, Modal Synthesis, Procedural Audio, Sound Design

Paper topics
Digital audio effects, Models for sound analysis and synthesis, Perception and cognition of sound and music, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
filter based modal synthesis [14], pre recorded sample [12], real time [10], sound effect [10], modal synthesis [9], procedural audio [9], sound design [9], audio file [8], impact based sound [7], perceptual evaluation [6], sound synthesis [5], synthesized version [5], video game [5], audio engineering society [4], deterministic component [4], discrimination factor [4], enveloped white noise signal [4], filterbased modal synthesis [4], f measure value [4], game engine [4], interactive application [4], modal synthesizer [4], musical instrument [4], real time sound synthesis [4], stochastic component [4], synthesized sound effect [4], synthetic sound [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249246
Zenodo URL: https://zenodo.org/record/3249246


2019.67
Percussion synthesis using loopback frequency modulation oscillators
Hsu, Jennifer   Department of Music, University California, San Diego (UCSD); San Diego, United States
Smyth, Tamara   Department of Music, University California, San Diego (UCSD); San Diego, United States

Abstract
In this work, we apply recent research results in loopback frequency modulation (FM) to real-time parametric synthesis of percussion sounds. Loopback FM is a variant of FM synthesis whereby the carrier oscillator "loops back" to serve as a modulator of its own frequency. Like FM, more spectral components emerge, but further, when the loopback coefficient is made time varying, frequency trajectories that resemble the nonlinearities heard in acoustic percussion instruments appear. Here, loopback FM is used to parametrically synthesize this effect in struck percussion instruments, known to exhibit frequency sweeps (among other nonlinear characteristics) due to modal coupling. While many percussion synthesis models incorporate such nonlinear effects while aiming for acoustic accuracy, computational efficiency is often sacrificed, prohibiting real-time use. This work seeks to develop a real-time percussion synthesis model that creates a variety of novel sounds and captures the sonic qualities of nonlinear percussion instruments. A linear, modal synthesis percussion model is modified to use loopback FM oscillators, which allows the model to create rich and abstract percussive hits in real-time. Musically intuitive parameters for the percussion model are emphasized resulting in a usable percussion sound synthesizer.

Keywords
feedback systems, frequency and phase modulation synthesis, modal synthesis, pitch glides, sound synthesis, time-varying allpass filters

Paper topics
Digital audio effects, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
pitch glide [27], modal frequency [25], loopback fm oscillator [20], sounding frequency [16], carrier frequency [13], time varying [12], modal synthesis [11], percussion instrument [11], time varying timbre [11], percussion synthesis [10], amplitude envelope [9], filtered noise burst [9], raised cosine envelope [9], tom tom [9], acoustic resonator [8], real time [8], circular plate [7], fm percussion synthesis [7], loopback fm percussion [7], raised cosine [7], raised cosine excitation [7], sound synthesis [7], acoustic resonator impulse response [6], feedback coefficient [6], high carrier frequency [6], impulse response [6], percussion sound [6], percussion synthesis method [6], frequency component [5], percussion model [5]

Paper type
Full paper

DOI: 10.5281/zenodo.3249382
Zenodo URL: https://zenodo.org/record/3249382


2019.68
PERFORMING WITH SOUND SAMPLE-CONTROLLED GLOVES AND LIGHT-CONTROLLED ARMS
Pecquet, Frank   Université Paris 1 Panthéon-Sorbonne; Paris, France
Moschos, Fotis   Vocational Training Institute; Greece
Fierro, David   University Paris VIII; Paris, France
Pecquet, Justin   Jazz Institute of Berlin (JIB), Berlin University of the Arts (UdK); Berlin, Germany

Abstract
Interacting with media: The TransTeamProject (T3P) works on developing interactive gloves technics - and other materials, with sound and/or visual samples. Piamenca continues the work developed in Transpiano with a specific emphasis on visual content such as transforming sound into lights, in this case together with a strong vernacular inspiration (Flamenco). T3P creative project is involved with art music together with techno-perspectives. After contextualizing the state of the art in the specific field of “body gesture technology”, the present file will explain how Piamenca relates to computers in a technical sense – methods and processes to produce media transformations (audio and vision) - and will comment their integration in terms of sound, music and audio-visual performance. It will finally demonstrate some ideas such as trans-music orientations with regard to enhancement theories in relation with the transhumanism movement.

Keywords
flamenco, glove-technology, interaction, performance, piano, sampling, trans-music

Paper topics
and software environments for sound and music computing, Humanities in Sound and Music Computing, Interaction in music performance, Interactive music recommendation, Languages, Multimodality in sound and music computing, Music creation and performance, protocols

Easychair keyphrases
musical instrument [4], sound spectrum [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249254
Zenodo URL: https://zenodo.org/record/3249254


2019.69
PHYSICAL MODELS AND REAL-TIME CONTROL WITH THE SENSEL MORPH
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Willemsen, Silvin   Aalborg University; Aalborg, Denmark

Abstract
In this demonstration we present novel physical models controlled by the Sensel Morph interface.

Keywords
comtrol, physical models, selsel

Paper topics
not available

Easychair keyphrases
sympathetic string [8], bowed string [6], hammered dulcimer [6], sensel morph [6], hurdy gurdy [4], physical model [4], plucked string [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249276
Zenodo URL: https://zenodo.org/record/3249276


2019.70
PIANO SCORE-FOLLOWING BY TRACKING NOTE EVOLUTION
Jiang, Yucong   Indiana University Bloomington; Bloomington, United States
Raphael, Christopher   Indiana University Bloomington; Bloomington, United States

Abstract
Score following matches musical performance audio with its symbolic score in an on-line fashion. Its applications are meaningful in music practice, performance, education, and composition. This paper focuses on following piano music --- one of the most challenging cases. Motivated by the time-changing features of a piano note during its lifetime, we propose a new method that models the evolution of a note in spectral space, aiming to provide an adaptive, hence better, data model. This new method is based on a switching Kalman filter in which a hidden layer of continuous variables tracks the energy of the various note harmonics. The result of this method could potentially benefit applications in de-soloing, sound synthesis and virtual scores. This paper also proposes a straightforward evaluation method. We conducted a preliminary experiment on a small dataset of 13 minutes of music, consisting of 15 excerpts of real piano recordings from eight pieces. The results show the promise of this new method.

Keywords
piano music, score following, switching Kalman filter

Paper topics
Automatic music generation/accompaniment systems, Automatic separation, classification of sound and music, Content processing of music audio signals, Interaction in music performance, Interactive performance systems, Music information retrieval, recognition, Sound/music signal processing algorithms

Easychair keyphrases
kalman filter [17], score following [16], switching kalman filter [9], filtered distribution [7], mvmt1 piano concerto [7], discriminating data model [6], frequency profile [6], independent kalman filter [6], data model [5], observed data [5], piano music [5], real time [5], score alignment [5], state graph [5], art system [4], continuous variable [4], evaluation method [4], frame wise accuracy [4], hidden markov model [4], kalman filter model [4], musical score [4], partial amplitude [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249398
Zenodo URL: https://zenodo.org/record/3249398


2019.71
Polytopic reconfiguration: a graph-based scheme for the multiscale transformation of music segments and its perceptual assessment
Gillot, Valentin   Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); France
Bimbot, Frédéric   Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); France

Abstract
Music is usually considered as a sequential process, where sounds, group of sounds and motifs are occurring chronologically, following the natural unfolding of time. At the same time, repetitions and similarities which develop between elements create multi-scale patterns which participate to the perceived structure of the passage and trigger expectation mechanisms and systems [Narmour 2000][Bimbot et al. 2016]. These can be represented as a Polytopic Graph of Latent Relations [Louboutin et al. 2017] where each node of the graph represents a low-scale musical segment and vertices correspond to their mutual relation within the expectation systems. The content of a musical segment can be manipulated by applying various permutations to the nodes of the graph, thus generating a reconfiguration of its musical content, with the same elements in a different order. Specific permutations, called Primer Preserving Permutations (PPP), are of particular interest, as they preserve systems of analogical implications between metrically homologous elements within the segment. In this paper, we describe the implementation of the polytopic reconfiguration process and we elaborate on the organizational properties of Primer Preserving Permutations as well as their potential impact on the inner structure of musical segments. Then, in order to assess the relevance of the reconfiguration scheme (and its underlying hypotheses) we report on a perceptual test where subjects are asked to rate musical properties of MIDI segments : some of them have been reconfigured with PPPs while others have been transformed by Randomly Generated Permutations (RGP). Results shows that PPP-transformed segments score distinctly better than RGP-transformed ones, indicating that the preservation of implication systems plays an important role in the subjective acceptability of the transformation. Additionnaly, we introduce an automatic method for decomposing segments into low-scale musical elements, taking into account possible phase-shifts between the musical surface and the metrical information (for instance, anacruses). We conclude on the potential of the approach for applications in interactive music composition.

Keywords
multiscale representation, music cognition, music structure, music transformation, perceptual tests, polytopic graph

Paper topics
Algorithms and Systems for music composition, Perception and cognition of sound and music

Easychair keyphrases
implication system [9], musical segment [9], time scale [8], elementary object [6], musical surface [6], phase shift [6], polytopic representation [6], degradation score [5], melodic line [5], musical object [5], perceptual test [5], account possible phase shift [4], analogical implication [4], compressibility criterion [4], inner structure [4], low scale [4], low scale musical element [4], musical consistency [4], parallel face [4], polytopic graph [4], randomly generated permutation [4], time shift [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249408
Zenodo URL: https://zenodo.org/record/3249408


2019.72
Predicting Perceived Dissonance of Piano Chords Using a Chord-Class Invariant CNN and Deep Layered Learning
Dubois, Juliette   ENSTA (École nationale supérieure de techniques avancées), MINES ParisTech; Paris, France
Elowsson, Anders   anderselowsson@gmail.com, KTH Royal Institute of Technology; Stockholm, Sweden
Friberg, Anders   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
This paper presents a convolutional neural network (CNN) able to predict the perceived dissonance of piano chords. Ratings of dissonance for short audio excerpts were combined from two different datasets and groups of listeners. The CNN uses two branches in a directed acyclic graph (DAG). The first branch receives input from a pitch estimation algorithm, restructured into a pitch chroma. The second branch analyses interactions between close partials, known to affect our perception of dissonance and roughness. The analysis is pitch invariant in both branches, facilitated by convolution across log-frequency and octave-wide max-pooling. Ensemble learning was used to improve the accuracy of the predictions. The coefficient of determination (R2) between rating and predictions are close to 0.7 in a cross-validation test of the combined dataset. The system significantly outperforms recent computational models.

Keywords
CNN, Consonance, DAG network, Deep Layered Learning, Dissonance, Ensemble Learning, Music Information Retrieval, Pitch invariant, Roughness

Paper topics
Automatic separation, classification of sound and music, Content processing of music audio signals, Models for sound analysis and synthesis, Music information retrieval, Perception and cognition of sound and music, recognition, Sound/music signal processing algorithms

Easychair keyphrases
pitch chroma [26], test condition [12], better result [8], computational model [7], cross validation [7], audio file [6], music information retrieval [6], acoustical society [5], dense layer [5], ensemble learning [5], pitch class [5], test run [5], convolutional layer [4], cross fold validation [4], deep layered learning [4], ground truth test [4], just intonation ratio [4], max pooling filter [4], neural network [4], non stationary gabor frame [4], piano chord [4], truth test condition [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249465
Zenodo URL: https://zenodo.org/record/3249465


2019.73
RaveForce: A Deep Reinforcement Learning Environment for Music Generation
Lan, Qichao   RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Department of Musicology, University of Oslo; Oslo, Norway
Tørresen, Jim   RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Department of Informatics, University of Oslo; Oslo, Norway
Jensenius, Alexander Refsum   RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Department of Musicology, University of Oslo; Oslo, Norway

Abstract
RaveForce is a programming framework designed for a computational music generation method that involves audio sample level evaluation in symbolic music representation generation. It comprises a Python module and a SuperCollider quark. When connected with deep learning frameworks in Python, RaveForce can send the symbolic music representation generated by the neural network as Open Sound Control messages to the SuperCollider for non-real-time synthesis. SuperCollider can convert the symbolic representation into an audio file which will be sent back to the Python as the input of the neural network. With this iterative training, the neural network can be improved with deep reinforcement learning algorithms, taking the quantitative evaluation of the audio file as the reward. In this paper, we find that the proposed method can be used to search new synthesis parameters for a specific timbre of an electronic music note or loop.

Keywords
Deep Reinforcement Learning, Music Generation, SuperCollider

Paper topics
Automatic music generation/accompaniment systems, Models for sound analysis and synthesis

Easychair keyphrases
neural network [23], music generation [20], deep reinforcement learning [17], reinforcement learning [16], audio file [13], symbolic representation [13], observation space [11], symbolic music representation [11], computational music generation [9], deep learning [8], non real time synthesis [8], non real time [7], preprint arxiv [7], deep learning framework [6], live coding session [6], music generation task [6], non real time audio synthesis [6], open sound control message [6], raw audio generation [6], drum loop [5], raw audio [5], real time [5], action space [4], audio waveform [4], deep learning music generation [4], deep reinforcement learning environment [4], electronic music [4], kick drum [4], musical context [4], running time [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249325
Zenodo URL: https://zenodo.org/record/3249325


2019.74
Real Time Audio Digital Signal Processing With Faust and the Teensy
Michon, Romain   GRAME-CNCM, Lyon // CCRMA, Stanford University GRAME (Générateur de Ressources et d’Activités Musicales Exploratoires), CNCM (Centre national de création musicale), in Lyon Center for Computer Research in Music and Acoustics (CCRMA), Stanford University; Stanford, United States
Orlarey, Yann   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Letz, Stéphane   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France
Fober, Dominique   GRAME - Générateur de Ressources et d’Activités Musicales Exploratoires; Lyon, France

Abstract
This paper introduces a series of tools to program the Teensy development board series with the Faust programming language. faust2teensy is a command line application that can be used both to generate new objects for the Teensy Audio Library and standalone Teensy programs. We also demonstrate how faust2api can produce Digital Signal Processing engines (with potential polyphony support) for the Teensy. Details about the implementation and optimizations of these systems are provided and the results of various tests (i.e., computational, latency, etc.) are presented. Finally, future directions for this work are discussed through a discussion on bare-metal implementation of real-time audio signal processing applications.

Keywords
DSP, Faust, Micocontroller, Teensy

Paper topics
and software environments for sound and music computing, Hardware systems for sound and music computing, Languages, New interfaces for interactive music creation, protocols

Easychair keyphrases
teensy audio library [22], code listing [14], block size [13], faust program [13], teensy audio [11], teensy program [11], floating point [9], real time audio signal processing [9], teensy audio shield [9], polyphonic dsp engine [6], signal processing [6], audio shield [5], audio signal processing [5], sound synthesis [5], void loop [5], audio shield teensy audio shield [4], audio signal processing application [4], bare metal implementation [4], command line [4], digital signal processing [4], faust2api teensy [4], faust compiler [4], faust program implementing [4], monophonic dsp engine [4], processing power [4], realtime audio signal processing [4], sampling rate [4], sawtooth oscillator [4], standalone faust teensy program [4], void setup [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249282
Zenodo URL: https://zenodo.org/record/3249282


2019.75
Real-time Control of Large-scale Modular Physical Models using the Sensel Morph
Willemsen, Silvin   Aalborg University; Aalborg, Denmark
Andersson, Nikolaj   Aalborg University; Aalborg, Denmark
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Bilbao, Stefan   The University of Edinburgh; Edinburgh, United Kingdom

Abstract
In this paper, implementation, instrument design and control issues surrounding a modular physical modelling synthesis environment are described. The environment is constructed as a network of stiff strings and a resonant plate, accompanied by user-defined connections and excitation models. The bow, in particular, is a novel feature in this setting. The system as a whole is simulated using finite difference (FD) methods. The mathematical formulation of these models is presented, alongside several new instrument designs, together with a real-time implementation in JUCE using FD methods. Control is through the Sensel Morph.

Keywords
high-fidelity control, physical modelling, real-time

Paper topics
Interactive performance systems, Models for sound analysis and synthesis, Sonic interaction design

Easychair keyphrases
grid point [14], bowed string [12], stiff string [11], sympathetic string [11], physical model [10], sensel morph [10], real time [8], grid spacing [7], sound synthesis [7], finite difference [6], hurdy gurdy [6], next time [6], plucked string [6], computer music [5], cpu usage [5], hammered dulcimer [5], non linear [5], connection term [4], discretised distribution function [4], excitation function [4], mass ratio [4], melody string [4], modular physical modelling synthesis environment [4], system architecture [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249295
Zenodo URL: https://zenodo.org/record/3249295


2019.76
Real-time Mapping of Periodic Dance Movements to Control Tempo in Electronic Dance Music
Jap, Lilian   KTH Royal Institute of Technology; Stockholm, Sweden
Holzapfel, André   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
Dancing in beat to the music of one's favorite DJ leads oftentimes to a powerful and euphoric experience. In this study we investigate the effect of putting a dancer in control of music playback tempo based on a real-time estimation of body rhythm and tempo manipulation of audio. A prototype was developed and tested in collaboration with users, followed by a main study where the final prototype was evaluated. A questionnaire was provided to obtain ratings regarding subjective experience, and open-ended questions were posed in order to obtain further insights for future development. Our results imply the potential for enhanced engagement and enjoyment of the music when being able to manipulate the tempo, and document important design aspects for real-time tempo control.

Keywords
beat tracking, electronic dance music, embodiment, real-time interaction, rhythm

Paper topics
Interactive performance systems, Interfaces for sound and music, New interfaces for interactive music creation, Sonic interaction design

Easychair keyphrases
tempo manipulation [17], real time [10], second session [9], body movement [8], first session [6], hand wrist [5], tempo change [5], dance experience [4], data stream [4], electronic dance music [4], mean value rating [4], playback tempo [4], quality factor [4], slide value [4], standard playback [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249343
Zenodo URL: https://zenodo.org/record/3249343


2019.77
Real-Time Modeling of Audio Distortion Circuits with Deep Learning
Damskägg, Eero-Pekka   Aalto University; Espoo, Finland
Juvela, Lauri   Aalto University; Espoo, Finland
Välimäki, Vesa   Aalto University; Espoo, Finland

Abstract
This paper studies deep neural networks for modeling of audio distortion circuits. The selected approach is black-box modeling, which estimates model parameters based on the measured input and output signals of the device. Three common audio distortion pedals having a different circuit configuration and their own distinctive sonic character have been chosen for this study: the Ibanez Tube Screamer, the Boss DS-1, and the Electro-Harmonix Big Muff Pi. A feedforward deep neural network, which is a variant of the WaveNet architecture, is proposed for modeling these devices. The size of the receptive field of the neural network is selected based on the measured impulse-response length of the circuits. A real-time implementation of the deep neural network is presented, and it is shown that the trained models can be run in real time on a modern desktop computer. Furthermore, it is shown that approximately three minutes of audio is a sufficient amount of data for training the models. The deep neural network studied in this work is useful for real-time virtual analog modeling of nonlinear audio circuits.

Keywords
Audio systems, Feedforward neural networks, Music, Nonlinear systems, Supervised learning

Paper topics
Digital audio effects, Sound/music signal processing algorithms

Easychair keyphrases
neural network [25], real time [25], convolutional layer [20], big muff [16], deep neural network [15], distortion effect [13], processing speed [13], ibanez tube screamer [12], training data [12], convolution channel [11], digital audio effect [11], receptive field [10], activation function [9], clipping amp [9], gated activation [9], harmonix big muff pi [8], layer model [8], non linear bp filter [8], signal ratio [8], tube screamer [8], audio interface [7], big muff pi [7], black box modeling [7], impulse response [7], audio distortion circuit [6], computational load [6], nonlinear activation function [6], selected model [6], tone stage [6], validation loss [6]

Paper type
Full paper

DOI: 10.5281/zenodo.3249374
Zenodo URL: https://zenodo.org/record/3249374


2019.78
Representations of Self-Coupled Modal Oscillators with Time-Varying Frequency
Smyth, Tamara   University California, San Diego (UCSD); San Diego, United States
Hsu, Jennifer   University California, San Diego (UCSD); San Diego, United States

Abstract
In this work we examine a simple mass spring system in which the natural frequency is modulated by its own oscillations, a self-coupling that creates a feedback system in which the output signal ``loops back'' with an applied coefficient to modulate the frequency. This system is first represented as a mass-spring system, then in the context of well-known frequency and phase modulation synthesis, and finally, as a time-varying stretched allpass filter, where both allpass coefficients and filter order are made time varying, the latter to allow for changes to sounding frequency other time (e.g. pitch glides). Expressions are provided that map parameters of one representation to another, allowing for either to be used for real-time synthesis.

Keywords
feedback systems, frequency and phase modulation synthesis, nonlinear modal coupling, pitch glides, time-varying allpass filters

Paper topics
Digital audio effects, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
time varying [16], sounding frequency [13], made time varying [12], self coupled oscillator [11], instantaneous phase [9], instantaneous frequency [8], real part [8], closed form representation [7], loopback fm parameter [7], loopback fm oscillator [6], mass spring system [6], frequency modulation [5], numerical integration [5], discrete time [4], final expression [4], time varying frequency [4], transfer function [4], unit sample delay [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249423
Zenodo URL: https://zenodo.org/record/3249423


2019.79
Resonance Improviser: A system for transmitting the embodied sensations of vocalization between two people during improvisation
Kelkar, Tejaswinee   University of Oslo; Oslo, Norway
Gerry, Lynda   Aalborg University; Aalborg, Denmark

Abstract
This is a system prototype for joint vocal improvisation between two people that involves sharing embodied sensations of vocal production. This is accomplished by using actuators that excite two participants' rib cages with each other's voices, turning a person's body into a loud speaker. A microphone transmits vocal signals and the players are given a Max Patch to modulate the sound and feel of their voice. The receiver hears the other person's speech and effects through their own body (as if it were their own voice), while also feeling the resonance of the sound signal as it would resonate in the chest cavity of the other. The two players try to re-enact and improvise a script prompt provided to them while not knowing what the other person can hear, of their voice. The game may or may not turn collaborative, adversarial, or artistic depending on the game play.

Keywords
actuator, sound exciter, system prototype, vocal improvisation

Paper topics
not available

Easychair keyphrases
social embodiment [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249443
Zenodo URL: https://zenodo.org/record/3249443


2019.80
SonaGraph. A cartoonified spectral model for music composition
Valle, Andrea   Interdipartimental Center for Research on Multimedia and Audiovideo (CIRMA), Dipartimento di Studi Umanistici, Università di Torino; Torino, Italy

Abstract
This paper presents SonaGraph, a framework and an application for a simplified but efficient harmonic spectrum analyzer suitable for assisted and algorithmic composition. The model is inspired by the analog Sonagraph and relies on a constant-Q bandpass filter bank. First, the historical Sonagraph is introduced, then, starting from it, a simplified (“cartoonified”) model is discussed. An implementation in SuperCollider is presented that includes various utilities (interactive GUIs, music notation generation, graphic export, data communication). A comparison of results in relation to other tools for assisted composition is presented. Finally, some musical examples are discussed, that make use of spectral data from SonaGraph to generate, retrieve and display music information.

Keywords
Assisted composition, Music notation, Spectral information

Paper topics
Algorithms and Systems for music composition, Interfaces for sound and music, Models for sound analysis and synthesis, Music information retrieval

Easychair keyphrases
sound object level [12], spectral data [12], music notation [11], filter bank [10], real time [10], music notation transcription [6], audio level [5], computer music [5], interactive gui [5], sample rate [5], spectral information [5], time resolution [5], amplitude threshold [4], assisted composition [4], constant q bandpass filter [4], gathered data [4], lilypond code [4], music information retrieval [4], spectral analysis [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249425
Zenodo URL: https://zenodo.org/record/3249425


2019.81
SONIC CHARACTERISTICS OF ROBOTS IN FILMS
Latupeirissa, Adrian B.   KTH Royal Institute of Technology; Stockholm, Sweden
Frid, Emma   KTH Royal Institute of Technology; Stockholm, Sweden
Bresin, Roberto   KTH Royal Institute of Technology; Stockholm, Sweden

Abstract
Today, robots are increasingly becoming an integral part of our everyday life. Expectations humans have about robots are influenced by how they are represented in science fiction films. The process of designing sonic interaction for robots is similar to how a Foley artist designs sound effect of a film. In this paper, we present an exploratory study focusing on sonic characteristics of robot sounds in films. We believe that findings from the current study could be of relevance for future robotic applications involving the communication of internal states through sounds, as well for sonification of expressive robot movements. Excerpts from five films were analyzed using Long Time Average Spectrum (LTAS). As an overall observation, we found that robot sonic presence is highly related to its physical appearance. Preliminary results show that most of the robots analysed in this study have a ``metallic'' quality in their voice, matching the material of their physical form. Characteristics of their voice show significant differences compared to that of human characters; fundamental frequency of robots is either shifted to higher or lower values compared to that of human characters, and their voice spans over a larger frequency band.

Keywords
film sound design, human-robot interaction, LTAS, non-verbal communication, robot sound, sonic interaction design

Paper topics
Multimodality in sound and music computing, Social interaction in sound and music computing, Sonic interaction design

Easychair keyphrases
sound design [11], andrew martin [7], human robot interaction [7], robot sound [7], sonao project [7], robot movement [6], bicentennial man [5], frequency band [5], non verbal [5], physical appearance [5], short circuit [5], bremen emotional sound toolkit [4], emotional expression [4], fictional robot [4], fundamental frequency [4], kth royal institute [4], main human character [4], mechanical sound [4], music computing kth [4], non verbal communication [4], non verbal sound [4], real world robot [4], robot andrew [4], robot sonic presence [4], video excerpt [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249337
Zenodo URL: https://zenodo.org/record/3249337


2019.82
Sonic Sweetener Mug
Mathiesen, Signe Lund   Department of Food Science, Aarhus University; Aarhus, Denmark
Byrne, Derek Victor   Department of Food Science, Aarhus University; Aarhus, Denmark
Wang, Qian Janice   Department of Food Science, Aarhus University; Aarhus, Denmark

Abstract
Food and music are fundamental elements of most lives. Both eating and listening can modify our emotional and cognitive states, and when paired, can result in surprising perceptual effects. This demo explores the link between the two phenomena of music and food, specifically the way in which what we taste can be influenced by what we listen to. We demonstrate how the same beverage can taste very differently depending on the music that happens to be playing at the same time. To do this, we have created a system that turns the act of drinking into a form of embodied interaction with music. This highlights the multisensory character of flavour perception and underscore the way in which sound can be used to raise people’s awareness of their own eating behaviour.

Keywords
Interactive systems, Multisensory flavour perception, Music, Sonic seasoning

Paper topics
not available

Easychair keyphrases
crossmodal correspondence [6], aarhus university [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249364
Zenodo URL: https://zenodo.org/record/3249364


2019.83
SOUND DESIGN THROUGH LARGE AUDIENCE INTERACTION
Hansen, Kjetil Falkenberg   KTH Royal Institute of Technology; Stockholm, Sweden
Ljungdahl Eriksson, Martin   University West / Edsbyn; Trollhättan, Sweden
Atienza, Ricardo   University of Arts, Crafts and Design (Konstfack); Stockholm, Sweden

Abstract
In collaboration with Volvo Cars, we presented a novel design tool to a large public of approximately three million people at the three leading motor shows in 2017 in Geneva, Shanghai and New York. The purpose of the tool was to explore the relevance of interactive audio-visual strategies for supporting the development of sound environments in future silent cars, i.e., a customised sonic identity that would alter the sonic ambience for the driver and by-passers. This new tool should be able to efficiently collect non-experts' sonic preferences for different given contexts. The design process should allow for a high-level control of complex synthesised sounds. The audience interacted individually using a single-touch selection of colour from five palettes and applying it by pointing to areas in a colour-book painting showing a road scene. Each palette corresponded to a sound, and the colour nuance in the palette corresponded to certain tweaking of the sound. In effect, the user selected and altered each sound, added it to the composition, and finally would hear a mix of layered sounds based on the colouring of the scene. The installation involved large touch screens with high quality headphones. In the study presented here, we examine differences in sound preferences between two audiences and a control group, and evaluate the feasibility of the tool based on the sound designs that emerged.

Keywords
Car sounds, Interaction, Novel interfaces, Sound design, Sound installation

Paper topics
Interactive performance systems, Interfaces for sound and music, Multimodality in sound and music computing, New interfaces for interactive music creation, Sonic interaction design

Easychair keyphrases
sound design [15], control group [13], school bell sound [9], motor sound [8], colour nuance [6], rolling sound [6], shanghai audience [6], colour book [5], colour palette [5], data collection [5], school scene [5], sound effect [5], audio effect [4], bell harmonic rolling [4], city centre [4], geneva audience [4], harmonic sound [4], musical expression [4], school area [4], school bell [4], volvo car [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249284
Zenodo URL: https://zenodo.org/record/3249284


2019.84
Sound in Multiples: Synchrony and Interaction Design using Coupled-Oscillator Networks
Lem, Nolan   Stanford University; Stanford, United States

Abstract
Systems of coupled-oscillators can be employed in a variety of algorithmic settings to explore the self-organizing dynamics of synchronization. In the realm of audio-visual generation, coupled oscillator networks can be usefully applied to musical content related to rhythmic perception, sound synthesis, and interaction design. By formulating different models of these generative dynamical systems, I outline different methodologies from which to generate sound from collections of interacting oscillators and discuss how their rich, non-linear dynamics can be exploited in the context of sound-based art. A summary of these mathematical models are discussed and a range of applications (audio synthesis, rhythmic generation, and music perception) are proposed in which they may be useful in producing and analyzing sound. I discuss these models in relationship to two of my own kinetic sound sculptures to analyze to what extent they can be used to characterize synchrony as an analytical tool.

Keywords
generative music, sonification, sound art, sound sculpture, synchrony

Paper topics
Algorithms and Systems for music composition, and software environments for sound and music computing, Auditory display and data sonification, Hardware systems for sound and music computing, Interaction in music performance, Interactive performance systems, Interfaces for sound and music, Languages, Models for sound analysis and synthesis, Music creation and performance, New interfaces for interactive music creation, Perception and cognition of sound and music, protocols, Sonic interaction design, Sound/music signal processing algorithms

Easychair keyphrases
coupled oscillator [33], instantaneous phase [11], coupled oscillator model [9], coupled oscillator network [9], intrinsic frequency [8], center frequency [6], complex order parameter [6], dynamical system [6], external forcing [5], audio visual resonance [4], coupled oscillator dynamic [4], coupled oscillator system [4], oscillator phase [4], phase coherence [4], phase response function [4], phase vocoder model [4], pushing motion [4], rhythmic generation [4], signal processing [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249427
Zenodo URL: https://zenodo.org/record/3249427


2019.85
State Dependency - Audiovisual interaction through brain states
Neff, Patrick   Center for Neuromodulation, Department of Psychiatry and Psychotherapy, University of Regensburg; Regensburg, Germany
Schacher, Jan C.   Institute for Computer Music and Sound Technology (ICST), Zurich University of the Arts (ZHdK); Zurich, Switzerland
Bisig, Daniel   Institute for Computer Music and Sound Technology (ICST), Zurich University of the Arts (ZHdK); Zurich, Switzerland

Abstract
Artistic installations using brain-computer interfaces (BCI) to interact with media in general, and sound in specific, have become increasingly numerous in the last years. Brain or mental states are commonly used to drive musical score or sound generation as well as visuals. Closed loop setups can emerge here which are comparable to the propositions of neurofeedback (NFB). The aim of our audiovisual installation State Dependency, driven by brain states and motor imagery, was to enable the participant to engage in unbound exploration of movement through sound and space unmediated by one's corpo-reality. With the aid of an adaptive feedback loop, perception is taken to the edge. We deployed a BCI to collect motor imagery, visual and cognitive neural activity to calculate approximate entropy (a second order measure of neural signal activity) which was in turn used to interact with the surround Immersive Lab installation. The use of entropy measures on motor imagery and various sensory modalities generates a highly accessible, reactive and immediate experience transcending common limitations of the BCI technology. State dependency goes beyond common practice of abstract routing between mental or brain with external audiovisual states. It provides new territory of unrestrained kinaesthetic and polymodal exploration in an immersive audiovisual environment.

Keywords
audio visual interaction, biofeedback, brain computer interface, motor imagery

Paper topics
Auditory display and data sonification, Multimodality in sound and music computing, Perception and cognition of sound and music, Sound/music and the neurosciences

Easychair keyphrases
motor imagery [16], neural activity [11], approximate entropy [8], entropy measure [8], immersive lab [8], state dependency [8], movement control [7], real time [7], audio visual [6], audio visual medium [6], brain state [5], eeg signal [5], mental state [5], visual cortex [5], adaptive feedback loop [4], bci art [4], closed loop setup [4], computer music [4], feedback loop [4], left primary motor cortex [4], motor cortex [4], motor imagery data [4], movement perception [4], primary visual cortex [4], right primary motor cortex [4], signal quality [4], swiss national science foundation [4], wet electrode [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249244
Zenodo URL: https://zenodo.org/record/3249244


2019.86
Teach Me Drums: Learning Rhythms through the Embodiment of a Drumming Teacher in Virtual Reality
Serafin, Stefania   Aalborg University; Aalborg, Denmark
Moth-Poulsen, Mie   Aalborg University; Aalborg, Denmark
Bednarz, Tomasz   The University of New South Wales (UNSW); Sydney, Australia
Kuchelmeister, Volker   The University of New South Wales (UNSW); Sydney, Australia

Abstract
This paper investigates how to design an embodied learning experience of a drumming teacher playing hand drums, to aid higher rhythm understanding and accuracy. By providing novices the first-person perspective of a drumming teacher while learning to play a West-African djembe drum, participants' learning was measured objectively by their ability to follow the drumming teacher’s rhythms. Participants subjective learning was assessed through a self assessment questionnaire measuring aspects of flow, user-experience, oneness, and presence. Two test iterations were conducted. In both there was found no significance difference in participants ability to follow the drumming teacher' s tempo for the experimental group exposed to the first-person perspective of the teacher in a VR drum lesson, versus the control group exposed to a 2D version of the stereoscopic drum lesson. There was found a significance difference in the experimental group' s presence scores in the first test iteration, and a significant difference in experimental group' s oneness scores in the second test iteration. Participants' subjective feelings indicated enjoyment and motivation to the presented learning technique in both groups.

Keywords
drumming, embodiment, pedagogy, virtual reality

Paper topics
Interaction in music performance, Sonic interaction design

Easychair keyphrases
drum lesson [17], first test iteration [17], control group [15], drumming teacher [15], test stimulus [13], test group [10], test iteration [10], first person perspective [9], hand drum [8], d drum lesson [7], virtual reality [7], vr drum lesson [7], independent t test [6], rhythm pattern [6], second test iteration [6], teaching material [6], trial phase [6], user experience [6], drumming lesson [5], drumming recording [5], significant difference [5], djembe drum [4], embodied first person perspective [4], fast tempo difference score [4], mean value [4], participant rhythm performance [4], playing teacher [4], rhythm accuracy [4], self assessment questionnaire [4], significance difference [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249341
Zenodo URL: https://zenodo.org/record/3249341


2019.87
Tempo and Metrical Analysis By Tracking Multiple Metrical Levels Using Autocorrelation
Lartillot, Olivier   University of Oslo; Oslo, Norway
Grandjean, Didier   University of Geneva; Geneva, Switzerland

Abstract
We present a method for tempo estimation from audio recordings based on signal processing and peak tracking, and not depending on training on ground-truth data. First an accentuation curve, emphasising the temporal location and accentuation of notes, is based on a detection of bursts of energy localised in time and frequency. This enables to detect notes in dense polyphonic texture, while ignoring spectral fluctuation produced by vibrato and tremolo. Periodicities in the accentuation curve are detected using an improved version of autocorrelation function. Hierarchical metrical structures, composed of a large set of periodicities in pairwise harmonic relationships, are tracked over time. In this way, the metrical structure can be tracked even if the rhythmical emphasis switches from one metrical level to another. This approach, compared to all the other participants to the MIREX Audio Tempo Extraction from 2006 to 2018, is the third best one among those that can track tempo variations. While the two best methods are based on machine learning, our method suggests a way to track tempo founded on signal processing and heuristics-based peak tracking. Besides, the approach offers for the first time a detailed representation of the dynamic evolution of the metrical structure. The method is integrated into MIRtoolbox, a Matlab toolbox freely available.

Keywords
autocorrelation, metrical analysis, tempo

Paper topics
Computational musicology and ethnomusicology, Content processing of music audio signals, Music information retrieval, Sound/music signal processing algorithms

Easychair keyphrases
metrical level [50], metrical structure [33], metrical layer [28], metrical grid [24], metrical period [16], accentuation curve [10], autocorrelation function [10], music information retrieval [7], periodicity score [7], dvorak new world symphony [6], contextual background [5], global tempo [5], metrical centroid [5], peak lag [5], tempo estimation [5], allegro con fuoco [4], autocorrelation based periodogram [4], core metrical level [4], deep learning [4], dotted quarter note [4], dynamic evolution [4], dynamic metrical centroid [4], dynamic metrical centroid curve [4], large range [4], main metrical level [4], metrical analysis [4], mirex audio tempo extraction [4], strongest periodicity [4], successive frame [4], whole note [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249305
Zenodo URL: https://zenodo.org/record/3249305


2019.88
The Chordinator: an Interactive Music Learning Device
McCoy, Eamon   Georgia Institute of Technology; Atlanta, United States
Greene, John   Georgia Institute of Technology; Atlanta, United States
Henson, Jared   Georgia Institute of Technology; Atlanta, United States
Pinder, James   Georgia Institute of Technology; Atlanta, United States
Brown, Jonathan   Georgia Institute of Technology; Atlanta, United States
Arthur, Claire   Georgia Institute of Technology; Atlanta, United States

Abstract
The Chordinator is an interactive and educational music device consisting of a physical board housing a “chord stacking” grid. There is an 8x4 grid on the board which steps through each of the eight columns from left to right at a specified tempo, playing the chords you have built in each column. To build a chord, you place blocks on the board which represent major or minor thirds above blocks that designate a root (or bass) note represented as a scale degree. In the bottom row, the user specifies a bass (root) note, and any third blocks placed above it will add that interval above the bass note. Any third blocks placed above other third blocks add an additional interval above the prior one, creating a chord. There are three rows above each root allowing either triads or seventh chords to be built. This interface combined with the board design is intended to create a simple representation of chord structure. Using the blocks, the user can physically “build” a chord using the most fundamental skills, in this case “stacking your thirds.” One also learns which chords work the best in a sequence. It provides quick satisfaction and a fun, interactive way to learn about the structure of chords and can even spark creativity as people build interesting progressions or try to recreate progressions they love from their favorite music.

Keywords
Arduino, Chords, Chord Sequencer, Education, Interactive, Learning, Stacking Thirds

Paper topics
not available

Easychair keyphrases
third block [7], chord progression [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249360
Zenodo URL: https://zenodo.org/record/3249360


2019.89
The Viking HRTF Dataset
Spagnol, Simone   Aalborg University; Aalborg, Denmark
Purkhús, Kristján Bjarki   University of Iceland; Reykjavik, Iceland
Björnsson, Sverrir Karl   Technical University of Denmark (DTU); Kongens Lyngby, Denmark
Unnthorsson, Runar   University of Iceland; Reykjavik, Iceland

Abstract
This paper describes the Viking HRTF dataset, a collection of head-related transfer functions (HRTFs) measured at the University of Iceland. The dataset includes full-sphere HRTFs measured on a dense spatial grid (1513 positions) with a KEMAR mannequin with 20 different artificial left pinnae attached, one at a time. The artificial pinnae were previously obtained through a custom molding procedure from 20 different lifelike human heads. The analyses of results reported here suggest that the collected acoustical measurements are robust, reproducible, and faithful to reference KEMAR HRTFs, and that material hardness has a negligible impact on the measurements compared to pinna shape. The purpose of the present collection, which is available for free download, is to provide accurate input data for future investigations on the relation between HRTFs and anthropometric data through machine learning techniques or other state-of-the-art methodologies.

Keywords
binaural, HRTF, KEMAR, spatial sound

Paper topics
and virtual acoustics, reverberation, Spatial sound

Easychair keyphrases
head related transfer function [14], related transfer function [10], negative mold [7], right channel [7], left channel [6], mean spectral distortion [6], pinna shape [6], standard large anthropometric pinna [6], audio eng [5], kemar mannequin [5], left pinna [5], measurement session [5], custom made pinna [4], dummy head [4], ear canal [4], impulse response [4], jesmonite r ear [4], kemar pinna replica [4], lifelike human head [4], pinna related transfer function [4], related transfer [4], shore oo hardness [4], signal process [4], starting point [4], viking hrtf dataset [4], virtual sound source [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249252
Zenodo URL: https://zenodo.org/record/3249252


2019.90
Toward Automatic Tuning of the Piano
Tuovinen, Joonas   Aalto University; Espoo, Finland
Hu, Jamin   Sibelius Academy, University of the Arts Helsinki; Helsinki, Finland
Välimäki, Vesa   Aalto University; Espoo, Finland

Abstract
The tuning of a piano is a complicated and time-consuming process, which is usually left for a professional tuner. To make the process faster and non-dependent on the skills of a professional tuner, a semi-automatic piano tuning system is developed. The aim of the system is to tune a grand piano semi-automatically with the help of a non-professional tuner. The system composes of an aluminum frame, a stepper motor, Arduino, microphone, and laptop computer. The stepper motor changes the tuning of the piano strings by turning pins connected to them whereas the aluminum frame holds the motor in place and the Arduino controls the motor. The microphone and the computer are used as a part of a closed loop control system, which is used to tune the strings automatically. The control system tunes the strings by minimising the difference between the current and optimal fundamental frequency. The current fundamental frequency is obtained with an inharmonicity coefficient estimation algorithm and the optimal fundamental frequency is calculated with the Connected Reference Interval (CRI) tuning process. With the CRI tuning process, a tuning close to that of a professional tuner is achieved with a deviation of 2.5 cents (RMS) between the keys A0 and G5 and 8.1 cents (RMS) between G#5 and C8 where the tuners tuning seems to be less consistent.

Keywords
acoustic signal processing, audio systems, automatic control, music, spectral analysis

Paper topics
Hardware systems for sound and music computing, Models for sound analysis and synthesis, Sound/music signal processing algorithms

Easychair keyphrases
fundamental frequency [37], partial frequency [14], professional tuner [14], closed loop control system [12], inharmonicity coefficient [12], stepper motor [12], beating rate [11], cri tuning process [11], inharmonicity coefficient estimation [9], piano string [8], coefficient estimation algorithm [7], target fundamental frequency [7], tuning process [7], aluminum frame [6], control system [6], piano tuner [6], piano tuning [6], piano tuning system [6], reference octave [6], cri process [5], lower tone [5], mat algorithm [5], measured output [5], mode frequency [5], target frequency [5], first matching partial [4], optimal fundamental frequency [4], piano tuning robot [4], tone equal temperament scale [4], yamaha grand piano [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249293
Zenodo URL: https://zenodo.org/record/3249293


2019.91
Towards a High-Performance Platform for Sonic Interaction Interfaces
Fasciani, Stefano   University of Wollongong in Dubai / University of Oslo; Dubai, United Arab Emirates
Vohra, Manohar   University of Wollongong in Dubai; Dubai, United Arab Emirates

Abstract
In this paper we introduce a hardware platform to pro-totype interfaces of demanding sonic interactive sys-tems. We target applications featuring a large array of analog sensors requiring data acquisition and transmis-sion to computers at fast rates, with low latency, and high bandwidth. This work is part of an ongoing pro-ject which aims to provide designers with a cost effec-tive and accessible platform for fast prototyping of complex interfaces for sonic interactive systems or mu-sical instruments. The high performances are guaran-teed by a SoC FPGA. The functionality of the platform can be customized without requiring significant tech-nical expertise. In this paper, we discuss the principles, the current design, and the preliminary evaluation against common microcontroller-based platforms. The proposed platform can sample up to 96 analog channels at rates up to 24 kHz and stream the data via UDP to computers with a sub millisecond latency.

Keywords
Hardware Platform, Musical Interface, Sonic Interaction

Paper topics
Hardware systems for sound and music computing, Interfaces for sound and music, New interfaces for interactive music creation, Sonic interaction design

Easychair keyphrases
sampling rate [16], sonic interactive system [14], analog signal [13], data acquisition [11], acquisition board [10], microcontroller based platform [9], sound synthesis [9], simultaneous sampling [8], fpga pin [7], maximum rate [7], maximum sampling rate [7], arm cortex [6], board computer [6], data acquisition system [6], pure data [6], udp packet [6], buffer size [5], musical instrument [5], serial interface [5], sonic interactive [5], bit arm cortex [4], filter bank [4], fpga based platform [4], fpga fabric [4], maximum data acquisition rate [4], measured data transmission [4], microcontroller based board [4], pressure sensitive touchpad [4], sensor data [4], sonic interaction [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249278
Zenodo URL: https://zenodo.org/record/3249278


2019.92
Towards CNN-based Acoustic Modeling of Seventh Chords for Automatic Chord Recognition
Nadar, Christon-Ragavan   Semantic Music Technologies Group, Fraunhofer Institute of Digital Media Technology (IDMT); Ilmenau, Germany
Abeßer, Jakob   Semantic Music Technologies Group, Fraunhofer Institute of Digital Media Technology (IDMT); Ilmenau, Germany
Grollmisch, Sascha   Semantic Music Technologies Group, Fraunhofer Institute of Digital Media Technology (IDMT); Ilmenau, Germany

Abstract
In this paper, we build upon a recently proposed deep convolutional neural network architecture for automatic chord recognition (ACR). We focus on extending the commonly used major/minor vocabulary (24 classes) to an extended chord vocabulary of seven chord types with a total of 84 classes. In our experiments, we compare joint and separate classification of the chord type and chord root pitch class using one or two separate models, respectively. We perform a large-scale evaluation using various combinations of training and test sets of different timbre complexity. Our results show that ACR with an extended chord vocabulary achieves high f-scores of 0.97 for isolated chord recordings and 0.66 for mixed contemporary popular music recordings. While the joint ACR modeling leads to the best results for isolated instrument recordings, the separate modeling strategy performs best for complex music recordings. Alongside with this paper, we publish a novel dataset for extended-vocabulary chord recognition which consists of synthetically generated isolated recordings of various musical instruments.

Keywords
automatic chord recognition, deep convolutional neural network, harmony analysis

Paper topics
Automatic separation, classification of sound and music, Models for sound analysis and synthesis, Music information retrieval, recognition, Sound/music signal processing algorithms

Easychair keyphrases
chord type [28], chord recognition [20], isolated chord recording [15], music information retrieval [14], root pitch class [14], extended vocabulary acr [12], chord root pitch [11], chord voicing [10], seventh chord [10], acr model [8], extended vocabulary [8], music recording [8], neural network [8], automatic chord recognition [7], chord tone [7], th international society [7], acoustic modeling [6], chord label [6], chord vocabulary [6], isolated instrument recording [6], midi file [6], minor chord [6], minor chord vocabulary [6], modeling strategy [6], real life acr application [6], novel dataset [5], training set [5], chord recognition dataset [4], final dense layer [4], high f score [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249472
Zenodo URL: https://zenodo.org/record/3249472


2019.93
URALi: a proposal of approach to real-time audio synthesis in Unity
Dorigatti, Enrico   Conservatorio F. A. Bonporti; Trento, Italy

Abstract
This paper aims to give a basic overview about the URALi (Unity Real-time Audio Library) project, that is currently under development. URALi is a library that aims to provide a collection of software tools to realize real-time sound synthesis in applications and softwares developed with Unity.

Keywords
not available

Paper topics
not available

Easychair keyphrases
not available

Paper type
Demo

DOI: 10.5281/zenodo.3249266
Zenodo URL: https://zenodo.org/record/3249266


2019.94
VIBRA - Technical and Artistic Issues in an Interactive Dance Project
Bergsland, Andreas   Norwegian University of Science and Technology (NTNU); Trondheim, Norway
Saue, Sigurd   Norwegian University of Science and Technology (NTNU); Trondheim, Norway
Stokke, Pekka   Ljos A/S; Norway

Abstract
The paper presents the interactive dance project VIBRA, based on two workshops taking place in 2018. The paper presents the technical solutions applied and discusses artistic and expressive experiences. Central to the discussion is how the technical equipment, implementation and mappings to different media has affected the expressive and experiential reactions of the dancers.

Keywords
computer visuals, Interactive dance, motion sensors, spatial sound

Paper topics
Improvisation in music through interactivity, Interaction in music performance, Interactive performance systems, Interfaces for sound and music, Sonic interaction design

Easychair keyphrases
interactive dance [17], computer visual [11], myo armband [10], sensor data [10], interactive instrument [7], ngimu sensor [7], third author [7], body part [6], causal relationship [6], technical setup [6], dancer movement [5], musical expression [5], myo mapper [5], data communication [4], first author [4], myo sensor [4], project participant [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249248
Zenodo URL: https://zenodo.org/record/3249248


2019.95
Virtual Reality Music Intervention to Reduce Social Anxiety in Adolescents Diagnosed with Autism Spectrum Disorder
Adjorlu, Ali   Aalborg University; Aalborg, Denmark
Betancourt, Nathaly   Università di Trento; Trento, Italy
Serafin, Stefania   Aalborg University; Aalborg, Denmark

Abstract
This project investigates the potentials of Head-Mounted-Display (HMD) based Virtual Reality (VR) that incorporates musical elements as a tool to perform exposure therapy. This is designed to help adolescents diagnosed with Autism Spectrum Disorder (ASD) to deal with their social anxiety. An application was built that combines the possibility of singing in VR while a virtual audience provides feedback. The application was tested with four adolescents diagnosed with ASD from a school for children with special needs in Denmark. The results of the evaluation are presented in this paper.

Keywords
Autism Spectrum Disorder, Music, Performance Anxiety, Performing, Singing, Social Anxiety, Virtual Audience, Virtual Reality

Paper topics
Interaction in music performance, Sound and music for accessibility and special needs, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
virtual audience [27], social anxiety [26], simplified version [21], autism spectrum disorder [17], exposure therapy [15], virtual reality [14], liebowitz social anxiety scale [10], none none none [9], virtual environment [9], vr music intervention [9], likert scale [6], smiley face likert [6], smiley likert scale [6], concert hall [5], described situation [5], future iteration [5], voice command [5], developmental disorder [4], face likert scale [4], feared outcome [4], head mounted display [4], immersive tendency questionnaire [4], multisensory experience lab aalborg [4], presence questionnaire [4], scale ranging [4], virtual concert hall [4], vr exposure therapy [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249339
Zenodo URL: https://zenodo.org/record/3249339


2019.96
VISUALIZING MUSIC GENRES USING A TOPIC MODEL
Panda, Swaroop   Indian Institute of Technology Kanpur (IIT KANPUR); Kanpur, India
Namboodiri, Vinay P.   Indian Institute of Technology Kanpur (IIT KANPUR); Kanpur, India
Roy, Shatarupa Thakurta   Indian Institute of Technology Kanpur (IIT KANPUR); Kanpur, India

Abstract
Music Genres serve as an important meta-data in the field of music information retrieval and have been widely used for music classification and analysis tasks. Visualizing these music genres can thus be helpful for music exploration, archival and recommendation. Probabilistic topic models have been very successful in modelling text documents. In this work, we visualize music genres using a probabilistic topic model. Unlike text documents, audio is continuous and needs to be sliced into smaller segments. We use simple MFCC features of these segments as musical words. We apply the topic model on the corpus and subsequently use the genre annotations of the data to interpret and visualize the latent space.

Keywords
Music Genre Visualization, Probabilistic Music Genres, Probabilistic Topic Models

Paper topics
not available

Easychair keyphrases
topic model [19], music genre [13], probabilistic topic model [9], cluster mean [7], document topic proportion [7], text document [6], latent space [4], progressive genre visualization [4], term topic proportion [4], topic proportion [4]

Paper type
Demo

DOI: 10.5281/zenodo.3249352
Zenodo URL: https://zenodo.org/record/3249352


2019.97
Visual Pitch Estimation
Koepke, A. Sophia   University of Oxford; Oxford, United Kingdom
Wiles, Olivia   University of Oxford; Oxford, United Kingdom
Zisserman, Andrew   University of Oxford; Oxford, United Kingdom

Abstract
In this work, we propose the novel task of automatically estimating pitch (fundamental frequency) from video frames of violin playing using vision alone. In order to investigate this task, we curate a novel dataset of violin playing, which we plan to release publicly to the academic community. To solve this task, we propose a novel Convolutional Neural Network (CNN) architecture that is trained using a student- teacher strategy to transfer discriminative knowledge from the audio domain to the visual domain. At test time, our framework takes video frames as input and directly regresses the pitch. We train and test this architecture on different subsets of our new dataset. Impressively, we show that this task (i.e. pitch prediction from vision) is actually possible. Furthermore, we verify that the network has indeed learnt to focus on salient parts of the image, e.g. the left hand of the violin player is used as a visual cue to estimate pitch.

Keywords
Audio-visual, Multi-modality, Visual pitch estimation

Paper topics
Multimodality in sound and music computing

Easychair keyphrases
video frame [14], visual information [14], pitch network [10], student network [9], convolutional layer [8], pseudo ground truth pitch [8], teacher network [8], violin playing [6], midi number [5], rpa tol [5], silent video [5], test time [5], audio visual [4], ground truth pitch [4], modal audio visual generation [4], multiple input frame [4], pitch frame [4], predict pitch [4], raw pitch accuracy [4], regress pitch [4], test set [4], truth pitch information [4], urmp dataset [4], visual cue [4], visual music transcription [4], visual pitch estimation [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249433
Zenodo URL: https://zenodo.org/record/3249433


2019.98
VocalistMirror: A Singer Support Interface for Avoiding Undesirable Facial Expressions
Lin, Kin Wah Edward   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Nakano, Tomoyasu   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan
Goto, Masataka   National Institute of Advanced Industrial Science and Technology (AIST); Tsukuba, Japan

Abstract
We present VocalistMirror, an interactive user interface that enables a singer to avoid their undesirable facial expressions in singing video recordings. Since singers usually focus on singing expressions and do not care about facial expressions, they sometimes notice that some of their own facial expressions are undesirable when watching recorded singing videos. VocalistMirror allows a singer to first specify their undesirable facial expressions in a recorded video, and then sing again while seeing a real-time warning that is shown when the facial expression of the singer becomes similar to one of the specified undesirable expressions. It also displays Karaoke-style lyrics with piano-roll melody and visualizes acoustic features of singing voices. iOS ARKit framework is used to quantify the facial expression as a 52-dimensional vector, which is then used to compute the distance from undesirable expressions. Our experimental results showed the potential of the proposed interface.

Keywords
facial expression, singer support interface, singing video

Paper topics
Interfaces for sound and music, Multimodality in sound and music computing

Easychair keyphrases
facial expression [68], undesirable facial expression [38], singing voice [18], serious music background [15], real time [14], short singing video clip [10], video clip [10], acoustic feature interface [9], singing video clip [9], acoustic feature [7], facial expression interface [7], fundamental frequency [7], dimensional facial vector [6], karaoke style lyric [6], l1 norm distance [6], selected undesirable facial expression [6], singing pitch [6], interface design [5], music computing [5], singing video [5], truedepth camera [5], video recording [5], expression overall impression [4], exterior design feature [4], piano roll melody [4], real time vocal part arrangement [4], rwc music database [4], similar facial expression [4], singer facial expression [4], singing app [4]

Paper type
Full paper

DOI: 10.5281/zenodo.3249451
Zenodo URL: https://zenodo.org/record/3249451


2019.99
VUSAA: AN AUGMENTED REALITY MOBILE APP FOR URBAN SOUNDWALKS
Moreno, Josué   University of the Arts Helsinki; Helsinki, Finland
Norilo, Vesa   University of the Arts Helsinki; Helsinki, Finland

Abstract
This paper presents VUSAA, an augmented reality sound- walking application for Apple iOS Devices. The application is based on the idea of Urban Sonic Acupuncture, providing site-aware generative audio content aligned with the present sonic environment. The sound-generating algorithm was implemented in Kronos, a declarative programming lan- guage for musical signal processing. We discuss the con- ceptual framework and implementation of the application, along with the practical considerations of deploying it via a commercial platform. We present results from a number of soundwalks so far organized and outline an approach to develop new models for urban dwelling.

Keywords
augmented reality, generative composition, mobile application

Paper topics
Automatic music generation/accompaniment systems, Sonic interaction design, Sound and music for Augmented/Virtual Reality and games

Easychair keyphrases
urban sonic acupuncture [14], aural weather [8], sonic acupuncture [7], app store [6], augmented reality [6], augmented reality soundwalking [6], ios app store [6], sonic content [6], user interface [5], app store review [4], conceptual framework [4], mobile device [4], public space [4], urban acupuncture [4], urban sonic acupuncture strategy [4]

Paper type
Position paper / Poster

DOI: 10.5281/zenodo.3249416
Zenodo URL: https://zenodo.org/record/3249416


Search