Dates: from July 01 to July 04, 2017
Place: Espoo, Finland
Proceedings info: Proceedings of the 14th Sound and Music Computing Conference 2017, ISBN 978-952-60-3729-5
Abstract
Music constraint systems provide a rule-based approach to composition. Existing systems allow users to constrain the harmony, but the constrainable harmonic information is restricted to pitches and intervals between pitches. More abstract analytical information such as chord or scale types, their root, scale degrees, enharmonic note representations, whether a note is the third or fifth of a chord and so forth are not supported. However, such information is important for modelling various music theories. This research proposes a framework for modelling harmony at a high level of abstraction. It explicitly represents various analytical information to allow for complex theories of harmony. It is designed for efficient propagation-based constraint solvers. The framework supports the common 12-tone equal temperament, and arbitrary other equal temperaments. Users develop harmony models by applying user-defined constraints to its music representation. Three examples demonstrate the expressive power of the framework: (1) an automatic melody harmonisation with a simple harmony model; (2) a more complex model implementing large parts of Schoenberg's tonal theory of harmony; and (3) a composition in extended tonality. Schoenberg's comprehensive theory of harmony has not been computationally modelled before, neither with constraints programming nor in any other way.
Keywords
algorithmic composition, computer aided composition, constraint programming, harmony, music theory
Paper topics
accompaniment, and improvisation, Automatic composition, Computational musicology and mathematical music theory
Easychair keyphrases
pitch class [33], constraint programming [15], integer variable [13], scale type [12], chord type [10], scale degree [10], chord root [9], harmony model [8], underlying harmony [8], analytical information [7], microtonal music [6], music constraint system [6], music representation [6], nonharmonic tone [6], propagation based constraint solver [6], variable domain [6], allow user [5], chord tone [5], contained object [5], element constraint [5], end time [5], equal temperament [5], formal detail [5], harmonic rhythm [5], melodic interval [5], music theory [5], note pitch [5], dominant seventh chord [4], pitch class integer [4], voice leading distance [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401977
Zenodo URL: https://zenodo.org/record/1401977
Abstract
This paper presents an adaptive karaoke system that can extract accompaniment sounds from music audio signals in an online manner and play those sounds synchronously with users' singing voices. This system enables a user to expressively sing an arbitrary song by dynamically changing the tempo of the user's singing voices. A key advantage of this systems is that users can immediately enjoy karaoke without preparing musical scores (MIDI files). To achieve this, we use online methods of singing voice separation and audio-to-audio alignment that can be executed in parallel. More specifically, music audio signals are separated into singing voices and accompaniment sounds from the beginning using an online extension of robust nonnegative matrix factorization. The separated singing voices are then aligned with a user's singing voices using online dynamic time warping. The separated accompaniment sounds are played back according to the estimated warping path. The quantitative and subjective experimental results showed that although there is room for improving the computational efficiency and alignment accuracy, the system has a great potential for offering a new singing experience.
Keywords
adaptive karaoke system, audio-to-audio alignment, automatic accompaniment, singing voice, singing voice separation
Paper topics
accompaniment, Analysis, and improvisation, and modification of sound, Automatic composition, Computer-based music analysis, High-performance computing for audio, Music performance, Music performance analysis and rendering, synthesis
Easychair keyphrases
singing voice [72], singing voice separation [42], user singing voice [30], accompaniment sound [24], music audio signal [17], stretch rate [14], audio alignment [12], audio signal [12], warping path [12], musical audio signal [9], musical score [9], real time [9], adaptive karaoke system [7], cost matrix [7], mini batch [7], real time audio [7], separated singing voice [7], singing voice alignment [7], automatic accompaniment [6], robust principal component analysis [6], user interface [6], karaoke system [5], deep recurrent neural network [4], hidden markov model [4], low rank [4], online dynamic time warping [4], phase vocoder [4], polyphonic midi score following [4], robust nonnegative matrix factorization [4], score alignment [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401897
Zenodo URL: https://zenodo.org/record/1401897
Abstract
This paper presents findings from an exploratory study on the effect of auditory feedback on gaze behavior. A total of 20 participants took part in an experiment where the task was to throw a virtual ball into a goal in different conditions: visual only, audiovisual, visuohaptic and audiovisuohaptic. Two different sound models were compared in the audio conditions. Analysis of eye tracking metrics indicated large inter-subject variability; difference between subjects was greater than difference between feedback conditions. No significant effect of condition could be observed, but clusters of similar behaviors were identified. Some of the participants’ gaze behaviors appeared to have been affected by the presence of auditory feedback, but the effect of sound model was not consistent across subjects. We discuss individual behaviors and illustrate gaze behavior through sonification of gaze trajectories. Findings from this study raise intriguing questions that motivate future large-scale studies on the effect of auditory feedback on gaze behavior.
Keywords
auditory feedback, eye tracking, gaze behaviour, haptic feedback, multimodal feedback, sonification
Paper topics
Multi-modal perception and emotion, Sonification, Virtual reality applications and technologies for sound and music
Easychair keyphrases
total fixation duration [30], gaze behavior [28], total fixation [26], auditory feedback [25], ball aoi [23], total fixation count [22], fixation duration [19], goal aoi [16], eye tracking [14], sound model [13], median value [12], throwing gesture [12], eye tracking metric [11], lowest median value [11], total fixation duration index [10], error rate [9], fixation count [9], averaged left [7], gaze trajectory [7], highest median value [7], longer fixation [7], movement sonification [7], visual attention [7], eye movement [6], eye tracking data [6], feedback condition [6], inter quartile range [6], right eye gaze position [6], total fixation count index [6], virtual ball [6]
Paper type
unknown
DOI: 10.5281/zenodo.1401927
Zenodo URL: https://zenodo.org/record/1401927
Abstract
This paper presents a live-performance system that enables a user to arrange the vocal part of music audio signals and to add a harmony or troll part by modifying the original vocal part. The pitches and rhythms of the original singing voices can be manipulated in real time while preserving the lyrics. This leads to an interesting experience that the user can feel as if the original singer is directed to sing a song as the user likes. More specifically, a user is asked to play a MIDI keyboard according to the user's favorite melody. Since the vocal part is divided into short segments corresponding to musical notes, those segments are played back one by one from the beginning such that the pitch and duration of each segment matches a key pushed by the user. The functions needed for this system are singing voice separation, vocal-part segmentation, key-to-segment association, and real-time pitch modification. We propose three kinds of key-to-segment association to restrict the degree of freedom in manipulation of pitches and durations and to easily generate a harmony or troll part. Subjective experiments showed the potential of the proposed system.
Keywords
a MIDI keyboard, arrangement, harmony, interface, live-performance, real time, singing voice
Paper topics
Analysis, and modification of sound, Interactive performance systems and new interfaces, synthesis
Easychair keyphrases
vocal part [33], singing voice [28], pitch modification [18], easy arrangement mode [15], free arrangement mode [15], real time [15], time fixed mode [15], midi keyboard [13], arrangement mode [12], musical note [11], target note [10], existing song [9], user interface [8], onset time [7], singing voice separation [7], time stretching [7], musical note estimation [6], signal processing method [6], harmonic part [5], key operation [5], sound quality [5], accompaniment source separation [4], active music listening [4], assignment mode [4], drum part [4], pitch information [4], robust principle component analysis [4], rwc music database [4], troll part [4], vocal arrangement system [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401987
Zenodo URL: https://zenodo.org/record/1401987
Abstract
SysSon is a sonification platform originally developed in collaboration with climatologists. It contains a domain-specific language for the design of sonification templates, providing abstractions for matrix input data and accessing it in real-time sound synthesis. A shortcoming of the previous version had been the limited breadth of transformations applicable to this matrix data in real-time. We observed that the development of sonification objects often requires pre-processing stages outside the real-time domain, limiting the possibilities of fully integrating models directly into the platform. We designed a new layer for the sonification editor that provides another, semantically similar domain-specific language for offline rendering. Offline and real-time processing are unified through common interfaces and through a mechanism by which the latter can make use of the outputs of the former stage. Auxiliary data calculated in the offline stage is captured by a persisting caching mechanism, avoiding waiting time when running a sonification repeatedly.
Keywords
Offline, Sonification, Stream Processing
Paper topics
Analysis, and modification of sound, Computer music languages and software, Sonification, synthesis
Easychair keyphrases
real time [33], pre processing stage [11], sonification model [11], ugen graph [11], blob detection [10], offline program [8], attribute map [7], real time program [7], val val val [7], offline processing [6], real time sound synthesis [6], temperature anomaly [6], blob data [5], multi rate [5], audio sampling rate [4], blob matrix [4], development process [4], domain specific language [4], floating point number [4], general ide [4], pre processing [4], quasi biennial oscillation [4], real time program source [4], real time ugen graph [4], signal processing [4], sonification editor [4], sound model [4], structural data [4], sysson ide [4], user interface [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401879
Zenodo URL: https://zenodo.org/record/1401879
Abstract
We propose a framework for audio-to-score alignment on piano performance that employs automatic music transcription (AMT) using neural networks. Even though the AMT result may contain some errors, the note prediction output can be regarded as a learned feature representation that is directly comparable to MIDI note or chroma representation. To this end, we employ two recurrent neural networks that work as the AMT-based feature extractors to the alignment algorithm. One predicts the presence of 88 notes or 12 chroma in frame-level and the other detects note onsets in 12 chroma. We combine the two types of learned features for the audio-to-score alignment. For comparability, we apply dynamic time warping as an alignment algorithm without any additional post-processing. We evaluate the proposed framework on the MAPS dataset and compare it to previous work. The result shows that the alignment framework with the learned features significantly improves the accuracy, achieving less than 10 ms in mean onset error.
Keywords
automatic music transcription, music feature extraction, recurrent neural network, score alignment
Paper topics
Analysis, and modification of sound, Computer-based music analysis, Music information retrieval, synthesis
Easychair keyphrases
score alignment [16], amt system [13], chroma onset feature [12], chroma onset [10], dynamic time warping [9], onset feature [8], carabias ortis algorithm [7], ewert algorithm [7], piano music [7], alignment algorithm [6], automatic music transcription [6], mean onset error [6], neural network [6], recurrent neural network [6], fastdtw algorithm [5], midi note [5], onset error [5], truncated backpropagation [5], ground truth [4], informa tion retrieval [4], map dataset [4], music informa tion [4], note onset [4], piecewise onset error [4], score midi [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401969
Zenodo URL: https://zenodo.org/record/1401969
Abstract
With the proliferation of video content of musical performances, audio-visual analysis becomes an emerging topic in music information retrieval. Associating the audio and visual aspects of the same source, or audio-visual source association, is a fundamental problem for audio-visual analysis of polyphonic musical performances. In this paper, we propose an approach to solve this problem for string ensemble performances by analyzing the vibrato patterns. On the audio side, we extract the pitch trajectories of vibrato notes of each string player in a score-informed fashion. On the video side, we track the left hand of string players and capture their fine-grained left-hand vibration due to vibrato. We find a high correlation between the pitch fluctuation and the hand vibration for vibrato notes, and use this correlation to associate the audio and visual aspects of the same players. This work is a complementary extension to our previous work on source association for string ensembles based on bowing motion analysis. Experiments on 19 pieces of chamber musical performances with at most one non-string instrument show more accurate and robust association performance than our previous method.
Keywords
Multi-modal Music Analysis, Source Association, Vibrato Analysis
Paper topics
Interactive performance systems and new interfaces, Music information retrieval, Music performance, Music performance analysis and rendering
Easychair keyphrases
pitch trajectory [22], source association [18], vibrato note [18], audio visual source association [16], audio visual [15], hand displacement curve [11], note level matching [11], sound source [11], source separation [11], signal process [10], score informed fashion [9], visual aspect [9], displacement curve [8], musical performance [8], score informed source separation [8], string ensemble [8], bounding box [7], music information retrieval [7], non string instrument [7], pitch fluctuation [7], audio visual analysis [6], detected vibrato note [6], level matching accuracy [6], matching score [6], motion velocity [6], note onset [6], correct association [5], fingering hand [5], hand motion [5], optical flow [5]
Paper type
unknown
DOI: 10.5281/zenodo.1401863
Zenodo URL: https://zenodo.org/record/1401863
Abstract
The paper presents an experiment in which subjects had to localize head\-phone-reproduced binaural piano tones while in front of a grand Disklavier instrument. Three experimental conditions were designed: when the fallboard was closed the localization was auditory only; when the fallboard was open the localization was auditory and visual, since the Disklavier's actuated key could be seen moving down while the corresponding note was produced; when the listener actively played the note the localization was auditory, visual and somatosensory. In all conditions the tones were reproduced using binaural recordings previously acquired on the same instrument. Such tones were presented either transparently or by reversing the channels. Thirteen subjects participated in the experiment. Results suggest that if auditory localization associates the tone to the corresponding key, then the visual and somatosensory feedback refine the localization. Conversely, if auditory localization is confused then the visual and somatosensory channels cannot improve it. Further experimentation is needed to explain these results in relation with i) possible activation of the auditory precedence effect at least for some notes, and ii) potential locking of the sound source position that visual and/or somatosensory cues might cause when subjects observe a key moving down, or depress it.
Keywords
Binaural sound, Multimodal perception, Piano tones
Paper topics
Multi-modal perception and emotion, Music performance, Spatial sound
Easychair keyphrases
piano tone [12], passive closed condition [11], somatosensory feedback [10], passive open condition [9], routing setting [9], active condition [8], auditory localization [8], key position [8], perceived position [8], listening condition [6], reverse routing setting [6], significantly different position [6], standard routing setting [6], active listening [5], binaural recording [5], binaural sample [5], note position [5], precedence effect [5], visual feedback [5], acoustic piano [4], auditory precedence effect [4], digital piano [4], musical instrument [4], passive open [4], perceived note position [4], reverse setting [4], somatosensory cue [4], sound source [4], sound source position [4], velocity value [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401931
Zenodo URL: https://zenodo.org/record/1401931
Abstract
Signal processing and computing procedures are presented to imprint, into any given audio signal (preferably obtained from a dry anechoic recording), the temporal and spectral characteristics of radiation of simple sources in free field, and in reverberant rectangular rooms, that could virtually be obtained in space of any spatial dimensions $D{\ge}1$. These techniques are based on mathematically exact solutions of the conventional acoustic field equations, usually expressed and solved for space of three dimensions or less, but generalized formally to describe the analogue to sound waves in space of higher dimensions. Processing as presented here is monaural, but the mathematical tools exist to imprint a hyper-angular dependence in higher dimensions, and to map that, into conventional binaural presentations. Working computer implementations, and sound samples are presented. The foreseen applications include: sonic art, virtual reality, extended reality, videogames, computer games, and others.
Keywords
image source method, simulation of room acoustics, sound in space of arbitrary D dimensions
Paper topics
Analysis, and modification of sound, Room acoustics modeling and auralization, synthesis, Virtual reality applications and technologies for sound and music
Easychair keyphrases
image source [16], impulse response [10], higher dimension [7], source transfer function [6], image method [5], sound pressure [5], space dimension [5], frequency weighting [4], monopole source [4], reflection coefficient [4], transfer function [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401929
Zenodo URL: https://zenodo.org/record/1401929
Abstract
Arranging the music composed of multiple instruments for the solo piano is demanded because there are many pianists who want to practice playing their favorite song or music. In general, the method which is reducing original notes so as to fit on a two-line staff is used for piano arrangement. However, their approaches still have not reached a fundamental solution which improve not only an originality, and a playability but a quality of the score. In this paper, we propose a new approach to arrange a musical score for the piano using four musical components - a melody, a chords, rhythm and the number of notes - that can be extracted from an original score. In our method, we input an original score and then generate both right and left hand playing parts of piano scores. For the right part, we add optional notes from a chord to the melody. For the left part, we select appropriate accompaniments from database constructed from pop musical piano scores. The selected accompaniments are considered the impression of an original score. We generate high quality solo piano scores reflected original characteristics and considered part of playability.
Keywords
accompaniment selection, music arrangement, piano reduction
Paper topics
accompaniment, and improvisation, Automatic composition
Easychair keyphrases
piano arranged score [42], piano arrangement [23], accompaniment database [14], hand part [14], accompaniment matrix [12], high quality piano arrangement [12], root note [12], piano score [11], right hand part [11], musical component [9], special interest group [9], musical piece [8], musical score [8], pop musical piano score [8], rhythm part [7], maximum musical interval [6], note value [6], pianoarranged score [6], subjective evaluation [6], high quality [5], selected accompaniment [5], solo piano [5], sounding start [5], additional note [4], extant study [4], left part [4], piano arrangement process [4], piano arrangement system [4], quarter note [4], sugar song [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401875
Zenodo URL: https://zenodo.org/record/1401875
Abstract
Performance databases that can be referred to as numerical values play important roles in the research of music interpretation, the analysis of expressive performances, automatic transcription, and performance rendering technology. The authors have promoted the creation and public release of the CrestMuse PEDB (Performance Expression DataBase), which is a performance expression database of more than two hundred virtuoso piano performances of classical music from the Baroque period through the early twentieth century, including music by Bach, Mozart, Beethoven and Chopin. The CrestMuse PEDB has been used by more than fifty research institutions around the world. It has especially contributed to research on performance rendering systems as training data. Responding to the demand to increase the database, we have started a new three-year project to enhance the CrestMuse PEDB with a second edition that started in 2016. In the second edition, phrase information that pianists had in mind while playing the performance is included, in addition to the performance data that can be referred to as numerical values. This paper introduces an overview of the ongoing project.
Keywords
Music database, Music performance analysis, Phrase structure and interpretation
Paper topics
Computer-based music analysis, Music games and educational tools, Music performance analysis and rendering
Easychair keyphrases
self self [20], self self self [17], performance data [12], beta version [9], performance expression [6], performance expression database [6], phrase structure [6], piano sonata [6], musical structure [5], music performance [5], score note [5], structure data [5], acoustic signal [4], alignment information [4], apex note [4], audio signal [4], early twentieth century [4], extra note [4], matching file format [4], musical structure data [4], music information retrieval [4], music performance database [4], performance deviation data [4], performance expression data [4], performance rendering system [4], performed note [4], recorded performance [4], waltz self [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401963
Zenodo URL: https://zenodo.org/record/1401963
Abstract
Mobile and sensor-based technologies have created new interaction design possibilities for technology-mediated audience participation in live music performance. However, there is little if any work in the literature that systematically identifies and characterises design issues emerging from this novel class of multi-dimensional interactive performance systems. As an early contribution towards addressing this gap in knowledge, we present the analysis of a detailed survey of technology-mediated audience participation in live music, from the perspective of two key stakeholder groups - musicians and audiences. Results from the survey of over two hundred spectators and musicians are presented, along with descriptive analysis and discussion. These results are used to identify emerging design issues, such as expressiveness, communication and appropriateness. Implications for interaction design are considered. While this study focuses on musicians and audiences, lessons are noted for diverse stakeholders, including composers, performers, interaction designers, media artists and engineers.
Keywords
design implications, interaction design, interactive performance systems, live music, participatory performance, quantitative methods, survey, technology-mediated audience participation (TMAP)
Paper topics
Interactive performance systems and new interfaces, Music performance
Easychair keyphrases
live concert [22], most spectator [20], technology mediated audience participation [18], audience participation [16], live music [16], most musician [9], research question [9], mobile phone [8], mobile technology [7], survey participant [5], survey result [5], computer music [4], design implication [4], general design implication [4], interaction design [4], interactive performance system [4], interquartile range [4], live music performance [4], musical expression [4], music related information [4], special experience [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401873
Zenodo URL: https://zenodo.org/record/1401873
Abstract
While many studies have shown that auditory and visual information influence each other, the link between some intermodal associations are less clear. We here replicate and extend an earlier experiment with ratings of pictures of people singing high and low-pitched tones. To this aim, we video recorded 19 participants singing high and low pitches and combined these into picture pairs. In a two-alternative forced choice test, two groups of six assessors were then asked to view the 19 picture pairs and select the "friendlier", and "angrier" expression respectively. The Result is that assessors chose the high-pitch picture when they were asked to rate "friendlier" expression. Asking about "angrier" expression resulted in choosing the low-pitch picture. A non significant positive correlation between sung pitch ranges from every participant to the number of chosen high-pitch resp. low-pitch pictures was found.
Keywords
Emotion, Facial Expression, Inter-modal Perception, Pitch
Paper topics
Multi-modal perception and emotion, Social interaction in sound and music computing
Easychair keyphrases
low pitch [25], high pitch [21], pitch range [21], facial expression [17], midi note [14], low pitch picture [12], pitch picture [8], high pitch picture [7], inter assessor agreement [7], intermodal association [7], low note [6], low pitch face [6], vocal pitch height [6], angrier expression [5], non threatening [5], sung pitch [5], vocal pitch [5], aalborg university [4], actual produced pitch [4], assessor agreement [4], chosen high pitch picture [4], low pitch tone [4], produced pitch range [4], rating experiment [4], sung interval size [4], threatening display [4], vocal pitch range [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401909
Zenodo URL: https://zenodo.org/record/1401909
Abstract
The paper discusses a generative approach to the design of experimental electronic circuits for musical application. The model takes into account rewriting rules inspired by L-systems constrained by domain-specific features depending on electronic components, and generates families of circuits. An integrated production pipeline is introduced, that ranges from algorithmic design to simulation up to hardware printing.
Keywords
Digital/Analog audio modeling, Experimental sound processing, Generative electronics, Physical computing
Paper topics
Analysis, and modification of sound, synthesis
Easychair keyphrases
amp op amp [17], buffer buffer buffer [14], buffer buffer [11], electronic circuit [8], audio circuit [6], circuit design [6], dc buffer buffer [6], root circuit [6], electronic component [5], generative model [5], algorithmic design [4], analog audio [4], audio circuit design [4], data structure [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401961
Zenodo URL: https://zenodo.org/record/1401961
Abstract
Many people are exposed to large sound pressure levels either occasionally or regularly, and thus need to protect their hearing in order to prevent hearing loss and other hearing disorders. Earplugs are effective at attenuating sound from the environment, but they do not attenuate bone-conducted sound, but instead amplify it at low frequencies due to the occlusion effect. This is a problem, e.g., for many musicians and especially wind instrument players, since this low-frequency amplification greatly affects the sound of their own instruments. This makes it difficult for the musicians to play while using hearing protectors, and therefore many musicians choose not to use hearing protectors at all. In this paper, we propose electronic hearing protectors that mitigate the problems associated with musicians' hearing protection through several different approaches: reduction of the occlusion effect, adjustable attenuation with natural timbre, and monitoring of the musician's own instrument. We present the design of prototype electronic hearing protectors and the evaluation of these by professional musicians, where they were shown to alleviate the problems associated with conventional hearing protectors.
Keywords
earplugs, hearing protection, musician, occlusion effect
Paper topics
Analysis, and modification of sound, Music performance, synthesis
Easychair keyphrases
ear canal [33], hearing protector [33], occlusion effect [28], hearing protection [18], occlusion reduction [16], bone conducted sound [15], occlusion reduction circuit [15], low frequency [14], electronic hearing protection solution [12], electronic hearing protector [12], instrument microphone [10], air conducted sound [9], conducted sound [8], electronic hearing [8], hearing aid [7], instrument microphone signal [7], natural timbre [7], adjustable attenuation [6], ear canal simulator [6], feedback loop [6], hearing protection solution [6], magnitude response [6], musician instrument [5], sound pressure [5], dashed red line [4], high pass filter [4], low frequency boost [4], musician hearing protection [4], problem used hearing protector [4], solid blue line [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401949
Zenodo URL: https://zenodo.org/record/1401949
Abstract
Today, a large variety of technical configurations are used in live performance contexts. In most of them, computers and other devices act usually as powerful yet subordinated agencies, typically piloted by performers: with few notable exceptions, large-scale gestures and structural developments are left either to the performer’s actions or to well-planned automations and/or composing algorithms. At the same time, the performance environment is either ignored or ‘tuned out’, ideally kept neutral with regard to the actual sound events and the overall performance process. This paper describes a different approach. The authors investigate the complex dynamics arising in live performance when multiple autonomous sound systems are coupled through the acoustic environment. In order to allow for more autonomous and adaptive -- or better: ecosystemic -- behaviour on the part of the machines, the authors suggest that the notion of interaction should be replaced with that of a permanent and continuing structural coupling between machine(s), performer(s) and environment(s). More particularly, the paper deals with a specific configuration of two (or more) separate computer-based audio systems co-evolving in their autonomic processes based on permanent mutual exchanges through and with the local environment, i.e.: in the medium of sound only. An attempt is made at defining a self-regulating, situated, and hybrid dynamical system having its own agency and manifesting its potential behaviour in the performance process. Human agents (performers) can eventually intrude and explore affordances and constraints specific to the performance ecosystem, possibly biasing or altering its emergent behaviours. In so doing, what human agents actually achieve is to specify their role and function in the context of a larger, distributed kind of agency created by the whole set of highly interdependent components active in sound. That may suggest new solutions in the area of improvised or structured performance and in the sound arts in general.
Keywords
autonomous sound generating systems, complexity, distributed agency, environmental agency, feedback systems, live electronics performance, structural coupling
Paper topics
accompaniment, Analysis, and improvisation, and modification of sound, Automatic composition, Computer-based music analysis, Interactive performance systems and new interfaces, Music information retrieval, Music performance analysis and rendering, synthesis
Easychair keyphrases
control signal [10], machine milieu project [9], performance ecosystem [9], live performance [7], signal processing [7], feedback delay network [6], human agent [6], complex system [5], audio signal processing [4], audio system [4], autonomous sound generating system [4], control signal processing [4], dynamical behaviour [4], dynamical system [4], electroacoustic music study network [4], live electronic [4], machine milieu [4], performer action [4], real time [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401871
Zenodo URL: https://zenodo.org/record/1401871
Abstract
We present the evaluation of a sonification approach for the acoustic analysis of tremor diseases. The previously developed interactive tool offers two methods for sonification of measured 3-axes acceleration data of patients' hands. Both sonifications involve a bank of oscillators whose amplitudes and frequencies are controlled by either frequency analysis similar to a vocoder or Empirical Mode Decomposition (EMD) analysis. In order to enhance the distinct rhythmic qualities of tremor signals, additional amplitude modulation based on measures of instantaneous energy is applied. The sonifications were evaluated in two experiments based on pre-recorded data of patients suffering from different tremor diseases. In Experiment 1, we tested the ability to identify a patient's disease by using the interactive sonification tool. In Experiment 2, we examined the perceptual difference between acoustic representations of different tremor diseases. Results indicate that both sonifications provide relevant information on tremor data and may complement already available diagnostic tools.
Keywords
diagnosis, emd, empirical mode decomposition, evaluation, sonification, tremor
Paper topics
Sonification
Easychair keyphrases
test participant [28], tremor type [20], ess psy dy [12], vocoder sonification [12], pilot study [11], emd sonification [10], dystonic tremor [9], percent correct response [9], tremor disease [9], interactive sonification [7], interactive sonification interface [7], empirical mode decomposition analysis [6], graphical user interface [6], hilbert huang transform [6], percent correct [6], pre recorded movement data [6], psychogenic tremor [6], reference diagnosis [6], right arm sensor [6], tremor analysis [6], correct diagnosis [5], identification task [5], response time [5], sonification method [5], sonification parameter [5], sound characteristic [5], available diagnostic tool [4], empirical mode decomposition [4], expert listening panel [4], pooled result [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401925
Zenodo URL: https://zenodo.org/record/1401925
Abstract
Certain properties of isomorphic layouts are proposed to offer benefits to learning and performances on a new musical instrument. However, there is little empirical investigation of the effects of these properties. This paper details an experiment that examines the effect of pitch adjacency and shear on the performances of simple melodies by 24 musically-trained participants after a short training period. In the adjacent layouts, pitches a major second apart are adjacent. In the unsheared layouts, major seconds are horizontally aligned but the pitch axis is slanted; in the sheared layouts, the pitch axis is vertical but major seconds are slanted. Qualitative user evaluations of each layout are collected post-experiment. Preliminary results are outlined in this paper, focusing on the themes of learnability and playability. Users show strong preferences towards layouts with adjacent major seconds, focusing on the potential for learning new pitch patterns. Users confirm advantages of both unsheared and sheared layouts, one in terms of similarity to traditional instrument settings, and the other to ergonomic benefits. A model of participants' performance accuracy shows that sheared layouts are learned significantly faster. Results from this study will inform new music instrument/interface design in terms of features that increase user accessibility.
Keywords
learning, new musical interfaces, performance, pitch layouts
Paper topics
Interactive performance systems and new interfaces, Music performance
Easychair keyphrases
non adjacent [13], pitch axis [13], isomorphic layout [12], non adjacent layout [12], adjacent layout [10], major second [10], adjacent m2 layout [9], octave axis [8], pitch layout [8], adjacent major second [7], unsheared layout [7], linear mixed effect model [6], pitch pattern [6], user evaluation [6], correct note [5], musical instrument [5], sheared layout [5], shear perfno [5], correct note score [4], first performance [4], first presented layout [4], known nursery rhyme fr [4], motor skill [4], negative statement [4], non adjacent m2 layout [4], octave axis pitch axis [4], participant selecting layout [4], positive impact [4], positive statement [4], training paradigm [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401989
Zenodo URL: https://zenodo.org/record/1401989
Abstract
Onset detection methods generally work on a two-stage basis: a first step which processes an audio stream and computes a time series depicting the estimated onset positions as local maxima in the function; and a second one which evaluates this time series to select some of the local maxima as onsets, typically with the use of peak picking and thresholding methodologies. Nevertheless, while the former stage has received considerable attention from the community, the second one typically ends up being one of a reduced catalogue of procedures. In this work we focus on this second stage and explore previously unconsidered methods based on descriptive statistics to obtain the threshold function. More precisely, we consider the use of the percentile descriptor as a design parameter and compare it to classic strategies, as for instance the median value. Additionally, a thorough comparison of methodologies considering the temporal evolution of the time series (adaptive techniques) against the use of static threshold values (non-adaptive) is carried out. The results obtained report several interesting conclusions, being the most remarkable two the fact that the percentile descriptor can be considered a competitive possible alternative for this task and that adaptive approaches do not always imply an improvement over static methodologies.
Keywords
Descriptive statistics, Music information retrieval, Onset detection, Signal processing
Paper topics
Music information retrieval
Easychair keyphrases
onset detection [23], window size [20], percentile value [14], music information retrieval [11], overall performance [10], sliding window [10], adaptive methodology [9], static threshold value [7], median value [6], odf process [6], signal processing [6], time series [6], odf method [5], osf method [5], adaptive threshold [4], best figure [4], digital audio effect [4], friedman test [4], local maxima [4], onset selection function [4], percentile parameter [4], static methodology [4], th international society [4], threshold value [4], wilcoxon rank sum [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401899
Zenodo URL: https://zenodo.org/record/1401899
Abstract
This paper describes the use of a MIDI-enabled pipe organ console at Ryerson United Church and various explorations performed in expanding the music service as well as possibilities for exploration in alternative control of the instrument via various gestural interfaces. The latter provides new possibilities for expression and extended performance practice including dance. Future work on both the artistic use of the system as well as technical development of the interfacing system is then proposed.
Keywords
mapping, new interfaces, pipe organ
Paper topics
accompaniment, and improvisation, Automatic composition, Interactive performance systems and new interfaces, Music performance, Sound installation
Easychair keyphrases
ryerson united church [11], midi enabled pipe organ [10], pipe organ [7], enabled pipe organ console [6], alternative control [5], kinect controlled pipe organ [4], live performance [4], music service [4], open sound control [4], organ console [4], solid state organ system [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401983
Zenodo URL: https://zenodo.org/record/1401983
Abstract
Pitch and spatial height are often associated when describing music. In this paper we present results from a sound-tracing study in which we investigate such sound-motion relationships. The subjects were asked to move as if they were creating the melodies they heard, and their motion was captured with an infrared, marker-based camera system. The analysis is focused on calculating feature vectors typically used for melodic analysis, and investigating the relationships of melodic contour typologies with motion contour typologies. This is based on using proposed feature sets for melodic contour similarity measurement. We test these features by applying them to both the melodies and the motion contours to establish whether there is a correspondence between the two, and find the features that match the most. We find there to be a relationship between vertical motion and pitch contour when evaluated through features rather than simply comparing contours.
Keywords
melodic contour, motion capture, music and motion
Paper topics
Computational musicology and mathematical music theory, Computer-based music analysis, Music performance analysis and rendering
Easychair keyphrases
melodic contour [32], feature vector [12], vertical motion [11], pitch contour [10], melodic contour typology [7], melodic similarity [7], sound tracing [7], mean segmentation bin [6], melodic contour similarity [6], signed relative distance [6], motion capture [5], motion contour [5], melodic feature [4], melodic fragment [4], motion capture recording [4], note mean motion [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401893
Zenodo URL: https://zenodo.org/record/1401893
Abstract
Touch-screen musical performance has become commonplace since the widespread adoption of mobile devices such as smartphones and tablets. However, mobile digital musical instruments are rarely designed to emphasise collaborative musical creation, particularly when it occurs between performers who are separated in space and time. In this article, we introduce an app that enables users to perform together asynchronously. The app takes inspiration from popular social media applications, such as a timeline of contributions from other users, deliberately constrained creative contributions, and the concept of a reply, to emphasise frequent and casual musical performance. Users' touch-screen performances are automatically uploaded for others to play back and add reply performances which are layered as musical parts. We describe the motivations, design, and early experiences with this app and discuss how musical performance and collaboration could form a part of social media interactions.
Keywords
asynchronous collaboration, distributed collaboration, group creativity, performance, smartphone, social media, touch-screen
Paper topics
Interactive performance systems and new interfaces, Music performance, Social interaction in sound and music computing, Sonic interaction design
Easychair keyphrases
touch screen [13], social medium [12], musical performance [11], music making [11], mobile device [9], tiny touch screen performance [8], mobile music [7], tiny performance [7], touch interaction [7], musical expression [6], early experience [5], sound scheme [5], mobile digital musical instrument [4], mobile music making [4], mobile phone orchestra [4], music therapy [4], screen musical performance [4], social medium app [4], social mobile music [4], touch area [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401907
Zenodo URL: https://zenodo.org/record/1401907
Abstract
A musical texture, be that of an ensemble or of a solo instrumentalist, may be perceived as combinations of both simultaneous and sequential sound events, however, we believe that also sensations of the corresponding sound-producing events (e.g. hitting, stroking, bowing, blowing) contribute to our perceptions of musical textures. Musical textures could thus be understood as multimodal, with features of both sound and motion, hence the idea here of sound-motion textures in music. The study of such multimodal sound-motion textures will necessitate collecting and analyzing data of both the produced sound and of the sound-producing body motion, thus entailing a number of methodological challenges. In our current work on sound-motion textures in music, we focus on short and idiomatic figures for different instruments (e.g. ornaments on various instruments), and in this paper, we present some ideas, challenges, and findings on typical sound-motion textures in drum set performance. Drum set performance is particularly interesting because the at times complex textures are produced by one single performer, entailing a number of issues of human motion and motor control.
Keywords
constraints, drum set, motion capture, motion hierarchies, sound-motion, textures
Paper topics
Multi-modal perception and emotion, Music performance, Music performance analysis and rendering
Easychair keyphrases
sound motion [33], sound motion texture [33], drum set [17], sound motion object [15], drum set performance [14], sound producing body motion [14], motor control [12], body motion [11], human motor control [7], phase transition [7], drum set sound motion [6], bass drum [5], impact point [5], motion trajectory [5], drum set music [4], high frame rate [4], human body [4], infrared motion capture system [4], motion capture [4], motion event [4], motor control constraint [4], musical experience [4], passive marker [4], set sound motion texture [4], vertical displacement [4], very high frame [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401859
Zenodo URL: https://zenodo.org/record/1401859
Abstract
This paper describes the design and implementation of a real-time adaptive digital audio effect with an emphasis on using expressive audio features that control effect parameters. Research in A-DAFx is covered along with studies about expressivity and important perceptual sound descriptors for communicating emotions. This project was aiming to exploit sounds as expressive indicators to create novel sound transformations. A test was conducted to see if guitar players could differentiate between an adaptive and non-adaptive version of a digital audio effect. The participants could hear a difference, especially when performing expressively. However, the adaptive effect did not seem to enhance their expressive capabilities, and preference over the two versions varied evenly between participants.
Keywords
adaptive digital audio effects, expressive performance, feature extraction, mapping
Paper topics
Analysis, and modification of sound, Interactive performance systems and new interfaces, Music performance, synthesis
Easychair keyphrases
adaptive digital audio effect [18], adaptive version [16], non adaptive version [12], adaptive effect [11], real time [9], digital audio effect [7], fundamental frequency [7], effect parameter [6], feature extraction [6], musical expression [6], tone rate [6], audio signal [5], expressive intention [5], musical performance [5], non adaptive [5], parameter mapping [5], sound feature [5], acoustic tone parameter [4], adaptive mapping [4], effect change according [4], expressive acoustic feature [4], mapping strategy [4], spectral centroid [4], target group [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401955
Zenodo URL: https://zenodo.org/record/1401955
Abstract
Ferin Martino is a piano-playing software algorithm created by Jeff Morris for art installations. It uses nearby motion observed with a camera to shape, but not dictate, its flow of musical decisions. Aesthetically, the work challenges the notions of the composer and the composition by presenting a software program that composes its own oeuvre in such a way that visitors cannot experience the composition without also influencing it. The installation has taken many forms, at times including multiple cameras and speakers, video display, note input by the visitor, a digital player piano, and an outdoor venue with an original sculpture embedded in nature. This algorithm has also proven useful in live performance and computer-aided composition. Case studies of exhibiting this work illustrate the process of finding the most effective venue for an interactive art installation and the process of tuning interactivity for a given venue.
Keywords
algorithmic composition, installation, intermedia, piano, video
Paper topics
accompaniment, and improvisation, Automatic composition, Interactive performance systems and new interfaces, Music performance, Social interaction in sound and music computing, Sonic interaction design, Sound installation
Easychair keyphrases
ferin martino [12], digital player piano [4], disklavier digital player piano [4], midi reference track [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401891
Zenodo URL: https://zenodo.org/record/1401891
Abstract
Many of the dynamic range compressor (DRC) designs that are deployed in the marketplace today are constrained to operate in the time-domain; therefore, they offer only temporally dependent control of the amplitude envelope of a signal. Designs that offer an element of frequency-dependency, are often restricted to perform specific tasks intended by the developer. Therefore, in order to realise a more flexible DRC implementation, this paper proposes a generalised time-frequency domain design that accommodates both temporally-dependent and frequency-dependent dynamic range control; for which an FFT-based implementation is also presented. Examples given in this paper reveal how the design can be tailored to perform a variety of tasks, using simple parameter manipulation; such as frequency-depended ducking for automatic-mixing purposes and high-resolution multi-band compression.
Keywords
dynamic range compression, FFT, spectral processing
Paper topics
Analysis, and modification of sound, synthesis
Easychair keyphrases
dynamic range compression [20], dynamic range [18], audio engineering society convention [16], domain drc design [12], frequency dependent gain factor [12], envelope detector [11], gain factor [11], time frequency domain [11], chain signal [10], frequency dependent [10], gain reduction [9], time domain drc [9], time frequency [9], audio signal [8], audio engineering society [7], dynamic range compressor [7], signal processing [7], time domain [7], frequency domain drc [6], time frequency transform [6], chain compression [5], digital audio [5], frequency range [5], gain computer [5], music production [5], sub band [5], automatic mixing purpose [4], band pass filter [4], calculated gain factor [4], fft based implementation [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401877
Zenodo URL: https://zenodo.org/record/1401877
Abstract
Deep learning approaches have become increasingly popular in estimating time-frequency masks for audio source separation. However, training neural networks usually requires a considerable amount of data. Music data is scarce, particularly for the task of classical music source separation, where we need multi-track recordings with isolated instruments. In this work, we depart from the assumption that all the renditions of a piece are based on the same musical score, and we can generate multiple renditions of the score by synthesizing it with different performance properties, e.g. tempo, dynamics, timbre and local timing variations. We then use this data to train a convolutional neural network (CNN) which can separate with low latency all the renditions of a score or a set of scores. The trained model is tested on real life recordings and is able to effectively separate the corresponding sources. This work follows the principle of research reproducibility, providing related data and code, and can be extended to separate other pieces.
Keywords
classical music, data augmentation, deep learning, music source separation, neural networks
Paper topics
Music information retrieval
Easychair keyphrases
source separation [24], neural network [18], circular shifting [11], classical music [10], audio source separation [9], local timing variation [9], convolutional neural network [7], training data [7], classical music source separation [6], deep learning [6], deep neural network [6], generating training data [6], real life [6], score constrained nmf [6], circular shift [5], data augmentation [5], magnitude spectrogram [5], synthetic piece [5], deep convolutional neural network [4], low latency [4], low latency scenario [4], non negative matrix factorization [4], score informed source separation [4], target source [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401923
Zenodo URL: https://zenodo.org/record/1401923
Abstract
This paper presents a method taking into account the form of a tune upon several levels of organisation to guide music generation processes to match this structure. We first show how a phrase structure grammar can be constructed by hierarchical analysis of chord progressions to create multi-level progressions. We then explain how to exploit this multi-level structure of a tune for music generation and how it enriches the possibilities of guided machine improvisation. We illustrate our method on a prominent jazz chord progression called 'rhythm changes'. After creating a phrase structure grammar for 'rhythm changes' based on a corpus analysed with a professional musician and automatically learning the content of this grammar from this corpus, we generate improvisations guided by multi-level progressions created by the grammar. The results show the potential of our method to ensure the consistency of the improvisation regarding the global form of the tune, and how the knowledge of a corpus of chord progressions sharing the same hierarchical organisation can extend the possibilities of music generation.
Keywords
Computational musicology, Computer music, Formal Language Theory, Machine improvisation, Music generation, Phrase Structure Grammar
Paper topics
accompaniment, and improvisation, Automatic composition
Easychair keyphrases
multi level progression [57], chord progression [44], multi level [35], rhythm change [35], phrase structure grammar [31], equivalent chord progression [12], music generation [11], multi level label [9], hierarchical structure [8], factor oracle [7], generation model [7], generative grammar [7], article n oun [6], computer music [6], memory sharing [6], machine improvisation [5], dominant seventh chord [4], human computer music improvisation [4], international computer [4], jazz chord progression [4], musical sentence [4], music generation model [4], professional jazz musician [4], style modelling [4], traditional bebop style rhythm [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401975
Zenodo URL: https://zenodo.org/record/1401975
Abstract
In the present study we will show the realization of a multimedia installation for the interactive sonification of a painting of Hartwig Thaler. The sonification is achieved by means of sound synthesis models, capable of reproducing continuous auditory feedback, eventually generated from the feature analysis of three color layers extracted from the painting (red, blue, yellow). Furthermore, a wooden sensors equipped runway, divided in three sections, each of which represent a different layer, will activate the corresponding soundscape according to the visitor position. This system enables to discover a 2D painting in a new 3D approach, moving along the runway toward the projection of the painting. The paper describes in details the developing of all the elements of the system: starting from the analysis of the image and the algorithm to realize the acoustic elements, than the sensorized runway and finally the testing results.
Keywords
board, installation, interactive, painting, runway, sensors, sonification, space, wooden
Paper topics
Analysis, and modification of sound, Computer music languages and software, Interactive performance systems and new interfaces, Sonification, Sound installation, Soundscapes and environmental arts, synthesis
Easychair keyphrases
wooden runway [9], color matrix [7], ground reaction force [6], localization system [6], sensor equipped runway [6], wooden board [5], audio interface [4], color layer [4], interactive installation [4], lightness value [4], piezoelectric sensor [4], real time [4], second sensor [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401883
Zenodo URL: https://zenodo.org/record/1401883
Abstract
This project presents the development of an interactive environment inspired by soundscape composition, in which a user can explore a sound augmented reality referring to a real soundscape. The user walks on the application’s responsive floor controlling through her/his position the soundscape rendering. In the same time s/he can explore the sound’s structure entering the responsive floor’s central zone, where some granular synthesis processes are activated. One main goal of the project is to sensitize the participants regarding the surrounding sonic environment looking also for pedagogical studies. A first evaluation of an audio excerpt produced by a user’s soundwalk shows that the application can produce good quality and interesting soundscapes which are fully consistent with the real environments from which they were inspired.
Keywords
interactive system, responsive floor, soundscape composition, soundscape design, soundscape interaction
Paper topics
Interactive performance systems and new interfaces, Sonic interaction design, Sound installation, Soundscapes and environmental arts
Easychair keyphrases
computer generated soundscape [11], granular synthesis [11], interactive soundscape [9], audio excerpt [8], audio file [7], audio source [5], deep listening [5], real time [5], soundscape composition [5], sub stream [5], user position [5], audio quality [4], barry truax [4], granular stream [4], granular synthesis process [4], musical instrument [4], real environment [4], real soundscape [4], responsive floor [4], sonic environment [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401885
Zenodo URL: https://zenodo.org/record/1401885
Abstract
A spatial array of vibro-mechanical transducers for bone-and-tissue conduction has been used to convey spatial ambisonic soundscape and spatial musical material. One hundred volunteers have undergone a five-minute listening experiences, then have described the experience in their own words, on paper, in an unstructured elicitation exercise. The responses have been aggregated to elicit common emergent descriptive themes, which were then mapped against each other to identify to what extent the experience was valuable, enjoyable and informative, and what qualia were available through this technique. There appear to some substantive differences between this way of experiencing music and spatial sound, and other modes of listening. Notably, the haptic component of the experience appears potentially informative and enjoyable. We conclude that development of similar techniques may have implications for augmented perception, particularly in respect of quality of life (QoL) in cases of conductive hearing loss.
Keywords
Audio spatial qualia, augmented perception, Bone and tissue conduction, Conductive hearing loss, Multi-modal music, Quality of life, Spatial Music
Paper topics
Multi-modal perception and emotion, Music performance analysis and rendering, Sonic interaction design, Soundscapes and environmental arts, Spatial sound, Virtual reality applications and technologies for sound and music
Easychair keyphrases
hearing loss [11], spatial music [9], tissue conduction [9], music listening [7], conductive hearing loss [6], signal set [6], positive comment [5], audio engineering society [4], bone conduction [4], spatial tissue conduction [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401905
Zenodo URL: https://zenodo.org/record/1401905
Abstract
Acoustical conditions on stage have an influence on the sound and character of a musical performance. Musicians constantly modify their playing to accommodate to the stage acoustics. The study of acoustical preferences of musicians is part of the characterization of this feedback loop, which impacts on the musician comfort as well as on the aural experience of a concert audience. This paper presents an investigation on preferences of solo musicians on stage. By spatial acoustic measurements and real-time auralization, different real rooms are resynthesized in laboratory conditions. Two formal tests are conducted with solo trumpet players: a room preference test and a test investigating the preferred directions of early energy. In the first test, musicians are presented with four different rooms and asked about their preference in five different aspects: practice of instrument technique, practice of concert repertoire, concert performance, ease of performance and sound quality. The second test is related to preference of directions of early stage reflections. The auralized rooms are modified to provide early reflections from different directions (front-back, top-down, sides, no early reflections) and the preference of musicians is investigated. The results show that the judged aspect or musical activity to be performed is a key factor in determining the preference of musicians' stage acoustics preference. Drier rooms are preferred for technique practice while louder rooms help to reduce the fatigue of the players. Bigger rooms with slightly longer reverberation are preferred for concert piece practice and concert performance. The easiness of performance seems to slightly correlate with the preference of concert conditions. Regarding the preference of direction of early reflections, there are no clear differences between the preference of direction, and the results suggest that the level of early energy is more important for the comfort of solo musicians on stage.
Keywords
Auralization, Early reflections, Musicians' stage preferences, Room acoustics, Stage acoustics, Stage acoustics preferences, Trumpet
Paper topics
Analysis, and modification of sound, Music performance, Room acoustics modeling and auralization, synthesis, Virtual reality applications and technologies for sound and music
Easychair keyphrases
early energy [18], early reflection [12], impulse response [12], stage acoustic [10], auralized room [9], directional early energy [9], performance context [9], reverberation time [7], virtual environment [7], general preference [6], late reverberation [6], practice technique [6], real room [6], room acoustic [6], solo musician [6], solo trumpet player [6], stage acoustics preference [6], acoustic condition [5], auralization error [5], concert hall [5], estimated preference [5], laboratory condition [5], reproduction loudspeaker [5], sound quality [5], stage support [5], concert concert easiness [4], concert easiness quality [4], quasi linear relationship [4], room acoustic parameter [4], stage acoustic preference [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401865
Zenodo URL: https://zenodo.org/record/1401865
Abstract
In this paper, we discuss a computational model of an automatic jazz session, which is a statistically trainable. Moreover, we describe a jazz piano trio synthesizing system that was developed to validate our model. Most previous mathematical models of jazz session systems require heuristic rules and human labeling of training data to estimate the musical intention of human players in order to generate accompaniment performances. In contrast, our goal is to statistically learn the relationship between a piano, a bass, and a drum player from performance MIDI data as well as information contained in lead sheets, for instance tonic key and chord progression. Our system can generate the performance data of bass and drums from only piano MIDI input, by learning the interrelationship of their performances and time series characteristics of the three involved instruments. The experimental results show that the proposed system can learn the relationship between the instruments and generate jazz piano trio MIDI output from only piano input.
Keywords
DNN, HMM, Jazz
Paper topics
accompaniment, Analysis, and improvisation, and modification of sound, Automatic composition, Music performance analysis and rendering, synthesis
Easychair keyphrases
jazz piano trio [22], jazz session system [17], piano performance [16], feature space [13], performance trajectory [12], performance feature space [11], piano performance feature [9], piano trio synthesizing [9], trio synthesizing system [9], accompaniment performance [8], performance feature [8], deviation vector [7], performance midi data [7], piano trio [7], training data [7], accompaniment instrument [6], jazz piano [6], performance state [6], stochastic state transition model [6], ipsj sig [5], neural network [5], sig technical [5], synthesizing phase [5], auxiliary function method [4], centroid deviation vector [4], hmm state centroid [4], machine learning method [4], recurrent neural network [4], short term memory [4], style feature vector [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401861
Zenodo URL: https://zenodo.org/record/1401861
Abstract
The aim of this paper is to present techniques and results for identifying the key of Irish traditional music melodies, or tunes. Several corpora are used, consisting of both symbolic and audio representations. Monophonic and heterophonic recordings are present in the audio datasets. Some particularities of Irish traditional music are discussed, no tably its modal nature. New key-profiles are defined, that are better suited to Irish music.
Keywords
Audio key detection, Irish traditional music, Modal music, Symbolic key detection
Paper topics
Computational musicology and mathematical music theory, Computer-based music analysis, Music information retrieval
Easychair keyphrases
key profile [36], irish traditional music [11], weighted cadence key profile [10], irish music [9], pitch class [9], fifth relative parallel neighbour [8], cross validation [7], pitch class histogram [7], unseen data [7], audio recording [6], confusion matrix [6], grid search process [6], music information retrieval [6], tone center image [6], accuracy score [5], best model [5], key detection [5], mirex score [5], audio dataset [4], best weight method [4], cross validation methodology [4], dublin institute [4], foinn seisi un [4], hyper parameter [4], irish traditional music tune [4], machine learning [4], mirex evaluation metric [4], neighbour key [4], test set [4], weight tuning [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401889
Zenodo URL: https://zenodo.org/record/1401889
Abstract
Recent advances in online technologies are changing the way we approach musical instruments teaching. The diversity of online music resources has increased through the availability of digital scores, video tutorials and music applications, creating the need for a cohesive, integrated learning experience. This article will explore how different technological innovations have shaped music education and how their actual transposition in the digital world is creating new learning paradigms. This will be done by presenting an experimental online learning environment for the guitar that allows master-apprentice or self-taught learning, using interactive and collaborative tools.
Keywords
augmented score, instrumental gestures, musical notation, music education and pedagogy, online learning environment, serious game
Paper topics
Music games and educational tools, Music information retrieval, Music performance
Easychair keyphrases
augmented score [10], musical excerpt [10], music education [9], social network [9], informal music learning [6], serious game [6], sharing platform [6], audio analysis automatic multimodal analysis [4], group lesson [4], integrated learning experience [4], learner motivation [4], learning and teaching [4], learning environment [4], motion capture [4], online music resource [4], real time [4], score editor [4], self taught learner [4], ultimate guitar [4], video sharing platform [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401887
Zenodo URL: https://zenodo.org/record/1401887
Abstract
This paper introduces the first system for performing automatic orchestration based on a real-time piano input. We believe that it is possible to learn the underlying regularities existing between piano scores and their orchestrations by notorious composers, in order to automatically perform this task on novel piano inputs. To that end, we investigate a class of statistical inference models based on the Restricted Boltzmann Machine (RBM). We introduce a specific evaluation framework for orchestral generation based on a prediction task in order to assess the quality of different models. To gain a better understanding of these quantitative results, we provide a qualitative analysis of the performances of the models, from which we try to extract the most crucial features of amongst the different architectures. As prediction and creation are two widely different endeavours, we discuss the potential biases in evaluating temporal generative models through prediction tasks and their impact on a creative system. Finally, we introduce an implementation of the proposed models called Live Orchestral Piano (LOP), which allows to perform real-time projective orchestration of a MIDI keyboard input.
Keywords
Automatic orchestration, Machine learning, Neural networks
Paper topics
accompaniment, and improvisation, Automatic composition
Easychair keyphrases
piano score [20], projective orchestration [16], projective orchestration task [14], visible unit [13], event level [12], frame level accuracy [12], frame level [11], real time [10], live orchestral piano [9], orchestral score [8], context unit [7], evaluation framework [7], event level accuracy [7], hidden unit [7], level accuracy [7], restricted boltzmann machine [7], conditional model [6], piano roll [6], piano roll representation [6], accuracy measure [5], automatic orchestration [5], companion website [5], gibbs sampling [5], neural network [5], probabilistic model [5], event level granularity [4], factored gated crbm [4], level accuracy measure [4], music generation field [4], statistical inference model [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401985
Zenodo URL: https://zenodo.org/record/1401985
Abstract
Theoretical basis of existing attempts to automatic fingering decision is mainly path optimization that minimizes the difficulty of whole phrase that is typically defined as the sum of the difficulties of each moves required for playing the phrase. However, from a practical viewpoint of beginner players, it is more important to minimize the maximum difficulty of the move required for playing the phrase, that is, to make the most difficult move easiest. To this end, we introduce a variant of the Viterbi algorithm termed “minimax Viterbi algorithm” that finds the path of the hidden states that maximizes the minimum transition probability along the path (not the product of the transition probabilities). Furthermore, we introduce a family of Viterbi algorithm termed “Lp-Viterbi algorithm” that continuously interpolates the conventional Viterbi algorithm and the minimax Viterbi algorithm. We apply those variants of the Viterbi algorithm to HMM-based guitar fingering decision and compare the resulting fingerings.
Keywords
Automatic Fingering Decision, Hidden Markov Model, Viterbi Algorithm
Paper topics
Computer-based music analysis
Easychair keyphrases
viterbi algorithm [54], minimax viterbi algorithm [36], lp viterbi algorithm [34], conventional viterbi algorithm [33], fingering decision [25], hidden state [20], transition probability [11], decoding problem [10], output symbol [10], generalized decoding problem [9], arg max [7], automatic fingering decision [7], difficult move [7], guitar fingering decision [7], difficulty level [6], hidden markov model [6], hmm based guitar fingering [6], maximum minimum transition probability [6], output probability [6], very small value [6], path optimization [5], pinky finger [5], blue line [4], fret number [4], maximum difficulty [4], maximum likelihood sequence [4], minimum transition probability [4], monophonic guitar phrase [4], red line [4], string instrument [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401971
Zenodo URL: https://zenodo.org/record/1401971
Abstract
This work proposes a lexicalized probabilistic context free grammar designed for meter detection, an integral component of automatic music transcription. The grammar uses rhythmic cues to align a given musical piece with learned metrical stress patterns. Lexicalization breaks the standard PCFG assumption of independence of production, and thus, our grammar can model the more complex rhythmic dependencies which are present in musical compositions. Using a metric we propose for the task, we show that our grammar outperforms baseline methods when run on symbolic music input which has been hand-aligned to a tatum. We also show that the grammar outperforms an existing method when run with automatically-aligned symbolic music data as input. The code for the grammar described here is available at https://github.com/author1/met-detection.
Keywords
Lexicalization, Meter, Metrical Structure, PCFG, Symbolic Music Analysis, Time Signature
Paper topics
Computer-based music analysis, Music information retrieval
Easychair keyphrases
sub beat [30], meter type [24], metrical structure [23], meter detection [15], full metrical structure detection [12], sub beat length [12], training data [11], true positive [9], beat tracking [8], rhythmic tree [7], symbolic music data [7], time signature [7], good turing smoothing [6], metrical tree [6], sub beat level [6], temperley model [6], beat level [5], longuet higgin [5], rhythmic stress [5], symbolic music [5], automatic music transcription [4], beat level transition [4], correct time signature [4], enough training data [4], first full bar [4], hand aligned tatum [4], hypothesis state [4], inner metric analysis [4], metrical stress pattern [4], non terminal [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401967
Zenodo URL: https://zenodo.org/record/1401967
Abstract
While good physical health receives more attention, psychological wellbeing is an essential component of a happy existence. An everyday source of psychological wellbeing is the voluntary practise of skilled activities one is good at. Taking musical creation as one such skilled activity, in this work we employ an interaction method to monitor varying levels of engagement of musicians improvising on a desktop robotic musical interface (a network of intelligent sonic agents). The system observes the performer and estimates her changing level of engagement during the performance, while learning the musical discourse. When engagement levels drop, the musical instrument makes subtle interventions, coherent with the compositional process, until the performer's engagement levels recover. In a user study, we observed and measured the behaviour of our system as it deals with losses of performer focus provoked by the controlled introduction of external distractors. We also observed that being engaged in our musical creative activity contributed positively to participants' psychological wellbeing. This approach can be extended to other human activities.
Keywords
Engaging interaction, interactive system, musical interaction, NOISA, semi-autonomous system, wellbeing promotion
Paper topics
accompaniment, and improvisation, Automatic composition, Interactive performance systems and new interfaces, Multi-modal perception and emotion, Music performance, Social interaction in sound and music computing, Sonic interaction design
Easychair keyphrases
psychological wellbeing [16], response module [11], musical activity [9], negative affect [9], creative activity [6], skilled activity [6], engagement level [5], noisa instrument [5], positive affect [5], user study [5], creative musical [4], intelligent sonic agent [4], music therapy [4], noisa system [4], overall improvement [4], participant psychological wellbeing [4], schooling issue digest [4], system usability scale [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401911
Zenodo URL: https://zenodo.org/record/1401911
Abstract
Highly recurrent networks that exhibit feedback and delay mechanisms, offer promising applications for music com- position, performance and sound installation design. This paper provides an overview and a comparison of several pieces that have been realised by various musicians in the context of a practice-led research project.
Keywords
Composition, Generative Art, Music Performance, Sound Installation, Sound Synthesis, Time-Delayed Feedback Networks
Paper topics
accompaniment, and improvisation, Automatic composition, Music performance, Sonification, Sound installation
Easychair keyphrases
network topology [22], network node [16], connection characteristic [12], computer music [11], delay time [11], sound synthesis [11], time delayed feedback network [8], acoustic output [7], audio routing [7], generative art [7], network connection [7], sheet music [7], delay network [6], digital signal processing [6], generative system [6], installation sheet music [6], izhikevich model neuron [6], piece roj [6], piece twin prime [6], recurrent connection [6], signal amplitude [6], sonic material [6], time delay [6], audio signal [5], feedback network [5], network algorithm [5], neural network [5], sound installation [5], twin prime [5], wave shaping function [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401959
Zenodo URL: https://zenodo.org/record/1401959
Abstract
Novelty detection is a well-established method for analyzing the structure of music based on acoustic descriptors. Work on novelty-based segmentation prediction has mainly concentrated on enhancement of features and similarity matrices, novelty kernel computation and peak detection. Less attention, however, has been paid to characteristics of musical features and novelty curves, and their contribution to segmentation accuracy. This is particularly important as it can help unearth acoustic cues prompting perceptual segmentation and find new determinants of segmentation model performance. This study focused on spectral, rhythmic and harmonic prediction of perceptual segmentation density, which was obtained for six musical examples from 18 musician listeners via an annotation task. The proposed approach involved comparisons between perceptual segment density and novelty curves; in particular, we investigated possible predictors of segmentation accuracy based on musical features and novelty curves. For pitch and rhythm, we found positive correlates between segmentation accuracy and both local variability of musical features and mean distance between subsequent local maxima of novelty curves. According to the results, segmentation accuracy increases for stimuli with milder local changes and fewer novelty peaks. Implications regarding prediction of listeners’ segmentation are discussed in the light of theoretical postulates of perceptual organization.
Keywords
kernel density estimation, musical features, musical structure, music segmentation, novelty detection
Paper topics
Computer-based music analysis, Music information retrieval
Easychair keyphrases
novelty curve [47], musical feature [34], perceptual segmentation density [22], segmentation accuracy [22], novelty detection [17], perceptual boundary density [14], musical piece [13], feature flux [12], novelty peak [10], musical stimulus [9], segment boundary [9], tonal centroid [9], musical change [8], segmentation task [8], boundary density [6], higher accuracy [6], local change [6], perceptual segment boundary density [6], boundary data [5], mean distance [5], perceptual segmentation [5], segmentation prediction [5], time point [5], time series [5], tonal feature [5], interdisciplinary music research university [4], music information retrieval [4], perceptual organization [4], perceptual segment boundary [4], tonal centroid time series [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401965
Zenodo URL: https://zenodo.org/record/1401965
Abstract
The paper presents the pch2csd project, focused on converting patches of popular Clavia Nord Modular G2 synthesizer into code of Csound language. Now discontinued, Nord Modular G2 left a lot of interesting patches for sound synthesis and algorithmic composition. To give this heritage a new life, we created our project with the hope for being able to simulate the original sound and behavior of Nord Modular.
Keywords
Csound, format conversion, Nord Modular G2
Paper topics
Analysis, and modification of sound, Computer music languages and software, synthesis
Easychair keyphrases
nord modular [21], nord modular g2 [15], cable visibility [7], mapping table [7], third international csound [7], bonch bruevich st [6], patch format [6], clavia nord [5], petersburg state [5], binary encoding field name [4], bit midi value [4], csound code [4], field name header [4], list binary encoding field [4], modular g2 patch [4], name header byte [4], patch file [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401979
Zenodo URL: https://zenodo.org/record/1401979
Abstract
In this paper we present Qualia, a software for real-time generation of graphical scores driven by the audio analysis of the performance of a group of musicians. With Qualia, the composer analyses and maps the flux of data to specific score instructions, thus, becoming part of the performance itself. Qualia is intended for collaborative performances. In this context, the creative process to compose music not only challenges musicians to improvise collaboratively through active listening, as typical, but also requires them to interpret graphical instructions provided by Qualia. The performance is then an interactive process based on “feedback” between the sound produced by the musicians, the flow of data managed by the composer and the corresponding graphical output interpreted by each musician. Qualia supports the exploration of relationships between composition and performance, promoting engagement strategies in which each musician participates actively using their instrument.
Keywords
Composer-Pilot, Composition, gestural feedback, graphical scores, Improvisation, Meta-improvisation, Qualia, Real-time Score Generation
Paper topics
accompaniment, and improvisation, Automatic composition, Music performance, Music performance analysis and rendering
Easychair keyphrases
real time [13], graphical score [7], animated notation [4], music notation [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401881
Zenodo URL: https://zenodo.org/record/1401881
Abstract
Sword sounds are synthesised by physical models in real-time. A number of compact sound sources are used along the length of the sword which replicate the swoosh sound when swung through the air. Listening tests are carried out which reveal a model with reduced physics is perceived as more authentic. The model is further developed to be controlled by a Wii Controller and successfully extended to include sounds of a baseball bat and golf club.
Keywords
Aeroacoustics, Physical Models, Real-Time, Sound Synthesis
Paper topics
Analysis, and modification of sound, synthesis, Virtual reality applications and technologies for sound and music
Easychair keyphrases
wii controller [11], aeolian tone [10], physical model [10], listening test [8], reynold number [8], synthesis method [8], compact source [7], sound effect [7], fundamental frequency [6], lift dipole [6], physically inspired model [6], physical model physical model [6], real time physical model [6], sword model [6], sword sound [6], top speed [6], circular cylinder [5], golf club [5], sound texture [5], strouhal number [5], audio engineering society convention [4], baseball bat [4], compact source model [4], low quality [4], synthesis model [4], time averaged acoustic intensity [4], tukey post hoc test [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401947
Zenodo URL: https://zenodo.org/record/1401947
Abstract
We present the problem of music renotation, in which the results of optical music recognition are rendered in image format, while changing various parameters of the notation, such as the size of the display rectangle or transposition. We cast the problem as one of quadratic programming. We construct parameterizations of each composite symbol expressing the degrees of freedom in its rendering, and relate all the symbols through a connected graph. Some of the edges in this graph become terms in the quadratic cost function expressing a desire for spacing similar to that in the original document. Some of the edges express hard linear constraints between symbols expressing relations, such as alignments, that must be preserved in the renotated version. The remaining edges represent linear inequality constraints, used to resolve overlapping symbols. The optimization is solved through generic techniques. We demonstrate renotation on several examples of piano music.
Keywords
Graph-based Optimization, Music Notation, Optical Music Recognition
Paper topics
Computer music languages and software, Music information retrieval
Easychair keyphrases
beamed group [12], optical music recognition [12], note head [11], inequality constraint [10], music notation [10], augmentation dot [8], horizontal position [8], equality constraint [7], quadratic term [7], basic symbol [6], linear inequality constraint [6], hard edge [5], isolated symbol [5], music renotation [5], objective function [5], composite symbol [4], musical symbol [4], optimization problem [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401943
Zenodo URL: https://zenodo.org/record/1401943
Abstract
The purpose of this paper is to present Robinflock, an algorithmic system for automatic polyphonic music generation to be applied to interactive systems target-ed at children. The system allows real time interaction with the generated music, in particular a number of parameters of each line can be independently manipu-lated. In the context of interactivity, we highlight im-portance of identifying specific needs of the targeted scenario. We discuss how the specific needs of the po-lyphony with children influenced the developing choices. The algorithm has been practically adopted in a field study in a local kindergarten involving 27 chil-dren over a period of seven months.
Keywords
Algorithmic composition, automatic composition, computer music systems for children, music and hci, music interaction design
Paper topics
accompaniment, and improvisation, Automatic composition, Music games and educational tools, Sonic interaction design
Easychair keyphrases
polyphonic music [26], real time [21], algorithmic composition [10], algorithmic composition system [9], algorithmic system [7], quarter note [7], field study [6], fifth species counterpoint [6], interactive system [6], rhythm array [6], algorithmic composer [5], generation module [5], genetic algorithm [5], interactive scenario [5], musical parameter [5], music teacher [5], body movement [4], children attention [4], design phase [4], first order markov chain [4], help child [4], interaction lab [4], melody generation module [4], music education [4], music theory [4], real time algorithmic system [4], rhythm generation module [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401901
Zenodo URL: https://zenodo.org/record/1401901
Abstract
Recently, the end-to-end approach that learns hierarchical representations from raw data using deep convolutional neural networks has been successfully explored in the image, text and speech domains. This approach was applied to musical signals as well but has been not fully explored yet. To this end, we propose sample-level deep convolutional neural networks which learn representations from very small grains of waveforms (e.g. 2 or 3 samples) beyond typical frame-level input representations. Our experiments show how deep architectures with sample-level filters improve the accuracy in music auto-tagging and they provide results comparable to previous state-of-the-art performances for the Magnatagatune dataset and Million Song Dataset. In addition, we visualize filters learned in a sample- level DCNN in each layer to identify hierarchically learned features and show that they are sensitive to log-scaled frequency along layer, such as mel-frequency spectrogram that is widely used in music classification systems.
Keywords
Convolutional Neural Networks, music auto-tagging, Raw waveforms, Sample-level Deep Networks
Paper topics
Music information retrieval
Easychair keyphrases
convolution layer [31], filter length [26], first convolution layer [23], sample level [23], raw waveform [22], frame level [17], level raw waveform model [14], music auto tagging [14], deep convolutional neural network [12], layer filter length [12], model n layer [12], n layer filter [12], mel spectrogram model [11], strided convolution layer [11], mel spectrogram [10], sub sampling [8], arxiv preprint [7], convolutional neural network [7], pooling length [7], preprint arxiv [7], strided convolution [7], auto tagging task [6], frame level mel [6], last sub sampling layer [6], level mel spectrogram [6], music information retrieval [6], n model model [6], sample level architecture [6], stride auc model [6], time frequency representation [6]
Paper type
unknown
DOI: 10.5281/zenodo.1401921
Zenodo URL: https://zenodo.org/record/1401921
Abstract
Cochlear implants (CI) can restore some amount of hearing to people with severe hearing loss, but these devices are far from perfect. Although this technology allows many users to perceive speech in a quiet room, it is not so successful for music perception. Many public spaces (e.g., stores, elevators) are awash with background music, but many CI users report that they do not find music enjoyable or reassuring. Research shows that music training can improve music perception and appreciation for CI users. However, compared to the multiple computer-assisted solutions for language training, there are few systems that exploit the benefits of computer technologies to facilitate music training of children with cochlear implants. The few existing systems are either targeting a different audience or have complex interfaces that are not friendly to children. In this study, we examined the design limitations of a prior application (MOGAT) in this field and developed a new system with more friendly interfaces for music training of children with CI. The new system, SECCIMA, was crafted through an iterative design process that involved 16 participants. After the design phase was completed, the final game was evaluated and compared against the previous game with 12 new participants. Our results show that the newly designed interface is more intuitive and user-friendly than to the previous one. To assist future work, we discuss some guidelines for designing user interfaces for this audience.
Keywords
assistive computing, cochlear implants, evaluation, game design, iterative design, music training, user study
Paper topics
Multi-modal perception and emotion, Music games and educational tools
Easychair keyphrases
cochlear implant [17], iterative design process [14], target pitch [13], completion time [12], music perception [9], music training [9], visual element [9], musical note [8], vocal matcher [8], instruction screen [6], post session questionnaire [6], progress bar [6], success rate [6], glossy ball [5], pitch contour [5], singing game [5], singnroll game [5], animated instruction [4], auditory training [4], deaf child [4], hearing loss [4], hearing research [4], higher lower [4], interactive music awareness program [4], objective measure [4], pitch perception [4], speech training [4], user interface [4], user study [4], xylophone game [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401915
Zenodo URL: https://zenodo.org/record/1401915
Abstract
This article discusses a recent large-scale audio-architectural installation which uses the glass structures of a greenhouse to create a multichannel sound system driven by structure-borne audio transducers. The sound system is presented and its implementation is discussed in reference to the constraints of a site-specific installation. A set of sound spatialisation strategies are proposed and their effectiveness weighted in the specific context of a large-scale work where the audience acquires an active role in moving within the piece, countering traditional centered spatialisation methods. Compositional strategies are developed in response to the context, emphasizing the spatial dimension of composition over the temporal and narrative ones and pointing towards the concepts of “Son-ic Weather” as well as “Sonic Acupuncture”.
Keywords
Sound and architecture, Sound installation, Sound Spatialisation, Structure-borne sound, Time and Space in Electroacoustic Composition
Paper topics
Sound installation, Soundscapes and environmental arts, Spatial sound
Easychair keyphrases
sonic greenhouse [16], palm room [9], cactus room [8], sonic weather [8], structure borne sound driver [8], compositional strategy [7], helsinki winter garden [7], large scale [7], sound spatialisation [7], sonic acupuncture [6], sound diffusion system [6], winter garden [6], first person [5], glass panel [5], aural architecture [4], distance based amplitude panning [4], glass structure [4], largescale audio architectural installation [4], plexiglass panel [4], sound source [4], weather data [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401933
Zenodo URL: https://zenodo.org/record/1401933
Abstract
Content-based retrieval becomes more and more signifi- cant with increasing availability of data. In the context of music data, emotion is a strong content-based characteris- tic. In this paper, we focus on the emotion recognition from musical tracks in the 2-dimensional valence-arousal (V-A) emotional space. We propose a method based on convolu- tional (CNN) and recurrent neural networks (RNN), hav- ing significantly fewer parameters compared with state-of- the-art (SOTA) method for the same task. We utilize one CNN layer followed by two branches of RNNs trained sep- arately for arousal and valence. The method was evaluated using the “MediaEval2015 emotion in music” dataset. We achieved an RMSE of 0.202 for arousal and 0.268 for va- lence, which is the best result reported on this dataset.
Keywords
convolutional neural networks, music emotion, neural networks, recognition, recurrent neural networks, regression
Paper topics
Music information retrieval
Easychair keyphrases
neural network [15], sequence length [13], baseline feature [12], mel band energy [12], recurrent neural network [12], log mel band [9], raw feature [9], audio feature [8], average rmse [7], signal processing [7], standard deviation [7], mel band [6], music emotion recognition [6], bidirectional gru [5], feature set [5], band energy feature [4], continuous conditional neural field [4], convolutional recurrent neural network [4], dropout rate [4], log mel [4], mel band feature [4], music task [4], time distributed fc [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401917
Zenodo URL: https://zenodo.org/record/1401917
Abstract
Systema Naturae is a cycle of four compositions written for various instrumental ensembles and electromechanical setups. The workflow includes design and construction of electromechanical instruments, algorithmic composition, automated notation generation, real-time control of the setups and synchronization with the acoustic ensemble during the performances. These various aspects have to be integrated in a unique working pipeline, that has to be shared between the two authors, and thus requires to define communication protocols for sharing data and procedures. The paper reports on those aspects and on the integration required between hardware and software, non-real time and real time operations, acoustic and mechanical instruments, and, last but not least, between the two composers.
Keywords
Algorithmic composition, Automated music instruments, Physical computing, Shared music practices
Paper topics
accompaniment, and improvisation, Automatic composition, Interactive performance systems and new interfaces, Robotics and music
Easychair keyphrases
sound body [49], real time [15], algorithmic composition [12], residual orchestra [11], systema naturae [10], acoustic instrument [9], algorithmic composition environment [9], hair dryer [9], music instrument [8], physical computing [8], wind instrument [7], click track [6], non idiophonic flute [6], wind instrument girodisco [6], musical instrument [5], transistor board [5], audio file [4], electromechanical setup [4], ensemble mosaik [4], eolio ocarino fischietta [4], indirectly cetro free aerophone [4], music composition [4], music notation [4], ocarino fischietta cocacola [4], real time application [4], real time performance [4], real time simulator [4], sound body part [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401973
Zenodo URL: https://zenodo.org/record/1401973
Abstract
This paper is an intermediate report of an ongoing artistic reseach project on technology-assisted performance practice. It describes the application \emph{Polytempo Network} and some works that were realised using this application in the course of the last two years. The different compositional approaches chosen for every individual work and the experiences gathered during the process of composing, rehearsing and performing these works are discussed. The leading question is to what extent the usage of this technology is aesthetically significant.
Keywords
Networked music performance practice, Open form, Polytemporal music, Tempo synchronisation
Paper topics
Computer music languages and software, Interactive performance systems and new interfaces
Easychair keyphrases
application polytempo network [11], real time [9], polytempo network [7], technology assisted performance practice [6], tempo polyphony [6], tempo progression [6], computer music [5], electronic score [5], message driven operation mode [4], music notation [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401903
Zenodo URL: https://zenodo.org/record/1401903
Abstract
This article introduces the Csound Plugin Opcode Framework (CPOF), which aims to provide a simple lightweight C++ framework for the development of new unit generators for Csound. The original interface for this type work is provided in the C language and it still provides the most complete set of components to cover all possible requirements. CPOF attempts to allow a simpler and more economical approach to creating plugin opcodes. The paper explores the fundamental characteristics of the framework and how it is used in practice. The helper classes that are included in CPOF are presented with examples. Finally, we look at some uses in the Csound source codebase.
Keywords
C++, Csound, musical signal processing, object-oriented programming
Paper topics
Analysis, and modification of sound, Computer music languages and software, synthesis
Easychair keyphrases
base class [11], fsig data [10], int init [8], plugin class [8], array subscript access [6], csound plugin opcode framework [6], fsig format [6], static constexpr char [6], vector data myflt [6], function table [5], int kperf [5], processing function [5], template typename [5], class template [4], compile time [4], const iterator [4], csound csound [4], init time [4], iterator type [4], member variable [4], myfltvec data [4], return iterator [4], typename t int [4], virtual function [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401935
Zenodo URL: https://zenodo.org/record/1401935
Abstract
This paper discusses the processes involved in designing and implementing an object-oriented library for audio signal processing in C++ (ISO/IEC C++11). The introduction presents the background and motivation for the project, which is related to providing a platform for the study and research of algorithms, with an added benefit of having an efficient and easy-to-deploy library of classes for application development. The design goals and directions are explored next, focusing on the principles of stateful representations of algorithms, abstration/ encapsulation, code re-use and connectivity. The paper provides a general walk-through the current classes and a detailed discussion of two algorithm implementations. Completing the discussion, an example program is presented.
Keywords
digital signal programming libraries, music programming, object-oriented programming
Paper topics
Computer music languages and software
Easychair keyphrases
delay line [7], processing method [7], signal processing [7], base class [6], const double [6], band limited waveform [4], class hierarchy [4], const double sig [4], def vframe [4], derived class [4], function table [4], impulse response [4], iso international standard iso [4], processing class [4], signal generator [4], table lookup oscillator [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401867
Zenodo URL: https://zenodo.org/record/1401867
Abstract
This paper presents the results of an investigation into audio-visual (AV) correspondences conducted as part of the development of Morpheme, a painting interface to control a corpus-based concatenative sound synthesiser. Our goal was to measure the effects of the corpora on the effectiveness of the AV mappings used by Morpheme to synthesise sounds from images. Two mapping strategies and four corpora were empirically evaluated by 110 participants. To test the effectiveness of the mappings, sounds were generated using Morpheme from images designed to exhibit specific properties. The sounds were then played to the participants of the study and for each sound, they were presented with three images: the image used to synthesise the sound and two similar destructor images. Participants were asked to identify the correct source image and rate their level of confidence. The results show that the audio corpus used to synthesise the audio stimuli had a significant effect on the subjects' ability to identify the correct image while the mapping didn’t. No effect of musical/sound training was observed.
Keywords
Audiovisual Interaction, Concatenative Synthesis, Crossmodal Correspondence, Evaluation
Paper topics
Analysis, and modification of sound, Interactive performance systems and new interfaces, Multi-modal perception and emotion, Music information retrieval, Sonification, synthesis
Easychair keyphrases
correct selection rate [14], non expert [14], correct image selection rate [12], selection rate [12], non expert group [11], audio stimulus [9], image selection [8], audio corpus [7], unsupervised condition [7], audio corpus x2 [6], confidence level [6], correct image [6], audio unit [5], confidence rating [5], correct detection [5], expert group [5], sound synthesis [5], subject skill [5], benferroni adjusted p value [4], chi square result [4], concatenative synthesiser [4], correct image selection [4], correct selection rate x2 [4], image selection success [4], organised sound [4], significant relationship [4], statistical significance [4], strong relationship [4], subject ability [4], visual stimulus [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401953
Zenodo URL: https://zenodo.org/record/1401953
Abstract
The paper presents results from an experiment in which 91 subjects stood still on the floor for 6 minutes, first 3 minutes in silence, followed by 3 minutes with music. The head motion of the subjects was captured using an infrared optical system. The results show that the average quantity of motion of standstill is 6.5 mm/s, and that the subjects moved more when listening to music (6.6 mm/s) than when standing still in silence (6.3 mm/s). This result confirms the belief that music induce motion, even when people try to stand still.
Keywords
micromotion, music-induced motion, music-related motion, standstill
Paper topics
Multi-modal perception and emotion
Easychair keyphrases
sound condition [9], average qom [4], electronic dance music [4], entire data [4], human standstill [4], mean qom [4], motion capture [4], younger participant [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401913
Zenodo URL: https://zenodo.org/record/1401913
Abstract
The process of creating historical-critical digital music editions involves the annotation of musical measures in the source materials (e.g. autographs, manuscripts or prints). This serves to chart the sources and create concordances between them. So far, this laborious task is barely supported by software tools. We address this shortcoming with two interface approaches that follow different functional and interaction concepts. Measure Editor is a web application that complies with the WIMP paradigm and puts the focus on detailed, polygonal editing of image zones. Vertaktoid is a multi-touch and pen interface where the focus was on quick and easy measure annotation. Both tools were evaluated with music editors giving us valuable clues to identify the best aspects of both approaches and motivate future development.
Keywords
Digital Music Edition, MEI, Tools
Paper topics
Computational musicology and mathematical music theory
Easychair keyphrases
measure editor [21], music score [15], edirom editor [14], music edition [13], digital music edition [11], measure annotation [11], bounding box [8], last access [8], existing measure [7], music editor [7], optical measure recognition [7], optical music recognition [7], music encoding initiative [6], selection gesture [6], sus score [5], user interface [5], annotation process [4], common western music notation [4], featured music edition platform [4], full featured music edition [4], human computer interaction [4], interaction technique [4], measure annotation tool [4], music information retrieval [4], scissors tool [4], source material [4], textual annotation [4], th international society [4], web application [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401941
Zenodo URL: https://zenodo.org/record/1401941
Abstract
The use of Answer Set Programming (ASP) inside musical domains has been demonstrated specially in composition, but so far it hasn’t overtaken engineering areas in studio-music (post) production such as multitrack mixing for stereo imaging. This article aims to demonstrate the use of this declarative approach to achieve a well-balanced mix. A knowledge base is compiled with rules and constraints extracted from the literature about what professional music producers and audio engineers suggest creating a good mix. More specially, this work can deliver either a mixed audio file (mixdown) as well as a mixing plan in (human-readable) text format, to serve as a starting point for producers and audio engineers to apply this methodology into their productions. Finally this article presents a decibel (dB) and a panning scale to explain how the mixes are generated.
Keywords
Answer Set Programming, Automatic Mix, Declarative Language, Logic Programming, Multitrack Mixing, Music Production
Paper topics
High-performance computing for audio
Easychair keyphrases
answer set programming [15], audio engineering society convention [10], lead instrument [9], audio engineering society [7], integrity constraint [7], low level feature [6], answer set [5], mixing process [5], amplitude ratio [4], audio engineer [4], intelligent system [4], knowledge base [4], knowledge representation [4], logic programming [4], mixdown file [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401981
Zenodo URL: https://zenodo.org/record/1401981
Abstract
In this paper we propose to extend the concept of the Internet of Things to the musical domain leading to a subfield coined as the Internet of Musical Things (IoMUT). IoMUT refers to the network of computing devices embedded in physical objects (Musical Things) dedicated to the production and/or reception of musical content. Musical Things, such as smart musical instruments or smart devices, are connected by an infrastructure that enables multidirectional communication, both locally and remotely. The IoMUT digital ecosystem gathers interoperable devices and services that connect performers and audiences to support performer-performer and audience-performers interactions, not possible beforehand. The paper presents the main concepts of IoMUT and discusses the related implications and challenges.
Keywords
Internet of Things, Networks, NIME
Paper topics
Interactive performance systems and new interfaces, Music performance, Virtual reality applications and technologies for sound and music
Easychair keyphrases
musical thing [26], smart instrument [16], audience member [14], musical instrument [10], low latency [8], music performance [8], smart device [8], musical interaction [7], interoperable device connecting performer [6], live music performance [6], musical expression [6], musical performance [6], networked music performance [6], real time [6], smart wearable [6], virtual reality [6], digital ecosystem [5], musical content [5], tactile internet [5], collaborative music creation [4], co located interaction [4], iomut digital ecosystem [4], iomut ecosystem [4], musical experience [4], physical object [4], sensus smart guitar [4], technological infrastructure [4], virtual environment [4], wireless communication [4], wireless sensor network [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401869
Zenodo URL: https://zenodo.org/record/1401869
Abstract
Live Coding is a movement in electronic audiovisual performance that emerged at the turn of the millennia and is now being performed all over the world through range of artistic practices. It involves the continual process of constructing and reconstructing a computer program and, given that most performances are improvisational in nature, working together from multiple computers can provide a challenge when performing as an ensemble. When performing as a group Live Coders will often share network resources, such as a tempo clock to coordinate rhythmic information, but rarely will they work together directly with the same material. This paper presents the novel collaborative editing tool, Troop, that allows users to simultaneously work together on the same piece of code from multiple machines. Troop is not a Live Coding language but an environment that enables higher levels of communication within an existing language. Although written in Python to be used with the Live Coding language FoxDot, Troop provides an interface to other Live Coding languages, such as SuperCollider, and can be extended to include others. This paper outlines the motivations behind the development of Troop before discussing its applications as a performance tool and concludes with a discussion about the potential benefits of the software in a pedagogical setting.
Keywords
Collaboration, Live Coding, Network Performance
Paper topics
Computer music languages and software, Interactive performance systems and new interfaces, Social interaction in sound and music computing
Easychair keyphrases
live coding [35], live coding language [12], text buffer [10], collaborative live coding [6], computer music [6], connected client [6], live coding ensemble [6], live coding environment [6], buffer networked live coding system [4], client server model [4], connected performer [4], creative constraint [4], execution model [4], text box [4], troop server [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401895
Zenodo URL: https://zenodo.org/record/1401895
Abstract
The attack phase of sound events plays an important role in how sounds and music are perceived. Several approaches have been suggested for locating salient time points and critical time spans within the attack portion of a sound, and some have been made widely accessible to the research community in toolboxes for Matlab. While some work exists where proposed audio descriptors are grounded in listening tests, the approaches used in two of the most popular toolboxes for musical analysis have not been thoroughly compared against perceptual results. This article evaluates the calculation of attack phase descriptors in the Timbre toolbox and the MIRtoolbox by comparing their predictions to empirical results from a listening test. The results show that the default parameters in both toolboxes give inaccurate predictions for the sound stimuli in our experiment. We apply a grid search algorithm to obtain alternative parameter settings for these toolboxes that align their estimations with our empirical results.
Keywords
Attack, Attack phase descriptors, Listening test, Matlab, MIRtoolbox, Onset, Perceptual attack time, Timbre toolbox
Paper topics
Analysis, and modification of sound, Computational musicology and mathematical music theory, Computer-based music analysis, Computer music languages and software, Multi-modal perception and emotion, Music information retrieval, synthesis
Easychair keyphrases
energy envelope [32], attack phase descriptor [31], timbre toolbox [29], sound event [23], perceptual attack [22], attack phase [19], attack time [18], perceptual attack time [15], time point [14], physical onset [13], attack range [11], perceptual result [9], click track [7], group delay [7], perceptual onset [7], salient time point [7], time span [7], cutoff frequency [6], energy peak [6], frame size [6], log attack time [6], mirex audio onset competition [6], default parameter [5], attack slope [4], default strategy [4], grid search algorithm [4], music information retrieval [4], onset detection [4], sound file [4], various attack phase descriptor [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401919
Zenodo URL: https://zenodo.org/record/1401919
Abstract
This paper describes a versioning and annotation system for supporting collaborative, iterative design of mapping layers for digital musical instruments (DMIs). First we describe prior experiences and contexts of working on DMIs that has motivated such features in a tool, describe the current prototype implementation and then discuss future work and features that are intended to improve the capabilities of tools for new musical instrument building as well as general interactive applications that involve the design of mappings with a visual interface.
Keywords
collaboration, mapping, new instruments
Paper topics
Computer music languages and software, Interactive performance systems and new interfaces
Easychair keyphrases
user interface [7], computer music [4], digital musical instrument [4], mapping design [4], mapping tool [4], system component [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401939
Zenodo URL: https://zenodo.org/record/1401939
Abstract
This work presents a novel virtual analog model of the Lockhart wavefolder. Wavefolding modules are among the fundamental elements of 'West Coast' style analog synthesis. These circuits produce harmonically-rich waveforms when driven by simple periodic signals such as sinewaves. Since wavefolding introduces high levels of harmonic distortion, we pay special attention to suppressing aliasing without resorting to high oversampling factors. Results obtained are validated against SPICE simulations of the original circuit. The proposed model preserves the nonlinear behavior of the circuit without perceivable aliasing. Finally, we propose a practical implementation of the wavefolder using multiple cascaded units.
Keywords
Acoustic signal processing, Audio processing, Audio synthesis, Circuit simulation, Virtual analog modeling
Paper topics
Analysis, and modification of sound, synthesis
Easychair keyphrases
digital audio effect [20], lockhart wavefolder [12], fundamental frequency [7], lambert w function [7], antialiased form [6], computer music conf [6], lockhart wavefolder circuit [6], west coast [6], collector diode [5], emitter current [5], magnitude spectrum [5], real time [5], signal process [5], spice simulation [5], wavefolder model [5], aliasing distortion [4], audio rate [4], digital implementation [4], digital model [4], low frequency [4], moog ladder filter [4], post gain [4], west coast style [4], west coast synthesis [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401957
Zenodo URL: https://zenodo.org/record/1401957
Abstract
In the 50s and 60s, steel plates were popularly used as a technique to add reverberation to sound. As a plate reverb itself is quite bulky and requires lots of maintenance, a digital implementation would be desirable. Currently, the available (digital) plugins rely solely on recorded impulse responses or simple delay networks. Virtual Analog (VA) simulations, on the other hand, rely on a model of the analog effect they are simulating, resulting in a sound and 'feel' the of the classical analog effect. In this paper, a VA simulation of plate reverberation is presented. Not only does this approach result in a very natural sounding reverb, it also poses many interesting opportunities that go beyond what is physically possible. Existing VA solutions, however, have limited control over dynamics of physical parameters. In this paper, we present a model where parameters like the positions of the in- and outputs and the dimensions of the plate can be changed while sound goes through. This results is in a unique flanging and pitch bend effect, respectively, which has not yet be achieved by the current state of the art.
Keywords
audio effect, plate reverb, virtual analog
Paper topics
Analysis, and modification of sound, synthesis
Easychair keyphrases
plate reverb [21], pa1 dynamic plate reverb [10], plate reverberation [10], loss coefficient [7], pitch bend effect [6], radiation damping [6], boundary condition [5], computational time [4], dry input signal [4], emt140 plate reverb [4], modal shape [4], output sound [4], plate dimension [4], porous medium [4], steel plate [4], thermoelastic damping [4], update equation [4], virtual analog [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401951
Zenodo URL: https://zenodo.org/record/1401951
Abstract
Reverberation is a sonic effect that has a profound impact in music. Its implications extend to many levels such as musical composition, musical performance, sound perception and, in fact, it nurtured the sonority of certain musical styles (e.g. plainchant). Such relationship was possible because the reverberation of concert halls is stable (i.e. does not drastically vary). However, what implications surface to music composition and music performance when reverberation is variable? How to compose and perform music for situations in which reverberation is constantly changing? This paper describes Wallace, a digital software application developed to make a given audio signal to flow across different impulse responses (IRs). Two pieces composed by the author using Wallace will be discussed and, lastly, some viewpoints about composing music for variable reverberation, particularly using Wallace, will be addressed.
Keywords
Composition, Impulse Response, MaxMSP, Reverberation
Paper topics
Computer music languages and software, Interactive performance systems and new interfaces, Music performance, Room acoustics modeling and auralization
Easychair keyphrases
concert hall [12], reverberation scheme [9], variable reverberation [9], sound result [7], sound source [7], audio signal [6], composing music [6], natural reverberation [5], convolution process [4], music composition [4], sobre espaco [4], variaco sobre [4]
Paper type
unknown
DOI: 10.5281/zenodo.1401945
Zenodo URL: https://zenodo.org/record/1401945