The tentative programe is below.
Event Time Slot
Registration 08:30 to 09:00
Workshop Introduction 09:00 to 09:20
Forenoon Session: Detecting mental states 09:20 to 12:00
Keynote speech: Understanding affective expressions and experiences through behavioral machine intelligence
Prof. Shrikanth (Shri) Narayanan
Behavioral signals in the audio and visual modalities available in speech, spoken language and body language offer a window into decoding not just what one is doing but how one is thinking and feeling. At the simplest level, this could entail determining who is talking to whom about what and how using automated audio and video analysis of verbal and nonverbal behavior. Computational modeling can also target more complex, higher level constructs, like the expression and processing of emotions. Behavioral signals combined with physiological signals such as heart rate, respiration and skin conductance offer further possibilities for understanding the dynamic cognitive and affective states in context. Machine intelligence could also help detect, analyze and model deviation from what is deemed typical. This talk will focus on multimodal bio-behavioral sensing, signal processing and machine learning approaches to computationally understand aspects of human affective expressions and experiences. It will draw upon specific case studies to illustrate the multimodal nature of the problem in the context of both vocal encoding of emotions in speech and song, as well as processing of these cues by humans.
09:20 to 10:00
Coffee break 10:00 to 10:20
Minus Oral presentations (20 minutes each, including Q&A)
  • Prosodic and voice quality analyses of loud speech: differences of hot anger and far-directed speech
  • A protocol for collecting speech data with varying degrees of trust
  • No Sample Left Behind: Towards a Comprehensive Evaluation of Speech Emotion Recognition Systems
  • Towards Feature-Space Emotional Speech Adaptation for TDNN based Telugu ASR systems
  • Detection of emotional states of OCD patients in an exposure-response prevention therapy scenario
  • New Features for Speech Activity Detection
  • Transformation of voice quality in singing using glottal source features
10:20 to 12:40
Lunch 12:40 to 13:40
Afternoon Session: Influencing mental states 13:40 to 16:20
Recorded presentations (20 minutes)
  • A Successive Difference Feature for Detecting Emotional Valence from Speech
  • Visualizing Carnatic music as projectile motion in a uniform gravitational field
  • 13:40 to 14:00
    Keynote speech: Everyday Features for Everyday Listening
    Prof. John Ashley Burgoyne
    You are sitting on a commuter train. How many passengers are wearing headphones? What are they listening to? What else are they doing? Most importantly, amid the cornucopia of distractions, what exactly are they hearing? Much research in music cognition pits ‘musicians’, variously defined, against non-musicians. Recently, especially since the appearance of reliable measurement instruments for musicality in the general population (e.g., MÃŒllensiefen et al., 2014), there has been growing interest in the space in between. Moreover, the ubiquity of smartphones has greatly enhanced the ability of techniques like gamification or Sloboda’s ‘experience sampling’ to reach this general population outside of a psychology lab. Music information retrieval (MIR) – and signal processing research more generally – can provide the last ingredients to understand what is happening between our commuters’ earbuds: everyday features for studying everyday listening. Since Aucouturier and Bigand’s 2012 manifesto on the poor interpretability of traditional DSP measures, clever dimensionality reduction paired with feature sets like those from the FANTASTIC (MÃŒllensiefen and Frieler, 2006) or CATCHY (Van Balen et al., 2015) toolboxes have sought a middle ground. This talk will present several uses of everyday features from the CATCHY toolbox for studying everyday listening, most notably a discussion of the Hooked on Music series of experiments (Burgoyne et al., 2013) and a recent user study of thumbnailing at a national music service. In conclusion, it will outline some areas where MIR expertise can go further than just recommendation to learn about and engage with listeners during their daily musical activities.
    14:00 to 14:40
    Coffee Break 14:40 to 15:00
    Oral presentations (20 minutes each, including Q&A)
    • Using Shared Vector Representations of Words and Chords in Music for Genre Classification
    • MAD-EEG: an EEG dataset for decoding auditory attention to a target instrument in polyphonic music
    • Mining Mental States using Music Associations
    • Synaesthesia: How can it be used to enhance the audio-visual perception of music and multisensory design in digitally enhanced environments?
    15:00 to 16:20
    Workshop Conclusion 16:20 to 16:30