Behavioral
signals in the audio and visual modalities available in
speech, spoken language and body language offer a window into
decoding not just what one is doing but how one is thinking
and feeling. At the simplest level, this could entail
determining who is talking to whom about what and how using
automated audio and video analysis of verbal and nonverbal
behavior. Computational modeling can also target more
complex, higher level constructs, like the expression and
processing of emotions. Behavioral signals combined with
physiological signals such as heart rate, respiration and
skin conductance offer further possibilities for
understanding the dynamic cognitive and affective states in
context. Machine intelligence could also help detect, analyze
and model deviation from what is deemed typical. This talk
will focus on multimodal bio-behavioral sensing, signal
processing and machine learning approaches to computationally
understand aspects of human affective expressions and
experiences. It will draw upon specific case studies to
illustrate the multimodal nature of the problem in the
context of both vocal encoding of emotions in speech and
song, as well as processing of these cues by humans.