Cognition & Language Workshop - Dave Kleinschmidt

This is an Archive of a Past Event

ROBUST LANGUAGE COMPREHENSION: Recognize the familiar, generalize to the similar, and adapt to the novel

Abstract:
Anyone who has used an artificial speech recognition system knows that robust speech perception remains a difficult and unsolved problem, yet one which human listeners achieve nearly effortlessly.  Speech perception requires that the listener map continuous, variable acoustic cues to underlying linguistic units like phonetic categories and words.  One of the substantial challenges that human listeners have to tackle is the lack of invariance, or the fact there is no single set of acoustic cue values which reliably indicates the presence of a particular linguistic structure.  The lack of invariance is due in large part to the fact that the relationship between cues and linguistic units varies substantially from one situation to another, due to differences between individual talkers, registers, dialects, accents, etc.: one talker's /p/ may be more like another talker's /b/.

In this talk I will present a computational framework---the ideal adapter---which characterizes the computational problem posed by the lack of invariance, and how it might be solved.  This framework naturally suggests three ways that listeners might achieve robust speech perception in the face of the lack of invariance: recognition of familiar situations/talkers, generalization to new situations/talkers similar to those encountered before, and rapid adaptation to novel situations/talkers.  All three of these strategies have been observed in the empirical literature, bearing out a range of qualitative predictions---of the ideal adapter framework---and quantitative predictions---of an implemented model within this framework.

Finally, this framework provides a unifying perspective on flexibility in language comprehension across different levels, as well as tying language comprehension together with other, more general perceptual processes, which also show similar adaptive properties.  These connections point out future directions for investigating how the kinds of computations necessary for achieving robust speech perception might be carried out algorithmically and could be implemented in neural mechanisms.