Colloquium talk at CUNY Graduate Center

March 23rd Colloquium at CUNY Graduate Center

Title: Selection dynamics as grammar-enabling neural computation: evidence from phonological alternations, social accommodation, and leaky prosody

 Abstract:  Spoken language, although often characterized as structured configurations of discrete linguistic units, has a continuous substrate in the brain and in the speech signal. Formal linguistic theories and the language descriptions on which they are built often fruitfully abstract away from these continuous dimensions. However, there are also linguistic phenomena which fall outside of the scope of purely discrete theories, indicating that a complete account requires integrating discrete and continuous aspects of linguistic cognition. In this talk, I’ll argue that the discrete appearance of linguistic units follows from non-linear dynamics of the neural substrate. In this vein, I’ll illustrate the concept of selection dynamics. Using Dynamic Field Theory (DFT) (e.g., Schöner & Spencer, 2016) as a framework, I’ll illustrate how discrete behavior, i.e., selecting one category over another, can emerge from a continuous neural field representing a phonologically relevant dimension. I’ll then show that a specific implementation of selection dynamics in DFT can derive both categorical alternations–the bread and butter of discrete formal theories of phonology–as well as phenomena that require integration of continuous dimensions, including socially-mediated phonetic convergence/divergence (joint work with Irene Yi and Claire Bowern) and leaky prosody (Tang & Shaw, 2021; Shaw & Tang, 2023).


Schöner, G., & Spencer, J. P. (2016). Dynamic thinking: A primer on dynamic field theory: Oxford University Press.

Tang, K., & Shaw, J. A. (2021). Prosody leaks into the memories of words. Cognition, 210, 104601.

Shaw, J. A., & Tang, K. (2023). A dynamic neural field model of leaky prosody: proof of concept.

UCL talk, Feb 11th

Talk at University College London Speech Science Forum. Link is here. Slides for the talk are here. Recording is available here (Access Passcode: 71&Gx#4E)

Gestural coordination in the living lexicon of spoken words

Language varieties show variety-specific patterns of gestural coordination, where gestures are forces (dynamics) that exert control over articulatory movements (kinematics), see, e.g., Browman & Goldstein (1986). By hypothesis, the dimensions of gestural control are those that serve phonological function, e.g., supporting contrast in the lexicon.

I start by illustrating this point with a comparison of Russian palatalized consonants, e.g., /pj/, /bj/, /mj/, with articulatorily similar English sequences /pj/, /bj/, /mj/. High temporal resolution articulatory tracking, using Electromagnetic Articulography (EMA), reveals systematic differences in coordination corresponding to differing phonological functions: complex segments (Russian) vs. segment sequences (English).

I next present cases in which linguistic context conditions systematic changes in gestural coordination. First, in Tokyo Japanese, high vowel devoicing can trigger the categorical loss of a lingual gesture for the vowel and subsequent reorganization of gestural coordination (Shaw & Kawahara 2018, 2021). Second, in Mandarin Chinese, certain morpho-syntactic environments condition a shift in gestural timing, which shortens syllable duration and precipitates a loss of lexical tone. This last case is particularly informative when compared with diaspora Tibetan, where tone loss has proceeded without gestural reorganization (Geissler et al., 2021). These patterns are consistent with a characterization of the human lexicon in terms of a relatively small number of gestures and coordination modes, organized to support phonological function and sensitive to linguistic context.

I close by presenting two additional cases, also drawn from Mandarin and Japanese, that challenge the completeness of this view of the lexicon, showing both (1) that the lexicon absorbs contextual prosodic influences, leading to gradient shifts in phonetic form (Tang & Shaw, 2021) and (2) that words can resist influences of prosodic context (Kawahara, Shaw, Ishihara, 2021). Taken together, the data suggest that a low dimensional characterization of the lexicon in terms of discrete gestures and coordination modes co-exists with a representation of higher dimensional phonetic parameterization.


Browman, C., & Goldstein, L. (1986). Towards an Articulatory Phonology. Phonology Yearbook, 3, 219-252.

Geissler, C.,  Shaw, J.A., Fang H. & Tiede M.. (2021). Eccentric C-V timing across speakers of diaspora Tibetan with and without lexical tone contrasts. Proceedings of the 12th International Seminar on Speech Production, Yale University, 4pgs.

Kawahara, S., Shaw, J.A., & Ishihara, S. (2021). Assessing the prosodic licensing of wh-in-situ in Japanese: A computational-experimental approach. Natural Language & Linguistic Theory.

Shaw, J. A., & Kawahara, S. (2018). The lingual articulation of devoiced /u/ in Tokyo Japanese. Journal of Phonetics, 66, 100-119.

Shaw, J. A., & Kawahara, S. (2021). More on the articulation of devoiced /u/ in Tokyo Japanese: effects of surrounding consonants. manuscript, Yale University and Keio University. 47 pgs.

Tang, K., & Shaw, J. A. (2021). Prosody leaks into the memories of words. Cognition210, 104601.


New Glossa paper

+Mandal, S., Best, C.T., Shaw, J.A., Cutler, A. 2020. Bilingual phonology in dichotic perception: A case study of Malayalam and English voicing. Glossa: a journal of general linguistics 5(1):73. 1-17. DOI:

Abstract: Listeners often experience cocktail-party situations, encountering multiple ongoing conversations while tracking just one. Capturing the words spoken under such conditions requires selective attention and processing, which involves using phonetic details to discern phonological structure. How do bilinguals accomplish this in L1-L2 competition? We addressed that question using a dichotic listening task with fluent Malayalam-English bilinguals, in which they were presented with synchronized nonce words, one in each language in separate ears, with competing onsets of a labial stop (Malayalam) and a labial fricative (English), both voiced or both voiceless. They were required to attend to the Malayalam or the English item, in separate blocks, and report the initial consonant they heard. We found that perceptual intrusions from the unattended to the attended language were influenced by voicing, with more intrusions on voiced than voiceless trials. This result supports our proposal for the feature specification of consonants in Malayalam-English bilinguals, which makes use of privative features, underspecification and the “standard approach” to laryngeal features, as against “laryngeal realism”. Given this representational account, we observe that intrusions result from phonetic properties in the unattended signal being assimilated to the closest matching phonological category in the attended language, and are more likely for segments with a greater number of phonological feature specifications.



New JASA paper

Shaw, J. A., & Tyler, M. D. (2020). Effects of vowel coproduction on the timecourse of tone recognition. The Journal of the Acoustical Society of America147(4), 2511-2524. pdf

Abstract: Vowel contrasts tend to be perceived independently of pitch modulation, but it is not known whether pitch can be perceived independently of vowel quality. This issue was investigated in the context of a lexical tone language, Mandarin Chinese, using a printed word version of the visual world paradigm. Eye movements to four printed words were tracked while listeners heard target words that differed from competitors only in tone (test condition) or also in onset consonant and vowel (control condition). Results showed that the timecourse of tone recognition is influenced by vowel quality for high, low, and rising tones. For these tones, the time for the eyes to converge on the target word in the test condition (relative to control) depended on the vowel with which the tone was coarticulated with /a/ and /i/ supporting faster recognition of high, low, and rising tones than /u/. These patterns are consistent with the hypothesis that tone-conditioned variation in the articulation of /a/ and /i/ facilitates rapid recognition of tones. The one exception to this general pattern—no effect of vowel quality on falling tone perception—may be due to fortuitous amplification of the harmonics relevant for pitch perception in this context.

Talk at BLS

Talk at the Berkeley Linguistics Society workshop “Phonological representations: at the crossroad between gradience and categoricity”  Feb 7-8 was entitled: Finding phonological structure in vowel confusions across English accents. The talk draws a connection between some collaborative work on cross-accent speech perception (Shaw et al. 2018. 2019) and contrastive feature hierarchies, in the sense of Dresher (2009).

The slides are available here.




New paper in Frontiers

“Spatially Conditioned Speech Timing: Evidence and Implications” is part of the Frontiers research topic “Models and Theories of Speech Production”. The paper provides evidence that the temporal coordination of articulatory gestures in speech is sensitive to the moment-by-moment location of speech organs (tongue, lips), a result which has implications for mechanisms of speech motor control, including the balance between feed-forward and state-based feedback control.


Patterns of relative timing between consonants and vowels appear to be conditioned in part by phonological structure, such as syllables, a finding captured naturally by the two-level feedforward model of Articulatory Phonology (AP). In AP, phonological form – gestures and the coordination relations between them – receive an invariant description at the inter-gestural level. The inter-articulator level actuates gestures, receiving activation from the inter-gestural level and resolving competing demands on articulators. Within this architecture, the inter-gestural level is blind to the location of articulators in space. A key prediction is that intergestural timing is stable across variation in the spatial position of articulators. We tested this prediction by conducting an Electromagnetic Articulography (EMA) study of Mandarin speakers producing CV monosyllables, consisting of labial consonants and back vowels in isolation. Across observed variation in the spatial position of the tongue body before each syllable, we investigated whether inter-gestural timing between the lips, for the consonant, and the tongue body, for the vowel, remained stable, as is predicted by feedforward control, or whether timing varied with the spatial position of the tongue at the onset of movement. Results indicated a correlation between the initial position of the tongue gesture for the vowel and C-V timing, indicating that inter-gestural timing is sensitive to the position of the articulators, possibly relying on somatosensory feedback. Implications of these results and possible accounts within the Articulatory Phonology framework are discussed.

Shaw, J. A., & Chen, W.-r. (2019). Spatially Conditioned Speech Timing: Evidence and Implications. Frontiers in psychology, 10(2726). doi:10.3389/fpsyg.2019.02726

AMP talk & poster: Oct 11

I’ll be representing a couple of research projects at the Annual Meeting on Phonology (AMP).

Titles and links to abstracts are below:

Poster: Kevin Tang (University of Florida) and Jason Shaw (Yale University). Sentence prosody leaks into the lexicon: evidence from Mandarin Chinese

Talk: Shigeto Kawahara (Keio University), Jason Shaw (Yale University) and Shinichiro Ishihara (Lund University). Do Japanese speakers always prosodically group wh-elements and their licenser? Implications for Richards’ (2010) theory of wh-movement