Title and abstract of Stony Brook talk (Friday, Feb 24):
Linking phonology and phonetics in the frequency domain
Attempts to link phonetic and phonological form face two fundamental challenges. The first is the constancy-variability problem: (1) how to capture the range of phonetic actuations of a constant phonological form. The second is the fidelity problem: (2) how low fidelity phonological forms map to high fidelity phonetic forms. In this talk, we present an integrated approach to these problems. The core innovation is a transformation of high fidelity phonetic data into a low fidelity frequency domain (cosine components). Using Discrete Cosine Transform (DCT), we express continuous movement of the tongue as a small number of frequency modulations that map to phonologically-specified vocal tract constrictions while effectively preserving fine phonetic detail, i.e., ms to ms changes in spatial position over time. DCT addresses the fidelity problem. To address the constancy-variability problem we apply stochastic sampling techniques from the micro-prosody literature (e.g., Shaw & Davidson, 2011; Shaw, Gafos, Hoole, & Zeroual, 2011; Shaw & Gafos, 2015; Shaw, Gafos, Hoole, & Zeroual, 2009) to frequency components, effectively transforming phonological hypotheses into the (realistically variable) physical dimensions of phonetic form.
To demonstrate the approach, we take up the issue of phonetic underspecification (e.g., Archangeli, 1988; Keating, 1988)—or in more neutral terms, apparent phonetic targetlessness—asking whether the phonetic signal provides evidence for the presence/absence of a phonological feature. The crux of phonetic arguments for targetlessness is often linear interpolation between flanking segments. Consider a phonological sequence ABC, where the feature specification of B is in question. Whether observed in the domain of intonation (Pierrehumbert & Beckman, 1988: 37-38), vowels (Browman & Goldstein, 1992; Lammert, Goldstein, Ramanarayanan, & Narayanan, 2014), or consonants (Cohn, 1993; Keating, 1988) “linear interpolation” on the relevant phonetic dimension between A and C constitutes an argument for the targetlessness of B. But how linear is linear? Rigorous assessment of linear interpolation faces both the constancy-variability problem and fidelity problem. How do we decide whether observed deviation from linearity is not simply noisy actuation of a linear trajectory? More specifically, taking the case of ABC again, how do we distinguish complete targetlessness of B from (heavy) reduction of B? How do we know that B is truly targetless rather than just heavily susceptible to coarticulation with surrounding segments (c.f.,Recasens & Espinosa, 2009)? We demonstrate how transformations to frequency space bring clarity to these issues, revealing phonological patterns in variable phonetic data.
The empirical domain of our demonstration is high vowel devoicing in Japanese. A classic description of the facts is that high vowels are devoiced between two voiceless consonants and after a voiceless consonant before a pause but there is a debate about whether vowels are phonologically deleted, i.e., “targetless”, or merely devoiced (for a recent and comprehensive review, see Fujimoto, 2015). To resolve this issue, we collected Electromagnetic Articulography data on the trajectory of tongue movements during voiced and voiceless vowel productions from six speakers of Tokyo Japanese. Analysed within the computational framework described above, these data provide a clear answer while elucidating some previously unknown phonological conditions under which devoiced vowels lack lingual articulatory targets.
References
Archangeli, D. (1988). Aspects of underspecification theory. Phonology, 5, 183-208.
Browman, & Goldstein, L. (1992). ‘Targetless’ schwa: An articulatory analysis. In G. Docherty & R. Ladd (Eds.), Papers in Laboratory Phonology II: Gesture, Segment, Prosody (pp. 26-56). Cambridge: Cambridge University Press.
Cohn, A. C. (1993). Nasalisation in English: phonology or phonetics. Phonology, 10(01), 43-81.
Fujimoto, M. (2015). Chapter 4: Vowel devoicing. In H. Kubozono (Ed.), The handbook of Japanese phonetics and phonology. Berlin: Mouton de Gruyter.
Keating, P. (1988). Underspecification in phonetics. Phonology, 5, 275-292.
Lammert, A., Goldstein, L., Ramanarayanan, V., & Narayanan, S. (2014). Gestural control in the English past-tense suffix: an articulatory study using real-time MRI. Phonetica, 71(4), 229-248.
Pierrehumbert, J., & Beckman, M. (1988). Japanese Tone Structure. Cambridge, Mass.: MIT Press.
Recasens, D., & Espinosa, A. (2009). An articulatory investigation of lingual coarticulatory resistance and aggressiveness for consonants and vowels in Catalan. The Journal of the acoustical society of America, 125(4), 2288-2298.
Shaw, J. A., & Davidson, L. (2011). Perceptual similarity in input–output mappings: A computational/experimental study of non-native speech production. Lingua, 121(8), 1344-1358.
Shaw, J. A., Gafos, A., Hoole, P., & Zeroual, C. (2011). Dynamic invariance in the phonetic expression of syllable structure: a case study of Moroccan Arabic consonant clusters. Phonology, 28(3), 455-490.
Shaw, J. A., & Gafos, A. I. (2015). Stochastic Time Models of Syllable Structure. PLoS One, 10(5), e0124714.
Shaw, J. A., Gafos, A. I., Hoole, P., & Zeroual, C. (2009). Syllabification in Moroccan Arabic: evidence from patterns of temporal stability in articulation. Phonology, 26, 187-215.