Global Englishes, Rhyme, and Rap: A Meditation Upon Shifts in Rhythm

This essay considers how the Somali-born hip-hop artist K’naan occasionally uses rhymes that embody a slight but perceptually noticeable shift in the rhythms of global Englishes. Our verse prosody is being reshaped by the rhythmic contours of speakers who bring the prosody of their first language to bear upon their rhythmicization of English. This is no matter of local or virtuosic performance but a structural shift in the texture of our language.

On Rhyme
David Caplan
Presses Universitaires de Liège, 2017
Liège, Belgium

For centuries, sophisticated readers and performers of rhyme have derived pleasure from the ways in which innovative poets and lyricists have simply altered the tempo—either of the spoken language, the implicit prosody of language on the page, or the performance of language to a beat, as in folk song or light opera—to create novel rhyme pairs in which the number of syllables or the lexical boundaries within a linguistic string are compressed or expanded to rhyme with another. Probably the best known example from canonical literature is Lord Byron's rhyme from Don Juan, "But—Oh! ye lords of ladies intellectual, / Inform us truly, have they not hen-pecked you all?”[1]

Of course, in recent usage virtuosic examples of multisyllabic rhymes and their variants (e.g., apocopated rhyme; mosaic rhyme)[2] come as much from spoken-word poets and rap artists as from poets writing for the page; as just one instance, consider the following verbally dexterous chain rhyme that virtuosically extends a single rhyme over a succession of lines by deleting word boundaries and even consonants: “tenfold/rental/essential/utensils/confidential/ continental/ compliment all/competent souls.”[3] Indeed, rap's ability to reshape what counts as a phonemic equivalence has attracted significant and important attention in Adam Bradley's The Book of Rhymes: The Poetics of Hip Hop or David Caplan's Rhyme's Challenge: Hip Hop, Poetry, and Contemporary Rhyming Culture.

This essay does not attempt to replicate those discussions. Instead, it adds an important but lateral note, suggesting that some of the more intriguing, albeit less flamboyant, rhyme pairs in recent global rap demonstrate evidence of the tug between stress-timed and mora- or syllable-timed rhythms in global Englishes (where the first tends to reduce unstressed syllables and unstressed vowels, the others do not). In particular, it considers how the Somali-born hip-hop artist K’naan occasionally uses rhymes that embody a slight but perceptually noticeable shift in the rhythms of global Englishes. Such a shift in and of itself hardly appears to be significant, and it is easy to dismiss these instances as examples of wrenched accent or performativity.[4] However, a mix of historical precedent (e.g., the loss of final –e in Middle English) together with contemporary research regarding the pronunciation of global Englishes by L2 speakers suggest a different, potentially structural explanation is needed. Such an explanation may well suggest that our verse prosody, indeed, one of our most stalwart prosodic devices, is being reshaped by the rhythmic contours of speakers who bring the prosody of their first language to bear upon their rhythmicization of English. As I will argue, this is no matter of local or virtuosic performance but a structural shift in the texture of our language.

Before I make such an argument, let me offer a quick caveat: the phenomenon I am remarking—rhyme pairs that display unreduced or full vowels—is only a minor part of K’naan's, and presumably other global rap artists', prosodic repertoires. In point of fact, it is the fairly unremarkable nature of such rhymes that is of interest, for they challenge the very definition of rhyme,[5] which has long been taken to involve not only a certain phonemic overlap in the nucleus and coda of the syllables in question but also a designation of stress on these syllables. It is the presence of stress—indeed, the very necessity—that I wish here to bring into question. To do that, though, I should first lay out why such an argument needs to be made.

Some Relations of Rhyme and Stress

It isn’t necessarily intuitive that rhyme need be connected to stress. Presumably, there could be unstressed syllables that share a phonemic coincidence (e.g., honey/motley, an example given in Princeton Encyclopedia of Poetry and Poetics) or a pair of words in which an unstressed syllable rhymes with a stressed syllable (e.g., “Seth”/“Elizabeth”, as in John Hollander’s verse definition of rhyme[6]). Indeed, one could plausibly argue for a likely co-occurrence rather than an underlying requirement. That is, since rhyme adds attention to the syllable in question, the syllable is either more likely to receive secondary stress (especially in American English) or, in the context of a metrical poem, to receive metrical ictus, or both. In Hollander’s example, for instance, the vowel in the final syllable of Elizabeth would quite probably reduce to schwa, /ə/, had we not been primed by the precedent of Seth in the prior line and in the context of a poem written in heroic couplets to expect the mid front lax vowel /ɛ/.

Yet defining statements about rhyme—in the West, and not just in the English language—make a crucial connection between rhyme and accent. For example, the Princeton Encyclopedia of Poetry and Poetics, 4th ed., asserts that “The emergence of rhyme in the West had to await the development of accent (Clark remarks that ‘the history of the adoption of rhyme is almost exactly parallel to, and contemporaneous with, the history of the substitution of accent for quantity’)”[7]. Annie Finch’s recent Poet’s Craft says the same: “While classical and medieval verse did not rhyme, two factors led rhyme to become firmly established throughout European poetry by the Renaissance. The first [and the one that interests us] was a change in the pronunciation of vowels so that Latin became pronounced as an accented language; rhyme helps emphasize the patterns of accent so rhyme and accent usually go together.”[8]

Indeed, as John Thompson observes, since "what poetry imitates is the structure of the language itself,”[9] different languages develop different versification systems, such that each system draws upon the fundamental elements of rhythm for that language. Thus, a language such as French, which is said to be syllable-timed, gets a syllabic meter, like the French alexandrine. By the same argument, mora-timed languages such as Japanese develop a moraic meter, like haiku, while stress-timed languages like German and English develop their respective stress-based meters, including the alliterative “strong-stress” meters of the older Germanic languages and the modern manifestations of the iambic pentameter.

It is worth noting, then, that rhyme is not conventionally a feature of either syllable- or mora-counting meters (although it can become an added ornament, as in Marianne Moore’s poetry). It is, however, strongly associated with accentual-syllabic meters, and that association may be as much structural as it is historical. Determining whether or not rap (which has variously been described as accentual verse, ballad meter, and so forth) may also be said to be accentual-syllabic is beyond the limits of this paper, but suffice it to say that rap in English not only prominently features rhyme: it presents “a dense field of rhyme” as well as “more extravagant rhyming, piling internal rhyme upon external and rhyming several consecutive lines.”[10] And rap, which speeds up and slows down pronunciation to park it in the beat, is widely regarded as so attached at stress accent that it has even been proposed as a means of teaching English word and sentence stress patterns” to English language learners.[11]

The Phonological Dynamics of K’naan's Rhymes

It should be of some interest, then, to find evidence of rhyme pairs—in global rap or verse—where a form of distinction other than stress contributes to the phonemic identity of the lexical strings. Let me offer two examples. Both are from songs by K’naan, a Somali-born singer, songwriter, rapper, and poet who writes and performs in English, a second language.[12]

The first example—a particular rhyme within a larger chain rhyme—comes from the song "I Was Stabbed by Satan" from Dusty-Foot Philosopher, K’naan's first album:

They say “freeze” but there's only one comin’ out
There's two dead with a legal gun to his head
It's stupid, he shoulda played ball instead
Now sing it out[13]

Three out of the four rhyme words are straightforward: they involve either monosyllabic rhyme, i.e., rhyme of two stressed monosyllabic words (here, head/dead), or they involve mosaic rhyme, that is, rhyme of a monosyllabic word with the stressed syllable of a disyllabic word (here, head/instead). The rhyme that interests us is the mosaic rhyme on stupid, which, could be said to wrench accent, that is, to "force . . . an accent onto a syllable that is not accented in order to conform to a meter"[14] so that the initially stressed disyllabic word stupid and the finally stressed disyllabic word instead would rhyme. But K’naan's performance does not alter the fundamental stress pattern of the word. Stress accent (i.e., word accent) still falls on the first syllable, but, unlike in American English, the high front lax vowel [ɪ] in the unstressed syllable fails to reduce to schwa, /ə/, as in [ˈstü-pəd].[15]

A similarly idiosyncratic rhyme pair can be found in a more recent song, "The Sound of My Broken Heart,” from K’naan’s 2012 album, Country, God, or the Girl. The lyrics to its first stanza read

Bombs are falling on me since you been gone
Shook my world feels like it's armageddon
It's not a earthquake ake or tsunami
What you hear is the sound of my breaking heart[16]

Here the rhyme of interest falls on the adverbial phrase, since you been gone, and the noun armageddon. According to standard American and British English dictionaries (American Heritage, Merriam Webster's, the OED), the vowel in the final unstressed syllable of armageddon either reduces to schwa, /ɑːməˈɡɛdən/,[17] or entirely disappears, /ärˊmə-gĕd′n/.[18] In the context of the song, however, as well as in K’naan's performance of it, the rhyme asks us to hear this final syllable as unstressed but not reduced; that is, as we are further primed by the exemplar in the stressed monosyllable, gone, we hear the word-final vowel as /ɑ/, instead of schwa or a syllabic consonant.[19]

It is easy to dismiss these phonetic idiosyncrasies as a matter of performance; or, if we don't listen carefully, to characterize them as instances of wrenched accent. However, work in phonetics and phonology suggests a more interesting possibility. For whereas stress-timed languages like English (which I'll treat as an abstract, stable singular entity for the moment) tend to reduce vowels of unstressed syllables, the non-reduction of vowels in unstressed syllables tends to be a feature of both so-called syllable-timed languages, like many East African languages, and some more typologically complex languages like Somali.[20] Indeed, one of the signal differences between so-called standard English and East African English, or EAfE (i.e., a regional variety of global English spoken by non-native speakers), lies in the pronunciation of vowels. Notably, there is a widespread tendency in EAfE for vowels to lengthen: this phenomenon is both quantitative,[21] and, as pertains to the two rhyme pairs in question, qualitative as well: thus, "usually short vowels in EAfE are longer and more peripheral than in RP,"[22] that is, Received Pronunciation.

This nonreduction of vowels is, indeed, found across African Englishes, likely an influence of similarities in the phonology of the substrate languages. Speaking of West African Englishes, but with points that extrapolate to East African English as well, Baugh and Cable observe in their survey of "English Worldwide" that “The rarity of the central vowel [ə] and of syllabic consonants accounts for the full value of vowels in the final syllables of words—for example, smoother [smuθa] . . . bottle [bɔtεl], lesson [lεsɔn]. The rarity of reduced vowels and weak forms is typical of syllable-timed languages such as those of West Africa, in contrast with the stress-timed rhythms of English.”[23]

Of course, one must be cautious about generalizations across regions, especially since Somali does not necessarily generalize with other substrate languages of East African Englishes. As a member of “a set of languages called lowland Eastern Cushitic spoken by peoples living in Ethiopia, Somalia, Djibouti, and Kenya,”[24] its phonology is likely distinct from the phonology of the major languages spoken in Tanzania, Kenya, etcetera, countries firmly identified with EAfE.[25] Nonetheless, Somali shares with these languages a signal trait: vowels in Somali do not reduce and a fruitful comparison can be drawn between the non-reduction of the vowel in the final syllable of lesson, in the example from Baugh and Cable, and the non-reduction I point out in final syllable of armageddon. That is, according to the literature on African Englishes, even without the prime of goneas a rhyme word conditioning such an expectation, a speaker of English with K’naan's rich phonological background would likely realize the final syllable of armageddon with an unstressed but unreduced vowel. In the further context of the song, what we then hear—either as an anomalous or innovative example of rhyme—is a mismatch between a stressed syllable and an unstressed but unreduced one that share phonemic identity; that is, we are hearing—if we are listening closely—a tug between the underlying principles governing the realization of vowels in stress-timed "standard" English (where they reduce) and their realization in many varieties of global Englishes (where they do not).

Principles of Linguistic Rhythm, the Realization of Unstressed Vowels, and Implications for Rhyme

I have been using these terms, syllable-timed and stress-timed, fairly unremarkably, but they bear elaboration. Introduced by Kenneth Pike in 1945, they originally pointed to the idea that natural languages are isochronous, and this isochrony reflects underlying facts of linguistic structure: that is, in certain languages where stress-accent plays a crucial role in linguistic rhythm—not least of which is marking word boundaries and edges of phrasal memberships—stretches of speech were said to be equally timed between stresses. But in languages where stress-accent lacks a central role in their word-level phonology and morphology[26] (and many of which happen to be African and Asia), the fundamental prosodic unit (the syllable for French; the mora, for Japanese)[27] was said to be the underlying basis for rhythm; and stretches of speech were said to be equally timed between syllables or moras. Thus, the rhythms of stress-timed languages like German and English were said to be fundamentally different than the rhythms of a syllable-timed language like French, a mora-timed language like Japanese, or even an ambiguously timed language like Somali.[28]

Empirical research over the past seventy years has modified many of Pike's initial claims; nonetheless, this crucial taxonomic distinction still has force, even if we now regard the terms syllable-timed and stress-timed as involving speakers' perceptions rather than empirical facts about language. Importantly for this essay, we also now recognize that these terms designate ends of a continuum, rather than categorical distinctions.[29] For example, there are languages (reference dialects of English, such as British English and American English among them) that we recognize to be more acute in their stress-timed properties than others, but there are also “intermediate languages” such as Polish and Catalan that exhibit “shared properties characteristic of both stress- and syllable-based languages.”[30] In other words, the distinction between syllable- and stress-timing is better regarded as gradient rather than categorical.

Indeed, these gradient qualities are also found within national varieties of English and manifest in important ways in the phonology of L2 speakers. For example, where speakers of the reference dialects (AmE, BE) strongly favor stress-timed rhythms, speakers of varieties from syllable-timed substrate languages tend to impose these syllable-timed rhythms upon their realization of English[31]. As just one instance, the literature on East African English remarks the lack of rhythmic differentiation in determinations of word class; that is, whereas “Standard English uses stress to indicate word class….[i]n EAfE, the distinction between the verbs pro′test, alter′nate, at′tribute and the nouns ′protest,′alternate,′ attribute through stress is not always maintained.”[32] Similarly, “in African English as a whole it is very common for pronouns, auxiliary verbs, prepositions and so on to be stressed in running speech,” meaning that the phonology lacks an accentual distinction between content and function words; thus, “the principles whereby some words in English running speech are accented but others not are far from self-evident to those whose first language makes no such distinctions.”[33]

But what does this mean for our discussion of K’naan's seemingly idiosyncratic rhymes? On the one hand, they bespeak, perhaps somewhat surprisingly, a slight but perceptible change in rhythm. We have already recognized that speakers of North American English might expect a schwa, /ə/ that is, a reduced vowel in armageddon and stupid—as befits the perceptual reality of a stress-timed language compressing strings of unstressed syllables between stressed ones— whereas speakers of other varieties would likely not, due to the principles of rhythm of their substrate languages. But we should also recognize that such unstressed but unreduced vowels also play a vital if fairly recently theorized role in the underlying rhythm of even reference varieties of English: that is, even though stress is an essential issue in the phonology of English, speech rhythm in English involves more than just the alternation of stressed and unstressed syllables. Rather, as psycholinguists have argued, discussions of rhythmic patterning should also factor in vowel quality (i.e., the unreduced or reduced nature of the vowel). In other words, the generalization that a weak/strong distinction between syllables in English is binary, that is, “wholly a function of whether or not [a syllable] is stressed,” must give way to a more nuanced contrast between full vowel quality and “central, or reduced, vowels, usually schwa,”[34] yielding four pertinent syllable types: primary stresses (P); secondary stresses (S); unstressed but unreduced syllables (U); and reduced (and thus unstressed) syllables (R).[35]

Intriguingly, these unstressed but unreduced syllables—exactly the kind found in the two rhyme pairs from K’naan discussed here—are acoustically distinctive in speakers’ productions but not necessarily distinctive in hearers’ perceptions. That is, in terms of three acoustic correlates—duration, intensity, spectral characteristics—unstressed but unreduced vowels “are neither like P and S vowels, nor like R vowels”: instead, they “occupy an intermediate position between the stressed and the reduced vowels, significantly different from both.”[36] However, listeners tend to hear them categorically and thus to group them either as stressed or unstressed, rather than as a “clear-cut . . . intermediate vowel category.”[37] Such a habit (well attested to in other tendencies to level phonetic variation in favor of functional, categorical distinctions relevant to phonology) is also relevant to our tendency to re-characterize these unstressed but unreduced word-final syllables in K’naan’s songs as bearing stress and thus potentially as examples of wrenched accent.

Second, I want to remind us that while stress is a categorical determinant in norm-setting varieties of English, stress often plays, as we have seen, a less significant role in the phonology of many other languages, especially syllable-timed ones. And as speakers of these other languages learn English, they are likely to bring the phonology of their L1 languages to bear upon their L2 ones.[38] So, to the extent that “what poetry imitates is the structure of the language itself,”[39] we might well expect the parameters of even stalwart verse conventions such as rhyme to modulate along with changes in the underlying phonology of speakers of global Englishes.

Returning to K’naan’s rhyme pairs, then, we note that whereas rhyme in English conventionally requires both phonemic identity and stress-accent on the paired segments, K’naan occasionally admits rhymes that involve a "mismatch" between stress-accent and vowel quality, that is, between a stressed syllable and an unstressed but unreduced syllable (which is typically found in the context of a disyllabic or polysyllabic word). We may well expect such fairly unremarkable but rather anomalous or innovative examples of rhyme to become increasingly common and even conventional as L2 speakers from substrate languages that tend not to reduce unstressed syllables bring to their varieties of global Englishes a functional equivalence or alternative between vowel quality and stress.

Rhyme’s Intersection with Larger Trends in the History of English

The rhythms of global Englishes are shifting. Such shifts have happened before. As the entry on rhyme in Princeton Encyclopedia of Poetry and Poetics, 4th ed., points out, the reduction of vowels and subsequent dropping of most inflectional endings in Middle English had fundamental effects first upon the phonology of English and subsequently upon the nature of rhymes in English, leading to “sets of rhyming words in Eng. . . . [that are] different in character as well.”[40] While the shift in rhythms that I point to is certainly not comparable in scale or distribution to the nearly complete loss of inflections in Middle English,[41] it may not be unreasonable to suggest some similitude based on changes in vowels. Or in potential impact upon the factoring of stress.

Indeed, even though standard English is “an outlier”[42] in terms of how centrally stress factors into its phonology, there is already ample evidence of varieties of global Englishes that are less stress-driven than others. For example, a 2011 study found that “the ratio of stressed to unstressed syllables was lower for non-native speakers compared to native speakers of English, showing that less contrast between stressed and unstressed syllables can be found in non-native English speech.”[43] A similar study from 2012 of New Zealand English [NZE] speakers showed that “younger speakers of NZE tended to show less of a distinction between stressed and unstressed vowels.”[44] Even within the capital of England, it is possible to find evidence of English rhythms that deviate from the stress-timed norm-setting standard of British English: that is, in contrast with the speech rhythm of their outer-city counterparts, the inner-city migrants who speak Multicultural London English (MLE) exhibit a “more syllable-based rhythmic patterning [that] is consistent with L2 varieties of English spoken around the world.”[45] The same holds true for speakers of North American varieties too.[46] Indeed, in study after study, L2 speakers of national or regional varieties of English have exhibited speech-rhythm properties that are—like the two rhyme pairs from K’naan examined in this article—“intermediate between what one would expect for a stress-based language like English and a syllable-based language.”[47]


It may, therefore, be prudent to reexamine our assumption that stress plays a categorical role in the determination of rhyme (or, for that matter, of rhythm) in contemporary English. Given mounting evidence that “L2 rhythm is clearly influenced by L1 rhythm . . .”[48] and that many L2 speakers lack central, if not functional, roles for stress, it appears unobjectionable to speculate that the nature of salience in global Englishes is itself shifting with the changing rhythmic patternings of its speakers. Thus, while rhyme in global Englishes will likely always involve phonetic (whether or not phonemic) coincidence, making stress a categorical or even normative factor for rhyme may need reappraisal in the not-distant future.[49] As the entry for rhyme in PEPP notes, "In some verse systems, the rules in a prosody survive sometimes for centuries after the ling. facts on which they were originally based have disappeared. One of the chief instances of this process is the mute e suffix in Fr., which disappeared from pronunciation in the 15thc. but was preserved in a set of elaborate rhyme rules into the 19th."[50] We certainly are not at such a point yet with our working definition of rhyme, but the question is, Will we be in 20, 50, 100 years, and will our versification handbooks—keep up?

Coda: Rethinking Rhyme

Throughout much of history, rhyme has often been regarded as mechanical and unchanging, either a "cocoon or stimulant."[51] It has also, typically and unfairly, been relegated to the domain of the traditional and thus widely derogated; indeed, today, outside of rap and popular verse, rhyme has remained decidedly unpopular, even as interest in formal verse has markedly increased.[52]

And yet rhyme as a form of "natural-historical experience"[53] can itself be "revolutionary ground."[54] While I've argued so far for a rather neutral perspective, i.e., that the rhymes discussed herein reveal possibilities that lie less in the virtuosity, or prosodic repertoire, of a single poet than in the changing prosodic repertoire of global Englishes themselves, I'd like to conclude the essay by pointing to possible issues in choice regarding rhythm and the vexed relationships between rhythm and identity that underlie not only global Englishes but also the specific rhyme pairs under discussion.

As the conclusion to a quite technical literature review of research on the rhythmic patterning(s) of global Englishes makes clear, the connection between rhythm and identity can be a matter of choice or cultural performance, a recognition that the author of that essay recommends be made explicit in the context of the English as an International Language, or EIL, classroom: “Upholding either a local or global norm has different implications . . . if . . . learners aspire toward a globalist orientation then stress-based timing should be taught. However, if learners aspire toward a localist orientation, then syllable-based timing should be the focus . . . The key here is to introduce the element of choice to the learners, allowing them to decide their identity and orientation.”[55]

Choices in rhythm are thus tantamount to choices in “identity and orientation": a speaker can choose a “local or global norm”—each with its distinct associations—“depending on what the speakers want to achieve or portray with their language use.”[56] That is, an L2 speaker of Somali English like K’naan may choose a norm that suggests the listener hear/see Somalia or hear/see North America. And the irony is that today doing the first, i.e., choosing the local norm, as K’naan feels he did on his first two albums, can make one “well known.”[57]

On those first two albums, K’naan’s apparent choice was his explicit aesthetic goal to heighten awareness of the humanitarian tragedy in Somalia and he hoped that “When I sang, my audience wouldn’t just hear music; they would see geography.”[58] He later spoke of one of these songs as being the embodiment in his “truest voice [of]—my continent’s angst in a personal story.”[59] But in a trenchant 2012 New York Times essay, K’naan examines how he censored himself on his third album, Country, God, or the Girl, in order to achieve wider success as an artist, reworking his songs for “15-year-old American girls, mostly, who knew little of Somalia”: “How much better to sing them songs about Americans . . . so my songs became far more Top 40 friendly, but infinitely cheaper.”[60] His substitution of American names like Adam and Mary for Mohammad and Fatima not only changed what is thematized, it also changed the texture of what listeners heard. Still, the sound of his voice, he said, carries darkness: “I come with all the baggage of Somalia—of my grandfather’s poetry, of pounding rhythms, of the war, of being an immigrant, of being an artist, of needing to explain a few things. Even in the friendliest of melodies, something in my voice stirs up a well of history—of dark history, of loss’s victory.”[61] There may be a baseless coincidence between the “unfriendly machine-gun fire”[62] sometimes used as a descriptive of the rhythms of EAfE (and other syllable-timed languages with unreduced syllables) and the “something in my voice [that] stirs up a well of history,” nonetheless, there remains a more direct—if difficult—set of complex technical and philosophical connections to be made between shifts in natural-historical rhythms and questions of voice, identity and orientation. This essay, unfortunately, stops short of that.

