TWO SIDES OF INTELLIGIBILITY: THE PRACTICE AND PERCEPTION OF PERFORMED ACCENTS ONSTAGE by ELLEN LOUISE KRESS A DISSERTATION Presented to the Department of Theatre Arts and Division of Graduate Studies of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy December 2021 DISSERTATION APPROVAL PAGE Student: Ellen Louise Kress Title: Two Sides of Intelligibility: The Practice and Perception of Performed Accents Onstage This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Department of Theatre Arts by: Theresa J. May Chairperson Nelson Barre Core Member John B. Schmor Core Member Melissa M. Baese-Berk Institutional Representative and Krista Chronister Vice Provost for Graduate Studies Original approval signatures are on file with the University of Oregon Division of Graduate Studies. Degree awarded December 2021 ii © 2021 Ellen Louise Kress This work is licensed under a Creative Commons Attribution- NonCommercial-NoDerivs (United States) License iii DISSERTATION ABSTRACT Ellen Louise Kress Doctor of Philosophy Department of Theatre Arts December 2021 Title: Two Sides of Intelligibility: The Practice and Perception of Performed Accents Onstage The profession of voice and dialect is built upon the premise of maximum understanding for the audiences attending theatre. This maximum understanding, or intelligibility, has historically driven the practice and continues to shape the profession today. Intelligibility has been used as an objective measure for countless performers throughout the history performance. However, intelligibility may not be an objective threshold of listening, but a socially constructed term used for both the practice and perception of voices onstage. The work of this dissertation unpacks the idea of audience intelligibility from two perspectives—a critical examination of the relatively short history of the profession of voice and dialect in English-speaking countries, and an empirical investigation into the audience’s role in building intelligibility for actors. Intelligibility is in fact susceptible to social structures and individual’s preconceived normative ideas towards language. Analysis in the history of voice and dialect reveals two recurring goals throughout the past two centuries. One goal of the practice was to eliminate any non-standard language usage in actors and students, to eliminate and traces of linguistic lived experiences for students onstage. The second goal is to replace these non-standard language varieties with sanitized or stereotyped versions of acceptable language varieties, iv appearing as either a general standardized accent, or stereotypical versions of foreign or regional dialects. The main results of the series of linguistic experiments appear in three main themes. The first main theme is the context of language (e.g., listening to a performance) will necessarily change how listeners perceive language. The second theme is that there are multiple ways to achieve maximum constructed intelligibility, which makes way for more diverse voices in performance. The third theme uncovers the ambiguous relationship between authenticity, imitation, and stereotype, which leads to bigger questions of the role authenticity continues to play in performance. I then offer modifications to a profession by taking seriously the notion of intelligibility as a socially constructed judgment that has a real-world effect on perception. The findings from the history and the experiments contribute to my position about the state of contemporary voice and dialect practices. I use the findings from the body of this dissertation to grapple with my own position as a white theatre maker and advocate for practices that respect the linguistic autonomy of students and actors while honoring the needs of theatrical production. v CURRICULUM VITAE NAME OF AUTHOR: Ellen Louise Kress GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene University of New Mexico, Albuquerque Augustana University, Sioux Falls DEGREES AWARDED: Doctor of Philosophy, Theatre Arts, 2021 University of Oregon Master of Arts, Linguistics, 2016 University of Oregon Bachelor of Arts, Theatre and Linguistics, 2014, University of New Mexico AREAS OF SPECIAL INTEREST: Speech Perception Actor and Voice Pedagogy New works development PROFESSIONAL EXPERIENCE: Teaching Fellowship, University of Oregon, 2014-2021 Dialect Coach, Oregon Contemporary Theatre 2019-2021 GRANTS, AWARDS, AND HONORS: Graduate school Special “Opps” grant. (2018, 2019). University of Oregon. Meritorious Achievement Award in Direction for Machinal. (March 2019). Kennedy Center American College Theatre Festival Region VII. General Excellence in Communication, Best website Design, and Best Social Media on behalf of GTFF, local 3544. (May 2018). American Federation of Teachers – Oregon Arnold, Isabel, and Rupert Marks Scholarship. (2017-2018, 2018-2019, 2019- 2020). Department of Theatre Arts. University of Oregon. 1 Hour Panel Presentation Panel Winner. (May 2017). University of Oregon Graduate Student Forum 8. vi PUBLICATIONS: Tonning-Kollwitz, Melissa, Joe Hetterly, and Ellen Kress. "The Current Use of Standard Dialects in the United States Theatre Industry." Voice and Speech Review (2021): 1-15. Kress, Ellen, et al. "Embodiment and Social Distancing: Performances." Journal of Embodied Research 3.2 (2020). Gillooly-Kress, Ellen. “Review: Building Character by Amy Cook.” New England Theatre Journal (2019). Gillooly-Kress, Ellen. (2018). “#HEWILLNOTDIVIDEUS: Weaponizing Performance of Identity from the Digital to the Physical.” The Journal of American Drama and Theatre. Vol. 30, 2 (2018). Kress, Ellen. “An Interdisciplinary Exploration of Language through Theatre and Linguistics” UNM Department of Theatre and Dance Honors Thesis. University of New Mexico (2014). vii ACKNOWLEDGMENTS I am profoundly indebted to and grateful for the Indigenous Nations and communities whose peoples, cultures, and customs continue to nurture my life. I was raised primarily on traditional lands of nineteen pueblos and the Apache. I completed this dissertation on the lands of the Kalapuya peoples. In this acknowledgement, I hope to build the capacity for solidarity with movements like #LandBack and #BlackLivesMatter and the activists engaged every day in the hard work of social justice. I also like to acknowledge completing this dissertation in the global COVID-19 pandemic and the lives lost and livelihoods interrupted due to this global event. May we build a stronger, more resilient, and sustainable profession that acknowledges the dignity and creativity of the people who make the theatre profession so great. For their patience, insight, and mentorship I am incredibly indebted to the members of this dissertation’s advisory committee. To Theresa J. May, for your lessons in persistence and resiliency; to John Schmor, for jumping into this project feet first and reminding me there’s joy to be had in this work; to Nelson Barre, for a fast friendship and stimulating inquiries; and to Melissa Baese-Berk for fostering and encouraging my interdisciplinary inquiry. Much of this dissertation would not have been completed without access to a graduate employee union that fights tirelessly for the dignity of graduate students every day. To my comrades who served with me for three years on the executive board of the GTFF; to the 2018-2019 bargaining team for becoming beyond comrades and fighting to truly make viii difference in the world (hail Satan!). Finally, to Michael Marchman for becoming a de facto mentor and treasured friend. And to Liz Fairchild, Zeina Salame, Waylon Lenk, Anna Dulba-Barnett, Tricia Rodley, Chelsea Couch, Brendon Zuel, Stephen Armijo, Lilly Josten, Ashley Baker, Mica Pointer, Nate Severance, Harrison Sim, Arielle Owens, and Loren Billington thank you for your friendship, advice, your creative collaboration and space you create for me when I need it. To anyone I have worked with in the theatre, you inspire me more than you know. To Ryan Sayegh and Ilan Weinschelbaum, I love you and I would not be where I am today without you. Finally, to my family, Joel and Barbara Kress, for raising an outgoing kid with a strong sense of justice who loves both art and science. ix DEDICATED TO My momand dad for in every thing I do, they support me all the way. x TABLE OF CONTENTS Chapter Page I. INTELLIGIBILITY AND AUDIENCE MEANING-MAKING ............................ 1 1. Overview ................................................................................................................. 1 1.2 Motivation and Structure ....................................................................................... 7 2. Literature Review: Different Disciplines, One Room ............................................. 13 2.1 Thread One: The Foundations of the Voice Profession .................................. 15 2.2 Thread Two: A Cognitive Account of How Audiences Construct Intelligibility .......................................................................................................... 28 3. Chapter Summary .................................................................................................... 37 II. A CRITICAL HISTORY OF VOICE AND DIALECT ......................................... 42 1. Overview ................................................................................................................. 42 2. Learn to Speak “Good” American English: Elocutionists 1900-1950 .................... 50 2.1 William Tilly Teaches World English ............................................................. 50 2.2 Edith Warman Skinner Bridges Performance ................................................. 55 2.3 Mimesis Vs. Semiosis: Establishing Dialect Coaching as a Profession .......... 60 3. Freeing Tension and the Rise of Regional Theatre ................................................. 65 3.1 Berry and Linklater and the British Voice Training Revolution ..................... 68 3.2 A New Professional Organization Establishes Itself ....................................... 74 3.3 The Hunt for the Perfect Dialect: Midcentury Dialect Coaching .................... 78 4. Voice Practitioners Join the Internet ....................................................................... 82 4.1 Monich Coaches for the Movies ...................................................................... 86 4.2 Digital Approaches to Voice and Dialect ........................................................ 87 xi Chapter Page 5. Towards a Cognitive Conception of Voice Training .............................................. 93 III. CONSTRUCTING AUDIENCE INTELLIGIBILITY USING EMPIRICAL INQUIRY .................................................................................................................... 101 1. Rationale for Empirical Approach ........................................................................... 101 1.2 Language Perception: General Mechanisms, Several Models ........................ 107 1.3 Top-down Processes Help Organize the Speech Signal .................................. 114 1.3.1 What is a Dialect Anyway? Social Construction of an Accent .............. 115 1.3.2 Social Construction of an Accent Affects Perception ............................ 120 1.3.3 Measures of Factors Affecting Accented Speech Perception ................. 122 2. Preliminary Study .................................................................................................... 125 2.1 Participants ...................................................................................................... 125 2.2 Stimuli ............................................................................................................. 125 2.3 Listening Groups ............................................................................................. 126 2.4 Procedure ......................................................................................................... 127 2.5 Results (Two Alternative Forced Choice Task) .............................................. 127 2.6 Results (Free Response Question) ................................................................... 129 2.7 Interim Discussion ........................................................................................... 131 3. Experiment 1 ........................................................................................................... 133 3.1 Method: Participants, Stimuli, Procedure ........................................................ 133 3.2 Results ............................................................................................................. 134 3.2.1 Intelligibility (Accuracy of Recall) .............................................................. 134 3.2.2 Accentedness and Comprehensibility ..................................................... 136 xii Chapter Page 3.2.3 Accentedness and Comprehensibility by Expectation ........................... 139 3.2.4 Who is the Actor? ................................................................................... 141 3.2.5 Who Has the Most Stereotypical Accent? .............................................. 142 4. Experiment 2 ........................................................................................................... 143 4.1 Method: Participants, Stimuli, Procedure ........................................................ 143 4.2 Results ............................................................................................................. 145 4.2.1 Adjectives by Speaker ............................................................................ 145 4.2.2 Adjectives by Expectation ...................................................................... 148 4.2.4 Who is the Actor? ................................................................................... 151 4.2.4 Who Has the Most Stereotypical Accent? .............................................. 151 5. Discussion: What Can Practitioners Take from This Chapter? ............................... 152 IV. TOWARDS A NEW TRAINING PARADIGM FOR VOICE PROFESSIONALS ...................................................................................................... 158 1. Enacting the Expansive Imagination in Voice ........................................................ 158 2. Case Study: Contemporary Linguistic Needs for Latinx Actors and Directors ...... 166 3. Pragmatic Answers to Utopian Questions ............................................................... 170 4. My Own Practice Recommendations for Dialect .................................................... 173 4.1 Who Let this Dialect: Pre-production and Season Selection ........................... 175 4.1.1 A Note on Casting .................................................................................. 182 4.1.2 Questions to Ask the Play (and Production Team) ................................ 184 4.2 The Heart of the Work: One-on-one with the Actor ....................................... 185 4.2.1 Finally Meeting the Actor ...................................................................... 189 xiii Chapter Page 4.2.2 Questions to Ask an Actor ...................................................................... 193 4.3 Audience Outreach: Working Within the Community with the Dramaturg(e) ......................................................................................................... 193 4.3.1 Questions to Ask a Dramaturg(e)/Community Outreach ....................... 197 5. Challenges that Remain, Where do We Go from Here? .......................................... 198 APPENDIX ................................................................................................................. 206 REFERENCES CITED ............................................................................................... 208 xiv LIST OF FIGURES Figure Page 1. Histogram of responses by keyword, and again with keywords by speaker ....... 129 2. Histogram of responses by keyword, by expectation condition ......................... 131 3. Box and whisker plot that shows the median accentedness scores for all four speakers ............................................................................................................... 137 4. Box and whisker plot that shows the median comprehensibility scores for all four speakers .................................................................................................. 139 5. Box and whisker plot for accentedness with expectation conditions .................. 140 6. Box and whisker plot for comprehensibility with expectation conditions .......... 141 7. Comparison of expectation conditions for selection of speaker most likely to be the actor ...................................................................................................... 142 8. Comparison of expectation conditions for selection of speaker most likely to be the most stereotypical speaker of Russian-accented English. .................... 143 9. Box and whisker plots of the results from the Likert rating of all five adjectives, by speaker .......................................................................................... 146 10. Box and whisker plots for the five adjectives that comparison the two listening conditions. ........................................................................................................... 150 11. Comparison of expectation conditions for selection of speaker most likely to be the actor in experiment 2. ........................................................................... 151 12. Comparison of expectation conditions for selection of speaker most likely to be a stereotypical speaker of Russian-accented English in experiment 2 ....... 152 xv LIST OF TABLES Table Page 1. Four different listening groups in the experiment ............................................... 127 2. Results of two alternative forced choice task ...................................................... 128 3. Accuracy of transcription of sentences for each speaker. ................................... 135 4. Accuracy of transcription of sentences by expectation by listening condition ... 136 5. Adjective alignment on the Likert scales for experiment 2. ............................... 145 6. Mean and Standard Deviation for all five adjectives, by speaker ....................... 148 xvi CHAPTER I INTELLIGIBILITY AND AUDIENCE MEANING-MAKING “You are a linguist. You think everything is about linguistics.” - Julia Cho, The Language Archive 1. Overview When performers or presenters speak to an audience that has gathered for the purpose of listening to what the speaker has to say, audience members generally expect to understand what the performer is saying. Indeed, this assumption on the part of the audience/listener is more fundamental to the performance event than any other concern, including whether the listener will agree with the message, enjoy the story, or (in the case of dramatic performances) have empathy for the circumstances of the character portrayed. Because of this expectation, performers create an extra effort to produce speech over and above what counts as understood speech in day-to-day scenarios. An entire constellation of professions that cater to the voice in performance cater to this seemingly simple and objective expectation of being able to understand speakers in public performance. Voice professionals have dedicated their lives to defining what it means to be understood in these public speaking contexts, and in turn can lend their expertise to speakers of all stripes, including public speaking for CEOs of large companies to actors in theater productions. This dissertation will focus on the latter group and examine the role that expectations of intelligibility plays in performance. Dudley Knight, a practitioner of voice and speech, claims that a way to measure this understanding is with intelligibility, or a measurement of the amount of information that is communicated to the listener or audience member (Knight 20). This expectation of intelligibility appears at first blush to be an objective measure of communication; 1 common sense dictates there ought to be some objective threshold the speaker must meet in order to be understood by the listener. However, this logical idea about how speech perception might work does not necessarily square with the cognitive reality of the act of speaking and listening. An entire field of linguistics investigates this very assumption, and researchers have theorized that an objective threshold of understanding might not exist. In fact, audiences' expectations for understanding performed language is an under- examined microcosm of much larger social forces at play. According to current research in cognitive linguistics, intelligibility is not merely a feature of the speech itself, but a result of a combination of factors that encompass the speaker, listener, and the context in which speech is perceived (Bakanic 12). Similarly, this attention to listener or audience context is reflected in current audience research in both performance and theatre (Sedgman 103). For both fields, the relative privilege and cultural context of audience members, along with implicit attitudes about how language “ought to sound,” quickly starts to affect this seemingly objective measure of intelligibility. According to research, attitudes and prior held beliefs can predict the behavior of listeners, which poses a problem for the assumption that intelligibility of language onstage is somehow separate from the context in which that same language is presented. Driven by the assumption that to speak on stage means an actor must deliver the text in a way that the audience understands the words (as well as the actions), performers are often asked to “speak clearly.” (Indeed, this may be the most common note given to actors in all levels of acting training, voice classes, and production.) But what does it mean to be “clear” or “intelligible” on stage? Practitioners assume that a basic level of effort is required so that the audience may understand not only what is being said, but 2 how it relates to the characters and situations in the play they are witnessing. Audience members expect a basic or privileged level of understanding, whether they label this quality of speech as “clarity,” “intelligibility,” or “authenticity.” Audiences assume that performed speech has an objective set of linguistic forms (e.g., volume, enunciation, breath). The question of what and how audience members understand what they hear is at the heart of a debate in theater praxis, and, more generally, in the field of linguistics. Both fields examine closely the role that expectations and cultural contexts play in perception for audience members. These expectations of properly intelligible speech are prescriptive in their approach, which measures the speech of performers to some predetermined ideal approach. Often, prescriptive expectations of ideal speech that mask as mere descriptions of linguistic forms are still subject to the overarching ideologies and beliefs of society. Performed speech, which is a type of communication that is limited to certain venues and contexts in society, is not immune to these expectations. The social makeup of performers, characters and audience members combine to influence the social factors constructing intelligibility. Specifically for theatre, this expectation of intelligibility is directly connected to the types of bodies that have had the privilege to be onstage. This means the practice of western theatre has constructed intelligibility around the white body. Historically, in the United Kingdom and The United States, practitioners have privileged a certain sound or way of speaking that was indicative of a certain race, class, and gender of speaker (Skinner ix). This privilege contributed to the continued marginalization of theatre practitioners in most arenas of performance, with overt discrimination against those who do not speak close to the 3 standard accepted language in performance. Those who do not sound like privileged white, middle-class, male cis-gender actors from the correct part of the United States or England were judged as speaking an inferior type of dialect and were not featured on the stage. Indeed, actors and theatre professionals, like new broadcasters and politicians, are often encouraged to work to lose their regional or cultural accents in order to conform to a normative sound. The results of this exclusion reinforced the types of actors invited to train in the various official institutions of theatre. Those who were able to overcome institutional barriers found hostility from their voice and acting teachers. For example, Stan Brown, an Associate Professor in Theatre at the University of Nebraska, Lincoln, recounts his early experiences with a voice teacher while training to become a voice teacher himself, As a young acting student, I was told by one of my voice teachers that the English language didn’t belong to me. I am an African American. Her exact words were, ‘Well Stan, you know the English language doesn’t really belong to you...your culture.’... my initial response to her assessment was one of total silence and stillness (Brown 17). Such overt racist ideas stem from this expectation of total intelligibility in performance yet are irrefutably affected by prejudiced ideas of the culturally dominant ideal English speaker. Contemporary voice and dialect professionals have inherited these overtly racist ideas in their practice and continue to grapple with both the legacy of these origins and the expectations of audience members for intelligibility of actors on stage. Both ideas of elocution and audience expectations of intelligibility spring from the deeply held ideas of 4 a society that privileges certain language usage over other usage—and by extension, privileges users of more standard language, determined by voice professionals, over non- standard or colloquial language users. This means that vocal professionals, in pursuit of presumed objective measures like intelligibility or clarity, reproduce harmful structures of language classification that help perpetuate bias and further enforce the flawed idea that some speakers do not get to claim to be users of a language. Similar to the idea of the white gaze, this raciolinguistic perspective is “attached...to a listening subject who hears and interprets the linguistic practices of language-minoritized populations as deviant based on their racial (or socioeconomic) position in society as opposed to any objective characteristics of their language use” (Flores and Rosa p. 151). The focus of both the theorizing of these raciolinguistic practices and the bulk of the work of this dissertation is on the listening subject (both the highly trained listening subjects in voice and dialect trainers, and the listening subject of the average theatre audience member) rather than the empirical practices of speaking subjects or performers. The flexible and social nature of language perception and standard language ideologies in listeners combine to produce real-world consequences for speakers who do not speak standard dialects or speak their second language with an accent of their first language. While an audience may perceive as harmless an actor using a foreign-accented English dialect to punctuate their villainous character, these types of social expectations that accompany accent perception in real-world scenarios can perpetuate harm. This harm comes from the stereotyped ideas that non-standard speakers are somehow inferior to speakers closer to the overall societal ideal (read as white) speaker. As this dissertation will demonstrate, in the same way explicitly racist ideologies influenced strict 5 gatekeeping in early elocution approaches that morphed into practices throughout the voice profession, these insidious social expectations influence everything from courtroom proceedings to education and many other contexts of language usage. Linguists have theorized that harmful social stereotyping as a result of normative language expectations has influenced perception in many different social continua and can lead to consequences that extend beyond initial judgments about these accents. Real-world consequences are documented in many different linguistic studies. For example, in courtroom proceedings, perceived accent affects the credibility of eyewitness statements in the courtroom (Frumkin 317). Six videos of identical eyewitness testimony, varying by accent and ethnic background of the eyewitness, were presented to participants. Listeners perceived speakers with non-native and non-standard accents as less credible than native speakers with more standard accents, over and above ethnic background of the speaker alone, and these speakers were also more likely to be perceived as deceptive. Further, listeners were overall less accurate in their recollections when they perceived accented eyewitness testimony. Theatre, in these cases, can offer a site for intervention that disrupts these negative views, and offers audiences a chance to re-configure their perceptions about non-standard language speakers. It follows, then, that practitioners in voice and dialect ought to consider how audience members who are seeing (and hearing) performance in which language is a central participant are using their biases to construct intelligibility, and therefore meaning and experiences they are drawing from the performed event. However, the way voice professionals are constructing intelligibility does not often take audience expectation and meaning creation from these expectations into account. Overwhelmingly, voice 6 professionals rely upon their expert experience and predictions of audience intelligibility to guide how they train theatre performers. The work of this dissertation demonstrates that this assumption contributes to the continuing use of standard language as a guide for voice training. Understanding how audiences use their own context and expectations to construct intelligibility as a subjective measure is an essential piece of updating the training model for voice professionals, where such understanding will create space for performers and speakers who historically have been excluded from performing. This dissertation opens audience expectations of intelligibility and standard language usage to assess the subjective nature of this measurement of speech. This dissertation will combine a critical history of the voice profession, a cognitive consideration of how audiences use social context and expectations to create meaning on stage, and a specific empirical linguistic case study to specifically build the argument that audience members, while perceiving intelligibility as an objective reality of language, create their own subjective parameters around language perception on stage. I will demonstrate that because these parameters are sensitive to social context, intelligibility and speech perception can be used in thoughtful ways to counteract the overarching societal expectations around the use of standard language, denying a large portion of speakers who do not share characteristics with those historically who have had control. 1.2 Motivation and Structure I propose a direct challenge to the objective view of intelligibility as a space for the audience and practitioner to explore their own linguistic biases, which privileges the white, male cis-gender middle-class listener as the preferred audience member. Examining cultural norms regarding language in this way opens opportunities and 7 possibilities in theatre and entertainment production that allows for speakers who might not otherwise have a chance of participation in the process of production. The role of the audience member in constructing intelligibility for the stage is a site to challenge the idea that the way a person sounds immediately indicates or confirms their innate character traits. Often when a speaker is referred to as accented, with rare exceptions, the context is negative. This pernicious idea also figures heavily in the profession of accent reduction or modification, which is a clear example of the burden of expectations of understanding lying squarely on the speakers, while absolving listeners of their responsibility for their role in constructing intelligibility. Approaches for training, both in performance and in real life, advocate for an appropriateness-based type of language education, claiming there are appropriate avenues for different types of speech. These models of training “advocate teaching language-minoritized students (or performers) to enact the linguistic practices of the white speaking subject when appropriate” while denying that the white listening subject or audience member may still continue to perceive linguistic markedness even with the best training that is available to the student or performer (Flores and Rosa 149). Performed language qualifies as a type of privileged language and thus is highly susceptible to this type of appropriateness-based training. Performed language must therefore be conceptualized not as objective linguistic categories but as a set of racialized ideological perceptions perpetuated by those who create and maintain the structures of power in formal entertainment structures in stage, television, and film. While performed language appears in many forms, from public speaking to formalized performances, the work of this dissertation will focus on the formal voice profession structures that have 8 arisen to support professional forms of performed speech—found in professional theatre, television, and film. This dissertation also spans between fields serving as a model of interdisciplinary inquiry that takes seriously the notion of approaching a problem from both a humanistic and a scientific point of view. Theatrical performance remains a uniquely structured human behavioral activity, which means the practice is an arena rife with the possibility of scientific and linguistic inquiry. My position as an artist and scientist allows me to see a problem or assumption in one discipline such as audience perception of language onstage and approach the problem as both a theatre and a linguistics scholar, creating significant contributions to both fields in the process. For example, this dissertation not only holds the potential to influence pedagogical approaches of voice and dialect practitioners, but also stands to contribute significantly to conversations in linguistics and cognitive science about social models of speech perception more generally. Answering this question of audience expectation of understanding feeds directly back into best practices for voice professionals to confront the overtly racist ideas that serve as the foundation of this practice, as the results of examining this phenomenon feeds directly in the form of a practical guide for theatre makers. One of the core tensions of this dissertation is disentangling the difference between knowledge that is considered objective versus subjective experience. In my own experience, appealing to objectivity for a socially constructed phenomenon such as intelligence or intelligibility automatically reinforces built-in biases of evaluators in mild cases, and creates active harm for those evaluated in the most severe cases. Specifically, I am interested in the issue of intelligibility (in both senses as linguistics research and voice 9 practitioners) and how it can be considered an objective measure, when the substance that intelligibility is measuring is as nebulous as information or content of the speech, when human speech carries more meaning than the words that are uttered. Intelligibility used by these practitioners is, in my estimation of the literature, a synecdoche for the perceptual experience and subsequent quality judgment of the expert listener. Intelligibility, for a voice or dialect coach, stands for their estimation of the perceived understanding of the dialect in the context of performance (Knight 72). For the sake of empirical research, linguists have been able to side-step that larger epistemology by creating a working definition of intelligibility that quantifies words understood and recreated by listeners (Flege 2020). In consideration of the objective, voice teachers have also adopted this approach to a degree, yet do not measure content produced as narrowly as asking each individual audience member to regurgitate the content that they have just witnessed. If we were to recreate this experimental paradigm in the theatre, imagine participating as an audience member in one such experiment, with a researcher asking you to rewrite each line as you witnessed Midsummer Night's Dream by William Shakespeare. My goal with the research in this dissertation is to use the definition of intelligibility and the constellation of similar terms narrowly and precisely from linguistics to explore the more common denotations of intelligibility in the voice profession that uses this term as a marker for success in theater. Carefully parsing terms will reveal a gap in the assumptions that govern the voice profession and give clarity to existing linguistic literature on intelligibility. The linguistic experiments I have designed for my investigation initially aim to isolate the moment of perception of the average 10 audience member when they encounter a speaker or a voice minus the richly complicated immediate context of live performance. This approach requires the explicit assumptions that the basic auditory, visual, and linguistic perception in individual audience members continues to function as normal, even in a highly specialized context and environment (McConachie 34). From this base, I can then expand the paradigm to capture richer and more complicated contexts in which these voices are heard. In the linguistic sense, intelligibility and the accompanying factors audiences use to judge an accent will provide feedback or confirmation for the concrete goals for the expert listener. To examine this social construction of intelligibility, I must approach this topic using two main threads of inquiry. The first thread asks, what are the assumptions of audience understanding guiding the principles of voice and dialect coaching? How do these assumptions shift from their historical origins to influence the contemporary profession of voice and dialect? To answer these questions, I create a short critical history of this relatively young profession in theatre making, answering these questions for different eras of voice and dialect. Interspersed with this critical history is an examination into how audiences construct meaning onstage, more specifically theatre found on educational and professional stages in the United States. How does the audience conception of voice (promoted and influenced by key players in voice and dialect training) and intelligibility contribute to this meaning making process? This first thread draws the answers from the legacy of American Realism, the establishment of actor training in the United States and the United Kingdom, and Ireland draws upon both historical and contemporary cognitive audience reception/perception research. A central theme of this exploration is the idea that words uttered onstage in their rich linguistic and 11 social context of performance are doing things far and above the base meaning of the assertions found on the page or in performance, providing each audience member the opportunity to create meaning for themselves (Austin 6). I will use the inheritors of J. L. Austin’s ordinary language theory, George Lakoff’s and Mark Johnson’s embodied realism, to critique the often-detached ways that historical practitioners have conceived of language and perception in their pursuit of the profession of voice. The other thread builds upon the theories of meaning-making presented in the previous thread by using a very specific empirical case study to advance our cognitive understanding of audience perception and how they use intelligibility to construct meaning on the stage. This thread examines in quite literal terms the assumptions of listeners when they encounter performed speech. In this thread, I ask, what cognitive processes are audiences accessing in the moment when encountering performed accented language? How do those expectations affect how speech is perceived, specifically in terms of intelligibility or clarity? I build a brief and useful linguistic primer for practitioners before describing in detail the experiments and their results, ending with a collection of takeaways that will influence the conclusion of this dissertation. The conclusion brings together the prior cognitive audience research and this specific empirical linguistic inquiry to speculate on the future of the voice profession, offering best practices informed by the experiments conducted within the body of this dissertation. Mirroring other efforts in updating representation on stage, I speculate towards a new field of voice and dialect coaching that critically grapples with audience meaning making as a reflection of the cultural context of theatre production in a society that explicitly privileges some language varieties over others. A critical examination of the underlying 12 meaning-making processes of audiences means we as practitioners may unlock the potential to intervene in harmful stereotype creation and reinforcement of dominant varieties of language. A re-tooling of voice and dialect and its role in theatre creation can expand the notion of who gets to sound like whom onstage, creating room at the table for an expanded variety of lived linguistic experiences. 2 Literature Review: Different Disciplines, One Room This dissertation draws its foundation from two disparate fields to investigate the role of intelligibility—one humanistic, the other scientific—as both fields contain unique approaches to knowledge that complement each other. Both bodies of knowledge, and ways-of-knowing (one aesthetic, the other empirical) are crucial to deciphering the specific ways in which voice professionals inadvertently reinscribe biases and standard attitudes, and to suggest proactive countermeasures to mitigate the damage already done by the profession. I begin in the first thread by providing a history of the profession of voice and dialect, that critically engages with the ideologies of over one hundred years of voice professionals, ending with recent prominent voices through the professional organization Voice and Speech Trainers Association (VASTA). Along with the voices of professionals themselves, I describe the material conditions of theatre production as a possible source for the shape of the profession. Voice practitioners have, from time to time, pondered the ethical considerations of their craft, with fewer still producing academic literature. One publication, Standard Speech: Essays on Voice and Speech (2000), printed as the initial issue of the Voice and Speech Review by VASTA, has served as the model for discussion about ethics of producing voice and dialect work. Precious few of these voices in the literature of voice professionals are concerned with 13 audience perception and reception, which is the conversation into which I inject the research of this dissertation. I build upon the implicit arguments of Patsy Rodenburg work against what she calls “vocal imperialism” in her first book The Right to Speak (2015). The history of this profession is offered through a lens of philosophical approaches to language via cognitive audience studies that will roughly divide the different approaches to voice and dialect training throughout the brief history of this craft. In order to engage with this history, I am using two prominent audience listening theories. One area is audience reception theory, introduced by Susan Bennett’s book Theatre Audiences: A Theory of Production and Reception (1997), and the other is contemporary iterations that have followed other humanities scholars and adopted a cognitive turn towards scholarship through Bruce McConachie’s Engaging Audiences: A Cognitive Approach to Spectating in the Theatre (2008) and other scholars that are already borrowing from psychology, linguistics, cognitive science, and embodied realism to explore the processes of audience reception/perception. Bennet has essentially laid the groundwork for future cognitive humanities investigation into theatre by arguing for deeper systematic research that accounts for different contexts for audience reception (89). The second thread answers Bennet’s call for deeper systematic research by examining a very specific instance of audience members constructing intelligibility through speech perception of non-native dialects on stage. To establish this thread, I examine the up-to-date theories of language perception that voice practitioners may directly use in training. I will specifically draw upon Rosina Lippi-Green’s work English 14 with an Accent: Language, Ideology and Discrimination in the United States (2012) where she uses empirical evidence to demonstrate what she calls “standard language ideology,” a phenomenon affecting listeners as they reconcile their expectations of how a speaker should sound based upon visual cues such as perceived race or gender (Lippi- Green 64). This standard language ideology is particularly pertinent to performed language as the language is often spoken by an actor in a space who is read in multiple social levels (race, age, socioeconomic status, gender). I add to this by summarizing the most recent findings in the field of cognitive linguistics, by summarizing the most recent theories of non-native speech perception, second dialect acquisition, and speech adaptation. I start by specifically drawing upon the work of Kevin McGowan (2015), Donald Rubin (1990, 1992, 2013), James Flege (1995, 2020), Munro and Derwing (1997), and the work of the very lab of which I am a member and where I have conducted the research of this dissertation, under the direction of Dr. Melissa Baese-Berk. To view these two threads as an intertwining braid, I will draw parallels between these two fields of study by highlighting overlapping vocabulary terms used in both fields. By defining terms such as accent, perception and even intelligibility, I can create a space with this dissertation that combines parallel conversations and offers best practices as a result of these overlapping fields of study. I will use the main takeaways from both threads to directly advocate for a better approach to a profession that has historically contributed to active harms of marginalized people. 2.1 Thread one: The foundations of the voice profession The first thread concerns itself with a thematic historical overview of the voice and dialect profession, beginning with elocutionists at the turn of the twentieth century 15 and leading to the working model of contemporary practitioners that lend their services to many forms of entertainment, from live theatrical performance to film and television. Vocal professionals work in terms of meaning-making by teaching performers to access their voice as an integral aspect of conveying language on stage. Most theatrical audiences expect to be able to easily understand these performers, which is an aspect that voice professionals have identified as a key area for voice work. In this thread, I connect the expectations of both voice professionals and theatrical audiences to raciolinguistic ideology—both of these behaviors stem from the inferred expectation of the white listener, which automatically and systematically labels the marginalized voice and body as an Other, leading to real-world consequences for performers and marginalized communities. The chapter will demonstrate that even contemporary voice training ascribes to the appropriateness model of education, popular in many different language education models. The arguments in this thread preview the notion that the so-called objective listening criteria offered for performed speech is subjectively constructed between the speaker and the listener (i.e., the audience). The training apparatus of theatre is, indeed, the white listening apparatus and the accompanying economic system made manifest. In addition to conveying the written language of the piece, vocal choices the performer makes also do work to create meaning or context of this language. For example, an artistic director for the company may choose to hire a dialect coach for a production of Good People and ask that actors in the production perform using a Boston dialect, to convey a sense of meaning of place in production. Regardless of location of performance, audiences take this acoustic signal as part of the process of meaning- 16 making in theatre. However, use of a Boston Southie dialect on the West Coast of the United States may be interpreted differently than audiences on the East Coast (especially if that playhouse is in Boston). The audience’s threshold for dialect accuracy may be wider for those on the west coast of the United States since audience members have likely less direct experience with a Boston dialect in the real world outside of the performance. This difference in perception of authenticity between individual audience members affects how they are perceiving the show. Even with a stereotypical dialect, audiences expect ease of access to understanding the language onstage as a seamless part of meaning making in theatrical production, calling for an accent that is both recognizable as authentic but clear in delivery. However, clear in this instance is defined by the narrow experience of the small slice of socioeconomically privileged demographics that Susan Bennett in Theatre Audiences demonstrates attends the theatre (Bennett 114). Practitioners have established authority in this growing field by opting for engagement with more general audiences for their scholarship. These practitioners cultivate access to their work not only through creating institutes and offering workshops, but to espouse their philosophies and approaches to voice through publications that are meant as accessible guides for speakers and performers of all stripes. In these general audience publications, some practitioners grapple with the built-in inequalities of the profession of performance. In her 1993 book, The Right to Speak: Working with the Voice, Patsy Rodenburg writes about the right for every person to have their voice heard through her system of work of the voice. This is a simple declaration, yet to make such a declaration requires an honest admission of who has historically had the right to speak 17 and in which arena, and who has not. In her introduction, she introduces the notion of “vocal imperialism” with respect to the notion that when a person opens their mouth to speak, they are exposed to snap judgments from others of their geographical and socioeconomic origin and subsequently their capabilities as speakers (Rodenburg 5). In this society, certain voices are privileged above others as a result of overlapping prejudices and beliefs, which is then reflected in the media, performance, and entertainment that this society consumes. Rodenburg warns against how these types of snap judgments lead to a loss of voice or vocal power in a speaker. Rodenburg contributes a culturally sensitive addition to the usual idea of vocal practice as being a deeply unique and individual practice that aims to investigate the speaker’s own limits and restrictions. The fact that she names “vocal imperialism” as the first obstacle to declaring one’s right to speak reveals volumes about the society in which speakers and listeners find themselves. Naming these biases forms the basis of inquiry into how practitioners and audiences alike conceive of intelligibility, whether it is called vocal imperialism in the field of voice or standard language ideology in the field of linguistics. This deceptively simple belief in a standard language continues to hold massive implications for the use of dialect onstage. The fault alone does not rest in this field of voice; belief in standard language is a pervasive, common-sense idea that is deeply and subconsciously ingrained in how most of society views communication. However, the field of voice and dialect is in the unique position to actively push against how standard language beliefs affect individual audience members’ perceptions of how intelligible and clear an actor must sound in order to be accepted as a good performer or speaker. However, these 18 expectations are not only the responsibility of the untrained listeners in the audience, voice experts’ opinions and trained ears are also responsible for shaping the conception and use of intelligibility. While some vocal practitioners like Patsy Rodenburg push back against this prevailing idea, the fact remains that vocal and dialect training has been shaped by these biases towards an abstract idealized language. These expert expectations are responsible for appeals to standardization of language use on stage. This thread traces a critical history of the harmful usage of these language standards throughout the profession that still find their ways into contemporary linguistic practice onstage, in film, and on television. Standard language varieties often reflected a neutral mode of speech that was anything but neutral, as these varieties often favored white, middle class cis-gender actors because the variety acoustically most resembled these actors. (Lippi-Green 14). Often in these books from early elocutionists, these attitudes towards modes of speaking would give way to overt discussions of racist ideas like restricting immigration to the United States. Published in 1924, Marguerite DeWitt writes in EuphonEnglish, explicitly highlighting why she believed white American speakers of English were the most superior speakers in general, “ignorance may be condoned, lack of dexterity may be excused, but faulty speech and foreign accent are indelible signs of social inferiority” (DeWitt as quoted in Knight 40). Building from these explicitly racist structures, other voice and dialect professionals incorporated these prejudices and systemic inequities throughout the following century so that these explicitly racist ideas are now hiding as common sense or implicit approaches to vocal production. The use of “Good American Speech” as introduced by Edith Skinner in training, for instance, necessarily implies there 19 are versions of Bad American Speech that do not deserve to be heard onstage (Skinner 1990, ix). These effects are borne disproportionately by those actors with marginalized identities, and there are many instances where actors are encouraged to either drop their home dialect or play up the ethnic aspects of their speech in order to secure roles (Sullivan). The professional manifestation of these practices includes the organization Voice and Speech Trainers Association (VASTA) that houses both practical resources like lists of trainers that are available in the institutional ecosystem, and also a scholarship wing where critical conversations have shaped the profession in its twenty years of existence. I engage here with two of the most influential voices throughout the history of VASTA and the publication Voice and Speech Review because I have seen their work influence the community of voice and speech trainers in lasting and damaging ways. With explicitly racist foundations that this profession has seldom acknowledged—apart from a growing contemporary call from a small group of scholars—I choose these voices as critical entries into understanding the historic legacy of racism and sexism in the contemporary iteration of this profession. Like voice and dialect coaches in production, these scholars are admired as respected authorities and gatekeepers of access to the voice and bear the responsibility to honestly reckon with the dark history and continued oppressive practices of this profession. Some of these influential scholarly and professional voices still defend the use of standardized dialects or Skinner’s “Good American Speech” as a pedagogical tool, contending the importance for actors to learn about their unique voices through learning a different accent or mode of speech (Robbins 55). While many voice professionals may not use standardized dialects or accents or “Good American Speech,” 20 there are other insidious ways where normative language attitudes seep into their practices. Again, while these normative attitudes are on the surface aimed at the marginalized speaker or actor, these attitudes spring from subjective ideologies that actually privilege the white listener. One of the ways that voice and dialect practitioners have attempted to circumvent issues of standardization of language practice is to appeal to the objective sounding measure called intelligibility, as described at the beginning of this introduction. This is the phrase that Dudley Knight used in his article on standard language usage in voice pedagogy in the initial journal of Voice and Speech Review in 2000 that was so influential there was a reprint in 2012. His claim appeals to the commonsense notion that there must be some objective threshold for understanding sound and language onstage. Surely, Knight argues, there is some absolute baseline of minimum understanding from the audience when it comes to communication (Knight 65). The work of this dissertation dismantles this appeal to common sense and demonstrates that intelligibility is socially constructed; thereby demonstrating that no such objective threshold exists in the way conceived by Knight. Conveniently, Knight does not offer a direct definition of intelligibility, instead appealing to a know-it-when-you-see-it approach by saying, “most theatre accent coaches have a keen experiential awareness of what intelligibility is, because they have had to modify the accuracy of accent all the time to accommodate it” (Knight 75). Using appeals to authority is a common theme for voice trainers throughout the short history of this profession, which accomplishes two things—establishing this profession as having legitimate expertise, and gatekeeping speakers of non-intelligible accents from the profession. He then goes on to claim that intelligibility is fully the 21 responsibility of the speaker, “A standard based on indelibility is not tied to any prescriptive pattern. Rather is it based solely on the speaker’s ability to transmit to the listener the appropriate amount of linguistic information to the level of detail and specificity appropriate to the event” (Knight 75). This is a curious approach, because he appears to set up the responsibility for intelligibility to either the expert listener or the speaker, but not at the same time. The responsibility for intelligibility during rehearsal lies with the expert listener or voice coach, while intelligibility rests on the shoulders of the actor during performance. Knight also appeals to this objective measure as a way to circumvent the idea that normative language attitudes about race and gender affect all forms of communication and especially language perception. Knight is referencing specifically the linguist Rosina Lippi-Green, who was also invited to contribute to the initial volume of Voice and Speech Review on Standard Language as an outside expert. In her article “The Standard Language Myth,” Rosina Lippi-Green explains a phenomenon with which voice and speech trainers must contend called standard language ideology—introduced briefly here but discussed more in depth in the following section—where a listener believes that a homogeneous or perfect version of language exists, and they are comparing what they hear with that expectation (Lippi-Green 24). In order to create a standard objective measure of intelligibility, Knight had to reconcile the subjective nature of standard language ideology, since every listener in the audience has the potential for a slightly different version of ideal language. His solution was to discredit Lippi-Green’s theory that listeners share responsibility for intelligibility, even going as far as accusing Lippi- Green for cherry-picking anecdotal evidence to support her case by offering his own list 22 of anecdotal evidence himself. However, Knight’s own cherry-picked anecdota of exceptions all skew heavily white cis-gender male (who could initially be perceived as standard speakers by their looks alone) which ironically serves to confirm Lippi-Green’s theory that listeners are using social cues from their beliefs of standard language to construct intelligibility. It takes incredible hubris to criticize a linguist, whose entire job is to systematically investigate language usage, as picking and choosing evidence while ignoring others in building a theory of language usage for voice professionals. Knight’s mistake in 2000, and since reprinted in 2012, was refusing to take into account the audience member’s role in measuring intelligibility of speakers onstage. In the intervening time since 2012, Dudley Knight has since softened his appeal to intelligibility as a wholly objective measure, describing intelligibility of the work he offers on his website as “not a fixed property of some idealized and prescribed accent model, but a constantly negotiated process between speaker and listener, within conditions set by the acoustics of the space and the familiarity of the audience with the language style” (Knight and Thompson). However, they still suggest the use of objective measure through referencing the acoustics of a given location, along with an implied ability to measure the audience’s familiarity with the language style of the piece. For example, familiarity itself cannot be measured as a fixed quantity, since the audience's experiences with the piece itself affects familiarity proportionate to the amount of time they spend experiencing the production. In other words, the audience’s familiarity with the language styles will increase with every minute of the production. I engage with Knight’s arguments here in detail to make a point about how voice and dialect practitioners approach many kinds of objective or scientific measures. That is 23 to say, the history in the first thread will reveal that Knight and other voice and dialect practitioners appeal to scientific authority on topics by using objective-sounding measures without following through on how to use such measures. The use of intelligibility as an objective measure of linguistic ability can be compared to the use of IQ tests; at the outset, there appears to be objective measures of intelligence, but scratch beneath the surface and one can find many instances of normative and racist judgments that accompany these types of measures (Kendi 311). To balance this appeal to objectivity, as Knight often does in his article, voice and dialect practitioners appeal to their extensive and subjective experience of their profession. As a voice practitioner myself, I do not have a large issue with using personal experiences as evidence per se. I do, however, criticize when subjective observation is then passed off as objective or scientific evidence. Fortunately, other practitioners in this field of voice approach the issue of standard language through a lens that accounts for cultural differences in practitioners and audiences, laying the foundation for an alternative discipline of voice training. This deep interrogation of the foundation of this field is necessary if we are to include voices that have been pushed aside, voices that continue to be marginalized in both society at large but also in our performance spaces. Afterall, as Rodenburg says, “voice work is for everybody...your voice belongs to you, it is your responsibility and right to use it fully” (Rodenburg xiv). A new generation of voice professionals have taken this quote to heart, as they create systems that attempt to open space for speakers who do not speak perceived standard varieties of English or other languages. Professor Melissa Tonning-Kollwitz and Joe Hetterly have pioneered work both in the professional sphere 24 and through their continued scholarship of industry needs and the documented shift away from standard dialect in The Voice and Speech Review. From these and other professional observations, Tonning-Kollwitz has crafted a new system of voice work that accounts for biases, and she has adopted actively antiracist stances within the entertainment industry. Another voice professional Daron Oram advocates for a very explicit decolonization of the “linguistic imperialism” described by Rodenburg (Oram 280). Contemporary voice and dialect professionals have begun the difficult work of transforming this profession through their own years of negative experiences. While these viewpoints are previewed in the initial thread of inquiry, the final chapter of this dissertation is in direct conversation with these contemporary scholars and practitioners. Interspersed with discussion of these voice practitioners and different eras of voice training are cognitive considerations of how trainers and audiences alike are constructing meaning using the social contexts of the voice. Part of this thread is a consideration of the use of the word “voice” and the many different permutations that govern this profession. This chapter also offers cognitive explanations behind the assumptions—established as common sense principles—of practitioners in the field of voice. Through audience studies, cognitive humanities, and the philosophy of language, I can examine why these assumptions feel common sense, and the explicit role intelligibility of speech plays in the larger context of meaning creation in production. Interest in audience studies, also known as audience reception, did not rise to scholarly prominence until Susan Bennett laid the foundation for a compelling case for studying audience reception systematically within theatre production in 1989 through her book Theatre Audiences: A Theory of Production and Reception. In Bennett’s estimation, 25 theatre, while beginning in the west as an act of communal religious and political gathering for most of the populace (an embodied act or performance of democracy), has shifted instead to serve the interests of the middle class or bourgeois society (Bennett 3). "Naturalist theatre," which is what Bennett labels Realism in the American tradition that arose out of the work of Stanislavski, can be the culprit most guilty of catering to the middle-class tastes of these audiences. Bennett observed that treatment of audiences as homogenized and sanitized masses did a massive disservice to the types of theatre that was pushing against the dominant or mainstream approach to theatrical production. These theatre companies, often sharing and promoting work created by marginalized theatre creators, often approach theatrical spectating as a group of individuals who are "productive and emancipated spectators" as part of a vibrant cultural ecosystem (1). While an imaginary or stage world is still at the center of the model, it is the stage world that is concentrically wrapped in audiences' cultural expectations that constitutes audience reception. The cultural context in which theatre is created belongs directly to the audience members who are witnessing these acts of performance; the performance by its very nature cannot be separated from the expectations of those who witness it. Thus, according to Bennett, to study performance is to study the audience and the various contexts in which they encounter theatre. Audiences make meaning not just from the imaginary world on stage, but the real world in which they find themselves encountering this act. Because of the serious consideration of these environments, audience reception studies makes room for empirical consideration of the audience experience and thus engages with different fields such as sociology, anthropology, and even philosophy of 26 language. Bennett builds a foundation for her model by reviewing the available empirical studies on theatrical audiences, which all confirm the initial conceit of the introduction; namely, audiences polled who attend performances occupy a narrow swath of socio- economic and political opinions, regardless of the geographical area that is sampled (Bennett 92). While often sharing dominant societal habits and attitudes in general, this small band of politically like-minded middle-class theatre attendees also share similar held ideas about expectations of intelligibility’s role in meaning on stage. To this group, theatre and art for the middle class were made for relatively easy consumption, which requires an access to understanding with little to no effort on the part of the theatrical audience. Theatre has been built for this particular type of white listener, which is a subject that is found in other “appropriate” contexts, including academic and classroom instruction of language (Flores and Rosa 145). Contemporary audience research scholars such as Kirsty Sedgman follow this tradition of empirical investigation, calling for rigor in theatre audience research that rivals serious social scientific inquiry in other fields. This dissertation answers that call by posing uniquely specific empirical questions of audience perception in the theatre, by incorporating research from the field of linguistics and detailing a series of empirical experiments that advance cognitive audience studies. Bennett borrows heavily from semiotics and philosophy that regards humans in the tradition of Locke and his “tabula rasa” rational mind, while still attempting to rectify the immediate and material influence of context on that mind. In contrast, Bruce McConachie uses recent cognitive scholarship to embrace both nature and nurture in the quest to describe human meaning-making in performance. McConachie points to actor training as a huge influence on the mode of meaning-making for theatrical audiences, 27 particularly in the United States of America and the United Kingdom and Ireland. I will take up the arguments from McConachie and demonstrate that voice and dialect coaches are responsible in large part for replicating and disseminating ideas and modes of meaning-making for the voice and the use of dialects onstage, and that this, in turn, replicates social bias. The context of Realism or “naturalist theatre” in which audiences have grown accustomed to encountering theatre governs the practice of voice and dialect directly (Bennett 2). Both audiences and theatre producers in particular are subject to this naturalist approach to the dominant epistemology of theatre creation in the twentieth and twenty-first century. As the voice profession that was created in the twentieth century, reasoning extends that the voice and dialect profession ascribes to these same ideas about knowledge creation. This thread closely examines these ideas about knowledge creation through distinct generations or approaches to voice training. This thread not only offers an alternate lens into the guiding assumptions of voice and dialect professionals, but I also lay the foundation for a very specific application of empirical investigation into the exact construction of intelligibility in which audience members participate. 2.2 Thread two: A cognitive account of how audiences construct intelligibility One of the unique contributions of this dissertation is the critical engagement with linguistic theories that parallel discussions that scholars in theatre are having regarding the role of communication. This thread will serve as a specific instance of cognitive humanistic inquiry that details the cognitive mechanisms behind audience perception in order to assess its role in meaning-making for performance. Since the focus of this dissertation is on audience experience and perception of voices and accents on stage, the best expertise that is available on the topic arrives directly from psycholinguistics and 28 related fields, where researchers have examined the perceptual, mechanical, and psychological processes that accompany language perception more generally. Sociolinguistic theories in non-native or foreign accent perception also serve as a critical basis for investigation of standard language ideology, which is a root cause of the historic and continued use of standardized dialects or accents in actor training. Also found in sociolinguistics is a very useful model with which to frame various approaches to voice and dialect training called raciolinguistics, a model that describes how speakers are unfairly held up as flawed examples of the language or accent they are attempting to acquire, and the prevailing belief that some forms of language have appropriate avenues and are only appropriate during certain contexts. This dissertation acts as a bridge in multiple ways, connecting conversations between fields as wide apart as the social sciences and the humanities, and within discipline between subfields of linguistics. To decipher the ways in which normative vocal training may reinscribe implicit bias and inadvertently prejudice the entire practice of theatre, I make use of ideas and terms from the field of linguistics. While these are not always credited to one particular author (as is often the case in the humanities), these terms are central to my project, and indeed they are the tools I apply reflectively to the field of vocal training for the theater. I begin here with the term at the nucleus of my inquiry, standard language ideology, and branch to the theory and experimental evidence from multiple scientists throughout this literature review. Much like Rodenburg’s “vocal imperialism” Rosina Lippi-Green introduces the term “standard language ideology” in her book English with an Accent. This term is defined as “a bias toward an abstract, idealized, homogeneous language” (Lippi-Green 29 64). The use of standard language ideology captures the audience belief that there is an idealized or perfect language to which speakers must measure themselves. This bias appears both in positive feelings towards the “idealized” language variety and in negative feelings towards non-standard or non-native varieties of language. For instance, many listeners register positive feelings towards prestigious accents such as British-accented English, or French-accented English, and more negative attitudes towards Spanish- accented English, especially varieties from countries other than Spain (Lindemann 187). This belief itself is often the root cause of linguistic discrimination of all stripes; these beliefs do lead to discrimination and consequences in real-world scenarios that can lead to loss of economic and cultural opportunities. For example, listeners will rate a local Catalonian dialect as less trustworthy than a standardized Spanish accent when listening to speakers on the radio, leading these listeners to disregard important information (Renaires-Lara et al. 16). Throughout this dissertation, I will compare “vocal imperialism” as described by Rodenburg with Lippi-Green’s “standard language ideology” as proposed in the field of sociolinguistics. The parallels are so striking, I believe that both scholars are talking about the same phenomenon through different ways to access that knowledge. Rodenburg coined “vocal imperialism” using her years of personal embodied experience of ushering hundreds of students towards vocal freedom in performance. Lippi-Green points to the plethora of empirical evidence found by numerous linguistics researchers and scholars as the basis of her theory. The goal of this dissertation is to create conversation between these different modes of knowledge, highlighting the diverse paths that one can take towards discovering the underlying assumptions of a profession such as voice and dialect. 30 In some cases, standard language ideology can reinforce an accepted, explicit standard dialect that has been championed by those in power. In the United Kingdom, Received Pronunciation, a region less and constructed dialect championed by the British Broadcasting Company and encouraged to be used by its on-air personalities, is favorably associated with competence, education, self-confidence, and intelligence (Brown et al.). Received Pronunciation is a dialect that has been created for performance and for media, yet still influences listeners’ language attitudes about how English, at least in the media, ought to be pronounced. Audiences tuning into the evening news expect anchors and reporters to be easily understood and expect a high level of intelligibility. Efforts by dialect coaches and voice professionals in the United States of America have attempted a type of standardized dialect similar to Received Pronunciation for performance, with one dialect, sometimes named Mid- or trans-Atlantic, becoming a popular dialect with which to train actors (Skinner). Actors and performers trained for both stage and television have long been taught standard dialects with an eye towards prestige, reinforcing who gets to sound like whom onstage. These dialects and the prestigious institutions that created them also enforce ideas about what accent is right or standard in any given culture, leading to a kind of feedback loop that affirms the confirmation bias of audiences and practitioners alike. These reflections are so prevalent, one can actually trace the changes in prestigious forms over years. Performed speech may reflect the standard or idealized speech of the dominant time in which the media is produced (Elliott 105). Earlier generations of voice and speech practitioners claim that the speech of actors (especially explicitly trained in dialect or voice for the stage and film) can represent an ideal or standard style of speech to which all speakers ought to conform. 31 Further, standard language ideologies of non-standard dialects and accents on stage influence stereotyped representations, where representations of accent and dialect in media can be conceived as cognitive shortcuts for characterization of the characters (Bakanic 14). All of these forms of language package the idea that certain types of speech are more appropriate for public-facing contexts, presenting these styles of speaking as objective linguistic fact, when the bulk of the identity of these styles rests in the listener. Both the speaker and the listener are creating these styles of speech through cooperation—the speakers’ continuous use of these styles, and the listeners' continued expectation of these styles. To examine how these styles are built without trainers constructing this interaction misses a large chunk of the story, which will be the focus of this thread. The field of accent perception offers the strongest rebuttal to Dudley Knight’s arguments for an “objective” measure of intelligibility. There is a significant amount of prior research that has examined the factors that affect accented speech perception previewed here and explored in depth in this thread. The three factors of accented speech perception that many researchers use to build their theory of accented speech perception are accentedness, comprehensibility, and intelligibility. Accentedness is a subjective measure that refers to how “strong” a listener believes an accent to be. Comprehensibility is also a subjective measure, which asks the listener how easily they can understand the speaker. Both of these first two factors are measured on a Likert scale while asking for the opinion of the respondent. For example, respondents giving their comprehensibility opinion will be asked “how easy was it for you to understand this speech?” and given the 32 choice between 1as “not at all easy to understand” to 9 “extremely easy to understand” (Derwing and Munro 1). In contrast to accentedness and comprehensibility, intelligibility is measured in a different matter. Linguists refer to intelligibility as an explicit measure of the amount of information a listener gathers from the speech signal. However, instead of being left to the impression of an expert listener such as a dialect coach, researchers devise paradigms where listeners are expected to write down the exact words that they heard from speech. In this way, whether or not the listener guesses “high” at the end of the sentence “the ball bounced very high.” becomes a matter of objective achievement. Crucially, however, these researchers do not only include this measure of intelligibility in their investigations into speech perception. Research in this area operationalizes and uses intelligibility as one of many factors to examine the cognitive processes behind accent perception. These three factors, including intelligibility, are highly sensitive to different modes of context. Many factors often determine these scores, including factors intrinsic to the speaker, intrinsic to the listener, or related to the environment in which the language is perceived (Moyer 192). Of interest to this particular research is the environment, the factors affecting perception of accented speech are not fixed within a listener, as these factors can be influenced by the context in which the speech is being perceived, including expectations of the listener (Kang & Rubin 441). There are many specific empirical instances where expectation affects perception of non-native accent, which can point to theatre or performance being a specialized social context for language perception. For example, listeners’ perception of the vowel space that speakers use is sensitive to explicit labeling of regional dialects on testing materials, 33 shifting listeners’ perception depending on the regional label (Niedzielski 80). Explicit mention of geographical areas may not be necessary as listeners are so sensitive to their environments that they can be affected by relatively minor influences on the testing environment (e.g., stuffed toys in the testing area, Hay and Drager 867). That is, listeners’ perception of vowels was influenced by the toys the listeners had seen before the experiment, driving listeners to label these vowels as originating from different regions, even as they heard the exact same vowels in each experiment. This means that audience members are sensitive to all types of cues in the performance environment, from their expectations regarding the bodies that are onstage, to the decisions that designers make for costumes, set, lighting and general ambiance. This sensitivity can lead mismatched expectations that can also have perceptual consequences. In other work, listeners were less accurate in transcribing information when they experienced a mismatch between what they were seeing and what they heard (e.g., seeing a photo of a white woman while hearing Mandarin-accented English, McGowan 515). This evidence supports a model in which linguistic and non-linguistic information (e.g., social expectations) are intertwined (Hay & Drager 866) and one in which socially weighted perception of spoken words that encompass both linguistic and social factors: where listeners map acoustic patterns to linguistic and social representations in tandem (Sumner et al. 1015). The results of storing and later accessing these representations means listeners are directly encoding their judgments and prejudices in the very apparatus of language perception that they use in everyday life. Language use literally cannot be disentangled from the social context in which listeners and speakers find themselves, including power structures inherent in the dominant society. 34 Consequences of standard language ideology have been demonstrated in educational environments which point to real-world effects, demonstrating that intelligibility can be affected by listeners’ attitudes toward a speaker. In a seminal study on perceived accentedness, degree of accent correlated negatively with undergraduate students’ perception of teaching competence of international teaching assistants. (Rubin and Smith 351). Comprehensibility ratings were measured after playing 4-minute lessons either in a “moderate” or “strong” accent for 92 undergraduate students while displaying the photograph of one of two “lecturers”—a white or an East Asian instructor. In a follow-up, a standard American accent was used as the audio signal, students who saw a picture of an East Asian woman while listening to the lecture performed more poorly on the content exams in the post test, thus affecting intelligibility (Rubin and Smith 348). In later research, this phenomenon is referred to as “reverse linguistic stereotyping” which refers to a listeners’ difficulty navigating a seemingly neutral accent being produced by a speaker who appears to not be from the area (Kang & Rubin 441). An example of this type of reverse linguistic stereotyping was the relentless accusations leveled at Barack Obama for not talking “black enough” throughout his presidency (Graham). Direct implications for casting and theatre can be added to the complicated equation of listeners’ use of intelligibility to perceive language. One final term from linguistics that will contribute to the foundation of this dissertation is how models of language or dialect acquisition can be conceived as deficit models of language production. Deficit models take the basic assumptions made by standard language ideology, that there exists a “perfect or homogeneous” version of language and apply this ideology to language instruction. The deficit model assumes 35 speakers or learners are flawed or incomplete in their acquisition of the target language and are subsequently judged by the degrees to which they are assumed to be flawed (Modiano 525). The ideal form is the unattainable yardstick by which all speakers are measured, and teachers are free to treat students by their expert estimation of their skill acquisition in the classroom. Even liberal or culturally sensitive approaches to language instruction demonstrate shades of this deficit model. For example, a movement to explicitly teach code switching between non-standard home dialects and school-approved standard dialects still implies there is an appropriate space for one variety of speech over the other (Modiano 527). When the “appropriate” environment for the acceptable standard way of speaking is also the institutionally reinforced environment of school, the student learns their home dialect does not belong in the institution, thus reinforcing individual ‘deficits’ in their way of speech (Rosa and Flores 145). Given this background, I designed a series of experiments using performed speech to test the hypothesis that objective measures of speech are indeed socially constructed in the minds of listeners. Both of these experiments manipulate listeners’ conscious expectation of performance, which disambiguates the role of the context of performance from language perception in general. These experiments both replicate classic experiments (e.g., exploring accentedness like Munro and Flege) and build a performance-specific inquiry into how audiences judge accents on stage. Results from these experiments are incorporated into a subsequent discussion of the implications for cognitive humanities research, along with practical results for voice and dialect practitioners. The combination of empirical findings of the specific linguistic inquiry of this project and the systematic exploration of the psychological and philosophical 36 underpinnings of conceptions of the voice contributes to meaning-making and reinforce an emerging view of how to approach performance in the conclusion of this dissertation. 3. Chapter Summary In this dissertation, I refer to each section as a thread, as each section is both independent and dependent upon the other in narration construction and in chronology. That is to say, each thread is thematically organized, and within each thread is its own unique chronological and thematic progression. References within individual sections that point towards discussion elsewhere in the dissertation will be noted as the dissertation progresses. The first thread contains a critical history of voice practitioners and their guiding assumptions of audience experience; I will be sketching three general generations of voice and dialect pedagogy based upon three approaches to voice instruction. Starting from the foundations with elocutionists and inventors of standard dialects such as Received Pronunciation, through practitioners who were concerned with “freeing the natural voice,” to more recent practitioners who have embraced the science of voice in their approach to training the actor’s voice in VASTA, I examine the assumptions that lie at the core of each of these eras of voice pedagogy. To end this thread, I preview efforts by recent voice practitioners to decouple this practice from using standard language in voice pedagogy, which aligns with efforts in other areas of theatrical production to expand representation both onstage and off. Interspersed through this historical discussion is a discussion about how the theatrical apparatus influences audience members’ meaning-making by examining how American Theatre has been shaped by over 100 years of the tradition of Realism, a theatrical movement that has its roots in 37 Moscow Art Theatre established by Konstantin Stanislavski in 1897 (Benedetti 12). Reliance upon “authenticity” in this mode of production leads to expectations of “real life” onstage. I will also discuss the material circumstances and the economics of American theatre making that influence the approach to voice training. The thread is bolstered by scholarship in cognitive humanities that takes seriously the notion that cognition in performance and the arts arise from the cognitive structures that are already in use by each human. This thread also offers cognitive philosophical underpinnings of the metaphors actively in use by voice professionals. This establishes a foundation by which I examine a smaller piece of this context of performance, namely what linguistic perception and expectation of intelligibility of voices and speakers on stage offers to the production and meaning-making process at large. The second thread interrogates the assumptions made by voice and dialect practitioners in their work about audience experience using linguistic experimentation as a second lens for critical inquiry. In this thread, I examine the linguistic literature that traces the origins and effects of standard language ideology, which then serves as additional background to my own empirical investigation into audience perception of accents on stage. I background other research of interest to voice and speech practitioners that demonstrate the nuances of acquiring a second dialect (Siegel), perceptions about regional and non-native accents (Moyer), and the mechanics behind clear speech (Smiljanić and Bradlow 4020, Bradlow and Bent 707). Very specifically, I focus on the factors that surround audience judgments of imitated accents and their effect on intelligibility, both in the broad voice practitioner sense, and the more narrowly defined linguistic sense. Instead of assuming about how audiences perceived a performed accent, 38 I ask audience members what criteria of judgment they were using when they perceived these types of accents. This thread ends with interpretation of these experiments and exciting implications of how these data will contribute to the re-configuration of the profession of voice and dialect. These experiments will assist in defining what clear speech regarding audience expectations of understanding—and even “intelligibility” as conceived by Dudley Knight—means in the context of performance, which will lead to a more thoughtful pedagogy that accounts for these audience expectations. These two threads form the basis for the conclusion, where I grapple with my position as a voice professional and offer practices that are informed by both the historic notion and the cognitive notion of audience expectation are proposed as a specific guide for voice and dialect professionals. Part of this work is discussing the critical pitfalls of working in such a profession situated within a society that has such strong standard views towards language, and a large part of this chapter addresses the mismatch between the judgments that audience members make in the experiments with the judgments voice practitioners believe audience members make. I specifically begin with a question that guides my work as vocal professional, borrowing from Amy Cook’s Building Character: The Art and Science of Casting. She asks of character creation more generally, “What does it mean to build characters from the ecosystem up, rather than a more psychologically focused method of character creation?” (117). This thread answers this question more specifically about building characters as part of an embodied dramaturgical framework that treats dialect and accent selection as seriously as other aspects of theatrical production. As a dialect coach myself, I use specific examples from my own practice that address assumptions about audience perceptions and how they 39 might create meaning while watching theatre onstage. As a scholar, I hope that my focus on voice for this dissertation does not lead the reader to the false conclusion that I do not care about embodiment of voice. In fact, this dissertation deeply considers how voices are perceived as they are attached to bodies onstage, and the combination of these two experiences carries social meaning. I am advocating for a deep consideration of these topics in voice precisely because I think these considerations have been left out of our conversation about representation. By pushing back against the more general assumptions about rationality and cognition, I am able to create a space that more deeply considers the intertwined, sometimes contradictory nature of meaning making and artistic creation in theatre (and in entertainment more widely). As Mark Johnson asserts in The Meaning of The Body (2007) Our “body” and “mind” are dimensions of the primordial, ongoing organism- environment transactions that are the locus of who and what we are. Consequently, there is no mind entity to serve as the locus of reason. What we call “reason” is neither a concrete nor an abstract thing, but only embodied processes by which our experience is explored, criticized, and transformed in inquiry (vi). This dissertation aims to illuminate assumptions to the contrary of these embodied processes to propose a new approach to voice and dialect practice that both rejects harmful, explicitly racist practices and builds a model that reflects our understanding of human cognition and meaning making. This dissertation begins with the assertion that we 40 all carry within us a voice that has been shaped and created by the location we grew up in, those who we called peers, and every linguistic interaction since acquiring language from a young age. My aim is to create a paradigm that teaches students and actors about the complex lived experiences that accompany dialect on stage, while actively working to counteract the harm caused by the problematic aspects of the profession. In essence, I am proposing a type of deeply situated dialect dramaturgy that honors different forms of objective/subjective knowledge and that accompanies the pragmatic aspect of learning the sounds of a new dialect as an essential part of the character creation process. 41 CHAPTER II A CRITICAL HISTORY OF VOICE AND DIALECT “... Physically crossing ethnic borders was relatively easy for me until I entered the world of theatre. There cultural and monetary capital was acquired by entering the dominant culture. To gain entrance, I abandoned my voice.” Micha Espinosa, “A Call to Action: Embracing the Cultural Voice or Taming the Wild Tongue” 1. Overview Robert L. Hobbs published the book Teach Yourself Transatlantic in 1986, teaching a dialect that was created for the stage as a secret to becoming a successful individual in society. As an appeal to his authority as voice expert, Hobbs, a “well-known teacher in the field” (xii), claims that his system of working not only is advantageous for a student of acting, but also his system of working would indeed lead to success in more fields than just in performance. Hobbs uses his authority as voice teacher to combine the two major goals of voice practice in his arguments for Transatlantic as an appropriate dialect for all aspects of life. Can accent really make a difference? Yes—some people claim that it’s the major difference between managements at the upper and lower levels…the way you speak gives an impression—for better or worse—far more lasting than the clothes you wear or the design of your home. If you have upward mobility on your mind, speaking transatlantically can help you blend more successfully into the particular social or professional group of your choice. (Hobbs, 1986, X, emphasis my own) Hobbs presents his own standard language ideology as immutable fact and wraps that ideology with his authority as a respected vocal coach to sell his book to people who may be feeling self-conscious about how they sound compared to their peers. Using this book to teach oneself a stage and film dialect from the 1940’s is an extreme example of the idea of appropriateness—a theory that posits that speakers 42 believe certain varieties of the language or dialects that a community speaks are appropriate in certain contexts and not others. Hobbs and those who subscribe to these approaches conceptualize standardized linguistic practices such as stage dialects as an objective set of linguistic forms that are appropriate for an academic, work or otherwise successful setting (Flores and Rosa 149). Appropriateness-based approaches center the idealized white listener as the target and aims to create a maximum intelligibility based around expectations of that listener. Marginalized workers and performers are required to learn the appropriate linguistic forms and assess when to use these forms, rather than ask a listener to accommodate the speaker. Hobbs uses this implicit framework by placing the onus of communication on the speaker and at no point does he suggest that bosses and other listeners ought to practice listening to the plethora of dialects of their employees. Hobbs inherited this approach to Transatlantic and the belief that changing one’s speech patterns can lead to advances in life from a long line of voice professionals. This thread lays out three major waves or generations of voice and dialect practices in United States theatrical production in the 20th and 21st centuries.9 Key philosophies and approaches to voice and dialect training shape each of these waves, and they reflect approximate successive generations of teachers and students who apprenticed under their preceding teachers, developing their own materials from their prior training. In part due to this loose apprenticeship structure, the three segments of voice training do not necessarily have distinct chronological borders. Each successive generation inherited the voice philosophy of the last, which fueled their own problematizing and creating their own view on the voice. Because of this, this chapter is roughly divided chronologically 9 I will also be addressing British voice and dialect norms to the extent of their direct influence on practices in the United States. 43 but the sections will focus on key practitioners from the era. Like oral histories of families, strict chronologies do not matter as much as who taught whom, and who learned to accept the ideology from their teacher and who decided to push back against their teacher. Coupled with these shifts in approaches and teaching philosophies are the changes in the material circumstances of each successive generation, with the establishment of industry and educational norms. The approaches to voice philosophy and to theatrical production in the United States are intertwined enough that presenting them side by side will demonstrate their effect on each other. Standards in both philosophy of voice and material circumstances of production of one era of training become the essential problems and questions of the next era. Each of these waves contain key influential practitioners that help define the overarching approach to voice. My historical study is limited to voice practitioners who have written instructional materials and scholarship that document their particular approaches and are often cited as touchstone approaches to the practice of voice. This study includes both approaches to voice instruction more broadly, and practitioners whose area of instruction includes dialect coaching. Dialect coaching is defined as training an actor in a dialect or accent that is not their own (including dialects that are intentionally created and do not have a real-world equivalent) and can be a specialization of voice teachers who practice more broadly. I am defining voice practitioners as any instructor that trains performers in any aspect of the voice, including vocal anatomy, breath work, movement (especially as it pertains to preparing the body for performance), and articulation of the vocal apparatus. The definition of voice practitioner is broad and can include theatrical, film, singing, and dialect. This history includes both voice 44 practitioners in the broad sense and dialect coaches in specificity to set the stage for the linguistic experiments and subsequent best practices that specifically focus on the issue of dialect training. To attempt a history of dialect coaching without situating it within the larger voice profession would be an exercise in futility, especially since so many dialect coaches employ general voice practice in their work. The first generation of practitioners, spanning the first half of the twentieth century, I will name the Elocutionary phase, due in part to the influence of several elocutionists who did not start specifically in performance or theatre, though may have transitioned later in their careers. Philosophical advancements of the 1800s, including Semiotics and new scientific approaches to acting, heavily influence the thinking of these practitioners. This elocutionary era saw the foundation of several influential schools and departments in higher education, establishing the authority of the brand-new profession of voice and elocution. Practitioners in this era would not shy away from their explicit stances towards linguistic supremacy of English spoken by the white middle class majority that created most popular entertainment. This phase also saw a change in preferred entertainment and media tastes in performance, shifting from live performance to film (Elliott 140). The second generation, students of the first generation who benefitted from key structural changes to the institution of theatre making and became “master practitioners” in their own right, created a response to the strict standardization of speech through exploring psycho-social and anatomical approaches to freeing students from tension. This generation’s focus on freeing tension coincided with an explosive growth of regional theatres, thus further embedding this profession of vocal training into the vast network of 45 regional theatres and the larger economic arm of theatrical production in the United States. These regional theatres have become an integral part of the theatrical landscape in the United States in particular through the birth of the League of Resident Theatres (LORT) system of theatrical creation (Zazzali 192). The philosophical underpinnings of these instructors in this time period attempts to break free of the partitioning of the mind and body into separate entities. The third era takes the idea of freeing tension and claims to use a more scientific approach to the practice of voice. Successive practitioners in the first twenty years of the twenty-first century have shifted their pedagogy to create context- and cultural-specific approaches to voice pedagogy that embrace embodied realism, where the practitioner and audience member alike are constructing meaning through individual and collective enacted experience of the world around them. The intergenerational shift between these successive voice practitioners are in part a result of the heavy use of the master/apprenticeship model of knowledge transmission, and thus some overarching conceptions of voice have succeeded in influencing the practice even today. While other authors, including Dudley Knight (2000, 2012) and Derek Mudd (2014) have written versions of the history of Voice and Dialect practitioners, this dissertation’s version has a specific critical focus. The critical lens of this brief history of voice and dialect practitioners arises in two main themes, both directly related to “standard language ideology” of Lippi-Green (9) and “vocal imperialism” of Rodenburg (14). The goal of voice instruction as a profession (regardless of generation) supports two explicit goals—one where a student is stripped of their particular idiolect or unique way of speaking, and the other where that student is then encouraged to use one or more 46 dialects that have been simplified or sanitized. Both of these goals are named in the name of audience intelligibility in the estimation of the trained voice professional. Each generation of voice professional has had these two goals, either explicitly or implicitly, which has had the potential to harm students in the process. For example, the first goal in the elocutionary generation manifested in practitioner Edith Skinner’s classroom as her infamous day one exercise, where she would invite each student to pronounce their name and then she would correct the name into her proprietary “Good American English'' implying that the student’s way of pronunciation was unacceptable (Skinner et al. 20). This first goal manifests differently in contemporary practices and can include various microaggressions and social and economic barriers that institutions put into place that prevent a student actor from accessing voice training in the first place. Each era, through its relationship with higher institutions of learning and the economic realities of theatrical production, presents a way to affect students’ voices in a manner in which their home dialect or accent is not welcome on stage. The second goal for these voice and dialect professionals appears more explicitly when a second dialect or accent is needed for the stage, whether it is a standardized dialect or a foreign or regional dialect. Historically, voice coaches taught imitated dialects in a way that strips the individual dialect of its nuance and complexity in the name of intelligibility or audience understanding. These sanitized dialects are reflections of stereotypes, or standardized linguistic ideas, that society creates through associating meaning-making with how people of a certain race, class, gender, or language background sound. Students whose own voices do not match the expectations of intelligibility in performance are often faced with two issues of linguistic representation, 47 being silenced in the voice training classroom, and then sometimes being asked to exoticize their own accents to fit the stereotype expectations of dialect coaches, casting coaches, and directors.10 For example, Asian American actors like John Cho are still asked to put on highly stereotypical East Asian accents to portray Asian characters in the movies in which they are cast. Aware of the historical problematic representation of Asian characters in particular, explaining his acting choices, Cho did not, “want to do this [one] role in a kid’s comedy, with an accent, because I don’t want young people laughing at an accent inadvertently” (Sullivan). Black, Indigenous and Actors of Color like Cho are often keenly aware of the harmful stereotypes they are asked to perpetuate by voice and dialect professionals and directors in theatre, film and television. Marginalized actors are often put into the unenviable position of advocating for themselves and their identity groups against the overarching power structures of performance creation that favors stereotypical presentations of foreign and regional dialects that are accompanied by negative presentations of race, gender, and class. In this environment, marginalized actors do not possess enough power in their workplaces to push back against the tendency to simplify and present stereotypical accents. These two goals in dialect and accent (subtracting undesirable linguistic traits or behaviors and replacing with sanitized or stereotyped versions of language) are ways of erasing authenticity and therefore embodied linguistic knowledge for the voice student or performer under the care of voice professionals. These goals have remained prominent in part due to an unchallenged authority of the vocal or dialect coach as a singular expert in all matters of the voice. This authority rests on a semi-scientific knowledge and appeals 10 I discuss a case study of contemporary Latinx students the conclusion of this dissertation. 48 to audience intelligibility or ease of understanding for the audience. This appeal to intelligibility also feeds back into linguistic supremacist ideas that some speakers are already more naturally intelligible than other speakers, which often align with other supremacist ideas about race, gender, and class. This approach often claims that there is a voice that is appropriate for performance that does not match the voice of the actor, and thus they must learn the linguistic forms and policies that govern this appropriate way of speaking. Throughout this history of voice practitioners, I will highlight the ways in which key practitioners use their authority as experts in language to further their agenda. The following history of voice and dialect foregrounds the assumptions that practitioners in each successive generation used to build their profession and also highlights the material circumstances behind each approach to voice training. The material circumstances are an important piece of the story of this profession, as theatre has become an institutionalized piece of an ideal listening apparatus, having been privileged as a form created specifically for overwhelmingly white, middle-class audiences (Bennett 114). I trace the lineage of these assumptions through the different generations, to construct the base for the contemporary understanding and approaches of voice and dialect work. The following sections will demonstrate the way voice training is structured historically has catered nearly exclusively to an appropriateness-based approach to audience, at times explicitly privileging constructed linguistic forms over natural or spontaneous in service to maximum intelligibility of audience members. Theatre as practiced in the United States during this time is the ideal white listening apparatus made manifest and is therefore a site ripe for intervention against appropriateness-based approaches to contemporary voice training and in the future. 49 Creating this opportunity to push back against prior generations of training (in the tradition of those who have come before) will set up a different view of how audiences may create meaning through their social expectations of voice independent of voice professional intervention, which establishes the foundations for the experiments in the following thread. 2. Learn to speak “Good” American English: Elocutionists 1900-1950 2.1 William Tilly teaches World English The first generation of modern voice professionals begin with the elocutionists of voice, which is an era that begins at the turn of the century and continues through approximately the 1950s. This generation is marked by elocutionary teachers who were not necessarily specifically associated with theatre or film performance but were imposing strict standard English practices with their students in an effort to create speakers who were successful in life as well as performance. Most practitioners’ goal was to create permanent good speech in their students according to their own standards, reflecting their own view on what good speech ought to sound down to the tiniest minute phonetic detail, without producing much evidence on why that speech was supposed to be better than other types of speech. The progenitor of this era, with foundational writing and training in speech, is William Tilly, who through his obsession of capturing fine phonetic detail from speakers of English, also contributed to the creation of the International Phonetic Alphabet11 (IPA) by the International Phonetic Association (Knight 32). In this era, practitioners used strict phonetic transcription as the learning model for students. Many students of William Tilly, 11 This alphabet is still in use by thousands of linguists around the world. 50 such as Edith Skinner and Marguerite DeWitt, practiced narrow transcription of IPA as a way to capture precise detail of speech and advocated for a strict approach to the English language through this transcription style. Narrow transcription is the practice of transcribing linguistic sounds with as much phonetic detail as possible, where each letter that represents a linguistic sound (e.g., a phoneme, discussed in detail in the next chapter) can feature diacritic marks that indicate slight differences in pronunciation due to the location of the letter within the structure of words, and of course due to differences in dialect of the speaker. With this system, each phoneme represents the ideal pronunciation of the sound, and every diacritic added represents a failure at achieving that ideal. While elocution as a formal profession arose near the turn of the twentieth century, it owes most of its origin to practice of rhetoric, the art of persuasive speech in the realm of public speaking. Rhetoric begins with Aristotle formally and the art of instruction for public speaking takes many forms throughout history. The direct English descendent of rhetoric that contributes to elocutionary studies begins in the middle 1700s. The Art of Speaking published by James Burgh in England in 1762 kicked off an elocutionary movement in the United States. This book would inspire other texts where the goal was to inspire proper persuasive speaking in the public sphere. Two competing schools of elocution would arise at this time in training, with competing philosophies or approaches to the role of the voice in public performance. One such school, known as the Mechanical School, taught countless students to align gestures and expression with speech in order to appear persuasive in the public sphere, though without much emphasis on connecting to the emotions underlying the speech. As a result of this school of elocution, in 1827, James Rush, a U.S. medical doctor, published A 51 philosophy of The Human Voice, based off his work on anatomy and physiology of the human voice, according to medical knowledge of the time (Mudd 31). This would mark the first instance where a voice practitioner would appeal to the authority of the fields of science and medicine, to lend a veneer of authenticity to the claims made within the book. Proponents of the mechanical school of acting were influenced highly by Denis Diderot’s The Paradox of Acting (1883). In his treatise, Diderot claims that the role of the actor is to recreate the forms, gestures, and habits of characters without the actor becoming emotional themselves (Roach 116). Soon after, another school of “expression” would rise in opposition to the Mechanical School of elocution. The expressionism school of elocution was influenced heavily by Romanticism, Naturalism, and other philosophical movements that arose in the same era. These teachers were interested in the interior expression of a speaker and warned against the external and mechanical nature of the school before them. The expression school for elocution further split into two fields, namely oral interpretation and actor training (Mudd 31). At the same time as these competing schools of thought, a young William Tilly formed his school of elocution in the late 1800s, which leaned into the Mechanical aesthetic. William Tilly grew up in Australia in the 1860s and 1870s and moved to Germany. Having already established his school for elocution in Germany in 1890s, Tilly moved to the United States in 1918, right when practitioners in the school of expression split the profession between oral interpretation and acting (Knight 32). While the school of expression found homes in English and speech departments and eventually fledgling theatre departments in universities in major cities along the east coast of the United States, Tilly’s sights were set on the scientific interpretation and precise expression of 52 language and subsequently found the humanities limiting to his vision. Through his school, Tilly attracted many students who wished to master English as a second spoken language in all arenas of life, believing that they were carrying linguistic deficits. The United States at this time was undergoing a massive shift in demographic, with more than 15 million immigrants arriving in the years between 1910 and 1915, which was a number equal to the number of immigrants who had arrived in the previous 40 years before this date (“Immigrants in the Progressive Era”). Due to this, Tilly had many eager students striving to assimilate to their new homeland. Dudley Knight describes Tilly’s chief reform—one that was passed down to many of his students, most of whom became very influential in the field of voice—as, “his attempt to teach the pronunciation of English as a spoken language, and not as a written one” (Knight 33). To assist with his goals, William Tilly was one of the first elocutionist practitioners to create a wholly artificial dialect that he taught to his students. Subsequent students of Tilly’s would call his system World English or even World Standard English (Knight 34). Through his advocacy for narrow transcription, Tilly set the stage for fanatical adherence to how English should sound in every arena of public performance (and by extension, private communication). His students, including Marguerite DeWitt, Margaret Prenderghast McLean, and later Edith Warman Skinner, would carry this fanaticism through their own teaching and strict adherence to detail. With books such as Speak With Distinction (Skinner et al.) and EuPhon English (DeWitt), elocution teachers framed their ideologies with their narrow instructions on how to sound in real life. Implicit biases against speakers whose first language was not English often became explicit, as evidenced by the introductions in several of these books. The explicit 53 ideologies professed by voice practitioners and other disciplines that were examining language in a systematic and scientific way. The debate between those professing that language ought to be pronounced in a certain way and those who merely wished to observe and describe the varieties of language found in the world became particularly fierce in the 1920s. Onone side of this debate was Tilly and his students’ fanatical adherence to speech standards, while on the other side were linguists and linguistic anthropologists beginning to establish their fields and academic departments in universities. Anthropologist John Kenyon, having just published his influential textbook American Pronunciation, advocated fiercely for the equality between different dialects heard in the United States, and argued fiercely against the use of standard dialects, especially ones promoted by Tilly based on class and access to different elocutionary and training techniques (Mudd 35). Through this feud between sides of standards in United States English pronunciation and dialect, many followers of William Tilly revealed their own biases against speakers with no formal training, especially those for whom English was not their first language. Elocution practitioners were expressing fear of a polyglot United States where diversity and difference are valued over unity and homogeneity. These ideas bore strong resemblance to racist ideas expressed in other social spheres in this tumultuous time in United States history (Kendi 145). One passage from Tilly’s student Marguerite DeWitt in the introduction of her book Euphon English highlights this explicit prejudice, even fearing the dissolving of the United States entirely: To squander national vitality and money on that which will but cause biological disintegration of a nation is not the philanthropy; to infuse into a body politic 54 blood that destroys the racial blood of a nation is not the deed of a rational healer; to foster the growth of parasites on a national tree of education and knowledge is not the work of advanced sociologist. (DeWitt, qtd. in Knight 40) DeWitt refers here specifically to her reluctance to educate immigrants to this country by using particular supremacist phrases like “racial blood of a nation,” claiming that such time spent in education is equivalent to fostering parasites that would otherwise be unwelcome to a racially pure United States. These racist ideas, however, were not explicitly limited to the practice of elocution, though to find them at the very root of the origins of this discipline should not be dismissed as incidental to the time in which these authors found themselves. To demonstrate how these explicitly racist ideas became ingrained to the work of performance, I turn now to a practitioner who bridges elocution in general to performance in theatre specifically. 2.2 Edith Warman Skinner bridges performance One student, who can trace her lineage directly to William Tilly and his school, will become particularly influential in future approaches to voice in the twentieth century, and will make the connection between elocution generally to performance specifically. Edith Warman Skinner, like many in the expression school of rhetoric, realized the power of vocal training in the life of an actor (Skinner et al. xi). Simultaneously with this jump to speech training in performance, acting training has been shifting from the teacher- apprentice model of various touring troupes of theatre ensembles where actors are immersed in on-the-job training, and towards a more formal site of education through 55 new partnerships with higher education institutions. For instance, the first theatre12 department in the United States was established in a similar timeline as Tilly’s growing influence in 1925, at Yale University (Berkeley 23). In these burgeoning theatre departments, the expressive school of elocution was winning, and theatre practitioners were employing new techniques by Konstantin Stanislavski to couple emotion and action with text. The actor was to consider the word as “verbal action” and no longer as literary form (Moore 69). Skinner had been trained as an actress at the Powers school, so she had an interest in combining the elocutionary lesson she was learning from her teacher Margaret Prenderghast McLean with her work as an actor (Knight 43). Skinner became the speech instructor at Carnegie Tech’s theatre training program in 1937. Skinner established herself as one of the premier speech trainers for theatre in America, not only because of the large number of famous actors she worked with but also the number of speech trainers that she taught, as well. Her legacy as a speech trainer can be seen in the generations of contemporary speech trainers who can and often do trace their lineage directly back to her and her work at Carnegie. Edith Skinner would go one to hold two appointments at Carnegie Mellon school and the Juilliard school at Yale. After World War II, many soldiers returned home from the war and took advantage of G.I. Bill benefits, thereby flooding the American university system, and allowing the freedom to pursue disciplines in the humanities that were not necessarily immediately lucrative. 12 Standards for language extend to the never-ending debate of the ending of the practice of theatre/theater. Apart from regional differences (British versus American spelling preferences), some American institutional bodies use Webster’s dictionary as an appeal to authority to argue for the -er spelling. My preference is to refer to the practice as “theatre” and the space or location as “theater” 56 Theatre departments developed throughout the country and would often focus on training for actors. Within this environment, Edith Warman Skinner found the necessary conditions to create a system similar to her teachers before her, a system that would become the standard for theatrical production and would heavily influence speech in film. Skinner’s system, named Good American Speech, presented what Skinner described as the most intelligible type of speech for performance (Knight 44). Skinner also taught her own version of the International Phonetic Alphabet, often combining her own cursive symbols with standard symbols for sounds, demanding her students practice an exact copy of her own work.13 Through this, Skinner was able to create a proprietary system that required rigorous study with her and her designated students, thus establishing the practice of creating exclusive systems that require particular access to training. Though Skinner’s text Speak with Distinction was published posthumously, her unofficial notes and voice approach were shared between departments, always with careful attribution to Skinner and her system. Skinner’s work eventually formed the basis for Mid- (or Trans-) Atlantic dialect, a dialect that was used by many film and stage performers throughout the twentieth century, eventually losing popularity in the mid 1980’s (Elliott 105). This dialect is recognizable in many Hollywood stars such as Audrey Hepburn, Judy Garland, and Marlon Brando. Skinner constructed this dialect with the aim for maximum intelligibility and the dialect is constructed out of a mixture of dialect features from British varieties of English, most notably lacking /r/ in particular environments and the use of broad /a/ or the initial vowel sound in “father,” and the 13 See Michael J. Barnes “A Critique of Phonetic Transcription in American Actor Training” in Standard Speech: Essays on Voice and Speech on page 100 for diagrams of Skinner’s use of the IPA. 57 rhythm of the speech of higher class white residents of the Eastern coast of The United States. Eventually, this accent enjoyed the prestige status of being the most intelligible and preferable dialect for both stage and screen, due to the fanatical advocacy of Skinner’s students. Skinner’s system would remain popular with students even until the twenty-frist century; the first edition of the Voice and Speech Review would feature no less than six articles and rebuttals to Skinner’s system in 2000. Skinner continues to be one of the most influential early figures for voice professionals, whether they ascribe to her ideologies or push against them. Notable of Skinner’s students are Tim Monich, a dialect coach beloved by Hollywood and covered later in this chapter, and Sanford Robbins and David Hammond, both of whom would figure heavily in the eventual formation of the professional organization for voice professionals. Skinner’s Transatlantic is often still held up as the proper or correct approach to Shakespeare14 or classical work in particular (Hammond 143). The association between Skinner’s Good American English and performing classical works such as Shakespeare is incredibly strong; facets of this accent can be heard in stereotypical pseudo-British Shakespearean dialect that students and those poking fun at Shakespeare often drop into while performing classical texts. Lippi-Green’s standard dialect ideology is at play here, since many audience members and inexperienced actors 14 I often point out that stereotypical approaches that pokes fun at a Shakespearean theatrical accent sounds relatively close to the Transatlantic accent Skinner taught in her classrooms. This is especially ironic given David Crystal’s work in Original Pronunciation (OP) of Shakespeare which—as it has been historically reconstructed from Early Modern English—does not share many linguistic traits with Skinner’s system. For an excellent audio comparison between RP/Transatlantic and the OP of Shakespeare, see the video “Listen to a demonstration of the original pronunciation of Shakespeare's English and how it differs from modern English” on Encyclopedia Britannica’s website: https://www.britannica.com/video/187707/David-Crystal- pronunciation-Ben-Elizabethan-English-British 58 strongly associate a prototypical accent with a particular type of dialect or accent. Linguistic stereotypes can be strongly associated with context when enough language users employ the same dialect consistently. Concurrent to Skinner’s strong adherence to linguistic standards is the work of Arthur Lessac, who would become the progenitor of the next generation of voice professionals that eschew overt standards for an individualized psychosocial approach. Edith Skinner and her contemporary Arthur Lessac would rise to be one of the most preeminent theatre voice teachers in the United States (Mudd 30). The major difference between Skinner and Lessac was Skinner’s emphasis on “standardized speech” while Lessac advocated for a more individualized approach to voice in performance that examined each voice student. Arthur Lessac’s work as a voice teacher would mark a shift from standards and elocutionary practice to a more individualized approach to voice, though Lessac’s goal to “produce beautiful voices” and, “clear, articulate speech” still bore the hallmarks of standardization (Lessac 114). Lessac’s standardized voice did not have to conform to a narrow phonetic or overly prescriptive approach to voice. Lessac’s class had more room than Skinner’s for voices outside of the narrow band of approved students, but not to the extent where every voice was welcome in the training classroom. Each speaker still had to adhere to a standard of “clear, articulate speech” though now that standard was not made explicit through precise use of phonetic symbols and rote drills. Lessac’s work, however, did pave the way for the next era of voice training, guided for the most part by Kristin Linklater’s landmark work Freeing the Natural Voice, covered in the next generation. 59 2.3 Mimesis vs. Semiosis: Establishing dialect coaching as a profession Near the end of this elocutionary era, a different set of voice professionals guided by similar overt language ideologies as Tilly and his students staked their expertise through dialect instruction for voice in performance, specifically with dialects that were trained and did not necessarily originally belong to the performers themselves. The second goal of historic dialect training, to produce sanitized and easily acquired stereotypical accents, guided these practitioners as they established their authority on dialect coaching. In this era, the husband and wife team Lewis Herman and Marguerite Shalett Herman published two seminal texts on dialects, Foreign Dialects (1943) and American Dialects (1947). The original subtitle for both of these texts reads A Manual of Dialects for Radio, Stage, and Screen, implying the dialects presented were not only representative of the countries and regions they claimed but also suitable for a plethora of public performance scenarios over and above theatrical presentation. The text on foreign dialects in particular were intended to, “help the actor prepare for the most difficult foreign role and offer the director or producer a convenient aid for correcting actors and evaluating applicants for authenticity and dialect ability” (Herman and Herman back cover). Presented as an authority on these dialects, Herman and Herman compiled this material during more than twenty-five years of acting, writing, and teaching across Europe, New York, and Hollywood. Their texts became canonical dialect and accent texts for producers, writers, and actors alike. In their Foreign Accents text, Herman and Herman not only describe how the accents mechanically worked, but also described the stereotypical stress patterns (described as “lilt”) of an average speaker. They also provide grammar expertise for common mistakes produced by these speakers, in service of providing what they believed 60 was a more authentic example of dialects for playwrights and screenwriters. What Herman and Herman believed were mistakes in the dialect are actually valid and real differences in dialect that approached grammar differently from American English. Because they were approaching dialect from an ideology that held American English up as the standard, any deviations from this particular dialect were described as mistakes. Despite this, Herman and Herman frame this book as a helpful or neutral guide to foreign dialects. However, in their introduction, Herman and Herman reveal their preference for American English for stage and film, The art of the dialect is the twin art of being consistent with the fundamental and radical changes and of being consistently inconsistent with the less-important [sic] changes…if the dialect is to be very light, the radical changes may also become inconsistent. But if this point is reached, the character will be speaking an almost perfect American speech. (Herman and Herman 15, emphasis is mine) Written in 1942, Herman and Herman selected accents that were in demand from producers of stage and film and revealed their biases for and against several varieties of accented English through their selection and organization of these accents. For example, Herman and Herman divide British English into two chapters, assigning Cockney English its own chapter, and then assigning Australian English, Bermuda English, and the “Dialect of India” to the other chapter (Herman and Herman 65). That Herman and Herman consider Indian English15 a form of British English is a damaging and implicit reflection on their ideas on the colonization of India. 15 According to Babbel, there are 22 official languages in India, and well over 19,000 languages and dialects, so reducing a subcontinent to only one dialect reveals multitudes about Herman and Herman’s attitude towards this country. 61 In addition to the grammar and rhythm considerations, the original publication of the book also described the characterization of an average speaker of that particular dialect. For example, British speakers are characterized as stolid, resistant to change, and “unbrilliant... With temperate habits, and temperate emotions” (Herman and Herman 52). Chinese speakers were characterized as industrious, frugal, devoted to family and country, and have “a proverb for every occasion and a wide grin to accompany it” (Herman and Herman 245). Each chapter, therefore, explicitly trains speakers and indoctrinates them into accepting that speakers of a certain dialect were inextricably linked with character traits both positive and negative. These characterizations are particularly egregious for the various East Asian accents presented in the book, as when this book was first published one of the enemies of the American army was the Japanese in World War II (Royde-Smith). Angela C. Pao writes specifically about this portrayal of Japanese accents written in the original 1943 publication, The opening lines of the chapter on the Japanese dialect blatantly signal the war- time substitution of subjective prejudices for more objective observations: ‘Unfortunately, the Japanese military has caused the people of the other nations to brush the cherry-blossoms from their eyes and to thinking of these little, yellow men in unmentionable expletives. Their overpowering politeness has currently taken on a sinister threat, and their wide toothy grin, an ominous leer.” (Herman and Herman 225, qtd. in Pao 358). By painting these prejudices and overall approach to dialect with an objective or scientific veneer, Herman and Herman established the practice of dialects for film and theatre as a neutral or even beneficial contribution to entertainment. Using appeal to 62 authority with nearly twenty-five years in the business, Herman and Herman could parrot explicit racial prejudice as scientific fact, and necessary information for producing commercially viable yet ultimately harmful dialects. These accents and this training manual, therefore, contributed to popular entertainment’s construction of ethnicity and race. Accents and dialects are no longer neutral indicators but carry a constellation of meaning that includes stereotypes that audience members can incorporate into their artistic experience, according to what Robert Hodge and Gunther Kress describe as Social Semiotics (1988). In this publication, Hodge and Kress posit that due to the human capacity to assign meaning to every experience, every perceptible detail is available for meaning-construction. Pao states, We are in a society that assigns character traits (i.e. meaning) to how people sound and to deny that fact is negligent behavior on behalf of voice and dialect professionals everywhere...Succinctly stated, accents of all kinds (foreign, regional, class) function not on the mimetic plane (to which dialect coaches refer on almost all fronts) but on the semiotic plane (the production of meaning.) (Pao 359). Herman and Herman built the argument that dialect coaches are offering their services as a reflection of the mimetic conception of accent, while denying or deemphasizing the semiotic use of accents to produce meaning in the minds of audiences. Nevertheless, Herman and Herman still offered stereotypical characterizations of the speakers While overt racial characterization descriptions are omitted from the 1997 re- publication of this text, implicit characterizations of these accents remain throughout the text, through descriptions of lilt, mouth position, and Herman and Herman’s attempt at 63 delineating different types of accents within each chapter (Pao 364). Anti-Asian sentiment in particular is present in surprising places, including descriptions of the Filipino dialect that “resembles Pidgin English. In fact, it has, for this reason, been called Bamboo English. But the pronunciation is based mostly on the Spanish with some infiltrations of Malaysian” (Herman and Herman 190, emphasis author’s own). Herman and Herman, while careful to separate Chinese and Japanese dialects, flatten the diverse history of trade in Southeast Asia by privileging the influence of the European colonial language over and above a neighboring country. Herman and Herman then continue to flatten the taxonomy of different speech areas by assigning the “Portuguese dialect” a sub-area of the chapter on Spanish dialects. Perhaps the most egregious examples of overt racism and overgeneralization exist in the practice monologues provided at the end of each chapter. For example, the monologue used in the Chinese dialect chapter still refers to an obedient Chinese owner of a Dry-Cleaning business, “grinning widely at a customer” as he says, “Ticket, please? Thank you. Me got wash finish… Maybe you put change in China Relief box, no? That for China people Japan make hurt. That for make world safe for democracy, no?” (Herman and Herman 258, qtd. in Pao 359). Editors at Routledge erased the overt racial linguistic imperialism, yet still subscribe to ideologies through implicit means. Herman and Herman established what they thought was a neutral and empirical approach to dialect coaching, without acknowledging that they were relying on their authority as experts and building guides for accents according to their subjective perception of the speakers of these dialects. Dialect coaches have denied their participation in the semiotic meanings of these accents by focusing on the mimetic 64 aspects of these dialects. Performance is created in a society that assigns character traits to how people sound. To deny that fact is a misstep for not only performers but also voice and dialect professionals. Later dialect coaches and other authors of books similar to Foreign Dialects and American Dialects, grapple with this dichotomy of mimesis/semiotics to varying degrees of success. What remains from this time period, however, is a staunch adherence to the idea that the art of dialect coaching is a precise and near-scientific pursuit, masking the more dangerous prejudice and racial stereotyping that this profession replicates and perpetuates throughout entertainment. In the next section, even dialects presented as neutral contribute these dangerous ideas about the voices that produce these types of accents. 3. Freeing Tension and the rise of regional theatre The second wave or generation follows the first era, but many of these ideas are attempting to push back against the overt standards of the previous generation. This second wave of thought came to prominence in the 1950s and remained popular until the early 1990s, when voice practice was changed dramatically by the advent of the internet and relative ease of access of knowledge. Practitioners Cicely Berry, Patsy Rodenburg, and Kristin Linklater were key players in transforming the profession from strict adherence to standard stage dialects to an individualized approach that aimed to consider the actor as a whole psycho-social being that requires individualized care and consideration (Mudd 40). Specifically, Linklater’s text Freeing the Natural Voice (1976) became a touchstone text—partially due to other practitioners who did not write and widely disseminate their materials—for this generation of voice professionals. In this era, practitioners pointed to the need for deep physiological and sometimes psychological 65 work in order to free the body from tension, thus releasing the voice from the newly un- tensed body, implying that the ideal body and voice vessel is without cultural and individual habits that are carried in actors’ bodies. The underlying ideology from this era presents an interesting new re-interpretation of the first goal of voice practitioners that aimed to erase individual identifying characteristics for actors. As Rockford Sansom notes, “...their work demonstrates a seismic shift in ideology away from elocution to a praxis desiring an actor’s authentically personal expression and interpretation” (Sansom 159). From this type of personal expression arose the idea of authenticity onstage, which shares an uneasy relationship with mimesis/semiotic representation for the audience. The audience can read the same linguistic performance as simple mimesis and simultaneously as a stand-in as part of the gestalt of the larger meaning apparatus of the performance as a whole. For this generation, I have chosen to analyze in depth the philosophies of three practitioners in particular as the most widely used examples of the philosophy that governed the voice profession at this time, through the work of Cicely Berry, Patsy Rodenburg (already introduced in the previous chapter), and Kristin Linklater. These three practitioners claim that to achieve authentic personal expression and by extension success in performance, students must release chronic tension in their bodies, since after all, the voice of the actor cannot be separated from their body. The success of this release still determined by the expertise of these practitioners within their individual proprietary systems. Actors who are more bodily able to release tension get to enjoy the benefits of voice work. Voice professionals, therefore, privilege the voice that inhabits some idealized, unmarred vision of non-tension. In other words, the work of these practitioners still privileges some standard, idealized homogeneous body release 66 from tension to produce optimal voice work16. Often, this type of voice work is accompanied by body/breath awareness work such as yoga, Feldenkrais17, and Alexander18 technique all in the service of release of muscle tension, and fuller awareness of the body (Moore 101). It should be noted that Feldenkrais and Alexander techniques both seek to eliminate what has been deemed as “harmful” tension in the body, while practitioners of these systems still claim that some tension (e.g., when an actor’s vocal folds are activated in phonation) is still acceptable. The idea of acceptable approaches to body support of the voice is still heavily regulated by students of these practitioners. This era is also marked by the unprecedented boom of the profession in two key performance markets, regional theatres all across the United States and the United Kingdom, and also the professionalization of the position of voice/movement professor in higher theatre education (Zazzali 47). This professionalism was due in part to two sources: the 16 See Louis Colaianni’s interview pp, 69-81 in Voice and Speech Training in the New Millenium by Nancy Sakland for an in depth discussion of the implications of what type of voice or body gets to be perceived as free of tension. 17 The Feldenkrais Method, created by Moshe Feldenkrais, is a system of body movement and awareness that uses gentle movements to ease tension. According to the method’s website https://feldenkrais.com/about-the-feldenkrais-method/, “The Feldenkrais Method is based on principles of physics, biomechanics, and an empirical understanding of learning and human development.” Through learning this system, “Since how you move is how you move through life, these improvements will often enhance your thinking, emotional regulation, and problem- solving capabilities.” Feldenkrais, like many practitioners of this era, used claims to science to sell a proprietary movement method as life-enhancing for their students. 18 Similar to the Feldenkrais Method, the Alexander Technique is another proprietary system of movement that makes similar claims of easing tension and creating movement “as nature intended.” According to the Alexander Technique official website https://alexandertechnique.com/fma/, Australian actor F.M. Alexander developed this approach to movement when he was experiencing chronic laryngitis whenever he performed. Alexander credited relieving his tension in his neck and body as the secret for his recovery from his laryngitis and developed a system to ease muscle tension from his personal experience. There is a second dissertation’s worth of commentary about these movements systems that also cover their proprietary systems in pseudo-science veneer and take advantage of marginalized populations in similar fashion to the voice practitioners of this era. 67 explosion of federal and grant funding available to regional theatre companies, and to the creation of the Voice and Speech Trainers Association that assisted in codifying expectations for these positions. The creation of such positions thus legitimizes the authority already created in the previous generation and the need for voice and movement practice for aspiring actors in the United States. This generation, like Skinner and her students previous, is also marked with the branding of specific approaches to voice training that will take hold in theatre departments across the United States. Continuing on the work of Herman and Herman, Jerry Blunt created his own system for dialects, releasing Stage Dialects in 1967. Accompanying this guide was an innovation in dialect study; the book was released with practice tapes for the dialects featured within the chapters. The debate between approaches to dialect instruction in this era was between two camps: one camp advocated for use of spontaneous dialects “from the field”–using tapes of actual dialect speakers–while the other camp still advocated fiercely for the use of example dialects produced by dialect coaches themselves (Blunt viii). The implications of this debate would reverberate in the following years of dialect coaching into the twenty-first century. Blunt also made claims about the qualities of the accents and dialects he sought out for his definitive yet limited guide on dialect, revealing an interesting paradox about the types of accents and dialects that are privileged above others. Blunt carried many standard language ideologies established by Herman and Herman into this generation of dialect instruction. 3.1 Berry and Linklater and the British voice training revolution Kristin Linklater, who was a student under Iris Warren, who taught at many different institutions including London Academy of Music and Art and NYU, would 68 publish her first book Freeing the Natural Voice in 1976, and kickstarted a revolution in voice training. This book would become the foundation of the Linklater system of voice training, which has over one hundred master teachers working today (Linklater Voice). How Linklater’s book became one of the most influential texts on voice is a result of a combination of the number of prestigious positions Linklater would hold throughout her fifty-year career as a voice professional, and the emerging actor training philosophies that privileged the individual psychological experiences of the actor. Linklater credited her individual approach to her work with legendary voice teacher Iris Warren, often quoting Warren’s training philosophy and mantra, “I want to hear the person, not the voice” (Linklater Voice). Other individual approaches were heralded as the new definitive way to train the voice and other absolutely influential practitioners in this era to precede Linklater were Cicely Berry (Voice and the Actor 1973) and Patsy Rodenburg (The Right to Speak 1964). All three practitioners promoted the idea of psychological work for the actor to “free” them from the habitual tensions that society has placed upon the actors’ body. Rodenburg’s “vocal imperialism” constitutes the most socially aware ideation of this concept, while Berry and Linklater chose to reference this type of tension more obliquely as something more value neutral that leaves out intersections of class, gender and race. This shift from practicing a constructed standard for acceptability for onstage in elocution and towards true individual freedom signaled attempts at a radical re-imagining of the voice profession. Practitioners in a previous generation wrote openly about their linguistics supremacist ideas where this generation worked harder to include more voices in their work and classrooms. This shift was part of a larger movement towards a goal of 69 the democratization of theatre, as theatre makers such as Peter Hall, Peter Brook and Trevor Nunn influenced the new “so-called radical '' Royal Shakespeare Company of the 60’s and 70’s (Knowles 95). Linklater, Berry and Rodenburg sought to democratize the voice in response to what they thought were oppressive practices by the elocutionists before them. While they often sought freedom, they would often employ notions that accomplished the opposite, by seeking authenticity and insisting on their own version of intelligibility. In her book, Voice and The Actor, Cicely Berry constructs intelligibility as the direct natural result of the Actor’s conscious work towards relaxation and freedom in their training. Within her book, actors are repeatedly urged to “allow the words to do their own work,” so that if successful, “the meaning will be clear” (Berry 108). In this case, regarding this book, Berry defines meaning in these words of the play as the original intention of the playwright or author, which means the actor’s job through voice work is to become the most neutral vessel possible through which the original intentions of the playwright may be read by the audience. This goal, therefore, privileges bodies and voices that are already closer to what an audience member may consider neutral, which often means a white speaker from a region that does not have any noticeable accent. The goal of this privilege allows the bodies producing these voices to melt into the background, foregrounding the text and meaning constructed through the script. Following this logic, Berry constructs intelligibility that is highly valued, most crucially, as meaning constructed by the words actors speak and not the context in which these words are uttered. In other words, the goal is to become neutral enough to utter words mimetically and not contribute as an actor to any meaning construction in which the 70 audience may partake. Berry privileges and rests her authority on the text to be spoken above all other embodied approaches. Peter Brook, in his forward to Berry’s book, confirms this privilege by praising her ability to neutralize or free actor’s voices, “all is present in nature; our natural instincts have been crippled from birth...by the conditioning, in fact, of a warped society” (Berry iv). Both Brook and Berry commit to the idea that voice professionals must return the acting student to a tabula rasa for optimal intelligibility of the text, without explicitly constructing what that blank slate space looks like. In her subsequent book, The Actor and His Text Berry privileges this blank slate or maximum ease of tension to promote “intuitive response” (24) to text to access what she believes is “a physical level, deeper than the intellect” (27). In other words, she does not trust the intellectual access to meaning, and advocates for a subconscious approach to text and voice. This simultaneous freedom and instinctual response while pursuing an ill-defined definition of intelligibility of text (particularly Shakespeare) marks many of the influential practitioners in this era. While still grappling with the particular instinct/intellect dichotomy, Linklater acknowledges that meaning is constructed through a combination of both the individual and the text they are speaking. In her book Freeing the Natural Voice (1976), her process revolves around the individual through whom the text is “revealed,” and around processes through which “interpretation of the text [is] released from within” (185). No longer neutral mouthpieces through which the authority of the author emanates, actors in Linklater’s approach now possess the important job of interpretation of the text. Linklater’s experience with method acting and American psychotherapeutic approaches 71 leads to a conception of language that mirrors contemporary linguistic approaches, that incorporates physiological and theoretical meanings of language: “Words have a direct line through the nerve endings of the mouth to sensory and emotive storehouses in the body...That direct line has been short-circuited, and the beginning work to release the build-in art of eloquence must be to re-establish the visceral connect of words to the body” (Freeing Shakespeare’s Voice 174). Her consequent privileging of imagery that arises over the text reflects a conception of language in opposition to Berry’s detached definition of language. To Linklater, words are places and are embodied conceptions of a speaker’s interaction with the world. However, the issue with Linklater’s approach is that the goal for ultimate subjective human truth leads to a hyper individualistic approach to voice and text interpretation that may smooth over particular cultural conditioning. Richard Knowles summarizes this point, “In attempting to transcend cultural conditioning en route to ‘the atmosphere of universal experience’ (Linklater 186) she allows for the effacement of cultural and other kinds of difference and is in danger of throwing the particular baby with the generalizing wash of her rhetoric” (103). The question remains in Linklater’s training whose “universal experience” is privileged over others, and the answer is still a homogeneous group of actors who had the georgraphic and economic means to access the training. Linklater’s conception of voice training reflects overarching approaches to actor training, as Realism or Bennett’s“naturalist mode” of theatre making takes hold over both the British and American stages. Both approaches trade on the audience's knowledge and desire to see authenticity onstage, while eschewing the semiotic power of meaning- making in the theatre. In Engaging Audiences: A Cognitive Approach to Spectating in the 72 Theatre, Bruce McConachie interrogates the apparent tension between the training of the profession and exploration of the actual cognitive processes behind this training, as a way to access meaning-making in the naturalist mode of artistic production: One rough parallel between therapy and science in our own field is the relationship between the teaching of acting and scientific research into how actors actually pursue their art. Most acting instructors will affirm that Stanislavski’s “system,” developed between 1906 and 1938, still has much to offer actors. This does not mean, however, that Stanislavski’s explanation for why his system worked–a curious psychological stew dependent on the theories of Pavlov and Ribot–retains scientific credibility among research psychologists today. While there appear to be good reasons to continue to work with actors on the basis of “objectives,” “obstacles,” and “emotions,” the acting class alone cannot become a laboratory to test for this scientific basis of Stanislavski’s ideas (11). This particular acting system still has some kind of efficacy and holds a prominent place in American acting and voice training in the Linklater fashion. Both acting and voice training from this time period make heavy use of the container metaphor of knowledge creation, where actors are conceived as empty vessels ready to be filled with knowledge. The teacher’s responsibility was to confer acting as a skill into their student. The actor- as-empty-vessel metaphor is very similar to the Lockian blank slate metaphor and contributes directly to the notion that acting and other artistic skills can be created as pre- formed modular units that can only be conferred onto actors who have done the necessary prep work to become blank or neutral. In this sense, practitioners of this era assumed voice teaching (like the number of proprietary approaches) or dialects (sanitized and ready for ease of use) become discrete attainable chunks for the actor to master. While 73 many theorists have posited and continue to move past this ontological divide, voice and dialect professionals in the Linklater fashion continue to work in this actor-as-empty- vessel mode even while outwardly claiming their resistance against this mode. 3.2 A New professional organization establishes itself To determine how pervasive these container approaches are in the profession of voice and dialect without examining the establishment of higher education theatre practices in the United States would be painting an incomplete picture of this practice (Zazzali 45). Theatre instructors adopted voice instruction focused on “freeing tension” to compliment the dominant acting style of Realism that was already being used in different theatre departments. The League of Resident Theatres (LORT), formally established on March 18, 1966 (one decade prior to Freeing the Natural Voice), sought actors who had enough vocal stamina to perform in different styles for extended periods of time (Calta 26). In response, several university theatre departments began training actors who could meet this demand and could enjoy long contracts as part of established repertory theatre programs, thus cementing the need for vocal coaches not only in these regional theatres, but in higher education institutions as well. Training in higher education would explode at about the same time as regional theatres would be founded using grants from the government during the middle part of the twentieth century (Zazzali 47). Suddenly, theatre programs were concerned with training actors for the conservatory-style seasons of realism plays that regional theatres were building throughout the nation. Institutions were enjoying unprecedented financial support in the form of government supported grants and private foundations, offering incentive for higher education departments to train a legion of actors for stamina and longevity in their acting. This need for flexibility in acting style meant that an actor 74 trained to be a performer through not only acting classes, but through a newly founded discipline of voice and movement work. Establishing the intimate connection between institutions like the academy and newly established LORT cemented a type of authority that would create new modes of respect for the profession of voice training. This type of authority would lead to the creation of the first professional organization Voice and Speech Trainers Association (VASTA) to shape the expectations of the profession and further define their authority on the topic of voice training. This professional organization bloomed from a series of casual gatherings that happened to coincide with the establishment of the League of Resident Theatres, in a way that helped to establish VASTA’s prominence not only as a professional organization in higher education, but also as a profession that was required to produce quality theatre in the United States. To protect the integrity of this emerging profession, key players created the Voice And Speech Trainers’ Association (VASTA) in 1968. Dorothy Mennen (VASTA’S first president) describes the first academic gathering of speech professionals in 1968 as “a dynamic session which fired the spark that initiated a new group called (at the time) Theatre Voice and Speech” (Moore 100). VASTA as a formal organization was established in 1986, nearly two decades after these gatherings began, by five women at the National Educational Theatre Conference in New York City. (Moore 101). Some of the functions VASTA would eventually fill were to issue evaluation guidelines for voice and speech trainers, a code of ethics and guidelines for training voice and speech trainers, and even advocate for promotion and tenure procedures for newly created voice professionals working in higher education. VASTA also established a publication, Voice and Speech Review, that has since become a prominent source of interdisciplinary voice and dialect research. This organization, in its first 10 years of existence, would become 75 the preeminent authority on who counted as a properly credentialed authority on voice and dialect. Membership has grown from 150 in the first five years of existence, to over 750 active members19. VASTA also draws its prestige and power through the many associations with other established academic organizations and networks, including the American Speech-Language-Hearing Association, The National Communication Association, The Voice Foundation, and the Association for Theatre in Higher Education (ATHE). Alliances with existing and well-respected organizations led to an air of authority and authenticity for the newly formed organization. VASTA continues to draw many different professionals across the voice training spectrum, from voice teachers in theatre, to singing instructors, and even linguists and speech pathologists. Many members credit the success of VASTA to the role the annual conferences have played as a means of connection between various vocal professionals and their proprietary voice systems. Many key vocal professionals (many of whom appear in this manuscript) have presented workshops and keynotes, including Cicely Berry, Catherine Fitzmaurice, Jan Gist, Arthur Lessac, Patsy Rodenburg, Dudley Knight, David Crystal, Rocco Dal Vera, among others. VASTA conferences have become famous for their mixture of academic work and practical sessions, including sessions called “Things That Work,” a round-table that shared techniques, tools, and tactics, and “The Identity Cabinet,” a session where members can perform work that is “close to the heart” (Moore 103). Presenting at these conferences have become an unofficial requisite for acceptance into this organization, and to present at these conferences means one’s particular approaches had to be approved by the organizers in the first place. VASTA’s practices 19 Estimates using Adrianne Moore’s numbers in her 2019 article “The History of Voice and Speech Trainers Association (VASTA)” in the Voice and Speech Review. 76 are particularly insular because they sit at an intersection between academic and outside professional organization, meaning that VASTA can use both gatekeeping mechanisms for academic institutions and private professional organizations. Another form of authority establishment came from the communication networks that VASTA’s members established. One former president, the very same Dudley Knight, seeing a rising need for communication amongst members, established a listserv and email newsletter named VASTA Vox20, which quickly became a location for members to exchange training tips, but more crucially revealing their various stances on different issues, such as incorporation of body techniques into voice instruction, but more crucially the use of standard dialects in voice instruction. This discussion, particularly around the status of standard dialects, flared up occasionally, as the topic inhabits a contentious space. In this case, the arguments often revolve around the use of created standard stage dialects, like Skinner’s Good American Speech, and not necessarily regional or foreign dialects used onstage. Some practitioners who can trace their voice training lineage directly to Skinner defending Good American English to others as a relatively value- neutral “tool to teach phonetics,” while others still claim that to teach these standard dialects continues to oppress marginalized actors and introduces white language ideologies into the classroom (Moore 104). Oblique references to heated discussion can be found in VASTA’s outward publications in Voice and Speech Review, such as Rockford Sansom’s 2016 article “The unspoken voice and speech debate [or] the sacred 20 Early VASTA publications are built out of the debates that were had in this list serv. I have since tried to find an archive or other ways to access this list serv through asking early participants in VASTA Vox and it appears that all storage has been wiped out and no institutional repository of this debate exists anymore, apart from perhaps archived emails in members’ own personal inboxes. This is the peril of referencing online discourse in the early 2000’s; here one day, and gone the next when your school updates its IT capacity. 77 cow in the conservatory” that summarized discussions on VASTA Vox. Though the unspoken speech debate to which Sansom is referring centers around instructional styles, Sansom still references fierce online discussions via VASTA Vox in his brief outline of historical conflicts within the organization, including the central question of the use of standard dialects in the voice classroom (160). In the intervening years, the substance of this email listserv has since been lost entirely, and multiple attempts to retrieve the data have been unsuccessful. In these listserv conversations, often more senior members, officers and board members would often exercise their control over the conversation by chiming into more heated debates, essentially silencing minority opinion and junior members in their discussions, thus creating a hierarchy within the organization itself (Sansom 160). 3.3 The hunt for the perfect dialect: Midcentury dialect coaching Following the success of Herman and Herman’s manuals of dialects, the next generation of dialect professionals sought the next evolution in dialect coaching. The innovation came in the form of easier access to audio tapes as supplements to the written manuals. These audio tapes were often pre-recorded exercises by the dialect coach to aid in acquisition of dialects, and thus were sanitized and stereotyped versions of the dialects in question. Access to audio tapes also aided dialect coaches in collecting samples from spontaneous speakers of the target dialects, which would lead to one of the biggest debates in the field. Dialect coaches often debated the use of standardized dialects versus the use of spontaneous real-world examples of dialect. On one hand, synthesized dialects contained fewer target sounds for the actor, which encouraged faster training, on the other hand, practitioners claimed that use of spontaneous dialects created a more “authentic dialect” (Blunt, xx). These authentic dialects were still filtered through actors 78 who often were not given an adequate amount of time to acquire these new skills, which means that the use of authenticity in this regard should be regarded cautiously. The use of spontaneous dialect would present its own challenges, as often dialect coaches compared their idea of a regional or foreign dialect to the speakers they found in those areas. Jerry Blunt describes this struggle between intelligibility and authenticity in his own manual Stage Dialects (1967). Jerry Blunt published his own book titled Stage Dialects in 1967. In this book, he details eleven of what he perceives as most used dialects in the literature of theatre. In a move that already privileges white listeners and speakers, Blunt provides instructions for ten dialects of European or Anglo descent (Regional American, British English and a few foreign accented European dialects), and a Japanese dialect. The prime feature of this book is access to practice tapes for students of all stripes, and his dedication to creating authenticity in each dialect by featuring work done with spontaneous audio samples from across the world. By the design of his chapters, and the discussion of the sources of his dialects, Jerry Blunt reveals his position towards speakers of non-standard and foreign- accented varieties of English, which, while not nearly as explicitly antagonistic as Herman and Herman still bears the hallmarks of a normative approach to language in performance. Blunt carries with him standard dialect attitudes similar to Herman and Herman, where his concern is the assumed white listening audience of stage and film. One of the ways in this manual that bears this explicit normative stance is “Standard English,” which Blunt claims is the Received Pronunciation or constructed British dialect of Daniel Jones, who documented the dialect phonetically in his English Pronouncing Dictionary (52). That Blunt does not include “British” in his title for this dialect speaks to his opinion on how this particular dialect ought to be considered as the 79 gold standard for pronunciation in production. Even in his claims, however, Blunt admits that this Standard English holds a paradoxical position as the dialect of the educated high class speakers in England, Standard English is self-conscious. The habitual user is as aware of his speech as he is of his posture or his social deportment...When a speech with this characteristic is used inappropriately, its basic nature is changed from what it is to what it was never intended to be, and a falseness or affectation results (52). In contrast, Blunt’s reference to American English points to his dismissal of the use of a generalized American dialect in terms of authoritative uses on stage. The American English dialect in performance “has no authoritative standing, but is needed to specify the most widely employed form of American speech. It is the dialectal utterance of Midwestern and, more recently, Western groupings” (2). Blunt appeals to education and class despite no real-world use when advocating for Standard English in his chapter while simultaneously appealing to frequency of use when he admits that general American English is needed for performance. In other words, he does not uniformly apply criteria for standards of use for his dialects. For other accents, Blunt provides tapes as optimal or perfect examples to accompany the text, demonstrating the vowel and consonant shifts for each dialect. His tapes are sanitized versions of the dialects that are in the book; the speakers on the tapes are imitating each dialect after much training. While training these dialects, Blunt details collecting numerous primary sources as reference. Blunt admits to frustration in collecting these primary sources, particularly for the non-native accented English he sought out in Europe. He blames formal education (which he previously celebrated in the 80 Standard English chapter) for denying him the stereotypical accents he came to expect while overseas, The core of the problem lay in the fact that the Italian living in Italy learns his English in school under the eye and ear of a teacher who places emphasis upon correct grammar and pronunciation. In contrast, the foreign accent we Americans have come to know is a slightly taught patois developed by the foreign-born living in America (4). Blunt exalts education as reasons for including Standard English, but education becomes an obstacle when Blunt is pursuing primary sources for foreign accents for his own book. Blunt positions speakers whose English is their second language as second-class speakers of their own dialect. He seeks to extract accents untouched by formal language education without admitting that one of the only ways that these speakers can advance in society is seeking formal education, an education for which Blunt has a position in creating standards. In the introduction of his book, he admits, “more usable foreign accents can be found at home than abroad,” tacitly admitting that he is seeking a stereotype of the foreign-born immigrant whose “need for communication must of necessity bypass rules of grammar and the niceties of pronunciation” (4). Faced with an impossible position, immigrants to the United States must be able to communicate in English, but not sound educated enough according to Blunt’s expectations of intelligibility for a foreign accent. While uneven in his application of criteria about what qualifies as an accent worthy of study, Blunt did innovate the field with the use of practice audio tapes. This sets the stage for the next generation of dialect coaches, who will employ more than just audio, and more than just the samples directly collected by the dialect coaches themselves. 81 4. Voice practitioners join the Internet The succeeding contemporary generation of voice professionals who were students of those promoting their own proprietary systems of voice have access to far greater amounts of knowledge than any generation previously. While this generation still focuses on voice, body and breath work like in Linklater’s Freeing the Natural Voice, their access to information via the internet enforces their authority by appealing to cutting edge empirical knowledge in voice (Bartoskova 2). As a consequence, practitioners make more use of the internet as both repositories of knowledge and advertisements for their individual approaches to voice.21 Practitioners also saw the regional theatres and theatre departments that once enjoyed unprecedented governmental and private support shrivel in the wake of Neoliberal policy making of the 1980s (Zazzali 201). Grants dried up with the resurgence of the Neo-liberalist approach to funding, leaving these regional institutions with large built-up capital and buildings, and shrinking budgets for actual production work. Regional theatres that had built immense amounts of capital construction and fossilized into an unsustainable funding model reliant on these grants had to scramble to appease as many individual donors and patrons as possible, limiting the type of work presented to classics and easy-to-digest works of theatre (Zazzali 202). As a result, actors who had enjoyed relatively stable months-long contracts with theatres were now faced with a creative economy that would only offer job security for the length of one production before an actor would have to find more work. Voice and dialect coaches who worked for LORT houses also found their jobs becoming less stable. This destabilization forced contemporary voice coaches to seek jobs in higher education (slow 21 Evidence of these websites can be seen in the shift from use of printed publications like books to more internet sources throughout this section. 82 to respond to the declining demand for conservatory-style actors and still a relatively safe occupation) and create side businesses of accent modification and dialect work through the use of the internet. Beginning with this generation, expectations of voice teachers include finding themselves with a plethora of choices when considering the type of training they want to add to their curriculum vitae to remain competitive as voice experts. Several key practitioners define this generation through a combination of published work and holding leadership positions within VASTA. One key and influential practitioner in this era include Louis Colaianni, who, as a nod to the elocutionists and their establishment of the International Phonetic Alphabet, released his book Joy of Phonetics and Accents (1995), a book that joins the practice of embodied tension release with an emphasis on the use of the phonetic alphabet. Another duo, Dudley Knight and Philip Thompson, began work that similarly borrows psychological research and concepts such as Neuro-linguistic Programming, which makes popular the idea that certain people have certain modalities in which they learn best (Knight and Thompson iii). Like the previous generation of practitioners, Dudley Knight and Philip Thompson offer training certificates in official Knight-Thompson training that lends a veneer of authority to any voice trainer trying to gain an edge in the tightening labor market. Louis Colaianni can directly trace his lineage to Kristin Linklater and her system, which influences his own approach to voice profoundly. Colaianni adapted Linklater’s system of freedom of tension and added what he calls his signature exercise, the Phonetic Pillows approach, through his book The Joy of Phonetics (1995). Colaianni’s approach has proven incredibly popular, as he has taught in many higher education institutions, along with coaching in some of the largest performance institutions in the United States 83 (“About”). Riding the medium line between strict standardized learning of sounds and the freedom of imagination has become a lucrative approach to voicework, as Colaianni has offered several workshops in conjunction with the Linklater Center For Voice. In contemporary times, having a Linklater stamp of approval lends authority to Louis Colaianni and his approach to voice. Colaianni, in a credit to his popularity and system, has also made the jump to film, naming many famous actors including Bill Murray and America Ferrera for film (“About”). Through turning his attention to fine phonetic detail (in a way similar to Edith Warman Skinner) and maintaining freedom from tension (via his training through Linklater), Colaianni has attempted to skirt issues with standard stage dialect and oppressive practices by attempting to strike a balance between these approaches of previous generations. His phonetic pillows, a set of large stuffed phonetic symbols derived from the International Phonetic Alphabet and used as embodied stimulation in the voice classroom, have become a part of his successful voice system. He claims, One dimensional phonetic symbols, printed on paper, tell our eyes what sounds they represent, tell our ears what sounds we are expected to utter, but make little or no appeal to our imaginations... In an effort to bring phonetics into the same physical world as other performance classes I have worked with student actors for many years on ways to get the symbols to jump from the page, enter our bodies and demand us to express them (“About Phonetic Pillows”). Colaianni favors the embodied experience and approach to voice work, and implicitly reinscribes the split between “intellectual” pursuit of voice and “embodied” pursuit of performance. Colaianni seems to equally use empirical pursuits, like phonetics and 84 linguistics, as much as he uses embodied individual knowledge in his work. He is pushing the needle from individualized psychological and social work in the actor back to empirically minded, a subjective/objective balance that this dissertation also attempts to strike. In contrast, Dudley Knight and Phil Thompson declare that their system of training does not use standards in the same way as training from previous eras. Dudley Knight and Philip Thompson, after developing their working relationship in the founding of VASTA, created a system of training called Knight-Thompson work, publishing Speaking with Skill: An introduction to Knight-Thompson speech work in 2012. This approach claims to be standard English agnostic, as Knight and Thompson claim that intelligibility is the ultimate goal of the work they offer (“About the Work”). As analyzed in the introduction, Knight’s use of the term intelligibility references directly an appeal to scientific authority, by claiming there is some objective measure that is resistant to social construction of what it means to be understood easily. Knight-Thompson work also uses accents as a natural extension of their analytical approach to accent. The K-T approach introduces the actor to “the four p’s” person, posture, prosody and pronunciation, By addressing characteristic sounds with reference to the speaker’s system of sound categories, the inherent variability in the realization of these sounds, and the relation of these sounds to the speaker’s vocal tract posture, actors can more confidently achieve an accent performance that authentically represents the speech of the character (“About the Work”). Authenticity again appears as the ultimate goal for voice work, both within generalized 85 voice and within dialect, while the remaining question is, to whom does the voice work sound authentic? The possible answers about authenticity include the original speaker of the dialect, the expert voice practitioner or director, and the theatrical audience. The implied answer from their work seems to indicate that authenticity is determined by the practitioner, thus again giving authority to the voice coach in this work. Knight and Thompson present authenticity in this work as the ultimate goal, without qualifying why practitioners, actors and audiences alike ought to make authenticity the goal. The contemporary generation of voice professionals will challenge this implied goal. 4.1 Monich coaches for the movies One of the most prolific and famous dialect coaches from this same time period is Tim Monich, whose Hollywood pedigree includes students such as Brad Pitt, Shia LeBeouf, Gerard Butler and many others (Wilkinson). His entry in the Internet Movie Database (IMDb) includes over 192 credits as dialect coach in a career that expands from 1983 to current productions slated for release in 2021 (“Tim Monich”). Like practitioners before him, Tim Monich can trace his voice and dialect training lineage directly to Edith Skinner, who was his teacher at Carnegie Mellon (Wilkinson). Monich even helped Skinner edit her text Speak With Distinction (1990), which speaks to his practice with precision phonetic symbols in training actors. While the lion’s share of Monich’s work is for television and film, however, he has hundreds of credits for theatre as well. Because of his prominent place in the film industry, Monich might be the most recognizable American dialect coach in the contemporary era. Like Blunt, Monich is an avid collector of samples for his dialect work, possessing enough recordings for fifty three consecutive days of listening (Wilkinson). 86 He works with a number of famous actors, often giving them options for linguistic models with which to work. Monich, while trained in the elocutionary style of Skinner and her proteges, has adapted his technique for dialect training to rise to the challenge of the fast-paced world of entertainment and film. This means he is adapting his dialects to the skills of the actor, the desires of the director, and the overall look and feel of the film. His approach has earned him accolades from several highly influential actors and directors, including Martin Scorsese, and as a result enjoys references for hundreds of accent and voice jobs in television and film. This means Monich has often become the most requested dialect coach in Hollywood, in an increasingly tight market that favors a few coaches at the top of the production hierarchy, leaving many more to carve out a living as coaches for independent films, television and theatrical productions. Because of this structure, many accent and voice coaches have turned to modern social media and technology to piece together enough work to live as a voice professional. 4.2 Digital Approaches to voice and dialect From the foundation of these key practitioners, contemporary voice and dialect professionals are beginning to approach voice in new ways that attempt to dismantle the harms of this practice. Harm reduction begins with increased collaboration between voice and dialect coaches and voice scientists and other experts in language, via research collaborations that often appear in the pages of Voice and Speech Review (Bartoskova 1). These collaborations reflect Colaianni’s practice, where the voice is treated as both a creative vehicle for expression, but also a scientific object of study. This type of research celebrates the tension between objectivity and subjectivity of the voice. The modes by which this work is disseminated has shifted in the next generation, mainly with the use of 87 personal websites, social media, YouTube and other digital means of communication. Accent and dialect coaching also benefits from this online explosion, yet remains unregulated. The explosion of online voice coaches represents a threat to the model that has been established in the second half of the twentieth century by voice and dialect coaches who have established themselves as part of an organized professional association, draw upon the authority of higher education, or the depth of experience as a seasoned coach for stage and film.Online voice coaching, in other words, is not subject to the tight gatekeeping or controls of the previous generations of voice professionals. A cursory search on YouTube reveals hundreds of channels that are dedicated to the topic of accent and dialect coaching, where some of the more popular personalities have amassed over 100,000 followers combined.22 Not all channels that appeared in this search are entertainment or actor training focused; there are plenty of channels, like Dr. Geoff Lindsey and Linguistix Pronunciation, where the main goal is to help non-native speakers learning a second or third language to achieve more native-like pronunciation in their everyday lives (“Linguistix Pronunciation”). These practitioners exist through a large and profitable business called accent reduction, also known as accent modification or accent neutralization, which is an Anglo-centric profession aimed at speakers of English as a second language to acquire more native-like pronunciation (Hope 10). These professionals and videos are catering directly to the idealized white listener and will often demonstrate different contexts for appropriate registers of language. Other dialect and voice trainers participate in what is called affirming voice therapy. A person’s desire to change may stem from the desire to have a voice that matches the gender identity a 22 I searched the term “dialect coach” for channel names on YouTube on May 3, 2020. 88 speaker may want to project to the world. In this sense, treating all accent reduction or modification as inherently bad may exclude individuals who may wish to change their voices to match their gender identities (Nolan et al.1368). For actor training in both film and stage, these channels and videos are part of a larger trend of individualized and freelancing vocal experts, of which the largest sector of voice professionals is voice and singing instructors (“Find a Voice Pro”). Often, videos in the tradition of accent and dialect coaching feature such language as “learn to improve or neutralize your accent!” and, “sound like a native speaker FAST!” which preys both on the precarious position of marginalized speakers and the accelerated nature of theatre and film production for actors. No YouTube video, no matter how thorough and engaging, holds the secret or key to accent neutralization or improvement, because the work of neutralization always caters to the normative white English-speaking listener. Dialect and accent coaches who have created these videos are engaging with raciolinguistic ideas of how to sound without interrogating the damaging normative language attitudes behind these sentiments. These dialect and accent coaches are often held up as popular culture icons, a type of expert to reference when discussing accents in entertainment. They often derive their authority from the sheer popularity of their videos and materials that are available on the internet. One of the most popular dialect coaches found on YouTube does not have his own channel, however, but is often called upon by popular general media producers like Wired and Insider as a particular expert in movie dialects and accents. Erik Singer has starred in over 10 videos that amass 1–13 million views each.23 He stars in a series called 23 Number of views was assessed on May 3, 2020 using YouTube search for “Erik Singer” and he has since published more videos. 89 “Technique Critique” where he breaks down several different accents in different scenarios in film and Television. The most popular video in which he is featured is titled “Movie Accent Expert Breaks Down 32 Actors' Accents” where he analyzes both accents considered good and bad in his own opinion (Wired). In these videos, Singer showcases his expertise and positions Singer as the undisputed authority on accents in popular entertainment. Subsequent to the original video, Erik Singer has also been featured in another video titled, “Movie Accent Expert Breaks down 31 Actors Playing Real People,” where he breaks down actors' attempt at ideolects, or specific individual accents (Singer Wired). The practice of imitating specific individual accents seems to be a genre peculiar to film and television, and strict adherence to recreating painstaking details of individuals' lives is often rewarded both financially and critically. The popularity of this video has led to another video where Singer breaks down 17 more film performances of ideolects in film. Often, YouTube channels are an external-facing part of the advertising apparatus for individual freelance voice and dialect coaches. Coaches will create viral-like videos in the format of a “talking head” explanatory video, or a demonstration of linguistic prowess through demonstrations of particular accents. These explanatory videos will be uploaded to YouTube, instagram, Facebook and other social media and link to further services that they offer through their own personal websites. Services offered through this viral-like online presence can sometimes include one-on-one coaching via video conferencing with actors or speakers who wish to change their accent24 In this way, dialect and accent coaches are offering their services to a larger geographical and wider demographic than 24 The popularity of this type of training seems to have risen with the advent of the COVID-19 pandemic. 90 would otherwise be possible if they were limited to more traditional approaches to training. Social media in this way are used as a type of personal brand advertisement and part of an independent business model. In this way, the digital profession of dialect and accent coaches recreate the inequality and power structures inherent within “influencer” styled professions. This means that a small number of popular or well-known dialect coaches enjoy the benefits of this structure while constantly navigating an ever-changing algorithmic landscape amassing followers and creating income, while a large number of dialect coaches do not (Cotter 904). Dialect coaches also use the power of YouTube and other audio corpora for various stages of research for different dialects in the tradition that Blunt established. Various videos and snippets of audio are available online for a coach to sift through and utilize in their work, though they may need to be careful because YouTube videos are not always accurately labeled. Coaches use videos uploaded by the speakers of the communities or accent of interest, with no explicit connection to accent or dialect work. A coach must consider the ethics while using the vast trove of internet resources that are available to dialect coaches and would-be actors who are looking for reference accents for their own work. Technically, users of popular social media websites YouTube, Instagram, and Facebook all must agree to user agreements that hold that material that is uploaded publicly does not necessarily belong to the user. However, there are perceptions in this work that material uploaded (especially to locked, unlisted or private accounts) should not be used by other users on the website (Grover et al. 772). While not specifically illegal, use of these audio and video sources can create friction with the perception of ownership of the original material that has been uploaded. Despite legal 91 murkiness and perception of ownership, the ethical question remains whether dialect and accent coaches may freely use spontaneous language and accent material for their own work, considering that these speakers may not necessarily explicitly consent to their lived linguistic lives being adapted for stage or film. Like using people’s images in film and stage, dialect coaches ought to adhere to stricter ethical codes in use of people’s linguistic likenesses for dialect and accent usage. The internet has facilitated both a rise in access to this type of work and an influx of digital influencer-styled accent, voice and dialect coaches without much ethical oversight. Coaching via the internet provides larger access to actor training and to resources, but also presents an issue of licensing and qualifications for doing this kind of actor training. In contrast to the establishment of VASTA in the generation prior, online dialect and accent coaching requires no formal membership nor formal training certification to create a business that caters to actors and speakers. Both approaches present their own advantages regarding access and professional ethics. On one hand, more people have access to resources and coaches online, while those who have access to officially vetted or VASTA members are limited to students in higher education or participants who can pay the fees associated with official voice training programs such as the Linklater Center in New York City and Orin, Scotland. The contemporary issue voice and dialect coaches must face is how to balance the gate-keeping privilege of VASTA with the unregulated generation of online voice and dialect coaching that borrows its business model from influencer-style online promotion. Inherent in both models of access to this discipline is still the prevalence of implicit and explicit biases that contribute to the enforcement of negative Linguistic stereotypes seen through entertainment. The tension 92 between access and professional training informs many of the best practices that I discuss in the conclusion of this dissertation. 5.0 Towards a cognitive conception of voice training While the bulk of this chapter has explained the arguments and assumptions practitioners use in training, I turn my attention now to how theatrical audiences factor into this work. My focus for the following chapter will be the second side of intelligibility, which is the perception of how understandable speakers are to the average audience member. Intelligibility begins with listening, both for the actor and crucially in the audience. Both of these types of listening are subject to normative language ideology, especially given the privileged arena of theatrical speech. Voice trainers have conceived of theatrical audiences as arenas that require extra perceptual expertise and have historically fashioned their work around this expectation. Voice trainer Marian Hampton speaks of clarity and intelligibility in her opening essay on standard language, As teachers, we must guide students in listening astutely to the speech of others so that they may adopt those characteristics which will contribute to the establishment of character, yet choose carefully what will help in this process without destroying an audience’s ability to understand the text of the play...We’ve all seen and heard productions in which the accent, albeit accurate beyond question, is so broad as to render the play unintelligible (15). Hampton uses her conception of the near universal experience of perception of the audience to establish the knowledge base upon which she judges accents for actors, and simultaneously pits accuracy–not authenticity–against the needs of the audience. Intelligibility to Hampton is placed squarely on the execution of an “accurate yet broad” 93 rendition of an accent and leaves little room for audience autonomy in the interpretation of said accents. This refrain is the foundation of the profession of voice and dialect training after nearly one hundred years of philosophy and work of successive generations of voice professionals. The history of this profession necessarily informs the modern conception of intelligibility in its colloquial use by this profession. Understanding intelligibility in this way, however, eliminates the autonomy of individual audience members and their experiences. Honoring the autonomy of the experiences of audience members is the intervention of the cognitive conception of intelligibility. The second half of this dissertation will center individual experiences of audience members from a cognitive perspective, thereby creating room for contemporary voice and dialect training to embrace a more diverse and deeper understanding of intelligibility that will result in the inclusion of a more diverse group of performers and audiences. The emphasis on actor training in this profession misses the audience’s role in meaning-making in production. Through the use of “audience reception,” Susan Bennet weakens the audience’s role in meaning-making processes when experiencing performance, by reducing audience members to passive receptacles of meaning in the theatre (4). Reception implies a passive almost literary role for each audience member, which limits each audience member’s agency as a participant in the theatrical event. Bennet uses reception in part in response to other burgeoning theories of the time, most famously the reader-reception theory, thereby extending the metaphor of literacy or “reading” to a theatrical event (6). To counteract this passivity, I propose shifting from reception to perception, borrowing from psychological use of the term. Perception recognizes a person’s participation in consciously or subconsciously recognizing and 94 organizing sensory information due to a number of ecological and psychological factors (Michel). This shift in terminology both gives agency back to the audience member, but also acknowledges the complicated and precise cognitive processes that are activated when an audience member creates meaning for the performance they are witnessing in a way where the voice profession has flattened individual experiences into the professional’s expectation of audience experience. Audience spectatorship is not necessarily a uniquely cognitive activity but uses human cognitive faculties available to other modes of perception in a unique configuration for theatrical spectating. In Dr. Thalia R. Goldstein’s article, “Questions of Realness,” where she debates the role of cognition in Realism, she claims “Theatre is obviously artifice,” yet audiences have come to expect this artifice as a stand-in for authenticity. More so, theatre uses real humans in real-time; even the most experimental of forms still often include humans on stage. Several questions about the tensions related to audience experience of authenticity and artifice may unlock the secret to creating a balance between artificial dialects and accents with accurately portraying accents represented onstage. How does the audience parse what is the artifice of theatre in the form of performance, versus the real automatic cognitive reactions to witnessing human behavior in real time? How are we balancing these imagined scenarios with the reality of our automatic responses that have been shaped by our experience in the rest of society? The question of representation is particularly pressing when audiences can perceive that certain accents are representing certain and often derogatory character traits. The ultimate ethical responsibility of presenting these accents lies with the voice professionals and production teams, which becomes fraught with the historical resonances of so many 95 harmful practices of the past. Examining the profession of voice and dialect may not be enough to untangle expectations around authenticity and intelligibility. According to Bruce McConachie, contemporary approaches to acting in general, even while they owe their origins to Stanislavski, heavily employ the actor-as-container metaphor, and envisions actor bodies as empty vessels ready to be filled with emotion and the psychological means to access character (44). The words spoken onstage amount to what Stanislavski called “verbal action” and ought to be considered an integral part of any acting approach but more importantly, a reflection of the environment in which practitioners and audience members find themselves (Moore 69). Instead of asking what an actor can fill themselves with, cognitive approaches to acting conceive the actor as a permeable part of a larger system, asking, “what does it mean to build characters from the ecosystem up, rather than a more psychologically focused method of character assessment?” (Cook 117). This approach necessarily considers the contexts in which audiences and performers find themselves, which lends a sharper vocabulary and tools to confront the overarching raciolinguistic ideologies imposed by a society with normative listening ideologies. Conceiving actor and voice training as an ecological event, inextricably connected to the context in which theatre is created leads to a larger access to empathy, and new pathways for meaning- making for both artists and audiences alike. By reconsidering both spectating and training as embodied processes that cannot be separated from ecological and social conditions, theatre producers can uncover new and surprising ways to make meaning on the stage, and by extension can conceive of new ways to approach voice training. These surprising ways are bolstered by evidence from 96 several neurological studies. Italian scientists made an exciting discovery in the early 2000s when they observed the motor neurons (and not just the sense neurons) in monkeys’ brains light up when they watched their handlers perform actions with their hands (McConachie 70). The use of motor neurons points toward how empathy might be built in the brain, seeing an emotion could mean the perceiver is activating and feeling that same emotion in their own brain. In his book, Bruce McConachie says, “visuomotor representations… provide spectators with the ability to ‘read the minds’ of actor/characters, to intuit their beliefs, intentions, and emotions by watching their motor actions…Empathy is not an emotion, but it readily leads viewers to emotional engagements” (65). Conceiving of empathy as an automatic cognitive process in theatrical audiences, theatre producers can stop wondering about the necessary and sufficient conditions to create empathy in audiences and instead consider audience members as an integral interactive part of the process of theatrical creation. In this way, producers and audience members alike not only access new modes of meaning making but can justify in a very real way the role that performance plays in our social fabric (Dissanayake 89). Activating empathy and subsequent emotional experience drives the interactive cognitive model of audience perception, deconstructing the rational model of audience experience that divides subjects from objects and assumes a separate rational world where meaning is made. The antidote to this objective approach can be found in Lakoff and Johnson’s concept of embodied realism. They write in Philosophy of the Flesh, The alternative we propose, embodied realism, relies on the fact that we are coupled to the work through our embodied interactions…what disembodied 97 realism misses…is that, as embodied, imaginative creatures, we never were separated or divorced from reality in the first place (emphasis author’s own, 93). Crucially, this rational objectivism destroys embodied experience as a mode of meaning- making. The assumption that objects lie “out there” and subjectivity lies within the audience destroys an opportunity to conceptualize meaning creation as a connective ecological event, dependent upon the exact conditions and contexts of each performance. The kernel at the center of the voice and dialect profession is the very use of voice, which practitioners can also approach through embodied realism. In very concrete terms, the voice is the result of the physical configuration of an individual’s vocal tract and the subsequent effect that configuration has on how air travels through that system. The essentials of having a voice requires the vocal tract, but also usually the movement of air–which is usually provided by the lungs or colloquially the breath. The vocal tract has some features that can be consciously manipulated, while components of voice are much harder or impossible to change. For example, physiological features, like vocal tract length, and medical issues like a deviated septum are nearly immutable, while placement of the lips, tongue and jaw can be changed rapidly. The combination of these mutable and conservative features creates an individual sound or voice. Honoring both the physiological circumstances and the context in which speakers and listeners find themselves completes the picture of understanding the role of intelligibility in voice training. Embodied realism even supports the metaphorical use of voice–actors and even playwrights are encouraged to “find their voice” when performing in theatre. The very experience of using your voice in the theatre lends itself to metaphorical ways of conceiving of theatrical practice. In Metaphors We Live By George Lakoff and Mark 98 Johnson argue that nearly all metaphorical language that we use comes from a deeply embodied and very much non-metaphorical lived experience. From there, lived experience ought to be centered when creating meaning-filled work such as performance. By examining the assumptions and underlying philosophies of these trainers through the lens of cognitive audience studies, I can bring the discipline of voice training into a more contemporary conversation with actor training and cognitive understandings of how humans make meaning out of the world around them. The historical conception of voice training contributes to contemporary understanding of knowledge transmission from each generation to the next, and demonstrates where subsequent practitioners transmitted ideas while others resisted ideas shows clues to how the voice profession is situated in the larger ecosystem of theatre production. In the tradition of pushing back against prior generations, I advocate for a system that meets actors and audience members from where they linguistically hail, explicitly honoring the diversity of experience that has led them to inhabit a small dark room together to experience a specialized form of communication. Yet, the process of the audience’s use of intelligibility in their experience of performance remains a large question as part of this cognitive conception of voice. To probe deeper into the role of intelligibility and meaning-making for individual audience members, I will introduce my own linguistics cognitive studies in the following chapter. My empirical investigation created around these very questions coupled with cognitive humanities studies will demonstrate how audiences can subjectively construct intelligibility onstage and how intelligibility can no longer act as a reliable objective measure for voice professionals and actors to use in their work. Finally, I will follow that chapter by picking up where this critical history leaves 99 off, with an eye towards the future of voice discipline. I will discuss the needs of contemporary theatre makers for voice training, along with highlighting some practitioners who I believe are on the right path to account for historical biases and constructions of authority in this discipline. I will use results from the linguistics chapter to further explore our cognitive understanding of how individual audience members use context and their prior experience with authority to construct intelligibility of what they see and hear onstage. 100 CHAPTER III CONSTRUCTING AUDIENCE INTELLIGIBILITY USING EMPIRICAL INQUIRY “. . . [O]nce most people really come to understand what an embodied conception of mind entails, they are going to be upset about it. Much of what they hold dear is at stake – their view of mind, meaning, thought, knowledge, science, morality, religion, and politics.” -Mark Johnson, The Meaning of the Body, 15 1. Rationale for empirical approach In the previous chapter, I examined the assumptions and ways that dialect and voice trainers construct their authority and knowledge in their field, along with their understanding of how audiences perceive performed speech through an in-depth look at how the profession is constructed. This chapter will challenge the ideas that established the profession by asking specific empirical questions about audience understanding, or intelligibility, which appears to be the yardstick by which voice and dialect coaches measure the effectiveness of their training. Many voice trainers measure the success of actor voice and dialect training through their perception of how intelligible the actor sounds on stage. Cognitive perception and meaning making by audience members are highly context dependent and built over time according to the experience of the listener, trainer, and actor. In this chapter, I take aim specifically at intelligibility as a socially constructed phenomenon that is the direct result of speaker (performer), listener (audience member), and the specific context (e.g., performance space, context of the story, previous historical encounters with voice). Contemporary research of speech perception supports the ideas of context dependent constructions of intelligibility explored in the last chapter. I will demonstrate this by constructing my own empirical research that tests the question of specific social contexts and experiences in audience members. To test the specific influence of performance context in this construction of intelligibility, I have devised two empirical 101 experiments that manipulate the listener’s belief—asking whether explicit knowledge that what they are about to hear is performed speech versus spontaneous affects how they judge the voices they hear. What happens to a listener when they are expecting a context with maximum intelligibility; how are they constructing the voices they hear in the context of performed speech? In this chapter, I use the idea of expectation of performance as a direct stand-in for expectation of intelligibility, and will simply use the term “expectation” throughout these experiments as a shorthand phrase. The best way to approach the gap in voice practice is to use a field of inquiry that specializes in understanding the mechanisms of linguistic perception, drawing useful information from primary studies, and creating a theory from whence practitioners can work. Combining these two fields takes a careful approach because the vocabulary in one field can overlap with the vocabulary in the other field, while having two different meanings. For example, while researchers in linguistics operationalize intelligibility as accuracy of speech transcription, voice practitioners use intelligibility as a general measure of audience understanding. In this case when I refer to intelligibility, I must be careful to either highlight the colloquial usage by theatre practitioners or use intelligibility as a linguist. In some situations, intelligibility for both hold similar meanings; when practitioner Dudley Knight uses intelligibility, he carefully defines his usage as the “amount of linguistic information a listener can gather” (Knight 140). In contrast, intelligibility is defined by Derwing and Munro as, “the extent to which the native speaker understands the intended message” and is specifically measured by recall of key terms in subsequent experiments (2). The difference lies in the expectations of knowledge recall of the listener. Untangling this distinction between linguistic usage and 102 colloquial usage will be part of the work of this chapter, which is essential in the task of truly understanding how audiences perceive language onstage. Going into depth about the different uses of this term specifically will illuminate the gaps in knowledge that practitioners have been carrying despite their years of embodied subjective knowledge. A fresh new perspective on the terms any profession is using ought to be welcome at any stage in training. Using research from an adjacent empirical field is the practice of interdisciplinary research of the cognitive humanities. I am continuing this tradition with this research, with one difference. Often cognitive humanities make use of research studies as the primary sources of theorizing. The research in this chapter offers a unique intervention by featuring a custom designed study, which means this chapter will consist of literature review of relevant linguistic studies, a lab report of the experiments I conducted, and then a subsequent discussion of implications of the findings in the report that incorporates theories from cognitive humanities. The literature review is thematically organized. The lab report is often the primary resource of cognitive humanities research and therefore summarized without the raw analysis; this dissertation offers the opportunity to examine the research in the form that cognitive humanities refer to but does not often display. In the future, I hope that more theorists in this field choose to work with linguists and scientists to present their primary findings in an accessible way for humanities researchers and practitioners alike. Combining interdisciplinary research into a format that is accessible to both disciplines has always been a goal of mine, and I will use this chapter as a blueprint for future research. Numerous linguistic research studies support the idea that the objective-sounding 103 measure of intelligibility is susceptible to social context and standard language ideology, through the various social experiences of different listeners or audience members. Despite this, in his discussion of standard accents and intelligibility, voice practitioner Dudley Knight off-handedly laments, “it appears that little if any research into intelligibility has been done up to now” (70). Knight is mistaken; since the foundational 1960 Wallace Lambert et al. article “Evaluational reactions to spoken languages,” thousands of articles on speech intelligibility have been published in the field of linguistics and our understanding of the social effects of language continues to grow. This means, contrary to Knight’s claims, that objective measures of language perception such as intelligibility are subject to language standards, especially in voice and dialect training where the environment explicitly judges speakers on their perceived intelligibility. Linguistics provides the tools and vocabulary necessary to explicitly examine how listeners use intelligibility to construct what they are hearing on stage and provide evidence for the embodied realism approach to knowledge construction of Lakoff and Johnson (4). Listeners are exposed to significant variation in speech, including different accents and dialects, and subsequently they hold variable beliefs about how ideal language should sound. Regardless of the variability encountered in speech, listeners can not only parse information from speech, but also associate this speech signal with perceived physical and social qualities of the speaker25 (Agheyisi and Fishman 146). Generally, accent perception and, by extension, meaning construction can be construed as 25 Agheyisi and Fishman summarize the use of attitudinal matched guise studies investigating these qualities including various languages, regional dialects, races, socioeconomic status, religion, and gender. 104 a combination of relying upon two different types of cognitive processes, bottom-up or subconscious processes, and top-down or conscious processes. Bottom-up processes relate to how automatic cognitive processes decipher the acoustic signal that strikes the ear drums in the listener (Rauss and Pourtois 276). Top-down processes which help the language users predict patterns in the signals they are hearing mold the perception of these acoustic signals (Rauss and Pourtois 276). This investigation closely examines the effect of using social context in top-down processing in accent perception and meaning construction in performance. These top-down processes are affected by conscious and subconscious socially ingrained ideas about language, gathered through a lifetime of being a language user. Listeners of spoken languages often judge accents and dialects and by extension the speakers of these accents and dialects, and comprehension of the speech signal can suffer as a result of these judgments (Gluszek and Dovidio 215). These socially ingrained language attitudes are close in concept to Lippi-Green’s “standard language ideology” (27). Decades of research on non-native accent perception have demonstrated that listeners carry specific language attitudes towards non-native speech (Moyer 114). Simultaneously, several factors impact perception of non-native speech, all of which carry the ability to affect audience perception of intelligibility of the performer on stage. This chapter offers a succinct voice practitioner-friendly summary of the relevant literature in cognitive linguistics about non-native accent perception and other linguistic phenomena that are important to the practice of voice and dialect. Most importantly, accented speech perception is a composite of different factors that listeners weigh when hearing speech. To complicate matters, linguists use the term intelligibility in a very 105 narrow sense; they operationalize intelligibility as a numerical measure of the amount of information that the listener receives and subsequently can reproduce (Munro and Derwing 287). Often, intelligibility is measured using the proportion of correct words recalled by listeners in various listening environments. Significant prior research has examined the constellation of related factors that affect accented speech perception that are closely related to intelligibility: particularly comprehensibility, and accentedness (Derwing and Munro 1, Flege, Munro and MacKay 3129). Accentedness is a subjective measure that researchers define as how strong or heavy a listener believes an accent to be. Comprehensibility is also a subjective measure, which asks the listener how easily they can understand the speaker. Many social contexts often determine these scores, including contexts intrinsic to the speaker, intrinsic to the listener, or related to the environment in which the language is perceived (Moyer 144). Performance, in this case, can be construed as a particularly specialized social context in which a listener is encountering the speaker.. The limitation to this research is that I must simplify this interaction to how listeners are reacting to the voice of the performer before introducing visuals to the working model. Clarifying the role of the actor’s voice can still lead to clues to how an audience member might behave; even with the impoverished signal of voice only, the listener can do a lot of work to fill in social expectation and qualities for the speaker. This chapter demonstrates that the very environment of theatre or performance might contribute to skewing that seeming-objective measure of intelligibility, or other ways to judge communication onstage. Intelligibility, like other terms used in the theatre like ‘authenticity’ or ‘clarity’ is in fact a privileged form of judgment that listeners and practitioners use as shorthand, affected by the individual experiences of listeners. In other 106 words, these terms or qualities are continuously constructed by their users, and subsequently these terms are affected by every instance of use. To believe these are objective measures is to lead audience and performers astray towards the belief that their perception of objective fact is the correct approach to this work. To examine this socially constructed idea of intelligibility, I explore factors over and above accentedness, comprehensibility and intelligibility that a listener may use to construct their perceptions of the voice and the actor they are encountering, including subjective factors that are often used in discussion of an actor’s voice, including discussions of ‘clarity,’ ‘effort,’ and ‘authenticity.’ The final section of this chapter considers how my research might respond to voice and dialect practitioners in their calls for more interdisciplinary investigation into these issues. I discuss the results with an explicit eye to how these results could be interpreted considering assumptions made by the voice and dialect profession, which will carry into the final chapter where I consider my position as a white cisgender practitioner myself. In the discussion, I seek to establish questions of how standard language ideology may be navigated in this craft, which will be considered in my conclusion. The discussion of this thread will lead directly into the final chapter, which will summarize the contemporary issues and challenges the voice and dialect industry faces and will provide best practices and considerations of those who seek to incorporate dialects into their productions. 1.2 Language perception: General mechanisms, several models Before addressing the social aspect of speech perception, a deeper understanding of speech perception is necessary to describe some of the more general speech perception 107 theories and models that influence linguistic research today. Within this section, I will highlight how these models might serve as points of access for voice trainers into the field of speech perception. First, in language perception more generally, several puzzles or issues of perception exist that speech models must address. These puzzles, like social context surround speech, may at first blush appear to have simple answers, but linguistics research will reveal that these puzzles are difficult principles to unravel. Models must address the most common issues that includes linearity (tracking the order in which the speech signal is received), segmentation (being able to perceive discrete meaningful units of language), speaker normalization (accounting for speech differences in different speakers), and the basic unit of speech perception (Ferrand 394). Often, speech practitioners begin instructing voice students with the most popular proposed basic unit of speech, the phoneme. A phoneme is a unit of sound that is perceptually distinct and can help distinguish one word from another and are often taught by voice practitioners to their students through use of the International Phonetic Alphabet or their own proprietary writing system (Blumenfeld 12). For example, /p/ and /b/26 are phonemes because in English, these sounds distinguish between the words “pad” and “bad” (Catford 184). Phonemes are an important key to understanding basic speech perception theories, as the other principles of speech perception are built from these units. For instance, the linearity principle and segmentation principles use phonemes to refer to the idea that a specific sound in different words corresponds to the same specific phoneme (Ferrand 393). As a practical example, this principle might assert that the /k/ 26Linguistic notation often places phonemes in forward slashes, which is a convention I adopt for this dissertation. 108 sound in “cat” is the same sound at the end of “back.” The segmentation principle asserts that the speech signal can be divided into discrete units that correspond to specific phonemes. Therefore, according to these two principles, the /k/ sound in “cat” and “back” not only should be the same sound, but they should be easily discernible in the speech signal, and both sounds ought to be identified as the phoneme /k/. Overwhelming evidence, however, has established that this is not the case. The exact acoustic sound characteristics of the /k/ in “cat” and “back” vary because of differing characteristics of the contexts in which this sound is produced. Voice practitioners can often explore these differences with their students; I often have my students explore the physical difference in the back of the mouth placement for the initial /k/ sound in “cup” versus “key.” The placement for “key” is closer to the front of the mouth than with “cup,” because of the mouth placement of the following vowel. This phenomenon is referred to as coarticulation, where muscular preparation for one sound affects the production of the immediately surrounding sounds. The mystery remains that there is no clear-cut one-to- one mapping of the acoustic signal to discrete sounds; we perceive speech as a series of separate and distinct phonemes and words even as the acoustic boundaries between phonemes are blurred and highly variable within one speaker, much less between speakers of the same language (Ferrand 394). The issue of speech perception compounds when listeners perceive speech not from one speaker but from many different speakers throughout the day. Theories of speech perception try to account for differences in speakers. Different factors (e.g., age, gender, language background) lead to wide variations in speech, that includes pitch, loudness, stress, and rate of speech. However, theories posit that listeners can account for 109 these differences by somehow attuning to the ranges that speakers produce in their acoustic signals. Somehow, listeners can ignore irrelevant differences between productions of a given sound, while focusing on the acoustic features that indicate differences between meaningful units of speech (Ferrand 394). These units of speech also produce a linguistic quandary, do language users store and perceive speech as acoustic- phonetic features, abstract sound categories like phonemes, or larger units like syllables or small word units? This question might also have a different answer depending upon the age of the perceiver, children may process auditory information using larger units and later shift into adult-like behavior where they may depend upon smaller units like phonemes (Nittrouer 280). Units might also be sensitive to environment; a person may be able to attune to smaller units of sounds in a quiet situation than in a noisy situation, where they might rely upon context to predict the linguistic sounds they are hearing. Clearly, these four issues—linearity, segmentation, speaker normalization, and units of perception—present unique challenges to creating models of speech perception that account for the wide variety signals that a listener hears and of which they must make sense. The leading theories of speech perception grapple with these challenges that establish a basis for understanding the social implications of speech perception and ultimately how a listener might construct the idea of intelligibility to aid them in their perceptual journey. One influential speech perception model is the Motor Theory of perception. This theory stresses the link between knowledge of production of speech and perception (Liberman and Mattingly 2). At its most basic form, the theory posits that a listener can perceive speech because they produce speech. Listeners are aware on some level of the 110 relationship between the theoretical sounds in their speech, and the articulatory gestures they produce to get there. Listeners are taught to perceive in terms of different types of mouth gestures but does not track the actual movements, instead they are tracking abstract articulatory plans that results is a perfect production of the utterance (Hawkins 127). Acoustic Invariance Theory assumes a similar abstract articulatory plan for each sound found in a specific language. This theory focuses on core acoustic properties, however, and can be conceptualized as a template against which the listener compares the incoming sound (Stevens and Blumstein 1358). The listener is still working with abstract representations of speech. In both theories, listeners are abstracting essential features of the incoming acoustic signal and subsequently make a decision about its identity by checking against a theoretical list of features. Developed in the 1980s, Direct Realism27 pushes back against notions of specialized abstract representation of speech sounds (Ferrand 399). This theory posits that direct knowledge of speech perception does not only stem from the acoustic signal itself, but also from prior experience of the listener perceiving the speech signal. Integrating direct experience more explicitly into further explanation of how prior experience shapes the speech signal, the TRACE model reflects a connectionist approach that integrates parallel processing across multiple sources of information in speech perception (McClelland and Elman 41). In other words, listeners are processing sounds across different levels simultaneously, including phonetic features, phonemes, words, and vitally social contexts of the speech. Units of perception can be as small as phonemes, or as large as words (e.g., a logogen or another type of unit associated with words in a 27 Not related to the theatre movement of Realism. 111 listener’s vocabulary). Each experience of a unit is tagged with useful information, such as perceived qualities of the speaker, to help the listener recall these units more efficiently in the future. These models reflect processes that researchers were using in computing, and these models became more sophisticated as knowledge about computers advanced. Native Language Magnet Theory has been an influential model over the beginning decades of the twenty-first century and begins to take seriously the notion that language perception is a weighted collection of individual experiences for the listener (Frieda et al. 130). The critical element of this theory is that phonetic categories are organized in terms of prototypes (like theories that came before), but these prototypes function as perceptual magnets that assimilate variations in production towards these categories. Distinctions within the same category that are close to the prototype are reduced (e.g., the /k/ in both “key” and “cup”), but perceptual distinctions between category boundaries become even more distinct, so the boundaries between sound categories are clearly divided (Kuhl et al. 684). Thus, listeners in this model can account for a range of differences while maintaining a sound system that supports specific language perception and production. These general perceptual theories lay the groundwork for language perception in an even more variable environment, which is language produced by a speaker who is speaking in their second, or even third language. A model that is specifically concerned with perception of non-native contrasts in language is the Perceptual Assimilation Model (Best and McRoberts 193). This model can be used by voice practitioners to help predict the relative ease or difficulty an actor may have in acquiring a new performed dialect. Importantly, the Perceptual Assimilation 112 Model can help account for difficulties in perceiving and acquiring a target performed dialect that is close to the actor’s own idiolect. This model incorporates perceptual attunement to the physical consequences of articulatory gestures that signal contrasts between speech sounds, incorporating speech gesture into a statistical experiential model. Degrees of distinctions between phonological categories are what perceivers attend to; listeners can also learn to tune out phonetic sequences that do not signal a change of meaning. The listener can attune to a hierarchy of phonological differences and assigns weight to these degrees which are incorporated into the model (i.e., sound differences within category receive little weight, while differences that signal meaning change have higher weights). The model PAM predicts that discrimination performance on non-native contrasts will vary from poor to excellent depending on how the contrasting non-native phones are assimilated (according to the weights assigned) to native phonological categories (Faris, Best and Tyler EL1). If non-native sounds are not fully incorporated into a listener’s collective experience of sounds, they are categorized as examples of phonological sounds with ratings from excellent (e.g., sounds native-like) to poor. If the features of the non-native sound are not consistent with any one native sound category, then it is uncategorized, and if it is not heard as speech (such as a lateral click from the Xhosa language), then it is non-assimilable into the listener’s sound inventory. PAM accounts for the counter-intuitive notion that sounds close to an actor’s own phonology may be more difficult to acquire precisely because their own native categories overlap with the sounds they are trying to acquire as part of an accent due to their inability to perceive these sounds. These categories can predict how well a listener can understand a non-native speaker and can help explain why some sounds are more salient than others. 113 The next section takes the idea of categorization and expands the idea that listeners are tracking more than just the sounds they hear, but the contexts in which they find themselves as well. 1.3 Top-down processes help organize the speech signal The work of a voice practitioner is a continual balancing act between expert judgment and fine phonetic work with their actors and understanding empirical findings of this field about how these judgments work will strengthen their work. Theories of cognitive processes in the previous section describe the automatic and subconscious ways that listeners use to perceive speech, and introduces just how variable the acoustic signal can be, which presents the first challenge to voice practitioners. In this section, I describe the processes that listeners use to assign meaning to that acoustic signal, focusing specifically on those processes that live closer to the surface of the listener’s consciousness and experience of their world, which adds another layer to the complicated story of how listening and perception works in performance. Contexts in which people encounter speech affects the way that that speech is processed, as demonstrated by countless experiments that measure listeners’ language attitudes. Social context even affects the very notions of “dialect” and “accent,” which turns out to be a less stable notion than voice and dialect practitioners would like to admit. It appears that all these conscious and socially constructed processes come at a cognitive cost; listeners who judge speakers as less easily understood in turn are less likely to absorb information or content in the speech signal, creating a feedback loop where perception creates the results of that perception (Rubin 522). The following section explores the causes and consequences of these social top-down expectations and how they may relate to the work 114 the precarious position voice and dialect practitioners hold by navigating these social contexts. 1.3.1 What is a dialect anyway? Social construction of an accent To answer the question “what is a dialect?” requires consideration of how power intersects with the lived linguistic lives of everyday people. This power manifests itself by way of individuals' standard language ideology, but these individual ideologies are shaped by authorities and the broader society in which they live imposing an indexical order on speakers who are accessing macro-sociological categories as individual values (Silverstein 193). Answering this question points to who society at large believes who has an “accent” or a “dialect” or non-standard variation of a standard language. These speakers are already marked by the community as deviant from the norm or accepted language usage. Pragmatic knowledge perceived as individual value judgments attached to linguistic forms varies depending on the availability, accuracy, detail and control that speakers in a community have (Preston 188). Michael Silverstein (1981) refers to these dimensions in his work on indexicality of the linguistic forms in the minds of speakers and listeners. The term indexicality is the notion that “attitudes towards and folk beliefs about languages are not isolated instances, but reflect patterned and structured ideologies within cultures and speech communities” (Silverstein, qtd in Preston p.182). Voice professionals must account for their own and their students’ systematic beliefs, the beliefs speakers they are trying to emulate, and the audience who will hear these speakers in specific instances, which creates a rich ecosystem of ideology to unpack as part of the rehearsal process. This accounting begins with the overarching idea of who even has an accent in the first place, as Preston (2016) notes: 115 When the folk say that someone has an accent, there are at least two important differences. First, for linguists, if the word “accent” is a technical term at all, it refers exclusively to the phonetic/phonological level. Folk respondents very often refer to the entire linguistic system with this word. Second, and more importantly, linguists know that everyone speaks some regional variety, even those heavily invested in removing such matters from their speech. Folk comment abounds, however, with the idea that somewhere there is “accent-free” speech; in the United States, for example, many respondents identify the Upper Midwest as “accent-free,” perhaps particularly those from the area itself. (182) Given these attitudes, explicit standardization—the codification of pronunciation, grammar, lexicon, or spelling for a given language variety—often interacts with explicit political structures (Moyer 85). Sociological indexicality affects power structures on the personal and institutional level. Often the type of political power involved with standardization includes the right to determine what may count as a “language” and what may count as a “dialect.” As illustration of the interrelating power structures of language ideologies, I highlight the particular issue of labeling dialects within English, which could be an aid for voice practitioners in approaching indexicality of their chosen accent or dialect. Even within regional and international dialects of English, power structures dictate “inner” and “outer” circle dialects to legitimize certain speakers of English over others. The use of inner and outer delineates a type of privilege that has been awarded to historically white English speakers, while those on the “outside” have been relegated to a second or lower 116 class of acceptable English usage. The Inner circle consists of the United Kingdom28, Ireland, US, Canada, Australia, and New Zealand, and represents about 3–5% of English use in the world. The remaining 97–95% of usage, English is a second language or heavily influenced by local languages and used in official governmental context (Moyer 91). Using this delineation disrupts the colloquially accepted idea of nativeness, since some English dialects have indeed undergone language change independent of the trajectory of English in the inner circle (Mollin 170). Dialect coach Jerry Blunt implicitly uses this inner/outer orientation towards his collection of audio dialects, where he sought non-native dialects, but not in the context of the speakers living in the locations where each language variety is found. Instead, he sought speakers from “outer” countries living in “inner” locations like the UK or the United States to collect audio samples. Given these complications, in these experiments, I will refer to non-native dialects and mean that these are English dialects where the speaker has learned English as their second language in a location outside of the United States. Dialects standardized explicitly by prestigious institutions such as formal education or voice professionals are direct reflections of ideas about what accent is right or standard in any given culture. How these standardized accents are treated and reified can reveal elements of indexicality of the traits that are important to key authorities who control access to these artificial dialects. Though different dialects do not innately indicate characteristics of the speaker, the perception remains that dialect or accent can indeed reveal qualities about the character of the speaker and their status in wider society. 28 Within the United Kingdom some varieties of English are not as widely accepted as others, depending upon markers of class and geographical location. Perceptions of outsider versus insider status also vary with identification of the label “United Kingdom.” 117 These associations are often instilled in language users at a young age by various forms of entertainment; one study examining different Disney characters and accents found that often villains were portrayed with regional or non-native accents29, while heroes often spoke in some form of unmarked or General American accent (Lippi-Green 92). The same phonemic trait can appear as both a marker of high and working class, and the dialect coach’s job is to aid the actor or student in disentangling phonetic realities from the social expectation of character traits. Another example examines the use of rhoticity, defined as the appearance or disappearance of the phoneme /r/ in certain accents and dialects, especially in syllable-final position. The mere presence or absence of rhoticity in a dialect does not inherently point towards character traits good or bad. Dialects of different prestige and mainstream acceptability employ r-lessness in various degrees. These accents are often seen as indicative of higher-class speakers specifically because of the perception of Received Pronunciation and other higher-class British accents as accents that were taught in schools and to performers and used in middle class pursuits such as theatre. However, this type of r-lessness also applies to working class dialects found in some neighborhood of Boston, such as the Southie neighborhood, as popularized by figures who have actively cultivated a working-class image or persona such as Whitey Bulger, and Mark Wahlberg (Ulin). The decision to use accents is influenced by stereotyped representations, where representations of accent and dialect in media are used as cognitive shortcuts for characterization of the characters (Bakanic 14). In the Disney study, foreign-accented 29 Since this 1997 study, the pattern has continued. In Lion King (2019), Scar (Jeremy Irons and Chiwetel Ejiofor) and Zazu (Rowan Atkinson and John Oliver) are the only British-sounding characters in Lion King, with the former the villain and the latter the sort of busybody killjoy. 118 characters in particular were more often seen as poor, uneducated, and as the bad guys, enforcing stereotypes that foreign-sounding speakers of English are not to be trusted (Lippi-Green 93). Further evidence points towards performed speech as a reflection of the standard or idealized speech of the dominant time in which the media is produced (Elliott 120). Elliott investigates the predominance of certain speech varieties by tracking the change in rhoticity, or the use of /r/ at the end of syllables, throughout a century’s worth of movies and correlating that decline with the general decline in rhoticity in English in the United States over the same period of time. Though this claim could be attributed to language change in general and the fact that actors are also part of a language community, a tantalizing theory exists that the speech of actors (especially explicitly trained in dialect or voice for the stage and film) can and does represent an ideal or standard style of speech, especially as it relates to Skinner and her Transatlantic speech. Listeners may use expectation of idealized social norms for accents to assign character traits to accents, and may be using this same ideology to judge authenticity and other social factors to the accents they are hearing. Using a performed accent (especially with actors who are trained to acquire a new accent in a short period of time) in this dissertation may offer access into the processes by which expectation of a standard accent affects perception. In this case, a listener may perceive a non-native accent through their conception of authenticity or intelligibility. German-speaking listeners were able to identify the origin of different imitated non- native accents (e.g., French, American, Italian) better than authentic non-native accents (Neuhauser & Simpson 1805). However, they were less accurate at judging the authenticity of the presented accents. That is, listeners’ expectation of authenticity does 119 not translate to ability to judge authenticity of the accent. This may be because listeners are identifying stereotypical traits in the imitated speech they are hearing as evidence of authenticity, while these stereotypical cues are missing from the authentic accents that they are hearing. The research in this chapter will further examine the effects of expectation on imitated and natural accents by examining other social factors that may be susceptible to these types of effects. Using expectations in this way will reveal the various layers of indexicality that listeners place upon their conception of dialect the role dialects play in creating meaning for performance. 1.3.2 Social construction of an accent affects perception Standard language expectations have also been demonstrated to have consequences in educational environments, demonstrating that comprehension is affected by listeners’ attitudes toward a speaker. In a foundational study on perceived accentedness in 1990, Donald Rubin and Kim Smith investigated lecturer ethnicity and lecture topic as factors in undergraduates’ attitudes towards International Teaching Assistants (337). They measured comprehensibility ratings after playing 4-minute lessons either in a ‘moderate’ or ‘strong’ accent for 92 undergraduate students while projecting one of two lecturers, indicated by photograph of a white or an East Asian instructor. Degree of accent correlated negatively to perceptions of teaching competence. In 1992, Rubin followed-up this study by demonstrating that college students’ language perception and comprehension can be influenced by perceived race. Even when using a standard American accent as the audio signal, students who saw a picture of an East Asian woman while listening to a lecture performed more poorly on the content exams in both the science and humanities post-tests (Rubin 516). In later research, Rubin named this 120 phenomenon “reverse linguistic stereotyping,” demonstrating that listeners’ perceptions are sensitive even to the suggestion of racial context (Kang and Rubin 441). Kevin McGowan explored the reverse of this effect in 2015, demonstrating that foreign accented English paired with a picture of a person of a different perceived race resulted in similar detrimental effects on the listener (e.g., Chinese-accented English paired with a picture of a white woman). That is, listener expectation runs both ways; if the listener hears foreign-accented speech, they expect the image of the person to match the signal to which they are listening. In other words, listeners carry standard language expectations for more than their own language, and are poised to carry standard expectations for most of the accents they encounter in their lives. Voice practitioners and dialect coaches especially may use of standard expectations or stereotypes to affect how audiences perceive the speakers of these accents. For example, if actors are using dialect in a surprising non-stereotypical way, dialect coaches can account for the adjustment that audiences must make when they encounter these accents on stage for the first time. To test the audience’s perceptions of stereotypical accents in performance context, I will use foreign or non-native dialect as the tool for inquiry into this idea of intelligibility as a specific example of a context in which audiences are creating meaning using the voice in performance. For the purposes of these experiments, I needed a target dialect or accent from which to work that could be controlled in the lab setting for my experiments. Even with controls in place, asking a voice trainer to help an actor sound their best or most intelligible runs the risk of introducing many different variables. Beginning with a target dialect of Russian-accented English to be trained for an American English-speaking actor at least creates a target than can offer insight into 121 listeners’ ideas of stereotype. I acknowledge that reducing the lived linguistic experiences of speakers is quite near the opposite of what I have been arguing in the dissertation up until this chapter. However, I do need an entry point into the world of context for intelligibility, and I need an entry point that will elicit reactions about that particular accent from listeners. Often, listeners are more likely to give explicit judgments or ratings when they are listening to non-native accents (Wester and Mayo). Starting so specifically with an accent like this means that the patterns and phenomena captured in the subsequent experiments in this chapter may not be generalizable to every moment (as I have argued so far in this dissertation), however, they may serve as a baseline for further inquiry into perception of intelligibility. Scientific knowledge, after all, starts with as many variables controlled as possible and introduces more variables as the model becomes more complex. The next section contains the description of factors I will use the results of two experiments designed to tease apart the social measures behind intelligibility. 1.3.3 Measures of factors affecting accented speech perception In order to test audience expectations, I must be able to measure some kind of experience the listeners are having. In order to do this, I will be using language attitudes that listeners can use to label their perceptual experience. I am interested in attitudes intrinsic to the listener that affect the factors of accentedness, comprehensibility, and intelligibility, like in previous research, and the context in which the listeners perceive speech (Munro and Derwing 285). For performance-specific questions related to voice and dialect, I turn to other listener-intrinsic qualities assigned to accented speech that can be measured on scales like those used to judge accentedness and comprehensibility. For 122 example, when judging qualities of speech, listeners used adjectives such as “appealing”, “clear,” “pleasant,” “intelligent,” and “sophisticated” in different amounts on a five-point scale while listening to different regional accents in North America (LaMonica). These different qualities can reveal specific attitudes about different accents and dialects. In LaMonica’s study, listeners rated Southern dialects as more appealing, yet not as sophisticated as accents found in the Midwestern United States, demonstrating that, while these scales are often aligned, there is some independence in descriptions of accents. The factors affecting perception of accented speech are not fixed within the speaker or even the listener, as these factors can be influenced by the context in which the speech is being perceived, including expectations of the listener (Kang & Rubin 450). In the present study, we ask specifically about the effect of these contexts on the factors of accentedness, comprehensibility and intelligibility, along with other qualities with which listeners may be associating in particular with non-native accents both within and outside of the context of performance. To test the influence of expectation of standard accents on social factors in perceiving accented speech and ultimately intelligibility, I employ a modified matched guise experimental design (Lambert et al. 44), using different instructions to different groups of listeners to make them believe they are in different scenarios. Experiments utilizing matched guise attempt to hold as many variables as possible constant and employ a single speaker speaking the same material over a series of accents or over a series of contexts (Giles & Coupland 34). For example, in one experiment employing matched guise, participants listened to the same stimulus while being assigned to one of two different listening contexts; being told whether the speaker is a native speaker of 123 Cantonese or an American speaker (Hu and Lindemann 254). Listeners who were told they were listening to a Cantonese speaker gave higher accentedness ratings than those who believed they were listening to an American speaker. I employ a matched guise paradigm in a similar way by introducing different listening contexts to different groups of listeners by using stimuli from a trained actor imitating a dialect and from natural speakers of that dialect. While this series of experiments captures a mere fraction of the rich environment an audience member would encounter while experiencing theatre, I first start with the voice or audio as a way to directly compare this work to work that I have just reviewed.30 I want to test if the mere suggestion of expecting performed or imitated speech affected audience perception. If perception is affected with the mere suggestion, I would expect future work to demonstrate that the entire audience experience affects speech perception in profound ways that are yet to be documented. To support these experiments, I conducted a preliminary study using this matched guise paradigm, to determine whether listeners are sensitive to the differences between an actor imitating an accent and natural speakers of the target accent, and whether patterns of description exist for listeners. Data from this pilot have helped me identify social factors that may be associated with a socially idealized or maximally intelligible accent. Below, I review this study, using conventions from linguistic inquiry. Implications of these data for the field of cognitive humanities follow the description of the experiment. 30 Another issue I must address is that all of the experiments conducted for this dissertation was during the COVID-19 pandemic and I only had access to my participants via the internet, so I decided the best way to control for differences in technology and access was to focus on the audio portion of experiencing voice onstage. Adding video or a picture of these speakers will be an excellent future direction for further study of this effect. 124 2. Preliminary study 2.1 Participants I used Mechanical Turk31 to gather results from 108 participants in an experiment that took an average of 2.1 minutes to complete. These participants were in the United States, at least 18 years of age, and had indicated that they had completed some high school. Importantly, these participants were from locations outside of the University of Oregon community, so I was able to gather a larger variety of listeners. 2.2 Stimuli Four recordings of one randomly selected sentence from the Hearing in Noise Task (HINT) sentences (Nilsson, Soli and Sullivan) from the Archive of L1 and L2 Scripted and Spontaneous Transcripts and Recordings (ALLSSTAR) corpus (Bradlow, Kim and Blasingame) were used as stimuli for the experiment. Three recordings of three Russian-accented English speakers were selected from the corpus. The fourth speaker was a university student who was trained in a Russian accent through a Voice and Dialect theatre class offered at the University of Oregon. This actor was subsequently privately coached specifically on all the sentences for 4 hours (2 two-hour sessions, one for the first 60 sentences and another for second set of 60 sentences) and then recorded the sentences. Acceptability was determined by the dialect coach32; recording would continue until the dialect coach was confident each sentence was successfully produced in the target Russian accent. The student was coached to read the sentences, and to not act out 31 Mechanical Turk is an online service provided by Amazon that employs HITs, or Human Intelligence Tasks that rewards workers a small amount of money when they complete a HIT. In this case, I paid for this HIT at an equivalent rate of $15/hour. 32 Much appreciation to Dr. Tricia Rodley for her contribution as dialect coach, and to Christian Mitchell as the actor in this experiment 125 the sentences, as they might normally do in a theatre class. Russian-accented English was chosen as the target accent partially because of the availability of this accent through the theatre class, but also because this Russian-accented English has been described by the research of Stephanie Lindemann as “correct but not pleasant,” and occupies an intersection of intelligibility and accentedness, where the speakers are perceived as heavily accented yet still intelligible (204). 2.3 Listening groups Participants were divided into four listening groups in a 2X2 design that examines speaker type (trained vs. untrained) and expectation (no expectation vs. expectation), as described in Table 1 below. This design allows for making multiple types of comparisons of the data by comparing either rows or columns to one another, or comparing all four groups to each other. By separating by expectation (columns), we can examine the effect the listener has on intelligibility and other factors, while separating by training (rows), we can examine the specific effect of speaker training on intelligibility and other factors. A listener heard either the two real Russian-Accented English speakers (untrained group), or a mixture of one real Russian-accented English speaker and an actor (trained group). No participant heard a different combination of the real Russian-English Accented speakers, nor did they hear any other speaker compared to the actor. These two groups of speakers were crossed with two listening conditions. In one condition, the listeners were explicitly told that there is an actor in the group of two speakers (expectation). In the other, listeners were not explicitly told there is an actor in the group (no expectation). Listeners were randomly assigned to one of four listening groups. 126 Table 1. Four different listening groups in the experiment. Expectation No Expectation Trained Group 1 Group 2 Un-trained Group 3 Group 4 2.4 Procedure Participants were instructed to complete the experiment using headphones. In the expectation condition, participants were first informed that they would hear two voices and that one of those two voices were an actor who was been trained to perform an accent. In the no expectation condition, they were informed that they would hear two speakers, but were given no other information about those speakers. They were then presented with the audio of the sentence spoken by each of two speakers and could listen to each audio clip as many times as they wished. Regardless of condition, participants advanced to the next screen where they were asked to select which audio clip contained who they thought was an actor in a two alternative forced choice task. Note that only half the participants were told in advance that there would be an actor in producing speech and in only half the conditions was an actor actually included in the sound files. After their selection, an attention question was asked, “how did you listen to the audio samples today?” Finally, participants were asked to explain their choice of actor using a free response text box. 2.5 Results (two alternative forced choice task) For the trained condition in both expectation and no expectation, participants selected the actor more often than the other speaker. In the untrained condition in both 127 expectation and no expectation, participants selected speaker 142 more often over speaker 140. Exact percentages are found in Table 2. Table 2. Results of two alternative forced choice task Expectation No Expectation Trained Actor Speaker 144 Actor Speaker 144 59% 41% 65% 35% Untrained Speaker 142 Speaker 140 Speaker 142 Speaker 140 53% 47% 65% 35% Independent t-tests show that all these percentages are not significantly different from chance (all p-values >.05), probably due in part to the small number of participants in each square. T-tests also show that proportions within condition (expectation versus no expectation) and within listening groups (trained versus untrained) are not significantly different from each other, while demonstrating a trend in the direction towards the first option that they were given in the experiment. Non-significance in this case means there is no detectable difference between conditions. That is, we do not have evidence that listeners are sensitive to the presence of an imitated accent. However,, like in all scientific experiments, the null result should be interpreted cautiously, as a null result neither proves nor disproves hypotheses that are established as part of the experiment. The trend in the task demonstrates enough possibilities that I adopted this procedure for the main experiments below. 128 2.6 Results (Free response question) Participants were able to type free responses to the question “Why did you pick that speaker as the actor?” Responses were coded with a type-token count by tagging each response with a keyword (sometimes multiple keywords). A type-token count refers to how many (tokens) of each type of keyword was found in the responses. Responses containing the word “unnatural” accounts for over 20 individual responses of 108 participants in both conditions and listening groups. The term “unnatural” was followed in order by “clear,” “forced,” “fake,” “exaggerated.” “recognizable,” “natural,” and “not authentic.” These descriptors appear to have positive and negative connotations in their use, with negative terms “unnatural” and “not authentic” being the most transparent negations to their counterparts “natural” and “authentic”. Figure 1. Histogram of responses by keyword, and again with keywords by speaker. Dividing each key term by the choice that was made by the listener reveals that different terms used by listeners pattern differently in each instance of these keyword. For example, 70% of the times the participants used “forced” to describe their choice of actor, they correctly chose the actor, compared to only 13% of the times they used “clear” to describe their choice when they correctly chose the actor. In particular, the use of the descriptor “clear” seems to pattern with selection of spontaneous accents, since most 129 listeners who chose the descriptor of “clear” also chose one of the three speakers of the real speech samples of Russian-Accented English as their selection for the actor. These selections show that listeners use different factors when selecting for an actor when they hear an imitated accent—the terms where the highest proportion of listeners select the actor include “unnatural”, “forced”, and “exaggerated”. Factors that listeners use in selection of natural speakers as the “actor” include, “clear”, “fake”, and “natural”. Different proportions in keywords used to describe their choices points to at least some kind of sensitivity to the difference between imitated and natural accents. When examining the keywords that are used by expectation group (comparing the columns of the experiment design), another pattern arises in the responses. One key word “recognizable” is only used when the listener is explicitly expecting to hear an actor. With explicit expectation, listeners were sensitive to this voice as imitation with one participant describing the actor’s voice as imitating a famous actor. Another key word “natural” is nearly used exclusively while explicitly expecting to hear an actor. The fact that keywords appear in different proportions in expectation and no expectation listening conditions points to a possible difference in factors that listeners are using to judge imitated versus natural accents, regardless of their ability to accurately detect imitated accents versus natural accents. 130 Figure 2. Histogram of responses by keyword, by expectation condition 2.7 Interim Discussion Data from the initial pilot study reveals the exciting possibility of patterns in how listeners conceptualize the voices they are hearing when there is an explicit expectation of intelligible speech (e.g., performed speech from an actor). Because of the trends and promising results, these findings lead to the experiments conducted specifically for this dissertation. I will employ specifically the keywords from the free response that were coded and analyzed. Above these terms, these experiments ask how subjective measures such as accentedness, comprehensibility, and other measures of language are viewed when the listener expects maximum intelligibility. Further, these experiments dissect just what listeners mean when they are listening for intelligible speech. Experiment One examines what kind of social qualities listeners are assigning to each of the four speakers (the actor and the three Russian-English accented speakers) before the notion of expectation is introduced at the end of the experiment. From the findings of the pilot, the possibility that listeners are perceiving idealized forms of language as intelligible speech when asked to listen for an actor is not immediately clear, so I explicitly ask listeners to select who they believe to be the stereotypical accent, which helps to approach the idea of intelligibility through a more top-down processes and conscious activation of social 131 standard dialect expectations. To answer the other research question, which of the social factors are listeners using to subjectively determine if they are hearing idealized or standard forms of language, the free response keywords will be used. In Experiment Two, listeners employ descriptions from the keywords of the pilot on Likert scales (scales of 1 to 9, a standard practice in surveys and social experimentation) to see how expectation of performance affects these descriptions. Using these descriptors, I can approach the social construction of intelligibility and explore what types of qualities listeners use while perceiving speakers they expect are performing for them. Determining which of these special qualities that listeners are using can clue dialect and voice coaches towards new goals in vocal (and more specifically dialect) production. What if, while voice professionals were using intelligibility as a benchmark, we could use a different quality as a goal instead? If audiences are indeed sensitive to social context and expectation, why achieve “authenticity,” when authenticity is constructed out of expectation of stereotype and not experience of reality for these accents? These experiments aim to answer these questions in service of expanding the notion of who has an acceptable performing voice. Copious research evidence exists that points toward social expectation shaping how speakers are perceived by listeners. The hypothesis is that this special case of expectation—that performance is a social context that triggers a change in listeners’ perceptions of language—is not any different. Therefore, if listeners are using performance as a special social context when we test different groups of listeners (those in the Expectation condition and those in the No Expectation condition), we should see changes between the conditions. We should see that expectation affects how listeners 132 score their listeners on dimensions such as accentedness, comprehensibility, and the adjectives that were found in the pilot experiment. This would demonstrate, in part, that objective-seeming adjectives are constructed through the social context that listeners have in the scenario. If there is no change between the Expectation and No Expectation groups, then performed speech is not a factor that listeners account for in their perception of speech. 3. Experiment 1 3.1 Method: Participants, stimuli, procedure I recruited forty-five (45) participants whose first language is English and had no prior experience with Russian from the human subjects pool at the University of Oregon. In addition to collecting language background information from each participant, I also asked about their experience in performing and attending live performance (acting, improv, role playing, and any other type of performance). Stimuli are the same speakers as in the pilot study. Three Russian-accented English speakers from the ALLSSTAR Corpus and the actor are the same speakers as in the pilot study. Forty-two HINT sentences were selected (see appendix). Each participant heard ten different sentences from each speaker. Two additional sentences were selected, where the participant heard all four speakers. In Qualtrics33, participants were randomly assigned to one of two expectation conditions. In one condition (expectation), listeners were explicitly informed that one of the four speakers they are about to hear is an actor, and that they will be asked to choose who they believe the actor is. In the second condition (no expectation), participants were not informed that one of the four speakers they are about to hear is an 33 Qualtrics is an online survey platform. 133 actor. Each participant heard each sentence selected for the experiment. After each sentence, the participant transcribed the sentence. These transcriptions were scored for accuracy using AutoScore34 (Borrie, Barrett and Yoho). After transcribing each sentence, participants then rated each sentence on a 9-point scale for both accentedness (i.e., “how accented is this sentence?”) and comprehensibility (i.e., “how easy is it to understand is this sentence?”) (Derwing and Munro). After participants responded to each of the forty sentences, they were presented with the audio of the same sentence spoken by each of the speakers and could listen to each audio clip as many times as they wish. They were asked to select from the four voices which person they believe is the actor in a four-alternative forced choice task. To test explicit language attitudes about stereotype and authenticity, a second sentence was played with all four speakers and the listener was asked, “which is the closest to a stereotypical Russian accent?” The results of these procedures follow, first examining how accentedness, comprehensibility and intelligibility vary by speaker (either the native Russian speakers or the actor), and then how these attributes vary by listener condition (expectation versus no expectation). 3.2 Results 3.2.1 Intelligibility (Accuracy of recall) Since the work of this dissertation directly challenges the notion of intelligibility, I am electing here to rename intelligibility to what was being measured functionally from the experiment. Intelligibility, in this case, is the proportion of correct words recalled to the number of words in the sentence. These intelligibility scores are at ceiling or nearly 34 AutoScore is a program that compares the ideal sentence to a response sentence given by a listener, and automatically counts the number of words correct in the sentence, automating the process of analysis. 134 100%, precisely because this experiment was designed to maximize this score to examine how accentedness and comprehensibility behave with maximum perceived intelligibility. What follows is a table that shows the proportions of words correct for each speaker. Because the measure is a proportion, the closer the number to one (1), the more accurate listeners were in their transcription of the sentences that they heard. Standard deviation is used to indicate the extent of deviation for the group as a whole. In other words, this measures how different each of the individual scores for each of the sentences are from one another. This means that a smaller number indicates scores that are all very similar, while a larger number shows a larger variation in scores. Table 3. Accuracy of transcription of sentences for each speaker. Speaker Mean accuracy Standard Deviation Actor .9574 .1621 Speaker 140 .9504 .1325 Speaker 142 .9338 .1791 Speaker 144 .9386 .1308 The Actor overall has a higher accuracy of recall ratings than the three other native Russian-English speakers. However, when compared for significance using t-tests, none of these accuracy ratings are significantly different from each other. The t-test between the Actor and the speaker with the lowest rating (Speaker 142) does not reveal significant differences between the two (t(677)=1.7981, p=0.0726). The following table reveals accuracy of recall by expectation condition, combining all four speakers in both categories. These data are the same responses that make up the above Table 3, but are divided in a different way that might help reveal the role of social expectation in accuracy 135 of recall. Listeners in the Expectation condition showed a higher accuracy than No Expectation in their transcription of the sentences. However, a t-test reveals that these results are not necessarily significantly different from each other (t(1356)=1.2303 p>.05). Overall, these results show a slight trend towards more accuracy for the actor, and more accuracy for the Expectation condition. Table 4. Accuracy of transcription of sentences by expectation by listening condition Condition Mean Accuracy Standard Deviation Expectation .9505 .1281 No Expectation .9403 .1714 3.2.2 Accentedness and Comprehensibility Listeners scored each sentence on Likert scales from 1 to 9 for both accentedness and comprehensibility. Listeners gave a score of 1 for “not at all accented” and 9 for “extremely accented.” For comprehensibility, listeners gave a score of 1 for “easiest to understand” and a score of 9 for “extremely difficult to understand.” In other words, the higher the score, the more accented and less comprehensible each sentence is judged to be. Results are presented in box plots, which show a summary of a set of data. Each box represents the first and third quartile35 of the data set, while the horizontal line represents the median. The ends of the whisker—or the lines above and below the box—represent the minimum and maximum values in each set of data. In this case, the minimum and maximum are always 1 and 9 for any set of data. Below is the box and whisker plot that directly compares the scores for the Actor and the three Russian-Accented English 35 A quartile is the median of the data below and above the median of the entire data set. 136 speakers. Each data point in this plot is the individual sentences that each speaker produced for this experiment. Speakers 140 and 142 are judged to have the similar accentedness ratings and received over all very similar scores. The mean accentedness rating is 5.7 (s.d.36=1.91) for Speaker 140 and 5.6 (s.d.=2.0) for Speaker 142. The actor received a mean score of 5.1 (s.d.=1.87), meaning he was judged as less accented than speakers 140 and 142. Speaker 144 received a mean score of 4.6 (s.d. = 2.08), which means they were judged the least accented of the four speakers, while simultaneously demonstrating the widest variation in scores. Figure 3. Box and whisker plot that shows the median accentedness scores for all four speakers. Compared to accentedness ratings, each speaker has a lower overall mean comprehensibility rating. Again, for in Figure 3, the higher the number, the less comprehensible the speaker sounds. The actor, while demonstrating accentedness ratings that are similar to Speakers 140 and 142, appears more comprehensible than these 36 s.d. stands for standard deviation. 137 speakers, patterning this time with Speaker 144. The mean comprehensibility rating for Speaker 140 is 3.94 (s.d. = 2.21) and for Speaker 142 is 4.1 (s.d.=2.27). The mean comprehensibility rating for the Actor is 3.1 (s.d.=2.09), while the mean comprehensibility rating for Speaker 144 is 3.0 (s.d.=2.21). While the literature posits that accentedness and comprehensibility ratings can be independent of each other, these results clearly demonstrate that these scores do not always correlate with one another. Listeners are indeed constructing different understandings of accentedness and comprehensibility for each of these speakers, and appear sensitive enough to the differences between each speaker to rate them differently from one another. The intelligibility scores for each of these speakers do not differ significantly, yet speakers still receive different scores for their accentedness and comprehensibility of their speech. These differences demonstrate that there are different ways to construct maximum intelligibility, regardless of expectation. The next section teases apart these results further by examining listening by expectation. Figure 4. Box and whisker plot that shows the median comprehensibility scores for all four speakers. 138 3.2.3 Accentedness and comprehensibility by expectation Results and these next two figures further divide the data of each speaker into two listening conditions: expectation versus no expectation. The same data from the first analysis of accentedness and comprehensibility appear again, this time further divided into expectation;. Recall the prediction that, if listeners are sensitive to social context, they would adjust their ratings in a different direction, thus demonstrating that intelligibility is further constructed out of different subjective judgments of voice and is highly sensitive to the context in which the listeners and speakers find themselves. In the first box plot, the red boxes represent listeners who were explicitly told to listen for an actor in the experiment, while green represents listeners who were not informed they were listening to an actor until asked to determine who the actor is in the experiment. The following box plot demonstrates that accentedness is not necessarily sensitive to the social context of expectation. Because the boxes overlap and look similar between the two conditions for each speaker, it appears that there are little to no differences in this chart between the conditions, with the exception that for speaker 144, the median is different. As a comparison, the results of the comprehensibility question show different patterns between the Expectation and No Expectation conditions. Figure 5. Box and whisker plot for accentedness with expectation conditions. 139 In this next plot, similar patterns for Speakers 140, 142 and 144 show no differences between the two expectation conditions. The Actor shows that in the No Expectation condition, listeners were more consistent in their scores concentrating around a score of 2 (where the median is, right at the bottom of the box). While in the Expectation condition, answers varied widely, with the first quartile at 1, and the third quartile ending at a score of 5. Changes in the variability of scores may indicate a change in behavior as a result of expecting a certain type of voice or performance context while listening to speakers. Figure 6. Box and whisker plot for comprehensibility with expectation conditions. 3.2.4 Who is the actor? The second half of this experiment asked each listener to determine who they thought was the actor after listening to a block of forty sentences that contained ten sentences each for all four speakers in the experiment. Remember, each listener gave ten intelligibility, accentedness and comprehensibility ratings per speaker, but only gave one actor selection per experiment, so the number of data points for this question is a lot lower than the above figures that contain hundreds of data points. Though the overall number of data points is lower, valuable insights about listener behavior can still be 140 found. In the following figures, the y-axis is the proportion of times that a listener selected a particular speaker in the task. While each of the four speakers demonstrated different scores for the above three characteristics (i.e., intelligibility, accentedness and comprehensibility), listeners chose the actor the most in both expectation conditions. As shown in Figure 7, social Expectation of performance helped listeners choose the Actor more overwhelmingly than in the No Expectation condition. Figure 7. Comparison of expectation conditions for selection of speaker most likely to be the actor. 3.2.5 Who has the most stereotypical accent? The second question in the experiment, about the listener’s perception of stereotypical accents shows a different pattern. Immediately following the first actor selection question is a second question, asking the listener to select who they think is the most stereotypical Russian-accented English speaker. This question was asked to try to explicitly access the top-down social judgments that listeners might be using while they are listening to speakers of an accent that they can recognize. In fig. 8, selections appear as a wider range of possible acceptable answers. Unlike the first question that was asked of listeners, the idea of a stereotypical accent is up to the interpretation of the listener and there is no “right” answer in the experiment. These preferences subsequently appear in the form of a wider range of answers. For example, Speaker 142 and Speaker 140 were selected in this question and given that they had higher accentedness ratings than the 141 other two speakers, this answer seems perfectly acceptable. These selections are interesting given the individual profiles of intelligibility, accentedness, and comprehensibility with no clear correlation between the scores these speakers received and the selections listeners made at the end of the experiment. Figure 8. Comparison of expectation conditions for selection of speaker most likely to be the most stereotypical speaker of Russian-accented English. 4. Experiment 2 In addition to the performance specific questions being asked for this dissertation, I designed a second experiment in an attempt to find what kinds of judgments listeners are making while in different social contexts of listening. Managing to capture any kind of differences between Expectation and No Expectation conditions demonstrates that listeners may be sensitive to different listening contexts and adjust their parameters of accent judgment in response to these contexts. The second experiment is designed to probe more closely into the different types of subjective judgments a listener might use in a context of performance, guided by the pilot project and keywords that appear when training voice and dialect. The hypothesis is that these adjectives are more sensitive to changes in social context since they are more associated with the idea of performance as demonstrated in the free answer of the preliminary study. 4.1 Method: Participants, stimuli, procedure Again, I recruited forty-five (45) additional participants whose first language is 142 English and had no prior experience with Russian and had not participated in the first experiment. In addition to collecting language background information from each participant, we also asked about their experience in performing and attending live performance (acting, improv, role playing, and any other type of performance). Stimuli are the same sentences used in Experiment 1. Three Russian-accented English speakers from the ALLSSTAR Corpus (Speaker 140, 142 and 144) and the actor are the same speakers as in the first experiment (see appendix for sentences that the listeners heard). Each participant was randomly assigned to one of two expectation conditions like in Experiment 1. Then each participant listened to each sentence one at a time in a random order. In the Expectation condition, participants were told that they are listening to sentences from four different speakers and one of those speakers is an actor. There was no such explicit instruction in the no expectation condition. After each sentence, participants were asked to judge each speaker on nine-point judgment scales for the five most frequent responses in the pilot study. The scales were established so that 1 stood for “extremely natural,” “extremely authentic,” “not forced,” “extremely clear,” and “not exaggerated.” A score of 9 represented the opposite of these adjectives (shown in Table 5 below). Scales were aligned so that speech that sounded spontaneous would receive a lower score, while the higher scores are attributes that indicated performed speech, according to the adjectives gathered in the pilot study. The exception to this rule was the adjective “clear”, where 1 represents careful, planned or otherwise performed speech. This is the mistake in my experiment design because I assumed that spontaneous speech would be clearer than performed speech, despite the extensive literature that points to the opposite condition. 143 Table 5. Adjective alignment on the Likert scales for experiment 2. Score of 1 (spontaneous sounding) Score of 9 (performance sounding) Extremely natural Not Natural at all Extremely authentic Not Authentic Not forced Extremely forced *Extremely clear *Not Clear at all or UNclear Not exaggerated Extremely exaggerated 4.2 Results 4.2.1 Adjectives by Speaker The five box plots in Figure 9 below show the results for all five adjectives. A few patterns are immediately apparent; for example, Speakers 140 and 142 have similar patterns, appearing with nearly identical profiles in all adjective cases. However, in some cases, the scores for the Actor behaves like Speakers 140 and 142, and in some cases the Actor behaves like Speaker 144. The different behaviors for the scores of the Actor may indicate that different vocal qualities are valued in performed voices than others. Even more apparent from this part of the experiment, is that listeners often assigned different values of each quality to speakers, which indicates there is a range of acceptable qualities for these speakers while maintaining the same basic level of intelligibility, as indicated in the first experiment. 144 Natural Forced Authenticity Clear Exaggerated Figure 9. Box and whisker plots of the results from the Likert rating of all five adjectives, by speaker. 145 For comparison’s sake, I have also provided a table of the mean score and standard deviation of each adjective with each speaker. Note in Table 6, with the exception of the adjective “clear,” the Actor averaged higher ratings in all other adjectives. Recall that the Likert scale for “clear” was flipped as a result of the experimental design, so having a lower average means they still sound the most performative in that adjective category. In fact, the Actor scored as the most performative sounding out of all the speakers regardless of adjective or listening condition. These scores indicate that each speaker exhibits a unique set of attributes compared to the other speakers. Each of the three Russian-accented English speakers is different from the other two speakers and have their own unique scores. In the case of Speaker 144, he scores lower than the other two native Russian English speakers on all attributes, indicating that speaker 144 could be perceived as a more “performative” spontaneous speaker at the same time speaking more clearly than the other two native speakers. With their similar intelligibility scores from the first experiment, these results indicate speakers can exhibit different combinations of these adjectives and still be sufficiently intelligible for performance. As a speaker, the Actor shares some attributes with Speakers 140 and 142 (scoring similarly in natural, forced, and authentic) and shares other attributes with speaker 144 (scoring similarly in the clear category). These attributes for the actor point to a possible special attenuation to how they are speaking, indicating there might be unique acoustic properties that the actor exhibits that may clue a listener into performance outside of social expectation. 146 Table 6. Mean and Standard Deviation for all five adjectives, by speaker. Actor Speaker Speaker Speaker 140 142 144 Mean 4.97 4.82 4.76 3.81 Natural S.D. 2.23 2.10 2.25 1.92 Mean 4.91 4.54 4.51 3.86 Authentic S.D. 2.10 1.97 2.10 1.94 Mean 4.83 4.79 4.48 3.85 Forced S.D. 2.22 2.12 2.22 1.99 Mean 3.72 4.68 4.75 3.55 Clear S.D. 1.91 2.12 2.22 2.97 Mean 4.53 4.32 4.01 3.75 Exaggerated S.D. 2.72 2.15 2.16 2.01 4.2.2 Adjectives by Expectation The following box and whisker plots further divide the data seen above into two listening conditions—Expectation and No Expectation. If social context is a factor in judging these speakers, we should see a difference between the two conditions. How that difference manifests is the key factor in these results and common sense may dictate that expectation of performance may affect a listener’s perceptions more towards indicating that voices are more performance like. For this hypothesis to be true, a listener who is explicitly expecting an actor will score speakers higher on the Likert scales (with the 147 exception of clear) which indicates the listener believes that the speakers are exhibiting performative behaviors in their voice. In fact, trends in the data show the opposite of this prediction. Figure 10 shows a slight pattern in the opposite direction of this prediction. Again, note that red (or the boxes on the right side of each column) indicates listeners were explicitly expecting to hear the voice of an actor. The overall pattern does not show a lot of change between the No Expectation and the Expectation conditions, meaning the boxes in the conditions for each adjective almost completely overlap and share descriptive attributes in both listening conditions. However, examining a few combinations of speaker and adjectives leads to an unexpected result. Listeners score Speakers 140 and 142 higher in natural in the No Expectation condition than in the Expectation condition, which is the opposite effect than what the hypothesis of using social expectations predicts. Further, Speaker 142 also exhibits similar behavior with the adjective ‘clear.’ Social expectation of performance has resulted in listeners scoring Speaker 142 as clearer (with a lower score) than when listeners are not expecting performance. This possibly indicates that a speaker might be adjusting their social expectations for speech towards a more generous mode of assessment if they know they are expecting performed speech. This effect seems to become more pronounced for speakers that would otherwise be judged as less intelligible or comprehensible, (e.g., Speaker 142 had the lowest intelligibility mean score of .933, (s.d. = .179)). 148 Natural Forced Authenticity Clear Exaggerated Figure 10. Box and whisker plots for the five adjectives that comparison the two listening conditions. 149 4.2.3 Who is the actor? The results from the actor selection question pattern differently than those who were asked in the first experiment. Figure 11 shows the comparison between the two expectation questions. The y-axis is the proportion of answers that were given for that speaker. When listeners were expecting to hear an actor, they overwhelmingly chose the Actor as the correct choice almost 60% of the time. However, when listeners were not expecting to hear an actor, they selected the Actor and Speaker 144 around 40% of the time. While strictly better than a chance guess (25%), it does demonstrate that Speaker 144 and the Actor may share characteristics in common. This even split between selecting Speaker 144 and the Actor appears in the first experiment, as well (see fig. 7). Without social expectation of consciously listening for the traits of performed speech, listeners may be tapping other social expectations to determine their choice. In many aspects, Speaker 144 and the Actor scored quite similarly in the rating experiments. Figure 11. Comparison of expectation conditions for selection of speaker most likely to be the actor in experiment 2. 4.2.4 Who has the most stereotypical accent? Listeners answered the question about stereotype immediately after selecting their choice for who they believe was an actor in the experiment. Figure 12. compares the two listening conditions, giving proportions of answers that listeners selected. Answers were spread more evenly between the four speakers when listeners were expecting to hear an actor, selecting the Actor approximately half of the time. However, listeners selected the 150 Actor more than any other speaker when they were not primed to expect to hear an actor in the voices. This curious pattern reflects some aspects of the choices that listeners made in the first experiment. Listeners’ expectations of performance might have shifted in the second experiment, since they had just completed fifteen minutes of an experiment that asked for them to listen for “authentic” and “natural” voices, indicating that one or all of these speakers might not be authentic spontaneous speakers of Russian-Accented English. This result demonstrates opinions about “stereotype” can fluctuate from listener to listener. This question about stereotype does not necessarily have one right answer in the way that asking who they believe the actor is has one correct answer. Figure 12. Comparison of expectation conditions for selection of speaker most likely to be a stereotypical speaker of Russian-accented English in experiment 2. 5. Discussion: What can practitioners take from this chapter? Overall, the results of these experiments show that there are multiple ways to construct maximally intelligible speech. All four speakers were not statistically different from one another in their intelligibility scores, and yet exhibited different levels of different attributes such as accentedness, comprehensibility or different levels of the five adjectives featured in Experiment two. If practitioners of voice and dialect were strictly using measures of intelligibility such that Dudley Knight’s advocates, practitioners would be happy to accept all four voices in this experiment as acceptable voices for onstage performance. However, subjectively, differences between the four different speakers exist 151 and practitioners might feel the need to work with the speakers to make them more ideal for performance contexts. Instead of affecting intelligibility (which is already at maximum, according to these experiments), speakers would modify different aspects of their voices to affect the other aspects that listeners are using to judge these accents like “clear” or “natural” or “authentic.” This work with these voices would not be necessary, since they are already perceived as maximally intelligible. Using these parameters to perceive these voices becomes even more complicated, because these experiments have demonstrated that the very context of expecting performed speech affects a listener’s ability to judge. Just by mentioning a context change, the goal posts for “authenticity” move when listeners expect performed speech in an unexpected direction. Listeners are more generous in their observations (i.e., willing to accept performed speech as spontaneous-like) and are more likely to be generous in their ratings and rate inauthentic speech as more authentic than contexts in which they would hear the same type of speech outside of performance. This phenomenon could indicate a privileging of performed speech by listeners, which also means that authenticity could be an easier target for voice and dialect coaches than previously imagined. Another possible explanation of these results is that listeners are only using the word “authenticity” in a particular context that requires some doubt of the veracity of the speech they are encountering, and therefore expecting performed speech more readily primes a listener to use this term. This phenomenon is similar to when listeners describe marginalized voices as “articulate,” implying that the default expectation of the marginalized speaker is that they cannot achieve a certain level of articulation that is acceptable to the listener. This type of compliment is a backhanded way to show surprise 152 that also can be explained by privileging the white listening apparatus and attendant expectations. To highlight the inequality of the use of this adjective, language advocates have criticized some users of this adjective. For example, when President Joe Biden is quoted as saying that former president Barack Obama is an “articulate and bright and clean and a nice-looking guy,” he is implicitly arguing that he could not expect a Black man in Obama’s position to speak and look the way he does (Alim and Smitherman 10). Results from these experiments show that authenticity might be used in a similar sense by listeners, especially when primed to expect a fake or performed accent, or in a scenario where “authenticity” might be questioned by a listener. One further aspect of these results worth noting demonstrates that the Actor as a speaker of imitated Russian-accented English is not always the most stereotypical speaker. These results, hand-in-hand with the relative struggle of listeners to actually identify the actor in the experiment, point towards different criteria used by listeners to determine performed speech other than use of stereotype to identify performance. Other research has demonstrated that listeners have some ability at identifying different types of accents from one another (such as German and French accents), but cannot reliably determine if the accent they are hearing is authentic or imitated (Neuhauser and Simpson 1805). Due to the variety in answers in the experiments, listeners may not be relying on their stereotypical representations of this accent to help guide their judgments of performed speech and stereotype. The relative spread of the answers that listeners provided to both the actor and the stereotype question demonstrates that pinpointing authenticity and performance are still relatively difficult for listeners to do and they are not reliable in this task. The expectation from their experiences of performed speech and 153 stereotypical accents in performance did not reliably provide listeners with enough examples to accurately determine which speaker was the performing imitator. Voice and dialect practitioners should take these results with a responsibility towards their listening audiences; the listener will not always be accurate in telling imitated accents from authentic ones. On the other hand, the inability of listeners to tell spontaneous accents from performed accents can work in the favor of marginalized actors who do not speak a mainstream version of English who wish to perform onstage. Recall that the intelligibility measure for all four speakers was essentially the same, even between listening conditions. This means that listeners, even while they vary in their judgments of the accents they are hearing, are still receiving information and meaning from every speaker in this experiment. This means that even non-native speakers in their own accents can easily be understood and should be treated as such as performers in their own right, and voice practitioners ought to take care of how they approach intelligibility with their acting students. Specifically in the United States, students and actors who have non-native English accents or regionally accented English can therefore participate with their own unique voice and not necessarily fear that they are unintelligible—in both the colloquial and the linguistic sense—to audience members in performance. However, the accent or dialect they are speaking will still trigger normative language attitudes and can contribute to subjective meaning-making when the individual audience members combine their ideas about how these speakers sound and what they are hearing as part of the script. Social expectation of performance can even push the boundaries of acceptable or performance voices more since listeners are more willing to judge accents to be more 154 spontaneous-like in their delivery, as demonstrated by experiment 2. The results of these experiments show that practitioners have more flexibility than once thought because listeners can accept a wider range of what it means to be an intelligible voice onstage. Simply expecting performed speech boosts these voices towards a more generous interpretation for a host of attributes that listeners use in the audience. Future directions include creating a better design to tease apart expectation of performance and expectation of stereotype, since these experiments have not made clear the link between the types of judgments listeners were asked to make and their selection for stereotype, as evident in the wide spread of answers for the stereotype question. In the future, the order in the procedure of the experiment can change, where I ask listeners their stereotype judgments before they participate in the rating task. Then, after the rating tasks, I can ask listeners who they think is the most stereotypical speaker. If listeners change their answer and subsequently change their scores for these attributes, I can gain more insights into the types of judgments listeners make when considering stereotypical voices. I can then compare first impressions of listeners with listeners after they have had fifteen minutes to adapt to these voices. I can also create a scenario where the speech is masked (i.e., less intelligible in the empirical sense), which might result in a wider array of judgment scores for all four speakers and help to reveal any significant differences that social expectation makes for listeners, especially in a scenario where they must rely more heavily on the social expectation of performance. Another challenge is present in the form of Rubin and Kang’s “reverse linguistic stereotyping.” Marginalized actors in bodies that read as non-white will also face a challenge with respect to social expectations of how they are expected to sound onstage. A future direction in this 155 research can explore that space between expectation of accent from marginalized bodies and adaptation to these voices when these expectations are not initially met. How can we exploit the gap between expectation and reality to train audience members to expand their initial expectations when interacting with new people? The main results that these experiments demonstrate for voice professionals fall into three themes. The first theme is that the context of performance will necessarily change how listeners are perceiving the language they encounter. The second theme is that there are many different ways to create a voice that has maximum intelligibility; all of the voices in this experiment could convey linguistic information to listeners. This second theme means that voices that have historically been excluded in performance are not excluded because they were unintelligible; prior prejudices and catering to raciolinguistic ideas of the white listener have excluded these performers in the past. The third theme is the ambiguous relationship between authenticity, imitation, and stereotype of which voice practitioners can take advantage. Listeners cannot reliably discern spontaneous accents from imitated or performed accents, which means there exists a space where voice practitioners can use this fact to highlight real voices onstage next to trained accents and dialects and manipulate audience expectations towards the benefit of marginalized or non-standard dialects. These three themes will influence the practical steps this profession can take that I highlight in the next chapter. The results combined with the alternative approach of conceiving voice training in general will offer a foundation for the best practices for voice training—and listening! —in the future. I will then tie these steps into other work that is being done to expand the idea of theatre in the final chapter and conclusion to this dissertation. 156 CHAPTER IV TOWARDS A NEW TRAINING PARADIGM FOR VOICE PROFESSIONALS “We celebrate it when white actors nail an accent...We celebrate it! Until we can offer that same detail and attention to all linguistic identities, and to a myriad of accents, we’re still going to be erasing the humanity of those stories and those characters.” Cynthia Santos DeCure, “How Should Black People Sound?” New York Times “One of the most repressed things for Black people in this country has been our voice. Right now, we’re seeing if we can really find our voice, at this time, and this specific moment, to specifically tell this story — this beautiful thing — the way the team wants it to be told.” Tre Cotten “How Should Black People Sound?” New York Times 1. Enacting the expansive imagination in voice I begin with the above quotes from the article “How Should Black People Sound?” by Reid Singer, which ran in the New York Times on October 28, 2020 as a vision for the future of the profession of voice and dialect, explicitly correcting the white supremacist foundations upon which this profession was built. My research offers a critical look at these supremacist foundations, examining the assumptions upon which the profession is built, and then interrogating those assumptions in a way that reveals the social construction of what it means to experience a voice onstage. I have used two critical lenses on the profession and practice of voice professionals to examine how the profession might grow more equitably into the remainder of the 21st century and beyond. One lens, a critical philosophical examination of the assumptions of voice practitioners throughout the history of the practice, demonstrates the role of the profession in cultivating and enforcing explicit racial and cultural stereotypes. The assumptions of these professionals asked two important questions. The first question, a question of unlearning, seeks to eradicate any trace of what Micha Espinosa refers to as “cultural voice” or marker of class, gender, or race (75). The second question of previous voice 157 professionals seeks to implant a sanitized version of the voice, whether through generalized standard accents, or through stereotypical foreign and regional dialects. These two questions appeared in the name of an objective measure of clarity or intelligibility; the idea that the audience needs a sanitized version of authenticity governs the previous choices of voice professionals. Paradoxically, voice professionals assume the audiences’ hunger or awareness for authenticity on the stage but historically have not offered a system with which authenticity of various lived linguistic experiences may be honored. In this case, voice trainers’ perceptions of audience need drives the profession, and not necessarily the lived experiences of actors nor the audience. Intelligibility, a driving force in the working epistemology of voice, has been elevated as an objective and separate gold standard by which voice professionals ought to measure the success of their work. By treating intelligibility as an objective measurement, voice professionals again denigrate the autonomy and lived experiences of actors and audience alike, by excluding any differences in subjective experience in society. A second lens, and the key intervention of this dissertation, takes seriously the notion of intelligibility as a socially constructed judgment that has a real-world effect on perception and affects individuals differently when they use intelligibility to perceive their world. This lens is from the cognitive linguistics field and offers a different explanation of how speech perception may work in performance, which opens an avenue to respect different lived linguistic experiences of actors and audience members alike. This lens uses cognitive research into speech perception to establish an experiment of intelligibility in performance, demonstrating that intelligibility may not be as objective or easy to measure as voice professionals may desire. Interrogating this assumption of 158 intelligibility is necessary to support a contemporary approach to voice training that actively resists the racist underpinnings of the profession. While these findings complicate the picture of providing professional voice services in some ways, this also demonstrates that audiences may be more willing to meet the actor in their own lived linguistic experience and may not be relying upon “authenticity” as a measure as common sense might imply. Further, the use of “authenticity” only arises when listeners are using their expectations of stereotype to guide their listening. This linguistic experiment demonstrates that listeners are willing to change their perceptual patterns only with the mere suggestion of performance or pretend. In other words, empirical evidence suggests that audiences are more likely to receive marginalized voices in the realm of performance and theatre. With a critical look into the past of these practices and a cognitive look into the present-day reality of actual audience behavior, I can now offer my own approach as a white theatre maker and cognitive scholar that incorporates a more just and equitable voice practice for the future. To do this, I look to colleagues in language education, and will incorporate what April Baker-Bell calls an “Anti-Racist Black Language Pedagogy” that explicitly acknowledges and actively works against the white supremacist structure that the practice of which performed language is a part (11). By acknowledging the problematic foundations, introducing the working assumptions of the profession, and offering evidence backed ways to push against these assumptions, I can contribute to a profession that can honor healthy holistic approaches to the voice and lived linguistic experiences of those who have been excluded from theatre, film, and entertainment. With enough practice, this approach can actively push back against the larger societal 159 phenomenon of linguistic discrimination by extending the audience and performers’ imaginations in honoring actor’s own lived experiences and introducing accents and dialects in dynamic and surprising ways. Creating an anti-racist approach to dialect training that does not rely upon pre- existing notions of standard language ideology or imperialism offers the chance for practitioners and audience members alike to experience a different form of empathy by enacting their imaginations towards a world where people of all different lived linguistic experiences have a right to their own stories in performance. Anti-racist approaches to dialect and voice training require an explicit examination of the assumptions that drive this behavior, and the choices that continue to be made regarding using accented language onstage. This type of empathy does not necessarily always mean that the audience will be comfortable, as confronting detrimental biases sometimes require sacrifice of comfort and complacency in the moment. As theatre trainer and anti-racist activist Nicole Brewer says, “It’s okay to explore another’s lived experience with empathy and curiosity, and deep profound listening, and it’s okay not feeling comfortable” (“Training with a Difference”). The ethical imperative of vocal professionals is to provide a carefully scaffolded cultural framework to introduce these conversations into the production process, from pre-production through actor training. The benefits include creating a linguistically diverse soundscape onstage which means that more actors and producers have access to American theatre-making. According to my preliminary linguistic data, audiences might be ready and eager to incorporate this soundscape into their meaning-making while experiencing the theatre. It is the ethical responsibility of the voice practitioner to take the steps necessary to 160 respectfully represent linguistic lives on stage; and this dissertation has offered another critical intervention to ensure the respectful representation of dialect by considering audience experience as a direct factor in perceiving these dialects. At the same time, Micha Espinosa still explains the core goals of a voice practitioner, I do believe that a voice should be free of those pesky glottal attacks and/or have the ability to sustain throughout a run of a show, but it was at that moment that I became aware of the cultural voice. A voice that has endured the dirt and struggle of constantly crossing borders might not be as aesthetically pleasing to some, but it was a lot more interesting to me (78). Balancing audience expectations between “interesting” and intelligible need not be tied to racial, gender and class expectations or stereotypes. Just as listeners construct the notion of intelligibility in everyday speech interactions, we, the performers, and practitioners, can construct the idea of intelligibility to create an inclusive approach to voice. In other words, the choices we make as artists and practitioners hold real-world consequences, who we represent linguistically on stage reflects the values we hold while making our art. We must interrogate our own assumptions that guide our art creation, and to do so means we ought to be reaching for new tools and lines of inquiry. This is the lesson above all that I hope practitioners and scholars alike learn: theatre-making itself can be a critical tool in combating the detrimental belief system of a society that believes that how a person sounds is related directly to their worth. This dissertation has sought to analyze the profession of voice and dialect through the dual methodologies of cognitive philosophy and linguistics in order to interrogate the assumptions driving this profession. In the preceding chapters, I detail the historical 161 material circumstances and the prevailing assumptions about voice to establish the challenges facing contemporary vocal trainers. I created my own empirical inquiry to demonstrate that the notion of intelligibility, one of the driving qualities behind much vocal professional work, is in fact a subjective measure that is susceptible to implicit biases of the listener—information that should be a central concern to theatre practitioners and all cultural workers. These normative assumptions are baked into the theatre-creating apparatus and have contributed to a negative feedback that reinscribes and damages marginalized practitioners. The question before us as theatre practitioners generally is: how will we respond to the present call for a reckoning with the normative raciolinguistic listening practices of professional and regional theatre in the United States? This dissertation is my attempt as a white theatre maker and voice coach with specific linguistic training to wrestle with the central question of equity in approaches to vocal training. I will not be able to correct all the historical violence that this profession has wrought on marginalized performers, but I offer an approach, using my own experience, that can begin to crack open anti-racist practice of voice training. The previous chapters have established a complicated story of the interchange between audience expectations and the expectations of theatrical practitioners to build the idea of linguistic intelligibility on stage. In what follows, I will use the three themes that appeared in my experimental work to highlight my approach to voice and dialect practice. The three themes—listening is context dependent, maximally intelligible voices have a wide array of attributes and qualities, and the complicated relationship between authenticity, stereotype, and imitation—will appear in my past examples of work as I grapple with the question presented above. I will also use the historical lessons and 162 assumptions that previous voice professionals have infused into this profession. I will push back against the two core goals: unmaking voices in their grit and lived linguistic experience and replacing these voices with sanitized or simplified versions of voice. I am gearing this last chapter towards dialect coaching specifically for two reasons. The first reason is that the practice of training dialects has a highly fraught and explicit history in enforcing stereotypes, therefore will be a fruitful site for anti-racist intervention. As linguist Vijay Ramjattan once remarked, “language practices are racialized, and language practices racialize as well” (Twitter). The language practices of theatre are not immune from the racializing practices of assigning accents and dialects to characters with unsavory attributes, linking how people sound with innate negative stereotypes. The demand for the practice of dialect coaching is only growing and this is a prudent time to create an ethical framework from which to work (Singer). Secondly, I have personal experience in dialect training throughout my career as a dialect coach and have taken concepts from this research in this dissertation and applied them to project based work. The three examples I will draw most heavily upon are my work in Pilgrims Sheri and Musa in the New World by Youssef El Guindi, Good People by David Lindsay-Abaire, and The Language Archive by Julia Cho. To accompany my practical work, I will draw upon theoretical examples from several other plays. This work of undoing years of explicitly racist structures in the profession mirrors work by other practitioners in other areas of representation on stage. My own practices run in conversation with practices offered by Nicole Brewer and guidance set forth by the group WE SEE YOU WHITE AMERICAN THEATRE, which advocate for centering marginalized voices in all aspects of production, from onstage to offstage to the front of 163 house and artistic management staff (“Training with a Difference”). Some voice work in this vein is gaining recognition in professional circles, from practitioner Daron Oram’s 2019 article “De-Colonizing Listening: Toward an Equitable Approach to Speech Training for the Actor” to featuring the story of dialect coach Tre Cotten as a dialect coach in the New York Times article that began this chapter. I borrow not only from theatre practitioners, but also language teaching advocates like April Baker-Bell’s “Anti- Racist Black Language Pedagogy” in her 2020 book Linguistic Justice: Black Language, Literacy, Identity, and Pedagogy. Many different scholars and artists from marginalized communities have already begun this work and I will highlight some of this work in a section that prefaces my own practices by featuring the work of Latnix theatre makers. All of these examples of alternate approaches to theatre production serve to liberate and celebrate historically marginalized communities, offering an alternate vision of how theatre can shake loose the shackles of linguistic white supremacy. This chapter serves examples of liberatory linguistic practices that already exist within theatrical production, and the findings in my own linguistic research and dialect coaching support these practices. I seek not to reinvent the wheel, but to support and reaffirm the work already being done. Like the rest of my research and work, the method by which I conduct my work as a dialect coach sits at an intersection of use of my background as a linguist and as a theatre practitioner, utilizing the subjective knowledge I have accumulated as vocal coach and scientific knowledge gathered from my peers. Some of these practices have already been employed and some of these practices are steps I would recommend for the field. What has become abundantly clear throughout this research is that simple and robust solutions to equity, diversity and cultural 164 competency in voice practices do not exist. The thorny history of white supremacy, raciolinguistic practices, and the material circumstances of theatrical creation has limited historic approaches to voice. 2. Case Study: contemporary linguistic needs for Latinx Actors and Directors Contemporary voice practitioners are already creating space for voices that do not necessarily fit the mold of accepted performance, and I would like to discuss an approach that one facet of this community is using to push back against the dominant white mode of theatre production. Borrowing from the scholarship of Gloria Anzaldúa, Micha Espinosa, a voice trainer and professor at Arizona State University, uses the term “cultural voice” to describe voices of actors who do not fit the dominant mode of acting, “Cultural voice is described as the self-constructed, emotionally bound, non-dominant performer’s identity and identification with the social-historical values and principles of one or more cultures” (75). Espinosa enumerates the difficulties of working as a marginalized voice in a white-dominated space by connecting her own struggles as a voice practitioner with Anzaldúa’s 1987 essay “How to Tame a Wild Tongue” by connecting her work as a voice practitioner in higher education with the linguistic terrorism of the “authentic wild tongue” (75). While in the white-dominated space, Espinosa feels, “to succeed, Latinos and Mexican students have to negotiate an identity with the psychological and physical realities they have been given. Both students and teachers often find themselves working with unexamined and opposing sets of external and internalized beliefs” (78). In this way, Anzaldúa’s concept of border identity is reflected in Espinosa’s personal experience, in a way that evokes physical location and environment as stand-in for the linguistic realities of students: 165 I have been straddling that tejas-Mexican border, and others, all my life. It’s not a comfortable territory to live in, this place of contradictions. Hatred, anger, and exploitation are the prominent features of this landscape. However, there have been compensations for this mestiza, and certain joys. Living on borders and in margins, keeping intact one’s shifting and multiple identity and integrity, is like trying to swim in a new element, an “alien” element (18). I have demonstrated that the desire to “Tame a Wild Tongue” and its detrimental effects can be seen throughout the history of the voice profession, and yet this desire still remains in contemporary practices of training in both theatre and film. The discomfort of marginalized speakers in this new element of catering to the white listener is the focus of Espinosa’s training and her commitment to honoring the voices that “constantly endure the struggles of constantly crossing borders” as rightful participants in the theatre making apparatus of higher education theatre. Oftentimes, the desire in terms of voice is for a mildly Hispanic accent that can read as “spicy” or “foreign” as a cognitive shortcut to character or entertainment, but the accent or dialect does not read as coming from a distinct culture or geographical area. The psychological damage done to the actor is described succinctly by Espinosa, “When we unconsciously continue to cast that one Latino student as a spirit, or ‘other,’ we again propagate Eurocentric dominance and the student’s social marginalization” (81). To get a deeper read on the contemporary linguistic issues facing Latinx actors and theatre producers, I would like to highlight a recent conversation I had with fellow scholar Dr. Olga Sanchez Saltveit, who is a director and educator with over twenty years 166 of experience bringing Latinx stories to the stage.37 Sanchez Saltveit shared her frustrations with the entertainment industry. Specifically, Sanchez Saltveit points to historic assumptions of directors and producers that, “any Latinx can be any kind of Latinx” (Personal Communication). This expectation results in actors creating a type of accent or dialect that is not reflective of any one geographical area as part of fulfilling the expected role. These stereotypical accents often accompany stereotypical roles, reinforcing harmful social and racial stereotypes. Further, these accents erase any type of Latinx indigeneity; speakers of minority languages such as Quechua (spoken in some South American countries) are not represented with these generic Spanish-dominant accents. Additionally, Sanchez Saltveit points out that actors are sometimes asked to translate from Spanish to English on the fly, thereby performing a type of free labor for bilingual productions. Sanchez Saltveit’s own personal experience is reflected in larger industry patterns, and Mexican-born film producer and director Batán Silva remarks in a recent New York Times article, “There’s nothing worse than a Mexican character who sounds like an Argentinian or a Spaniard...Or actors who say seven things in Spanish, and then miraculously switch to English” (qtd. in Singer). He follows this statement by remarking that production studios are beginning the work to diversify their production staff truly and deeply to more accurately reflect the specific cultures required to fulfill the script’s demands, actors and voice professionals included. Sanchez Saltveit has seen 37 Thanks to Dr. Olga Sanchez Saltveit for permission and input on this section that summarizes this conversation. The original scope this dissertation included a systematic survey and interviews with practitioners, coaches, and directors, but that plan had to be altered in the wake of COVID- 19. What follows is a summary of a conversation that could serve as a source of future inquiry with other theatre professionals. 167 promising signs of change for the industry as well, as professional theatre casting has begun to reflect each script’s specific cultural demands more accurately. Balancing these pressing issues of representation are the audibility requirements and needs of the production, namely that actors still must be heard by their audience. In Higher Education, where a fair number of Latinx productions are produced, Sanchez Saltveit warns against using actor training to reinforce its own type of standard language expectations. She says, “Asking Latinx actors for Latino accents ultimately enforces codeswitching and appropriate times to be ‘Latinx’ or not” (Personal communication). The use of the appropriateness paradigm in language teaching again assumes the primacy of the typical white listener as the target for language (Rosa and Flores). While she is sensitive to how alienating university and professional theatrical production can be for marginalized students, Sanchez Saltveit agrees that actors must deliver their text with clarity. Whatever accent they may bring to the process, she asks all actors to be “clear, strong and vocally present” (Personal Communication). When I asked her further about the definition of clarity, she responds that she defines clarity as, The ability to understand people’s words, what they are saying. I know my hearing of what people are trying to say is broader because I have more experience with so-called accents. I grew up listening to people speak accented English. Clarity has to do with the receiver, and the ability for your audience to discern what the actor is saying. (Personal Communication, emphasis my own). The professional instincts of Sanchez Saltveit in this one conversation confirm what the linguistic experiments of the previous chapter demonstrate—that judgments of intelligibility and clarity are in the minds of the audience. With a little practice, listeners 168 can also increase their hearing range to understand a wider breadth of voices onstage. Further linguistic evidence of practice and accommodation can be found in the work of Melissa Baese-Berk, Ann Bradlow and Beverly Wright, where listeners are able to adapt to novel accented speech after training on a variety of other types of accented speech (EL177). Change happens by two avenues in both production and perceptions—including historically marginalized theatre professionals in the larger production apparatus and exposing audiences to a wider variety of lived linguistic experiences. This brief conversation offers a very specific viewpoint of linguistic issues in a particular identity group and does not begin to address the nuance and myriad of approaches in on theatre producing community. Other marginalized theatre creators run into similar obstacles while creating theatre in the larger theatrical apparatus in white- dominated field. The rest of this chapter is dedicated to imagining a world and creating an ethical framework that not only minimizes the harms of centuries of stereotyping in entertainment, but also offers a way to actively undo harms that have been created by the dominating raciolinguistic assumptions of larger society. As executive producer Lang Fisher says, “We don’t want caricatures, and so it’s important not to have actors just winging the accent” (qtd. in Singer). The rest of this chapter is aimed at encouraging thoughtful and diligent work from pre-production through audience interface. 3. Pragmatic answers to utopian questions These practices are especially fraught when the production still requires training an accent or using dialect work as part of building the world onstage. This scenario involves a production team—casting, directing, and vocal coach—who all attempt to not only create an artistic statement satisfying to those who are part of production, but also 169 desire to cast this show as ethically as possible. The old approach to this issue would be to hire a dialect coach and have them teach a pre-created accent or dialect that the production requires, either from available materials, or custom crafted by the dialect coach to fit the desires of the production team. Often times this accent or dialect is made of different elements or “ingredients” of the voice. with “a recipe for every person’s voice” without much regard as to the source of these ingredients (Barton, qtd. in Sakland 30). This approach, with the two explicit goals of voice practice of un-creating the actor’s voice and the re-creation of a pared down version of an accent from pre-existing ingredients, is a reification of raciolinguistic ideals aimed with a white listener in mind. I have established through the work of this dissertation that these practices are problematic at best, and actively harmful at worst. To counteract this issue, production teams have several options. The most extreme of these options is to forego the accent or dialect entirely. This option risks flattening everyone’s lived experience to a “neutral” or General American accent that disqualifies the vast majority of actors who do not have a middle- class white cisgender background. Another option is to have actors use their home dialect or accent when in production. This option honors the lived linguistic experiences of individual actors, but might not serve the story as intended, and runs the risk of again reinscribing harmful stereotypes (if, say, an actor plays an angry character and happens to speak in a dialect from a marginalized group). Neither of these options serve to honor the cultural specificity that this era of entertainment and theatre production deserves. Casting considerations may also reveal yet a third option for using dialects and accents in production. Responsible casting in theatrical production requires casting actors who share the basic traits of the characters they are to portray. For example, the 170 documentary Disclosure advocates heavily for roles about trans characters to be given to trans actors (Feder 0:05:16). Production teams might advocate for casting actors who already speak the accent or dialect in question for production. At first blush, this advocacy avoids the traps set forth by the first two options for our production team, and third option reveals the desire for authenticity in representation onstage, which is a term that has been heavily problematized in this dissertation. As this desire for absolute authenticity is often practically impossible, I would caution against this enthusiasm. Brian Herrera addresses this issue regarding Latinx casting in production, A more rigorous advocacy for culturally competent presentations of plays engaging Latinx racial, ethnic, and gender diversity need not solely rely upon demands for authenticity, indeed the hunger for authenticity—often rooted in some combination of fear and fantasy—can risk fetishization as readily as if promised the reward of cultural validation (33). The use actors with authentic dialects and accents risks reducing the nuances of the lived experiences of actors to a token that is ultimately used as a stand-in for the accent, for which, as my experiments demonstrate, audiences will be carrying a stereotype regardless. Both Herrera and one of the takeaway points from my experiments agree, “the appearance of authenticity always lay in the eye of the beholder...the priority of presenting...requires a more reliable and more rigorous protocol than authenticity” (33). The average production team is put into an impossible position; they cannot approach the idea of accent and dialect through appealing to authenticity or without any consideration or regard to the effects of how different voices present onstage. The following section explores yet more options to approach dialects and accents onstage, a 171 place where dialect training can still be practiced, but is practiced with a heightened awareness and care towards the crushing mechanisms of white-dominated cultural spaces. The keys to approaching linguistic casting for the stage includes research with respect to the sources of this research, collaboration, and honoring the lived linguistic experiences of both the actor and the obligations of character as written or envisioned by the production team. The approach requires interrogation of the playwright, direction, voice training, and ultimately the audience to truly determine the role that accent and dialect training play in theatre making. There is hope yet for voice and dialect training! 4. My own practice recommendations for dialect A strong vision of voice and dialect work within production situates the dialect or voice coach in conversation with the usual players—playwrights, actors, and directors— and incorporates their positionality as theatre makers within the white-dominated space that theatrical production has historically been a participant. We can expand that linguistic positionality awareness to those often excluded from voice practice and include dramaturgs into the practice of accent and dialect. To keep my own practice as ethical as possible, as adopted from Chelsea Pace and Laura Rickard in their book Staging Sex: Staging Sex Best Practices, Tools, and Techniques for Theatrical Intimacy, I aim to be the first to name power structures in the room, both in production and working with actors to begin to demystify the myriad structures that uphold the practice of theatre (16). This looks like naming my point of view in the world as a white cisgender femme queer theatre maker38 who is (sometimes) paid for my expertise to teach and train with actors under my care. These practices I offer are from my limited and privileged point of view 38 I aim to use adjective that describe myself from most visible label to least visible. 172 as a white theatre maker who aims to become an anti-racist co-conspirator in breaking the white-dominated structures that still govern theatre today (McIntosh 6). In plays that require dialect, the two positions of playwright and dramaturg contribute to interrogating the necessity of the dialect or accent in the first place. Because theatre makers strive to create stories that feature characters from many different backgrounds and perspectives, these characters will require the vocal obligation of cultural competency, promoting a space for respectful vocal training that acknowledges the necessity of trained dialects. What follows is a guide that roughly divides the production process and the ideal roles that a vocal professional may play throughout the life of a production, beginning with season/play selection and ending with public engagement about language in the play. Some of these practices spring from my own experiences as a coach, while others are suggestions that will enhance approaching dialect with a mind towards ethical and responsible considerations of voice. Following each subsection concerning guidance for each segment of targeted audience, I offer questions as guideposts for voice professionals in pre-production, working with the actor, and working with the audience. These questions build upon the suggestions of Bonnie Raphael in her 2000 article “Dancing on Shifting Ground” and Kim James Bey in her 2014 article “Speech Stereotypes: good vs. evil.” These questions will bear quite a resemblance to Elinor Fuchs’ “Visit to a Small Planet.” The final section of this conclusion will be thoughts that speculates on the future of the profession itself and how we can put into place practices for a more ethical profession going forward. 4.1 Who let this dialect: Pre-production and season selection In contrast to contemporary approaches to dialect coaching, where coaches may 173 be selected after the director has determined they want a dialect in their production, the dialect or vocal coach ought to be invited to the initial conversations around accent or dialect needs for an entire season. In effect, the dialect coach serves to remind producers to consider ethical issues of voice and representation from the beginning. Historically, inclusion of dialect coaches in season selection has not been the case; in a report that surveys productions in the 2018-2019 season for member theatres of the Theatre Communications Group, Melissa Tonning-Kollwitz, Joe Hetterly, and I found that the overwhelming majority of determining need for dialect work lay with either the artistic director of a theatre company at season selection, or with the director when they are initially assigned their production (5). Dialect selection practices would benefit greatly from the keen expertise of dialect coaches participating in initial conversations with production teams. Further, this position ought to be adequately compensated so that the dialect coach may not be tempted to advocate for dialect work in the production so that they may be tempted to pay themselves for the work within the season if they recommend dialect work for a particular production. The model of inclusion of a dialect or voice professional reflects the movement to de-gigify or create steady creative work in theatrical production, working back towards the models of theatrical creation in the 1950s and 1960s where professionals are hired for entire seasons or as permanent staff at professional theatre companies (Zazzali 48). Some regional theatre companies, like Oregon Shakespeare Company, do employ full-time voice professionals as company members. More positions like those at the larger theatre companies ought to be the norm, with a retooling of season selection to include the expertise of voice and dialect coaches. This recommendation 174 reflects a call towards more stable employment in general in the sector of theatre, put forth by Brian Bell and Sam Hunter in American Theatre, arguing for a vast expansion of federal and state funding for live performance (“How U.S. States Could Fund Repertory Resident Theatres”). Another consideration in dialect coach selection is fitting the lived linguistic background of the dialect coach with the anticipated needs of the production or season. In this sense, actors who come from different backgrounds are more likely to see their lived experience affirmed through the professional conduct and creative decisions of the dialect coach. Micha Espinosa describes this affirming choice through her first experience of studying at a Patsy Rodenburg voice intensive under the direction of David Carrey: I had never discussed the aesthetics of voice. I had adopted my Anglo teacher’s aesthetic. The voice teachers all agreed on the benefits of a clear tone and a healthy instrument. But one of the voice teachers, a non-native English speaker, liked a voice with a little dirt in it. A voice that sounded like it had life. Maybe that life was hard? Maybe that voice had imperfections? (78) By training and employing voice professionals of different backgrounds, the profession can already begin to deconstruct the assumptions behind the chosen “aesthetics of voice” that has dominated the practice. In this case, the perceived voice aesthetic of the profession Espinosa was entering did not have metaphorical dirt by matching the listening expectations as determined by the voice professionals that were instructing Espinosa in this workshop. Productions that employ marginalized or non-standard versions of different language varieties ought to endeavor to find and employ voice professionals of similar backgrounds. Luckily, resources are emerging that help support 175 this recommendation for best practices. VASTA, as the leading professional organization that tracks voice professionals, has started to include search terms in their search service that honor different experiences. These terms include “cultural identity”, “equity/diversity/inclusion,” and “social justice” (“Find a Pro”). The ethical responsibility of the dialect coach is to ask loudly and often if the accent is indeed necessary for the production. For example, if the playwright desires a Southern American regional dialect for a side character that is often teased for being stupid or dumb, the dialect coach may question that choice by asking if the playwright desires to play into stereotypes and/or might question the use of that accent as a short cut within the play or performance to signal to the audience about the character’s intelligence. From my expertise, the moral duty of the dialect coach is to remind the playwright that accent is not indicative of intelligence and will ask the playwright to deeply consider their own biases and to make a new choice. In some instances, the choice of using a perceived non-accent or General American English39 accent is also worth consideration as part of the meaning-making process for the audience. The dialect coach must interrogate this choice, since the choice to or desire to “do away with accent altogether” privileges the idea of general or neutral accents as maximally intelligible, which is an idea that this dissertation works diligently to interrogate. In entertainment, this route to eschew expectations of matching character background to accent has been used to great success in HBO’s television series Chernobyl. In an interview, the show’s 39 The definition of General American English that I am using is from Tonning-Kollwitz and Hetterly 2018 and defined as, “a dialect of North American English that is free from regional characteristics” (295). The specific phonology is available as online supplemental material for this article. 176 creator Craig Mazin explains, “We didn’t want to fall into the “Boris and Natasha'' cliched accent [from The Rocky and Bullwinkle Show] because the Russian accent can turn comic very easily” (Freeth). Unbridled from the concerns of authenticity, choosing not to match accent or dialect with immediate expectations may provide for a fruitful avenue of theatre creation. In this position, a voice professional must account for ethical considerations in this position at the time of season selection including include a deep consideration of which accent or dialect is a) required by the playwright b) desired by the director or production team and c) appropriate for the actor who must use this dialect. The first source of requirement ought to be the intentions and effects of accents used by the playwright in their writing. In new works development, a voice professional ought to act in the way of dramaturg and must ask the playwright two very important questions if the playwright desires to use a specific foreign or regional dialect for their characters. The first question, familiar to new works dramaturgy is, “What is the work that you think the dialect is doing for creating meaning for the play?” and the second question is, “what is the work that the dialect is actually doing for the play?” Oftentimes playwrights may desire the use of dialect as a sort of cognitive shortcut or stand-in for certain traits of their characters, which may cut down on exposition. However, these same cognitive shortcuts are very close to societal stereotypes and may in fact be reinforcing biases and stereotypes in ways unintended for the playwright. The answers to the two above questions may begin to disentangle intent of the playwright in a new work with the potential impact of the new work. To illustrate how dialects may be used in new work development, I will use two different examples from established plays and playwrights to 177 highlight the need to carefully think through the use of dialect as character trait. Often, to achieve their desire for a certain dialect or accent, playwrights might include instructions on how they prefer various characters sound. These instructions vary between vague instructions in the stage directions to spelling changes to indicate phonetic differences between characters. The dialect coach’s responsibility in all cases is to interpret the intentions of the playwright and the work the dialect is doing for creating meaning on the stage. To demonstrate how to explore the ethical ramifications of dialect and accent requirements in playwriting, I will contrast two approaches to dialect that are baked into the writing of the play which will include the background of the playwright and the work that the addition of an accent or dialect might be doing in meaning- construction for the audience. In Pilgrims Musa and Sheri in the New World, playwright Youssef El Guindi uses only sparse instructions for dialect work in his stage directions, which indicates that the stage directions that are included are pointing towards pertinent details that must be included in the production. The opening character instructions read, “Musa (Offstage; accent)” followed immediately by “Sheri (Offstage)” (66). El Guindi indicates his desire for one of his main characters to have an accent—thereby also implying that the other character does not have an accent or speaks with a General American English accent that is read by the audience as neutral or accent-less. In this case, El Guindi wants to vocally separate Musa, who is a recent immigrant to the United States of America, from Sheri, who is a character native to the United States of America. Throughout the play, Musa reveals his desire to assimilate to American culture; for him to be vocally marked and “Othered” to prevent total assimilation reminds Musa that he cannot ever achieve his 178 desire. In this case, a dialect coach or vocal professional considering this play may use these clues to conclude an accent or dialect is required for this play. Other clues of requirements such as differences in orthography written into the dialogue of the text present a different challenge for the dialect coach as these clues do not immediately lend themselves to ease of identification for the type of dialect that is required by the playwright. In these cases, careful consideration of the backgrounds and social statuses of the characters within the play and of the playwright is required to determine the need for dialects. Other instances of dialect desires might not be as clear cut, and the dialect coach must carefully weigh the desires of the playwright with the potential to actively harm marginalized groups further. Often this decision becomes more difficult when the production in question is intended as comedy or satire. For example, in Avenue Q (2003), book writer Jeff Whitty writes the desire for dialect implicitly into the Christmas Eve character’s grammar of her lines, “He a pervert. You no spending time with him” (14). Christmas Eve as a character is written to be a smart East Asian girlfriend of Brian, and she often laments that people cannot see or hear her brilliance due to her accent. In some ways, this pastiche East Asian accent, often drawn from stereotypical examples of accents in popular culture (e.g., Mickey Rooney in Breakfast at Tiffany’s), reinforces audience expectations by once again tying the audio experience with Christmas Eve’s character traits. Perhaps, Christmas Eve was written as a smart satire, but Jeff Whitty does not do anything with her character that implies subversion, and in fact draws upon Herman and Herman’s exact examples in their Foreign Dialects book from the 1950s. 179 A dialect coach may see the attempts at satire and is left with a decision. The next question ought to be, is the comedy “punching up” or cleverly uplifting a historically marginalized group? Another way to examine this question; would a dialect coach or vocal professional, doing their due diligence and involving community input, be proud to present this character and dialect concept to a member of the community this character is trying to represent? Accent is often used as a cognitive shortcut to lead the audience to a conclusion about the character in question, and the dialect coach ought to examine every facet of this conclusion. Another option in this sticky instance is to investigate the implications of not using the desired accent. In this instance, if Christmas Eve used a General American Accent, or even location-specific New York accent, the audience would find themselves recreating the Kang and Rubin experiment of “reverse linguistic stereotyping,” where the reason Christmas Eve cannot find clients in “Sucks to be Me” is the audience’s expectation of her accent (442). Given the implications of either decision, the dialect coach has a lot to weigh and ought to be given enough power and respect to make the most ethical decision. An unlikely ally of the voice professional ought to be the dramaturg, who together can determine deeply situated expertise that can guide the process of the production. Early incorporation and respect of the dialect or vocal coach helps to shape the production in the way that best combats racist, classist, and sexist linguistic stereotypes, an arm of the basic cultural competency responsibilities of the entire production team. The dialect coach, much like the dramaturg, can offer expertise about the linguistic lives of the characters in a way that shapes the overall meaning-making in production. For new works, the dialect coach’s number one question is, “what is this accent doing to enhance 180 or detract from meaning-making for the audience?” This production team integration ought to combine with individual access to actors with ample time to integrate the linguistic requirements for the role. Practically, in my own experiences, I am often asked to coach past the time for thoughtful pre-production integration and am often left with little time to work with actors on a desired accent, yet alone be able to discuss with the director about the motivations behind the desired accent. 4.1.1 A note on casting A discussion of pre-production would not be complete without addressing the multi-faceted issue of casting and representation, which is a topic that has been addressed extensively in other venues of research and deserves its own dissertation’s worth of exploration. However, this guide would be incomplete without consideration of the effect of casting on dialect and accent coaching. Directors and dialect coaches have the responsibility to explicitly account for the power structures in society and the barriers of marginalized actors from entering the profession. In their quest for equitable representation, directors and dialect coaches must also be wary of fetishizing or tokenizing individuals as representatives of their race, ethnicity, gender, or socioeconomic status. For dialect coaches to stand the best chance of ethically doing the work they are hired to do, the explicit responsibility of representation of actors onstage ought to rest on the artistic directors and director’s shoulders. A dialect coach’s job of ethical representation and training of lived linguistic experiences stand the greatest chance of succeeding if ethical approaches to casting are employed from the very outset of the production process. Cultural competency is the responsibility of all on the production 181 team, with the director’s immediate responsibility to consider the implications of the types of bodies onstage that they choose to represent the characters in their productions. Acknowledging the racist power structures that exist in theatrical creation means that in production, roles that are created specifically for Black, Indigenous, and People of Color ought to be filled with individuals who fit the demographics of the character description as closely as possible or through thoughtful consideration of coalition casting (Herrera 33). A director ought to also consider casting in the opposite tenor, extending roles historically created for white actors to Back, Indigenous, and actors of color. With this foundation, dialect coaches can approach training these actors with as much specificity and care as in the casting process, tailoring the approach for each actor. Part of this process is creating space in training to explicitly acknowledge how society treats the lived linguistic experiences of not only the target dialect, but the lived linguistic experiences of the actor, as well. I will detail honoring such lived linguistic experiences in the following section about approaching the encounter with actors in training. Pre-production work is vital for the success of the work that the dialect coach must do with the actors in the next phase of production. The following ethical questions and guidelines for pre-production work are not limited to only the dialect or voice coach. These questions can become the responsibility of the director, dramaturg and ultimately the artistic director, especially if the production company has an eye towards equity and representation. Much work has been done for authentic representation on stage and the time is now to further extend that work to linguistic representation, which is an area that has been neglected. Production team members can work together towards ethical representation onstage while making room for the expansive possibilities exceeding the 182 use of authenticity in production. Dialects and voice onstage can start from grounded research in the lived linguistic experiences of speakers, but used thoughtfully against expectation, can lead to new ways to make meaning in the stories we present onstage. The work of the dialect coach can also fall in line with new and important positions that are being created for production that ultimately respect the bodily autonomy and lived experience of the actor, in a similar vein to how Theatrical Intimacy Educators approach completing work while respecting the bodily autonomy of actors and producers alike (Pace). Like simulated intimate acts, this approach to voice and dialect coaching aims to respect the actor and provide vocabulary to provide a simulation of the dialect or accent that respects all parties. Cultural competency and equity ought to be on the minds of every practitioners in production. Both professions build upon an exchange between the student or actor and their coach, in a configuration that can be physically, emotionally, and psychologically intimate and can be susceptible to abuse of power dynamics. 4.1.2 Questions to ask a play (and production team) What are the dialect requirements ○ What character or personality traits do the of the play/playwright? accents point towards? Is the accent supposed to “stand in” for any character trait? ○ What does the use of a particular dialect reveal about the dramaturgical life of the characters? ○ What dramaturgically does a dialect contribute to the play? ○ How does the positionality or background of the playwright interact with the desired dialects or accents in the script? What stereotypes/expectations ○ In what ways is dialect enforcing these (racist, classist, sexist) are the stereotypes/expectations? dialects ○ In what ways is dialect use subverting these participating/perpetuating? stereotypes/expectations? (Are you sure?) 183 What are the dialect requirements ○ Are there expectations of a unifying dialect? of the production company and director? What are the dialect desires of the ○ Are there ways to include dialect/accent in a play/playwright? manner that subverts stereotype/expectations of the audience? How does this enhance the production? ○ How does dialect work connect into community outreach and audience education? 4.2. The heart of the work: one-on-one with the actor In the time between completing pre-production activities with the production team, answering the above questions, and getting a crystal-clear picture of the desires and ethical responsibilities of the vocal professional, the next phase is extensive research on the dialect or accent, working with real speakers of the desired dialect. This type of research that a vocal coach must do includes specific research into the different intersecting identities of the characters (e.g., socioeconomic status, gender, race/ethnicity) with the aim to become as specific as possible when constructing the lived linguistic realities of these characters. Sources for academic research can vary from specific linguistic descriptions of languages (in the case of second language speakers), linguistic atlases (useful for pinpointing direct regional dialects), to dialect materials40 that already exist for performance. An important source can be direct recordings from speakers, which are available oftentimes on YouTube, podcasts, and other archives of material. One overlooked source of language attitudes towards particular accents or dialects can be popular linguistics videos and articles, and general folk linguistic articles that include non-experts’ opinions about the accent in question. These types of resources can point an 40 See the bibliography of The Dialect Handbook (2003) by Ginny Kopf for a comprehensive collection of dialect instructional resources collected over the twentieth century. A future project of mine will be to collect resources created since this guide appeared. 184 actor towards how their character might feel about how they sound, which gives actors access to more nuanced choices about the level of character self-confidence in any given scene and might affect their choices about how a character ought to sound when they are feeling an extreme emotion. For example, an actor might make creative linguistic choices maintaining control over their character’s more middle-class accent that they acquired when they were older when their character is being teased about their lower-class upbringing. I have found that compiling materials about language attitudes towards speakers of the target accent or dialect can create access to a new form of dialogue that feeds into an actor’s autonomy over their sound on stage. A voice professional who aims to be an anti-racist co-conspirator ought to consider the level of formal linguistic education actors might have along with time constraints of training sessions with actors and cater the material accordingly. The voice professional can compile a resource similar to dramaturgical research that includes the reasons behind the accent or dialect that was selected, and the details of the target accent. Often, I include audio and video materials for actors as they desire an audio example from which to work, which activates the perceptual system, but can interfere with production of target sounds in unexpected ways (Kato and Baese-Berk 7). Regarding sanitized or theatrical examples of popular dialects, I will sometimes provide actors with materials from established dialect coaches (e.g., Blumenfeld, Singer, or even Knight and Thompson) with explicit discussion of the constructed and standardized nature of these dialects. I do not use pre-written dialects from countries outside of the United States and Great Britain. I will also always provide audio samples of real speakers. One popular dialect that I often use with pre-created material is Received pronunciation, a popular 185 dialect that often represents British characters on stages in the United States, which consists of a series of worksheets and information adapted from colleague Dr. Tricia Rodley and Robert Blumenfeld (29). With careful framing of how to approach standardized dialects, I include a forward to the materials that acknowledges power structures and a discussion of how these accents became standardized in the first place. I invite discussions with directors and actors alike to approach the desire for dialect in the first place. Another approach to creating dialect work is to create a dialect or accent from scratch with which actors may work. Dialect creation has historically been a tool for voice professionals to impose their dialect ideologies upon speakers who enter their elocution classrooms and rehearsal rooms. There is room, however, for this type of dialect creation to be used in performance, especially when the characters’ backgrounds are fictionalized to the point of being from made-up locations. In film and television, extreme examples of dialect creation include creating entire separate languages. These constructed languages figure heavily in science fiction, from Mark Okrand’s creation of Klingon from Star Trek to high fantasy with various languages in J.R.R. Tolkein’s Lord of The Rings series, and languages created for Game of Thrones. (Klingon Language Institute). These constructed languages often borrow their phonology and sound systems from languages that exist around the world. In a similar vein, constructing a dialect or accent would again consider the questions posed in the prior section. When real-world sources are selected for these types of characters, utmost care must be taken in order to ethically represent these fictionalized linguistic lives so that they may not reflect linguistic stereotypes that exist in the real world. 186 As an example of my own work in this area, I constructed the dialect for two characters Alta and Restin in The Language Archive by Julia Cho for my undergraduate student production in October 2013 for University of New Mexico. Alta and Restin are the last two speakers of a language called Ell-o-wa, a language that the main character George has a vested interest in preserving (Cho 20). One of the few linguistic cues the play offers as to how these two speakers sound is in a monologue from Restin wherein he describes a “golden fleek” (38). This demonstrates an important sound substitution at the end of words—and a few lines of dialogue in this playwright created language between Alta and Restin. The production team determined that Restin and Alta are from some area where Slavic speakers live, so I, wearing many hats as part of a student production of dramaturg, linguist, and director then turned my attention to the specific phonetic and phonemic categories of several major Slavic languages (e.g., Polish, Moscow Russian- acccented English, and Czech) and selected several target consonant and vowel sound substitutions that would be targets for the actors. Part of the justification for targeting this part of the world was an explicit discussion of power dynamics that overlap with the bodies of the actors who would be featured speaking this dialect. While these dialects do not carry the overt prestige of western European dialects, they did carry a relatively perceived neutral ethnic prestige in which speakers from other racialized parts of the world do not participate. In the play, Alta and Restin would sound like they were hailing from a foreign country, but the origin would be hard to pin down for the average audience member. This was important, because the two actors who were cast as these characters were of different ethnicities—one actor identified as white and the other actor self-identified as non-white 187 Hispanic. I wanted an accent that audience members would mark as “foreign” without necessarily marking one actor with more negative stereotypes. What resulted was an accent that gestured towards a certain part of the world but could live in the mouths of the two fictional Ellowan characters believably. Subsequent discussions with actors showed where each sound decision was made and how they related to real-world linguistic experiences. We could therefore approximate what could be considered the homeland of Restin and Alta without the need to fall back upon stereotypes of sounding stereotypically Eastern European. Did the audience catch these nuances? The answer, according to the results of my linguistic research in chapter 3, is mostly like not, but the important aspects of training and respect for the lived linguistic realities of speakers of minority languages remained a fruitful means of discussion and character creation. 4.2.1 Finally meeting the actor The initial meeting between coach and actor is a crucial moment that sets the stage and approaches to the work. It is in this initial meeting that misconceptions and unconsciously held biases are explored in a safe space one-on-one with the actor and discussed, along with the practical work of actors becoming intimately acquainted with their vocal apparatus. This is the first opportunity to create a space of mutual respect that acknowledges the power structures at play. The first lesson I impart during this meeting is that we all carry within us a voice that has been shaped and created by the location we grew up in, those who we called peers, and every linguistic interaction since acquiring language from a young age. The second lesson, directly from the lessons the study of linguistics imparts, is we also live in a society where everyone carries with them standard language ideologies for language varieties, and these ideologies govern which varieties of 188 language are more societally acceptable than others. Expanding approaches to dialect and voice work past the individual to examine the socially built structures that govern language usage connects nicely to other work in character building and audience reception. Instead of approaching the individual as a psychological container unto themselves, this approach to dialect work foregrounds the idea that language is a tool for community use, and therefore is necessarily shaped by the ecology in which a community finds itself. Again, I invoke Amy Cook’s question about character building, “what does it mean to build characters from the ecosystem up, rather than a more psychologically focused method of character assessment?” (117). I take the same approach to dialect coaching and center language usage as an integrated part of community and ecology over and above any individual’s psychological experience. After discussing general language societal language attitudes with the actor, I begin by introducing the ideas of raciolinguistics, and how theatre is set up to cater towards an assumed white listener. In this discussion, the idea that accents and dialects are not inherently connected to character traits is introduced, but that society has an overwhelming tendency to assign character traits to how people sound when they speak. The framework we use will incorporate societal expectations (i.e., the average audience member) into the work while explicitly working against assigning character traits to how the actor will sound. This discussion is accompanied by any stereotypes we might be pushing back against with the production. Part of this work is a series of questions that we begin as a discussion point (see “4.2.2 Questions to ask the actor”) and unpack how the actor thinks they sound before creating work on a new dialect. We then turn to discuss intelligibility as an attribute and feature of the listener. I have not had the pleasure 189 of sharing the results of my research in dialect coaching (partially because the pandemic has all but eliminated production for the time being), but I look forward to the day of showing actors how listeners react to performed accents and demonstrating that listeners might be more generous in their interpretations of the accents they will hear onstage. We then discuss the term authenticity and all the implications around the word, including its strange and uneasy relationship with how theatre uses representation to create meaning in the audience. This opens discussion for how we approach dialects for production; we will be targeting a few key sounds and rhythms to create a dialect that the audience will read as “authentic” while honoring the source of the dialect. What follows is an information session about the historical, ethnic, and socioeconomic circumstances that surround the target dialect (even if the dialect is constructed, such as Received Pronunciation). This discussion is often the bulk of the initial session, with a small introduction to language notation and an initial approach to the work. After conversations about the theoretical framework and a small warm up, I introduce the actor to the idea of linguistic notation and offer a few brief exercises in thinking about how words sound (phonemes) as opposed to how they are spelled (orthography). Often, I incorporate a small introduction to the International Phonetic Alphabet with a historic framing about the origin of this alphabet. We begin to connect these symbols and how they are arranged on paper with the lived reality of the actor’s vocal apparatus. We then calibrate how we approach the work by exploring how the actor responds to commands to “make this sound harder/softer/wider” for future instruction and physiologically attune them to what their vocal apparatus is doing when they make certain linguistic sounds (Colaianni). At this point in the instruction, I also introduce a 190 sense of serious playfulness to the work, emphasizing the fact that it took six to seven years for the actor to master their first sound system (Statler, Heracleous and Jacobs). This type of work requires the playfulness of toddlers exploring their sound systems combined with the targeted work that an actor must do to achieve desired linguistic results. Opening the session for play helps to relax the actor and remind them that acquiring new skills will inevitably include making mistakes in the process. After initial discussion and exploration, I give actors training sheets that include practice of the IPA, target sounds of the dialect, and links to audio sources for practice. Intense one-on-one sessions vary depending on the availability and schedule of the production and actors. Subsequent sessions include questions and answer sessions from initial discussion of framework, specific questions about target sounds and target lines in the production, and targeted practice with notes. Like Bonnie Raphael, I aspire to give notes, “always stated in vocal rather than in acting terms” (168). I also give several physiological options for actors to access the sound they desire. When I provide sociolinguistic background to actors, I remind them that they may have sociolinguistic tools to make decisions about their dialect and accent usage but that they ultimately have autonomy to make acting choices with their dialects onstage. I will also participate in rehearsals at least once a week, and more often near the end of rehearsal when there are full runs of the production to give specific notes. While speakers are often variable41 in their speech, I am listening for moments and target sounds that are attempted in dialect in production that do not quite make their targets. I am also gathering feedback from the 41 For example, Huspeck demonstrates that almost every American English speaker is variable in their pronunciation of -ing at the end of words like “running” (152). 191 director, stage manager and any assistant stage managers for their overall impression to ensure actors sound like they are of the same linguistic world, even if individual actors have different accent or dialect targets. 4.2.2 Questions to ask an actor Initial questions to teach accent awareness and o Do you have an accent? discuss lived linguistic experience with the actor o How do you feel when before even approaching the target dialect: asked if you have an accent? o Have you been told, or do you think that you have an accent? o Where did you grow up? o Do the people you grew up with have an accent? o When was the last time someone pointed out to you that you had an accent? o How did that feel? Was it a positive experience? Approaches to language notation, accompanied by ○ Have you noticed that an exercise that explores sounds in the mouth as spelling does not always they are arranged on the International Phonetic match pronunciation? Alphabet: Which sounds do you notice this discrepancy the most? ○ How will you consistently notate sound changes in your practice? Finally, we combine language attitudes with practice and their attitudes towards their character’s accent or dialect. I ask the actor as their character the initial questions from above. 4.3 Audience outreach: Working within the community with the dramaturg(e) Best practices for audience and community outreach are to build long-term, lasting relationships with audiences and the community far and above single productions, or even single seasons focused on marginalized groups. Like representation issues in casting, much research has been conducted in building community more generally with 192 groups that have historically been excluded from the practice of theatre (Lacko 21). The focus of this section is on particular roles that dialect coaches and voice professionals may fill over the course of production that lead to audience and community interfacing. These suggestions call for a tighter relationship between the dialect coach and dramaturg and whatever apparatus the production company has set up for community interface. In the absence of these roles, the dialect or vocal coach may fulfill these community obligations themselves. Community outreach and coalition building may start within the pre-production when a dialect coach can partner with consultants to provide insights and audio material for dialect construction. This is an important part of the research process; access to audio for reference is an important part of the training available to actors. Access to collaborators also means the possibility of audio references that are custom catered for the production. For example, in the 2019 production of Pilgrims Musa and Sheri in the New World, I was able to create custom audio tracks that included pronunciation of the Arabic that was used in the show, thanks to access to a willing Arabic speaker42. Not only were the exact phrases available, but I was able to work with my consultant to create slower instructional tracks that included pronunciation one phoneme at a time. In this case, the production was a community production, and the consultant was not monetary compensated, but they were invited to a preview night and thanked in the program. Ideally, dialect coaches in a production ought to be allocated funding from the production budget to compensate consultants or pay to access other audio materials that are not readily available. 42Deepest thanks to Ryan Sayegh as consultant for this production. 193 Part of the dialect coach’s community outreach can be creating opportunities to educate the audience about their own linguistic biases as part of the dramaturgical process. The careful nuanced work that happens with actors must somehow extend to audience members, since it is within each audience member that meaning-making and perceptual processes happen. Usual venues for audience outreach include materials in the programs, interpretative displays in the lobby, and interactive events such as talkbacks and meet-the-artist type events. When dialect is considered core to the story or character, the dialect coach ought to be afforded the opportunity to provide materials and feedback on the dramaturg’s displays and notes for the program. Even short descriptions of accent or dialect choice for a given production can help give audience members insight into the complicated processes of how language attitudes and accent perception govern audience members’ everyday lives. Interpretative displays can include discussions of accents and actor training. Another source of audience interaction can take the form of talkbacks with linguistic experts in the field. For the 2013 production of The Language Archive where the play was about a linguist performing field work on a language with only two speakers remaining, I arranged an evening of talkbacks with two Linguistics PhD students who conduct research similar to the main character of the play. This accomplished two objectives—one was connecting audiences to a different way of conceptualizing language and language use, and another was introducing an entirely different audience to theatrical production, as the house was packed with linguistics professors and students who had never received the explicit invitation to a theatrical production by those who would be featured in the programming. Talk backs can be structured in a similar fashion, with options for guests that include leaders from the speech community featured in the show, 194 linguistic experts, or even the dialect coach themselves. This real time interfacing can be incredibly valuable to all involved participants. However, due to historical mistrust and exclusion from theatre creation, some members of communities may resist inclusion in production. Voice professionals, even in their good intentions and research plus community outreach, ought to be prepared for a non-response, or even negative response from the communities they research and ultimately bear responsibility for representation in production. An example of this rejection can be as compassionate as how Brian Andrew Cheslik describes in his article “ASL and Theatre: Here’s what not to do” where he opines the best intentions from directors who are trying to incorporate his native language of American Sign Language (because not all lived linguistic realities are auditory) into theatrical productions. Cheslik’s first request is casting Deaf characters with Deaf actors, to bring authenticity, representation, and work to these actors. Cheslik follows with, “While I appreciate that you want to share your student’s hard work learning sign language with my students, we are not interested in coming to see new signers butcher our language” (Cheslik). This point is an excellent reminder that even the most carefully researched, rehearsed, and practiced dialect for production will not in any meaningful way truly capture the depth of a person’s lived linguistic experience. This sentiment matters because it is directly connected to how theatres develop new audiences; without the ability to experience linguistic representation onstage, communities historically exploited for their culture will not trust these endeavors. Speakers of the group that will be featured have the right to not experience poor simulated approaches to their lives, even with the most careful approach from pre-production through community engagement. 195 While a still a socially constructed term, authenticity still has a large role to play in community representation and outreach. Cheslik cites “authenticity” as the ultimate reason for his hesitance for engaging hearing productions that feature ASL but also often points to poor planning and incorporation at all stages of production. “There has to be a reason why you have decided to do the show in ASL and English. Basically, there should be Deaf actors in the show. Do not just do a bilingual show with ASL and English without there being a Deaf performer involved” (Cheslik). While authenticity can be one avenue for community coalitions, Cheslik leaves the door open for Herrera’s coalitional casting approach to producing theatre with marginalized culture. Including the marginalized group about whom the piece of theatre is about (at various steps in the production process) is the ethical path for production. While included at the end of this discussion, to approach community coalition building and outreach at the end of the process is a mistake for production. These relationships need to build over time and in coalition with one another. These approaches help to avoid an exploitative relationship between producing company and community around which these plays and productions revolve. Such exploitative relationships directly contribute to the harm and stereotype creation that has governed this profession since its inception with elocution teachers. There has to be a balancing act between the tensions of authentic representation and authentic recreation on stage. 4.3.1 Questions to Ask a Dramaturg(e)/Community Outreach How soon can we begin building ○ When and how are we tracking community outreach? community interaction and outreach? What clubs or organizations can we reach ○ What is the budget to compensate out to collaborate? community consultants for their contributions to dialect? ○ How else can we support a 196 community or organization that historically has been omitted from theatre production? How can we search for actors and ○ What is our interaction with the consultants who are of the same casting process? background (overlap with casting considerations of the production team)? What do the dramaturgy materials look ○ Will there be interface via like? program/lobby display? ○ Will there be talk backs or other interactive elements? ○ How do we create materials and community outreach that is as accessible as possible? After the production, how can we o What does continued support and critically engage with new audience collaboration look like for members? communities that we have already engaged? 5. Challenges that Remain, Where Do We Go from Here? This ecological approach to building the profession of dialect coaching complements existing anti-racist and decolonialization efforts in the theatre. I struggle against the prevailing tide of voice training as a singular white theatre maker, and I look to practitioners and scholars in actor training for answers. One such practitioner is Nicole Brewer, who is an advocate for decolonializing acting training. In her article “Training with a Difference,” Brewer advocates against equity, diversity, and inclusion in that the policies that are often created are not strong enough to address the deepest underlying issues. She, like myself, argues for an anti-racist framework in approaching theatre production and training, "It’s a problem to not say racism. You have to turn to face racism. Lacking clarity around an anti-racist policy allows it to persist, despite your intention” (Brewer). Because dialect and voice training has not addressed its own explicitly racist and white supremacist foundations, this profession continues to 197 reproduce and promote harmful practices, and I have a difficult time with my individual activism through one-on-one training with actors in project-based approaches to voice. Dismantling biases using a discipline that has been explicitly built upon standard language which is driven by white-dominated structures does not necessarily inspire confidence in the success of the venture. After all, as Audre Lorde has said, “the master’s tools will never dismantle the master’s house” and therefore this enterprise may be doomed from the start (27). Lorde, however, qualifies this statement, “They may allow us temporarily to beat him at his own game, but they will never enable us to bring about genuine change. And this fact is only threatening to those women who still define the master’s house as their only source of support” (27). I must wrestle with the perceived threat of the master’s house—the institution of theatre in which I have been trained—as an ineffective at best and violent at worst approach to performance and theatrical creation. The methods by which these metaphorical tools in voice training have been employed uphold the structures of white supremacy and linguistic racism. By conceiving of these tools as just tools, we can begin to conceive of voice training in a way that can combat these structures. We must walk the knife’s edge of using voice training techniques and materials established as part of this structure and acknowledge their histories, while allowing room to do the work of linguistic transformation as production calls for it. I must examine the privilege that my position as dialect coach comes with an assumed institutional authority on what is said can be deemed acceptable to appear on the stage. As discussed in the second thread, this reverence for authority reflects the standard language practices that have been in place since early elocutionists at the turn of the twentieth century (Knight, 2000). In these past years, I have come to grow uncomfortable 198 with replicating this societal structure while still serving as a source of confidence building and joy for actors. Thus, my own deepest desire is to create a system that empowers actors to become their own linguistic expert; I wish to become a kind of “guide on the side” that helps actors understand that the language they are already producing contains years of embodied knowledge of environment, and that knowledge is a powerful lens in which they may enter the lived linguistic experiences of others. Yet another struggle that I come up against dialect coaching is usually project- based, and thus is part of a larger production team. A dialect coach is given an accent in a particular production and the directive to make an actor sound clear and accented in the performance. It is the coach’s responsibility to interpret “clarity” from many different production team members (most importantly the director) and balance the respect for the lived linguistic lives of actors on stage. Thus, a large portion of my approach is determining the best ways to deal with that balance and offer tools for producers and directors guided by the research in this dissertation. This type of work challenges the preconceived notion of accent as character trait by asking every participant in production from the actor to the producer about the role of trained accent or dialect in performance onstage. After that question is addressed, I offer other steps in training and production that continue to keep the fundamental idea of accent and dialect not as superfluous accessory to be added to an actor’s repertoire, but as a deeply embodied part of the dramaturgical and practical work of inhabiting a character. The practice of using accent and dialect will remain a part of performance and entertainment; I offer my perspective as a white theatre maker what equates to harm reduction to portray accent in a way that challenges the prevailing attitudes about who gets to sound like whom. 199 Often, when presenting work outside of specific dialects and in voice training in general, my goal is not to only introduce the new dialect, but again to empower participants to fully accept their own embodied voice, by exploring a brief text that is meaningful to them outside of the project at hand. This may be reminiscent of other approaches by voice practitioners like in Kristin Linklater’s Freeing the Natural Voice) However, in my work, I always introduce to participants the explicit societal structures that reinforce standard language biases, which grounds this practice into real-world pressures we all face as language users. As discussed in the introduction, other practitioners like most notably Patsy Rodenburg in her Right to Speak also address these issues implicitly, but my argument is that work of this nature must begin with this framework to even attempt a respectful representation of voices onstage. Since the publication of her book, Rodenburg has acknowledged that linguistic discrimination is its own form of violence towards marginalized actors and has since adjusted how she approaches Received Pronunciation in her own classroom. What I believe is that if you teach an accent that has painful historic resonances you must teach that accent with grace and sensitivity. You must also understand that the student has a right not to master or even speak that accent without the fear of failing a course. Of course, not speaking certain accents will affect an actor’s potential casting—Received Pronunciation is still very important for British actors’ careers—and that fact has to be very clearly communicated to the student. Most of my students, who have emotional problems with Received Pronunciation, when given the above option and having their pain honored, do learn and own Received Pronunciation. (qtd. in Espinosa 82) 200 Actors and students of voice deserve this type of sensitivity to individual experiences with sociolinguistic or raciolinguistic discrimination at all stages of their training and professional lives. At the heart of the work, the actor’s autonomy over their own instrument must be of paramount importance. These structural patterns of white-dominated structures in theatre still persist, yet there are examples of contemporary practitioners who are resisting the dominant ideas and attitudes of how voice training has been established. Teachers who come from marginalized communities themselves are fighting for this type of freedom to acknowledge and work with students in this fashion. However, voice teachers are only one part of the theatrical and film production and their roles can be limited. For example, when Micha Espinosa worked with one well-meaning director, the director suggested, To record the one African-American student I was working with at the beginning of the semester, and he expected that by the end of the semester this student would speak in a Standard American dialect. When I rejected his advice, the director felt that I was not teaching this student the skills he needed. He had no knowledge of the emotional carnage that following his advice would have inflicted (81). This type of behavior is driven by the raciolinguistic device of the idealized white listener, and it is obvious that standard language ideology can and does carry through other production roles in professional settings as well as higher education. To ethically practice voice and dialect, voice trainers ought to have not only the responsibility of saying “no” to other production team members, but the authority to do so as well. My ethical responsibility as a white theatre maker to ensure this fight for equity and equal 201 linguistic representation does not fall upon the shoulder of my marginalized colleagues; equity and cultural competency is everyone’s responsibility. Another challenge that the voice profession will face in the coming years is the question of licensing and qualification, especially given the growing trend of voice coaches advertising their services on the internet. VASTA, the professional organization for voice trainers, is not a licensing body or board that regulates the industry or proliferation of online dialect and accent coaches. Licensing does however often appear in the form of certifications from individual voice trainers and their schools. For example, the Linklater Center in New York City and Orkney, Scotland offers certification to students in higher education or participants who can pay the high fees associated with this official voice training program. This tension of who has access or the right to study their voice lies at the heart of my own criticism of the material circumstances that have arisen around this industry and this model may be changing. The COVID-19 pandemic has forced established voice institutions to inhabit digital space in an unprecedented way. This increases access to teaching and classes, but often presents issues for work that often requires in-person intimate contact. The internet has facilitated both a rise in access to this type of work and an influx of digital influencer-styled voice and dialect coaches. The larger question remains is how to balance the gate-keeping privilege of VASTA with the unregulated generation of online voice and dialect coaching that borrows its business model from influencer-style online promotion. Inherent in both models of access to this discipline is still the prevalence of implicit and explicit biases that contribute to the enforcement of negative Linguistic stereotypes seen through entertainment. The desire to quantify qualifications into easy-to-read but hard-to-attain 202 lines in one’s curriculum vitae or resumé has existed for a long time in theatre production at large. One solution can be garnered from the emerging field of intimacy coordination, where leading intimacy coordinators have cautioned against the idea of a certification system. Chelsea Pace writes in a newsletter from June 2021, “The existence of ‘certification’ leverages systems of power that promote inequality, exclusion, and the dynamics of deeply problematic master-teacher models to capitalize, financially or otherwise, on gatekeeping access to knowledge and opportunity” (2). Access to certification also disproportionately affects Black, Indigenous and People of Color who do not have the same resources and can be overlooked for the knowledge and skillset they have. This issue of certification and access will continue to be a problem into the future of this profession but will necessarily reflect how willing this field wants to change to correct historic harms and exclusions. Without a reimagining of the very structure of this profession, no progress can be made towards the ethical responsibility of harm reduction and inclusion. At the heart of this work both in ethical approaches to voice training is the idea that every person communicates differently due to their lived experiences, and every human deserves dignity and respect for how they sound. Everyone’s voice …comes from where we come from, but then every single one of us gets influenced in ways that are both conscious and unconscious through our entire life: who we dated, what we liked to watch when we were younger, a formative iconic figure for us during the era that we were growing up, what our age is, who we wanted to hang out with, where else we’ve lived in the world. There is a conscious and unconscious way in which our voice tells a story of who we are 203 (Bay, qtd. in Feller). The future of this voice profession is navigating a way that honors every language user’s unique experience, recognizes the power and rich resources of that unique experience, and makes room for serious and respectful linguistic play. A piece of examining socially constructed values is to critically interface with audience expectations to analyze how these social constructions arise through empirical inquiry. By recognizing the harms of historic voice practices that erased individual experiences and endeavored to replace these experiences with a stereotypical depiction, we can begin to correct and mitigate harm by recognizing voice practice as a tool where we can ascribe our own social meaning in pursuit of an ethical practice of voice. 204 APPENDIX STIMULI MATERIALS USED Hearing In Noise Test (HINT) sentences (Nilsson, Soli and Sullivan) used as stimuli in Experiments 1 and 2 from Chapter 3. HINT 1 1. A boy fell from the window. 31. The painter uses a brush. 2. The wife helped her husband. 32. The family bought a house. 3. Big dogs can be dangerous. 33. Swimmers can hold their breath. 4. The shoes were very dirty. 34. She cut the steak with her knife. 5. The player lost a shoe. 35. They're pushing an old car. 6. Somebody stole the money. 36. The food is expensive. 7. The fire was very hot. 37. The children are walking home. 8. She's drinking from her own cup. 38. They had two empty bottles. 9. The picture came from a book. 39. Milk comes in a carton. 10. The car is going too fast. 40. The dog sleeps in a basket. 11. The paint dripped on the ground. 41. The house had nine bedrooms. 12. The towel fell on the floor. 42. They're shopping for school clothes. 13. The family likes fish. 43. They're playing in the park. 14. The bananas are too ripe. 44. Rain is good for trees. 15. He grew lots of vegetables. 45. They sat on a wooden bench. 16. She argues with her sister. 46. The child drank some fresh milk. 17. The kitchen window was clean. 47. The baby slept all night. 18. He hung up his raincoat. 48. The salt shaker is empty. 19. The mailman brought a letter. 49. The policeman knows the way. 20. The mother heard the baby. 50. The buckets fill up quickly. 21. She found her purse in the trash. 51. The boy is running away. 22. The table has three legs. 52. A towel is near the sink. 23. The children waved at the train. 53. Flowers can grow in the pot. 24. Her coat is on a chair. 54. He's skating with his friend. 25. The girl is fixing her dress. 55. The janitor swept the floor. 26. It's time to go to bed. 56. The lady washed the shirt. 27. Mother read the instructions. 57. She took off her fur coat. 28. The dog is eating some meat. 58. The match boxes are empty. 29. Father forgot the bread. 59. The man is painting a sign. 30. The road goes up a hill. 60. The dog came home at last. 205 HINT 2 1. They heard a funny noise. 31. They're running past the house. 2. They found his brother hiding. 32. He's washing his face with soap. 3. The dog played with a stick. 33. The dog is chasing the cat. 4. The book tells a story. 34. The milkman drives a small truck. 5. The matches are on a shelf. 35. The bus leaves before the train. 6. The milk was by the front door. 36. The baby has blue eyes. 7. The broom was in the corner. 37. The bag fell off the shelf. 8. The new road is on the map. 38. They are coming for dinner. 9. She lost her credit card. 39. They wanted some potatoes. 10. The team is playing well. 40. They knocked on the window. 11. The boy did a handstand. 41. School got out early today. 12. They took some food outside. 42. The football hit the goalpost. 13. The young people are dancing. 43. The boy ran away from school. 14. They waited for an hour. 44. Sugar is very sweet. 15. The shirts are in the closet. 45. The two children are laughing. 16. They watched the scary movie. 46. The firetruck is coming. 17. The milk is in a pitcher. 47. Mother got a sauce pan. 18. The truck drove up the road. 48. The baby wants his bottle. 19. The tall man tied his shoes. 49. The ball broke the window. 20. A letter fell on the floor. 50. There was a bad train wreck. 21. The ball bounced very high. 51. The waiter brought the cream. 22. Mother cut the birthday cake. 52. The teapot is very hot. 23. The football game is over. 53. The apple pie is good. 24. She stood near the window. 54. The jelly jar was full. 25. The kitchen clock was wrong. 55. The girl is washing her hair. 26. The children helped their teacher. 56. The girl played with the baby. 27. They carried some shopping bags. 57. The cow is milked every day. 28. Someone is crossing the road. 58. They called an ambulance. 29. She uses her spoon to eat. 59. They are drinking coffee. 30. The cat lay on the bed. 60. He climbed up the ladder. 206 REFERENCES CITED “About.” Joy of Phonetics, n.d. https//www.joyofphonetics.com “About Phonetic Pillows.” Joy of Phonetics , n.d https://www.joyofphonetics.com/phonetic-pillows-2/phonetic-pillows/. “About the Work.” Knight-Thompson Speechwork, n.d. https://ktspeechwork.org/about- the-work/ Agheyisi, Rebecca, and Joshua A. Fishman. “Language Attitude Studies: A Brief Survey of Methodological Approaches.” Anthropological Linguistics, vol. 12, no. 5, 1970, pp. 137–57. Alim, H. Samy, and Geneva Smitherman. Articulate while Black: Barack Obama, language, and race in the US. Oxford University Press, 2012. Anzaldúa, Gloria. “How To Tame a Wild Tongue.” Borderlands / La Frontera: The New Mestiza, Aunt Lute Books, 1999. Austin, John Langshaw. How to do things with words. Oxford university press, 1975. Bakanic, Von. Prejudice: Attitudes about race, class, and gender. Prentice Hall, 2009. Bartoskova, Michaela. “The Role of the Psoas Major Muscle in Speaking and Singing.” Voice and Speech Review, vol. 15, no. 2, Mar. 2021, pp. 1–11. Bell, Brian, and Sam Hunter. “How U.S. States Could Fund Repertory Resident Theatres.” AMERICAN THEATRE, 21 June 2021, www.americantheatre.org/2021/06/21/how-u-s-states-could-fund-repertory- resident-theatres/. Benedetti, Jean. Stanislavski: An introduction. Taylor & Francis, 2004. Bennett, Susan. Theatre Audiences: A theory of production and reception. Psychology Press, 1997. Berkely, Anne. “Changing Views of Knowledge and the Struggle for Undergraduate Theatre Curriculum, 1900-1980.” Fliotsos and Medford. 7-30. Blumenfeld, Robert. Accents: A manual for actors- revised and expanded edition. Limelight Editions, 2002. Borrie, Stephanie A., Tyson S. Barrett, and Sarah E. Yoho. "Autoscore: An open-source automated tool for scoring listener perception of speech." The Journal of the Acoustical Society of America 145.1 (2019): 392-399. 207 Bradlow, Ann R., and Tessa Bent. "Perceptual adaptation to non-native speech." Cognition 106.2 (2008): 707-729. Bradlow, Ann R., Midam Kim, and Michael Blasingame. "Language-independent talker- specificity in first-language and second-language speech production by bilingual talkers: L1 speaking rate predicts L2 speaking rate." The Journal of the Acoustical Society of America 141.2 (2017): 886-899. Brewer, Nicole. “Training With a Difference.” AMERICAN THEATRE, 4 Jan. 2018, www.americantheatre.org/2018/01/04/training-with-a-difference/. Brown, Bruce L., Howard Giles, and Jitendra N. Thakerar. "Speaker evaluations as a function of speech rate, accent and context." Language & Communication (1985). Brown, Stan. "Column the cultural voice." Voice and Speech Review 1.1 (2000): 17-18. Brunstetter, Bekah. The Cake. Samuel French, 2019. Calta, Louis "26 Stage Troupes Form League to Bargain with Actors Equity" New York Times, 4 April 1966, pg. 26. Catford, John Cunnison. A practical introduction to phonetics. Oxford: Clarendon Press, 1988. Cho, Julia. The Language Archive. Dramatists Play Service, 2012. Colaianni, Louis. The Joy of Phonetics and Accents. Joy Press, 1995. Cook, Amy. Building Character: The art and science of casting. University of Michigan Press, 2018. Cotter, Kelley. "Playing the visibility game: How digital influencers and algorithms negotiate influence on Instagram." New Media & Society 21.4 (2019): 895-913. Dal Vera, Rocco, and Voice and Speech Trainers Association. Standard Speech: Essays on voice and speech. Applause Books, 2000. Derwing, Tracey M., and Murray J. Munro. “ACCENT, INTELLIGIBILITY, AND COMPREHENSIBILITY: Evidence from Four L1s.” Studies in Second Language Acquisition, vol. 19, no. 1, Mar. 1997, pp. 1–16. DeWitt, Marguerite E. Euphon English in America. EP Dutton & Company, 1924. Diderot, Denis. The Paradox of Acting Translated with Annotations from Diderot’s ‘Paradoxe sur le Comedien’. Trans. by W. H. Pollock. London: Strangeway and Sons, 1883. Openlibrary. Web. 2 April 2020.. 208 Disclosure: Trans Lives on Screen. Directed by Sam Feder, Netflix Movies, 2020. Dissanayake, Ellen. Homo Aestheticus: Where art comes from and why. University of Washington Press, 2001. “Dr Geoff Lindsey • speech coach.” YouTube Channel, n.d. https://www.youtube.com/user/englishspeechservice El Guindi, Youssef. “Playscript: Pilgrims Musa and Sheri in the New World.” American Theatre, September 2012: 63-80. Elliott, Nancy C. "Peer-reviewed Article Rhoticity in the Accents of American Film Actors: A Sociolinguistic Study." Voice and Speech Review 1.1 (2000): 103-130. Espinosa, Micha. "A Call to Action: Embracing the Cultural Voice or Taming the Wild Tongue." Voice and Speech Review 7.1 (2011): 75-86. “Find a Voice Pro.” VASTA. n.d. https://www.vasta.org/content.aspx?page_id=154&club_id=516524 Ferrand, Carole T. Speech Science: An integrated approach to theory and clinical practice. Pearson Education, 2013. Flege, James E. "Second language speech learning: Theory, findings, and problems." Speech Perception and Linguistic Experience: Issues in cross-language research 92 (1995): 233-277. Flege, James E., and Ocke-Schwen Bohn. "The Revised Speech Learning Model." Unpublished Preprint, 20 August 2020. https://www.researchgate.net/publication/342923241_The_revised_Speech_Learn ing_Model Flege, James Emil, et al. “Factors Affecting Strength of Perceived Foreign Accent in a Second Language.” The Journal of the Acoustical Society of America, vol. 97, no. 5, May 1995, pp. 3125–34. Freeth, Becky. “Chernobyl Creators Explain Why the Characters Don't Have Russian Accents.” Metro, Metro.co.uk, 7 June 2019, metro.co.uk/2019/06/07/chernobyl- cast-asked-not-put-russian-accents-make-emotion-authentic- 9854176/?ito=cbshare. Frieda, Elaina M., et al. "Adults’ perception and production of the English vowel /i/." Journal of Speech, Language, and Hearing Research 43.1 (2000): 129-143. Frumkin, Lara. "Influences of accent and ethnic background on perceptions of eyewitness testimony." Psychology, Crime & Law 13.3 (2007): 317-331. 209 Fuchs, Elinor. "EF's visit to a small planet: some questions to ask a play." Theater 34.2 (2004): 4-9. Giles, Howard, and Nikolas Coupland. Language: Contexts and consequences. Thomson Brooks/Cole Publishing Co, 1991. Gluszek, Agata, and John F. Dovidio. “The Way They Speak: A Social Psychological Perspective on the Stigma of Nonnative Accents in Communication.” Personality and Social Psychology Review, vol. 14, no. 2, May 2010, pp. 214–37. SAGE Journals, Goldstein, Thalia. “Questions of Realness.” The Junkyard, The Junkyard, 6 Apr. 2018, junkyardofthemind.com/blog/2017/8/14/questions-of-realness. Graham, David A. “A Short History of Whether Obama Is Black Enough, Featuring Rupert Murdoch.” The Atlantic, Atlantic Media Company, 8 Oct. 2015, www.theatlantic.com/politics/archive/2015/10/a-short-history-of-whether-obama- is-black-enough-featuring-rupert-murdoch/409642/. Grover, Purva, et al. "Perceived usefulness, ease of use and user acceptance of blockchain technology for digital transactions–insights from user-generated content on Twitter." Enterprise Information Systems 13.6 (2019): 771-800. Hampton, Marian. "Editorial Column The Golden Rule for Standard Speech." Voice and Speech Review 1.1 (2000): 13-16. Hammond, David. "Peer-reviewed Article Sidebar ‘Good Speech in Classic Plays’: The Historical Perspective." Voice and Speech Review 1.1 (2000): 143-147. Hampton, Marian. "Editorial Column The Golden Rule for Standard Speech." Voice and Speech Review 1.1 (2000): 13-16. Hawkins, Eric W. "Foreign language study and language awareness." Language awareness 8.3-4 (1999): 124-142. Herman, Lewis, and Marguerite Shalett Herman. Foreign Dialects: A Manual for Actors, Directors, and Writers. Routledge, 1997. Hay, Jennifer, and Katie Drager. "Stuffed toys and speech perception." Linguistics 48.4 (2010): 865-892. Herman, Lewis and Marguerit Shalett Herman American Dialects: A manual for Actors, Directors, and Writers. Routledge 1947. 210 Herrera, Brian Eugenio. ""But Do We Have the Actors for That?": Some Principles of Practice for Staging Latinx Plays in a University Theatre Context." Theatre Topics 27.1 (2017): 23-35. Hobbs, Robert L. Teach Yourself Transatlantic: Theatre speech for actors. Mayfield Publishing Company, 1986. Hodge, Robert, and Gunther Kress. "Social semiotics, style and ideology." Sociolinguistics. Palgrave, London, 1997. 49-54. Hodge, Robert, et al. Social semiotics. Cornell University Press, 1988. Hope, Donna. American English Pronunciation: It's No Good Unless You're Understood. Cold Wind Press, 2006. Hu, Guiling, and Stephanie Lindemann. "Stereotypes of Cantonese English, apparent native/non-native status, and their effect on non-native English speakers’ perception." Journal of multilingual and multicultural development 30.3 (2009): 253-269 Huspek, Michael R. "Linguistic variation, context, and meaning: A case of- ing/in'variation in North American workers' speech." Language in Society 15.2 (1986): 149-163. “Immigrants in the Progressive Era : Progressive Era to New Era, 1900-1929 .” The Library of Congress, n.d. www.loc.gov/classroom-materials/united-states-history-primary-source- timeline/progressive-era-to-new-era-1900-1929/immigrants-in-progressive-era/. Johnson, Mark. "The meaning of the body." Developmental perspectives on embodiment and consciousness. Psychology Press, 2007. 35-60. Kato, Misaki, and Melissa Michaud Baese-Berk. "The effect of input prompts on the relationship between perception and production of non-native sounds." Journal of Phonetics 79 (2020): 1-20. Kang, Okim, and Donald L. Rubin. "Reverse linguistic stereotyping: Measuring the effect of listener expectations on speech evaluation." Journal of Language and Social Psychology 28.4 (2009): 441-456. Kendi, Ibram X. Stamped from the Beginning: The definitive history of racist ideas in America. Hachette UK, 2016. Knight, Dudley. “Peer Reviewed Article: Standards”. Voice and Speech Review 1.1 (2000): 61-88. Knight, Dudley. Speaking with Skill: A Skills Based Approach to Speech Training. Methuen Drama, 2012. 211 Knight, Dudley. Speaking with skill: An introduction to Knight-Thompson speech work. A&C Black, 2013. Knight, Dudley. “Reprint Standard Speech: The Ongoing Debate.” Voice and Speech Review, vol. 1, no. 1, Jan. 2000, pp. 31–54. Knight, Dudley, and Philip Thompson. “About the Work.” Ktspeechwork.org, ktspeechwork.org/about-the-work/. Accessed 19 Dec. 2020. Knowles, Richard Paul. “SHAKESPEARE, VOICE, AND IDEOLOGY: Interrogating the Natural Voice.” SHAKESPEARE, THEORY, AND PERFORMANCE, edited by James C Bulman, Taylor & Francis, 1996, pp. 95-116. Kuhl, Patricia K., et al. "Cross-language analysis of phonetic units in language addressed to infants." Science 277.5326 (1997): 684-686. Lacko, Ivan. "Imaginative Communities: The Role, Practice and Outreach of Community-Based Theatre." Ars Aeterna 6.2 (2014): 21-27. Lakoff, George, and Mark Johnson. Philosophy in the Flesh: The cognitive unconscious and the embodied mind: How the embodied mind creates philosophy. Basic Books, 1999. Lambert, Wallace E., et al. "Evaluational reactions to spoken languages." The Journal of Abnormal and Social Psychology 60.1 (1960): 44. LaMonica, C. (2019). “Factors in an acoustical-attitudinal account of dialect perception.” New Ways of Analyzing Variation 48. Poster (2019). “Language Opens Worlds.” Klingon Language Institute, www.kli.org/. Liberman, Alvin M., and Ignatius G. Mattingly. "The motor theory of speech perception revised." Cognition 21.1 (1985): 1-36. Lindemann, Stephanie. "Who speaks “broken English”? US undergraduates’ perceptions of non‐native English 1." International Journal of Applied Linguistics 15.2 (2005): 187-212. Lindsay-Abaire, David. Good People. Theatre Communications Group, 2011. “Linguistix Pronunciation.” YouTube Channel, n.d. https://www.youtube.com/channel/UC3wikulG1obZp9k1sH2cH3Q, Linklater, Kristin, Phyllis Epp, and William Snow. Freeing the Natural Voice. Drama Book Publishers, 1976. 212 Linklater, Kristin. Freeing Shakespeare's Voice: The actor's guide to talking the text. Theatre Communications Group, 1992. Lippi-Green, Rosina. English with an Accent: Language, ideology and discrimination in the United States. Routledge, 2012. Lippi-Green, Rosina. “The Standard Language Myth” Voice and Speech Review 1.1 (2000): 23-30. Lopez, Robert, et al. Avenue Q, the Musical: The complete book and lyrics of the Broadway musical. Applause Theatre & Cinema Books, 2003. Lorde, Audre. "The master’s tools will never dismantle the master’s house." Feminist Postcolonial Theory: A reader 25 (2003): 25-40. McClelland, James L., and Jeffrey L. Elman. "The TRACE model of speech perception." Cognitive Psychology 18.1 (1986): 1-86. McConachie, Bruce. Engaging Audiences: A cognitive approach to spectating in the theatre. Springer, 2008. McConachie, Bruce. Evolution, Cognition, and Performance. Cambridge University Press, 2015. McConachie, Bruce. "Metaphors we act by: Kinesthetics, cognitive psychology, and historical structures." Journal of Dramatic Theory and Criticism (1993): 25-46. McGowan, Kevin B. "Social expectation improves speech perception in noise." Language and Speech 58.4 (2015): 502-521. McIntosh, Peggy. White Privilege: Unpacking the invisible knapsack. January 1990, convention.myacpa.org/houston2018/wp- content/uploads/2017/11/UnpackingTheKnapsack.pdf. Michel, Alexandra. “Cognition and Perception: Is There Really a Distinction?” Association for Psychological Science - APS, 29 Jan. 2020, www.psychologicalscience.org/observer/cognition-and-perception-is-there-really- a-distinction. Modiano, Marko. "Euro-English from a ‘deficit linguistics’ perspective." World Englishes 26.4 (2007): 525-533. Mollin, Sandra. "New variety or learner English?: Criteria for variety status and the case of Euro-English." English World-Wide 28.2 (2007): 167-185. 213 Moore, Adrianne. “The History of the Voice and Speech Trainers Association (VASTA).” Voice and Speech Review, vol. 13, no. 1, Jan. 2019, pp. 97–105. Moore, Sonia. The Stanislavski System: The professional training of an actor. Penguin, 1984. Moyer, Alene. Foreign Accent: The phenomenon of non-native speech. Cambridge University Press, 2013. Mudd, Derek. Staging the Voice: Towards a critical vocal performance pedagogy. 2014. Louisiana State University, PhD Munro, Murray J., and Tracey M. Derwing. “Foreign Accent, Comprehensibility, and Intelligibility in the Speech of Second Language Learners.” Language Learning, vol. 49, Jan. 1999, pp. 285–310. Neuhauser, Sara, and Adrian P. Simpson. "Imitated or authentic? Listeners’ judgements of foreign accents." Proceedings of the 16th international congress of phonetic sciences. 2007. 1805-1808 Niedzielski, Nancy. "The effect of social information on the perception of sociolinguistic variables." Journal of language and social psychology 18.1 (1999): 62-85. Nilsson, Michael, Sigfrid D. Soli, and Jean A. Sullivan. "Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise." The Journal of the Acoustical Society of America 95.2 (1994): 1085-1099. Nittrouer, Susan. "Discriminability and perceptual weighting of some acoustic cues to speech perception by 3-year-olds." Journal of Speech, Language, and Hearing Research 39.2 (1996): 278-297. Nolan, Ian T., et al. "The role of voice therapy and phonosurgery in transgender vocal feminization." Journal of Craniofacial Surgery 30.5 (2019): 1368-1375. Oram, Daron. "De-Colonizing Listening: Toward an Equitable Approach to Speech Training for the Actor." Voice and Speech Review 13.3 (2019): 279-297. Pace, Chelsea. “The Certification Question.” The Journal for Consent-Based Performance, 1 June 2021, www.journalcbp.com/the-certification-question. Pace, Chelsea, and Laura Rikard. Staging Sex: Staging Sex Best Practices, Tools, and Techniques for Theatrical Intimacy. Routledge, 2020. Pao, Angela Chia-yi. "False accents: Embodied dialects and the characterization of ethnicity and nationality." Theatre Topics 14.1 (2004): 353-372. 214 Preston, Dennis R. "Whaddayaknow now." Awareness and Control in Sociolinguistic Research (2016): 177-199. Ramjattan, Vijay. “language practices are racialized, and language practices racialize as well” Twitter. 29 June 2021. Raphael, Bonnie N. "Peer-reviewed Article Dancing on Shifting Ground." Voice and Speech Review 1.1 (2000): 165-170. Reinares-Lara, Eva, Josefa D. Martín-Santana, and Clara Muela-Molina. "The effects of accent, differentiation, and stigmatization on spokesperson credibility in radio advertising." Journal of Global Marketing 29.1 (2016): 15-28. Robbins, Sanford. "Essay Edith Warman Skinner: A Former Student's Recollection and Appreciation." Voice and Speech Review 1.1 (2000): 55-60. Roach, Joseph. The Player's Passion: Studies in the Science of Acting. 1985. Ann Arbor: U of Michigan P, 1993. Rodenburg, Patsy. The Right to Speak: Working with the voice. Bloomsbury Publishing, 1993. Royde-Smith, John Graham. “World War II.” Encyclopædia Britannica, Encyclopædia Britannica, Inc., 15 May 2021, www.britannica.com/event/World-War-II. Rubin, Donald L. "Nonlanguage factors affecting undergraduates' judgments of nonnative English-speaking teaching assistants." Research in Higher education 33.4 (1992): 511-531. Rubin, Donald L., and Kim A. Smith. "Effects of accent, ethnicity, and lecture topic on undergraduates' perceptions of nonnative English-speaking teaching assistants." International journal of intercultural relations 14.3 (1990): 337-353. Sabia, Joe. “Movie Accent Expert Breaks Down 32 Actors' Accents.” Wired YouTube Channel, n.d. https://www.youtube.com/watch?v=NvDvESEXcgE Sabia, Joe. “Movie Accent Expert Breaks down 31 Actors Playing Real People. ”Wired YouTube Channel, n.d. https://www.youtube.com/watch?v=lZSCGZphjq0 Sakland, Nancy. Voice and Speech Training in the New Millennium: Conversations with Master Teachers. Applause Theatre & Cinema Books, 2011. Sakland, Nancy. “Robert Barton.” Voice and speech training in the new millennium: Conversations with master teachers. Applause Theatre & Cinema Books, 2011, pp. 29-38. 215 Sansom, Rockford. "The unspoken voice and speech debate [or] the sacred cow in the conservatory." Voice and Speech Review10.2-3 (2016): 157-168. Sedgman, Kirsty. "Challenges of cultural industry knowledge exchange in live performance audience research." Cultural Trends 28.2-3 (2019): 103-117. Siegel, Jeff. Second Dialect Acquisition. Cambridge University Press, 2010. Silverstein, Michael. "Indexical order and the dialectics of sociolinguistic life." Language & Communication 23.3-4 (2003): 193-229. Silverstein, Michael. "The Limits of Awareness. Sociolinguistic Working Paper Number 84." (1981). Singer, Reid. “How Should Black People Sound?” The New York Times, The New York Times, 28 Oct. 2020, www.nytimes.com/2020/10/28/style/hollywood-accent- coaches.html. Skinner, Edith, et al. Speak with Distinction. Applause Theatre Book Publishers, 1990 Smiljanić, Rajka, and Ann R. Bradlow. "Bidirectional clear speech perception benefit for native and high-proficiency non-native talkers and listeners: Intelligibility and accentedness." The Journal of the Acoustical Society of America 130.6 (2011): 4020- 4031. Staff, ABQJournal News. “Student Production Explores the Connection between Linguistics & Love.” Albuquerque Journal, 25 Oct. 2013, www.abqjournal.com/287877/albuquerque-theater-4.html. Statler, Matt, Loizos Heracleous, and Claus D. Jacobs. "Serious play as a practice of paradox." The Journal of Applied Behavioral Science 47.2 (2011): 236-256. Stevens, Kenneth N., and Sheila E. Blumstein. "Invariant cues for place of articulation in stop consonants." The Journal of the Acoustical Society of America 64.5 (1978): 1358-1368. Stoller, Amy, et al. "Speech stereotypes: good vs. evil." Voice and Speech Review 8.1 (2014): 78-92. Sullivan, Gail. “John Cho of 'Selfie': 'I Experienced Racism'.” The Washington Post, WP Company, 2 May 2019, www.washingtonpost.com/news/morning- mix/wp/2014/10/09/john-cho-of-selfie-wants-roles-outside-any-asian-stereotype- 2/. Sumner, Meghan, et al. "The socially weighted encoding of spoken words: A dual-route approach to speech perception." Frontiers in Psychology 4 (2014): 1015. 216 “Tim Monich.” Imdb, n.d https://www.imdb.com/name/nm0598106/. Tonning-Kollwitz, Melissa, and Joe Hetterly. "The Current Use of Standard Dialects in Speech Practice and Pedagogy: A Mixed Method Study Examining the VASTA Community in the United States." Voice and Speech Review 12.3 (2018): 295- 315. Tonning-Kollwitz, Melissa, Joe Hetterly, and Ellen Kress. "The Current Use of Standard Dialects in the United States Theatre Industry." Voice and Speech Review (2021): 1-15. Trudgill, Peter, and Jean Hannah. International English: A guide to the varieties of standard English. Routledge, 2013. Ulin, David L.“'Whitey Bulger' Digs Deep into a Gangster's Tale.” Los Angeles Times, Los Angeles Times, 1 Mar. 2013, www.latimes.com/books/la-xpm-2013-mar-01- la-ca-jc-whitey-bulger-20130303-story.html. Wester, Mirjam, and Cassie Mayo. “Accent Rating by Native and Non-Native Listeners.” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014. We See You W.A.T., n.d., www.weseeyouwat.com/. Wilkinson, Alec. “Talk This Way.” The New Yorker, 11 Nov. 2009, www.newyorker.com/magazine/2009/11/09/talk-this-way. “Who Is Kristin Linklater.” Linklater Voice, n.d. www.linklatervoice.com/linklater- voice/who-is-kristin-linklater, Zazzali, Peter. Acting in the Academy: The history of professional actor training in US higher education. Routledge, 2016. 217