«The expression of stance in Mandarin Chinese: A corpus-based study of stance adverbs Haiyang Ai 304 Sparks Building, University Park, PA 16802, ...»
International Journal of Asian Language Processing 22 (1): 1-14
The expression of stance in Mandarin Chinese:
A corpus-based study of stance adverbs
304 Sparks Building, University Park, PA 16802, United States
Department of Applied Linguistics, The Pennsylvania State University
Stance-taking is considered as one of the fundamental properties of human communication (Jaffe, 2009). It is pervasive, intersubjective, and collaborative. While a good deal of research has investigated the expression of stance in English, much less has been done in Chinese. In this study, we draw upon the five-million-word Academia Sinica Balanced Corpus of Modern Chinese to investigate a comprehensive set of 34 stance adverbs expressing certainty, likelihood, attitude, and style (Biber, 2006b) across various modes of communication, diverse genres, and media channels. In addition, we also zero in on a pair of synonymous stance adverbs i.e. dique vs. queshi, to illustrate the subtle and nuanced functional and distributional differences across different genres and registers. Implications for Chinese dictionary compiling, pedagogy and sentiment analysis are discussed.
Keywords Stance adverbs, Mandarin Chinese, corpus-based, register variation, near synonym _________________________________________________________________________
1 Introduction Taking a stance towards the content of our utterances or the propositions of our interlocutors is considered as one of the fundamental properties of human communication (Jaffe, 2009). In fact, Jaffe (2009) argues that there‘s no such thing as a completely neutral position, even neutrality itself indicates one‘s stance. Biber (2006b) suggests that personal stance reflects how certain we feel towards the truthfulness of a preposition, how we obtain access to the information, and what our perspective we are taking. Du Bois (2007) maintains that stance-taking helps us to assign values to objects of interests, position ourselves, calibrate alignment with our interlocutors, and evoke the presupposed system of sociocultural value and ideologies. As such, the understanding of stance is closely related to the understanding of social aspects of human conduct. Research has shown that the expression of stance carries such pragmatic functions as hedging (Hyland, 1996), mitigating (Fraser, 1980; Homes, 1984), showing evident (Chafe, 1986), indicating appraisal (Martin & White, 2005), and evaluating propositions (Hunston & Thompson, 2000).
Stance can be expressed by a range of lexico-grammatical features including grammatical devices, value-laden word choice, and paralinguistic devices (cf. Biber, et al.
1999, pp. 966-9
classification, Biber and colleagues (e.g. Biber, 2006a, 2006b; Biber and Conrad, 2000;
Helt, 1997) have conducted a series of studies examining the expression of stance across different genres and registers. For instance, Biber and Conrad (2000) compared adverbial markings of stance in speech and writing. Biber (2006a, 2006b) examined the expression of stance in American university spoken and written registers. Helt (1997) analyzed stance adverbial variation in spoken American English.
While a large number of studies have been conducted on the expression of stance adverbs in English, relatively few efforts have been made in examining stance-taking in Chinese, a typologically distant language from English. In this study, we draw upon the five-million-word Academia Sinica Balanced Corpus of Modern Chinese to systematically investigate a comprehensive set of stance adverbs across different communicative modes, diverse genres, and various media channels. In addition to the macro-level analysis, we also zero in on a pair of synonymous adverbs at a more micro-level in order to illustrate the subtle and nuanced differences across different genres and registers. In section 2, I outline the literature on stance-taking in Mandarin Chinese, and the benefits of the corpus-based approach. I then describe the corpus data and research questions. In section 4 I report and discuss the different variations of stance adverbs followed by a presentation of the findings of a pair of synonymous stance adverbs. I conclude the article with a discussion of the implications for Chinese dictionary compilation, pedagogy, and sentiment analysis.
2 Stance-taking in Mandarin Chinese In Mandarin Chinese, adverbs are used to express time, attitude, manner, frequency, or duration. They are often placed after the subject or after the topic if no subject is present, or at the beginning of sentence (Li & Thompson, 1981). In (1), the epistemic adverb yiding ―defintely‖ expresses the sense of certainty of willingness to cooperate, and was placed
immediately after the subject women ―we‖:
(1) mei wenti, women yiding quanli peihe NEG problem we definitely full-power cooperate ―No problem. We will definitely fully cooperate (with you).‖ Only a few studies have examined the expression of stance in Mandarin Chinese. Adopting the conversation-analytical framework, Wu (2004) examined the expression of stance from the use of final particle a and ou in the unfolding development of talk-in-interaction. Wang, Tsai, and Yang (2009) examined two stance adverbs qishi (‘actually‘) and shishishang (‘in fact‘) in spoken discourse. They found that these two adverbs have face-saving and intersubjectivity functions, and are used more frequently in situations where politeness is expected, such as TV or radio interviews. Hsieh (2009) investigated the use of stance adverbs in press reportage. She reported that journalists often make strategic choices of epistemic stance markers to achieve special power. Other studies have taken a contrastive perspective. Zhen (2008) followed Biber‘s (2006b) classification of stance adverbs and examined epistemic (i.e. certainty and likelihood) stance adverbs in a Chinese-English parallel corpus. She found that although there were English translations for Chinese stance adverbs, the positions and linguistic forms were not always consistent. Long and Xu (2010) compared stance adverbs in Chinese EFL learners‘ English and Chinese argumentative essays of a shared topic. They reported that learners‘ use of stance in English has strong correlation with the stance use in Chinese. Taken together, these studies have contributed to our understanding of the expression of stance in Mandarin Chinese by examining different linguistic devices from different theoretical and methodological perspectives. However, a 3 The expression of stance in Mandarin Chinese: A corpus-based study of stance adverbs systematic analysis of a comprehensive set of stance adverbs in Mandarin Chinese based on large-scale corpus data has yet to be done, which is the focus of the present study.
3 The corpus-based approach The use of authentic and attested corpus data is likely to introduce more rigor into theory testing than using introspective and isolated data (Channell, 2000). Norrick (2009, p. 865) posits that the use of electronic corpora gives pragmatic research ―a broader, more secure basis‖. A central feature of corpus-based studies is the focus on frequency and distribution (Gries, 2009). Within the field of corpus linguistics, it is well recognized that differences in frequency imply differences in function and use (cf. Firth, 1957; Gries, 2010). In other words, the distributional patterns of a linguistic item reveal its semantic and functional properties. The relationship between meaning and distribution has been captured nicely by
[I]f we consider words or morphemes A and B to be more different in meaning than A and C, then we will often find that the distributions of A and B are more different than the distributions of A and C. In other words, difference of meaning correlates with difference of distribution. (Harris, 1970, p. 785; cited in Gries, 2010, p. 122) The role of frequency has also been recognized in usage-based linguistics, cognitive linguistics, and construction grammar (cf. Leech, 2011). A general underlying assumption is that the more frequent a linguistic expression is used, the more likely it is to be entrenched in cognition. On the other hand, it is also generally agreed that the identification of such frequent use of linguistic items and associated register or genre variation is difficult to come by based on native speaker‘s intuition. However, this is relatively easy to achieve by using the corpus-based approach. While some studies (e.g. Long & Xu, 2010; Zhen,
2008) used corpus data in examining the expression of stance in Chinese, they seemed to focus more on the comparison of stance-taking from a contrastive or second language acquisition perspective. In this study, we will draw upon large-scale corpus data to systematically investigate a comprehensive set of stance adverbs across different genres and registers.
4 The Sinica Corpus
The corpus data used in this study were retrieved from the five-million-word Academia Sinica Balanced Corpus of Modern Chinese (The Sinica Corpus thereafter) 1. The Sinica Corpus was designed to be a representative corpus of modern Mandarin Chinese, containing texts from different communicative mode (written, spoken, written-to-be-read, written-to-be-spoken, spoken-to-be-written), genre (narration, argumentation, exposition, and description), media (newspaper, academic journals, conversation or interview, etc.).
Table 1 shows its composition by media. All texts in the corpus are word segmented and part-of-speech tagged. This allows us to identify all the stance adverbs by grammatical category.
Haiyang Ai 4
The stance adverbs examined in this study were selected from previous studies of stance adverbs, notably from Biber (2006b) and Zhen (2008), as well as consultation with the reference thesaurus dictionary (Mei, Zhu, Gao, & Yin, 1983). We have also taken account of limitations of the tool used in this study, i.e. the online version of the Sinica Corpus via its web-based interface. While the web-based interface is a convenient tool for exploring the distributional information of the corpus data, it does not allow for more complicated phrasal or Regular Expression search. Hence it is not possible to search for zai…chengdu shang ―to … degree or extent‖. Second, we selected these stance adverbs with an intention to make comparison to Biber‘s (2006b) study, whose classification of stance adverbs was also followed in this study. Third, the list of stance adverbs could be overwhelming long if multi-word expression and constructions were to be included in the study. While it is certainly desirable to include as much stance adverbs as possible, considering that each stance adverb requires dozens of searches to determine its frequency in different communicative modes, diverse genres, and media channels, a line has to be drawn at some point in order to make the project manageable. In total, 34 stance adverbs have been included in this study (see Appendix B). Frequency counts were all normalized to per million words, and mean values of normalized frequency were used for comparison.
5 Result and discussion
5.1 Register variation of stance adverbs The overall distribution of stance adverbs in Mandarin Chinese shows considerable variation across four categories: certainty, likelihood, attitude, and style adverbs occur 188.5, 158.8, 117.7, and 77.2 times per million words respectively. Epistemic stance adverbs i.e. indicating certainty and likelihood are relatively more frequent, while style adverbs are less so in our corpus data. In addition, the epistemic certainty adverbs occur more frequently than likelihood adverbs. These findings are consistent with Biber‘s (2006a) study of English stance adverbs in university registers.
Considerable variation has been found in the distribution of stance adverbs across different genres. Specifically, stance adverbs are extensively used in narration, but seldom in description (see table 2). This might not be surprising necessarily, as description of an 5 The expression of stance in Mandarin Chinese: A corpus-based study of stance adverbs object or an event tend to be objective and impartial, while narration of an event tends to afford more opportunity for speakers to convey their opinions, judgments, or evaluation.
Table 2. Distribution of Stance Adverbs Across Different Genres Within the genre of narration, a number of stance adverbs stand out as having higher frequency than other members of the same category.
For instance, yiding and bixü are pervasive in expressing certainty, while keneng is frequently used in expressing likelihood, and qishi emerged as a popular choice for marking personal attitude. These three stance adverbs are illustrated in examples (2)-(4).
Within each genre, large variation was observed among different groups of stance adverbs.
Attitude and style adverbs, for instance, occur less frequently than certainty and likelihood adverbs. This seemed to suggest that the expression of epistemic stance (i.e. certainty and likelihood) is more of a concern than the way they are expressed (i.e. style).
Next we explore the distributional patterns of stance adverbs across different communicative modes: written, written-to-be-read, written-to-be-spoken, spoken, and spoken-to-be-written. Such differentiation—according to the Sinica Corpus manual—is theoretically motivated. Specifically, written refers to the typical sense of written documents as in corpus construction. Written-to-be-read, however, refers to documents that are essentially written for the purpose of giving a speech or lecture. It is argued that this type of text is different from common spontaneous speech, given that it is often carefully constructed and edited. Written-to-be-spoken refers to scripts for plays or dramas. They are written in such a manner as to mimicking real spoken interaction. Because they are repeatedly rehearsed in advance, they are said to be somewhat different from real spontaneous speech. Spoken-to-be-written refers to texts such as notes taken at a conference. Although they are spoken, they are often post-edited, and thus warrant a separate category in the corpus construction. Finally, spoken genre consists of transcripts of spontaneous speech.