Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Assessing modal proficiency in English as a second language
(USC Thesis Other)
Assessing modal proficiency in English as a second language
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ASSESSING MODAL PROFICIENCY IN ENGLISH AS A SECOND LANGUAGE by Roann Altman A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Linguistics) May 1984 Copyright by Roann Altman 1984. UNIVERSITY OF SOUTHERN CAUFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES, CAUFORNIA 90089 This dissertation, written by Roann Altman under the direction of h .. ffX....... Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re quirements for the degree of DOCTOR OF PHILOSOPHY ./1-d~~c;/J dd; ············································~····r············v;~~ DISSERTATION COMMITTEE D ACKNOWLEDGEMENTS In the several years it has taken me to conduct pilot studies, do the research, and write this dissertation, I have had the support of many friends, colleagues, and professors--all of whom were as ecstatic as I was that the task had finally been completed. I am indebted first of all to the members of my committee, each of whom contributed something special to the final product: Steve Krashen, who convinced me to do something that was manageable; Bill Rutherford, who understood how difficult it was to describe the system of modality and who made sure that all material produced was accurate; Jackie Schachter, whose insights led to a refinement of the research and of the final version; Ed Purcell, who provided much needed assistance in the area of statistical analysis; and Joe Hellige, who read the dissertation and served as an active participant throughout the process. I would like to extend special thanks to those USC faculty and staff with whom I consul ted at various times during the process: Bob Kaplan, Bernard Comrie, and Murvet En~ for discussions on a framework for modal analysis; and ii Doug Biber and Jack Roberson of the University Computing Center for their statistical and computing assistance. The development of Part I of the test used in this study owes a great deal to the efforts of Elena Gar ate, Assistant Director of the Office of International Students and Scholars (OISS). Several OISS staff members and Peer Advocates, Assembly, along with members of the International Student provided input for the lecture given on registration procedures. The administration of the test to the non-native speakers could not have been done without the assistance of two wonderful colleagues: Susie Dever, a member of the Humanities Audio Visual Center staff, helped proctor that part of the exam requiring audio-visual equipment; Mary Alvin helped with all aspects of test administration. In addition to their tremendous help in getting the test to run smoothly, both also provided much needed moral support. Special thanks go to the twenty native speakers whose responses served as the criterion for grading the non-native- speaker tests: Nicholas Arkimov ich, Chuck Beckett, Tracy Bronstein, Anita Brown, Lynn Davis, Lisa Gittelman, Susan Grodsky, Monica Hidalgo, Paul Holt, David Katz, Tom Kennedy, Eli Hubara, Earl Levenson, Robert Louis, Susan Pressman, Devora Salovey, David Smith, Laurie Topel, Randy Wolman, and Harry Zimmerman. iii And finally, there are some special people in my life who have helped in various ways: Bill Grabe, who understood exactly what I was talking about when I told him my ideas about medals and language acquisition; Clif de Cordoba, Allen Koshewa, Mike Maggio, and Fred Wilkey, for being there to provide emotional support; Mary Alvin, for being there and being such a tremendous help just when I needed her; Dr. Caleb Gattegno and Joseph Sirota, for wanting to know when I'd be done already; Matt Geller, for his patience in seeing me through the most difficult time and for helping me with whatever needed to be done; and my parents, for wanting to see me graduate in May 1984. iv ABSTRACT In this research, a test of English modal expressions was developed and subsequently evaluated for its ability to assess overall language proficiency. Because the use of medals had been shown to be problematic for even advanced 1 earners of English, it was hypothesized that a test of modal performance would be able to discriminate between native and non-native speakers and between different levels of non-native speakers. Based on a functional-semantic analysis of the system of modality, a three-part test was developed: (1) an open-ended portion requiring modal production in guided but spontaneous oral discourse, ( 2) a restricted-response format requiring that blanks within written dialogs be filled in with modal expressions, and (3) a closed-ended portion requiring that one of two sentences differing only in the modal expression used be selected as sounding more appropriate. This test (Special English Test) was administered to 65 non-native speakers (intermediate, low advanced, and high advanced level students) and to 17 native speakers. The native speakers were tested in order to obtain a set of acceptable responses. The percentage of v native speakers providing a particular response on each test item was calculated and the resulting probabilities of occurrence served as the criterion for scoring the non-native speaker tests. In order to determine how well the Special English Test (SET) matched a test already in use, a canonical correlation analysis was performed on the non-native speaker results of the three parts of this test and on the seven parts of an English placement test (International Student Exam or ISE). Results showed a strong correlation between the two tests (R = 0.8278). The strongest contributor to the SET canonical variable was the restricted-response format where the subjects were required to fill in blanks in a given context with an appropriate modal expression; the strongest contributor to the ISE canonical variable was the subtest where subjects were required to select an grammatical form to complete a sentence. In order to determine how effective the SET was as a predictor of language performance, a canonical discriminant analysis was performed. The resulting significant canonical discriminant function correlate was 0. 8552 and was able to account for 98.89% of the variance. The SET was able to correctly predict 75.61% of the placements into four levels (intermediate, low advanced, high advanced, and native speaker). vi The study thus shows that it is possible to develop a test based on a single aspect of language structure, in this case modality, that can successfully discriminate between native and non-native speakers different levels of non-native speakers. and between vii CONTENTS ACKNOWLEDGEMENTS . . . . . . . . . . . . ii ABSTRACT MODALITY . . . . . . . . . . . . . . . . . . . . . . . • v . . . . . . . . . . . . . • 1 What Makes Non-Natives Native? • • • • • 1 Overview of Modality •••••.••••.••• 2 Traditional Linguistic Analyses •••••••• 5 Modal Research Studies • • • • • • • • • • • • 12 Adult Native Speakers • • • • • . • • • 14 Child Language Acquisition 21 Second Language Acquisition • • . • • • • • 22 Towards Establishing a Modal Framework • • • • 27 EVALUATING LANGUAGE PROFICIENCY ••.••••.. 31 Purposes • . . • • . • . • • • • • . • . • . • 31 Issues in Evaluating Language Proficiency 32 Trends . . . • . • . . . • • • . . • . • . 32 Communicative Competence • • • • • . • • . 37 Testing Communicative Performance • • • • . 43 Testing Considerations • • • • • • • • • • • . 47 Types of Language Data • • • • • • • • 47 Format . . . . . . . . . . . . . . . . . . 52 Scoring . . . . . . . . . . . . . . . . . . 54 General Considerations • • • . • . • • • • 55 TESTING MODAL PROFICIENCY • . • . • • 57 Specifying Test Content . • • • • • • 57 Establishing a Functional Framework • • • • 57 Functions of Modality • . • • • • • • • 61 Scale of Compulsion/Obligation • • • • 63 Issues in Testing Modality . • • • • • 66 Type of Data • • • • . • • 66 Format • . • • . • . . • . • . • • 67 Scoring • • • . • • • • • • • • • • • • 69 Administrative Considerations • • . • . • • 70 viii Previous Research Involving Elicitation Procedures • • • • . • • • . • • Designing the Test • • • • • • • • • • • • . • Pilot Testing • • • • • . • • • • . Restricted-Response Format • • • • . • Closed-Ended Format • • . • • • • • • • Open-Ended Format • • • • • • • • • • • • • Administering the Test • • •••••.•.•• Description of Subjects • • •••••• Description of Non-Native Speaker Administration Procedures • Native Speakers ••••••••••.•.• 72 74 74 78 81 82 84 84 89 94 RESULTS AND ANALYSIS . . . . . . . . . . . . . . . . . 97 Description of Test Correction Procedures 97 Coding of Responses • • . • • . • • • . 97 Evaluating and Scoring the Responses ••• 101 Statistical Analysis of the Test •.•.••• 107 Preliminary Analyses • • • . . • • • • 107 Establishing Concurrent Validity ••••• 112 Predicting Placement ••••••.•••• 116 Discussion • • • • • • . • • • . • • • 122 Data Elicitation .•••••• 122 Validity • • • • • • • •••. 125 Placement • • • . . . • • . • . 127 CONCLUSION . 130 Appropriateness of the Test of Modality ••• 130 Modality as an Area of Assessment •••.• 131 The Success of the Test Formats • • • • 134 Beyond the Test: Suggestions for Further Research • • . • • • • • • . •• 136 First Language Research . • • • • ••• 137 Second Language Research • • • 138 Implications . • . • • . • • • . • • • • • 140 REFERENCES . . . . . . . . . . . . . . . . . . . . . . 142 Appendix A. SPECIAL ENGLISH TEST: PART II B. SPECIAL ENGLISH TEST: PART III . . 150 • 162 ix c. D. E. F. SPECIAL ENGLISH TEST: PART I . PREPARATORY PAPERS NATIVE SPEAKER QUESTIONNAIRE INTERNATIONAL STUDENT INFORMATION SHEET . . 167 • . • 17 3 • 178 • • 180 X LIST OF TABLES Table ~ 1. Expressions of Modality by Adult Native Speakers . . . . . . . . . . . . . . . . . . . 16 2. Representations of WILL by Adult Native Speakers . . . . . . . . . . . . . . . 17 3. Summary of Micro-Functions and Their Linguistic Realizations • • • • • • . • • • • 65 4. 5. 6. 7. 8. 9. 1 0 • Special English Test: Part I . . . . . . . . Special English Test: Part II . . . . Frequency Count of Subjects By Level . . . . . Simple Statistics of Scores by Level: SET Part I . . . . . . . . . . . . . . . . . . . Simple Statistics of Scores by Level: SET Part II . . . . . . . . . . . . . . . . . . Simple Statistics of Scores by Level: SET Part III . . . . . . . . . . . . . . . . . . Correlation Matrix of Non-Native-Speaker Scores on the SET and the ISE . . . . . . . . • . . . 103 . . 104 . . 108 . . 11 0 . . 1 1 1 . . 111 . . 113 11. Standardized Canonical Coefficients for the Significant Composite Variables of Each Test • 115 12. Correlations Between the Subparts of Each of the Tests and the Opposing Canonical Variables • • . . • • • . • . • • • • • • 13. Standardized Canonical Discriminant Function 116 Coefficients for the SET • • • • • • • .• 118 14. Placement by the SET •.••••••••••••• 121 xi 15. F-Scores of the Significance of the Difference Between Levels Determined by the SET •..•. 121 LIST OF EXAMPLES Example 1. Special English Test, Part II: Sample Test Item 80 2. 3. Special English Test, Part III: Sample Test I tern • • • • • . • • • . . • Special English Test, Part I: Episodes and Their Linguistic Representations .. 4. Special English Test, Part II: Item to be Scored . . . . . . . . . . . . . . . 82 99 104 xii MODALITY What Makes Non-Natives Native? One of the problems facing those in the field of second or foreign language teaching is how to evaluate the language ability of non-native speakers of a language. Language assessment is required at different stages of the learning process: at the beginning to determine placement, during the course of study to determine intermediate levels of achievement, and at the end to determine final proficiency. In each case, the language proficiency of a non-native speaker is compared either with that of other non-native speakers or against an established standard. Assessment of the language of more advanced speakers of a second or foreign language is especially problematic. The ability of these speakers to communicate, often fluently and without apparent syntactic or morphological errors, might lead one to conclude that they have native-like proficiency. Closer inspection of their speech, however, would 1 ikely show evidence of forms and structures that either are not common among native speakers or perhaps never appear at all among native speakers-- 1 despite the fact that most of their structures are grammatically correct in a technical sense. This study focuses on one such area of English where the distinction between native and non-native speaker usage lies not so much in syntactic grammaticality but in the application of rules of appropriateness. A test is described that can be used to (1) assess the nativeness of a learner's speech in one aspect of English structure and (2) predict level of overall proficiency. Overview Qf Modality Informal observation had led the researcher to believe the modification of basic propositions was problematic for non-native speakers of English. While beginning level students often failed to use any modifiers at all, advanced students too experienced a tremendous amount of difficulty in expressing themselves. The missing or problematic forms seemed to fall under the grammatical classification of modality, including what are traditionally referred to in English as the modal auxiliaries: can, could, will, would, shall, should, m@Y, might, must, ought (to), and sometimes need. In order to determine more exactly what the nature of the difficulty was, and to see if indeed it was possible to use modality to distinguish native from non-native speakers and to determine level of proficiency, an investigation was made into the area of modality. 2 The medals, which express such meanings as certainty, probability, possibility, permission, obligation, and advice, have generally been taught as a set of verbs with distinguishing characteristics: they have finite forms only, take no inflections, and do not appear as imperatives, for example (Halliday 1970:300). It would be incorrect to assume, however, that these auxiliaries could represent only modal meanings, or conversely, that these meanings could be expressed solely by the modal auxiliaries. In fact, these same meanings can be expressed with periphrastic verb forms (e.g., BE allowed to 1 and even non-verbal forms (e. g., lexical i terns such as adverbs and adjectives). And, in certain contexts, the aforementioned auxiliaries do not even perform the modal function (e. g., Q.E11 in the sense of ability; 2 will in its strictly future sense). 1 A capitalized verb is a convention used to represent any form of the verb (e.g., Be= am, is, are, was, were). 2 In fact, Boyd and Thorne (1969) consider can to be modal only when may can substitute for it, as in these two senses of possibility or permission. Can in the sense of ability, in conjunction with achievement verbs (e.g., see, understand) where they fail to take the progressive, or in sentences that convey a sporadic aspect and can be paraphrased with sometimes (e.g., Linguistics can be fun.) is considered non-modal. Leech ( 1971) also considers Q.E11 to be non-modal when it means know how to and when used with verbs of inert cognition (e.g., remember, understand) and of inert perception (e.g., hear, see). 3 In order to find some way of determining non-native speaker proficiency in the area of English modality, it was first necessary to establish a clearer definition as to what fell within the realm of modality. Unlike linguistic descriptions of other aspects of English (e.g., verb morphology, prepositions, pronouns), descriptions of the system of modality are quite complex. While the system has variously been described by structural linguists, functionalists, semanticists, and pragmatists (see section on Traditional Linguistic Analyses), there does not seem to be any consensus among them as to the best framework within which to view modality. Rather than arbitrarily impose one descriptive framework, it seemed best to investigate a range of them and select the one that would best suit this type of linguistic study. (See Steele et al. 1981 for rationale for minimizing investigations.) assumptions in empirical Closer investigation of the descriptions given by the traditional linguists, however, revealed that most of them failed to deal with the rules for actual language use (e.g., appropriateness rules). An essential supplement to any traditional linguistic description, therefore, would be a systematic investigation of actual native speaker performance. Although some recent modal descriptions have, in fact, been based on actual language data (e.g. , Palmer 4 1979), these data are invariably written. A complete analysis should include spontaneous and natural oral language performance data as well. Traditional Linguistic Analyses Traditional analyses of modality can be divided into two major types: ( 1) those that begin with the modal auxiliaries themselves and simply enumerate all possible meanings and functions for each word and ( 2) those that begin with modal functions or meanings and then enumerate the many expressions that represent each one. There is no clear agreement as to the best way to look at modality: there have been a great many analyses of each type, as well as discussions surrounding the difficulties of categorization that arise with the second type of analysis. The earlier modal analyses, in most cases, 1 is ted the modal auxiliaries one at a time with their various possible meanings and interpretations. Probably the first major work of this kind was Ehrmann's The Meanings Qf the Modals i.n American English ( 1966). It was soon followed by Boyd and Thorne's "The semantics of modal verbs" (1969). Although this analysis was based on the "speech act" notion of Austin (1962) and Searle (1965), the point of departure for the analysis was still the individual modal auxiliaries. 5 Nevertheless, the work on speech acts did get linguists to focus on the uses to which language is put, thus opening the way for analyses based on semantic or functional categorization rather than on the structures themselves. The first linguist to analyze modal expressions in terms of the uses to which they are put was Halliday (1970) who considered language to have three basic functions: a content (ideational) function, a social role (interpersonal) function, and a discourse (textual) function. He distinguished between the interpersonal function of modality, on the one hand, and the ideational function of modulations, on the other. Modality, in conveying the speaker's comment on the content (e.g., probability of occurrence), can be expressed either by modal auxiliaries (e.g., should, might) or by non-verbal lexical items (e.g., possible, certain). Modulation, an integral part of the content, can be realized not only by modal auxiliaries (e.g., must, ~) but also by periphrastic verb forms (e.g., BE~ to, ~allowed to, BE w i 11 i n_g to ) • Most of the modal analyses made during the 1970's categorized the medals according to the logical notions of necessity and possibilitY. The earliest such analysis was that proposed by Leech (1971) in which the same modal form could appear in two parallel systems, each with a different 6 meaning. Thus, on one end of the semantic continuum, must and have to 3 are the realizations for necessity in one system but for obligation in the other; and at the other end, may and can are the realizations for .R.Q..§Sibility in one system but for permission in the other. Although one of the first attempts to begin to represent the medals using a framework encompassing more than just the modal forms for analysis, this analysis was problematic in its assumption that the system could be adequately described by such a neat, idealized, symmetrical framework. What was probably the major semantic analysis of modal s was made by Lyons ( 1977). He distinguishes epistemic modality (viz. Halliday's modality)--wherein the speaker qualifies his or her commitment to the truth of what is being said (e. g., 11 You mus.t be kidding; It might rain.") from deontic modality (viz. Halliday's modulation)-- the obligation and permission of engaging in a certain act (e.g., "You must go now; You may go now. 11 ) Nevertheless, they are not completely distinct, as evidenced by the use of the same modal auxiliaries to represent both epistemic and deontic modality (e.g., must and should) and the difficulty in distinguishing them when referring to a future event. For example, a clerk who 3 He includes ~ to although he recognizes that it is not a true modal auxiliary: it can combine with other medals as it has a non-finite form. 7 tells a customer: "I'll be with you shortly" is conveying only one meaning. Yet the statement could have the force of a prediction or a promise depending on whether or not the speaker has the power or ability to effect the event. Haliday also notes that the distinction between interpretations becomes especially fluid in hypothetical situations. The most recent, most complete work on categorization of modal s was done by Palmer ( 1979). Although he too assumes the centrality of possibility and necessity, he distinguishes three types of modality rather than two: epistemic, deontic, and dynamic. Epistemic modality is the modality of probability; deontic modality (also called root) is the modality of obligation. The dynamic category includes neutral dynamic modality, as in general cases of necessity and possibility, and subject-oriented modality, which modifies characteristics of the subject such as ability (..Q...91!) and willingness (will) --all of which have often not even been considered modal (cf. Boyd and Thorne 1969; Leech 1971). He prefers this three-way division because of the lack of a clear distinction between the two major kinds of modality (i.e., deontic and epistemic), the existence of several different kinds of modality, and the close connection between possibility of events and possibility of propositions (p. 35). Nevertheless, he does 8 concede that the issue is more terminological than real and that it would also be possible to maintain the two-way distinction as long as there was a division of deontic modality into subjective (i.e., deontic) and objective (i.e. , dynamic) modality. The problems of the lack of any clear-cut division between the two basic kinds of modality, the existence of more than two interpretations for some of the modals, and the indeterminateness of some of the modals without sufficient context can be dealt with to some extent in a more pragmatic analysis. Kratzer (1981) has suggested that context be looked at in order to more appropriately interpret modals. In "The Notional Category of Modality," Kratzer claims: "Modals are context-dependent expressions since their interpretation depends on a conversational background which usually has to be provided by the utterance situation" (p. 42). Some of the issues involved in establishing a framework for describing modality have been addressed by Leech and Coates (1979). Based on their analysis of written language data, they discuss the importance of coming to some agreement as to whether or not modals should be, or even can be, strictly categorized. Their view is similar to Palmer's (1979), which recognizes that it may be very difficult to categorize those modals that are not at 9 the extremes of the continua (i.e., neither necessity nor possibility). Leech and Coates discuss three basic kinds of indeterminacy data: (1) that arise when trying to categorize modal intermediate interpretation gradience--difficulty in cases; ( 2) ambigui t,y--when is possible, particularly categorizing more than one with isolated sentences where there is no context to disambiguate; and ( 3) merger-- a special case of ambiguity where an i tern in context conveys virtually the same meaning, no matter which of two possible modal interpretations is selected. The problem with gradience is demonstrated with the modal ~. Can has three basic meanings--possibility, permission, and ability--with possibility related to each of the other two in some way. The possibility sense of can is related to the permission sense of ..Q..911 in terms of restriction; that is, ..Q..911 in the sense of possibility refers to restriction in the real world that might prevent something from taking place, while can in the sense of permission refers to human restrictions. The possibility sense of can is related to the ability sense of can in terms of inherency; that is, whether the occurrence of an event is contingent on something external to the participant or within the participant. 10 The problem of ambiguity can be demonstrated with must where an isolated sentence could yield either an epistemic or deontic interpretation as in: "They must realize that it's all over." An example of merger is may used in a generic sense. It is equally possible to substitute can (in the sense of possibility; i.e., deontic) or a lexical item such as perhaps (i.e., epistemic) as in an item such as the following: "It is appalling that students may be graduating high school without being able to read." Despite the problems of indeterminacy in categorization, Leech and Coates conclude that the concept of gradience will rarely be needed since most i terns will tend to fall into traditional categories (e.g., possibility, ability, permission) with core meanings. They propose the notion of a "'quantitative stereotype'" wherein one form tends to become associated with a particular category by virtue of its more frequent appearance. For example, a study by Coates (1980) of oral and written data showed that may and can are not in free variation but tend to occur in a set pattern: may in situations of epistemic possibility (e.g., it is possible that); can in the root (i.e., deontic) sense (e.g., it is possible .f.Q.r:). In general, may seems to occur in more formal situations and in fixed phrases where can would not be acceptable. 11 Modal Research Studies The fact that linguists themselves cannot agree on the most appropriate framework for viewing modality, makes it unlikely that teachers of English as a second language (ESL) would have available to them clear descriptions of the system either. Some evidence for this lack of information comes from Wilkins who notes that the language categories under which modal expressions fall "are the very things we use language for and yet they form only the smallest part of either the grammatical or situational content of language courses" (1979:89-90). One writer who has tried to render the grammatical description of the modal system simple enough for ESL students to understand is Close ( 1975). In A Reference Grammar for Students Qf English, he distinguishes a primary modal use (i.e., deontic) and a secondary use (i.e., epistemic). The selection of the deontic sense as primary seems to be motivated by the fact that it is the more common form of the two, and is perhaps less subjective than the epistemic sense, which involves speaker judgment regarding probabilities. A framework that forces a distinction between epistemic and deontic modality, however, seems to result in two major problems of categorization. In addition to the indeterminacy of the modal auxiliary can, as described by 12 Leech and Coates ( 1979), there is a tremendous amount of indeterminacy in the modals will, would, and the periphrastic form BE going to (often reduced to gonna). One approach to this indeterminacy, as already mentioned, is Palmer's treatment of liill and dynamic possibility as a separate category deontic and epistemic modality. A separate framework for H.1.ll and shall had similarly been proposed by Leech ( 1971): the will of predictability was distinguished from the Kill of volition, which included the notions of insistence, intention, and willingness. Those linguists who are unwilling to accept such a neat theoretical division have suggested other possibilities for designing a framework. Leech and Coates (1979) have suggested that linguistic categories might better be established based on a statistical analysis of the data. Kellerman ( 1980) has suggested obtaining probabilities of occurrence of the forms from natural data. These suggestions show that some linguists are no longer content to use traditional analyses for describing language but prefer to analyze samples of language data to determine what speakers of the language actually do. In order to overcome the failure of traditional descriptions to take into account actual oral language use, a quantitative analysis of the probabilities of occurrence of various expressions of modality in the speech of adult 13 native speakers was made (Altman 1982c). The initial analysis of the data was based primarily on the work by Palmer, where a distinction was made between deontic and epistemic modality, and expressions involving Hill and would were considered separately. In addition to the data on probabilities of occurrence, the modal production by speakers who still were not competent in the language was also investigated. It was felt that the data from language acquisition might provide insights as to the most appropriate descriptive framework to use. Therefore, a survey was made of research studies reporting on the acquisition of the modal system by child learners of English as a first language and by child and adult learners of English as a second language. Adult Native ~eakers The spontaneous oral data used for a quantitative analysis of the modal production of adult native speakers were collected during a small dinner party attended by a group of close friends (three women and four men, 4 all with Master's degrees). 4 The few contributions made by one of the men were eliminated from the analysis as he was a native Japanese speaker. 14 Approximately fifty minutes of taped discourse were analyzed. All forms expressing a modal function were underlined in the transcript ion. Forms included not only the traditional modal auxiliaries such as should, can, and will, but also periphrastic forms such as BE able to, BE supposed to, and BE going to· _, adverbials of possibility such as probably and maybe; lexical items such as think and ~ sure; and simple verbs or infinitives where the context clearly conveyed a modal meaning, for example, .&.Q for you should .&.Q and belongs for should be. The forms were then divided into deontic, epistemic, or WILL/WOULD expressions 5 (as per Palmer 1979) and assigned to one of two or three strength categories that varied as to their strength of meaning. Within deontic modality, therefore, there were expressions of strong obligation, advice, and suggestion/request/permission; within epi stemic modality, there were absolute certainty, strong certainty, and uncertainty. For the category WILL/WOULD, WILL was considered the stronger form and WOULD the weaker form. 6 The quantity for each category of modal expressions appear in Table 1. 5 The capitalized forms of will and would indicate that expressions having that same meaning but a different form also belong in this category; for example, BE going to. 6 The distinction is not so clear when would is the form selected solely because of the rule for sequence of tenses. 15 TABLE 1 Expressions of Modality by Adult Native Speakers (from strong (1) to weak (3)) Deontic Epistemic WILL/WOULD* Total 1 50 13 ( 114) 63 2 30 58 (23) 88 3 68** 38 106 T 148 109 (137) 257 *The number of expressions with WILL/WOULD appear in parentheses as they have not been included in the total. **Includes 27 instances of ability. As can be seen from Table 1, expressions of modality are extremely common in speech (approximately 5 per minute; 6 per minute if expressions with WILL/WOULD are included) and the total number of deontic expressions ( n = 148) is significantly higher than the total number of epistemic expressions (n = 109; x 2 = 5.92; df = 1; p < .02). 7 7 This difference disappears, however, if the 27 instances of ability are considered separately from the deontic category. In addition, it is to be expected that the relative quantity of deontic and epistemic expressions may vary as a function of topic. 16 WILL/WOULD Because of the difficulty in distinguishing epistemic from deontic uses of WILL, the expressions considered under the category of WILL/WOULD included the functions of intention/volition as well as the functions of prediction/ likelihood. In general, ~ was used much more frequently than HQ~ (114 vs. 23 instances) and would almost always represented a more tentative form of will, especially in the sense of intention. Both forms were most often used in their contracted forms (~ and ..!.Ji). The distribution of the 114 instances of RLU. appears in Table 2. As can be seen from this table, the most common realizations for the sense of RLl_l are the auxiliary .Kill and the main verb want, with BE going to (especially gonna) almost as well represented. TABLE 2 Representations of WILL by Adult Native Speakers (wi)ll want BE going to (gonna) present progressive other 39 36 29 6 4 TOTAL 114 17 Scale Qf Obligation QI Deontic Modality There are basically three levels of deontic modality: (1) strong, absolute, inescapable obligation; (2) escapable obligation; and (3) weak, freedom to act (Close 1975), including suggestion, permission, and possibility. For the 50 instances of inescapable obligation, the most common representation was have to/hafta (n = 28) followed by ( 1 ye/ 1 s) gotta (n = 11). The third most common expression involved the verb need: need to ± ~ (n = 5) and need± noun (n = 6). The 30 instances of escapable obligation form a continuum. The neutral, intermediate form is most often represented by the modal should or by the present tense of a verb (e.g., what YQY ~is= what YQY should~ is) (n = 19, including one instance of oughta). The strong form is ..!.J;l better ( n = 5). The weak form implies that the person is thinking about escaping the obligation; for example, BE supposed 1Q (n = 3) or present tense verb forms (n = 3), e.g., DQ YQY eat?= Are YQY supposed to eat? The lowest category on the scale of obligation has been termed "freedom to act" by Close (1975) and includes the meanings of ability, permission, and opportunities provided. Out of the 68 expressions of modality included in this category, the predominant form used is can (generally, phonologically reduced; n = 38). Another 18 common form is could (n = 17) which is the past of can when the sense is ability, but a more attenuated, polite form in all other instances. Requests for permission are most often indicated by the interrogative forms of can and could. The final sense in this category, that of opportunities ~yided, implies that suggestions are being made for future action. Scale Qf Certainty ~ Epistemic Modality The scale of certainty (i.e., epi stemic modality) can also be divided into three major strength categories: (1) absolute certainty resulting from logical deduction; (2) strong certainty, greater than 50%; and (3) weak certainty or uncertainty (50% or less). In contrast with the linguistic representations for deontic modality, the expressions of epistemic modality assume a greater variety of forms even though there are fewer instances of them. The data lend themselves to being organized as done in Wilkins (1976), with a distinction made between the personal and impersonal representations of certainty. Of the 81 personal expressions of epistemic modality, all but 3 were used with the first person singular pronoun J. Impersonal modality was realized by the modal auxiliaries in 16 of the 34 instances; by sentential adverbs in most of the remaining instances. 19 Cases of absolute certainty appeared least frequently (13 vs. 58 for strong certainty and 38 for weak certainty). The impersonal modal forms used were 2 cases each of must (~) and NOT .bafta (~); the personal forms included 2 instances each of I'm sure and l really/just think. By far the most common way to express strong certainty was with some form of the verb think ( 43 out of the 51 personal instances). Impersonal representation of strong certainty usually involved the adverb probably (5 out of 7 cases), with should appearing only once. The uncertainty of the third category was usually expressed by NOT think or NOT know ( 14 out of 17 cases). Impersonal representations usually involved either maybe (7 cases) or might (6 cases), with .m.g_y used in 4 cases, and could used in 3 (total= 21). Discussion Attempting to analyze oral modal production by adult native speakers using a single framework was difficult. The major problem was the decision to include not just modal auxiliaries but all expressions of the modal functions. Although there may be a limited number of such expressions within deontic modality, within epistemic modality there is a much greater variety. The need to include both impersonal and personal expressions within 20 epistemic modality, and the proliferation of lexical items and even adverbs that can express epistemic functions, make neat categorization extremely difficult if not impossible. Another problem in using this single framework involved the categorization of some of the individual items. There were many instances of WILL/WOULD and it was often very difficult to distinguish their modal usage (e.g., intention/ volition; prediction/likelihood) from their pure future usage. The many instances of .rum were similarly difficult to categorize because of the overlap of meaning in the areas of ability, possibility, and permission. For this study, all were placed within the deontic category. Child Language Acguisition Some of the first research studies investigated to help establish a framework within which to view modality were studies of children acquiring English as their first language. According to a study by Wells (1979) reported by Fletcher (1979), can and li1ll were the primary modal forms used by 3-1/2-year-old children, specifically to indicate their own willingness, inability, or request for permission, or to allow or prohibit an action by their interlocutor. 21 Based on Shepherd's ( 1981 ) report of Kucz aj' s ( 1977) analysis of first language (L1) longitudinal data, the earliest forms used were gonna (for intention), can (for ability), and don't and can.!...,t (for permission), followed closely by hafta (for obligation), can.!...,t (for possibility) and then must (for probability). Using data produced by Nina during the first four years of her life, Pea, Maw by, and MacKain ( 1982) report that the most common forms were BE going to and H.ill, followed by can, have(.g..Q.,t).t.Q, and need. The same corpus of data analyzed by Shepherd (1981) shows the forms H.ill and gonna (for volition and intention) and can (for ability) to be the earliest acquired. In sum, these studies all show the earliest and most common modal forms to be: variations of will and BE going to, can, and ~ (,g..Q..t) .t.Q. Later acquired and less frequent forms are: need, could, and must (for probability). Second Language Acquisition The research studies investigated in the field of second language acquisition to help establish a modal framework involved teenagers, adults, and children. In his analysis of the acq ui si tion of auxiliary verbs by adolescent learners of English as a second language, Schumann (1978) looked at the modals can, could, .Hil.l, 22 would, ~' and may. Using a criterion of acquisition of 90% correct in three consecutive obligatory contexts, it was found that can was supplied correctly only 5% of the time, coulg was never supplied in the three possible obligatory contexts, ~ was used correctly approximately 38% of the time, and would was used only once in 13 possible contexts. While ~ is not an auxiliary in English, one of the subjects (Alberto) used it frequently in the sense of English ~ tQ, must, or need to, probably as a result of interference from Spanish. In a reanalysis of Schumann's data, Andersen (1981) showed that can was used correctly 85% of the time in obligatory contexts if all instances of the negative were included--even if incorrectly formed. In light of the fact that it is often very difficult to determine obligatory contexts for the modals (Schumann 1978) and that the infrequent use of modals by learners is not likely to show them approaching the criterion of acquisition, a criterion of appearance can be adopted instead. Andersen's reanalysis showed that the auxiliary can may be considered to have appeared because "it was supplied in obligatory contexts at least 80% of the time in three consecutive samples 11 ( p. 170) • Only three other modal auxiliaries appeared, and then, only in the production of a few of the subjects: H.1ll (4 subjects), could (3 subjects), must 23 ( 1 subject). Appearance, of course, says nothing a bout correctness or appropriateness of use. Furthermore, frequency of occurrence cannot be equated with acquisition (Wode 1981 :68). Defining acquisition to be the first of 20 successive days in which a form appeared in spontaneous data at least 5 times, Bahns (1983) investigated the modal production of Wode's four children who were acquiring English in a naturalistic setting. For the 2- and 3-year-old girls, the only modal auxiliary "acquired" was can. By the age of 5, one boy had acquired can, li1ll, should, might, and could, and by the age of 4, the other had acquired can, would, should, could, .xil.ll, and might. Bahns cautions, however, that the exact age at the point of acquisition for each form might be misleading in that the forms were invariably used incorrectly. The adoption of an acquisition criterion, rather than an appearance one, shows the actual acquisition to occur at a later date, though even such later usage is often not perfect. In a re-analysis of the data, Bahns ( 1981) compared the order of acquisition from a structural point of view (i.e., the order in which the meanings for each given modal form were acquired) and from a semantic-pragmatic point of view (i.e., the order in which the modal forms themselves were acquired to represent each semantic-pragmatic 24 interpretation). Bahns found that a clearer developmental sequence could be seen when the forms were looked at in terms of the semantics/pragmatics. Thus, for the functions of possibility, ability, and permission, the first acquired modal form was clearly can, followed by could or might (the two boys differed). For the function of advice/suggestion/invitation, again the first modal form acquired was can, followed by shou].J;l and then could. The meaning of reguest was also most clearly realized by can with one of the boys, and by can, would, or could by the other. The semantic notions of prediction and predictability, willingness, intention, and warning/permission/threat were all represented by will, though for willingness, one of the boys used first would and then H.ill. In a study of the acquisition of modal auxiliaries by adult second language learners (Altman 1982b), a similar pattern of use emerged. An analysis was made of the spontaneous oral production of two groups of ESL learners. One group consisted of 3 Japanese learners of English; the other, a class consisting of 2 Arabic speakers, 1 Chinese speaker, and 1 Spanish speaker. The forms were categorized according to the functions conveyed and the analysis was carried out on only those forms occurring at least three times for a particular function. To denote 25 possibility/ability/permission, 5 of the 7 subjects used can. To denote possibility, there was an overwhelming preference for the adverb maybe rather than the modal might. For the expression of strong obligation, 3 learners preferred have to, 2 preferred need, and 1 must. Only one speaker used the modal should to express advice. And to convey intention, volition, and probability, a variety of expressions were used, not all of which were modal: one subject used would and then, another want and gonna; if, Hill, and want were used by 3 other subjects. What characterized the modal production of these adult learners in general is the limited number of functions expressed by each of the seven subjects. The most advanced speaker (the one with the greatest total modal production and the one so judged by his teacher) regularly used (i.e. , in more than three instances) 7 different forms to represent 5 different functions: then. The have to, should, can, could, maybe, would, forms used most often by the remaining 6 subjects were: have to, can, need, maybe, and want. 26 Towards Establishing ~ Modal Framework The studies described above were reviewed in order to establish the categories needed for the development of a modal framework that would be clear enough to serve as a guide for determining proficiency in modal usage. The different kinds of grammatical descriptions put forth by linguists (e.g., structural, functional, semantic, and pragmatic) were paralleled by a similar diversity in the research studies themselves, particularly in the methodology used. While most studies in child language acquisition (L1 or L2) were longitudinal in nature (e.g., Wells 1979; Kuczaj 1977; Pea, Mawby and MacKain 1982; Bahns 1981), those of adult second language acquisition were primarily cross-sectional (e.g., Schumann 1978, Altman 1982b). The methodology used tended to determine the type of analysis: longitudinal studies focused on the order of appearance of expressions; cross-sectional studies focused on the quantity of particular expressions. The varying backgrounds of the researchers in the field of modality, along with the different types of methodology and analysis used, rendered the studies, for the most part, incomparable. In order to make use of these research studies, therefore, it was decided to focus on what the studies had in common. The most outstanding point of commonality in 27 all the studies was the early use and frequency of the modal auxiliary can. Even when an analysis was made of the forms likely to be used to express the various functions (i.e., possibility, ability, permission; request; and advice/suggestions/invitation), Bahns (1981) found that can was the predominant form, followed by could. Another apparent similarity among all the studies was the frequency and early use of Hill (or sometimes BE going to/gonna) to represent the functions of prediction, willingness, intention, and warning/permission/threat. Several studies also mentioned the appearance of have (KQ1) to (or hafta) and sometimes need to represent the function of necessity/obligation. The rare mention of such medals as wou~, should, mus1, might could only be taken to mean that these forms either had not been acquired, just were not used in the corpora investigation. or if acquired, of data under The modal studies reviewed here can help refine the descriptive framework to be used as the basis for assessing modal proficiency. Since the area of modality is so diverse, however, it should be limited in some principled way. A rather easy approach to simplifying the task at hand would be to test only one of the major categories (i.e., epistemic or deontic). There are several reasons to choose to focus on deontic modality rather than epi stemic 28 modality. First, epistemic modality is not amenable to simple description. A great many of the expressions used to represent epistemic modality are not modal auxiliaries, and since many of them are lexical i terns, there are too many expressions to be easily included in a single framework. Second, it has been suggested that deontic modality is more basic than epistemic modality (Shepherd 1982). Support for this hypothesis comes from Kuczaj ( 1977; reported in Hirst and Weil 1982) in which young children were seen to produce more deontics than epistemics. Shepherd (1981) predicts that modal forms will first be used to express deontic meanings and then the same forms will be extended to epistemic ones. Given (1979:28) notes that the earlier sense of the English modal auxiliaries involved obligation and that the probability senses developed later. Another delimitation of the study would be to exclude the forms can and will/would from consideration. Linguistic descriptions have shown their categorization to be extremely difficult. In addition, they appear quite frequently and seem to be acquired early. As such, they would not serve as good indicators of different (especially advanced) levels of English language ability. The one area that would effectively distinguish different levels of English language ability would be that 29 of deontic modality. A differential order of acquisition and frequency of use has been determined for the different levels of strength that relate to different degrees of obligation (e.g.' suggestion/permission). strong obligation, advice, For example, while have (..K.Q.t) ..t.Q and need sometimes appeared in the studies reviewed, shou~ appeared much less frequently, and ought to, ~ better, and BE supposed to appeared in only one study (Altman 1982c). Since the purpose of the present study is to find an area of modality that will distinguish native from non-native speakers and different levels of non-native ability, the modals to be studied need not be those learned early by all (e.g., ..Q.911, .rtlll), but those that are likely to show a differential order of acquisition (even if it is not possible to determine with any precision what that order may be). The expressions of deonti c modality have thus been chosen as the focus for this study. 30 EVALUATING LANGUAGE PROFICIENCY Purposes The primary purpose in undertaking this study was to develop a way to assess the language proficiency of advanced learners of English. Although performance data are imperfect realizations of what a speaker knows about a language, these data are the only means of getting at language competence. The basic problems in assessing proficiency are what kind of data should be collected, how best to collect that data, and how best to interpret them. Traditionally, the focus in 1 anguage testing had been on grammatical competence. In 1961 , however, J. B. Carroll first suggested that testing not be limited to discrete grammar items but that there be a concern for meaning of stretches of speech, that is, communication. This distinction between the use of language for communication (i.e., use, or performance) and the metalinguistic knowledge of formal language patterns (i.e., usage, or competence) has been stressed repeatedly by researchers in the field of second 1 anguage acq ui si tion (see, for example, Widdowson 197 8, B. J. Carroll 1980). 31 In this study designed to assess proficiency in the area of deontic modality, data will have to be collected that reflect the use of language for communication. The major concern will be how best to collect the data. Can a test instrument be designed that would allow the assessment of modal production? Would it be a valid measure of general language proficiency? Would this instrument be capable of distinguishing native from non-native speakers? Would it be able to discriminate between different levels of non-native speaker ability? And could it provide detailed information about the status of a learner's grammar (i.e. , interl anguage) at a pa rti cul ar point in the learning process? Issues in Evaluating Language Proficiency Trends The only way to effectively ev al ua te speakers' knowledge of a language (i.e., competence) is by looking at their actual use of the language (i.e., performance). Although this performance is never perfect, constrained by such factors as "memory limitations, distractions, shifts of attention and interest, and errors" (Chomsky 1965:3), it is nevertheless the key to determining proficiency. The technique for evaluating second language (L2) proficiency has changed over the years, as a direct 32 reflection of new trends in the field of foreign language teaching and learning. Spolsky ( 1978) has documented the history of these trends in the field of language testing and divided them into three major periods: ( 1) pre-scientific, where evaluation (usually written) was made by an experienced teacher; ( 2) psychometric-structuralist, where language testing experts tried to develop objective, reliable, and valid measures using the discrete items contributed by the structural linguists; and (3) the current language/communicative competence approach, with psychologists insisting on an overall measure of language performance (i.e., integrative) and sociolinguists insisting on a functional component as well. Two major purposes of testing are to assess achievement and proficiency. While achievement tests measure how well specific skills taught in a particular course have been learned, proficiency tests measure total general 1 anguage capability (Briere 1971). Since course content tended to be clearly specified under the earlier influence of the structuralists, achievement tests often measured production on discrete points of language (Clark 1978). The difficulty of comparing achievement across classes, much less different schools or different parts of the country or world led J. B. Carroll ( 1961) to suggest that target proficiencies be described. Carroll also 33 recommended that the testing of proficiency not be restricted to discrete items measuring the use of parts of structure but that the items become more integrative focusing on communication as a whole. The movement away from discrete point i terns toward more communicative measures has been motivated in part by a new philosophy of language learning (i.e., that knowledge of grammar is no clear indication of ability to use the language for communication) and in part by the problems inherent in discrete point tests (e.g., the need to assess mastery within a particular area using only a single test item and the difficulty in designing such items and appropriate test formats in general) (Clark 1978). The more recent, integrative tests are of two types: ( 1) direct, measuring actual language use in real situations or (2) indirect, simulating actual language use. Indirect measures are based on what is known as "expectancy grammar" (Oller 1978): input is understood by using contextual clues (i.e., pragmatics) in addition to the internalized grammar. The two indirect measures that have received the most attention are the cloze procedure and dictation. The cloze procedure, invented by W. L. Taylor in 1953, was first used to measure the readability of texts and the reading comprehension of native speakers of English (Oller 34 1979, Alderson 1979). Since that time, the procedure has been adopted for use with non-native speakers of English. In general, cloze procedure referS to "'the systematic deletion of words from text 1 11 (Alderson 1979:219), with learners attempting to fill in the missing word. Deletion can be by the fixed-ratio method where every nth word in a passage is deleted or by the variable-ratio method where words to be deleted are selected by virtue of their meaning or grammatical form class (Oller 1979:346-347). While scoring is most often by the exact-word method (i.e., only the exact same missing word is counted as correct), other scoring procedures have also been used. The acceptable word method counts as correct synonyms or words from the same form class (Alderson 1979). There are two types of weighted procedure. In one, the errors are categorized according to how severely they violate contextual constraints. In the other, which was developed by Darnell (1968) and called clozentropy, non-native-speaker responses are weighted in accordance with the frequency with which their responses match those given by native speakers (Clark 1978, Oller 1979, Alderson 1979). Although Oller stresses that cloze tests "require the utilization of discourse level constraints as well as structural constraints within sentences," (1979:347) and are thus valid pragmatic tasks (i.e., simulating real 35 language processing in context), research by Alderson ( 1979) suggests that the cloze procedure does not require the use of the larger context but is sentence bound, restricted to the immediate environment. If, in fact, it is found to be sentence bound, he claims that it is more likely to be a measure of lower-order skills such as grammar than a measure of overall language proficiency. The other major indirect test of language proficiency is dictation, where subjects write down material they hear (Oller 1979). In partial dictation, a segment of the oral material is provided, so the subject need only fill in the missing portion. A variation on dictation makes use of the principle of reduced redundancy. Although language is normally redundant, background noise introduced into the test situation would interfere with this redundancy and thus would severely affect the performance of less proficient speakers. (See Spolsky et al. 1968 for a complete description.) While the strength of these indirect measures of language ability (e.g., cloze and dictation) lies in their high correlation with more direct measures (i.e., language use in real situations; Clark 197 8), they have been criticized on two grounds. The first criticism comes from the more traditional structuralists who criticize the indirect measures for failing to provide diagnostic 36 information about the language performance of individuals (Oller 1978). Nevertheless, while the diagnostic information may not be as clearly presented as it might in a multiple-choice, discrete point type test, such information is certainly present, particularly in the variable-ratio method of cloze testing (i.e., where the items are omitted based on a particular meaning or function that they have). The second criticism comes from sociolinguists who do not feel that the indirect measures go far enough, in that they do not test language in real communicative situations (B. J. Carroll 1980). Communicative Competence The recent emphasis on testing communicative competence has two sources: ( 1) a reaction against the discrete point testing of grammar i terns and (2) a push towards the addition of a sociolinguistic component in the grammar. Both influences on the testing process reflect the changes that have come about in the field of second language teaching. The evaluation of language learning as involving more than just learning the grammar of a language had its impetus in the work of linguists published in the early 1970's. The notion of "communicative competence" (Campbell and Wales 197 0, Hymes 197 2) was proposed to include "not 37 only grammatical competence (or implicit and explicit knowledge of the rules of grammar) but also contextual or sociolinguistic competence (knowledge of the rules of language use)" (Canale and Swain 1980:4). This broader view of competence has been evident not only in the field of second language teaching, primarily in the area of syllabus construction (see, for example, Munby 1978), but also in language teaching methodology (e.g., Brumfit 1980) and testing procedures (e.g., Morrow 1979). One of the earliest influences on syllabus design of the suggestion that teaching focus on communicative competence can be seen in the work of Wilkins (1972, 1979). He recognized the growing dissatisfaction with a grammatically-based syllabus: that learners often did not perceive the knowledge of grammar as applicable to their learning the language, that meaning was subordinated to form, and that there was no motivation for arbitrarily grouping together sentences with similar surface structures. The resulting alternative, a situational syllabus, was not without its problems: the difficulty of clearly specifying situations (especially those that were non-physical, non-observable) and the variety of linguistic forms possible in any one situation. Wilkins's solution was to propose a semantic or notional syllabus: the notions likely to be needed by 38 learners would be matched with the grammatical forms required to express them. A notional syllabus thus would consist of semantico-grammatical categories (e. g., time, quantity, space, case, deixis) and categories of communicative function (e.g., modality--scales of certainty and commitment; judgment and evaluation, suasion, argument) (Wilkins 1976). There are some who object to a functional or notional approach because of the lack of organization (Morrow and Johnson 1977, Brumfit 1980). Brumfit states that if a sy 11 a bus is to be used to direct students' learning, then it should be based on a systematic view of language, and it is "the grammatical system that gives us a generative framework which is by nature economical and capable of being systematically ordered for teaching" (1980:5). Such objections to lack of grammatical organization are dismissed by Canale and Swain ( 1980) in favor of a functional- based communi ca ti ve approach. First, they are not convinced that a functional approach of necessity need be totally disorganized, and second, they feel that a functional approach will provide greater motivation for students, who will be better able to relate to an approach to language that is communication-based. In attempting to derive a viable theory of communicative competence for research purposes, Canale and 39 Swain (1980) have suggested that the various theories that have been proposed differ primarily with regard to the relative emphasis placed on the elements of competence (e. g., grammatical competence, sociolinguistic competence, etc.). They propose an integrative theory of communicative competence where "there is a synthesis of knowledge of basic grammatical principles, knowledge of how language is used in social contexts to perform communicative functions, and knowledge of how utterances and communicative functions can be combined according to the rules of discourse" (p. 20). Their proposed theoretical framework includes: 1. grammatical competence "knowledge of lexical i terns and of rules of morphology, syntax, sentence-grammar semantics, and phonology" ( p. 29); 2. sociolinguistic competence a) sociocultural rules of use rules of appropriateness for specified sociocultural contexts (e. g., topic, role, setting) and for expressing certain attitudes, registers, and styles within these contexts) (cf. Hymes 1967, 1968 for more detail); b) rules of discourse - "combination of utterances and communicative functions" (p. 30) ( cf. 40 Halliday and Hasan 1976 and Widdowson 1978); and 3. strategic competence verbal and non-verbal communication strategies used by learners to cope with situations where there is a lack of grammatical or sociolinguistic competence. Their framework also includes a subcomponent called orobability rules Qf occurrence that will specify the knowledge that native speakers have as to the relative frequencies of the items in each of the three major components. [Probabilistic competence had been proposed as one of the contributing factors to communicative competence. See Labov (1972) for further details.] The framework described above is quite similar to the one proposed earlier by Munby (1978) which included: 1. sociocultural orientation a) acknowledgement of varieties of language ( cf. Hymes 1972), b) "the rules of use and language features appropriate to the relevant social context" (p. 23), c) the specification of learners' communication needs; 2. sociosemantic basis of linguistic knowledge 41 a) language is triggered by semantic options derived from the social structure (cf. Halliday 1973a, 1973b); b) communicative approach based on notional categories (e. g., semantico-grammatical and modal meaning and communicative function categories) (Wilkins 1972); and 3. discourse (e.g., speech acts, rhetorical acts). The frameworks differ primarily in two ways. First, Munby's framework is predominantly sociolinguistic, as evidenced not only by his sociocultural rules for use but also sociosemantic determination of the linguistic (i.e., grammatical) component. And second, Canale and Swain's framework includes a component of communication strategies and a subcomponent of probability of rules of occurrence. While the subcomponent of probabilistic rules of occurrence provide additional specifications as to language content (i.e., native speaker competence), the strategic competence component only describes the way in which non-native speakers manage to perform in the second language, and as such, does not seem to properly belong in a theoretical framework that describes the content of language. 42 Testing Communicative Performance The extension of the notion of competence to include communicative as well as linguistic competence had ramifications in second language teaching not only for syllabus design, teaching methodology, teacher training, and materials development (see Canale and Swain 1980:31-34) but for language testing as well (Brumfit 1980). The testing of communicative competence has been an issue at least as far back as Spolsky's paper "What does it mean to know a language? Or how do you get someone to perform his competence?" (1968). He stressed that in order to tap underlying competence it would be necessary to resort to measures which would do more than just sample surface features of the learner's production of the language. The subject should be asked to produce novel utterances. Although this might best be accomplished in an interview, the cost of administration and lack of reliability in scoring would rule out its use except as an adjunct to other procedures. In order to overcome the difficulties posed by procedures such as interviews, it was suggested that background noises be added to tests (Spolsky et al. 1968). The increased test difficulty posed by the test noises was based on the principle of redundancy: language is normally highly redundant, but native speakers and more proficient learners will be able to perform better with reduced redundancy than will less proficient learners. 43 Nevertheless, reduced redundancyin testing--background noise in dictation tests and missing elements in cloze tests--could not really be considered true measures of communicative performance because of their failure to provide evidence of ability to use the language in real situations (B. J. Carroll 1980). According to Morrow (1979), language tests generally fail to take into account several features of language use: 1. that language use is based on interaction, especially face-to-face oral interaction; 2. that such interaction is unpredictable; 3. that there are rules of appropriateness that vary according to the linguistic and situational context; 4. that every utterance for communication has a purpose; 5. that performance has been ignored in favor of idealized competence; 6. that authentic language is not modified specially for non-native speakers; and 7. that whether or not an interaction is successful depends on the resulting behavior (1979:149-150). In order to remedy this general failure of tests to measure actual language use, Morrow recommends that communicative tests: 44 1. be criterion-referenced (measuring learners' ability to perform a set of language activities), rather than norm-referenced (measuring their ability relative to each other); 2. establish their own validity (in accordance with claims as to what it is that was supposed to have been learned and what it is that can be predicted from such performance), rather than concurrent validity (against other tests); 3. use qualitative rather than quantitative assessment; and 4. consider face validity more reliability (1979:150-151). important than Morrows's decision to consider face validity before reliability seems to be based on what has been described as a "reliability-validity 'tension"' (Davies 1978). The attempts to increase reliability are likely to lead to the development of more objective-type tests, where students may not be required to produce the language but just recognize it (Robinson 1973). These more restrictive tests, while reliable, are not likely to have much face validity when one considers that the major aim is to measure real language use. Relaxing the criteria of reliability and validity will be necessary if a test is to be developed which truly 45 measures ability to use language in real situations. Such tests, according to Morrow, must be performance-based if the intent is to test proficiency. There may be, however, situations where the purpose of testing will be diagnostic, in which case it may be possible and even advisable to use discrete point tests to determine exactly which components of language have been acquired (Morrow 1979; Canale and Swain 1980). These discrete point tests, however, are not necessarily tests of individual structural i terns of the language, but rather, of what Morrow calls "enabling skills"--the sub-skills that must be successfully performed to accomplish a global task. These sub-skills are defined in operational terms and appear repeatedly in the carrying out of more global tasks. Morrow suggests that tests be developed "which measure both overall performance in relation to a specified task, and the strategies and skills which have been used in achieving it" (1979:153) since successful performance on the enabling skills does not entail successful performance on the global task. Performance-based tests also pose a problem in terms of assessment. Specifically, subjective evaluation of performance is likely to result in reduced reliability. In order to satisfy the requirements of face validity and some amount of reliability, Morrow suggests following the work of B. J. Carroll ( 1977) where judges are asked to ev al ua te 46 performance in accordance with the following specifications: size and complexity of text, range of structures and functions, processing speed, flexibility, accuracy and appropriacy, independence from other sources, and repetition and hesitation in processing. Although the exact scoring technique would remain to be worked out, he suggests that an overall score be given, taking the stated specifications into account. 8 Testing Considerations Types Qf Language ~ In order to evaluate the proficiency of language learners, there must be a body of second language data to evaluate. There have been two major approaches to the analysis of second 1 anguage data. Some of the earliest studies of language production in the field of second language acquisition were case studies of individuals, usually the children of the linguists themselves (Wode 1981 :68). These studies, longitudinal in nature, provided 8 Support for an overall, synthetic impression based on the defining criteria rather than an analytic score based on each individual criterion comes from Mullen ( 197 8). In an analysis of test scores given to subjects by two different judges on four different scales, Mullen found that there was an interaction effect between the judges and the scale and the judges and the subject, which meant that the scores given each subject on the four scales were not independent but were, rather, affected by the judge. 47 detailed information on the language development of a few individuals. In order to obtain more representative language data, later studies were cross-sectional in nature: many subjects at different stages of language development were assessed. The data for the earlier longitudinal studies were usually spontaneous; that is, the researcher established a collection timetable and either recorded or observed actual language production. Two major problems arise with the analysis and collection of spontaneous speech data: (1) a complete description of a learner's grammar would require a lot of data, much of which is duplicative; and (2) if the purpose of the data was to try to verify the acquisition of a particular rule, the instance might not appear or it might even be avoided (Swain, Dumas, and Naiman 1974). In order to overcome the inadequacies of spontaneous data, therefore, various el ici tat ion procedures were devised to force a learner to either produce a specific linguistic structure or render a grammatical i ty judgment about one (Corder 1981). El ici tat ion procedures have taken various forms and have met with varying degrees of success. One of the earliest and most popular procedures has been translation: translation into the native language is taken as a measure of second language comprehension; translation into the 48 target language, a measure of second language production (Swain, Dumas, and Naiman 1974). Problems with the translation procedure are that it is artificial, there is no context, and it is unnatural. Another popular procedure has been elicited imitation where the subject is asked to repeat an oral model. A major problem with this procedure is the influence that memory capacity has on the task. Also, it is not yet clear whether it is even necessary for learners to be able to understand an utterance before it can be correctly produced. In a study reported by Swain, Dumas, and Naiman ( 197 4) comparing several methods of data el ici ta tion, it was found that errors in translation were similar to those in spontaneous production and in imitation. Nevertheless, there were some errors that occurred in children's translations that did not occur in spontaneous speech. The problems inherent in eliciting oral data become especially acute if the goal is to collect data that reflect how well learners can really communicate. Spontaneous production, however, will not always produce the desired results, especially if a particular structure is under investigation. In trying to devise the best elicitation procedure for the collection of sentential complements in unmonitored speech, Richards (1980) investigated a wide range of elicitation tasks. The tasks included: 49 1. oral reconstrucion--retelling a story that could be in either the first or the second language; 2. oral picture composition with and without support of first seeing the text in the native language; 3. contrived situation that would trigger the desired structure with the stimulus in either the first or second language; 4. imitation; and 5. sentence completion. In a pilot test of these tasks, it was found that the oral reconstruction and oral picture composition were too open-ended. While they provided a great deal of oral language data, they failed to focus sufficiently on the desired structure (i.e., sentential complements). Once these more open-ended tasks were eliminated, the primary investigation showed the contrived situation--a task where a situation was given in the first language along with a question in the second language and the subject had to serve as interpreter--provided the greatest response rate, followed closely by the sentence completion task, and then by imi ta ti on. 9 The advantage of the contrived situation over the sentence completion is that the content is 9 The imi ta ti on fared slightly better than the sentence completion task for the more proficient students; it was less valuable at the lower level where learners were unable to process the utterance. 50 controlled by the investigator. Another elicitation procedure that has recently become popular in language acquisition research is introspection. Unlike introspection by linguists used to establish a grammar of the 1 anguage, introspection in second 1 anguage research most often refers to the intuitions of the language learners themselves. A theoretical model for researching mental states has been suggested by Cohen and Hosenfeld (1981). They distinguish two activities: thinking aloud (observation of what has just been said, without an effort to control the thoughts) and self-observation (an inspection of a mental state). Self-observation in turn can be either immediate (introspection) or delayed (retrospection). In an attempt to study introspection in connection with oral communication, Glahn (1980) concludes that introspection is valuable for the knowledge it yields about learner interlanguage, but that its value lies in the qualitative descriptions and in its support of quantitative results derived from other elicitation procedures. Glahn's research also leads her to conclude that introspection is more likely to occur at higher linguistic levels and would be quite valuable for the information it could provide on the nature of the language learning process. 51 Inasmuch as spontaneously collected data often do not provide the desired structure or provide much more information than is really needed, a preferred approach for eliciting particular grammatical items would probably be to use an elicitation procedure. Spontaneous data production could always be used as supplemental evidence and as confi rma ti on of the results collected from an el ici ta tion procedure. Format In designing an elicitation procedure, several factors must be taken into account. Once the focus of the test is decided upon, an appropriate format must be selected. Considerations that enter into the designing of a format include the mode, the stimulus, and the response. ~- The mode of the test refers to the general domain to be focused on. The test may focus on one of the four primary skill areas (i.e., listening, writing) or on any combination (listening/speaking) or written receptive (listening/reading) (speaking/writing). It can focus on items (discrete point) or be more (integrative). speaking, reading, of skills: oral (reading/writing); or productive language specific global in nature 52 Stimulus. The primary difficulty in designing elicitation procedures is the selection of a stimulus appropriate for the purpose. There are many factors that must be taken into account when selecting a stimulus: ( 1) What is the area of language to be tested? (2) What is to be the mode of the test? (3) Do the subjects share a common language background? (3) Are the subjects literate? (4) Are the subjects adults or children? It is of utmost importance that a stimulus not be so inappropriate that the language point to be tested cannot be elicited. For example, a writ ten stimulus may prove so difficult that it fails to elicit the desired language. A picture stimulus may be thought too childish by adult subjects. And a stimulus in the native language is out of the question with subjects from mixed language backgrounds. Response. The selection of the format for the response should also not interfere with the el ici ta tion of the desired language item. While most formats for stimuli are either aural or visual (writ ten or pictorial), response formats for the most part are written and sometimes oral. There is a great deal of v ari ability possible in writ ten formats, ranging from open-ended formats, response including 53 compositions, paragraphs, and story retellings, to closed-ended formats, including typical multiple-choice and fill-in-the-blank items. Such closed-ended formats, while not allowing for the initiative of the learner, do allow the examiner to focus on specific language items. An intermediate format that provides a greater amount of leeway in the responses, while restricting the performance to the task at hand, is known as a restricted-response format (B. J. Carroll 1980). Although all three formats are theoretically possible, one may be preferable to another given the language items to be tested and administrative considerations such as time and cost. Scoring The ease with which a test can be scored is likely to have an effect on the test format chosen. There is often a tradeoff between format and scoring: open-ended tests that are relatively easy to design are often extremely difficult to correct; closed-ended tests that are extremely difficult and time-consuming to design are relatively easy to correct. The primary consideration in scoring is how reliable the procedure is. If the scoring is objective (e.g., only one correct answer), there is generally no problem of reliability. If, on the other hand, scoring is subjective, 54 either there must be several people scoring the test or an elaborate scoring procedure must be worked out, or both. More elaborate scoring procedures needed for open-ended tests often include considerations such as "flexibility, accuracy, appropriacy, independence, repetition and hesitation" (B. J. Carroll 1980:60), with subjects evaluated into levels of ability (e.g., basic, intermediate, and target levels). General Considerations Beyond the considerations of test content, format, and scoring are administrative considerations such as time, effort, and money required to design, administer, and score the test. In addition to these economical considerations, there are those of relevance, acceptability, and comparability (B. J. Carroll 1980:13-16). Relevance is concerned with how well the test reflects the needs of the learner. Acceptability refers to whether or not the subject will accept the content and format of the test. And comparability asks whether the scores at different times and different groups are comparable (i.e., reliable). For a communicative test of language performance, B. J. Carroll also adds the criterion of authenticity: that all real-life, ... ; that day-to-day the tasks undertaken should be interactive communicative operations the language of the test should be discourse ••• ; that the contexts of 55 the interchanges are realistic ••• ; and rating of a performance • • • will non-verbal as well as verbal criteria. Carroll 1980:11-12) that the rely on (B. J. The best test for a given purpose will be the one that finds an acceptable balance among all four elements and takes into account authenticity in assessing language performance that is more communicative. B. J. Carroll applies these criteria to each of the three major test formats based on the type of response required. While the open-ended format seems to be the most authentic in its closer approximation of real language communication, it is not very economical to administer and the results are often not comparable because of the high degree of subjectivity involved. On the other hand, the comparability and economical nature of closed-ended, objective tests cannot overcome their general lack of relevance to real communi ca ti ve needs. The test format that seems to offer the best balance is the restricted-response format with an objective scoring of somewhat subjective responses. 56 TESTING MODAL PROFICIENCY Specifying Test Content Establishing ~ Functi~ Framework As outlined in Chapter I, informal observation had shown the area of modality to be a difficult one for non-native speakers of English. While performance in this area might conceivably be improved through instruction, there would be no way of determining how much improvment there had been unless a measure of performance had been given both preceding and following the instruction. The test of modality to be designed, therefore, should be capable of measuring proficiency and providing some diagnostic information as well. The amount of detail that would need to be provided might be termed "macro-diagnostic" rather than "micro-diagnostic" (Clifford 1983): the former refers to information regarding some basic patterns of deficiency; the latter refers to information about very specific errors--errors that might occur only occasionally. There would, therefore, need to be a sampling of the area of grammar known as modality, with a particular focus on the contrasts that non-native speakers find difficult. 57 Since the area of modality is very broad, and test construction involves a great deal of time and effort, a decision was made to 1 imi t the area of language to be tested. The area selected (viz. , deontic modality) was based on linguistic descriptions of the various aspects of modality, on research studies of modal acquisition, and on requirements of the present study. In order to provide diagnostic information, the area of deontic modality would have to be sampled, with a particular focus on the contrasts that proved to be difficult for non-native speakers. 10 Within the larger framework of language descriptions in general, modality lies somewhere between the level of basic grammar and the discourse leve1. 11 The basic grammar level includes the basic meaning categories of grammar 10 A review of some theories that attempt to account for language difficulty can be found in Kellerman ( 1979). Moving beyond the earlier proposals that focused either on a contrastive analysis of the two languages involved (La do 1957) or on the relative markedness of a particular structure in each of the languages (Eckman 1977), Kellerman suggests that perhaps difficulty should be looked at in terms of teaching difficulty (i.e., how much effort must be expended to make the learner aware of the problem and to produce the item correctly). More specifically, he recommends that learners provide intuitions about their own language while doing a contrastive analysis so that they will notice differences between the two languages and not transfer those items that should not be transferred. 11 For a comparison of overall language frameworks proposed by Leech and Svartvik (1975), Candlin (1976), and Wilkins (1976), see Munby (1978). 58 (i.e., concepts, notions), what has been elaborated in Wilkins as semantico-grammatical categories. The discourse level includes meaning in connected discourse. And the intermediate level includes such i terns as modal meaning, sociolinguistic meaning, communicative functions, logical communication, and pragmatics (e. g. , moods, emotion, and attitude). Munby (1978) provides a useful functional/semantic framework for describing language. The communicative (i.e., intermediate) portion, known as sociosemantics, includes a description of communicative events. These events are then converted into micro-functions (semantic/pragmatic subcategories), each with its own linguistic realizations. It is the micro-functions of modality that need to be assessed here. They do not exist in isolation, however. Their use varies depending on speaker attitudes (cf. Munby's (1978) "attitudinal-tones" based on Wilkins's (1976) categories of "personal emotions" and "emotional relations"), sociocultural rules for use (e.g., topic, role, setting; register, style; see, for example, Hymes 197 2, van Ek 1979, and Canale and Swain 1980), the rules of discourse (e.g., speech acts), and probabilistic rules of occurrence. In addition, it is expected that these micro-functions would interact with elements at the lower conceptual level (e. g., time, quantity; see Wilkins 1972, 1979). 59 In establishing a test that would performance on each of the micro-functions would be helpful to have an inventory assess the selected, it of possible linguistic realizations for each of the micro-functions. A reference work such as this, however, has not yet been produced and may never be. The best source to date is Leech and Svartvik's (1975) description of everyday English grammar (A Communicative Grammar Qf English). It is not sufficient, however, to rely solely on traditional grammatical descriptions. It is of utmost importance that the area to be evaluated include items that sample language use in different social contexts ( Ochs, in press). The problem then becomes one of establishing what adult native speaker norms are for each social situation. Because of the idiosyncratic nature of any one person's usage in a particular area, native speaker intuitions should not be used ( Labov 1969). The sol uti on proposed by Labov, therefore, is to collect oral data from speakers of the language. Such oral data would provide valuable information about the variable rules in a language--the knowledge a speaker has about how often and when to apply a particular rule. Such a quantitative analysis should then become an integral part of the structural description of the rule and of the grammar as a whole. 60 Functions Qf Modality A functional framework has been adopted in this study in order to be able to include not only the modal auxiliaries but the adverbs and periphrastic phrases as well. A similar approach was taken by Wilkins in his work Notion.§.l Syllabuses ( 1976). Wilkins describes modality as having two primary functions: (1) a reporting function organized into a scale of certainty and a scale of commitment and (2) a speech act function of suasion (i.e., getting things done). A functional approach is also taken by van Ek (1979). With a minimum level of language proficiency (i.e. , threshold) being defined in terms of specific situations, activities, functions, topics, notions, forms, and degree of skill needed, modality would fall under the category of language functions. In their major work delimiting thresholds for learners of English in all areas, van Ek and Alexander (1975) placed modality under the functions of "expressing and finding out intellectual attitudes" and "suasion." These functional categories did not completely coincide with the functions of deontic modality described in Chapter I. Whereas the focus of deontic modality was the function of compulsion/ o bl iga tion (specifically, obligation, advice, and permission), these micro-functions have been placed in different categories by Wilkins (1976) and van Ek and 61 Alexander (1975). They distinguish between the speech act function of suasion and the simple reporting of intellectual attitudes. The result is that expressions of obligation and of permission are located within the reporting category while expressions of advice, suggestion, and request are located within the category of suasion. Although it makes sense to distinguish the reporting function from suasion (actually getting someone to do something), it does not make sense to 1 ist obligation and permission as reporting and advice/suggestion/request as suasion. A decision was made, therefore, for the purposes of this study, to deal with the scale of compulsion/obligation as it applies primarily in situations where others are involved (i.e., .Y.Q.Y + modal expressions) and secondarily in expressions of reporting (e.g., l, he, she + modal expressions). The micro-function of suasion was selected as the primary focus because it is a more important function for learners to be able to handle: it allows them to have some control over others and to respond appropriately when others ask them to do things. The reporting of obligation incurred and promises given, on the other hand, is more subtle and less crucial for successful information because it does not involve direct interaction with others. 62 Scale Qf Compulsion/Obligation Three major micro-functions fall within the function of compulsion/obligation along a scale from strong to weak: obligation (or negatively, prohibition), advice, and permission. For each of these functions there are two major kinds of variations--grammatical and social. Grammatical variations include elements such as past or non-past, affi rma ti ve or negative, statement or interrogative, and person (i.e., 1st, 2nd, 3rd; singular, plural). Social variations that apply specifically to the area of modality include politeness or familiarity of the expression and whether or not the statement is endorsed by the speaker. Language areas also need to be specified for the domain to be tested. For modality, the domains selected (from those given by Leech and Svartvik 1975) are as follows: ( 1) the geographical domain or national variety will be American English; ( 2) the mode will be oral; and ( 3) the level of formality will be informal (i.e. , colloquial). Although modality could also be tested in the more formal, written mode, the informal, oral mode was selected because it was in this domain that the language production of non-native speakers seemed most problematic. In order to further delimit the content of the test of modality, it was decided to design primarily i terns that 63 were grammatically non-past, affi rma ti v e statements in the second person. However, it was felt that the other aspects of modal production (i.e., past, negative, and interrogative non-second person) were exactly those aspects that were likely to cause problems for non-native speakers and thus would be good discriminating test items and should not be ignored. Therefore, within each micro-function (i.e., obligation, advice, permission) it was decided to design at least one more-difficult item. For each of the variations of the micro-functions, possible linguistic realizations were listed (see Table 3). This list contains only affirmative/negative and declarative/interrogative contrasts, because these are the principle contrasts used within the function of suasion (i.e. , getting someone to do something). The few i terns that would be designed to test the reporting function of obligation would most likely be those forms occurring in a different person and a different tense (e. g., 1st or 3 rd person, past). When designing the test, it was felt that there should be a special focus on testing items that are in contrast (e.g., obligation vs. negative obligation, permission vs. obligation, obligation vs. advice). 64 TABLE 3 Summary of Micro-Functions and Their Linguistic Realizations MODALITY have to ( 1 ve got to) must 1 d better need to BE supposed to Do have to? Must 17 Need I? Am I supposed 1 d better should ought to BE supposed to Should 17 to? Am I supposed to? can may BE a I lowed to Can I? May 17 Am I allowed to? negate MODALITY OBLIGATION don 1 t have to don 1 t need to needn 1 t Don 1 t have to? Don•t need to? ADVICE 1 d better not shouldn 1 t Shouldn 1 t 17 Aren 1 t I supposed to? PERMISSION** cannot/can•t may not BE not allowed to Can 1 t I? Aren 1 t I allowed to? negate EVENT'~ can• t mustn•t 1 d better not BE not supposed to *The negations of the event of the functions of advice and permission are difficult to obtain and have thus been omitted from this table. For a complete description of all options possible, see Palmer (1979). '~*The difference between refusing permission (may!!£!_, can•t) with laying an obligation not to (mustn 1 t) is that "with the former it is to be assumed that permission is normally required, while with the latter the speaker takes a positive step in preventing the action for which permission may not normally be required" (Palmer 1979:65). 65 Issues in Testing Modality IYM Qf .ILs.t.s Since the purpose of testing modality is to both assess proficiency and provide diagnostic information in this area, it was decided that the best approach to data collection would be some sort of formal elicitation. While spontaneous production might be recommended for its greater face validity, the formal elicitation procedures would reduce the amount of data that had to be collected and would allow an immediate. focus on the modal structures. Nevertheless, spontaneous production that is guided in some way might also be used to combine the benefits of an elicitation procedure with the greater face validity of actual language production. Language learner intuitions about grammaticality might also provide useful supplemental information as to their proficiency in this area. Since there was no way of knowing exactly what kind of knowledge about the learner's competence each type of data would provide, it was decided that it might be best to design several different kinds of tests. Ingram states: "For any full assessment ••• a number of different types of subtests are more likely to give an accurate picture than any single measure" (1978:12). A battery of tests could be designed and administered and then the best test or composite could be determined through statistical analysis. 66 It may also be discovered that while one subtest is best for measuring proficiency, a different subtest may provide the best diagnostic information. The use of a test battery may also allow the assessment of different kinds of knowledge (e.g., comprehension vs. production, linguistic competence vs. communicative competence). Format Since I first noticed non-native speaker difficulty with medals in oral production, I decided that the test should focus on the modal forms likely to be used in informal, spoken discourse. Ideally, the testing of oral production and comprehension should be done completely in the oral mode (i.e., through listening and speaking). Testing solely within the oral mode is not feasible, however, because of the difficulty in eliciting, recording, and scoring oral production. Since a battery of tests was to be used, it was decided that each test could be given in a different mode, as long as the language to be tested was spoken English. Of the three possible response formats described by B. J. Carroll ( 1980) --open-ended, closed-ended, and restricted-response--it was decided to use each format in a different subtest. It was hoped that the advantages and disadvantages of each type of format would balance each 67 other out effectively and result in evaluate modal a test battery that could proficiency. The open-ended format would be an integrative measure of communicative competence/performance; the closed-ended format would be objective, based on particular structures, and the restricted-response format would be based on language functions. The test subparts would probably best be administered in the order going from most to least open (i.e., open, restricted-response, closed) in order for the open activity not to be influenced by information given in the more objective formats. The format for the stimulus needed to be restricted in one major way: the stimulus could not be the native language of the students since the test was to be administered to a group of students of mixed language backgrounds. As a test of oral modal usage, ideally most of the subtests should have an oral stimulus. The problem with using oral data as a stimulus, however, is that the test may become more a measure of listening comprehension than of modal proficiency. Similarly, with any subpart that had for its stimulus a written component, care would have to be taken that the test was not measuring just reading comprehension. 68 Scoring In a battery of tests with a variety of response formats, some tests are likely to be quite easy to score (i.e., the objective, closed-ended ones) while others are likely to pose severe difficulties (i.e., the open-ended ones that are normally scored subjectively). Since this was to be a diagnostic test of modal proficiency, it was felt that scoring procedures should be as objective as possible. If not, subjective scoring procedures would run the risk of having the tests be measures of general language proficiency rather than of modal proficiency. While most objective tests (e.g., multiple-choice, true-false, matching) have one clearly correct response, any type of open-ended test is likely to have more than one correct response, and the more open-ended the task, the greater the number of possible responses. Since the selection of a particular modal expression depends on a great many variables (e. g., perception of roles of participants, amount of perceived formality and politeness), it would be inadvisable for the researcher to impose ~ priori one response as correct. In order therefore to select the best answer for a particular test item, it was decided to score the tests based on the responses given by the group of native speakers taking the test. Although anything given by the 69 native speakers would be points would be awarded in of subjects responding in considered correct, variable accordance with the percentage that way. The use of native speakers as the criterion and the weighting of responses in cloze tests is known as "clozentropy" and was first used by Darnell (1968; Oller 1979:372-373; Alderson 1980). This variable scoring would not only provide a weighted score on each of the subtests for each individual, but would also provide valuable information about probabilities of occurrence of particular modal forms used by the native speakers. The relative occurrence of the forms as used by the native and non-native speakers on each item could also be compared and conclusions could be drawn regarding the nature of modal interl anguage. This qualitative analysis of individual items rather than a quantitative analysis of entire tests has been recommended strongly by Alderson (1980). Administrativ~ Considerations The development of any test is quite time-consuming and any test is likely to go through several stages before it can serve the purpose for which it was devised (B. J. Carroll 1980). In addition, tests need to be revised constantly in order to ensure they are still measuring what they are supposed to and to refine the test items. 70 Nevertheless, at some point in the test development, once some pilot testing has been done, a decision must be made to produce the test, even if it is likely to be revised in the future. If certain items prove to be problematic following the test administration, they can always be eliminated from the scoring procedure or from the statistical analysis. Since the modal test to be developed would be administered to both native and non-native speakers, the test had to be appropriate enough for both groups to take. The items had to be easy enough for the non-native speakers to understand and relevant for the native speakers as well. Since modality is quite difficult for even advanced learners of English, and modal expressions are often not even produced by lower-level learners, it was decided that the test would be administered to only those learners who were at least at the intermediate level of English. Although subjects below the intermediate level could take the test, they would likely not perform very well and would be needlessly frustrated by the experience. As with all tests, this one too would have to be short enough to administer within a reasonable period (i.e., no more than two 50-minute sessions), yet long enough to get the desired information. If any part of the test required oral production that would have to be collected 71 individually, there would be additional time considerations. Time could be saved if the oral portion could be administered in a group, but then arrangements would have to be made for audio-recording the production of a large group of subjects. Another administrative consideration would be the scoring of tests. Although an attempt would be made to make the scoring as objective as possible, the scoring would not likely be objective enough to be handled by computer. First establishing native speaker norms and then scoring non-native speaker production against these norms would thus be extremely time-consuming with a very large group of subjects. Previous Research Involving Elicitation Procedures While the tests developed by Richards (1980) were not designed to elicit modal expressions, they were designed to elicit complements in the unmonitored speech of second language learners. One advantage Richards had was that the subjects all spoke the same native language and thus it was possible to use the native language as a stimulus. In the final study with thirty subjects, Richards found that the best test format was the contrived situation in which the stimulus was given in the native language, and a question in the second 1 anguage (i.e. , English) was used to trigger 72 a response in the second language. The subject was told to act as an interpreter for another person in attempting to answer the question. In an interlanguage study of modal auxiliary production by ten secondary school students, Smith ( 1980) collected data using a written translation test with distractors and two oral interviews (actually, controlled conversations designed to elicit desired structures). Data from the translation exercise suggest that errors occurred in the translation due to structures in the native language, errors that might not have occurred had the data been elicited differently. For example, in an i tern designed to elicit the expression for negative obligation (i.e., nQ1 ~to/need to), the German form muss elicited a great deal of interlanguage forms with must not. Performance in a parallel controlled conversation confirmed that the form used in the translation exercise must indeed have resulted in interference from German and that the more open-ended conversation did not produce the same errors. The use of translation from the native language therefore can be highly problematic. In an attempt to find elicitation procedures appropriate for testing English modality, a pilot study was undertaken with twelve native and non-native speakers (Altman 1982a). One test had a multiple-choice format with 73 three distractors and was designed to determine whether or not the subject understood the meaning of the modal expression that was underlined in the test i tern. Results showed that non-native speakers often selected completely different resonses from those chosen by the native speakers, understood. thus indicating that the modals were not well The task in the other test was to rank order seven modals from strong to weak: should, can, ~ to, BE supposed to, could, ..!...Q better, and must. The resultant rank orders showed that non-native speakers felt should a stronger expression than BE supposed to, with ..!...Q better listed the weakest of all. While the results from each of these tests supported each other (i.e., in that the non-native speakers were failing to understand the import of certain expressions of modality), a completely different kind of test format would have to be devised to elicit modal production. Designing the ~ Pilot Testing In accordance with the requirements set forth above--that is, that the test be both a proficiency and a diagnostic test and that the area to be tested be the functions of the scale of obligation (deontic modality; speech act suasion)--a battery of tests was devised. The 74 subparts corresponded to each of the three response formats that were outlined by B. J. Carroll (1980): open-ended, restricted-response, and closed-ended. In designing the exact format for each of the subparts, several factors had to be kept in mind. 1. At least three subparts would be needed so that the measurement of proficiency would not be biased by any particular test format. 2. There should be a combination of formats: spontaneous objective, skills. and elicited, subjective integrative and discrete point, and all 3. There should be no native language stimulus (i.e., no translation task) and the modal itself should not appear in the stimulus (except for the grammaticality judgment task). 4. The subparts should test different levels of language performance (e.g., communication, language functions or semantics, and individual language forms) and not just language structure. 5. The test should not be so designed that it tests one of the four major skills (i.e. , 1 istening, speaking, reading, writing) rather than modality. 6. The scoring procedure should take into account not only the actual responses given by the native 75 speakers but the probabilities of their occurrence as well. In accordance with these recommendations, each of the three subparts was designed to test a different level of language performance. The open-ended format was designed to test communicative skill through a controlled, spontaneous production task. The restricted-response format was designed to elicit linguistic structures for a given language function. And the closed-ended format was designed to elicit grammaticality judgments, where the forms differed either in meaning or appropriateness. Of the three subtests, the first to be developed was the restricted-response format. Since the investigative study done by Richards (1980) had shown a contrived situation to be best for eliciting complements, it was decided that a similar contrived situation might also work for modals (with the stimulus in the second langauge, however). The restricted-response format was able to test exactly what the non-native speakers had trouble with: selecting a form to use in a situation requiring a modal. Later, the objective, grammaticality judgment task was added to see if non-native speakers could aurally distinguish appropriate modal constructions. Finally, a communicative task was devised in order to provide spontaneous oral data that would show modal usage. In 76 order that the subjects not realize that the test was about modals, the parts were administered from the most to least subjective (i.e., open-ended, restricted-response, closed-ended). Each subpart was pilot-tested with a group of native speakers. With each pilot test, bad i terns were discarded and new ones devised. The parts were timed in order to determine a reasonable length of time for administration to non-native speakers. The pilot of the entire test battery, administered to another five native speakers, included a questionnaire regarding the length of the test, the comprehensibility of the directions and individual test i terns, the form of the test, and the face validity of the test as a measure of deontic modality. The comments by those who took the pilot were incorporated into the final version of the test. Most would have preferred a shorter test, so the number of items in the restricted-response format was reduced by almost half and the number of grammaticality judgments to be made was reduced as well. In addition, since the directions for the restricted-response and closed-ended formats were confusing to the subjects at first, a sample i tern was provided for each. Probably the most important feedback provided by those taking the pilot test, however, was that everyone agreed that the test had face validity. 77 Restricted-Response Format The restricted-response format evolved out of an attempt to approximate the modal language production task. When speakers are aware of a particular modal function or meaning they wish to express, they try to come up with an appropriate form. Given sufficient context, therefore, it might be possible to trigger a modal function. A blank could be inserted within the i tern at the point where the modal form itself should go. Although this type of format is very similar to the cloze format (where blanks are placed in connected discourse), Alder son prefers to refer to such formats as 111 gap-filling tests 1 11 ( 1980:60). The primary difference is that cloze deletions are normally pseudo-random (i.e., every nth word) while gaps to be filled in are rationally motivated. An additional difference between the fill-in-the-blank test used here and a traditional cloze test is that the blanks in the modal test could contain any number of words. The blanks could not realistically be limited to one-word responses because of the need to test the periphrastic modal forms as well as the simple modal auxiliaries. Test items were developed over a period of time, with actually-occurring modal usage serving as the basis on which most of the situations were developed. Since this study was to be of modality within the domain of spoken 78 English, the contexts used to trigger modal functions were dialogues. At first, items were constructed with only one blank. Later, it was decided that, since modals often occur in clusters over a series of several utterances, several blanks could be conversation. An example of placed within a longer this fill-in-the-blank type item is the sample item given preceding this section of the test (see Example 1). This example, as all test items in this section, contains an identifying title, a brief description of the situation preceding the dialogue, and labels as to the identity of the speakers. Sociolinguistic variables thus form an integral part of the test items: the locations of the dialogues vary, as do the age, sex, and role of the speakers. (See Appendix A for the complete set of Part II test items.) The first set of test items was administered to five native speakers and one non-native speaker. Items were retained if there were no more than three distinct responses from the five native speakers; items were rejected if four or five different answers were given. Following this first pilot test, items were rewritten or added as necessary so that each of the major modal functions was being elicited. The six were: obligation, negative obligation, negative permission), permission, advice, major functions prohibition (or warning. The 79 JAYWALKING A native Californian is talking to a recently-arrived international student. Native: Where are you going? Student: Across the street to get something to eat. Native: Well, you cross there. Student: Why not? Native: There's a police officer standing over there who's likely to give you a ticket. Example 1: Special English Test, Part II: Sample Test Item final set of test items was then administered to five additional people to make sure the items were comprehensible and to find out how much time native speakers required to complete this portion of the test. With just over 50 items, the native speakers required approximately 18 minutes (or 20 seconds per i tern). Since the non-native-speaking subjects would likely require at least twice as much time as did the native speakers to do each item because of their lower reading speed, the number of items was reduced to 17, with 36 blanks, and 35 minutes were allotted for this portion of the test. In order to make sure that the more advanced subjects (i.e., the faster readers) would not have an undue advantage in being able to go over the test items, subjects were told that time was a 80 factor in their score: they were to work as quickly as possible, and at the bottom of their answer sheets were to indicate the number of minutes it took them to complete that portion of the test. It was hypothesized that the amount of time required for this subtest might correlate with reading ability. Closed-Ended Format The closed-ended, objective format was designed to elicit grammaticality judgments of modal expressions in given contexts. For each situation (preceded by an identifying title), subjects 1 istened to two alternatives that one of the interlocutors could have said. The subjects then circled the number of the alternative that sounded best. Items contrasted either different modal functions (e.g. , obl iga ti on v s. advice) or different modal expressions within a given function (e.g., must vs. 've gotta within obligation, can~ vs. not BE allowed to within negative permission). The i terns in this subpart were specifically designed to elicit grammaticality judgments of similar functions and forms. In order for subjects not to be able to focus on the exact forms being contrasted, the alternatives were given orally. However, so that the test was not entirely dependent on listening comprehension ability, the situation 81 itself that preceded the alternatives was written as well as spoken, and an orienting title was given before each item. The sample item given to the subjects can be seen in Example 2. The statements following the numbers in parentheses were presented to the subjects orally and did not appear on their test answer sheets. (See Appendix B for the complete set of Part III test items.) GRADUATION REQUIREMENTS At a lecture about graduation requirements, students were told: (1) You must turn in all library books and pay all fines before you can get your diploma. (2) You have to turn in all library books and pay all fines before you can get your diploma. 1 Example 2: Special English Test, Part III: Sample Test Item Open-Ended Format 2 The open-ended format was designed in order to provide a measure of spontaneous modal production in a communicative situation. As with any test that focuses on a specific area of language, it is extremely difficult to be sure that any segment of spontaneous data will contain samples of the items needed for analysis. There was, 82 therefore, a need to control the spontaneous production task in some way in order to elicit expressions of obligation, particularly in contexts where one person is directly trying to get another to do something. Since the modal functions of obligation are quite prevalent in oral interactions, an oral communicative event was required that would trigger the requisite forms. (See B. J. Carroll 1980 for a complete framework for testing communicative performance. ) Of the several communi ca ti v e events possible, the one that seemed most productive was one that involved an orientation procedure at a university. The communicative activity entailed listening to a lecture and repeating the lecture to a student who has arrived late. In order to determine what the exact content of the lecture should be, a pilot test was conducted with employees and Peer Advocates (students hired to assist international students) of the Office of International Students and Scholars (OISS) at the University of Southern California (USC). These subjects were asked to pretend that the researcher was a new international student who had come asking them for information about registration. Their instructions about how to register for classes were audio-recorded. 83 A composite lecture was then made from the information contained therein. The lecture to be delivered to the subjects was written with a great many modal expressions of obligation and advice. Nevertheless, it is unlikely that the subjects would be able to remember them exactly for two reasons. First, the lecture was to be listened to, and the notes the subjects received to help them recount the lecture consisted of noun phrases, with no verbs. Second, since this portion of the test was to be administered first, the subjects would have no way of knowing that the focus of the test was modality. (See Appendix C for a transcript of the lecture and the directions and notes received by the students.) Administering the ~ Description Qf Subjects In order to determine proficiency in the area of modal production and to gain diagnostic information as well, the modal test (hereinafter referred to as the Special English Test) had to be administered to a group of non-native speakers. Although it was necessary to get a range of non-native speakers, those at the lower end had to have a moderate command of the language if they were to be able to understand the directions. 12 Since it seemed that 12 One of the subjects at the lower end, for example, 84 expressions of modality were late-acquired, it was felt best to test only those who would likely be proficient enough in the language to be producing such expressions. In addition to testing the non-native speakers, it was necessary to test a group of native speakers as well for two reasons. First, since modality was hypothesized to be an area of English that might serve to distinguish non-native from native speakers of English, members from both groups would have to be tested. And second, since the situations and contexts were set up to allow for variable native speaker usage, it was the answers given by the native speakers that would serve as the criterion against which the non-native-speaker responses would be judged. Non-Native Speakers There seemed to be two distinct ways of selecting the non-native speakers to be tested: (1) from groups that had already been determined to be of a certain proficiency level or ( 2) from a population of indeterminate proficiency. In the first instance, the groups would be a range of (intermediate and above) classes selected at random from those offered at the American Language failed to understand the directions to Part I, and instead of retelling the information to another international student, read the directions into the cassette recorder. His scores were not included in the analysis. 85 Institute (ALI) at USC. In the second instance, the sample would be taken from a population whose proficiency was as yet unknown--newly-entering international students at USC. For the purposes of this research, the first alternative was deemed not viable for two reasons. First, the fact that the test would be administered to relatively homogeneous groups whose proficiency level was known could conceivably affect the results of the tests. For example, instructions given to the group might be geared to their proficiency level; expectations by the test administrator might be inadvertently conveyed to the subjects. Second, and more important, administration of the test to selected classes of students at the ALI would preclude the testing of non-native speakers at the upper 1 ev el of proficiency, since those students are exempt from having to take English classes. The second alternative would allow the inclusion of students at the upper level of proficiency. Given that there seemed to be approximately three broad levels of ability--intermediate and advanced students at ALI and released students (those not required to study English)--it was decided that a sample size of 20 students per level would be appropriate. 13 Since these students were 1 3 Since there was no of the students, non-native-speaker 60 subjects. way of knowing beforehand the level it was decided that the total sample should include approximately 86 already scheduled to take the International Student Exam (ISE) for English placement purposes and other tests as required by their departments, it was decided that it would not be reasonable to require them to take an additional test that was to be used primarily for research purposes. Since requiring the test was not a viable al terna ti v e, it was decided to seek volunteers instead. The use of volunteers as subjects is problematic in two ways: ( 1) in getting people to volunteer and ( 2) in making sure the results are not biased in any way. In order to get students to volunteer, it was decided to offer them two incentives. Since the Special English Test was scheduled to last approximately one hour, the first incentive was that they would be able to get the results from their English placement test (International Student Exam, or ISE) approximately one hour before everyone else in their group. The advantage to this would be that they wouldn't have to wait in line with 200 or so other students and that they could then go immediately to see their advisors. The second incentive was that they would be told how they did on the Special English Test and would be given some suggestions on how to improve their performance in the area covered by the test. These two incentives were quite different in nature and it was hoped that they would offset any bias in the self-selection of the subjects. Getting 87 the ISE results earlier might be an incentive to those who are impatient and eager to get a headstart; finding out how they did on the SET would likely be an incentive to serious students. 14 Native Speakers As with the non-native speakers, there were two basic ways to select native-speaking subjects. They could either ( 1 ) come from one homogeneous group or ( 2) be a representative sample of a 1 arger po pul at ion. Since the non-native speakers being tested were newly-entering students at USC, an appropriate group of native speakers might be a group of freshmen. Since all freshmen are required to take English, the test could have been administered to randomly chosen classes of Freshman English. Using Freshman English students as subjects, however, posed several problems. First, there was the same problem encountered with non-native speakers; namely, they could not be required to take the test. Second, it was not likely that many would volunteer, and if any did, they would likely volunteer for very similar reasons (e.g., 14 Several students expressed to the researcher how silly they thought the first incentive was (i.e., getting the ISE results early). Others said they wanted to take the test because they were eager to promote research endeavors. 88 because they enjoy taking tests or enjoy language). And third, this group was not really comparable to the group of entering international students. Even though both groups are new to the university, the freshmen are likely to be much more homogeneous--and thus not representative of the rest of the population. Freshmen are likely to be of approximately the same age (18 or 19 years of age) and have only a high school education behind them. International students, on the other hand, are likely to range in age from 18-24 and often higher and come from a variety of educational backgrounds. In order to prevent the homogeneous group of subjects, twenty volunteers from a wider bias inherent in using a it was decided to seek population. The groups sampled included primarily a dance group, as well as some friends and neighbors. Description Qf Non-Native Speaker Administration Procedures Preparation Since two parts of the test required the use of tape recorders for both listening and speaking, and since it was unfeasible to test each subject individually, it was decided to administer the test in a language laboratory that had a console for monitoring audio output. 89 Subject Recruitment At the oral interview portion of the ISE, students were given an Information Sheet about their language background that all international students are required to complete (see Appendix F), along with a memorandum from the researcher (see Appendix D). The memorandum stated that approximately 75 volunteers were needed to take a special English test. 15 The stated purpose of the research was to measure how well they could communicate in English. In order to make sure that only those students would sign up who had a modest command of the English language, the memorandum stated that the test was for students at the intermediate level of English and above. It did not matter if students imagined themselves at the intermediate level but weren't, as long as they realized that they were the ones who had placed themselves in the position of taking a test that might be too difficult for them. Another safeguard against getting volunteers whose command of English was not adequate was the memorandum itself: if they could read and understand it, their English was probably good enough for them to take the test. intentionally to volunteer. the first given the 15 The number of volunteers needed was overstated in order to get more students Any student who volunteered during administration of the ISE would have been Special English test. 90 During the time the students spent waiting for someone to give them an oral interview, they asked the researcher to clarify the nature of the test and/or signed up to take the test. The sign-up took no more than five minutes. They were given a 3 x 5 card to fill out with their name, address, and phone number; a release form to date and sign; and an appointment slip on which would be written their name and the date and time of their test (see Appendix D). They then selected one of four test administrations that would take place at the end of the week and signed their name on a sheet of paper for the time selected. For cross reference, their appointment time was also written on their 3 x 5 card. ~ Administration As the students entered the testing room, their names were checked off on the 3 x 5 cards they had filled out (which had previously been alphabetized according to test administration time). They were then told to take a seat and sign their name next to their seat number on a sheet that was to be passed around. There were three rows in the language lab (A,B,C) with eight seats in each row. For the written parts of the exam, they were to write the test administration date and time along with their seat number. For the speaking part of the test, the tapes for each test 91 administration time were color coded, 16 with the seat number writ ten on each tape. Two complete sets of tapes could thus accommodate the responses from the subjects at all four test administrations. Prior to beginning the test, the researcher explained that they were taking the test in a language lab in order to facilitate the collection of the speaking part of the test. Then the assistant in charge of the master console gave instructions about how to use the audio equipment for listening and for speaking. Following the test administration, students were told where they could pick up their placement test results and were reminded to notify the researcher of any change in address or phone number so that they could be contacted in order to be given the results from the test they had just taken. The three parts of the test were administered in order. So that they would understand exactly what to do, subjects received the instructions to each part of the test visually and aurally. The aural presentation was from a master cassette played from the console into the headsets. During the portion of the test in Part I where speaking was 16 Test Administration 1, given on August 25, 1983 at 1:30 p.m. was red. Test Administration 2, given on August 25, 1983 at 3:00 p.m. was green. Test Administration 3, given on August 26,1983 at 9:30 a.m. was yellow. Test Administration 4, given on August 26, 1983 at 11:00 a.m. was blue. 92 required, the students were individually handed the cassettes corresponding to their seat number and reminded how to work the machines to record their voices. The entire test was designed to last less than an hour, though in actuality it lasted about an hour and a quarter because of the additional time needed for the passing out and collecting of materials and for the reading of instructions. The lecture given by the researcher in Part I was approximately 3-1/2 minutes long. The students were allowed as much time as they needed to record their retelling of the lecture, but recordings didn't seem to last longer than 5 minutes. A time limit of 35 minutes had been placed on Part II as the estimated maximum time required for any intermediate level student or higher to be able to complete the 36 items. During the first test administration, however, everyone was able to finish Part II within 30 minutes, so the maximum time allotted was revised to 30 minutes. 17 The length of Part III was pre-determined by the number of minutes it took for the 18 items to be listened to on cassette--approximately 8. 17 The time was reduced in order not to put undue stress on those who had already completed this portion of the test and were required to wait until everyone else had finished it before the next part could be administered. At a later test administration, however, there was one student who was unable to complete Part II within the requisite 30 minutes--most likely because his English language proficiency was relatively low compared with that of the other subjects. 93 Native Speakers Preparation Since the 20 native speakers were to be selected from among friends and acquaintances who, because of prior commitments, were not likely to be able to get together at any one time, it was decided to test them individually. When feasible, a few were tested at once, though the administration of Part I was always individual. The test was administered in various locations, with all attempts made to cause the least amount of inconvenience to the subjects. Subject Recruitment Since the native-speaking subjects were friends and acquaintances of the researcher, several precautions were taken in order to keep the sample from being biased: ( 1) no one was asked to take it who knew that the research was on modality; (2) no linguists were asked to take it because of the undue influence their knowledge might have on their responses once they realized the test was on modals; and (3) no people with Ph.D.'s were asked to take it because the non-native speaker population would not have included anyone with a Ph. D. No other constraints were placed on the selection of subjects. In no case did anyone turn down my request out of fear of test-taking. They were all 94 informed that the test was being given for research purposes and that whatever answers they gave would be correct, because they were native speakers. ~ Administration Overall, the native speaker test administration was virtually the same as the non-native one. The administration did vary, however, in a few minor ways because of the nature of the group being tested and the nature of the testing arrangements. In order that the subjects be uniquely identified, the subjects were asked to sign a sheet of paper in the order in which they took the test. 18 They were asked to fill out an address card on which their identification number was placed. In order to make sure there had been no bias in subject recruitment, they were also asked to fill out a questionnaire with information about their background, including sex, age, native language, country or state of origin and principal residence, highest educational degree received, and occupation (see Appendix E). The question on native language was included just in case there were some subjects whose native language was technically not English. 19 18 The native speaker test administration was assigned the color white and the number 5 (since there had been four previous administrations). 1 9 In fact, three of the subjects in this group were technically non-native speakers (with native languages 95 Another way in which the native speaker test administration differed from that of the non-native speakers concerned Part I. Since the non-native speakers had recorded their answers to Part I directly into the microphone rather than telling it directly to another person, it was decided to have the native speakers record their answers in the same way. Thus, when the subjects were about to record Part I, the researcher left the room until they had finished. 20 All the responses from native speakers on Part I were recorded onto two master cassettes, with each speaker identified by number just prior to the recording. of Spanish, Russian, and Hebrew). 20 During a pilot of this part of the test, the researcher remained in the room for some of the recordings and left the room during others. A quick analysis of the modal expressions used seemed to indicate that when the researcher was not present, the forms tended to be more removed from the present tense (e.g., use of future time expressions such as H.ill, BE going .t.Q). Although the cause could not be traced definitively to the researcher's presence or absence, it was decided that the researcher would remain out of the room for all Part I recording sessions in order not to bias the results in any way. 96 RESULTS AND ANALYSIS Description of Test Correction Procedures In order to check the relative performance of the non-native speakers and to compare their results as a group with those of the native speakers, it was necessary to code the responses on each test item, assign scores to each response, and determine a total score for each non-native speaker individually and for the non-native speakers and native speakers as a group. Coding of Responses Since the goal of the test was to assess learners 1 production in given situations, two of the three parts of the test (i.e. , Parts I and II) required that the subjects provide responses that were either completely open-ended or guided in some way (i.e., restricted-response). Only Part III had a restricted set of responses. Therefore, it was necessary to code the responses of Parts I and II in order to be able to assign a score to each item. 97 fi1:.t .I. In Part I, subjects retold a lecture they had just heard about how to register for classes. On the notes they received to guide them, there were what could be called 14 distinct episodes for which a modal expression might be used. The 13th episode could be divided into two parts, the second one calling for an expression of negative obligation (e.g., don..!....t_ have to). The various episodes along with the linguistic realizations as given in the original lecture can be seen in Example 3. Each linguistic realization for each episode was assigned a two-digit code. Only those episodes where the subject could be said to be trying to get the other person to do something (i.e., use of ..Y.Q..Y +verb; cf. Wilkins's category of suasion) were coded. For example, in episode 5, the linguistic realization in the original lecture is not really an instance of suasion but a reporting statement, and thus would be coded as missing. In all cases of suasion, however, where obligation, advice, or permission was expressed, responses received a special code--even if the response did not contain a modal verb (e.g., episode 11--YQY list--coded as ..Y.Q..Y + verb). 98 1. Peer Advocate (OISS) 2. Orientation program 3. Schedule of classes in packet 4. Location: Physical Education (PE) Building 5. Numbered stations 6. Station 1: Permit to Register 7. American Language lnstitute--JEF 150: E-hold 8. Station 2: F-hold (OISS; passport, 1-94) 9. Station 3: Advisor 1 0. Stat ion 4: F i na I check (International Admissions) 11 • Classes on Permit to Register 12. Computer terminal: Fee b iII 13. Health insurance (1) 14. Health insurance (2) 15. Pay fee bill (sponsored vs. unsponsored) "The first thing you should do is to try to see a Peer Advocate ••• '' "Then you should plan to go to the orientation program " 11 ••• there wi 11 be a schedule of classes, which you should read very carefully." "You have to go to many different p laceSTri the PE bu i I ding in order to register." 11 the stations are numbered in the order you have _!£go to them." "The first station you~_!£ go to i s Stat i on 1 • • • '' "You should fill it out and then go over to the American Language Institute ••• ~ " ••• you have to go back to the PE building, to Station 2 11 "After your holds have been removed you ~ to Station 3 • • • " 11 ••• and then ~ to Station 4 ••• " 11 ••• you list the classes you intend to take on the Permit to Register ''When you get your fee bill you should check it over 11 "You must buy health insurance in order~be covered in case anything happens to you." "If ••• then you don't have to buy hea I th insurance from tii"ei:inTVers i ty. 11 "Finally ••• you have to pay your fee b iII." II Example 3: Special English Test, Part I: Episodes and Their Linguistic Representations 99 Part II. For the restricted learners read a dialogue appropriate forms. The 36 response format and filled in blanks generally in Part II, blanks with required the use of expressions of modality. Again, as in Part I, each response was given a two-digit code. In order to limit the number of codes, it was necessary to restrict the variety of responses in some way. Therefore, for this part of the test, a master list of the native-speaker responses was made. Responses that seemed to be idiosyncratic (i.e., one of a kind) were listed as other. Non-modal responses were coded as long as they were given by at least 2 of the 17 native speakers. Then the non-native-speaker responses were coded using the codes assigned to the native-speaker responses. If it later seemed that the non-native speakers were consistently using responses not used by the native speakers, these were then assigned their own code. All other non-native-speaker responses were left blank so as not to be assigned any points during scoring. Part III. Since Part III was a closed-ended format, the coding was already taken care of. The responses to each i tern consisted of a 1 or a 2 that had been circled by each subject. 100 Evaluating and Scoring the Responses .fH.t I. For this portion of the test, what was of most interest was whether or not a modal-like form was used, regardless of its grammatical accuracy. In order to determine what forms would be acceptable, all responses given for each item by the native speakers were listed and deemed acceptable and appropriate. Since all of the episodes dealt with some sort of obligation or advice, the responses given seemed to indicate that the modal forms used to represent these functions might be interchangeable. That is, while one subject might use have to for one episode and need to for another, another subject might do just the opposite. Therefore, a decision was made to assess modal production not on each episode individually but on the communicative event as a whole. The scoring procedure, though complicated, provides a general score of modal production, with quantity the major contributing factor. The procedure was as follows. A list was made of all the modal auxiliaries and modal-like forms that appeared in any of the episodes. Then, the number of episodes in which each item appeared was tallied. (With 15 episodes, the maximum number of appearances could not exceed 15.) The responses were weighted according to their 101 frequency of appearance. They were divided into five groups, with the most-frequently occurring receiving 5 points and the least, 1 point. A list of the forms along with the number of episodes in which they appeared and the number of points they were assigned can be found in Table 4. The points received by each subject were then summed to get a total score for Part I (maximum = 75 points: 5 points for each of the 15 items). 21 .f.stl II. For this fill-in-the-blank portion of the test, the concern was not only with how many modal-like expressions were used but with their degree of appropriateness and grammatical accuracy as well. Again, the native-speaker responses were used as the metric against which to measure the non-native speaker responses. Since this portion of the test contained 36 individual items, each with a limited set of correct responses, the points were derived for each item individually. As in Part I, all native-speaker respones were listed and given a two-digit code. Then the 21 Of the 65 non-native speakers who took the test, one had to be eliminated due to what might be termed mechanical failure. Although the subject could be heard retelling part of the information into the cassette recorder, most of the cassette was blank, without any apparent sound. Since there was no way of determining for certain whether the silence was due to inability to carry out the task or to mechanical failure, the subject was dropped entirely from the analysis. 102 TABLE 4 Special English Test: Part I Responses, Relative Frequencies, and Points Assigned Responses have to you + VERB need to will + VERB will have to must 've got to 're going to should VERB can will need to would + VERB oughta don't have to* Number of Episodes in Which They Appeared (maximum = 15) 14 14 10 10 8 7 5 5 5 5 3 3 2 1 1 ( 4) Number of Points Assigned 5 5 4 4 3 3 2 2 2 2 1 1 1 1 1 ( 2) *Since the function of negative obligation appeared in only one of the 15 episodes, its frequency was determined based on the number of subjects who gave it as a response in episode 14. Of the 17 subjects, 4 (or 24%) used~~ have to. Since 24% is equivalent to approximately 4 out of 15, don~ have to was assigned 2 points. number of responses for each item were tallied, and the percentage of native-speaker subjects giving each particular response was determined. Item 9 (Example 4) is used here to illustrate the scoring procedure (see Table 5 ) • 103 GROCERY STORE A customer is talking to the clerk in a grocery store after paying for the groceries. Customer: Oh, I really hate those new plastic bags I have a paper one instead? Clerk: Sure. Example 4: Special English Test, Part II: Item to be Scored TABLE 5 Special English Test: Part II Sample Responses and Scoring (Item 9) Percentage of Number Number of Respondents of Points Responses Respondents (Total N = 17) Assigned could 6 35 35 may 6 35 35 can 3 18 18 can't 2 12 12 In those cases where two or perhaps three responses were clearly synonymous (e.g., had better, ~better;~ allowed to, ~ permitted to), the percentage was assigned to all synonyms taken together. Grammatically incorrect responses--rare among the native speakers, but common among the non-natives--were given only two-thirds credit. 22 22 It was decided to give two-thirds percentage of subjects responding credit to each since item the was 104 If non-native speakers were to be compared to native speakers, then it was necessary that there be some consensus among the native speakers. It was decided, therefore, that there would have to be a clear set of responses used by at least 75% of the native-speaker sample in order for an i tern to be retained. Three of the i terns (113, 6, and the first blank in situation 12; see Appendix A) had to be dropped or else it would have been necessary to use one or more idiosyncratic responses (i.e., those given by only one subject) and/or a response that was not accurate (even though given by a native speaker). 23 For the remaining 33 items in this section of the test, the points received for each item were summed to derive a total part score for each subject. With 33 items and a maximum score on each i tern of 100 (i.e. , 100%), the maximum total part score was 3300 points. The advantage of using a scoring system based on percentage of use is that it reflects easily divisible by 3. subjects, a response by one of 6 points (since 1 out of 12 points, etc. That is, with seventeen subject would yield a score 17 is 6%), by two subjects, 2 3 A look at the responses given by the non-native speakers for one of these items (#6) showed there to be an additional difficulty. The sentence with the blank is: "There dancing but the band cancelled at the last minute." Many of these students had difficulty with the existential there and often responded with: "was supposed to be/have a," "was to be, " or "was a" which would all require the following word to be dance. 105 native-speaker performance and thus establishes probabilities of occurrence of each response for each item • .f..9_tl III. As in Part II, the total number of native-speaker subjects giving each response (in this case a 1 or a 2) was found. The response given by the majority of native speakers (i.e., at least half) was determined to be the correct response. However, since the selection of either 1 or 2 is possible by chance at least 50% of the time, only those items were included where the answer could not be due to chance. A binomial distribution with a probability of .05 shows that with a sample of 17, the selection of 1 of 2 responses by 76% (n = 13) could no longer be due to chance. All items, therefore, where the response was given by fewer than 71% (n = 12) were eliminated (i.e., items 2, 7, 10, 13, and 17). For the remaining items, native- and non-native-speaker subjects were awarded one point for each response that conformed with the response given by a majority of the native speakers. The points were then tallied for each subject, with 13 being the maximum number of points possible. 106 Statistical Analysis Qf ~ Tes~ The major purpose in devising a modal test for non-natives was to determine whether performance in this area could serve to distinguish different levels of English ability. Prior to performing any analysis to see whether this was possible, it was first necessary to check the validity of the modal test (i.e., the Special English Test or SET) by seeing whether it correlated with an outside criterion (in this case, the International Student Exam or ISE). In order to most effectively carry out these analyses, however, certain preliminary information had to be gathered. Preliminary Analyses It was felt that the test would discriminate well at the upper levels of ability and thus might be very difficult for speakers at the lower level. Therefore, only students at the intermediate level and above were asked to take the test. Nevertheless, it was quite possible that students from lower levels might have volunteered as well. In order to verify this, all subjects were divided ?Ccording to the levels in which they had been placed by the ISE, and a frequency count was made of the number of subjects falling in each level. The levels used at USC 107 are: Intensive 200, Special 201, 24 201, 202, and Release. In addition, there were a few near-native speakers (those who appeared native but whose native language was not English) and a group of native speakers. The distribution by level appears in Table 6. TABLE 6 Frequency Count of Subjects By Level Level Number Qf Subjects Intensive 200 1 Special 201 1 (Low Intermediate) 201 (Intermediate) 10 202 (Advanced) 25 Release 28 Questionable Native 3 Speakers Native Speakers 17 TOTAL 85 Since there was only one subject each in the intensive 1 evel class and the Special 201 class, and this would not allow for generalizations to be made, these subjects were dropped from further analysis. The total number of non-native-speaker subjects was thus reduced by 3 24 The Special 201 class is lower than the 201 class and is distinct from it since most students who take Special 201 are also required to take 201. 108 (including the one deleted due to mechanical failure) from 65 to 62. Of the 20 native speakers tested, the questionnaire revealed that 3 were technically non-native speakers. Since these 3 subjects had originally been asked to volunteer for this study because they had been assumed to be native speakers, their scores were submitted for analysis to determine whether they performed well enough to be so considered. If they did, they would be merged with the native speakers; if not, they would be dropped from the study. The total number of native speakers in the study then would be either 17 or 20. It also seemed appropriate at this point to be able to look at all the native- and non-native-speaker subjects as groups and see how their performance on each of the subtests of the Special English Test compared. The results from the sta ti sties on SET Parts I, II, and III appear in Tables 7, 8, 9, respectively. As can be seen, in almost every case there is an increase in mean score at each respective level. Only for the mean on Part I for the questionable native speakers and the native speakers was there a decrease. A closer look at their other scores showed them to be very similar in all instances. It seemed, therefore, that the questionable native speakers were probably proficient enough to be considered native speakers. 109 Lower scores at a higher level also occurred with the minimum scores on Parts I and II and the maximum score on Part I between the 202 and Release groups. The lack of clear distinction between 202 and Release can also be seen in the extremely small increase on the mean of Part I, and the relatively small increases on the mean of Parts II and III. Further statistical analysis would be needed to determine whether the distinction between the two levels could be considered significant. TABLE 7 Simple Statistics of Scores by Level: SET Part I (Rounded to Nearest Hundredth) Standard Level Mean Deviation Minimum Maximum 201 32.80 14.87 20.00 56.00 202 42.12 10.81 16.00 62.00 Release 42.44 12.90 14.00 61.00 Questionable Native Speaker 51.33 12.70 44.00 66.00 Native Speaker 49.47 11.72 32.00 69.00 110 TABLE 8 Simple Statistics of Scores by Level: SET Part II (Rounded to Nearest Hundredth) Standard Level Mean Deviation Minimum Maximum 201 633.50 172.34 430.00 984.00 202 815.04 152.68 588.00 1166.00 Release 962.78 229.21 357.00 1421.00 Questionable Native Speakers 1409.00 153.70 1251.00 1558.00 Native Speakers 1445.59 232.23 983.00 1790.00 TABLE 9 Simple Statistics of Scores by Level: SET Part III (Rounded to Nearest Hundredth) Standard Level Mean Deviation Minimum Maximum 201 6.70 2.21 3.00 10.00 202 8.32 1.49 5.00 10.00 Release 8.89 1.65 6.00 12.00 Questionable Native Speakers 10.67 1.15 10.00 12.00 Native Speakers 11.47 1.18 9.00 13.00 1 1 1 Establishing Concurrent Validity While informal questioning of the pilot group who took the test had established face validity of the test on modality, it was also necessary to validate it against some outside criterion. It was for this reason that the Special English Test had been administered to incoming international students: they had just taken the placement exam (International Student Exam) that could be used to validate the test designed for the study. The validity of the ISE itself can be seen in the success with which it places students into levels in which they belong. Confirmation of the validity of the ISE lies in the fact that not one of the 62 international students taking the test was deemed misplaced by the teachers during the first week of the semester (known as reevaluation week). 25 In only one case did a student have to repeat the course (i.e., meaning perhaps that he should have been placed in a lower level). Although the results from this test would have concurred with his placement, a closer look at the test results showed him to have scored extremely low on SET Parts I and III. Lower scores on these tests that depend on more oral skills (speaking and listening) might 25 There were, however, four students who were waived by their department (i.e., were released from having to take courses at the ALI.) Such waivers, however, do not signify that the placement was in any way erroneous. 112 have indicated that the lower level class (i.e., Special 201) was more appropriate. The correlation matrix for the non-native speakers on both sets of tests appears in Table 10. As can be seen, the highest correlations appear to be between SET Part II and all parts of the ISE. The next highest set of correlations is an inverse correlation between the number of minutes needed to complete the test and all parts of the ISE except the speaking portion. As expected, the number of minutes it took subjects to complete Part II was negatively correlated with ISE test performance. Since the correlation was generally consistent with the correlation between SET Part II and the ISE, yet not as high, this test variable was eliminated from further analyis. TABLE 10 Correlation Matrix of Non-Native-Speaker Scores on the SET and the ISE SET I SET II SET III MINUTES ISE WORD SENSE 0.28123 0.69497 0.53466 -0.55953 ISE GRAMMAR 0.31554 0.50549 0.36020 -0.45229 ISE COMPOSITION 0.17010 0.50877 0.22062 -0.44354 ISE READING 0.19819 0.55878 0.32969 -0.46093 ISE LISTENING 0.33214 0.62042 0.37989 -0.50071 ISE THEME 0.20504 0.51695 0.49603 -0.50236 ISE SPEAKING 0.24936 0.41763 0.26600 -0.15328 113 Since the correlations of all parts of the ISE were so high on just one part of the modal test (i.e., Part II), an attempt was made to try to discover exactly how well the two tests related to each other. Since there were three subparts on the SET as well as the seven subparts on the ISE, it was necessary to perform a canonical correlation analysis. Canonical correlation analysis is a procedure used to find the best linear combination for two sets of variables. A composite variable is derived for each set so as to maximize the correlation between the two composites. The canonical correlationprocedure used here was run under release 82.3 of SAS (SAS Institute Inc. 1982). Of the three canonical correlates that were calculated, only the first one was significant (p < .0000): R = 0.8278, adjusted R 2 = 0.7920, R = 0.6852, F :4.1635, df = 21. The standardized canonical coefficients for the one composite SET variable and the one composite ISE variable that were significant can be found in Table 11. As can be seen from Table 11, the variable that loaded highest on the composite SET variable was SET Part II. 26 On the ISE composite, the variable that loaded highest was Word Sense followed by 26 SET III loaded high on the second composite variable, and SET I loaded high on the third, but neither of these composite variables was significant. 114 Listening. 27 These high loadings on the composite variables of SET II, on the one hand, and ISE Word Sense, on the other, show how similar these subtests are in what they are measuring and how strong an influence they each have on their respective composite variables. TABLE 11 Standardized Canonical Coefficients for the Significant Composite Variables of Each Test SET Variable 1 ISE Variable SET 1 0.2511 ISE Word Sense 0.5085 SET 2 0.7374 ISE Grammar -0.0127 SET 3 0.3647 ISE Composition -0.0193 ISE Reading 0.0188 ISE Listening 0.3207 1 When each of the variables is correlated with the opposing canonical variable, the relative strength of each of the variables can be clearly seen (Table 12). While SET II can again be seen to correlate strongly with the canonical variable of the ISE, the sections of the ISE that now correlate best with the canonical variable of the SET are the word sense portion, the listening portion, and the theme. 2 7 The word sense portion of the ISE focuses on the function of words (as did the SET). For example, one of the sample questions given to students prior to taking the test is: "Cooking can be a very technique/technical/technician/leave blank science." 115 TABLE 12 Correlations Between the Subparts of Each of the Tests and the Opposing Canonical Variables ISE SET Canonical Canonical Variable Variable SET I 0.3358 ISE Word Sense 0.7781 SET II 0.7451 ISE Grammar 0.5834 SET III 0.5318 ISE Composition 0.4984 ISE Reading 0.5821 ISE Listening 0.6795 ISE Theme 0.6136 ISE Speaking 0.4676 The results of the canonical correlation analysis thus show that the composite variable of the SET correlates significantly with the composite variable of the ISE (R = 0. 827 8). The ISE has been shown to be a useful predictor of language proficiency in the American Language Institute. Although teachers have the option of overruling ISE placement for an individual student, they rarely exercise that option. Therefore, the high correlation of the SET with the ISE provides evidence for concurrent validity. Predicting Placement In order to find out whether or not a test that focuses on one portion of English (i.e., modality) can successfully predict proficiency or placement, a procedure known as discriminant analysis was performed on the Special 116 English Test and placement level. Discriminant analysis is a form of multivariate analysis in which continuous data from an independent variable (in this case, the Special English Test) are used to predict membership in the dependent variable, which is categorical (in this case, the different levels of language proficiency). The exact procedure used for this study was canonical discriminant function analysis in which successive linear combinations are tried in order to get the best fit, that is, determine which regression equation can best be used to predict group membership. The analysis was run under Version M, Release 9.1 of SPSS (Nie et al. 1975). The placement variable consisted of four levels: 201, 202, Release, and Native Speaker. Since the results from the analysis showed that the three questionable native speakers (i.e., those who seemed to be native speakers but whose native language was not English) were actually predicted to be native speakers by the SET, their scores were merged with those of the native speakers for the remainder of the analysis. The weighted values of each of the part scores on the one function that was significant appear in Table 13. This analysis again shows the importance of SET II in its ability to discriminate between levels (by virtue of the fact that it has the largest coefficient). 117 TABLE 13 Standardized Canonical Discriminant Function Coefficients for the SET Subpart SET I SET II SET III Coefficients 0.23902 0.80021 0.46243 Discriminant analysis was also able to show the correct placement of individual subjects. For those who were apparently misplaced by the SET, the correct level was given, along with the estimated probability of the placement being correct. It then became possible to identify each misplaced individual and try to see why they were placed where they were by the ISE. At this point, details about their language background were determined from the International Student Information Sheet (Appendix F) to find out what might account for their different performance on the two tests. Of the 79 subjects, 20 were predicted to be in a different group from the group they were actually in: 9 were predicted to be in a higher group (2, two levels higher), 11 in a lower group (3, two levels lower). While it does not seem possible to say exactly what caused the misplacement in each case, there does seem to be a general trend: those placed by the SET into a higher group than 118 their actual group usually had high scores on the objective portions of the ISE, but relatively low scores on the productive portions (i.e., written theme and oral interview). Of special interest are the two subjects in the Release category who scored sufficiently higher on the SET to be grouped with the native speakers. Both cases are easily explained by their language background data. One was a graduate student from India whose native language was Punj abi but who spoke Hindi as well. He began studying English in school at the age of 5; English was also used as the language of instruction throughout his educational career. The other subject began studying English at the age of 9 in Greece. For the two years prior to coming to USC she was attending high school in California. It seems appropriate that both these subjects would be predicted to be of native speaker caliber because of their extensive contact with English. The use of English as the language of instruction in classes appears to be an especially good predictor. One surprise in the test results was that one native speaker (not one of the questionable ones) scored low enough to be placed in the Release category. The low score received by this subject was caused by poor performance on SET II. The only possible explanation was either that the 119 subject was not operating under optimum conditions or the subject used forms that are not normally used (perhaps forms of a more formal register). Discriminant analysis was further used to show how well the SET was able to predict the placement of the subjects into the different levels (see Table 14). The total correct placement into four groups (i.e., 201, 202, Release, and Native Speaker) was 75.61%. Looking at the placement of only the non-native speakers into three groups (i.e., 201, 202, Release), 69.35% were placed correctly. The placement predictions were the best at the native speaker level and the 201 level, and the worst at the 202 and Release levels. A closer look at the significance between levels based on the test shows that the distinction between the 202 and Release levels just barely misses being significant (p < .0563). The other distinctions are clearly significant (see Table 15). The canonical discriminant function analysis thus showed how effective the SET was as a predictor of English language proficiency level. With 75.61% correct placement into the four levels, a canonical discriminant function correlation of 0.8552, and 98.89% of the variance accounted for, the SET (especially Part II) can be said to be a rather good predictor of language proficiency. 120 TABLE 14 Placement by the SET Predicted Group Membership No. of Actual Group Cases 201 202 Release Native 201 10 8 0 2 0 80.0% 0.0% 20.0% 0.0% 202 25 3 17 5 0 12.0% 68.0% 20.0% 0.0% Release 27 3 4 18 2 11. 1% 14.8% 66.7% 7.4% Native and 20 0 0 1 19 Questionable 0.0% 0.0% 5.0% 95.0% Native Speakers TABLE 15 F-Scores of the Significance of the Difference Between Levels Determined by the SET* Levels 202 Release Native and Questionable Native Speakers 201 5. 1 677 0.0028 11.2650 0.0000 52.7380 0.0000 202 2.6277 0.0563 45.5460 0.0000 Release 28.8020 0.0000 *The F-score appears on the first line, the probability on the second. After the analysis with the three subparts of the test, the F-statistic was 3 and 73 degrees of freedom. 121 Discussion In general, the results from the statistical analyses performed on the Special English Test support the claim that a test based on one portion of the grammar of English (i.e. , modality) can be used to distinguish native from non-native speakers and to discriminate between different levels of non-native speaker ability. The test was also shown to have face validity and concurrent validity. And above all, the success of the test can be attributed in part to its ability to elicit the desired structures. Each of these findings will be discussed in more detail below. ~ Elicitation The data el ici tat ion procedures used here certainly contributed to the success of the study by virtue of the fact that they were successful in eliciting appropriate data. Each of the test formats was successful in its own way: Part I provided guided, spontaneous speech data; Part II forced the production of specific forms according to the functional contexts provided; and Part III made use of the subjects' intuitions as they provided grammatical i ty judgments on orally-presented data. The greater use of Part II in the prediction may be due in part to the wide range of scores possible, from a low of 357 points to a high of 1421 points (1790 for the native speakers). It may 122 also be due to the nature of the task: subjects were required to fill in a form for a particular function--which exactly parallels what happens when people are trying to communicate. Communication involves knowing what you want to say and then figuring out how to say it. The validity of this measure lies, then, in the ability with which it succeeds in eliciting modal constructions. With responses given by native speakers serving to show whether or not an item was capable of eliciting a modal expression, all items not deemed successful were eliminated from the analysis. The lower contribution of Part III to the prediction could be due to several factors. First, after i terns were eliminated that native speakers could not agree on at better than chance level, only 13 items remained. This may have been too few for the purposes of discrimination. Second, while the oral format of the grammaticality judgment task was designed so that subjects would not have time to study the forms, it may have been too difficult a task for many of the non-native speakers. The items were often long and the non-natives may not have been able to hold so much information in short-term memory. Also, their listening comprehension may not have been good enough for them to be able to detect a difference in the items. And third, the results from this portion of the test may not have been reliable. When having to make quick judgments 123 without much information to support the decision, it is conceivable that subjects might not make the same choice if they had to take the test again. The more open-ended nature of Part I provided spontaneous oral data but did very little to help predict placement. The low contribution of Part I to the prediction is most likely due to the nature of the test and the scoring procedure used. Although the spontaneous production task was guided, it was not 1 imi ted enough to allow for objective evaluation. It would probably have been possible for raters to provide a subjective evaluation of proficiency, but such a global measure would not have provided a clear indication of proficiency in the area of modality. A much shorter task, or perhaps, a few smaller tasks where the information to be retold was given in 20-second segments or so, would have facilitated the scoring procedure. The use of episodes to delimit the discourse worked fairly well, but the evaluation of responses within the episodes still proved problematic. Since the native speakers themselves varied so much in their responses, a scoring procedure was used that would allow for such variation. The results may have been quite different, however, had the episodes been scored independently instead of jointly (i.e., with scores based on probability of the use of a modal in any one of the 15 episodes). 124 Validity The Special English Test was deemed to have face validity by a group who had taken the pilot test, and to have concurrent validity based on the strong set of correlations (primarily of SET II) with all subparts of the ISE. All of this, of course, is based on the premise that the ISE itself is a valid mea sure of English language proficiency. The best evidence of the ISE's validity seems to be the success with which it places students into appropriate levels at the ALI. Based on the sample tested here, it correctly placed approximately 93.55% of the t . b. t 28 non-na 1ve su Jec s. 28 Of the four students misplaced by the ISE (less than 10% at each level), all were similarly misplaced by the SET. What can account for this? A discussion with the placement board confirmed that the placement of two of the four had indeed not been based solely on their test scores. Two graduate students, who probably should have been in the 202 class, were released. Several departments at the university feel that students who score high on the objective portions of the test should be released from English classes since 1 it tl e oral or written production is required from the students (e.g., engineering). Another student, who would have been predicted to be in 202 because of his high writing skills, was placed in 201 because of his poor speaking a bil i ties. The special emphasis in the 201 class on oral production enabled him to complete the language requirements in that one semester, without having to take the 202 class at all. One additional student, who was predicted to be a 201 student, was actually placed in 202 with a special tutorial class to help with his low speaking skills. 125 The canonical correlation analysis done to determine what composite variable of the SET best correlated with a composite variable of the ISE consistently showed the major contributing factor to the SET to be Part II and the major contributing factor to the ISE to be Word Sense. This finding was quite interesting, but not surprising, in light of the similarity between the two tasks: both deal with functions of language and focus on forms appropriate in a given context. It will be recalled that the word sense portion of the ISE requires that subjects choose which part of speech best completes a sentence. Similarly, Part II of the SET, requires that subjects fill in a form appropriate to the context (i.e., a modal function). One finding that was surprising, however, was that the grammar portion of the ISE did not load highly on the first canonical variable and that the number of minutes to do Part II of the SET did not correlate as strongly as it might have with the reading portion of the ISE. The weak contribution of the grammar portion of the ISE might be explained by the type of test it is. The typical grammar item is a multiple-choice item that focuses on either discrete points of grammar, logical connectors, or sentence meaning. The focus on any one of these is quite distinct from the focus on word function and thus understandably does not contribute as much to the ISE canonical variable as does the word sense portion. 126 Similarly, comprehension. complete SET the reading portion of the ISE focuses on Since the number of minutes it took to II was more a function of speed, it is understandable that there may not be a very strong relationship between these measures. Placement The discriminant analysis showed the SET to be a fairly good predictor of level: 76% for all four levels--201, 202, Release, Native Speaker; 69% for the three non-native-speaker levels. This contrasts with the 94% correct prediction of the non-native speakers by the ISE. Since it has been determined that some of the placement decisions were based on subjective assessment of some of the non-native-speaker variables (e. g., level in school, major) then it might be safe to conclude that perhaps, for these students who were borderline, the placement could have gone either way. If such was indeed the case, then some of the students misplaced by the SET may not have been misplaced at all. Another question that must be considered is why the distinction between levels 202 and Release was not clearly significant for the SET (p < .0563). One possibility may be that the SET does not discriminate well at that advanced a level of English. Another possibility might be that 127 there is very little difference between students at the 202 level and those who have been released. In fact, the 202 students only get to study English for an additional 60-90 hours before they get released, and the major reason they have been placed in 202 is that they have deficiencies in their written skills, not their oral skills--which is what the SET emphasized. A third reason for the lack of clear distinction between 202 and Release might be that the placement of borderline students at that level, like at all other levels, might be based on subjective factors that have nothing to do with test performance. Nevertheless, since the ISE successfully discriminates between the 202 and Release levels (p < .0000), it must be concluded that the SET is not as efficient at these levels as it might be. Is it possible to account for those students misplaced by the SET by looking at their personal variables? Although nothing definitive was found, some tentative hypotheses can be proposed. Based on data provided on their information sheets, those students who were predicted to be released but were actually studying in 201 or 202 seemed to be of two types: ( 1) those who had had a brief, but intensive oral exposure to English but perhaps had gaps in their written knowledge or (2) those who had a great deal of knowledge about the language but were poor in productive skills (i.e., theme, speaking). On the other 128 hand, those students who were predicted to be in 201 or 202 but had actually been released most likely had a good command of the language from years of exposure, even though they might not have done very well on some of the discrete parts of the test. 129 CONCLUSION In a study such as this, where a theoretical description of English is being extended to a more applied area of research, there are conclusions to be drawn not only about the research results themselves but about the application procedure and the original theoretical description as well. This study, therefore, has a great deal to contribute to the assessment of second language proficiency, to diagnosis of modal proficiency, to second language testing procedures, and to descriptions of native speaker usage of expressions of modality. The specific contributions lead to suggestions for further research in the areas of language description and language acquisition, with general implications for theoretical linguistics, applied linguistics, and language research methodology. Appropriateness Qf ~ ~ Qf Modality The main purpose in undertaking this study was to investigate whether assessment of modal performance could be used to distinguish native from non-native speakers and to discriminate different levels of non-native speaker proficiency. Analysis of the data collected from the 130 administration of the Special English Test showed that the SET can indeed distinguish natives from non-natives. It can, in addition, successfully place almost 70% of the non-native-speaking subjects into one of three levels. A test of significance showed the difference between the two lower levels to be significant (p < .0000) and the difference between the two higher levels to just miss being significant (p < .0563). The relatively high percentage of successful prediction is especially interesting because of the nature of the test itself. Why is it that a test focusing on modality should be able to predict, with a fair amount of accuracy, a person's general language proficiency? What does the test really measure? Was the success of the test due to the focus on modality or the format of the test, or both? What could be done to increase the percentage of successful prediction? Modality ~ ~ Area Qf Assessment As suggested at the beginning of this study, modality seemed to be an area of English structure that posed difficulties for learners of all proficiency levels. While beginning learners failed to use modal expressions at all, advanced learners didn't seem to be using them in exactly the same way native speakers would in the same situation. 131 Since the subjects in this study were at the intermediate level and above, they generally knew when a modal was required. A glance at the data seemed to indicate, however, that a major problem was accuracy; that is, tenses were often misformed (e.g., .must for had to; should for should haye) or part of the modal expression was missing (e.g., ~os~ to for BE supposed to). While it might be claimed that native speakers too make the mistake of omitting the _g in expressions such as BE supposed to, non-native speakers did so more frequently. In addition, there were other errors made by the non-native speakers that were almost never made by the native speakers. On the other hand, even when the non-native speakers used the modal expressions correctly, they tended to select those that were used by native speakers either much less frequently or not at all (e.g., needed to for were supposed to; ~ to for .h.ru,1 better). The assessment of modality, then, served as a fairly good indicator of proficiency level because it is an area in the manner of expression yet requirements on the form these that allows some leeway maintains fairly rigid expressions may take. The variation in expression occurs in taking situation into (e.g.' account factors setting; sex, having to do with the participants; degree of formality; age, and role of degree of politeness) 132 and with rules of appropriateness. The requirements, on the other hand, have to do with grammatical restrictions based on the syntactic context (e. g., past tense, periphrastic modal if preceded by a modal auxiliary). Since grammatical rules regarding modal form are much easier to specify than are the many variations in modal expressions, the grammatical rules often become the focus of English language teaching programs. Consequently, learners are rarely exposed to the colloquial variants normally used by native speakers. Nor do they ever learn about the probabilities of occurrence associated with each variant for each function. In addition to their being good indicators of general language proficiency (due to the range of expressions possible), modal expressions are an integral part of language because of the link they provide between a simple statement and a statement modified in accordance with either the speaker's evaluation of its likelihood (i.e., epistemic modality) or the speaker's attempt to influence others (i.e., deontic modality). These modifications of propositions are an essential part of communication, for without them, there would be no communication of speaker intent, attitude, or feeling. The essence of a language (and of one 1 s personality) may, in fact, be conveyed more by such modal expressions than by the propositions 133 themselves. The expressions of modality lend themselves to being studied as a group in that they are a fairly closed set, allow some variation, and appear continuously during the communication process. In& Success Qf ~ ~ Formats The characteristics of modal expressions that make them difficult to learn also make them difficult to assess. The lack of a complete description of informal, spoken usage makes it virtually impossible to test all possible variations. The most that can be hoped for is that there be a sampling of usage that accurately reflects overall modal usage. In order to increase the success of the predictions made by Parts I and III, and to improve on the success of Part II, there are two basic alternatives: to either improve the test format or improve the scoring procedure. For Part I, both al terna ti ves would be recommended. The guided retelling task should be reformulated into several smaller, better defined tasks-- each req ui ring a distinct modal expression. Scoring, still based on percentage correct given by native speakers, would thus be facilitated. For Part III, the basic format seems sound but must be further investigated. Although scoring is completely 134 objective, based on native speaker responses, experiments should be conducted in order to determine the reliability of the format: Would subjects choose the same answer if the test were given again at a later time? Would they choose the same answer if the alternatives were reversed? If the items were in a different order? Would the format be easier if a larger number of sample items were provided (or if the first few items were not counted)? Would there be any effect if the alternatives were provided in writing instead of orally? Would the answers be consistent or would the additional time and focus on form foster monitoring of language behavior? If so, which seems more indicative of the subjects' performance? For Part II, the strongest subtest, the format is exceptionally good. Improvement, however, could come in several ways. First, i terns could be revised and new ones written where there was even greater agreement among the native speakers as to the appropriate modal expressions(s). Second, scoring procedures could be modified to determine whether any of them would be better than the one used based on probabilities of native speaker usage. For example, only that response given by the most native speakers might be considered correct. Or there could be no credit given for an answer that was improperly formed in any way. And third, the responses given by the most native speakers 135 along with the incorrect responses given by the non-native speakers could be used to construct a multiple-choice test that, while perhaps not as good a measure, might be more practical in terms of scoring. Beyond ~ ~: Suggestions LQL Further ResearQb In general, the battery of tests developed to assess modality served as an excellent elicitation procedure. The formats provided a balance of oral and written activities, requiring the use of all four major language skills (though writing was minimal). All three of the major response formats--open-ended, restricted-response, and closed ended--were used. The quantitative analysis conducted above, however, fails to make full use of the data provided. The scoring of tests and the summing of points to derive one overall score per subject obscures much of the information available. Alderson (1980) has strongly suggested that a qual ita ti ve analysis be undertaken whenever possible. A qualitative analysis of the wide range of data collected here would be of utmost importance not only to work on second language acquisition but to work on language description as well. 136 First Language Research Probably. one of the most important outcomes of this research is that it can provide information on native speaker usage of modality. A detailed analysis of the tests by the seventeen subjects whose native language was English would provide an enormous amount of information regarding modal usage in spontaneous speech, usage of forms in given situations, and grammaticality judgments. The guided spontaneous speech data from Part I could provide information that goes beyond the study of modality. The data could be subjected to any other type of discourse analysis; for example, length of entire retelling, average length of utterance, pauses, immediacy of directions given (e.g., .YQY + verb, you .rllll + .YM.b, .YQY should/have to + ~), and level of English used (e.g., degree of formality). Research on Part II could focus on patterns of responses. Does one subject tend to select one form to realize a certain function? Do subjects with specific characteristics tend to respond in a certain way? (Personal information as to sex, age, highest level of education, and place lived in at different periods of one's life is available.) Is modal usage highly systematic or is it idiosyncratic? And most important, what are the probabilities of occurrence of the usage of these forms by native speakers? 137 Information about probabilities of occurrence could be provided by research on Part III as well. A sociolinguistic analysis could also be done to determine the conditions under which a particular form is chosen. In light of the fact that this portion of the test is highly objective, the best research would most likely be on the format of the test itself (e.g., in assessing its reliability and altering the administration to see what further conclusions can be drawn). The information provided from this extended research would be of help to general linguists in their theoretical descriptions of the grammar of modality. It would be helpful as well to sociolinguists interested in language variation among speakers. And it would help other researchers who are interested in data el ici tat ion procedures to see the relative advantages of each of the test formats and to be able to revise them as necessary to suit their own purposes. Second Language Research The results from this study would also be of special help to those doing research in the field of second language acquistion. basic ways: (1) by individuals to see how Results could be extended in two looking at the performance of each compared with the group of 138 native speakers as a whole and (2) by looking at the performance of all the non-native speakers (either by level or as a group on each test item). The study of individual performance would provide diagnostic data about each learner's proficiency with reference to the criterion. The language background information could then be investigated in order to determine what variables, if any, could account for differential language performance among individuals. For example, what variables could account for the near-native performance of some of the non-native speakers? If any non-native speaker held these variables, would we then expect native or near-native performance? Similarly, what variables could account for the very low performance of some of the non-native speakers? And would we then expect low performance in all situations? Or are there some other intervening variables that are not readily apparent? Second language research could also be done where the focus is the performance on individual test i terns rather than the performance by individuals on the test as a whole. A cross-sectional analysis could be done showing the order of appearance and/or acquisition of the various modal expressions. An interlanguage study could be done in which the subjects' responses are analyzed by level. In order to determine the relative acquisition of forms and functions, 139 the items can be analyzed separately and those items focusing on the same function could be analyzed together. It would be especially fruitful to see if production of particular modal expressions on Part I supports their production on Parts II and III. Implications This study could conceivably have far-reaching implications not only for applied linguistics, but for language research methodology and theoretical linguistics as well. The field of applied linguistics would be helped 1 ingui sts could provide more the system of modality. are absolutely essential for tremendously if theoretical complete descriptions of Expressions of modality successful communication, and a complete description (including probabilities of occurrence) of the use of this system by native speakers in informal, spoken situations is not yet available. A major methodological implication that arises from this study concerns the administration of proficiency tests to non-native speakers. Such tests should always be administered first to native speakers in order to be sure that the native speakers can perform on them successfully. The responses given by these native speakers can also serve as the criterion against which to measure non-native 140 speaker performance. And in those cases where native speaker performance is likely to vary (as in modality, for example), the specification of proba bil i ties of occurrence for the responses for the various test items is an especially good way to determine exactly how close the non-native speaker performance matches that of the native speakers. And a major implication for the field of applied linguistics concerns second language teaching theory. If modality is such an important area for non-native speakers to acquire, and if it is so problematic for them, do they ever completely acquire it? If so, how? Is it through a process of natural language acquisition? Or must they be taught? If it is acquired over time, can the process be speeded up in any way? The test described in this research provides a way of assessing the modal proficiency of non-native speakers of English who might be part of an experiment designed to address just such questions. 141 REFERENCES Alderson, J. Charles. 1979. The cloze procedure and proficiency in English as a foreign language. TESOL Quarterly 13(2):219-227. Alderson, J. Charles. 1980. Native and nonative speaker performance on cloze tests. Language Learning 30(1):59-76. Altman, Roann. 1982a. Giving and taking advice without offense. Manuscript submitted for publication. Altman, Roann. 1982b. Interlanguage modality. Paper presented at the Annual Meeting of the American Association for Applied Linguists, San Diego, CA, December 27-30. (ERIC ED 228 861) Altman, Roann. 1982c. A quantitative analysis of English expressions of modality. Unpublished manuscript, University of Southern California. Andersen, Roger W. 1981. Two perspectives on pidginization as second language acquisition. In New dimensions in second language acquisition, Roger W. Andersen (Ed.), 165-195. Rowley, Massachusetts: Newbury House Publishers Inc. Austin, J. L. 1962. How to~ things with words. London: Oxford University Press. Bahns, Jens. 1981. Semantisch-pragmatische Aspekte des Erwerbs von Modalverben. Sprache: Lehren-Lernen, Band II. TUbingen. Bahns, Jens. 1983. On acquisitional criteria. International Review of Applied Linguistics XXI:57-68. Boyd, J., and J. P. Thorne. 1969. The semantics of modal verbs. Journal Qf Linguistics 5:57-74. 142 Bri~re, Eugene J. 1971. Are we really measuring proficiency with our foreign language tests? Foreign Language Annals 4(4):385-391. (Reprinted in Teaching English ~ s second language, s ~ Qf readings (2nd ed.), 1972, Harold B. Allen and Russell N. Campbell (Eds.), 321-330. New York: McGraw-Hill, Inc.) Brumfit, Christopher J. 1980. From defining to designing: communicative specifications versus communicative methodology in foreign language teaching. Studies in Second Language Acguisition 3(1):1-9. Brumfit, Christopher J., and Keith Johnson (Eds.). 1979. The communicatie approach to language teaching. Oxford: Oxford University Press. Campbell, Robin, and Roger Wales. 1970. The study of language acquisition. In ~ horiz~ ln linguistics, J. Lyons (Ed.), 242-260. Baltimore, Maryland: Penguin Books. Canale, Michael, and Merrill Swain. 1980. Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics 1(1):1-47. Candlin, Christopher N. 1976. Communicative language teaching and the debt to pragmatics. In Semantics: theory~ application, Georgetown University Round Table on Languages and Linguistics, Clea Rameh (Ed.), 237-256. Washington, D. C.: Georgetown University Press. Carroll, Brendan J. 1977. Specifications for a new English language examination. Royal Society of Arts, mimeo. Carroll, Brendan J. 1980. Testing communicative Q§Iformance: an interim study. Oxford: Pergamon Press. Carroll, John B. 1961. Fundamental considerations in testing for English language proficiency of foreign students. In Testing, 31-40. Arlington, Virginia: Center for Applied Linguistics. (Reprinted in Teaching English ~ s second language, s ~ Qf ~dings (2nd ed.), 1972, Harold B. Allen and Russell N. Campbell (Eds.), 313-321. New York: McGraw-Hill, Inc.) Chomsky, Noam. 1965. Aspects Qf the theory Qf syntax. Cambridge, Massachusetts: The M.I.T. Press. 143 Clark, John L. D. 1978. Psychometric considerations in language testing. In Approaches to language testing, Papers in applied linguistics, advances in language testing series Z, Bernard Spolsky (Ed.), 15-30. Arlington, Virginia: Center for Applied Linguistics. Clifford, Ray. 1983. Trade-offs in testing. Panel presentation at the Fourteenth Annual CATESOL Conference, Los Angeles, CA, April 15-17. Close, R. A. 1975. English. London: (6): The medals) ! reference grammar for students Qf Longman. ( Ch. 14: The verb phrase Coates, Jennifer. 1980. On the non-equivalence of may and can. Lingua 50(3):209-220. Cohen, Andrew D., and Carol Hosenfeld. 1981. Some uses of mentalistic data in second language research. Language Learning 31(2):285-313. Corder, S. Pit. 1981. The elicitation of interlanguage. In Error analysis and interlanguage S. Pit Corder (Ed.), 56-64. Oxford: Oxford University Press. Darnell, Donald K. 1968. The development of an English language proficiency test of foreign students using a cloz e-entropy procedure. (ERIC ED 024 0 39) Davies, A. 1978. Language testing. Language Teaching and Linguistics: Abstracts 11(3,4):145-159, 215-231). Eckman , Fred R • 1 9 7 7 • analysis hypothesis. Markedness and the contrastive Language Learning 27(2):315-330. Ehrmann, Madeleine. American English. 1966. The meani~ of the medals in The Hague: Mouton. Fletcher, Paul. 1979. The development of the verb phrase. In Language acguisition: studies in first language development, Paul Fletcher and Michael Garman (Eds.), 261-284. Cambridge: Cambridge University Press. Giv6n, Talmy. 1979. From discourse to syntax: grammar as a processing strategy. In Syntax ~nd semantics, Volume 3, Talmy Given (Ed.). New York: Academic Press. Glahn, Esther. 1980. Introspection as a method of elicitation in interlanguage studies. Interlanguage Studies Bulletin 5(1):119-128. 144 Halliday, M. A. K. 1970. Functional diversity in language as seen from a consideration of modality and mood in English. Foundations of Language 6:322-365. Halliday, M. A. K. 1973a. Language in a social perspective. In Explorations in the functions Qf language, M. A. K. Halliday (Ed.), 48-71. London: Edward Arnold. Halliday, M. A. K. 1973b. Towards a sociological semantics. In Explorations in the functions Qf language, M. A. K. Halliday (Ed.), 72-102. London: Edward Arnold. Halliday, M. A. K., and R. Hasan. 1976. Cohesion in English. London: Longman. Hirst, William, and Joyce Weil. 1982. Acquisition of epistemic and deontic meaning of modals. Journal Qf Child Language 9:659-666. Hymes, Dell. 1967. and linguistics. Models of the interaction of language Journal Qf Social Issues 23:8-28. Hymes, Dell. 1968. The ethnography of speaking. In Readings jn the sociology of language, Joshua A. Fishman (Ed.), 99-138. The Hague: Mouton. Hymes, Dell. 1972. On communicative competence. In Sociolinguistics, J.B. Pride and Janet Holmes (Eds.), 269-293. Harmondsworth, England: Penguin Books. Ingram, Elisabeth. 1978. The psycholinguistic basis. In Approaches tQ language testing, Papers in applied linguistics, advances in language testing series Z, Bernard Spolsky (Ed.), 1-14. Arlington, Virginia: Center for Applied Linguistics. Kellerman, Eric. 1979. The problem with difficulty. Interlanguage Studies Bulletin 4:27-48. Kellerman, Eric. 1980. An eye for an eye: on the permeability of learners' language to the translation equivalent of a body part word and its concrete extensions of meaning. Unpublished manuscript. Kratzer, Angelika. 1981. The notional category of modality. In Words, worlds, and context: ~ approaches in ~ semantics, Hans-JUrgen Eikmeyer and Hannes Rieser (Eds.), 38-74. Berlin, New York: Walter de Gruyter. 145 Kuczaj, Stan A. 1977. Old and new forms, old and new meanings: the forms function hypothesis revisited. Paper presented at the Society for Research in Child Development. New Orleans, LA, March 17-20. Labov, William. 1969. Contraction, deletion and inherent variability of the English copula. Language 45(4):715-762. Lado, Robert. 1957. Linguistics across cultures: applied linguistics for language teachers. Ann Arbor, Michigan: University of Michigan Press. Leech, Geoffrey N. 1971. Meaning and~ English~. London: Longman. Leech, Geoffrey, and Jennifer Coates. 1979. Semantic indeterminacy and the modals. In Studies in English linguistics iQL Rando~ Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik (Eds.), 79-90. London: Longman. Leech, Geoffrey, and Jan Svartvik. 1975. grammar of English. London: Longman. A communicative Lyons, John. 1977. Semantics, Volume 2. Cambridge: Cambridge University Press. (Ch. 17: Modality) Morrow, Keith. 1979. Communicative language testing: revolution or evolution? In The communicative approach to language teaching, Christopher J. Brumfit and Keith Johnson (Eds.), 143-157. Oxford: Oxford University Press. Morrow, Keith, and Keith Johnson. 1977. Meeting some social language needs of overseas students. Canadian Modern Language Review 33(5):694-707. Mullen, Karen A. 1978. Direct evaluation of second language proficiency: the effect of rater and scale in oral interviews. Language Learning 28(2):301-308. Munby, John. 1978. Communicative syllabus design. Cambridge: Cambridge University Press. Nie, Norman H., C. Hadlai Hull, Jean G. Jenkins, Karin Steinbrenner, and Dale H. Bent. 1975. SPSS: Statistical Package IQL the Social Sciences (2nd ed.). New York: Mc-Graw Hill Book Co. 146 Ochs, Elinor. In press. Variation and error: a sociolinguistic approach to language acquisition in Samoa. In The cross-cultural study of language acguisition, D. Slobin (Ed.). Hillsdale, New Jersey: Lawrence Erlbaum. Oller, John W., Jr. 1978. Pragmatics and language testing. In Approaches to language testing, Papers in applied linguistics, advances in languag§ testing series 2, Bernard Spolsky (Ed.), 39-59. Arlington, Virginia: Center for Applied Linguistics. Oller, John W., Jr. 1979. Language tests at school: ~ pragmatic approach. London: Longman. Palmer, Frank R. 1979. Modality ~ ~ English modals. London: Longman. Pea, Roy D., Ronald W. Mawby, and Sally J. MacKain. 1982. World-making and world-revealing: semantics and pragmatics of modal auxiliary verbs during the third year of life. Paper presented at the Seventh Annual Boston University Conference on Child Language Development, October 8-10, 1982. Richards, D. R. 1980. Problems in eliciting unmonitored speech in a second language. Interlanguage Studies Bulletin 5(2):63-98. Robinson, P. 1971. Oral expression tests. English Language Teaching 25(2,3):151-155, 260-266. SAS Institute Inc. 1982. SAS user's guide: statistics (1982 ed.). Cary, North Carolina: SAS Institute Inc. Schumann, John H. 1978. ~ pidginization process: ~ model fQI second language acguisition. Rowley, Massachusetts: Newbury House Publishers. (Chapter IV) Searle, J. 1965. What is a speech act? In Philosophy in America, M. Black (Ed.), 221-239. London: Allen & Unwin; Ithaca, New York: Cornell University Press. (Reprinted in Language~ social context, 1972, Pier Paolo Giglioli (Ed.), 136-154. Harmondsworth, England: Penguin Books Ltd.) Shepherd, Susan Carol. 1981. Modals in Antiguan Creole, Child Language Acquisition, and History. Unpublished Ph.D. Dissertation, Stanford University. 147 Smith, Mark A. 1980. The grammatical model in an error analysis investigation. Unpublished manuscript. Spolsky, Bernard. 1973. What does it mean to know a language? Or how do you get someone to perform his competence? In Focus on the learner, John W. Oller, Jr., and Jack C. Richards (Eds.), 164-176. Rowley, Massachusetts: Newbury House Publishers. Spolsky, Bernard. 1978. Introduction: linguists and language tests. In Approaches to language testing, Papers in applied linguistics, advances in language testing series z, Bernard Spolsky (Ed.), v-x. Arlington, Virginia: Center for Applied Linguistics. Spolsky, Bernard, Bengt Sigurd, Masahito Sato, Edward Walker, and Catherine Arterburn. 1968. Preliminary studies in the development of techniques for testing overall second language proficiency. In Problems in foreign language testing, John A. Upshur (Ed.), Language Learning, Special Issue 3(August):79-101. Steele, Susan, with Adrian Admajian, Richard Demers, Eloise Jelinek, Chisato Kitagawa, Richard Oehrle, and Thomas Wasow. 1981. An encyclopedia Qf AUX: £ stygy lD cross-linguistic eguiyalence. Cambridge, Massachusetts: The MIT Press. Swain, M., G. Dumas, and N. Naiman. 1974. Alternatives to spontaneous speech: elicited translation and imitation as indicators of second language competence. Working Papers on Bilingualism 3. Taylor, Wilson L. 1953. "Cloze procedure": a new tool for measuring readability. Journalism Quarterly 30(4) :415-433. van Ek, J. A. 1979. The threshold level. In The communicative approach to language teaching, Christopher J. Brumfit and Keith Johnson (Eds.), 103-116. Oxford: Oxford University Press. van Ek, J. A., and L. G. Alexander. 1975. Threshold level English. Elmsford, New York: Pergamon Press Inc. Wells, c. G. 1979. Learning and using the auxiliary verb in English. In Language development, Victor Lee (Ed.), 250-269. New York: Halstead Press (Div. of John Wiley & Sons Inc.). 148 Widdowson, H. G. 1978. Teaching language as communication. London: Oxford University Press. Wilkins, David A. 1972. Grammatical, situational and notional syllabuses. Proceedings Qf the Third International Congress of Applied Linguistics. Copenhagen: Julius Groos Verlag, Heidelberg. (Reprinted in The communicative approach to language teaching, 1979, Christopher J. Brumfit and Keith Johnson (Eds.), 82-90. Oxford: Oxford University Press.) Wilkins, David A. 1976. Notional syllabuses. London: Oxford University Press. Wode, Henning. 1981. Learning~ sec~ language. Tubingen: Gunter Narr Verlag. 149 Appendix A SPECIAL ENGLISH TEST: PART II 150 Special English Test Fall 1983 Part I I Vo no.t Wlr.Ue on .the .tu.t booki.e.t. For this part of the test you will read a situation and try to figure out what it's about. If there is a word you do not understand, try to guess what it means, but do not spend a lot of time on it and don't worry about it. After you read each situation and think you understand what it's about, go back and try to decide what word or words should go In the blank and write your answer on the answer sheet next to the number of each item. Put down the first answer that you think.of. Do not try to remember grammar rules. Put down what sounds right to you:--More than one word may go In each blank. You will have 35 minutes to complete this part of the test. How ever, the amount of time it takes you to finish the test will affect your score. Therefore, you should work as quickly as possible and put down the first answer that comes to mind. You wi 11 not Jose any points for a wrong answer. After the last blank on the answer sheet there Is a blank line followed by the word "minutes." As soon as you finish the last test i tern, look up at the board at the front of the room and copy down the number you see there onto your paper. This is very important. After you have recorded the number of minutes, raise your hand and someone wi 11 come over and collect your paper. Do not go back over any of the test items. Your first impression is the mo~important. After your paper is collected, please wait quietly until the others have finished. Then we will continue with the thir.d and final part of the test. 151 Part I I Example In order to help you understand what to do, an example is given below. Please read it silently and think about the answer you would put in the space. Do not write on this paper. JAYWALKING A native Californian is talking to a recently-arrived international student. Native: Student: Native: Student: Native: Where are you going? Across the street to get something to eat. Well, you cross there. -------------------------- Why not? There's a police officer sitting over there who's likely to give you a ticket. What answer did you think should go in the space? Can you think of any other answers that might have fit there? 152 Part II 1 • INSURANCE A man approaches while his friend is on the phone. He is not talk ing but seems to be waiting for someone to come back to talk to him. Man: Friend: Man: Friend: Man: What's the matter? 1 1 m having trouble with my insurance company. They don't want to pay for the things I had stolen out of my car. How come? They say there was no sign that anyone had broken into the car o Someone must 1 ve j us t opened the but ton care fully and then locked it again. This is the second time it's happened and they're refusing to pay. Well, then you change ~----------------------- insurance companies. 2. CLASS ASSIGNMENT The teacher and students are in the classroom. Teacher: Student: Teacher: Okay everyone. Please put your books away and take out a pen or pencil. On the paper I've just handed out I'd like you to write your name and today 1 s date. (Pause.) Is everybody ready? (Students nod.) Okay, you may begin. (Students look up at the teacher, not knowing what to do. Finally, one of them raises her hand.) Excuse me, Mrs. Bell, but what do? --------------------------------- Oh, I'm sorry. I guess I forgot to tell you. 153 3. STUDYING A boy friend is talking to his girl friend about her final exams. Boy: How's your studying coming along? Girl: Not too good. As soon as I finish studying something I forget it. Boy: Why don't I come over and help you tonight? G i r 1 : Oh, that would be nice but you --------------------- Boy: I know, but 1 1 d like to. 4. TRIP A husband and his wife are about to leave on a trip across the country. Husband: D 1 you wanna drive first? WIfe: Welt., I just took some medication for my allergies. Husband: Oh, then you------------------------ drive just yet. Wife: Yeah. If I sleep while, 1 1 It be okay when I wake up. 5. EXAM Two students who are friends are talking. Student 1: 1 1 m really nervous. Student 2: What's the matter? Student 1: I'm not doing too well in my English class and if I don't get at least a 11 B 11 on this exam I'm gonna fail the course. Student 2: Well, then, I guess you study. Student 1: I think you're right. See ya later. 154 6. WEDDING Two friends are discussing what they did over the weekend. Friend 1 : HI. What did you do over the weekend? Friend 2: Oh, I went to a friend 1 s wedding. Friend 1 : How was it? Friend 2: Not too good. It was pretty unexciting. There dancing but the band cancelled at the last minute. ]. LONG DISTANCE CALL A co 11 ege student from Los Ange 1 es goes home with his rooiTillate during suiTiller vacation. His roommate comes from Long Island, New York. L.A. N.Y. L.A. N.Y. 8. LIBRARY I 1 d like to call my parents to let them know I got here okay. Sure. The phone 1 s in the hall. Just dial the area code and the number. Really? dial 11 1 11 first? No, not here. Just in California. The librarian has just asked a student not to smoke while in the 1 ibrary. Student: I didn 1 t know we smoke in here. ------------------------- Librarian: Yes. Can you imagine what might happen if someone accidentally dropped a cigarette? 155 9. GROCERY STORE A customer is talking to the clerk in a grocery store after paying for the groceries. Customer: Oh, I really hate those new plastic bags. I have a paper one instead? C Jerk: Sure. 10. DATE A boy calls a girl up on the phone to make a date. Boy: G i r 1 : Boy: Girl: Boy: Hi. What're you doing tonight? Nothing. Why? I thought we'd get together and go to a movie. Sounds great! But didn't you tell me you were seeing Bill tonight? Yeah. We go out but he called to say he couldn't make it tonight and we 1 d have to do it some other time. 11. MOVIE STUDIO TOUR A tour guide is taking a group on a tour of a movie studio. Guide: Tourist: Guide: Excuse me, rna• am, but you go in there. ------------------------- Why not? We 1 re taking a tour of the studio, aren 1 t we? Yes, but that area is marked restricted and only emp 1 oyees to go i n there. 156 12. CLASSROOM During class there's a knock on the door. The young woman asks to speak to her brother. Teacher: Woman: Teacher: Woman: Teacher: But he's not here. What do you mean he 1 s not here? He meet me here at 12:00. ------------------ Well, he didn 1 t come today. 1 1 m gonna tell my mother then. Yes, and you te 11 your brother that if he doesn 1 t start coming to class he 1 s gonna be dropped from the program. 13. VISITING AN AMERICAN FAMILY An international student is talking with an American student about what to do when he goes to visit an American family. Visitor: American: Visitor: American: Visitor: American: 1 1 ve got a problemo I wonder if you can help me. What is i t7 Well, this American family I met when I first got here has invited me to spend the weekend with them. What do? ----------------------- Well, you can bring a gift for the hostess, some flowers or candy, or you can bring some wine or dessert. But if I 1 ve spent the whole weekend with them, ------------------------ do something more for them? Well, it would probably be a good idea to write a thank you note afterwards letting them know how much you appreciated their hospitality. 157 14. GUN SHOP A young man enters a gun shop and indicates that he'd like to purchase a rife. Clerk: Customer: Clerk: I'm sorry, sir. You------------ buy a gun. Why not? You are here in the U.S. on a student visa which means that you wait six months before you to purchase one. 15. FACULTY DINING ROOM Two students enter the faculty dining room for lunch. Maitre d': 1 1 m sorry. students. stairs. The upstairs dining room is closed to You eat down- Student: I didn 1 t know we ------------- eat up here. Maitre d': Yes. Students--~------------- eat up here since last September. 158 16. PROFESSOR'S OFFICE A student goes to her professor 1 s office after class in order to discuss the grade she received on her midterm exam. Professor: What seems to be the problem? Student: I don 1 t understand why I failed the exam. Professor: Well, let 1 s see. (Pause.) Oh, yes. You ~~----~--------~~- answer all five questions but you only answered three. Student: Where does it say that? Professor: On the top of the first page, in the directions. Student: Oh. thought we only --~------~------------- answer three questions. I guess I read the directions more carefully. Professor: Yes. That would 1 ve been a good idea. And you read the directions care ~f~u~l~l-y __ o_n __ t~h-e~f~i-na~l-o __ r_y_o_u-111 fail the course. Student: wi 11 • do, I I don 1 t want to fail this course, because if drop out of school. ----------------------- 159 17. REGISTRATION A student enters the registration area in order to register. Checker: Student: Checker: Student: Checker: Student: Checker: Student: Checker: Student: Checker: I see your registration materials please? Yes. Here they are. I'm sorry. You ----------------------- register yet. Why not? Because you have an E-hold on your Permit to Register and you get it removed before you register. What 1 ------------------------- You go over to the American Language Institute and they' II take care of it for you. Have you taken the International Student Exam? No. Well, that's the first thing you --------------------- do. But I'm from Hong Kong. English. without taking the test? My native language is I register No, all entering international students take the test. ----------------------- 160 I I (Date) 1 • 2. 3. 4. s. 6. ]. 8. 9. 10. 11 • 12. 14. (Time) Part II Answer Sheet (Seat Number) 15. 16. 17. Now, look up at the board and write in the space below the number that is written there. minutes ---------------- 161 Appendix B SPECIAL ENGLISH TEST: PART III 162 Part I I I Script 1. PASSPORT Two Americans are talking and one sa¥S she's thinking about going to Europe over the summer. Her friend says: (1) Well, you oughta get a passport. (2) Well, you need to get a passport. 2. DOCTOR A doctor is talking to a patient about the dangers of taking cer tain medication and driving. (1) You shouldn't drive after taking this medication. (2) You can't drive after taking this medication. 3. LOST WALLET Two teenagers who went to a party together are getting ready to leave. The one who drove there realizes he left his wallet at home, so his friend says to him: (1) Well, you'd better let me drive. (2) Well, you must let me drive. 4. BUS STOP A mother is waiting for a bus with her son who suddenly asks why they're standing there if the sign says ''No Standing." The mother laughs and says: (1) That just means that cars had better not stop here because this is where the bus stops. (2) That just means that cars can't stop here because this is where the bus stops. 163 5. MOVIE Two friends are discussing a movie one of them is going to see. One asks how long it 1 11 take to get there and when to leave if the movie starts at 9:00. His friend says: (1) · lt 1 s 8:00 now so you should leave right away. (2) lt 1 s 8:00 now so you have to leave right away. 6. MONEY CHANGER A woman is trying to get change for the laundry and puts a dollar bill into the money changer but nothing happens. Another woman comes over and says: (1) Oh, In order to get it to work you have to use a new b i 11. (2) Oh, in order to get it to work you 1 re supposed to use a new b i 11. 7. DINNER RESERVATIONS On Saturday afternoon a group of friends are deciding where to go for dinner that night. One suggests a fancy French restaurant downtown. One of the others responds: (1) We can 1 t go there; you must have reservations. (2) We can 1 t go there; you 1 ve gotta have reservations. 8. PICNIC A group is planning a picnic and someone calls up asking for infonmation about what to bring. The person who is organizing the picnic says: (1) You 1 d better bring your own food. (2) You 1 re supposed to bring your own food. 9. WAITING ROOM A woman turns to her husband who is about to leave the airport waiting area and says: (1) You mustn 1 t go too far or you 1 11 miss the plane. (2) You 1 d better not go too far or you 1 11 miss the plane. 164 1 O. SHOES A sales clerk walks into the dressing room and sees a customer trying on a pair of pants with her shoes still on. The clerk politely says: ( 1) (2) 11. VITAMINS Excuse me, but you•re not supposed to try clothes on without taking your shoes off first. Excuse me, but you shouldn 1 t try clothes on without taking your shoes off first. A man and a woman are eating lunch together and he 1 s just having soda and potato chipso She says to him: (1) If you 1 re gonna eat that way you must take vitamins. (2) If you 1 re gonna eat that way you oughta take vitamins. 12. STUDENT A teenage son asks his mother if it 1 s all right if he takes the car out for a drive. She asks him if he 1 s finished studying for his exams the next day. He says he hasn 1 t but that it 1 s such a nice day out that he·doesn 1 t want to stay in. She answers: (1) You 1 d better study when you get back then. (2) You should study when you get back then. 13. NO SMOKING The flight attendant calls out to a passenger smoking while the plane is about to land: (1) Sir, you mustn 1 t smoke while the NO SMOKING sign is lit. (2) Sir, you shouldn 1 t smoke while the NO SMOKING sign is 1 it. 14. EXPIRED LICENSE A woman in a store goes to make a purchase with her credit card. The clerk asks to look at her driver 1 s license and sees that it expired a month earlier. The clerk says: (1) You know, you have to renew your license. (2) You know, you need to renew your license. 165 15. LOST KEYS A man and his wife are about to leave their hotel room to go sightseeing. He says to her: (1) You shouldn't close the door yet; I can't find my keys. (2) You'd better not close the door yet; I can't find my keys. 16. DICTIONARY The proctor of an exam sees one of the students using a dictionary and says: (1) You're not allowed to use a dictionary during the exam. (2) You can't use a dictionary during the exam. 17. PHARMACIST A patient has just gotten some pills from the pharmacist, who is telling her how to take them because of the stomach problems they might cause: (1) You're supposed to take these pills after meals. (2) You should take these pills after meals. 18. REPRIMAND A father is yelling at his young son. (1) You mustn't talk back to your mother like that. (2) You can't talk back to your mother like that. 166 Appendix C SPECIAL ENGLISH TEST: PART I 167 Special English Test Fall 1983 Registration Lecture "How to Register for Classes" In order to get ready to register for classes, there are three things you should do. The first thing you should do is try to see a Peer Advocate in the Office of International Students and Scholars (OISS) who will help you with any questions you might have. Then you should plan to go to the orientation program which is held the week before registration begins. During orientation the registration pro cedure will be gone over in detail. And third, in the packet that all international students receive during orientation, there will be a schedule of classes, which you should read very carefully. It not only tells you when and where classes meet, but also tells you exactly how to register. Registration takes place in the Physical Education (PE) building. You have to go to many different places in the PE building in order to register. Each place is-called a station, and the stations are num bered In the order you have to go to them. The fl rst station you have to go to is Station 1 where you pick up your Permit to Register. You should fill it out and then go over to the American Language Institute Administrative Offices which are 168 located in the Jefferson Building, Room 150. They will remove your English or E-hold by either registering you for a class at the All or by releasing youo After your E-hold has been removed, you have to go back to the PE building, to Station 2, to get your F-hold removed. In order to get it removed you have to bring your passport and your 1-94 with you. After your holds have been removed you go to Station 3 where you can see an advisor from your department. After you talk to your advisor you list the classes you intend to take on the Permit to Register and then go to Station 4 where the people from International Admissions will check to be sure all your papers are in order. After Station 4 you follow the same procedure as all other stu dents at the school. You wait in line for a computer terminal. The computer operator will enter the courses you listed on your Permit to Register and a fee bill will be generated. When you get your fee bill you should check it over to make sure you got the classes you wanted. Next you stop at the health insurance station. You must buy health insurance in order to be covered in case anything happens to you. If you have health insurance from your country and you show proof that your insurance company has a claims office in the U.S., then you don't have to buy health insurance from the university. Finally, after you have signed up for health insurance, you have to pay your fee bill. If you're a sponsored student you should bring with you the letter stating that payment is guaranteed by the 169 sponsoring agency. If you're not a sponsored student then you have to pay your fee bill yourself. You can charge the fees or pay by check or cash--though most students don't pay cash. 170 Special English Test Fall 1983 Part I General The English exam you are going to take tests how well you communi cate in certain situations. It consists of three parts: Part I: Part II: Part I I 1: Part Listening and Speaking (10 minutes) Fill-in-the-Blanks (35 minutes) Multiple Choice (10 minutes) On the cassette you are going to listen to, you will be given instructions about how to register for classes. After you finish listening to the instructions, you will be asked to repeat them to another international student who has just arrived and asks you what to do to register for classes. In order to help you remember what to tell the other student, you can use the.notes below as a guide. "How to Register for Classes 11 A. Getting Ready 1. Peer Advocate (OISS) 2. Orientation program 3. Schedule of classes in packet B. Registration 1. location: Physical Education (PE) Building 2. numbered stations 3. Station 1: Permit to Register 4. American Language lnstitute--JEF 150: E-hold 5. Station 2: F-hold (OISS; passport, 1-94) 6. Station 3: Advisor ]. Station 4: Final check (International Admissions) 8. Classes on Permit to Register 9. Computer terminal: Fee bill 10. Health insurance 11. Pay fee bill (sponsored vs. not sponsored) 171 Part I Directions Now pretend that It is 5 P.M. and that you have just left the Office of International Students and Scholars (OISS) where someone told you all the information you just heard on the cassette. As you are leaving the office, you meet another international student who has just arrived on campus. Since the office has just closed for the day and there is no one around to help him, he says to you: 11 Excuse me. Could you tell me what to do to register for classes7 11 You know that it is very important for him to get the same information you did, especially since the orientation program is scheduled for the next day. So you respond: 11 Sure. 11 and then begin talking directly to him (that is, into the cassette recorder) and tell him exactly what to do to register for classes. You may use the notes you received as a guide. 172 Appendix D PREPARATORY PAPERS 173 CAMPUS MEMO TO International Students oiY Roann Altman, Instructor, All; Ph.D. Candidate, Linguistics FROM DATE August 1983 SL:BIECT Volunteers Needed to Take Special English Test I am working on my Ph.D. in Linguistics and need to administer a test to approximately 75 international students. The test will measure how well you communicate in English in certain situations. There are three parts to the test: a speaking part, a fill-in the-blank part, and a multiple-choice part. The test will have absolutely no effect on any other test scores or classes you take at U.S.C. --- Questions and Answers About the Test Who can take the tut? The test is for students at the intermediate level of English and above. How .tong will. .U :take? It will take about one hour. When IAJili. .U be g.i.ven? It will be given Friday, August 26, 1983. Whe11.e wil.1. .U be g.i.ven? It will be given in Taper Hall of Humanities (THH). There are two advantages to taking this test. 1. You will get your International Student Exam (ISE) test results approximately one hour earlier than the others in your group. 2. Sometime in the fall I will meet with you in groups to tell you how you did on the test and to offer some suggestions on how you can improve your English in the specific area covered by the test. Wha;t do I do i6 I unnt to :take the tut? If you want to take the test, please see me now--or if I'm not around, ask someone where you can find me. You wi 11 then be asked to sign up for the test and will be given an appointment slip telling you when and where to go. REMEMBER:. 1. 2. 3. You must sign up~ if you want to take the test. Only the first 75 who sign up wi 11 be allowed to take the test. Only those who take the test wi 11 be given their ISE results early. 174 Special English Test Fall 1983 Sign-Up If you would like to take the Special English Test, please read and fi 11 out the attached 11 Re lease Form 11 and 3x5 card. On the card please put your.name, address, and telephone number (see sample below~ The address and telephone number are necessary in order for me to be able to get in touch with you during the fall semester to go over your test results. If you move, please be sure to let me know. You can put a note in my mailbox in the Jefferson Building. ( Fami 1 y name, Given name, ------------------------------- Middle name) ------------------------------- ------------------------------- ------------------------------- (Street Address) ~------------------------------ J_C..!_ty_,_S_!a_!e __ Z.!,p_Code) __________________ _ ------------------------------- ------------------------------- (Telephone Number) ~------------------------------ ~------------------------------ After you have filled out the 3x5 card and signed the 11 Release Form, 11 please hand them in so that you can be given an appointment for the test. Thank you. 175 I , Special English Test Fall 1983 Release Form (Name--please print) would like to take the Special English Test to be given by Roann Altman. I understand that the test results are to be used for research purposes only and that my name will not be used in reporting the results. The test results will have absolutely no effect on any other tests or classes I take at u.s.c. If I take the test, I will be able to get my International Student Exam (ISE) results early. Then, during the fall semester of 1983 you will let me know how I did on the Special English Test and give me some advice as to how I can improve my English. (Date) (signature) 176 Special English Test Fall 1983 Appointment Slip PletUe bJt..ing .th-<A a.ppo.intment .6Up, along w..i...th a. pencil 01r.. pen, .to .the .te.6.t. ----------~----------------------------- wi 11 take the Special English Test in Taper Hall of Humanities (THH) Room 309 on ------------------------------- at ----------- If you have questions or want more information, please contact: Roann A 1 tman All Room 202 Phone: 743-8866 REMEMBER: You will get your ISE results an hour earlier~ if you take this Special English Test. 177 Appendix E NATIVE SPEAKER QUESTIONNAIRE 178 Special English Test Fall 1983 Questionnaire 1. Name----------------------------------------------------- 2. Sex (circle one): Male Female 3. Age -------- 4. Native language --------------------------...- 5. Where are you from originally? (city) (State) (Country, if not US) 6. Where have you spent most of your life? (Please use percentages if you have spent a great deal of time in more than one state or in another country.) (States/Countries) 7. Highest educational degree received (circle one): High School Diploma Bache lor• s Master's Doctorate Other (please specify): 8. Occupation 179 Appendix F INTERNATIONAL STUDENT INFORMATION SHEET 180 AMERICAN LANGUAGE INSTITUTE INFORMATION SHEET (Date) Name T:~~~-----------------------r.~--~------------------------~~~~ (Family) (Given) (Middle) Local Address ~----~~~----~~--~~--------------------~~~~~~~--r (House Number) (Street) (Apartment Number) ( ) (City) (Zip Code) (Telephone) Family Address~~~------------------------~~----~---------------------- (City) (Country) Check one: ) Freshman ) Sophomore Junior ) Senior ) Graduate Major ------------------------------- Graduate Assistantship Award? ) Yes ) No Department ---------------- NatIve 1 anguage ------------------- Other languages ---------------------- How long have you been In Los Angeles/California?---------------------------- in the United States? --------------------------------- How old were you when you began speaking English? ------------------------- Have you ever lived with native English speakers? ) Yes ) No If so, for how long? ------------------- Have you ever lived in an English-speaking country? ) Yes ) No I f so, where 7 ------------------- For how long? ------------------- How o 1 d were you? -------------------- When you studied English, in what country did you study it for how many years (or months)? for how many hours per week? Was English used as the language of instruction in other classes? 1 Elementary School (Gr. K-6) yes/no Secondary School (Gr. 7-12) yes/no College/ University yes/no Other yes/no 181
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A regression model of coarticulation effects in naturalistic American English
PDF
Assertion and modality
PDF
Ryūkyūan language history
PDF
Syntactic reanalysis in early English
PDF
A geometric approach to error correcting codes
PDF
A study of Japanese communication: compliment-rejection production and second language instruction
PDF
Temporal Structure Of Spoken Korean: An Acoustic Phonetic Study
PDF
An historical study of theatrical entertainment in Virginia City, Nevada or Bonanza and Borasca Theatres on the Comstock (1860-1875)
PDF
A study of Korean national cultural values influencing leadership styles and cross-cultural management practices
PDF
Post-verbal phenomena in colloquial Persian syntax
PDF
An investigation of the effects of liquid crystalline phases on drug permeation through skin
PDF
A videofluorographic investigation of tongue and throat positions in playing flute, oboe, clarinet, bassoon, and saxophone
PDF
A case grammar of the parker manuscript of the "Anglo-Saxon chronicle" from 734 to 891
PDF
Morphosyntactic feature chains and phonological domains
PDF
"Tuwaak bwe elimaajnono": perspectives and voices: a multiple case study of successful Marshallese immigrant high school students in the United States
PDF
A comparative analysis of three contemporary solo performances based on the lives and works of women writers
PDF
A fine structural analysis of ovarian morphology, oogenesis, and ovulation in the marine bryozoan 'Membranipora serrilamella' (Cheilostomata, Anasca)
PDF
A conditional resolution of the apparent paradox of self-deception
PDF
Auditors' risk attitudes: a hierarchical levels study within various decision contexts
PDF
A longitudinal study of anxiety: noted relationships between anxiety, depression, parenting style, and academic achievement
Asset Metadata
Creator
Altman, Roann (author)
Core Title
Assessing modal proficiency in English as a second language
Degree
Doctor of Philosophy
Degree Program
Linguistics
Defense Date
05/01/1984
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Digitized by Interlibrary Loan Department
(provenance)
Advisor
Krashen, Stephen (
committee chair
), Hellige, Joseph B. (
committee member
), Purcell, Edward T. (
committee member
), Rutherford, William (
committee member
), Schachter, Jacqueline (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-491093
Unique identifier
UC11298217
Identifier
etd-Altman-579647.pdf (filename),usctheses-c3-491093 (legacy record id)
Legacy Identifier
etd-Altman-579647.pdf
Dmrecord
491093
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Altman, Roann
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA