Close
USC Libraries
University of Southern California
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Folder
Rhetorical abstraction as a facet of expected response: A structural equation modeling analysis
(USC Thesis Other) 

Rhetorical abstraction as a facet of expected response: A structural equation modeling analysis

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Request accessible transcript
Transcript (if available)
Content INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely afreet reproduction In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6 " x 9 " black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. RHETORICAL ABSTRACTION AS A FACET OF EXPECTED RESPONSE: A STRUCTURAL EQUATION MODELING ANALYSIS by Donald L. Weasenforth A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Linguistics) May 1995 Copyright 1995 Donald L. Weasenforth Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 9617003 C o p y r ig h t 19 95 b y W e a s e n fo r th , D o n a ld L e s t e r All rights reserved. UMI Microform 9617003 Copyright 1996, by UMI Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. UMI 300 North Zeeb Road Ann Arbor, MI 48103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA THE GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES. CALIFORNIA 90007 This dissertation, written by Donald Lester Weasenforth under the direction of h.is Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillment of re­ quirements for the degree of DOCTOR OF PHILOSOPHY d .. ___ Dean o f Graduate Studies D ate. . . . A . » . ..1995........... DISSERTATION COMMITTEE ................ ^ y _ _ Chairperson \ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To Anna, Erik and Kristen . . . One eye on the past, one to the future and ever mindful of the ephemeral present which s o delicately binds the two. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. iii Acknowledgments I would first like to acknowledge the guidance and direction provided by the members of my dissertation committee. They have not only guided me through the doctoral studies and the completion of this dissertation, but have also offered insightful suggestions for the direction of future research. Eh. Robert Kaplan's blend of kindness, firm prodding, unquestionable expertise and trustworthy counsel (just to mention a few of his favorable characteristics) m ade the journey pleasant and certainly intellectually profitable. I consider it an honor to have worked with a person with su ch a respected reputation in the field and will hold dear die invaluable mentoring that he h a s s o generously provided. He h as gifted m e with a treasure of anecdotes which he skillfully employs to illuminate his discussions which have deepened my understanding of written discourse. I hope to pay my debt in part, as he has, by treating students under my charge a s well as I have been treated. I wish him and his beautiful wife, Audrey, a restful, but not too restful, retirement In the midst of a tremendously hectic schedule, Dr. Lyle Bachman has found time to field questions, offer advice, and even provide help in setting up a DOS directory for a once MAC-only literate graduate student His expertise in language testing and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i v statistical approaches to language related research is well known, and I could not do him justice in describing the extent of his knowledge and experience which have incited me to test assumptions of language use, thereby opening a number of paths for research. He h a s also provided me with knowledge of and experience with a wealth of research methodologies which should prove useful in pursuing further research as well a s in completing practical test development. .Dr. David Eskey's commitment to TESOL and his acumen in a wide range of issu es related to literacy and language pedagogy have prompted me to explore the implications of the results of a rather abstract analysis. His ready sen se of humor h as more than once reminded me to consider the larger context of this research and the dissertation process. To my wife, Esther, I owe a great debt of gratitude. For working with me in a number of capacities-including the counting of textual features, typing, and proofreading— I cannot express enough appreciation. Perhaps more important was her continual moral support, encouragement and commitment all along the way. I am guilty of no hyperbole when I state that, without her support, this work would not have been completed. To a great degree I owe to my mother an understanding of the value of persistence which h as pushed me to complete this project. Although sh e would not see the end of the project, sh e unfailingly supported my work as long as life allowed her to do s o . The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. inspiration of her support and of her wish to see the completion of the work have pushed m e forward at times of diminished momentum. Finally, I would like to express my gratitude again to those who survived the multiple ratings of 400 student essays. Thanks is due Joe Allen, David Bycina, Daryl Kinney and Cheryl Kraft for suffering through several long days of ratings and discussions of their evaluations to provide the data for the study. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS Page ACKNOWLEDGEMENTS iii LIST OF TABLES ix LIST OF FIGURES xi ABSTRACT xii Chapter I. Introduction 1.1. Preface 1 1.2. Language Testing 2 1.2.1. Construct Validity 2 1.2.2. Model of Academic Writing Proficiency 6 1.3. Discourse Analysis 9 1.3.1. Relative Significance of Textual Dimensions 10 1.3.2. Interrelationships between Textual Features and Dimensions 1 3 1.4. Purpose of the Study 18 1.5. Value of the Study 24 1.6. Limitations of the Study 30 Notes 36 II. Review of Literature: Discourse Analysis 11.1. Overview 38 1 1 .2 . Discourse Structure 39 11.2.1. Rhetorical Aspects of "Planned" Discourse 44 11.3. Language Testing 66 11.3.1. Facet Theory 67 1 1 .3.2. Facets of Writing Tests 69 1 1 .3.3. Facets of Expected Response in the Study 77 Notes 87 III. Review of Literature: Statistical Methodology III. 1 . Overview 89 111.2. Analyses of Rater Consistency 90 1 1 1 .3. Structural Equation Modeling 91 111.3.1. Applications of SEM to Linguistics Research 92 1 1 1 .3.2. Fundamental Concepts 94 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS-Continued Page III.3.3. Model Specification 100 III.3 .4. Model Identification 106 III.3.5 Estimation 108 III.3.6. Model Fit 111 III.3.7. Benefits 117 III.3.8. Disadvantages 119 Notes 120 IV. Methodology IV .l. Overview 123 IV.2. Personnel 124 IV.3. Corpus 125 IV.4. Rating S cales 127 IV .4 .I. ESL Challenge Test Essay Evaluation Scale 128 IV.4.2. ESL Composition Piofile 132 IV.4.3. Rhetorical Abstraction Scales 135 IV.5. Variables 145 IV.5.1. Holistic Ratings of Overall Textual Quality 146 IV.5.2. Rhetorical Abstraction Ratings 147 IV.5.3. ESL Composition Piofile Ratings 148 IV.5.4. Text Length 149 IV.5.5. Textual Elaboration 150 IV.5.6. Topic Abstraction 151 IV.6. Descriptives and Reliabilities 154 IV.7, Preliminary Exploratory Factor Analyses 156 IV.8. Preliminary Regression Analyses 156 IV.9, Structural Equation Modeling 157 IV.9.1. Model Specification for the Study 158 IV . 10. Bootstrapping 165 IV .l 1 . Methodological Assumptions 166 IV. 12. Methodological Limitations 168 Notes 172 V. Results V .l. Overview 176 V.2. Analyses of Distributional Statistics 177 V.3. Structural Equation Modeling Analyses 178 V.3.1 Measurement Models 179 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. viii TABLE OF CONTENTS-Continued Page V.3.2 Full Models 187 Notes 201 VI. Discussion and Conclusions V I. 1. Overview 205 VI.2. Addressing the Hypotheses 206 V I.3. Implications 220 VI.3.1. Language Assessm ent 220 VI.3.2. Language Teaching 226 VI.3.3. Discourse Analysis 230 VI.4. Further Research 234 V IA L Test Validation 234 VI.4.2. Discourse Description 240 VI.5 Conclusion 241 Notes 242 References 244 Appendix A: Essay Prompts 269 Appendix B: Rating Scales 276 Appendix C: Descriptive Statistics 284 Appendix D: Validation Statistics for Rhetorical Abstraction Scales 286 Appendix E: Correlation Matrix 292 Appendix F: Text Samples 299 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF TABLES Table Page Table n. 1 Moffett's Levels of Abstraction 49 Table II.2 Summary of Investigations of Rhetorical Abstraction 53 Table 1 1 ,3 Britton et al. 's Adaptation of Moffett's Levels of Rhetorical Abstraction 54 Table II.4 Freedman and Pringle's Adaptation of Britton et al. 's Rating Scale 57 Table III. 1 Fit Indices 116 Table IV. 1 Rating Categories and Descriptions 137 Table IV.2 Latent and Observed Variables 153 Table IV.3 Reliability Estimates 155 Table V. 1 Fit Indices for Textual Elaboration Measurement Model 180 Table V.2 Parameter Estimates for Textual Elaboration Measurement Model 182 Table V.3 Fit Indices for Topic Abstraction Measurement Model 183 Table V.4 Fit Indices for Original Rhetorical Abstraction Measurement Model 187 Table V.5 Correlations of Errors for Full Model 1 : Rhetorical Abstraction Features 190 Table V.6 Correlations of Errors for Full Model 1 : ESL Composition Profile Features 191 Table V.7 Correlations for Exogenous Latent Variables for Full Model 1 193 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 I LIST OF TABLES-Continued Table Page Table V.8 Fit Indices for Full Model 1 194 Table V.9 Exogenous Variable Estimates for Full Model 1 194 Table V. 10 Fit Indices for Full Model 2 199 Table V. 1 1 Exogenous Variable Estimates for Full Model 2 200 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES Figure Page Figure II. 1 . Levels of Rhetorical Abstraction in Argumentation 61 Figure 1 3 1 .1 Example of LISREL Measurement Model 97 Figure ffl.2 Example of LISREL Structural Model 98 Figure III.3 Example of LISREL Full SEM Model 103 Figure IV. 1 A Proposed Structural Equation Model of Discourse Features 159 Figure IV.2 Example of Structural Equation Model of Discourse Features 162 Figure V. 1 Textual Elaboration Measurement Model 181 Figure V.2 Topic Abstraction Measurement Model 184 Figure V.3 Alternative Models of Textual Abstraction and Topic Abstraction 185 Figure V.4 Original Rhetorical Abstraction Measurement Model 186 Figure V.5 Full Model 1 189 Figure V.6 Full Model 2 198 Figure VI. 1 Proposed Model of Assessm ent 218 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT Researchers have attempted to identify textual features which determine holistic evaluations of texts. However, m ost studies have involved evaluations of Li texts which may be qualitatively and quantitatively different from th o se of L2 texts. Most of the studies have been restricted by analyses of a limited number of textual features, usually syntactic features. Also, quantitative studies have relied predominantly on regression analyses which are based o n assumptions that may not be consistent with aspects of assessment This project involved the application of quantitative analyses of L2 texts in an investigation of the relative salience of twelve textual features. Six of the features were associated with rhetorical abstraction, defined in the literature as the linking of general and specific information within a text These features included the specification of a central theme, specification of directions for solutions to a problem, amount of evidence used to support arguments, number of explicit warrants used to associate evidence with arguments, elaboration of discussions of cau ses and elaboration of premises of an argument. The application of structural equation modeling (SEM) provided statistical descriptions of the relative salience of textual features. The value of this study lies partially in the fact that this is the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. xiii first application of SEM in an investigation of the relative salience of textual features and the argument that SEM is preferable to regression analyses in this type of investigation. A total of 391 ess a y s written by L2 English language students at the University of Southern California were analyzed for the study. The students were enrolled in writing classes representing a range of proficiency levels and were heterogeneous with regard to ethnic identity and native language. Students wrote their protocols under standardized testing conditions. Protocols were evaluated through the application of holistic rating, analytic rating and quantitative m easu res. Results indicate that som e aspects of rhetorical abstraction are significant features of L j students' test protocols. Rhetorical abstraction is associated with other dimensions-such as text length, content, and organization-which are also significant features. The results may have implications for writing assessm ent and pedagogy. The study also represents a methodological advance in multidimensional textual analysis. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER I Introduction to the Study L I. Preface. The research endeavor reported in this text represents an investigation of one area in which the fields of language testing and written discourse analysis intersect The study is grounded in language testing in that the texts which were analyzed for the study were written under standardized testing conditions, that is, with a set of essay prompts commonly assigned to all writers, a common amount of time allotted for planning, writing and revising, and standardized evaluations of the texts. The information provided by the study is relevant to the construct and pedagogical validity of tests of writing abilities. The study is also related to language testing in that it will yield a model of academic writing proficiency which may be useful in investigations of test validity, in rating scale development and in rater training. The study is related to written discourse analysis in that it entails evaluations, or analyses, of written texts although the analyses may be courser grained than those often employed in written discourse research. The analyses completed in this study involve holistic evaluations of textual dimensions and frequency counts of syntactic and grammatical features which may define textual dimensions. The central point of interest is the relative salience of textual dimensions to readers of student protocols. A second, related point of interest lies in the relationships between textual dimensions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 In the discussion that follows, I seek to situate this study more clearly within the fields of language testing and written discourse analysis. I will further pursue the sam e objective in Chapter II by reviewing literature in both fields. 1 . 2 , l an g u a g e Tes t in g - The two areas in language testing which the study speaks to are construct validity of writing assessm ent and the development of a model of academic writing proficiency. Holistic evaluations obscure rater behavior so that it is unclear which textual features are being evaluated and so that the relative significance of the textual features in determining ratings is not obvious. Although a number of researchers have investigated the relative significance of textual features in determining ratings of overall textual quality, none have provided a coherent model in which the relative significance of textual features and the interactions of those features are described. The model resulting from the present study provides indications of how experienced readers evaluate the writing abilities of second language (L2) learners, identifying textual features which are important in determining whether a student can be considered academically literate. 1,2.1. Construct Validity. Since Diederich etal.'s (1961) seminal investigation of holistic ratings of students' written texts, a good deal of effort has been invested in describing the nature of holistic rating. The bulk of this work has been completed with concerns Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. for test validity and rating reliability in mind. With the understanding that raters can vary markedly in their judgments of a text and a recognition of the complexity of how textuality is determined, it is not surprising that reliable sco res can be difficult to come by. Promoting reliable rating-a largely statistical concern— raises related concerns about the relationship of sco res to theoretical assum ptions underlying writing assessment-a largely conceptual issue. The extent to which assessm ent of writing is consistent with a theoretical understanding of writing is referred to as construct validity. The extent to which holistic rating reflects an institution's or an individual’ s theoretical understanding of writing proficiency and instructional objectives is unclear because holistic rating obscures the agendas of raters. A fair amount of research h a s contrasted holistic m ethods with analytic and primary trait scoring or other m ethods of assessin g writing (Breland and Gaynor 1979, Perkins 1980,1983, Skehan 1990). In general, holistic scoring h a s been defended a s more valid than other m ethods in that it entails, in theory at least, the evaluation of a complete, coherent text. Another type of investigation has focused on the effects of aspects, or facets, of the testing situation on holistic evaluations (Bachman 1990, Bachman and Palmer forthcoming ). This s e t of research studies attributes variations in ratings to various characteristics of the testing situation. Test taker characteristics -such as fatigue, familiarity with test formats, native language, learning style— have been investigated as so u rces of variation in the measurement of writing abilities (Bachman, Purpura and Cushing 1993, Kunnan 1991). Test methods- such as prompt types, variations in rating scales, topic, explicitness of test Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 instructions-have also received attention (Golub-Smith et al. 1993, Greenberg 1986, Hale 1991, Keech 1985, Ruth and Murphy 1989). Rater characteristics-- su ch as fatigue, knowledge of topic, extent of experience, training-have also been identified as possible sources of variance in ratings (McNamara 1990, McNamara and Adams 1991). Another source of variation in ratings lies in the characteristics of the written texts, called facets of expected response (Bachman 1990). Handwriting, grammatical accuracy, vocabulary usage, organization and other textual features have been investigated as facets of protocols that determine ratings to som e degree. A fuller discussion of textual features a s sources of variation in scores is provided in n.3.3. Theoretically, with analytic and primary trait rating, one knows which textual features are being evaluated. With holistic rating, one ass u m es that the overall quality of texts is being rated, but what is actually being evaluated is perhaps less clear. It is not unheard of for a single textual feature, su ch as one misspelled word, to affect significantly the evaluation of overall textual quality. Diederich etaL (1961) found that raters differed markedly according to the textual features that they focussed on in evaluating students' writing holistically. They also found that, even when raters evaluated the sam e features, their sco re s differed. Diederich nonetheless argued that raters could be brought into perfect agreement in holistic scoring. However, as additional research h a s been completed, it h a s become evident that bringing raters into perfect agreement is not an easy task and may not even be desirable.1 Raters' judgments can be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 constrained somewhat through the u se of scales and training, but raters do not always follow the direction given in rating guidelines and training. While recognizing that it is probably impossible to reach 100% consistency in raters' evaluations, one would be misguided to ignore the issu e of variations in rater judgments. Accurate placement of stu d en ts depends on the consistency of rater judgments— consistency not just in the sen se of assigning similar scores, but also in the s e n s e of evaluating the sam e features when assigning sco res. If raters look at different features of texts during evaluation, it may be more likely that the reliability of scores will suffer. On the other hand, if raters assign exactly the sam e sco res while looking at different features, one attains a high level of reliability; that is, there is no variation in the sco res that are assigned across raters. However, the scores in this instance are not comparable; they do not represent m easures of the same textual feature. At this point, the validity of the test may be compromised. Consequently, students may be misplaced for instruction; students may pay more tuition and lose time completing unnecessary courses; and discontent may be expressed by departments, students and instructors. Additionally, score reports may become less meaningful.2 While investigating the statistical consistency of sco res can be rather straightforward, researching rater consistency in terms of evaluating the sam e textual features can be rather difficult The focus of raters' evaluations is not obvious, not even to the raters m uch of the time. The relative amount of attention given to various textual features is perhaps even less obvious. Thus, one of the first step s in maintaining rater consistency would be the identification of textual features which influence raters'judgments. A subsequent step would be the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 description of the relative strength of the influence of these features in determining raters' judgments. This study was undertaken to complete these two step s. The study was designed to investigate the influence of textual features not yet investigated on holistic evaluations of L2 students' protocols written under standardized testing conditions. The protocols are limited in terms of text type and rhetorical mode. Essays, a text type commonly required in composition courses, were analyzed. All protocols presented a line of argumentation. Results indicate which textual features influenced raters' evaluations of overall textual quality and provide indications of the relative amount of influence of textual features. This is accomplished through the application of structural equation modeling, a statistical methodology not used before in s u c h an investigation, but theoretically more appropriate than other m ethods. 1.2.2. Model of Academic Writing Proficiency. During the past decade, a fair amount of work h as been completed to define the general construct of language proficiency (Bachman 1990, Bachman and Palmer 1982, Canale 1983, Canale and Swain 1980). Models of academic writing proficiency in particular have also been proposed (Bizzell 1992, Cheseri- Strater 1991, Hamp-Lyons 1990; 1991a, Kaplan 1987; 1988, Nash 1990a).3 The relation of a model of writing proficiency to language testing is fairly direct Models of language proficiency have served as guides in the development of rating sca les and rating procedures (rater training, score reporting). A model of writing proficiency also provides a basis for evaluating the construct validity of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 writing te s ts . The lack of a model of writing proficiency represents a significant difficulty in determining the construct validity of writing te s ts since there is no commonly accepted theoretical standard against which tests can be compared.4 Although a model should incorporate all significant aspects of a construct, it h a s been observed that, in practice, writing proficiency is often limited to grammatical accuracy. Berthoffs (1984; 1986) argument against limiting the definition of writing proficiency to accurate grammar u sag e is echoed by Davies (1990), Kaplan (1982; 1988) and McNamara and Adams (1991) who have argued that the sole, or predominant, focus on language u sag e in assessm ents of academic writing may adversely affect the construct validity of the ass essm en ts. Using evaluations of syntactic competence a s the sole or the predominant predictors of writing proficiency--across tasks and ability levels-should be questioned. It should be questioned especially in light of a sizeable body of literature which indicates that other abilities— su ch a s the ability to order prepositional material to form a coherent text-are significant aspects of writing proficiency and in light of the understanding that syntactic devices are not the only m eans by which prepositional material is ordered (Coe 1988, Connor 1987; 1990; 1991, Enkvist 1987, Kaplan 1988; 1991, van Dijk 1990). Researchers have thus called for the development of a more comprehensive model of writing proficiency (Hamp-Lyons 1991a, Kaplan 1982, Keech 1984). In this research, I advocate the inclusion of rhetorical abstraction as a component in the proposed model and investigate the role of rhetorical abstraction in determining the overall quality of Lo students' protocols. Rhetorical abstraction is defined a s the hierarchical organization of prepositional material, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 entailing the contextualization of relatively specific pieces of information within more general patterns of information and inversely the substantiation of relatively general information through the u s e of details or examples. This ability h as been discussed by rhetoricians-including Berthoff (1986), Winterowd (1986) and Winteiowd and Gillespie (1994)~and linguists-including Kaplan (1988), Mann and Thompson (1988) and Pike and Pike (1983). More detailed operational definitions used for this project are provided in Chapters II and IV. Little empirical research h a s been completed to investigate features of rhetorical abstraction. Much of the research that exists remains inconclusive due to vague operationalizations of the notion of abstraction and to inadequate attention given to the development of valid, reliable m easurement instruments. The false dichotomization of writing products and p ro cesses and the predominant concern with writing processes during the 1980s partially account for this lack of research (Hamp-Lyons 1991a, Horowitz 1986). An interest in maintaining the objectivity of assessm ent may also account for the lack of research (Madaus 1994). Since rhetorical features can be more elusive than grammatical and lexical features, assessm ents of rhetorical aspects of texts could make testing appear to be more subjective. In addition to describing the relative salience of various textual features as determinants of overall textual quality, this study yields a coherent model of writing proficiency which incorporates syntactic, lexical, semantic and rhetorical features in a coherent representation of textual quality. The model may be further distinguished from other proposed models in that it is based on statistical analyses of empirical data through the application of methodology that may provide more Reproduced with permission of the copyright owner. Further reproduction prohibited without permission 9 accurate estimates of the influence of textual features and the strength of relationships between features. 1.3. Discourse Analysis. The sam e is s u e s raised with regard to the development of a model of academic writing proficiency are also discussed by written discourse analysts in terms of textuality, textual features or dimensions, the structure of discourse and the interrelationships between textual dimensions. Two is s u e s relevant to analysis of written discourse were investigated in this study. The first issue, and the primary focus of this study, is the description of the relative significance of textual dimensions in determining textual quality. The second issu e is the description of the interrelationships of textual dimensions. Before discussing each of these issues, it may be helpful to contextualize the discussions by identifying a wider range of factors involved in determining textuality. There are many factors which may be instrumental in determining textuality. In addition to features of texts (e.g., syntax, lexicon, rhetoric, content, print type and quality, paper quality and color), it has been argued that extratextual factors also affect textuality. These factors include characteristics of authors (e.g., "real world" knowledge, articulateness, willingness to be understood), of readers (e.g., knowledge of world events, knowledge of linguistic forms, willingness to understand) and contextual factors (social and cultural constraints on both writers and readers). All of these factors were at work in the textual analyses that were completed for this study. Since writing ability w as tested, writer characteristics- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 0 at least writing ability-were influential in determining the quality of texts. Of course, it is recognized that familiarity with the topics, ability to interpret graphics and ability to appropriate the data from graphics in support of arguments are only three of many other writer characteristics which had a bearing on the development of the texts. Characteristics of the raters also played a role in determining textual quality. The raters’ experience with La students' texts may have allowed them to interpret texts which may have been indecipherable to someone not familiar with La texts. As will be described in more detail later, the familiarity with environmental is s u e s that one rater brought to the rating table played an obvious role in the evaluation of textual quality. The social context, a testing situation, was also apparently influential in determining textual quality. While recognizing the complexity of the determination of textuality, I will take a limited, text-based approach to investigating what makes a text This study fo cu ses on the relative influence of thirteen textual dimensions in the constitution of textuality. The relevance of the two objectives of the study to the field of written discourse analysis are discussed below. 1.3.1. Relative Significance of Textual Dimensions. A number of textual features have been found significant in determining the quality of a text, a s reviewed in Chapter II. It h as also been observed that som e textual features play a more significant role than others in determining textual quality. The complex syntax of Derrida and the amusing breaches of coherence in Ionesco's plays serve a s examples of defining features of the two writers' texts. Although not as accomplished or renown as these two writers, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. students' texts may also have defining features. In Li students' texts, the defining feature may be an amusing lexical choice or a foreign turn of logic. One might expect that syntactic features-particularly the misappropriation of them— would figure prominently in evaluations of L? students' texts. This might be expected since many L2 students have difficulty in constructing syntactically accurate texts. However, discourse involves more than syntactic control. L2 students may also exhibit difficulty in appropriately deploying rhetorical and semantic features of their writing. One might, therefore, expect features other than syntactic ones to be important in determining the overall quality of L2 texts.5 Most investigations of the relative salience of textual features have involved analyses of Li texts. Evaluations of Li and Li texts, however, can be qualitatively different For instance, raters may expect Li and L i writers' familiarity with som e topics to vary systematically and may thereby adjust their judgments accordingly. There may also be quantitative differences in evaluations of Li and L2 texts with ratings for Li texts being generally higher and representing a different distribution. Results from analyses of Li texts may thus not be generalizable to ass essm en ts of Ln texts. Investigations of the relative salience of textual features have also focussed largely on syntactic features while rhetorical and semantic aspects of discourse have been slighted (Schroder 1991). Many studies have focussed on a limited number of syntactic features or on ill-defined constructs associated with syntactic features. For example, many studies of the 1970s and early 1980s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 2 focussed primarily, or exclusively, on m easures of "syntactic maturity," operationally defined by m easures of length, particularly by the number of T-units, sentences per T-units and clauses per T-unit (Hunt 1964; 1965. Only recently have multidimensional analyses of discourse-including syntactic, semantic and rhetorical features-been attempted (Biber 1988; 1992, Carlson 1988, Connor 1987; 1990, Ferris 1990, Hatch 1991). These analyses have been limited in several ways. Investigations of the relative significance of various features have been limited by the relative lack of su ch research, the lack of a consistent model and constraints imposed by the statistical methodologies employed. Lack of research h a s meant that relatively little information is available to build upon. Relatively little is known about the influence of many features on the constitution of textuality. Many aspects of these features have not been investigated. There is little consistent information about the relative significance of these features. Little is also known about various aspects— functions, distributions across text types-and the relationships between textual features. The lack of a consistent model has likewise hindered the coordination of efforts to define textuality. The measurement of textual features varies across studies, limiting the comparability of results. The definition of features is sometimes vague and often varies across studies, also hampering the comparability of studies. The influence of other textual characteristics-such as text type and topic~on overall quality of texts is also little understood and thereby limits comparability. Researchers have often been satisfied with the identification Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 3 of relatively more significant textual features, ignoring the interrelationships between features which may influence the relative significance of features. The choice of statistical analyses has not allowed the analysis of a coherent model of multivariate data. The statistical methodologies-mostly regression and discriminant analysis-provide limited information neces sary for the establishment of a model. They also entail assumptions which may not be consistent with the measurement of many textual features of interest, as d is cus sed in in.3.7. This study incorporates evaluations of syntactic, semantic and rhetorical features a s well a s text length in an analysis of L / > texts. The statistical analysis employed allows investigations of the relative significance of these features in determining textual quality. It also provides descriptions of the strength of relationship between textual features. 1.3.2. Interrelationships between Textual Features and Dimensions. Intuitively, it appears plausible that textual features and dimensions6 interact in determining textuality (Hatch 1991; 1992, Kaplan 1988; 1991, Q uirk# a l 1985, Mauranen 1993). It is commonly assumed that texts cannot be defined in terms of clearly distinguished components. Although textual dimensions or features are often discussed, for the sake of research and assess me nt, a s if they are isolatable, independent components o f texts, in practice it is often difficult to distinguish textual features/dimensions. In ass essment, for example, raters sometimes find it difficult to draw clear distinctions between lexical features of texts and syntactic, semantic and rhetorical features. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 4 As aptly demonstrated by Halliday and Hasan (1976) and Schiffrin (1987), lexical and syntactic features may be deployed within a text to enhance the text's cohesiveness. Although the relationship between cohesive features and a broader textual dimension is implicit in their work, Halliday and Hasan and Schiffrin focus their attention on the identification of relatively discrete features which may co-occur in texts. Hasan (1984) and Hoey (1991a; 1991b) have additionally shown that the lexical and syntactic features identified by Halliday and Hasan interact (i.e., co­ occur) to establish the coherence of a text. Hasan and Hoey elaborated Halliday and Hasan's investigation of coherence by describing the network o f grammatical and lexical features that defines textual cohesion. In this work, they not only posited a relationship between grammatical and lexical features and the broader dimension of coherence, but also theorized that th es e features interacted s o as to define textual coherence. Similarly, particular lexical and syntactic features s e e m to co-occur to define textual dimensions other than cohesiveness or coherence. Passive voice, ladnate lexical items and longer s en te n ce s have been associated with the relative formality of texts for instance. Biber (1988) h a s sought to define textual dimensions, assuming that particular grammatical and lexical features co-occur to define dimensions. Biber, Hasan and Hoey as su m ed not only that discrete features could collectively define broader dimensions, but also posited an interaction/correlation among textual features in the definition of dimensions. Two types of relationships between textual features and dimensions are observable in the work referenced above. Correlational relationships between Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 5 textual features and dimensions are defined a s the co-occurrence of features and dimensions. Aspectual relationships are defined a s the co-occurrence, or possible interaction, of two or more features in the definition of a textual dimension. The establishment of th es e relationships de pends on the definition of the features/dimensions, including the level of conceptual aggregation. Mechanics - defined a s paragraphing-may be s e e n as an aspect o f organization, whereas mechanics --defined a s handwriting-may be s e e n as irrelevant to organization. Mechanics -defined at a different level of conceptual aggregation a s paragraphing, handwriting, spelling and capitalization— may be se e n a s a dimension that is correlated with organization. I w ill dis cu ss each of these two relationships, providing examples for textual features and for textual dimensions. I w ill also argue that so m e features/dimensions may be s e e n as independent First, lexical and/or syntactic features are found to co-occur in texts in a correlational relationship. Halliday and Hasan posited the co-occurrence of features and related groups of features to textual cohesiveness. On the other hand, Biber (1988) found some groups of lexical and syntactic features to be correlated without being associated with any identifiable broader dimension. Similarly, modal verbs are found to co-occur in certain sections of research papers and are related to tentativeness or hedging (Swales 1991). Correlational relationships are also as su m ed to occur between textual dimensions. For example, vocabulary and content are as s um ed to be closely correlated within a text. It is argued, for example, that vocabulary carries mu ch of the substantive content of a text and the two textual dimensions would, therefore, be highly Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 6 correlated. Similarly, Hillocks (1986) su g ge sts that text length and content may be highly correlated. It is tempting to interpret correlational relationships in terms of causality. That is, one may want to conclude that if text length and content are highly correlated, it is because longer texts are more informative, and shorter ones, le ss informative. While a significantly large correlation might be interpreted as an indication of one of the textual features being an aspect of the other, it could also be interpreted a s simply the cooccurrence of independently defined textual dimensions. The independence of the two dimensions appears to be intuitively plausible. It is possible that a longer text, a personal letter for example, would exhibit relatively little substantive content in that much of the letter is phatic in nature, not intended to present new information. A relatively short text, on the other hand, might be densely packed with new information. Memoranda might serve as a commonly recognized example. Conversely, textbooks may serve as a common example of longer texts with relatively more substantive content. Intuitively it s e e m s that content -defined as the semantic import of a text- may be more closely related to vocabulary choice or textual organization than it might be to mechanics-defined a s handwriting legibility, paragraphing, spelling and punctuation. Mechanics— defined as legibility of handwriting— and language use-defined a s felicitous u s e of syntactic structures-for example, may be unrelated. That is, the felicitous deployment of syntactic structures in a text probably h a s little, if anything, to do with the handwriting. Similarly, one may observe in students' writing content which is engaging and entertaining. In so m e o f thes e texts, a native-like control of mechanics, vocabulary and syntax may be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 7 observed. In others, little control of these three asp ec ts may be observe ! The correlation o f evaluations of content and control of the three features a cr os s texts would therefore be equal to zero. In the second type o f relationship, lexical and syntactic features which co­ occur are often assum ed to define broader textual dimensions, su ch a s cohesiveness and coherence. That is the features are se e n a s components, or facets, o f a more broadly defined textual dimension. I w ill refer to this relationship throughout the dissertation as an aspectual relationship. This type of relationship played a prominent role in tagmemic theory (Pike 1982, Pike and Pike 1983) in which language was s e e n as a hierarchy of aspectually related linguistic components. In an aspectual view o f textual dimensions, s u ch as Biber’s (1988), features not only co-occur, but combine a s aspects of a textual dimension.7 In a s e n s e , the co-occurrence of particular textual features causes the textual dimensions. The co-occurrence of prepositions and relative clauses, for example, cause the informational load of a text. The co-occurrence of latinate lexical items, longer words and the lack o f contractions may cause the formality of a text In the present study, data use and examination o f premises/issues are hypothesized to be asp ec ts of content. It s e e m s reasonable to interpret significantly high correlations between the first two features and content a s indications that the two features are asp ec ts of content. One might expect the u s e o f data, or evidence, to support an argument and the analysis of i s s u e s to be viewed a s substantive material of a text. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 8 One may be tempted to posit a high correlation between lexical coherence features-such a s moreover, in addition, also -and the coherence of texts. Although su ch features are often u se d by novice writers a s an aid to organizing their texts, they may be absent in many texts which are nonetheless coherent and well-organized. In s u c h texts the correlation between lexical coherence features and the coherence or organization may be equal to zero. Textual dimensions may co-occur to define a particular textual quality. The discourse of literary criticism often reflects this conceptualization. A list o f descriptives are often strung together to form a stylistic picture of a text Texts are described in additive terms s u c h as fresh and exciting and modem. A literary critic might be hard press ed to distinguish the fresh from the exciting and the modem asp ec ts of a text. Nonetheless, the descriptions imply that a text could be fresh without being exciting and modem or that it could be fresh and exciting without being modem, etc. This conceptualization of textual descriptions implies that textual dimensions can be defined independently but may co-occur to define a particular text or author. 1.4. Purpose of the Study. This study involved the application of computer-aided textual analyses, ratings of student es s a y s and statistical procedures (structural equation modeling) to provide descriptions of L2 university-level students' written discourse. Each of these three a sp ec ts of the study are described in more detail in Chapter IV; structural equation modeling (SEM) is also further described in Chapter HI. The analyses were applied to a corpus of argumentative e s s a y s written by Ln Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 9 university students. Detailed descriptions of the corpus and the writers are provided in Chapter IV . The analyses provided the following types of descriptions: • Descriptions of the efficacy of the observed variables (i.e., frequency counts and ratings of textual features) in measuring underlying textual dimensions, • Descriptions of the structural composition of rhetorical abstraction, * Descriptions of the statistical significance of various textual dimensions in determining overall textual quality, • Descriptions of the strength of relationships between various textual dimensions in terms of correlations, * Descriptions of the strength of relationships between more discrete textual features (i.e., length, rhetorical abstraction features, topic abstraction and textual elaboration) and broader textual dimensions (i.e., content, organization, language use, vocabulary, mechanics), and * Statistical indices of the adequacy of two models of discourse structure. It may be important to point out explicitly at this time that the descriptions that are provided by SEM are statistical. They are numerical indices of the strength of relationship between variables and are ba se d on observed variances in measures. Furthermore, the descriptions are o f textual dimensions rather than of more discrete elements of texts. For example, rather than describing the function Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 20 of a particular syntactic structure or lexical item or associating particular syntactic features with particular rhetorical features, a more general analysis is attempted in this study. Broader textual aspects-as contrasted with particular syntactic structures or lexical items-are the objects of analysis and are referred to a s textual dimensions. The textual dimensions are defined in terms of syntactic/lexical features or in terms of other types of aspects a s found in the ESL Composition Profile (Ja cobs et aL 1981). Likewise, the objective of the study is not to trace the development of texts or to identity the functions of syntactic structures or lexical items. The main objective of the study is, instead, to describe-in statistical terms— the relative salience of textual dimensions to raters of Lz texts. The types of descriptions that SEM provided for the study are d is cu ss ed in more detail in Chapters in and IV . Nonetheless, it may be apparent from this brief discussion that the analyses provided more information than can be adequately explicated in this dissertation. This embarrassment of informational riches is due to the need to specify a comprehensive model of discourse which theoretically identifies a ll textual dimensions or, at least, all significant textual dimensions. Although the analyses provided information about the relative salience of grammatical, lexical and semantic dimensions of the texts, rhetorical abstraction remained the main focus o f the study. While information about other textual dimensions is provided in this dissertation, the discussions w ill focus exclusively on information regarding rhetorical abstraction and the relationships between features of rhetorical abstraction and other textual dimensions. To help focus the dissertation, I have formulated a series of hypotheses which have guided the research and w ill constrain the discussions that w ill be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 1 found in subsequent chapters of the dissertation. The following hypotheses were investigated in this study: 1. The model o f rhetorical abstraction proposed in the study (see IV .9 .1) is supported by statistical analyses. 2. Rhetorical abstraction is a statistically significant aspect of overall textual quality. 3. Evaluations of rhetorical abstraction are significantly associated with evaluations of text length. 4. Evaluations of rhetorical abstraction are significantly associated with content. 5. Evaluations of rhetorical abstraction are significantly associated with rhetorical organization. 6. Evaluations of rhetorical abstraction are significantly associated with topic abstraction. 7. Evaluations of rhetorical abstraction are significantly associated with textual elaboration. The first hypothesis reflects a basic concern for the definition o f the textual dimension that is the focus of the study. The hypothesis is tested by determining whether the observed variables appear to be m ea su re s of common underlying textual dimensions which may represent aspects of rhetorical Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 22 abstraction. These asp ec ts should, in turn, be associated with a single common underlying variable which can reasonably b e labeled rhetorical abstraction. The se co nd hypothesis is of central concern in this study. Available literature provides only speculations about the nature of rhetorical abstraction. It h a s often been asserted that the rhetorical abstraction features which have been incorporated in this study play an important role in determining overall textual quality. This hypothesis w ill be tested by calculating regression weights between measures of overall textual quality and rhetorical abstraction features. The last five hypothe ses claim that features of rhetorical abstraction are significantly associated with other textual dimensions. These hypotheses are tested in two ways. SEM analyses w ill calculate correlations between all textual dimensions, providing one indication of the strength of association. Also, an analysis regressing the broader textual dimensions on more discrete one s w ill provide regression weights indicating the extent to which rhetorical abstraction features can be considered aspects of the broader dimensions. Results from tes ts of the last five hypotheses may be useful in defining the broader textual dimensions s u c h as content. They may also be useful in describing the relative salience of rhetorical abstraction features. If rhetorical abstraction, for example, is found to be a significant aspect of content and content a significant aspe ct of overall textual quality, one may expect rhetorical abstraction to be instrumental in determining overall quality by enhancing the substantive asp ec ts of texts. One may thereby identify more specific functions of rhetorical abstraction. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 3 It might be expected that a sp ec ts of rhetorical abstraction, particularly data use, would be correlated with m ea su re s of content. This may not, in fact be the case since content h a s been interpreted in various ways by raters. In fact, raters may resist requests to evaluate the content of texts. Hamp-Lyons (1991a) notes that raters in her study were reluctant to evaluate substantive content in students' es sa ys bec ause the raters felt unqualified to do s o . One might also reasonably expect the length of a text to increase a s one elaborates dis cussions of a topic. Since the amount of elaboration is reportedly measured by the rhetorical abstraction ratings, it s e e m s reasonable to expect text length to vary proportionately with mea sures of rhetorical abstraction. It is also hypothesized that topic abstraction, a s defined by Biber (1988), is highly correlated with rhetorical abstraction. Rhetoricians concerned with composition instruction (Berthoff 1986, Winterowd 1989; 1994, Shaughnessy 1977) have su g g e s te d that balancing abstractions with more specific data is valued by composition evaluators. If this is true, then topic abstraction may be significantly correlated with rhetorical abstraction, or a least with some features of rhetorical abstraction. Finally, the textual elaboration dimension (Biber 1988) is hypothesized to be correlated with rhetorical abstraction. It may be that Biber*s dimension represents a different type of textual elaboration than that represented in this study as rhetorical abstraction. Biber describes his textual dimension a s on-line elaboration, that is, elaboration common to spontaneous communication, s u c h a s face-to-face conversations or e-mail m e s s a g e s . Given the fact that Biber associated this type of elaboration with sp ontane ous communication and the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 4 association of rhetorical abstraction with relatively more planned and edited texts, one might expect an inverse relationship between the two dimensions. If the two types of elaboration represent mutually exclusive textual dimensions~one common to sp ont an eo us communication and the other to relatively more planned, edited communication-a relatively large negative correlation may be found between the two dimensions. If, on the other hand, the two dimensions are interrated, one might expect to find a poative correlation between them. 1.5. Value of the Study. The importance of the study to the field of discourse analysis lies in the resultant description of textuality and the application of a statistical methodology which may prove useful to multidimensional textual analyses. The study may also be useful to language testers interested in validation of te sts o f writing abilities, rater training and the coordination of writing as s e s s m e n t and pedagogy for L2 students. The descriptions of textuality yielded by the study are drawn from a broader perspective than those in which researchers trace topic development or identify particular cohesion features in texts. The descriptions are specifically applicable to e s s a y s written by L2 students under standardized testing conditions. Being limited to academic es sa ys, the study may be useful in defining academic writing proficiency in terms of the relative salience of textual dimensions. It h a s been asserted, for example, that substantive aspects-coruent and development of ideas-may be more important to readers o f academic texts than they are to readers of other types of texts. Likewise, it is claimed that terseness~as contrasted with Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 5 rhetorical abstraction and textual elaboration-is generally favored more in s o m e busines s oriented texts than it is in other types of texts. If it ca n be determined that readers place more emphasis on certain textual features when reading one type o f text a s opposed to another, this discrepancy in reader expectations may be useful in distinguishing text types. The descriptions of relative salience may also be taken a s an indication of how future research efforts should be channeled. If, for example, rhetorical asp ec ts o f academic e s s a y s appear to be relatively important in determining the overall quality of the texts, this information could be u s e d to argue that additional research of rhetorical features is warranted. On a more abstract level, I believe that an unpretentious goal for this research is the provision of information useful in the formulation of a multi­ dimensional theory-driven model of textual analysis proposed by discourse analysts (Connor 1987; 1990; 1991, Hatch 1991; 1992, Kaplan 1987; 1988; 1991, van D ijk 1990, van Eemeren 1990). Two types of information from this study may be of u s e in the development of such a modeL First, several textual features incorporated in this study have received little or no attention in previous research. Features s u c h as content and vocabulary which are expected to be relatively important in determining textual quality have been investigated in only a handful of studies (see II.3.3). In contrast, many studies have focussed on the role of formal aspects of texts--i.e., syntactic features, spelling and handwriting. Knowledge of other aspects~e.g., semantic and rhetorical aspects-is needed to develop a more comprehensive model of textuality. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The fact that so me textual features have not received m uc h attention may be in part due to the difficulty of defining the features. This appears to be the c as e for content which has been confused with other textual features. It may, in fact, be impossible to distinguish categorically som e features from others. This not only complicates the definition of the features, but also makes the measurement of features a complex task. Measurement is also a problem in that there is no commonly accepted metric for quantifying features s u c h a s content. While it is a fairly straightforward task to tally frequency counts for syntactic structures or lexical items, it is not clear how one can measure semantic a sp ec ts of texts. Kintsch (1988), Van D ijk (1990) and Meyer (1987,1992) have proposed procedures for identifying and counting propositions, but their decisions to define propositions in the manner in which they did are arbitrary and thus open to controversy. Nonetheless, it may be generally agreed that additional understanding of the importance o f these textual features in determining textual quality would be useful. The study may also be useful in identifying textual features which are associated and identifying features which may reasonably be considered aspects of broader textual dimensions. The study may also be useful in that validated sca les for various features o f rhetorical abstraction are provided. Similarly, it may be useful to have a better understanding of the importance of a more inclusive se t o f features, not just more features. It is unfortunate that some of the studies reviewed in Chapter n are limited to investigations of only one type of textual feature Many are limited to grammatical aspects or various measures of length. Studies which incorporate Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 7 only, or predominantly, syntactic features are m o st likely going to produce results which indicate that syntactic features of texts are the most significant features which determine textual quality. It is, therefore, desirable to incorporate a s comprehensive a list of textual features as possible in this type of analysis. This study may be useful since a fairly wide variety of features is incorporated, perhaps giving a le ss "biased" view of textual quality. Second, the u s e of SEM -a methodology not previously applied in investigations of the relative significance of textual features-may prove to be useful in a number of ways. The theoretical b a s e s for SEM are preferable to those of other multivariate statistical analyses. Unlike regression, SEM d oe s not as s um e that measurement errors are equal, a difficult assumption to make when dealing with many different types of me as u re s s u c h a s those incorporated in the proposed research. Furthermore, SEM permits correlated errors which might be expected when the s a m e raters evaluate multiple asp ec ts of texts. Finally, SEM is able to estimate relations among variables at the s a m e time rather than estimating the significance o f one variable at a time. This provides the advantage of yielding parameter estimates relative to a complete, coherent model. SEM also allows the researcher to incorporate different types of mea su res — ordinal measures, s u c h a s ratings, and interval measures, s u c h a s frequency counts-in one coherent analysis. The ability to incorporate both types of m ea su re s provides the researcher with the ability to perform a multidimensional textual analysis not limited to frequency counts. This represents an advantage in that, while the frequency of occurrence of textual features may be an important aspect of texts, there are probably other features which are also important but Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 8 which can not be reduced to frequency counts. While the sa m e argument may b e made with regard to any attempt to quantify textual characteristics, the possibility o f including ratings in an analysis along with interval me as u re s expands the range o f features that can be examined SEM and the advantages that it offers to discourse analyses are d is cus sed in more detail in Chapter III. Perhaps another valuable aspect of this study is the definition of textual dimensions that remain ambiguous. I have already referred to the difficulty of defining dimensions s u c h a s content. One of the tas k s of this study is to describe the relationship between features of rhetorical abstraction and other textual dimensions incorporated in the study, in a s e n s e defining the dimensions. For example, the relationship between rhetorical abstraction features (data use, consideration o f causes and examination ofpremises/issues) and content are investigated Results indicate whether these features play a part in raters' evaluations of content. Although the ESL Composition Profile may give some indication of the meaning of content, the terms u se d in the document are relatively vague and not specifically related to argumentation. The results o f the study may be useful in identifying asp ec ts of content, as well a s other more ambiguous terms, su ch as organization. This information may, in turn, be useful in test development and in coordinating the a ss es sm en t and instructional arms of language programs. The study may also have implications for the testing of In students' writing abilities. These implications are associated with various conceptualizations of the validity of writing test s. One implication is related to the construct validity of writing test s. For a writing test to have construct Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 29 validity, it must be consistent with a theoretical understanding of the writing abilities which are purportedly measured The theoretical understanding of abilities is reflected in the rating criteria that supposedly guide raters in their evaluations. Another implication is related to the pedagogical validity o f the test The concern with pedagogical validity is that the testing procedures should be reflective o f classroom instruction. The promotion of planning and revision activities and the specification of a rhetorical context are important considerations for both testing and teaching. In addition to being important for both contexts, I would argue that s u c h considerations should receive similar attention in both contexts. If, for example, specification of an audience and a purpose for writing are considered important in teaching writing, they should also be incorporated in a s s e ss m en ts of writing as well. Similarly, the focus of evaluations and instruction on particular features of writing should be similar. For instance, if a predominant focus is placed on grammatical aspects of writing in classrooms, the sam e focus should be observable in as se ssm ent s for the same program. This would be expected for accurate placement and effective instruction of students.8 Investigations of both types of validity may be made possible by the results of the study. Results w ill indicate the relative significance of various textual dimensions in determining the overall quality of students' protocols. The indications of relative significance can be compared with theoretical understandings of adequate academic writing and with classroom instruction, providing evidence which can b e u se d to judge the validity of writing assessments. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 0 Results from the study may also be useful in rater training once validity is investigated. If raters' behavior doe s not appear to be consistent with explicitly stated criteria, additional rater training may be called for. If, for example, raters appear to place more importance on grammatical a s p e c t s of protocols than appears to be warranted by the stated criteria, raters may be trained to give le ss attention to grammar or give more attention to other textual features. Likewise, if raters se e m to b a s e their evaluations on textual features which are not part of the criteria, training could be implemented to make ratings more consistent with criteria The results of the study may also have applications for the ass es sm en t of literacy levels, including readability estimates of texts. Readability ha s long been based solely on mea su res of word/clause/sentence length and frequency counts of syntactic features of a ss u m ed relative processing difficulty. Although thes e mea su res have been considered valid indices of readability, their validity is questionable. Literacy experts (Davison and Green 1988, Homing 1993, Kintsch 1988, Krashen 1988; 1993) have called for investigations o f other determinants of readability, including rhetorical features. The study w ill provide a comprehensive model o f textual quality which w ill indicate the relative significance of textual dimensions in describing the overall quality of texts. This information may be useful in supporting assertions regarding the importance of rhetorical and semantic features in estimates of the readability of texts. 1.6. Limitations o f the Study. A number of limitations are inherent in res earch involving the statistical analysis of the relative importance of textual features to raters. These limitations Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 1 are dis cu ss ed here. Limitations to the specific methodology u s e d in this study are discussed in IV . 12. The discussion that immediate follows focuses on the necessary limitations of text types and textual features, as well as the limitations imposed by statistical analyses of linguistic data in general. Researchers interested in academic discourse have pointed out that there are many types of discourse u s e d in academic contexts (Bizzell 1992, Bridgeman and Carlson 1983, Jolliffe 1988, Nash 1990a, Swales 1991). In order to conduct this study, it was neces sary to lim it the types of discourse to be analyzed. I have ch o se n to analyze a type of academic written discourse which I describe in terms of genre and rhetorical mode. The genre of interest in the study is de Montaigne's essai which is commonly taught in modem composition courses a s essay writing and can be observed in extended forms identified a s screening papers, qualification papers, or th e s e s and dissertations (van Peer 1988; 1990).9 My rea so ns for choosing the essay as the object of analysis include the pervasiveness of this text type in Western academic discourse (Berlin 1984; 1987, van Peer 1990).1 0 Perhaps most importantly, the es sa y is often use d to decide whether stu de nts should advance academically and who should receive financial awards. It is commonly u s e d to test students' writing abilities a s a key to admission to academic programs and as an indication of academic achievement. Forms of the es s ay are also key to gaining fellowships and scholarships. I have further restricted the choice o f text type to be analyzed by limiting the analyses to argumentation. For this study, I have defined argumentation in terms of the purpose of a text, namely to defend a policy or belief (Kinneavy 1971). Argumentation is further defined a s a process of advancing, supporting, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. modifying and criticizing claims s o that the targeted audience may grant or deny adherence (Perelman and Olbrechts-Tyteca 1969, Kinneavy 1971), an inducement to some type of action, be it intellectual, emotional, or physical (Kinneavy 1971). Bridgeman and Carlson (1983) have found argumentation to be a prevalent discourse type in academic programs. This observation is corroborated by others (Fitzgerald 1988, Hamp-Lyons and Heasley 1987, Hughey 1990, J o hns 1986; 1991). Another reason for choosing argumentation is that its pervasive presence in society in general (Hays and Brandt 1992, Jo hns 1991, Kinneavy 1971) may provide a basis for the potentially wide application of the research. With an understanding of difficulties due to "field" variation, a s Toulmin (1958) dis cu sse s, it is hoped that the results ca n be readily applied to other types o f argumentation. I argued earlier that a comprehensive se t of textual features should be incorporated in the study. Given the fact that there are th ousands o f textual features that could be incorporated, a comprehensive analysis is impossible for this study. The dimensions-co/i/enr, organization, language use, vocabulary, and mechanics -chosen for inclusion in the study are as su m ed to represent a relatively comprehensive list o f features. It was necessary to restrict the number of rhetorical abstraction features also. There are other aspects which could be investigated, aspects su ch a s abstractness of topic (Berthoff 1986, Davison and Green 1988), aspects associated with referencing outside sou rc es (Campbell 1990, Greene 1993, Keech 1984), frequency of the presentation of new information (Spiro and Taylor 1987), audience address (Black 1989, Connor 1987; 1990; 1991, Flower and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 3 Hayes 1980, Hays and Brandt 1992, Kirsch and Roen 1990, Sandell 1977) and us e of metadiscourse features (Berthoff 1984, Nash 1990b, Zwicky 1982). I have limited my investigation to the s ev en rhetorical abstraction features identified and defined in n.3.3. These seven features were ch os en for several rea so ns . They have been associated with the hierarchical organization of prepositional material in argumentative writing (Berthoff 19 8 6 , Weasenforth 199 3, Winterowd and Gillespie 1994). They were present in es sa ys written for this study. Also, in a previous study, the measurement sca les for these sev en features performed well in several respects (Weasenforth 1993), providing an adequate psychometric basis for this study. One should question the u s e of one set of ratings to identify the object of evaluation in a se c on d s e t of ratings. The textual features being evaluated by raters may not be clear for either set I have as su m ed that the nature of the holistic ratings is unclear; it is not certain what the raters were looking at when they assigned their scores. I also assumed that the nature of the sec ond set of ratings, ratings from applications of the ESL Composition Profile, was clear. That is, I assum ed that when raters assigned a score for content, for example, that only features associated with content a s the scale indicated were evaluated. Although this generally see m ed to be true during the training se s si o n s, raters, in fact, may not have followed the sca les exactly as directed. Deviations from directions in the sca les would obviously confound the results of this study. As with any statistical analysis, disappearance o f the data represents a potential limitation. I u s e this term to refer to the distancing--by mea ns of multiple layers of analysis-of the observer from the object of interest This is a n Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 34 issue which Chomsky (1986) h a s referred to in his criticisms o f statistical analyses o f linguistic data. With each layer of analysis-the assignment of scores, the transformation of s co re s to meet distributional requirements, the application of bootstrapping to meet sample size restrictions, the calculation of correlation/covariance matrices and the estimation of parameter estimates based on matrices-the texts being analyzed can become lost to sight rather easily. In generalizing results ac ro s s a wide variety and number of textual features as well as across a fairly large number of es sa ys, it is easy to lose sight o f the more specific observations that can be made. Similarly, it is easy, and convenient, to assume temporarily that all raters interpreted sc ales similarly. It is, likewise, easy to lose sight of the variations in the deployment and functions of the various features across texts. This problem has been mitigated to some degree, I hope, through references to comments given by raters during rating se s s io n s and through the revision of sample texts. It should be pointed out that statistical procedures are normally us ed to identify general trends in data, often at the expense of specific deviations from the observed trend. While data which deviate from the trend can be identified and the deviations estimated, the predominant concern in the application of statistical procedures is to describe a body of data at a general level. Because the impetus o f this study was to provide a general description of the relationship between textual dimensions and overall textual quality, statistical analyses see me d to be an appropriate, objective method o f deriving the descriptions. The validity of assigning numerical values to linguistic constructs may also be questioned. To reduce the substantive content or any other textual Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 5 characteristic to a single descriptive digit could be s e e n a s meaningless. This problem h a s been partially addressed in the study through rater training and periodic reviews o f ratings to guarantee consistency of judgments. Nonetheless, one may aigue, the textual dimensions are defined with reference to various textual features. Content, for example, is associated with use o f reasoning and the interrelation o f ideas just to mention two related features. In fleshing out the argument, one may question whether a particular rating for content represents an evaluation of the u s e of reasoning or of the interrelation o f ideas or of both features. One may further question what these two features mean. Although part of the study entailed an identification of the features associated with content and other textual dimensions, it is true that the intent of the raters in assigning sc o re s is not entirely clear. This does, in fact, impose limitations on the interpretations o f the ratings. There is evidence, however, which indicates that the ratings are not meaningless. The relatively high levels of rater consistency indicate that raters were identifying the sa m e features and evaluating them consistently. The u s e of statistical analyses, particularly complex analyses s u c h a s those u s e d in this study, may also lim it the accessibility o f a study. The complex mathematical calculations and the long list of theoretical assumptions underlying them can be intimidating for those not familiar with the procedures. I w ill d is cu ss this limitation with regard to SEM in particular in Chapter III. For readers who have limited knowledge of SEM, I have provided a general introduction to the procedures involved in SEM analyses in Chapter III. I have also provided a description of how SEM was applied in this study in Chapter IV . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 36 Notes 1. If raters are coerced into producing highly consistent evaluations of texts, a number of questions are raised with regard to the validity of the test. To what extent is the authenticity of the test compromised? That is, to what extent doe s the test become artificial in the s e n s e that it reflects less clearly the types of evaluation that are made in the target language u s e context. What amount of consistency is acceptable? A number of administrative questions are also raised. What amount o f time and money is to be spent in bringing raters to the level of consistency that is dema nd ed ? How w ill attempts to maintain consistency be affected by variations in topics, rhetorical mode, test taka' population and changes in the curriculum? 2. Test validity and reliability also have real-world importance f e w test takers. Raters'judgments may determine whether and when a student successfully achieves a degree, whether a potential student is admitted to higher education and whether he is admitted to a desired school. Such decisions which are based on raters’judgments have implications for test takers' professional life, finances and a c c e s s to particular communities. Bachman (1990) and Bachman and Palmer (forthcom ing) dis cu ss the implications of testing at som e length. 3. I recognize that the construct academic w riting proficiency is based on the assumption that a discourse community related to academic programs can be identified but lack the space to dis cu ss this assumption here. (See Swales 1987 and Swales 1991 for a discussion of defining discourse communities.) I also recognize that s u ch a construct must necessarily consider a number of factors, including (but not limited to) academic field of study, academic stat us and requirements o f particular institutions. Some of the se factors have been addressed in descriptions of academic discourse which are reviewed here. The limitations I have chosen to impose on this study and which I w ill discuss below may belie the understanding that we are not dealing with a simple, monolithic construct 4. The argument for local relativity o f educational as s e s s m e n t has gained more currency in recent literature (Darling-Hammond 1994, Haswell and Wyche-Smith 1994). The sc o p e of a test's standardization, I would argue, de pends a great deal on the purpose of the test Whereas classroom a ss es sm en t is often used for diagnostic purposes, institutionally standardized tests are more often used to describe student achievement. Given this difference in focus, perhaps classroom a s s e s s m e n t should be more situationally dependent than institutionally standardized tests. I would further argue that the two types of tes ts could, and in so me situations should, coexist While classroom standardized tes ts may aid in the delivery of effective instruction and in the delivery of useful feedback to students within the classroom, tests standardized on a wider population may be more useful in assuring that the promotion of stu dents from one level of instruction to another holds similar meaning from one institution to another. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 7 I do not s e e the establishment of a model a s being inconsistent with the relative focus of as s e s s m e n ts . A model would provide general guidelines on which more specific as se ssm ent needs could be based. 5. Studies of contrastive rhetoric attest to culturally defined differences in the rhetorical organization of texts (Connor and Kaplan 1987, Kaplan 1966,1988). It h a s been demonstrated that these differences can be significant in determining the overall quality of texts (Connor 1987; 1991, Ferris 1991). 6. I distinguish the terms textual features and textual dimensions in terms of discreteness. That is, features are relatively individually distinct, while dimensions are more broadly defined and may incorporate a number of features in their composition. In this dissertation, I have consistently u s e d features to refer to syntactic and lexical features, a s well a s the sev en features of rhetorical abstraction. 7. One questionable aspect of Biber's work and work similar to his is the assumption that frequency counts of textual features are useful in defining textual characteristics. Partially due to this assumption, the interpretations of Biber's textual dimensions are questionable. Couture (1986), for example, s u g g e s ts that Biber's work may actually define register variations rather than textual dimensions a s described by Biber. In fact, whether the dimensions actually define any textual characteristic is open to debate. Given the fact that many features were not included in Biber's analyses and the understanding that asp ec ts -such a s functional variation and salience--of the features were not investigated (Halliday 1989), it is not clear what Biber's dimensions actually represent 8. I would like briefly to elaborate this discussion by making two points. The focus of evaluations in terms o f textual features may shift ac r o s s proficiency levels. Grammatical aspe cts may play a more important role in determining ratings of novice students' writing than in ratings o f more advanced students' writing. The shift in focus across proficiency levels was not investigated in this study. Second, I am not arguing that all students should be taught in the sa m e way. Instead, I am arguing that the general thrust of an instructional program should be reflected in the program's ass essment procedures. 9. This list is not intended to be exhaustive. Essai writing may also include professional journal articles, abstracts, exam papers and so m e editorials to name but a few other examples. 10. A more detailed description of the genre is provided by Berlin (1984; 1987) and van Peer (1988; 1990). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 8 CHAPTER II Review of Literature: Discourse Analysis H,l, Overview- This chapter is devoted to a review o f the literature of two areas of research which are relevant to the endeavors described in the previous chapter. I w ill begin this discussion with a summary of theoretical work and empirical research in the field of written discourse analysis. More specifically, I have culled out a body of work related to investigations of written discourse structure. The main focus o f this part of the discussion w ill be on investigations of rhetorical abstraction a s an aspect of structured, "planned" discourse. The discussion w ill also include a summary of Toulmin's theoretical work in the structure of argumentation and a n explanation of the relevance of his theory to the study. Since the study is also related to facet theory a s discussed in the language testing field, a review o f this body of literature w ill also be provided. A number o f researchers have demonstrated that various facets (i.e., aspects, features or characteristics) of texts may influence readers' evaluations of the overall quality of texts. This literature indicates that features related to the structuring of texts, a s well as other textual characteristics, can affect readers' evaluations of textual quality. The influence of textual features on readers is of great concern to language testing experts who see k to minimize measurement error in evaluations of writing performances of test takers. This discussion w ill include a brief Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 39 overview of facet theory and a more extended review of literature relevant to the textual dimensions analyzed in this study. n,2, P is c Q u r s e _ Struct»re- An important line of discourse research h a s involved descriptions of structures or patterns of discourse, the choice o f terms depending on one's convictions of the extent to which the shap e of discourse can be predicted (Hoey 1991b). As Hoey points out, few analysts adhere to the strict view o f written discourse being astructural. Although one may be inclined to argue this viewpoint when faced with occasional texts which are impenetrable due to their amorphous character, there is general agreement that written discourse takes on some form. There are, however, strong and weak versions of the theory of text structure. Weak views of text structure are characterized by relatively less strict associations o f textual features. Hoey's (1991a; 1991b) own work and that o f Hasan (1984) exemplify this view of discourse structure. Both researchers identify semantic associations among lexical and grammatical features o f texts to form semantic networks which partially define the coherence o f texts. Halliday and Hasan (1976) and Schiffnn (1987) similarly s u g ge st that the identification of explicit cohesion markers w ill in part account for the reason texts se e m to hold together. Other proponents of a weak view of text structure have sought to describe the linkage of semantic asp ec ts of texts rather than tracing links among explicit lexical and grammatical features. Connor (1987; 1990; 1991), Connor and Lauer Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 40 (1985; 1988) and Lautamatti (1987) have investigated the role of topicalization in determining text structure. Their work entails the identification of various types of topical material in texts and the description of how these materials appear to b e tied together to form coherent texts. Similarly, much work h a s been devoted to the ordering of given-new information (Chafe 1982, Cook-Gumperz and Gumperz 1981, Givdn 1984, Gumperz, Kaltman and O'Connor 1984.) This work describes how the progression from shared information to unshared is accomplished in order to create coherent texts. The strong view of discourse structure incorporates a more rigid depiction of the organization of texts. Tagmemists, for example, viewed all linguistic production a s hierarchically layered binary units (Pike 1982, Pike and Pike 1983, Young, Becker and Pike 1970). Psychologist Crothers (1979) devised a n d demonstrated a descriptive framework for inferential aspects of discourse, positing a hierarchical ordering o f inferences. Van D ijk (1977), Kintsch (1974; 1986; 1988) and van D ijk and Kintsch (1983) have proposed an analytic sys tem which incorporates their view of discourse as hierarchically order linguistic units. Studies of discourse types or descriptive dimensions of discourse have also been base d on the relative amount of structuring observed in texts. Distinctions in text types have been expressed in terms, s u c h a s "elaborated" vs. "restricted" codes, "oral" vs. "literate" discourse, or "spoken" and "written" modes. These terms are used in referring not only to the mode of production, but also to the organization of the discourse. A notion behind the dichotomy of oral and written language in s o m e research is base d on an assumption that oral language is informally organized Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 1 while written language, o n the other hand, is formally structured (Brown and Yule 1983, Chafe 1982; 1985; 1986, Chafe and Danielewicz 1986, Chafe and Tannen 1987, Cook-Gumperz and Gumperz 1981, Gumperz, Kaltman and O'Connor 1984, Tannen 1982a; 1982b; 1985). In so m e respects the assumed dichotomy between oral and written language is valid. The amount of time generally needed for production and the reliance on prosodic features, for instance, distinguish the oral and written forms of language fairly clearly. . It is not the case, however, that all spoken language is informally organized. Many forms of spoken public presentation are very formally structured. Similarly, s o m e forms of written language, including many personal letters, are relatively loosely organized. The formal/informal asp ec ts of language variation can perhaps be more accurately captured by Ochs' (1979) distinction between "planned" and "unplanned" discourse, examples of which can be observed in both the oral and written modes.1 Ochs describes the differences between the two types of discourse in terms of syntactic features, s u c h as left dislocation, conjoined and embedded clauses, phonological and syntactic parallelism, as well a s chances for repair and "nextness," that is, the marking of clausal relationships through the u s e of syntactic and other explicit cohesive ties. A similar distinction in discourse types is that of "involvement" and "detachment" a s d is cu sse d by Tannen (1985) and Chafe (1982). The distinction drawn here is b as ed on the relative extent to which a speaker or writer can rely on immediate context in order to communicate. Tannen h a s defined these concepts in terms of syntactic features and qualitative differences in communication. For example, sh e describes "involved" discourse a s a variety in which more Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 2 discussion of personal experience takes place and in which inferencing is relied on more often. "Detached" discourse, on the other hand, is characterized by le ss personal content and greater explicitness. Chafe (1982; 1985; 1986) and Chafe and Danielewicz (1986) describe "detached" discourse in terms of other syntactic features, including relatively frequent occurrences of relative clauses, nominalizations and passive voice. In contrast, he describes "involved" discourse in terms of the type of content communicated a s well a s frequently occurring syntactic features. The syntactic features which he s u g g e s t s occur with relatively high frequency in "involved" discourse include first and se c o n d person pronouns and lexical features, including hedges. He also s u g g e s ts that in "involved" discourse, more details are us ed , more repairs occur, and feelings and thoughts are more frequently reported. Bereiter and Scardamalia (1987) have devised a model of discourse production base d on a similar distinction in discourse types. Their "knowledge telling" is similar in notion to "involved" discourse, describing a le ss formally structured form of discourse. They point to preschool and early elementary school children's sp ee c h a s examples o f "knowledge telling” discourse, contrasting this with the more formally structured discourse, "knowledge transforming," which is associated with academic training. In a series of experiments with elementary sc ho ol children, they found evidence which indicated that older children organized text with reference to macrostructural (van Dijk 1977, van Dijk and Kintsch 1983) features while younger children pieced together text in a linear fashion without reference to the overall organization of the texts. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 3 Tannen and Chafe likewise associate the "detached" type of discourse with academic training. Schieffelin and Ochs (1986) and Ochs (1979) associate the notion o f "planned" discourse with academic writing in terms of socialization. Biber and Finegan's (Biber 1984; 1988; 1992, Biber and Finegan 1989a; 1989b) work represents another approach to identifying descriptive dimensions of discourse. Through the u s e o f factor analytic procedures, Biber (1988) identified s ev en dimensions of textuality and labeled them with reference to the relative frequency with which selected syntactic and lexical features co-occurred in texts. Among other dimensions identified in his study, Biber's analysis yielded common factors related to the distinction described by Tannen, Chafe and Scardamalia and Bereiter a s summarized above. He defines "informational" discourse— characterized by frequent occurrences of nouns, prepositions and attributive adjectives, high type/token ratios and longer words-as carefully crafted and highly edited. Discourse in which information is presented in a loose, fragmented manner is represented by a common factor labeled "on-line informational elaboration" and is defined by frequent co-occurrence of three syntactic features-f/iar complements, demonstratives, that relative clauses on object position and that clauses u s e d as adjectival complements. A fairly large body of literature ha s identified syntactic features, lexical features and mea su res of length associated with the "planned discourse." There are inconsistencies in the results of the studies, but a common finding is the correlation of longer T-units with older writers (Crowhurst 1980, Hunt 1964; 1965, Martinez San Jo s e 1973, Nietzke 1972, Ro sen 1969). Specialized vocabulary h a s been associated with planned, academic writing (see, e.g., Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 4 Huckin 1983, Reid 1990). Subordination has also often been associated with relatively highly planned written discourse (Hunt 1964; 1965, Crowhurst 1980). One might expect features o f deliberate, highly planned discourse to be observable at other levels of language a s well, particularly at the rhetorical level. In fact, discourse analysts (Akinnaso 1982, Biber 1992, Kaplan 1991, Shuy 1981) have called for an expansion o f research in discourse types to include rhetorical features. It lias even been suggested that the rhetorical features of text may play a more significant role in defining textual characteristics than d o syntactic features (Flower 1979, Carrell 1987, Horowitz 1987, Kaplan 1982; 1991, Kintsch 1986; 1988, Smith 1985, van Dijk 1990). II.2 .1. Rhetorical Aspects o f "Planned" Discourse. Rhetorical abstraction is defined in this study a s the hierarchical organization of prepositional content in the form of functionally related components of argumentative discourse. I would like to review theoretical discussions and empirical studies-ftom the fields of rhetoric and linguistics- related to rhetorical abstraction a s defined in this study. The review w ill focus on approaches to describing the hierarchical ordering of prepositional content I will also point out the following aspects of the approaches: assumptions o f the directional nature of abstraction, the nature o f text comprehension and constraints placed on descriptions of prepositional structuring of text Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. II.2 .U . Theoretical Discussions. Working in the field of rhetoric, Christensen (1965; 1967) developed procedures for describing the hierarchical organization of propositions in a text, suggesting that su ch organization is an index of structured discourse. He proposed that propositions o f well planned discourse spanned multiple levels of generality s o that more concrete information is contextualized by more general information, and general information is associated with more concrete information in the form o f examples and other more specific types of information. Christensen identified three types of relationships that obtain between pairs of sen ten ce s: subordinate, coordinate and superordinate. He limited his analysis to paragraphs and u se d the sentence a s the unit o f analysis. He also as su m ed that the se qu enc e of s e n te n ce s limited the relationships that could be observed in a text s o that his analyses traced movement from one level of generalization to another by looking only at adjacent pairs of sentences. As an example of his analysis, Christensen analyzed the following paragraph written by Kenneth Tynan: 1) 2 In Spain, where I saw him last, he looked profoundly Spanish. 2) 3 He might have p as se d for one of those confidential street dealers who earn their living selling spurious Parker pen s in the cafes of Malaga or Valencia. 3) 4 Like them, h e wore a faded chalk-striped shirt, a coat slung over his shoulders, a trim, dark moustache, and a sleazy, fat- cat smile. 4) 4 His walk, like theirs, was a raffish saunter, and everything about him s ee m ed slept in, especially his hair, a nest of small, wet serpents. 5) 3 Had h e been in Seville and his clothes been more formal, h e could have been mistaken for a pampered elder s o n idling away a legacy in dribs and on drabs, the sort you s e e in windows along the Sierpes, apparently stuffed. 6) 2 In Italy he looks Italian; in Greece, Greek; wherever he travels on the Mediterranean coast, Tennessee Williams takes on a protective Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 6 colouring which melts him into his background, like a lizard on a rock. 7) 2 In New York or London he se em s out of place, and is best explained away a s a retired bandit. 8) 3 Or a beach comber; shave the beard off any of the self-portraits Gauguin painted in Tahiti, soften the features a little, and you have a sleepy outcast face that might well be Tennessee's. (1965: 140) In Christensen's analysis, the shifting indentation and the number immediately preceding each se nt en ce represent the "structural level" (i.e., the level o f generality) of the sen ten ce . None of the se nt en ce s in this paragraph is marked with a 1 be cau se Christensen found no topic senten ce in the paragraph, the type of s e n te n ce s marked with a /. All se nt en ce s marked with the sa m e number {n ) are semantically coordinate-or "structurally" equal unl es s a senten ce which is marked with a larger number intervenes. Sentences marked with a larger number (n + 1, n + 2, etc.) are semantically subordinate to sen te n ce s which precede them and which are marked with a smaller number. One must question the decision to take the sentenc e a s the unit o f analysis as well a s the decision to partition text into binary segm ents . I present this discussion and the example to help identify the construct in which we are interested. Christensen's work represents one attempt to describe the hierarchical ordering of prepositional content in a text, the main focus of this study. Christensen's work was expanded by Pitkin (1977a; 1977b) who identified levels of generalization in terms of relationships between prepositional material in texts and devised a sy s te m fra- graphically displaying the relationships and the levels of generalization. He points out the limitations of working only at the paragraph level, but maintains the assumptions that relationships do not s p a n Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 7 sen te n ce s and that se n te n c e s represent only one proposition. One of the mo st important contributions of Pitkin's work, however, is that of defining the hierarchical organization of propositions in texts in terms o f functional relationships rather than in toms o f semantic generality. The more nebulous distinction of levels of generalization is one of the stumbling blocks of m uc h of the empirical research presented below. Nold and Davis (1980) dis cu ss what they believed to be a pedagogically useful system for analyzing writing. They adopted Christensen's distinction of coordinate, subordinate and superordinate relations, applied it to the labeling of T-units, and stated rules of movement which they claimed described the prepositional development of a text They also defined distinctions in levels in terms of semantic generality and assumed connections o f adjacent seg m e nt s only. A significant aspect o f their research is the relation of Christensen’s work to the psycholinguistic work o f Kintsch. In spite of his understanding of the dynamic structure of text, Coe (1988) proposes an analytic sy s te m which traces movement in terms of generalization between adjacent clauses. The major contribution of his work is the refinement of Nold and Davis' model s o that a three dimensional diagram is not needed. His model also is not a s limited to a strictly linear analysis a s is that of Nold and Davis. Instead, strings of clauses are tied to superordinate propositions in a tree diagram. A number of other rhetoricians (Berthoff 1984; 1986, Winterowd 1986, Shaughnessy 1977a; 1977b) have provided descriptions of rhetorical abstraction based on observations of students' essays. Winterowd's (1986) distinction Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 8 between propositioned (highly abstract) and oppositional (highly concrete) writing is meant to describe what I w ill look at a s two types of abstract text— that which presents highly general information and that which is weighted with very specific information. As Berthoff (1984; 1986) points out, it is a misconception to equate abstraction with generalization. It is possible to abstract from generalities to draw relationships to specific observations just a s it is possible to abstract from specific information to form generalities. Winterowd (1986) s u g g e s ts that students should be taught to organize details by defining general patterns and to tie down generalizations to details and examples. However, his assertion that abstract texts are characterized by a n explicit topic sentence and organizational rigidity are, I believe, misleading. It is not necessarily true that an abstract text, as Winterowd h a s defined it here (i.e., text with highly general content), w ill contain an explicit topic sentenc e or "rigid" organization.2 Furthermore, his description of abstractness through the u s e of nebulous terms, su ch a s "background/foreground style,” and "presence” are ambiguous and consequently not very helpful. Taking a more cognitive approach and aiming to describe hierarchical ordering o f types of propositions rather than specific propositions, Moffett (1969) produced a taxonomy of discourse types which he proposed to u se in tracing children's development of writing skills. He drew on the notion of distance of the writer from his topic and the Piagetian theory of puerile egocentrism to posit a hierarchy of these discourse types. Connecting his hierarchy of prepositional organization with cognitive ability, he developed a scale of indices of rhetorical abstraction (see Table II. 1 below). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table II. 1 Moffett's Levels of Abstraction Level Rhetorical mode Treatment of topic ________ topic Repeat what's happe ning '? Describe what happened? Describe what h a p pe ns ? Speculate about what may h ap p e n ? ______ _______ Recording Drama Reporting Narration Generalizing Exposition Theorizing Logical Argumentation Moffett hypothesized that beginning writers exhibited an egocentric approach to addressing a topic, simply recording data. As the writer matures, according to Moffett's theory, the writer becomes capable of reporting data, then of generalizing observed data to report patterns and finally of theorizing about the nature o f observed patterns in data. Moffett adheres to a linear relation between these text types and cognitive ability. He views discourse types as reflections of progressive sta g es in a writer’s development and tends to describe the discourse types as occurring in isolation from each other. His hierarchy unfortunately implies that abstract thought is not present in the production o f "less abstract" texts, s u c h as drama and narration. In spite of these limitations in his theory, Moffett's work represents a noteworthy pioneering attempt to describe rhetorical abstraction. Within the field o f linguistics, the basis of tagmemic work (Gray 1977, Pike 1982, Pike and Pike 1983, Young, Becker and Pike 1970) is the hierarchical organization of communication. All communication is s e e n a s structured from small blocks of linguistic matter-such as phonemes-which are combined into blocks of morphemes and s o forth until prepositional tagmemes are combined Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 50 into text which in turn fits into cultural and cognitive tagmemes. Texts are decomposed into ever increasingly discrete units of analysis tlirough binary segmentations of larger units into nuclear and marginal material. Grimes (1975), drawing on tagmemists' notion of slot grammar and borrowing the transformational grammarians' descriptive device, phrase structure rules, proposed a system for u s e in describing the structure o f discourse. Following the lead of tagmemists, he as su m ed that discourse should be broken into binary segments, propositional predicates and arguments, between which functional relationships are identified. Also a s in tagmemic analyses, a text is broken down into a hierarchically organized se t of pairs of propositions. Hinds (1979), like Christensen, posited a hierarchical organization of propositional material and limited his view to the paragraph. He argues that many types o f discourse— including narrative, procedural and expository discourse— are linearly and hierarchically organized. He u s e s tree diagrams, like those of the tagmemists and Grimes, to segment increasingly discrete, functionally related bits o f propositions. Selinker, Trimble and Trimble (1976; 1978), concerned about the comprehension difficulties that their ESP stu den ts had in reading technical English texts, investigated the sou rce of their difficulties. They determined that the students were able to p ro c e s s the syntax and the lexical items easily, but the rhetorical development of the texts threw them o ff track. In response to thes e findings, the researchers developed a hierarchically ordered taxonomy of rhetorical functions observed in technical texts. This taxonomy represented, with varying specificity, the hierarchically ordered propositional content of the texts. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 1 A number of researchers have investigated the structuring o f texts in terms of macrostructures-propositional elements representing the gist of texts (Kintsch 1974; 1986; 1988, Meyer 1987, van D ijk and Kintsch 1983). Discourse h a s be en described by these researchers a s branching, hierarchically ordered s e t s of propositions. Psycholinguistic work indicating that recall of macrostructures is greater than recall of more specific information s e e m s to support the hypothesis that readers process texts a s hierarchically structured s e ts of propositions (Frase 1969, Johnson 1970, Singer e ta l 1992, Speelman and Kirsner 1990). Mann and Thompson (1987; 1988) have developed an elaborate theory, Rhetorical Structure Theory, to describe the propositional organization of texts. They ass um e a simple binary segmentation o f texts into text spam and the identification o f nuclear (i.e., inclusive, relatively general information) and satellite (i.e., more specific information) s p a n s within adjacent pairs. They hypothesize that the order of sp a n s is determined generally by the functional relationship that obtains between the s p an s. Like the work of Pitkin and Coe, they neglect the dynamic network of semantic links within a text that Hoey (1991a; 1991b) has clearly demonstrated. Hoey (1991a; 1991b) h a s worked extensively in developing an analysis of the organization of propositional content of texts by identifying semantic networks marked by lexical features. His analyses take into account the dynamic nature of readers' reconstruction of texts and are not limited to segm ents of texts. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 2 IU ,L 2,. ..E m p iric a l R e s e a r c h - I have summarized empirical research that h a s b ee n done to investigate rhetorical abstraction a s a predictor of holistic s co re s of compositions (see Table II.2 below). I w ill elaborate on this information below. Britton etaL (1975) adapted Moffett's (1969) model o f abstraction (see Table n.3 below) in a study involving the evaluation of 2122 pieces of writing from 65 secondary schools in Britain. The writing was done by first, third, fifth and seventh grade elementary children and represented an informative, transactional (i.e., "to get things done... concerned with an end outside itself") type of writing. The researchers distinguished this type of writing from persuasive writing. They employed three raters, two writing teachers and one member from the research team. Each rater classified each piece of writing according to the highest level of abstraction observed in the piece. In analyzing the distribution of ratings across grade levels, the researchers found that more children at higher grade levels u se d more "higher level" (i.e., speculative and tautologic) abstraction than did the children at the lower grade levels; they attribute this to the cognitive maturity of the older children. There are a number of problems with Britton etaL.' s study and some methodological i ss u es which the researchers did not address. Britton etal. assume-an assumption implicit in other studies (Dilworth, Reising and Wolfe 1978, Caplan and Keech 1980) and explicitly stated in Freedman and Pringle (1980)--a direct correlation between levels of abstraction and writer ability.3 The assumption that higher levels of generalization define greater cognitive ability or better writing is misleading (Berthoff 1984; 1986); it is a s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission. Table n.2 Summary of Investigations of Rhetorical Abstraction Study Corpus Subjects Assumptions regarding textual structure Assumptions regarding abstraction tiritton etaL (1975) 2122 texts on various topics and in various rhetorical modes, written as class assignments 1664 British (L[ students?) 1st, 3rd, 5th and 7th graders Dynamic reconstruction of whole text Abstraction equated with generalization; linear relation between rhetorical abstrac-tion and cognitive abilities Dilworth etal. (1978) 90 essays based on 2 prompts, interpretations of poetn, written as im­ promptu essays for test 90 American high school (Li students?) students Binary segmentation of discourse Abstraction equated with generalization; linear relation between rhetorical abstrac- tion and cognitive abilities Caplan& Keech (1980) 129 argumentation essays written as im-promptu essays for test 129 American (Li students?) high school students Dynamic reconstruction of text; Binary segmen- tation of discourse Range of abstraction is more important than highest level Freedman & Pringle (1980) 300 argumentation essays written outside of class 300 (Lj students?) Canadian high school and college students Dynamic reconstruction of whole text Abstraction= generalization; dichotomuation of rhetorical and cognitive abstraction Lindeberg (1985) 20 description essays 20 Swedish university EFL students Binary segmentation of discourse Range of abstraction is more important than highest level Carlson (1988) 406 impromptu argumentation essays written for test (farming and space topics) 203 Arabic, Chinese, Spanish and British high school students (169 Li students) Dynamic reconstruction of whole text Frequency of occurrence of features determines level of abstraction Ferris (1990) 60 argumentation essays written for test (techno- logy and consumer culture topics) 30 Lj and 3 0 1 _ 2 freshman writing students at USC Dynamic reconstruction of whole text Interested in degree to which individual argument components were developed Connor (1987, 1990, 1991) 40 impromptu argumen­ tation essays written for test (community problem topic) 10 English, 10 Finnish, lOGennanand 10 American high school students Dynamic reconstruction of whole text Interested in degree to which individual argument components were developed u > u > I 54 Table 113 Britton etaL's Adaptation of Moffett's Levels of Abstraction Moffett's scale Britton etaL's scale Descriptors Record ........... Record................. Eye witness account or running commentary Report............. Report.................. Generalized narrative.............. Account o f particular series of events or appearance of a particular place Discussion of particular events and places, but no indication of pattern in generalized form Classificatory... Low-level analogic. Loosely related generalizations, no logical or hierarchical relationships made explicit Analogic.............. Generalizations related in explicitly logical or hierarchical manner Theoretical..... Speculative.......... Speculations about generalizations Tautologic........... Use of hypotheses and deductions from them, theory backed by logical argumentation important to tie generalizations to specific information a s it is to organize details and examples around general principles. It is not clear how raters distinguished among descriptors used in the scale. It is not clear, for example, what constituted a "generalization’ ' or "patterns of generalizations." It is also not clear to what extent the type of data u s e d affected their results. Their data included student assignments from different fields of study, from different schools, on different topics and presumably in different rhetorical mod es. Also, the researchers do not indicate whether all stu d en ts were native speakers of English, an important consideration in evaluations of writing. The last concern is a question of the generalizability of their findings to evaluations of writing done by college and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 5 university stu d en ts in light o f expected performances at this level of education (Freedman and Pringle 1980, Hays etal. 1990). Dilworth, Reising and Wolfe (1978) labeled each T-unit of 100 e s s a y s written by high school students as representative of low level abstraction (i.e., physical description, statement of objective fact or restatement of the prompt), moderate abstraction (i.e., limited inferences or judgments relevant to specific circumstances), or high level abstraction (Le., principles or generalizations concerning life at large).4 Although the researchers report a"... modest but clear tendency. . . to place a premium on relatively abstract signification in the papers.. the correlation between high abstractions and grades was very low (1978: 102). This might be expected if the es s ay s consisted mostly of generalizations without data used a s evidence of the generalizations, a phenomenon common in student writing (Winterowd 1986). The researchers also report, without any substantiation, that the "... superior students... tendjed] to increase the number of abstractions as their papers [grew] longer, pacing their generalizations and supporting them with specifics" while"... the typical stu dents tend[ed] to give fewer generalizations and... tend[ed] not to culminate a line o f evidence" (1978: 103). As in Britton et ai.'s work, it is not clear how the task--interpretation of poetry— affected students' responses, and findings may not be applicable to college/university level writing. The researchers also do not indicate whether all of their subjects were native sp eak ers of English. Caplan and Keech (1980) computed correlations between holistic s c o re s assigned to 129 es s a y s written by high school students in 3 classes and s c o re s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 6 representing the levels of abstraction spanned by each ess ay. Readers u s e d a system similar to that of Nold and Davis (1980), assigning a numerical value indicating the level of abstraction from 1= "highly abstract statement," 2= "more focused generalization," 3= "somewhat generally stated detail or example offered," 4= "specific, concrete detail, image, event" A "moderate" correlation was found to obtain between the two ratings. In a se c o n d analysis of 16 of the original ess ays, they found that the presence of details was an "important" (1980: 199) determiner of holistic score distribution into upper and lower halves. They indicate that the presence of details compensated at times for the lack of meaningful generalizations and vice versa. The ambiguous descriptors u se d in their scale may account for the low inter-rater reliability coefficient for the first study. Their se co nd study required the assumption that the texts could b e segmented into binary pairs of sentences, a questionable assumption in light of Hoey's (1991a; 1991b) work. Again, the researchers do not indicate the native language of the students, and the writers woe high school students. Freedman and Pringle (1980) developed a se t of scales for traditional rhetorical ratings and for rating levels of generalization in high school and college freshman argumentative ess ays . The researchers sought to correlate performance on a number o f dimensions of writing ability with cognitive development which they operationally defined a s the highest level of abstraction observed in students' writing. The researchers us ed three instruments-a syntactic scale, a rhetorical scale and a scale of abstraction levels. They found that "development"--which they describe a s the extent to which details are u se d — correlated highly with ratings, but that level of abstraction did not, although level of abstraction did Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 7 correlate significantly with level of education. They s u g ge st that abstraction, a s an index of cognitive maturity, be incorporated a s a category in rating scales since it correlated significantly with the level of education. The categories of their rating scale for abstraction are listed in Table n.4 below along with descriptors from Britton etal.'s scale which they adapted. It may be that Freedman and Pringle found that holistic sc or es and ratings of abstraction did not correlate significantly due to the nature of the ratings. Berthoff (1986) identifies two misconceptions which may have led to these inconclusive results. First, the researchers assum ed that abstraction was unidirectional in nature; that is, they were interested in only the highest level of abstraction observed in students' essays. As Shaughnessy (1977a; 1977b), Table II.4 Freedman and Pringle's Adaptation of Britton et a l's Rating Scale Britton et al. 's Freedman & Descriptors for Freedman scale _________ Pringle's scale and Pringle's scale_________ Record Report............... Report......... Generalized narrative Low-level analogic Commentary. Analogic .......... First-level classification. Speculative Second-level classification. Tautologic__________________ Summarization of any primary data necessary to task, including others' generalizations, in chronological or co-ordinate pattern Summarization of primary data by generalizing, but generalizations not related and impose no organization on presentation of material Generalizations organize body of data; text is structured by logical order and argumentation Classifications of classifications and generalizations of generalizations Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 8 Ohmann (1979) and Winterowd (1986) point out, it is not uncommon for stu den ts to write e s s a y s of a highly general nature without relating the general information to more specific details and examples. They also su g ge st that this type of novice writing is generally not valued by writing instructors. Since they provide no information about rater training, no samples of writing to exemplify the constructs and the various levels and no copy of the rating scale, it is not at all clear how raters accomplished the task of rating for abstraction. The researchers' discussion of the construct leads one to wonder how raters accomplished the task. It is not clear, for instance, how raters distinguished generalizations that were copied from the prompts from th os e that the writers produced. Nor is it clear how raters distinguished generalizations and classifications from generalizations o f generalizations and classifications of classifications. One of the most questionable aspects of the Freedman and Pringle scale may be the assumption that raters would be able to recover the writers' thought proc ess es, a difficult if not impossible task. Concern for this difficulty prompted Berthoff (1986) to call for work that would define abstraction in rhetorical terms rather than in cognitive terms. There is no indication of how valid the rating scale was or how reliable the ratings were. A copy of the rating scale of abstraction is not provided. The researchers provide little discussion about the development of the scale and none about rater training. Furthermore, inter-rater reliability estimates for ratings of abstraction and the data matrix are not provided. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 59 Lindeberg (1985) proposes a model involving the identification of levels of generalization based on functional roles between adjacent clauses. She devised a system for the graphic representation of movement from levels of generalization and reports that her study of 20 es s a y s indicated that more movement along a range of levels was moderately correlated with holistic score s, although s h e d oe s not describe what the holistic evaluations represent Although one of her major goals was to devise a sys tem of analysis divorced from the intuitive nature of previous analyses by making reference to explicit markers of abstraction, her proposed F-units and the assignment of levels of abstraction remain highly intuitive in nature. The role that explicit markers of abstraction play in her analysis is not clear although s h e highlights their u s e a s a m e a n s of developing a more objective type of analysis. She did, however, find evidence which indicated that rhetorical abstraction was a significant aspect of overall textual quality. Carlson (1988) investigated the correlation between the existence of individual components of argumentation identified in Toulmin's (1958) model and holistic s c o re s assigned in the International Association for the Evaluation o f Educational Achievement (IEA ) Study of Written Composition (Purves 1988). The components-claims, support/justification and qualifications/rebuttal— were tagged by 5 high school ESL writing instructors. Carlson found the frequency of occurrence of th es e individual features to b e moderately correlated with holistic sc ores. She was interested only in the presence of th ese individual features, not the extent to which they were developed or their relevance to the line of argumentation developed by the writer. Nor was Carlson interested in the range of abstraction levels the se various components represented. Her research is Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. summarized here b ec au se s h e borrows from the sa m e model o f argumentation that is used in this project and the components o f argumentation s h e investigates represent elements of a hierarchical organization of argumentative texts. In the sa m e study Carlson (1988) included ratings based on a rating scale -developed by Alan Purves and Anna Soter-which was devised to "evaluate several characteristics of the reasoning pro ces s demonstrated in written prose...." (1988: 234). Rating dimensions— content/thinking, organization and style/tone- included a number of subcategories, including evaluation of issue s and consideration of alternatives, which are asp ec ts of rhetorical abstraction investigated in this project Carlson reports that the inter-rater reliabilities for some categories were "quite low" (1988: 235), s o the ratings for each subcategory were collapsed into one score for the appropriate dimension. These composite sc ore s correlated highly with the holistic sco re s of overall writing ability. Although rhetorical abstraction was not the main interest in their studies, research by Ferris (1990) and Connor (1987; 1990; 1991) is related in that it s ee ks correlations between holistic score s and m ea su re s of argument components identified by Toulmin (1958). Both Ferris (1990) and Connor (1987; 1990; 1991) found that composite s co re s for three components of argumentation were statistically significant predictors of holistic sc o re s . II.2 .1.3. Defining Rhetorical Abstraction. In this section of the chapter, I w ill summarize Toulmin's (1958) theory of the structure of argumentation, pointing out how his theory was used in this Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 1 study to define levels of rhetorical abstraction operationally. Examples of the levels are provided to clarify the definitions. Toulmin (1958) and Toulmin et aL (1979) identify components of argumentative discourse, providing a theoretical framework for rhetorical analyses of argumentation. These components include claims, data, warrants and background and are displayed in a hierarchical order in Figure n. 1 below. I. Background A. Major Claim 1 . Minor Claims a. Warrants aa. Data Figure II. 1 Levels o f Rhetorical Abstraction in Argumentation Claims, according to Toulmin (1958) are "conclusions whose merits we are seeking to establish," "assertions put forward publicly for general acceptance." They are general assertions which are presented for adherence and which are judged partially on the merit of the data marshalled in their support and the credibility o f the warrants. I will distinguish two types of claims: major claims and minor claims (Tirkkonen-Condit 1985, Toulmin 1993). I have defined the major claim as the "thesis statement," "controlling idea," or "the proposition" of a line of argumentation. Minor claims are entailed by the major Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 62 claim and can be represented in the "body" of the essay a s topic sentenc es . The following text is provided as an example of the u se of major and minor claims. MAJOR CLAIM The atrocities committed in Bosnia must be st o p p ed through continued trade embargoes and increased diplomatic pressures------ MINOR CLAIM I Continued trade embargoes w ill force the Serbians to reconsider their aggressive actions------ MINOR CLAIM 2 Increased diplomatic pr es s ur es w ill bring the Serbian government tn a realization that their actions concern the rest of the world. . . . Major claims may include a proposal for change and a statement o f directum for the change (Aston 1977, Tirkkonen-Condit 1985). For example, a change in current policy or belief can be expressed. In addition to the change, a direction for the change may be indicated through a suggestion of how the current policy or belief should be altered or through the proposal of solutions. Statements of direction may also be expressed in the form of minor claims. In the example that follows, the statements o f change and direction are both expressed in a major claim. CHANGE The atrocities committed in Bosnia must be sto p p ed DIRECTION through continued trade embargoes and increased diplomatic pressures. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 63 D m are defined a s pieces of evidence which are appealed to explicitly a s substantiation for claims. They include "experimental observations, matters of common knowledge, statistical data, personal testimony, previously established claims, or other comparable 'factual data’" (Toulmin et al. 1979: 25). Two types of data were defined for the study. External data included data that were explicitly identified by the writer as being derived from outside the writer’ s personal experience, including any data drawn from the graphics provided in the prompts. Personal experience data included common knowledge, personal testimony and previously established claims if those claims are not attributed by the writer to so u rce s other than him/herself. Data are related to claims through the u s e of warrants which may be stated explicitly or may be implicit to the argumentation. Warrants are"... general hypothetical statements which can act as bridges... to show that, taking [particular] data as a starting point, the step to the original claim or conclusion is an appropriate and legitimate one." They "... take the form of laws o f nature, legal principles and statutes, rules of thumb, engineering formulas, and s o on" (Toulmin 1958: 78). They are often implicit to argumentation. However, they may also be explicitly marked through the u s e o f lexical items s u c h a s because, i f ... then, since, therefore and thus. CLAIM Continued trade embargoes w ill force the Serbians to reconsider their aggressive actions. ... DATA They have already brought the Serbian economy to a virtual standstill. Three quarters of manufacturers have shut down. The inflation rate is soaring to 70%. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 64 Many sto re s have dosed sh o p or carry only a few U w m-.h. WARRANT Because Karadvic recognizes the political implications of these problems. I believe that he w ill eventually bow to the UN's wishes and stop the killing. I u s e the term, background, for Toulmin’s term, backing. Toulmin (1958) and Toulmin etal. (1979) us e backing to refer to information which supports the warrants of an argument5 Background is defined in this study a s information which contextualizes a whole line of argumentation and takes the forms o f examination of premises and issues (i.e., underlying assumptions). Material described by the category, consideration of cause s, may also be used to con textual ize a line of argumentation. Issues are defined as the points of contention within a line of argumentation (Missimer 1986, Toulmin 1993) and are expressed in neutral terms. Statements of i s s u e s can be used to organize or contextualize a series of arguments. Various asp ec ts of issues, including their nature and history, may be identified and examined. Both of these aspects are illustrated in the example that follows. The precarious situation in Bosnia is not new, having begun with... HISTORY OF ISSUES centuries-old clashes between Serbians and Muslims, and represents... NATURE OF ISSUES a complex set of political, social and economic i ss u es . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 65 Premises, fundamental assumptions underlying a line of argumentation, may also be explicitly stated and examined. Examinations o f premises can be u s e d to indicate the limitations of arguments, in a s e n s e associating arguments with more fundamental, and often more general, considerations (Missimer 1986). The fundamental assumptions underlying our initial response to the Bosnian problem were that the problem is simply a-Serbian grab for territorv and that US military intervention w ill solve the problem. Discussions of causes of the policy or belief can also serve a s background against which particular solutions are proposed (Missimer 1986). In the following example, two possible c au se s of the problem are identified. The situation in Bosnia was caused indirectly by the break-up of the former USSR, but h as more direct hes to the growing Serbian nationalism pandered by Karadvic------ The proposed hierarchical organization of these various elements was determined by the functional relationships which obtain among them. Data are subordinated to warrants in the s e n s e that their relevance in a line of argumentation depends on the logical relation to claims which warrants provide. Warrants are subordinate to minor claims in that they serve a s bridges between data and claims. Minor claims are subordinate to major claims in that they are logically entailed by the major claim. They are often referred to as arguments u s e d to support the major claim or proposition o f a line of argumentation (Toulmin 1993). The line o f argumentation is, in turn, subordinated to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 66 background material which contextualizes the line of argumentation (Toulmin 1993). The hierarchical organization o f propositions represented in Figure II. 1 should not be interpreted a s a statement o f the linear ordering of prepositional material. On the contrary, data may be us ed in the beginning of a paper along with more general information serving a s introduction, and background material may be interspersed throughout a text It is not always the ca se that background material precedes the presentation of a major claim, which in turn precedes minor claims, followed by warrants and finally data. Nor should one a s s u m e from viewing this hierarchy that all the se components w ill b e explicitly stated. Warrants are often implicit to the line o f argumentation in many o f the e s s a y s evaluated for this project, and claims are frequently not explicitly stated. n.3. Language Testing. Language testers have raised several of the sa m e questions that many written discourse analysts have investigated. O f importance to those working in both areas is the identification of textual features/dimensions a s well a s the relative importance of the features/dimensions in determining textuality. Also of interest are the effects of contextual factors that affect the production of texts. These concerns are of particular interest to language testers working within the area offacet theory. In the following sections, I w ill provide a definition of facet theory and a brief general literature review related to the theory. This general review w ill be followed by a literature review focussed more specifically on facets related to the testing of writing. The final part of this Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 67 discussion w ill be the identification of facets of expected response which were investigated in this study and reviews of the literature associated with each facet. 11.3.1. Facet Theory. Facet theory is an approach to research design which, according to Canter (1983; 1985a), is better suited to social research be ca u se of the many variables and the interactions o f variables often observed in investigations of human behavior. Brown (1985) and Canter (1985b) point out that the major concern of facet theory is the association of conceptualizations with empirical observations. The first step in explicating th es e associations for res earch involves the identification of a s p e c ts of the research environment and characteristics o f the population which may influence empirical results. Brown (1989) associates the development of facet theory with the increasing interest in contextual influences on language u s e a s reflected in ethnographic research— s u c h as Hymes' (1964) seminal work— and cognitive sciences-such as Rummelhart (1977). Although s h e places emphasis on language production within the classroom, many of the principles of facet theory apply to the interpretation of language in testing situations. Facet theory within the testing field represents concerns about variations in characteristics (i.e., facets) of the design and administration of test s. Fundamentally important is the identification of facets and descriptions o f their effects on test sco re s. Bachman (1990) h a s developed a framework which facilitates the identification of test method facets. Earlier dis cussions o f facets (e.g., Bachman and Palmer 1982, Guttman 1970) defined facets on a relatively Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 8 general level, referring to test format variations, s u c h a s multiple-choice or fill-in - the-blank or writing sample, a s clearly distinguished test method facets. Part of the value of Bachman's framework is that it underlines the fact that within these distinctions lie finer distinctions which may affect test results. For example, facets of writing samples-such a s length, topic and rhetorical mode-have been shown to affect production and evaluations of the sample. Similarly, facets of test administration can affect test results. For instance, variations in time allotment for writing, variations in raters and rating s c a le s influence the evaluations of a test taker's writing. Bachman (1990) h a s identified several types of test method facets. One type is associated with the testing environment, including test location and personnel. Another group o f facets are related to test format, including instructions, test organization and the nature of the language u se d in the test Facets o f expected response are a separate category, representing the expectations of testers which may or may not be reflected in test takers' responses. The importance of facet theory lies partially in its relation to test reliability and validity (Skehan 1990). Given the variations in test taker performance which are attributable to test method facets, the reliability of test results can be questioned unl es s s u c h variations are controlled either statistically or in the testing procedures. Variations in test methods also raise questions about the validity of tests. Much of the discussion of the preference for "direct'' versus "indirect" tes ts of writing is ba se d on the i ss u e of the effects of test method facets on validity (Oiler 1976; 1979). Similarly, concerns about the inclusion of particular prompt Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 69 types on the Test of Written English revolve around the i ss u e of test validity (Bridgeman and Carlson 1983, Raimes 1990). Bachman (1990) provides a more lengthy review of research in test method facets in general terms. I would now like to focus this discussion on investigations of test method facets specific to test s of academic writing. H.3.3. JEafis&flQ yritin g Tests- A number of asp ec ts of the testing environment have been investigated. These include the amount of time allotted for writing, the effect of writing with a word processor versus handwriting and the effect of a n established classroom environment on writing as s es sm en t . Hale (1991), looking at writing done for the Test o f Written English, found that there w a s no statistically significant difference between sc or es assigned to es s a y s written within a 30 minute period and sc o re s assigned to e s s a y s written within a 45 minute period, although he notes that so me students felt unduly rushed under the shorter time allotment Kroll (1990), although not working with a standardized test similarly found that the allotment o f additional time for writing did not yield significantly higher ratings. Moustafa (1987) investigated another aspect of the testing environment namely the mode of production. She compared sc o re s assigned to handwritten texts with those assigned to word process ed copies. Although her sample size was small, s h e nonetheless found evidence that suggested that word processing yielded significantly higher ratings. Nelson (1990) investigated situational constraints on students' evaluations of writing tasks, rinding that the classroom represents a social environment which Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 70 students rely on for their understanding of writing assignments. Poole (1990) found that the testing situation creates an environment which is distinguishable from, yet similar to, the normal classroom environment. A good many facets of testformat have been discussed Facets of test format which have been proposed a s significant determiners o f test performance include topic assignment (Freedman 1977, Freedman 1979, Hoetker 1982, Ruth and Murphy 1988), specification of rhetorical constraints (Flower and Hayes 1977, Flower and Hayes 1980, Odell 1981), specificity of instructions (Greenberg 1982), prompt length (Brossell 1983, Brossell and Ash 1984), wording of prompts (Greenberg 1982, Ruth and Murphy 1988), chronological presentation of a series of prompts (Hayward 1989; 1990), the cognitive complexity represented by the task (Bereiter and Scardamalia 1987, Caccamise 1987, Keech 1984, Quellmalz, Capell and Chou 1982, Tetroe as quoted in Bereiter and Scardamalia 1987). A number of studies have investigated the u s e o f task types through the administration of surveys to university faculty. These studies have defined task types in terms of the kinds of information included in writing prompts and the rhetorical mode s assigned. Most notable, Bridgeman and Carlson (1983) conducted a survey of 190 academic departments at 34 American and Canadian universities, finding that descriptive tasks were valued by science and engineering departments while argumentation task s were considered most important for undergraduates, b us ine ss students and psychology majors. Science and engineering faculty indicated that describing and inteipreting graphics was a preferred task in their fields while the undergraduate English faculty saw this a s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 1 an inappropriate task. Ostler (1980) surveyed a smaller sample of faculty at the University of Southern California and found similarly that task types varied across fields of study and academic status (i.e., graduate vs. undergraduate). Horowitz's (1986b) survey yielded similar results with faculty identifying different ta sk s a s being more important for their particular field of study. A number o f investigations have found that reading-to-write assignments are common to academic writing tasks (Campbell 1990, Flower 1991, J o h n s 1991, Quinn and Matsuhashi 1985). They have also described the apparent added difficulties which students' experience when incorporating borrowed information into their original text, particularly with ordering the information into a coherent hierarchical plan. As a result of interviews with undergraduate faculty members from a variety of fields, Schmersahl and Stay (1992) found that most of the writing tasks u se d by faculty required the manipulation of source readings. Many of the tasks required summarization, synthesis and/or evaluation of reading so u rc es . Faculty placed a great deal of value on the ability to link general information to relevant specifics and the ability to recognize general patterns in specific observations. In terms o f rhetorical m o de s important to academic writing, a number of researchers have identified argumentation a s a predominant type of writing for academic training (Hamp-Lyons and Heasley 1987, Jo hns 1986; 1991, Vahapassi 1988). Exposition h a s also been suggested as the type of writing most often required in academic contexts (Fitzgerald 1988). Investigations have been mad e of the effects of writing prompt facets o n score s (Carlson etaJ. 1985, Golub-Smith, Reese and Steinhaus 1993, Hale 1991, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 72 Weasenforth 1991) and features o f protocols (Frase et al. forthcoming, Reid 1990, Weasenforth 1993). This literature indicates that prompt type (i.e., prose vs. graph) is an insignificant determiner of holistic sc o re s. However, Reid (1990) and Weasenforth (1993) have found that the sa m e prompt type differences appear to promote particular textual qualities, s u c h a s the choice of vocabulary and the ordering of propositional material in texts. The effects of rhetorical constraints on re s p o n se s have also be en investigated There exists a large body o f literature of investigations of the relationship between assigned rhetorical mode and syntactic features of protocols (Crowhurst 1978a; 1978b, Keech 1984, Longacre 1983, Martinez S an J o s 6 1973, Rosen 1969, Smith 1985) and lexical features (Carter and Nash 1990, Hyland 1990) in English a s well as in other languages (Anscombre and Ducrot 1983, BrauBe 1983, Choi 1988, Clyne 1991, Primatorova-Miltscheva 1987, Weydt 1979a; 1979b). Flower and Hayes (1980) have investigated the effect of assigning an audience on the holistic ratings of students' es s a y s and found that the assignment of audiences was correlated with significantly higher sc o re s. Soter (1988) also found that the rhetorical mode cho sen by a test taker can be a significant determiner of ratings. Connor (1987; 1990; 1991) and Connor and Lauer (1985; 1988) have devoted much of their work to investigations of rhetorical features of argumentative texts-such a s audience address, propositional organization and persuasive appeal--as predictors of holistic score s. They have found so m e rhetorical features, s u c h as the address of counterarguments, to be statistically significant predictors of ratings. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 73 Keech (1984; 1985) h a s argued that the relationship between test methods u se d for testing writing proficiency and as s e s s m e n t results is more complicated than is generally assumed, demonstrating that writers' respon ses can be significantly affected by test methods. She h a s called for investigations of linguistic and cognitive constraints imposed by writing tests, pointing out that the reliability of a s s e s s m e n t s dep en ds on a clearer understanding of the affects of these constraints on writers. Facets o f expected response are the focus of attention in this study and have also received a fair amount of attention in other studies. Researchers have sought to identify textual features of common interest to readers and to describe the relative importance of the features to readers. A se c o n d approach in these investigations h a s been to describe the relative significance of me asures o f various features to predict evaluations of overall textual quality. Important textual features have been identified through the u s e o f surveys and by analyzing raters' behavior to identify characteristics most significant in evaluating writing. Results from Bridgeman and Carlson's (1983) survey indicate that faculty members relied more on discourse-level characteristics than word- or sentence-level characteristics to evaluate students’ writing. Preliminary results from a replication of Bridgeman and Carlson's (1983) study by Hale, et a!, (forthcoming) appear to be yielding similar results. Faculty members in Horowitz's (1986b) survey indicated that discourse-level criteria were more important to evaluations of writing than were sentence-level criteria. There is consistency in the level of language which is reported to be most important Re sponses to these surveys indicate that discourse-level aspects (e.g., content, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 74 organization and development) are more important than syntax and mechanics in evaluating students' writing. It appears that, when faculty are asked to rate es s ay s, however, that usage, in fact, plays a more significant role than these survey results indicate (Bridgeman and Carlson 1983, Raimes 1990), particularly for the English and ESL departments (Bridgeman and Carlson 1983). An obvious difficulty in interpreting these results, however, is the lack of a clear definition o f the rating categories. One of the earliest studies of criteria us ed in evaluating writing was completed by Diederich, French and Carlton (1961) with the aid of Educational Testing Service colleagues who had 53 faculty members from English, social sc ie n ces and natural scien ces rate 300 freshman compositions. Low inter-rater reliabilities s u g g e s te d that raters u se d different criteria or interpreted the criteria differently. When Diederich et aL investigated the u s e of rating criteria, they found that raters clustered around various types of criteria with the largest group mo st concerned about more global characteristics of writing, including development and support of ideas and relevance of ideas to topics. The next largest group was mostly concerned about usage, and the third group about organization and analysis of is su es. As a result of in-depth interviews with two faculty members at the University of S an Diego, J o h n s (1991) found that one o f the main concerns faculty expressed regarding students' writing was their apparent inability to structure their work clearly. The concern centered around the ability to highlight macro-structures in a piece of writing, to organize details around the major points and to anchor general ideas appropriately to specific details and examples. J o h n s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 75 goes on to propose test methods which could be implemented to a s s e s s students’ ability to structure the propositional content of their writing in a hierarchical arrangement. Ballard and Clanchy (1991) similarly found that faculty are more interested in thinking skills involved in creating a coherent, reasoned argument or explication than they are in linguistic accuracy. Rhetorical abstraction is often described in the literature a s a significant factor of writing proficiency (Berthoff 1986, Caplan and Keech 1980, Freedman and Pringle 1980, Jo h n s 1991, Lindeberg 1985), but ha s not been adequately investigated. A number o f rhetoricians (Berthoff 1984; 1986, Christensen 1965; 1967, Coe 1988, Kaplan 1982, Nash 1990b, Pitkin 1977a; 1977b, Winterowd 1986) have also su gge ste d that the hierarchical organization of propositional material is an important aspect of academic writing. Lindeberg (1985) reports empirical results that s u g ge st that rhetorical abstraction may play a significant role in determining saxes. Witte and Faigley (1981) appear to indicate that rhetorical abstraction is associated with more highly-rated texts when they associated low­ rated es s a y s with l es s elaboration of ideas and less expansion of concepts. In addition to surveying readers, researchers have also interviewed stu dents in order to identify those features which apprenticed writers believe to be important to writing. Leki (1991) surveyed a group o f L o beginning freshman writing stu dents and found that the students indicated that grammar and mechanics asp ec ts of writing were more important to them. Former Ln composition stu dents in a survey study by Leki and Carson (1994) indicated that grammar, lexical and mechanics features were the most important asp ec ts of academic texts. Results from these studies are not clear, however. The students Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 6 in Leki's (1991) study indicated that they paid little attention to instructors' comments about language-level errors, although they thought language-level asp ec ts of their writing were important Similarly, the former students in Leki and Carson's (1994) study indicated that content faculty paid little attention to grammar and spelling although the students believed th es e features to be important to the s u c c e s s of their academic writing. Quantitative analyses of the effects of various linguistic and rhetorical features (Cushing 1992, Freedman 1979, Hake and Williams 1981, Santos 1985) reveal a similarly complex and contradictory picture. Freedman's (1979) analysis of variances in holistic sc o re s indicated that content and organization generally accounted for more variance than did language use and mechanics. She su g ge st s, however, that this is the case only when the "language" and mechanics features are highly rated Otherwise, the more discrete features appear to receive more of the raters' attentioa Hake and Williams (1981) found that nominalizations accounted for much o f the variance in holistic s co re s assigned by high school writing instructors. Santos' (1985) results from having 178 UCLA professors rate 2 compositions, one from a L i and one from a L2, indicated that non-ESL faculty tended to correct language more than ESL faculty, but also tended to rate organization, development and u s e of supporting ideas/arguments more harshly than ESL faculty. Carlson (1988) investigated the correlations of holistic ratings of overall quality with a variety of other types of evaluation, including ratings of rhetorical organization, critical thinking skills and computer aided tagging of grammatical Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 77 and lexical features. She found that the ratings of critical thinking skills observed in the es s a y s were highly correlated with holistic sc o re s. Salient textual features have also been identified through the u s e of raters' think aloud protocols (Hamp-Lyons 1991a, Vaughan 1991). The protocols from Vaughan's nine holistic raters revealed that handwriting and extended metaphor were often referred to as criteria. They also indicate that raters do not always interpret features referenced in rating scales in the sa m e manner. The protocols from Hamp-Lyons' four raters indicated that "argumentation" and "development" were mo st often referred to as significant criteria. U.3.3. Facets of Expected Response in the Study. In this section I identify the 17 facets of expected response. For each, I provide a review of literature, including theoretical discussions and results from empirical research. The reviews w ill be limited to dis cussions of the features a s significant facets of writing assess me nt. * Text Length: Text length is defined in this model in terms of the amount of language produced, operationally defined as the total number of words, clauses and characters. Hillocks (1986) reviews a number of studies in which text length appeared to be the single best predictor of holistic s c o re s of overall quality and hypothesizes that text length is highly correlated with ratings of content Carlisle and McKenna (1991), Connor (1990), Ferris (1990), Linnarud (1986) and Vaughan (1991) have since found text length to be a significant determinant of the overall quality of L2 texts. Homburg (1984) found that text length, measured by Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 8 both the number of words and the number o f dependent clauses per essay, was statistically significant in predicting overall textual quality in L2 texts. Stewart and Grobe (1979) and Grobe (1981) found text length to be a consistently significant factor in rating texts produced by Li students in grades five, eight and eleven. However, the influence of text length was attenuated when mea su res of vocabulary were included in Grobe's (1981) replication. Witte and Faigley (1981: 196) point out that highly rated es s a y s from L i university students were an average of 375 words longer than low-rated es sa ys. Faigley (1979) found text length to be a significant predictor o f textual quality in his application of regression analysis. Connors and Lunsford (1988: 406) noted that Li essay length, as reported in research studies, h a s increased from an average length of 162 words in 1917 to an average length of 422 words, suggesting that longer texts are evaluated more positively in general. Witte (1983) found a statistically significant difference in text lengths between low-rated and high-rated Li student compositions. In Nold and Freedman's (1977) study, text length was one significant predictor of the overall quality of L i freshman writers' compositions. * Topic abstraction: In a factoral analysis of textual dimensions, Biber (1988) found that six lexicogrammatical features-conjuncts, agentless passives, BY-passives, past participial clauses, past participial W HIZ deletions and ''other" adverbial subordinators-loaded on the sa m e common factor and associated the common factor with the abstractness of topics addressed in texts. More occurrences of th e s e features within a text, Biber argues, reflect the address of relatively more abstract topics. He also found that academic texts were clearly Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 79 distinguished from all other texts on this dimension. Connor (1991) also found this textual feature to be a statistically significant predictor of holistic evaluations of students' argumentative ess ays. • Textual elaboration: Biber (1988) argues that four features— t/tar clauses as verb complements, demonstratives, that relative clauses on object positions and that clauses u se d a s adjectival complements--are associated with a common function in the structuring of texts. He interprets this particular cluster of features a s lexical and syntactic manifestations of "on-line elaboration" of texts. By "on-line" elaboration, Biber refers to the instantaneous elaboration of texts which might commonly be observed in face-to-face conversations or in electronic- mail communications. "On-line" elaboration, a s defined by Biber, s t a n d s in contrast to deliberate, edited elaboration which might be observed more readily in prepared s p e e c h e s and highly edited written texts. The choice of this particular textual dimension to be used in the present study was based on personal observations of students' use of "on-line" elaboration. The choice is also validated by Biber*s empirical evidence which demonstrates that academic written prose can be distinguished from other types of written discourse included in his study by comparing frequency counts o f these features. I know of no other empirical research which investigates this textual dimension a s a significant aspe ct of textuality. • Language Use: Language u s e includes adherence to grammar conventions. Eskey (1983) h a s su gge sted that linguistic accuracy is an important aspect of academic writing. There is also a fairly large body of empirical res earch Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. which s u g g e s ts that syntactic features play a significant role in determining holistic score s (e.g., Freedman 1979, Pollitt and Hutchinson 1987). Applying Ras ch item response theory analysis in an investigation of ratings, McNamara and Adams (1991) found that raters treated the category grammatical accuracy more severely than categories associated with sociolinguistic features and discourse-level organization. Similarly, Pollitt and Hutchinson (1987) found that a category which included syntax, vocabulary and mechanics was more harshly rated than other categories in their scale. Cushing (1992) also u se d Ras ch analysis to investigate ratings from UCLA's ESLPE writing test and found that 'language' (i.e., grammatical accuracy) was rated more severely than discourse-level features. McNamara and Adams (1991) review a body o f literature which indicates that significant amounts of variance in raters' holistic evaluations of writing is accounted for by m ea su re s of grammatical competence. They further indicate that raters' orientation to grammatical features may be deep-seated, and that in spite of training, raters may continue to rely predominantly on ingrained criteria which they bring with them to the rating job.6 Consistent with this view, Green and Hecht (1985) found that both Lj and L2 raters of students' papers focused on sentence level errors, particularly syntactic errors. • Rhetorical Organization: Bridgeman and Carlson's (1983) work indicates that university faculty consider organizational asp ec ts of texts to b e generally more important than sentence-level features. Johns ' (1991) interviews of university faculty appear to corroborate this finding. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 1 Rhetorical organization h a s been identified a s statistically significant in determining holistic sc o re s. Freedman (1979) pointed out that raters of stu dents' e s s a y s pay more attention to organizational structures of texts if the more discrete aspects of the texts are felicitous. Carlisle and McKenna (1991) unexpectedly found that raters in their study placed more weight on organizational structure than on sentence-level features in their evaluation of L2 university student es s ay s. • Content: The literature su g ge sts that a writer’s presentation of content is significant in determining holistic scores. Results from Bridgeman and Carlson's (1983), Schmersahl and Stay's (1992) and Johns' (1991) surveys suggest that presentation of substantive content is valued in academic writing. Investigations of holistic ratings indicate that ratings o f content are significant predictors of holistic ratings of overall textual quality (Carlisle and McKenna 1991, Freedman 1979). Carlisle and McKenna (1991) unexpectedly found that raters in their study placed more importance on content than on sentence-level features in their evaluation of L2 university student ess ays. Markham (1976) found that evaluations of content accounted for a statistically significant amount of variation in holistic ratings assigned to Li fifth- grade descriptive texts. Witte and Faigley (1981: 197) appear to indicate that more substantial content is associated with more highly-rated Li university ess ay s, stating that Tow-rated e s s a y s generally fail to elaborate and extend concepts through su ccessive T-units." Witte (1983) concludes that writers of the high-rated compositions in his study provide more substantial content than writers o f low-rated es sa ys. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 82 • Vocabulary: Researchers suggest that the type of vocabulary u s e d (Hoey 1991a; 1991b, Reid 1990) plays a significant role in determining ratings. Biber (1988) su g g e s ts that variety in word choice is associated with edited, academic genres. Patthey-Chavez (1988) similarly found that variety in word choice was a significant predictor of higher quality writing. Weasenforth (1993) obtained qualitative evidence that technical vocabulary c a n significantly affect ratings. Linnarud (1986), who investigated me as u re s of vocabulary and length, found that lexical choice played a significant role in determining ratings of L2 students' texts. Grobe (1981) found that mea sures of vocabulary diversity accounted for statistically significant amounts of variance in ratings of Li fifth, eighth and eleventh grade students. Pritchard (1981) similarly found u s e of lexical cohesive devices to be highly correlated with overall writing quality. Witte and Faigley (1981: 198) conclude their- study of Lj university e s s a y s by pointing out that student writers of more highly rated es s ay s us ed more lexical collocations, displaying "more adequate working vocabularies." In an investigation of teachers' comments on L t texts, Connors and Lunsford (1988: 406) found that many comments were related to "wrong word errors." Green and Hecht (1985) found that L t raters were more concerned about vocabulary in students' papers. L < > raters were le s s concerned about vocabulary than they were about syntactic errors. • Mechanics: Kaplan (1988) provides anecdotal evidence that punctuation, paragraphing and spelling can be significant determinants of textual quality. Bruthiaux (1993) also s u g ge st s that punctuation is a significant feature of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 83 textuality. Rafoth and Rubin (1984: 455) found that mechanics-including punctuation, capitalization and spelling-was a "potent" factor in determining overall textual quality. Other researchers have focu sse d on more specific types of mechanics. A number of studies have investigated the influence of handwriting on evaluations o f texts produced by Li students. Markham (1976) found that evaluations of handwriting accounted for a statistically significant amount of variation in holistic ratings assigned to fifth-grade descriptive texts. Hughes, Keeling and Tuck (1983) provide a fairly extensive review of research of the influence of handwriting o n ratings o f student writing. They and Sloan and McGinnis (1978) found that variations in handwriting accounted for a significant amount of variation in sc or es assigned to high school students' es s ay s. More recently, Powers et al. (1994) found that handwritten texts tended to be more highly scored than the sam e texts which were word-processed. Spelling has also been shown to be a significant determinant of writing evaluations. Stewart and Grobe (1979) and Grobe (1981) found spelling to be a consistently significant factor in rating texts produced by Li students in grades five, eight and eleven. Connors and Lunsford (1988) point out that in evaluations of Lj student texts, teachers comments related to spelling predominated, suggesting that spelling played a significant roll in determining evaluations of overall quality. To their surprise Scannell and Marshall (1966) found that spelling played a significant role in determining ratings o f Lj high school seniors' es s ay s even though raters were trained to evaluate only the content of the writing. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 84 • Statement of claims: Aston (1977) d is c u s s e s asserting c la im -m inor and major-as necessary prepositional components of the persuasive act Connor (1987) employed a rating scale of audience awareness in which several of the variables— statement o f a major claim and an indication of change and direction for the change-of the current set of scales appear. She found that the ratings were significantly correlated with holistic ratings of overall quality, suggesting that these variables are significant determiners of overall textual quality.7 Connor and Lauer (1988) inducted a holistic rating o f the explicitness of claims, the consistency o f the claims, the number of subclaims and the presentation of solutions in their contrastive analysis o f students' es s ay s. They did not, however, investigate the correlation of these ratings with ratings o f overall quality. ■ Statement of direction: Another critical element of argument is die statement of the direction which a change should take (Aston 1977, Tirkkonen- Condit 1985). I know of no investigations of the role this type of statement plays in determining ratings. • Data u s e : Although the type of data u s e d in arguments is field dependent— that is, it varies across situations, including academic disdplines— data as a prepositional component of argumentation is crudal to all arguments (Toulmin 1958, Toulmin et al. 1979). The u s e of data w as moderately correlated with holistic s co re s in Carlson's (1988) empirical study. Connor and Lauer (1988) included ratings of the amount of data, the extent to which it was developed, the variety of types o f data u se d and the explicitness of the connection Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 85 of data to claims in their investigation of students' compositions. They did not investigate the correlation of the se features with holistic s c o re s in this study. In a subsequent study, Connor (1991) found that the presentation o f data which supported claims was moderately correlated with TWE holistic sc o re s. • Data type: The u s e of "objective," empirical data h a s often been associated with academic writing (MacDonald 1990). Channell (1990) found that, while the us e of less precise data is acceptable under certain circumstances s h e identified, the u s e of precise, objective evidence is highly valued in academic discourse. Van Peer (1988; 1990) s e e s the "dialectal interpenetration of subjective and objective" a s a defining quality of the academic essay. Perelman and Olbrechts-Tyteca (1969) dis cu ss the importance of data, both abstract and concrete, in the substantiation of various types of arguments. I know of no investigations of the use of various data types a s predictors of ratings. • Warrants: Perelman and Olbrechts-Tyteca (1969) d is c u s s e s the various ways in which data are related to claims in terms o f liaisons. Toulmin (1958) and Toulmin etaL ( 1979) include warrants a s a necessary, field independent, component of arguments. Connor (1987) su g ge sts that the use of warrants is associated with more proficient writing, finding very few occurrences of warrants-explicit or implicit— in any of the es s a y s u s e d in her study, especially in es s ay s with lower holistic sco re s. Connor and Lauer (1988) investigated ratings of the amount o f warrants, their trustworthiness and their relevance but did not calculate correlations with holistic scores. Connor (1991) found that the statement o f warrants correlated moderately with TWE holistic sc o re s. Carlson Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 86 (1988) similarly found that the explicit statement of warrants was moderately correlated with the IEA holistic sc o re s. * Claim/data/warrant: A clear statement of claims and the marshalling of supporting evidence have been identified as important a sp ec ts of academic discourse (Campbell 1990, Johns 1985; 1991, Kennedy 1985, Spivey 1983). Ferris (1990) combined sc or es for: 1) statement of major and minor claims, solutions and conclusion, 2) u s e of data, and 3) explicit statement of warrants. She found this composite score to be a statistically significant predictor of holistic sco re s of L i and Lo freshman writer es sa ys and also found this score a statistically significant variable in discriminating between Li students and Ln students. Similarly, in her studies of argumentative es sa ys, Connor (1987; 1990; 1991) found the composite sc o re s for the existence and effectiveness of claims, data and warrants to be the most significant predictor o f holistic sco re s. * Consideration of cau se s: Berthoff (1984; 1986) and Marzano et al. (1988) indicate that the ability to teas e out immediate and indirect c au se s of a problem are important critical skills to academic writing. This type of prepositional material can be u se d to situate a line of argumentation for the reader (Aston 1977) or can be employed a s minor claims in the argumentation a s is often the cas e in the e s s a y s u s e d for this project I know of no investigations of the consideration of c a u s e s a s a predictor of ratings. * Examination of i ss u es : Missimer (1986) identifies the extraction of is su es from arguments as an important aspect of the development of arguments. Marzano et aL (1988) su gge st that the ability to abstract individual i ss u es from a n Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 7 argument should be included in critical thinking curricula Berthoff (1984; 1986) claims that stud ents often do not examine the nature and/or history o f an issue in their writing although this is often required in academic discourse. This type of prepositional material can also be u se d to situate a line of argumentation (Aston 1977). I know of no investigations of the examination of i ss u es as a predictor of ratings. • Examination of premises: Toulmin (1993) indicates that, although it is not incorporated in his diagrammatic representation of arguments, examination of underlying assumptions of a line of argumentation should be an integral part o f a theory of the structure of argumentation. Perelman and Olbrechts-Tyteca (1969) also view the identification of underlying assumptions a s a crucial step in the act of persuasion. Lipman (1991) identifies the recognition of underlying assumptions as an important aspect of academic discourse. Berthoff (1986) lists the examination of premises a s a necessary skill for academic writers. Aston (1977) d is cu ss es statements of premises a s one type o f prepositional material given in the situation -prepositional content which situates a line of argumentation. I know o f no investigations of the examination of underlying assumptions a s a predictor of ratings. Notes 1 . The labels, "planned" and "unplanned," are misleading in that all discourse is planned. The distinction would be more accurately expressed in relative terms, 2. It is explicitly indicated in the rating sc al es developed for this project that major and minor claims may be explicitly stated or implicit to the text In either case, raters were asked to provide evaluations of the claims. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 8 Connor's (1987) work demonstrates that while many of the sam e prepositional components can be observed in highly rated writing and while the components are organized in similar patterns, it is not necessarily the case that highly rated writing w ill follow conventional organizational principles. In fact, the most highly rated es s ay written by a student in the United States was narrative, not argumentative in structure. While certain prepositional components (e.g., claims, solutions) are referenced in the scales developed for this project, no assumptions about the linear organization o f the components are incorporated in the scales. 3. Although this question is interesting and underlies the usefulness of test results, it ties beyond the scope of this study. Confounding factors, such as an examinee's misinterpretation of test procedures, lack of cooperation and other s o u rc es of measurement error, make the correlation of abilities and score s a hazardous endeavor. 4. The researchers analyzed the texts, but it is not clear how many rated the texts in terms of abstractions or how reliable the ratings were. Additionally, no contextualized examples of the three levels of abstraction are provided. 5. Toulmin's u s e of this term h a s been criticized for its restricted definition and the lack of a clear distinction between backing and data (Brockriede and Ehninger 1971, Manicas 1971). While the examples he presents clearly illustrate backing as factual information, in many arguments, unverifiable fundamental assumptions serve a s backing. As an example, the fundamental assumption of the existence of a god ser ves a s the justification fin- warrants in arguments over abortion rights. Suc h information not only serves a s justification for warrants, but a s justification for the entailment of minor claims. 6. This difficulty was, in fact, observed in the current study. Both hired raters had difficulty applying the set of sca les us ed in this study. Out o f frustration, one rater complained that die criteria were irrelevant to the type of ratings usually applied by language teachers, explaining that"... after all, we are LANGUAGE instructors." 7. The audience awareness ratings woe based also on evaluations of writers' ad dres s of counteraiguments, a s well as their statement of major claims. It is, thus, not clear to what extent the evaluations of statement of major claims correlated with holistic sco re s. As noted below, the add res s of counterarguments h a s been found to be a significant predictor of holistic sc o re s. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 89 CHAPTER ID Review o f Literature: Statistical Methodology m,i, O verview - This chapter provides an introductory explanation of the statistical analyses u s e d in the development and validation of the rhetorical abstraction scales and those analyses u se d to test the hypotheses presented in Chapter I. The explanations are supplemented with references to relevant literature s o that the reader may pursue more detailed information with relative ease. There are two main purposes for this chapter: The first is to make accessible the statistical technologies described herein to those who are not familiar with them; the se c o n d purpose is to leave a fairly detailed account of the implementation of the technologies s o that the appropriateness of their implementation can be judged more readily. In the first part of the chapter, I present a brief review of two statistical analyses-Generalizability-Theory and multifaceted Ras ch scalar analysis--which are useful in investigations of rater consistency. These two analyses were u se d in a preliminary project, the goal of which was to develop and validate the rating sca les for rhetorical abstraction. More detailed descriptions of that project are provided in Chapter IV and in Weasenforth (1993). The se c on d and larger part of this chapter is devoted to an expository discussion of structural equation modeling (SEM) and a review of the literature Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 90 related to the statistical procedures involved in this type of analysis. SEM was the statistical analysis u se d to address the hypotheses expressed in 1.4. UI.2. Analyses of Rater Consistency. In constructing the rating scales o f rhetorical abstraction, two types of quantitative analyses were employed: Generalizability-Theory and multifaceted Rasch scalar analyses. Both analyses were us ed to investigate the consistency of raters’ evaluations. Generalizability Theory (GT) is ba sed on classical test theory and factorial analysis of variance without the stringent distributional assumptions (Brennan 1992, Shavelson and Webb 1991). As in classical test theory, an assumption underlying the analysis is that ratings are a composite of a "pure" measure of a test taker's true ability and o f measurement error. However, GT expands classical theory by using factorial ANOVA to provide variance estimates for a "pure” measure and multiple sources o f error instead of just one source (Bolus, Hinofotis and Bailey 1982, Brennan 1992, Shavelson and Webb 1991). GT analyses entail a two-step process. The first step includes the estimation of variance for each identified source of variation in s c o re s through the application of general analysis of variance. The second step involves the u s e of the se variance estimates to estimate the generalizability (analogous to reliability) o f test results. Multifaceted Ra s c h scalar analysis is ba se d on latent-trait theory which is also known as Item Characteristic Curve Theory or Item Re sponse Theory when applied to te st s of ability and achievement (Weiss 1983). Ra sch's analytic model is part of a family of mathematical models of the functional relations between Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 1 ratings and underlying hypothetical trails or abilities, s u c h a s writing ability or rating severity. Ratings, serving as mutually independent observations of ability, are u s e d as the known elements of an equation that is assum ed to describe the relationship between the ratings and the unobserved latent trait. With the known elements of the equation specified, the purpose o f a latent trait analysis is to estimate a test taker’s ability on the trait or a rater’s severity level in judging test takers' performances (Linacre 1989; 1991, Ras ch 1960, Weiss 1983). The unit of measurement u s e d to describe rater severity and the difficulty of rating steps is the lo g it-th e logarithm of the probability o f moving from one level on a scale to the next Test takers' abilities, raters' severity and difficulties of rating categories are all measured in logits on a common scale. Both analyses were used because they offer different perspectives on rater consistency. GT analyses estimate variances for groups of test facets, such a s raters as a group, test takers a s a group and rating scales a s a single facet In contrast Rasch scalar analyses identify more specific ca us es of variance, su ch as one particular sc o re of a particular rater applying a particular scale to evaluate a particular es s ay (Bachman 1990). The procedures employed to investigate rater consistency and die results o f the GT and Rasch analyses are summarized in Chapter IV . III.3 . Structural Equation Modeling. Structural equation modeling-entailing a series of statistical procedures- was the primary source of statistical evidence u s e d in this study. The following discussion includes a brief historical sketch of the applications of SEM to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 2 linguistic research. A discussion of fundamental concepts u se d in SEM analyses follows. This discussion is, in turn, followed by a general description of the procedures entailed in SEM analyses. The description is organized chronologically according to the order in which SEM procedures are completed. These include descriptions o f the specification of models, identification, estimation and the examination of model fit This chapter is concluded with an identification of the advantages and disadvantages which SEM offers to this type o f study. IH.3.1. Applications of SEM to Linguistics Research. Research which h a s contributed to the development of SEM extends back to the se co nd de cad e of the 20th century with the development of path diagrams. Several fields, including sociology, psychology and economics, have contributed to its development Bollen (1989) provides a more elaborated historical overview and references for additional historical information.1 SEM analyses have been applied in investigations of several language related is su es, including language use, language acquisition and attrition and tart structure. Bachman and Palmer (1981; 1982) introduced the u s e of SEM to the field of applied linguistics, developing a model of language u s e which h a s since superseded the previously accepted model proposed by Canale and Swain (1980). Bachman and Palmer u s e d several types of mea sures of linguistic, pragmatic and sociolinguistic competence in order to test a proposed model o f communicative competence. They found that a model which included a general ability factor and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 93 two more specific factors-grammatical/pragmatic and sociolinguistic-best described their data. Sang et aL (1986) used SEM to test Oiler's (1976) unitary language competence hypothesis. Measures of three postulated levels o f language were incorporated in their SEM model. The researchers postulated a basic level at which pronunciation, spelling and lexicon are acquired. The intermediary level- at which language acquirers integrate basic knowledge~was measured by tests of grammar usag e and reading comprehension. The third level-at which pragmatic appropriateness is acquired-was measured by two listening comprehension test s which required processing language at a discourse level Consistent with Bachman and Palmer’s findings, Sang et al. found that a multiple factor model of language competence explained their data better than a single factor model. This evidence was used to discount Oiler's theory. Purcell’s (1983) work is an early application of SEM to language acquisition. He investigated the determinants of English pronunciation acquisition by using data originally collected by Suter (1976). The data included evaluations of L2 students' pronunciation as measures o f pronunciation accuracy and questionnaire re sp on ses as measures o f determinants of pronunciation accuracy. Results from his study indicated that a Ln student’s length o f stay in an English-speaking country was the most significant predictor of highly rated pronunciation accuracy. Language retention h a s also been investigated through applications of SEM (Gardner, Lalonde and Moorcroft 1987, Gardner, Lalonde and Pierson 1983, Lalonde and Gardner 1984). Pre- and post-tests were u se d in longitudinal Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 94 studies which incorporated mea su res of motivation and attitudes toward language learning. Results indicate that students' motivations and attitudes significantly influence language acquisition and retention. More recently, Biber (1992) tested a number of models o f text structure using SEM to perform confirmatory factor analyses. He used frequency counts o f 33 lexical and syntactic text features as m ea su res of various types of textual complexity. Results were interpreted as indicating that written registers vary more in complexity than spoken texts. Written registers, he concluded, varied in terms of the types and extent of complexity; spo ken registers varied only in the extent of their complexity. m.3.2. Fundamental Concepts. SEM analyses entail the simultaneous analysis of multiple variables— observed and latent-to solve groups of linear equations which represent relationships among the variables. Latent variables are unobserved constructs which are not directly measurable. Observed variables, in contrast, can be directly measured in the form of test scores, ratings, or frequency counts. Observed and latent variables are associated through applications of theory and/or previous empirical findings. Associations between variables may also be proposed as hypotheses to be tested. Two examples-one from psychology and another from this study-may be helpful in clarifying the distinction between these two types of variables and the associations that are drawn between them. Within the field o f psychology som e researchers are interested in describing and measuring depression, a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 95 complex psychological and physiological state of being. Among other types of behavior, depression h a s been associated with loss of memory, loss of sleep and su bstance abuse. Depression is observable and measurable only in the s e n s e that one can observe and measure behavior which h a s been associated with the state. That is, one can measure loss of memory, loss of sleep and substance abuse. Based on the theoretical assumption that these three behaviors are related to depression, one claims that depression has been measured. The sam e procedures are involved in describing and measuring the quality of written texts. It is as s um ed that the organization of a text entails the existence of a thesis and topic se nt en ce s and the u s e of transitions to name but several features associated with textual organization. The measurement of these observable features of a text- often through the impressionistic ratings of readers, sometimes through frequency counts of cohesion markers--is often claimed to be measurement also of the unobserved textual dimension organization with which the observed variables have been associated. In SEM analyses, the relationships between observed and latent variables are specified prior to analysis and are based on substantive theory or are stated a s hypotheses. To extend the example drawn from this study, various observed textual features are as s um ed or hypothesized to be associated with a full se t of latent textual dimensions. Handwriting, the existence of paragraph indentation and punctuation u s a g e are associated with the textual dimension mechanics. Similarly, subject-verb agreement, article usage, pronoun u s a g e and choice of prepositions are associated with language use. The dimension vocabulary is associated with appropriateness of word choice, translation and word form Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 96 accuracy. Content is related to the us e of examples, the u s e of irrelevant information and the reasonableness of the thesis.2 In addition to the association of observed and latent variables, relationships between latent variables are specified. Continuing the textual analysis example, one may ass um e or hypothesize that the five textual dimensions-con/en/, organization, vocabulary, language use and mechanics - are independent or, on the contrary, that they are related in so m e way. One could, for example, ass um e that content and vocabulary are correlated or that vocabulary represents an aspect of content. The power of SEM analyses lies in the ability to test theoretical assumptions by identifying and examining the relationships among latent variables of a theoretical model. This power is a result of the combination of: i) latent factor analysis which associates observed variables with latent ones and ii) regression analysis which estimates the strength of relationships among the latent variables (Bollen 1989, Pedhazur and Schmelkin 1991). These two analyses are referred to a s the measurement models and the structural model components of a full SEM model. The measurement models define the relationships between latent and observed variables in terms of factor loadings. Measurement error for each observed variable is partialled out s o that the latent variable theoretically is a true representation of the trait of interest (Long 1983b, Pedhazur and Schmelkin 1991). Figure III. 1 presents— in the form o f a path diagram -an illustration of a measurement model which might be incorporated into a SEM model. In the figure, organization is represented a s a latent variable through the u s e o f an Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 7 ellipse; the features associated with organization are represented as observed variables through the u s e of rectangles. The arrows connecting the latent and observed variables represent the factor loadings; the other arrows represent error terms. error error Organization Use of transitions error Existence of thesis Existence of topic sentences Figure IU .l Measurement Model This figure graphically ties together many of the concepts dis cu ss ed s o far. First, textual organization is represented a s a latent, unobserved construct that is characterized by the existence of a diesis, the existence o f topic s e n te n ce s and the u s e of transitions. These three observed variables are measurable, and their mea sur es are represented a s indicators of textual organization. It is as s um ed that a certain amount of error exists in measuring each of the three variables. The error, however, is partialled from the mea su res s o that theoretically one obtains a true mea sur e of the latent variable organization. The structural model defines the relationships between latent variables. An example of a structural model is presented in Figure m .2 below. Correlations-represented by slings-and "causal" or predictor relationships- represented by arrows-are two types o f relationships that may be defined. The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 98 slings and arrows in Figures in. 1 and Ifl.2 are called parameters of the models. The relationships that may be defined by parameters w ill b e discus sed in more detail in m .3.3. The sam e textual analysis example is drawn on for this diagram. ontei O verall Textui Q uality^ ocabut echanii Figure III.2 Structural Model In Figure ID.2 the latent variable overall textual quality is shown to be regressed on five other latent variables, including content, organization, vocabulary, language use and mechanics. That is, the five textual dimensions are represented a s determinants of the overall quality of texts a s indicated by the arrows. Furthermore, the textual dimensions content and vocabulary are shown to be correlated a s indicated by the sling which connects the two latent variables. It is possible to specify a correlation or causal relationship between each pair of variables on the right side of Figure ID.2; only the one is specified in this model for the sak e of clarity. Before moving on to a discussion of the mathematical representation of relationships between variables, another useful distinction drawn between variables should be mentioned. In Figure III.2 above, one may notice that the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 99 latent variable overall textual quality is represented a s a function of the five latent variables on the right hand side of the diagram. That is, content, organization, vocabulary, language use and mechanics are posited to be aspects of overall textual quality. Overall textual quality, depicted as being caused by the other five latent variables, is called the endogenous latent variable of the model. The other five latent variables, the causes of overall textual quality, are the exogenous latent variables of the model. The last issue raised in the introductory paragraph for section QI.3.2. was that of the mathematical representation of relationships. Each parameter of a model— the correlations, the causal relationships, the errors and the factorial loadings between latent and observed variables-can be described mathematically by a linear equation. Through the application of matrix algebra, the set of equations is solved to provide an estimate for each parameter. The known parts of the equations are the correlations or covariances between variables. It is important to point out that solving the se t o f equations provides for the simultaneous analysis of all variables as part of a complete, coherent model. I w ill dis cu ss the implications of the simultaneous analysis of all variables in III.3 .7 . Three types of SEM models have been described: the measurement model, the structural model and the fu ll model in which both of the previous models are combined. SEM ca n be used to analyze each of these types of models, as well a s others (see Bollen 1989, Joreskog and Sorbom 1988; 1993). SEM is useful in performing factor analyses in which c a s e s only the measurement models are analyzed. The structural model can also be analyzed separately in a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 100 path analysis. This study involved the testing of a series of measurement models and two full models described in Chapter IV . Regardless of the type of model specified, the sa m e procedures are followed in completing SEM analyses. These procedures include the specification of a theoretical model, generation of a correlation or covariance matrix from a raw data matrix, estimation of model parameters and generation of an estimated matrix and finally the as se ssm ent of model fit (Bollen 1989). These procedures w ill be discussed in III.3 .3 through III.3 .6 . IH.3.3. Model Specification. Model specification is the first procedure ne ce s sa ry to a SEM analysis. 3 Specification entails the identification o f variables to be included in a model a s well as the parameters that represent relationships between variables. Specification of a model is often represented graphically in the form of a path diagram for heuristic purposes. It is also represented mathematically in the form of matrices and a set o f linear equations. Logically one should identify the latent variables prior to identifying the observed ones. Latent variables define the primary conceptual interests of a SEM analysis and thereby are usually the focus of the analysis. Identification of all significant latent variables is theoretically necessary. Although SEM analyses can be completed without all conceptually significant variables being identified, estimates of relationships (i.e., parameter estimates) may be unreliable unles s a ll variables are incorporated in a model. The exclusion o f a significant variable, for Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 101 example, may lead to results which erroneously indicate the significance of other variables which were included in a model but may otherwise not be significant. The foregoing discussion is not meant to deny the importance of specifying observed variables. Certainly care must be given to the selection of measures which represent observed variables. It is after all the observed variables which define the latent traits that are of primary interest A number of concerns should be considered in choosing the observed variables, including the appropriate aspects of the latent trait to be measured, the number of observed variables associated with each latent variable, the level of measurement and distributional characteristics of the measurements. The specification of parameters is largely implicit in the choice of variables. A n ec es sar y criterion for choosing latent variables is that the variables be related in so me manner although it is not always clear what the relationship may be. The choice of observed variables is also based on the assumption that they are associated with at least one latent variable. In addition to deciding whether a particular parameter should exist, one must also specify the type of parameters between latent variables. Latent variables may be correlated to indicate the coexistence of latent traits. A causal relationship may be specified instead of a correlation. This type of relationship indicates that a latent trait may be a function or asp ec t of another or it may indicate that a given trait determines another. Both of th es e relationships are graphically represented in Figure ID.2 above. Content and vocabulary are assumed to be correlated. In contrast, the five textual dimensions on the right side of the figure Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 102 are represented as aspects of overall textual quality; that is, they determine the overall quality of texts. With regard to the specification o f parameters, it is important to note that SEM computer programs w ill automatically identify all possible parameters in a model. By adding an arrow or sling between variables of a path diagram, the researcher is indicating that the statistical estimate of the parameter is so m e value other than 0. By not adding the arrow or sling, the researcher is indicating that the parameter estimate is 0 ; s/he is not indicating that the parameter d oe s not exist. Figure III.2 , therefore, indicates that the researcher a s s u m e s that the correlation between content and vocabulary is greater or le ss than 0. In contrast, the researcher h a s apparently assum ed that the correlation between content and organization, for instance, is 0. In other words, the researcher h a s assum ed that the two textual dimensions are unrelated. Thus, the nonexistence of an arrow or sling represents a s significant a statement as the existence of one o f the symbols. By not specifying the existence of a parameter, the researcher has in effect fix e d the value of the parameter to 0. The parameter in this c a s e is said to be fixe d . The researcher may fix the value of a parameter to be a value other than 0 also. Parameters whose values are not fixed are called free parameters. The values of free parameters are estimated during the SEM analysis. It may be helpful to the reader to provide an example of a full model and to dis cu ss the assumptions represented by it. The path diagram of a hypothetical model is represented in Figure M .3 below. This model w ill provide a ba sis for the synthesis of all the fundamental information needed for interpreting the models u se d in this study. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 103 e rro r-e fB a te rl error -a(Rater2 error- Rater3 Use or examples error -^iu sa of irrelevant Information error Reasonableness of tbesls e-error Existence of thesis e-error Existence of topic sentences (Qrganizatig error Use of transitions error Appropriateness of word choice error verall Textu Qualit ocabula Translation error Word form accuracy error Subject-verb agreement e -e rror Article usage language U error Pronoun usage e -e rror Choice of prepositions e-error Handwriting e-e rror Paragraph indentation echam error Punctuation usage e -e rror Figure III. 3 Full SEM Model A brief description of the model may be helpful, beginning with a n identification of the variables. The full model is composed of six measurement models, one for each of the five exogenous latent variables and one for the endogenous latent variable. Associated with ea ch latent variable, there are three or four observed variables (i.e., mea su res of the associated latent variables). In Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 104 addition to the m ea su re s o f the five exogenous variables which were already identified, there are three me asures of overall textual quality. These three measures are the holistic s em es o f three raters. Finally, there is an error term for each and every observed variable. Several types of parameters are represented in the model. The one endogenous latent variable is regressed on the five exogenous variables. A ll observed variables are represented a s factors of a common latent trait Error terms are represented a s factors of measurements. Two correlations are indicated in the model. As in the previous examples, content is correlated with vocabulary. Additionally, the error associated with the measure of reasonableness o f thesis and the error associated with the measure of existence o f thesis are correlated. A substantive interpretation of the full model represented in Figure IU.3 w ill conclude this section. As discussed before, one assumption of the model is that the overall quality of a text is determined by five textual dimensions: content, organization, vocabulary, language use and mechanics. It is assumed that only these five textual dimensions determine overall textual quality. These assumptions of the model deny that any other textual dimensions are significant in determining overall textual quality. They also preclude the existence of any other type of possibly significant determinant of textual quality, including sociocultural constraints or reader volition and attitudes for example. Each of the textual dimensions is defined in terms of three or four more discrete textual features as discussed earlier. It is assumed, for example, that overall textual quality is measured by the impressionistic ratings of three raters. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 105 Similarly, the measurement of the u s e of examples, the u s e of irrelevant information and the reasonableness of a thesis is as s um ed to be measurement of a textual dimension which is labeled content. By including only these three measures, one is not stating that these are the only possible mea su res o f content With regard to the association of m ea su res with underlying textual dimensions, it is important to point out that the model incorporates an assumption that each measure is associated with only one textual dimension. The u s e o f irrelevant information, for instance, is associated with only content Although it may be argued that the presence of irrelevant information may also be a mea su re of organization, the model doe s not accommodate this argument It is further as su m ed that error (measurement and/or random) may be present in each measure of the textual dimensions. In most ca s e s , the errors are not correlated. One may make a valid argument for s o m e error terms to be correlated however. In the hypothetical model, for example, one may argue that, since the reasonableness of a thesis and the existence of a thesis me as u re s involve the ass es sm en t of the sam e textual feature, the errors associated with the m ea su res may be correlated. Finally, the model a ss um es that the textual dimensions content and vocabulary are correlated. One may support this assumption by referring to substantive theory which indicates that vocabulary is the primary bearer of the semantic content of a text (Hoey 199 la; 1991b for example). In contrast, the model incorporates the assumption that no other textual dimensions are related. The discussion in this section has focussed on the graphical representation of model specification. It was mentioned earlier, however, that mathematical Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 106 representations of model specification must also be completed for the computer programs to run the analyses. I w ill not provide in this dissertation a discussion of the mathematical specification of models which involve the application of tracing rules from matrix algebra. For further information about the mathematical specification of models, the reader may wish to consult Long (1983b) and Bo lien (1989). 111.3.4. Model Identification. Once a model h a s been specified, it is n ec es sar y to verify the identification of the model. Identification is not a procedure, but rather a characteristic of individual parameters of a model and of the full model. Individual parameters are said to be identified if the model contains enough information for the parameter values to be consistently estimated. If all parameters of a model are identified, the whole model is identified If not enough information is available to consistently estimate parameter estimates, the parameter and the model are said to be non­ ide n tifie d * Identification of individual parameters and o f a full model is determined by the amount of information known relative to the amount of information sought from a SEM analysis. The amount o f information, in this case, is not determined by sample size. The number of es sa ys in this study is irrelevant when considering the identification of the models, although it is important when choosing an estimation method. Instead, the values associated with observed variables and their variances is the information which is relevant to model Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 107 identification. If enough of the se values are known, then the parameters and the model w ill be identified. Another intuitive conceptualization may be drawn by m e a n s of reference to the set of equations that are solved in SEM analyses. As explained earlier, parameters are described mathematically in the form of algebraic equations. If there is at least one equation that w ill describe each parameter, then the parameter, and thus the model, are identified If no equation can be found to describe a parameter, then the parameter is non-identified. In order to formulate an identified model, one may increase the number of observed variables, thereby increasing the amount of information that w ill be u s e d to estimate parameters. The rule of thumb that there should be at least three measures o f each latent variable is ba sed on the need to have enough information in order for the proposed model to be identified5 One may also fix parameter values in order to obtain an identified model By fixing values, the SEM analyses are provided with additional information which would otherwise have to be estimated The importance of model identification mak es it a n ec es sar y topic in a discussion of SEM. Long (1983a; 1983b) points out that parameter estimates can vary markedly if a model is non-identified SEM programs may c e a s e estimation procedures if a model is non-identified, providing a clue that the model may not be identified. However, the programs may complete the normal procedures in spite of a model being non-identified In this cas e, the researcher may obtain estimates which are not reliable. If identification of the model is not verified the researcher may end up interpreting unreliable statistics. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 108 Verifying identification of a model is a difficult task. Long ( 1983b) sug ge sts writing and solving the appropriate s e t s of equations by hand For a large model, s u c h as the one proposed for this study, these procedures would require days and would be prone to error. Bollen (1989) d is c u s s e s a number of tests appropriate for several types of models. A ll of th es e t e s t s are limited in their effectiveness to identify non-identified models. Many are appropriate for only particular types of models. Most are complicated to the point of being impractical. LISREL provides so me checks for model identification. Very high correlations between estimates from the information matrix may indicate a nearly non-identified models (Joreskog and Sorbom 1989). Modification indices may also be analyzed to determine whether individual parameters are identified. 6 LISREL w ill provide a m e s s a g e if a parameter may not be identified, but one should not fully rely on LISREL to tag unidentified parameters because the program is not 100% trustworthy (Joreskog and Sorbom 1989). III.3 .5 . Estimation. Once the model h a s been specified and identification of the model h a s been verified, a correlation/covariance matrix is computed. The elements of the matrix are correlations/covariances of the observed variables. These correlations or covariances are u se d to solve the linear equations which represent the relationships between variables in the model. In other words, solution of the equations provides estimates for the parameter values. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 09 The estimation p ro ces s involves the give and take between observed relationships (correlations/covariances) in the data and the relationships (model parameters) which the researcher a s s u m e s to exist The object of a SEM analysis is to impose a structure, in the form of a SEM model, on the data while maintaining the actual observed relationships in the data. Put another way, the researcher attempts to explain the data by showing that hypothesized relationships exist among variables. A SEM analysis can involve numerous recalculations of parameter estimates in the process of solving the s e t of equations while also trying to maintain the original correlations/covariances. The computer program retains the matrix of original correlations/covariances while completing recalculations. After eac h calculation of parameter estimates, it calculates a se c on d correlation/ covariance matrix which can account for the mo st recent parameter estimates. The two correlation/covariance matrices are then compared. Once an adequate match is obtained, the program terminates the estimation procedures. The estimation of parameter values takes precedence over maintaining the original correlations/covariances. That is, the computer program w ill estimate parameter values at the expense of changing the original correlations/covariances if necessary. Therefore, although the program may stop estimation procedures, the two correlation/covariance matrices may not be identical. In fact, they are rarely identical. The extent to which the two matrices differ lies at the heart of the issue of model fit which w ill be d iscu sse d in the next section. The last issue that I would like to add res s in this section is that of estimation methods. This issue is important in that the choice of estimation Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 10 methods can determine the reliability of parameter estimates and estimates of the adequacy of a model. The discussion which follows is brief and largely non­ technical. The procedures described above are followed in every estimation of a SEM model. There are, however, a number of mathematical algorithms which ca n be u se d to calculate and recalculate the parameter estimates and the associated covariance matrix. These algorithms are referred to a s estimation methods. One may draw distinctions between a number of types of estimation methods. The first distinction that can be made is between those estimation methods which provide rough estimates of parameter values and those which provide finer estimates. Bec au se of the complexity of simultaneously solving a se t of equations, a great deal of computing power may be necessary to calculate parameter estimates and other statistics. In order to conserve computing effort, SEM programs generally us e both types o f estimation methods. Fast, noniterative methods (e.g., two-stage least s q u ar es or instrumental variables methods) are u se d first to provide rough estimates. These rough estimates are then improved through the u s e of iterative estimation met hods. The iterative methods u s e the rough estimates a s starting values. They can then make hundreds o f p a s s e s through the data to further refine the estimates until adequate solutions to the equations are found. Iterative estimation methods vary according to: 1) the assumptions underlying them, 2) the type of matrices that are u s e d as input and 3) the types of data that can be analyzed (Bollen 1989, Jdreskog and Sorbom 1988; 1993). Both the maximum likelihood and generalized least s q u ar es estimation methods a s s u m e Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 111 a multivariate normal distribution of the observed variables, although they are robust under moderate violations of normality (Boomsma 1983). Both are u s e d to analyze a matrix of correlations of continuous data. Unweighted least s q u a re s is used when the assumption of multivariate normal distribution is met and is appropriate when all variables are measured in the same units. Distribution-free me thods of estimation are also available. These methods are us ed when variable distributions are not normal or when ordinal data are analyzed. Weighted least sq u ar es analyzes a covariance matrix based on correlations of ordinal data. Diagonally weighted least s q u ar es analyses a vector of covariances of ordinal data. The distribution-free assumption of the se two methods make them theoretically attractive. However, they require large sample sizes, making them difficult to apply in practice (Joreskog and Sorbom 1988; 1993). Hi,3.6 , M od e l Fit- Before analyzing the results of a SEM analysis, it is wise to check the model fit be cau se poor fit indicates that the parameter estimates may be unreliable. Model fit is a term which refers to the extent to which a proposed SEM model ca n explain the data. Measures of model fit indicate the reliability of a model to describe relationships that can actually be observed in the data. As briefly dis cu ss ed above, SEM programs calculate two correlation/ covariance matrices. The first matrix includes actual correlations/covariances of the observed variables. B as ed on parameter estimates, a corresponding matrix is also calculated a s an estimate of the original matrix. Evaluating the extent to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 12 which a model fits the data involves comparisons of elements of the original matrix with corresponding elements of the estimated matrix. The difference- known a s the residual --between each pair o f elements ser ves as an index of model fit Relatively large residuals indicate that the model d o e s not fit the data or that the model d oe s not fit the data well. That is, the SEM model doe s not do an adequate job of describing the relationships between variables. On the other hand, relatively small residuals indicate that model fit is adequate. Since, even with a small model, there are many residuals to be calculated, it would be impractical to evaluate model fit by looking at individual residuals. Therefore, summary descriptions of the residuals are provided by SEM programs. Many programs provide a plot of all residuals a s a graphic description of model fit. In addition to plots, programs provide numerical indices of model fit There are three types of numerical fit indices: indices relevant to the fit of measurement models, indices related to overall fit of a full model and indices for individual parameters. A brief non-technical discussion of fit indices may be helpful in evaluating the results of the study. I have summarized this discussion in Table III. I below for the reader’s ease o f reference. The only fit index for measurement models that I will dis cu ss here and in subsequent chapters is the squared m ultiple correlation. A squared multiple correlation (R 2 ) is calculated for each of the observed variables in a measurement model and is an index of the reliability of the variable a s a mea sure of the associated latent variable. It indicates the proportion of variance in the endogenous variable which is accounted for by the variables in the structural Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 13 equation (Joreskog and Sorbom 1993). Values for this index range from 0.0, indicating that the observed variable does not measure the latent variable, to 1.0 , indicating that the observed variable is a "pure" mea su re of the latent variable. Values of .70 or greater were considered adequate (Byrne 1989, Long 1983b). The next set of indices to be addressed are associated with overall f it of the fu ll model. 7 A great deal of the literature a d d re ss es various i ss u es related to the evaluation of overall fit (see e.g., Bollen and Long 1993, Marsh, Balia and McDonald 1988 and Mulaik et al. 1989). Tanaka (1989) provides a comparative analysis o f nine fit indices and suggestions for the choice of indices. LISREL 8 provides a menu of 28 indices for overall model fit, only five of which were consulted in this study and w ill be discussed here. The x2 is a test statistic that can be u s e d to a s s e s s model fit If the model fits well, the x2 w ill be small relative to the degrees of freedom, and the p value large relative to the level of significance chosen. However, the index is sensitive to sample size. Thex2 index tends to be large for large samples although the model may fit according to criteria. On the other hand, it tends to be small for small sam ples although the model may not adequately fit the data according to other criteria. Becau se of the sensitivity of the x2 to sample size, the Goodness-of-Fit (G FI) and Adjusted Goodness-of-Fit (AGFI) indices were developed, and according to Joreskog and Sorbom (1985) are independent of sample size and robust to departures from normality.8 It is generally accepted that, with a range of 0.0 to 1.0, a GFI >.90 is considered good; a GFI >.85 is considered acceptable (Long, Kahn and Schutz 1992). The AGFI is the sa m e index a s the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 14 GFI except that it is adjusted for the number of degrees of freedom in the model. Values for the AGFI also range from 0.0 to 1.0 and generally have lower values. Long, Kahn and Schutz (1992) suggest that GFI and AGFI values of .90 or greater indicate good fit and values between .85 and .90 to indicate adequate fit Bentler and Bonnet's (1980) Normed F it Index (NFI) is a sample- dependent index which tests model fit through comparison of the hypothesized model with a null model in which all observed m ea s u re s are assum ed to be independent. Values for this index range from 0.0 to 1.0. A N FI of .90 or greater is considered an indication of good model fit The Root-Mean-Square E rro r o f Approximation index (RMSEA) indicates the discrepancy between elements in the hypothesized matrix and those in the sample matrix. Values for this index range from 0.0 to 1.0. If a model fits the data well, the RMSEA w ill be < .05 (Byrne 1989). The third set of indices are u se d to evaluate the fit o f individual parameters to the data. Along with each parameter estimate, LISREL provides a standard error and a t-value. A ll parameter estimates range in value from -1.0 to 1.0. The larger the absolute value of die estimate, the stronger the association between the associated variables. Negative polarity indicates inverse relationships between variables. The acceptability of parameter estimates is judged against theory; estimates which appear to be inconsistent with accepted theory can be rejected and may warrant model respecification. Standard errors serve a s indications of the amount of error which is associated with measuring the particular relationship and can be u s e d to test whether parameter estimates differ significantly from 0.0 (Joreskog and Sorbom Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 15 1993). Standard errors range in value from 0.0 to 1.0, with 0.0 indicating that no error is involved in measuring the associated parameter, and 1 .0 indicating that the measure is composed o f only error. Errors smaller than the absolute value of the associated parameter estimates indicate adequate parameter estimations. Standard errors that are equal to or greater than the associated parameter estimates indicate that the estimate may be considered equal to 0 .0 . The t-value se rv es a s an index of the significance of the particular parameter to the full model. It is calculated by dividing the parameter estimate by its associated standard error, there is theoretically no restriction on the range of t- values. T-values who se absolute values are greater than 2.0 indicate that the parameter can be considered significant to the model (Byrne 1989). T-values with an absolute value le s s than 2 . 0 signal that the parameter should perhaps be fixed to 0.0, thereby removing it from the model (Joreskog and Sorbom 1993). LISREL also provides fitte d and standardized residuals which are indications of the amount of discrepancy of fit between the sample and hypothesized covariance matrices. Relatively large residuals indicate misfit of the model to the data; small residuals indicate adequate fit Fitted residuals are reported in their original metric form; standardized residuals are normalized and are analogous to Z-scores. B e cau se all standardized residuals are normed to a common scale, they are easier to interpret Values for standardized residuals which are greater than 2 . 0 for any element of the model indicate model misspecification (Byrne 1989). LISREL also provides m odification indices and estimations o f the expected change in x2 for each modification index. Both of these pieces of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 16 information are provided for ea ch parameter that is not identified in a model. Modification indices that are relatively large indicate that model fit might change if the associated parameter were added to the existing model. The estimations of the expected change in x 2 are estimates of the change that might be expected in the X2 fit statistic once the parameter is added to the model .9 Table m.l Fit Indices - Index Indication of: Range of Values Poor fit Good fit R2 Reliability of measure of latent variable 0.0 1.0 f> :7 0= -." reasonable) X2 . d f p Fit between constrained model and unrestricted sample data large x2, small p small x 2, large p G FI Amount of variance and covariance explained by model 0.0 1.0 (>.90= good, >.85 considered acceptable) AGFI Amount of variance and covariance explained by model — adjusted for sample size 0.0 1.0 (>.96= good, 2:. 85= acceptable) N FI Fit between null model and proposed model 0.0 1.0 good fit) RMSEA Average discrepancy between elements in sample and hy­ pothesized covariance matrices 1.0 6.0 (<.05 suggested) Normalized residuals Discrepancy of fit between sample and hypothesized covariance matrices A to o >2.0 T-values Significance of parameter to model <2.0 >2.0 Standard errors Amount of error associated with parameter estimate Small relative to param­ eter estimate Large relative to parameter estimate Modification indices, expected chan ge in x2 Additional parameters that would change x2 test of model fit relatively large values relatively small values Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 17 III.3 .7 . Benefits. The application of structural equation modeling provides several theoretical advantages to research in the social and behavioral sciences. One of the most important advantages is a result of the combination of measurement and structural models a s discus sed above. When measuring abstractions-such a s textual elaboration, organization and con tent--these measurements can often contain sizeable amounts of error. The inclusion o f measurement models in the full SEM model allows the researcher to take into account measurement error, thus allowing investigations of abstract constructs that are not confounded with error (Bollen 1989, Newcomb 1990). Another measurement iss u e commonly raised with regard to the social and behavioral sciences is that o f the level of measurement (Dwyer 1983, Goldberger and Duncan 1973). Often measures o f abstract constructs, s u c h as those identified previously, are not linear. Use of nonlinear m ea su res may violate assumptions of other multivariate analyses. However, SEM analyses accommodate ordinal level data through the calculation of appropriate correlations and the application of appropriate estimation methods (Joreskog and Sorbom 1988; 1993).1 0 As a result of model specification, a demand is placed c m researchers for an explicit statement of theoretical assumptions underlying a model and the testing o f those assumptions (Hughes, Price and Marrs 1986). In developing a model to be tested by SEM procedures, the researcher must identify m ea su res of latent variables, indicate whether measurement errors are correlated and identify latent variables and the relationships among latent variables. The basis of the se Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 1 8 specifications should be theory or previous empirical evidence. Results from the analyses: 1) indicate whether the model as a whole fits the observed relationships in the data and 2) identify specific parameters which do not fit the data. Results also identify revisions to the model that would yield a better fitting model, yet the researcher should provide theoretically sound arguments for making any chang es to the model (Joreskog and Sorbom 1988; 1993, MacCullum, Roznowski and Necowitz 1992). Unlike ANOVA and regression analyses, measurement errors associated with different variables are not assumed to be equal in SEM analyses (Bollen 1989, Fassinger 1987, Joreskog and Sorbom 1988,1993, Pedhazur 1982). The importance of this i ss u e becomes apparent a s the number of mea su res and type of measures involved in a study increases. One might expect measurement errors to differ across raters and rating scales. The rating scales in this study, for instance, call for different textual features to be evaluated, including observable features— s u c h as the u s e of prepositions and articles-as well a s more illusive features— su ch a s the reasonableness of content and the effects of various features on intelligibility. As Carlson (1988) and Ferris (1990) have demonstrated, so me features are rated with less error than others. In any rating session, one may observe variations in rater behavior which may yield different amounts of measurement error associated with individual raters. 11 Instead of being limited to analyses of a single dependent variable as is the case in regression analyses, SEM makes it possible to analyze one or more dependent variables (Pedhazur 1982, Pedhazur and Schmelkin 1991). This possibility allows the analysis of models su ch a s the one represented in Figure Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 19 IV . 1 (see IV .9 .1). If regression were used to analyze the relationships represented in this model, separate analyses would have to be run. SEM, however, allows the researcher to analyze the complete model, maintaining the coherence o f the theoretical model and providing a firmer basis for valid, reliable interpretations (Bollen 1989, Joreskog and Sorbom 1988). In other multivariate analyses, errors are a ss u m ed to be uncorrelated (Fassinger 1987, Pedhazur 1982). This assumption can be relaxed in SEM analyses in som e c a s e s . For example, the errors associated with the sam e rater for various ratings can be allowed to correlate. It is reasonable to ass um e that thes e errors would be correlated since it h a s been observed that a rater may behave similarly acr os s various ratings. It h a s been found, for instance, that individual raters evaluate different features of protocols with a similar level of severity acr oss rating categories (McNamara and Adams 1991, Weasenforth 1993).1 2 III.3 .8 . Disadvantaaes. Although SEM provides a number of theoretical benefits, it also p o s e s several practical problems. One o f the problems encountered in applications of SEM is that rather large sample sizes are required. Sample size requirements are particularly demanding when the weighted least s q u ar es method of estimation is employed. As an example, for the model proposed in Figure IV . 1 (see IV.9.1) with 45 observed variables, a sample size of 990 es s ay s is needed (Joreskog and Sorbom 1988). The time and financial support ne eded to collect several ratings Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 120 for each o f sixteen textual dimensions for each of 990 e s s a y s is impractical for this study, as it may also be for others. 13 The issue of model fit h a s received a great deal of attention. Much o f the discussion o f model fit is due to die difficulties of basing fit statistics on x2 distributions. As a result o f being based on x2 distributions, fit statistics are not reliable with relatively large or small samples (i.e., Boomsma 1983, Bollen 1986; 1990, Hu, Bentler and Kao 1992, Tanaka 1987). x2 fit statistics based on large samples tend to indicate lack of fit while fit statistics base d on small samples tend to show fit This problem h a s been somewhat resolved through the provision of a number of fit statistics, so me of which are independent of sample size. Model identification is another issue which h a s also been d is cu sse d rather often. In spite o f this attention, verifying the identification of a model is a difficult task to accomplish. Indeed, Long (1983b) describes identification as one o f the most difficult practical problems o f applying SEM analyses. A final disadvantage which I would like to identify may have become obvious in the foregoing discussions. In spite of the relative eas e o f computation which computers provide, a number of complex procedures are necessary. These procedures require an understanding of statistics, a knowledge of the theoretical field in which one works and experience with the statistical software needed to complete analyses. These requirements may discourage researchers and potential audiences and thereby represent a potential disadvantage. M o l e s 1. SEM is known by several other names, including structural modeling, causal modeling, analysis of covariance modeling and latent variable modeling. Be cause Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 121 of the predominant u s e the LISREL computer program to run SEM analyses, SEM models are often referred to a s LISREL models. 2. One may rightly argue that so me o f the observed variables-such as the reasonableness o f the thesis --seem to be a s unobservable as the construct content. Reasonableness o f a thesis could also be represented a s a latent construct associated with a se t of tangible textual features. It could also be argued that the observed variables are not associated with the appropriate construct or that they may be associated with more than one construct. The point is that s e ts o f directly mea sured variables are assumed or hypothesized to be associated with unobserved constructs. 3. Using SEM for model building may involve continual respecification o f a model. However, one still begins the analysis by specifying a plausible model, modifying it according to results. 4. This issue is considerably more complicated than I have portrayed it here. Actually, an important distinction should be drawn between just-identified and overidentified parameters and models. A more elaborated, but intuitive, discussion is provided by Pedhazur and Schmelkin (1991). More elaborated and more technical discussions can be found in Bollen (1989) and Long (1983b). 5. Estimation of a measurement model with only two mea su res of a latent variable, for example, is impossible b eca use the model is not identified. However, once the measurement model is incorporated in a full model, parameters of the previously non-identified model may be consistently estimated. Estimation of the incorporated model may be possible be cause there may be enough information in the rest of the model which would compensate for the lack of information in the non-identified measurement model 6 . Joreskog and Sorbom (1989: 17) state that if a model is identified, the information matrix "is almost certainly positive definite." If the matrix is singular, the model is non-identified. They also su g ge st estimating a model with credible values, then using the Sigma matrix a s the data matrix to estimate Theta. If similar estimates are calculated, Joreskog and Sorbom (1989: 18) suggest that the model is identified. 7. LISREL 8 performs a preliminary check of certain asp ec ts of the hypothesized model after a specified number o f iterations (20 by default). The Ax and Ay matrices are checked for full rank stat us; the d > , 'P, ©e and © 5 are checked to ass ure that they are positive definite. If thes e conditions are not met, LISREL discontinues analyses and reports the solution is inadmissible, indicating that the model d oe s not adequately fit the data. 8 . These claims have be en disputed by Marsh, Balia and McDonald (1988) and Tanaka (1989). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 122 9. LISREL d oe s not provide modification indices or estimations in expected change for parameters that already exist in the model. EQS, another widely u s e d SEM program, doe s provide these indices. This represents a limitation of the LISREL program. 10. Ordinal data represent order categories of measures for which there is no metric scale. Statisticians are developing programs for the analysis o f ordinal level multivariate data (see C liff 1993 and Hildebrand, Laing and Rosenthal 1977). It is also p o s sible to investigate the thresholds for ordinal data through PRELIS or FACET’S in order to determine whether the data approximate interval distributions. 11. FACETS analyses of rater behavior (e.g., Weasenforth 1993) have demonstrated this variation in raters' behavior across ratings and test takers. 12. Although not observed in this study, there is another important benefit that SEM provides. Since SEM applies an ex post facto correlational research design, it d oe s not require the assignment o f separate comparison groups or the administration of an experimental treatment as in experimental research. As a result, the complexity of a study, especially when the number o f variables increases, is decreased (Bachman 1989). 13. Bootstrapping procedures are available with PRELIS 2 and provide a m e a n s o f circumventing the problem of small sample sizes, although not without so m e controversy. Since bootstrapping was used in this study, it w ill be discussed in Chapter IV . It is not a procedure integral to SEM and is, thus, not discussed in this chapter. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 123 CHAPTER IV Methodology IV. 1. Overview. The purpose of this chapter is to provide a description of the methodology employed to complete the present research study. The first aspect of the methodology to receive attention w ill be the personnel involved in the study. I have ch osen to disc us s the personnel separately be cause there were a number of people involved in the study, and the duties assigned to various personnel differed markedly. Extracting this information from further dis cussions w ill, I hope, ease the reader's way through the chapter. Also receiving separate attention are descriptions of the corpus of texts u sed in the study and the three rating scales, with a more detailed description of the rhetorical abstraction scales. I will then provide descriptions of the variables measured through the application of the scales and through computer-aided frequency counts of various features in the texts. The statistical analysis, structural equation modeling (SEM), w ill then be discussed. Unlike the general discussion in the previous chapter, this discussion of SEM is focussed on the specification of two models u s e d in this study. Discussions of the assumptions underlying the general methodology and of the limitations inherent to the methodology w ill conclude this chapter. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 24 IV.2. Personnel. Because of the complexity of the research methodology, it may be helpful to the reader to extract from further discussion of the methodology descriptions of the personnel who were involved in the study. This relatively brief discussion w ill include descriptions of the personnel and the duties which they performed for the successful completion o f the study. Details of the rating procedures w ill be provided in later disc us sion s of the rating sc ales and the variables employed in the study. A professionally trained typist was employed to type and edit the es s a y s us ed for the study. In addition to typing the texts, s h e proofread the word processed copies to ass ure typing accuracy and edited them a s described below. She also generated computer-aided counts of textual features u se d a s estimates of text length and computer-aided counts of lexical items relevant to the study. Finally, sh e counted manually ragged grammatical features in the texts and normalized the frequency counts for text length. The typist also verified all counts and computations involved in the normalization procedures. Three s e t s of raters were hired to complete various types of evaluations of the texts. For the first set of ratings-the holistic ratings of overall textual quality -tw o raters were hired to work with the researcher in providing a se t of three s c o re s for each ess ay. Each of the three raters had at least five years of experience in teaching and a s s e s s in g English a s a Secon d Language (ESL) composition. A s ec on d s e t of three raters-including two hired raters and the researcher Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 125 --completed a se t of 15 s co re s (5 per rater) for each essay, using the ESL Composition Profile analytic scale described below. Each of thes e raters also had at least five years of experience in both teaching and testing ESL composition. Two o f the three had previously been trained by Educational Testing Service to rate Test o f Written English e s s a y s . Neither of the hired raters was involved in the earlier holistic ratings. The third se t of raters-including a hired rater and the researcher-provided a se t of 14 s c o re s (7 per rater) for each essay. These score s resulted from applications of the rhetorical abstraction sc a le s which were developed and validated specifically for this study and which are described in more detail below. The hired rater was also involved in the holistic evaluations which had been completed a year and a half earlier. iy .3. Cpipus- A total of 421 e s s a y s were collected from stu dents in the Freshman Writing Program and the American Language Institute (A LI) at the University of Southern California. The students represented a range o f ethnic backgrounds and proficiency levels-from intermediate ESL to se c on d year freshman writing levels -and included both native and non-native speakers of English. Four prompts (2 topics x 2 prompt types-graph and prose) were randomly distributed to students who were told that the e s s a y s they produced might be graded a s in-class writing or u se d a s first drafts of a class assignment (see Appendix A for copy of prompts). Students were allowed a total of 35 minutes to plan, write and revise their e ss ay s; each student wrote one es s ay for the study. A ll four prompts clearly Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 126 requested argumentation; the two graph prompts additionally requested that information from the graphics be u se d to support arguments. 1 A ll 421 e s s a y s were collected over a five week period during the later part of the Fall Semester of 1991. The first group of e s s a y s was collected during the seventh week of the semester, s o that all subjects had received at least half o f a semester of instruction in composition. In addition to collecting essays, a group of 26 stud ents also volunteered to take part in two interviews--one before and one after writing their ess ay s. These students were presented with the prompt which they would use/had u se d for the writing exercise. They were not told that they would u s e the prompt for the writing assignment and they did not write their es s ay s any sooner than 48 hours after the first interview. A ll but nine e s s a y s were found to be argumentative in nature. These nine es s a y s were either criticisms of the activity or purely narrative or descriptive in nature, as determined by the researcher, and were, thus, excluded from this study a s well a s from previous studies involving the es s a y s . The 412 remaining es s ay s were holistically score d by three experienced, trained raters during the Spring Semester of 1992 using a traditional rhetorical rating scale (see Appendix B for copy of scale). This scale is described in more detail below. An additional 21 e s s a y s were deleted from the corpus before the current study. In a comparative investigation of s co re s assigned to Lj and L o essays, it was determined that the two groups of students represented two populations defined by score distributions.2 So a s not to complicate interpretations of results that the inclusion of both populations might cause, the es s a y s of the 2 1 Li Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 127 students were dropped from the present study. The remaining 391 es s ay s were evaluated for the present study. A ll 391 e s s a y s were word processed to allow for computer-aided analyses, including two mea sures of length and frequency counts of Iexicogrammatical features which were identified by Biber (1988) as mea su res of textual dimensions of interest in this study. These me asures w ill be described in more detail below. The only type of change intentionally made to the word processed copies was revision of unconventional spelling. 3 This change affected only th os e lexical features (identified below) that are associated with the two textual dimensions defined by Biber (1988). The change was necessary for the reliability of the computer ass isted identification of the relevant lexical features. No other intentional ch an g es were made to the ess ay s. To ass ure that no unintentional changes had been made, the es s a y s were read after being pr oc essed and were compared with the original hand-written copies. All unintentional changes were revised s o that the word processed copies were the sa m e a s the original copies except for the spelling changes made to those lexical items which are associated with the two textual dimensions defined by Biber. The original handwritten copies of all 391 e s s a y s were also rated against two analytic scales. The rating sc al es and rating procedures are described below. 1V.4. Rating Scales. In this section I w ill d is cu ss the rating sc ales by identifying the rating categories, the textual features associated with the categories and the number of score points. I will also provide a discussion of the reasons for choosing the se Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 128 particular sc ales far u se in this research. For the ESL Challenge Test Essa y Evaluation Scale and the rhetorical abstraction scales, I w ill also dis cu ss the development of the scales. Since the rhetorical abstraction scales where developed specifically for u s e in this study and since they have not been published, I will provide an extended discussion o f their development as well a s their validation. 1V 4 .1 . ESL Challenge Test Essav Evaluation Scale. The holistic ESL Challenge Test Ess ay Evaluation Scale (see Appendix B) was u s e d to generate a set of three holistic ratings of each essay. Although not clearly indicated in the scale, there are four textual dimensions referenced: overall organization, paragraph organization, vocabulary/syntax and mechanics. The textual features referenced in the overall organization dimension include the: • tripartite organization o f texts, • unity of the three parts, • statement of a thesis, • explicitness of the thesis and tripartite organization, • clarity of the thesis and tripartite organization, • quality of the formulation of the thesis and • effectiveness of the thesis. The se c o n d textual dimension-paragraph organization— references a number of textual features, including the: • address o f the prompt, • organization of material within paragraphs, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 129 • inter-relation of paragraphs and • the u s e o f cohesive devices. The relevance of the first feature to paragraph organization s e e m s obscure. It, in turn, engenders a number o f facets, including the: • directness of addressing the topic, • coherence of the response, • comprehensiveness of the respon se and • adequacy o f the response. The relation between comprehensiveness and adequacy is not indicated. The second feature is described in terms o f the: • relevance of details, • "adequacy" of details, • sufficiency/quantity of details, • coherence of the integration of details within paragraphs, • clarity of "focus," • paragraph "development," and • consistency in the us e of topic se nt enc es . References to the third feature include the: • "logical" nature o f development, • "adequacy" of development, • interrelation of paragraphs and • "adequacy" and "appropriateness" of the development of a thesis. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 130 The last feature mentioned in this category is described with references to th e: • existence of cohesive devices, • "appropriateness" o f the cohesive devices, • "rudimentary" nature o f the cohesive devices and • variety in the u s e of cohesive devices. The third dimension combines elements of vocabulary and grammar usage. The first feature is correctness of word forms. The second makes reference to the: • range of vocabulary, • u s e of sub-technical vocabulary and • sophistication of the vocabulary. The last feature is related to vocabulary u sa g e and is described in terms o f the: • correctness of word choice, • effect of word choice on intelligibility and • use of appropriate register. The other two features in this category are related to grammar. The first makes reference to the: • existence of errors and • effects of errors on intelligibility. The se c on d refers to the: • variety in syntactic structures, defined a s the proportional u s e of simple, compound and complex sen ten ce s. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 3 1 The last dimension is related to textual features often su bs u m ed under the label, "mechanics." The first feature in this dimension refers to the: • violation/observation/usage of paragraphing conventions. The sec ond is defined by reference to the: • "appropriateness" o f punctuation and • "consistency" of punctuation. The third feature is defined by reference to the: • "consistency" of capitalization and • "appropriacy" of capitalization. Finally, the last feature of this category is defined a s the: • existence of "spelling errors," • frequency of "spelling errors," and • effect of the "spelling errors" on intelligibility.4 The scale h a s a range of 10 possible scores, "1" to "10," with two possible points for each of five profiles labeled "incompetence," "minimal competence," "developing competence," "proficiency w/ some errors," and "near native proficiency." No weighting of categories is indicated. There is also no indication of how the decision to choose one of the two s co re s for each profile should be made. This scale w as originally an adaptation of the es s ay rating sc ale u s e d for the American Language Institute (A LI) placement examination. The 10 point range represented the range of proficiencies of students who normally took the placement examination. Use of the adapted scale was proposed for the ESL Challenge Test, but validation of the sca les was not completed due to a lack of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 132 funds. In spite of the potential drawbacks to using this scale for research purposes, I chose to do s o because o f earlier research on prompt type effects. In order to investigate prompt type effects on students' performance on the ESL Challenge Test, this scale--which at the time was being u s e d for the exam-was applied to generate holistic evaluations of es s a y s . These holistic evaluations are also used for the present study. In defense of the u s e of thes e scales, I should point out that the estimates of rata* consistency (see Table IV.3 below) s u g g e s t that the scale was adequate (a >.80) in providing a basis for relatively reliable ratings. It also incorporates a number o f features which are often included in sc a le s u s e d by other ESL programs. This being the case, results from this study may be more immediately applicable to other programs. 1V.4.2. ESL Composition Profile. The ESL Composition Profile is an analytic rating scale developed by Jacobs et a l (1981) and widely recognized in ESL programs (see Appendix B). The scale was developed specifically for academically oriented ESL programs and was designed to evaluate a wide range of writing proficiencies. The scale consists of five rating categories for ea ch of which a rater ass ign s a separate score. The rating categories represent different textual dimensions, including content, organization, vocabulary, language u s e and mechanics. The content category makes reference to a number of features, including the : • adequacy of reasoning, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 133 • development o f the thesis, • relevance of information and • interrelation of ideas. The organization category is defined by references to the: • existence and effectiveness of a thesis, • existence and "strength" of topic sentences, • existence of opening and closing sentences and paragraphs, • existence and "strength" o f transitions, • communication o f ideas and • "organization." The rating category for vocabulary refers to the following features: • correctness of idiom and word form usage in context, • effectiveness o f idiom and word form us age , • clarity of meaning, • u s e o f translation, • effectiveness of word choice, • clarity of meaning and • knowledge of English. Language u s e is defined by the: • variety of sentence types, • effectiveness of sentence u se, • clarity of meaning, • extent of the writer's mastery o f sentence rules and • correct u sa g e o f the following grammatical features: verb tense, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 134 subject-verb agreement, number, word order, word u s e , articles, pronouns and prepositions. The final category, mechanics, makes references to the: • frequency of occurrence o f errors in spelling, punctuation, capitalization and paragraphing and • legibility o f the writer’s handwriting. As with the holistic rating scale, one may observe that each category is characterized with reference to many features, so me of which may appear to be ambiguously defined. Also, the distinction between categories may not always be clear. The original sc ales differed from those u se d in this study in the number of sco re s that could be assigned for each category. For this study, the sc a le s were adapted in that only 4 possible score s could be assigned with a range from ” 0” to "3" for all five categories. The only other change made was the deletion of the comments section at the bottom of the scale. The scales were chosen to be used in the study for two reasons. The ESL community’s familiarity with the scales promised immediate recognition by raters hired for the study and by ESL composition teachers and evaluators outside of the A LI also. This familiarity not only may have sav ed time in rata- training for the study, but also may have rendered the study more accessible to the ESL community at large. Secondly, the close relationship in structure and content of the sca les to sca les us ed in many composition programs provide a basis for generalizability of the results of the study. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 135 IV .4.3. Rhetorical Abstraction Scales. O f central interest in this study was the role of rhetorical abstraction in the assignment of holistic sc or es . Now I would like to dis cu ss at some length the rating scales that were u s e d to measure dimensions of rhetorical abstraction for this study. The discussion below w ill provide information on the format o f the scales, including identifications of the rating categories and descriptions of the features referenced in them. Since the sc ales were developed specifically for this study, development and validation of the scales w ill also be discussed.5 IV .4 .3 .1. Rating Cateaories. Toulmin's (1958) components of argumentation--claims, data, warrants and background-were incorporated in the rating scale (see Appendix B). A list of rating categories and descriptions of the categories are given below in Table IV . 1. Definitions and examples of Toulmin's components are provided in Chapter IL More detailed descriptions of the textual features represented in Table IV . 1 are also provided in Chapter IL This information is summarized below for the convenience of the reader. The first rating category a sk s raters to indicate whether a major claim (i.e., proposition or thesis) is evident, either explicitly stated or implicit to the writer's discussion. Further reference is made in the first category to the extent to which the major claim identifies a single idea which is central to the writer's discussion. The statement o f direction category is defined with reference to the presence of direction for proposed solutions to the stated problem. Raters were Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 136 also asked to evaluate the extent to which the statements of direction were elaborated. The data use category was used to evaluate the writer's u s e of data, or evidence u s e d to support minor claims. Raters evaluated the relative amounts of evidence marshalled by the writer to substantiate his assertions. The data type category refers to the relative proportions o f two types of data that could be observed in the ess ays. One type o f data was that which was explicitly associated with a secondary source of information or that which was borrowed from the ess ay prompts. The other data type was that which was not from the prompts or which was not associated with another source. The explicit statement o f warrants category refers to the relative frequency with which warrants were explicitly stated. The relative frequency of us e is the only dimension of this category. Consideration o f causes refers to the identification of c a u s e s of the problem to be solved. This category also asks raters to evaluate the extent to which the discussions of c a u s e s are elaborated. The rating category for examination o f premises/issues prompted raters to indicate whether the writer explicitly stated underlying assumptions and/or the i ssu es involved in his discussion. The category also referenced the extent to which the discussion was elaborated. In concluding the description of the scales, I would like to point out that there were four score points, with a range o f "0" to "3," for six of the sev en sca les. The data type scale had five possible sc or es ranging from "0" to "4." Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 137 Table IV . 1 Rating Categories and Descriptions Rating categories Descriptions Statement of claims The presence of major and minor clai ms-implicit or explicit-and the extent to which they represent one central idea Statement of direction The presence of statements of direction--in major and minor claim s-for the desired change and the extent to which the statements are elaborated Data u s e The extent to which data is used to support minor claims Datatype The proportion of all data which appears to be expert opinion or statistical in nature versus that which appears to come from the writer's personal experience or from common knowledge Explicit statement of warrants The frequency at which explicitly stated warrants are used to link data with minor claims Consideration of c a u s e s The presence o f discussions o f ca us es of the perceived problem and the extent to which they are elaborated Examination of premises/issues The presence of explicit statements of: 1) assumptions underlying the writer’s line of argumentation or 2 ) the history and/or nature of the main issue and the extent to which the statements are elaborated 1V.4.3.2. Scale Development. To date, I know of only two other attempts to construct rating s c a le s for evaluating rhetorical abstraction. Harris, Laan and Mossenson (1988) developed a similar s e t of s c a le s for narrative writing to measure the writing performance of children and to build a developmental model o f writing abilities. They s u g g e s t that similar s ca le s be developed for expository and argumentative discourse. As part of the Cambridge-UCLA Language Testing Project, Bachman et a J . (1991) developed a se t of sc a le s to be us ed for describing test methods. Included in the s e t are s c a le s for the degree of contextualization of textual information, the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 138 distribution of new information, the abstractness of information and the complexity o f rhetorical organization. Since no scale of rhetorical abstraction for argumentation exists, it was necessary to develop and validate one. The development and validation of the scales are discussed in detail by Weasenforth (1993). The procedures and results are summarized here. Several concerns guided efforts in developing the sca les. In light of the possible complexity of the scales, given the overwhelming number of textual features associated with rhetorical abstraction, it was decided that the number of rating categories and descriptors should be lim ited Observations of features from previous readings of the texts partially guided the choice of categories and descriptors. The usefulness--in toms of the generalizability estimates and consistency of the ratings, the clarity of distinction between sc ales a s evaluated by raters-of the scales also determined which of the sca les would be used for this study. Likewise, the rating categories woe defined with reference to one or two features a s described in the previous discussion. As Carlson (1988) and Ferris (1990) have demonstrated, reliable ratings can be difficult to come by when raters are asked to evaluate elusive or ill-defined multidimensional textual features, su ch a s warrants, audience address and metadiscourse. For this reason, evaluations were restricted-except in the c a s e of statement o f claims -to explicitly stated features. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 139 IV,4,3,3. Validation o f S cal es. Since no other me as u re s of rhetorical abstraction for argumentative discourse exist, investigations of construct validity were restricted to appeals to two sou rce s of expert opinion. I relied heavily on relevant literature a s reviewed in Chapter II. Also, advice was gathered from Robert B. Kaplan, Ross Winterowd and Stephen Toulmin, experts in rhetoric, discourse analysis and argumentation. These experts were asked to evaluate the s ca le s in terms of: 1) the relevance o f rating categories to the construct, 2 ) the relevance of descriptors to the construct, 3) the clarity and co n cise ne ss of the categories and descriptors and 4) step level judgments. Evaluations indicated that the sc a le s were relevant to the notion of rhetorical abstraction, particularly to argumentative discourse, but that there exist other aspects o f the construct that could also be investigated, aspects identified in Chapter II. While there was co ncensus that the categories appeared to be distinguishable, some descriptors for the rating levels were found to be ambiguous. The construct validity of the scales was, thus, supported by discourse analysis and argumentation experts consulted for the project The features described in the s c a le s and rater training materials appear to be relevant to what can be considered an elaboration of rhetorical elements of texts. Furthermore, the descriptors for each scale s e e m to describe levels of an elaboration of the features. The features and descriptors also appear to be relevant to argumentation in particular. The other aspect of the validation process involved investigations of the reliability of ratings which resulted from applications o f the scales. These Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 140 investigations entailed both qualitative and quantitative procedures. It wa s crucial to verify that raters were not only able to identify textual features of interest, but that they were able to classify the features consistently according to the descriptors in the sca le s . Once they were able to identify features consistently, it was important to verify that they were able to rate the features consistently. This part of the validation process was completed in three se s si o n s. During the first s es si o n , only qualitative evidence regarding the usefulness of the scales was collected. The se co nd and third p ha s e s of the investigation included discussions of the sc ales and independent ratings of ess ays . Ratings from these last two s e s s io n s were analyzed qualitatively and quantitatively, using Generalizability Theory (GT) and multifaceted Rasch scalar analyses. Rating se ss io n s were also recorded to provide additional evidence of raters' u s e o f the scales. A survey of the data matrices indicated that raters were successful in recognizing the existence of the features. The taped discussions during training s e s s io n s also indicated that raters had developed a common definition of the textual features. Finally, a tagging and labeling procedure provided evidence that raters were not rally able to recognize the existence o f features, but that they, in fact, were identifying features consistently. Tables containing statistical information u se d in the validation process are provided in Appendix D. Statistics for each of the original sixteen s ca le s are given although only se ve n o f thes e sc a le s were used for the current study. Descriptive statistics for the ratings gathered during the validation procedures are reported in Tables 1 and 2. Relatively large differences in mean Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 141 sc ore s ac ros s investigation p h a s e s raised questions about the reliability of ratings but oould largely be attributed to rater training. The small differences for the last three sc ales reflects the fact that there were relatively few of thes e features observed in the es sa ys. Rater consistency reflected in the mean sc o re s for the logical entailment and logical relation scales would have been welcome were it not the fact that die sc ore s tend to cluster around a single score. These sc ales appeared to be ineffective in distinguishing writers’ abilities in terms o f logically relating elements of an argument. The standard deviations reported for the third p ha se results were larger than those for the se c on d pha se results, indicating a greater amount o f variation in ratings for the third pha se. The larger amounts of variation were of concern be cau se they could reflect a decrease in the reliability of ratings. However, the greater variation did not appear to be due to raters since the standard deviations for scores--exoept those for the last three scales-assigned by each rater wee reduced by the third phase. Rater variation did account for greater percentages of total observed variance in the third phase data than in the se c on d phase data (see Table 3 in Appendix D). However, except for the development o f m ajor claim ratings, the increases in percentages were not significant (assuming that an association of 10% of total variance with raters is reasonable). Increases in variance estimates could be observed for test takers for so me sc ales and for residuals for other scales. Thus, the greater variation in sc o re s generally did not appear to be associated with rater behavior, allaying concerns about rater consistency. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 142 A cursory analysis of the figures for the se c o n d p ha se revealed relatively more variance in sc or es across raters for eight sc a le s for the se co nd pha se as compared to the third phase. On the other hand, the figures for the third pha se revealed relatively more variance across raters for four of the sc a le s than there was for die se con d phase ratings. The GT analyses indicated that only two of the instances were statistically significant: the variance in statement o f cla im score s for the se co nd phase and variance in development o f m ajor claim sc or es for the third phase. Results from the GT analyses for each scale are reported in Table 3 (see Appendix D). For both ph a se s of the investigations, almost all variance estimates for raters were reasonably low— i.e., < > 1 0 % of the total observed variance- indicating that raters were relatively consistent in applying the sca le s . In general the percentages of variance associated with examinees were respectable (i.e., >50%) although there are a number which are considerably lower. The low estimates were attributed to raters' not being able to identify the se features and to little variation in the occurrence o f these features. Estimates for the residuals were disturbingly high; one would ideally like to s e e 0% for all residuals. Unfortunately, there is no way to identify the exact source of this variation since rater x test taker interactions, other sou rc es of systematic error and sou rces of random error are confounded in the estimate. Generalizability coefficients indicated that many of the sc a le s performed well enough, in terms of raters' consistent interpretation of them, to be useful in providing reliable ratings of the constructs they describe. Ratings from sev en of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 143 the 1 5 s c a le s were reliable at the .80 level or higher by the last ph a se of the investigation; three were reliable at the .69 -.79 levels. Multi-faceted R as ch scalar analyses were also completed for the last two pha se s of the investigation of rater consistency. FACETS rater measurements (see Table 4 in Appendix D) provided several indices of rater consistency. The total of all s co re s for each rater ser ves a s an index of the consistency of raters ac ro s s all sca les, all rating levels and all test takers. The totals for the second phase range from 555 to 639, a difference of 84 points assigned by raters 2 and 3. In the se c on d pha se the range was narrower, 641 to 696, with a difference of 55 points assigned by raters 1 and 2. The difference between score ranges indicated that raters s e e m e d to be more consistent overall in die last set of ratings in terms of the total number of points assigned. The fit statistics for the se co nd phase indicated that rater behavior for raters I and 3 was not consistent with the Rasch model. The fit statistic (-2) for rater 1 indicated that there was too little variation in the first rater's s co re s for his behavior to be adequately modeled. The fit statistic (2) for rater 3, on the other hand, indicated that there was too much variation in his score s for adequate fit to the model. Fit statistics for the third phase indicated that the s co re s raters assigned were consistent with the Rasch model, indicating that each rater was generally consistent in his/her application of the sc ales and that the variations in each rater’s s c o re s were roughly equal. The fixed x2 tes ts su gge ste d that significant differences in rater severity existed for both p h a s e s . Fit statistics for the se co nd pha se data indicated that all raters differed significantly in severity. Statistics for the third phase data indicated Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 144 that raters 2 and 3 applied the sca les with nearly equal levels o f severity, but they were significantly less severe than rater 1. The x2 results seemed to contradict the GT results which portray raters a s being very consistent It may well be that raters varied in severity but rank ordered the e s s a y s similarly (see Linacre 1989 for a discussion of various conceptualizations of rater reliability). FACETS also identified inconsistent interactions of facets, that is interactions that do not appear to be consistent with the majority of the sam e type of interactions. Two types of interactions were investigated with FACETS: rater x test taker and ra te rx scale. The estimates for raters and test takers and the estimates for raters and sc al es are reported in Tables 5 and 6 (see Appendix D). There are four c a s e s of significant interactions between raters and test takers for the second phase data. By the third phase of the investigation, the number of significant interactions was reduced to one. Likewise, the number of significant rater x scale interactions was reduced from seve n to four. The reduction in the number o f significant interactions s u g g e s ts that raters were generally able to apply the sc ales to a higher degree of consistency by the third phase of the investigation. Because of the large amounts of time (approximately 8 minutes per essay) needed to apply sixteen sca les for each essay, it was decided to u s e only half of the sc ales for this study. The choice of sca les was ba se d on the statistical results just summarized as well a s rater's comments regarding the scales' usefulness. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 145 1 Y , 5 ^ - V a r i a b t e s - Now that I have provided descriptions of the personnel who generated the data u se d in this study, the corpus on which the data is ba sed and the measurement instruments u s e d to generate the data, I would like to describe the variables in more detail. These include three s e t s of ratings and fifteen interval level measures-three frequency counts of text length and twelve frequency counts of lexical and grammatical features in the e s s a y s . Each of these measures was us ed in die LISREL analyses as an observed variable, that is, a measure of a latent textual dimension. The following discussions of the variables w ill address the i s s u e s o f their labeling for u s e in the LISREL analyses, their role in the LISREL models, the procedures followed in calculating me as u re s of the variables and estimations of the reliability of the m ea su res . For the reader's convenience, I have provided in Table IV .2 a complete list of the mea sur es us ed in the LISREL analyses, organized according to the latent variable with which they are associated, the level of measurement-ordinal or interval— which they represent and the type of variable-endogneous or exogenous-of the associated latent variable. A graphic representation of a proposed LISREL model (see Figure IV . 1 below) may be more immediately interpretable for readers familiar with path diagrams. The exemplary model w ill be described in detail later in the discussion o f structural equation modeling. I have also summarized in Table IV .3 the estimations-using Cronbach's a -o f reliability of e ac h variable a s a measure of the latent textual dimension. Descriptive statistics are provided for each variable in Appendix C. LISREL also provided various parameter estimates-including squ are d multiple correlations, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 146 error estimates, residual estimates and t-values--which indicate whether the ratings were reliable, statistically significant mea su res of an underlying variable. This information w ill be provided in Chapter V. IV .5.1. Holistic Ratings o f Overall Textual Quality. The holistic ratings (HR1, HR2, HR3) served a s me asures of the single endogenous variable, overall textual q u a lity, in the full LISREL models (see Figure IV . 1 for example). These are the ratings that were based on the ESL Challenge Test Es sa y Evaluation Scale and were completed by three raters, including the researcher. Each of the variables~HRl, HR2 and HR3--represents the set of 391 sc o re s for each rater. Three experienced raters completed the holistic evaluations during one weekend at the end of the Fall Semester 1991. Raters independently scored es sa ys in batches of approximately S O essays each. In addition to the nooning se s si o n at the beginning o f each day’s rating session, there were norming ses si o n s before rating each batch of papers, using approximately 3 - 4 es s a y s drawn from the batch to be rated. Whenever sc o re s for a particular essay varied by more than 2 points, raters were asked to reread the e ss ay independently and rescore it s o that the s c o re s were brought to within a 2 point range.6 Cronbach's a (.8 8 ) indicates that raters were adequately (i.e., a > .80) consistent in their application of the scale. It also su g g e s ts that the ratings do serve as reliable m ea su re s of a single latent variable. LISREL also provided information which indicates that the ratings were reliable, statistically significant mea su res of an underlying variable which I have labeled overall textual quality. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 147 1V.5.2. Rhetorical Abstraction Ratings. The rhetorical abstraction ratings (R11-R72) were originally hypothesized to be me asures of various aspects of a single latent textual dimension called rhetorical abstraction. As previously described in more detail, these asp ec ts included statement o f claims (R 11 and R 1 2 served a s measures), statement o f direction (R21 and R22 served as measures), data use (R31 and R32 served a s measures), data type (R41 and R42 served as measures), explicit statement o f warrants (R51 and R52 served a s measures), consideration o f causes (R61 and R62 served a s measures) and examination o f premises and issues (R71 and R72 served as measures). Each variable--Rl 1, R12, R21, R22, etc.-represents the set of s co re s assigned by each rater for each rating category. These me asures were ratings assigned by two A LI personnel and were base d on a se t o f s ev en rhetorical abstraction scales. Ratings were completed over a consecutive five day period after the Fall Semester of 1993. Socialization s e s s io n s were conducted prior to starting the rating for the day, after lunch break and mid-afternoon. Ratings were completed independently by the two raters. At the end of the day, ratings were collated and checked for discrepancies. Whenever ratings varied by more than one point, raters reread the associated e ss ay and rerated it independently s o a s to bring the ratings to within a one point range. 7 Cronbach's a s were computed for the complete s e t of 14 s c o re s and for each pair o f sco re s. The a for the full set of sc or es wa s .63, indicating that the raters did not apply the rating scales, jointly, to an adequate degree of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 148 consistency. The relatively small index may also s u g ge st that the set of ratings are not mea su res of a single underlying construct. The a s for each of the subcategories were almost all adequately large. The only marginally respectable index was .79 for ratings of explicit statement o f warrants. The a s for the subcategories indicate that the ratings were reliable m ea s u re s of the textual dimension associated with the appropriate subcategory although the a for the whole set of ratings s u g ge st s that the sev en subcategories may not be aspects of a common textual dimension. 8 IV .5.3. ESL Composition Profile Ratings . The final se t of ratings (CT1-CT3, OG1-OG3, LU1-LU3, VC1-VC3 and MC1-MC3) were generated from the application of Ja c ob s etaL 1 s (1981) scales. Three sc o re s (CT1, CT2, CT3) serve a s mea su res o f content as defined in the scale. The scores, O G l, OG2 and OG3, are measures of organization; VC1, VC2 and VC3 me as u re s of vocabulary usage. The last two s e t s o f sco re s are measures of language use (LU1, LU2 and LU3) and mechanics (M C I, MC2 and M C3). Three raters completed all ratings during a three day period after the Fall Semester 1993. Socialization s e s s io n s were conducted prior to starting the rating for the day, after lunch break and mid-afternoon. Ratings were completed independently by the three raters. At the end of the day, ratings were collated and checked for discrepancies. Whenever ratings varied by more than one point, raters reread the associated e s s a y and rerated it independently s o as to bring the ratings to within a one point range.9 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 149 Alpha estimates of rater consistency were computed for each se t of ratings (i.e., CT1-CT3, OG1-OG3, etc.). A ll estimates were respectable except the one for the mechanics ratings (.76). LISREL analyses of the individual measurement models estimated adequate fit for each model. IV.5.4. Text Length. Three m ea su re s of text length were taken. One mea su re of text length included the number o f words in each text (TL1) . 10 The typist us ed the Microsoft Word 4.0 Count function to calculate word counts. Once the word count for each text was calculated, it was neces sary to read through the text to look for hyphenated words and acronyms and adjust the word count accordingly if these features were found. The typist completed a s e c o n d count of each text to verify the accuracy of the counts. The number of clauses (TL2) was use d a s a se c on d measure of text length. The researcher manually marked each clause in ea ch essay ; the typist counted the number o f clauses in each essay . 1 1 Clause identification and counts were checked for accuracy by a second p a s s through the es s a y s . The third measure of text length was the number of characters per es s ay (TL3). This count included the number of characters, punctuation and sp a c e s , including the five indentation s p a c e s at the beginning of paragraphs. As with the word counts, the typist u se d the Microsoft Word 4.0 Count function to generate this data. She repeated this procedure to verify the accuracy o f the original count Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 150 The standardized a for the three mea su res of text length was .97. This value indicates that all three m ea su re s are associated with a common underlying textual dimension. IV,5.5,, Textual Ela bora tion- Frequency counts were made o f four lexicogrammadcal features: 1) that clauses a s verb complements, 2) demonstratives, 3) that relative clauses on object positions and 4) that clauses u se d a s adjectival complements. These features were found by Biber (1984; 1988) to load on the s a m e common factor in his factorial analysis of textual dimensions. This common factor was labeled textual elaboration a s discussed in Chapter II. Each of the four features are represented in the LISREL model (see Figure IV . 1 below) a s a separate measure o f the textual dimension, textual elaboration. That clauses a s verb complements are represented by the variable IE1, demonstratives by IE2, that relative clauses on object positions by IE3 and that clauses us ed a s adjectival complements as IE4. In response to the very low a (.06) and the preliminary factor analysis results and preliminary LISREL analysis results which indicated that the four measures were not associated with a common textual dimension, the us e o f a composite variable in place of the four separate m ea su re s was suggested. The composite variable, IEC, is defined a s the su m of the four sc or es for each text Counts for the Textual Elaboration features were generated using the Microsoft Word 4.0 Find function. Biber's (1988) lists of the various features were u se d in the identification process. In order to identify demonstratives, the typist input into the Find function each form of the feature. The Find function Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 5 1 then identified each occurrence, one after another, of the feature. A tally of the number of occurrences for each essay was made. The typist verified the accuracy of the tallies by combing through the es s ay s a se c on d time. In order to identify the three types of that clauses, the researcher us ed the Find function by inputting the word that. Each occurrence of the word was identified at which point the researcher identified the us e of that, categorizing the occurrence a s one of the three features of interest or a s some other type of feature. Tallies for each of the three features were made for each essay. The researcher verified the accuracy of the identification and categorization o f the that clauses. The typist verified the totals for e ac h essay. Once the counts were calculated and verified, the counts for each e s s a y were normalized for a text length of 500 words. 12 Accuracy of the computations was verified by computing each score a se co nd time. The normalized frequencies were used in the statistical analyses of this study. IV.5.6. Topic Abstraction. Syntactic features associated with abstract/situated style, a s defined in Bibo's (1988) research, were also identified and counted. These features include 1) conjuncts, 2) agentless passives, 3) BY-passives, 4) past participial clauses, 5) past participial W HIZ deletions and 6 ) "other" adverbial subordinators (Biber 1988). Each of the six features is represented in the LISREL model (see Figure IV . 1) a s a separate measure o f the textual dimension. Conjuncts are represented by the variable AB1, agentless passives by AB2, BY-passives by AB3, past participial clauses as AB4, past participial W HIZ deletions a s AB5 and "other" adverbial subordinators a s AB6 . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 152 As in the cas e of the previous variable, it was decided to include in the LISREL analyses a composite variable, ABC, in place of the six separate measures. The theoretical questions and practical problems raised are the sa m e a s those raised with respect to the composite sc o re for textual elaboration. The a estimate o f reliability is also very low (.06). Also, the preliminary factor and LISREL analyses indicate that the six mea su res are not associated with a common underlying textual dimension. Procedures similar to those u se d to identify and count the textual elaboration features were also u s e d for the topic abstraction features. The typist input each of the lexical items for conjuncts and "other" adverbial subordinators into the Find function and generated frequency counts for eavh o f the items for each essay. She then verified the totals by completing a se c on d sweep o f the essays. The researcher used the Find function to identify BY-passives, following the sa m e procedures u s e d to identify and count the that clauses for textual elaboration. The researcher also completed a se co nd sweep of the es s a y s to verify the identification and counts of BY-passives. Since the Find function could not identify agentless passives, past participial clauses and past participial W H IZ deletions, the researcher read through each of the 391 essays, tagging each occurrence of the features. A second reading was done for each o f the features to check the accuracy of the identification and count of each of the three features. Once the counts were calculated and verified, the counts for each e ss ay were normalized for a text length of 500 words. Accuracy of the computations was verified by computing each score a s ec on d time. The normalized frequencies were us ed in the statistical analyses of this study. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 153 Table IV.2 Latent and Observed Variables Latent Variables Observed Variables Descriptions of Observed Variables Overall textual quality H R l, H R i, HR3 Set o f 3 holistic s co re s b ase d on ESL Challenge Test Scale Rhetorical Abstraction Statement of Claims R ll, R l2 Set of 2 score s base d on Rhetorical Abstraction scale Statement of Direction R21, r22 Set of 2 score s base d on Rhetorical Abstraction scale Data u s e R31, R32 Set of 2 sc ore s ba sed on Rhetorical Abstraction scale Datatype R41, R42 Set of 2 sc o re s based on Rhetorical Abstraction scale Explicit statement of warrants R51, R55 Set of 2 sc o re s based on Rhetorical Abstraction scale Consideration of c a u s e s R61, R62 Set of 2 s co re s based on Rhetorical Abstraction scale Examination of premises/issues R71, R 7i Set of 2 sc ore s based on Rhetorical Abstraction scale Content c t i, c t 2 , c t3 Set of 3 holistic s co re s based on ESL Composition Profile Organization o g i, oC2, 0 g 3 Set of 3 holistic s co re s based on ESL Composition Profile Language u s e LU1, LU2, LU3 Set of 3 holistic sc o re s based on ESL Composition Profile Vocabulary VC1, VC2, VC3 Set of 3 holistic s co re s ba sed on ESL Composition Profile Mechanics M C I, MC2, MC3 Set of 3 holistic s c o re s b as ed on ESL Composition Profile Text Length TL1 Number of words in text t L i Number of clauses in text TL3 Number of characters in text Topic Abstraction AB1 Normalized frequency count of conjuncts AB2 Normalized frequency count of agentless passives AB5 Normalized frequency count of past participial cl au ses Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 154 Table IV.2-contmued Latent and Observed Variables Latent Variables Observed Variables Descriptions of Observed Variables AB4 Normalized frequency count of BY-passives "AB5 ............... Normalized frequency count of past participial W HIZ deletions AB6 Normalized frequency count of adverbial subordinators ABC Composite of AB1-AB6 Textual Elaboration iE l Normalized frequency count of THAT clauses a s verb complements IE2 Normalized frequency count of demonstratives ie 3 Normalized frequency count of THAT relative clauses on object positions IE4 Normalized frequency count of THAT clauses a s adjective complements IEC Composite of IE1-EE4 IV .6 . Pescriptives and Reliabilities. Univariate and bivariate descriptive statistics-including means, standard deviations, kurtosis and skewness-of the variables were calculated. 13 The results o f the univariate analyses are presented in Appendix C. Cronbach's a was also calculated a s an estimate o f the consistency of ea ch set of variables in measuring the associated latent variable. The a estimates are provided in Table IV . 3 below. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 155 Table IV.3 Reliability Estimates Latent Variables/Measures Cronbach's a = Overall duality (HR1,HR2,HR3) .8 $ Content (C tl, CT2, C IS ) .82 Language tJse (LU1, LU2, LU3} " ............:8 i ....... Mechanics (M c l, M c2, MC3) ..... '776........." Organization (OG1, OG2, OG3) .80 Vocabulary (V d , VC2, VC3) ' .S T ""'..... Rhetorical Abstraction (R11 - R72) .6 % R11,R12 (Statement o f claims) .89 R21, R22 (Statement of direction) .......... :89........... R3l, R32 (Data use) .92 k4l,R 42 (Datatype) ------- ”9 7--------- R51, R52 (Explicit statement o f warrants) .79 R61, R62 (Consideration of causes) .95 R71, R72 (Examination of premises/issues) ........83'............. Textual Elaboration (IE1, IE2, IE3, IE4) .06 Topic Abstraction (AB1, AB2, AB3, AB4, AB5, AB6 ) .Ob Text Length (TL1, TL2, TL3) .41 (sd a = .97) 14 These results were u se d partially to ad d res s the LISREL assumption of multivariate normal distributions of variables. Although investigations of multivariate distributions are more complex than simply looking at univariate and bivariate distributions, it was nonetheless useful to look at these distributions. They revealed that the mea su res associated with Biber*s textual dimensions were highly (i.e., >2 ) skewed, underscoring the need for the u s e of weighted least s q u ar es estimations for LISREL analyses. Furthermore, the descriptive statistics indicated that the distributions of the measures, R41 and R42, associated with the rhetorical abstraction variable, data type, were parabolic. This observation su gge sted that the data type variable should not b e included in the LISREL analyses since LISREL is not able to accommodate s u c h distributions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 156 IV.?. Preliminary Exploratory Factor Analyses. Exploratory factor analyses were completed prior to running LISREL analyses. 1 5 The purpose of these preliminary analyses was to investigate asp ec ts of the proposed models described in IV .9. O f interest was the association o f observed variables with first order underlying common factors. For example, the hypothesis that R11 and R12 were measures of a common textual dimension which could be labeled statement o f claims was tested. Similarly, assumptions related to the factorial structure of the other constructs incorporated in the full models were investigated. Carroll's (1992) factor analysis programs are designed to facilitate the identification of higher order factors, which was a se c o n d aspect of interest in these preliminary analyses. O f particular interest was the hypothesized higher order structure of rhetorical abstraction, as presented in Figure IV . 1. It was also ass ume d that other higher order factors might be identified, given investigations which indicate that language u s e may be adequately modeled a s a hierarchical structure (Bachman 1990, Sang e ta l. 1986). Forty-two of the variables identified in IV .5 were included in the factor analyses. The composite sco re s for topic abstraction and textual elaboration were excluded. IV .8 . Preliminary Regression Analyses. A preliminary stepwise regression wa s performed prior to the LISREL analyses in order to gain a rough perspective on the behavior o f the variables in predicting overall textual quality. 16 The su m s of the three holistic s c o re s for each Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 157 essay were used for the dependent variable, overall textual quality. Fifteen dependent variables were entered into the regression: • topic abstraction (the composite score, ABC), • textual elaboration (the composite score IEC), • text length (number of words only, TL1), • content (sum of three scores, CT1-CT3), • language use(sum of three scores, LU1-LU3), • organization (sum o f three scores, OG1-OG3), • vocabulary (sum o f three sc or es , VC1-VC3), • mechanics (sum of three scores, MC1-MC3), • statement of claims (sum of two scores, R1 1 and R 1 2 ), • statement of direction (sum of two scores, R21 and R22), • data u se (sum of two scores, R31 and R32), • data type (sum o f two scores, R41 and R42), • explicit statement of warrants (sum of two scores, R51 and R52), • consideration of ca us es (sum o f two scores, R61 and R62), and • examination of premises/issues (sum of two scores, R71 and R72). 1V.9. Structural Equation Modeling. The following discussion of the LISREL models u se d in this study includes descriptions of the two types of models specified and explanations of the purposes for each model.1 7 The parameters of particular interest in the study are identified and their relation to the hypotheses tested in the study is dis cussed. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 158 IV .9 .1 . Model Specification for the Study. The previous discussion of model specification in Chapter III was provided as a general explanation o f the conventions used in SEM analyses to indicate variables and relationships among them. I would now like to present path diagrams of two models originally proposed in this particular study. The reader may find it useful to refer to previous dis cussions of the variables in IV.5 and the general discussion of model specification in U I.3.3. 1 8 In the following discussion, I w ill describe prototypes of models used in this study. I w ill also describe the relationship of the models to the hy pot he se s that have guided the study. Although there a number of similarities-including the variables incorporated in the model and the types of parameters identified-between the two models, I have chosen to describe both. The model represented in Figure IV . 1 was specified s o as to investigate the definition of overall textual quality in toms of a selected group of textual features and their relative salience in determining overall quality. The model depicted in Figure IV .2, on the other hand, was specified s o as to investigate the relationships between more discrete measures, including rhetorical abstraction features and more general measures. The two models are, thus, distinguished in terms of the questions they address. They also differ in terms of how they are specified. In specifying the second model, the measures of overall textual quality were not included. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 159 R111 | R121 R21 IE11 1IE2I IIE3 R22 R32 IE4 R32 irectio iata Ui lalrrii R41 >ata Ty| R42 R51 arranti R52 R61 lause; Topic trstractt R62 R71 Issues, AB3 [AB2 |abV P rL 3 l iraartizatii O Q 2 003 TL2 ext Lei TL1 l o n te i VC1 Pcabuli VC2 VC3 Overall ixtual Quali] LU1 Use ’ «MC2 HR1 MC3 HR2 HR3 Figure IV . 1 A Proposed Structural Equation Model of Discourse Features Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Seventeen latent variables are incorporated in the model represented in Figure IV. 1: • text length, • textual elaboration, • topic abstraction, • issues, • organization, • content, • vocabulary, • language u se, • mechanics and • overall textual quality. • cause s, • rhetorical abstraction, claims, direction, data u s e , datatype, warrants, Nine o f these variables are represented as asp ec ts of overall textual quality by the inclusion o f single-headed arrows. That is, the se variables are hypothesized to be dimensions of overall textual quality. The variables claim s, direction, data use, data type, warrants, causes and issues are similarly represented as asp ec ts of rhetorical abstraction. The variables represented by boxes and associated with the latent variables are m ea su res o f the latent variables. These observed variables were identified and described in Table IV.2. It is important to note that each of the me as u re s loads on one and only one latent variable. Given that there are so m e apparent overlaps in the content of sc ales and given the ambiguity of raters' interpretations of features referenced in scales, it may not be reasonable to view each observed variable a s a measure solely of one underlying textual dimension. Modification indices will indicate whether, from a statistical point of view, the me as u re s should be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 161 associated with more than one dimension. I begin with the hypothesis that they should not be, because associating me asures with multiple dimensions could obscure interpretations of the dimensions. A ll correlations and causal relations between latent predictor variables, are fixed at 0.0. That is, the model indicates that none of the latent variables are correlated and that none of them predict any others. The interpretation o f th es e constraints would be that the textual dimensions represented in the model are independent, a position that is highly untenable (Coulthard 1985, Kaplan 1988). This model was tested, nonetheless, as a baseline model against which other models could be compared. The other models allow all of the latent predictor variables to be free, allowing LISREL to estimate the correlations among them. In order to simplify the figure, none of the error terms in the model are shown to be correlated. As sug gested in the discussion of LISREL's benefits, this is not a reasonable assumption since one might well expect raters to behave similarly acr oss various ratings. Therefore, errors for all ratings associated with each rater are estimated in all models. This would be represented by slings connecting the appropriate errors associated with s c o re s assigned for each rater (e.g., R11-R21-R31-R41-R51-R61-R71, R12-R22-R32-R42- R52-R62-R72, OGI-CT1-VC1-LU1-M C1, etc.). The model represented in Figure IV.2 below is the second type of model which may prove useful in an investigation of the relationships between the more discrete textual features-rexr length, topic abstraction, textual elaboration and aspec ts o f rhetorical abstraction --and the more inclusive textual dimensions- organization, content, vocabulary, language use and mechanics. This model Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 162 hypothesizes that the more discrete textual features are asp ec ts of the five dimensions which were measured with the ESL Composition Profile scales. Text length, for instance, is associated with the content, organization, vocabulary, language use and mechanics dimensions o f students' texts. [abs AB2flAB3|AB4 AB1 AB6 IE1 IE2 IE3 IE4 R11 R12 R21 R22 Textual iaboratio] Topic pstractigj Irectio laim: TL3 R31 ext Lengtj ata Usi TL2 R32 R51 arrant: TL1 R52 R61 £ausi R62 R71 ISSUI R72 CT1 (QontenjT^ turqanizatior CT2 CT3| OG3 |OG2 OG1 Ocabulai Use VC3 VC2 VC1 LU3 LU2 LU1 Figure IV .2 Example of Structural Equation Model of Discourse Features Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 163 It should be pointed out that a SEM model cannot necessarily be shown to be the sole model which best explains observed covariances in the data. A model can be shown to be consistent with the data, but there is theoretically more than one model which ca n adequately explain the data. For this reason, two or more models are usually tested and compared. The two models represented in Figures IV . 1 and IV .2 are in a s e n s e prototypes of several models which were tested and which w ill be discus sed in more detail in Chapters V and V I. It may be helpful at this point to describe the relationship between the models and the hypotheses expressed in 1.4. In the following discussion I identify the parameters of interest in addressing the hypothe se s and explanations of how significance of the parameters was interpreted. The discussion that follows outlines in general terms the procedures that were followed in testing the hypotheses. Once model fit was a s s e s s e d , specific parameter estimates were identified in order to address the hypotheses. In order to ad d re ss the hypothesis that rhetorical abstraction would be a significant dimension of overall textual quality, the following parameter estimates in models similar to that depicted in Figure IV . 1 were identified: • estimates for parameters (i.e., y parameters) relating rhetorical abstraction features with the latent variable, overall textual quality, • error estimates for the sa m e y parameters as indications of the accuracy of the y parameter estimates and • t-values as indications of the significance of the parameters relative to all other parameters in the model. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 64 Two types of statistical relationships between rhetorical abstraction and other latent variables were explored: correlations and aspectual relationships. Correlations-investigated in the first type of model— could be interpreted in terms o f co-occurrence of the associated features in texts, or they could indicate an overlap of similar features in the associated rating scales. Aspectual relationships -investigated in the se co nd type of model-were interpreted a s indications that features o f rhetorical abstraction were a s p ec ts of a particular dimension, or factors in defining the associated textual dimension. It was hypothesized, for example, that rhetorical abstraction features would not only be highly correlated with ratings of content, but that they would also b e identified a s significant asp ec ts of content. Correlations would indicate that rhetorical abstraction features co-occurred with textual features associated with content. The investigation o f predictor relationships provided a basis for a different, stronger statement involving the identification of elements of content. If, for example, significant y parameter estimates relate features of rhetorical abstraction with content rhetorical abstraction features could be interpreted as asp ec ts of a broader textual dimension labeled content. The parameter estimates to be identified in describing correlational relationships are the following in models similar to that in Figure IV . 1 : • estimates for correlations (i.e., < } ) parameters) between rhetorical abstraction variables and other latent variables, • error estimates for the sa m e < J > parameters a s indications of the accuracy of the estimates and • t-values a s indications of the significance of the parameters relative to all other parameters in the model. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 65 The parameters of interest in investigations of rhetorical abstraction features a s asp ec ts of more inclusive textual dimensions include: * estimates for y parameters between rhetorical abstraction variables and variables-content, organization, vocabulary, language u s e and mechanics-representing broader textual dimensions, * error estimates for the sa m e y parameters as indications o f the accuracy of the estimates and * t-values a s indications of the significance of the parameters relative to all other parameters in the model. IV . 10. Bootstrapping. As d is cu sse d in Chapter III, one disadvantage of distribution-free estimation procedures-such a s weighted least s q u a re s (WLS) and diagonally weighted least squares-is that they require very large sample sizes. Although the sample of 391 e s s a y s in this study would have been sufficiently large for other estimation procedures, it was not for WLS. I, therefore, needed to find a method to compensate for this relatively small sample size. LISREL 8 (Joreskog and Sdrbom 1993) includes s u c h a method called bootstrapping. Bootstrapping essentially entails multiple random sampling from the original sample s o a s to expand the original sample. Statistical procedures, including the calculation of correlation/covariance matrices, can then be performed on the larger sample . 19 Applications of bootstrapping procedures include specification of the number of samples to be generated, the number of c a s e s to be drawn from the original sample to generate the additional sam ples and the random s e e d number Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 66 for the selection of ca s e s . For this study, a total of 900 samples were generated, ba sed on random drawings of 75% of the original sample with a random se ed of 11. IV . 11 . Methodological Assumptions. A number of assumptions regarding the structure of written discourse are implicit in this study. Perhaps the most basic assumption is that discourse can be adequately described in terms of components. This assumption is apparent in the the path diagrams in which seven teen latent variables are represented a s different dimensions or components of discourse. The model represented in Figure IV . 1 is based c m the even stronger assumption that the components are independent since the < ( > parameters are fixed at 0.0. While this model was tested, the assumption made in this study is that the components defined in the study are interrelated. The interrelation of components was incorporated into the model by freeing the < f ) parameters to be estimated. The assumption that discourse is componential in structure is also observable in the construction of s ca le s and discourse analysis, but is questionable. There is an assumption that the components identified in the model are relatively significant textual dimensions. It is true that in this study, the relative significance of components within the model is hypothesized and tested. The choice of components, however, was based on the assumption that the components were significant in determining overall textual quality. While literature consistently identifies so m e features-most notably text length and syntax--as significant textual dimensions, there is little empirical evidence related Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 167 to other textual features. This is due in part to the large number of features and the interdependence of the features in determining textuality. There were further assumptions made about the componential nature of several of the textual dimensions identified in the model. It was assum ed that die se ve n variables associated with rhetorical abstraction are components of a single textual dimension. Similarly, it was assumed that the dimensions topic abstraction and textual elaboration can be measured by frequency counts of the various lexicogrammatical features associated with the two dimensions. These assumptions may actually be considered hypotheses of the research since the LISREL analyses test the viability of the measurement models and provide indications o f whether the observed variables of the two dimensions defined by Biber can reasonably be considered mea sur es of a common underlying dimension. Similarly, LISREL w ill indicate whether the s ev en aspects of rhetorical abstraction can be considered asp ec ts of a common underlying textual dimension. The inclusion of dimensions identified by Biber is ba sed on the assumption that the co-occurrence of features that defined textual dimensions in Biber's research would also define the sam e dimensions in the texts u se d for this study. It should be pointed out that Biber's research is ba se d on analyses of Li texts, both professional and non-professional. As noted earlier, the corpus u se d for this study included only L > texts of undergraduate and graduate university students. It is conceivable that the features which defined textual dimensions in Biber's work would not define the same, if any, dimensions in the texts u se d for this study (Grabe and Biber 1987). The features may serve different functions in Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 168 the l/> texts. Some features may also be u se d with different frequencies in the L2 since the students may not have acquired u s e of certain features (Ferris 1990; 1991). On a more theoretical level, the assumption that variables are linearly related is made in the application o f linear structural equation modeling. It is conceivable, however, that the relationships among variables might be better described by non-linear equations. Be cause non-linear equation modeling is still being developed and its applications are uncertain, I w ill have to operate on the assumption that the relationships are linear in nature (Kunnan 1991). IV . 12. Methodological Limitations. There are a number o f limitations to this study which should be identified. I w ill first identify and dis cus s those limitations relevant to the corpus u s e d for the study. Limitations related to the ratings w ill then be identified and dis cu ss ed . The last category of limitations to be d is cu sse d in this section are related to the u s e of LISREL. There are many textual features which were not included in this study and which may be significant in determining the overall quality of texts. Ferris (1990) found, for example, that the proportion of subtopics to text length was statistically significant in her study of L2 ess ay s. Connor (1991) found certain types of topic coherence to be significant in her study of L j texts. There are, likewise, many other features which have not been investigated, but may prove to be significant in defining textuality. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 69 The corpus analyzed for the study limits interpretations of the results in several ways. The corpus was limited to L2 texts which were generated under test conditions. These conditions include the limitation of writing time to 35 minutes, the assignment o f a prompt and the understanding on the part of participants that the e s s a y s would be evaluated It h a s be en shown that the L1-L2 distinction can be a significant determiner of evaluations although it may be denied in interviews with faculty (RusikofF 1994). Moreover, time allotments can affect evaluations (Hale 1991), a s can topic (Keech 1984). Interpretations o f results must then be constrained by these experimental restrictions. The nature of the ratings and the raters who provided them also constrain the study. Only ESL instructors/testers were u s e d for this study. It h a s been shown that evaluations can vary markedly ac r o s s academic disciplines. Bridgeman and Carlson (1983) found that disciplines varied in the types of writing task s they valued and in the relative significance accorded various textual features in evaluations of writing. Differences exist even within the composition field, with distinctions shown to exist between the way ESL and English Department composition instructors evaluate writing (Ferris 1990, Rusikoff 1994). Another limitation related to the ratings is a result of the manner in which rating categories of the s ca le s are defined. As mentioned in the discussion o f the rhetorical abstraction scales, the rating categories of the sc a le s were designed s o a s to represent single dimensions (i.e., usually the extent of elaboration) of a textual feature. This is not the case for the other two s c a le s u s e d in the study. There are references to nearly 50 features for each scale level in the ESL Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 70 Challenge Test scale. For ea ch rating categoiy in the ESL Composition Profile, there are similarly over 10 features referenced The multidimensional nature of these two sc al es restricts the level of specificity to which the results can be interpreted. If language use, for example, is identified a s a significant predictor of overall textual quality, it w ill not be possible to say whether the us e of prepositions, or subject-verb agreement, or any other feature referenced in the category was in itself responsible for the significance. Likewise, it w ill be impossible to determine whether any of the features were ignored or if all the features were equally significant in predicting overall textual q u a lity. Conversely, the unidimensional nature of the rhetorical abstraction scales restricted the number of features that could be included in the associated se t of evaluations. B e cau se of the more discrete nature of these measures and limited resources for the study, there are many other textual features which have been associated with rhetorical abstraction but could not be included in the study. I have a ss u m ed that the features incorporated in this study can be categorized into two groups: more discrete features and broader dimensions. Based on this assumption, I have hypothesized that the more discrete features were facets o f the broader dimensions. Intuitively, it makes s e n s e to ass um e that grammatical features, for instance, make up, or compose, larger dimensions of discourses. It is, however, conceivable that this is not the case. The last se t of limitations that I would like to dis cu ss are related to the statistical analyses performed for the study. It is neces sary to point out that SEM does not confirm a model. Instead, it provides indices upo n which judgments of fit are based. If a model is found to fit the data, it is considered to be one of a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. number of possible models that w ill fit the data. The focus of the research th en beco mes finding a model which fits the data better than other models. This determination is ba se d on substantive considerations a s well as statistical estimations. One must consider the theoretical s o u n d n e s s and parsimony of models in addition to fit statistics o f the whole model and individual parameters. One runs the risk o f capitalizing on chance when modifying models to fit the data (Byrne 1989, MacCallum, Roznowski and Necowitz 1992). A reliance on post hoc analyses in the study in order to improve model fit limits the results to the extent to which ch ance played a part in finding acceptable fit. There is a possibility that the results may not be replicated with other samples since the post hoc analyses may have taken advantage of possible peculiarities in the data. Replication of the study with different texts and raters would address this is sue. The assumption that the relative salience of textual features/dimensions is defined by the amount of variance in the endogenous variable that is accounted for by the exogenous variables must also be considered. I analyzed the ESL Composition Profile ratings with FACETS and with LISREL and ended up with different results in terms of identifying the relative "significance" of textual features. This may not be surprising since FACETS identifies significant features in terms of levels of difficulty (i.e., the relative magnitude of scores). Other factors may also play a role in determining the relative salience or significance of features. For example, raters may bring their own ag en da s to the rating table for whatever reason, a s noted by Hamp-Lyons (1991). The relative significance or relative salience of features a s defined by personal agend as may not be consistent with the statistical significance of features. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 72 NsiSS 1. Prompt type assignment was considered an irrelevant variable in the present study. It was found to be a statistically insignificant predictor of holistic sc or es (Weasenforth 1993). The potential effects of prompt type assignment on the other ratings and variations in frequency counts of various lexicogrammatical features across prompt type assignments were not investigated. These investigations are su gge sted for further research in prompt type a s a test method facet and its potentuufor determining test taker r e s p o n s e s . 2. The graphic representation of the score distributions for the L i and Lt students (Figure N IV . 1), descriptive statistics (Table N IV . 1) and a test of group distinction indicate that the L i and L i students represent two populations. 7 0 T 60 5 0 -- 4 0 ■ ■ 3 0 20 -■ 1 2 3 4 5 6 7 8 9 1 01 11 21 31 41 51 61 71 81 9 Figure N IV .l: Frequency D istribution o f Holistic Scores for LI and L2 Students Table N IV . I ________ Descriptive Statistics for Li and L> Score Distributions Language Group Sample Size Mean Score Standard Deviation Lj students 2 l 28.81 1.06 L i students 391 21.45 3.13 The frequency distribution of raw holistic s c o r e s (Figure 1) indicates that there is little overlap of the s c o re s for the L i and L2 students. The mea n score (Table 1) for die L i stu dents is higher than that of die L i students by more than 7 points, indicating that the Li stud ents performed better in general than the L i students. The Mann Whitney U test of the two groups indicates that Li students generally ranked higher than L2 stu dents (as reflected in Figure 1) and that the difference in ranking was statistically significant (z = -9.06, p < .05). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 173 3. Changes in spelling were based on Microsoft's spell checking function. In order to insure that all neces sary ch an ge s were made, the typist carefully read each o f the 391 e s s a y s to identify misspellings that would not have been identified by the spell checker. A LI instructors were consulted about changes if the typist and researcher could not come to a decision about changes. 4. Perhaps the most significant observation that can be made from this tedious description of the sc a le s is that a great, perhaps impossible, demand is placed on raters. Not only are raters supposedly to consider all of these features while assigning s co re s to es sa ys, but they must also interpret the meaning and determine the relative significance of the features. This description clearly illustrates why sc o re s assigned in holistic ratings remain ambiguous. 5. The reader should be warned against wholesale adoption o f the scales. The us e of the sc a le s required fairly extensive rater training and a significant amount of time to apply. Several hours were spent at the beginning of each training ses sion in discussion of the constructs incorporated in the scales, of scale levels and of their application to ratings of a small set of ess ay s. Raters spent approximately eight minutes per e s s a y in their individual rating se s s io n s . In contrast, approximately 2.5 minutes per ess ay is needed to assign one holistic score in ratings for the English language placement test u s e d at the American Language Institute at the University of Southern California. Research (Quellmalz e ta l. 1984, Spooner-Smith e ta l. 1980) indicates the importance of considering commitments of time and financial resources when deciding on the type of rating system to be u se d for a program. Similar concerns should be considered when designing research studies also. 6 . Original holistic sc o re s varied by more than a 2 point range for a total o f 1 5 essays, or 4% o f all ess ay s. 7. Discrepancies representing a 2 point range or greater were observed f e w 11 1 s e ts of the original ratings. This number is approximately 4% of the 2737 s e t s (7 se ts o f scores for 391 essays). 8 . LISREL was unable to estimate fit statistics for the se ve n individual measurement models since they including only two observed variables and were thus non-identified. LISREL fit statistics-as dis cu ss ed in Chapter V -fo r the full Rhetorical Abstraction model indicated that the model was misspecified and th us possibly theoretically indefensible. Results for the exploratory factor analyses se e m to corroborate the LISREL results, indicating that the 7 aspects (i.e., statement o f claims, statement o f direction, etc.) may not be facets of a common latent factor. 9. A range of 2 points or greater was observed in 36 s e t s of the original ESL Composition Profile ratings. O f the 1955 s e ts (5 s e t s o f s co re s for each of the 391 essays), the discrepancies represen to d approximately 2%. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 7 4 10. The following were counted a s a single word: • abbreviations (e.g., etc., TV) • numbers (e.g., 356, 1990) • time designations (e.g., 7:30 = 1 word, a.m. = 1 word, p.m. = 1 word) • hyphenated words (e.g„ well-defined) The following were counted as separate words: • acronyms (e.g., US = 2 words, AIDS = 4 words, LA = 2 words, USC = 3 words) • hyphenated words that are not conventionally hyphenated (e.g., world-wide = 2 words, well-being = 2 words) Outline designations (e.g., I., A ., 1.) and title words were not included in the word counts. 11. Sentence fragments which missed a subject or verb and non-clausal material -such a s introductory prepositional phrases-were excluded from the counts. 12. The following formula was u se d to normalize raw frequency counts of lexicogrammatical features for text length. Normalized score = (Raw score + it words in text) x 500 13. PRELIS 2.03 and SPSS 6.0 for Windows were used to calculate univariate and bivariate descriptive statistics. SPSS 6.0 for Windows was also u se d to calculate Cronbach's a . 14. The standardized a "is the a value that would be obtained if all of the items were standardized to have a variance of 1" (Norusis 1990: 467). The variances o f the three m ea su res of text length vary markedly, accounting for the substantial difference between the a and the standardized a . There was relatively little difference between the a s and the standardized a s for all other measu res; only the unstandardized a is thus reported for all other me asures. 15. Carroll's (1992) exploratory factor analysis programs— FAS, HFS and PMS — were u s e d to complete the preliminary factor analyses. 16. The preliminary regression analyses were completed using SPSS 6.0 for Windows. 17. LISREL 8.0 for Windows was used to complete the structural equation modeling analyses of the data. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 175 18. The reader should note that the observed variables in the models (see Figures IV . 1 and IV .2) used in this study are not me as u re s of features associated with textual dimension as was the cas e in the hypothetical models in IV .5. Instead, the observed variables for the rhetorical abstraction, organization, content, vocabulary, language me and mechanics latent variables are ratings of individual raters. Raters a s s e s s e d groups of textual features to provide a score for the textual dimension. 19. See Mooney and Duval (1993) and Bollen and Stine (1993) for fuller and more technical explanations of bootstrapping procedures. Both works also provide discussions of the assumptions underlying bootstrapping and difficulties associated with bootstrapping. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 176 CHAPTER V Results V .l. Overview. The purpose of this chapter is to present and interpret the results o f the various statistical analyses discus sed in Chapters III and IV . A brief discussion of the results from investigations of distributions of data immediately follows this introduction. In concluding this chapter, I w ill dis cu ss each of several aspects of the SEM analyses. 1 The results from explorations of several measurement models w ill be discus sed first The discussion w ill then turn to results gathered from testing two types of SEM models. Results from the first type of model focus on the relative salience of various textual features, particularly rhetorical abstraction, in determining overall textual quality. These results, as well as results from testing the se co nd type of model, w ill address the issue of the relation of rhetorical abstraction features to other textual dimensions. Results from testing the s ec on d type of model focus more specifically on relationships between rhetorical abstraction m ea s u re s and more inclusive me asures, including content, organization, vocabulary, language u s e and mechanics. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 177 V.2. Analyses of Distributional Statistics. The distributions of all variables u s e d in this study were examined with PRELIS 2.03. Although the weighted least s q u ar es estimation method is not ba sed on the assumption o f a normal distribution o f data, an investigation o f the distributions is still important for two reasons. The calculation of so m e correlation matrices used in SEM analyses are ba sed on the assumption that the data are normally distributed. Marked departures from normality could significantly alter the correlations and thereby affect substantive interpretations of results. An investigation of distributions is important also because, while SEM analyses are robust to departures from normality, they are not stable when data have a U-shaped distribution (Bollen 1989, Boomsma 1983). Foregoing investigations o f data distributions could thus lead to results that are substantively interpretable but which inaccurately portray actual relationships in the data. If su ch a distribution were identified, one would want either to transform the data or to delete the data from further analyses.2 Univariate distributional information for all variables are reported in Appendix C. This information indicates that many of the variables exhibit high levels (i.e., > 2.0) o f kurtosis and skewness. Tests of bivariate normal distributions also indicate that many bivariate distributions are not normal. Tests of multivariate normality for continuous variables indicate that die continuous data are not normally distributed.3 Since a polychoric/polyserial matrix was calculated for the analyses, the non-normal distributions o f most data were not of great concern. However, the investigations of the data identified the variables, R41 and R42, a s having Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 178 nonlinear distributions. Because the distributions of R41 and R42 (measures of data type) were parabolic, they were not included in the SEM analyses.4 V.3. Structural Equation Modeling Analyses. The first concern in analyzing the LISREL results was that the theoretical model fit the data. As sug ge st ed by Joreskog (1993), the measurement models of the full LISREL models were checked first for fit These investigations were completed for each measurement model, then for all possible combinations of measurement models. The results of these investigations raised questions about the viability of several of the measurement models. Because the investigations raised questions about the definition of three of the latent variables, this information w ill be d is cu sse d first Although the fit o f component measurement models is usually not included in a discussion o f SEM results, it w ill be here for two reasons. Firstly, the association of particular grammatical and lexical features with particular textual characteristics h a s received a fair amount o f attention in recent computational discourse literature (see Biber 1988; 1992, Frase et al. forthcom ing in particular). Results from the analyses of measurement models in this study are relevant to the validity of assuming that su ch associations exist and, therefore, hold so m e potential interest for discourse analysts as d is cus sed in the concluding chapter. More specifically, the results of these analyses call into question Biber’s (1988) factor analysis approach to defining textual dimensions. They also indicate that the definition of rhetorical abstraction a s heretofore presented may not be valid. The results, therefore, call Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 179 for significant revisions to the original full models represented in the previous chapter. The second reason for discussing model fit of several measurement models is that in order to complete this study, it was neces sary to delete from the original model two measurement models and to modify another. This discussion thus ser ves a s explicit documentation o f the methodological procedures followed in the research and a s an explanation for the deletion and modification of so m e components of the original model presented in Chapter IV . Following the discussion of the measurement models which were not supported by SEM analyses, results from analyses of the full models are presented and interpreted. Included in this part of the chapter will be discussions of model fit and parameter estimates o f particular interest in the study for two, conceptually distinct full models. V.3.1. Measurement Models. The results o f preliminary investigations of three measurement models raised questions about the theoretical assumptions underlying the models. The models in question are the textual elaboration, topic abstraction and rhetorical abstraction models. The discussions o f these three models w ill focus on model fit statistics, the association of observed and latent variables and error estimates. Results for the textual elaboration measurement model are summarized in Tables V. 1 and V.2 and Figure V. 1 below.5 As a reminder, on-line textual elaboration, according to Biber's (1988) interpretations, is represented by four grammatical features: that clauses a s verb complements, demonstratives, that Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 180 relative clauses on object positions and that clauses u s e d a s adjectival complements. Measures of the se four features are as su m ed to be indications of spontane ous elaboration of texts, an elaboration that may be typically observed in face-to-face conversations. As indicated in Table V. 1, all fit statistics indicate that the model is statistically sound. The RMSEA is well below .05, indicating that the me as u re s considered together do an adequate job of measuring the latent variable. The other four indices indicate that the model d oe s rather well in explaining actual relationships observed in the data. The ratio of x 2 to the degrees of freedom is well below 2.0. The GFI is 1.0, and both the AGFI and NFI are close to 1.0. A ll of these indices of overall model fit indicate that the measurement model reflects observed relationships in the data very well. Table V .l Fit Indices for Textual Elaboration Measurement Model Index Fit Estimate RMSEA 0 . 0 0 X2 < d f)p 0.84 (2 ) 0.66 G FI 1 . 0 0 AGFI 0.99 N FI 0.97 In spite of the fit indices, parameter estimates reveal problems in the model. Estimates for the associations of observed variables and the latent variable and the associated error estimates are reported in the graphic representation of the model in Figure V. 1 below. Three of the parameter estimates (.12, .09 and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 181 -.0 1 ) are low, indicating that the measures may not be measuring an underlying textual dimension common to all four measures. The respective error estimates (.28, .20 and .04) associated with these me asures are largo' than the associated measure, providing a statistical basis for interpreting the measure estimates a s being equal to 0.0. These results indicate that what is being measured is not necessarily a textual dimension called textual elaboration, but rather simply one of the four grammatical/lexical features identified by Biber (1988).6 1.77 Textual laboratii .09 Figure V .l Textual Elaboration Measurement Model Other parameter estimates, reported in Table V.2 below, re-enforce concerns about the model. The squared multiple correlations indicate that three of the four observed variables are, in fact, not m ea su res of a single textual dimension, be it textual elaboration or any other dimension. The R 2 for IE2 is high be cau se LISREL automatically assigned it a s the reference variable for scaling purposes, but it is too high (> 1.0 ) to be interpretable. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 82 Relatively large error estimates are associated with all four mea su res , indicating that none of the observed variables is an adequate measure of the latent variable. Furthermore, none of the t-values for the observed variables is greater than 2.0. The low t-values indicate that the parameters (i.e., the associations of the observed variables with the latent variable) are statistically insignificant On the last two lines in Table V.2 are reported the error estimates of the measurement errors and the associated t-values. The small size of three of the four 0 5 estimates relative to the error estimates reported on the second line corroborates previous interpretations. The t-values indicate that the error estimates are statistically more significant parameters th an the associations o f observed variables with the latent variable. This would have been a curious result had the observed variables been adequate me as u re s of one underlying textual dimension. TaEIeV3 Parameter Estimates for Textual Elaboration Measurement Model m m 2 m m R 2 s "O'.TE ....■ ....o:or O . t K T Standard Errors 1 . 6 6 0.25 0.13 10.93 T-values 0.44 0.44 0.43 -0.27 Error estimate for error (6 5 ) 0 . 1 0 14.17 0.08 0.07 T-values for 05 9.87 -0.15 12.37 13.96 Similar results were found for the topic abstraction measurement model. These results are reported in Tables V.3 and Figure V.2 below. 7 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 183 Table V.3 Fit Indices for Topic Abstraction Measurement Model Index Fit Estimate RMSEA 0 . 1 1 X2 (d f)p 53.49 (9) 0.00 G FI 0.98 AGFI 0.95 N FI 0.96 Not all of the fit indices were adequate for this model. Although the GFI, AGFI and N FI indicate adequate fit, the RMSEA is higher than the expected .05, and the x 2 to degrees of freedom ratio is greater than 2.0. The high RMSEA indicates that the six observed variables are not effective m ea su re s of a common latent variable. Problems similar to those in the analysis of the previous model are obvious in the graphic representation of the topic abstraction model in Figure V.2 below. Four (AB1, AB2, AB3 and AB4) of the six variables appear to be only slightly associated with topic abstraction. The variables AB5 and AB6 have stronger relationships with the latent variables. However, since the model would not converge, no error estimates or t-values are provided, and it is thus impossible to determine whether the associations are significant This provisional solution doe s indicate, however, that the observed variables may not be related to topic abstraction a s defined by Biber. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 184 -.01 AB3 Topic l}stracti< AB5 -.26 AB6 Figure V.2 Topic Abstraction Measurement Model Be cause o f the problems identified in the investigations of both of the se measurement models, the two latent variables were eliminated from investigations of the full models. Even if the latent variables were found to be significant aspects of overall textual quality, substantive interpretations of the results would be complicated since the variables are not well-defined. 8 As pointed out in Chapter IV , one could argue that the composite score of all frequency counts for each textual dimension may represent Biber's definition of the latent variables more accurately. Both of th es e models are graphically represented below in Figure V.3. Both of these models raise theoretical concerns and present problems for SEM analyses. Theoretical concerns are related to the interpretation of the models in terms of the definitions of the textual dimensions that are supposedly measured. The models hypothesize that the observed variables, IEC and ABC, are "pure" Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 85 e rro r— # I EC » Textual jaboratk e rro r— p| a Bc |» Topic istractii Figure V.3 Alternate Models of Textual Elaboration and Topic Abstraction measures of the associated textual dimension. The validity o f assuming that a single measure-particularly limited to a handful of syntactic and lexical features -can be considered an adequate measure of a textual dimension is questionable (Connor 1991, Kaplan 1988, van D ijk 1990). Analyzing the models with SEM is also problematic in two ways. It is not possible to analyze the individual measurement models since they are not identified in the SEM s e n s e of the term. Inclusion of the measurement models in full LISREL models also caused matrices in the analyses to be non-positive definite. Furthermore, no satisfactory solution was found when either or both o f these measurement models were included in full models .9 These results served as additional indications that the measurement models did not adequately explain the data. They were thus excluded from the analyses discus sed below. The third individual measurement model that should be discus sed here is the rhetorical abstraction model. Two measurement models were tested before a n adequate solution was identified. Each of th e s e models is discussed and fit statistics are reported for each. Since neither of the models fit the data, LISREL Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 186 provided only preliminary solutions useful only in respecifying the model to obtain adequate fit These solutions are not reported. The originally proposed model of rhetorical abstraction is represented in Figure V.4 below. The reader should note that for the sake of clarity of the figure, two types of relationships are not represented in the figure. The measurement errors associated with the sa m e rater (R11, R21, R31, etc. and R12, R22, R32, etc.) were correlated. Each first order latent variable was also correlated with each other first order variable. error-e R11 Claims error-«|nl2 error e rror—b|r 22 Irectlo error ata Use error-#fR32 hetorica stract error-*jR5l error-c|RS2 arrant error error -« R 6 2 error— RR71 e rror— «R72 Figure V.4 Original Rhetorical Abstraction Measurement Model Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 187 Indices of overall fit for the model are reported in Table V.4 below. All indices indicate that the original model fits the data poorly. Only a preliminary Table V.4 Fit Indices for Original Rhetorical Abstraction Measurement Model Index Fit Estimate RMSEA 2.37 X2 (d f)p 4388 (2) 0.00 GFI 0.33 AGFI -24.98 NFI 0.37 solution was provided since the model would not converge. Results included a negative multiple correlation for structural equations and large measurement errors relative to parameter estimates. These results were further indications that the model represented in Figure V.4 does not adequately reflect relationships observed in the data. Since the previous model was inadequate, the model was modified. The six dimensions represented as first order factors in Figure V.4 were used in the full models without the latent variable rhetorical abstraction The six dimensions are represented as first order variables as in Figure V.5. 10 V.3.2. Full Models. Two types of specifications for full m odels were tested. The first type was employed to investigate the relative significance of various textual dimensions in defining overall textual quality. The sec o n d type of specification Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 188 was used to investigate the relationships between m easures of more discrete textual dimensions and more global dimensions. The full model represented in Figure V.5 below was used to investigate the relative salience of 1 2 textual dimensions as aspects of overall textual quality. As noted earlier, the topic abstraction and textual elaboration dimensions have been eliminated from the model, and rhetorical abstraction features are represented as independent of any underlying textual dimension other than overall textual quality. The reader should note again that scores assigned by each rater are correlated, and all textual dimensions are intercorrelated. These correlations are not depicted in the figure s o that the figure will be readable. Three types of statistical estimates are provided in the figure: estimates for the associations between observed and latent variables, estimates for measurement errors and estimates for the relationships between the twelve textual dimensions and overall textual quality. Estimates of correlations between measurement errors (see Tables V.5 and V.6 ) and latent variables (see Table V.7) are presented in tabular form below. 1 1 Estimates for the associations between observed and latent variables are all above .71, and the associated errors are all equal to or less than .10. These two pieces of information indicate that the observed variables are effective m easures of the associated textual dimension. Estimates of the relationships between textual dimensions and overall textual quality are of m ost interest, however, since they directly address the hypotheses of the study. These estimates will be d iscussed below after a report of other parameter estimates and model fit. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 89 • 06 .05 .10 .10 A A 4 i, | R31 [I R3211RS111RS2 .08 .07 .06 .05 R611 R62 R71 R72 E90 1.01 :86 .9 .87 1.03 ata Us arrant Cause ssue Irectlo .24 ^ w JOG1 k- ni ^rg an izati^- ^ f o ^ r ' 00 ^ ? fO G 3 |* 0 1 R11 n -06 .00-e|TL3 ext Le T L irr.0 1 <$pcabul VC 3H-00 Overall xtual Quali LU1 4-01 Mechanic MC3O-.01 Figure V.5 Full Model 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 190 Correlations of errors are reported in Tables V.5 and V . 6 below. Correlations between errors are all below .35, m ost below. 10, indicating that raters were able to maintain a fairly clear distinction between rating categories. Table V.5 Correlations of Errors for Full Model 1 : Rhetorical Abstraction Features R ll R12 R21 R22 R31 R32 R51 R52 R61 R62 R71 R72 R ll .16 R12 . 0 2 R21 -.09 .15 R22 .06 . 0 0 R31 -.09 . 0 0 - . 1 1 R32 .1 1 - . 0 1 .32 R51 -.08 .08 - . 0 1 * 4 * - O R52 .09 - . 0 2 . 0 0 .58 R61 i o • C k -.08 -.03 -.09 1 o R62 .08 .05 .05 .08 .19 R71 -.04 i b -j - . 0 1 -.06 .03 .07 R72 .04 .06 . 0 2 .07 - . 0 1 .25 Correlations between all textual dimensions are provided in Table V.7 below. Two general observations should be made with regard to this set of correlations. Correlations for text length (TL) and ESL Composition Profile ratings range in value from .34 to .93, the highest correlation being between language use (LU) and vocabulary (VC). This correlation appears to be partially due to overlapping material in the scales, but may also be due to difficulties in discriminating between the two types of features (Huckin 1983). The next highest correlation for this set of variables is .83, between organization (OG) and content (CT). This correlation does not appear to be due to overlapping Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission. Table V .6 Correlations of Errors for Full Model 1 : ESL Composition Profile Features CT1 CT2 CT3 OGl OG2 OG3 VC1 VC2 VC3 LU1 LU2 LU3 M CI MC2 MC3 CT1 .31 CT2 .32 CT3 .08 OG1 .13 .34 OG2 .35 .36 OG3 - . 1 0 .09 VC1 .07 .1 1 .29 VC2 .16 .24 .31 VC3 .03 -.08 .1 1 LU1 .1 0 .09 .19 .37 LU2 .18 .24 .23 .32 LU3 - . 0 1 -.04 .03 .07 MCI - . 0 2 -.06 -.03 .04 . 2 2 M Q .26 .32 .30 .29 .48 MC3 . 0 0 - . 0 1 . 0 0 .06 .26 192 information in the scales; other researchers have reported connections between the two textual dimensions (Hamp-Lyons 1991a). These correlations attest to the difficulty of isolating what appear to be interrelated textual dimensions. They also underscore the importance of using a methodology which can accommodate the interrelationship of dim ensions rather than analyzing them as independent textual characteristics. The correlations between features of rhetorical abstraction and the other six. features are relatively low, the highest being .25 between statement o f direction and language use. One may interpret the generally small correlation coefficients a s an indication that the elaboration of rhetorical abstraction features within a text are only slightly related to the adequacy of content, organization, vocabulary, language use and mechanics. The correlations between various rhetorical abstraction features are also relatively low. However, while other correlations are less than .30, statement o f claims and statement o f direction are correlated at .69. These correlation coefficients corroborate results from analyses of the rhetorical abstraction measurement model in that they indicate that in general the features are slightly related with each other. With the exception of statement o f claims and statement o f direction, the features appear to be independent of each other which would be inconsistent with a strong theoretical view of the interdependence of textual features a s well as with the proposed model of rhetorical abstraction. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 193 Table V.7 Correlations for Exogenous Latent Variables for Full Model 1 TL CT OG VC LU MC CL DI DU WA CA IS TL 1.00 CT .63 1.00 OG .43 .83 1.00 VC .45 .71 .62 1.00 LU .42 .66 57 .93 1.00 MC .34 .49 .47 .60 .65 1.00 CL .05 .07 .08 .05 .15 .07 1.00 DI .09 .13 .09 .08 .25 .12 .69 1.00 DU .09 .10 .09 .09 .18 .06 - .04 .18 1.00 WA .09 .04 .07 .08 .12 .04 .20 22 .16 1.00 CA .12 .07 .10 .11 .18 .05 -.01 .09 24 .03 1.00 IS .18 .05 .09 .12 .11 .06 .20 .12 .29 .13 .27 1.00 Fit indicates are reported in Table V .8 below. The indices indicate that the model fits the data well. The RMSEA is less than .05, indicating that the sample is representative of the target population. The x2 • ' d f ratio is less than 2.0, indicating that model fit is adequate. The GFI, AGFI and NFI indicate perfect fit. Table V.9 presents parameter estimates for the exogenous variables. For each variable, the y estimate-the strength of relationship between the twelve textual dimensions and overall textual quality --an estimate of error associated Table V.8 Fit Indices for Full Model 1 RMSEA . 0 2 2 X2 (d f)P 199.57(168) .048 GFI 1 . 0 0 AGFI 1 .0 0 NFI 1 . 0 0 with the y estimate, and the associated t-value are reported. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 94 One should first compare the y estimate with its standard error to test whether the parameter esti m ate differs significantly from 0.0. If the standard error is equal to or greater in value than the y estimate, the y estimate may be Table V.9 Exogenous Variable Estimates for Full Model 1 Variable y Estimate Standard Error T-value Text Length .4o 0.95 .....335''"“ Claims -.1 8 0.09 -1.98 Direction .44 0.09 4.96 Data u s e .1 2 0.06 1.91 Warrants - . 1 1 0.05 -2.04 Causes .04 0.06 0.67 Issues .2 4 0.06 4.14 Organization .06 0.06 1 .2 1 Content -.15 0.07 0.96 Vocabulary .4 8 0.08 3.29 Language Use .08 0 . 1 0 1.84 Mechanics .08 0.04 1.19 considered equal to 0.0. This is the case for the following variables: causes, organization and language use. All other estim ates may be considered significantly different from 0 .0 . T-values should be reviewed to identify parameters which are statistically significant. Joreskog and Sorbom (1993) suggest that t-values less than 11.961 indicate tliat the associated parameter estimate may be considered equal to 0 .0 . This is the case for all but six of the parameters in this analysis. The significant parameters are text length, claims, direction, warrants, issues and vocabulary. One may then compare the sizes of the significant variables to determine the relative significance in determining overall textual quality. The significant textual dimensions are displayed below in order of their significance, from Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 195 highest (1) to lowest (6 ). The associated y parameter estimates are included in the list. 1. Vocabulary (.48) 2. Statement o f direction (.44) 3. Text length (.40) 4. Examination o f premisesAssues (.24) 5. E xplicit statement o f warrants (-.1 1 ) 6 . Statement o f claims (-.18) The m ost salient textual dimension for the raters who assigned holistic evaluations of overall textual quality was apparently vocabulary. This dimension, a s defined in the ESL Composition Profile rating scale included the features of word form, word choice and appropriate u sag e of idioms. The reader should note that language use may also be considered one of the most salient dimensions. Vocabulary and language use are highly correlated (.93). By implicating one of the dimensions, the other m ust also be implicated. The next m ost salient dimension is statement o f direction, one of the rhetorical abstraction features. Text length was also influential. Examination o f premisesAssues, another feature originally associated with rhetorical abstraction, was also significant in determining overall textual quality. Finally, estimates for two features originally associated with rhetorical abstraction were negative. E xplicit statement o f warrants and statement o f claims were both significant features of overall textual quality, but in an inverse relationship with overall quality. That is, the extensive u se of explicitly stated warrants and extensive elaboration of claims are apparently indicative of relatively poorer writing. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 196 The second type of model was specified to investigate the relationships between more discrete m easures and more global m easures. The more discrete m easures, that were identified as exogenous variables in the model, included: • text length, • statement o f claims, • statement o f direction, • data use, • explicit statement o f warrants, • consideration o f causes, and • examination o f premises/issues. The more global m easures, that were identified a s endogenous variables, included: • content, • organization, • vocabulary, • language use, and • mechanics. While the correlational parameter estim ates reported in Table V .8 offer some information about the relationship between variables, the relationships defined in the model depicted in Figure V. 6 are of a somewhat different nature. The parameter estimates d iscu ssed earlier can be interpreted in terms of the co­ occurrence of textual features. The relationships between textual features in Figure V.6 can be interpreted in terms of the salience of exogenous variables- length, claims, direction, data use, warrants, causes and issues -as aspects of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 197 the endogenous variables, content, organization, vocabulary, language use and mechanics. The purpose in testing these models was to investigate whether the more discrete m easures were considered aspects of the more inclusive m easures. Again, there was a focus on the m easures of rhetorical abstraction and their relationship to the global m easures. It was hypothesized, for example, that measures of rhetorical abstraction -particularly data use, consideration o f causes and examination o f premises/issues -could be considered as aspects of content. As with the previous model, two types of correlations were included in the analysis but not represented in the graphic representation of the model in Figure V. 6 below. Only three types of parameter estimates were reported in the figure: estimates for the associations between observed and latent variables, measurement errors and estimates for the relationships between exogenous and endogenous variables. This limitation was necessary to maintain the legibility of the figure. Correlations between selected errors and between latent variables are provided in a tabular format below. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 198 .10 06 .09 -A-,,—A-- A TL2||TL3f I R11 R51 R52 1 15 9/6 4 1 \31 /- 6 4 1100 /.9 7 irectio ataUsfl) m arram s .07 .1 4 ^ . 1 7 Cause 1 09JR71 b -.2 0 ssue R72 a -.09 <S&cabulary> raan zati ecnani 7 i7 2 \ .78 *7 . 7 \8 2 .9 8 /.7 CT1 CT2 CT3 OG3| OG2 |0G1 VC3||VC2|VC1 LU3 |LU2 H MC3 MC2| |MC1 ♦ 4 4 " f * ' I f V ♦ 4 4 ♦ 4 .04 .01 .02 .10 .01 .01 .04 .09 .03 .08 .07 .05 .11 .06 .07 Figure V.6 Full Model 2 The fit statistics indicate that this second model also fits the data. The RMSEA index is smaller than .05. The x 2-'d f ratio is smaller than 2.0, and the other three indices approach 1.0 . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 199 Table V.10 Fit Indices for Full Model 2 Index Fit Estimate rMSe a .025 X2 (d f)p 174.48 (168) .052 GFI 1 . 0 0 AGFI .99 NFI .98 Estimates for the y param eters represented in the model are reported in Table V. 1 1 below. Exogenous variables, the more discrete textual dimensions, are listed along the left margin of the first column. Under the nam es of these variables are subsum ed the broader textual dimensions which were regressed on the more discrete variable indicated above them. The associated standard errors and t-values for each parameter estimate are reported in the last two columns. Standard errors for four y parameters (regressions of content and organization on both causes and issues) are greater than the parameter estimates. These parameters may be considered equal to 0,0; that is, the rhetorical features, consideration o f causes and examination ofpremises/issues, do not appear to be aspects of the broader textual dimensions, content and organization. The t-values associated with y parameter estim ates for data use-content and data use-organization parameters fall between -1.96 and 1.96. These parameter estimates may thus be considered equal to 0.0. These results suggest that the relative amount of data u sed as support in an ess ay was not considered by raters to be an aspect of either content or organization. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 200 Table V. 1 1 Exogenous Variable Estimates for Full Model 2 Exogenous Variable- Endogenous Variable y Estimate Standard Error T-value Text Length- Content 0 .6 0 0.06 10.27 Organization 0 .3 7 0.06 6.26 Vocabulary 0 .4 6 0.06 8.27 Language Use 0 .42 0.06 7.61 Mechanics 0 .28 0.06 5.12 Claims- Content -0 .1 4 0.05 -3.00 Organization -0 .1 4 0.05 -2.67 Direction- Content 0 . 2 1 0.05 4.14 Organization 0 .15 0.06 2.62 Data u s e — Content -0.05 0.04 -1.19 Organization -0.07 0.05 -1.29 Warrants- Content -0 .1 4 0.03 4.11 Organization -0 .1 7 0.04 4.18 Causes- Content -0 . 0 2 0.04 -0.52 Organization 0.03 0.05 0.65 Issues— Content -0.03 0.04 -0 . 6 6 Organization -0 . 0 2 0.05 -0.53 Eleven estimates ate identified a s significant. These include all five estimates involving text length, indicating that text length can be considered an aspect of all six broader textual dimensions: content, organization, vocabulary, language use and mechanics. Text length appears to be most closely related to content The relationships between text length and three broader dimensions- organization, vocabulary and language use -appear to be of roughly the sam e strength. There is a looser relationship between text length and mechanics. Similarly, statement o f claims, statement o f direction and explicit statement o f warrants appear to be aspects of content and organization. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 201 Statement o f claims appears to be related to both content and organization to an equal degree. Statement o f direction appears to be more closely related to content than to organization. The relationships between explicit explicit statement o f warrants and the two broader dimensions appear to be of fairly equal strength. It appears that it is the absence (due to negative polarity) of statement o f claims and explicit statement o f warrants that partially constitutes substantive and organizational asp ects of the texts. Modification indices for both models 1 and 2 indicated that the model fit could be improved if the m easure, CT2, were considered a measure of not only content, but of organization also. These models were tested, but only modest improvements in model fit were attained Parameter estimates varied slightly, but the substantive interpretations remained the sam e. The models represented in Figures V.5 and V . 6 woe, therefore, not only adequate in explaining the data, but were more parsimonious and more easily interpretable than other m odels suggested by the LISREL modification indices. Notes 1 . Two preliminary analyses-exploratory factor analyses and stepwise regression-were performed on the data. The results of th ese analyses are summarized below. The exploratory factor analyses revealed that: 1) m easures for textual elaboration (IE1-IE4) did not load on a common factor, 2) m easures for topic abstraction (AB1-AB6) did not load on a common factor and 3) m easures for rhetorical abstraction (R11-R72) did not load on a common factor. These results indicate that the m easures for each assum ed underlying construct did not, in fact, measure any common underlying construct. The following results were of concern also. Language use (LU1-LU3), vocabulary (VC1-VC3) and overall textual quality (HR1-HR3) variables consistently loaded on a common factor. Organization (OG1-OG3) and content (CT1-CT3) variables also consistently loaded on a common factor. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 202 These results are reflected in the high correlations between language use and vocabulary and between organization and content that LISREL produced. The fact that HR1-3 loaded on the sam e factor a s LU1-3 and VC1-3 is reflected in the relatively large y estimates in the LISREL results. The results of the preliminary regression analysis are summarized in Table NV. 1 below. Results for only the six variables identified as significant predictors are reported although all thirteen latent variables were included in the analyses. ' Table N V.l Stepwise Regression Results Variable P F-value P Language Use .3700 508.64 0.0000 Text Length .2118 368.01 0.0000 Vocabulary .2953 294.74 0.0000 Examination of .1208 235.38 0.0000 premises/issues Content .1030 195.14 0.0000 Topic Abstraction .0699 166.01 0.0000 Language use is identified as the most significant predictor of overall textual quality, followed by vocabulary and text length. Content was also identified a s a significant predictor. Only one feature of rhetorical abstraction, examination o f premises and issues, was identified as significant As in Connor's (1991) study, Biber’s dimension topic abstraction appears to be significant 2. Boomsma (1983) identifies and discusses the difficulties encountered when data are transformed. Transformations of data may alter statistical results and the substantive interpretations thereof. Even if the statistical results are not altered, substantive interpretations may have to be altered to reflect the changes in data. 3. In addition to the univariate tests of normality, PRELIS 2 executes a test of the joint hypothesis of no multivariate skewness or excess kurtosis (Bollen 1989, Jdreskog and Sorbom 1993). Results from the latter test are reported below. Skewness_________ Kurtosis__________ Skewness and Kurtosis Z-score p Z-score p Chi-square p 148.497 0.0 52.183 0.0 24774.287 0.0 The results indicate that the multivariate distribution of data is not normal. 4. Data type (measured by R41 and R42) w as not identified as a significant predictor of overall textual quality in the preliminary regression analyses described below, suggesting that SEM analyses may not be significantly changed in terms of identifying significant aspects of overall textual quality. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 203 5. LISREL provides three types of solutions, that is, full s e ts of parameter estim ates: & fitte d solution, a standardized solution and completely standardized solution. For the fitted solution, LISREL standardizes neither the latent nor observed variables. For the standardized solution, the program standardizes the latent variables to a common scale. For the completely standardized solution, both latent and observed variables are standardize to a common scale (Joreskog and Sorbom 1988). The completely standardized solutions are reported in this study since they are consistent with the goals of the study. 6. The one variable which is actually being measured depends on model specification. IE2 was identified as the feature being m easured simply because it was chosen as the reference variable for the analysis. Any of the four observed variables could have been chosen as the reference variable, that is the variable which most accurately represents the latent variable. 7. This model would not converge with the WLS estimator after 400 iterations. Provisional estimates were provided. Analyzing the model with a matrix of product-moment correlations and the maximum likelihood estimator-an appropriate approach with continuous variables-vielded results with very similar substantive interpretations. This information underscores the inadequacy of the model. 8. Textual elaboration was not found to be a significant variable in the regression analysis. Although topic abstraction was identified a s a significant predictor in the regression analysis, it is not clear what the cluster of lexicogrammatical features represent. 9. For these analyses, the admissibility check was turned off and the maximum number of iterations was set at 300. Every analysis provided only provisional solutions. 10. Correlations of the rhetorical abstraction variables, reported in Table NV.2 below, indicate that statement o f claims and statement o f direction are relatively closely related to each other. Since both variables are related to the concept of mints' claims, this may not be surprising. Perhaps more interesting, they also indicate that pairs of measures for the sam e feature-i.e., R11 and R12, R21 and R22, etc.--are measuring the sam e textual dimension. In fact, the correlations between pairs of m easures are higher than those between features--e.g., R11 and R21, R12 and R22, etc.-leading to the conclusion that the rhetorical abstraction features are not related to a common textual dimension. Correlations between other features are relatively low compared to correlations between ratings of the sam e feature. These correlations suggest that the specification of separate latent variables for the six features would be the m ost satisfactory from a statistical point of view. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 204 Table NV.2 Correlation Matrix for Rhetorical Abstraction Variables R ll R12 R21 R22 R31 R32 RSI R52 R61 R ll 1.00 R12 0.95 1.00 R21 0.49 0.63 1.00 R22 0.64 0.75 0.95 1.00 R31 -0.14 0.04 0.18 0.16 1.00 R32 -0.13 0.06 0.21 0.16 0.97 1.00 RSI 0.14 0.21 0.31 0.22 0.18 0.16 1.00 R52 0.13 0.22 0.14 0.11 0.17 0.15 0.77 1.00 R61 -0.06 0.01 0.01 0.10 0.24 0.24 -0.05 0.10 1.00 R62 -0.02 0.03 0.04 0.11 0.19 0.21 -0.04 0.08 0.99 R71 0.13 0.16 0.04 0.12 0.30 0.33 0.08 0.18 0.31 R72 0.21 0.24 0.11 0.17 0.23 0.24 0.07 0.15 0.26 R62 R71 R72 1.00 0.32 1.00 0.22 0.84 1.001 Results from this analysis call into question the validity of associating these six features {statement o f claims, statement ofdirection, data use, explicit statement o f warrants, consideration o f causes and examination ofprem ises) with a common textual dimension called rhetorical abstraction. Additional research is required to investigate the nature and definition of su ch a dimension. This research lies beyond the current study. Nevertheless, I will continue to refer to the six features a s aspects of rhetorical abstraction. A fuller discussion of the validity of associating these features with rhetorical abstraction is provided in the concluding chapter. 11. All results reported in this chapter are from completely standardized solutions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 205 CHAPTER V I Discussion and Conclusions VI. 1. Overview. In this chapter I will d iscuss the results summarized in the previous chapter as they relate to language assessm ent, language teaching and discourse analysis. There are actually a number of discussions incorporated in this chapter. First the results that are useful in addressing the hypotheses are summarized and d iscu ssed . The SEM analyses yielded many results that deserve mention. However, I will limit elaborated discussions to those results which address the hypotheses articulated in Chapter I. These discussions will include interpretations of the results and explanations for the findings. Explanations will include references to limitations of the study, text sam ples and raters' comments. Implications of the results for language testing and teaching, and discourse analysis are then discussed. Included in this section is a discussion of the coordination of testing and teaching efforts within programs. The final section of this chapter will include proposals of several avenues of research which may be useful to the language testing and discourse analysis fields. These proposals are prompted by limitations imposed on this study and results which were interesting but tangentially related to the hypotheses tested in the study. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 206 VI.2. Addressing the Hypothe s e s . Results which directly address the hypotheses stated in 1.4 are summarized below. The hypotheses are restated for the reader’s convenience and the relevant results provided and more fully interpreted than they were in the previous chapter. The first hypothesis expressed in 1.4 addressed the validity of the rhetorical abstraction model. A statistical solution was sought to corroborate the second-order factor model of rhetorical abstraction a s reflected in the rating scales for rhetorical abstraction (see Appendix B). The SEM results presented in Chapter V do not support the assumption of a second-order model. Nor are they consistent with a more general assum ption that the original seven rhetorical abstraction features can be related to a common underlying textual dimension. In spite of the lack of statistical support, the variables were still referred to as features of rhetorical abstraction. The lack of statistical support for a coherent model of rhetorical abstraction may be attributed to several characteristics of the study. The abbreviated ranges of possible sem es evident in the scales applied in the measurement of rhetorical abstraction features may partially account for the results. For all but one of the scales, there were only four possible scores (0,1, 2 and 3) identified. This range stan d s in contrast with the range of possible scores (from 0 to 9) for the holistic ratings of overall quality. The restricted range of possible sco res may have allowed higher consistencies in raters' evaluations than might have been observed had a wider range of sco res been identified, particularly for rhetorical abstraction. Correlations between rhetorical abstraction Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 207 features are small relative to the correlations between m easures for each feature, providing a contrast in correlational relationships which may have resulted in failure of the proposed model. A second, conceptual explanation for the failure of the model is related to the definitions of the rhetorical abstraction features. The scale levels were generally defined in terms of the amount of elaboration of each feature, but other aspects of the seven features could have been rated. For example, the relevance of data, warrants, cau ses and premises/issues to the major claim may be a significant aspect of coherent text and could have been m easured. Likewise, the organization of the various features of rhetorical abstraction could have been m easured. There are, in all probability, many other aspects of the seven features that can be associated with overall textual quality. As argued above, the inclusion of ratings of these various other aspects could lead to the successful identification of a model of rhetorical abstraction The fact that no adequate model of rhetorical abstraction was identified may also be a result of the text types evaluated. Relatively little variation in the ratings of two rhetorical abstraction features (i.e., consideration o f causes and examination o f premises/issues) was observed, probably due to the type of texts chosen for the study. Some of the rhetorical abstraction features were not observed in many of the essays. Furthermore, the highest level of elaboration of some features was observed in only a handful of essays. The restricted variation may be due to the limited amount of time allotted for writing and the absence of instructions indicating that the features should be included in the students' texts. It could be that if another corpus of texts--one which exhibited a fuller Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 208 representation of the rhetorical abstraction features-were analyzed, that an adequate model of the construct could be derived. The limited time could have restricted the number of rhetorical abstraction features deployed in many protocols particularly b ecause the writers were Lo students. Cultural variations in rhetorical development of writing may partially account for the scarcity. Also, the fact that none of the writers were experts may account for the lack of rhetorical development A second statistical explanation may also provide a clue a s to why the proposed model w as not found adequate and may also provide a direction for future investigations. Results from preliminary exploratory factor analyses indicate that pairs of these seven features load on separate common factors, suggesting that rhetorical abstraction may be a multidimensional construct It could be that investigations of additional features associated with rhetorical abstraction would provide a b asis for a coherent multidimensional model of the construct As d iscussed in Chapter II, there were many features that have been associated with rhetorical abstraction which were not measured in this study. The second hypothesis addressed the salience of rhetorical abstraction as one dimension of overall textual quality. It was hypothesized that rhetorical abstraction would prove to be a statistically significant dimension of overall textual quality. Since the first hypothesis was not supported, this hypothesis is restated a s a series of hypotheses addressing the relative salience of six features which were assum ed to be related to a common dimension called rhetorical abstraction. These hypotheses are equivalent to hypothesizing that six textual fea.tures~sratement o f claims, statement o f direction, data use, explicit statement Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 209 o f warrants, consideration o f causes and examination ofpremises/issues — are statistically significant features of overall textual quality. Parameter estimates for the first full SEM model indicate that several rhetorical abstraction features were significant (see Table V.6). Statement o f direction and examination o f premisesfissues were both significantly and positively related to overall quality. Statement o f claims and explicit statement o f warrants were also statistically significant, but their relationship with overall textual quality is inverse. Of the four features found to be significant and positively associated with overall quality, examination o f premises/issues appears to be the least significant This aspect of rhetorical abstraction was earlier associated with contextualizaticm of a whole line of argumentation. This feature involved identification and possible examination of the assumptions and issu es underlying arguments. Raters gave little indication in their d iscussions that this textual feature appeared significant to them. However, they did note that this feature occurred relatively infrequently and indicated that the feature added a "sophistication” to the papers in which it w as exhibited. They further interpreted the feature as an indication that the writers who included this feature in their papers (see essay #008 in Appendix F for example) had given more thought to the topic than had other writers, suggesting that examination o f premises fissues might constitute an aspect of content. The other rhetorical abstraction feature that w as identified a s significant and positively associated with overall quality w as statement o f direction. In fact, statement o f direction appears to be the second m ost significant textual Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 10 dimension, subordinate only to vocabulary in importance. This is not surprising since the elaboration of this feature constitutes the development of the major claim. Stating the direction of change is one m eans of elaborating a proposition and was the m eans often used by the writers of the texts used for the study (see essay #277 in Appendix F for example). Raters did provide som e indications that this textual feature played a role in determining overall quality. They expressed the opinion that some statements o f direction were irrelevant to Western societies and therefore constituted weak points of support for the proposition. One example of an irrelevant statement o f direction was the suggestion that water consumption could be decreased through the u se of physical punishment or confinement of people who were s e e n as consuming too m uch water. Raters also noted that the number of statements o f direction were important in determining overall textual quality, with one statement, for example, being se e n as an inadequate development of a proposition. These comments seem to suggest that statements o f direction were also considered aspects of coment. As indicated earlier, there were two other significant features of rhetorical abstraction which are inversely associated with overall textual quality. Statement o f claims and explicit statement o f warrants are related to overall quality to an approximately equal degree. The estimate for the relationship between statement o f claims and overall quality indicates that, a s the statem ent of a central idea becomes clearer, the overall quality of the writing d ecreases. The estimate for the relationship between explicit statement o f warrants and overall quality similarly Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 1 1 indicates that a s more warrants are stated explicitly, the overall quality of the text is adversely affected There is no clear explanation for why statement o f claims is inversely related to overall quality. This is a surprising and counterintuitive result The clear statement of a central idea in a text is generally regarded as a desirable characteristic of texts, as can be seen in many composition textbooks. It may be that the raters interpreted the scale descriptors as being related to explicitly stated claims. They may have then assigned lower sco res to those essays in which claims were often or always expressed explicitly. Since the explicit statement of theses and topic sen ten ces h as been seen as the mark of developing writers, raters may have been expressing a preference for clear, yet implicit, claims. Another reason for this finding may be that raters dispreferred the overuse and predictability of the claims stated by test takers. There is a piece of anecdotal evidence which may support this conjecture. Raters noted that in many essays test takers would repeat the appropriate parts of the prompt as their claims (see essay #277 in Appendix F for example). The wholesale adoption of the prompt and the frequent repetition of the sam e, predictable claims appeared to be dispreferred by raters. There is also evidence which supports an explanation for the inverse relationship between the explicit statement o f warrants and overall quality. The sam e argument u sed to provide a provisional explanation of the relationship between statement o f claims and overall quality can be made here. Other researchers who have investigated the u se of coherence devices in composition have concluded that such linguistic devices, particularly when they are expressed Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 212 explicitly, can be overused (Witte and Faigley 1981). That is, a continually explicit marking of rhetorical or logical development in texts can be dispreferred by composition evaluators. This was apparently the case for essay #895 in this study, a copy of which is provided in Appendix F. The writer's frequent use of therefore was noted by raters and may account for the relatively low holistic score (average of 7.3) for overall quality. The fact that any rhetorical abstraction features woe identified a s being significant was surprising, given raters' comments in an earlier study (Weasenforth 1993). During the development of the scales of rhetorical abstraction for this earlier study, the raters indicated that the rhetorical abstraction features would have little relevance to their evaluations of Lz essays. One rater suggested that it is not a "language teacher's" responsibility to evaluate features of rhetorical abstraction, implying that rhetorical abstraction was irrelevant to language instruction. The other rater similarly indicated that, as evaluators of L o writing, the main concern should be with syntactic accuracy and vocabulary usage. In addition to finding the rhetorical abstraction features irrelevant because the texts were written by L j writers, the raters indicated that die features were irrelevant because the texts were written under standardized test constraints. They further argued that the features might have been expected had the texts been written as a class assignment. However, since the texts were impromptu and written within a relatively short period of time, rhetorical abstraction features could not be expected to be found in the texts.1 Ferris (1991) similarly notes the sam e difference in expectations as a possible explanation for her findings. She Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 13 suggests that teachers as teachers in English composition classes look for "rhetorical sophistication"; a s evaluators, however, they look for grammatical accuracy. It is not surprising that readers might apply different evaluation criteria to a text which students had written for class compared to the criteria they might apply to an impromptu text written for a test. However, the variation in expectations raises questions about the validity of the ratings. If criteria used in the classroom and th o se used in testing vary, to what extent is it valid to u se the test results to place or promote students? To what extent do the two s e ts of criteria reflect the pedagogical objectives of the program? To what extent do the two s e ts of criteria reflect the program’s model of adequate writing? Particularly important for placement are the criteria used in testing efficient discriminators of writers at various levels of proficiency? Anecdotal evidence may explain why consideration o f causes and data use were not found to be significant textual features in this study. It may be that raters saw and treated both of these features as irrelevant to the evaluation of protocols written by Lz test takers. Hamp-Lyons (1991a) concluded that the raters in her study avoided evaluations of substantive aspects of essays, explaining that the raters felt unqualified to evaluate content. Such a position is untenable from both a theoretical and practical perspective. If one assum es that textual features interact to constitute the substantive aspects of texts, then the isolation of features related to content from other textual features is impossible. Also, as Keech (1984) h as demonstrated, the manipulation of content affects the writer’s manipulation of other textual features. In disregarding content - Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 14 assuming that this is even possible— raters may very well get a narrow view of not only the overall quality of a text, but also of the writer's control of more formal textual features, su ch as grammar and vocabulary. In the case of consideration o f causes, it may be more likely that the lack of any significant finding is due to the relative scarcity of the feature. The descriptive statistics (see Table 1 in Appendix C) indicate that few instances of consideration o f causes were exhibited in the texts. The mean for m easures of consideration o f causes was .66 with a standard deviation of .74. The highest score (4) for this scale w as not assigned; the next highest (3) w as rarely assigned. The last several hypotheses were formulated in response to an interest in defining rhetorical abstraction. The definition sought w as in term s of the type and strength of relationships between rhetorical abstraction features and other textual dimensions. Identifying the associations of rhetorical abstraction with other dimensions-such as text length, content or organization -might be useful in defining the nature of rhetorical abstraction. The third hypothesis posited a significant relationship between rhetorical abstraction features and text length. This hypothesis was tested by calculating correlations between the variables. The other hypotheses involved the identification of broader textual dimensions with which rhetorical abstraction features were correlated or of which the features appear to be asp ects. Statistically significant relationships between rhetorical abstraction features and several textual dimensions— content, rhetorical organization, textual elaboration and topic abstraction -were hypothesized. These relationships were described in terms of correlations in model 1 and in terms of regression weights in model 2. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 15 Results from the relevant investigations indicated whether rhetorical abstraction features could be viewed as features which commonly occur with content, rhetorical organization, textual elaboration, topic abstraction, or a s aspects of these dimensions. Since the topic abstraction and textual elaboration variables were omitted from analyses following the preliminary SEM factor analyses, the last two hypotheses expressed in 1.4 were not tested. Testing the hypotheses would not have beat useful or meaningful since the latent variables could not be defined. None of the rhetorical abstraction features correlated significantly with text length or with any of the textual dimensions identified above (see Table V.7). The correlations between these features ranged from .04 to . 18. In fact, the highest correlation between any rhetorical abstraction feature and any of the broader dimensions was .25 between direction and language use. The low correlations indicate that the rhetorical abstraction features are generally independent of the other textual features and dimensions. An explanation for this finding is that, as the raters suggested, a fairly clear distinction can be drawn between the rhetorical abstraction features and the other textual dimensions. Indeed, the raters argued that rhetorical abstraction should be disregarded, and evaluations should instead be mainly focussed on language use and vocabulary. The assum ption implicit in this argument is that rhetorical abstraction could be considered separately from other (supposedly more important or more relevant) textual features. The low correlations between rhetorical abstraction features and text length ranged in value from .05 to. 18. These small correlation coefficients Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 16 indicate that the relationship between text length and rhetorical abstraction features is statistically insignificant. One may interpret this to mea n that text length is not dependent on rhetorical elaboration. That is, a writer may extend the length of a text without necessarily elaborating the text rhetorically. The writer may also rhetorically elaborate the text without necessarily lengthening the text This s ee m s to be a rather common s e n s e conclusion. The writer may, for example, lengthen a text through verbosity without elaborating the rhetorical aspec ts o f the text Results from the analyses of the second model may, at first, appear to be inconsistent with interpretations of the correlational results. Parameter estimates summarized in Table V. 1 1 indicate that several rhetorical abstraction features may be considered asp ec ts of content and organization. Statement o f claims, statement o f direction and explicit statement o f warrants are all identified a s aspects of both content and organization. All of th es e estimates are relatively small compared to the estimate for the relationship between text length and content. The parameter estimates for the relationships between statement o f claims and explicit statement o f warrants and the broader dimensions content and organization are negative. Similar to the argument made earlier, it may be that both features were considered aspects of content and organization, but the explicitness with which they were articulated w as considered detrimental to the quality of the two textual dimensions. The parameter estimates for the relationships between statement o f direction and content and organization indicate that elaborations of statements of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 17 direction are aspects of both textual dimensions. These results support the hypotheses that rhetorical abstraction features would b e significantly related to content and organization. The results are also easily explained with an understanding o f the definition and function of statement o f direction. The feature is defined a s elaboration of di sc ussions of solutions to a problem. The feature was related not only to the identification of solutions, but also to descriptive discussions of the solutions. An example drawn from es s a y s written about the topic of water conservation may be helpfiil. One statement of direction in many of these es s a y s was the suggestion that water should be rationed in order to solve the problem of decre ase d water supplies. An elaborated discussion of this particular solution could include suggestions for media promotion of water conservation, issuance o f water saving devices (such a s special shower nozzles) and governmental regulations (such as the stipulation that water should not be u s e d for washing cars or watering lawns). With the understanding that s u c h information was associated with statement o f direction, one might expect this particular feature to be an aspect of content. The connection between statement o f direction and organization might also be expected. Statements o f direction often functioned a s organizational devices in the es s a y s evaluated for the study. That is they often served a s the topic sentenc es in the "body" of the ess ay s. It may be this function which brought about the SEM results which indicate that statement o f direction is a n aspect of organization. The forgoing discussion of results from analyses of the sec ond model may appear to be inconsistent with interpretations of the correlational results from Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 1 8 analyses of the first model. To claim that rhetorical abstraction features do not co-occur with broader dimensions on one hand and to state that the rhetorical abstraction features may be asp ec ts o f the dimensions may se em to be irreconcilable statements. Comments from the two raters who provided holistic sc o re s o f overall quality may be useful in reconciling the two statements. In a discussion of protocol evaluation, raters su gge ste d that an hierarchical arrangement of textual features portrayed their theory of the relative importance assigned to features in evaluations of L o test protocols. They categorized features into "basic" and Content Rhetorical organization Grammar, vocabulary and mechanics Figure V I. 1 Proposed Model o f Assessment "supplementary" features which they placed respectively at the b ase and the apex of the hierarchy (see Figure V I. 1 above). O f greatest importance in evaluating L j writers' test protocols, according to the two raters, were the mo st "basic" elements grammar, vocabulary and mechanics. Basic, but of secondary importance, was overall rhetorical Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 19 organization (i.e., the tripartite organization of texts and the u s e of explicit th e s e s and topic sentences). These features, according to this model of evaluation, were the focus of writing as se ssm ent for Lz test takers. Content was of relatively le ss importance to evaluations. Content appeared to hold s o m e importance in two respects. Content appeared to be important or useful in determining whether the writing was "o n or off topic." If the content of the text did not match raters' expectations, the writer could be asked to retake the examination with die explanation that the writing had been "canned" or was not comparable to the other es s a y s and therefore adversely affected the reliability and validity of ratings that would have been assigned to the writing. Content, loosely defined a s the amount of writing, was useful in determining whether writers ha d produced enough text to a s s e s s grammar, vocabulary and mechanics adequately. Content and the elaboration o f basic elements, although related to the elements, were considered separately as say, "frosting on the cake,” a s one rater metaphorically explained The other rater similarly explained that evaluators of L2 writing, especially under standardized testing circumstances, should lim it their evaluations to "basic" textual features, particularly grammar and vocabulary, a s well a s rhetorical organization which was viewed a s "basic," but more relevant to writing produced by relatively more advanced writers. Concerns for content were considered less relevant in evaluating protocols. Rhetorical abstraction was associated with content, but was relegated to a relatively small area toward the apex of the triangle. Similar to views expressed by raters in Hamp-Lyons' (1991a) study, raters considered evaluation o f content to be the purview of "content course" professors and instructors. Evaluation of rhetorical abstraction Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 220 similarly was not s e e n a s the responsibility of evaluators o f Ln texts. These assumptions raise questions about the validity of the writing tests; these questions w ill be discussed further in VI.3. V I. 3. Implications. Results from the study have implications for both the testing and teaching arms of L2 programs which I w ill argue should be coordinated. The implications that I dis cu ss are drawn in res pon se to results related to rhetorical abstraction features. Results related to other features, however, may also be referenced if they are useful in discussing the sa m e implications. Implications for testing L ? writing are d iscu sse d first, followed by discussions of implications for the teaching of writing to L2 students. Results from the study also have implications for text descriptive efforts in the field of discourse analysis. The initial SEM analyses, in particular, yielded information which calls into question quantitative efforts to define characteristics of texts. V I.3.1. Language Assessment I would like to d is c u s s two implications for writing ass es sm en t in the process of advancing two arguments which are relevant to the s u c c e s s of language programs, particularly the as se ssm ent arms of programs. The first of my arguments is that program testers and administrators should know what is Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 22 1 being tested Although in theory this assertion may not s e e m controversial, the implications entailed by the assertion may draw so me debate. The study yielded information which indicates that raters were not fully aware o f what they were evaluating, or at least not aware of the relative importance that they accorded various textual features in their assessment In contrast to their denial of the relevance of rhetorical abstraction features to the evaluation of Ln protocols, parameter estimates indicate that three of these features were significant Statement o f direction was one of the features which raters dismissed as being irrelevant yet was one of the most statistically significant indicators of overall textual quality. One of the disadvantages of holistic scoring is that it m as ks the evaluation proces s. Although it allows for reliable and relatively cost-efficient a s s e s s m e n t of writing abilities, holistic rating often obscures die criteria actually u s e d to a s s e s s students' writing. High reliability estimates may indicate that s co re s are relatively consistent, but they do not guarantee that raters are evaluating the sam e textual features. One may argue that this may not matter since overull textual quality is being evaluated and that the further breakdown of features is not warranted However, it h a s been shown that raters do at times focus on one, or a limited number of, textual features when asked to evaluate texts holistically (Hamp-Lyons 1991b). I am not advocating the abandonment of holistic rating, especially since it d o e s provide a number of attractive benefits, s u c h a s a quick and cost effective mea ns of ass essment. However, one should also consider the disadvantages s u c h as the masking of what is actually being rated While this sa m e problem Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 222 may also occur with other evaluation procedures, s u c h a s analytic and primary trait rating, it may not occur to the sa m e extent since raters' attention is fo cu sse d on a limited number of textual features. As researchers have investigated the nature o f holistic evaluations, it h a s been discovered that holistic m ea su re s can include significant amounts of error associated with rater fatigue, test taker characteristics and other phenomena which a s s es so rs may not have intended to evaluate (Kunnan 1991, Rusikoff 1994). Again, the intrusion of extraneous effects o n as s e s s m e n t probably occurs with other types of rating also. Laying aside the is sue of contrasts between holistic rating and other forms of a s s e s s m e n t, my argument is that not to recognize the nature of assessments-be it holistic, analytic, primary trait or any other type-can place a program in a vulnerable position and deny it a reliable ba sis for placement and promotion of students. For example, to assum e that no measurement error is involved in an a s s es sm en t, even when perfect reliability estimates are obtained, is to ignore the many facets of as s es sm en t. This ignorance, in turn, can lead to other difficulties, s u c h as teacher and student dissatisfaction with the program and ineffective instruction. If one accepts this argument, it may be concluded that calculating correlational estimates of reliability of any particular value is not adequate in itself to assure rater consistency. Perha ps more attention should be paid, in rater training, to those features which raters do evaluate. Procedures which would entail multiple analytic ratings of e s s a y s may be too expensive, but trainers could gather anecdotal evidence which could be useful in focussing raters’ attention on relevant textual features. Procedures similar to training for the Test o f Written Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 223 English ratings could b e implemented to train raters to focus on relevant features. These procedures include investigations o f how raters apply the rating scale and explicit direction of raters to focus on only those features which are referenced in the scale. I would also argue that there should be s o m e consistency between the explicitly stated goals and the implicit goals of assessment This argument also may not raise a great deal of controversy, but lack of adherence to the principle underlying the argument can be observed in testing procedures in L2 programs. It may be encouraging to find that raters of L2 protocols in this study did apparently evaluate rhetorical abstraction features a s others have su gge ste d should be the case (Carlson 1988, Freedman and Pringle 1980, Hamp-Lyons 1991a, J o h n s 1991). However, if, as the SEM results suggest, elaboration of statements o f direction is a textual feature which is significant in constituting the overall quality of texts, there should be some explicit statement of the feature's importance in the rating scale and in instructions to test takers. Similarly, if, a s two raters argued and the SEM results partially corroborated, the primary concern in as s es si n g L2 students’ texts should be vocabulary usage and grammatical accuracy, then thes e concerns should be reflected in the rating scale and in rater training. However, there are only three references to vocabulary us ag e and one reference to grammatical accuracy in the rating scale (see Appendix B) u s e d for the holistic sc o re s. There is no indication of the level of significance that vocabulary u s a g e and grammatical accuracy should play relative to the other textual features identified in the scale. It is not Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 224 made clear in the scale that vocabulary u s a g e and grammatical accuracy should play predominant roles in evaluations. The raters appear to take the position that it doe s not matter what a writer say s as long a s s/he s a y s it using felicitous grammar and vocabulary. They appear to interpret the scoring instructions to me a n that grammar and vocabulary should be the primary concerns in evaluating stu dents’ writing. This approach to evaluating writing may be an inevitable result if the particular generation of teachers employed as raters have accepted this theory of writing evaluation. The sa m e argument can be made in re s p o n s e to the role of text length in determining overall textual quality. It is curious that test takers who complete the A Li placement test are notified that their writing will be judged partially according to the amount of text they produce although there is no mention of text length in the rating scale. As the SEM results indicate, text length appears to play a relatively prominent role in holistic evaluations. Since text length apparently is an important indicator of overall textual quality, and since students are explicitly notified that text length is important in the evaluation of their writing, it could easily be argued that there should be so me reference to this feature in the rating scale. The coordination of explicit and implicit a s s e s s m e n t procedures might provide for more reliable placement and promotion of students. It might also provide more reliable diagnostic information for instructors, students and academic departments. To finish this se c on d argument, I would conclude that the importance of explicitly stating criteria for evaluating writing lies in the validity and reliability of assess me nt. These two concern s in turn are manifested in more practical, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 225 mundane concerns related to rater training, student placement and instruction and the articulation of L2 curricula with Freshman Writing programs and "content courses." On a more general level of discussion, this st ud y may be useful to the language testing field in terms of its application to investigations of test method effects (Bachman 1990, Kunnan 1991). As d is cu ss ed in Chapter 4, there are many sou rce s o f variation in ass ess m ent , including (but not limited to) test taker characteristics, characteristics of the testing environment, characteristics o f the test and characteristics of expected response. In this study I have taken a text ba sed approach to examining characteristics of expected res pon se. Expected re sp on se was limited in the study to particular textual features/dimensions and the relative salience of the features in determining overall textual quality. The study represents a modest expansion of Bachman's (1990) facet theory approach to investigating test validity and reliability. I have looked at textual features/dimensions of L2 writing which have received little, or no, attention in empirical research. Although so m e of the features have been included in similar empirical investigations, most o f thes e investigations have been limited to Lj writing, a s reviewed in Chapter 2. As argued previously, one might expect findings from investigations of Lj writing to differ from findings of investigations of L2 writing (Connor and Kaplan 1987). The study also represents a renewed interest in the interface of applied linguistics with psychometric theory and statistics which promises to be useful to investigations of language u s e and acquisition (Bachman 1990). Use o f SEM, in particular, to investigate the relative salience of textual features provides Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 226 theoretical advantages over statistical procedures u s e d in previous research which h as described the relative salience of textual features. The application of a procedure that h a s a preferable theoretical foundation h a s implications for the validity of previous findings. VI.3.2. Language Teaching. Results from the study may be related to L2 writing instruction in several ways. I will argue first that the results should not be taken as categorical directives for how writing should be taught. However, I would like to sug ge st that two features receive more attention than they have traditionally received in composition clas se s. Finally, I w ill advance an argument for the systematic coordination of language teaching and assessment I do not want to us e the results to rehash stale arguments about the relative focus of pedagogical efforts. It is likely die c a s e that particular asp ec ts of language are more salient than others depending on a wide range o f factors. It may be important to emphasize rhetorical asp ec ts in so m e classroom circumstances, whereas in others the teacher may be better advised to focus on grammatical a sp ec ts . The choice of how teaching efforts should be focussed depends on students' needs, the type o f texts students are being asked to produce, the audience being addressed, a s well a s a great number of other (social, historical, cultural) contextual factors just to name a few. I would argue that a categorical call for a general focus o n rhetorical features or on lexical features or any other type of textual feature may run counter to the efficacy o f language teaching. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 227 Su ch a pluralistic view of instructional focus can b e very difficult to implement however. Certainly part of curriculum design and teacher training entails setting an agenda for the classroom. In defining the type of instruction that w ill be offered in composition courses, one is tempted to dictate that emphasis be placed on particular gen res or on particular textual features which may define genres. In some c a s e s , academic departments may mandate that particular texts or textual features be taught However, it is often the c a s e that composition instructors are sent forth to teach general writing skills. The complexity of text interpretation, the lack of a coherent model of textuality, the uncertainty of students' future u s e s of writing are only several o f the daunting obstacles that face anyone who would try to make a rational decision about the textual features that should receive more emphasis in composition classrooms. Nevertheless, it is not, in my estimation, adequate to settle for the "easy way out" As noted by Leki (1991), it is often easier to identify sentence level errors— e.g., syntax and mechanics-than textual features associated with content or coherence. It would be a mistake to focus on sentence level features simply because they are more obvious. A more rational, theoretical approach to writing instruction could provide for more effective instruction as well a s benefit the profession.2 Indeed, I would argue that any program that purports to focus o n communicative competence would b e misrepresenting itself if it were to focus predominantly on grammar and vocabulary.3 While grammar and vocabulary control are considered integral a s p e c t s o f communicative competence, they are not the only asp ec ts and may not be the most significant aspects. To focus on thes e Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 228 two particular a sp ec ts of communication would be inconsistent with a communicative syllabus. In spite of my avoidance of a prescriptive stance, I would argue that both rhetorical abstraction and vocabulary should receive more attention when curricula for writing courses are designed. Results from the SEM analyses indicate that these were significant textual features of overall quality. Coady (1991; 1994), Hoey (1991a; 1991b),and Huckin, Haynes and Coady (1993) have argued that vocabulary, in particular, is one of the most important determinants of textual quality. I would argue that the two features should receive more attention also bec au se relatively few empirical investigations have involved these features (see review by Zimmerman 1994). Finally, I will argue that the testing and teaching arms o f a program should be systematically coordinated. This discussion ste m s from raters' comments which gave rise to questions regarding variations in expected response with such variations determined by whether writing is completed a s a class assignment or for a test Raters' comments indicate that the results o f the SEM analyses might be different had the writing been produced as a classroom assignment More specifically, the raters indicated that rhetorical abstraction features and content would have been more salient features of the overall quality of texts produced a s class assignments. Poole (1990) also noted differences in expected response, and Ferris (1991) noted specifically that raters would have placed more emphasis on substantive asp ec ts of texts produced for class than they would have for test protocols. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 229 These differences between expectations raise questions regarding the validity o f a s s e s s m e n t s and/or the evaluation of classroom writing. One might reasonably expect variations in the quality of texts produced under different circumstances, and, in fact, Kroll (1990) has identified so m e of those variations. It s e e m s reasonable, for example, to expect more errors in grammar u sa g e and mechanics when writers must produce texts within a restricted period o f time and thus to adjust one's evaluations o f writers accordingly. However, the assertion that some textual features are relevant to the evaluation of cl ass assignments, but irrelevant to test protocols should be questioned. I will d is cu ss this issue as a possible area of research in the concluding section of this chapter. I surmise that one reas on for the unquestioning acceptance of different expectations is that as s e s s m e n t is too often treated as the step child of language programs. I would like to advance the proposition that a ss es sm en t be accorded the attention that teaching is given in language programs. A relatively great amount of energy is expended on materials development, curriculum design, teacher training, teacher and course evaluations. While these concerns deserve a great deal o f attention, a ss es sm en t should also be s e e n as an integral part of providing stu dents with the language skills/abilities that they may b e expected to have. Assessment is often s e e n a s a perennial nuisance, a s a waste of classroom time, a s a goad to prod stud ents along an academic path, or a s protection against possible challenges of final grades. Little, or no, connection is s e e n between a ss es sm en t and teaching materials (Bachman 1993); a s s e s s m e n t materials are rarely incorporated in textbooks. The relationship between as s e s s m e n t and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 230 curriculum design is also often ignored or casually assumed. Programs may, for example, state objectives in a curriculum in terms of "70% accuracy in understanding and usage." It is not made clear what "70% accuracy" in a student's understanding and us ag e mea ns or how it should be evaluated. Moreover, there is often little, or no, attempt to define or describe s u c h terminology.4 Furthermore, I believe that the professionalization of the field of TESOL is dependent in part on the clarity and s o u n d n e s s o f the theories of language u s e and acquisition that are held, a s well as the clarity and s o u n d n e s s of the applications of theory to classroom practice. Classroom practice, I maintain, includes both teaching and as se ssm en t. Testing should reflect accepted theories of language u s e and acquisition and should be an integral part of language learning, not an adjunct activity to fill time or fulfill mandates. V I.3.3. Discourse Analysis. Results from the SEM analyses raise questions about the dimensions defined by Biber (1988) and Grabe (1984), most importantly about their validity. While the textual dimensions topic abstraction and textual on-line elaboration may be identifiable textual dimensions and may be definable in terms of component textual features, it is doubtful that frequency counts alone w ill tell u s m u ch about them. Grabe and Biber (1987) argue that the dimensions may not be defined similarly in L i and L2 texts, but this argument raises questions about the validity of Biber’s (1988) definitions of the dimensions. If the dimensions that he h a s identified are observable and are defined a s he suggested, they should be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 23 1 observable in both Li and L2 texts and should be identically defined across both text types. Variability of the definition of textual dimensions ac ros s text types, most importantly, raises questions about the validity of generalizing Bibo's (1988) results. If, in order to maintain the validity of his original analyses, Biber must argue that textual dimensions may be defined differently from one group of texts to another, then his original results are limited in application to those texts included in his research. The common factors may define dimensions of his corpus, but are not necessarily relevant to other texts. Proposing that the definition of dimensions varies raises the question of whether the dimensions actually exist, even in the corpus used in his research. If, for example, frequency counts of features not included in the original analyses were incorporated in the analysis, differently defined common factors would most likely be derived. Not only would additional features load on the already existing factors, but most disconcertingly the factor loadings would possibly be different Features which originally belonged to a particular common factor would most likely load on a different factor. Anyone who h a s familiarity with factor analysis recognizes that there are an infinite number of solutions to identifying an adequate factor structure. Perhaps the m o s t difficult job for the researcher applying factor analysis is identifying the mo st acceptable solution, not only from a statistical perspective, but from a conceptual point-of-view a s well. It is possible that the dimensions identified by Biber are no more than the reification o f arbitrary correlations of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 232 frequency counts. This appears particularly possible for those dimensions for which little literature can be found for support Finally, if the dimensions exist and vary in definition, to what extent can one apply these multiple definitions in discourse analytic rese arc h? To argue that the dimensions are not observable in L j texts or that they may be defined differently in L2 texts then beg s the question of the identification of the se dimensions. The argument appears to be an attempt to maintain credence in objectively defined (statistical probabilities of co-occurrences of selected grammatical and lexical features) textual features while admitting the nonexistence of the dimensions for texts which were not included in the original research. Biber (1988; 1992) h a s argued that a multidimensional approach to defining textual dimensions is needed. However, the definitions of his dimensions are base d on a restricted selection of features, particularly to lexical and grammatical features. Furthermore, the frequency of occurrence of the features is the only measure u s e d to define the latent textual dimensions. Other aspects-e.g., functions and semantic relevance-of the features are ignored. Ignoring other features and various aspects of the features may in part account for the results of this study which are related to investigations of two of Biber's textual dimensions. The results of this study are consistent with the conclusion that the textual dimensions topic abstraction and textual on-line elaboration do not exist a s Biber h as defined them. Rather than arguing that the dimensions should b e redefined from one text to another-a not very useful or credible argument— it s e e m s more reasonable to argue that features other than, or in addition to, frequency of Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 233 occurrence of tho se select lexical and grammatical features listed in Biber*s (1988) research should be accounted for before definitions of textual dimensions are provided. Somewhat different conclusions can be drawn with regard to the rhetorical abstraction textual dimension. Although the existence of a latent dimension labeled rhetorical abstraction was not supported, results of the SEM analyses indicated that the seven feature&statement o f claims, statement o f direction, etc.- are real. That is, the me asures for each feature were relatively highly correlated, and both anecdotal and statistical evidence horn an earlier study (Weasenforth 1993) indicated that readers were able consistently to identify and to measure the features. It may be that rhetorical abstraction is a multidimensional construct, the structure of which would be revealed with the inclusion of a larger number of textual features. It may be that the measurement of other aspects of the features would lead to a successful model of rhetorical abstraction. This issue is d iscu sse d further a s part of a future research agenda. The research represents the advancement of a multidimensional model of written discourse in terms of the textual features included in the study and the statistical methodology applied. Two features-i.e., vocabulary and rhetorical abstraction-have received little, or no, attention in empirical investigations of textual quality, particularly of texts. This study, a s do Connor’s (1987; 1991; 1994), Ferris' (1991) and Hatch's (1991), demonstrates that a multidimensional statistical approach to discourse analysis is both feasible and useful. In particular, the application of SEM in this study represents the introduction of a statistical approach which provides theoretical advantages to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 234 multidimensional analyses of written discourse. SEM is particularly useful in analyzing both ordinal and interval level data, thus accommodating both ordinal mea su res o f semantic, rhetorical textual features a s well a s interval me as u re s— e.g., frequency counts-of grammatical and lexical features. SEM also accounts for measurement error, thereby providing for a "pure" measure of underlying textual dimensions. It also allows for the correlation of error terms of varying magnitude— an important aspect when raters evaluate multiple textual features-and the correlation o f textual features, thereby accommodating the generally accepted assumption of the interdependence of textual features. It also allows the analysis of a complete model of latent and observed variables a n d thereby may be useful in developing and testing a model of discourse structure. V I.4. Further Research. There are a number of avenues of research that could logically follow from this study. One direction would be to continue investigations related to test validation, looking at the relation of ratings to stated teaching and as s e s s m e n t objectives. Another direction would lie in the pursuit of information more closely related to discourse descriptions. This direction includes the description of individual textual dimensions and text structure. VI.4.1. Test Validation. In this section I will propose three directions for future research a s suggested by the results o f the study. The first s e t of proposals involves four Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 235 types of expansion of the sam e research. I w ill sug ge st replications of this study with several distinct purposes in mind. I w ill also propose investigations of qualitative evidence, comparisons o f SEM and regression analyses and the replication of previous studies. The second se t of proposals stem from results of this study which may be interesting but were not related to investigations of rhetorical abstraction This set of proposals includes investigations of the discrepancies between expected respon se in testing versus classroom environments and the effects of these discrepancies on test validity. I w ill also suggest further investigations of the role o f textual features not focussed on in this study, as well a s further investigations of the definitions o f vague categories s u c h as content. The final proposal comes from taking a broader perspective of the areas involved in this study. I w ill argue that a meta-analysis of research related to investigations of textual features and their role in determining textual quality is now warranted. It may be useful to replicate the study with a larger sample. This may be particularly useful in light o f the demands of the weighted least sq u ar es estimator which must be u s e d with ordinal measures. Replication would not only be useful in light of the statistical demands of a particular estimation procedure, but also with regard to investigating the reliability and generalizability of the results of this study, a s well a s to provide an empirically ba se d model of discourse structure. Other types of evidence could be sought for describing the role of rhetorical abstraction in defining textuality. Qualitative evidence~in the form of recorded discussions during rater training and rating sessions-was collected. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 236 This evidence could be investigated more fully to determine whether it is consistent with the quantitative evidence. Such an endeavor may also be useful if it provides different perspectives o n the investigation of the relative salience of text features. It may, for example, provide some indication of what other features, or aspects of features, could prove to be significant indicators of textual quality. Finally, it may be useful to replicate this study using regression analysis instead of SEM analyses. This replication may serve to identify the possible limitations of regression analyses in studies s u c h as this one. Replication may also be useful in underscoring the usefulness of SEM analyses by pointing out the limitations of other types of analyses. It may also be useful to replicate previous studies which have involved regression analyses. Reanalysis o f data from previous studies with the application of SEM may be useful for the sam e reasons. There are a number of results from this study that were not discussed at length in this dissertation because of the focus of the study. These results are discussed below in terms o f future research. First, it may be useful to investigate how the testing situation modifies the expectations o f raters. It was suggested that test takers would not generally have time to incorporate in their writing many, if any, of the rhetorical abstraction features. Two of the raters described the rhetorical abstraction features a s "frosting on the cake," pointing out that the main purpose of evaluating ESL students' protocols, at least in a testing environment, was primarily to ascertain how m u ch command of grammar and vocabulary and so me s e n s e of organization the test taker could display. Elaboration of content and sophistication of organizational aspects of the texts were "extras." Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 237 This sa m e position was taken by the raters in Ferris' (1991) study who indicated that grammatical accuracy was their primary concern in evaluating test protocols but that rhetorical development was their primary concern in evaluating texts written a s class assignments. Although they do not provide comparative information about raters' expectations in testing versus classroom situations, Pollitt and Hutchinson (1987) point out that it is common practice for raters to accord grammatical correctness greater importance than other textual features. Identifying the situational variations in raters' evaluations would be useful in analyses of the pedagogical validity of writing test s and in test development It may not be surprising that raters would not expect to find certain types of features in texts which had been extemporaneously composed within a 35 minute block of time. It appears, in fact, that with the exception of several features, the features of rhetorical abstraction identified in this study had relatively little effect on raters' judgments. The diminished expectations within a testing situation raise questions about the validity of test results. It may be useful to identify discrepancies in expected respon ses and to describe how those discrepancies relate to the validity o f a test and the reliability of sco re s. This type o f information may be useful in the development o f test formats and procedures that might be more valid and might yield more reliable m ea su re s of writing abilities. Further investigations of the role of other features in determining overall textual quality -including content, text length, and vocabulary -m ay be useful. Investigations of content would be interesting at least in the s e n s e that it is not clear what is meant by this rubric. It is interesting that the results indicated that Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 238 the u s e of data did not appear to be an aspect of content. This information may also be useful in test development and validation. Investigations of vocabulary as an influential aspect of discourse would be useful since relatively few investigations of vocabulary exist There are a number of aspects to vocabulary -including, but not limited to, the type o f vocabulary, appropriateness of usage, variety of usage-which could be investigated Given the apparent value that stu de nts place on vocabulary acquisition (Zimmerman 1994, Leki and Carson 1994) and the consistency with which studies have sh own vocabulary to be a significant aspect of overall textual quality, further investigations of vocabulary u s a g e may be fruitful and useful. The high correlation between vocabulary and language u s e may also deserve some attention. While the correlation may b e d u e in part to the overlap in criteria in the ESL Composition Profile scales, it is also conceivable that the two textual dimensions are difficult to distinguish. This would have implications for a model of discourse structure a s well as for ass es sm en t methods. The role of text length is also interesting. As pointed out by Hillocks (1986), text length h a s consistently been identified a s a significant aspect o f overall textual quality. This observation is consistent with the results in this study. Researchers have speculated about the rea so ns for this finding. The significance of text length in determining overall textual quality h a s been attributed to the association of this feature with content (Hillocks 1986, Nold and Freedman 1977) and sophistication (Nold and Freedman 1977). The SEM analyses indicate that text length is a significant asp ec t of many textual dimensions, including content, organization, language use, vocabulary and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 239 mechanics. It is not clear why this would be true. It may be the case that more proficient writers produce longer texts which are in general more competently constructed. It may be that raters in general expect better writing to be exhibited in longer papers. Investigations of the effect o f text length may be useful to test validation, and s u c h investigations may have implications for composition instruction. The results of this study were somewhat useful in identifying aspects of broader rating categories, su ch a s content. As expected, several features o f rhetorical abstraction were identified by LISREL as aspects of content. Surprisingly,daw. use was not It might be useful and interesting to identify the aspects of content and other broad rating categories, s u c h a s organization, and development. This information might be useful in validating ratings, that is, to determine whether textual features o f interest are actually being rated. This information could be u s e d in rater training and scale development as well as test validation. Finally, I believe that an updated meta-analysis of research focused on various features which influence raters and the extent or the manna in which they affect raters would be useful to the language a s s e s s m e n t and teaching fields. It h a s been nearly a decade since Hillocks' (1986) meta-analysis was published. Becau se a fair amount of research h a s been completed since that time, it may be useful to compile a list of the studies, summarize them, identify consistencies and inconsistencies in the results. Such a compilation of research efforts could provide a basis for future studies a s well a s direction for composition instruction. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 240 V I.4.2. Discourse Description. It may be useful to investigate the relative salience of features in other types o f texts defined according to rhetorical mode or genre. Argumentation texts were analyzed in this study, but similar analyses of narrative, descriptive, expository or other rhetorically defined text types may be useful in terms of distinguishing and defining different text types by identifying salient aspects of the texts. Empirical evidence could be sought for the theoretical assumption that argumentation and persuasion texts can be distinguished a s two separate text types (Connor 1987,1991). The sam e approach could be taken for other genre, such as grant proposals or research articles. If, a s is maintained by other researchers (Flower 1979, Carrell 1987, Horowitz 1987, Kaplan 1982; 1991, Kintsch 1986; 1988, Smith 1985, van D ijk 1990), rhetorical features are more useful in distinguishing text types, a comparison of SEM models for various text types might be an intellectually profitable line of research. Similarly, it may also be useful to replicate the research with professional texts (as distinct from student texts). Resultant information could be useful in defining pedagogical objectives for novice writers, that is, for graduate students as well as for L2 language learners. Selection of a different s e t of textual features may also prove useful. A model of the rhetorical structure of texts could employ a s e t of only rhetorical features rather than a list of features which supposedly represent all salient features. This approach may be useful in narrowing the number of features included in a more comprehensive model. Furthermore, rhetorical features not included in this study could be investigated. This would be especially advisable Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 24 t should the features, a s argued earlier, be specific to argumentation texts. As suggested earlier, various asp ec ts o f the features should also be investigated. Although this dissertation is based to a great degree in language testing, it is also relevant to discourse analysis concerns. A discussion o f the usefulness of SEM to discourse analysis and of the difficulties in applying it could be useful to the field. Other than Biber*s (1992) work, I am aware of no other application of SEM in written discourse analysis efforts. An introduction to SEM for the discourse analysis field could not only be instructive, but could also clarify the idea that quantitative approaches to discourse analysis, and SEM in particular, do not necessarily need to be limited to frequency counts of grammatical and lexical textual features. An introduction to SEM for the discourse analysis field could present the possibility o f applying SEM to address discourse analysis interests- su ch as text structure descriptions and defining text type distinctions- and not concerns limited to language testing per s e (Mauranen 1993). V I.5. Conclusion. A number o f aspects-e.g, choice of text type, raters and test takers-of this study lim it the extent to which the results thereof can be generalized in applications to wider contexts. Nevertheless, the consistency of the results o f this study with th o se of other studies may give so m e indication of the extent to which they ca n be generalized. The study may also be useful in providing different perspectives on the evaluation of students' written protocols. In spite of the particularity of the study, several findings are consistent with t ho se of other researchers. It is interesting that a consistent finding acr oss Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 242 studies h a s been that relatively discrete features of textuality, particularly grammar and vocabulary, appear to be more significant than more broadly defined dimensions. A more striking consistency is the relative significance of text length in determining overall textual quality, striking in that it is rare to find text length not to be a significant aspect of textuality, if not the mo st significant The results of the study may also provide a clearer, or different, perspective on the nature of rating. Many similar empirical studies have indicated that rhetorical organization and development are insignificant features of stu de nt s' texts. This study, however, indicates that rhetorical development, or som e asp ec ts thereof, can be a s important as more discrete textual features in determining the overall quality of students' texts. The study thereby provides empirical support for self-report data that indicates that rhetorical development is important in evaluations of students' written texts. As results from additional research become available, a clearer understanding of the nature of rating should evolve. Future research should involve investigations of various text types, raters (e.g., expert, novice and NNS), test takers and contexts (i.e., testing and non-testing). Notes 1 . In contrast to the anecdotal evidence reported above, raters who participated in the study decried the lack of evidence, the repetition of ideas from the prompts and the lack of contextualization of arguments. These comments could be construed as evidence that the raters did actually value and evaluate the features of rhetorical abstraction identified in the study, as the statistical results indicate. 2. One should also be aware of the possible difficulties spawned by the evaluation of more illusive textual features (Weasenforth 1993). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 243 3. Kaplan (1994) h as argued that grammatical control is prerequisite to writing and that any course focused on grammar is not teaching writing. He points out that in many EFL contexts, syllabi have only grammatical accuracy as their objective. In ESL environments the options are much broader, in some ca s es , a grammar focus may be justified. 4. Kaplan notes that this type of definition is a direct outcome of the pressure for accountability. Programs in many states are required by law to specify goals and objectives. However, given the complexity o f communicative competence, they find themselves compelled to write s u c h vacuous definitions. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 244 REFERENCES Akinnaso, F. N. 1982. On the differences between spo ken and written language. Language and speech. 25. 97-125. Anscombre, J. and O. Ducrot 1983. L'argumemation dans la langue. [Argumentation in language.] Bruxelles: P. Madrago. Aston, G. 1977. Comprehending value: Aspects of the structure of argumentative discourse. Studi ita lia n i di linguistics teorica ed applicata. 6. 3. 465-509. Bachman, L. 1989. Language testing-SLA research interfaces. In R. B. Kaplan eta l. (eds.) Annual review o f applied linguistics, 9. New York: Cambridge University Press. 193-209. Bachman, L. 1990. Fundamental considerations in language testing. Oxford: Oxford University Pres s. Bachman, L. and A. Palmer. 1981. The construct validation of the FSI oral interview. Language learning. 31. 1.67-86. Bachman, L. and A. Palmer. 1982. The construct validation of so m e components of communicative proficiency. TESOL Quarterly. 16.4. 449-465. Bachman, L. and A. Palmer, forthcom ing. Language testing in practice. New York: Oxford University Press. Bachman, etaL 1991. Cambridge-University of California, Los Angeles language testing project: Test method rating instrument Los Angeles: University of California, Los Angeles. Interim Report submitted to the University of Cambridge Local Examination Syndicate. Bachman, L., Purpura, J. and S. Cushing. 1993. Development o f a questionnaire item bank to explore test-taker characteristics. Los Angeles: University of California, Los Angeles. [Interim Report submitted to die University of Cambridge Local Examination Syndicate). Ballard, B. and J. Clanchy. 1991. Assessment by misconception: Cultural influences and intellectual traditions. In L. Hamp-Lyons (ed.) Assessing second language w riting in academic contexts. Norwood, NJ: Ablex. 19-35. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 245 Bentler, P. and D. Bonett 1980. Significance tests and goodness-of-fit in the analysis of covariance structures. Psychological bulletin. 88. 588-600. Bereiter, C. and M. Scardamalia 1987. The psychology o f written composition. Hillside, NJ: Lawrence Erlbaum Associates. Berlin, J. A. 1984. Writing instruction in nineteenth-century American colleges. Carbondale, IL: Southern Illinois University Press. Berlin, J. A. 1987. Rhetoric and reality: Writing instruction in American Colleges, 1900-1985. Carbondale, IL: Southern Illinois University Press. Berthoff, A. 1984. Is teaching still possible?: Writing, meaning, and higher order reasoning. College English. 46. 8. 743-755. Berthoff, A. 1986. Abstraction as a speculative instrument. In D. A. McQuade (ed.) The territory o f language: Linguistics, stylistics, and the teaching o f composition. Carbondale, IL: Southern Illinois University Press. 227-237. Biber, D. 1984. A model of textual relations within the written and spoken modes. Los Angeles: University of Southern California Ph. D. dissertation. Biber, D. 1988. Variation across speech and writing. Cambridge: Cambridge University Press. Biber, D. 1992. On the complexity of discourse complexity: A multidimensional analysis. Discourse processes. 15. 133-163. Biber, D.^and E. Finegan. 1989a Styles and stance in English. Text. 9. 1. 93- Biber, D. and E. Finegan. 1989b. Drift and the evolution of English style. Language. 65. 3. 487-517. Bizzell, P. 1992. Academic discourse and critical consciousness. Pittsburgh, PA: University of Pittsburgh Press. Black, K. 1989. Audience analysis and persuasive writing at the college level. Research in the teaching o f English. 23. 3. 231-249. Bollen, K. 1986. Sample size and Bentler and Bonett's nonnormed fit index. Psychometrika. 51. 375-377. Bollen, K. 1989. Structural equation modeling with latent variables. New York: John Wiley & Sons. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 246 Bollen, K. 1990. Overall fit in covariance structure models: Two types of sample size effects. Psychological bulletin. 107. 2. 256-259. Bollen, K. and J. Long (eds.). 1993. Testing structural equation models. Thousand Oaks, CA: Sage. Bollen, K. andR. Stine. 1993. Bootstrapping goodness-of-fit measures in structural equation models. Thousand Oaks, CA: Sage. Bolus, R., F. Hinofotis and K. Bailey. 1982. An introduction to generalizability theory in second language research. Language learning. 32. 245-258. Boomsma, A. 1983. On the robustness ofUSREL (maximum likelihood estimators) against small sample size and non-normality. Groningen: Rijksuniversiteit Groningen. Braufle, U. 1983. Bedeutung und Funktion einiger Konjunktionen und Konjunktionaladverbien: aber, nur, immerhin, allerdings, dafur, dagegen, jedoch. [The meaning and function of some conjunctions and adverbial conjunctions: but, only, nevertheless, of course, for it/them/that, against it/them/that, however.] In W. Bahner, W. Neumann, J. Schildt, B. Techtmeier, D Viehweger, and W. Wurzel (eds.) Linguistische Studien, Reihe A : Untersuchungen zu FunktionswOrtem I. [Linguistic studies, Volume A: Investigation o f Junction words I. ] Berlin: Akademie der Wissenschaften der DDR Zentralinstitut fur Sprachwissenschaft 1-40. Breland, H. and J. Gay nor. 1979. A comparison of direct and indirect assessm ents of writing skill. Journal o f educational measurement. 16. 2. 119-128. Brennan, R. 1992. Elements o f Generalizability Theory. Iowa City, IA: American College Testing Program. Bridgeman, B. and S. Carlson. 1983. Survey o f academic writing tasks required o f graduate and undergraduate foreign students. Princeton, NJ: Educational Testing Service. [Research Report 15.] Britton, J., T. Burgess, N. Martin, A. McLeod and H. Rosen. 1975. The development o f writing abilities. London: MacMillan Education. Brockriede, W. and D. Ehninger. 1971. Toulmin on argument: An interpretation and application. In R. Johannesen (ed.) Contemporary theories o f rhetoric: Selected readings. New York: Harper and Row. 241-255. Brossell, G. 1983. Rhetorical specification in essay examination topics. College English. 45. 165-173. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 247 Brossell, G. and B. Ash. 1984. An experiment with the wording of essay topics. College composition and communication. 35. 423-425. Brown, G. 1989. Making sen se : The interaction of linguistic expression and contextual information. Applied linguistics. 10. 1.97-108. Brown, G. and G. Yule. 1983. Discourse analysis. Cambridge: Cambridge University Press. Brown, J. 1985. An introduction to the u s e s of facet theory. In D. Canter (ed.) Facet theory: Approaches to social research. New York: Springer- Verlag. 17-57. Bruthiaux, P. 1993. Knowing when to stop: Investigating the nature of punctuation. Language and communication. 13. 1. 27-44. Byrne, B. 1989. A primer ofUSREL' Basic applications and programming fo r confirmatory factor analytic models. New York: Springer-Verlag. Caccamise, D. 1987. Idea generation in writing. In A. Matsuhashi (ed.) Writing in real time. Norwood, NJ: Ablex. 224-253. Campbell, C. C. 1990. Writing with others' words: Using background reading text in academic compositions. In B. Kroll (ed.) Second language writing: Research insights fo r the classroom. Cambridge: Cambridge University Press. 211-230 Canale, M. 1983. On som e dimensions of language proficiency. In J. Oiler (ed.) Issues in language testing research. Rowley, MA: Newbury House. 333-342. Canale, M. and M. Swain. 1980. Theoretical b ases of communicative approaches to second language teaching and testing. Applied linguistics. Canter, D. 1983. The potential of facet theory for applied social psychology. Quality and quantity. 17. 35-67. Canter, D. 1985a. Editor's introduction: The road to Jerusalem. In D. Canter (ed.) Facet theory: Approaches to social research. New York: Springer-Verlag. 1-13. Canter, D. 1985b. How to be a facet researcher. In D. Canter (ed.) Facet theory: Approaches to social research. New York: Springer-Verlag. 265-275. Caplan, R. and C. Keech. 1980. Showing-writing: A training program to help students be specific. Collaborative Research Study No. 2. Berkeley, CA: University of California, Berkeley. [ERIC ED 198 539.} Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 248 Carlisle, R. and E. McKenna. 1991. Placement of ESL/EFL undergraduate writers in college-level writing programs. In L. Hamp-Lyons (ed.) Assessing second language writing in academic contexts. Norwood, NJ: Ablex. 197-214. Carlson, S. 1988. Cultural differences in writing and reasoning skills. In A. Purves (ed.) Writing across languages arid cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. 227-260. Carlson, S., B. Bridgeman, R. Camp and J. Waanders. 1985. Relationship o f admission test scores to writing performance o f native and nonnative speakers o f English. Princeton, NJ: Educational Testing Service. [Research Report 19.] Carrell, P. 1987. Interactive text processing: Implications for ESL/second language reading classrooms. In P. Carrell, J. Devine, and D. Eskey (eds.) Interactive approaches to second language reading. Cambridge: Cambridge University Press. 239-259. Carroll, J.B. 1992. Program PMS.TX1. J.B. Carroll, Unc. Carter, R. and W. Nash. 1990. Seeing through language: A guide to styles o f English writing. Oxford: Basil Blackwell. Chafe, W. 1982. Integration and involvement in speaking, writing, and oral literature. In D. Tannen (ed.) Spoken and written language: Exploring orality and literacy. Norwood, NJ: Ablex. 35-54. Chafe, W. 1985. Linguistic differences produced by differences between speaking and writing. In D. R. Olson, N. Torrance, and A. Hildyard (eds.) Literacy, language, and learning: The nature and consequences o f reading and writing. Cambridge: Cambridge University Press. 105-123. Chafe, W. 1986. Writing in the perspective of speaking. In C. Cooper and S. Greenbaum (eds.) Studying writing: Linguistic approaches. Thousand Oaks, CA: Sage. 12-39. Chafe, W. and J. Danielewicz. 1986. Properties of spoken and written language. In R. Horowitz and S. J. Samuels (eds.) Comprehending oral and written language. New York: Academic P ress. Chafe, W. and D. Tannen. 1987. The relation between written and spoken language. Annual review o f anthropology. 16. 383-407. Channell, J. 1990. Precise and vague quantities in writing on economics. In W. Nash (ed.)The writing scholar: Studies in academic discourse. Thousand Oaks, CA: Sage. 95-117. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 249 Cheseri-Strater, E. 1991. Academic literacies: The public and private discourse o f university students. Portsmouth, NH: Boynton/Cook. Choi, Y. 1988. Textual coherence in English and Korean: An analysis of argumentative discourse. Urbana-Champaign, D L : University of Illinois. Ph.D. dissertation. Chomsky, N. 1986. Knowledge o f language: Its nature, origin, and use. New York: Praeger. Christensen, F. 1965. A generative rhetoric of the paragraph. College composition and communication. 16.144-156. Christensen, F. 1967. Notes toward a new rhetoric. New York: Harper and Row. Cliff, N. 1993. Dominance statistics: Ordinal analyses to answer ordinal questions. Los Angeles: University of Southern California. Unpublished manuscript. Clyne, M. 1991. The sociocultural dimension: The dilemma of the German speaking scholar. In H. Schroder (ed.) Subject-oriented text: Languages fo r special purposes and text theory. Berlin: Walter de Gruyter. 49-67. Coady, J. 1994. Detecting growth in language. Journal o f reading. 37. 4. 341- 342. Coady, J. 1991. Teaching and learning vocabulary. TESOL Quarterly. 25. 4. 707-710. Coe, R. 1988. Toward a grammar o f passages. Carbondale, IL: Southern Illinois University Press. Connor, U. 1987. Argumentative patterns in student essays: Cross-cultural differences. In U. Connor and R. B. Kaplan (eds.) Writing across languages: Analysis o f L2 text. Reading, MA: Addison-Wesley. 57-71. Connor, U. 1990. Linguistic/rhetorical m easures for international persuasive student writing. Research in the teaching o f English. 24.1.67-87. Connor, U. 1991. Linguistic/rhetorical m easures for evaluating ESL writing. In L. Hamp-Lyons (ed.) Assessing second language writing. Norwood, NJ: Ablex. 215-225. Connor, U. 1994. Learning discipline-specific academic writing: A case study of a Finnish graduate student in the United States. Paper presented at the annual conference of American Association of Applied Linguists. Baltimore. March. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 250 Connor, U. and R. B. Kaplan (eds.). 1987. Writing across languages: Analysis o f L2 text. Reading, MA: Addison-Wesley. Connor, U. and J. Lauer. 1985. Understanding persuasive essay writing. Text. 5. 4. 309-326. Connor, U. and J. Lauer. 1988. Cross-cultural variation in persuasive student writing. In A. Purves (ed.) Writing across languages and cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. 138-159. Connors, R. and A. Lunsford. 1988. Frequency of formal errors in current college writing, or Ma and Pa Kettle do research. College composition and communication. 39. 4. 395-409. Cook-Gumperz, J. and J. Gumperz. 1981. From oral to written: The transition to literacy. In M. F. Whiteman (ed.) Writing: The nature, development, and teaching o f written communication, Variation in writing. Hillsdale, NJ: Erlbaum. vol. 1,89-109. Coulthard, M. 1985. An introduction to discourse analysis. New York: Longman. Couture, B. 1986. Effective ideation in written text: A functional approach to clarity and exigence. In B. Couture (ed.) Functional approaches to writing research perspectives. Norwood, NJ: Ablex. 69-92. Crothers, E. 1979. Paragraph structure inference. Norwood, NJ: Ablex. Crowhurst, M. 1978a. The effect of audience and mode of discourse on die syntactic complexity of the writing of sixth and tenth-graders. Minneapolis: University of Minnesota. Ph.D. dissertation. [Dissertations Abstracts International 38.7300-A.] Crowhurst, M. 1978b. Syntactic complexity in two modes o f discourse at grades 6,10 and 12. Washington, D.C.: U.S. Department of Education, Office of Educational Research and Improvement [ERIC ED 168 037.] Crowhurst, M. 1980. Syntactic complexity and teachers' quality ratings of narrations and arguments. Research in the teaching o f English. 14. 223- 231. Cushing, S. 1992. An IRT analysis of the ESLPE writing subtest. Los Angeles, CA: University of California, Los Angeles. Unpublished manuscript. Darling-Hammond, L. 1994. Performance-based assessm ent and educational equity. Harvard educational review. 64. 1.5-30. Davies, A. 1990. Principles o f language testing. Oxford: Basil Blackwell. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 25 1 Davison, A. and G. Green (eds.). 1988. Linguistic complexity and text comprehension: Readability issues reconsidered. Hillsdale, NJ: Erlbaum. Diederich, P., French, S. and S. Carlton. 1961. Factors in judgements o f writing ability. Princeton, NJ: Educational Testing Service. (Research Bulletin 61-15.] Dilworth, C., R. Reising and D. Wolfe. 1978. Language structure and thought in written composition: Certain relationships. Research in the teaching o f English. 12.97-106. Dwyer, J. 1983. Statistical models fo r the social and behavioral sciences. Oxford: Oxford University Press. Enkvist, N. 1987. Text linguistics for the applier: An orientation. In U. Connor and R. B. Kaplan (eds.) Writing across languages: Analysis o f12 text. Reading, MA: Addison-Wesley. 23-42. Eskey, D. 1983. Meanwhile, back in the real world...: Accuracy and fluency in second language teaching. TESOL Quarterly. 17.315-323. Faigley, L. 1979. The influence of generative rhetoric on the syntactic maturity and writing effectiveness of college freshmen. Research in the Teaching o f English. 13. 197-206. Fassinger, R. 1987. Use of structural equation modeling in counseling psychology research. Journal o f counseling psychology. 34. 4. 425-436. Ferris, D. 1990. Linguistic and rhetorical characteristics of student argumentative writing by native and non-native speakers of English. Los Angeles: University of Southern California, Department of Linguistics. Qualifying paper. Ferris, D. 1991. Syntactic and lexical characteristics of ESL student writing: A multidimensional study. Los Angeles: University of Southern California, Department of Linguistics. Ph.D. dissertation. [Dissertation Abstracts International 52. 08-A.] Fitzgerald, K. 1988. Rhetorical implications of school discourse for writing placement. Journal o f basic writing. 7. 1 . 61-72. Flower, L. 1979. Writer-based prose: A cognitive basis for problems in writing. College English. 41. 19-37. Flower, L. 1991. Personal communication. 29 March. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 252 Flower, L. and J. R. Hayes. 1977. Problem-solving strategies and the writing process. College English. 39. 449-461. Flower, L. and J. R. Hayes. 1980. The cognition of discovery: Defining a rhetorical problem. College composition and communication. 31. 21 -32. Frase, L. 1969. Paragraph organization of written materials: The influence of conceptual clusterings upon level of organization. Journal o f educational psychology. 60. 344-401. Frase, L., J. Faletti, J. Reid, D. Biber and U. Connor, forthcoming. A computer analysis o f TOEFL's Test o f Written English. Princeton, NJ: Educational Testing Service. [Research Report tt not yet assigned.] Freedman, A. and I. Pringle. 1980. Writing in the college years: Some indices of growth. College composition and communication. 31. 311-324. Freedman, S. 1977. Influences on the evaluation of student writing. Los Angeles, CA: University of California, Los Angeles. Ph.D. dissertation. Freedman, S. 1979. How characteristics of student ess ays influence teachers' evaluations. Journal o f Educational Psychology. 71. 328-338. Gardner, R., Lalonde, R. and R. Mooncroft 1987. Second language attrition: The role of motivation and u se. Journal o f language and social psychology. 6. 1. 29-47. Gardner, R., Lalonde, R. and R. Pierson. 1983. The socio-educational model of second language acquisition: An investigation using LISREL causal modeling. Journal o f language and social psychology. 2. 51 -65. Givon, T. 1984. Syntax: A Junctional-typological introduction, 2 vols. Amsterdam: Benjamins. Goldbeiger, A. and O. Duncan (eds.). 1973. Structural equation models in the social sciences. New York: Academic P ress. Golub-Smith, M ., C. Reese and K. Steinhaus. 1993. Topic and topic type comparability on the Test o f Written English. Princeton, NJ: Educational Testing Service. [Research Report 42.] Grabe, W. 1984. Towards defining expository prose within a theory of text construction. Los Angeles, CA: University of Southern California. Ph.D. dissertation. Grabe, W. and D. Biber. 1987. Who are we writing for? A linguistic comparison of freshman argumentative ess ays and published English genres. Los Angeles, CA: University of Southern California. Unpublished manuscript. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 253 Grady, M. 1971. A conceptual rhetoric of the composition. College composition and communication. 22. 348-354. Gray, B. 1977. The grammatical foundations o f rhetoric: Discourse analysis. The Hague: Mouton. Green, P. and K. Hecht. 1985. Native and non-native evaluation of learners' errors in written discourse. System. 13.2. 77-97. Greenberg, K. 1982. Some relationships between writing assignments and students'writing performance. The writing instructor. 2.7-14. Greenberg, K. 1986. The development and validation of the TOEFL writing test: A discussion of TOEFL Research Reports 15 and 19. TESOL Quarterly. 20. 3. 531-544. Greene, S. 1993. The role of task in the development of academic thinking through reading and writing in a college history course. Research in the teaching o f English. 27. 1.46-75. Grimes, J. 1975. The thread o f discourse. The Hague: Mouton. Grobe, C. 1981. Syntactic maturity, mechanics, and vocabulary as predictors of quality ratings. Research in the teaching o f English. 15. 75-85. Gumperz, J., H. Kaltman and M. C. O'Connor. 1984. Cohesion in spoken and written discourse. In D. Tannen (ed.) Coherence in spoken and written discourse. Norwood, NJ: Ablex. 3-20. Guttman, L. 1970. Integration of test design and analysis. In Proceedings o f the 1969 invitational conference on testing problems. Princeton, NJ: Educational Testing Service. 53-65. Hake, R. and J . M. Williams. 1981. Style and its consequences: Do as I do, not as I say. College English. 43. 5. 433-451. Hale, G. 1991. Effects o f the amount o f time allowed on the Test o f Written English. Princeton, NJ: Educational Testing Service. [Research Report 39.1 Hale, G., B. Bridgeman and C. Taylor, forthcoming. An investigation o f writing tasks. Princeton, NJ: Educational Testing Service. [Research Report # not yet assigned.] Halliday, M.A.K. 1989. Language, context, and text: Aspects o f language in social-semiotic perspective. Oxford: Oxford University Press. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 254 Halliday, M.A.K. and R. Hasan. 1976. Cohesion in English. London: Longman. Hamp-Lyons, L. 1990. Second language writing: Assessm ent issues. In B. Kroll (ed.) Second language writing: Research insights fo r the classroom. Cambridge: Cambridge University Press. 69-87. Hamp-Lyons, L. 1991a. Reconstructing 'academic writing proficiency’. In L. Hamp-Lyons (ed.) Assessing second language writing in academic contexts. Norwood, NJ: Ablex. 127-153. Hamp-Lyons, L. 1991b. Holistic writing assessm ent of LEP students. Denver: University of Colorado, Denver. [Center for Research in Applied Language Research Report 2.] Hamp-Lyons, L. and B. Heasley. 1987. Study writing: A course in written English fo r academic and professional purposes. New York: Cambridge University Press. Harris, J., S. Laan and L. Mossenson. 1988. Applying partial credit analysis to the construction of narrative writing tests. Applied measurement in education. 1. 335-346. Hasan, R. 1984. Coherence and cohesive harmony. In J. Flood (ed.) Understanding reading comprehension. Newark, DE: International Reading Association. Haswell, R. and S. Wyche-Smith. 1994. Adventuring into writing assessment College composition and communication. 45. 2. 220-236. Hatch, E. 1991. Using a layered discourse analysis of language data. Presentation given at Second Language Research Forum. Los Angeles. March. Hatch, E. 1992. Discourse analysis in education. New York: Cambridge University Press. Hays, J. and K. Brandt 1992. Socio-cognitive development and students' performance on audience centered argumentative writing. In M. Secor and D. Chamey (eds.) Constructing rhetorical education: From the classroom to the community. Carbondale, IL: Southern Illinois University P ress. 202-229. Hays, J., R. Durham, K. Brandt and A. Raitz. 1990. Argumentative writing of students: Adult socio-cognitive development. In G. Kirsch and D. Roen (eds.) A sense o f audience in written communication. Thousand Oaks, CA: Sage. 248-266. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 255 Hayward, M. 1989. Choosing an essay test question: It's more than what you know. Teaching English in the two-year college. 16. 174-178. Hayward, M. 1990. Evaluations of essay prompts by nonnative speakers of English. TESOL Quarterly. 24. 4. 753-757. Hildebrand, D., Laing, J., and H. Rosenthal. 1977. Analysis o f ordinal data. Thousand Oaks, CA: Sage. Hillocks, G. 1986. Research on written composition. Urbana, IL: National Council on Research in English. Hinds, J. 1979. Organizational patterns in discourse. In T. Givdn (ed.) Syntax and semantics, Discourse and Syntax. New York: Academic Press, vol. 12. 135-157. Hoetker,J. 1982. Essay examination topics and students'writing. College Composition and Communication. 33. 4. 377-392. Hoey, M. 1991a. On the surface o f discourse. Nottingham: University of Nottingham. Hoey, M. 1991b. Patterns o f lexis in text. Oxford: Oxford University Press. Homburg, T. 1984. Holistic evaluation of ESL compositions: Can it be validated objectively? TESOL Quarterly. 18. 1 . 87-107. Homing, A. 1993. The psycholinguistics o f readable writing. Norwood, NJ: Ablex. Horowitz, D. 1986a. Process not product: Less than meets the eye. TESOL Quarterly. 20. 1. 141-144. Horowitz, D. 1986b. What professors actually require: Academic tasks for the ESL classroom. TESOL Quarterly. 20. 3. 445-462. Horowitz, R. 1987. Rhetorical structure in discourse processing. In R. Horowitz and S. Samuels (eds.) Comprehending oral arid written language. New York: Academic Press. 117-160. Hu, L.-t., P. Bentler and Y. Kao. 1992. Can test statistics in covariance structure analysis be trusted? Psychological bulletin. 112. 2. 351 -362. Huckin, T. 1983. A cognitive approach to readability. In P. V. Anderson, R. Brockman and C. R. Miller (eds.) New essays in technical and scientific communication: Research, theory, practice. Farmingdale, NY: Bay wood. 90-108. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 256 Huckin, T., Haynes, M., and J. Coady (eds.). 1993. Second language reading and vocabulary learning. Norwood, NJ: Ablex. Hughes, D.C., B. Keeting and B. Tuck. 1983. Effects of achievement expectations and handwriting quality on scoring e ss ays. Journal o f educational measurement. 20. 65-70. Hughes, M ., R. Price and D. Marrs. 1986. Linking theory construction and theory testing: Models with multiple indicators of latent variables. Academy o f management review. 11. 1. 128-144. Hughey, J. 1990. ESL composition testing. In D. Douglas (ed.) English language testing in US colleges and universities. Washington, DC: National Association for Foreign Student Affairs. 51-67. Hunt, K. 1964. Differences in grammatical structures written at three grade levels, the structures to be analyzed by transformational methods. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement. [ERIC ED 003 322.] Hunt, K. 1965. Grammatical structures written at three grade levels. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement [ERIC ED 113 735.] Hyland, K. 1990. A genre description of the argumentative essay. RELC Journal: A journal o f language teaching and research in Southeast Asia. 21. 1. 66-78. Hymes, D. 1964. Introduction: Toward ethnographies of communication. InJ. Gumperz and D. Hymes (eds.) The ethnography o f communication. American anthropologist. 66. 6. 1-34. Jacobs, H., S. Zingraf, D. Warmuth, V. Hartfield, and J. Hughey. 1981. Testing ESL composition. Rowley, MA: Newbury House. Johns, A. 1985. Coherence and information load: Some considerations for the academic classroom. San Diego, CA: San Diego State University. Unpublished manuscript. Johns, A. 1986. Writing tasks and evaluation in lower division classes: A comparison of two- and four-year post-secondary institutions. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement [ERIC ED 273 090.] Johns, A. 1991. Faculty assessm ent of ESL student literacy: Implications for writing assessm ent. In L. Hamp-Lyons (ed.) Assessing second language writing in academic contexts. Norwood, NJ: Ablex. 167-179. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 257 Johnson, R. 1970. Recall of prose as a function of the structural importance of linguistic units. Journal o f verbal learning and verbal behavior. 9.12- 20. Jolliffe, D. (ed.). 1988. Writing in academic disciplines. Norwood, NJ: Ablex. Joreskog, K. 1993. Testing structural equation models. In K. Bollen and J. Long (eds.) Testing structural equation models. Thousand Oaks, CA: Sage. 294-316. Joreskog, K. and D. Sorbom, 1985. Simultaneous analysis of longitudinal data from several cohorts. In W. Mason and S. Feinberg (eds.) Cohort analysis in social research: Beyond the identification problem. New York Springer-Verlag. 323-341. Jdreskog, K. and D. Sorbom. 1988. PREUS: A program fo r multivariate data screening and data summarization. Chicago: Scientific Software. Joreskog, K. and D. Sorbom. 1989. USREL 7: A guide to the program and applications. Chicago: SPSS. Joreskog, K. and D. Sorbom. 1993. New features in LISREL 8. Chicago: Scientific Software. Kaplan, R.B. 1966. Cultural thought patterns in intercultural education. Language learning. 16. 1-20. Kaplan, R. B. 1982. An introduction to the study of written texts: The "discourse compact" In R. B. Kaplan etaL (eds.) Annual review o f applied linguistics, 3. Rowley, MA: Newbury House. 138-147. Kaplan, R. B. 1987. Cultural thought patterns revisited. In U. Connor and R. B. Kaplan (eds.) Writing across languages: Analysis ofL2 text. Reading, MA: Addison-Wesley. 9-21. Kaplan, R. B. 1988. Contrastive rhetoric and second language learning: Notes toward a theory of contrastive rhetoric. In A. Purves (ed.) Writing across languages and cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. 275-304. Kaplan, R. B. 1991. Concluding essay: On applied linguistics and discourse analysis. In W. Grabe et al. (eds.) Annual review o f applied linguistics, 11. New York: Cambridge University Press. 199-204. Kaplan, R. B. 1994. Personal communication. 2 November. Keech, C. 1984. Apparent regression in student writing performance a s a function of unrecognized changes in task complexity. Berkeley, CA: University of California, Berkeley. Ph.D. dissertation. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 258 Keech, C. 1985. Writing modes and writing "prompts." In P. Evans (ed.) Directions and misdirections in English evaluation. Ottawa: The Canadian Council of Teachers of English. 100-104. Kennedy, M. 1985. The composing process of c o llie students writing from sources. Written communication. 2. 434-456. Kinneavy, J. 1971. A theory ofdiscourse: The aims o f discourse. Englewood Cliffs, NJ: Prentice-Hall. Kintsch, W. 1974. The representation o f meaning in memory. Hillside, NJ: Lawrence Erlbaum. Kintsch, W. 1986. On modeling comprehension. In S. DeCastell, A. Luke and K. Egan, (eds.) Literacy, society and schooling: A reader. Cambridge: Cambridge University Press. 175-195. Kintsch, W. 1988. The role of knowledge in discourse comprehension: A construction-interacdon model. Psychological review. 95.2.163-182. Kirsch, G. and D. Roen (eds.). 1990. A s en se of audience in written communicadon. Written communication annual, Vol. 5. Thousand Oaks, CA: Sage. Krashen, S. 1988. Do we team to read by reading?: The relationship between free reading and reading ability. In D. Tannen (ed.) Linguistics in context: Connecting observation and understanding. Norwood, NJ: Ablex. 269-298. Krashen, S. 1993. The power o f reading. Englewood, CO: Libraries Unlimited. Kroll, B. 1990. What does time buy? ESL student performance on home versus class compositions. In B. Kroll (ed.) Second language writing: Research insights fo r the classroom. New York: Cambridge University Press. 140-154. Kunnan, A. 1991. Modeling relationships among som e test-taker characteristics and performance on tests of English a s a foreign language. Ann Arbor, MI: University of Michigan. Ph.D. dissertation. Lalonde, R. and R. Gardner. 1984. Investigating a causal model of second language acquisition: Where does personality fit in? Canadian journal o f behavioural science. 16. 224-237. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 259 Lautamatti, L. 1987. Observations on the development of the topic of simplified discourse. In U. Connor and R. B. Kaplan (eds.) Writing across languages: Analysis ofL2 text. Reading, MA: Addison-Wesley. 87- 114. Leki, 1 .1991. The preferences of ESL students for error correction in college- level writing classes. Foreign language annals. 24.203-218. Leki, I. and J. Carson. 1994. Students' perceptions of EAP writing instruction and writing needs across the disciplines. TESOL Quarterly. 28. 1. 81- 101. Linacre, M. 1989. Many-faceted Rasch measurement. Chicago: MESA Press. Linacre, M. 1991. Constructing measurement with a many-facet Rasch model Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement. [ERIC ED 333 047.] Lindeberg, A.-C. 1985. Abstraction levels in student essays. Text. 5. 4. 327- 346. Linnarud, M. 1986. Lexis in composition: A performance analysis o f Swedish learner's written English. Malmo, Sweden: Lunds UniversiteL Ph.D. dissertation. [Dissertation Abstracts International 47. 3718-C.] Lipman, M. 1991. Thinking in education. Cambridge: Cambridge University Press. Long, J. 1983a. Confirmatory factor analysis. Thousand Oaks, CA: Sage. Long, J. 1983b. Covariance structure models: An introduction to LISREL Thousand Oaks, CA: Sage. Long, B., Kahn, S. and R. Schutz. 1992. Causal model of stress and coping: Women in management Journal o f counseling psychology. 39. 2.227- 239. Longacre, R. 1983. The grammar o f discourse. New York: Plenum Press. MacCallum, R., M. Roznowski and L. Necowitz. 1992. Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological bulletin. 111.3. 490-504. MacDonald, S. 1990. The literary argument and its discursive conventions. In W. Nash (ed.) The writing scholar: Studies in academic discourse. Thousand Oaks, CA: Sage. 31-62. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 260 Madaus, G. 1994. A technological and historical consideration of equity issu es associated with proposals to change the nation's testing policy. Harvard educational review. 24.1.76-95. Manicas, P. 1971. On Toulmin's contribution to logic and argumentation. In R. Johannesen (ed.) Contemporary theories o f rhetoric: Selected readings. New York: Harper and Row. 256-270. Mann, W. and S. Thompson. 1987. Text generation: The problem o f text structure. Marina del Rey, CA: Information Sciences Institute. [ISI/RS- 87-181.] Mann, W. and S. Thompson. 1988. Rhetorical structure theory: Towards a functional theory of text organization. Text. 8. 3. 243-281. Markham, L.R. 1976. Influences of handwriting quality on teacher evaluation of written work. American educational research journal. 13. 277-283. Marsh, H., Balia, J. and R. McDonald. 1988. Goodness-of-fit indices in confirmatory factor analysis: The effect of sam ple size. Psychological bulletin. 103.391-410. Martinez S an Jos6, C. 1973. Grammatical structures in four m odes of writing at fourth-grade level. Syracuse, NY: Syracuse University. Ph.D. dissertation. [Dissertation Abstracts International 33. 5411-A.] Marzano, R., R. Brandt, C. Hughes, B. Jones, B. Presseisen, S. Rankin and C. Suhor. 1988. Dimensions o f thinking. Alexandria, VA: The Association for Supervision and Curriculum Development Mauranen, A. 1993. Cultural differences in academic rhetoric: A textlinguistic study. New York: Peter Lang. McNamara, T. 1990. Item response theory and the validation of an ESP test for health professionals. Language testing. 7. 1.52-75. McNamara, T. and R. Adams. 1991. Exploring rater behavior with Rasch techniques. Paper presented at the Language Testing Research Convention. Vancouver. March. Meyer, B. 1987. Following the author's top-level organization: An important skill for reading comprehension. In R. Tierney, P. Anders and J. Mitchell (eds.) Understanding readers' understanding: Theory and practice. Hillside, NJ: Lawrence Erlbaum. 59-76. Meyer, B. 1992. An analysis of a plea for money. In W. Mann and S. Thompson (eds.) Discourse description: Diverse linguistic analyses o f a fund-raising text. Philadelphia: John Benjamins. 79-108. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 261 Missimer, C. 1986. Good arguments: An introduction to critical thinking. Englewood Cliffs, NJ: Prentice-Hall. Moffett, J. 1969. Teaching the universe o f discourse. Boston: Houghton- Mifflin. Mooney, C. and R. Duval. 1993. Bootstrapping: A nonparametric approach to statistical inference. Thousand Oaks, CA: Sage. Moustafa, M. 1987. The effects of word processing on composing in a university ESL composition class. Los Angeles: University of Southern California Unpublished manuscript Mulaik, S., L. James, J . van Alstine, N. Bennett, S. Lind and C. Stilwell. 1989. Evaluation of goodness-of-fit indices for structural equation models. Psychological bulletin. 105. 3. 430-445. Nash, W. (ed.). 1990a. The writing scholar: Studies in academic discourse. Thousand Oaks, CA: Sage. Nash, W. 1990b. The stuff these people write. In W. Nash (ed.) The writing scholar: Studies in academic discourse. Thousand Oaks, CA: Sage. 8-30. Nelson, J. 1990. This was an easy assignment: Examining how students interpret academic writing tasks. Berkeley, CA: University of California at Berkeley, Center for the Study of Writing. (Technical Report 43.] Newcomb, M. 1990. What structural equation modeling can tell u s about social support In B. Sarason, I. Sarason and G. Pierce (eds.) Social support: An interactional view. New York; John Wiley and Sons. 26-62. Nietzke, D. 1972. Hie influence of composition assignment upon grammatical structure. Urbana-Champaign, IL: University of Illinois, Urbana- Champaign. Ph.D. dissertation. [Dissertation Abstracts International 32. 5476-A.] Nold, E. and B. Davis. 1980. The discourse matrix. College composition and communication. 31. 141-152. Nold, E. and S. Freedman. 1977. An analysis of readers' responses to essays. Research in the Teaching o f English 11. 164-174. Norusis, M. 1990. SPSS base system user's guide. Chicago: SPSS, Inc. Ochs, E. 1979. Planned and unplanned discourse. In T. Giv6n (ed.) Discourse and syntax. New York: Academic P ress. 51-80. [Syntax and semantics, Vol. 12.1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 262 Odell, L. 1981. Defining and assessing competence in writing. In C. R. Cooper (ed.) The nature and measurement o f competency in English. Urbana, D L : National Council of Teachers of English. 95-138. Ohmann, R. 1979. Use definite, specific, concrete language. College English. 41. 390-397. Oiler, J. 1976. A program for language testing research. In H. Brown (ed.) Papers in second language acquisition. Language learning. 141-166. Oiler, J. 1979. Language tests at school London: Longman. Ostler, S. 1980. A survey of academic needs for advanced ESL. TESOL Quarterly. 14.489-502. Patthey-Chavez, G. 1988. Writing opinions in high school: A comparison of Anglo and Latino students texts. Paper presented at the 11th Annual Meeting of the American Association of Applied Linguistics, New Orleans. March. Pedhazur, E. 1982. Multiple regression in behavioral research, explanation and prediction. New York: Holt, Reinhart and Winston. Pedhazur, E and L. Schmelkin. 1991. Measurement, design and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum. Perelman, C. and L. Olbrechts-Tyteca. 1969. The new rhetoric: A treatise on argumentation. Tr. J. Wilkinson and P. Weaver. South Bend, IN: University of Notre Dame Press. Perkins, K. 1980. Using objective methods of attained writing proficiency to discriminate among holistic evaluations. TESOL Quarterly. 14. 1. til- 69. Perkins, K. 1983. On the use of composition saving techniques, objective m easures, and objective tests to evaluate ESL writing ability. TESOL Quarterly. 17. 4. 651 -671. Pike, K. 1982. Linguistic concepts. Lincoln, NB: University of Nebraska Press. Pike, K. and E. Pike. 1983. Text and tagmeme. London: Frances Pinter. Pitkin, W. 1977a. Hierarchies and the discourse hierarchy. College English. 38. 648-659. Pitkin, W. 1977b. X/Y: Some basic strategies of discourse. College English. 38. 660-672. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 263 Pollitt, A. and C. Hutchinson. 1987. Calibrated graded assessm ents: Rasch partial credit analysis of performance in writing. Language testing. 4. 72-92. Poole, D. 1990. Contextualizing IRE in an eighth grade quiz review. Linguistics and education. 2. 185-211. Powers, D., Fowles, M., Famum, M . and P. Ramsey. 1994. W ill they think less of my handwritten essay if others word process theirs? Effects on essay scores of intermingling handwritten and word-processed essays. Journal o f educational measurement. 31.3.220-233. Pritchard, R. 1981. A study of cohesion devices in the good and poor composition of eleventh graders. Columbia, MO: University of Missouri, Columbia. Ph.D. dissertation. [Dissertation Abstracts International 42. 02-A.] Purcell, E 1983. Models of pronunciation accuracy. In J. Oiler (ed.) Issues in language testing research. Rowley, MA: Newbury House. 133-153. Purves, A. (ed.). 1988. Writing across languages and cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. Quellmalz, E.,etaL 1984. Designing writing assessments: Balancing fairness, utility, and cost. Los Angeles: University of California, Los Angeles, Center for the Study of Evaluation. [CSE Report 188.] Quinn, K. and A. Matsuhashi. 1985. Stalking ideas: The generation and elaboration o f arguments. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement [ERIC ED 257 043.] Quirk, R., Greenbaum, S., Leech, G., and J. Svartvik. 1985. A comprehensive grammar o f the English language. Longman: New York. Rafoth, B. and D. Rubin. 1984. The impact of content and mechanics on judgments of writing quality. Written communication. 1 . 446-458. Raimes, A. 1990. The TOEFL Test of Written English. TESOL Quarterly. 24. 3. 427-460. Rasch, G. 1960. Probabilistic models fo r some intelligence and attainment tests. Chicago: University of Chicago Press. Reid, J. 1990. Responding to different topic types: A quantitative analysis from a contrastive rhetoric perspective. In B. Kroll (ed.) Second language writing: Research insights fo r the classroom. Cambridge: Cambridge University Press. 191-210. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 264 Rosen, H. 1969. An investigation of the effects of differentiated writing assignm ents on the performance in English composition of a selected group of 15/16 year old pupils. London: University of London. Ph.D. dissertation. Rummelhart, D. 1977. Toward an interactive model of reading. In S. Domic (ed.) Attention and performance. New York: Academic Press, vol. 6. 573*603. Rusikoff, K. 1994. A comparison of selected writing criteria used to evaluate non-native speakers of English at a California State university. Los Angeles: University of Southern California. Ph.D. dissertation. Ruth, L. and S. Murphy. 1988. Designing writing tasks fo r the assessment o f writing. Norwood, NJ: Ablex. Sandell, R. 1977. Linguistic style and persuasion. London: Academic P res s. Sang, F., B. Schmitz, H. Vollmer, J. Baumert and P . Roeder. 1986. Models of second language competence: A structural equation approach. Language testing. 3. 1. 54-79. Santos, T. 1985. Professors' reactions to the academic writing of non-native- speaking students. Los Angeles: University of California Los Angeles. Ph. D. dissertation. Scanned, D. and J. Marshall. 1966. The effect of selected composition errors on grades assigned to essay examinations. American educational research journal 3. 125-130. Schieffelin, B. and E. Ochs. 1986. Language socialization. Annual review o f anthropology. 15. 163-191. Schifffin, D. 1987. Discourse markers. Cambridge: Cambridge University Press. Schmersahl, C. and B. Stay. 1992. Looking under the table: The sh ap es of writing in college. In M. Secor and D. Chamey (eds.) Constructing rhetorical education. Carbondale, IL: Southern Illinois University P ress. Schroder, H. (ed.) 1991. Subject-oriented texts: Languages fo r special purposes and text theory. Berlin: Walter de Gruyter. Selinker, L., Todd-Trimble, M. and L. Trimble. 1976. Presuppositional rhetorical information in EST discourse. TESOL Quarterly. 10. 3. 281- 290. Selinker, L., Todd-Trimble, M. and L. Trimble. 1978. Rhetorical function-shifts in EST discourse. TESOL Quarterly. 12. 3. 311-320. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 265 Shaughnessy, M. 1977a. Errors and expectations. New York: Oxford University P ress. Shaughnessy, M. 1977b. Some needed research on writing. College composition and communication. 28. 317-320. Shavelson, R. and N. Webb. 1991. Generalizability Theory: A primer. Thousand Oaks, CA: Sage. Shuy, R. 1981. Toward a developmental theory of writing. In C. Frederiksen and J. Dominic (eds.) Writing: The nature, development and teaching o f written communication. Hillside, NJ: Lawrence Erlbaum. 2.120-132. Singer, M., M. Halldorson, J. Lear, and P. Andrusiak. 1992. Validation of causal bridging inferences in discourse understanding. Journal o f memory and language. 31.507-524. Skehan, P. 1990. Progress in language testing: The 1990s. In C. Alder son and B. North (eds.) Language testing in the 1990s. London: MacMillan. 3-21. Sloan, C. and I. McGinnis. 1978. The effect o f handwriting on teachers' grading o f high school essays. DeKalb, IL: Northern Illinois University. [ERIC ED 220 836.] Smith, E. 1985. Text types and discourse framework. Text. 5. 229-247. Soter, A. 1988. The second language learner and cultural transfer in narration. In A. Purves (ed.) Writing across languages and cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. 177-205. Speelman, C. and K. Kirsner. 1990. The representation of text-based and situation-based information in discourse comprehension. Journal o f memory and language. 29. 119-132. Spiro, R. and B. Taylor. 1987. On investigating children’s transition firora narrative to expository discourse: The multi-dimensional nature of psychological text classification. In R. Tierney, P. Anders and J. Mitchell (eds.) Understanding readers'understanding: Theory and practice. Hillside, NJ: Lawrence Erlbaum. 77-93. Spivey, N. 1983. Discourse synthesis: Contrasting texts in reading and writing. Austin, TX: University of Texas. Ph.D. dissertation. Spooner-Smith, L., et aL 1980. Characteristics o f student writing competence: An investigation o f alternative scoring systems. Los Angeles: University of California, Los Angeles, Center for the Study of Evaluation. [CSE Report 134]. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 266 Stewart, M. and H. Grobe. 1979. Syntactic maturity, mechanics of writing, and teachers' quality ratings. Research in the Teaching o f English. 13. 207- 215. Suter, R. 1976. Predictors of pronunciation accuracy in second language learning. Language learning. 26. 233-253. Swales, J. 1987. Operationalizing the concept of discourse community. Paper presented at the Conference on College Composition and Communication. Atlanta, GA. November. Swales, J. 1991. Discourse analysis in professional contexts. In W. Grabe et aL (eds.) Annual Review o f Applied Linguistics, 11. New York: Cambridge University Press, 103-114. Tanaka, J. 1987. How big is big enough: The sample size issue in structural equation models with latent variables. Child Development. 58. 134-146. Tanaka, J. 1989. Multifaceted conceptions of fit in structural equation m odels. In K. Bollen and J. Long (eds.) Testing structural equation models. Thousand Oaks, CA: Sage. 10-39. Tannen, D. 1982a. Oral and literate strategies in spoken and written narratives. Language. 58. 1-21. Tannen, D. 1982b. The oral/literate continuum in discourse. In D. Tannen (ed.) Spoken andwritten language: Exploring orality and literacy. Norwood, NJ: Ablex. 1-16. Tannen, D. 1985. Relative focus on involvement in oral and written discourse. In D. Olson, N. Torrance and A. Hildyard (eds.) Literacy, language, and learning: The nature and consequences o f reading and writing. New York: Cambridge University Press. 124-147. Tirkkonen-Condit, S. 1985. Argumentative text structure and translation. Jyvaskyla, Finland: Kirjapaino OY, Sisa-Suomi. Toulmin, S. E. 1958. The uses o f argument. Cambridge: Cambridge University Press. Toulmin, S. E. 1993. Personal communication. 18 June. Toulmin, S. E., R. Rieke and A. Janik. 1979. An introduction to reasoning. New York: Macmillan Publishing Company. Vahapassi, A. 1988. The problem of selection of writing tasks in cross-cultural study. In A. Purves (ed.) Writing across languages and cultures: Issues in contrastive rhetoric. Thousand Oaks, CA: Sage. 51-78. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 267 Van Dijk, T. 1977. Text and context: Explorations in the semantics and pragmatics o f discourse. London: Longman. Van Dijk, T. 1990. The future of the field: Discourse analysis in the 1990s. Text. 10. 1. 133-156. Van Dijk, T. and W. Kintsch. 1983. Strategies o f discourse comprehension. New York: Academic Press. Van Eemeren, F. 1990. The study of argumentation as normative pragmatics. Text. 10. 1/2. 37-44. Van Peer, W. 1988. The invisible textbook: Writing as a cultural practice. In A. Luke, S. deCastell and C. Luke (eds.) Language, authority and criticism: Readings on the school textbook London: Falmer, 123-132. Van Peer, W. 1990. Writing a s an institutional practice. In W. Nash (ed.) The writing scholar: Studies in academic discourse. Thousand Oaks, CA: Sage. 192-204. Vaughan, C. 1991. Holistic assessm ent: What goes on in the raters'minds? In L. Hamp-Lyons (ed.) Assessing second language writing in academic contexts. Norwood, NJ: Ablex. 111-126. Weasenforth, D. 1991. Prompt type effects on essay ratings. Los Angeles: University of Southern California. Unpublished manuscript Weasenforth, D. 1993. Prompt type effects: A discourse analysis of protocols based on prose vs. graph essay prompts. Paper presented at the American Association of Applied Linguists convention. Atlanta. April. Weiss, D. (ed.). 1983. New horizons in testing: Latent trtut test theory and computerized adaptive testing. New York: Academic P res s. Weydt, H. 1979a. Immerhin. [Nevertheless]. In H. Weydt (ed.) DiePartikeln derdeutschenSprache. [Particles in the German language.] Bolin: Walter de Gruyter. 335-348. Weydt, H. 1979b. Zur Unterscheidung semantisch-pragmatisch, dargestellt an den Partikelm jedenfalls, immerhin, und schliesslich. [Toward a semantic- pragmatic distinction between the particles in any case, nevertheless and fin a lly.] In I. Rosengren (ed.) Sprache und Pragmatik [Languageand Prgamatics. ] Malmo: CWKGIeerup. 355-370. Winterowd, R. 1986. Brain, rhetoric and style. In D. McQuade (ed.) The territory o f language: Linguistics, stylistics, and the teaching o f composition. Carbondale, IL: Southern Illinois University Press. 34- 64. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 268 Winterowd, R. and V. Gillespie (eds.). 1994. Composition in context: Essays in honor o f Donald C. Stewart. Caibondale, IL: Southern Illinois University P ress. Witte, S. 1983. Topical structure and writing quality: Some possible text-based explanations of readers’ judgments of student writing. Visible writing. 17. 2. 177-204. Witte, S. and L. Faigley. 1981. Cohesion, coherence and writing quality. College composition and communication. 32. 189-204. Young, R., A. Becker and K. Pike. 1970. Rhetoric: Discovery and change. New York: Harcourt, Brace and World. Zimmerman, C. 1994. Self-selected reading and interactive vocabulary instruction: Knowledge and perceptions of word learning among L2 learners. Los Angeles: University of Southern California. Ph.D. dissertation. Zwicky, A. 1982. Om tekstbinding i gode og dirlige skolestiler. [On coherence in good and bad academic writing.] Oslo: University of Oslo. Master's thesis. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A Essay Prompts Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Prompt 1A Student # The global supply of fresh water is quickly decreasing due to population growth, increased pollution and overuse of the supply. Propose solutions to solve the problem, using information from the graph and charts to support your argument. 300 -r 2 50 -- 200 -- 1990 □ 2020 1 50 1 00 ■■ Desalinlzatlon' Recycling ' Diversion of Towing Icebergs waterways Figure 1: Number of U.S. dollars to produce 1 liter of fresh water, actual and projected * Desalinization = extraction of salt from water Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 271 It creational i se i—— _ IU88I [Industrial i use Farm irrigation Figure 2: Percentages of ail water used, by use, 1990 figures. E vapor at Pollution Sait water Figure 3: Percentage of all unused water, by cause, 1990 figures Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 7 2 Prompt IB Student # The global supply of fresh water is quickly decreasing due to population growth, increased pollution and overuse of the supply. Propose solutions to solve the problem and support your argument Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 7 3 Prompt 2A Student# Homelessness is becoming a major problem around the world due to slowing economies, rising costs of living and changes in the types of skills demanded of employees. Propose solutions to solve the problem, using information from the graphs and chart to support your argument. 1 4 0 -r 1 20 1 00 8 0 - 6 0 •• 4 0 ■ ■ Figure 1: World's homeless population, by area (in millions). Rural Urban Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 274 1 20 j 100 - - 80 60 40 20 Cost of 1 Bedroom Home Annual Rent for 1 Bedroom Apartment 1960 1970 1980 1990 Figure 2: Average costs for home and annual apartment rent, by year (In thousands of U.S. Dollars), 1990 U.S. figures. College High school Less than high school Figure 3: Highest education level completed by homeless, by percentage of total homeless population, 1990 U.S. figures. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 275 Prompt 2B Student # Homelessness is becoming a major problem around the world due to slowing economies, rising costs of living and changes in the types of s kills demanded of employees. Propose solutions to solve the problem and support your argument Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 276 Appendix B Rating Scales Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ESL CHALLENGE TEST ESSAY EVALUATION SCALE 1 -2 Incompetence 3-4 Minimal Competence •Indistinguishable introduction, * Apparent introduction, body, body, conclusion conclusion *No apparent thesis *Inexplicit thesis * Off-topic, incoherent response •Irrelevant, unconnected details within paragraph (s); no focus •No logical development; paragraphs not related to one another; don't develop a thesis •No or inappropriately used cohesive devices •Severe problems with word forms •Vocabulary range extremely limited •Frequent errors in word choice, often affecting intelligibility •Frequent grammatical errors, affecting intelligibility •Range of syntactic structures extremely limited; short simple sen ten ces •Paragraphing conventions often violated •Punctuation inappropriate, inconsistent •Capitalization inappropriate, inconsistent •Frequent spelling errors, often affecting intelligibility •On topic; may not address all elements of the question •Insufficient, sometimes irrelevant, detail within paragraph (s); no focus •Inadequate development; paragraphs usually related, but do not adequately or appropriately develop a thesis •Few cohesive devices, sometimes inappropriately u sed •Many problems with word forms •Very limited vocabulary range •Frequent errors in word choice, occasionally affecting intelligibility •Frequent grammatical errors, occasionally affecting intelligibility •Range of syntactic structures very limited; som e compound but no complex sentences •Paragraphing conventions som etim es violated •Punctuation sometimes inap­ propriate, inconsistent •Capitalization som etim es inap­ propriate, inconsistent •Frequent spelling errors, oc­ casionally affecting intelligibility Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 278 5-6 Developing Competence 7-8 Proficiency w/Some Errors ’ "Explicit introduction, body, conclusion; may lack unity ■"Explicit but poorly formu­ lated thesis ■"C lear introduction, body, conclusion; may not be perfectly unified ■ "C le a r but weak thesis ■ "A d d re s s e s all elements of question, but not adequately ■"Insufficient paragraph de­ velopment; inconsistent u s e of topic sen ten ces; inadequate detail; paragraphs sometimes not focused ■ "P a ra g ra p h s related, but may not adequately or appropriately develop a thesis ■"Rudim entary cohesive devices ■ " A d d re s s e s all elements of question; some parts may be slighted ■ "C le a r paragraph development; topic sentences and supporting detail; clear focus ■"P arag rap h s related, but may not adequately or appropriately develop a thesis ■ " S o m e variety in cohesive devices ■"O ccasion al problems with word forms ♦Limited vocabulary ♦Some minor word choice errors; som e register problems ♦Occasional grammatical errors, rarely affecting intelligibility ♦Syntactic structures somewhat varied; som e complex sentences ♦Very few word form problems ♦Fairly broad sub-technical vocabulary ♦Very few word choice errors; register appropriate ♦Very few grammatical errors ♦Variety of syntactic structures; som e complex sentences ♦Paragraphing conventions observed consistently ♦Punctuation sometimes in­ appropriate or inconsistent ♦Capitalization appropriate and consistent ♦Occasional spelling errors ♦Paragraphing conventions observed consistently ♦Punctuation appropriate consistent ♦Capitalization appropriate and consistent ♦Infrequent spelling errors Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 279 9-10 Near Native Proficiency '"Clear, unified introduction, body, conclusion "Clear, effective thesis "Comprehensive response to the prompt "Good paragraph development; topic sen ten ces and ample detail; clear focus "Paragraphs related; adequately and appropriately develop die thesis "Variety of cohesive devices "Correct word forms "Broad, sophisticated vocabulary "Hardly any word choice errors; register appropriate "Almost no grammatical errors "Syntactic structures varied; many complex sentences "Paragraphing conventions u s e d consistently "Punctuation appropriate "Capitalization appropriate "Almost no spelling errors Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TOPIC ESL COMPOSITION PROFILE STUDENT 280 DATE COMPONENTS PJ^CRjED.ON CONTENT 0 EXCELLENT TO VERY GOOD: well-reasoned thesis • related ideas * specific development (personal experience— illustration— examples— facts -opinions) • good u s e of description/comparison- con trast GOOD: adequate reasoning * thesis partly developed • occasionally unrelated ideas FAIR TO POOR- poor reasoning • unnecessary information • very little development VERY POOR: irrelevant • no development • (or) not enough evaluate ORGANIZATION 3 1 0 EXCELLENT TO VERY GOOD: effective thesis * strong topic sen ten ces * introductory and concluding sentences/paragraphs • use of transitions * organized GOOD: clear topic sentences • no concluding sentences or paragraph • weak transitions * incomplete sequencing/organization FAIR TO POOR- no topic sentence • lacks transitions * little or no sequencing/organization VERY POOR: d oes not communicate one idea * no evidence of organization * (or) not enough to evaluate VOCABULARY 3 EXCELLENT TO VERY GOOD: correct use of idioms/word forms (prefixes-suffixes-roots- compounds) in context • effective word choice • word meaning precise 2 GOOD: mostly effective and correct idioms/word forms/word choice in context • meaning clear 1 FAIR TO POOR: frequent errors in idioms/word forms/word choice • som e translation • meaning confused 0 VERY POOR: little knowledge of English vocabulary • mostly translation • (or) not enough to evaluate LANGUAGE USE 3 EXCELLENT TO VERY GOOD: sentence variety • correct verb te n s e s • few errors in subject-verb agreement, number, word order/use, articles, pronouns, prepositions GOOD: effective but simple constructions • mostly correct verb ten ses * several errors in Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 281 subject-verb agreement, number, word order/use, articles, pronouns, prepositions, but meaning clear 1 FAIR TO POOR: ineffective simple constructions * frequent errors in verb tense, subject-verb agreement, number, word order/use, articles, _______________________ pronouns, prepositions_____________________ MECHANICS 3 EXCELLENT TO VERY GOOD: few errors in spelling, punctuation, capitalization, paragraphing 2 GOOD: occasional errors in spelling, punctuation, capitalization, paragraphing 1 FAIR TO POOR: frequent errors in spelling, punctuation, capitalization, paragraphing • handwriting unclear 0 VERY POOR: dominated by errors in spelling, punctuation, capitalization, paragraphing • _______________________ illegible handwriting______________________ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 282 Rating Scales for Evaluation of Rhetorical Abstraction I. Claims (Major and minor claims) la. Statement of Claims (Clarity and specificity of statement of major and minor claims) 0 Not identifiable 1 Unclear central idea or several confused ideas 2 One general central idea 3 One specific central idea lb . Statement of Direction (Statement of direction of change in major or minor claims) 0 No statement 1 Minimal statem ent 2 Adequate statem ent 3 Elaborated statem ent II. Data (Facts, expert opinion, and personal experience / knowledge) Ila . Data Use (Extent of use of personal experience or "external” data to support minor claims) 0 No use of data 1 Minimal use of dam 2 Adequate u se of data 3 Extensive use of data lib . Data Type (Relative amount of personal experience and "external" data in terms of % of total data) 0 No external data Exclusive u se of personal experience data 1 Relatively little external data Predominant u se of personal experience 2 Half of data is external Half of data is personal experience data 3 Predominant u s e of external Relatively little personal experience data data 4 Exclusive u se of external data No personal experience data Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 283 H I. Warrants (Frequency of explicit statem ents of relationships between data and minor claims) Ilia . Explicit Statement of Warrants 0 Warrants never used 1 Warrants sometimes used 2 Warrants often us ed 3 Warrants always used rv. Analysis. of Topic IV a. Consideration of causes (Extent of consideration of cau ses of problem(s)) 0 No consideration 1 Minimal consideration 2 Adequate consideration 3 Extensive consideration IVb. Examination of premises and issues (Extent of examination of nature and/or history of main issue or fundamental assumptions underlying a line of argumentation) 0 No examination 1 Minimal examination 2 Adequate examination 3 Extensive examination Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 284 Appendix C Descriptive Statistics Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 285 Table 1 Descriptives Variable Mean SD Kurtosis Skewness ABl 3.81 3.56 2.28 1.33 AB2 3.16 3.58 3.70 1.69 AB3 0.50 1.11 11.19 2.95 AB4 0.05 0.29 3853 6.14 AB5 1.07 1.51 1.32 1.39 AB6 0.79 1.30 3.06 1.78 ABC 9.39 5.71 0 5 6 0.68 CT1 1.68 0.60 -0.29 -0.03 CT2 1.60 0.65 -0.33 0.16 CT3 1.60 0.70 -0.40 0.28 HR1 7.24 1.34 -0.66 0.23 HR2 7.01 1.18 0.42 0.37 HR3 7.33 1.16 0.25 0.18 IE1 1.48 1.92 1.60 1.40 IE2 4.90 3.91 0.59 0.93 IE3 0.54 1.22 11.18 2.95 IE4 0.23 0.70 14.64 3.56 IEC 7.13 4.59 -003 0.65 LU1 1.62 0.60 -0 5 6 0.22 LU2 1.54 0.68 -0.33 0.50 LU3 1.57 0.69 -0.44 0 5 2 M CI 1.96 0.54 0.75 -0.13 MC2 1.88 0.62 -0.12 -0.05 MC3 2.08 0.69 -0.06 -0.35 OG1 1.72 0.62 -0.25 0.01 OG2 1.67 0.65 -0.40 0.16 OG3 1.82 0.76 -0.48 -0.11 R ll 2.14 0.72 2.39 -1.19 R12 2.26 0.75 0.96 -0.98 R21 1.51 0.64 -0.22 -026 R22 1.63 0.72 -0.10 -0.25 R31 0.85 0.68 0.48 0 5 4 R32 0.96 0.72 0.47 0.56 R41 2.34 1.56 -1.30 -054 R42 2.44 1.43 -1.12 -057 R51 0.50 0.96 2.40 1.82 R52 0.86 0.77 -0.08 0.60 R61 1.01 0.60 2.58 0.92 R62 1.08 0.64 2.68 1.00 R71 0.66 0.74 0.28 0.92 R72 0.78 0.80 0.41 0.89 TL1 297.46 101.88 3.20 1.18 TL2 28.54 11.03 3.81 1.28 TL3 1706.36 572.18 2.78 1.12 VC1 1.85 0.55 0.31 -0.16 VC2 1.72 0.66 -054 0.21 VC3 1.81 0.67 -0.61 0.15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 8 6 Appendix D Validation Statistics for Rhetorical Abstraction Scales Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 287 Tablet Descriptive Statistics: Second P h a s e (N =31), Third P h a s e (N = 30) S cale R a n g e s of Sco res (Unused scores) Mean S c o re s Standard Deviations 2nd P h a s e 3rd P h a s e 2nd P h a s e 3rd P h a s e 2nd P h a s e 3rd P h a s e Statement o f claims ...M (0) 1 -3 2.20 2.73 1.02 "T.32T" Development o f the major claim 0-2 (3) 0-2(3) 1.13 1.54 “ 0.70” 0.93 Statement o f cha ng e 0-3 0-3 1.05 1.80 0.72 1.30 Statement of direction 0 -3 " 0-3 1.47 1.80 0.82 1.34 Logical entailment 0-3 (0) 1 -3 2.17 2.87 1.18 0.93 Data U s e 0-3 0-3 0 2 “ 1.53''" 0.72 1.38 U se of external da ta 0-3 0 3 " 1.45 0.62 0.02 2.22 U se o f personal experience data 0-3 0-3 1.25 2.3l 0.93 2.29 Explicit stat em en t o f warrants 0-3 o r 0.84 0.52 “ O ' .W ~ 1.45 Logical relation Immediate c a u s e s ? ? u > (0) 1- 3 2.28 “ 032“ 2.78 1 .1 5 “ 0.79" 132“ Indirect c a u s e s C a u s e s (Combined Immediate an d Indirect c a u s e s ) 0-3 0-3... 1.43 " '0 4 “ 0.71 i : v r Examination o f issues 0-3 0-2(3) 1.08 1 .21 0.70 1.07 Evaluation of solutions 0-2(3) 0-3 0.08 0.43 0.24 1.74 Examination of premises 0-2 (3-4) 0-4 0.08 0.20 0.30 2.00 Counterargument 0-(3M 0-4 0.23 0.41 0.60 2.30 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 288 Table 2 Mean Score and Standard Deviation for Each Rater for Each Scale: Second Phase (N =31), Third Phase (N =30) Scale Phase # 2 Ph ase #3 Rater 1 Rater 2 Rater 3 Rater 1 Rater 2 Rater 3 Statement of claims 1397... 1 .0 2 2.527... 1 . 0 2 1 .6 8 / 1 .0 1 2.67/ 0.76 2 .8 0 / 0.55 2.73/ 0.64 Development of the major claim 1.03/ 0.55 1.40/ 0.85 0.97/ 0.71 1.30/ 0.53 1.93/ 0.25 1.40/ 0.50 Statement of change 1.427 0.62 1.6 8 / 0.79 1.32/ 0.75 1.63/ 0.67 1.93/ 0.36 1.83/ 0.46 Statement of direction 1.45/ 0.72 1.487 ' 0.89 1.48/ 0.85 1.90/ 0.55 1.90/ 0.40 1.90/ 0.55 Logical entailment 2.48/ 1.06 2 . 1 0 / 1 .2 2 1.94/ 1.26 2.87/ 0.57 2.80/ 0.61 2.93/ 0.25 Data Use 1.6 1 / 0.72 1.877 0.62 1.39/ 0.84 1.50/ 0.57 1.60/ 0.56 1.50/ 0.57 Use of external data 1.26/ 0.73 1.6 8 / 0.98 1.42/ 1.06 0.90/ 0.76 0.43/ 0.82 0.53/ 0.90 Use of personal experience data 1.35/ 0.95 1.29/ 0 . 8 6 1 .10/ 0.98 2 .0 0 / 0.83 2.47/ 0.94 2.47/ 0.90 Explicit statement of warrants 0.97/ 1.08 0.90/ 0.65 0.64/ 0.60 0.33/ 0 . 6 6 0.67/ 0.61 0.57/ 0.57 Logical relation 2.32/ 1.08 2.35/ 1.14 2.16/ 1.24 27707"” 0.70 2.70/ 0.75 2.93/ 0.36 Immediate cau ses 0.61/ 0.72 0.64/ 0.80 0.61/ 0.84 Indirect causes 1.26/ 0.58 1.32/ 0.60 1.71/ 0.94 Wi l l “ Causes (Combined Immediate and Indirect causes) 1.40/ 0.62 1.2 0 / 0 . 6 6 1.43/ 0.73 Examination of issu es 1 .2 2 / 0.67 1.19/ 0.65 0.84/ 0.78 1.40/ 0.50 1.67/ 0.46 1.07/ 0.45 Evaluation of solutions 07137" 0.43 0,007" 0 . 0 0 <m7“ " 0.30 0.33/ 0.55 0.43/ 0 . 6 8 0.53/ 0.78 Examination of premises 0 .0 6 / 0.35 0 .0 0 / 0 . 0 0 0.19/ 0.54 0.137“ 0.73 0.50/ 1.07 0.23/ 0.77 Counterargument 0.16/ 0.52 0.16/ 0.37 0.38/ 0.92 0.17/ 0.65 0.60/ 0.89 0.47/ 1.07 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 8 9 Table 3 Variance Estimates and Generalizability Coefficient for Each Scale: 2nd Phase (N =31) and 3rd Phase {N = 30) t Scale (% total variance) <^r (% total variance) 0 tr,e (% total variance) Generali­ zability coef­ ficient 2 nd Phase 3rd Phase 2 nd Phase 3rd Phase 2 nd Phase 3rd Phase 2 nd Ph. 3rd Ph. Statement of claims .60 (49%) .16 (42%) .10 (!?%) -.01 ( 1 % )** 44 .23 (57%) .80 .69 Develop, of major claim (50%) .04 .05 (9%) .1 1 (36%) .23 (41%) .16 (50%) .79 .46 Statement of change .24 (43%) .15 (53%) .02 < 5 ,P .02 (7%) .29 (52%) jT i (40%) .77 .80 Statement of direction ..... A T (6 8 %) .17 (6 8 %) -.01 (1%)** -.01 ( 1 % )** .21 (31%) .08 (31%) .87 .87 Logical entailment .76 (52%) :o2 (7% )* “T O T (1%)** .64 (44%) (92%) .78 .18 Data Use .32 (54%) . 1 6 (48%) .05 (9%) - .0 1 (0 % )** .2 2 (37%) .17 .82 .73 Use of external data ■ ;$ 5 ' (61%) .48 (65%) .03 .05 (7%) .32 (35%) . 2 0 (27%) .84 .88 Use of pers. exp. data .49 (55%) .48 (56%) .0 1 (!S> .06 (7%) (44%) .32 (37%) .79 .82 Explicit statement of warrants .17 (26%) . 1 6 (41%) .0 1 (2 %) .0 2 (6 %) .48 (72%) "3 1 " (53%) .51 .70 Logical relation " .61 - .0 1 (4% )** - .0 1 o * r .0 1 (2 % )* .72 (54%) .28 (94%) .72 .00 Immediate causes .29 (46%) - .0 1 (2% )** — — — .33 (52%) .72 Indirect causes .2 8 (50%) .03 (9%) .-------- .24 (41%) ----- .78 Considera­ tion: cau ses .31 (67%) .0 1 (2 %)* .14 (31%) . 8 6 Examination of issues ...1 T ~ (33%) 'U 7 ... (27%) .04 (7%) .752" ( 1 0 %) .32 (60%) .15 (63%) .62 .57 Evaluation of solutions .0 1 (7%) * .28 (61%) . 0 0 (2 %) * .0 0 .08 (91%) .17 (38%) . 2 0 .83 Explicit stm t, of premise -.01 (3% )** (36%) .Oi (3%) * .0 2 (3%)* .15 (94%) .48 (61%) . 0 0 .64 Counterargu­ ment .26 (60%) .40 (58%) .0 1 (3%) * .04 (5%) 1 .16 (37%) .30 (37%) .83 .83 t Using algorithm * Statistically equivalent to 0 (standard error greater than variance estimate) ** Negative variance due to sampling. May be regarded a s 0.0 variance. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 90 1 Raters] rable4 Vfeasurem ent Phase # Rater Ttlof Scores Measure Logit Model Error Fit Statistic 2 1 612 -0.04 :07 - 2 2 639 -0.17 .07 -1 3 553 0 .2 1 .07 2 Separation = 2. Fixed x 2 = 16. 3, Reliability = .82 6 8 , d f = 2, p < . 0 0 3 0 . 1 8 m 0 2 696 -0.13 .07 0 3 679 -0.04 .07 0 Separation = 1.42, Reliabi Fixed X 2 = 8.97, d f - 2 , ^ lity = .67 p < .0 1 Facets model: Rating scale for examinees, partial credit for scales, rating scale for raters Significant ( z-s Table 5 core > 12 1) Rater x Examinee B ias Estimates Phase# Rater Examinee Z-Score Bias Model Logit Error 2 1 1 3 3 091 381 0 1 1 094 -2.13 -2.25 2.04 -2.57 -0.86 .40 -0.76 .34 0.68 .33 -1.00 .39 3 2 031 2 . 1 0 0.76 .36 Facets model: F scales, rating s e t fating scale for examinees, partial credit for lie for raters Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 291 Table 6 Significant ( Z-Score £ 1 2 1 ) Rater x Scale Bias Estimates Phase# Rater Scale Z-Score Bias Logit Model Error i Statement of change -2.13 -d.6 d .31 1 Statement of direction -2.35 -0 . 6 8 .29 2 1 Data u s e 2.05 0.57 .28 2 Statement of direction 2.55 0.62 .24 2 Examination of issues -2.45 -0.76 .31 3 Statement of claims 2 . 0 0 0.47 .23 3 Data u s e -2.56 -0 . 6 8 .27 1 Development of major claim ■ 'm 0 .16 .is 3 1 Use of external data -2.90 -0.59 . 2 0 1 Examination of issues -2.78 - 1 .0 1 .36 2 Development of major claim -3.16 -2 . 1 1 .67 Facets model: Rating scale for examinees, partial credit for scales, rating scale for raters Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 292 Appendix E Correlation Matrix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 93 Table 1 Matrix of Product Moment, Polychoric and Polyserial Correlations used for ____________ Analysis_________________________ HRl HR2 HR3 TL1 TL2 TL3 HRl 1 .0 0 0 HR2 0 .7 5 0 1 .0 0 0 HR3 0 .7 2 0 0 .7 9 8 1 .0 0 0 TL1 0 .5 3 0 0 .5 2 7 0 .5 5 7 1 .0 0 0 TL2 0 .4 1 1 0 .4 2 4 0 .4 2 7 0 .9 0 2 1 .0 0 0 TL3 0 .5 4 9 0 .5 4 2 0 .5 7 3 0 .9 9 0 0 .8 7 5 1 .0 0 0 R l l 0 .1 2 9 0 .1 1 9 0 .0 7 9 -0 .0 5 4 - 0 .1 1 2 - 0 .0 3 4 R12 0 .1 8 1 0 .1 3 9 0 .1 4 5 0 .1 0 6 0 .0 1 9 0 .1 2 2 R21 0 .3 5 1 0 .3 8 8 0 .3 6 1 0 .3 6 5 0 .2 8 0 0 .3 8 7 R22 0 .3 0 4 0 .3 3 4 0 .2 8 0 0 .3 0 9 0 .2 1 0 0 .3 3 5 R31 0 .2 9 8 0 .2 7 1 0 .3 0 0 0 .3 7 3 0 .3 1 8 0 .3 8 1 R32 0 .2 3 7 0 .2 4 7 0 .2 8 7 0 .3 4 1 0 .2 8 0 0 .3 5 1 R41 - 0 .1 2 3 - 0 .1 9 7 -0 .0 8 9 -0 .2 4 6 - 0 .2 8 5 - 0 .2 3 0 R42 - 0 .1 3 3 -0 .1 9 9 - 0 .1 1 3 -0 .2 2 7 - 0 .2 6 2 - 0 .2 0 8 R51 -0 .0 0 2 0 .0 9 2 0 .0 5 3 0 .0 6 2 0 .0 4 3 0 .0 6 0 R52 0 .0 7 7 0 .0 9 0 0 .0 3 2 0 .0 8 2 0 .0 5 4 0 .0 8 2 R61 0 .0 9 0 0 .1 6 6 0 .1 7 8 0 .2 3 9 0 .1 7 6 0 .2 7 1 R62 0 .0 9 1 0 .1 8 8 0 .1 8 6 0 .2 3 2 0 .1 7 1 0 .2 6 6 R71 0 .2 4 9 0 .2 8 7 0 .2 9 3 0 .2 7 9 0 .2 5 8 0 .3 0 0 R72 0 .1 6 8 0 .2 5 3 0 .2 3 8 0 .2 1 9 0 .1 9 6 0 .2 3 4 OG1 0 .5 2 3 0 .5 3 0 0 .5 2 2 0 .4 9 5 0 .3 4 1 0 .5 1 9 OG2 0 .5 1 7 0 .5 6 9 0 .5 2 7 0 .4 8 5 0 .3 7 0 0 .5 0 0 OG3 0 .5 9 4 0 .6 0 6 0 .6 1 1 0 .6 8 2 0 .5 4 4 0 .6 9 8 CT1 0 .4 3 4 0 .3 9 2 0 .4 6 6 0 .4 0 7 0 .3 0 7 0 .4 1 6 CT2 0 .4 7 6 0 .5 4 9 0 .5 1 4 0 .3 7 5 0 .2 5 0 0 .3 9 5 CT3 0 .4 0 7 0 .4 6 4 0 .5 1 8 0 .3 1 1 0 .2 0 5 0 .3 1 9 VC1 0 .6 4 8 0 .6 8 9 0 .6 8 2 0 .3 4 4 0 .2 4 5 0 .3 5 0 VC2 0 .6 4 9 0 .6 6 0 0 .5 9 4 0 .4 1 0 0 .3 3 4 0 .4 2 7 VC3 0 .7 0 5 0 .6 9 6 0 .6 8 8 0 .4 8 6 0 .3 4 8 0 .5 0 8 LU1 0 .6 2 6 0 .6 3 6 0 .6 3 2 0 .3 4 6 0 .2 4 4 0 .3 6 3 LU2 0 .6 2 6 0 .6 6 6 0 .6 4 0 0 .3 6 9 0 .2 7 2 0 .3 9 1 LU3 0 .7 1 3 0 .7 3 3 0 .7 2 6 0 .3 9 3 0 .2 7 2 0 .4 1 6 AB1 0 .0 1 2 - 0 .0 3 6 - 0 .0 0 6 - 0 .0 7 0 - 0 .0 8 1 -0 .0 5 7 AB2 0 .2 7 7 7 .2 1 8 0 .1 8 1 0 .1 6 7 0 .1 5 1 0 .1 6 9 AB3 0 .2 3 2 0 .1 6 9 0 .1 7 6 0 .0 8 9 0 .0 6 0 0 .1 0 2 AB4 0 .0 9 2 0 .0 1 4 0 .0 3 4 0 .0 2 6 0 .0 4 0 0 .0 2 5 AB5 0 .0 6 5 0 .1 0 3 0 .0 7 5 0 .0 0 4 - 0 .0 5 7 0 .0 2 4 AB6 0 .0 3 8 0 .0 7 0 0 .0 9 0 0 .0 9 4 0 .0 6 8 0 .1 0 0 ABC 0 .0 1 4 0 .0 7 0 0 .0 2 0 0 .0 3 1 0 .0 8 4 0 .0 1 8 IE 1 0 .1 1 2 0 .0 3 3 0 .0 3 9 - 0 .0 3 1 - 0 .0 2 6 - 0 .0 2 2 IE 2 0 .0 1 1 -0 .0 1 2 0 .0 2 0 0 .0 1 5 0 .0 5 0 0 .0 0 0 XE3 0 .0 0 6 -0 .0 0 4 - 0 .0 2 7 0 .0 1 0 0 .0 3 6 0 .0 1 1 IE4 0 .2 4 4 0 .2 0 0 0 .1 9 1 0 .1 0 3 0 .0 5 9 0 .1 2 2 IEC 0 .1 1 8 0 .0 7 5 0 .0 4 9 0 .0 0 7 0 .0 4 8 0 .0 0 5 MCI 0 .2 4 3 0 .3 7 7 0 .3 9 1 0 .2 6 6 0 .2 4 4 0 .2 7 8 MC2 0 .4 3 8 0 .5 6 4 0 .4 9 9 0 .2 8 7 0 .2 1 5 0 .3 1 3 MC3 0 .4 0 4 0 .5 2 0 0 .4 9 6 0 .2 4 1 0 .1 5 6 0 .2 6 3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 294 “ Table 1 -continued Matrix of Product Moment, Polycharic and Polyserial Correlations used for _________________________ Analysis_________________ _ _ _ _ _ _ R l l R12 R21 R22 R31 R32 R l l 1 .0 0 0 R12 0 .9 5 4 1 .0 0 0 R21 0 .4 8 8 0 .6 3 5 1 .0 0 0 R22 0 .6 3 7 0 .7 4 5 0 .9 4 7 1 .0 0 0 R31 - 0 .1 3 8 0 .0 4 4 0 .1 8 5 0 .1 5 6 1 .0 0 0 R32 - 0 .1 2 6 0 .0 6 2 0 .2 0 8 0 .1 6 4 0 .9 7 3 1 .0 0 0 R42 - 0 .0 7 4 - 0 .0 3 0 -0 .0 3 9 - 0 .0 6 2 -0 .0 8 4 0 .0 5 3 R51 0 .1 3 6 0 ,2 1 1 0 .3 1 4 0 .2 2 0 0 .1 8 5 0 .1 5 8 R52 0 .1 3 2 0 .2 1 7 0 .1 4 0 0 .1 0 7 0 .1 7 3 0 .1 4 6 R61 - 0 .0 5 5 0 .0 0 9 0 .0 1 2 0 .1 0 2 0 .2 4 3 0 .2 4 0 R62 - 0 .0 2 2 0 .0 3 3 0 .0 4 1 0 .1 1 5 0 .1 8 6 0 ,2 0 8 R71 0 .1 3 2 0 .1 5 7 0 .0 4 4 0 .1 1 9 0 .2 9 7 0 .3 3 1 R72 0 .2 1 3 0 .2 3 8 0 .1 0 6 0 .1 7 3 0 .2 3 0 0 .2 3 9 OQX 0 .2 9 7 0 .3 7 9 0 .5 2 8 0 .4 4 6 0 .2 0 9 0 .1 5 7 002 0 .2 6 9 0 .3 2 2 0 .5 2 9 0 .4 2 3 0 .0 6 6 0 .0 7 7 0G3 0 .1 2 3 0 .2 5 1 0 .5 3 7 0 .4 5 6 0 .3 1 9 0 .2 7 0 CT1 0 .2 2 6 0 .3 2 7 0 .4 4 8 0 .3 5 2 0 .1 3 3 0 .1 6 7 CT2 0 .1 8 0 0 .2 7 8 0 .3 7 5 0 .3 4 7 0 .1 0 1 0 .0 8 6 CT3 0 .2 4 8 0 .3 3 6 0 .4 4 7 0 .4 0 0 0 .0 7 9 0 .1 1 7 VC1 0 .1 7 1 0 .1 9 6 0 .3 2 3 0 .2 9 7 0 .2 5 4 0 .2 0 0 VC2 0 .0 9 1 0 .1 1 6 0 .2 9 3 0 .2 4 0 0 .1 4 8 0 .1 5 5 VC3 0 .1 1 2 0 .1 7 1 0 .3 5 6 0 .3 2 4 0 .2 5 5 0 .2 2 9 LU1 0 .1 0 8 0 .1 5 4 0 .3 2 1 0 .2 8 6 0 .2 5 3 0 .2 1 3 LU2 0 .0 7 8 0 .0 6 5 0 .2 4 2 0 .1 9 2 0 .1 6 5 0 .1 5 7 UJ3 0 .0 7 2 0 .1 2 3 0 .3 7 0 0 .3 1 0 0 .2 2 0 0 .2 2 8 AB1 - 0 .0 2 0 0 .0 3 0 - 0 .0 0 1 0 .0 0 0 0 .0 8 9 0 .0 6 8 AB2 0 .0 6 7 0 .0 8 6 0 .2 0 9 0 .1 5 0 0 .0 4 6 - 0 .0 0 5 AB3 0 .0 4 1 0 .0 0 4 0 .1 7 9 0 .1 4 3 0 .0 9 0 0 .0 4 8 AB4 - 0 .0 5 0 - 0 .0 5 9 - 0 .0 1 3 0 .0 0 0 0 .0 3 0 0 .0 2 2 AB5 0 .0 5 1 0 .0 6 7 0 .0 5 2 0 .0 6 2 0 .0 7 5 0 .1 9 7 AB6 - 0 .0 9 8 -0 .0 6 2 0 .0 3 1 -0 .0 1 9 0 .0 7 8 0 .0 6 5 ABC - 0 .0 9 7 -0 .0 7 6 - 0 .0 2 5 -0 .0 5 4 0 .1 4 4 0 .0 9 8 IE 1 0 .0 5 3 0 .0 4 1 - 0 .0 0 3 0 .0 3 8 - 0 .0 8 2 - 0 .0 8 0 IB2 0 .0 2 4 0 .0 4 6 - 0 .0 0 5 -0 .0 0 5 - 0 .0 1 7 - 0 .0 6 5 IH3 - 0 .0 9 8 - 0 .1 1 5 - 0 .0 7 6 -0 .0 9 2 - 0 .0 1 6 - 0 .0 6 1 IE 4 0 .0 2 1 0 .0 7 0 0 .1 7 4 0 .1 2 5 0 .1 3 4 0 .1 1 2 IEC 0 .0 1 1 0 .0 1 0 - 0 .0 1 6 0 .0 0 6 -0 .0 1 7 - 0 .0 5 1 MCI 0 .1 3 3 0 .1 1 1 0 .1 7 9 0 .1 8 9 0 .0 8 8 0 .0 6 9 MC2 0 .1 7 6 0 .1 2 2 0 .1 8 2 0 .1 9 9 0 .1 3 4 0 .1 6 3 MC3 0 .1 8 1 0 .1 5 9 0 .2 1 6 0 .2 5 2 0 .0 8 3 0 .1 1 8 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 295 Table 1 -continued ’ ’ Matrix of Product Moment, Polychoric and Polyserial Correlations used for ____________________ Analysis ________________________ R41 R42 R51 R52 R61 R62 R41 1 .0 0 0 R42 0 .9 7 0 1 .0 0 0 R SI -0 .0 1 4 -0 .0 7 3 1 .0 0 0 R52 0 .0 3 0 0 .0 3 9 0 .7 6 9 1 .0 0 0 R61 0 .2 2 0 0 .2 4 2 -0 .0 4 5 0 .0 9 6 1 .0 0 0 R62 0 .2 1 5 0 .2 6 4 - 0 .0 4 5 0 .0 8 3 0 .9 8 6 1 .0 0 0 R71 -0 .1 0 7 - 0 .1 1 3 0 .0 7 6 0 .1 8 3 0 .3 1 2 0 .3 1 6 R72 -0 .1 4 2 - 0 .1 0 7 0 .0 7 3 0 .1 4 9 0 .2 5 9 0 .2 2 3 OQ1 -0 .1 1 9 - 0 .1 5 3 0 .2 1 3 0 .1 5 1 0 .1 3 4 0 .1 0 7 OQ2 - 0 .1 7 3 - 0 .1 9 4 0 .1 7 7 0 .1 0 6 0 .0 2 6 0 .0 1 0 0 0 3 -0 .0 8 8 - 0 .1 0 1 0 .2 4 2 0 .1 4 8 0 .2 0 6 0 .1 8 2 CT1 -0 .0 4 7 - 0 .0 5 0 0 .2 2 8 0 .1 1 1 0 .1 2 4 0 .1 3 5 CT2 -0 .0 7 5 - 0 .1 0 8 0 .1 2 3 0 .0 7 8 0 .1 0 0 0 .1 0 1 CTO - 0 .0 4 8 - 0 .0 5 8 0 .2 4 3 0 .0 6 9 0 .0 9 2 0 .1 1 5 VC1 -0 .1 0 9 -0 .1 2 4 -0 .0 3 9 - 0 .0 9 9 0 .1 0 1 0 .1 1 3 VC2 -0 .0 9 9 -0 .0 9 2 - 0 .0 5 5 - 0 .0 2 0 0 .0 4 8 0 .0 5 7 VC3 -0 .0 9 0 -0 .0 9 4 -0 .0 4 9 0 .0 1 9 0 .2 1 3 0 .2 0 3 X jU I -0 .0 3 0 - 0 .0 7 9 -0 .0 6 3 - 0 .0 2 7 0 .1 6 5 0 .1 4 4 LU2 - 0 .0 6 5 - 0 .0 8 7 -0 .0 7 0 -0 .0 4 1 0 .0 8 7 0 .1 1 8 LU3 - 0 .0 4 5 -0 .0 5 4 0 .0 0 5 0 .0 8 1 0 .1 6 7 0 .1 9 7 AB1 0 .0 2 6 0 .0 1 4 0 . 1 1 1 0 .0 3 5 -0 .0 7 4 -0 .0 8 8 AB2 - 0 .0 0 9 -0 .0 1 3 - 0 .0 6 4 -0 .0 0 7 0 .0 0 2 • 0 .0 0 2 AB3 - 0 .0 5 3 -0 .0 3 3 0 .0 1 3 -0 .0 4 5 0 .0 3 2 0 .0 2 3 AB4 0 .0 0 3 -0 .0 3 3 -0 .1 0 0 -0 .0 0 9 0 .0 2 5 0 .0 0 4 AB5 0 .1 6 9 0 .1 7 6 -0 .0 3 2 0 .1 0 3 0 .2 2 2 0 .2 2 6 AB6 0 .0 3 6 0 .0 2 4 0 .0 1 6 - 0 .0 1 5 0 .0 6 4 0 .0 1 7 ABC 0 .0 6 9 0 .0 3 2 - 0 .0 9 1 -0 .0 9 4 0 .0 0 9 - 0 .0 2 5 IE 1 -0 .0 7 0 - 0 .0 6 4 -0 .0 1 4 -0 .0 2 3 -0 .1 6 4 -0 .1 5 1 IE 2 -0 .0 0 6 - 0 .0 2 ‘i 0 .0 3 2 -0 .0 4 9 -0 .0 0 4 -0 .0 3 4 IE 3 0 .0 2 1 - 0 .0 1 6 - 0 .0 1 2 0 .0 1 1 -0 .0 5 7 - 0 .0 4 7 IB 4 0 .0 4 6 0 .0 3 7 0 .0 1 3 0 .0 2 0 0 .0 4 1 0 .0 1 9 IBC -0 .0 3 0 - 0 .0 4 6 - 0 .0 4 5 -0 .0 7 2 0 .1 2 6 - 0 .1 3 7 MCI - 0 .1 0 1 - 0 .1 3 5 0 .0 2 2 0 .0 1 2 0 .2 5 5 0 .2 2 0 MC2 -0 .1 3 7 - 0 .1 1 4 0 .0 1 2 0 .0 6 8 0 .2 3 0 0 .2 2 5 MC3 -0 .0 7 7 - 0 .0 7 0 - 0 .0 5 7 0 .0 0 9 0 .2 6 7 0 .2 4 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 296 Table 1-continued Matrix of Product Moment, Polychoric and Polyserial Correlations used for _______________________ Analysis_______ __________________ H71 K72 OG1 002 OG3 CT1 R71 1 .0 0 0 R72 0.B 35 1 .0 0 0 OG1 0 .1 8 2 0 .2 2 8 1 .0 0 0 OQ2 0 .1 1 2 0 .1 8 8 0 .6 6 9 1 .0 0 0 0G3 0 .1 4 6 0 .1 4 1 0 .7 8 6 0 .8 0 3 1 .0 0 0 CT1 0 .2 2 7 0 .1 8 1 0 .6 5 7 0 .5 8 0 0 .5 5 7 1 .0 0 0 CT2 0 .0 7 5 0 .0 6 2 0 .5 2 0 0 .8 5 8 0 .6 1 5 0 .6 1 0 CT3 0 .0 9 7 0 .1 0 3 0 .4 7 7 0 .7 2 5 0 .6 0 9 0 .7 5 8 VC1 0 .1 9 3 0 .1 4 4 0 .5 3 5 0 .4 8 3 0 .5 2 9 0 .5 1 3 VC2 0 .0 1 5 0 .0 4 2 0 .4 6 8 0 .6 1 5 0 .5 7 1 0 .3 9 1 VC3 0 .0 9 9 0 .0 8 8 0 .5 0 8 0 .4 9 5 0 .6 4 3 0 .4 1 4 LU1 0 .1 2 1 0 .0 7 8 0 .5 1 0 0 .4 0 3 0 .4 3 8 0 .4 3 3 LU2 0 .0 5 9 0 .0 7 4 0 .5 2 6 0 .6 1 2 0 .5 1 9 0 .4 1 1 LU3 0 .1 5 8 0 .1 6 2 0 .5 0 9 0 .4 3 7 0 .5 6 8 0 .4 1 0 AB1 -0 .0 4 4 - 0 .0 3 1 - 0 .0 2 1 - 0 .0 0 5 - 0 .0 0 3 - 0 .0 3 0 AB2 -0 .0 6 3 -0 .0 2 2 0 .2 7 6 0 .2 2 7 0 .2 0 7 0 .1 5 7 AB3 0 .0 0 8 - 0 .0 1 5 0 .0 4 0 0 .1 3 8 0 .1 0 4 0 .0 9 7 AB4 0 .0 3 6 0 .0 5 2 - 0 .0 0 3 0 .0 7 4 0 .0 5 1 0 .0 6 0 AB5 0 .2 0 3 0 .1 4 3 - 0 .0 2 2 0 .0 0 1 -0 .0 7 0 0 .0 4 7 AB6 0 .0 2 7 - 0 .0 0 5 0 .0 3 3 0 .0 0 3 0 .0 7 S 0 .0 4 5 ABC 0 .0 1 7 0 .0 0 5 - 0 .0 0 6 0 .0 2 2 0 .0 6 1 -0 .0 8 7 IE 1 0 .0 7 1 0 .0 5 2 -0 .0 0 7 0 .0 5 0 0 .0 0 5 0 .0 3 9 IE 2 - 0 .0 9 2 - 0 .0 4 3 0 .0 5 7 -0 .0 2 0 - 0 .0 2 8 -0 .0 1 6 XB3 -0 .0 1 9 -0 .0 7 0 - 0 .0 3 6 0 .0 4 1 0 .0 3 0 -0 .0 0 9 IE 4 0 .0 0 3 0 .0 0 8 0 .1 6 5 0 .1 6 4 0 .1 4 3 0 .1 2 8 IEC 0 .0 5 7 0 .0 2 8 0 .0 0 6 0 .0 7 0 0 .0 4 4 0 .0 2 6 MCI 0 .2 1 2 0 .1 8 6 0 .3 1 8 0 .3 1 7 0 .2 3 2 0 .2 6 1 MC2 0 .2 7 6 0 .1 7 0 0 .3 6 2 0 .5 3 1 0 .4 0 9 0 .3 1 5 MC3 0 .2 5 7 0 .1 8 1 0 .3 0 4 0 .3 7 4 0 .3 7 5 0 .2 8 0 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 297 Table 1-continued Matrix of Product Moment, Polychoric and Polyserial Correlations used for _________________________ Analysis_________________________ OT2 CT3 VC1 VC2 VC3 LU1 o r a 1 .0 0 0 CT3 0 .8 0 4 1 .0 0 0 VC1 0 .4 7 2 0 .3 9 1 1 .0 0 0 VC2 0 .6 1 2 0 .4 3 7 0 .6 7 6 1 .0 0 0 VC3 0 .5 0 1 0 .4 3 4 0 .8 0 4 0 .7 9 5 1 .0 0 0 LU1 0 .3 8 5 0 .3 0 0 0 .7 6 8 0 .5 5 8 0 .6 4 7 1 .0 0 0 LU2 0 .5 9 1 0 .4 2 8 0 .6 3 4 0 .8 2 6 0 .6 7 8 0 .6 6 6 LU3 0 .4 2 5 0 .4 4 6 0 ,7 2 3 0 .6 5 6 0 .8 1 0 0 .7 7 8 AB1 -0 .0 0 8 -0 .0 3 3 0 .1 0 2 -0 .0 1 5 -0 .0 0 3 0 .0 0 2 AB2 0 .1 8 6 0 .0 8 4 0 .3 0 1 0 .2 4 2 0 .2 1 3 0 .2 5 0 AB3 0 .1 4 1 0 .1 1 9 0 .0 9 2 0 .1 3 5 0 .1 1 4 0 086 AB4 0 .1 0 9 0 .0 8 5 0 .0 0 7 0 .0 2 4 0 .0 0 2 0 .0 9 0 AB5 0 .0 5 5 0 .0 7 6 -0 .0 3 3 0 .0 6 2 0 .0 2 3 0 .0 2 5 AB6 0 .0 4 4 0 .0 3 7 0 .2 5 5 0 .0 8 7 0 .1 1 3 0 .0 2 0 ABC -0 .0 1 5 -0 .0 5 1 0 .1 4 0 0 .1 0 2 0 .0 2 2 0 .0 3 9 I E l 0 .0 2 2 0 .0 6 1 0 .0 0 2 0 .0 1 3 0 .0 2 6 0 .0 4 6 IE 2 -0 .0 0 1 0 .0 2 8 0 .0 3 4 0 .0 3 4 0 .0 9 3 0 .0 2 7 XE3 0 .0 3 0 0 .0 8 5 0 .1 6 6 0 .0 1 7 0 .0 1 5 0 .0 1 0 IE 4 0 .1 7 7 0 .0 9 1 0 .3 1 0 0 .2 1 8 0 .1 9 8 0 .1 9 3 IEC 0 .0 5 0 0 .0 8 1 0 .0 8 4 0 .1 0 2 0 .0 7 9 0 .0 6 6 MCI 0 .2 4 8 0 .2 7 6 0 .3 9 6 0 .2 8 2 0 .2 6 7 0 .4 6 0 MC2 0 .5 7 6 0 .3 7 4 0 .4 8 3 0 .6 2 3 0 .4 5 5 0 .4 6 0 MC3 0 .3 8 6 0 .3 5 3 0 .4 9 4 0 .4 9 1 0 .4 5 2 0 .4 5 4 Table 1 -continued Matrix of Product Moment, Polychoric and Polyserial Correlations used for _________________________ Analysis_____________________ ____ LU2 LOT AB1 AB2 AB3 AB4 LOT 1 .0 0 0 LU3 0 .7 8 1 1 .0 0 0 AB1 - 0 .0 4 3 -0 .0 7 6 1 .0 0 0 AB2 0 .2 3 2 0 .2 3 9 0 .0 1 9 1 .0 0 0 AB3 0 .1 1 0 0 .1 5 8 0 .0 3 4 0 .0 9 1 1 .0 0 0 AB4 0 .0 4 9 0 .0 0 2 -0 .0 5 6 0 .0 1 6 -0 .0 5 6 1 .0 0 0 AB5 0 .0 0 4 0 .0 8 3 -0 .0 6 1 0 .0 0 9 0 .0 1 9 -0 .0 2 7 AB6 0 .0 4 7 0 .1 0 7 0 .0 0 1 0 .0 2 1 0 .0 9 6 0 .0 3 1 ABC 0 .0 3 7 -0 .0 0 9 - 0 .0 3 4 -0 .0 2 3 -0 .0 7 0 0 .0 2 1 XE1 0 .1 - 0 0 .0 5 9 0 .0 4 4 0 .0 2 8 -0 .0 0 9 0 .1 1 0 XE2 0 .0 9 4 0 .0 2 1 - 0 .0 5 7 0 .2 0 0 0 .1 6 7 - 0 .0 3 5 IE 3 0 .0 3 1 - 0 .0 0 7 - 0 .0 8 0 -0 .0 2 4 -0 .0 3 5 0 .1 3 2 XE4 0 .1 6 4 0 .1 8 4 0 .6 1 8 0 .6 6 4 0 .3 0 2 0 .0 1 4 IEC 0 .1 5 5 0 .0 5 2 - 0 .0 3 3 0 .0 7 3 0 .0 0 7 0 .1 0 6 MCI 0 .3 3 5 0 .3 8 9 0 .0 0 4 0 .1 6 7 0 .1 2 7 0 .0 5 2 MC2 0 .6 4 2 0 .4 7 3 -0 .0 5 2 0 .1 9 2 0 .0 8 1 0 .0 2 7 MC3 0 .4 6 8 0 .5 5 3 -0 .1 2 2 0 .1 6 7 0 .1 0 0 0 .0 1 5 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 298 Table 1 -continued Matrix of Product Moment, Polychoric and Polyserial Correlations used for _________________________ Analysis__________________ _______ AB5 AB6 ABC IE 1 IE 2 IE3 AB5 1 .0 0 0 AB6 o . o e i 1 .0 0 0 ABC 0 .1 0 1 0 .0 5 8 1 .0 0 0 IE 1 0 .0 2 3 0 .0 0 3 -0 .0 0 9 1 .0 0 0 IE 2 -0 .0 6 9 0 .0 6 8 0 .0 0 5 0 .0 2 8 1 .0 0 0 IB 3 0 .0 3 8 0 .0 8 7 0 .1 1 8 -0 .0 1 6 0 .0 1 5 1 .0 0 0 IE 4 0 .2 4 8 0 .2 7 8 - 0 .0 4 1 0 .0 5 4 0 .1 1 1 - 0 .0 3 4 IEC 0 .0 4 8 0 .0 5 3 0 .4 1 0 0 .8 3 1 0 .2 9 2 0 .1 9 7 MCI 0 .1 1 2 0 .0 8 1 0 .1 4 3 -0 .0 4 5 0 .0 6 4 0 .0 0 8 MC2 0 .1 3 2 0 .0 8 0 0 .1 1 4 0 .0 6 4 0 .0 5 1 0 .0 2 6 MC3 0 .2 2 9 0 .0 7 9 0 .0 9 2 -0 .0 7 2 - 0 .0 2 4 - 0 .0 2 6 Table 1 -continued Matrix of Product Moment, Polychoric and Poly serial Correlations used for 1E4 IEC MCI MC2 MC3 IE4 1.000 IEC 0.056 1.000 MCI 0.165 0.036 1.000 MC2 0.161 0.133 0.572 1.000 MC3 0.119 -0.031 0.705 0.755 1.000 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix F Sample Texts Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 0 0 Essay #008 [Examinations o f premises/issues have been marked in bold print] Eventhough hom elessness is becoming a major problem around the world due to slowing economies, rising costs of living and changes in the types of skills demanded of employees, however, all the causes above are mainly derived from the idea of capitalism. In order to solve the problem of hom elessness, the government itself h as to change the economy system form capitalist to socialist First of all, how possibly that capitalism system is the main reason to homelessness? In order for anyone to answer this question, let's we ourselves ask another question: who are in charge of this system and what do they beneficial from it? F irs tly , capitalism is running big businessmen whose are hungry to richness and fortune. They can be considered the careless people in the world because all they want to do is bring more profit for themselves. For example, if a certain corporation is decided to move its branches or factories from Los Angeles to Mexico in order to enrich the company with profitability then it will increase the rate of unemployment in America. Moreover, it also will create the changes in the types of skills demanded of employees for those workers who are wanted to work at another field. Since they are unable to find jobs to pay for their bills, sooner or later, they will end up in the street like others previous homelessness. However, there is a solution to this problem. In order to decrease the rate of hom elessness, any government h as to change the economy system from capitalism to socialism. Socialism idea is the m ost qualifiable equality for everyone in turn of capital but it also prevents businessm en away from profit themselves. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 301 Essay #277 [Statements o f direction have been marked in bold print] One of the resources more important for life is the water and currently the world is seeing that the supply of water in the next years will decrease owing to mainly population growth, pollution and overuse. Therefore in many places are looking for a solution. In general there are three forms to obtain water: desalinization of water, recycling and use of other sources of potable water. 3/4 of the earth is water but it is salted. Therefore for using this water w e need to extract the water, because it can be used in its natural way including for farming, and household. Besides to date, there is not a efficient and cheap form to extract the water, and it will be so difficult to find a cheap method. Other solution is to recycle the water in general in many countries are doing with excellent result The water p a s s e s many filters until it is almost clean. This water is not potable but it can be u sed in farm irrigation, in the industry and in the house (for washing, and in the bathroom. In this way, it is cycle where the water is used then recycle and again used and so o n , and the lost of water is minimum. Here the problem is that the population grow very fast, the water that already exist will not be enough and then will be neccesary so found more water until reach a optimum level of water. But maybe it will have a moment when there will be not possible to maintain it with the currently sources of water. With the recycling system will not have have problem with overuse because all the water used is recolected and used again, the only problem is with potable water. But there are still so u rces of potable water that are not used like the poles of the earth. The poles of the Earth are the biggest storage of potable water. The problem here is the distribution of this water, how to transport this water to the places where it is needed, in Los Angeles som e people are thinking in way of carrying icebergs. One way is putting turbines in one iceberg and bring it until L.A. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 302 Essay #895 [Markets of warrants have been tagged by bold print] Everybody will die without water. As we have seen, U.S.A. is feeing a very serious problem, which is lack at water supply. As a part of the population living in U.S.A., we should be aware of the dangerous at this situation and to help out to solve the problem. This could be very hard for everybody to do so , but we should work this out for our own future. Good solutions are very important in solving the water supply problem. We have already se e n many fects of the water problems. For example, the money we u s e to produce water is quickly increasing. When water supply become less and less, we will reach to the point that we cannot afford to buy water to maintain our life. Therefore, we m ust have som e effictive solution by now. The first one come to our mind is to cut down water use. Which we are already doing it right now. Household water u se is the field we should work on because it is 1/6 of all water u se in the nation. Therefore, i f we can cut it down, the problem is able to solve in the short-run. The reason I said it can only solve the problem in the short run is because no matter how successful we are, we still cannot u s e this 1/6 of the water u se to cover other 5/6 of the water use. Therefore, it is not a good long-term solution to our water problem. Ocean is our best solution, if we are able to transmit sea water into drinking water, our problem will no longer exits. Because sea-water will never finish. Therefore, the thing we should work on is the new technology. No only transmitting s e a water into drinking water. We can also find a way to collect water when it rain. Because most of the rain water is waste and i f we can collect it more effectively, we will increase out water input greatly. Why is diamond s o m uch more expensived than water? We cannot live without water, but we can live no problem without diamond and why is that true in the different of value? The answer istheamoungquantiesofboth we can get. Since diamond is so rare, therefore it cost alot; however, if one day the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 303 amoung of water we can get is equal to the amoung of diamond we are getting now, everybody will die for water. O f course, none of us want this to happen. Therefore, we should stand up with our Government and our society as a whole to solve this problem. Everyone of use should involve in this. Do not wait until it is too late! Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 
Asset Metadata
Creator Weasenforth, Donald Lester (author) 
Core Title Rhetorical abstraction as a facet of expected response: A structural equation modeling analysis 
Contributor Digitized by ProQuest (provenance) 
Degree Doctor of Philosophy 
Degree Program Linguistics 
Publisher University of Southern California (original), University of Southern California. Libraries (digital) 
Tag education, tests and measurements,language, linguistics,language, rhetoric and composition,OAI-PMH Harvest 
Language English
Permanent Link (DOI) https://doi.org/10.25549/usctheses-c17-479242 
Unique identifier UC11351896 
Identifier 9617003.pdf (filename),usctheses-c17-479242 (legacy record id) 
Legacy Identifier 9617003-0.pdf 
Dmrecord 479242 
Document Type Dissertation 
Rights Weasenforth, Donald Lester 
Type texts
Source University of Southern California (contributing entity), University of Southern California Dissertations and Theses (collection) 
Access Conditions The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au... 
Repository Name University of Southern California Digital Library
Repository Location USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
education, tests and measurements
language, linguistics
language, rhetoric and composition
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button