Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Intelligence test development with special reference to a test for use in Iraq
(USC Thesis Other)
Intelligence test development with special reference to a test for use in Iraq
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INTELLIGENCE TEST DEVELOPMENT WITH SPECIAL REFERENCE TO A TEST FOR USE IN IRAQ A Dissertation Presented to the Faculty of the School of Education the University of Southern California In Partial Fulfillment of the Requirements for the Degree Doctor of Education by Abdul Jalil Alzobaie July 1954 UMI Number: DP25903 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Dissertation Publishing UMI DP25903 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 T h is dissertation, w ritte n under the direction of the C ha irm an o f the candidate’s G uidance C om m ittee and approved by a ll members of the C om m ittee, has been presented to and accepted by the F a c u lty o f the School o f E du ca tion in p a rtia l fu lfillm e n t o f the requirem ents f o r the degree o f D o c to r of E ducation. S f J V Dean Guidance Committee l i j J . S . l t j J x Q . f z s j Q l / ' Cnairman I TABLE OF CONTENTS CHAPTER • PAGE I. THE PROBLEM AND DEFINITIONS OF TERMS USED . 1 The problem . . ......... . . ......... 2 Statement of the problem ....... 2 Importance of the study ........... 3 Brief statement of procedure ...... 4 Definition of terms u s e d ........... 3 Intelligence . ................... 5 Culture ........ 6 Social status ............. 6 Socio-economic ............ 7 Mental test ............. 7 Age scale ....................... . 8 Point scale .............. 8 Validity .............. ............ 8 Reliability ............. 8 Iraq....................... 8 Organization of the dissertation .... 9 II. EARLY HISTORY OF MENTAL MEASUREMENT ... 11 Beginning of experimental psychology . . 11 Early beginning of measurement ..... 15 The work of Binet ............ 21 Summary and evaluation ......... 28 CHAPTER III. DEVELOPMENT OP INDIVIDUAL INTELLIGENCE TESTS (VERBAL TYPE)................... Extension of Binet's work: the Stanford- Binet . . ...................... . . . Modification of Binet’s practices: point scales .......... ....... Modification of Binet’s practices: the work of Kuhlmann . . . Par-reaching revision of Binet’s practices: the work of Wechsler . . . Complete departure from Binetfs practices: the CAVD (Thorndike and others) ........ ........... Summary and evaluation ......... IV. DEVELOPMENT OP INDIVIDUAL INTELLIGENCE TESTS (NON-VERBAL TYPE) ......... Performance tests . . . Early development of performance tests. What performance tests really are . . . Representative scales of performance ty p e ............................. Recent use of performance tests . . . Summary and evaluation of performance _______ tests . . . . . . . . PAGE 29 33 37 41 44 32 53 55 55 55 60 63 66 _ _ 69 iv CHAPTER PAGE Tests of early development ....... 71 The nature of tests of early develop ment ......... 72 Representative tests of early develop ment ............................. 72 Summary and evaluation of tests of early development........... 75 V. DEVELOPMENT OF GROUP INTELLIGENCE TESTS (VERBAL AND NON-VERBAL TYPE) ...... 79 Early development of group intelligence tests ................. 79 The nature of group intelligence tests . . 82 Trends in developing group intelligence tests ............... 88 The Army tests (Alpha and Beta) and their chief revisions ........... 89 A departure from the Army practice: the Otis Self-Administering Tests of Mental Ability ..................... 93 Far departure from the Army practice: the CAVD (Thorndike and others) .... 95 Complete departure from the Army practice: California Test of Mental Maturity . . 98 ________ Summary and evaluation______________________101— CHAPTER PAGE VI. DEVELOPMENT OF INTELLIGENCE TESTS BASED ON FACTOR ANALYSIS ......... ..... 104 Spearman’s two-factor theory ...... 105 Bl-factor theory..................... 109 Multiple-factor theory ............ Ill Primary mental abilities: ThurstoneVs investigations ..................... 113 Representative tests of factor analysis . 116 Summary and evaluation . . . ......... 121 VII. CULTURAL FACTORS IN CURRENT INTELLIGENCE TESTS: STUDIES DEALING WITH I.Q. OR TOTAL SCORE ......... .......... 125 I.Q.' and socio-economic factors .... 127 I.Q. and family relationship ...... 140 I.Q. and natio-racial differences . . . 145 Factors which may contribute to status differences ...................... . 146 Summary and evaluation . 160 VIII. CULTURAL FACTORS IN CURRENT INTELLIGENCE TESTS: STUDIES DEALING WITH ANALYSIS OF TEST ITEMS......................... 163 Binet ............................... 164 I Stern and associates............. 167 t I __________Weintrob and Weintrob . . . ._ . _. . . . . . .. __17Q_. CHAPTER PAGE Yerkes and associates ......... 172 Bridges and Coler ..... 174 English . ...... 177 Pressy and Ralston................... 179 L. W. Pressy .................. l80 Burt ........................ 184 Stoke ............... 186 Long ....................... 189 Saltzman ..... ..................... 192 Clarke ............................. 198 Murray ..... ....................... 201 Eells ......................... 203 Driggs............................ . . 207 Summary and evaluation .......... 210 IX. DEVELOPMENT OF CULTURE-FREE OR CULTURE-FAIR 214 215 219 222 225 j 233 | 236 INTELLIGENCE TESTS ................. Goodenough Draw-a-Man Test ......... The Leiter International Performance Scale............................ The Cattel Culture-Free Test . . . . . The work of Chicago group ...... Summary and evaluation ............. X. COMPARISON OF THE STANFORD-BINET.WITH ITS ADAPTATION IN OTHER CULTURES ........ . - . - — v±T CHAPTER PAGE The Indian form .................... . 238 The South African form. ......... 257 The Egyptian fopm . ................ 259 The Mexican form ............ 261 The Swedish form ........... 264 An investigation of certain aspects of. Bantu intelligence................. 266 Summary and evaluation. . 267 XI. SUGGESTIONS FOR CONSTRUCTION OF AN INTELLIGENCE.TEST FOR IRAQ ......... 270 Difficulties.which.face any intelligence. test builder in Iraq ........ 271 General principles of importance in the choice and arrangement of test*s items. 276 Analysis of the Wechsler Intelligence Scale for Children and the.Stanford- Binet ......... 278 Summary and evaluation......... 281 XII. SUMMARY................. 289 The problem and procedure.......... . . 289 Findings and conclusions ........ 289 * Recommendations ..................... 301 BIBLIOGRAPHY......................... ... 306 LIST OP TABLES TABLE I. II. III. IV. V. VI. PAGE Representative Tests Derived from the Binet Scales .............. 45 Summary of Research Studies Dealing with Child's I.Q. and Parent’s Occupation when Results Were Reported in Correla tion Pom ..................... 129 Summary of Research Studies Dealing with Child’s I.Q. and Parent’s Occupation when Results Were Reported in Terms of Mean or Median I.Q.................... 130 Summary of Research Studies Dealing with Child's I.Q. and Parent’s Occupation when Results Reported Were Not in Terms of Mean or Median I.Q. ...... 132 Summary of Research Studies Dealing with Child’s I.Q. and General Social Status when Results Were Reported in Correla tion F o r m ....................... 134 Summary of Research Studies Dealing with Child’s I.Q. and General Social Status when Results Were Reported in Terms of Mean or Median I.Q.................... 136 i x TABLE PAGE VII. Summary of Research Studies Dealing with Decline of I.Q. of Isolated Group Children with Advancing Age ........ 138 VIII. Summary of Research Studies Dealing with Intelligence of Rural Children .... 139 IX. Correlations Between I.Q.‘s and Various Factors for Foster Children and "Own” Children . 142 X. Correlation for Various Traits of Three Groups of Twin Pairs............. 144 XI. Summary of Research Studies Dealing with the Intelligence of Negro Children . . . 147 XII. Summary of Research Studies Dealing with the Intelligence of American Indian Children ..... . .148 XIII. Summary of Research Studies Dealing with the Intelligence of Spanish Speaking Children............. 130 XIV. Percentage of Tests Passed in Certain Age Level of Two Status Groups .......... 169 XV. The Percentage of Children from Each Occupation Group Who Score above the Median for their Age (for the Total Group), on Each Test . . . . ...... . . ... 181 x TABLE PAGE XVI. The Percentage of Children in Each Occu pational Group Scoring above the Median for Their Age, on Each of the Four Tests................... 183 XVII. Comparison of Group I and Group II by Subtests of the Kuhlmann-Anderson Intelligence Test . . .................. 191 XVIII. The Total Percentage of Each of Three Groups Passing Each Test ....... 195 XIX. Revised Stanford-Binet Scale L Items Showing Either Strong Tendencies or Ho Tendencies to Differentiate Negro and White Groups ................... 200 XX. Differences in Degrees of Discrimination at the Means of Tests and Their Subtests between Higher and Lower Ranking Occu pations ............. 210 XXI. Comparison between the 1916 Stanford- Binet and its Adaptations ....... 239 XXII. Comparison between the 1937 Stanford- Binet and its Swedish Adaptation . . . 246 XXIII. Analysis of Committee Responses to Stanford-Binet Items........... 282 CHAPTER I THE PROBLEM AND DEFINITIONS OF TERMS USED Mental testing during the last fifty years has passed through many stages of development. It has been one of the most challenging research areas in psychology and education. It has aroused a great deal of interest and activity. This is illustrated by the large number of mental tests and other publications concerning intelli gence testing. It is unnecessary to point out the im portant role which intelligence tests have come to play in modern education in the United States. They are used widely as a basis for admission to institutions of higher learning, for assigning pupils to class groups, for ad vising students with respect to selection of courses and vocations, for determining which students should be ex cluded from regular school work and for educational re search. Intelligence tests are also used to compare the mental ability of racial and cultural-social groups. The differences found between these groups have been attrib uted to the differences in native ability by some psy chologists, especially the early ones. However, in the recent history of mental testing, there has been an in- cre^asing_interest__in_the_Importance_of_cultural__or_environ 2 mental factors. Many psychologists, though they do not deny the role of heredity, believe that any interpretation of intelligence test results must take into consideration the way in which cultural factors enter into the determina tion of test performance. It is hoped that this study will deal with the development, of intelligence tests and cultural factors in intelligence test performance. I. THE PROBLEM Statement of the problem. The present study is de signed to summarize the trends in the construction of in telligence tests since the early stages of the movement, with an attempt to show the influence of culture on test items, and the implications of that influence for the construction of an intelligence test in Iraq. The study will attempt to answer the following questions: 1. What have been the trends in developing intel ligence tests since the beginning of the movement to the present time? 2. What are the present trends in design and con struction of intelligence tests? 3. What evidences of the influences of culture on subtests and individual items of commonly used intelli gence tests have been found? What has been done to de- j crease cultural bias on test items?____________ _________ 3 4. How have the subtests on Stanford-Binet been adapted for use in cultures other than-the American cul ture? 5. What are the difficulties facing a constructor of intelligence tests in Iraq, and what are some of the suggestions for devising an intelligence test for this country? Importance of the problem. It is hardly necessary to point out the important role which intelligence tests have come to play in modern educational procedures. In telligence tests have certainly reached a highly important part in an increasing number of school situations in western countries. It is regrettable that until now no work in this field has been done in Iraq. This lack of interest in developing intelligence tests cannot be at tributed to any one reason. The problem was developed from an interest in the importance of intelligence testing in the American schools. This has led to this study which is expressed in terms of five major questions posed above under the ' statement of the problem. It is hoped that the investigation of development of intelligence tests in the United States and the study of the amount and nature of various types of cultural — lr factors in existing intelligence tests, may be of some help in developing an intelligence test in Iraq. It is expected that this study may help in the selection of items from many of the better known tests that would be useful and that, with experimentation to test their valid ity, could be worked into a particular and useable intel ligence test for Iraq. Further, it is thought possible that this study may furnish background and a springboard to any future investi gator who may wish to devise such a test. It may help him to avoid a few of the major pitfalls to which he might otherwise fall victim. I II. BRIEF STATEMENT OF PROCEDURE The procedure used in this study consists,- mostly, of reviewing the related research concerning the develop ment of intelligence tests since the time of Binet, and the influence of cultural factors upon intelligence test performance. In addition, letters,-were sent to the Ministry of Education of many countries, stating the pur pose of the study and asking for a copy of the Stanford- Binet in their native tongue. Only a few countries sent | the requested copies. Others have sent different tests which are not useful. Still others have replied, explain ing that the translation is under process and has..not _y_et__ 5 been completed. A comparison between the 1916 Stanford- Binet and its adaptation from India* Mexico* South Africa, and Egypt was made on the basis of age assignments and the content of the individual items. Then* a similar compari son was made between the 1937 Stanford-Binet and its Swedish adaptation on the basis of age assignments* item content* and the score points assigned. Finally, a committee of five Iraqi graduate stu dents studying at the University of Southern California were chosen to help in analyzing each item on the Wechsler Intelligence Scale for Children and the Stanford-Binet. Each one was contacted individually and was asked to give his opinion as to whether or not the item* if translated* would fit the Iraqi social and cultural conditions. III. DEFINITION OF TERMS USED Terms in this study which call for clarification include ’ 'intelligence*1 1 ’ ’culture*” ’ ’social-status*” r l socio economic*” ’ ’ mental test,” ”age scale*” "point scale*” "validity*” "reliability*” and "Iraq.” Intelligence. Despite the thought and research which have been concerned with mental testing* there is i 1 today no generally accepted definition of intelligence. It has been called* among other things* the ability to 6 learn, the ability to do abstract thinking, the ability to profit from experience, the ability to adjust to one en vironment, the ability to comprehend, the ability to per ceive, and a composite of abilities. This diversity of definitions is a result of (l) limited information, result ing from the difficulty of doing research on the nature of mental processes, and (2) the complexity of man’s mental behavior. It is not the purpose of this study to arrive at a new definition of intelligence nor to determine that any one definition or concept of intelligence is right or bet ter than any other. However, the concern of this study is “test intelligence” which refers to a score an individual makes on an intelligence test and which may be an estima tion of genetic and developmental factors which influenced his mental ability. Culture. For the purpose of this study, the term culture will be used in its widest sense, to include not only the very striking contrasts in “way of living” with which the ethnologist deals, but also the minor variations within one culture. Both types of differences in culture i may determine to a large extent how a subject will react ! to any test situation. Social-status. . . Previous ..research in._this~ field. has 7 defined “social ■status'* in various ways; frequently it has been equated with the occupational status of the father. Recent advances in the field of sociology and social an thropology provide the basis for what is probably a more useful conception of social status than the one based upon occupation, or upon any other single factor. Warner has developed a technique for measuring the social status of the individual. It is based on occupation, source of in come, type of home, dwelling areas. Thus, social class is defined as “two or more orders of people who are believed to be, and are accordingly ranked by the members of the community, In socially superior and inferior positions.1 Socio-economic. The term “socio-economic1 1 refers -i.. _ -11 f to those combinations of social and economic factors whichl designate the status level of individuals. Mental test. A standardized task or series of tasks used for the measurement or appraisal of some spe cialized aspect of ability. Unless otherwise designated, the term is generally understood to have reference to general intelligence. W. Lloyd Warner and Paul S. Lunt, Social Life of a Modern Community (Yankee City Series, Vol. 1; Mew Haven: Tale University Press, 1941), p. 8 2. 8 _ Age scale. A scale in which the items are arranged in groups according to the age at which a certain propor tion of children are able to pass them. Also known as a year scale. Point scale. A scale in which the items are ar ranged in serial order* usually on the basis of difficulty* and the score is expressed in "points” according to the weighted or unweighted sum of items passed. Validity. In mental measurement the term is de fined as the degree to which a test measures what it is designed to measure. Reliability. The term refers to the stability of a given measure on repeated applications or* as it is some times put* the extent to which a test is consistent in measuring whatever it does measure. Iraq. Iraq is a modern name for the country of Mes opotamia* bounded on the north by Turkey* on the south by the Persian Gulf and Saudi Arabia* the east by Iran and on the west by Syria and Trans Jordan. The government is a constitutional government with an inherited monarchy and representatives. Most of the-people are Arabs* with I several minority groups. The-official language is Arabic ! and the religion of the majority _of_ the_people is, Moslem. IV. ORGANIZATION OF THE DISSERTATION 9 This chapter has introduced the study by discussing the nature of the problem, stating its; importance, with a brief statement of the procedure and definition of some of the terms used. x Chapter II reviews the literature on the early history of mental measurement. Chapter III discusses the trends and directions in the development of individual tests of general mental ability of the verbal type, while Chapter IV discusses the non-verbal type. Chapter V describes the development of group intel ligence tests of both the verbal and non-verbal type, while Chapter VT deals with factor-analysis intelligence test. ^ Chapters VII and VTII discuss the cultural factors in current intelligence tests. The former deals with the studies concerning the I.Q. or total score, while the lat ter deals with studies of the analysis of test items. The development of so-called "culture-free” or "culture-fair1 1 tests is discussed in Chapter IX. Chapter X makes a comparison of the Stanford-Binet , with its adaptations in other cultures. i Suggestions for construction of an intelligence j test for Iraq may be found in Chapter XI. Chapter XII contains the summary and conclusions CHAPTER II EARLY HISTORY OF MENTAL MEASUREMENT i Although the field of mental measurement, as known today, is of recent growth, it is, nevertheless, inter- [ esting to trace its early history, and to find some of i the causes which led to its development. Tests and measurements of one kind or another have faced man since his beginning and have always played an important part in his history. Although from ancient times, as indicated by his written records, man has used tests for many purposes, there is no evidence of a scien tific approach to the techniques of human measurement un til near the beginning of the present century. Scientific testing and measurement were forced to wait until the es tablishment of an experimental and scientific psychology i and such a psychology was slow in coming, failing to ap pear until twenty years before the end of the nineteenth century.1 Beginning of experimental psychology. It is un questionably impossible to state definitely the first j person who began the objective testing of the mental 1 Clark L. Hull, Aptitude Testing (New York: World Book Company, 1936)* P- 45. 12 processes of the human individual. It is possible, how ever, to state that the roots of experimental psychology may be traced to the activities of scientists who were working in Germany, England, Prance and the United States in the latter part of the nineteenth century. Fechner, in Germany, took the decisive step in com-; 2 bining experimental methods with mathematical analysis, and the work of experimental psychology in Germany began In earnest when Wilhelm Wundt established, in 1879* the first psychological laboratory to be recognized and used as such in the world. This laboratory was at Leipzig. Here were trained such men as Kraepelin, Lehmann, Kulpe, Meumann, and the Americans Hall, Cattell, Serpiture, Angell, Tichener, Witner, Warren, Stratton, Judd,^ and many others who have had great influence upon psychology in general and to varying degrees upon testing as a psy chological and scientific technique. The Germans and those most influenced by them were interested in measuring; visual and auditory sensation, reaction time, and experi mentation in the field of psychophysics and association. -..............................., i 2 Andrew T. Wylie, nk Brief History of Mental Tests,1 1 Teachers College Record, 23:19-33* January, 1922. q Edwin G. Boring, A History of Experimental Psy chology (New York: The Century Company, 19297, pp. 413- r3 In their work, however, they were more interested in the way in which men resembled each other than in their dif- ferences. 4 In England, Sir Francis Galton, a widely known fol lower of Darwin, first attempted to apply the evolutionary principle of variation, selection, and adaptation to indi viduals. In 1869 he published a book with the problems of R heredity and in particular with hereditary genius. 6 Galton made real headway so far as methods of mak ing discriminations, distributions, and correlations were concerned. He was instrumental in building up a tool of analysis, but he applied it to what the ruling psychologists proposed, namely, the sensory processes. It was plausible to him that acuities in vision or hearing or weight dis crimination should indicate differences in mentality. Be cause of his satisfaction with the evidence from such pro cesses, his idea of studying widely spaced groups did:.not yield fruitful results in mental testing. ^ Ann Anastasi, Differential Psychology (New York: The Macmillan Company, 1937)> p. 11. 5 Francis Galton, Hereditary Genius: An Inquiry Into Its Laws and Consequences (London: Macmillan and 'Company, Ltd., 1914), 37$ PP* 6 Francis Galton, Inquiries Into Human Faculty and Its Development (London: Macmillan and Company, ltS83), 387 PP. — 14- He also began the use of l t free association” tests * a technique which was subsequently applied by Wundt. He also discovered wide variations in mental imagery through use of questionnaire methods. In Prance, Alfred Binet was outstanding in influ ence and accomplishment among his contemporaries. In 1896' he proposed a series of psychological tests. In 1905 he published a scale for the measurement of intelligence and by 1911 this had been revised two times. This scale was the direct ancestor of the majority of individual tests to be derived in the years to follow in many countries. ! Binet had designed this work at the request of school authorities of Paris, with the idea in mind of classify- j <~7 ing children in school according to their abilities.' In the United States, a young country, experimental psychology was also developing. Many students went to 1 Germany to study in Wundt’s experimental laboratory. Some went to England, others to Prance. Thus it was, that the Germans influenced American psychologists through such men as Cattell and the French through Goddard and Terman, who 1 accepted Binet’s scale to use in their own work in the United States. Thorndike, and later Thurstone, had great influence in the development of statistical procedures in ! I I T I I ' Kimbal Young, ”The History of Mental Testing,” , 1 — ffhe—Pedagogic ai~Seminary, - 3T: I- 48,—March; 19 2 4 " . 1 15 the United States. Many of the American students who re- i turned from abroad were instrumental in having departments of psychology accepted and experimental laboratories in troduced into the universities where they worked. The first psychological laboratory was established o by William James at Harvard. It was upon this foundation of experimental laboratory that testing was developed in the United States. Testing was developed in this country much more extensively and quickly than in any other coun try in the world. Early beginning of measurement. As has been shown., modern psychology and education owe a large debt to the i ; early experimental psychologists, a debt which science owes to these early pioneers for their efforts to discover general laws or principles of mental activity and how people are alike. Attempting to discover similarities of human beings was their primary purpose, and it was the intense desire to discover these similarities that led to increased efficiency in methods of mental measurement and the accelerated application of experimental procedures. It was later that experimenters became interested in how * people differ rather than how they are alike. 8 t / - Boring, op. cit., p. 496. . . im probably the first formal approach in the develop ment and use of mental tests was made by Oehrn^ in Germany in 1 889. Her tests were built upon a time factor in two ways, first, from the standpoint of time taken per unit of work, and, second, the amount of work done per unit of time. The tests.were grouped under the four main heads: (l) a study of perceptual processes; (2) memory; ( 3) the association processes; and, (4) motor functions. In the United States, Rice developed a uniform 10 spelling test which he administered to 33*000 pupils in different cities. His work immediately attracted atten tion, but, as has been true in the case of many new scientific developments, it attracted many ardent opponents. The report of Rice's results In the Department of Super intendents in 1897 was laughed at by most of the school men in attendance. Galton, who has already been mentioned because of his contributions, particularly his discovery of a method of correlation, was an early Influence on the scientific measurement of human beings. 9 Paul L. Boynton, Intelligence: Its Manifesta tions and Measurement (New York: D. Appleton and Company, 1933)j~pT T3T: i Joseph M. Rice, "The Futility of the Spelling Grid," Forum, 23:163-172, 409-419, April, June, 1897. 17 It is generally assumed that the first individual to definitely use the expression ”mental test” was the 11 American psychologist Cattell in his 1890 article in Mind. Equally important is the fact that the article goes on to describe a series of tests which he was actually using with his students at the University of Pennsylvania in the hope of finding a practical method of appraising their abilities which would enable him to advise them be forehand as to their chances of college success. These tests consisted of measurements of keenness of vision and hearing, color vision, sensitivity to pain, color prefer ence, reaction time, rote memory, mental imagery and the like. Almost immediately after the publication of Cat tail1 s 1890 article, a number of other psychologists de- 12 „ cided to try out the method. Jastrow in 1892 used a similar series with students at the University of Wiscon sin and in 1893 set up a special exhibition at the Worldfs Columbian Exposition in Chicago. Visitors to the exposi tion were invited to come in and try their abilities. The 11 James McKeen Cattell, "Mental Tests and Measure ments,” Mind, 15:373-81, July, 1 8 9 0. • 12 Florence L. Goodenough, Mental Testing: Its History, Principles, and Applications' (New York: Rinehart and Company, Inc.,1940), p. 41. - I 8 - tests were also applied to school children and the results compared with teachers1 estimates of their mental acute ness . In 1891 Bolton1^ worked out a study under Franz Boas on f , The Growth of Mentality in School Children1 1 in Worcester, Massachusetts. These tests were given to some 1 ,5 0 0 students and consisted of remembering five, six, seven, and eight place numbers from one auditory presenta tion. As in the case of practically all of these early tests, this one of Bolton and Boas was wholly inadequate to measure even the one thing it purported to measure, in this instance, memory. In this early period, there was no tester who was more careful in the preparation of his apparatus than Gilbert, first working at Yale and later at the University lli of Iowa. He made two studies which appeared in 1894 and • I £ • 1895* The measures were of the same general nature as those used by Cattell. They consisted of tests of visual and auditory acuity, speed of tapping, reaction time, and several others, including anthropometric measurements. The significant thing about Gilbert's work is the fact ■ * ■ 3 Thaddeus L. Bolton, "The Growth of Memory in School Children," American Journal of Psychology, 4:362- 80, April, 1892. 14 Boynton, 0£. eit., p. 154. _______jj_Ibid-,_p__156________________________________ 19 that he endeavored to introduce some control over the selection of his subjects. It is little wonder that Gilbert, like many others in these early days, was able to conclude as follows: . . . My data show no such relation between weight j and height and mental ability as has been claimed, but give a negative result, and if any positive result can be stated at all, it would be that the taller and I heavier the children, the duller they are, instead of the opposite. The marked differences occur between ages 10 and 14. By referring to the charts for graded weight, lung capacity, and wrist lift, it will be^seen1 that they all follow approximately the same law.1^ 17 The German psychologist, Ebbinghaus, ' came closest to the concepts that Binet, Henri, and Simon finally brought into a workable form for the measurement of in telligence. In his well known completion test, he arrived at a measurement of a central higher process. (it was about this time that Binet and Henri, in a series of articles in L1Annee Psychologique in 1895-1898 described an intensive study of subtests for the measurement of the higher complex functions.) Another study of importance at about this time is 16 J. A. Gilbert, "Researches Upon School Children and College Students, 1 1 Vol. 1, p. 39> cited by Paul L. Boynton, o£. cit., p..157• 17 ' George D. Stoddard, The Meaning of Intelligence (New York: The Macmillan Company,' i<547J7 P* 8£. 20 1 o the one made by Guiccardi and Ferrari in Italy. This was an analysis of the use of mental tests on deficient children, and these tests were divided into four groups. The first of these was a test of motor phenomena. The' second was on nvasomotor phenomena” or emotional states. The third was a test in the field of apperception and at tention. The fourth and last of the tests was on the superior phenomena of reasoning and of the "esthetic emo tions and associations." In the United States, Thorndike, who worked under Cattell, Boas, and others, was the first to publish a book which dealt directly with measurement.^ This book was applied to fundamental methods of test construction and to statistical methods. It remained a standard textbook in its field for more than a decade. A year after the publi cation of Thorndike!s book, Alfred Binet in France pub lished the test which made him famous and contributed so greatly to the immediate and subsequent development of 20 testing. In all of these experiments which have been * i Q Boynton, o£. cit., p. l6l. IQ * Edward L. Thorndike, An Introduction to the Theory of Mental and Social Measurements (New York: Teachers College, Colurnhia University,1913)* 277 pp. 20 Anastasi, ojd. cit., p. 20. ___ __ 21 presented up to this point, there are two things which are definitely missing in most of them. One of these is at tention to validity, and the other is attention to reli ability. In short, the early workers did not use good tests and they did nor score them well. The work of Binet. The work of Binet is important and merits special consideration because of the great stimulus he gave intelligence testing by the construction of the famous Binet-Simon Scale for the measurement of intelligence. It is of great importance historically be cause so much of the later development of intelligence testing is implied in his work. Alfred Binet was born in Nice, France. His father was a physician and his mother an artist. He studied medicine^and worked particularly under Charcot and Fere. To the influence of the former, he undoubtedly owed his knowledge of and Interest in abnormal psychology. 21 In 1896 Binet and Henri published an article in which they described tests that they proposed to try out with school children and which were designed to "measure" each of eleven named processes: (l) memory, ( 2) mental imagery, (3) comprehension, (4) attention, (5) imagination, 1 21 Goodenough, 0£. cit., p. 43. 22 ( 6) suggestibility, ( 7) aesthetic appreciation, ( 8) force of will, as indicated by sustained effort in muscular tasks, ( 9) moral sentiments, ( 1 0) motor skill, and ( 1 1) judgment of visual space. For each of these, a number of different tests was proposed in order to cover aspects of the ability to be measured. Binet found that tests of sensory judgment and other simple functions seemed to have little relation to general mental functioning. Then, he gradually formulated a description of intelligence as 1 1 the tendency to take and maintain a definite direction; the capacity to make adaptions for the purpose of attaining a » 22 desired end; and the power of auto-criticism. The stage was set, then, for the call in 1904 to produce the first practical mental test. The schools of Paris became concerned about their many non-learners and decided to remove the hopelessly feeble-minded to schools where they would not be held to the standard curriculum. Aware of errors in teachers judgment, they wished to avoid segregating the child of good potentiality who could learn if he tried and the trouble making child whom the teacher wanted to be rid of.23 Moreover, they wanted to identify those dull chil dren from good families whom the teacher might hesitate to rate low, and those dull children with pleasant personali- 22 Lewis M. Terman, The Measurement of Intelli gence (Boston: Houghton Mifflin, 1918), p.”45- J Lee J. Cronbach, Essential of Psychological Testing (New York: Harper and Brothers, 1949)7 p. 103. 23 ties who might be favored by teachers. Evidently* a method of making the selection was called for* and it was specifically to meet this need that Binet and his colleague* Thomas Simon* constructed their first formal scale for appraising the intelligence of children. 2 4 In 1905* appeared the first intelligence scale. It included thirty items which were arranged in order of increasing difficulty and the results for each -child were compared with an average for children of the same chrono logical age. Their estimate of average ability* of re tardation* and acceleration were based on agreement with or departure from this standard. The following examples \ will give a general idea of the kind of tasks which* at this time* Binet thought indicative of various levels of intellectual ability. 1. Visual co-ordination. 4. Knowledge of food. 6. Execution of simple orders and imitation of gestures. 9. Naming of designated objects. 15. Repetition of a sentence of fifteen words after a single hearing. 1 20. Stating similarities of two familiar objects. 24 Boynton* o£. cit.* p. 170. 24 25. Supplying the missing words in easy sentences. 30. Distinguishing between abstract terms. It is interesting to note that in the development of this scale., Binet had as his purpose the selection of feebleminded children* and as a result* the test was not of any particular importance so far as the normal or su perior child was concerned. One other thing* however* which should be noted* is that Binet veiy definitely makes use of the mental age concept in this scale. In 1908* Binet2^ devised the first intelligence scale that is really worthy of this title. This scale was designed not only for the selection of the mentally unfit* but also for the classification of the normal and the su perior. The outstanding feature of this scale* however* 1 lay in the fact that instead of being arranged in order of difficulty* the tests were now grouped according to the age at which they were most commonly passed. In this re spect* the scale was divided into a series of questions prepared for certain definite years so that the examiner would be able to give a child of any certain chronological age (covered in the age range of the test) questions which were designed especially for the average child of this age level. 2^ Stoddard, op. cit.* p. 9 6, ___ ___________ 25 Binet employed a large number of children in the standardizing sample for the 1908 scale. Tests were placed at a point where from 50 to 75 per cent of the children passed. By this time, Binet had definitely de veloped his idea- that a child grows mentally and that his rate of growth in comparison to an average rate of growth gives an important practical discrimination. He also ap- proached the concept of a normal distribution, observing that, for a given set of tests, a few children would fail, many would pass, and a few would score very high. The method of finding the mental age by means of - this scale was first to assign to the child a mental age corresponding to the age level at which not more than one test was failed. To this basal age was to be added one year for each five tests passed at levels above the basal. Inasmuch as the number of tests at successive ages varied from three to eight and as no credit for fractional years was allowed, it is evident that the method could at best yield only very crude results.27 Binet readily saw his mistake and corrected it in his 1911 scale. Another feature of the 1908 revision is the fact that the tests for the younger years were too easy, causing children of these years to appear to be ac celerated mentally, and the tests of the advanced years were too hard, causing the children to appear retarded. ^ Stoddard, loc. cit. 27 ‘Goodenough, 0£. cit., p. 5 0. 26 "If Binet had had a normal group with which to work* in stead of being forced to experiment on relatively low- type children* some of these errors might have been avoided.1 1 The following examples will give a general idea of the kind of items included in the 1908 scale. Age III 1. Pointing to nose* eyes and mouth 2. Repetition of short sentences 3. Repetition of two digits ' 4. Enumeration of objects in pictures 5. Knows his last' name. Age VI 1. Knows right and left 2. Repetition of a sentence of sixteen syllables 3. Aesthetic comparison. Choose the prettier of three pairs of faces 4. Definition of familiar objects 5. Executes three commissions given simultaneously 6. Knows age 7. Distinction between morning and afternoon. Age X 1. Repeats the months of the year 2. Knows the names of nine pieces of money 3. Uses three words in one sentence 4. Responds to three questions* 5. Responds to five difficult questions. Age XIII 1. Solves the cutting paper test 2. Completes a triangle 28 Boynton* ojd. cit. * p. 175. 27 3. Gives difference between five pairs of abstract words.29 Shortly before Binet*s untimely death, a second re vision of the Binet-Simon scale appeared. The 1911 scale differed from that published in 1908 in its details, ra ther than in its major principles. In this scale, Binet had omitted several tests which depend upon the scholastic ability of the child, such as reading and writing, also certain tests which are matters of knowledge dependent upon the environment, such as age, days of the week. In addition, some tests were found too hard for the age at which they were placed and so were moved, and a few new tests were introduced. Again, Binet found it advantageous to extend the scale into the upper limits. Accordingly, he worked out tests for age 15 and adults. The method of scoring was the same as that used in the 1908 scale, ex cept that the additional allowance for tests passed above the basal year was 0.2 for each test passed, thus permit ting the use of fractional parts of a year in computing the mental age. These three scales show the progressive development of the idea of measuring intelligence by age steps. The Rudolph Pintner, Intelligence Testing: Methods and Results (New York: Henry Holt and Company, Inc., T932J, pp. 137-40. 28 first series of tests was arranged in order of difficulty* the second series according to age, and the third series was better standardized according to age and to a uniform number of tests for each age. Quite naturally, the influence of these scales was widespread over Europe and America. From 1908 on, a more or less constant stream of revisions and modifications of either the 1908 or the 1911 scale has been made. Degand^0 in Belgium applied the 1908 scale to chil li dren in a private school in Brussels. Also, Johnson-’ reported, in 1910, her use and modification of the 1908 scale at Sheffield, England, in connection with some of her work with factory girls. Bobertag is largely re sponsible for the introduction and use of the scale in Germany. Ferrari-^ also made use of this scale in some experimental work on Italian children. As for the United States, the next chapter will be devoted to the more im portant revisions and modifications of the Binet tests. Summary and evaluation. Tests and measurements of manfs abilities have been attempted since his beginning, Boynton, o£. cit., p. 179 Fintner, o£. cit., p. 175 ^ Goodenough, o£. cit., p. 59- 03 _________ Boynton, op. cit., p. 180.____________ ________ 29 but scientific work along this line had to wait until near the beginning of this century for the establishment of a scientific and experimental psychology. Such a psychology had its roots in Germany, England, Prance and the United States in the latter part of the nineteenth century. The psychological laboratory of Wundt, in 1897* was the train ing ground for many of those who went on to influence psychology. The Germans and those most influenced by them were interested in measuring visual and auditory sensation, reaction time, and experimentation in the field of psycho physics and association. In England, Galton made headway so far as methods of making discriminations, distributions, and correlations were concerned. He also began the use of "free association" tests. The French, led by Binet, worked out a scale for the measurement of intelligence. In the United States, experimental psychology was also developing. The first psychological laboratory was established by William James at Harvard. The first work done in measurement was made to de termine how people are alike, and grew to study how they differ. Oehm, in Germany, began a timed test of perceptual processes, memory, association ancL motor_„functions.. Rice 30 in the United States, developed a uniform spelling test. Cattell was the first to use the expression Hmental tests/1 and he tried to develop one. His article stimu lated other psychologists to try his method. Thorndike, i in 1913came out with the first book that dealt directly with measurement. In all of the early experiments, there are two things which are definitely missing in most of them. One of these is attention to validity, and the other is atten tion to reliability. The work of Binet is important and merits special consideration because of the great stimulus he gave intel-i ligence testing. His attempts to sort out the feeble minded children in Prance led him to devise the first in- telligence scale worthy of any note. The first scale was rough, and yielded only crude measurements, but the 1911 scale remedied this fault. This revision also replaced some of the original tests, and extended the upper limit of the test. The influence of Binetfs scales was widespread over Europe and America. Degand in Belgium, Johnson in England, Bobertag in Germany, and Ferrari in Italy made j use of Binet*s work. In the United States, many revisions| have been made, but the most important one was the Stanford- Binet revisions.________________________ 1 CHAPTER I I I DEVELOPMENT OP INDIVIDUAL INTELLIGENCE TESTS (VERBAL TYPE) In this chapter, an attempt will be made to indi cate the general development and trends of intelligence tests following the work of Binet. This chapter will deal with the individual tests of general mental ability of the verbal type, except the so-called “culture-fair tests,“ to the development of which a separate chapter will be devoted. No attempt will be made to describe the innumerable individual intelligence tests, or to give de tailed information concerning standardization, administra tion and scoring. In the early years of his experimental work, Binet followed the traditional line of approach with its the oretical analysis of intelligence. However, when he was assigned the task of selecting candidates for special sub normal schools in Paris, he met a practical problem that gave new direction to his work. To solve this problem, he began to combine tests of many types into a single scale. i The very hodgepodge of single tasks which he put together aided him in getting a fairly good measure of general ability. His success diverted attention from the measure- ment of specific abilities and led to a concentration of__ 32 interest on the part of many psychologists on the new problem* the measurement of general mental ability. As a result* Binet1s techniques have been greatly refined. In addition to his scales of 1905* 1908* and 1911* Binet madei two outstanding contributions to the theory of psychologi cal measurement. He developed* first* the concept of general intelligence* and secondly* the concept of mental age. Binet%s influence is reflected in the many revi sions of the Binet-Simon Test which have appeared in the United States and other countries. In 1906* Goddard had \ been called to the training school at Vineland* New Jersey* to organize a research laboratory for the study of mental deficiency. He translated the Binet-Simon scale and adopted it for use with children in the United States. Interest in the scale developed rapidly* particularly among those interested in subnormal and backward children in institutions for the feeble minded and in special classes in the public schools. The rapid development of the use of this scale in this country was at firs;t gener ally ignored or was unfavorably criticized by many psy- 1 chologists. The Goddard revision was probably used more 1 Francis N. Maxfield* "Trends in Testing Intelli gence*" Educational Research Bulletin* 15:13^-41* May* 1936. . 33 than any other scale until the appearance of the Stanford- Binet by Terman and his collaborators in 1 9 1 6. Extension of Binet1s work: the Stanford-Binet. In 0 1916, Terman published his first revision of the Binet- Simon. This scale was definitely an extension of Binet*s ideas, rather than a departure from them. What Terman did, however, was to work out more thoroughly and more ac curately the method suggested by Binet. The scope of the standardization was extended and the scale was adjusted more accurately to fit each age. Another contribution by Terman in connection with this scale was the adoption of the Intelligence Quotient as suggested by Stern.^ The in troduction of the I.Q. as an integral part of this revi sion appeared to warrant some prognosis of individual . school attainment and to offer a basis for use In school planning if it could be shown the I.Q. was both constant and reliable. In 19379 the new revision of the Stanford-Binet by 2 Lewis Terman, The Measurement of Intelligence: An Extension of and a Complete Guide~Tor the Use of the" Stanford Revision ancT Extension of the Binet-Simon Intel ligence Scale (Bos't'dn: Hod'ghton~Mifflin Company, 19I6), 362 pp. ^ Rudolph Pintner, Intelligence Testing; Methods and Results (New York: Henry Holt and Company, 1932)', “ 417 3 4 k Terman and Merrill appeared. The scale is an age scale. That is to say, the tests are grouped in terms of age levels. Ten years of work had been put into a thorough and careful revision and restandardization. The range had been increased both upward and downward. It is no exag geration to say that at one time the major task of the psychometrician was to administer the Stanford-Binet. In fact, it was paid the compliment of being the standard against which newly developed scales were validated. The Revised Stanford-Binet provides two complete scales designated as Form L. and Form M. Both scales were devised for children aged two and upward to adulthood. Each scale contains 129 tests, distributed over twenty age levels. Below age five, the levels cover half-year inter vals; from age five through fourteen, the intervals are yearly; and beyond age fourteen, there .are four adult levels, designated as “Average Adult” and “Superior Adult I, II, and III.1 1 All levels have six tests except “Aver age Adult,” which has eight. The items of this scale were selected according to 5 variety of criteria. Among them were: 4 Lewis Terman, and Maud A. Merrill, Measuring In telligence (New York: Houghton Mifflin Company, 1937)» ¥6l pp. 5 Ibid., pp. 9-10. : 3- 5- (a) General opinion as to their worth formed by ex pert workers; (b) an increase in the percentage of children who succeeded with each item at increasing age; and (c) the correlation of each subtest with the total score. The test included a variety of tasks. Attempts at classification in order to reduce them to an intelligible form have been manifold. Porteus classified the tests as follows: (1) Memory. These include memory span for digits, sentences, commissions, items read in a story or news paragraph, pictures, and designs— 21 1 /2 tests. (2) School Attainments. There are 4 1/2 tests involving school attainment— 4 of arithmetic and the 'Reading Report' test. (3) Verbal Ability. These are tests of vocabulary, verbal comprehension and expression, description, de finitions, verbal reasoning, rhymes, word association, and verbal classifications— 32 tests. (4) Common Knowledge and Comprehension of Practical Situations. These include similarities, picture in terpretation, picture absurdities, problems of fact, and aesthetic comparison— 19 tests. ( 5) Practical Judgment and Abilities. These are tests of manipulative skill, drawing, form board, planning, induction, and ingenuity— 20 tests. 6 Inspection of the scale reveals that it is heavily 7 weighted with verbal material. Krugraan believes that Stanley D. Porteus, The Practice of Clinical Psy chology (New York: American Book 'Company,T94l], pp. 120- 21. 7 Morris Krugman, "Some Impressions of the Revised Stanford-Binet Scale," Journal of Educational Psychology, — 3 0-:-594-603>—November-,-1-939-*— — 1 ------------— --------- _6 there has been an undue emphasis on verbal material. He found levels VIII and XI of Form L, for example, especial ly open to this criticism. This opinion seems reinforced 8 by the findings of McNemar, who attempted to develop a non-verbal scale covering all age levels. He could in clude no- tests from level VIII and only one from level XI. Levels XIV and upward are almost entirely verbal in character. q As for the location of some of the tests, Krugman-' felt that three tests at level XIII were too easy. No other age level had this many tests misplaced. These three were the "Plan of Search, " "Word Memory," and "Paper 10 Cutting." In addition, Harriman found that two hundred school children in the fifth and sixth grades with a mean age of 11-7 and mean I.Q. of 112 made 63 per cent of sue- 4 cesses at level XII and 78 par cent of successes at level XIII. There is some contradictory evidence to these o Quinn McNemar, The Revision of the Stanford- Binet Scale; An Analysis of the Standardization Data' (New; York: Houghton-Mifflin Company, 1942}"," 185 pp. q Krugman, o£. cit., p. 597. 10 Philip Harriman, "Irregularity of Success on the 1937 Stanford Revision," Journal of Consulting Psychology, 3:83-85, May-June, 1939. 371 findings. Mitchell 11 found that with groups of college freshmen and senior medical students level XIII is more difficult than level XII. Modification of Binet1s practices: point scales. Even before the first Stanford Revision appeared, certain quite different developments, based on the work of Binet, were in the making. These developments were (l) abandon ing the age-scale approach, and (2) the use of the same tests, in many cases, at different age levels with a dif- i ferent scoring system. In the point scale, all the tests of a similar nature, as, for example, those for memory of digits, were grouped into a single test and given to gether, in increasing order of difficulty, while Binet placed them in different years according to the number of digits remembered. Another difference is that in the Binet scoring the all-or-none principle is used, that is, the subject either succeeds or fails. In the point scale, a varying amount of credit may be given according to the quality of the performance. The results of the tests are expressed in total scores ranging from 0 to 100, and this might be converted into an equivalent M.A. Theoretically Mildred B. Mitchell, "Irregularities of Uni versity Students on the Revised Stanford-Binet,1 1 Journal of Educational Psychology, 32:513-22, October, 1941. 38 the entire point scale is given to each individual ex amined; but in practice this is not necessary with young children who fail the earlier subdivisions of any test* so that the latter are unnecessary. The first point scale to gain general publicity was IP the Yerkes, Bridges and Hardwick scale which was a revi sion of the Binet test. Probably the best known point scale in current use is the Herring Revision of the Binet-Simon. The scale is essentially a modification of the ideas and practices of Binet, rather than a radical departure from them. Some points in the Herring examination are given for illustra tive purpose. Group A 1. What is this picture about? Tell me what you see in this picture. (Pour pictures presented suc cessively). 2. In the first row of numbers tell me what two numbers should come next , (here and here). Go ahead. (Eight rows of figures, each row present ing a number completion problem). 3. Read this to yourself. Then begin at the be ginning and tell me everything you have read. (A 12 Robert M. Yerkes, James W. Bridges, and Rose S. Hardwick, A Point Scale for Measuring Mental Ability (Balt imore: "Warwick ahd^ork, Inc. " , 1915), 218 pp. 13 John Herring, Herring Revision of;.-the Binet- Simon Tests (Yonkers, New York: "World Book' Company, 1921), 5 f c > pp. 39 passage of thirteen memories). 4. I am going to read some numbers. When I am through, say the numbers backward. If I say 9, 2, you say 2, 9- Do you understand? (Digit group rang ing in number from two to nine). Group C 14. Solving five other problematic situations. 15- Detecting absurdities in eight statements. 16. Sentence building. Four sets of three words each. 17* Giving a rhyme for each of four words. 18. Ficking out similarities in six groups of three things each. 19. Interpretation of five proverbs. 20. Reproduction of thought. Thirteen (so-called) memories. Rather difficult reading. 21. Reading three mixed sentences. 22. Solving three arithmetical problems. Group E 3 1. Naming five familiar objects. 32. Form comparison. 33* Performing three commands. 34. Diagrammatic problem solving. 35* Repetition of digits forward. Groups ranging from two to ten in number. 3 6. Repetition of three sentences ranging from nine teen to twenty-four syllables. 3 7. Detection of proportional relationships from material read. 40 14 3 8. Code writing. Upon inspection of the material, one notices con siderable similarities with the Binet material. Many of the subtests are derived or revised from the original Binet items, but these items are arranged and treated differently. It would be desirable to mention the advantages which have been claimed for the point scale in contrast to the age scale. These advantages are derived from the work of Yerkes and Foster, and presented as follows: Age-Scale Characteristics Foint-Scale Characteristics 1. Tests organized by years Single graded-test scale, or other age-units. .2. Tests and items selected Tests selected in terras of by relationship of suc cess to age. 3. Varied, unrelated un graded tests in a com posite. 4. Internally standardized and inflexible. 5. All-or-none ratings of subject*s responses 6. Qualitative. 7. Measurement not fully amenable to statistical treatment. the function to be measured. Each test so graded as to be available over a wide range of ages. Standardized against exter nal criteria and flexible. More-or-less ratings of subject’s responses. Quantitative. Measurement wholly amenable to statistical treatment. ^ Paul Boynton, Inte11igenee: Its Manifestations and Measurement (New YorE: D.' Appleton and Company, 1933)> -Pp.._193-94.: 1 Age-Scale Characteristics 8. Tests weighted equally. 9. Implicit assumption that of new appearing or emerging functions. 10. Measurements for differ ent ages relatively incommensurable. 4 T Point-Scale Characteristics Tests weighted unequally. Implicit assumption that of continuously developing functions. Measurements for different ages comparable and commen surable .-*-5 Modifications of Binet1s practices: the work of Kuhlmann. Another trend in developing individual tests of general mental ability was the work of Kuhlmann. His name is associated with modifications of the practices and ideas of Binet without going to the length of radical changes. Two revisions of the Binet Scale have been pub lished by Kuhlmann. The first one* published in 1912, adhered closely to the original Binet scale, while the 17 second one, which was published in 1922, 1 introduced a number of departures, among them the elimination of 16 15 ^ Robert M. Yerkes, and Josephine Foster, A Point Scale for Measuring Mental Ability (Baltimore: Warwick and YorkT T323TTp7 1W .---------- 16 Frank S. Freeman, Theory and Practice of Psycho logical Testing (New York: Henry Holt and Company, 1950), p. 115. 17 Frederick Kuhlmann, A Handbook of Mental Tests: A Further Revision and Extension of the Binet-Simon Scale "(Baltimore : Warwick and York, Inc., T 9 2 2) ' , 208 pp. nineteen of the original subtests, an increase in the tota number of subtests to 129, a credit for speed as well as accuracy. The scale was extended doxmward to the three- month level and upward to the age of fifteen years. The extension of the scale to the three month level was the most original contribution from him. Some such tests were proposed by Binet in 1905 . > but they were few in number and unstandardized. This instrument is mentioned because it has great practical and theoretical importance in developing individ ual intelligence tests. But in his Tests of Mental Develop- 10 ment, which he drew from various sources, including the author1s previous revisions of the Binet scale and the Kuhlmann-Anderson group tests, he definitely carried the revision of Binet*s ideas a step farther. In earlier age ranges, tests were drawn from Gesell. The essential point of resemblance between the Test^ of Mental Development and the original Binet scale and the two Stanford Revisions is that ail of them are composite analytic instruments containing a multipli city of items. Indeed one might say that the basic operative conception of general intelligence embodied in the instrument is even more loosely defined than with Terman and Binet.- * - 9 Kuhlraan differed from the original Binet and the 18 I D _ Frederick Kuhlmann, Tests of Mental Development: A Complete Scale for Individual" Examination (Minneapolis: Educational Test Bureau, 1939), 314 pp. 19 James L. Mursell, Psychological Testing (New York:__Longmans.,_Green_and_Comp any_,—1950-),—p.—12b.-------- '43 Stanford Revision in the arrangement of the tests. The tests are not grouped in age levels, but, instead, they are arranged in order of increasing difficulty, each test having an age-value and a value in ’ ’ mental units” (MU). These 'mental units' are based upon a mental growth curve constructed by Heinis and believed by him and Kuhlmann to represent the actual course of mental de velopment. The units in this curve are said to be equal throughout its length, so that scores and dif ferences between scores are said to be directly com parable along the entire scale. Another distinctive feature■about the test is the use of ' ’ Percent of Average” (PA). This index is found simply by dividing the individual's score in MU-points by the average MU-point score for his own age-group. Accord ing to Kuhlmann, the ’ ’ Percent of Average” provides a more constant index of mental development for the individual from year to year than does the intelligence quotient. F. L. Wells has reviewed this test: The Kuhlmann Binet occupies a position between the Binet systems and the Wechsler-Bellevue Intelligence Scale and Detroit Tests of Learning Aptitudes; on the whole nearer the Binet. The main issue is between a large number of disparate procedures, each aimed at a single or narrow range of developmental levels, and a much smaller number, each covering a considerable range of developmental levels. Kuhlmann is clearly a partisan of variety, in the Binet tradition. . . . All in all, the present offering probably represents the best instrument available over a fairly wide area, denoted as follows: (a) where it is desired to rep resent the intelligence function by a single figure; 20 Freeman, o£. cit., p. 117* 44 (b) for healthy individuals up to say fourteen years of age; and (c) for individuals with intelligence defect.21 For purpose of illustration and comparison with other revisions of the Binet scale, a few tests from the Kuhlmann 1939 scale follow: Sits with support. In speech, combines two or more syllables. Makes horizontal and vertical lines. Identifies objects in a picture. Trace a square, nth. Repeats a series of digits, nths. Repeats digits backwards. nths. Copies lines in a dot pattern. Uses a number-letter code. Finds digits having middle value in series of five, nths. Draws a triangle on a square according to directions, nths. Comprehends and follows direc tions . 11 years, 11 months. Supplies two of four terms to complete analogy, nths. Draws upright forms in inverted positions.^2 Several revisions of the Binet have been mentioned. For the purpose of clarification, the characteristics of the most important ones are summarized in Table I. 4 months • 1 year. 2 years. 3 years. 4 years. 5 years, 1 ■ 6 years, 2 ' rj years, 2 i 8 years. 9 years. 10 years, 5 i 11 years, 8 i 11 years, 11 12 years, 6 i Far-reaching revision of Binet * s practices: the work of Wechsler. Another trend in developing individual 21 Oscar K. Buros, editor, The Nineteen Forty Mental Measurements Yearbook (Highland Fark, New Jersey: 1941), pp. 254-55. 22 ___________Freeman, op. cit., p. 118._____________________ '45 TABLE I REPRESENTATIVE TESTS DERIVED FROM THE BINET SCALES Test Author Age Range Special Features Stanford Revision Terman 1916 3-adult Introduced the I.Q. Stanford Revision Terman- Merrill 1937 2-adult Parallel forms. Extended for higher and lower ages. Point Scale Yerkes- Bridges 1923 3-adult Subtests organ ized as point scale. Tests of Mental Development Kuhlmann 1939 3 month adult Uses Heinis scaling and point scoring. 46 intelligence tests of general mental ability was embodied in the work of Weehsler. His instruments have been chosen to represent a far-reaching revision of the practices and ideas of Binet, yet not unrelated to them. 1. The Wechsler-Bellevue Intelligence Seale, ^ an individually administered point scale, consists of ten sub tests, with one alternative (vocabulary test). The six subtests which made the Verbal Scale are entitled General Information, General Comprehension, Digit Span, Arithmetic, Similarities, and Vocabulary. The so-called Performance Scale includes Picture Arrangement, Picture Completion, Block Design, Object Assembly, and Digit Symbol. A principal difference between the Binet-type scale and the Bellevue is that the arrangements of the items of various kinds are grouped at each age level in the Binet, whereas, in the Weehsler, all items of a particular type are grouped together, constituting a subtest of the whole. The items within a given subtest are arranged in order of difficulty as found by Weehsler on standardization. 24 Nevertheless, evidence has been advanced by Jastak on 28 ^ David Weehsler, The Measurement of Adult Intel- ligence (Baltimore: The Williams and Wilkins Company, 1944),258 pp. 24 Joseph Jastak, l t An Item Analysis of the Wechsler- Bellevue Tests,1 1 Journal of Consulting Psychology, 14:88- I 94, April, 1950. I the basis of an item analysis of sixteen hundred cases that many items are misplaced. The author has constructed his scale in this form (Verbal and Performance) because, apparently, he believes that intelligence not only involves ability to deal with abstractions and conceptual thinking but also involves ability to deal with problems of concrete objects rather than words and numbers. One of the distinctive features about this scale is the system of selection of the cases for standardization. The norms are based upon scores obtained from 1,750 sub jects ranging in age from seven to seventy, selected out of 3,500 subjects to whom the tests had been presented, the selection being a sampling based upon the occupational distribution of the United States adult white occupation, as Indicated by the 1930 census. The adult subjects were divided into age groups by five-year intervals, the num ber of cases in a group ranging from fifty subjects in later fifties to 195 subjects in the later twenties. Another departure from Binet's practices is that the average performance of individuals in any age group is taken as the point of reference for that age group. The net result of this procedure is that older subjects gen erally receive higher I.Q.'s on this scale than they do in other tests of_genera 1 _inte 1 ligence_.__________________ -------------------------- 4 ' 8“ All parts of the scale are scored on a point basis, then this raw score is converted into a weighted score by means of a conversion table. The weighted scores for all parts of the scale are then added to obtain the full score upon which a "full I.Q.” is based. The same principle is used to find the "verbal I.Q,.” or the "performance I.Q." The weighted scores were derived from raw scores by use of Hull's method of standard score. A mean of ten and a standard deviation of three was arbitrarily assigned, thus equating each subtest with all the others. Thus, the I.Q. cannot be obtained by dividing the mental age by chronological age in the usual way, as in the case of the Stanford-Binet. In fact, his Intelligence quotients are i i | not in reality quotients at all. The Weehsler1s I.Q. is | j the ratio between the obtained or actual score and the expected mean score for the age of the particular subject. Then, it is evident that the meaning of the intelligence quotient when using this scale is different from that customarily associated with this term. In this respect, Weehsler disregards the mental age concept entirely. His concept of mental age and I.Q. is consistent with the following: Certainly a mental age of fifteen is no more mean ingful for an average thirty-year-old than a mental age of thirty for an average fifteen-year-old. If it is helpful to say that a six-year-old is bright, know- _____ ing full well that this mental maturity is_at__a_low__ 49 level, then we can make the analogous statement about a sixty-year-old. We do not say, 1 the child is bright for a six-year-old,1 since that is redundant. Neither should we say that a man is bright (or dull) for sixty. That, too, is redundant. Either he is bright or he is not, his age being utilized as a reference point for entering the tabulation of scores. For those still older, to be bright is to postpone senility It is apparent, then, that the Wechsler-Bellevue I.Q. implies a different meaning from the Stanford-Binet I.Q. The method of derivation is different, as are the groups with whom the individual subject is being compared. In the Stanford-Binet a comparison is made with the gen eral population within the limits of the standardization sample, but in the case of Wechsler-Bellevue the compari son is made with other individuals falling in the same age group. This scale has been criticized by Cronbach by say ing: . . . one can point to numerous shortcomings. Most of them arise from Weehsler1s emphasis on clinical utility rather than upon any theory of mental measure ment. The test consists of a random collection of items, most of them 20 or more years old. Although Weehsler collected norms conscientiously, and attempted some compensation for his failure to test subjects outside the New York area, his norms are probably not representative of Americans generally. I.Q.f s are de termined by an arbitrary computation having no parti cular advantages or rational justification. Technical 25 ^ George Stoddard, l f 0n the Meaning of Intelli gence," Psychological Review, 48:252, May, 1941. _ ^ reports on validity and reliability have been inade quate. The test is not difficult enough for measure ment of superior subjects. Furthermore* the test was based on- no clear theory of intelligence. All these criticisms suggest that the test is far short of the best that could be designed at present.26 The Wechsler-Bellevue scale is thought by many psy chologists to be superior for adults to the Revised Stanford-Binet scale. It does not involve the difficulty encountered by the latter in connection with ages above sixteen. The subtests are better suited for older per sons than the upper-level subtests of the Stanford-Binet. 27 2. The Weehsler Intelligence Scale for Children. This scale has grown logically out of the Weehsler Belle vue Intelligence Scale used with adolescents and adults. In fact* most of the items in the WISC are from Form 11 of the earlier scales* the main additions being new items at the easier end of each test to permit examination of children as young as five years of age. The scale is based on the same philosophy of global intelligence and the inadequacy of the mental age concept as the adult scale. Even though the materials overlap* 26 Lee Cronbach* Essentials of P sy cho1ogica1 Test ing (New York: Harper and Brothers* Publishers* 19^9)/ pT~1 5 8. 27 ' David Weehsler* Weehsler Intelligence Scale for Children— Manual (New York! tee Psychological Corpora- t i on * 155 2) * 11'4 pp. 51 the WISC is a distinct test from Wechsler-Bellevue scales and is independently standardized. Both scales can be used xcith adolescents. However, it is expected that the WISC is preferred in testing adolescents up through the age of fifteen years. The WISC consists of twelve tests which, as in the Adult scale, are divided into sub-groups identified as verbal and performance. The six subtests which make the verbal scale are entitled General Information, General Comprehension, Digit Span, Arithmetic, Similarities, and Vocabulary. The Performance Scale includes Picture Ar rangement, Picture Completion, Block Design, Object As sembly, Coding, and Mazes. In the interest of shortening the time required for testing, the scale is to be administered ordinarily on the basis of only ten tests. Digit Span is considered an al ternate test in the verbal series and Mazes an alternate test in the performance series. The standardization sample is much more adequate than for the adult scale. It included one hundred boys and one hundred girls at each age level from five through fifteen years, with distribution as to the area of the country, urban-rural proportion, and parental occupation being based on 1940 United States census. Only white chi ldren_wer.e_included_in_the_s ample. ____________ ____ 52 The value of the test for predicting eommon-sense criteria such as school progress or other evidences of ad justment remains to be established by future research. With its verbal and performance I.Q.’s obtainable from one uniformly standardized scale, and its interesting possi bilities for subtest analysis, the WISC undoubtedly will be used by clinicians, and will be evaluated by a wealth of studies. The test may become a more important compe titor to the Stanford-Binet when an individual test is needed for the ages from five through fifteen years. Complete departure from Binet*s practices: the CATO 28 (Thorndike and Others). This scale represents a very definite departure from the ideas and practices of Binet. It is for individual administration at the lower level, and can be used as a group test at upper levels. This scale was constructed by Thorndike and others at Teachers College, Columbia University. The letters CATO refer to the four kinds of content used in the tests, namely, Completion, Arithmetic, Vocabulary, and Directions, The distinguishing feature of this scale is that the items are arranged in order of difficulty, providing 28 Edward L. Thorndike, and others, The Measurement of Intelligence (New York: Teachers College, Columbia University7 X$27), 6l6 pp. --------------- 53" seventeen different levels* in each of which the tasks of any one subtest are of nearly equal difficulty. Also the steps between levels are of approximately equal difficulty. This scale will be mentioned again in the chapter on de velopment of group intelligence tests. Evaluation and summary. The evolution of individual i intelligence tests of general mental ability of the verbal type was rapid after the first appearance of the Binet scales. Binet*s influence is reflected in the many revi sions of the Binet-Simon test which have appeared in the I United States and other countries. The Stanford-Binet revisions were an extension of Binet*s work rather than a departure from it. The idea of Intelligence Quotient* as ! i j j suggested by Stern* was used in connection with this scale! The point scales developed by Yerkes and others were a modification of Binet practices. The new modifica tions were the dropping of grouping the tests into age groups and the use of the same tests* in many cases* at different age levels with different scoring system. In j ! the point scale* all the tests of a similar nature are i grouped into a single and given together* in increasing | order of difficulty. Another point* which is different ! [ ; from Binet * s work* is the varying amount of credit given i according to the quality of the performance. Another change; of the practices and ideas of Binet i without going to radical changes was made by Kuhlmann, He extended the test downward to the three month level and up- , ward to fifteen years,. He arranged * in order of increas- t ing difficulty, each ftest having an age-value and a value in ’ ’ mental units.f t Another distinctive feature about the test is the use of ’ ’ Percent of Average.” Another trendsin developing individual tests of general mental ability was embodied in the work of Wechs- ler. His work represents a far-reaching revision of the practices and ideas pjf Binet, yet not unrelated to them. A principal difference between the Binet-type scale and Weehsler scales is that all items of a particular type are grouped together* constituting a subtest of the whole. The items within a given subtest are arranged in order of difficulty. The I.Qii on these scales is determined on ■ * .. ' ’ v * the bases of weighted scores; therefore, it contains a different meaning'than the Stanford-Binet. The CAVD, constructed by Thorndike and others, is a complete departure from the Binet ideas. It is for indi vidual administration at the lower level, and can be used as a group test at upper levels. These tests represent the major developments that have occurred in individual intelligence tests of general __menta 1_abi 1 ity.„of__v_erba 1 _type_since_the_ time of Binet.__ CHAPTER IV I ! * DEVELOPMENT OP INDIVIDUAL INTELLIGENCE TESTS (NON-VERBAL TYPE) The purpose of this chapter is to deal with the development of individual tests of general mental ability not covered in the preceding chapter. Here* the concern * i will be with the development of performance tests as they are used to measure intelligence* rather than those which measure more specific abilities as dexterity and spatial ability. Since the intelligence tests of young children j i are mostly of non-verbal nature* a section will be devoted to their development. No attempt will be made to mention the characteristics of every test now on the market; how ever* a few specific descriptions will be given. I. PERFORMANCE TESTS Performance tests have been used extensively as one means of measuring general intelligence. The term is usually applied to those tests to which the child responds i j by doing or "performing" something* in contrast with those ! tests to which he responds verbally (either orally or by i ■ writing his answer). j i I ! ! ! Early development of performance tests. Performance ----------------------------------------------------------------------- 56-j » tests are the outgrowth of the intelligence measurement ■ i I movement. These non-verbal tests have shared the fame of | the language tests. Among the tests devised by Alfred j Binet were a number of tests which do not require verbal responses. For example, in the test known as "prehension provoked tactually," the examiner places the small wooden cube in contact with the palm or the back of the subject's hand to determine whether he can execute properly co ordinate movements of grasping. In the drawing test, he shows the subject two drawings, permits him to look at them for ten seconds, and then requires him to draw the views from memory. None of these tests expects a verbal response from the subject. Early revisions of the Binet-Simon scale had ad hered more or less closely to the original scale, and con sequently some of their tests are non-verbal in nature. In spite of the merits of the Binet-Simon scale and its revisions, their chief deficiency lies in the large pro portion of tasks requiring language responses. This cri- | ticism of the scale was vigorously presented by Ayres in 1911. He pointed out that Binet tests predominantly re flect the child's ability to use words fluently, and do not reveal his ability to do acts. Thus, it gives "a warped and partial measure of his real degree of intelli- r 57 I gence"1 J The language difficulty, inherent in the Binet- . Simon and its various revisions, became evident when the clinical psychologists attempted to apply it in various fields of practical work. They found that the scale was i inadequate for the mental examination of non-English speak- i ing people, speech defectives, the deaf, and those with a language disability. Hence they introduced non-language tests which do not require language responses on the part of the child for adequate performance. Among those who O first used a performance test were Healy and Fernald. In| i carrying out mental examinations at the Juvenile Psycho- ] pathic Institute of Chicago, they, had been confronted withl i the problem of testing a cosmopolitan population. Some of the inmates were illiterate, and some, educated in i I their own tongue, were unable to speak the English lan guage. Healy and Fernald, in testing their subjects, be- ( I ; came convinced that language, as far as possible, should | i j \ be eliminated from the mental examinations given to such i subjects. They say: i 1 Leonard P. Ayres, “The Binet-Simon Measuring i Scale for Intelligence: Some Criticisms and Suggestions,” Psychological Clinic, 5:187-96, November, .1911. 1 i 2 William Healy, and Grace M. Fernald, “Tests for j Practical Mental Classification,1 1 Psychological Mono- j graphs, 13j No. 2:1-53* March, 1911. In predicting the possible development of an indi- j vidual under various conditions, it is most desirable to ascertain the mental ability quite apart from the individuals experience in formal training in our language, or indeed any language. It often becomes necessary to classify mentally a subject who has had no education in English-speaking schools, or indeed who has had but little schooling of any kind.3 The work carried on at the Institute demonstrated i t the practical value of the performance tests. Healy and j i Fernald concluded as follows: j i On one occasion we found ourselves able to demon- j strate satisfactorily that a Gypsy boy of fifteen, j quite innocent of schooling and knowledge of the-three! R*s, had at least fair, if not good, native ability. And repeatedly a number of our tests have proved most serviceable in,mentally classifying young, deaf and dumb children.^ Knox,-* in his work among the immigrants at Ellis ! Island, found it impossible (even with the services of an 1 i ! interpreter) to use scales in which language responses i were required. Faced with this language obstacle, and under the necessity to diagnose the mental deficiency among the immigrants, Knox devised a series of performance i | tests, many of which were excellent and some of which are j i j ! still used in psychological clinics. 3 Ibid., p. 4. 4 Loc. cit. - * Howard A. Knox, "A Scale Based on the Work at Ellis Island for Estimating Mental Defects,” Journal of American Medical Association, 62:7^1-47> March, l'9l4. r _ ^9 i | 6 ' • Pintner and Paterson also found that language scale I i j was “absolutely inadequate to test the mentality of deaf | children." They experimented with the Binet-Simon .but were confronted with numerous difficulties, such as the lack of ] comprehension of certain tasks because a physical defi- t | ciency had deprived the subject of the experience needed ! to acquire the proper test reaction. Consequently, they ! | j constructed a scale of performance tests which requires | 1 i i practically no instructions for the child other than natur- ! | al gestures. Pintner and Paterson considered the non- language feature of the test as a sine qua non in the measurement of mentality of the deaf. As to the import- i ! ance of the performance tests, they say: i 1 I Here we have a group of individuals, completely j I shut off from hearing language, and for that reason ! ■ laboring under a language difficulty that only in rare] ! cases is surmounted to the extent of making them com- i parable in language ability to ordinary hearing indi- ! i viduals. Any kind of tests involving reading or spoken | language cannot be used as a test of their mentality, j | If we employ such tests for measuring the mentality ofi the deaf and use the standardization obtained from < j hearing children, we will not be measuring mentality j but merely difference in language ability. There may j : be a greater percentage of feeblemindedness among the ! deaf than among the hearing but the fact a deaf child does not measure up to the language standard of a ; Rudolph Pintner, and Donald G. Paterson, “The Binet Scale and the Deaf Child," Journal of Educational Psychology, 6:201-09* April, 1915* hearing child is not indication of mental deficiency.^ Thus* scales of performance tests first grew out of a need for some means of measuring the mental ability of subjects who were handicapped in verbal expression, such ! as the deaf, or those with foreign-born parents who spoke j their native language in their homes, or those whose ! language environment was restricted and meager, and those i with marked speech defects. I Pintner and Paterson, in 1917^ brought out the first we11-standardized performance scale. Other tests of the same general sort soon followed. Of those in use today, the Arthur Point Scale (1930) and the Cornell-Coxe series (193^) are among the best known. These scales will be mentioned in another section of this chapter in more de tail . j What performance tests really are. Most perform- | I ance tests are of the form board variety, that is, wooden j 1 boards from which blocks of various forms have been cut, ! 1 or various kinds of picture puzzles. Form boards have j 8 1 been classified by Newell into eight groups on the basis ; 7 ! 1 Rudolph Pintner, and Donald G. Paterson, A Scale , of Performance Tests (New York: D. Appleton and Company, : 1923), P. 20. ! o I Constance D. Newell, "The Uses of the Form Board j in the Mental Measurement of Children,1 1 Psychological Bulletin, 28:3Q9ri?> Apri_l, J1931 • _ ________ ! „ - ^ of the problem presented for solution and the ability uponj which the solution depends. Picture Form Boards. A picture has been pasted on a board, and from this various forms have been cut. The difference in the size and shape of the pieces and the background of the picture serve as guide to the correct < I solution. j I i | Picture Completion Form Boards. These consist of I 1 pictures from which blocks have been cut, but the blocks ! are all of the same shape and size. To complicate the i i problem, many additional pieces are presented from which ! the correct pieces must be selected for placement. The 1 choice depends upon the understanding of the situation in j l the picture, since the shape and size of the blocks and j the background contiguous to the holes give no clue to i ! the correct choice. Picture Puzzles. These consist of pictures cut into a number of pieces which, when fitted together, form i ! : ; a complete picture. Some are cut along straight lines, some into irregular pieces, and some are provided with frames into which the pieces are to be fitted. ! Form Boards with Geometrical Insets. These are | blocks of geometrical shapes cut from a plain wooden j board. Each block fits into a depression of the same ! __shape . . . - This_has__been-one_of_.the_ mos.t _ widely. us_ed _typ_es__j I : --------------------------------- 62 1 I I t of form boards. The number of depressions varies from ' i ' i two or three to ten or twelve. 1 1 j Construction Form Boards. In these boards the i j blocks have been cut apart and the pieces must be put to gether to fill the recess. The simpler ones have only tw> ; parts cut straight, while the more complex have three or k t | four blocks cut diagonally so that one or more of the ! ! 1 ! pieces must be turned over to make them fit. ; I I i i Cylinders. Cylinders varying in height only, in | diameter only, or in both height and diameter are placed in corresponding depressions. ! i j Peg Form Boards. Instead of blocks, long pegs of various shapes are placed in depressions corresponding in ; j shape. 1 j Form Board Tests of Apperception. To solve this j i ! • type correctly, there must be perception of a relationship' I I between the parts. For instance, separate pieces of wood ! ! representing the body, head, arms and legs of a manikin j ! 1 j must be assembled in their proper relationships. j I Many performance tests, not of the form board type,! have been used with young children. Examples of these are! ! ! building towers of blocks, copying figures, punching out j perforated holes in a sheet of paper, matching colors, stringing beads and buttoning large buttons. _ _ Representative scales of performance tests. Three performance scales are presented below in chronological order for illustrative purposes. j 4 1. Pintner-Faterson Scale of Performance Tests. j Q This is a pioneer performance scale, published in 1917* I i j abbreviated and revised in 1937. It is for individual i I administration and consists of fifteen subtests. This I l i scale, however, is not absolutely original with these I authors. Of the fifteen tests which they suggest, only ! three were originally devised by them. The other tests i ! are ones which had been prepared by other investigators, I ; but: which Pintner and Paterson restandardized and finally i j incorporated in their combined series. The tests which ! were used in the series are: (l) Mare and Foal Form Board!, i j (2) Sequin Form Board, (3) Five-Figure Board, (4) Two- j j Figure Board, (5) Casuist Board, (6) Triangle Test>(7) i i i Diagonal Test, (8) Healy Puzzle, (9) Manikin Test, (10) | i Feature Profile Test, (11) Ship Test, (12) Healy Picture- I j Completion Test I, .(13) Substitution Test, (14) Adapta- j 1 10 1 tion Board, and (15) Cube Test. i i Various methods for the calculation of mental age | have been suggested by the authors. They show how these | ! 9 ^ Pintner and Paterson, A Scale of Performance j Tests, 217 PP. I 10 ! 6 4 _ tests may be shaped into a year scale or a point scale. They also show how a mental age may be derived from tables of median performance and also how a percentile rating of the subject may be obtained. The scale is considered a valuable supplement to highly verbal scales* but it is not a good substitute for them. It is particularly useful for the deaf, for those who do not understand English, and for certain types of emotionally disturbed subjects. ! ’It is not very satis factory for older children, and older children who are dull often achieve high ratings which are probably 11 spurious because of their manipulation facility.1 1 12 2. The Arthur Point Scale of Performance Tests. This is another instrument for individual administration, designed for ages six years and upwards. The scale, form 1, consists of eight tests used in the Pintner-Paterson, plus two other tests. The eight are: Knox Cube Test, Sequin Form Board, Two-Figure Form Board, Casuit Form Board, Manikin, Feature Profile, Mare and Foal, Healy Picture Completion I. The two additional are: Porteus 1 1 James L. Mursell, Psychological Testing (New York: Longmans, Green and Company, 19^9J* p. 170. 12 Grace Arthur, A Point Scale of Performance, Volume I_, Clinical Manual (New York: The Commonwealth Fund, 19^-364 pp. _ _ gj- Maze Test and Kohs Block Design Test.1^ Each test yields a raw score, which is assigned a value proportional to the effectiveness of each suhtest in discrimination between successive age-levels. The raw score, then, is converted into weighted score points. The total of these weighted scores is converted into a mental age. This performance scale was devised primarily as a clinical instrument to be used as a substitute for the Binet revisions in cases where . . . the highly verbalized Binet tests are inade quate, first, because of language handicap; second, because of speech or hearing defect, third, because the Binet scale fails to give an adequate report of the intelligence of the individual in whom verbal and non-verbal abilities are markedly unequal in their development.^ 15 3. Cornell-Coxe Performance Ability Scale. ^ This scale includes six tests, with a seventh as an optional substitute for the third. These tests were selected from a variety of sources. These tests are: (l) Manikin Pro file Test, (2) Kohs Block Design Test, (3) Picture Ar rangement Test, (4) Digit-Symbol Test, ( 5) Memory for 13 Ibid., pp. 23-57. ^ It)id., p. 1. j 15 I ^ Ethel L. Cornell, and Warren Coxe, A Performance Ability Scale: Examination Manual (Yonkers, New York: World Book Company, 193§)> 88 pp. Designs Test, (6) Cube Construction Test, (7) Picture Completion Test.1^ The authors adopt a unique way of determining men tal ages. For them a mental age is neither the median of the scores of a given age group nor the median of the ages of those making a given score. Rather, it is a somewhat arbitrarily determined median between these two values which makes it decidedly questionable and ambiguous.1' In constructing this scale, the authors were inter ested primarily in developing a supplementary instrument for those of the Binet type and other verbal tests. Cornell and Coxe say in this regard: One important value of any scale supplementary to the Binet scale lies in the fact that if the two scales used give different results, the psychologists attention is directed toward discovering reasons for whatever differences may be found, and his analysis and interpretations are thereby enriched and tend to have greater validity.1^ Recent use of performance tests. A significant trend is represented in several studies of the pattern of performance, as distinguished from gross achievement of the individual given performance intelligence tests. Stimulated somewhat by the recent application of factor analysis to test data of this sort, this trend reflects 16 Ibid., pp. 11-36. 17 ' Mursel, op. cit., p. 171. 1 Cornell and Coxe, op. cit., p. 37* 67 chiefly a wider acceptance among psychometricians of the need for more detailed understanding of the individual through test performance. Another new trend is toward testing concept forma tion. This has come principally from experimental work in clinical psychology on disturbances in intellectual processes with brain injuries. More recently, some of these experimental methods have been adapted to the test ing of normal groups. A considerable number of studies have appeared in recent years which have utilized sorting tests to throw m light on the process of concept formation. One of these sorting tests is the Goldstein-Scherer Cube Test,^ which is a modification of the Kohs Block Design Test. Another 20 test of this kind is the Weigl Color-Form Sorting Test. In this test the subject is asked to sort various colored geometric shapes. Two possible solutions are successively sought— sorting by color and sorting by form, although not necessarily in this order. Responses to these tests were found to show two ^ Kurt Goldstein, and Martin Sherer, "Abstract and Concrete Behavior: An Experimental Study with Special Tests," Psychological Monographs, 53* No. 2:1-151* 19^-1. 20 Egon Weigl, "On the Psychology of the So-called Processes of Abstraction," Journal of Abnormal and Social Psychology, 36:1-33* January, 1941. g g _ kinds of approach: making a classification based entirely on the perceptual impact of the situation, which is called the concrete attitudes; and behavior characterized by con ceptual or abstract, volitional attempt to discover cate gories of classification. 21 Heldbreder, in her studies on the attainment of concepts by using card-sorting tests, found that there is a hierarchical order of concept formation which is deter mined by factors within the organism, but this order can -be varied by the amount of "situational support (perceptual or semantic cues contained in the test cards) toward one 22 or another mode of conceptual behavior." A different type of test, but one also concerned with concepts, or what its author calls "education abil ity," was developed and quantitatively standardized by Raven. J This is a series of sixty matrix tests, each test a design from which a part has been removed. The tests mentioned above are especially interest ing because they are non-verbal approaches to the problem pi Edna E. Heidbreder, "The Attainment of Concepts: i VII. Conceptual Achievements During Card-Sorting,1 1 Journal I of Psychology, 27:3-39* January, 1949. ' 22 i Ibid., p. 3. ^ John C. Raven, "Testing the Mental Ability of Adults," Lancet, 242:115-17* January, 1942. 59~ of.testing concept formation. Further research may throw more light on the relations of non-verbal approaches to the development of concepts and abstract thinking. Summary and evaluation of performance tests. Per formance tests have been found most helpful with persons handicapped by language disabilities. They are useful for the deaf, the illiterate, and foreigners, and those who have speech or reading disabilities. They are also valuable in helping to identify chil dren who are shy or inarticulate because of emotional reasons, and who, therefore, may appear at a disad vantage on verbal tests of mental ability.2^ They are also useful in identifying mentally retarded I adolescents and adults. Performance tests offer a good opportunity for clinical observation. They can provide more than a rat ing in the form of a numerical index. The subject’s ap proach to the test problem might be revealed through the variety of tasks that they offer. Thus, they are more helpful than verbal tests for studying some types of cases. An important reason for including performance tests ] in studies of adults is that different mental func tions decline at different rates after maturity is reached. Language tests and arithmetic tests remain stable well into the 30's, before slow decline sets pli Frank S. Freeman, Theory and Practice of Psy chological Testing (New York: Harper and Brothers, I Publishers, 1949)7 P- 1 6 3. YO In* But performance on non-language tests begins to decline in very early adulthood. This decline in perceptual and speed tests, and perhaps in all tests involving spatial reasoning., makes it important to include all types of tests in assaying the competence of a mature adult.^5 In addition to the advantages of performance tests, they have practical limitations. These tests are affected by the speed factor, manipulative dexterity factor, and the systematic method of working. "This raises some ques tion of their power to discriminate valid intellectual re- 26 sponses or responses calling for general intelligence." These tests were first constructed as a substitute for verbal tests of the Binet type, but It is sounder to regard them as supplements. The reason for this conclu sion is that many studies showed that the partial coeffi cient of correlation falls at .5 0 or lower (the average degree of resemblance which exists between the two types of tests is not high). Although the two types may measure i some functions in common, they are not interchangeable. Each type may measure functions different from those measured by the other type. Finally, the performance tests may "indicate general mental ability more with younger and inferior Lee J. Cronbach, Essentials of Psychological Testing (New York: Henry Holt and Company, 1949), pV185. _________ Mursell, op. cit., p. 169. _______________________ , , ^ subjects than with superior and older individuals.”*^ II. TESTS OF EARLY DEVELOPMENT Tests of early development have been included in this chapter because of their nature. They consist, most ly, of non-verbal material. This section will deal spe cifically with tests designed for young children. Tests of early development fall into two cate gories. These are infant tests, usually of motor skills, used in measurement primarily up to age two, though many are limited to the first year of life. The second cate gory is that of the pre-school group, ranging from approx imately age two to age five or beyond. Some of these scales, for the greatest part, are not tests as that term is commonly understood. They are, rather, developmental scales, grouped at respective aver age level, and derived from observation of children1s be havior and experimentation in a variety of situations. Psychological tests for children at these ages are used for three purposes: (l) to determine how far the child has progressed in the kinds of development normally taking place at his age; (2) to predict, insofar as pos sible, future development and intellectual status; (3) to Cronbach, ojd. cit., p . 163. 72 measure the Intelligence of infants and young children for adoption. Agencies for the placement of children in foster homes are greatly concerned with this problem. The nature of tests of early development. Infant tests are primarily observation scales of growth and de velopment of motor skills; so these scales have included many items of this type. Norms exist for these tests but much subjective judgment enters into the picture. The pre-school tests include items of spatial per ception, verbal skills, and performance skills. Norms have been adopted for these tests and judgment is less subjective. Representative tests of early development. Since the Binet-Simon in 1905 many tests have been published which purport to measure the intelligence of children un der five. Much research has been done as to the actual value of the tests as predictive measures of later intel lectual status, and much of the literature is controver sial, though the findings appear to follow along certain lines. | Two tests will be chosen from among many for brief i ; description. 1. Cattel Developmental and Intelligence 73 28 Scale. This scale covers the range from two to thirty months. Many of the items were adopted from Gesell and his co-workers* others were collected from various sources including the Stanford-Binet, and still others were orig inal. Test items were grouped at age levels as they are in the Stanford-Binet. There are twenty-two levels. Dur ing the first year each level covers only one month, and during the second year two months. In the first half of the third year each level covers three months, and from thirty months to the end of the fourth year each level covers six months. Strictly speaking, the scale . . . has been so constructed as to constitute an extension downward of Form L of the Stanford-Binet tests. Between the ages of twenty-two and thirty months Stanford-Binet items are intermingled with other items. Thus, using the infant test items for the early months and the Stanford-Binet tests for the older ages with a mixture of the two between, one continuous scale from early infancy to maturity has been attained.29 An inspection of the scale reveals that the tests at the earlier level are of non-verbal type and involve responses to sensory stimulation, by turning head, grasp ing, manipulating and so on. A gradual change in direc- 28 i Psyche Cattel, The Measurement of Intelligence Infants and Young Children (New York: Psychological Corporation, 1940'}, 274 pp. 29 Ibid., p. 24. : ------------------------------------------------------------- tion into more manipulatory tasks takes place at about five months. A vocabulary test appears at eleven months. From this point on more verbal tests are utilized although manipulatory tasks still predominate. The following two age levels illustrate the nature of the scale. Two months fl) Attends to voice (2j Inspects environment ( 3) Follows ring in horizontal motion (4) Follows moving person (5 3 Babbles (Alt. a) Follows ring in vertical motion (Alt. b) Lifts head in prone position Thirty months (1) Differentiates bridge from tower (2 ) Imitates drawing lines and circles ( 3) Stanford-Binet three-hole form board rotated (4) Folds paper (5) Stanford-Binet identifying objects by use 1 Alt. a) Identifies pictures from name (Alt. b) Concept of one30 2. The California Preschool Mental Scale.^ This scale covers the range from one and a half years to six years and consists of the following subtests: (l) Manual facility, ( 2) block building, (3) drawing, (4) form dis crimination, ( 5) spatial relationships discrimination, j (6) size and number discrimination, (7) language compre- 30 Ifeld., pp. 97-104, 2 5 8-6 8. Adele Saffa, The California Preschool Mental Scale: Form A (Berkeley, California: University of California Press, 1934. hension, (8) language facility, (9) immediate recall, op (10) completions. The scale', yields three types of score; mental ages and resulting 'I.Q'i’s, standard deviation scores, and pro files based on-the-* various types of tests. Summary and evaluation of intelligence tests of early development. ^Though certain individuals who de signed infant and preschool tests seem to have found cer tain predictive value for ,their tests, research studies by- others do not bear them out. Results of research studies in the field show that infant tests have little value for predicting later I.Q.*s or scholastic ability. Tests given after age two and on up to six show higher correla tion with the results of later intelligence tests. The infant tests are only predictive of how well the subject is apt to perform on a similar test given a short time ■later. Correlations drop with increasing time span be tween tests. To support the above conclusion some studies will be mentioned briefly. 33 Anderson ^ concluded that infant tests, as at pre- Mursell, ojb. cit., pp. 184-85. John E. Anderson, uThe Prediction of Terminal Intelligence from Infant and Preschool Tests,” The Thirty- ninth Yearbook of National Society for the Study of Educa- ' tion, Part I (Bloomington, 111.: Public School Publish- -±ng-Co—,-T940-)7-pp— 385^403------------------------------ T6~ sent constituted, measure very little, if any at all, of the function that is called intelligence at later ages. Preschool intelligence tests, while they are instruments of some value and usefulness, measure only a portion of that function. Bayley, in 1939» checked the Kuhlman, the Gesell, and the Jones test and found very low correlations be tween the results of these tests and the results of later tests. She concluded that, \ The behavior growth of early months of infant de velopment has little predictive relation to the later development of intelligence even though the latter be havior may depend in large part on previously matured elementary neural connections or behaviorist pat terns. 34 Herring, in a study on the reliability of the Buhler Bab Tests, concluded, Though the tests seemed to indicate a fair degree of reliability over a limited time interval there was little consistency between scores over a period of several months. For these infant tests as for most of the infant tests so far reported, the predictive value over a long period of time is slight.35 ^ Nancy Bayley, "Mental Growth During the First Three Years: A Developmental Study of Sixty-one Children by Repeated Tests," Genetic Psychology Monographs, 14, No. 1:74, July, 1933. ! 35 Amancia Herring, "An Experimental Study of the ; Reliability of the Buhler Baby Tests," Journal of Experi- i mental Education, 6:159.* December, 1937* r r Honzik^ in testing 252 children at Berkeley found that an initial test given at the age of twenty-one months gives a negligible prediction of test success on the Stanford-Binet at six or seven years. Tests given at three years or later to children who have test experience are much more predictive (*.58 to . 7 6) at age six or seven years, yet are not high enough to warranto full confidence if one were adopting a child. There are a number of possible reasons why these tests have low predictive value. First, the infant tests I may be measuring some factors which have no relation to i intelligence as measured at a later date, as they are pri marily sensorimotor in nature. Second, research studies 1 use the Binet I.Q. as the criterion which may be a falla cious criterion. The assumption must be made that the I.Q. remains constant and this assumption is open to ques tion. Finally, the environmental factors at play on the child may have a tremendous bearing upon his mental growth. In spite of their low predictive value for young infants, these scales are of assistance to an experi enced clinical psychologist in appraising a child1s mental development when attention is given princi pally to analysis of performance on the various parts 36 Marjorie P. Honzik, "Constancy of Mental Test Performance During the Preschool Period,n The Pedagogical Seminary, 52:285-302, June, 1938* : .; 78 rather than to numerical scores, and when the analysis is used in conjunction with other clinical data.37 Finally, infant tests may never be of great pre dictive value because so many factors of an environmental nature are at work upon the child in his formative years that may influence his mental growth in either direction. ^ Freeman, ojd. cit♦, p. 261. CHAPTER V DEVELOPMENT OP GROUP INTELLIGENCE TESTS (VERBAL AND NON-VERBAL TYPE) The development of the individual intelligence tests has been discussed- in chapters III and IV. In this chap ter, the trends in developing group intelligence tests will be discussed, except those based on factor analysis, to which a later chapter will be allotted. No attempt will be made to analyze or even list all the group tests now on the market. Special consideration will be given to a few of the better-known or more adequate examinations, but this consideration will not be extensive because it is difficult to make indisputable generalizations about most of the tests. 7 Early development of group intelligence tests. Early attempts at mental testing were largely individual in type. The procedures used followed the method set by Binet. General opinion held that group tests could not be well enough controlled to merit consideration as a useful instrument in testing programs. Fortunately, a few indi viduals attracted by the tremendous time saving that the use of group tests would bring about, went ahead despite general opinion. 80 In Whipple published a book on tests. Many of these tests i^ere suitable for groups. The publication of this book emphasized the accelerating interests in tests during the period prior to World War I. In his in troduction, Whipple recognized the growing interest in group tests when he stated: One need not be a close observer to perceive how markedly the interest in mental tests had developed during the past few years. Not very long ago atten tion to tests was largely restricted to a few labora tory psychologists; now tests have become the objects of attention of many workers whose primary interest is in education, social service, medicine, industrial management and many other fields In which applied psychology promises valuable returns.1 Sylvester, in recalling his experiences with test ing prior to World War I, stated: The writer recalls a situation in 1916 in which the urge toward group testing was felt most keenly. This was before the hastened development of group methods under pressure of necessity for marshalling the Na tion’s man-power in the organization of our citizen ship into armies for services in the World War. The medical school of the State University of Iowa had been invited to make a comprehensive study of the pupils in the public schools of Wapello, Iowa. Each pupil was given medical examination by specialists in a dozen fields of medicine, dentistry, and school hy giene. Experts on hereditary and environmental fac tors studied the family and community aspects of each child’s li'fe. The writer was assigned the task of measuring all children as to intelligence and of diag nosing the mental deficiency of those presenting prob lems. At that time our only available/measuring de vices were the Stanford-Binet Scale, the Yerkes Point | Guy M. Whipple, Manual of Mental and Physical 1 Tests, Simple Processes (Baltimore: Warwich and York, L IntT7“1 9 W r p ~ v .----------- - — -8 r .. .. Scale, and Performance Scales. To apply any of these to 500 children was a task entirely beyond our re sources or personnel and time. We met the situation by adopting several of the Yerkes Point Scale Tests to group application. After they have been given and scored, clinical psycholo gists took the children one at a time, and cleared up any doubtful responses that had been written in the group testing, and applied the remaining tests to the point scale. Thus fairly accurate scores were secured,, Results from the Wapello survey were so satisfactory that the method of group screening and testing was ap plied in other schools in Iowa. . . . These early ef forts are reported as representative of such attempts at group mental testing of that early period. With the exception of parts of the Otis1 test no actual products of that period remain in use today, but those beginnings revealed the possibilities of the method and prepared the way for the miraculously rapid forma tion of the Army Test in 1917. Otis^ was one of those few who went ahead preparing a test that could be administered to a large number of subjects at the same time. Otis turned this test over to the Armed Services immediately. Robert M. Yerkes and his assistants modified the material provided by Otis into the now famous Army Alpha Test for military aptitude. As the Binet is the direct ancestor of many individual tests, so is the Army Alpha, influenced as it was by the material Reuel H. Sylvester, nGroup Mental Tests, 0 Clini cal Psychology, editor, Robert A. Brontemarkle (PhiTa- delphia: University of Pennsylvania Press, 1931)j PP. 145-46. ^ Clark L. Hull, Aptitude Testing (New York: World Book Company, 1 9 2 8), p. 15. Ibid., p. 17. 82 provided by Otis, the direct ancestor of many group tests 5 of today. A great impetus was given to the construction and use of group tests by the advent of mental testing in the army during the World War in 1917-18. The need was great for testing thousands of men in a short time. So some sort of group method was obviously a necessity. In addi tion, instead of individuals working alone, a group of psychologists were working together to construct group tests for measuring intelligence. Under these circum stances, group methods developed rapidly and reached a degree of development that otherwise probably would have required many years to attain. The pioneers who developed the early group tests had no models to imitate. Thus, it seems logical that any test that happened to be built would have more chance to be accepted without adequate criteria and thorough analy sis. This actually was the situation in many instances. In those eases where criteria were demanded, less specific and detailed standards were required. Also, relatively more attention was paid to norms and less to such a cri terion as validity. The nature of group intelligence tests. Group _______^ Ibid., p. 18._____________: _______________ _ _ __ ■ ' 83~ tests do not differ very much in their types of material* Certain types of material are found in many group tests and a description of some of them will be given for illus trative purposes. (a) Opposite. In this kind of the test the subject is asked to respond by writing down or indicating the op posite of a given word. For example: Direction: underline the word which is opposite, or most nearly opposite, in meaning to the beginning word of each line: 1. Exit 1 emit 2 transcend 3 entrance 4 origin 5 arrival 20. Depress 1 press^ 2 elate 3 opress 4 exhort 5 climb6 (b) Analogies. In this type of test, the analogy between two words is given and the subject has to decide as to a similar analogy with reference to another pair: For example: Direction: Pick out the word which belongs to the third word in the same' way that the first two words belong together: Lewis M. Terman and Quinn McNemar, Terman-McNemar Test of Mental Ability: Form £ (Yonkers, New York: World Book Company, 1941J " . 84 1. People: houses _____birds: rivers nests fields 20. Baby: man Colt: cow cattle horse' (c) Best reasons. This test appears in many forms. It is often called a test of common sense or comprehension. The subject is asked to indicate in some form or other the best answer to a question. For example: Direction: Read each statement and underline the number of the answer which you think is the best: 10, The saying, "If the shoe fits, wear it," means 1. Be sure to buy shoes that fit. 2. Give the devil his due. 3. Do not take unnecessary steps. o 4. Recognize your own faults and virtues. (d) Disarranged Sentences. In this type of test, the words In a sentence are scrambled and the subject is asked to arrange them properly. For example: Direction: Look at the disarranged sentences, "think out" the right order for the words, and then do what each sentence tells you to do: 3. Sentence underline this the in shortest word. ^ Rudolf Pintner, and Walter N. Durost, Pintner- Durost Elementary Test, Scale 2: Form A (Yonkers, New York: World Book Company, 1$40), pp. 6-7. 8 Terman and McNemar, o£. cit. 85 40. Order sentence the write this with rearranged right words the in.9 (e) Arithmetical Problems. In this kind of a test, the subject is given problems in arithmetic for solution. For example: 1. How long is half of 8 minutes? 15. How many times as heavy is 1/2 of a load which weighs a ton and a half as a load weighing half a ton?10 (f) Information. In this kind of a test, the sub ject is tested regarding his knowledge of general infor mation. For example: Direction: Underline the correct word. 1. Euchre is played with dice, rackets, cards, pins. 2. John Wesley was famous in literature, science, war, religion.11 (g) Sentence Completion. In this type of a test, words in a sentence or passage are omitted and the subject is asked to fill in. For example: Direction: Fill in each blank: 1. Fish swim ________ the water. ^ Wilford S. Miller, Miller Mental Ability Test: Form A (Yonkers, New York: World Book' Company, 1*921) ' . ^ Melvin E. Haggerty, and others, National Intel ligence Tests: Scale A, Form 2 (Yonkers, New York: World !Book Company, 19^0). 11 Rudolf Pintner, Intelligence Testing: Methods! and Results (New York: Henry Holt and Co., 1932), p. IB7. B6 r 12 2. Boys _______ girls like to ______ ball. (h) Classification, Generalization. There are many tests requiring some kind of generalization, classification or logical selection. These may be tested in several dif ferent ways. Example 1: Direction: "In each row draw a line under each of the two words that tell what the thing always has:" Table - books, cloth, dishes, legs, top. Idiocy - crime, foolishness, poverty, stupidity, tuberculosis. Example 2: Direction: "Which one of the five things below is most like these three: horse, pigeon, cricket." 1 stall; 2 saddle; 3 eat; 4 goat; 5 chirp. Example 3: Direction: Draw a line under the two words which tell what the thing always has:1 1 A circle always has: altitude, circumference, 1ati tude, 1ongitude, radius Abhorence always involves: aversion, dislike, fear, rage, timidity.^3 J (i) Number Completion. "This calls for discovering the rule or method in the arrangement of a series of 12 Haggerty and others, op. cit. Pintner, ojd. cit., pp. 188-89. , _ 8 r numbers and indicating this in some way." For example: Direction: Look at each row of num bers below, and on the two dotted lines write the two numbers that should come next: 3 4 5 6 7 8 ........... 81 27 9 3 1 1/3... .............. 3 6 8 16 18 36 14 (j) Word Knowledge. This is tested by asking for the meaning of single words or of words used in sentences. (k) Non-Verbal Material. Almost all of the verbal material is duplicated in some form or other in non verbal material. Most of this type of material cannot be conveniently reproduced here. Much of the verbal material has been replaced by pictures. The logical sequence test is carried out by arranging a series of pictures so that they will form a logical sequence. The analogy test is presented by pic tures, and the subject has to choose from several pictures the one which makes the best fourth picture. The similar ity test is carried* out by similar pictures which require the discrimination between similarities and opposites. James C. De Voss (Director), Army Alpha Intelli gence Tests: Form < 3 in Public School (Emporia, Kansas: Jureau of EducationalHMeasurement and Standards, 1920). _ ---------------------- 38- Also, the group non-verbal tests include in their content the picture absurdity test, the picture completion test, the picture arrangement test, the maze test, and the cube test which requires the subject to tell how many cubes are in the pile. Code tests are also very common, There are many tests which require the subject to copy geometrical forms. Aesthetic judgment is tested by marking the prettiest of three or more pictures or diagrams. The dot imitation test which requires the subject to draw lines from one dot to another in accordance with the movement of a pointer is also used. Finally, to become familiar with all kinds of non-verbal material, the reader must study the various tests themselves. Trends in developing group tests. The evolution of group intelligence tests . . . has been much less clear-cut than that of individual intelligence testing. Many of the early group tests are still widely used, usually in revi sions and sometimes with new names, with refinements and improvements, but without basic alteration.1- 3 ; The Army (Alpha and Beta) tests will be discussed in some i J detail as representative of early group intelligence J tests. Later in this section, an attempt will be made to j ^ James L. Mursell, Psychological Testing (New York: Longmans, Green and Company, 1950)* P- 142. _ _ _ _ _ u 89“ discuss those tests which have shown new trends in the development of group intelligence tests. The Army tests (Alpha and Beta) and their chief re visions . The Army tests were the first major development in group intelligence testing. The original tests have become outmoded, although some of their revisions are still used to some extent. But many of the persistent problems of group meas urement defined themselves then, and many of its char acteristic concepts and methods were established. Thus some understanding of the original Army testing program is valuable as leading to a comprehension of the development, since that time.3-& Two major tests were developed in connection with the Army testing. The first one was called the Alpha Scale, the second one was called the Beta Scale. The Alpha Scale was a group^ test which was suitable for men who could read and understand the English language. The test consisted of the following: (l) Following di rections; (2) Arithmetical problems; (3) Practical judg ment; (4) Synonym-antonym; (5) Disarranged sentences; (6) Number series; (7) Analogies; (8) information.1^ There are five different forms of this test, all ^ Mursell, loc. cit. 17 1 Clarence S. Yoakum, and Robert M. Yerkes, Army Mental Tests (New York: Henry Holt and Company, 1920), pp. 53-53^ 90 roughly of the same degree of difficulty. It is, there fore, useful in testing groups where there is danger of coaching. The test is scored on the number of right re sponses, except with subtests 4 and 5* -in which a deduc tion for error is made to compensate for chance. The Army Alpha has been revised many times. The first Nebraska Revision followed the original make-up and scoring system.1^ The Schrammel-Brannan Revision*^ was intended for grades 4 to 1 6, and the original five forms were reduced to three. The oral directions and instruc tions were reduced, and the test was made largely self- administering . 20 Another important revision was made by Wells. He eliminated the practical judgment subtest and replaced it with a numerical subtest. Percentile norms were supplied for high school boys and girls, for seventh and eighth grades, and for adult men. The Beta Test. This test was intended for those unable to read-English. It consisted of a variety of pictures and diagrams. The tests are: (l) Maze Drawing; Mursell, o£. cit., p. 143. ^ k°c. cit. 20 Frederic L. Wells, "Army Alpha Revised," Person nel Journal, 10:411-17* April, 1932. 91. (2) Cube Analysis; (3) X-0 Series or completing series of crosses and circles; (4) Digit Symbol; (5) Number Check- 21 ing; (6) Drawing Completion; (7) Geometrical Construction. The directions are explained with gestures or are demon strated on a blackboard chart. The scoring is on the num ber of right responses, except with the code substitution subtest. The most important revision of Beta was that made 22 by Kellogg and Morton. Some changes were made in the subtests, but they were not very important. Cube analysis and X-0 series tests were eliminated and a picture dis crimination test was introduced. Points of significance concerning the Army Tests (Alpha and Beta). 1. The criteria of validity. The validity of the tests as measures of intelligence was checked against the following criterion: . . . officer rating of men, army rank as an out come of survival of the fittest, other kinds of intel ligence scales, professional success, and ability to learn as evidenced by school standing. Not only has the scale as a whole been thus checked up, but also every one of the separate parts making up the scale. 21 n Yoakum and Yerkes, 0£. cit., pp. 100-18. 22 Chester E. Kellogg, and N. W. Morton, "Revised Beta Examination,n Personnel Journal, 13:94-100, August, 1934. 2'3 _______ ji_Yoakum_and-Yerkes-,_op...„cit.._,_p..—9-»------------- 92 As far as selections of subtests are concerned* two criteria were kept in mind. . . . If* for example* the relationship between two of the tests is very high* it is possible that the tests are repetitive and that one of them is unneces sary. On the other hand* an extremely low relation ship between one of the tests and the total score might indicate that the test should be omitted beeause it adds little to the measurement of intelligence yielded by the group of tests as a whole. But* in constructing the tests* the authors had to abandon the first criterion (low correlation between sub tests)* because the mean of intercorrelation of the sub tests of Alpha was about .61* and those with the highest relationship to total score had the closest relationship to one another. 2. Mental age. The major reports from the Army mental testing led to the "discovery" of an average mental age for the American soldier of around thirteen years. This aroused a widespread controversy. An explanation of this phenomenon has been offered by Freeman: A carefully selected group of men were given the Army Alpha* and also the Stanford-Binet. The Stanford- Binet mental ages of these men were found. By a com parison of these mental ages with the Alpha scores of the same men* the mental ages which are equivalent to the various Alpha scores were calculated. This pro cedure assumes that scores made by children and by adults on the same mental test represent equivalent mental capacities. The results of the army test seem to give conclusive evidence that this assumption is not correct. While the discrepancy may be explained oil Ibid-.-,—p..—6-. ' 9 3 ' in part by other minor factors * the chief explanation must be this lack of equivalence of the results of the test given to children who are in school and are ac customed to doing tasks similar to those demanded by the tests, and to adults who have been out of school for from six to ten years or more, and have lost a good deal of their adeptness for performing tasks which involve clerical skill. It is unsafe, there fore, to interpret the mental-age ratings of adults, when they are obtained .in this way, as meaning the same thing as they have been found to mean in our ex perience with children.^5 This explanation involves the rather singular im plication that as soon as a person leaves school his in telligence ceases to be stimulated, and also it raises the broad issue of the effect of environment upon intelligence. Finally, it should be noted that the Army project was a definitely successful venture in group intelligence testing on a very large scale. It was soon followed by the construction of other group tests; for example, the Otis Test in 1 9 1 8, the Pressey Test in 1918, the Haggerty Test in 1919, the Whipple Test in 1919* and the National Intelligence Test by affiliated psychologists in 1920. A departure from the Army practice: The Otis Self- 26 Administering Test of Mental Ability. This test was ^ Frank N. Freeman, Mental Tests, Their History, Principles and Applications (Boston: Houghton Mifflin Company, 1 9 3 9 7* p. 1 2 8. 26 Arthur S. Otis, Otis Self-Administering Tests of Mental Ability, Grades 4-9* 10-adult (Yonkers, New York: World Book Co., 1'92£). 94 developed quite early. It represents a departure from original Army practice. There are two levels of this test, the Intermediate Examination for Grades 4 to 9 and the Advanced Examination for High Schools and College. The "self-administering" feature refers to and is based upon the "scrambled" or "spiral omnibus" arrange ment. That is, the test is not divided'into subtests, but different types of items appear mixed up throughout the test, beginning with ieasy items and proceeding to more - difficult ones';. ^ The test yiel4,s a point score that can be trans formed into mental, ages and so-called intelligence quo tients. Otis suggests the following way to find the I.Q.: The IQ of an individual may be found as follows: Add to 100 the number of points by which a pupil’s score exceeds the norm for his age, or subtract from ./ 100 the number of points by which a pupil’s score • falls below the norm for his age.^7 ■ ■ V ■ Thus, the‘ Otis I.Q. is different than Binet I.Q. - ~ : i . v because it is based on the deviation or difference of the subject's score from the mean score for the age group. Some items in the Otis Self-Administering tests are given below for illustrative purposes: 27 1 Arthur S. Otis, Otis Self-Administering Tests ofj Mental Ability, Manual of Directions and Key (Yonkers, New York': ~ World Book Company, 1 9 2 2), p. "51 95. 34. Of the five things below* four are alike in a certain way. Which is the one hot like these four? 1 smuggle* 2 steal* 3 bribe* 4 cheat* 5 sell . 36. The opposite of hope is? 1 faith* 2 misery* 3 sorrow* 4 despair* 5 hate 40. If 2 1/2 yards of cloth cost 30 cents* how many cents will 10 yards cost? 44. One number is wrong in the following series. What should that number be? 0 1 3 6 10 15 2-1 28 34 59. Revolution is related to evolution as flying is to 1 birds* 2 whirling* 3 walking* 4 wings* 5 standing 74. A statement the meaning of which is not definite is said to be? 1 erroneous* 2 doubtful* 3 ambiguous* 4 distorted* 5 hypothetical2^ Par departure from the Army Practice,The CATO (Thorndike and others). ° This test has been discussed very briefly in Chapter III as representing a complete departure from Binet. The test is included in this sec tion because it was designed as a group test* although it can be administered individually at the lower age levels. 28 ' Arthur S. Otis* Otis Se1f-Administering Tests of Mental Ability* Higher Examination: Form B (Yonkers* New York: World Sook Company* 1922)* pp. 3-4. 29 ^ Edward L. Thorndike, and others* The Measurement of Intelligence (New York: Teachers College, Columbia University Bureau of Publications* 1927)* 616 pp. ------- 96~ Thorndike stressed in this test what he believed to be three important aspects of intelligence: (l) altitude, or the difficulty of the tasks that can be carried, (2) breadth, or the number of tasks of equal difficulty that can be done in a given time, (3) speed, or the quickness with which responses can be made. Altitude was considered to be the most essential characteristic of intelligence, and the range was said to correlate with it. Thus, the scale was set up primarily to measure altitude of intel lect . The intellectualness of our total inventory of tasks, and the intellect whose level or altitude, range or width, and facility or quickness it measures, will be called hereafter Intellectualness CAVD and Intellect CAVD (the symbol CAVD refers to four series of tasks which constitute it--completions, arithmeti cal problems, vocabulary and directions). The total series of tasks concerns four lines of ability: C. To supply words so as to make a statement true and sensible. A. To solve arithmetical problems. V. To understand single words. D. To understand connected discourse as in oral directions or paragraph reading. The arrangement of scoring is such as to attach equal weight to each of these four varieties of tasks.30 There are seventeen groups of tests each measuring a level of intellect, from level A to level Q. Each level is represented by forty items or tasks, arranged in order j of difficulty, ten for each of the four kinds of material ! 3° Ibid., p. 65. --------------------------------------------------------97" mentioned above, that is, ten completion items, ten arith metic items, and so forth. Also, the steps between levels are of approximately equal difficulty. The lowest level is suitable for three-year-old children while the highest levels are intended for superior adults. A striking feature of the scale is that it purports to measure in crease in intelligence in levels or units of equal value at any part of the scale. In taking the test, the subject starts at a level where he can achieve everything and then proceeds to that level inhere he fails practically everything. There is no time limit and so the test becomes one of power, and not of speed. The scale yields three kinds of scores, namely, al titude scores which mean the per cent of items passed, range scores which mean the per cent of items passed at each level, and area scores which mean the sum of all success. The range of difficulty in the same category of mental tasks may be illustrated by the following examples. Sentence completion, level A (lowest level) You are sitting on a ______ . Sentence completion, level Q (highest level) It must seem to the wisest men, j when brought into contact with the great things of nature ____ they is nothing __ to infinitude of • they are ignorant. Arithmetic, level A _______ Counts 2 pennies_________ _____ •_______________ ’ 9B1 Arithmetic, level Q Let n = any number Let nr = 1 divided by n Let ng = 10 divided by n Let n® ® the number raised to the same power as itself q What does (|§) equal? Vocabulary, level A Show me the horse (to be indicated in series of pictures) Vocabulary, level Q Radical (means) 1 light 2 agitator 3 straight line 4 root 5 ray Directions, level A "Make a ring, like this," showing act Directions, level Q A rather long poem, entitled "Dirge in Woods" is read. The following is one of the questions asked: What is a s 1 as the masses? Complete departure from the Army practice: Cali fornia Test of Mental Maturity. The tendency in recent group tests has been in the direction of breaking down the total score, mental age, or I.Q. into two or more aspects of mental ability. Thus, the California Tests of Mental Maturity have been constructed to measure language and non-language factors. The test is set up for five levels: Pre-Primary, Primary, Elementary, Intermediate, and Advanced. All of them are designed to test‘ the same "factors": namely, ! visual acuity, auditory acuity, motor coordination, memory*. ^ Ikdd., pp. 66-94. ----------------------------------------------- yg- spatial relationships, reasoning, and vocabulary. The first three subtests are used "to identify in dividuals with defects sufficiently serious to prevent ob taining a valid diagnosis or measurement when they take op any paper-and-pencil test. These first three are not included in the scoring. The remaining "factors” are broken down as follows: ' Memory: immediate and delayed recall. Spatial relationships: sensing right and left, manipulation of areas. Reasoning: opposite, similarities, analogies, number series, numerical quantity, inference. Vocabulary: word-knowledge.33 The test yields three kinds of M.A.'s and three kinds of I.Q.rs, namely, a language M.A. and I.Q., a non language M.A. and I.Q. and a total M.A. and I.Q. The scale, also, is accompanied by profiles . . . designed to show graphically the relative ex tent to which each student possesses these abilities (which are tested by the test) thus enabling the teacher to see at a glance the probable sources of difficulty or success and to provide to the maximum the guidance which such a profile may suggest.34 It seems profitable to mention the attitude of the authors concerning the validity of a testing instrument. QO ^ Ernest W. Tiegs, and others, Manual: California Test of Mental Maturity-Advanced (Los Angeles: California Test Bureau, 1951), p. 2. QQ | Ernest W. Tiegs, and others, Manual of Direction: California Test of Mental Maturity-Intermediate Series (Los Angeles: California Test Bureau, 1942), pp. 2-4. ^ Ibid., p . 1. 100 It is rather unusual. They say: The validity of any test is difficult to estab lish; there are no purely objective criteria or standards which correspond to the factors or abili ties in terms of which conceptions of mentality are currently described. The authors of these tests believe that the multiple factor theory of intelligence comes nearer to explaining observable phenomena than does the strong central factor theory alone. They recognize the importance of philosophical contributions,, but they believe that progress in determining the nature of mentality and the value of tests of mental matur ity are dependent largely upon further studies in factor analysis which employ analytical and statis tical techniques. This series of tests recognize contributions already made by including samplings of verbal ability, mathematical ability, spatial relations, and logic.35 Apparently the authors believe that they have es tablished "face validity" by accepting the results of factor analysis. Kuhlmann has expressed doubts as to the value of labeling tests by the functions they measure or are sup posed to measure. He says: We do not believe there is much merit in labeling tests as regards functions measured, as the authors have done, first because it cannot be done correctly by inspection; and second, because these labels are not of much value until we know also how these func tions enter into school achievement in different school subjects. ... It would be hazardous, indeed, to conclude from a score on two brief tests that a child has a poor memory, for example.3° 35 Ibid., pp. 1-2. Oscar K. Buros, editor, The Nineteen Forty Mental Measurement Yearbook (Highland Park, New Jersey: 1941")! T I— p2Q9-. " ~.~ ---- ' 101 Summary and evaluation. Individual tests are quite tedious to give, especially when a large number of persons must be tested; so, group intelligence tests have been de veloped to be given to a large number of individuals at the same time. There is no doubt that the group tests are of value from a practical standpoint in that they en able the examiner to test a large group of subjects in a short period of time with a fair degree of success. In spite of time saved by using group tests, these are not as useful as are individual tests in studying an individual case because it is not possible to observe the subject's approach to the solution of the problem, nor his behavior. Also, it is not possible to evaluate his responses, since group tests are scored quite rigidly. Furthermore, group tests are somextfhat limited in the sorts of abilities tapped. An inspection of most group tests reveals that they are heavily weighted with verbal material. In this re spect Cronbach says: The common tests are heavily verbal. While several of them include problems requiring reasoning about numbers or geometric forms, verbal items account for the bulk of the variation in scores. This is warranted by the fact that verbal items generally correlate well with other sorts of items. Nearly all group tests are thought of by their designers as scholastic apti tude tests. Even the Army, in Army Alpha and the General Classification Test, sought to predict which men could most readily learn new duties. Since school demands verbal fac ilj.t 3r__at„eve^ryJburn,_thi s_ j us.t.ifies„ ---------- ----------------------------------ro2— the usual verbal loading of tests. Group tests are, with few exceptions, speed tests. Although speed is not the determining factor in causing high or low scores on most group tests, as shown by some schools,^ the current trend in constructing new tests is to provide enough time for nearly everyone to finish. Most group tests are based upon the "general fac tor" theory of intelligence; for most of them measure a person*s mental activities by means of several kinds of tasks and then rate the individual by means of a single index. A few tests are based upon the group factor theory. The development of group non-verbal tests of mental ability "has been hampered by the difficulty of eliminating language completely." Another difficulty is that most non-verbal items do not call for very complex mental processes. However, some of the authors of these tests maintain that the non-verbal tests require essen tially the same type of intelligence performance as that 37 Lee J. Cronbach, Essential of Psychological Testing (New York: Harper and Brothers Pub 1 ishe?s, 1949)> p. 175. oQ ° Mary W. Bennett, "Factors Influencing Perform ance on Group and Individual Tests of Intelligence: I. Rate of Work," Genetic Psychological Monographs, 23:237- 318, May, 1941. T03“ required by the abstract symbols of language and number. They hold that problems presented in diagrams, pictures, charts, and geometric forms closely parallel those pre sented by means of language and number. In this respect Cronbach says: . . . With the exception of 'spatial1 items which measure ability to perceive relations among forms, most non-verbal items are too easy to measure superior adolescents and adults. Some recent tests have demon strated ingenious solutions to this problem. But if 'higher intelligence' involves ability to do abstract thinking, it is hard to conceive of an adequate high- st that does not involve vocabulary and con- not interchangeable. Although they may measure some func tions in common, or are in other ways interrelated, each one also measures functions different from those of other Finally, different tests of the same functions are tests. Q Q Cronbach, pp. cit., p. 188 CHAPTER V I DEVELOPMENT OF INTELLIGENCE TESTS BASED ON FACTOR ANALYSIS The development of individual and group intelli gence tests of general mental ability has been discussed in the previous chapters. This chapter will concern'the development of intelligence tests through the application of factor analysis. It seems, In the last few years, that the develop ment of intelligence tests has been in one sense rapid, but at the same time so-called tests of general intelli gence have not advanced much beyond their status a few years ago. That is to say, the directions which research , in the field has taken, have been away from over-all tests and toward a breakdown into more adequate tests of "fac tors” or "functions. 1 1 Factor analysis is highly technical and only a relatively simple explanation can be presented here. Freeman says: The technique is essentially a search for the psy chological activities which are at the basis of and determine test performance. All techniques of factor analysis are statistical and based upon the correla tion coefficient. After the statistical calculations have been made, it is necessary for the investigator to bring to bear his psychological insights to inter pret and name his statistical findings. Tests contain a variety of items. What psychological functions do 105 the various types of items have in common? Are these functions in common between various tests of verbal performances? Between verbal and numerical? Between spatial perception and numerical ability. Between reasoning with verbal and with nonverbal materials? These are among the questions the factor analyst seeks to answer. After he has found his answer, at least tentatively, he proceeds to construct a scale in. which items are included and so grouped as to measure only, or almost solely, the factors he has segregated from his preliminary testing and statistical analysis. The factor analyst does not begin with a definite set of preconceived mental functions. He tries to discover which psychological process, or components, are necessary to explain his data. Yet, it should be noted, he must at the very outset have some concep tion of the kinds of test items to include in prelim inary experimentation. Thus what he ultimately dis tills out as factors is basically dependent upon his original conceptions regarding his preliminary items. The factor analyst, in seeking the components of ’in telligence, 1 for example, does not start with tests of color perception, tone discrimination, or finger dexterity.1 i It will be helpful to describe in more/detail the background of intelligence tests of factor analysis. Spearman*s two-factor theory. As early as 1904, about a year before the first public appearance of the famous Binet scale for measuring general, or global, in telligence, the field of statistics known as factor analy sis was founded by Charles Spearman. In that year Spear man had published his article, "General Intelligence Prank S. Freeman, Theory and Practice of Psycho logical Testing (New York: Henry Holt and Company, 1949), p. 8 9: Objectively Determined and Measured*" giving an exposition 2 of his two factor theory. In this article* he agreed that* . . . all branches of intellectual activity have in common one fundamental (or group of functions) where as the remaining or specific elements of the activity seem in every case^to be wholly different from that in all the others.3 This publication marked the first serious attempt to apply h mathematical procedures to mental organization. Twenty- three years later, he published the first complete* de tailed exposition of his theory in The Abilities of Man* with some modifications of his earlier reports.-' Spearman*s theory was inspired by a "curious" ob servation in many tables of intercorrelations between the measurements of different abilities (scores of tests* marks for school subjects* or estimates made on general impression). Statistical investigation of the phenomenon observed in these tables of intercorrelation enabled him p Charles E. Spearman* "General Intelligence* Ob jectively Determined and Measured*" American Journal of Psychology* 15:201-293* April* 1904. 3 Ibid., p. 284. 4 Florence L. Goodenough* Mental Testing: Its History* Principles and Applications' r "('New York: Rinehart and Company * 1949)* p. 2 8 0. n ; ^ Charles E. Spearman* The Abilities of Man* Their Nature and Measurement (New York: Macmillan Co.* 1927)"* 415 pp. 107 to work out what he called the "tetrad equation,M which became the chief substantiation for his theory. Having worked out his "tetrad equation,1 1 Spearman found mathe matical proof for the following principles: Whenever the tetrad equation holds throughout any table of correlations, and only when it does so, then every individual measurement of every ability (or of any other variables that enters into the table) can be divided into two independent parts which possess the following momentous properties. The one part has been called the 'general factor* and denoted by the letter ' g; * it is so named because, although varying freely from individual to individual, it remains the same for any individual in respect of all correlated abilities. The second part has been called the 'specific factor' and denoted by the letter 's.' It not only varies from individual to individual, but even from any one individual from each ability to another.° Thus, Spearman postulated the g factor, in the first place, to explain correlations that he found to exist among diverse sorts of mental activities. That is to say, he concluded that all mental activity is to some extent dependent upon and an expression of this general factor; and the magnitude of the correlation coefficient found be tween any two forms of mental activity reveals the extent to which this g factor is operative in each and common to both. Since the intercorrelations are by no means perfect, Spearman postulated the existence of specific factors, called the s factor, each of which is largely specific to 6 Ibid., pp. 74-75. T08~1 a particular type of activity. Spearmark after pointing out that standard mental testing has certain theoretical and structural defects * holds that all these defects can be resolved by turning to his theory of two factors. He says: From all these defects in the current methods of testing intelligence, let us turn to another method that claims to overcome them. It has been designated as that of Two Factors. Its essential nature consists in splitting the score of the subject at any test (or sub-test) into two independent parts, of which the one, 1G, 1 is quite general, whereas the other, fS,’ is narrowly specific. Here, it is said, the fatal equivo cality vanishes. For G has been shown to measure a perfectly definite mental power.; this may be roughly described as that of originative thinking in contrast with merely sensing or reproducing. More precisely, the process has been designated as ’education1 and has been shown to take two sharply distinguishable forms. In one, the person sets out from two or more items of knowledge and goes on to perceive how these are mutu ally related. For instance, he might hear and under stand the words ’good* and ’bad, 1 which qualities he should then perceive to be ’contrary' to one another. In the other form of education, the person starts from any item of knowledge together with some appropriate relation, and then conjures up to mind the other item so related to the first one. For instance, he might somehow be given the ideas ’good' and ’contrary to, 1 and thence he should derive the further idea of ’bad.* In these two forms of education, it is asserted, are comprised the whole and nothing but the whole of genuine insightful knowledge.7 The first major test in practice of Spearman's o theory was undertaken by Burt in 1909 when he studied the 1 Charles E. Spearman, "Intelligence Tests," Eugenics Review, 30:250, January, 1939* 3 Cyril Burt, "Experimental Tests of General Intel ligence," British Journal of Psychology, 3jL9^-A7J^ Decem ber, T909. ----------------------------------------- j Q ^ - intercorrelations for two groups of Oxford students on twelve tests. In an important study, Brown and Stephenson^ sub stantiated the theory of two factors on an adequate sta-^ tistical basis. This research was partially in answer to C ) a critical article by Pearson and Moul in which they sug gested some twelve to fifteen abilities. One of the most active critics of the "g" concept was Thorndike, who asserted that in many cases the self- correlations as well as the intercorrelations of the measures used by Spearman were too low to justify the theory. He questioned the very existence of any such universal trait. According to Thorndike, an individual has no general ability, but only a very large number of specific abilities consisting of elements that overlap to varying degrees; any such generality as may exist is in herent in the nature of the tasks performed. Bifactor theory. In more recent years, when ^ William Brown, and William Stephenson, "A Test of the Theory of Two Factors, 1 1 British Journal of Psychology, 23:352-70, April, 1933. K. Pearson and M. Moul, "The Mathematics of In- ! telligence, I, The Sampling Errors in the Theory of a Generalized Factor," cited by Karl S. Holinger, Review of Educational Research, 9:528, December, 1939* ' 1 T 0 — | psychologists started to use larger and more varied bat teries of tests, they found the tetrad criterion was gen erally not satisfied. Since this finding implies that a single general factor is not sufficient to account for the intercorrelations, a more elaborate theory was required. 11 The bifactor theory as developed by Holzinger postulated a general factor, a number of group factors, and factors specific to each test. Holzinger1s bifactor technique is an improved and mathematically simplified form of Spearman’s two-factor method. Holzinger retained a large general factor but placed greater emphasis on group factors than did Spear man. After the general factor was removed, the bifactor method sought clusters of tests which show significant residual correlations with each other and zero correla tions with the remaining tests. Each such cluster was assumed to contain a group factor. In addition, each test was assumed to contain a specific factor. The fundamental equation, as Holzinger wrote it, made each test score dependent upon the general factor, one group factor, and a specific factor. 11 Karl J. Holzinger, and Swineford Prances, A Study in Factor Analysis: The Reliability of Bi-Factors ahcf TheXr Relation to Other Measures (Chicago: ffie Uni versity of Chicago Press, 19^2), 88 pp. ■ ----I'll— 12 Burt's group factor method closely resembled the bifactor solution of Holzinger, A simple non-mathematical method of factor analysis I Q which was presented by Tryon J may be considered as being essentially of the bifactor type. He grouped the tests into clusters and obtained final correlation profiles which reveal the essential nature of the underlying factors. Multiple-Factor theory. The theory held by a large number of contemporary psychologists proposes a relatively small number of moderately broad group factors* each of which may enter with different weights into different tests. Such theory has been variously designated Multiple-Factor or Group-Factor theory. The earliest important contribution to this theory 1 4 was provided by Kelley in Crossroads in the Mind of Man. ^ Cyril Burt, ^Factor Analysis by Sub-Matrices,M Journal of Psychology, 6:339-75* October, 1938- I Q Robert C. Tryon, Cluster Analysis Correlation Profile and Orthometric (Factor) Analysis for the Iso1a- tion of Unities in MincT and Personality (Ann Arbor, Michigan: Edwards Brothers, 19391* 12o pp. j 1 4 ^ i Truman L. Kelley, Crossroads in the Mind of Man:! A Study of Differentiable Mental Abilities (Stanford I ITniversDEy, California: Stanford University Press, 1 9 2 8),: 238 pp. 1T 2 He contended, after a critical analysis of Spearman's data, that the general factor is of relatively minor importance* He attributed the major relationships among tests to a relatively small number of broad group factors. One of the leading exponents of this theory today is Thurstone. Since the publication of The Vectors of Mind,^ Thurstone has contributed a number of modifications to his method (centroid method) which he has used to obtain his factors. A simple variation of this method was also 16 furnished by Woodrow and Wilson. One of the more important applications of Multiple- ■I r? t O Factor analysis was made by Thurstone. 13 He identified several "primary" mental abilities. The analyses and interpretation of Thurstone and others led Freeman to conclude: . . . that certain mental operations have in com mon a 'primary' factor which gives them psychological IS ^ Louis L. Thurstone, The Vectors of Mind: Multiple-Factor Analysis for the Isolation of Primary Traits (Chic ago: Chicago University Press, 1935)7 266 pp. "j ^ Herbert Woodrow, and Lawrence Wilson, "A Simple Procedure for Approximate Factor Analysis," Psychometrika, 1:245-58.> December, 1936. ^ Louis L. Thurstone, "The Isolation of Seven Primary Abilities," Psychological Bulletin, 33*780-81, December, 1936. 1 Q Louis L. Thurstone, "A New Rotational Method in Factor Analysis," Psychometrika, 3 2 199-218, December, -I-938-._______________________________ __________________________ _ — - XI3_ _ I and functional unity and which differentiates them j from other mental operations. These mental operations] then, constitute a 'group. 1 A second group of mental operations has its own unifying 'primary* factor; a third group has a third; and so on. In other words, there are a number of groups of mental abilities (the number being, as yet, undetermined) each of which has its own 'primary' factor, giving the group a func tional unity and cohesiveness. Each of these 'pri mary* factors is said to be relatively independent of the others .19 Primary mental abilities; Thurstone's investiga tions . Spearman and Holzinger spent many years in the search for unitary mental traits. These and other studies have received less attention than the work of Thurstone; the claims and possibilities of factor analysis are best 20 illustrated in his work, however. 21 In 1938, Thurstone reported upon an experiment in which a battery of fifty-seven psychological tests was given to 240 University of Chicago students. These tests were loosely classified as follows: (l) Abstraction (five tests); (2) Verbal (eight tests); ( 3 ) Space (nine tests); (4) Form (four tests); (5) Number (six tests); ( 6) Numer ical Reasoning (four tests); ( 7) Verbal Reasoning (three Freeman, 033. cit., pp. 84-85. 20 Lee J. Cronbach, Essentials of Psychological Testing (New York: Harper and Brothers, Publishers, 19^9% p. 2 0 5. P i Louis L. Thurstone, Primary Mental Abilities (Chicago: University of Chicago Press, 1939 ) , > 121 pp. tests); (8) Space Reasoning (three tests); (9) Rote Learn ing (six tests); and (10) Unclassified (nine tests). This grouping was tentative * for the main purpose of the study was to isolate primary mental abilities. Thurstone carried out the analysis to thirteen factors through the use of the centroid system of analy-. sis* then he tentatively suggested the following factors: S Spatial V Verbal Relations I Induction P Perceptual M Memory R Restriction N Numerical W Word Fluency D Deduction Factors 10* 11* 12* and 13 may be regarded as obscure. Test items* to be considered meaningful in their contri bution to the concept of a common element* were required to have a factor loading of .40 or above. Probably the most ambitious project ever attempted ! pp I in this area was reported by Thurstone in 1941* under the title* Factorial Studies in Intelligence. In this study* the Thurstones prepared a battery of tests suit able for the fourteen-year age level. The tests were ad ministered to children in the eighth grade of fifteen Chicago elementary schools. A factorial analysis of the data yielded the following factors* which seems most readily identifiable: 22 Louis L. Thurstone* and Thelma G. Thurstone* Factorial Studies of Intelligence (Chicago: University of Chicagq^Pres s * 194TJ* 94 pp. V Verbal-comprehension N Number W Word-fluency M Memory S Space I Induction Then the Thurstones concluded as follows: The single factor loadings show that the verbal factor has the highest loading and the rote-memory factor the lowest loading on the common general factor in the primary abilities. This general factor is what we have called a ’second-order general factor.* It makes its appearance, not as a separate factor, but as a factor inherent in the primaries and their correla tions . If further studies of the primary mental abilities of children should reveal this general fac tor, it will sustain Spearman’s contention that there exists a general intellective factor. Instead of de pending on the averages of centroids of arbitrary test batteries for its determination, the present method should enable us to Identify it uniquely. We have not been able to find in these data a general factor that is distinct from the primary factors, but the second-order general factor should be of as much psychological interest as the more frequently postu lated, independent general faetor of Spearman. Our findings seem to support Spearman’s claim for a gen-' eral intellective factor, but he has been so critical of our work on the primary mental abilities that it is uncertain whether he would accept our support for a general intellective factor. We have not found any occasion to take sides as regards the existence of a general intellective factor, either as a factor in dependent of the primaries or as a factor operating through correlated primaries. We have reported on primary mental abilities in adults, which seem to show only low positive correlations except for the two verbal factors. Here we find higher correlations among the primary factors for eighth-grade children. It is now an interesting question to determine whe ther the correlations among primary abilities of still! younger children will reveal, perhaps even more I strongly, a second-order general factor. Thurstone does not claim that factors will remain ^ Ibid., p. 26. XT6~ the same if some of the elements in the correlation matrix are changed or if a different population is used for de riving the correlations. In addition the factors, in all likelihood* change with variations in difficulty of the tests, practice effects, reliability of the tests, and requirement of speed. Thurstone followed his preliminary investigations with attempts to construct tests, in various editions and forms, which purport to measure what he calls the ’ ’Primary Mental Abilities.f t The number of the factors employed varies from one form or edition of the test to other forms or editions. The important point is that these factors, or ’ ’ Primary Mental Abilities” are said to be relatively independent of each other. oil Thurstone*s recent volume (19^7) sets forth the underlying theory of factorial analysis in its relation to test construction and interpretation. The nature of his factors is too much to engage in here, but his conclusions represent the- latest of the most widely accepted concepts of mental abilities. Representative tests of factor analysis. 1. Chicago tests of Primary Mental Abilities. This pii Louis L. Thurstone, Multiple Factor Analysis; A Development and Expansion of the Vectors of Mind (Chicago: University of Chicago Press, 1947).> 535 PP»_______________I T 1 ' 7 scale, for ages eleven to seventeen, is constructed upon the group factor theory of mental abilities; that is, upon the theory that intelligence consists of the operations of certain distinguishable and relatively independent mental functions. An endeavor had been made to derive tests in which there is a heavy saturation of a primary factor in which other factors are minimized. The six primary factors measured by this battery may be identified as follows: The Number factor (N) is involved in the ability to do numerical calculation rapidly and accurately. It is not dependent upon the reasoning factors in problem solving, but seems to be restricted to the simpler processes, such as addition and multiplication. The Verbal factor (V) is found in tests involving verbal comprehension, for example test of vocabulary, opposite and synonyms, completion tests, and various reading comprehension tests. The Space factor (S) is involved in any task in which the subject manipulates an object imaginably in two or three dimensions. The ability is involved in many mechanical tasks and in the understanding of mechanical drawings. Such material cannot be used conveniently in testing situations, so we have used a large number of tasks which are psychologically similar, such as Flags, Cards, and Figures. The Word Fluency factor (W) is involved whenever the subject Is asked to think of isolated words at a rapid rate. It is for this reason that we have called the factor a Word Fluency factor. It can be expected in such tests as anagrams, rhyming, and producing words with a given initial letter, prefix, or suffix. The Reasoning factor (R) is involved in tasks that require the subject to discover a rule or principal covering the material of the test. The Letter Series and Letter Grouping tests are good examples of the task, A Memory factor (M) has been clearly present in all test batteries. The tests for memory which are -.. 1X8 now being used depend upon the ability to memorize quickly. It is possible that the memory factor will be broken down into several Memory factors.^5 The validity of the scale is reported in a manner quite different from those which are used with other scales thus far discussed. Thurstone feels that a search for primary mental abilities would not be helped by at tempts to get higher validity, that is, higher correlations between a battery and a criterion such as academic marks. Rather, he feels that the scale would be valid if the cor relations between the primary abilities were relatively low; for such correlation would show relative independence of factors, as required by the theory, and would thus sat isfy the -underlying group-factor theory of mental abili ties . The idea of general intelligence was given up, and with it went the familiar global score or rating, whether it be in the nature of a mental age, a percentile or a standard score. An individual Vs raw scores obtained on this scale, for each of primary abilities, are converted into percentile ranks which are then plotted on a profile. Because "the principal purpose of the present test is to obtain a profile of the six primary mental abilities for PR ^ Louis L. Thurstone, Instruction for Examiner-The Chicago Tests of Primary Mental Abilities (Washington, D.C.: The American Council on Education, 1941), pp. 5-9. 119 each child. . . . Burt has criticized this test: To most British psychologists the omission of a se parate measurement for ‘intelligence1 (i.e. innate* general* cognitive ability) will seem a serious de fect. ... In the Chicago Tests the presence of a general factor seems indicated by the positive inter correlations between the primary abilities ( .1 5 to *55); and its similarity to 'intelligence1 is attested by the relative magnitudes of the factor-loadings for the several tests (highest for reasoning* decidedly high for the verbal abilities— possibly because there is so large a verbal element in most of the tests-- low for memory and the space factor). It probably in cludes a marked element of speed and may underrate those whose capabilities are exhibited more freely in practical or manual problems. The authors recognize the presence of this 'second order general factor' and its 'psychological interest;' and presumably a measure of it* if required could be obtained by tak ing a simple (or better a weighted) total of the six scores. The attempt to measure special abilities* however* is of great interest and should be particularly in structive to British users. . . . There can be little doubt that the six abilities selected by the Chicago tests are of great importance both for school success and for vocational guidance; and* to judge by our own limited experience* this set of tests appears to as sess them (at any rate with older pupils) far more effectively than any other existing set of group tests. Nevertheless* I believe that the assessment of spe cial abilities or aptitudes is far more difficult than the assessment of general intelligence Ibid., p. 19. ^ Cyril Burt, The Third Mental Measurement Year- book (New Brunswick: ftutgers' University Press* J* P* 225* 120 28 2. Tests of Primary Mental Abilities. This bat tery was published in 1946., and intended for five and six year old children. The tests measured five factors: 1. Verbal Meaning is measured by tests of vocabu lary* sentence comprehension* sentence comple tion* paragraph comprehension* and auditory discrimination by indicating appropriate pic tures . 2. Perceptual speed is measured by tests of identi cal pictures* and identical forms by crossing out the appropriate picture. 3. The Quantitative factor is measured by tests of counting* comprehension of quantitative concepts, and story problems by indicating the appropriate number of pictures. 4. The Motor factor is measured by drawing lines connecting dots in parallel rows. 5. The Space factor is measured by items requir ing the child to select from a series of dia grams the one which will complete an incomplete square* and by items requiring him to copy in the element or elements omitted from an incom 28 Louis L. Thurstone* and Thelma G. Thurstone* Examiner Manual for the Tests of Primary Mental Abilities (Chicago: Science Hesearch Associa'tes* 194b)* 24 pp. _____ ---- i plete design from the complete design which is presented. The tests are designed for group administration. The battery yields scores for the five factors plus gen eral learning ability (total). These scores can be con verted into mental ages and into quotients, and the abilities scores yield a profile. Summary and evaluation. Factor analysis continues to show a few factors that recur in factoring various tests and using various methods of analysis, but there is still a great deal of confusion. There are still a few of the unsolved problems to which attention has been called. These are: the number of different factors found in any given test, the degree to which factor loadings are found constant for the same type of test by different investi gators, the non-intelleetive factors that may influence any test at different times, and the relative complexity of factors at different age levels. Different methods of factor analysis may yield dif ferent factors. Holzinger and others2^ compared bi-factor and multiple-factor methods and found that three uncorre lated group factors revealed by the bi-factor method were Karl J. Holzinger and others, The Estimation of Punll Abilitv bv Three Factorial Solutions (Berkeley: Dng^e r i W ^ r tillITorHiaT T W TT 252 yp7~_______________ ; 122 resolved into at least four by a multiple-factor method. Fruchter^0 identified at least two factors in verbal fluency: F, related to the flow of responses, and S, re lated to the selection of responses required for the solu tion of the problem. A considerable debate has been made concerning the reality of mental factors, and about this point Wolfle says: Thomson, Thurstone, and Tryon have repeatedly cri ticized the naivete of supposing that every factor necessarily represents an ultimate and unitary mental ability. None of the major students of factor anal ysis ever held such a view, but some of their critics have fallen into the easy error of accusing both Spearman and Thurstone of it because of the names they have given to their factors. Spearmanfs concept of g as the total fund of mental energy and Thurstone!s ’primary traits1 and ’primary abilities’ are easily misinterpreted. The ordinary connotations of the word ’primary1 are such as to foster the notion that Thurstone has, or believes he has, isolated the basic and ultimate causes of differences in ability.31 There is no doubt that factor analysis has at least a superficial resemblance to the theory of faculty psy chology. Burt^2 has remarked that many of the factors 3° Benjamin Fruchter, "The Nature of Verbal Fluency," Educational and Psychological Measurement, 8:33-47, Spring,; 1948. 31 Dael Wolfle, Factor Analysis to 1940 (Chicago: The University of Chicago Press'^ 1940J, p. 26. Cyril Burt, "Mental Abilities and Mental Fac tors," British Journal of Educational Psychology, 14:85- 94, JuneT 1944. that are being announced are given almost exactly the same names that Gall long ago attributed to his mental facul ties. Also* some prominent workers in the field have de clared that factors are mental faculties under another name* the only difference being that faculties were ar rived at a priori while factors are arrived at by dint of statistical analysis. But Cronbach says: "Although the theory superficially resembles faculty psychology* it ac cepts none of the faculty psychologists1 ideas as to the nature and source of abilities. l % 33 Some workers in the field attempted to find the re lationship between the Primary Mental Ability battery and the usual criteria, for example, success in school. Good- 34 35 manJ and Ellison and Edgerton^^ found little relationship between factor scores and achievement in school subjects that might be thought likely to be associated with them. 33 Cronbach, ojd. cit., p. 1 9 6. J Charles H. Goodman, "Prediction of College Suc cess by Means of Thurstone’s Primary Mental Abilities Tests," Educational and Psychological Measurement, 4:125- 40, Summer, 1944. 33 Mary L. Ellison, and Harold A. Edgerton, "The Thurstone Primary Mental Abilities and College Marks," Educational and Psychological Measurement, 1:399-406, October, 1941. , x 2 i | — Also, Stuit^^ and Hudson and Adkins*^ reported that the factor patterns revealed by the Primary Mental Abilities battery have little relationship to vocational fitness and choice. Other studies of prediction indicated that factor analysis in its present stage of development "has not produced tests which are superior to non-factional OQ diagnostic tests for practical purposes. At last, any detailed discussion of the technical methods and procedures of intelligence test construction of factor analysis is outside the scope of this chapter. A comprehensive, non-technical presentation of the various methods of factor analysis, their limitations and useful- 39 ness may be found in a publication by Dael Wolfle. 3 Dewey B. Stuit, and Harry H. Hudson, "The Re lation of Primary Mental Abilities to Success in Profes sional Schools," Journal of Experimental Education, 10:179-82, March, 1942."“ 37 James L. Mursell, Psychological Testing (New York: Longmans, Green and Company, 1950)> P• 414. Cronbach, o] 2. cit., p. 210. 39 Wolfle, ojd. cit., 69 pp. CHAPTER V I I CULTURAL FACTORS IN CURRENT INTELLIGENCE TESTS: STUDIES DEALING WITH I.Q. OR TOTAL SCORE In the recent history of intelligence testing* there has been an ever-increasing emphasis on the impor tance of the influence of cultural factors on intelligence test performance. Psychologists are coming to recognize more and more that the individual's attitudes* responses* interests* and goals--asiwe11 as what he is able to ac complish in practically any area— cannot be discussed in dependently of his cultural frame of reference. There is a mass of evidence to indicate that cultural differentials are also present in motor and in discriminative or per ceptual responses. It is perhaps unnecessary to point out that the term 1 1 culturen is used in this context in its very widest sense* to include not only the very striking contrasts in ways of living with which the ethnologist deals* but also the minor variations within one1s own culture which are more directly the province of the sociologist and the psychologist. Both types of differences in culture may determine how a subject will react to any test situation. - - - - - - - - - - ‘ 1 "2 * 6 j Every intelligence test is a sample of behavior. As such, intelligence tests should reflect every factor which may influence behavior. No one would think of giving an intelligence test standardized upon American children to a child in Iraq, or South Africa, and expect the results to mean very much. By the same token, no one would expect an intelligence test standardized upon urban children to be a fair measure of the intelligence of children living in isolated areas. Therefore, it is important to examine the relationship between the culture and the I.Q., and to ex plore the extent to which intelligence test performance depends upon the cultural background of the person tested. Almost since the advent of intelligence testing, educators and psychologists have debated and investigated the relationship of the I.Q* to the various cultural fac tors. These investigators have shown that there is a de finite and measurable relationship between the scores the pupils obtain on intelligence tests and their cultural background. There are a large number of studies which have dealt with this problem since the time of Binet. Some of these studies will be reported in this and the following chapter. This chapter is designed to give a general over view of research knowledge and of many research studies — d e a l - i n g —with—the—relat ionship—ofUL.Q ,._Ls—or__int el 1 igenc e — ----------------------------- 1'27~1 I test scores to various measures of cultural and social factors. In the next chapter, more detailed reviews will be found of the few studies which deal with the analysis of Individual items. I.Q. and socio-economic factors. There is a wealth of evidence thaty intelligence is more or less closely re lated to socio-economic factors such as parentfs occupa tion, home conditions, institutional conditions, the char acter of the community, and the like. It has been shown that intelligence test performance does vary, systematical ly, with differences in socio-economic background, no matter what tests, what measures of socio-economic back ground, what age levels of pupils, or what statistical techniques are used. An attempt is made in this section to present an over view of the major .findings of many studies dealing with this problem. For the sake of brevity, this Is done in tabular form, even though by doing so it is necessary to omit many important characteristics and qualifications which would have to be stated if the full significance of the findings of each study were to be assessed. Only a tfery few simple facts will be reported for each study: (a) the author; (b) date of publication; (c) test used, and whether analysis is in terms of I.Q., M.A., or some other score; (d) measure used as index of social status; (e) number of cases; (f) age or grade range covered; and (g) major findings. Where possible, the major findings of each study are reduced to one or two figures. 1. Studies dealing with test scores and parental occupation. There seems to be no doubt that there is a- basic relationship between the I.Q.'s of children and the occupational level of their parents. This has been demon strated by a large number of studies. Some of them are summarized in Tables, II, III, and IV. Table II shows that the correlation between parental occupation and intelligence scores ranges from low .20*s to .60. It should be noted that these studies yielding the lowest correlations are all based upon pupils in England (Duff-Thompson, MacDonald, and Gray-Moshisky). The other studies, based upon pupils in the United States, and using Stanford-Binet, yield correlations ranging from .3 0 to . 6 0. Table III reports many of the studies which deal with the mean or median I.Q. for two groups of pupils, the' highest and the lowest reported by the authors in the oc cupation ladder. The fourth, fifth, and sixth columns of Table III indicate the typical I.Q. found for children _of_these_tw.o_extreme groups, and the difference between TABLE II SUMMARY OF RESEARCH STUDIES DEALING WITH CHILD'S I.Q. AND PARENTS' OCCUPATIONS WHEN RESULTS WERE REPORTED IN CORRELATION FORM Author Date Correlation coefficient Number of pupils Range Ages included Grades Tests used Duff- Thompson MacDonald 1923 1925 , 28a . 26a 13,625 2,0lf7 11-12 11-12 — Northumberland Mental Northumberland Mental Stoke 1927 .30 508 1-3 S tanford-Binet Dearborn Leahy 1935 A5 19^ 5-15 — Stanford-Binet Gray- Moshinsky 1935 .25a 9,000 9-12 1/2 — Otis Advanced Bayley ! . 19*1 - 0 . 5^.60 47 7-10 Stanford-Binet a. Contingency coefficient. Hj ro vo 130 TABLE III SUMMARY OP RESEARCH STUDIES DEALING WITH CHILD'S I.Q. AND PARENTS' OCCUPATION WHERE RESULTS WERE REPORTED IN TERMS OP MEAN OR MEDIAN I.Q. Author Date Mean or median I.ft. for extreme groups for parents1 occupation reported Number of pupils - Range included Tests used Measured Highest group Lowest Differ ence Highest group Lowest group Total Ages Grades Arlitt 1921 Median 126 83 43 -48 71 304 -- Primary Stanford-Bine t Dexter 1923 Mean 115 89 26 225 522 2,782 1-8 Dearborn, National Duff-Thompson 1923 Mean 121 91 20 13 35 13,635 11-12 -- Northumberland Mental Fukuda 1925 Mean 101 84 17 29 24 257 --- 1-8 Stanford-Binet Sandiford 1926 Median 105 101 4 659 456 5,296 High S. Adults Army Alpha (modi) Collins 1928 Mean 115 94 21 ? ? 4,727 --- 1-6 Stanford-Binet Otis Group Armstrong 1931 Median 125 100 25 12 10 114 --- 4-8 Otis Intermediate Jordan 1933 Median 105 88 17 39 104 1,252 --- 1-7 Pintner-Cunningham Dearborn, National Gray-Moshinsky 1935 Mean 134 112 22 9 9 9,000 9-12 1/2 -- Otis Advanced Leahy 1935 Mean 119 102 17 40 - 23 194 15-14 -- S tanf ord-Bine t McNemar 1937 Mean 117 97 20 ? 9 959 10-14 — Stanford-Binet Maddy 1943 Mean 112 96 16 166 153 319, — , 6 National, Kuhlmann Anderson-Otis Self- Administering Robinson 1946 Mean 109 96 13 41 67 491 3 Kuhlmann-Anderson the two. Without a single exception* the children of the highest group are found to score higher than the children of the lowest group* and in most of the studies the dif ference is substantial. Three of the studies reported were based upon pupils not in the United States; two in England (Duff-Thompson* Gray-Moshinsky) and one in Canada (Sandiford). There is no clear tendency for the size of these differences to vary consistently with the ages or grades included* or with the test used. No completely valid con clusion can be inferred from these studies because of the variety of procedures used and the varying spread of social status apparently involved in the different studies. Table IV reports additional studies dealing with parental occupation and the child*s performance on intel ligence tests in terms of raw scores* percentiles* per centage above the mean or some other measure.- The results of these studies are not readily comparable with each other* because of the differences in the units involved* but without exception the difference is always in favor of the highest occupational group. Two additional studies which deal with the same occupational factor* and based upon pupils in other 132 TABLE IV SUMMARY OF RESEARCH DEALING WITH CHILDfS I.Q. AMD PARENT’S OCCUPATION WHEN RESULTS REPORTED WERE NOT IN TERMS OF MEAN OR MEDIAN I.Q. Test score (or other measure) for extreme groups reported Number of pupils Range included Author Date Measured Highest group Lowest group Highest Lowest Total. Ages Grades Tests used Pre s s ey-Ra1ston 1919 --- 85a 39 a 57 248 548 10-14 — Pressey group Pressey 1920 --- 79a 38a 21 138 337 6-8 — Pressey Primer Colvin-MacPhail 1924 Median score S3* 42 174 310 2,532 --- 12 Brown University Psychological Bear 1926 Median score 51 37 21 llb 85 --- - 13 Otis Higher Byrnes-Henmon 1936 Median percentile 68 41 6,029 24,618 100,820 --- 12 Ohio State University Psychological A.G.E. Psychological Henmon-Nelson Canady 1936 Median score 98 73 49 164 441 --- 13 A.C.E. Psychological Glass 1936 Mean percentile 65 43 ? ?b 118 --- 13 A.C.E. Psychological Livesay 1941 Mean score 178 102 191 546 1,896 --- 12 A.C.E. Psychological Smith 1942 Mean percentile 59 37 1,003 97 5,487 12 ? a. Per cent above median score for entire age group. b. Farmers. " 133' countries, may be mentioned briefly. Syrkin1 found su perior test scores for children of Soviet officials as compared with children of workers. Even when vocabulary differences are eliminated the superiority of one group over the other remains. Another investigator found that for one thousand ten-year-olds in Czechoslovakia the mean Binet I.Q. *s ranged from 90 for children of day laborers to 117 for the children of university educated parents. 2. Studies dealing with test scores and general socio-economic status. Differences in socio-economic status reflect themselves in the intelligence test per formance of children. The relationship is clearly shown in Tables V and VI. There are a large number of studies which report correlation between I.Q.*s and social status as measured by the Sims social-economic scale, or other similar com prehensive scales. Some of these studies are summarized in Table V. The correlations reported in this table vary 1 M. Syrkin, "Analysis of the Content of a Test from the Point of View of Social Classes," cited by Gertrude Hildreth, Review of Educational Research, 5:211, June, 1935. ^ Conference Internationale De Psychotechnique. Comptes Rendus de la VIIIe, Conference Internationale De Psychotechnique. Cited by Noel Keys, Review of Educa tional Research, 8:245# 1938. 134 TABLE V SUMMARY OP RESEARCH STUDIES DEALING WITH CHILD’S I.Q. AND GENERAL. SOCIAL STATUS WHEN RESULTS WERE REPORTED IN CORRELATION FORM Author Date Correlation Number Range included Tests Measure of coefficient of pupils Age Grades and score units Socio-economic status Fukuda 1925 .53 200 — — 1-8 Stanford-Binet (I. Q.' s) Whittier (modified) Chapman-Wiggins 1925 .32 632 --- 6-8 National (I.Q.!s) Chapman-Sims Freeman 1928 .48 401 9 --- Stanford-Binet (i.Q.’s) Special Chauncey 1929 .20 243 — 8 -9 McCall Multimental (scores) Sims Winch 1930 . 2 1 -.6 5 233 7-13 — « Special test (scores) Special Cuff 1933 .24 758 --- 13 A.C.E. Psychological (percentiles) Sims Leahy Bruce Honzik Bryan 1935 1940 1940 1941 .37-. 53 .55-.57 .35-.42 .49 194 187 213 169 5-15 6-13 6 -8 Inter mediate Stanford-Binet (I.Q.’s) Stanford Binet K-Anderson (I.Q.’s) Stanford-Binet (I.Q.’s) Otis Self-Administering (I.Q.’s) Special Sims Special Sims Shaw 1941 .31 ? 4-8 Otis Self-Administering (I.Q.’s) Sims Stroud 1942 .28 535 --- 6 Otis Self-Administering (I.Q.’s) Special Havighurst- Breese 1947 .21-.41 90 13 --- Chicago Primary Mental Abilities (T scores) Special. 135 even more widely than those in the preceding table., itfhich is perhaps to be expected in view of the greater variety of instruments used for measuring social status. It will be noted, however, that correlations based on the Sims scale alone are found to vary from .2 0 to . 5 7. Table VI reports some of the studies which deal with the relation of socio-economic status to I.Q.*s in 1 terms of the mean or median. It will be noticed that the difference between the two groups reported, without ex ception, are in favor of the children who came from high class or middle class, and, in most of the studies, the difference is substantial. Mention may also be made of studies conducted in other countries, all of which demonstrate the correspond ence between intelligence and socio-economic factors. These studies show the same'relationship, found by the American studies, between intelligence score and the social-economic status of the children tested. Comparable data have been obtained on large groups of subjects from early infancy to q 4 high school age in such countries as England, Scotland, ^ Raymond B. Cattell, "Occupational Norms of Intel ligence, and the Standardization of an Adult Intelligence Test," British Journal of Psychology, 25*1-28, July, 1934. 4 Charlotte M. Fleming, Socio-Economic Level and Test Performance," British Journal of Educational Psy chology, 13=7^-82, June, 19^3. 136 TABLE V I SUMMARY OF RESEARCH STUDIES DEALING WITH CHILD’ S I.Q . AND GENERAL SOCIAL STATUS WHEN RESULTS WERE REPORTED IN TERMS OF MEAN O R MEDIAN I.Q . The number and the mean or the median The Total Measure of Author Date Measured The group and its number I.Q. The group and its number I.Q. differ ence number of pupils Ages Tests used socio-economic status Terman 1915 Mean Superior (102) 107 Inferior (80) 93 14 ? 9 Stanford-Binet Teacher1 s classification Oldham 1935 Mean Highest ( 7 8) 101 Lowest (73) 87 14 319 12-16 Hagerty Sims Havighurst- Janke 1944 Mean Lower- Middle (26) 114 Lower- Lower (16) 91 23 110 1010 Stanford-Binet Warner Havighurst- Janke 1944 Mean Lower- Middle (14) 107 Lower- Lower (16) 91 16 110 10 Goodenough Draw-a-Man Warner Janke- Havighurst 1945 Mean Lower- Middle (42) 112 Lower- Lower (11) 98 14 110 16 Stanford-Binet Warner Janke- Havighurst 1945 Mean Lower- Middle (44) 109 Lower- Lower (13) 103 6 110 16 Wechsler-Bellevue Performance Scale Warner Carroll 1945 Median Middle (172) 105 Lower (128) 95 20 3 Op ? Kuhlman-Anderson Teacher1s rating Murray 194? Mean Middle 102 Lower 84 18 I87 10 Henmon-Nelson Warner Murray ' 1947 Mean Middle 98 Lower 79 19 208 14 Henmon-Nelson Warner Eells 1947 Mean High (226) 116 Low (322) 98 18 ? 9-10 Henmon-Nelson Warner Eells 1947 Mean High (233) 115 Low (361) 93 22 9 13-14 Terraan-MacNemar Warner 5 6 Poland, and the Soviet Union. 3. Studies dealing with intelligence test scores and .community setting. Once again, broad differences in com munity setting reflect themselves in intelligence test performance of children. Many psychologists found a rela tionship between ratings of relative community isolation and the intelligence scores obtained on various tests. The relationship is clearly shown in Table VII and Table VIII. An inspection of Table VII shows that the results of all the studies, dealing with isolated groups, are quite uniform. The average I.Q. is clearly below the national norms. The inferiority is more marked on verbal tests, and less marked on non-language and performance scales, and decline of I.Q. with advancing age. In summary, Table VIII, together with many other similar surveys, shows rural children to be inferior on intelligence tests in comparison with urban children. This inferiority tends to be greater on verbal than non verbal tests. Also, with increasing age, rural scores Anne Anastasi and John P. Foley, Differential Psychology (New York: The Macmillan Co., 194$), p. 808. 6 Belle Dubnoff, "A Comparative Study of Mental Development in Infancy,” Pedagogical Seminary, 53:67-73, September, 1938. 138 TABLE V I I SUMMARY OP RESEARCH STUDIES DEALING WITH DECLINE OP I.Q . OP ISOLATED GROUP CHILDREN WITH ADVANCING AGE Author Date Measured Test used Cases Range Grade included Age Results I.Q. Decline in I.Q. Average I.Q. for whole group Gordon Hirsh 1923 1928 Mean Mean Stanford-Binet Pint.-Cunn.9 Dearborn 9 8 88 174 rnrnrnmmm 6 12 5-9 14 89 61 87 75 28 12 70 79 Sherman S. Key 1932 Mean Pint .-Cunn. Draw-a-Man 25 21 25 21 --- 6-8 10-12 6-8 10-12 84 53 80 71 31 9 9 Wheeler 1932 Median Dearborn Illinois 33 61 23 63 --- 6 15 8 15 95 73 85 71 22 14 9 Stroud 1935 Mean ? 9 9 1 6-8 --- 90 70 20 9 Asher 1935 Median Meyers Mental Measure 25 15 7 15 83 61 22 68 Edward S. Jones 1938 Mean 9 .22 -39 - — 7 -8 14-15 104 73 31 9 * Wheeler 19^2 Median Dearborn 188 116 — - 6 15 103 81 22 9 139 TABLE VIII SUMMARY OF RESEARCH STUDIES DEALING WITH INTELLIGENCE OF RURAL CHILDREN Author Date Measured Cases Urban Rural Range Included Grade Tests used Urban R e s u l t s ^ Difference Pressey 1920 Median 337 183 ---- 6 -8 Primer Scale ---- 22a Jones 1932 Mean I.Q. ' 921 351 ---- 3-14 Stanford-Binet 101b 93b 8 Klineberg 1932 Total Points 300 700 -- 10-12 Six tests in Pintner- Paterson Series 216° 187° 29 McNemar 1942 Mean I.Q. 354 144 2-5 1/2 Stanford-Binet 106b 101b 5 864 422 — - 6-14 S tanford-Binet 106b 95b 11 204 103 15-18 Stanford-Binet 107b 96b 11 a. Percentage above the median for their.age made by city children. b. Mean I.Q. e. Mean raw score. tend to decline in relation to urban norms. The average scores tend to be lower in those districts with poorer schooling facilities. A number of investigations conducted in the European countries have substantiated the United States studies concerning the urban-rural differences in intel- 7 8 ligence tests scores, except in Great Britain 9 where the investigations have shown much less urban-rural differ entiation in intelligence test performance than has been found in the United States and in other European countries. I.Q. and family relationship. There is no doubt that there is resemblance in test perfojcmiance for various degrees of blood relationship, and this resemblance in mentality decreases as the relationship becomes more and I 9 i more remote. This relationship is shown below:-' ^ M. E. Bickersteth, application of Mental Tests to Children of Various Ages,1 1 British Journal of Psy chology, 9:23-73, December, 1917- 8 Godfrey H. Thomson, "The Northumberland Mental Tests," British Journal of Psychology, 12:201-22, Decem ber, 192T7 9 Rudolph Plntner, Intelligence Testing; Methods and Results (New York: Henry Holt and Company, 1932), pT-5l2~: 141 Relationship Average correlation Identical twins .90 All twins .75 Fraternal twins .70 Sibling .50 Cousins .20 Unrelated individuals .00 Parent-child .50 Certain special family relationships have been singled out by investigators. These are twins and foster children. The purpose was to study the contribution of hereditary and environmental factors. In this respect, a summary of some of the research studies dealing with the effect of a foster home environment upon the mentality of the child, and also the results obtained from identical twins raised together or apart will be presented in Tables IX and X. A glance at Table IX shows that Freeman1s study demonstrated a very definite relationship between the in telligence level of the child and the various character istics of the home in which he had been placed. On the other hand, Burks and Leahy studies show a low correlation between the mentality of foster children and the ratings assigned to the foster homes on the various characteristics indicated. Additional studies might be mentioned briefly. 142 CORRELATIONS BETWEEN I.Q.'S AND TABLE IX VARIOUS FACTORS FOR FOSTER CHILDREN AND "OWN” CHILDREN‘ S $ Foster children 1 1 Own1 1 children Factor Freeman Burks Leahy Burks Leahy N r N r N r N r N r Home rating 401 .48 206 .21 194 .19 104 .42 194 .53 Cultural index ---- ---- 186 .25 194 .21 101 .44 194 .51 Economic status ---- — - 181 .23 194 .15 99 .24 194 .37 Father's intelligence 180 .37 178 . 07 178 .15 100 .45 175 .51 Mother's intelligence 255 .28 204 .19 186 .20 105 .46 191 .51 Father's vocabulary 152 .2? 181 .13 177 .22 101 .47 168 .47 Mother's vocabulary 224 .37 202 .23 185 .20 104 .43 190 .49 ■^Adopted from Jane Loevlnger, 1 1Intelligence as Related to Socio-Economic Factors,u 39th Yearbook, National Society for the Study of Education, I, 159-210, 1940. 11 Speer showed that the longer the child is retained in his own home, the less resemblance to his foster mother, and the earlier he is adopted the greater the resemblance. 12 Harms showed that when a group of adopted children, whose true parents were of low mentality, were placed in superior foster homes, their mentality showed a rise above expectancy after five years of residence. Table X shows correlations between three groups, identical twins raised together, fraternal twins raised together, and identical twins raised apart. It shows that the correlations of the third group drop, in contrast to the first group, on all the indices of mentality. Also, members of the third group resemble one another, according to the data of this study, less than fraternal twins raised together. On the other hand, some studies of a few pairs of separated identical twins showed that differences in intelligence tests scores were uniformly small and insignificant, although a number of differences in attitudes, social conformity, and other personality 11 George S. Speer, The Intelligence of Foster Children,M Pedagogical Seminary, 57:49-55* September, 1940. 12 Irene E. Harms, "Children with Inferior Social Histories: Their Mental Development in Adoptive Homes _(unpublished Master’s Thesis, University of Iowa, 1941). TABLE X CORRELATIONS FOR VARIOUS TRAITS OF THREE GROUPS OF TWIN PAIRS-1 ^ Trait Identical raised together Fraternal raised together Identical raised apart Standing height .981 .9 3 4 . 9 6 9- Sitting height .965 .9 0 1 .9 6 0 Weight .973 .9 0 0 .8 8 6 Head length .910 .6 9 1 .917 Head width .9 0 8 .6 5 4 .8 8 0 Binet M.A. .9 2 2 .8 3 1 .637 Binet I.Q. .910 .640 .6 7 0 Otis I.Q. .9 2 2 .6 2 1 .6 2 7 Stanford Achievement Test .955 .833 .507 Woodworth-Matthews .5 8 2 .371 .583 ^ H. H. Newman, Frank N. Freeman, and Karl J. Holzinger, Twins: a Study of Heredity and Environment (Chicago: University of Chicago Press, 1037)* 3^9 pp. 145 14 IS characteristics were noted. 9 ^ I.Q. and natio-racial differences. Rates of mental development have been assigned to various "races1 1 in the United States. The procedure typically followed was to test samples of each racial group* calculate the average score, and make comparisons after consulting age equiva- l6 lents or norms. Brigham analyzed the "scores made by foreign-born recruits in the United States Army during World War I. He found that the recent immigrants made relatively low average scores while the older immigrant stocks made higher averages. The conclusion was drawn, for example, that the average intelligence of the Poles, the Italians, the Russians, and the Greeks was low, while that of the English, the Scotch, the Germans, and the native-born Americans was high. A few studies in the early twenties followed the same line. Years of experience in applying mental tests to the study of racial psychology has thrown much light both on the subject itself and upon the significance and proper Robert Saudek, "A British Pair of Identical Twins Reared Apart," Character and Personality, 3:17-39.» September, 193^-. Anastasi and Foley, 0£. cit., p. 3^6. Carl C. Brigham, A Study of American Intelli- -gence—(-Princeton:—Erinceton_University Press. 1923) 210 pp. 146 use of psychometrics. The great body of this work has been done with the Negro and the American Indian. Sum maries of research studies involving these two groups are presented in Table XI and Table XII. In addition., a sum mary of research studies completed with Spanish speaking children is presented in Table XIIIi In summary, these studies, together with many < other similar surveys, show that the Negroes, the Indians, and Spanish speaking children are inferior according to Anglo-American norms in mental test performance. This inferiority tends to be greater on verbal than on non verbal tests. In fact, some studies showed that the Indians exceeded the Anglo-American norms in a Draw-a- Man-test. Factors which may contribute to status differences. Many psychologists, although they do not deny the role of heredity, attribute the differences found on intelligence tests between differing socio-economic or racial groups to the differing cultural factors. A discussion of some of these factors will follow. A. Genetic ability. It is those aspects of intel ligence which may be presumed to be directly inherited through the genetic structure of the individual. The comparison of genetic mental potential of two 147 TABLE X I SUMMARY OP RESEARCH STUDIES DEALING WITH THE INTELLIGENCE OP NEGRO CHILDREN Author Date Measured Number of cases Range included Test used Result Differ Main conclusion White Negro Grade Age White I.Q. Negro I.Q. ence Schwegler and Winn Arlitt 1920 1921 Median Median 58 191 58 71 Primary Grades 10-17 S tanford-Binet S tanf or d -Bine t 103 106 89 83 14 23 Racial differences Jordan 1922 Median 1,504 247 -- 10-14 N.I.T. 97 71 26 Racial differences Pintner and Keller 1922 Mean 249 71 Kindergarten and 1 ,2 -- Stanford-Binet 95 88 7 -— Peterson 1923 1 Mean 772 734 3-8 — Pressey ? 75 9 * --- Hirsch 1925 Mean 1,030 449 5-18 Pint-Cunn-Dearborn 98 85 13 Natio-racial difference Lacy 1926 Mean 4,947 167 1-3 — Stanford-Binet 103 78 25 Strachan Davis Garth et al. 1926 1928 1930 Median Median Median 14,463 609 222 2,006 Kindergarten and 1,2 8-12 4-9 -- Stanford-Binet Terman group Otis Classification Part II 102 93 78 78 9 Native ability and school training Long 1934 Mean -- 4,684 1,3,5 -- Kuhlmann-Anderson 94 — Environment and social Bechham 1939 Mean — 912 • 9-12 -- Henmon-Nelson « . — 92 — -- Bruce 1940 Mean 521 87 87 432 72 72 -- 6-13 6-13 6-13 Kuhlmann-Anderson Binet Grace Arthur 88 90 93 72 74 74 12 16 16 Innate differences Brown 1944 Mean 341 91 Kindergarten — - Stanford-Binet 107 101 6 Cultural factors Laurance 1953 Mean3 7l6b 6,067 558 255 mm mm mm 6 -7 13-14 Chicago non-verbal 49 124 21 96 28 28 am mm mm a. The mean scores and not the mean of I.Q.1s. b. Standardization group. 148 TABLE XII SUMMARY OF RESEARCH STUDIES DEALING WITH THE INTELLIGENCE OF AMERICAN INDIAN CHILDREN Author Date Measured Number of cases Range included Grade Age Test used Results I.Q. Main conclusion Hunter 1921 Mean 715 -- ? Otis 83 White 123 Racial differences Garth 1925 Median 1,050 4-8 -- N.I.T. 69 all grades Fourth Grade 50 Eighth Grade 80 Social status and temperament Fitzgerald et al. 1926 Median 98 — 10-25 N.I.T., Otis, and Terman Group 87.5 Environment and language Goodenough 1926 Mean 79 ? ? Goodenough 86 White 102 Language not cause Garth 1927 Mean 765 ,4-8 N.I.T. 74-77 Racial difference not cause Garth and Garrett 1928 Mean 2 ,256 4-8 -- N.I.T 70-91 White 98 9 Garth et al. 1928 Median 1,000 4-9 -- Otis 70 all grades Fourth Grade 67 Ninth Grade 8l Environment Sandiford 1928 Median 717 1-8 -- N.I.T. P-rP-Performance 80 92 Language handicap Haught 1934 Median 961 — 6-16 Pint.-Cunn.-, N.I.T., and Terman Group 71-84 Native ability Arthur 1941 Median 31 grade school — Stanford-Binet Arthur Performance 83 90 Limited English and environmental factors Arthur 1941 Median 21 high school — S t anford-Binet Arthur Performance 94 126 Environment and limited English 149 TABLE XII (continued) SUMMARY OP RESEARCH STUDIES DEALING WITH THE INTELLIGENCE OF AMERICAN INDIAN CHILDREN Author Date Measured Number of cases Range included Grade Age Test used Results I.Q. Main conclusion Dennis 1942 Mean 152 6-10 Goodenough 108 No inferiority to white norms Rohrer 1942 Mean 125 110 1-3 4-8 Goodenough Otis Self- Administering 104 100 Cultural factors Havighurst and Hilkeritch 1944 Median 6? 0 6-15 Arthur Point Scale 97 Do about as well as white children Havighurst and others 1946 Mean 325 -- 6-11 Goodenough 110 White 101 Cultural factors 150 TABLE X I I I SUMMARY OF RESEARCH STUDIES DEALING WITH THE INTELLIGENCE OF SPANISH SPEAKING CHILDREN Author Date Measured Number of cases Range included Tests used Results The Main conclusion American Spanish speaking Grade Age American I.Q. Spanish speaking I.Q. differ ence Sheldon 1924 Mean 100 100 1 ------ Binet 105 89 16 ? Goodenough 1926 Mean 500 367 1-4 - — Goodenough 101 88 3 Language not cause Garth 1928 Median ------ 9 3 -8 ------ N.I.T. 78 — ■ Heredity and environment Garreston 1928 Median 197 117 3 -8 ------ N.I.T. 94 80 14 Heredity Randalls 1929 Mean ------ 92 ------ 9-13 N.I.T ------ 86 — Language and home environment Haught 1931 Mean ? 9 9-12 ------ P. -Cunningham, N.I.T., and Terman Group 100 79 21 Language not cause Herriraan 1932 Mean 28 78 7 ------ Terman Group 107 87 20 Language and heredity Davenport 1932 Mean 62 210 1*3 Goodenough 93 87 6 Environment Manuel and Hughes 1932 Mean 396 440 - — 7-10 Goodenough 106 92 14 9 Garth and Johnson 1934 Median 9 9 4-9 ------ Otis - Part II Terman Group 83 80 — 9 Manuel 1935 Mean ------ 132 ------ 7-12 S.Binet-Spanish S.Binet-English ------ 84 82 — Environment and language Garth and others 1936 Mean - — 455 8-16 Otis-Pint. Non-language ------ 101 — Language Craig 1938 Mean — ~ 144 ------ 5-8 L.I.P.S. ------ 101.5 — Environment Pratt 1939 Mean 146 95 3-8 Terman Group 90 80 10 9 • 151 TABLE XIII (continued) SUMMARY OP RESEARCH STUDIES DEALING WITH THE INTELLIGENCE OP SPANISH SPEAKING CHILDREN Author __ Number of cases Range included Date Measured American Spanish Grade Age speaking Tests used Results The American Spanish differ-* I.Q. speaking ence _____________ I^Q._______________ Main conclusion Mahekian Goulard Carlson and Henderson Arthur and Cook 1939 1949 1950 1951 Mean Mean Mean Mean 105 236 100 115 97 1-3 5-6 11-12 6-16 Otis Group English -- Spanish > -- L.I.P.S. C.T.M.M. -- Detroit First Grade, Detroit Primary, Pint.-Cunn., and C.T.M.M. Arthur Scale -- Stanford-Binet -- 95 102 85 87 91 101 84 13 Language Many factors Heredity and environment ------------------------------------------------ X5'2~~ or more individuals is possible if such individuals have received identical experience and identical cultural train ing and they have the same level of motivation. Therefore, accurate comparison of the genetic intelligence of two or more persons is impossible since no two individuals have identical experience and the same level of motivation. B. Developmental factors. By developmental factors is meant those elements of the environment of the child which may contribute to his mental growth and development. It is necessary to differentiate between two types of experiences. The first of these includes the experi ences to which an individual is introduced by his culture. The second of these comprises the unique experiences which occur to the individual and in which other individuals in the society do not ordinarily participate. These unique experiences may include a great variety of accidental events, traumas, parental treatment (except as this treat ment is culturally determined and taught). These experi ences influence the mental growth of the individual. Thus it is not possible to measure accurately his genetic in telligence. The exact nature and amount of the effect of j i such experiences upon mental activity is at present un- ] certain, but it cannot be ignored in dealing with research on mental testing. _______ Since_.c.ultural__training , influence s_t he. _mental______ developmental* cross-cultural comparisons of* mental ability of grossly dissimilar cultural groups offer an advantage to the cultural group for which a test was de signed. Cultural group A which is tested by a test con structed for cultural group B— a widely different group— will be handicapped to the extent that the test demands knowledge and performance which is not part of culture A. To continue the example* the performance of group A on a test designed for group B cannot be regarded as a basis for comparing the genetic intelligence of group A with that of B* since group B is expected to perform better* on the average* by virtue of greater training in and familiarity with the behavior demanded for suc cessful performance. It is obviously unfair to expect Australian aborigines or Mexican Indians who have never seen a tennis court to know that the net is missing from the picture (Army Beta); it is even less probable that the aborigines or Indians would be able to answer the ques tions contained in the information tests. In other cases* the situation may be a familiar, one and yet have a culturally determined connotation which will call forth a different, and therefore* from the tester's point of view* erroneous solution to the problem. Fitzgerald and ' ” 154 ' 17 Ludeman make this point in connection with one or the items in the National Intelligence Test. This item pre sents the word crowd, and follows it with five other words--danger, closeness, dust, excitement, number— the task being to underline two words which tell what a crowd always has. The correct words are, of course, closeness and number. Many Indian children, however, underlined ; dust and excitement, and the authors point out that that is a reasonable answer for children who never see a crowd without these two accompaniments. 1R Curtis and others, in their study of Jamaican children, found that these children showed an inferiority in the items requiring the use of paper and pencil. They also gave an unsatisfactory performance in the items re quiring the repetition, word by word, of simple sentences spoken by the experimenter. This was related to the linguistic habits of the population. Blackwood, in her application of the International Group Mental Test to Indian and Spanish American children writes, ^ J. A. Fitzgerald, and W. W. Ludeman, “The In telligence of Indian Children,n Journal of Comparative Psychology, 6:319-28, August, 192b. *1 o Margaret W. Curtis, Frances B. Marshall, and Morris Steggerda, “The Gesel Schedules Applied to One-, Two-, and Three-Year-Old Negro Children of Jamaica, B.W.I.,” Journal of Comparative Psychology, 20:125-26, October, 1935• ‘ I ~ “ ‘ X55~1 s * The reactions of the two racial groups were inter esting.. To the preliminary directions given to the Spanish-American group* it was found necessary to add the words: !No one must ask me any questions about the test,1 otherwise there would have been a constant flow of questions such as rIs this right?1 . . . 1 What does this picture mean?1 and so on. This addition to the directions was quite superfluous in the case of the Indian, for not a single Indian child even at tempted to ask anything whatever.X9 However, mention of these studies does not seek to prove that culture influences intelligence test perform ance. This is a recognized phenomenon. Hebb says: " . . . all psychologists recognize that one cannot com pare the innate intelligence of subjects from two differ ent cultures: experience affects their I.Q.'s to an un- 20 known degree.1 1 Cronbach also states this as a fact not needing discussion: "Obviously, a test is indicative of innate aptitude only when all persons compared have 21 similar backgrounds." Finally, a valid comparison of mental ability of persons from dissimilar cultural groups must utilize those ^ B. Blackwood, "A Study of Mental Testing in Re lation to Anthropology," cited by Otto Klineberg, Char acteristics of the American Negro (New York: Harper and Brothers Publishers^ 1944), p. 71. 20 ..Donald 0. Hebb, The Organization of Behavior; a Neuropsychological Theory (New York: John Wiley and Sons, Inc., 1949)* p. 295. 21 Lee J. Cronbach, Essentials of Psychological Testing (New York: Harper and Brothers, 1949), p7 116. 156 elements which are familiar to persons of both groups. C. Social status-cultural bias in test items. When most intelligence tests in common use today are examined in the light of cultural differences between different social and cultural groups in American life* it is ap parent that certain aspects of the tests operate to pro duce a cultural bias in favor of high status pupils. EelIs says concerning this factor: . . . if children from different social status levels have different kinds of experiences with dif ferent kinds of materials* and if the intelligence tests contain a disproportionate amount of material drawn from the cultural experiences with which pupils from the higher social levels are more familiar, one would expect that children from higher social status level would show higher I.Q.fs than those from the lower levels. This argument tends to conclude that the observed differences in pupil I.Q.’s are artifacts dependent upon the specific content of the test items and do not reflect accurately any important underlying ability in the pupils.^2 A few of the kinds of test items most influenced by the status characteristics are as follows: 1. The vocabulary used in items. A study completed by Stone J shows diear differences between high- and low- 22 Kenneth W. EelIs, and others, Intelligence and Cultural Differences (Chicago: UniversTEy~oFchicago Press, 1951), p? 4. David R. Stone, "Certain Verbal Factors in the Intelligence-Test Performance of High and Low Social Status Groups," (unpublished Doctor’s dissertation, Depart ment of Education, University of Chicago, 1946), 104 pp. ^ 157 status groups in the knowledge of words used in the pre sentation of intelligence test items. 2. The use of printed material. Items which re quire either reading or comprehension of printed material are biased in favor of high-status pupils. The greater familiarity of these children with printed material is largely a result of home influences. High-status homes < have more books* more reading* and more emphasis on read ing materials than low-status homes. This influence is particularly important In performance on group tests. 3. The use of school-like tasks in the items. Con cepts of "opposites* “ "analogies*” "classifications,” and "syllogisms” are more useful and more frequently found in high-status culture. The frequency of these items in cur rently used mental tests constitutes a cultural bias In favor of the high-status pupil. 4. Tests of information. From the differences in experience and environment between the high- and low- status groups* it is obvious that the information learned by the children at these different levels concerns dis similar facts and objects. The tests* however* are con structed to require information about experiences and objects that are characteristically high-status In nature. The low-status child is penalized by these tests to the -ex-tent—tha-t—his—culture—does—not—inc.lude_t hi s__informat ion. These are some of the factors which give high-status pupils an advantage in performance on standard group in telligence tests. These factors are clearly related to the cultural background of the pupils, which suggests that the superiority of the pupils from high-status homes is due, in part, to the greater cultural familiarity of these higher socio-economic groups with the experiences • and symbols used in the tests. D. Motivation. The administration of intelligence tests and the interpretation of the test scores proceed upon the assumption that all those who are taking the tests are equally strongly motivated, that is to say, they are all trying equally hard to do well on the test. One cannot assume that members of the lower class are as anxious to make good scores as are high class children. A high score may mean nothing to the former, or the test itself may be relatively devoid of significance and they may, as a consequence, be indifferent to the result. E. The speed factor. As a part of cultural back ground, the factor of speed should be taken into con sideration. The large majority of tests of intelligence depend at least to some extent upon speed. The attitude toward speed may vary greatly in different communities. 159 24 Peterson and others noted a relative indifference to speed among the Negroes, "The differences in speed are all in the same direction, favoring the white children.” pjr Klineberg compared groups of Negroes and Indians in this regard. Some of them lived on reservations or in rural settings. Others were city dwellers or were students in good colleges. He found that with the latter, speed of reaction to test situations was much higher than with the 26 former. Porteus also noted the indifference to speed among his native Australian subjects. It is not clear that this factor is generally pre sent, although in many cases and situations it should -un doubtedly be taken into consideration. Quite probably, speed is not a hereditary factor. F. The schooling factor. If schooling plays a part in mental test scores, and if the schooling available to the different groups (racial or socio-economic) differs Joseph Peterson, Lyle H. Lanier, and H. M. Walker, Comparisons of White and Negro Children in Cer tain Ingenuity and Speed Tests,” Journal of Comparative Psychology, 5:271-83, June, 1925. ^ Otto Klineberg, r , An Experimental Study of Speed and Other Factors in ’Racial1 Differences,” Archives of Psychology, 93:1-111, January, 1 9 2 8. 26 Stanley D. Porteus, The Psychology of Primitive People (New York: Longmans, Green and Company, 1$31) ' > 438 pp. in any important degree, those considerations must enter into any analysis of the differences in test scores among these groups. Middle class children stay longer in school and take a more academic curriculum, and the schools of the middle-class are better equipped and have the best prepared teachers. On the other hand, lower class children leave school early to find a job, or because of lack of motiva tion. Variations in schooling should be taken into con sideration in any attempt to explain the differences in intelligence scores between the various socio-economic cultural groups. G. The factor of sampling. Still another factor which must be considered in the analysis of the test re sults is the factor of sampling. This means the degree to which any particular group, socio-economic or racial, can be taken as representative of the whole group. The results obtained from these particular groups may prove false when they are applied elsewhere. For example, norms based on the occupational distribution of the country as a whole may easily be biased in other respects. Summary and evaluation. Seven tables have been j presented to show the established fact that there is a definite relationship between intelligence test scores and socio-economic status. Another two tables have been l6l presented to show the influence of environmental and cul tural factors on foster children and identical twins reared apart. Finally., three tables have been compiled concerning the intelligence of American Indian* Negro* and Mexican children* to show the scores obtained by these children on mental tests* as compared with the national norms. The question now arises* are the differences found on intelligence test scores due to heredity or environ mental and cultural factors? One hypothesis maintains tha-; the relationship between intelligence test scores and social status are due to selection. The less able members of an underprivileged community may tend to remain in it while the able tend to migrate. This assumption was made 27 28 by Hirch* and Pressey and Ralston. As for the racial 20 groups* Ferguson ^ attributed the differences to racial characteristics* dependent on heredity. Nathaniel D. Hirsch* “An Experimental Study of the East Kentucky Mountaineers; a Study in Heredity and Environment*" Genetic Psychology Monographs* 3:183-244* March* 1928.. ---------------- ---- 28 S. L. Pressey* and Ruth Ralston* “The Relation of the General Intelligence of School Children to the Oc cupations of Their Fathers*“ Journal of Applied Psy chology, 3:366-73* December* 1 9 1 9. George 0. Ferguson* "The Psychology of the Negro*" Archives of Psychology* 3 6:1-1 3 8* April* 1 9 1 6. 162 On the other hand, many psychologists, though they do not deny the role of heredity, attribute the differences to the cultural factors* The approach taken here to ex plain the relationship between intelligence test scores and socio-economic status, and the inferior scores of racial groups, was, in most of the cases, the one championed by Chicago groups* The factors which may contribute to the differences in intelligence test performance are genetic ability, de velopmental factors, social status and cultural bias in test items, motivation, speed, schooling and sampling. It is difficult to determine what factor or factors are mainly responsible for status of differences. Data that seem to be indicative of one cause are often later interpreted as being a sign of a different cause. Many of the findings may be explained in terms- of any and all of the presumed factors. Further directed research is the only way to discover which of these are valid. CHAPTER V I I I CULTURAL FACTORS IN CURRENT INTELLIG ENCE TE S TS : STUDIES DEALING WITH ANALYSIS OF TEST ITEMS Most everyone familiar with the history of mental testing in the United States is aware of the fact that there is a difference in performance between high socio economic and low socio-economic levels. The relatively higher scores of high-status pupils became evident soon after the publication of Binet1s first scale. Many re search studies have clearly established the existence of significant differences on measured I.Q’s in favor of 1 children from high-status homes. Comparatively few studies, however, made an analysis of responses on indi vidual test items for both groups. This chapter will deal with the studies that use the item or subtest analysis approach to differentiate between performance of high-status and low-status groups on current intelligence tests. The purpose is to show what items or subtests of commonly used intelligence 1 Kenneth W. Eells,’ feocial-Status Factors in In telligence-Test Items,” (unpublished Doctor’s disserta tion, Department of Education, The University of Chicago, 19^8), p. 67. 164 tests are more influenced by diverse cultures. The treat ment of these studies will not be confined to the usual factual review of results. Instead, an effort will be made to appraise and evaluate each study. Binet (1911). Frobably the earliest study which involves the suspicion that there may be a relationship between a child's relative intelligence and the social 2 status of his family was that of Binet. He, first, studied the relationship of social status to intelligence test performance by reexamining the findings of Decroly and Degand in Belgium and by comparing them with his own findings in Paris. He found that the Belgian children, from favored social strata, were approximately a year and a half in advance of the Parisian children from working class districts. When he analyzed these comparisons by individual tests he found what he regarded as significant differences. He says of them: There is a whole series of tests in which the advance is more marked than in the others; and con sequently it is perhaps possible to deduce some thing interesting upon which aptitudes are most favored in the education of a rich child. A priori one would suppose that these children, little used / 2 Alfred Binet, and The. Simon, The Development of Intelligence in Children, translated by Elizabeth S. Kitfe (Baltimore: Williams and Wilkins Company, 1 9 1 6), 33 6 pp. ----------------------------- _ r 6 g to serving themselves, constantly surrounded by willing servants, would be more awkward with their hands than future workmen. But without making supposition let us see what the facts reveal, or rather let us see how we can draw some conclusion from the tables which have been submitted to us.3 Since the Belgian children had shown an average ad vance of a year and a half over the Parisian children, Binet divided the subtests into two lists, one containing items showing advance of more than a year and a half and the other containing items showing advance of less than a year and a half. Binet listed fifteen items in the first category and ten in the second. For each of these subtests he attempted to judge what "aptitude1 1 was probably tested. He concluded that of the fifteen items showing greater than average differences for the Belgian children, seven involved language ability, four involved home training, three involved attention, and one involved what he called "practical life." Of the ten items showing less than average differences, he found that six were related to school training, and one to judgment. On the basis of these data he concluded that the superiority of the Belgian children was most marked on tests related to school training. With respect to lan- i guage factor, he writes: ^ Binet, ojd. cit., p. 319- 166 . . . The little children of the upper classes understand better and speak better the language of others. We have also noted that they begin to com pose, their compositions contain expressions and words better chosen than those of poor children. This verbal superiority must certainly come from the family life; the children of the rich are in a superior environment from the point of view of language; they hear a mope correct language and one that is more expressive. With regard to the scholastic-training factor which seemed to appear frequently in the small difference items, * Binet writes: . . as Decroly and Degand have already re marked, it is especially in the degree of instruc tion that the children of the rich approach those of the poor. They are not backward in instruction but they do not show the same marked advance that they showed in other tests. This may be the result of accidental circumstances which have no importance; for example, the habit of the parents of not pushing their children and of not sending them to school too early.5 These two quotations are significant as indicating that Binet himself was very much aware of the influence of both home and school training on the results of his tests, and that he did not regard the observed difference as necessarily indicative of inherent differences in native ability. 2i Binet, op. cit., p. J20. 5 Ibid., p. 321. [ 167 Binet1s procedures in item analysis were, of course, crude ones* No data are given for any of the items, other than the statement that they showed either more or less than the one and a half year advance which Binet took as the dividing line. It is impossible, therefore, to de termine whether or not any of the smaller differences were actually zero or close to zero. The number of cases Binet had available was too small to permit very reliable inferences. Furthermore, the assignment of certain "ap titudes" to each subtest is a highly subjective process. 6 Stern and associates (1912). Stern compared the Binet tests results given to pupils in two different types of German schools, that is, the Volksschule (lower and middle class) and the Vorschule (high class). The aim was "to find out whether there exist typical differ ences of intelligence between groups of children of the same age, and what magnitude these differences attain at different ages."^ Five groups were tested. These were 7 and 9 year old pupils of the Vorschule, and 7, 9, and 10 year old William Stern, The Psychological Methods of Test- ing Intelligence, translated by Guy M. Whipple (BaTffimore: Warwick and York, Inc., 191^), 160 pp. ______7 I b . i d . _ j pp. 54-55. __ 168 pupils of the Volksschule. Then, he compared the pupils on two groups of items separately. He scored the 9 and 10 year old pupils in both schools on those items which were at their chronological age on the Binet scale, and he also scored the 11 and 12 year items for both groups. The results of this analysis are indicated in Table XIV. The first column shows that the pupils in the "lower* 1 school are approximately a year retarded. The second column shows that the pupils in the Vorschule rank a little below the Volksschule pupils of their own age. On the more difficult or more advanced items, however, the pupils in Vorschule did "twice as well as by their mates of like age in the Volksschule" as shown by the third column. Stern comments on these data: These tests which lie above the age-level of the subjects are passed by the Vorschule pupils nearly twice as well as by their mates of like age in the Volksschule, and even the older pupils in these tests fall 18 per cent behind the younger children of the better school. If this interesting result should be confirmed again in the detailed computation, as it probably will be, we should then say: children of different social classes differ from each other less in the performances appropriate to their age than in the mastery of tasks that really lie above their level.8 ^ Stern, op. clt., pp. 56-57. 169 ~ | I TABLE XIV PERCENTAGE OF TESTS PASSED IN CERTAIN AGE LEVEL OF TWO STATUS GROUPS^ All Items Items from Items from 9 -1 2 9 and 10 11 and 12 70 77 64 60 81 34 7 0 86 46 ^ Ibid.* p. 56. 9 year Vorschule pupils (high status) 9 year Volksschule pupils (low status) 10 year Volksschule pupils (low status) Stern's study, like Binet's, was simple in con ception and procedure and the data were not reported very completely. It is significant chiefly for its demon stration of another early interest in the item-analysis approach. 10 Weintrob and Weintrob (1912). These two workers selected three groups of children. Each group consisted of seventy children of different ages and about 1 1 equal proportion of each sex. 1 1 The first group was from the Horace Mann school which draws its pupils, mostly, from wealthy families. The second group was from the Speyer school which draws its pupils from middle class families. The third group was from the Hebrew Sheltering Orphan Asylum. The Binet test was used for measuring the intelli gence of these groups. The Weintrobs also studied the individual items and grouped them into three classes according to the author's judgment as to the chief trait measured. The first group consisted of items which the Joseph Weintrob, and Raleigh Weintrob, "The Influence of Environment on Mental Ability as Shown by Binet-Simon Tests," Journal of Educational Psychology, 3 : 577-583, DecemberT^JTST" 171 authors identified as involving the use of language, "as reading, sentence building, rhyming, definitions, and 11 filling in omitted words." The average scores were: Horace Mann 10.5 > Speyer 7-9^ and Asylum 12, having the highest score. The second group of items was judged to test the ability to reason. The average scores were Horace Mann 11.4, Speyer 8, and Asylum 10.7. The third group of items involved observation, sense discrimination, and counting and reckoning ability. "In these the average of the three institutions varied, no one being highest at all. 1,12 The Weintrobs’ comment on these data was: Judging from the results of these tests, then, it would seem that environment does not greatly affect mental capacity, if at all. If environment was the determining factor that its exponents argue it to be, the measuring scale for intelligence should have placed Horace Mann at the head, . . . Speyer in the second place, and the Asylum at the foot.13 The results of this study concerning language Weintrob and Weintrob, "The Influence of En vironment on Mental Ability as Shown by Binet-Simon Tests," op. cit., p. 582. 12 x Loc. cit. Loc. cit. ------------ jyg-1 ability are different from Binet, who found that the largest differences in language are in favor of the high- status group. Pew data are reported for these items. It is not possible to appraise the significance of the findings ex cept that they support the general findings of th& Wein- trobs1 study, that the status differences are small, In consistent, and relatively unimportant. The item analysis does not seem to contribute anything very much to further insight into the nature of the relationships. Yerkes- and associates (1 9 1 5)* *Phe fourth study 14 found in this field is that carried out by Yerkes and his associates in Cambridge, Massachusetts, in connection with their revision of the Binet test scale. The study was based on two groups of fifty-four children each. The first group was chosen from a school which the authors called the "favored. 1 1 The second, "unfavored" group, was chosen from a second school by matching individuals in the favored group with respect to age and sex. Yerkes and his associates reported information on 14 Robert Yerkes, and Helen Anderson, The Impor tance of Social Status as Indicated by-the Results of the Point-Scale Method of Measuring Mental Capacity," Journal of Educational Psychology, 6:137-150, March, 1915- 173 twenty individual items given to the children of both groups. Out of the twenty items, they found the unfavored pupils (boys and girls) scored higher than the favored group on only one item. The favored group (boys and girls) scored higher than the unfavored group on fourteen items; on five tests, the directions of the relationship were re versed for boys and for girls. Yerkes and his associates concluded: Differences in economic or social status seem, then, to be correlated with differences in mental capacity, as measured by the point scale, which may amount to as -much as 30 per cent. In other words, at and about the age of 6 years the favored individuals do from a quarter to a third better in the point- scale examination than do the unfavored.15 No standard errors were reported. Many of the differences appear very small and, in view of the small groups, it is likely that most of the differences are actually not significant. The authors made no comment on the types of items which showed different degrees of status differences. The one subtest on which the unfa vored pupils made slightly better showing than the favored group was the first one in the scale, presumably testing "aesthetic judgment" by means of a judgment as to the prettiness of a picture of a face. Yerkes and Anderson, op. cit., p. 149. 174 Bridges and Coler (1917). These two authors studied 301 children in the first three grades of two schools situated in very different localities of Columbus, 16 Ohio. The favored group came from a school situated in a high-status neighborhood. The unfavored group came from a school located near the railroad in a poor factory district. The Yerkes-Bridges Point Scale was used in this investigation. The twenty tests which make up the test were examined. The authors found that the differences, although sometimes very small, were always on the side of the favored school. The differences were expressed as differences between mean scores on the subtests. The five tests which showed the ’ ’ greatest superior ity" of the favored sehool were reported as: absurd statements, comprehension of questions, comparison of familiar objects, concrete definitions, and counting back ward from 20 to 1. With respect to two of these tests, the investi gators offer special explanations of what they regard as ^ probable reasons for the results obtained. With regard * 1 ( Z James Bridges, and Lillian Coler, "The Relation of Intelligence to Social Status," Psychological Review, 24:1-31, January 1917. 175 to the test consisting of counting backward from twenty to one, they say: Probably the reason for the higher average score in the favored school is the fact that games in volving counting backwards were played by the chil dren of this school and when they were given this test they knew what was expected, with little ex planation; while counting backward seemed a new process for most of the younger children of the un favored school.17 With regard to the test involving concrete definitions, they say 2 The results . . . would probably have shown a greater difference between the two schools but for the fact that the average for the unfavored school was raised because a greater number of unfavored children were able to define ’charity.’ The famil iarity with this term is easily understood, as charity in some form is extremely common in the un favored district. 18 The five tests which showed the ’ ’least differ ence” between the two schools were reported as: arranging weights, aesthetic Judgment, copying a square and a dia mond, comparison of lines and weights, and drawing of designs from memory. With regard to the test involving aesthetic Judg ment , the authors say: Bridges and Coler, op. cit., pp. 23-24. 18 Ibid., p. 24. This is probably the easiest of all the tests; it was seldom missed by either school, which ac counts for the small difference in the average scores.19 In summarizing the findings of their analysis of the separate tests, they say: The results from the single tests show the great est difference In those Involving primarily motor coordination and kinaesthetic Judgment. This agrees with Thorndike 1s view that individuals differ least in sensory motor functions and most in analysis and abstraction.2 0 The first part of this conclusion involving analysis and abstraction agrees with Binet's analysis of the differences between the scores of the children in the private school at Brussels and those from the poorer sec tion of Paris. But the last part of the conclusion, con cerning sensory-motor tests, seems faulty. The authors' data show only that there were not large systematic group differences in the sensory-motor tests; they do not show that the variation of individual scores was small, al though the latter may well have been the case. ^ Bridges and Coler, op. cit., p. 24. 20 Ibid., p. 25. 177 English (1917). English studied two small groups 21 of children, ages twelve to fourteen. The first group consisted of thirty-one children and was chosen from "a free-paying, elementary school, drawing its pupils from 22 the lower middle classes." The second group, consisting of thirty-seven pupils, was chosen from a preparatory school which drew its pupils from a high class. There were ten tests measuring, with varying de grees of satisfactoriness, the functions of memory in its various forms; perceptual discrimination; an alogical reasoning; rapidity of arm-movement; rapid ity and accuracy of arm-movement under conditions demanding maximal attention; ability to divide at tention or rapidly to alternate it; ability to under stand spatial relations or to introduce order into one's spatial perception; and ability to comprehend conceptual relations.2 3 On each of the ten tests analyzed, mean scores for the two school groups were compared. The pupils in each group were also divided into six groups on the basis of their performance on each test, and interschool com parisons were made between equivalent sextiles as well as 21 Horace B. English, An Experimental Study of Mental Capacities of School Children, Correlated with Social Status, Psychological Monographs, 2 3:266-331* 1917. 22 Ibid., p. 269. 23 Ibid., p . 266. 178 between the school groups as a whole. The author com puted a great many correlations between different types of tests and social status. Probable errors were repor ted for the mean scores on each subtest for each school group, although not for the differences between the two means. English’s major conclusions are: 1. Children from the "better" class were "stri kingly superior" in all tests except those involving rapid movement. 2. On a number of tests the status differences were smaller when the top sextiles of the two schools were compared than when other sextiles groups, or the total school group, were contrasted. This study made a careful report of the subjec tive impressions of the examiners as to differences in the way in which individual pupils approached the test prob lems. The study attempted to apply to the item analysis somewhat more refined statistical procedures than had heretofore been used. This was apparently the first item- analysis study to report probable errors or to use corre lation procedures. 179 Pressey and Ralston (1919). In 1919* Pressey and 24 Ralston undertook a study to find the relationship be tween the occupation of parents and the I.Q's of chil dren’s ages ten to fourteen. They made the point that to study “social status1 ’ as related to intelligence of chil dren requires the selection of cases according to some economic or social standard which can be applied to each individual child or family in question. Among the best of these is the occupation of the parent. Pressey and Ralston's study scaled occupations into four classes: 1. Professional (teacher, lawyer, doctor, minister, editor). 2. Executive (independent businessman, foreman), 3. Artisan (electrician, engineer, skilled workman). 4. Labour (section hand, factory operator, unskilled laborer). The Pressey Group Point Scale was used in this study. The four tests of the scale are as follows: The first consist of twenty-five sentences with the words disarranged, and among the words of each sentence one word that cannot be used in the sen tence . . . the subjects are told to cross out this pii S. L. Pressey, and Ruth Ralston, The Relation of the General Intelligence of School Children to the Occupation of Their Fathers,” Journal of Applied Psychol ogy, 5:366-373* December 1919. 180 extra word. The second test is an information test; it consists of twenty-five lists of five words each . . . the third is a test of arithmetical ability, . . . The fourth is a test of vocabulary and of moral discrimination.2 5 To explore the matter of the discrimination powers of the individual subtests themselves, Pressey and Ral ston made a separate distribution for each of the sub- tests, with the following results as shown in Table XV. The authors comment on these data: As may be seen from this table the results are highly consistent from test to test. The results by test would suggest, therefore, that the scale really was measuring some fundamental underlying factor such as native endowment, and that any special features of environment that might be expected to influence the record made on certain of the tests have not affected the scores to any important extent. 26 L. W. Pressey (1920). Pressey2^ undertook a further study, involving the testing of children of ages six to eight. She felt that the earlier age of the chil dren should lessen the opportunity for differences due to home influences; furthermore, she believed that the pressey and Ralston, ojd. cit., pp. 370-571. 26 Ibid., p. 372. ^ Luella W. Pressey, "The Influence of (a) In- adequate Schooling and (b) Poor Environment upon Results with Tests of Intelligence, 1 1 Journal of Applied Psychol ogy, 4:91-96, March 1920. TABLE XV THE PERCENTAGE OF CHILDREN FROM EACH OCCUPATION GROUP WHO SCORE ABOVE THE MEDIAN FOR THEIR AGE (FOR THE TOTAL GROUP) ON EACH TEST28 Occupational Group Test I Test II Test III Test IV Professional 80 80 80 87 Executive 58 64 68 56 Artisan 56 54 51 59 Laborer 47 46 48 45 Pressey and Ralston,, op. cit. i P• 371 * Pressey Primer test which was used with these children would be less subject to cultural environmental influ ences than the Pressey Group Point Seale which had been used with the older pupils in the earlier study. The Pressey Primer scale was made up of four subtests. Two of these the author believed might be sub ject to some influence of home environment. One was a test (Test II) requiring the child to select one out of three pictures of objects which is not like the other two; the other (Test IV) involved the crossing out of something wrong in a series of pictures. The other two tests were: One (Test I) involving recognition of patterns of dots and the other (Test III) involving the fitting together of pieces of geometric figures. The author believed that these two tests might be less subject to the influence of home environment. The results of the study are shown In Table XVI. The data for the subtests showed a substantially similar relationship to the occupational grouping for each of the four tests. No standard errors were reported., but the differences from subtest to subtest are obviously minor. The author concluded: As will be seen, the per cents are distinctly con stant from test to test. The differences are quite as great on the two tests which we would expect least Influenced by home environment as they are on the two 183" TABLE XVI THE PERCENTAGE OP CHILDREN IN EACH OCCUPATIONAL GROUP SCORING ABOVE THE MEDIAN FOR THEIR AGE, ON EACH OF THE FOUR TESTS2^ Occupational Group Test I Test II Test III Test IV Professional 68 70 72 71 Executive 6 2. 62 38 61 Artisan 50 61 31 54 Laborer 40 38 42 47 29 L. W. Pressey, op. cit., p. 95. 184 tests we would expect most sensitive to such in fluences. . . . it seems reasonable to infer that the differences found between the occupational groups are probably true differences in a fundamental, underlying general intelligence or native endow ment .30 This was apparently the first study to subject a group intelligence test to item analysis on a social- status basis. Much more important, however, it was the first item analysis study to measure social status by a scale applying to individual pupils (parental occupation in this case) rather than by treating whole school popu lations as homogeneous status groups. Burt (1922). Burt,^ in a study on the "Influ ence of Social Status upon the Individual Tests," selec- ted two schools at each end of the social status scale. "With but a single representative of either type of school, the figures for the test taken individually would be too much at the mercy of wild accidents of sampling. After calculating the percentages of pupils in Pressey, op. cit., p. 95. ^ Cyril Burt, Mental and Scholastic Tests (Lon don: P. S. King and Son, Ltd., 1922), 4^2 pp. 32 Ibid., p. 195. 185 each status group passing the subtests, or items, he ranked the sixty-five subtests in order of their diffi culty for each status group separately. Unfortunately, he reports neither the percentages nor the ranks, b.ut only the differences between the two rankings* It is thus possible to identify those items which showed rela tively large or relatively small status differences, but it is not possible to determine whether the low-status pupils excelled on any of the items. The tests which prove relatively easier for children of ’superior' social class fall princi pally into the following broad groups: (1) Tests requiring linguistic facility. . . . (2; Scholas tic tests, especially tests in literary subjects. . . . (5) Memory tests requiring the repetition of sentences. . . . (4) Tests depending upon items of information imparted during early life in a cultured home.... For the poorer children tests in the following categories prove relatively easier: (1) Tests de pending upon familiarity with money. . . . (2) Tests perceptual rather than conceptual in char acter, especially where manual activity Is also Introduced. . . . (3) The more practical tests generally. . . . (4) Tests depending upon critical shrewdness— e.g., noting absurdities, resisting suggestion.35 The author does not give any information concern ing the size of the social-status differences which ^ Burt, o£. cit*, p. 195. 186 separated both groups. He did not report the number of eases included in the study, however, nor what age or grade levels were used. Unfortunately, no standard er rors were computed and none can be estimated, since neither the underlying percentages nor the number of cases were reported. This study is a noteworthy example of one whose possible real contribution to establish knowledge is al most entirely destroyed by inadequate reporting. Stoke (1 9 2 7). Stoke^ selected two groups with contrasting social status characteristics, just as Burt had done, but he did so on the basis of the occupational background of individual families rather than by taking whole schools as a unit. His low status group consisted of seventy-three pupils representing the lowest two of Taussig’s five occupational groupings; his high-status group consisted of seventy students from the top two of Taussig's classification. The test used was the Stanford- Binet. For each of these groups Stoke computed the per Stuart M. Stoke, Occupational Groups and Child Development (Cambridge: Harvard University Press'^ T9 2 J), PP. 187 cent passing each sub-test, then ranked the items in accordance with their degree of difficulty for each group separately. He then subtracted these ranks, just as Burt had done. However, Stoke noticed that the lar gest variations in ranking occurred for the very easy or the very difficult items. He says: The large differences are due to the fact that the tests are not discriminative enough between the children of the high group at the earliest age lev el used (the five-year level); and on the high end of the scale, in years 10 and 1 2, the tests do not discriminate sufficiently between the children of low group.~ T > Stoke then eliminated the tests at either end of the scale, and recomputed the rankings of the twenty-five items within the middle range (years 6 to 9) where the tests were properly discriminative among individuals in both groups. When this was done, the largest shift of ranking, among the twenty-five items, was 2 .5 ranks. Stoke also reports the percentages of each group passing each item, besides reporting the differences in rankings. From these data he found that The percentages of children passing these tests shows that in only two tests do the children of the low group excel those of the high, and then only by 4 per cent. In one test, an equal percentage passed, Stoke, op. cit., p. 29. 188 and in the remaining 22 tests, the percentage of high-group children passing exceeded the percen tage of low-group children by an average of 10.36 Stoke found that five of the items showed dif ferences of fifteen per cent or more. These tests are: (l) Description of pictures, (2) Repetition of digits, (3) Differences between two things, (4) Similarities of two things, and (5) Making change. With the exception of the last test and the digit test, the remaining tests are linguistic. "They do not, however, demand answers of the rote memory type but require some original think ing, as in the case of stating likenesses and differ ences."^ He also points out, however, that There are other linguistic tests which seem about as easy for the low group as for the high. . . . Evidently the children of the low group have almost as much linguistic ability as have the children of the high, where rote memory is con cerned, but where more than rote memory is re quired in the use of the words, the children of the high group excel.38 Finally, the author did not report the standard errors; thus it is difficult to say whether many of ^ Stoke, op. cit., p. 29. 57 Ibid., p. 30. 5 8 Ibid., pp. 30-31. these differences between both groups are statistically significant. He added and compared the differences as though they were equal, disregarding the fact that per centage differences do not represent equal amounts of difficulty differences at various points on the scale. This study is noteworthy not only because it was the first one to involve the items of the Stanford Revision of the Binet scale, but also because it shows evidences of more careful attention to statistical tech niques . Long (1935). Long^9 gave a battery of intelli gence and achievement tests to two hundred third-grade Negro children in Washington, D. G. His subjects were divided into two groups of one hundred children each. One of these groups (Group I) was selected from rela tively underprivileged communities. The other group, designated as Group II, was more carefully selected from the better communities. The groups were selected on the bases of socio-economic and cultural opportunities, as judged by the communities from which the groups came. 39 Howard Hale Long, "Test Results of Third-Grade Negro Children Selected on the Basis of Socio-Economic Status," Journal of Negro Education, 4:192-212, 523-552, April, October1935. 190 The ratings of the communities were estimated by school supervisory and administrative officers. Long gave his subjects Stanford-Binet, Fintne'r- Paterson Short Performance Scale, Dearborn A Group Test of Intelligence, Kuhlmann-Anderson Intelligence Test, and New Stanford Achievement, Reading and Arithmetic. He also, however, subjected the Kuhlmann-Anderson test to an analysis by separate subtests. The results of his analysis are indicated in Table XVII. He found that for four of the subtests (comple tion (1 3), countings taps (I2 *), likeness in pictures (1 7), and finding words that do not belong to others (2 2), the high-status pupils showed a significant super iority; on two subtests (finding similar forms among a group of varied forms (1 6), and substituting numbers for letters (2 1) , the low-status group showed a significant superiority. On the remaining four subtests, no sig nificant differences were found, although the low-status group showed a slight superiority on three out of the four subtests (rearranging letters to form words (1 5), drawing geometrical forms (1 8), pencil-and-paper form board (20). The author made no interpretation of the item data other than to point out which subtests showed significant differences and in which direction. TABLE XVII COMPARISON OP GROUP I AND GROUP II BY SUB-TESTS OP THE KUHLMANN-ANDERSON INTELLIGENCE TEST40 Sub-Test Number Mean M.A. (Months) Standard deviation in months Difference Group I - Group II Group I Group II Mr Mn D 13 92.7 98.9 14.62 19.11 6.2 2.56 14 100.2 109.1 13.26 10.53 8.9 5.27 15 113.7 111.1 16.77 15.21 -2.6 -1.15 16 122.7 120.3 7.02 7.41 -2.4 -2.35 IT 104.1 109.0 14.82 14.82 4.9 2.33 18 105.8 105.4 16.38 10.53 - .4 - .21 19 108.9 110.8 12.48 18.33 1.9 .86 20 112.3 111.1 14.04 14.04 -1.2 - .61 21 126.7 118.6 25.74 22.23 -8.1 -2.38 22 106.0 115.2 10.92 18.33 9.2 4.39 iiO ™ Ibid., p. 210. 191 192 It should be noted that the differences in this study were computed in terms of mean mental ages on the subtest as a whole, and were not based on the individual items which comprise the subtests* Another point which should be mentioned, is the fact that the low-status group averaged seven months older than the high-status group. This study is notable for being the first at tempt to analyze, by subtests, the relationship between social status and intelligence scores for Negro pupils, and for its relatively complete presentation of the statistical data needed for proper evaluation of its findings. ( Saltzman (1940). Saltzman studied 254 pupils in the first grade of two schools in New York Gity. The purpose of the study was to compare the performance of two groups of children of different socio-economic back ground on the Stanford-Binet examination, and to analyze the differing effects of social status on success in the individual tests of the Stanford-Binet Scale. 4l . . Sara Saltzman, 'The Influence of Social and Economic Background on Stanford-Binet Performance," Journal of Social Psychology, 12:71-81, August 1940. 193 The schools were selected to represent "the op- . f 42 poslte ends of the social and economic scale. School A is situated in the lower east side of Manhattan, in a crowded slum area. About 80 per cent of the families where children attend the school are unemployed. School B is located in a very fine residential section in upper Manhattan. The parents of the children attending the school are professional or business people. Each pupil was given a Stanford-Binet and a Goodenough Draw-a-Man Test. He grouped the pupils from both schools into "inferior," "average," and "superior" groups on the basis of the Goodenough I.Q's. Because as the author puts it: Since the higher economic group had a marked advantage on verbal tests which makes up a large part of the Stanford-Binet scale, matching the groups on the basis of the Binet I,Q. scores would obviously have been unfair. Instead, the children in each school were divided into three groups on the basis of the Goodenough test: those who made an inferior score (7 0-8 9); those who made an av erage score (90-109) and those who made a superior score (110+) .4*3 Saltzman made analyses of all the items In the Stanford-Binet from age five to age ten. This was done no Saltzman, o£. cit., p. 72. ^ Ibid., p. 74. 19^ in terms of percentage correct for each status group, and the difference between the two percentages. Stan dard errors of the differences are given for the analy sis of total groups, but not for the three groups analyses based on the Goodenough test. The results of the groups differences are indi cated in Table XVIII. Saltzman*s tabled data made it obvious that the large differences tended to occur at the middle of the scale, and the small ones near both ends of the scale. She failed to point out that this is inherent in the percentage method of analysis, where very easy or very difficult items necessarily show small differences. With regard to a group of items in which the low- status pupils are reported as surpassing the high-status pupils, she concludes: It will be noted that the tests in which group A [the low-status pupils] showed superiority in volve the types of ability which children of poor environment probably have more opportunities to ac quire: counting and handling money, rote memory, and sensory discrimination.W This conclusion is based on items on which nega tive status differences were found. No data are given 2l 1l 0 Saltzman, op. cit., p. 7a. f — 195 n TABLE XVIII THE TOTAL PERCENTAGE OP EACH OF THREE GROUPS PASSING EACH TEST4? Inferior Average Superior A W~ A B A B V Weights 100 Color 68 Aesthetic 95 Definitions 100 Patience 91 3 commissions 82 12 79 47 39 55 100 100 100 100 100 92 91 100 100 91 100 100 100 100 100 100 100 100 100 100 92 90 98 98 98 100 90 100 100 100 VT Right and left 77 92 92 91 86 96 Mutilated pictures 50 75 79 93 91 98 Counting 13 86 92 94 90 98 98 Comprehension 50 100 77 95 95 98 Coins 95 63 91 97 100 99 Sentences 16-18 41 92 70 93 78 98 VII Fingers 64 67 83 83 ' 96 89 Picture description 54 83 81 83 81 51 ‘ Digits 33 83 49 85 74 100 Bowknot 41 67 63 69 66 100 Differences 18 83 47 81 63 100 Diamond 9 25 33 46 52 50 Digits rev. 3 0 42 26 71 44 80 196 TABLE XVIII (continued) THE TOTAL PERCENTAGE PASSING OF EACH OF EACH TEST45 THREE I . 1 - S M - . 1 — - - ■ ■ ■ I GROUPS * Inferior Average Superior A B A B A B 22 12 79 47 39 ' 55 " VIII Ball and field 0 0 11 .36 5 14 Counting 20-0 9 42 12 30 / 24 35 Comprehension 0 58 26 53 10 64 Similarities 14 42 29 62 32 54 Definitions 14 58 26 68 22 70 Vocabulary 0, 17 0 11 0 13 IX Date 0 0 1 24 '0 12 Weights 5 25 13 42 5 19 Change 5 8 1 l 5 20 4 digits rev. 0 8 21 11 19 Three words 0 8 0 22 0 14 Rhymes 0 42 10 47 16 54 X Absurdities 0 8 4 10 0 1 Designs 0 0 0 0 5 13 Reading 0 0 0 5 0 0 Comprehension 0 8 0 9 0 0 60 words 0 17 2 20 0 19 Digits 6 0 17 13 34 27 29 Sentences 20-22 0 8 1 29 12 3 ^ Saltzman, ojd. cit., p. 76. as to the statistical significance of these negative differences. Incidentally, the one item showing a large negative difference (picture description) is one of three omitted through error from the summary table. With regard to the items in which group B showed a "marked superiority" she says: In general, the same type of tests, namely ver bal ones, are found to be passed by more of the children in Group B than in Group A. Their great est superiority was on the tests involving vocabu lary, verbal comprehension of every-day situations, and rhymes. They are also superior in motor con trol. In the environment of these children we can readily see that there would be opportunities to converse with adults, develop motor control, and. ^ acquire information and accepted moral concepts.4^ It is not clear how items were selected as show ing the "greatest superiority" on which this generali zation is based. It was apparently not alone on the basis of size of percentage differences, since some of the items included show smaller differences than some of those omitted. The author, unfortunately, made most of the in terpretation supplied in terms of the three groups of analysis, for which no standard errors are given. She apparently selected her items for her conclusions on the 46 Saltzman, op. cit., p. 79. basis of the size of the differences between the two groups (A and B) percentages, without regard to its in equality at different points of the scale** The study suffers seriously from carelessness in reporting and from inadequate statistical analysis. hrr Clarke (19^1). Clarke 1 selected two groups, ages fourteen to sixteen years old, from the New York State Training School for Boys. The first group consisted of 116 Negroes, and the other consisted of 116 whites. Both groups were equated in chronological age and mental age. "The cultural and socio-economic status of the two U Q groups were assumed to be comparable." The author examined the group responses to indi vidual items of the Revised Stanford-Binet Seale "Ln from the eleventh year through the Average Adult Level of the scale "to determine whether groups so matched ^ Daniel P. Clarke, "Stanford-Binet Scale 'L1 Response Patterns in Matched Racial Groups," Journal of Negro Education, 10:250-238, April 19^1. Ibid., p. 2 3 2. 199 would respond differently to individual items of the Revised Stanford-Binet Intelligence Scale *L.1 The results of his analysis are indicated in Table XIX. Clarke summarized his article by saying: No significant difference was found between Negro and white abilities to pass the individual items of the Stanford-Binet Scale ’L.1 However, strong tendencies were noted for Negro superi ority in verbal functions and white superiority in reasoning and number functions; test items failing to differentiate the groups tended to com bine verbal and reasoning content.50 This was apparently the first item analysis study to undertake analysis of responses of matched Negro and white groups. The conclusion concerning verbal ability is not in line with other studies which show that the white pupils, generally, are superior in verbal facility to the Negrods. This could be due to the inclusion in the white group of boys from homes in which foreign languages are spoken (5 8 per cent of the fathers and 34 per cent of the mothers of the white boys were foreign bom). in any case, the differences obtained, although not significant, should be interpreted on cultural rather than social bases. ^ Glarke, op. cit., p. 237. Loc. cit. TABLE XXX REVISED STANFORD BINET SCALE (L) ITEMS SHOWING EITHER STRONG TENDENCIES OR NO TENDENCIES TO DIFFERENTIATE NEGRO AND WHITE GROUPS51 Test Items Per cent Failing Chances in 100 of a "True" Difference between Means (Strongly Suggestive Differen tiation Favoring Negro Group) Dissected Sentences (XIII-5) 39.7 54.7 98.9 Memory for Sentences V (A.A.-7) , 3 1 .0 43.1 97 Vocabulary (XIV-l) 3 1 .0 43.1 97 Vocabulary (xil-l) 16.4 2 5.O 95 Memory for Sentences IV (XI-4) 10.3 16.4 92 (Strongly Suggestive Differen tiation Favoring White Group) Arithmetical Reasoning (A.A.-4) 3 6 .2 1 9 .8 99.7 Repeating 5 Digits Reversed (XII-4) 3 6 .2 27-7 94 Picture Absurdities III (XIV-3) 5 6.2 43.1 98 (No differences between Groups) Memory for Designs (XI-l) 16.4 16.4 50 Minkus Completion (XII-6) 52.5 5 2 .6 50 Verbal Absurdities II (XII-2) 1 3 .8 1 3 .8 50 Memory for Words (XIII-2) 3 2 .8 3 2 .8 50 Copying a Bead Chain from Memory (XIII-6 ) 39-7 39.7 50 Differences Between Abstracts Words (A.A.-3) 44.0 ' 44.0 50 51 Ibid., p. 236. ( 201 Murray (194-7). Another study using the Warner social class groupings is the investigation by Murray.^ 2 He gave a battery of group intelligence tests to 4-01 Negro children, ages ten and fourteen, in Gary, Indiana, and then classified the pupils into social-class groups on the basis of data obtained from interviews with the parents. For the purpose of the item analysis, Murray used the middle-class and the lower-lower-class groups, omitting the intervening upper-lower-class pupils. He included in the item analysis only those items which were reached and attempted by substantially all of the pupils in both status groups--a total of 119 items in three tests, Murray, after analyzing his data, found that only five of the 119 items showed negative differences {that is, with lower-lower-class pupils excelling the middle- class pupils), and only one of these five was significant at even the 5 per cent level. The proportion of items showing statistically significant (critical ratio of 5 .0 or more) differences in favor of middle-class pupils varies from 16 per cent on the Kuhlmann-Anderson test to Walter I. Murray, "The Intelligence Test Per formance of Negro Children of Different Social Classes, 1 1 (unpublished Doctor’s dissertation, Department of Educa tion, The University of Chicago, 194-7)* 128 pp. 202 64 per cent on the Henmon-Nelson test. Murray also classified each of the items into categories, using such categories as: definitions, num ber series, analogies, classification, sentence, arti ficial language, geometric design, opposite, syllogisms, arithmetic, and social reasoning. For each of these categories he determined the proportion of the items which showed significant differences between the two social-class groups. The most significant finding is that geometric-design items showed a smaller proportion of significant differences than most of the more highly verbal categories. With regard to this, he comments: A plausible explanation of this phenomena is that the tasks demanded abilities of perception and form and were less saturated with language which is predominantly of a middle class charac ter. On the other hand, the verbal classifica tions required abilities which were dependent upon school and extra-scholastic training. The amount of extra-scholastic training presumably varies directly as one ascends the social-class hierarchy,53 Murray also made a detailed analysis of the er rors, or wrong responses, on eighty-two items--all of the items in two of his tests. For each of these items he computed the proportion of the pupils in each social- 55 MUrray 3 0p . cit., p. 71. 203 class group checking each wrong response, and the sig nificance of the differences between the two social- class groups on each response. He found twenty-two items which showed significant differences on wrong re sponses. After examination of these twenty-two items, Murray concluded that all of the differences could be accounted for in terms of (a) differences in social experience, (b) structural peculiarities of the items, (c) differences in school achievement, and (d) differ ence in associational patterns. This study is one of the most extensive investi gations using the item-analysis approach. It involves a larger number of items, a better basis for determining social status, and more exhaustive statistical analysis than any of the preceding studies. The study might be criticized on the basis of the fact that the number of items is not sufficient to make the categorizing of Items very satisfactory, and there are some inaccuracies in computation and reporting. Eells (1948). The most thorough analysis of socio-economic differences in intelligence test perfor mance was carried out in Rockford, Illinois, a city of 64 115*000. The socio-economic status of each child in the city between the ages of nine and fourteen years was computed by use of Warner's Index of Status Characteris tics. By means of this index, and parental birthplace and interview information, three special groups were selected for intensive study. These three groups— high-status Old American, low-status Old American, and low-status ethnic--were established for the younger and for the older pupils separately. Ten standard group tests were administered to both high and low socio economic level pupils. Tests given to nine and ten- year-old pupils were: Henmon-Nelson, Otis Alpha (non verbal), Otis Alpha (verbal), Kuhlmann-Anderson (Grade 0 III), and Kuhlmann-Anderson (Grade VI). Tests given to thirteen and fourteen-year-old pupils were: Terman- McNemar, Otis Beta, California Mental Maturity, Thur- stone Spatial, and Thurstone Reasoning. The study at tempted to analyze the responses of about five thousand pupils, ^ Kenneth Walter Eells, and others, Intelligence and Cultural Diff1 erences (Chicago: University of Chi- cago Press, 1951), 388 pp. 205 . . . from high and low social-status back grounds on various types of items taken from widely used group intelligence tests. Its prin cipal purpose is to provide a basis for tenta tive inferences* and for further research, deal ing with the extent to which group differences in I.Q ’ s may be due to the presence in the tests of materials drawn more largely from the culture characteristic of high-status pupils than from j-c that with which low-status pupils are familiar. ^ i There were four steps in the procedures used: (a) collection of the test data; (b) securing of a socio-economic or social-status classifica tion, of the pupils; (c) analysis of the relation between social status and I.Q1s secured from the tests; and (d) analysis of individual item re- ( -j sponses in terms of several subgroups of pupils. The major findings of the study were: (1) Correlations between I.Q's (or percentile ranks on certain tests) and the Index of Status Characteristics vary with the test used and the age level tested. They are moderate in size ( .2 0 to .43) and are all definitely significant in the statistical sense. The correlation is linear for the thirteen and fourteen-year-old, but is not linear for the nine and ten-year-olds. (2) Increasing the time limits of the tests has no effect on the size of the status differ ences on any test except the two Thurstone tests, which are planned primarily as speed tests. (3) About half of the items in the tests for nine and ten-year-old pupils, and about 85 per cent of the items in the tests for the thirteen 55 Eells, op. cit., p. 51 5^ Loc. cit. 206 and fourteen-year-old pupils, show differences be tween high- and low-status groups large enough to be significant at the 1 per cent level. On the other hand, more than a third of the items from the tests for the younger pupils, and about a tenth of those from the tests for the older pupils, show status differences too small to be significant even at the 5 per cent level. (4) Mean status differences are largest for verbal and smallest for picture, geometric de sign, and stylized-drawing items. The dispersion of status differences is greater for verbal and for picture items than it is for geometric-design, stylized-drawing, number-combination, or letter- combination items. (f>) Mean status differences for different types of test questions (opposites, analogies, etc.) vary from category to category, but no con sistent trends appear, and no meaningful gener alization appears to be possible regarding the types of questions showing large or small differ ences, provided the form of symbolism is held constant. (6) Status differences for verbal items are fairly closely related (r * .46 to .62) to the difficulty of the items as determined for high- status pupils, with the easier items showing the larger status differences. . . . Status differ ences for nonverbal items are not so markedly related to difficulty of the items. (7) Practically all of the items showing un usually large status differences are verbal in symbolism. A substantial number of them involve what appears to be a relatively academic or book ish vocabulary. . . . Items which show small differences are almost without exception either nonverbal in symbolism or involve simple everyday words which do not appear to be intended as testers of vocabulary knowledge. The subject matter of such items is usually either 'noncul- tural1 or drawn from materials quite common to the experience of children at all status levels. (8) The proportion of items showing statis tically significant status differences is larger in the tests for the 13 and 14-year old pupils; conversely, the proportion of items showing dif ferences which are not significant is larger in the tests for the younger pupils. . . . This age differential is, however, due in part to the presence of a large proportion of verbal items in the tests for the older pupils and a larger pro portion of picture geometric-design, and stylized- drawing items in the tests for the younger pupils.57 Further analysis of- individual items, too detailed to cite here, revealed that in those items which indi cated an advantage for the high-status group, there was. often a factor of vocabulary or familiarity with mater ial, symbols, or experiences which penalized the low status group. The apparent superiority of the higher level pupils, for these items, could be interpreted as a difference in the experiences and behavior of the cul tural1 background from which they came; Drig/gs (1952). Drigg1 s^ study of the relation- ship between intelligence test items and the occupation ^ Eells, op. cit., pp. 53-56. ^ Don F. Driggs, "A Study of the Relationship Between Intelligence Test Items and Occupation of Parents of School Children (unpublished Doctor’s dissertation, School of Education, University of North Carolina, 1952), 227 PP. of parents of school children was derived from another investigation of over five thousand children who at tended the public school of Winston-Salem, North Caro lina. This community is noted primarily for the tobacco industry, with a population in excess of eighty thou sand people in its metropolitan area. The author chose 471 test booklets selected at random from those of the five thousand children for his final analysis of items. The tests used were: (1) the Pintner-Cunningham Primary Test, Form A in Grade I and the first half of Grade II. There are seven subtests whose names indi cate their content; common observation, aesthetic dif ferences, associated objects, discrimination of size, picture parts, picture completion, and dot drawing. (2) The Pintner-Durose Elementary Test, scale 1, Form A (picture content) in the last half of Grade II, Grade III, and the first half of Grade IV. The test is com posed of six subtests, namely, vocabulary, number se quence, analogies, opposites, logical selection, and arithmetic reasoning. (3 ) The Pintner Intermediate Test, Form A, in the second half of Grade IV through Grade VIII. The test is composed of eight subtests. There are: vocabulary, logical selection, number se quence, best answer, classification, opposites, 209 analogies, arithmetic reasoning. Since the study involved the occupation of the parent, the Minnesota Occupational Hating Scale was used for stratifying the occupation. The group was divided into four occupational groups from the highest, nAn down through "B,n MCn to nD," the lowest. After the tests to be studied had been separated into occupational groups, analysis of the over-all tests was undertaken. This involved calculation of the mean, 25th and 75th percen tiles for each of three tests as a whole. The same was done for each of the subtests in the three intelligence tests used. The same was also done for each item in each test for the highest and lowest occupational groups. The means of these two occupation groups (the highest and lowest scoring groups) were treated as the single scores most representative of the groups1 ability on that particular measure. These means were then used to test the degree of significance of any differences in test scores of the two groups. The analysis of individual items are too detailed to be reported here. The results of the subtests are reported in Table XX. Briggs found that the most discriminating sub- tests of the three intelligence tests used were: (1) 210 TABLE XX DIFFERENCES IN DEGREES OF DISCRIMINATION AT THE MEANS OF TESTS AND THEIR SUBTESTS BETWEEN HIGHER AND LOWEST RANKING OCCUPATIONS59 Tests Subtests Critical ratio P intne r-Cunningham 7. Dot Drawing 7.97 5 .9 8 1. Common Observation 4 .9 8 2. Aesthetic Differences 4.24 3. Picture Parts 4.17 3. Associated Objects 2.95 6. Picture Completion 2 .3 8 4. Discrimination of size 1 .6 8 Pintner-Durost 5* Logical.Selection 8 .6 5 5 .6 1 4. Opposite 4.59 2. Number Sequence 4.08 1. Vocabulary 5-57 3. Analogies 5 .1 6 6. Arithmetic Reasoning 5.07 Pintner-Intermediate 7.74 1* Vocabulary 6 .2 2 5. Classification 5.59 4. Best Answer 5.40 2. Logical Selection 5.31 6. Opposite 4.09 8. Arithmetic Reasoning 4.05 3. Number Sequence 3.59 7. Analogies 3.43 59 Driggs, op. cit., p. 192. vocabulary, (2) dot drawing, (3 ) logical selection, (4) classification, and (5) best answer. The five least discriminating subtests were; (1) discrimination of size, (2) picture completion, (3) associated objects, (4) arithmetic reasoning, and (5) analysis. Summary and evaluation. Sixteen studies involving the analysis of responses to individual test items or subtests are summarized. Although three of them have been of high quality, most of them suffer from a variety of weakness. Among the more obvious shortcomings of some of them may be listed small number of pupils, small number of items, unsatisfactory methods of social-status measure ment, improper statistical procedures, and inadequate re porting of procedures and results. Failure to test dif ferences for statistical significance and treatment of percentage differences as representing equal differences in difficulty at all points along the scale are noted frequently. From such studies, it seems clear that the rela tive performance of low-status as compared with high status pupils will vary to some extent with the particu lar item and test used. This could only be true if there were cultural factors intrinsic to at least some of the tests which favor the high-status pupils. It does not 212 follow, of course, that all of the differences between the groups in test performance are a result of cultural factors, but it does give evidence that cultural factors influence the test performance of pupils from different socio-economic levels in the United States. The conclusion that pupils from low socio-economic levels of the United States are genetically inferior as sumes that the tests are fair measures of their ability. This assumption is partially a function of a general lack of information about the differences which exist between high and low socio-economic levels in the United States. Although the low socio-economic levels in the United States share many things in common with the high socio economic groups, under the common American culture, the environment of the low-status person is apparently a cul ture of its own, different in kind and degree from the culture which is characteristic of the high socio-economic levels of the United States. Finally, the most adequate general explanation which can be derived from most of the findings is that the variation in opportunity for familiarity with specific cultural words, objects, or processes is most likely re sponsible for the differences found on test scores. It is also probable that status differences in responses to test items are due to genetic and develop mental differences in and to culture-bias in tests and motivational factors. It is difficult to state the true significance, on the basis of present knowledge, of the I.Q. differences between pupils of differing cultural backgrounds. CHAPTER IX DEVELOPMENT OF SO-CALLED “CULTURE-FREE OR “CULTURE-FAIR” INTELLIGENCE TESTS Any discussion of the influence of culture on test performance involves two questions. First, to what de gree is test performance influenced by cultural factors, and secondly, what can be done about it? In answer to the first question, two chapters have been presented. Chapter VII has shown a positive rela tionship between intelligence test performance and cul tural background. Chapter VIII has shown a considerable degree of differences in performance on various types of intelligence test items when children of different cul tures are compared. This chapter is devoted to the second question. Many attempts have been made to develop intelligence tests which could be applied to subjects from different cultural backgrounds. The Draw-a-Man Test is presented below as representative of early attempts to attack the problem of cultural influence upon test performance. This will be followed by presenting three tests which were purposely devised to be “culture-free" or "culture-fair” intelli gence tests. 215 Goodenough Draw-a-Man Teat. 1 This is a test which purports to evaluate a child's intelligence by means of his drawing. It is intended for ages from three and one- half to thirteen and one-half years. The child is in structed to make a picture of a man as best he can. He is told to work carefully and to take his time. Instructions to the subject are as follows: On these papers I want you to make a picture of a man. Make the very best picture that you can. Take your time and work very carefully. I want to see whether the boys and girls in school can do as well as those in other schools. Try very hard and see what good pictures you can make. 2 The scoring Is based not upon aesthetic quality but, rather, upon the presence of essential details, such as attached legs, nose, fingers, etc., which presumably indicate the individual's level of "perceptual differen tiation of an object that is very familiar in his en vironment . 1 1 ^ Florence L. Goodenough, Measurement of Intelli- gence by Drawing (Chicago: World Book Company, 192b), t t t p p :----------- 2 Ibid., p. 85. ^ Frank S. Freeman, Theory and Practice of Psychol- ogical Testing (New York: Henry Holt and Company, 19^9)3 pT 2o8 . 216 The raw score, which gives the number of points out of a possible total of 51 on which a given drawing has received credit, is not ordinarily used directly. Raw scores are converted into "mental ages," and the ratio between the subject's mental age and his chrono logical age is taken as his Goodenough I.Q. The table for converting raw scores into mental ages equivalents is derived from the norms for ages three to thirteen, which were based on 2 ,5 0 6 children, age four to ten, in grades appropriate to their age, drawn almost entirely from schools in New Jersey and in New York City. Since its introduction in 1926, the Goodenough Draw-a-Man Test has been used in a number of comparative studies of national and racial groups. However, as I I Goodenough has pointed out, the fact that the test is free from verbal requirements does not necessarily mean that it is equally suitable for all groups. Studies by 4 Florence L. Goodenough, and Dale B. Harris, "Studies in the Psychology of Children's Drawing: II, 1928-1949," Psychological Bulletin, 47:369-433j Sep tember 1 9 5 0. * 5 6 Dennis^ and Havighurst noted a marked difference in fa vor of boys in Indian tribes where art work is chiefly the responsibility of the males. Also, a number of American Indian groups obtained higher average I.Q’s 7 than white groups. Menzel, 1 in his application of this test to children in India, showed that Indian children did not make as high scores as American children of cor responding ages. He felt that new norms should be used for Indian children and suggested; Educational standards and practices cannot be imported from other countries without thorough going modification and adaptation which takes into account the handicaps and advantages under which the Indian pupils labor. 8 ^ Wayne Dennis, ’ ’ The Performance of Hopi Children on the Goodenough Draw-a-Man Test,” Journal of Compara tive Psychology, 34:341-3^8, December 1942. ^ Robert J. Havighurst, and others, 1 1 Environment and the Draw-a-Man Test: The Performance of Indian Chil dren,” Journal of Abnormal and Social Psychology, 41: 5 0-6 3, January 13T4FI ^ Emil W. Menzel, "The Goodenough Intelligence Test in India,” Journal of Applied Psychology, 1 9 2 6 1 5- 624, October 1933^ 8 Ibid., p. 624. Q Although Huang^ noted similarities in the devel opmental sequences of both Chinese and American children in drawing, Hsiao‘ S found it necessary to re-standardize 11 the test for use in China. Papavassilious, in using this test with Greek children found that some modifica tion of the scoring system was necessary, because of 1 1 differences in drawing themselves and because of a lack 12 of art education in Greek schools. 1 1 In evaluating the usefulness of the test, Good- enough has written as follows: Repeated studies have shown that when used with children of reasonably similar cultural backgrounds who are equally motivated to do well, the test is serviceable as a crude measure of ’general intelli gence, 1 although the moderate self correlations and correlations with outside criteria make it clear that it cannot serve as a satisfactory sub stitute for individual tests of the Binet type.15 ^ i. Huang, The Psychology of Children1s Drawing (Shanghai: Commercial Press, TPJo), cited in Psychological Abstracts, 13:395* No. 3 8 8 9* 1939. H. H. Hsiao, "Hsiao's Revision of Goodenough's Test of Intelligence for Children," Cited in Psychologi cal Abstracts, 1 5: 3 0 8, No. 2 8 3 6, 19^1. I. Th. Papavassiliou, "Validity of the Good- enough Draw-a-Man Test in Greece," Journal of Educational Psychology, 44:244-248, April 1953- 12 Ibid., p. 248. ^ Goodenough, op. cit., p. 399. The Letter International Performance Scale. This ill instrument was devised as a scale for mental measure ment which would be entirely non-language in type and could be applied to subjects from different cultural backgrounds. Neither verbal instructions nor pantomime are necessary in applying the tests. The scale consists of sixty-eight items, four at each level, ranging in difficulty from the two year to the eighteen year level. The author suggested that its greatest usefulness is for testing children between the ages of five and twelve. The test items consist of matching colors and forms, pic ture completion, perception of use of common objects, logical sequence, analogies, number estimation, geometri cal completion, opposites, recognition of age differ ences, complex discrimination, perception of relation ships other than by use, and recognition of material N . differences. The materials consist of a frame to which is attached a cardboard strip on which are printed the forms to be matched or the series to be completed. The subject is given blocks containing the complementary pictures and success depends upon the order in which l i i - Russell G. Leiter, The heiter International Per- formance Scale, Volume I (Santa Barbara: Santa Barbara State College Press, T9¥b), 95 PP. 220 these blocks are inserted in the frame. Originally, this scale was standardized for use with Japanese and Chinese children between the ages of three years and sixteen years. In 19^8, Leiter estab lished norms for Caucasian children between the age levels of five and twelve years. The mental age concept is used in connection with this scale, and it is based upon the total number of months of credit earned. The intelli gence quotient Is the ratio MA/CA. Leiter used a chrono logical age of 156 for anyone thirteen years and older. This in itself defines the upper limits of the test’s usefulness. Though the scale’s materials are supposedly cul ture free, Porteus found it necessary to omit those tests which were not applicable to the primitive people because of their lack of familiarity with the materials employed. Then, Porteus drew the following conclusions: 1. A certain number of the tests, sufficient to make a brief scale, are applicable to natives such as the Bantu. The tests proved completely inde pendent of language, interesting to the natives and apparently comprehensible from the standpoint of cultural background. 2. The tests were not, however, applicable with out verbal explanations to people as primitive as the Bushmen. 5. . . . only one test in the series, that of cube counting, seems to be directly influenced by school 15 experience. l6 Tate* in an attempt to evaluate the relative freedom from cultural influence of the Leiter Interna tional Performance Scale, gave this test and two other tests of intelligence, the Stanford-Binet, and the Ar thur Scale of Performance, to 108 five-year olds. These subjects were from four distinct groups: an upper socio economic group with preschool experience, an upper socio economic group enrolled in kindergarten with no pre school experience, a lower socio-economic group enrolled in kindergarten with no preschool experience, and a lower socio-economic group from a state orphanage. The main findings were: (a) All tests differentiated significantly be tween all pairs of the experimental groups, except between the two professional groups; (b) the LIPS appeared no more culture free than either the Binet or the Arthur; (c) the LIPS means for all groups were consistently smaller than those of either the Arthur or the Binet with the magnitude of means correlated ^ Russell G. Leiter, and Stanley D. Porteus, The Leiter International Performance Scale (University of Hawaii Research Publication, No. lj, Honolulu, Hawaii: University of Hawaii, May 1956), pp. 52-3 5. Miriam E. Tate, "Influence of Cultural Factors on the Leiter International Performance Scale,” Journal of Abnormal and Social Psychology, 47:497-501, April 222 as highly with each other tests as those tests did with each other. 17 He concluded: The general conclusion is that,the LIPS, though probably no more free of cultural /influence than the Binet or the Arthur, is a valid, useful instrument for measuring intelligence at the preschool level, but that it seriously needs a restandardization of revision of published norms. . . .1 8 The Gattell Culture-Free Test. Cattell, In an at tempt to attack the problem of culture influences upon test performance, undertook a study to provide a culture- free test. He felt that intelligence tests measure a good deal of obviously acquired knowledge and skill, and that they are heavily weighted with special abilities distinct from intelligence. To him, an intelligence test (culture-free) should be built upon: The following list of objects common to the ob servations of men wherever and however they live is given as an illustration of a possible nucleus, upon which careful investigation of primitive and civil ized cultures might build a far longer and more detailed matrix, of Items for intelligence tests: Common objects: The human body and its parts Footprints, etc. Trees (schematic and unspecific) (except for Eskimos!) Tate, op. cit., p. 501. Loc. oit. Four-legged animals (schematic and unspecific) Earth and sky Clouds, sun, moon, stars, lighting Fire and smoke Water and its transformations Parents and children (growth) and simple family relationship (except in special tribes) Common processes: Breathing, choking, coughing, sneezing Eating, drinking, defecating, urinating Sleeping Birth and death Running, walking, climbing, jumping Striking, stroking Sensing--seeing, hearing, smelling, tasting, etc. Emotional experiences--anger, grief, etc.19 Cattell pictures a "culture-free" intelligence test as: . . . one expressed in a type of performance in which (a) intelligence rather than physical or special aptitude factors are concerned and (b) life experience has brought the performance to its highest hereditary limit for the greatest number of people.20 Aiming at deriving a culture-free intelligence test from his research, he finally decided on six sub- 2i tests. The first, a classifications test, requires the ^ Raymond B. Cattell, "A Culture-Free Intelligence Test I," Journal of Educational Psychology, 31:161-179 March 1 9WI 20 Raymond B. Cattell, and others, "A Culture-Free Intelligence Test: II. Evaluation of Cultural Influence on Test Performance,1 1 Journal of Educational Psychology, 32:81-100, February 19^1T. 21 Raymond B. Cattell, Manual of Directions: A Cul ture Free Test (New York: The Psychological Corp., 1^44). 224 subject to identify in each row of six figures the two that do not belong with the others. In the second, called pool reflections* the subject identifies the one of six drawings which represents the specimen drawing as it would appear in a pool image. The third* called series* requires the subject to identify from six speci mens the one drawing that will complete a series of four members. The fourth* fifth* and sixth parts are called matrices. In these* the testee is required to identify the last number of a series of four or nine parts. The test was administered to about one hundred boys In a junior vocational high school and a like number in an academic high school in the same city. Each item was then tested for its ability to discriminate between the vocational and academic high school students. Those items which were most frequently answered correctly by the senior high school students and least frequently by the junior vocational high school students* were retained in the test. 22 The retained items were further screened on the basis of responses obtained from college students and from pu pils in grades seven and eight. Finally* an analysis was made using the highest-scoring and lowest-scoring students among a group of two hundred students majoring 22 Cattell* Manual of Directions: A Culture-Free Test* p. 3._____________________________________________ in psychology. The test material consists of problems of the type known as perceptual, which are believed to measure Spearman's general factor of intelligence more accur ately than any other type. Moreover, ability to do perceptual problems is said to be free from the influ ence of cultural and educational training. However, Wechsler in reviewing the tests said: The title of this test is misleading, for, apart from the arbitrariness of the author's def inition of culture, there is little evidence to show that the test is free from over-all cultural inf lue nc e s, howe ve r de f ine d. 25 In any case, the test can be given to literate, illiterate, educated, uneducated, and to other groups of widely varying cultural backgrounds. It fills an ur gent need for a discriminating, non-verbal, at least partially culture-free test of Intelligence. The work of the Chicago group (Davis, Fells, Havighurst, and Haggard). Considering the influences of broad, Inclusive cultural forces upon mental test per formance, some investigators maintain that most of the ^ Oscar K. Buros, editor, The Third Mental Mea surements Yearbook (New Brunswick: Rutgers University Press, 1549), p. 2. 226 current tests of intelligence include materials that fa vor the middle and upper socio-economic groups while handicapping the lower. Davis commented upon the prob lem of cultural bias in existing intelligence tests in an address delivered before the Midcentury White House Con ference with these words: Using recent research, I should like to point out that socio-economic factors influence the school’s diagnosis of a child’s intelligence tests, lower-class children at ages six to ten have an average I.Q. which is eight to twelve points be neath the average I.Q. of the higher socio-economic group. For children 14, the lowest socio-economic group is 20 to 23 I.Q. points beneath that of the higher occupational groups. In the same way, the present tests define rural children, on the average, as much less intelligent than urban children; southern white children as much less intelligent than northern white children, and so on. There is no clear, scientific evidence, however, that these tests use chiefly problems which are far more frequently met in urban middle class culture.. . . During the last five years, at the University of Chicago, an intensive and cooperative study of the present intelligence tests has been carried out, on a grant from the Genera1 Education Board of the Rockefeller Foundation. The study revealed: (1) Ten of the most widely used standard tests of intelligence are composed of an overwhelming proportion of questions on which the higher occu pational groups are superior. (2) This superiority is found, upon study, to be associated with the type of vocabulary used in these standard tests and with the greater training and motivation of the higher occupational groups 24 with regard to these tests. The position of Davis and Havighurst^ with re gard to current tests of intelligence has been summarized by them thus: (1) All responses to all items in all tests of general intelligence are necessarily and in evitably influenced by the culture of the respon dent . (2) In a test of general mental ability to be used in the United States, the problems should be selected from the common culture, expressed in cultural symbols common to all native inhabitants of the U.S., and selected from that common culture only. (5) In all available tests of general intelli gence, however, there are numerous items implying experience that is part of the culture of the high er socio-economic groups, but not equally a part of the culture of the approximately 60 per cent of all Americans who grow up in the lower socio- e c onomi e group s. (4) Therefore, the basic cultural flaws in all available tests of general intelligence may be over come by including only those problems and symbols 24 h Allison Davis, Socio-economic Influence Upon Children*s Learning." Reprint of address delivered at the Midcentury White House Conference on Children and Youth, National Guard Armory, Washington, D. C., December 1950, pp. 8-9. Allison Davis, and Hobert J. Havighurst, "The Measurement of Mental System," Scientific Monthly, 6 6: 301-216, April 1948. 228 that imply experience that is part of the general American culture.26 A valid comparison of mental ability of persons from dissimilar cultural groups must utilize those ele ments which are familiar to persons of both groups. In dealing with different socio-economic levels in the Uni ted States, it may be possible to distinguish the cultur al elements which are common to both high and low levels as well as those elements which are characteristic of only one level. In a gross classification, the cultural experiences of individuals in the American society may be divided into three groups: * * * • The common American culture. These are the experiences and types of training which every nor mal American child undergoes. 2. The social-class culture. This category re fers tothose experiences which are class-typed,1 i.e., which are characteristic of a given level of American society. Also, some experiences apparently common to dif ferent socio-economic classes are actually quite dissimilar. A certain food, for example, may be served on both high and low-status tables, yet be cooked, served, and eaten in an entirely different fashion. 3. The ethnic--including color caste— culture. This includes a variety of customs and cultural Davis and Havighurst, "The Measurement of Mental System," p. 303. patterns which exist in minority groups not com pletely assimilated by the common American cul ture . 27 A number of studies* under the leadership of Davis* attempted to isolate the effect of cultural sta tus on intelligence tests. The “cultural bias” is re moved by using only words* grammatical construction* and situations which are equally common in the environments of all socio-economic groups in the United States. Thus he hoped to find problems on which all individuals taking the test have had approximately the same amount of train ing and experience. 28 Davis changed certain test items to eliminate the cultural loading of the content in such manner that the essential problems appeared to be -unchanged. For example* a problem was taken: A symphony is to composer as a book is to what? ( ) paper* ( ) sculptor* ( ) author* ( ) musician* ( ) man. On this problem 8l per cent of the upper socio-economic ^ Robert D. Hess* "An Experimental Culture-Fair Test of Mental Ability*” (unpublished Doctor's disserta tion, Department of Education, The University of Chicago* 1950), p. 21. qQ Allison Davis, "Education and the Conservation of Human Resources." Official Report 19^9 Convention* American Association of School Administrators (Washington, D.C.: The Association* a Department of the National Edu cation Association* 19^9)> PP. 7^-8 3. 2^0 group answered correctly, while 52 per cent of the lower group were correct. The problem was changed in words but kept the same in nature as follows: A baker goes with bread the same way a carpenter goes with what? X ) a saw, ( ) a house, ( ) a spoon ( ) a nail, ( ) a man. 29 The difference between the upper and lower socio-economic groups was eliminated. Another example is the syllogism type of problem, such as, A is shorter than B B is shorter than C. Therefore, which is correct? ( ) B is taller than C. ( ) A is as tall as B or C. ( ) A is shorter than C. On this item, 67 per cent of the higher socio-economic group and 45 per cent of the lower group answered cor rectly. The item was changed to, Jim can hit harder than Bill. Bill can hit harder than Ted, so which is true? ( ) Ted can hit harder than Bill. ( ) Bill can hit as hard as Jim and Ted. ( ) Jim can hit harder than Ted.30 On this form of the problem, the two socio-economic groups did equally well. Hess, as a result of his study and other studies ^ Davis, ’ ’ Education and the Conservation of Human Resources," p. 7 8. 30 Ibid., p. 79. sponsored by the Chicago group, constructed an experi mental culture-fair intelligence test. It is an indi vidual test for children of ages six to nine inclusive. The test’s problems were taken from life experiences which measure reasoning, memory, observation, critical objectivity, and creativeness. It Includes syllogisms, problems of logical classification, inductive reasoning, arithmetical reasoning, and problems of imaginative insight. After experimenting with his ”culture-fair” test, Hess concluded on the basis of his results with the test that: 1. There was no difference of performance be tween the high- and low-status white groups on the experimental total test at any of the four age levels to which the test was administered. 31 2. The experimental test demonstrated adequate efficiency in discriminating individual differences in both high and low socio-economic samples. The work of the Chicago group on intelligence and cultural differences was climaxed by the publication of Davis-Ee11s Test of General Intelligence or Problem- Hess, op. cit., p. 185. 32 Ibid., p. 187. 232 Solving Ability.^ The test is designed for use with children in grades 1 through 6. The "Primary" is for use in grades 1 and 2; and the "Elementary" is for use in grades 3 through 6. It is composed of items in pic ture forms built around common child experiences. The test's problems include the following mental process: association, logical classification, discrimination of differences, analogy, inferences, organizing the elements of a problem into a meaningful pattern, and gaining of insight. The test's materials are made interesting to the children. This is done through the "child-oriented" •problems which comprise the test, the "semi-humorous" style in which some of the items are expressed, and stressing a "game" rather than a test atmosphere while administering the test. Performance on the test is not dependent on such factors as reading skill, "in-school instruction," or speed of response. Provision is made for praising and reassuring the children while taking the test. The test has been standardized on a group of ^ Allison Davis, and Kenneth Eells, Davis-Eells Test of General Intelligence or Problem-Solving Ability (ManuaT) (New York: World BooIT"Company, 7^ PP. children, more than nineteen thousand in grades 1 to 6, selected to be representative of the urban population of the United States with respect to various geographical and socio-economic factors. The validity of the test is indicated by analysis of the nature of the test and the nature of the problem solving ability which it seeks to measure, rather than by statistical comparisons with other measures which, in themselves, the authors of this test believe, are not satisfactory measures of mental capacity. While the usual term I.Q. may be used with this test, it is suggested that a new term, Index of Problem- Solving Ability (IPSA), be used. This index is what statisticians call a "normalized standard score," with a mean of 100 and standard deviation of 16 for selected age groups in the population on which the test was stan dardized. The test has not been used in any experimental work yet. Therefore, it should be regarded as a tenta tive and experimental device until its degree of fair ness to different cultural groups is demonstrated by other investigators. Summary and evaluation. There is no doubt that a culture-free or fair intelligence test is needed for purposes of genetical studies and comparisons of groups with differing cultural backgrounds. The elimination of the influences of training and environment would make scores from intelligence tests much more meaningful than they are at the present time when a score is made up of unknown proportions of innate capacity and environmental effects. Since these tests make a deliberate attempt to in clude only content which is universally familiar in all cultures, they provide a means for studying the effects of racial, cultural, and educational influences on mental capacity, and perhaps even of obtaining some evidence concerning these influences on the constitutional factors responsible for mental ability. Anastasi says in re spect to these tests: In actual practice, of course, such tests fall short of this goal (being universally familiar in all cultures). Moreover, the term * culture com mon1 tests, would probably be more accurate than 'culture free' since at best, performance on such items is free from cultural differences, but not from cultural influence s.3^ ^ Anne Anastasi, "Some Implications of Cultural Factors for Test Construction," The 19^9 Invitational Con ference on Testing Problems, Education Testing Service, p. 14. 23 5 However, these tests are limited in their range. They consist exclusively of visual perceptual items which some psychologists believe do not call for very complex mental processes, despite the claim of their authors that problems included in these tests require essentially the same type of intelligence performance as that re quired by abstract symbols of language. Another difficulty is the low validity reported, if the other tests were taken as basic criteria. The acceptance of "face validity,M which is usually under stood to mean prima facie or "common-sensen validation is objected to by many psychologists. Finally, these tests have not been used in encfugh studies to give a valid conclusion concerning their fair ness with various cultural groups; therefore, they must be regarded as tentative instruments pending further in vestigations on their usefulness with different social and cultural groups. CHAPTER X COMPARISON OF THE STANFORD-BINET WITH ITS ADAPTATION IN OTHER CULTURES The influence of Binet1s scales was widespread over Europe and the United States. From 1908 on, a more or less constant stream of revisions and modifications of either the 1908 or 1911 scale has been made. In England Burt, in Germany Bobertag, and in the United States Goddard, Kuhlman and Yerkes, were all engaged in further application and revision of the scale, but the most thor ough going effort to produce a standardized and adequately based version was made by Terman at Stanford University. The first account appeared in 1912 as a tentative re vision and extension of the Binet scale. 1 In 1 9 1 6, Terman1s The Measurement of Intelligence was published as an explanation of, and a complete guide for, the use of the Stanford Revision and Extension. In 1937s Terman Lewis M. Terman, and H. G. Childs, 1 1 A Tentative Revision and Extension of the Binet-Simon Measuring Scale of Intelligence,” Journal of Educational Psychology, 3- 61-74, 1 3 3-1 4 3, 1 9 8-2O8, 277-2S9, February, March, April, May 1§12. 2 Lewis M. Terman, The Measurement of Intelligence (New York: Houghton Mifflin Company, 1916), 562 pp. and Merrill published Measuring Intelligence^ as a guide to the administration of the New Revised Stanford-Binet Tests of Intelligence. Here they have provided two scales which differ in content, but are mutually equivalent with respect to difficulty, range, reliability, and validity. The influence of Binet-Terman1s intelligence scales is reflected in the numerous translations and adaptations which have appeared in many countries all over the world. Quite naturally, the Stanford-Binet scales contain items which are uniquely American. Even some pictorial items refer to things that only an American child could be expected to know. Thus an attempt was made to procure copies of adaptations of the Stanford- Binet from different countries. The purpose was to find out how the subtests on the Stanford-Binet have been adapted for use in cultures other than the American cul ture. Letters were sent to the Ministry of Education of many countries, stating the purpose of the study and ask ing for a copy of the Stanford-Binet In their native tongue. On one hand, only a few countries sent the ^ Lewis M. Terman, and Maud A. Merrill, Measuring Intelligence (New York: Houghton Mifflin Company, 1937)7 TOTppI------- requested copies. These include Egypt, Sweden, and India. On the other hand, other countries sent different tests which are not useful for the purpose of this chap ter. Still others replied, explaining that the trans lation is under process and has not yet been completed. In this chapter, an attempt will be made to com pare the 1916 Stanford-Binet test with its adaptation for use in other countries. These changes are shown in Table XXI. Then, a similar comparison will be made be tween the 1937 Stanford-Binet and its Swedish adaptation. These analyses will be made on the basis of age assign ments and the content of the individual items. In addi tion, the Swedish-Stanford-Binet comparison will contain an analysis of the score points. These facts are pre sented in Table XXII. Instead of a summary of the tables, a discussion will be offered for each test individually. The Indian Form (19^0). The 1916 Stanford-Binet was adapted by Kamat for measuring the intelligence of i i Indian children. Though the test followed the original Stanford-Binet closely, the adapter found that some of 2l V. V. Kamat, Measuring Intelligence of Indian Children (Bombay, India: The Times of Indian Press, 1^51)> 243 pp. TABLE XXI COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS Ho, of Name of test and year Age assignment changed Item1 s content modified or changed tests S.B, India S. Africa Egypt Mexico India S. Africa 1 Year III Pointing to parts of the body 3-1 3-1 2 Naming familiar objects 3-2 3-5 — — 3 Enumeration of objects in pictures 3-4 3-6 mmwmmm X X ? 4 Giving sex Alt. 3 3-3 rn.rn.rn. — — mm — . — 5 Giving the family name Alt. 5 3-4 — - -- 6 Repeating six to seven syllables 3-5 4-6 mmmm — X mm mm-mm X Alt, Repeating three digits 4-1 4-1 — - — - — - — - 1 Year IV Comparison of lines 3-6 4-2 2 Discrimination of forms 4-2 M w w omitted 3 Counting four pennies 4-5 4-3 X X X 4 Copying a square 4-6 5-1 -- -- • -- 5 Comprehension, first degree 4-3 4-5 -- X 6 Repeating four digits 6-1 5-4 -- Alt, Repeating twelve to thirteen syllables 4-4 6-6 -- X -- X ro LO VQ TABLE XXI (continued) COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS No. of Name of test and year S*B. Age assignment changed Item's content modified or changed tests India S. Africa Egypt Mexico India , S. Africa Year V 1 Comparison of weights Alt. 4 5-7 ... ... — m — ... 2 Naming colors Alt.6 5-8 -- mm mm mm ... 3 Aesthetic comparison 3-1 5-5 -- mm *mm mm — 4 Giving definitions in terras of use 5*2 6-1 .... X X — 5 The game of patience 6-3 5-6 ... ... ... — 6 Three commissions 5-3 6-2 ... ... ... ... Alt. Giving age Alt.5 5-2 — ... Year VI 1 Distinguishing right and left 5-k 6-7 ... — ... — 2 Finding omissions in pictures 6-6 7-2 ... — ... — 3 Counting thirteen pennies 5-6 6-8 X X X — * Comprehension, second degree 6-2 6-3 -- X -- X 1 5 Naming four coins 5-5 6-5 ... -- -- ... 1 6 Repeating sixteen to I i eighteen syllables 7-1 7-* X X X X Alt. Forenoon and afternoon Alt. 5 -- omitted IV 4 ? O TABLE XXI (continued) COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS No. of tests Name of test and year S.B. Age assignment changed Item1s content modified or changed India S. Africa Egypt Mexico India S'. Africa Year ¥11 1 Giving the number of fingers - 6-4 7-3 -------- — ——— — 2 Description of pictures 6-5 6-4 X X ? 3 Repeating five digits 8-2 8-6 — — 4 Tying a bow-knot Alt. 8 7-6 -------- . . . . . 5 Giving differences from memory 7-6 8-5 X X X 6 Copying a diamond y » » 2 7-5 -------- — — Alt .1 Naming the days of the week 7-* 8-2 « — Alt. 2 Repeating three digits reversed 7-3 8-4 — - — - -— Year ¥111 1 The ball-and-field test Alt.8 9-4 M M 2 Counting backwards from 20 to 1 7-5 8-3 -------- -------- - - - 3 Comprehension, third degree 8-3 8-1 -------- X ■mrnmmmm — - 4 Giving similarities; i two things 9-3 9-2 X X x to 5 Giving definitions - f s r superior to use 8-4 10-1 X X -------- _ * 1 TABLE XXI (continued) COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS No. Age assignment Item*s content modified or of Name of test and year changed changed tests S.B. India S. Egypt Mexico India S. ■ Africa Africa 6 Vocabulary] twenty definitions Alt. 9 9 • X X omitted Alt.l Naming six coins 8-5 -— omitted Alt. 2 Writing from dictation Year IX 9-1 X X omitted 1 Giving the date Alt.10 ___ .... omitted 2 Arranging five weights 10-1 10-4 -- — „ 3 Making change 9-2 11-1 X X X 4 Repeating four digits reversed 9-1 9-3 — 5 Using three words in a sentence 9-4 9-3 X X 6 Finding rhymes 10-5 11-5 X X X : Alt.l Naming the months 10-3 10-3 — -- — Alt. 2 Counting the value of stamps — — omitted omitted Year X 1 Vocabulary (thirty definitions) Alt.12 9 X X omitted ; 2 Detecting absurdities 12-1 10-6 X --- ——— ——_ 3 Drawing designs from 10-4 memory 11-2 mm mm mm mm m* mm ——— r o r o TABLE XXI (continued) COMPARISON BETWEEN TRE 1916 STANFORD BINET AND ITS ADAPTATIONS No. of tests Name of test and year S.B. Age assignment changed Tn3Ta Si Egypt Africa Item's content modified or ______ changed_________ Mexico India S. Africa 4 Reading for eight memories 5 Comprehension, fourth degree 6 Naming sixty words k i t . I Repeating six digits lA.lt. 2 Repeating twenty to twenty-two syllables kit. 3 Construction puzzle A Year XII 10-6 11-6 Alt.12 12-3 Alt.A.A. 12-2 Alt.12 11-4 10-2 12-1 12-2 10-5 x x X X X X 1 Vocabulary (forty Alt.14 definitions ? X X ? 2 Definihg:abstract words 12-3 14-1 X X X 3 The ball-and-field test (superior plan) 14-6 12-4 -— —— 4 Dissected sentences 14-2 13-2 «... X X — ~ 5 Interpretation of fables 12-5 14-2 X ... 6 Repeating five digits reversed 12-4 — - omitted 7 Interpretation of 8 pictures 12-6 13-3 --- X X 9 Giving similarities, three things 14-5 13-1 --- X X % u> TABLE XXI (continued) COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS No. of Name of test and year S.B. Age assignment changed Item1s content modified or changed tests India S. Africa Egypt Mexico India S. Africa Year XIV 1 Vocabulary (fifty definitions . 9 • X omitted ? 2 Induction test: find ! ing a rule 14-1 15-4 -- — « . — 3 Giving differences between a president i and a king -- omitted omitted 4 Problem questions A.A.5 13-4 X X X X 5 Arithmetical reasoning 14-3 15-5 X X — - — - 6 Reversing hands of clock A.A,2 -- — omitted Alt. Repeating seven digits A. A. 6 18-4 -- -- ■ — - — Average Adult 1 Vocabulary (sixty-five Alt .V. definitions) S.A. ? X X 9 2 Interpretation of fables A.A.-l -- X -- omitted 3 Differences between abstract terras S.A.3 16-1 X X X X 4 Problem of enclosed boxes 14-4 14-4 — - — — 5 Repeating six digits reversed A.A.4 19-4 - - - — — 6 Using a code S.A.l X X omitted rc 4 = 1 TABLE XXI (continued) COMPARISON BETWEEN THE 1916 STANFORD BINET AND ITS ADAPTATIONS No. of ■^ests Name of test and year S .B. Age assignment changed Item's content modified or changed India S. Africa Egypt Mexico India S. Africa Alt.l Repeating twenty-eight | syllables 16-3 x omitted X Alt. 2 Comprehension of physical relations V.S.A.l --- — — - omitted Superior Adult 1 Vocabulary (seventy- five definitions) ? x omitted 9 2 Binet's paper cutting test S.A.4 17-1 -- — 3 Repeating eight digits V.S*A.2 20-2 -- — — — A Repeating thought of passage V.S.A.3 -- — — omitted 5 Repeating seven digits reversed V.S.A.5 20-3 — - mm HM wmmrn r n r n ■mm** mm 6 Ingenuity test S.A.2 . 18-2 X mm mm mm mm X Key: --Not changed or modified x Changed or modified ? Incomplete information ro TABLE m i COMPARISON BETWEEN THE 1937 STANFQRD-BINET AND ITS SWEDISH ADAPTATION No. Age Item*s content of Name of test and year assignment changed or Score points test S.B. Swedish modified S.B. Swedish Year II 1 Three-Hole Form Board 2-1 wm-mm ++■ 1 4 * 1 + 2 Identifying objects by Name 2-2 X 4 + 3 + 3 Identifying Parts of the Body 2-3 3 + 1 4 * 4 Block Building: Tower 2-4 4 bl. 4 bl. 5 Picture vocabulary Alt. 2 — — 2 + 2 + 6 Word Combinations 2-6 -- Alt. Obeying Simple Commands 2-5 -- 2 + 2 + Year II-6 1 Identifying objects by Use 2-6-1 —. . . . 3 + 3 + 2 Identifying Parts of the Body 2-6-2 4 + 3 + 3 Naming objects 2-6-3 mm mm mm 4 + 4 + 4 Picture vocabulary 2-6-4 — - 9 + 9 4 * 5 Repeating 2 Digits 3-3 — - 1 4s ” 1 4” 6 Three-Hole Form Board: Rotated 2-6-6 -- 1 + 2 + Alt. Identifying objects by Name Alt.2-6 -- 5 + 5 + Year III 1 Stringing Beads 3-1 -- 4 b. 6 b. TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. of test Name of test and year S.B. Age assignment Swedish Item’s content changed or modified Score points S.B'. Swedish 2 Picture Vocabulary 3-2 m m 12 + 12 + 3 Block Building: Bridge 2-6-5 — - 4 Picture Memories 3-4 1 + 1 + 5 Copying a Circle 3-5 1 + 1 • + ’ 6 Alt. Repeating 3 Digits Three-Hole Form Board: Rotated Year III-6 3-5 omitted 1 + 1 + 1 Obeying Simple Commands 3-6 3 + 3 + 2 Picture Vocabulary 3-6-1 -— 15 + 15 + 3 Comparison of Sticks 3-6 -2 — 3 + 3 + 4 Response to Picture I 3-6-4 --- 2 + 3 + 5 Identifying Objects by Use 3-6-5 — 5 + 5 + 6 Comprehension I 3-6-6 — 1 + 2 + Alt. Drawing Designs: Cross Year IV .Alt.3 -6 1 Picture Vocabulary 4-1 16 + 16 + 2 Naming Objects from Memory 4-2 — . . . 2 + 2 3 Picture Completion: Man 4-3 •— 1 P. 2 p. 4 Pictorial Identification 4-4 --- 3 + 3 + 5 Discrimination of Forms 4-5 8 + 10 - i * 4^ 1 TABLE XXII (continued) 1 COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. Age Item's content - of Name of test and year assignment changed or Score points I test S.B, Swedish modified S.B. Swedish 6 Comprehension II 4-6 ... 2 + i + Alt. Memory for Sentences I Alt. 4 X 1 + i + Year 17-6 1 Aesthetic Comparison 4-6-1 ... 3 + 3 + 2 Repeating 4 Digits 6-3 ... 1 + 1 + 3 Pictorial Likenesses and Differences 4-6-3 ... 3 + 3 + 4 Materials 5-3 ... 2 + 1 - k Three Commissions 4-6-5 — —- 3 + 3 + 6 Opposite Analogies I 5-5 ... 2 - k 2 * k Alt. Pictorial Identification 4-6-6 -. 4 - k 4 * k Year V 1 Picture Completion: Man omitted 2 Paper Folding: Triangle 4-6-2 — 3 Definitions 4-6-4 ... 2 + 2 + 4 Copying a Square 5-4 ... 1 + 1 * k 5 Memory for Sentences II 5-2 X 1 - k 3 + 6 Counting Four objects 5-6 ... 2 + 3 - k Alt. Knot TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. Age Item!s content of Name of test and year f assignment changed or Score points test S.B. ; Swedish modified S.B. Swedish Year VI 1 Vocabulary 6-1 X 5 + 6 + 2 Copying a Bead Chain from - - Memory I 6-2 • — - 3 Mutilated Pictures 3-1 ...... ^ + 3 + 4 Number Concepts 6-4 --- 5 + 5 + 5 Pictorial Likenesses and - Differences 6-5 2 H b 3 + 6 Maze Tracing 6-6 — 2 + 3 + Year VII 1 Picture Absurdities I 7-1 3 + - 2 + 2 Similarities: Two Things 7-2 X 2 + 1 + 3 Copying a Diamond ' 7-3 — 2 + 1 + 4 Comprehension III 7-5 — 2 + 2 5 Opposite Analogies I 7-6 — 5 + 5 + 6 Repeating 5 Digits 8-2 -— 1 * 1 * Year VIII 1 Vocabulary 8-1 X 8 + 10 + 2 Memory for Stories: The Wet Pall 7-4 --- 5 + 5 + - ro vo . __ __ ___i TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. Age Itemfs content of Name of test and year assignment changed or Score points test S.B. Swedish modified S.B. Swedish 3 Verbal Absurdities I 8-3 ... 3 + 2 + 4 Similarities and Differences 8-4 X 3 + 3 + 5 Comprehension IV 8-5 . . . . . . 2 + 2 4 * 6 Memory for Sentences III 8-6 — 1 + 1 + Year IX 1 Paper Cutting I 9-1 ___ i + 1 + 2 Verbal Absurdities II 9-2 X 3 + 2 + 3 Memory for Designs 10-2 — 1 + 1 + 4 Rhymes: New Form 9-4 X 3 + 2 + 5 Making Change 9-5 -— 2 4 3 + 6 Repeating 4 Digits Reversed 9-6 — 1 4 1 + Year X 1 Vocabulary 10-1 X 11 + 13 + 2 Picture Absurdities II 11-1 3 Reading and Report 10-3 10 M. 10 M. 4 Finding Reasons I 9-3 -- 2 + 1 4 5 Word Naming 10-5 -- 28 W. 26 W. 6 Repeating 6 Digits 11-2 -- 1 + 1 + MY = Memories W. = Words TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. Age Item’s content - of Name of test and year assignment changed or - Score points test S.B. Swedish modified S.B. Swedish i Year XI 1 Memory for Designs 12-4 — 1 1/2 + 1 1/2 + 2 Verbal Absurdities III '11-4 2 + - 2 + - 3 Abstract Words I 11-3 3 + 3 4 4 Memory for Sentences IV 10-4 1 + 1 - j * 5 Problem Situation omitted 6 Similarities: Three Things 11-6 3 + 2 + Year XII 1 Vocabulary 12-1 x 14 + 17 + 2 Verbal Absurdities II 13-2 x 4 + 5 + 3 Response to Picture II 13-1 4 Repeating 5 Digits Reversed 12-5 “ * * ■ • — l + 1 + 5 Abstract Words II 14-6 2 + 3 + 6 Minkus Completion 12-6 2 4 2 + Year XIII 1 Plan of Search 12-2 2 Memory for Words 12-3 1 + 1 + ro V J 1 ( — 1 TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. of , test Name of test and year S.B. Age assignment Swedish Item's content changed or modified Score points S.B. Swedish 3 Paper Cutting I 13-3 mm-mm mm 2 + 2 + H - Problems of Pact 13-4 '-- 2 + 2 + 5 Dissected Sentences 13-5 -- 2 2 4 * 6 Copying a Bead Chaim from Memory II 14-3 -- Year XIV 1 Vocabulary 14-1 X 16 - h 19 4 - 2 Induction 14-2 - - 3 Picture Absurdities III 13-6 4 Ingenuity 14-4 1 4 * 1 + 5 Orientation: Direction I 14-5 3 4 * 3 6 Abstract Words II 14-6 -- 3 + 3 Average Adult 1 Vocabulary A.A.l X 20 4* 21 4 * 2 Codes A.A.2 X 1 1/2 4- 1 + 3 Differences Between Abstract Words S.A.I.3 — 2 4 * 2 4 - k Arithmetical Reasoning A.A.4 -- 2 4r 2 5 Proverbs A.A.5 -- 2 4 * 2 6 Ingenuity S.A.I.5 -- 2 + 2 7 Memory for Sentences V A. A.7 X . 1 + 1 + - r \ ; l r - ro TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. of test Name of test and year S.B. Age assignment Swedish Itemfs content changed or modified i Score points 1 S.B. Swedish 8 Reconciliation of opposites A.A.8 — 3 + 1 + ; i Superior Adult I i i i 1 Vocabulary S.A.I.l X 23 + 24 + ' 2 Enclosed Box Problem S.A.I.2 — 3 + 3 + i 3 Minkus Completion A.A.3 — 3 + 3 + 4 Repeating 6 Digits Reversed S.A.I.4 -- 1 + i + ! 5- Sentence Building \ A.A.6 2 H * 1 + i 6 Essential Similarities S.A.I.6 — 2 + 1 * : Superior Adult II > 1 Vocabulary S.A.II.l X 26 + 28 + 2 Finding Reasons II S.A.II.2 -- 2 + 2 + 3 Repeating 8 Digits S.A.II.3 -- 1 + 1 + 4 Proverbs II S.A.III.2 X 2 + 2 + 5 Reconciliation of Opposites S.A.II.5 . . . 5 + 5 + 6 Repeating Thought of Passage: Value of Life S.A.III.4 —... Superior Adult III 1 Vocabulary S.A.III.1 X 30 + 31 +■ ro VJ1 U) t I ! TABLE XXII (continued) COMPARISON BETWEEN THE 1937 STANFORD-BINET AND ITS SWEDISH ADAPTATION No. Age Item‘s content of Name of test and year assignment changed or Score points test S.B. Swedish modified S.B. Swedish; 2 Orientation: Direction II S.A.II.4 ... 2 - f r * 2 + ; 3 Opposite Analogies II S .A.III.3 — — 2 + 2 + 4 Paper Gutting II S.A.II.6 5 Reasoning S.A.III.5 6 Repeating 9 Digits S.A.III.6 1 + 1 + ■ Key: — Not changed or modified x Changed or modified r o VJ1 - { = ■ the subtests and material of the scale were unsuitable for Indian children and had to be replaced, and that some had to be modified to suit Indian conditions. Thus, Indian coins were substituted for American coins; the pictures required for the aesthetic and "missing fea tures" tests were given as Indian appearance, while re taining the original Binet features. Pictures represent ing Indian life were substituted for pictures of western life which were in the "description of pictures" test. The slip-knot was substituted for the bow-knot* The tests of repeating syllables were modified to fit Indian situations, while retaining the same number of syllables. In, the tests of "definitions" and "differences of ab stract words," words having the original meaning were selected, but in some cases the negative terms were used for the positive when the positive terms were found to be ambiguous or were used in more than one sense in the Indian languages. The test giving "differences between a patil and a Kulkorni" (village headman and village accountant) was substituted for the test "differences between a president and a king." An entirely new Indian code was substituted for the English code. The vocabu lary tests are made up from the words in Kannada and Marathi (two Indian languages) dictionaries and the . , , 256 I number of words for some age levels differs from the original test. In addition* the following tests were omitted: Year VIII (Alt. 2). Writing from dictation Year IX (Alt. 1). Counting the value of stamps Year A.A. (Alt. 1). Separating twenty-eight syllables and the following tests were added: Year III (3). Repeating 2 digits Year III (Alt. 2). Giving proper name Year VII (Alt. ). Giving day of week and day of month Year VIII ( 6). Reading and report.(2 facts: 10 errors) Year IX (5). Reading and report. (6 facts: 5 errors) Year IX ( 6 ). Free association* 35 words In 3 minutes Year S.A. (5)* Repeating 30 syllables Year S.A. ( 6). Reversing triangle in imagination Year V.S.A. (4). Reversing Triangle in imagination (Binet*s form) Year V.S.A. ( 6 ). Free association* 80 words in 3 minutes. -------The_Indian_form-_was_standar.dized_on_l*JDZ^-—children_ 257 and adolescents of all ages from 2 to 20 and of both sexes. In allocating the tests to the proper ages, the method ad vocated by Burt was followed. This method requires that 50 per cent of all children who have just passed their last birthday and have not reached their next pass the test before it is allocated to the later age; that is, if a test is passed by 50 per cent of all children between < 6 and 7 years of age, the test is located in year VII. Some of the tests not timed either by Biner or Terman were timed in this revision. The scale extends from the 3 year level to the very superior adult level. The South African Form (1939)• The 1916 Stanford- 5 Binet was adapted by Fick^ for measuring the intelligence of children in South Africa (English Language). Since the information supplied with the booklet and the manual for instructions are lacking, no valid explanation may be given concerning the basis of assigning the tests to their proper ages and the basis for the extension of the scale to the 20 year level, and its standardization group. In any case, the scale ranges from the 3 year level to R ^ The National Bureau of Educational and Social Research of the Union Education Department, Instructions and Allocation of Marks for the Individual Scale of Gen- eral Intelligence (Pretoria, South Africa: 19^5), 13 pp. 'the 20 year level* with the number of the tests varying from one age level to another. In the tests of "repeating syllables *" entirely new sentences replaced the original ones. In the tests of "comprehension*" "similarities*" "problem situation*" "differences between abstract words*" "defining abstract words*" and "using words in sentences*" few changes were made in the items composing the tests* : that is* elimination of some of them and addition of new ones. The vocabulary tests contained many words which were dissimilar to the original Stanford-Binet list* and the number of words was sometimes changed. A glance at Table XXI reveals that the following tests were omitted. Year IV (2). Discrimination of forms. Year VIII ( 6). Vocabulary. (Alt.) Naming six coins. Year IX (l). Giving date. (Alt. 2) Counting the value of stamps. Year X (l). Vocabulary. Year XII ( 6). Repeating five digits reversed. Year XIV (3). Giving differences between a presi dent and a king. ( 6) Reversing hands of clock. Year A.A. (2). Interpretation of fables. (6) Using code. (Alt. 2) Comprehension of physical relations. 259 Year S.A. (4). Repeating thought of passage. The following tests were added. Year III (2). Two digits. Year IV (4). Familiar objects (no error). Year VTI (l). Knox C (1st attempt). Year IX (5)* Knox D (1st attempt). Year X (2). Arithmetic (l and 2). Year XIV (3). Reasoning test 1. Year XV (l). Knox E (2nd attempt). (3) Reasoning test 2. Year XVT (2). Absurdity 1. (4) Five digits back wards . Year XVII (2). Absurdity 2. (3) Drawing reversed triangle. Year XVIII (l). Disarranged sentences. (3) Reason ing test 3 « Year XIX (2). Disarranged sentences. (3) Absurd ity 3. The Egyptian Form. The Egyptian form^ followed the 1916 Stanford-Binet very closely in every respect. It seems, though the information is lacking, that the Ismaiel M. El-Kabani, The Stanford Revision of the Binet-Simon Intelligence Tests; Manual~~of Instructions (Cairo, Egypt: The Committee for Editing, Translations and publishing, n.d.), 64 pp. 260 translator did not standardize the tests on Egyptian children, nor did he make any pilot study before adapting the scale. The few changes made by him were in the con tent of a few items to fit the Egyptian environment. Some of these changes are listed below. Stanford-Binet Stanford-Binet-Egypt1an Year VII (5) Ely and butterfly. Fly and cockroach. Year VIII (4) An apple and a. . . peach. Iron and silver. Year VTII (5) Balloon, tiger, football, and soldier. Year VIII (Alt. 2) See the little boy. Year X (2) Detecting absurd ities . An apple and an orange. Copper and silver. Airplane, lion, reeds, guard. The dog runs after his master. 1. Yesterday, we saw a tall and strong man walking in the street, his hands in his pockets and swinging his cane while walking 2. A father sent his son a letter saying that he had sent him a dollar within the letter, add ing » "If you did not ( not receive it* please cable. 1 1 Year XII (2) Charity* envy. Gallantry* deception. In addition to what is mentioned above* a new Egyptian code was substituted for the English code. The Egyptian coins were replaced by the American coins. The problem questions test-was changed* and new questions were put in. The Mexican Form (1925). The Stanford-Binet- * 7 Mexico was essentially a translation of the 1916 Stanford-- Binet with a little modification which was necessary to make it suitable for the Mexican child. The tests had not been standardized on Mexican children nor had any test been moved from one age level to another. Some of these modifications are mentioned below: Stanford-Binet Stanford-Binet-Mexican Year III ( 6) I have a little I have a big dog. dog. I eat "Tortilla” with The dog runs after the eat. salt. In the summer the sun is The girl cries a lot. hot. 7 1 D. P. Boder and others* The Binet-Simon-Terman Scale in Its Provisional Adaptation for Mexico (Mexico; D.F.: National GraphicWork* 1925)* 140 pp, Year IV (Alt.) The boy's name is John. He is a very good hoy. When the train passes you will hear the whistle blow. We are going to have a good time in the country. Year V (4) Chair, horse, fork, doll, pencil, and table. Year VI (4) What's the thing to do if it is raining when you start to^ school? Year VI (4) What's the thing to do if you find that your house is on fire? What1s the thing to do if you are going somewhere and miss your train (car)? Year VI (6) We are having a fine time. We found a little mouse in the trap. Walter had a fine time on his vacation. He went fishing every day._________ 262 The boy Juan cries. He i looks for his mother. Yesterday my aunt came and brought a candy. My mother bought me a white shirt. Chair, donkey, knife, doll, pencil, and table. What do you do if you have an apple and you want to give it to two children? What do you do if you want to light a fire and do not have matches? What do you_ do if you want to get your hat from the nail and cannot reach it? Last night we had a good time. We had a pasade (party). I like to go for a walk with mother. She always buy.s_me_candies.,__________ We will go for a long walk. Please give me my pretty straw hat. Year VII (5) What is the dif- ference between a fly and a butterfly? Year VIII (3) What1s the thing for you to do when you notice on your way to school that you are in danger of being tardy? Year VIII (4) An apple and a peach. Year VIII (5) What is a balloon? tiger? football? soldier? Year IX (5) Child, ball, river. Desert, rivers, lakes. Year XII (2) Pity. Year XIII (8) Smoke, cow, sparrow, knife blade, penny, piece of wire. Rose, potato, tree. Year XIV (4) An Indian who had to come to town . . ._.____ Yesterday my friend came We played all afternoon. What is the difference between a hen and a duck? What must you do if a little girl goes to get matches from the table? Pear and peach. What is an automobile? cat? ball? teacher? Child, jar, milk. Day, moon, stars. Hypocrisy. Smoke, cow, chicken, needle, fifty cent piece, nail. Rose, bean, tree. Many people are gathering around an electric train. : S'6U The police can be seen there, and from afar an ambulance is coming. What happened? Year A. A. (3) Laziness and Laziness and fatigue, idleness. In addition to these changes, entirely new fables replaced the original ones. The vocabulary tests contain words which, in many cases, are not similar to the Stanford-Binet list. In the test of "reading for eight memories” a new selection substituted the original one. o The Swedish Form (1948). The Swedish form is an adaptation of the 1937 Stanford-Binet. The scale has been standardized on 792 children, mostly from Stockholm and surrounding cities. As a result of her study, Hellistrom changed the number of words per age level and the order of difficulty in the vocabulary test. A comparison of the number of words with Terman’s list is shown below: Year Stanford-Binet Swedish Adaptation 6 0 6 8 8 10 10 11 13 O Alice Hellistrom, Intelligence Measurement (Swedish Translation and Adaptation; Stockholm, Sweden: Es.sle.te_Aktiabolag, 1948), 504 pp. _____________________ ' ” : ' 2 *65' Year Stanford-Binet Swedish Adaptation 12 14 17 14 16 19 A.A. 20 21 S.A.I. 23 24 S.A.II* 26 28 S.A.III. 30 31 The adapter also based her age assignments and score system., which differ from the original test some times, on the result of her standardization group* using the increasing percentage of children passing each test with advancing age as a basis for allocating the tests to their age level. An example of this method is shown below: Group age and per cent Test of passing 2 3 4 1. Three-hole form board 57.1 8 6 .6 100 . 2. Identifying objects by name 57.7 9 0 .0 100 3. Identifying parts of the body 64.2 83.3 J.00 4. Block building 60.7 9 0 .0 100 5. Obeying simple commands 48.0 83.3 93.6 6. Word combinations 71.4 9 6 .6 100 Alt. Picture vocabulary 42.8 9 6 .6 100 Average 59.8 88.3 98.9 Finally* only a very few changes were made in the 266 item content. These were in the tests of definition of abstract words, verbal absurdities, and proverbs. An Investigation of Certain Aspects of Bantu Intel- g ligence. The South African study of Bantu intelligence is brought in, not because it relates to the topic, but to show the influence of culture on test items when an intelligence test is translated to another language. The investigator felt that the South African Group Intelligence test, in either its English or Afrikaans form, was quite unsuitable for use with Bantu children because the test is a language test, and the language subtests . . . presume an analytic mode of verbal thinking, which might be quite unfamiliar to one speaking, like the Bantu, a polysynthetic tongue. Where we have words, the Bantu as often as not have roots, stems, prefixes and suffixes. . . . 10 The translation was carried out by Dent with the help of a number of native Zula teachers. Some of the changes that they made are as follows: Form I Classification Test (exercise) ( 2) "apple" was replaced by "ikiwane" (wild fig) 9 G. R. Dent, An Investigation of Certain Aspects of Bantu Intelligence-("Pretoria, South Africa: Depart ment ofEducation, Arts and Science, n.d.), 52 pp. 10 Ibid., p. 18. '267' "Pear" was replaced by "idoni" (fruit of water- oak) f f Plumn was replaced by "ithuhduluka* (wild plum) (3) "Shears" was replaced by "isizenze" (native axe) » t / "Plate1 1 was replaced by "uqwembe* (wooden meat trencher) "Jam" was replaced by "inyarna" (meat) "Cheese" was replaced by "amasi1 (curds) "Butter" was replaced by "amafutha" (fat) Classification Test < (1) "Potato" was replaced by "amadumbi (Colocasia antiquorum) "Grapes" was replaced by "amakiwane" (wild figs) "Apples" was replaced by amathunduluka" (wild (2) "Syrup was replaced by "amaswidi" (sweets) "Vinegar" was replaced by "umdokwe" (sour porridge) (g) - « was replaced by "izindluhu" (Jugo "Lucerne" was replaced by "ubontskisi" (beans) "Wheat" was replaced by "amohele" (kaffir corn) (18) "Ellipse" was replaced by "umudwe" (line) "Cube" was replaced by "isigaxa" (solid lump) Analysis Test (7) "Nut" was replaced by "intongomane (monkey nut) ■ i . / \ (9) "Father" was replaced by "umzali 1 (parent) (13) "Rose" was replaced by "iteke iquakazile' (arum in bloom) (16) "Butter" was replaced by famafutha (fat) (18) "Steep" was replaced by "uramango" (steep hill) (19) "Kennel" was replaced by "itambo" (bone) Summary and evaluation. The purpose of this chap ter is to find out what items or subtests are eliminated* 11 Ibid.* pp. 6-7. 268“ changed* or remain in the Stanford-Binet adaptations in other cultures. Due to the scarcity of available ma terial* valid conclusions may not be drawn concerning these modifications. However* it would seem that the Egyptian and the Spanish are mere translations with few changes* if any* while the Indian and the Swedish versions differ widely in the age assignment tests from the orig inal Stanford-Binet. This assignment was made on the basis of standardization with groups of children in the country concerned. In the case of the South African version* many changes have been made. Many items have been added or eliminated* and the numbers of items for each age level have been varied. Whether or not these things were based on standardization is not known. In most of the cases* the changes have occurred in the verbal items* leaving the non-verbal material relatively untouched. These modifications ,do not appear to be in one direction* nor does there seem to be any consistency in the items that are modified. It would seem that the desire to remain as close as possible to the original Stanford-Binet has prevented the adaptors from making broad adaptations to suit their cultures. Finally* the Bantu study was brought in* not be- cause it relates to the Stanford-Bine_t*_but__to_show_that__ the changes are mostly in the verbal material when an intelligence test is translated to another language. I____ f CHAPTER X I 1 SUGGESTIONS FOR CONSTRUCTION OF AN INTELLIGENCE TEST FOR IRAQ Although intelligence testing after Binet’s initial experimentation has been employed extensively for more than fifty years in western countries, it is regretable that until now no work in this field has been done in Iraq. This lack of interest in developing intelligence tests cannot be attributed to any one reason. An attempt will be made in this chapter to discuss the difficulties which test constructors are likely to encounter in devising or in adapting an intelligence test for school children in Iraq. In addition, an item evaluation of the Wechsler Intelligence Scale for children and of the Stanford-Binet was made by a committee of five Iraqi graduate students studying at the University of Southern California to find, on a subjective basis, which items must be eliminated, and which items may be retained, in case of adapting one of them. First, a brief discussion of each of the diffi- ( culties, which face any Intelligence test builder in Iraq, will be presented. Secondly, a general discussion of the problems of adapting a foreign intelligence test 271 to a different culture will follow. 1. The language. Classic Arabic is the official language in Iraq. The spoken language, however., is not v % the classic Arabic language. It is a mixed dialect made up of classic forms, and forms derived from other languages. As a result of contributions from many sources, the spoken language is different in pronunciation, gram-; matical construction, and composition from the classic language. Yet the classic Arabic is highly respected. It is the official language used in the schools, the press, government, and in oratory. It holds, moreover, a religiotjs value, being the language of the Koran, the Holy Book of the Moslems, who constitute the great majority of the popu lation. Another difficulty, in connection with the language, is that the spoken dialect in the northern region is not understood by the children of the southern region. In fact, the majority of the people in these provinces, located in the north and northeast, speak an entirely different language. The difference between spoken and written Arabic makes it difficult to construct an intelligence test for school children, especially a group test. If the test is written in the classic language, it would not be using the children1s mother tongue, and if it is written in the —co- 1-1 oquial~form-,_i-t—'would_be_the_firs.t_time_the_sub.jeo_ts, 272 have seen it on paper. This problem is less severe for the older children in junior high or senior high where the differences between the spoken and the written language have become very much less than at early ages. 2. The sampling. All school children in Iraq pre sent only a small and biased sample of the total child population, and the degree, and perhaps also the nature of the bias, vary a great deal between schools in accordance with varying local demand and social conditions. Neither do there exist in Iraq statistics such as are available in the United States concerning the proportion in the community of rich and poor, townsmen and countrymen, and so on, One reason for this biased sample is that educa tion is not universal, although it is free. There are many rural sections without schools, and in the rural I areas, many children are needed for the assistance that i they can give on the farms. The result of these circum stances is that only a few children attend school. So a standardization based on these few would be an unfair sampling of the total child population. The second reason for a biased sample arises from the examination system. An examination is given at the end of each school year which every child must pass in order to proceed to the next grade. In addition, the Fubl-i-c—Examination—is—given_at_the_end_of_the__p.rimary.,____. 273 intermediate, and the preparatory stages of learning. Passing of these examinations is necessary to go from one level to the next. Generally speaking, fewer than half of the pupils who enter actually graduate from the public schools. In other words, the public examinations at the end of each stage of the educational ladder act as real hurdles, and many students of low standing are eliminated or at least retarded. The number of students eliminated through examina tions from one grade tTo the next is very large. In addi tion, uninteresting curricula and the financial inability of parents to keep their children in school play a part in this large elimination. To select a sample of children v who remain in school, especially at the upper part of the t educational ladder, would be to slant in favor of the upper mentality .and income brackets. i Only a few children have birth certificates. And one may suppose that birth certificated children are not a representative sample even of their own school, for it Is likely that they have higher intelligence than the average, since their parents tend to belong to the more successful classes of the community. Therefore, although school regulations require that a child be six years old to enter school, the parents are able to send him whenever _they_think_he_is_r.eady_ A_typi.c_al_fdrst_grad.e_in_±he----- I -------------------------------------------------- 274 1 I rural areas might, due to this Tact, be composed of chil dren from six to nine years or older, while in the cities the range would be lower. Thus, to assume that the chil dren in any one grade are the same age would be incorrect. Another difficulty involved in obtaining a fair sex sampling arises from the fact that, because of old tradidions, girls are not encouraged to attend school while boys are. 3. The criteria of validity. Any intelligence test to be regarded as valid must correlate highly with outside reliable criteria of intelligence. Such a criterion of intelligence in the case of school children often taken is another intelligence test, teachers* estimates, and ex amination marks. The test cannot be correlated with another believed to be valid, because there is none. There remains only correlation with teachers* estimates of intelligence and with examination results. The teachers* estimates are questionable, because most of the teachers are not familiar with the technical meaning of intelligence 4. Problems of adapting a foreign intelligence test to different culture. It is necessary that no one should endeavor to adapt an intelligence test based on a different culture unless he is thoroughly acquainted with the culture to which the test is to be adapted. 1 1 Ethnological reports furnish more concrete inforraatjLqn,____ 275 but there is nothing that can take the place of actually living with the people and participating in their culture. Only in this way can insight into their habits of thought and action be gained* and such insight is very necessary in order to frame a proper test. In adapting an intelligence test* the test items must be wholly suited to the children's environment; and however well one knows the environment* it is impracti cable to guess at the suitability; it must be discovered by experimentation. An intelligence test* for example the Stanford-Binet* undoubtedly is suited to the American culture* but* if it is used in Iraq* even with prior adaptation* many items will prove to be x^eak or wholly indiscriminative* some because they are too easy or too difficult* or because of misunderstanding due to associa tion of ideas which differ in the different geographical and social environments* or because some items would fail to motivate the Iraqi child by seeming silly to him. An item that is easy for a child in one culture may be difficult for. a child in another. For example* in the Wechsler Intelligence Scale for Children* in the informa tion test* the words "Hieroglyphic5 1 and "Genghis Khan" are placed on the lower scale of the difficulty of the test. .F_or_an_Iraqi_child.*_these„wordS-Would_be_easy*~because ! ~ ~ ~ ~ ~ “ ~... 276 “ they are a part of his culture. On the other hand, the American child is more likely to know the color of rubies and who wrote Romeo and Juliet than is an Iraqi child. Therefore, in allocating the tests, whether as to their order of difficultly or age assignment, a complete re- standardization on new subjects must be used. In Chapter IX, when comparing the Stanford-Binet * with its adaptation in other cultures, it was mentioned that the desire to remain as close as possible to the original has hampered the adapters. It is easier merely to translate the items, but it is apparent that a more general adaptation must be made and tested by experimenta tion if a really valid test is to be made. An intelligence test does not. necessarily measure all the intelligence a child has, but only all that he is willing and able at the time to show. The child must understand what he Is asked to do, even if he cannot do it. He must be interested in doing it, and trying his best to do it, and he must not be disturbed by irrelevant emotional factors. The means to fulfill these conditions differ in different environments, l/tfhat will work in the United States may well not work In Iraq. One can find the right way only by experiment. General principles of importance in the choice and ------- - 277' arrangement of test1s items. It seems clear that in de vising or adapting an intelligence scale the following points should be kept in mind: 1. The items should be as varied as possible in order to “tap” the mental ability in various directions. 2. The number of items presented to any individual child should neither be too large nor too small. If the number of the items is too small the reliability of the I.Q. becomes less on account of the small number; while if it is too large it may become so owing to the effects of fatigue that creep in. 3. The wording of the questions in the test should be within the range of comprehension of the children of the ages for whom they are intended. 4. The questions should be unequivocal and should call for a definite answer although not necessarily a stereotyped one. In group tests most of the answers are either a definite word or a definite figure. This makes the assessing of the answers more objective and removes the possibility of subjective judgment. More often cor- > rect answers are various, especially in the individual test. They should, however, be suitable and intelligent. For example, take a question like, “What is the differ ence between a butterfly and a fly?1 1 Different children -give—diff.er_ent_answ.ers__and„a 1 !Ljaay_J>e sensib 1 e_.---------- "" 278“ ] 5. The questions should give no scope for guesswork. They should stimulate the children to think and to find a suitable answer. Prom this point of view the Problem Questions test in the Stanford-Binet is of no great value. 6. The questions should be such as to keep up the interest of the children. 7* The items as well as the sub-tests should be ar ranged in such a way that the easier ones come first and lead gradually to more and more difficult ones. This has the advantage of encouraging the child to do his best. 8 . No two tests of the same nature should follow each other immediately. In other words, there should be as much variety in consecutive tests as possible. 9. The tests should be such as to give no advantage to one child over another because of his better schooling or better home or social environment. It will be seen from what has been said in previous chapters that it is impossible to test the native intelligence of children except through what they have acquired or learned from their everyday experiences. Analysis of the Wechsler Intelligence Scale for Children and the Stanford-Binet. A committee of five graduate students studying at the University of Southern California were chosen to help in analyzing each item 279~ item on the Wechsler Intelligence Scale for Children and the Stanford-Binet. None of the five s/tudents had taken any intelligence test before. All of them were contacted individually. The purpose of the study was explained to them and full co-operation was obtained. Each one was asked to give his opinion as to whether or not the item* if translated* would fit the Iraqi culture. The purpose of this inquiry was to investigate the answers on the items of two well known intelligence scales of people from another culture. It is a widely accepted fact that most intelligence tests are culturally deter mined* and these two tests are no exception. The answers of the five students on each item were studied in order to <l 0termine which items or subtests can be used for adaption in Iraq* This section of the study has further implications. The point of view here maintained is that since these tests cannot be used in their original form* it is desir able to determine tentatively whether they can be used at all for another cultural group* and if so* how effective ly. In this study it is suggested that they may serve as basic material for .developing such tests in Iraq. An analysis of the answers showed that all of them agreed that the following items on the Wechsler Intelli— _genee—Scale_for_Chi 1 dren_shou 1 d__be e 1 iminated entirely--- and replaced by others ’ 280 | I Information How many pennies make a nickel? Who discovered America? What is the color of rubies? Who wrote Romeo and Juliet? What is celebrated on the Fourth of July? What does C.O.D. mean? How tall is the average American man? How far is it from New York to Chicago? When is labor Day? What is Alien? Comprehens ion Why is it better to build a house of brick than of wood? Why is it better to pay bills by checks than by cash is it generally better to give money to an or ganized charity than to a street beggar? Similarities In what way are beer and wine alike? In what way are piano and violin alike? Picture Completion Card Thermometer Hat Umbrella Picture Arrangement Mother Train Scale Fight Fire Burglar Sleeper Rain Object Assembly Auto On the rest of the items a full agreement was ob tained, except on a few of them, that, if they are trans lated, they would suit the social and cultural environment of Iraq. Not a single objection was raised to the Arith metic, Digit span, Block Design, and Coding subtests. The vocabulary test was not submitted to this analysis, be cause it was felt that many of the words are difficult to understand. In the case of the Stanford-Binet, the results of the analysis are shown in Table XXIII. A glance at this table reveals that most of the suggested changes occur at the upper age level of the scale in the verbal material. The changes are, in most cases, in items that are not familiar to the Iraqi culture. Summary and evaluation. In this chapter, an at tempt was made to discuss the difficulties which face any test constructor in building an intelligence test In Iraq. These difficulties are the language, sampling, and valid ity. Mention was also made of the difficulties of adapt ing any test to another culture. Among these are that the adapter should be thoroughly acquainted with the other culture, and the suitability of items for use in it. He must also be acquainted with the factors that will make the child willing and able to;;.do his best. A brief dis- TABLE XXIII ANALYSIS OF COMMITTEE RESPONSES TO STANFORD-BINET ITEMS The whole Item’s content must Articles needing Name of the Test test if be changed, modi change or ! suitable fied or eliminated modification ( Three-Hole Form Board X t Identifying Objects by Name X Engine ‘ Identifying Parts of the Body X Block Building: Tower X Picture Vocabulary X Clock, bed, fork, i umbrella, stool Word Combinations X Obeying Simple Commands X Identifying Objects of Use X Automobile, iron Naming Objects X Repeating 2,3^5*6,8,9 Digits X Stringing Beads X Block Building: Bridge X Picture Memories X Copying a Circle X Comparison of Sticks , X Response to Picture I X Dutch home, river scene, Post Office Comprehension I,II,III, and IV X Drawing Designs: Cross X Naming Objects from Memory X Automobile, engine TABLE XXIII (continued) ANALYSIS OF COMMITTEE RESPONSES TO STANFORD-BINET ITEMS Name of the Test The whole Itemfs content must test if he changed, modi- suitable fied or eliminated Articles needing change or 'modification Picture Completion: Man Pictorial Identification Discrimination of Forms Memory for Sentences I,II,III, IV,V Aesthetic Comparison Pictorial Likenesses and Differences Materials Three commissions Opposite Analogies Paper Folding: Triangle Definitions Copying a Square Counting Four Objects x X X X X X X X X X X X X Stove, umbrella Sentences should be modified A table is made of wood, a window of..... The point of a cane is blunt; the point of a knife is..... Hat ; r o o o TABLE XXIII (continued) j ANALYSIS OP COMMITTEE RESPONSES TO STANPORD-BINET ITEMS ! 1 The whole Item's content must i Articles needing ' Name of the Test test if be changed, modi change or ! suitable fied or eliminated modification 1 Knot X I Vocabulary X r Copying a Bead Chain from i Memory I,II X t Mutilated Pictures x , Number concepts X Maze Tracing X i Picture Absurdities I X Man with umbrella, man and women sitting in the rain i Similarities: Two Things X Ship and automobile, Copying a Diamond X Memory for Stories: the 1 Wet Fall X The story must be modified Verbal Absurdities I X A wheel came off Frank1s auto mobile . • . An engineer said that the more cars ♦ • • f V > 00 4 = * TABLE XXIII (continued) ANALYSIS OP COMMITTEE RESPONSES TO STANFORD-BINET ITEMS Name of the Test The whole test if suitable Item's content must be changed, modi- . fied . . o r eliminated Articles needing change or modification Similarities and Differences X Baseball, ocean . penny and quarter Paper Cutting I,II X Verbal Absurdities X Bill Jones' feet.... Christopher Columbus..... Icebergs Memory for Designs X Rhymes: New Form X To be eliminated Making Changes_ X Repeating 4,5>6 Digits Reversed X Picture Absurdities II X The Feature of Frontier Days must be changed Reading and Report X |Word Naming X Finding Reasons I X Automobile and bicycle Verbal Absurdities III X Little modifica tion Abstract Words I X Connection Problem Situation X To be eliminated ro GD U1 TABLE XXIII (continued) ANALYSIS OF COMMITTEE RESPONSES TO STANFORD-BINET ITEMS The whole Item1s content must Articles needing Name of the Test test if be changed, modi change or suitable fied or eliminated modification Similarities: Three Things Response to Picture II [Abstract Words II (Minkus Completion [Plan of Search Memory for Words [Problems of Fact dissected Sentences Induction Picture Absurdities III Ingenuity Orientation: Direction I,II. podes Differences between Abstract Words i Arithmetical Reasoning [proverbs I, II Reconciliation of Opposite Enclosed Box Problem Sentence Building Finding Reasons II x x x x x x x X X X X X X X X X X X X X Should be changed Constant To be eliminated To be modified To be changed To be eliminated To be modified Character and Reputation To be modified Typewriters should be changed to 00 01 TABLE XXIII (continued) ANALYSIS OF COMMITTEE RESPONSES TO STANFORD-BINET ITEMS The whole Item!s content must Articles needing Name of the Test test if be changed, modi- change or suitable fied or eliminated modification Repeating Thought of Passage: ] Value of Life x ppposite Analogies II x To be eliminated Reasoning x - - - - - - - - - - - - - - - - - - - - - - - - - - - - “ 2 8 8 J cussion was also offered as to general principles of im portance in the choice and arrangement of test items. Finally, two well-known intelligence tests were subjected to analysis by a committee of fitfe Iraqi graduate students who gave their opinions as to the suitability of each item, if translated. Although the group is small and has been exposed to the American culture, it is probable that the unanimous rejection of an item by the group would mean that it would not be useable and unanimous acceptance would probably mean that use of the item is justifiable, at least for initial exploration. It•is hoped that this analysis may be of some help in developing an intelligence test in Iraq. This study, and others like it, may result in the selection of items from many of the better known tests that would be useful and that, with experimentation to test their validity and age-assignment, could be worked into a practical and useable intelligence test for Iraq. CHAPTER XII SUMMARY I. THE PROBLEM AND PROCEDURE The problem. The purpose of the study was to show the trends in the construction of intelligence tests since the early stages of the movement, with an attempt to show the influence of culture on test items and its implica tions for construction of an intelligence test for use in Iraq. The procedure. Data were gathered by reviewing the related research in the field. In addition, copies of the Stanford-Binet adaptations were obtained from dif ferent countries. Last, a committee of five Iraqi gradu ate students studying at the University of Southern Cali fornia evaluated each item on the Wechsler Intelligence Scale for Children and the Stanford-Binet in terms of its suitability for the Iraqi culture. II.. FINDINGS AND CONCLUSIONS The need for accurate methods of measuring human potentiality has been recognized since the beginning of written records. Tests which permit objective measure- - - - “ 290 merrt are relatively new, their roots extending to the de velopment of experimental psychology in Germany, Statisti cal methods in England* the Binet scale In France, and applied psychology in the United States* As far as intelligence is concerned, the work of Binet deserves special consideration because of the great stimulus he gave intelligence test development. His work of selecting candidates for special subnormal schools in Paris led him to work out the first intelligence test worthy of that name. In addition to his scales of 1905* 1908, and 1911, Binet made two outstanding contributions to the theory of psychological testing. He developed, first, the concept of general intelligence, and secondly, the concept of mental age. Binet*s work was very influential over Europe and the United States. This can be illustrated by the large number of Intelligence tests which followed Binet*s ap proach to this problem. However, revisions and departures from Binet*s ideas in test construction can be noticed in many scales which appeared after him. The Stanford-Binet revisions, though they are extensions of Binet*s work, havp made many refinements and have adopted the idea of the I.Q. as suggested by Stern. Yerkes and others modified Binet*s practices by omitting the grouping of tests into age groups and using the same tests, in many cases, at____ 1 ---------------------------------------------------- -----291— different age levels in increasing order of difficulty, with different scoring systems. Kuhlmann went further in his modification of Binet!s work by arranging the tests in order of increasing difficulty and assigning an age value and a value in mental units to each test. Wechsler went far in his departure from Binet. He grouped all items of a particular type together, in order of diffi- , culty, constituting a subtest of the whole. He disregarded, the mental age concept. Thus, his I.Q. implies a differ ent meaning from the Stanford-Binet because the derivation is different, as are the groups with whom the individual subject is being compared. Thorndike and others, in their CAVD test, made a complete departure from the Binet ideas by arranging the items in order of difficulty, providing seventeen different levels, in each of which the task of any one subtest are of nearly equal difficulty. As a result of the intelligence measurement move ment, the performance scales appeared as one means of measuring general intelligence. The language difficulty, inherent in the Binet and its various revisions, made it inadequate for the mental examination of non-English speaking people, speech defectives, the deaf, and those with language disability. Hence, non-language tests, which do not require language responses on the part of 1 the child, were introduced as a substitute. Among those 29~2 who first used a performance test were Healy and Fernald, Knox, and Pintner and Paterson. However, performance tests do not differ widely from each other in their con struction. They are mostly of the form board variety or of various kinds of picture puzzles. Recently, performance scales have been used in testing concept formation. This result has come about principally from experimental work- in clinical psychology with brain injuries, though it is used now with normal groups. Despite the advantages of performance tests, they are affected by the speed factor, manipulative dexterity factor, and the systematic method of working. Stimulated by the testing movement and the great in terest in the possibility that environmental factors may greatly affect mental growth, tests solely for measuring early development were devised. They are of two kinds: infant tests which are primarily observation scales of growth and development of motor skills, and the pre school type which include items of spatial perception, verbal skills, and performance skills. Although some in dividuals who designed infant and pre-school tests have ♦ found certain predictive values for their tests, research studies by others do not bear them out. Early attempts at mental testing were largely indi-- -vidual-in—type___General_opinion_held that group tests ^ 293 could not be well enough controlled to merit - consideration as a useful instrument in testing programs. But during World War I, the need was great for testing thousands of men in a short time. Thus a great impetus was given to the construction and use of group tests. As a result, the now-famous Army Alpha test was introduced. As the Binet is the direct ancestor of many individual tests, so the • Amy Alpha is the direct ancestor of many group tests of today. The evolution of group intelligence tests has been much less clear-cut than that of individual intelligence tests. The Array tests (Alpha and Beta) were the first major development in group intelligence testing. The Alpha scale was suitable for men who could read and understand the English language, the Beta test was intended for those unable to read English. Though many followed the Army practices in developing group intelligence tests, sUfew departed from this approach. The Otis Self Administering Test of Mental Ability represents a departure from orig inal Army practice. Otis did not divide the scale into subtests, instead he mixed up the items of different types throughout the test, beginning with easy items and proceeding to more difficult ones. Thorndike and others in their CAVD went far in departing from the Amy practice _bv .providing: seventeen levels of intellect, from level A 294 to level Q. Each level is represented by forty items or tasks* arranged in order of difficulty. Also* the steps between levels,are of approximately equal difficulty. The lowest level is suitable for three-year-old children* while the highest levels are intended for superior adults. The tendency in recent group tests has been a complete departure from the Army practice. It has been in the di rection of breaking down the total score* mental age* or I.Q. into two or more aspects of mental ability. Thus* the California Tests of Mental Maturity were constructed to measure language and non-language factors. / Group tests do not differ very much in their types of material. Certain types of material are found in many group tests. These include opposite* analogies* best reasons* disarranged sentences* arithmetical problems* sentence completions* classifications* number completions* i word knowledge* and non-verbal material. It seams* in the last few years* that the direc tions in the development of intelligence tests have been away from over-all tests and towards a breakdown into more adequate tests of "factors” or "functions.” The idea of constructing intelligence tests based on the factor analysis technique has its roots extending back to 1904 when Spearman exposed his two-factor theory. This theory postuIated_a_general_factor, denoted by the letter V to 1 295 explain the correlation found between abilities and a specific factor* denoted by the letter "s" which is largely specific to a particular type of activity. This theory has been modified by Holzinger to account for the fact that some kinds of mental tasks have common elements not present in other tasks. This theory* known as "bi-factor" analysis* sought to divide a set of tests into their general* group* and unique c omponents. A competing theory* sponsored by Thurstone* proposed a relatively small number of moderately broad group factorq* each of which may enter with different weights into dif ferent tests. However* the work of Spearman and Holzinger and others has received less attention than the work of Thurstone; the claims and possibilities of factor analysis are best illustrated in his work. Thurstone culminated his investigations by constructing intelligence tests* in various editions and forms* which purport to measure what he calls the nPrimary Mental Abilities" which are said to be relatively independent of each other. The idea of general intelligence was given up* and with it went the familiar global score or rating* whether it be in the nature of a mental age* a percentile or a standard score. Since the advent of the intelligence testing move ment* intelligence tests have been used to compare the _mental_abi 1 ity__of__rac.ia 1_ and cultural social groups. ____ Educators and psychologists have debated and investigated the relationship of the I.Q. to the various cultural factors. These investigators have shown that there is a definite and measurable relationship between the scores the pupils obtain on intelligence tests and their cultural background. There is wealth of evidence that intelligence test scores are more or less related to socio-economic factors. It has been shown that intelligence test per formance does vary with socio-economic background no mat ter what tests*; what measures of socio-economic background* what age level of pupils* or what statistical techniques are used. By the same token* studies dealing with the analysis of intelligence test items* though they are few in number and some inadequate* have shown that the relative perform ance of lei? status as compared with high status pupils varied to some extent with the particular items and test used. These studies have Indicated that pupils who came from high socio-economic groups did better on most of the test items* especially items related to verbal ability or school learning* than the pupils who came from low socio-economic groups. Also* they showed the status dif ferences between both groups are less on the younger age levels than on the older age levels. _______The question now arises* are the differences found 297 on intelligence test scores due to heredity or to environ mental and cultural factors? One hypothesis maintains that the difference is due to the hereditary factor. It supposes that the relationship between intelligence test scores and social status is the result of selection. The less able members of an underprivileged group may tend to remain in it while the able tend to move upward. On ; the other hand* many educators and psychologists have attributed the differences to various cultural and social factors* though they do not deny the role of heredity. Some of the factors which may contribute to the status differences are: (l) genetic ability which may be assumed to be directly Inherited through the genetic structure of the individual; (2) developmental factors which mean those elements of the environment which may contribute to the child*s mental growth; ( 3) social-cultural bias in test items which means that certain aspects of the tests operate to produce cultural bias in favor of high status pupils; (4) motivation; and ( 5) work habits. It is difficult to determine what factor or factors are mainly responsible for status differences. To assume that all of the differences between the social-status groups in test performance are a result of cultural factors would run the risk of going beyond the supporting evidence and_to__c.Qnc.lude__that_pupi 1 s_fr om_low__s oc iojr.ecpnomic_leye 1 s_ n ------------------ 29B i are genetically inferior is to assume that the tests are a fair measure of their ability. Data that seem to be in dicative of the operation of one cause are often later interpreted as being a sign of a different cause. Many of the findings may be explained in terms of any or all of the presumed factors. It seems likely that the status differences in response to intelligence test items are not due solely to any one simple cause but are the results of various combinations and types of factors. It has been established that there is a definite relationship between socio-economic status and intelli gence test scores and that culture influences test per formances. There is no doubt that a culture free or fair intelligence test is needed if a fair intelligence test for all groups is the aim. Such tests would be more meaningful than are those at the present time when the score is made up of unknown proportions of innate capacity and cultural and environmental effects. Since these tests make a deliberate attempt to include only content which is universally familiar in all cultures, they may provide a means for studying the effects of racial, cultural, and educational influences upon mental ability. Many attempts have been made to devise intelligence tests which would be fair to subjects from different cul- t.ural_backgrounds. The Draw-a-Man test by Goodenough was 299 one of the early attempts to attack this problem. Then, it was followed by the Leiter International Performance Scale and the Cattel Culture-Free Test. The most popular test of the present time, which claims to be fair to all groups is the Davis-Eells test of General Intelligence or Problem-Solving Ability. This test was designed for use with children in grades 1 through 6. However, these tests are limited in their range. They consist exclusively of visual-perceptual items which some psychologists believe do not call for very complex mental processes. Also, they have not been used in enough studies to give a valid conclusion concerning their fair ness with various cultural groups. Finally, an attempt was made to find out how the subtests on the Stanford-Binet have been adapted for use in cultures other than the American culture. Because of the lack of enough data and the very small number of the copies received, no valid conclusion can be drawn concern ing the modifications. Two of the adapted tests are merely translations with few changes, if any, itfhile the other three differ from the original one. But these dif ferences do not appear to be in one direction, nor does there seem to be any consistency in the items that are modified. In any case, most of the changes have occurred _ 4_n—the—verbal—Items-,—leaving—the—non=_verb&l--material----- — ■ 300 relatively untouched. In view of work previously done in the field of in telligence testing* difficulties that a test constructor in Iraq would encounter were mentioned. These difficulties have to do with the language* the test* and the criteria for validity. A brief discussion was also offered as to general principles of importance in the choiee and ar rangement of test items. ^ At last* two well known intelligence tests were subjected to analysis by a committee of five Iraqi gradu ate students. The purpose was to obtain their judgments on suitability of each item in the Wechsler Intelligence Scale for Children and the Stanford-Binet for the cultural and social conditions in Iraq. Although the group was small* it seems probable that the unanimous rejection of an item by the group would mean that it would not be use- able and unanimous acceptance would probably mean that use of the item may well be justified. Most of the changes suggested are in Items that are not part of the Iraqi cul ture. It Is hoped that this analysis may be of some help in developing or adapting an intelligence test for school children in Iraq. ■It appears obvious that an intelligence test* to be useful in Iraq* must be carefully gone over and se- —1-ec-ted—1-tem-by—i-tem-from-se.veral—of—the_present_ly .used---- ~ ‘ ... 301 tests. Some new items will have to be employed, and the entire test will have to be standardized. Further re search should be done in Iraq, with Iraqi children, to select items which fit the Iraqi culture, and which will be suitable for all of the children, not favoring any one socio-economic group. Ill. RECOMMENDATIONS On the basis of this study, an outlined plan to be followed in construction of such an intelligence test is offered below. I. Test to be limited to ages 5-15 A. Students above fifteen selective because of elimination through examinations B. Impossible to gain co-operation of parents of children under five II. Test to follow Binet*s concept of intelligence A. Ability to take and maintain definite direction B. Capacity to make adaptations to attain desired end C. Power to be self-critical III. Selection of items A. Gather many items _____________ B._From_existing_United_States_inie.lligence— 302 tests 2. New items from Iraqi culture B. Sifting of items 1. Retain those thought to be suitable to Iraqi culture and familiar to all socio- e c onomi c group s 2. Eliminate unsuitable items C. Preliminary try-out 1* Determine approximate age location 2. Criteria for retention of items a. Validity b. Ease and objectivity of scoring c. Practical considerations 1. Time economy 2. Interest to subjects 3.. Need for variety Selection of students A. All from Arabic-speaking homes B. Fair sample, as far as possible with lack of census, of all socio-economic and geographical groups C. All to be chosen from school children D. Age of children to be determined by parental conferences, because of lack of statistics, alihough—this is not accurate._________ ' 303' V. Validity of Test A. Pace validity of items B. Teachers* marks C. Increase of percentage of success in a given test situation with increasing chronological age D. Correlation between single test items and total score. - VT. Final standardization A. One form of the test to be constructed and administered to a large sample B. Mean mental age of group should equal mean chronological age as near as possible C. Age norm will be established D. Test results to be expressed in mental age units VII. Language of test A. Lower scale test should be in the spoken language B. Upper scale test should be as close as pos sible to the language taught in school IX. Training of administerers A. A few students to be chosen and given a course in mental testing Trainees to be sent to different localities to give test. BIBLIOGRAPHY BIBLIO GRAPHY A. BOOKS Anastasi, Inn, Differential Psychology. New York: The Macmillan Company, 1937. 615 pp. Arthur, Grace, A Point Scale of Performance, Volume I_, Clinical ManuaT. N'ew YorHT "The Commonwealth Fund, 19'43. 64 pp. 'Baldwin, Bird T., and others, Farm Children; an Investiga- tion of Rural Child Life in Se1e c ted Are as of Iowa. New York: D. Appleton and Co., 1939* 337 PP. Binet, Alfred, and The. Simon, The Development of Intelli gence in Children, translated by Elizabeth S. Kite. Baltimore! Williams and Wilkins Company, 191$. 319 PP- Boder, David P., and others, The Binet-Simon-Terman Scale in the Provisional Adaptation for Mexico. Mexico; D.F.e National Graphic Work, 1925. i'4'0 pp. Boring, Edwin G., A History of Experimental Psychology. New York: The Century Company, 1929- 699 pp. Boynton, Paul L., Intelligence: Its Manifestation and Measurement. New York: D. Appleton and Company, 1933* 466 pp. Brigham, Carl, A Study of American Intelligence. Princeton: Princeton University Press, 1923! 210 pp. Buros, Oscar K., editor, The Nineteen Forty Mental Measure ments Yearbook. Bridgeport, Conn.: Braunworth and Co., Inc., 1 9 4 1. 674 pp. _______, The Third Mental Measurements Yearbook. Newnswick: Rutgers University Press," 194$. I"047 PP. _______, The Fourth Mental Measurements Yearbook. Highland; Park, New Jersey! The Gryphon ‘ Press, 1953* ll6g pp. Burt, Cyril, Mental and Scholastic Tests. London: P. S. King and Son, Ltd., 1922. 432 pp. 307 Carmichael, Leonard, editor, Manual of Child Psychology. New York: John Wiley and Sons, 1945.1055 pp. Cattel, Psyche, The Measurement of Intelligence of Infants and Young ChTIHrenI New YorTcT Psychological”"Corpora- tion, 19407 274' pp. Cornell, Ethel, and Warren Coxe, A Performance Ability Scale: Examination Manual. Yonkers, New York: World Book Company, 1939* pp. Cronbach, Lee J., Essential of Psychological Testing. New York: Harper and Brothers, 1949* 475 PP• Davis, Allison, and Kenneth Eells, Davis-Eells Test of General Intelligence or Problem Solving APility "(Manual)" New York: “World Book Company, 1953 72 pp. Dent, G. R., An Investigation of Certain Aspects of Bantu Inte11igenceT Pretoria, South AfricaT" Department of Education, Arts and Science, n.d.• 52 pp. t^Eells, Kenneth W., and others, Intelligence and Cultural Differences. Chicago: University of Chicago Press, 1951* 3HS PP* El-Kabani, Ismael M., The Stanford Revision of the Binet- Simon Intelligence Tests; Manual of Instructions. I Cairo, Egypb_i_ The Committee for Editing, Transla- ^ tions and Publishing, n.d. 64 pp. Freeman, Frank N., Mental Tests, Their History, Principles and Applications. Boston: Houghton Mifflin Company, 1939.450 pp. Freeman, Frank S. Theory and Practice of Psychological Testing. New York: Henry"Holt and Company, 1950. 516 pp. Galton, Francis, Hereditary Genius: An Inquiry into Its Laws and Consequences. London: Macmillan and Company, Xt3T,“lWl¥: 379 pp. " , Inquiries into Human Faculty and Its Development. London: Macmillan "and Company, T88'3'. 387 PP * Goodenough, Florence L., Mental Testing: Its History, Principles, and Applications. New York: Rinehard - . . and Company-,—Inc...,__194-9-— 509—PP-*------ : ------------— 308 Goodenough, Florence L., Measurement of Intelligence by Drawing. Chicago: World Book Company, 1926. T77 PP* Hebb, Donald, The Organization of Behavior; a Neuropsycho- logical Theory. New York: John Wiley and Sons, Inc.* 19WI 335 PP* Hellistrom, Alice, Intelligence Measurement; Swedish Translation and Adaptation. Stockholm, Sweden: Esslete Aktiabolag, 1548. 504 pp. Herring, John, Herring Revision of the Binet-Simon Tests. Yonkers, New York: World Book Company, 1923* 56 pp. Holzinger, Karl J., and Swineford. Frances, A Study in Fac tor Analysis: The Reliability of Bi-Factors and Their Relation to other Measures. Chicago: The University of Chicago Press, 1542; 88 pp. _____, and others, The Estimation of Pupil Ability by Three Factorial Solutions. BerkeTey: University of California Press, 1948". 252 pp. Hull, Clark L., Aptitude Testing. New York: World Book Company, 193*67 5'35 PP* Kamat, V. V., Measuring Intelligence of Indian Children. Bombay, India: The Times' of India Press, I'9’ 5X'. 243 pp. Kelley, Truman L., Crossroads in the Mind of Man: A Study of Differential Mental Abilities. Stanford University., "Calif.: Stanford University Press, 1 9 2 8. 238 pp. Kilenberg, Otto, Race Differences. New York: Harper and Brothers, 15557 367 PP * Kuhlmann, Frederick, A Handbook of Mental Tests: A Fur ther Revision and Extension of the Binet-Simon Scale. Baltimore: Warwick and York, Inc:.., 1922. 208 pp. , Tests of Mental Development: A Complete Scale ^or Individual Examination. Kinneapo1is: Educational Test Bureau, 19397 314 "pp. Leiter, Russell G., The Leiter International Performance Scale, Volume I. Santa Barbara, Calif.: Santa Barbara State College Press, 1940. 95 pp. -------- -------------------- — — ------- 309 McNemar, Quinn, The Revision of the Stanford-Binet Scale: An Analysis of the Standardization Data. New York: Houghton-MifTTin Company,1942. 185 pp. Mursell, James, Psychological Testing. New York: Long mans, Green and Company, 1950 • 488 PP* National Bureau of Educational and Social Research of the Union Education Department, Instruction and Alloca tion of Marks for the Individual Scale of General In telligence . Pretoria, South Africa, 19453 13 PP* Newman, H. H., and others, Twins: a Study of Heredity • and Environment. Chicago: University of Chicago Press, 1937* 389 PP• Pintner, Rudolph, Intelligence Testing: Methods and Re sults . New YorKl Henry Holt andCompany, 1932. 555 PP• , and Donald G. Patterson, A Scale of Performance Tests. New York; D. Appleton and Company, 1923* 218 pp. Porteus, Stanly D., The Psychology of Primitive People. New York: Longmans, Green and Company, 1931343B' pp. , and others, The Practice of Clinical Psychology. New York: American Book Company, 1941. 579 PP* Spearman, Charles E., The Abilities of Man, Their Nature and Measurement. New York; Rinehart and Company, 15493 415 PP* ‘ Stern, William, The Psychological Methods of Testing In telligence, Imranslated by Guy M. Whipple. Baltimore: Warwick and York, Inc., 1914. 160 pp. Stoddard, George D., The Meaning of Intelligence. New York: The Macmillan Company,T947 * 504 PP * Stoke, Stuart M., Occupational Groups and Child Develop ment. Cambridge: Harvard University Press, 1927* 9£ PP * Stroud, James B., Educational Psychology. New York: The Macmillan Co.,1935• 496 pp. . - 3T 0 Terman, Lewis M., The Measurement of Intelligence. Boston: Houghton Mifflin, 1916. 36'2 pp . , and Maud A. Merril, Measuring Intelligence. New York: Houghton Mifflin Company, 1937'• 461 pp. Thorndike, Edward L., An Introduction to the Theory of Mental and Social Measurements! New York: TeacHers College, Columbia University, 1913- 277 PP* , and others, The Measurement of Intelligence. New York: Teachers College, Columbia University, 1927. 616 pp. Tryon, Robert C., Cluster Analysis Correlation Profile and Orthometric (Factor) "Analysis for the Isolation of Unities in Mind ahcT Personality. Ann Arbor, MicEJ Edwards Brothers, 1939• 128 pp. Thurstone, Louis, Multiple Factor Analysis; a Development anci Expansion of the " Vectors of Mind. Chicago: Uni versity of Chicago Press, 1947- 535 PP* , Primary Mental Abilities. Chicago: University of Chicago Press, 1939* 121 pp. , The Vectors of Mind: Multiple-Factor Analysis for the Isolation of Primary Traits. Chicago: Uni versity of Chicago Press, 1933! 256 pp. , and Thelma G. Thurstone, Factorial Studies of Intelligence. Chicago: University of ChicagoPress, 1941. 94 "pp. Warner, W. Lloyd, and Paul S. Lunt. Social Life of Modern Community. Yankee City Series, V9I. 2. New Haven, Conn.: Yale University Press, 1942. 246 pp. Wechsler, David. The Measurement of Adult Intelligence. Baltimore: The Williams and Wilkins Company, 1944." 258 pp. , Wechsler Intelligence Scale for Children-Manual. Hew York: TEe Psychological Corporation, 19525 114 PP* Whipple, Buy M., Manual of Mental and Physical Tests, Simple Processes. Baltimore: Warwick and Yo'rkT Inc., 191S..— 36-5-PP-*— ------------------------------ 1 ------- 311 j Wolfe, Dael, Factor Analysis to 1940. Chicago: Uni versity of Chicago Press,'”T94o. 69 pp. Yerkes, Robert M., and Josephine Foster, A Point Scale for Measuring Mental Ability. Baltimore: Warwick and York, Inc., 1923* 219 PP* _______, and Rose S. Hardwick, A Point Scale for Measuring Mental Ability. Baltimore: Warwick "and York* Inc., 1915. 21b pp. Yoakum, Clarence S., and Robert M. Yerkes, Army Mental Tests. New York: Henry Holt and Company, 1920. B. PERIODICAL ARTICLES Arlitt, Ada H., "Further Data on the Influence of Race and Social Status on the Intelligence Quotient," Psycholog ical Bulletin, 1 8: 9 5-9 6, February, 1 9 2 1. _______, "On the Need for Caution in Establishing Race Norms," Journal of Applied Psychology, 5:179-183* June, 1921. Armstrong, 'darirette P., "A Study of the Intelligence of Rural and Urban Children,f f Journal of Educational Sociology, 4:301-315>January, 19'31 * Asher, Eston J., "The Inadequacy of Current Intelligence Tests for Testing Kentucky Mountain Children," Peda gogical Seminary, 46:480-486, June, 1935* Arthur, Grace, "Experience in Testing Indian School Chil dren, " Mental Hygiene, 25:188-195* April, 1942. _______, and John M. Cook, "Intelligence Ratings for 97 Mexican Children in St. Paul, Minn.," Journal of 18:14-15, October, 1951. Ayrers, Leonard, "The Binet-Simon Measuring Scale for In telligence: Some Oriticisms and Suggestions," Psy chological Clinic, 5:187-19^, November, 1911. Bayley, Nancy,.’ Mental Growth During the First Three Years: A Developmental Study of Sixty-one Children by Re peated Tests," Genetic Psychology Monographs, 14:1-92* duly, 1933. ~ __________________________________ f" ------- _ _ _ _ _ --3X2"] Beckham., Albert S., "Intelligence of a Negro High School Population in Northern City,,1 1 Pedagogical Seminary, 54:327-336, June, 1939- Bennett, Mary W., “Factors Influencing Performance on Group and Individual Tests of Intelligence: I. Rate of Work,“ Genetic Psychology Monographs, 23:237-318, May, 1941. Bickersteth, M. E., “Application of Mental Tests to Chil dren of Various Ages,” British Journal of Psychology, 9:23-73^ December, 1917. Bolton, Thaddeus L., “The Growth of Memory in School Children, 1 1 American Journal of Psychology, 4:362-380, April, 18927 Bridges, James, and Lillian Goler, “The Relation of In telligence to Social Status,“ Psychological Review, 24:1-31, January, 1917* Brown, Fred, “Experimental and Critical Study of the In telligence of Negro and White Kindergarten Children,“ Pedagogical Seminary, 65:161-175* September, 1944. Brown, William, and William Stephenson, “A Test of the Theory of Two Factors,“ British Journal of Psychology, 23:352-370, April, 1933. Bruce,/Myrtle, “Factors Affecting Intelligence Test Per formance of Whites and Negroes in the Rural South,“ Archives of Psychology, 252:1-99* July, 1940. Burt, Cyril, “Experimental Tests of General Intelligence," British Journal of Psychology, 3:94-177* December, 1 9 0 9. _______* “Factor Analysis by Sub-matrices,“ Journal of Psychology, 6:339-375, October, 1938. _______, “Mental Abilities and Mental Factors," British j Journal of Educational Psychology, 14:85-94, June, 15447------------------------------------- Byrns, Ruth, and H. H. Henmon, “Parental Occupation and Mental Ability," Journal of Educational Psychology, 27:284-291, April, I9 3 6. - 313— i i Canady, Herman G., "The Intelligence of Negro College Students and Parental Occupation," American Journal of Sociology, 42:388-389, November, 1 9 3 6. Carlson, Hilding B., and Norman Henderson, "Intelligence of American Children of Mexican Parentage," Journal of Abnormal and Social Psychology, 45:544-551* July, 1950. Carroll, Rebecca E., "Relation of Social Environment to the Moral Ideology and the Personal Aspiration of Negro | Boys and Girls," School Review, 53:30-38, January, 1945* Cattel, James McKeen, "Mental Tests and Measurements," Mind, 15:373-381, July, 1 8 9 0. Cattell, Raymond B., "A Culture-Free Intelligence Test I," Journal of Educational Psychology, 31:161-179, March, - 19407 _____, "Occupational Norms of Intelligence, and the Standardization of an Adult Intelligence Test," British Journal of Psychology, 25:1-28, July, 1934. _______, and others, "A Culture-Free Intelligence Test: II. Evaluation of Cultural Influence on Test Per formance," Journal of Educational Psychology, 32:81- 100, February’ , 1941. Chapman, Crosby J., and D. M. Wiggins, "Relation of Family Size to Intelligence of Offspring and Socio-Economic Status of Family," Pedagogical Seminary, 32:4l4- 421, September, 1925* Chauncey, Marlin R., "The Relation of Home Factor to Achievement and Intelligence Test Scores," Journal of Educational Research, 20:88-90, September, 1929* Clarke, Daniel P., "Stanford-Binet Scale 'L1 Response Patterns in Matched Racial Groups," Journal of Negro Education, 10:230-238, April, 1941. Colvin, Stephen S., "Factors in the Achievement of College Freshmen," School and Society, 24:802-804, December, 1 9 2 6. Collins, J. E., "The Intelligence of School Children and Parental Occupation," Journal of Educational Research, l-7-:-I57--169~, -March, _19 2HZZZZH______________________ ........................... 3T4 Cuff, Noel B., “Relationship of Socio-Economic Status to Intelligence and Achievement," Peabody Journal of Education, 11:106-110, November^ 1933. Curtis, Margaret W., and others, "The Gesel Schedules Ap plied to One-, Two-, and Three-Year-Old Negro Children of Jamaica, B.W.I.," Journal of Comparative Psychology, 20:125-156, October, 1935. Davenport, E. Lee, "The Intelligence Quotients of Mexican Siblings," School and Society, 36:304-306, September, 1932. Davis, Allison, and Robert J. Havighurst,"The Measurement of Mental System," Scientific Monthly, 66:301-316, April, 1948. s Davis, Robert A., "Some Relations Between Amount of School Training and Intelligence among Negroes," Journal of Educational Psychology, 19:127-^130, February, 1928. Dennis,■‘•Wayne, "The Performance of Hop! Children on the Goodenough Draw-a-Man Test," Journal of Comparative Psychology, 34:341-348, December, 1^42. Dexter, Emily S., "The Relation between Occupation of Par ent and Intelligence of Children," School and Society, 17:612-614, June, 1923. Dubnoff, Belle, "A Comparative Study of Mental Develop ment in Infancy," Pedagogical Seminary, 53:67-73.* Sep tember, 1 9 3 8. Duff, James F., and Godfrey H., Thompson, "The Social and Geographical Distribution of Intelligence in Northumberland," British Journal of Psychology, 14:192-- 198, October, 1923. Edward, A. S., and Leslie Jones, "An Experiment and Field Study of North Georgia Mountaineers," Journal of Social Psychology, 9:317-333, August, 1 9 3 8. Ellison, Mary L., and Harold A. Edgerton, "The Thurstone Primary Mental Abilities and College Marks," Educa tional and Psychological Measurement, 1:399-466', October, 1941. English, Horace B., "An Experimental Study of Mental Ca- pacities of School Children. Correlated with Social Status," Psychological Monographs, 2 3: 2 6 6-3 3 1, 1917. 315 Ferguson, George 0., "The Psychology of the Negro," Archives of Psychology, 36:1-138, April, 1 9 1 6.. Fitzgerald, J. A., and W. W. Ludeman, "The Intelligence of Indian Children," Journal of Comparative Psychology, 6:319-328, August, 1 9 2 6. Fleming, Charlotte, "Socio-Economic Level and Test Per formance," British Journal of Educational Psychology, 13:7^-82, June, 194-3. Fruchter, Benjamin, "The Nature of Verbal Fluency," Educational and Psychological Measurement, 8:33-47* ' Spring, i g m z ------- Fukuda, Tonan, "A Survey of the Intelligence and Environ ment of School Children," American Journal of Psy chology, 3 6:124-139* January, 1925. Garretson, 0. K., "A Study of Causes of Retardation Among Mexican Children in a Small Public School System in Arizona," Journal of Educational Psychology, 19:31-40, January,. 1 9 2 8. Garth, Thomas R., "The Intelligence and Achievement of Southern Negro.Children," School and Society, 32:431- 4 3 5, September, 1930. , "The Intelligence of Full Blood Indians," Journal of Applied Psychology, 9:382-339* 1925. , "The Intelligence of Mixed Blood Indian, Journal of Applied Psychology, 11:268-275* 1927. , "Intelligence of Mexican School Children," School anff Society, 27:791-794, June, 1928. , and James Garret, "A Comparative. Study of the Intelligence of Indians in United States Indian Schools or in the Public or Common Schools," School and Society, 27:178-184, February, 1928. , and Harper D. Johnson," The Intelligence and Achievement of Mexican Children in the United States, Journal of Abnormal and Social Psychology, 29:222- 229* July, 1934. --------- 3l6 j Garth, Thomas R., and others * "Administration of Non- Language Intelligence Tests to Mexicans," Journal of Abnormal and Social Psychology, 31:53-58, April, 1936. , and others, "A Study of the Intelligence and Achievement of Full Blood Indians,5 ' Journal of Ap plied Psychology, 12:511-516, 1928. Glass, Leroy C., "The Relation of the Intelligence of College Students to the Occupation of their Parents," Eugenical News, 21:1-2, January and February, 1936. Goldstein, Kurt, and Martin Scheerer, "Abstract and Con crete Behavior: An Experimental Study with Special Tests," Psychological Monographs, 53, No. 2:1-151, 1941. Goodenough, Florence, "Racial Differences in the Intelli gence of School Children," Journal of Experimental Psychology, 9:388-397, October, T§2&7 , and Dale B. Harris, "Studies in the Psychology of cEildren* s Drawing: II. 1928-1949,” Psychological Bulletin, 47:369-533, September, 1950. ^Goodman, Charles H., "Prediction of College Success by Means of Thurstone!s Primary Mental Abilities Tests," Educational and Psychological Measurement, 4:125- 140, Summer, 1944” ! Gray, J. L., and Pearl Moshinsky, "Ability and Educational Opportunity in Relation to Parental Occupation," Sociological Review, 27:281-327, July, 1935. Harriman, Philip, 5 1 Irregularity of Success on the 1937 Stanford Revision," Journal of Consulting Psychology, 3 : 8 3-8 5, May-June, 1939. Havighurst, Robert, and Rhea R. Hilkeritch, 5 5 Intelligence of Indian Children as Measured by Performance Scale, Journal of Abnormal and Social Psychology, 39:419-433, October, 1944. , and Leota L. Janke, "Relations between Ability ancT Social Status in a Midwestern Community: I. Ten- Year-Old Children," Journal of Educational Psychology, 34:357-368, September, 1944. 317 -Havighurst, Robert, and others, "Environment and the Draw- a-Man Test: The Performance of Indian Children," Journal of Abnormal and Social Psychology, 41:50-63, J anuary, 1946. Haught, Benjamin, "Language Difficulty of Spanish American Children," Journal of Applied Psychology, 1 5: 92-9 5 5 February, 1931 • , "Mental Growth of the South Western Indian," Journal of Applied Psychology, 18:137-142, 1934. Healy, William, and Grace M. Fernald, "Tests for Practical Mental Classification," Psychological Monographs, 135 No. 2:1-535 March, 1911. Heidbreder, Enda E., "The Attainment of Concepts: VII. Conceptual Achievements During Card-Sorting," Journal of Psychology, 27:3-395 January, 1949• Herring, Amanda, "Experimental Study of the Reliability of the Buhler Baby Tests," Journal of Experimental Educa tion, 6:l47-l60, December^ 1937. Hirsch, Nathaniel, "An Experimental Study of the East Kentucky Mountaineers: a Study in Heredity and En vironment," Genetic Psychology Monographs, 3:183-244, March, 1928. _______, r , A Study of Natio-Racial Differences> " Genetic Psychology Monographs, 1:229-406, May and July, 1 9 2 6. Honzi&, Marjorie P., "Constancy of Mental Test Performance During the Pre-School Period," The Pedagogical Semin ary, 52:285-302, June, 1938. Hunter, Walter, and E. Sommermier, "The Relation of Degree - of Indian Blood to Score on the Otis Intelligence Test," Psychological Bulletin, 18:91-92, February, 1921. Jamieson, Elmer, and Peter Sandiford, "The Mental Capacity of Southern Ontario Indians," Journal of Educational Psychology, 19:313-328, May, 1 9 2 8. Janke, Leota L., and Robert J. Havighurst, "Relations between Ability and Social Status in a Midwestern Community: II. Sixteen-Year-Old Boys and Girls," Journal of Educational Psychology, 3 8: 49 9 -5 0 9 5 Novem ber",1S545. 318 Jordan, Arthur M., ’ 'Notes on Race Differences,” School and Society, 16:503-504, October, 1922. _______, "Parental Occupation and Childrens Intelligence Scores,” Journal of Applied Psychology, 17:103-119* Jastak, Joseph,”An Item Analysis of the Wechsler-Bellevue Tests,” Journal of Consulting Psychology, 14:88-94, April, 155^ Kellogg, Chester E., and N. W. Morton, ’ ’ Revised Beta Ex amination,” Personnel Journal, 13:94-100, August, 1934. Klineberg, Otto, "An Experimental Study of Speed and Other Factors in Racial Differences,” Archives of Psychology, 93:1-111, January, 1928. ■ ____, "A Study of Psychological Differences between Racial and National Groups in Europe,” Archives of Psychology, 20:1-58, September, 1931* Knox, Howard, A., ”A Scale Based on the Work at Ellis Island for Estimating Mental Defects,” Journal of American Medical Association, 62:741-747, March, 1914. Krugman, Morris, "Some Impressions of the Revised Stanford- Binet Scale,” Journal of Educational Psychology, 30:594-603, November,- 1939* Lacy, L. D., "Relative Intelligence of White and Colored Children," Elementary School Journal, 26:542-546, March, 1926T Leahy, Alice M., "Nature— Nurture and Intelligence,” Genetic Psychology Monographs, 17:235^308, August, 1935* Livesay, Thayne M., "The Relation of Test Intelligence of' High School Seniors in Hawaii to the Occupations of their Fathers,” Journal of Applied Psychology, 25:369-377* August, 1941. " Long, Howard H., "Test^Results of Third-Grade Negro Children Selected on the Basis of Socio-Economic Status," Journal of Negro Education, 4:192-212, 523- 552, April, October, 1935* Long., Howard H., "The Intelligence of Colored Elementary Pupils in Washington, D.C.," Journal of Negro Educa tion* 3.:205-222, April, 1934. MacDonald, Hector, , , r Phe Social Distribution of Intelli gence in the Isle of Wight,* 1 British Journal of Psy- chology, 16:123-129, October, 1925. Maddy, Nancy E., ’ ’ Comparison of Children1 s Personality Traits, Attitudes, and Intelligence with Parental Occupation, * * Genetic Psychology Monographs, 27:9-65, February, 1943. Mahakian, Charles, ’ ’ Measuring Intelligence and Reading Capacity of Spanish-Speaking Children," Elementary School Journal, 39^760-770, June, 1939. Manuel, Herschel T., and Lois S. Hughes, "Intelligence and Drawing Ability of Young Mexican Children," Journal of Applied Psychology, 1 6: 3 8 2-3 8 7, August, Maxfield, Francis N., "Trends in Testing Intelligence," Educational Research Bulletin, 15:134-141, May, 1936. Menzel, Emil W., "The Goodenough Intelligence Test in India," Journal of Applied Psychology, 19:615-624, October, 1935. Mitchell, Mildred B*, ’ irregularities of University Students on the Revised Stanford-Binet," Journal of Educational Psychology, 32:513-522, October, T94i. Newell, Constance D., "The Uses of the Form Board in the Mental Measurement of Children," Psychological Bul letin, 2 8: 3 0 9-3 1 8, April, 1931. Newland, T. Ernest, and William C. Lawrence, "Chicago- non-Verbal Examination Results on East Tennessee Negro Population," Journal of Clinical Psychology, 9:44-47* January, l35!T Oldham, Ernestine V., "The Socio-Economic Status and Personality of Negro Adolescent Girls," Journal of Negro Education, 4:514-522, October, 1935* Papavassiliow, I. Th., "Validity of the Goodenough Draw- a-Man Test in Greece," Journal of Educational Fsy- chology.,—44-:.244^248.,_April,—1953._____ I 320 Peterson, Joseph, and others, "Comparisons of White and Negro Children in Certain Ingenuity and Speed Tests," Journal of Comparative Psychology, 5:271-283, June, 1925. , "The Comparative Abilities of White and Negro Children," Comparative Psychology Monographs, l:l-l4l, July, 1923. Pintner, Rudolph, and Donald G. Paterson, "The Binet Scale and the Deaf Child," Journal of Educational Psychology, 6:201-209, April, 1915• , and Ruth Keller, "Intelligence Tests of Foreign Children," Journal of Educational Psychology, 13*214- 222, April, 1922. Pressey, Luella W., "The Influence of (a) Inadequate Schooling and (b) Poor Environment upon Results With Tests of Intelligence," Journal of Applied Psychology, 4:91-96, March, 1 9 2 0. Pressey, S. L., and Ruth Ralston, "The Relation of the General Intelligence of School Children to the Occupa tion of their Fathers," J ournal of Applied Psychology, 3: 3 6 6-3 7 3, December, 1919* Raven, John C., "Testing the Mental Ability of Adults," Lancet, 242:115-117* January, 1942. Rice, Joseph M., "The Futility of the Spelling Grid," Forum, 23:163-174, 409-419, April and June, 1897* Robinson, Mary L., and Max Meenes, "The Relationship Between Test Intelligence of Third Grade Negro Chil dren and the Occupation of Their Parents," Journal of Negro Education, 16:136-141, Spring, 1947* Saltzmann, Sara, "The Influence of Social and Economic Background on Stanford-Binet Performance," Journal of Social Psychology, 12:71-81, August, 1940. Sandiford, Peter, "Parental Occupation and Intelligence of Offspring,".School and Society, 23:117-119, January, 1 9 2 6. Saudek, Robert, "A British Pair of Identical Twins Reared Apart," Character and Personality, 3:17-39, September, 1934. ________________________________________________ , 321 Schwegler, R. A., and Edith Winn, "A Comparative Study of the Intelligence of White and Colored Children, 1 1 Journal of Educational Research, 2:838-848, December, w : -------------------------- Sheldon, William H., "The Intelligence of Mexican Chil dren," School and Society, 19:139-142, February, 1924. Sherman, Mandel, and Cora Key, "The Intelligence of Iso lated Mountain Children, Child Development, 3:279- 290, December, 1932. Spearman, Charles E., "General Intelligence, Objectively Determined and Measured," American Journal of Psy chology, 15:201-293^ April, 1904. ______, "Intelligence Tests," Eugenics Review, 30:249- 254, January, 1939. Speer, George S., "The Intelligence of Foster Children," Pedagogical Seminary, 57:49-55, September, 1940. Smith, Mapheus, "University Student Intelligence and Occupation of Father," American Sociological Review, 7:7 6 4-7 7 1, December, 194^1 Stoddard, George, "On the Meaning of Intelligence," Psychological Review, 48:250-260, May, 1941. Strachan, Lexie, "Distribution of Intelligence Quotients of Twenty-two Thousand Primary School Children," Journal of Educational Research, 14:169-177* October, 1926. Stroud, J. B., "Predictive Value of Obtained Intelligence Quotients of Group Favored and Unfavored in Socio- Economic Status," Elementary School Journal, 43:97- 104, October, 1§42. Stuit, Dewey B., and Harry H. Hudson, "The Relation of Primary Mental Abilities to Success in Professional Schools," Journal of Experimental Education, 10:179- 1 8 2, March^ 1942. Tate, Miriam E., "Influence of Cultural Factors on the Leiter International Performance Scale," Journal of Abnormal and Social Psychology, 47:497-50l7 Apri1, 1952. 322' ^ Terman, Lewis M., and H. G. Childs, “A Tentative Revision and Extension of the Binet-Simon Measuring Scale of Intelligence,” Journal of Educational Psychology, 3:61-74, 133-145* 19 8-2158“ , 277-289* February, March, April, May, 1912. _______, and others, “The Stanford Revision of the Binet- Simon Scale and Some Results from Its Application to 1000 Non-Selected Children,“ Journal of Educational Psychology, 6: 5 5 1-5 6 2, November, 1915. Thomson, Godfrey, “The Northumberland Mental Tests, 1 1 British Journal of Psychology, 12:201-222, December, 1921. Thurstone, Louis, “The Isolation of Seven Primary,” Psychological Bulletin, 33*780-781, December, 1936. “A New Rotational Method in Factor Analysis,” Psychometrika, 3*199-218, December, 1 9 3 8. Weigl, Egon, “On the Psychology of the So-called Pro cesses of Abstraction,” Journal of Abnormal and Social Psychology, 36:1-33* January^ 1941. Weintrob, Joseph, and Raleigh Weintrob, “The Influence of Environment of Mental Ability as Shown by Binet- Simon Tests,” Journal of Educational Psychology, 3:577-583* December, 19T 2T" ~ Wells, Frederic, “Army Alpha Revised,” Personnel Journal, 10:411-417* April, 1932. Wheeler, Lester, “A Comparative Study of the Intelligence of East Tennessee Mountain Children,” Journal of Edu cational Psychology, 33:321-334, May, 1942. , “The Intelligence of East Tennessee Mountain Children,” Journal of Educational Psychology, 23:351- 370, May, 1932. Winch, W. H., “Christian and Jewish Children in East-End Elementary Schools,” British Journal of Psychology, 29:261-273* January, 1930. Woodrow, Herbert, and Lawrence Wilson, ”A Simple Pro cedure for Approximate Factor Analysis,” Psychometrika 1:245-258, December, 1936. ’ 323 Wylie, Andrew T., nA Brief History of Mental Tests,” Teacher College Record, 23:19-33> January, 1922. - Yerkes, Robert, and Helen Anderson, "The Importance of Social Status as Indicated by the Results of the Point-Scale Method of Measuring Mental Capacity,” Journal of Educational Psychology, 6:137-150, March, 1915. Young, Kimbal, ”The History of Mental Testing,” The Pedagogical Seminary, 31:1-48, March, 1924. C. .PUBLICATIONS OP LEARNED ORGANIZATIONS Bayley, Nancy, "Factors Influencing the Growth of Intel ligence in Young Children,” Intelligence: Its Nature and Nurture: Part III. Original Studies and Experi ments, pp. 49”79* Thirty-ninth' Yearbook of "the National Society for the Study of Education. Bloom ington, Illinois: Public School Publishing Co., 1940. 409 pp. Buros, Oscar K., Chairman, "Influence of Cultural Back- ^ ground on Test Performance,” Proceedings, 1949 In vitational Conference on Testing Problems, pp. 13-34. Princeton, New 'Jersey: Educational Testing Service, 1950. 94 pp. ^Colvin, Stephen S., and. Andrew H. MacPhail/ Intelligence of Seniors in the High Schools of Massachusetts? United States Bureau of Education Bulletin No. 9. Washington: Government Printing Office, 1924. 39 pp. Davis, Allison, "Education and the Conservation of Human Resources," Education and the General Welfare, pp. 74-83. Official Report, 1949 Conventions, the American Association of School Administrators. Freeman, Frank N., and others, "The Influence of Environ ment on the Intelligence, School Achievement, and Conduct of Foster Children," Nature and Nurture: Part Their Influence upon Intelligence, pp. 103-217. The Twenty-seventh "Yearbook of the National Society for the Study of Education. Bloomington, Illinois: Public School Publishing Co., 1928. 465 pp. 3 24“! Gordon, Hugh, Mental and Scholastic Tests among Retarded Children. Board ofEducation, pamphlet No7 447 London: H. M. Stationery Off., 1923* 92 pp. Hones, H. E., and others, Environmental Handicap in Mental Test Performance. University of California Publiea- tion in Psychology, V. 5, No. 3. Berkeley, University of California Press, 1932. Pp. 6 3-9 9. Honzik, Marjorie P., "Age Changes in the Relationship between Certain Environmental Variables and Children's Intelligence," Intelligence: Its Nature and Nurture: Part II. Original Studies’ and Experiments, pp. "185- 205. Thirty-ninth Yearbook of the National Society for the Study of Education. Bloomington, Illinois: Public School Publishing Co., 1940. 409 PP. Loevinger, Jane, "Intelligence as Related to Socio- Economic Factors," Intelligence: Its Nature and Nurture: Part I. Comparative and Critical Exposition, pp. 159-210. Thirty-ninth Yearbook of the National Society for the Study of Education. Bloomington, Illinois: Public School Publishing Co., 1940. 471 PP. Manuel, Herschel T., Spanish and English Editions of the Stanford-Binet in Relation to the Abilities of the Mexican ChildT University oF Texas Bulletin, No. 3532. Austin: University of Texas Press, 1935. 63 PP. Stocke, Stuart M., Occupational Groups and Child Develop ment . Harvard Monograph in Education, No. 6. Cambridge, Massachusetts: Harvard University Press, 1927. 92 pp. Stoddard, George D., and Beth L. Wellman, "Environment // and the I.Q.," Intelligence: Its Nature and Nurture: Part I. Comparative and Critical Exposition, pp. 405- 442.. Thirty-ninth"Yearbook of the National Society for the Study of Education. Bloomington, Illinois: Public School Publishing Co., 1940. 471 pp. 325 D. UNPUBLISHED MATERIALS Craig, Ann L., "A Study of the Performance of Mexican Children on the Leiter International Performance Scale." Unpublished Master's thesis, the University of Southern California, Los Angeles, 1938. l4l pp. Driggs, Don F., "A Study of the Relationship Between In telligence Test Items and Occupation of Parents of School Children." Unpublished Doctor's dissertation, School of Education, North Carolina, 1952. 227 pp. Eells, Kenneth ¥., "Social-status Factors in Intelligence- test Items." Unpublished Doctor's dissertation, De partment of Education, University of Chicago, 1948. 686 pp. Goulard, Lowel J., "A Study of the Intelligence of Eleven and Twelve Year Old Mexicans by Means of the Leiter International Performance Scale." Unpublished Master's thesis, the University of Southern Califor nia, Los Angeles, 1949. 101 pp. Herriman, Grace W., "An Investigation Concerning the Effect of Language Handicap on Mental Development and Educa tional Progress." Unpublished Master's thesis, the University of Southern California, Los Angeles, 1932. 102 pp. Hess, Robert D., "An Experimental Culture-Fair Test of Mental Ability." Unpublished Doctor's Dissertation, Department of Education, University of Chicago, 1950. 250 pp. Murray, Walter I., "The Intelligence Test Performance of Negro Children of Different Social Classes." Un published Doctor's dissertation, Department of Edu cation, University of Chicago, 1947. 128 pp. Pratt, Philip S., "A Comparison of the School Achievement and Socio-Economic Background of Mexican and White Children in the Delta, Colorado." Unpublished Master's thesis, the University of Southern California, Los Angeles, 1938. 109 PP. 326“ Randals, Edwyna H., 1 1 A Comparative Study of the Intelli gence Test Results of Mexican and Negro Children in Two Elementary Schools.” Unpublished Master1s thesis, the University of Southern California, Los Angeles, 1929. 65 pp. Stone, David R., "Certain Verbal Factors in the Intelli gence-Test Performance of High and Low Social Status Groups," Unpublished Doctor’s dissertation, Depart ment of Education, University of Chicago, 1§46. 104 pp. U rtw ntr of leathern CaJttonfe
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A handbook for students in primary teachers' training schools in Iraq.
PDF
A study of student progress through college with special reference to failure
PDF
The use and value of special tests in the selection of life underwriters
PDF
An empirical validation of the SRA Primary Mental Abilities Test in predicting success in college as measured by freshman grades
PDF
An analysis and evaluation of inservice education for teachers in small rural schools with special reference to Arizona
PDF
A linear model for combining job analysis data on criticality and frequency in developing test specifications for licensure and certification examinations
PDF
The statutory bases for school district participation in summertime recreation programs in the forty-eight states, with special attention to Arizona
PDF
Financing economic development in developing countries with special reference to Iraq.
PDF
Auding and reading skills as sources of cultural bias in the Davis-Eells games and California test of mental maturity
PDF
A study to attempt to determine some interrelationships among mass communication media, reading comprehension and intelligence
PDF
A study to determine the extent and nature of educational legislation obtained through the initiative and referendum in the United States with special application to the state of California
PDF
The nature and nurture of social values with special reference to the responsibility of the public school
PDF
Technology transfer: The case of Saudi Telecom
PDF
External Trade And Economic Growth In Developing Countries With Special Reference To Iraq
PDF
A critical comparison of certain music aptitude tests
PDF
The development of teacher sensitivity to pupil reaction
PDF
The relationship of creativity measures to school achievement and to preferred learning and thinking style in a sample of Korean high school students
PDF
The administration of special academic classes for the gifted pupil in high school
PDF
A critical analysis of California elementary school pupil learning activities for developing democratic citizens
PDF
A study of the high school principalship in its relation to curriculum development
Asset Metadata
Creator
Alzobaie, Abdul Jalil
(author)
Core Title
Intelligence test development with special reference to a test for use in Iraq
Degree
Doctor of Philosophy
Degree Program
Education
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
education, tests and measurements,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c29-138343
Unique identifier
UC11218028
Identifier
DP25903.pdf (filename),usctheses-c29-138343 (legacy record id)
Legacy Identifier
DP25903.pdf
Dmrecord
138343
Document Type
Dissertation
Rights
Alzobaie, Abdul Jalil
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
education, tests and measurements