Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The usability of teacher-growth scores versus CST/API/AYP math status scores in sixth and seventh grade mathematics classes
(USC Thesis Other)
The usability of teacher-growth scores versus CST/API/AYP math status scores in sixth and seventh grade mathematics classes
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE USABILITY OF TEACHER-GROWTH SCORES VERSUS
CST/API/AYP MATH STATUS SCORES IN SIXTH
AND SEVENTH GRADE MATHEMATICS CLASSES
by
Garry Van Cameron
________________________________________
A Dissertation Presented to the
FACULTY OF THE USC ROSSIER SCHOOL OF EDUCATION
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF EDUCATION
May 2010
Copyright 2010 Garry Van Cameron
ii
ACKNOWLEDGMENTS
I would like to acknowledge my immediate family,
Victoria, Kristina, and Garry who have sacrificed for
me to follow this accomplishment. Also, I would like
to thank the rest of my family and friends for their
understanding and support throughout the process.
I would like to thank Dr. Hocevar, my
dissertation chair. Dr. Hocevar worked with me over
the past year and a half so I could get it right. He
helped me mature and grow through this experience. In
addition I would like to thank Dr. Horton and Dr.
Brown for their time in assisting me.
This could not have taken place without the
approval of my former supervisor Mark Lenoir and
Golden Valley Unified School District. I am grateful
for the flexibility Mark Lenoir allowed me to have so
I could spend time on completing the program.
Finally, I would like to thank Jane Pryne and Michelle
Reid for permitting me to spend time in Southern
California while as a first-year principal at Port
Angeles High School.
iii
TABLE OF CONTENTS
ACKNOWLEDGMENTS..................................... ii
LIST OF TABLES....................................... v
LIST OF FIGURES.................................... vii
ABSTRACT.......................................... viii
CHAPTER 1 PROBLEM IDENTIFICATION................... 1
Purpose of the Study .............................. 11
CHAPTER 2 REVIEW OF THE LITERATURE................ 15
Models of School Accountability ................... 15
Status Models ................................... 15
Value-Added-Model ............................... 20
Growth Model .................................... 23
Growth Versus Status Model ........................ 26
Growth Models Currently Being Piloted by States ... 27
Summary ........................................... 45
CHAPTER 3 METHOD.................................. 47
Research Questions ................................ 47
Data Source ....................................... 51
Instrumentation ................................... 53
Procedure and Analysis ............................ 60
Algebraic Explanation of Residualized-Change
Scores .......................................... 63
CHAPTER 4 RESULTS................................. 65
Student Level Descriptive Findings ................ 67
Correlational Findings ............................ 72
Answer to Research Question 1 ................... 72
Answer to Research Question 2 ................... 75
Answer to Research Question 3 ................... 76
Teacher-Level Research Questions .................. 77
Answer to Research Question 4 ................... 82
Answer to Research Question 5 ................... 83
Answer to Research Question 6 ................... 85
iv
CHAPTER 5 SUMMARY, IMPLICATIONS, AND DISCUSSION... 89
Summary ........................................... 89
Implications ...................................... 95
Student-Level Residualized-Change Scores ........ 95
Teacher-Level Residualized-Change Scores ........ 99
Discussion ....................................... 103
Advantages of Residualized-Change Score Models . 103
Disadvantages of Residualized-Change Scores .... 106
Limitations .................................... 111
Conclusions .................................... 112
Residualized Change at the Student Level ....... 113
Teacher-Level Conclusions ...................... 114
REFERENCES......................................... 117
APPENDICES
APPENDIX A STATISTICS........................... 124
APPENDIX B CORRELATIONS......................... 126
v
LIST OF TABLES
1. API Scaled Scores............................ 4
2. Evidence of Problem in Four Middle Schools... 6
3. Current API Status Model.................... 17
4. Tracking Growth and Status Models........... 26
5. On Track for Proficiency.................... 29
6. Arizona Department of Education Example..... 31
7. Delaware Department of Education Example.... 32
8. Delaware Rejected Model..................... 34
9. North Carolina Growth Model................. 37
10. Michigan Growth Model....................... 40
11. Iowa Growth Model Example................... 44
12. Comparison of Eight States.................. 46
13. Golden Valley Unified School District
Middle School Backgrounds (2007-08 SARC).... 52
14. Mathematics CST 2007 Conversions............ 55
15. Mathematics CST 2008 Conversions............ 56
16. CST Scores.................................. 62
17. Descriptive Statistics...................... 67
18. Coefficients Grade Six to Grade Seven....... 69
19. Coefficients Grade Seven to Grade Eight..... 69
vi
20. Degree of Correlation....................... 74
21. Residual-Change Correlations................ 75
22. Correlations of Eight Grade Algebra I....... 78
23. Teacher-Level Descriptive Statistics........ 79
24. Reliabilities: Seventh Grade Pre-Algebra
Teachers.................................... 83
25. Reliabilities: Algebra I Teachers.......... 84
26. Correlations: Status Scores to Change
Scores Sixth to Seventh Grade............... 85
27. Correlations: Status Scores to Change
Scores Seventh to Eighth Grade.............. 85
28. Percentage of Students Below Proficient
on Algebra 1 CST............................ 91
29. Statistics................................. 124
30. Descriptive Statistics..................... 125
31. Correlations............................... 126
vii
LIST OF FIGURES
1. Number of Proficient Students................. 58
2. Standardized Residualized Change (6→7) ....... 70
3. Standardized Residualized Change (7→8) ....... 71
4. Distribution of Teacher-growth scores......... 81
5. Teacher Growth (6→7) by CST 6 ................ 87
6. Positive Residualized Change.................. 88
viii
ABSTRACT
With passage of The No Child Left Behind Act in
2001, the nation’s schools have been under pressure to
bring all students to a proficient level as defined by
their state. The State of California currently uses a
status model accountability system to measure whether
a student, school, or district is proficient. Other
states such as Arizona, Michigan, North Carolina, and
Tennessee are exploring an alternative to the status
model by piloting a growth model system to assess
whether their students are proficient. This case
study of a California school district examined the
usability of a growth model within seven middle
schools at the teacher-level. The growth model used
was a residualized-change score growth model that
accounted for the pretest when measuring growth.
Additionally, the relationship between sixth and
seventh grade mathematics student test scores and
eighth grade Algebra I student test scores was
explored.
ix
Split-half reliability results were strong, thus
suggesting the residualized-change score growth model
is reliable. Additionally, the data revealed a very
strong correlation between teacher-level residualized-
change scores with teacher-level CST API and AYP
scores.
1
CHAPTER 1
PROBLEM IDENTIFICATION
“A growth model is not a way around account-
ability standards. It’s a way for states that are
already raising achievement and following the bright-
line principles of the law to strengthen account-
ability” (Spellings, 2005, p. 1). The No Child Left
Behind Act (NCLB) of 2001 (United States Department of
Education, 2002a) and the California Accountability
Act of 1999 (California Department of Education,
2009a) have, for the past decade, labeled California
schools and school districts by test scores and by
academic levels using status models as a metric. The
status model used by the federal level is Adequate
Yearly Progress (AYP) and the status model used in the
State of California is the Academic Performance Index
(API).
If a school or school district does not meet the
state level of proficiency, they may be subject to
punitive measures by the State Department of
2
Education. A school that does not meet the defined
levels set by the federal government and the State of
California, may show defined academic progress toward
each of the goals to avoid the negative consequences
by the state (United States Department of Education,
2003). However, the academic progress is measured by
the overall performance level at the school and
district level, not by the individual performance
level, year-over-year by each student. The growth
model proposed in this quantitative study allowed for
the measurement of year-to-year individual student
performance. This growth model was evaluated in
comparison to the California Standards Test (CST)
status models which the State of California currently
uses. Subject matter CST’s are used to measure
middle-school students in the State of California.
These CST scores are then used in calculation of the
federal AYP and state API scores.
The NCLB Act of 2001 mandates that each public
school is measured annually to determine if all of
their students are proficient. The level of
3
proficiency is assigned by the state and is to be in
alignment with state standards (California Department
of Education, 2008a). Each year the percent of
students required to be proficient increases until
2014 when the federal government has mandated that all
students must be proficient in grade levels 2-11 in
English and Mathematics (United States Department of
Education, 2002b). The AYP tests do not use a
diagnostic measurement to identify a student’s ability
when they enter school or measure a student’s growth
from the previous school year. The AYP does not
evaluate the amount in change over time of an
individual student’s academic performance. The AYP
measures how many of the students in the school or
school district meet the state’s definition of
proficiency on a test administered toward the end of
the school year.
The API used in the state of California divides
student test scores into five categories: Advanced,
proficient, basic, below basic, and far below basic.
Each student’s test score is assigned a scaled score
4
depending on which level they attain. A student’s API
scaled score on a specific test can be determined
using Table 1 (California Department of Education,
2009a).
Table 1
API Scaled Scores
Advanced
Proficient
Basic
Below
Basic
Far Below
Basic
1000 875 700 500 200
A school’s API is determined by averaging all
tests using specific weights determined by the
California Department of Education during the previous
year. California has set 800 as the score that a
school or district must meet in all significant
subgroups to meet the state’s growth target. A school
or district that does not make that number in a
significant subgroup, or overall, must make
significant progress as a site to avoid state
5
interventions (California Department of Education,
2009a).
Like the federal AYP, the California API does not
take into account individual student growth nor does
it use a diagnostic test to establish a baseline for
student growth. Also, the California STAR program
does not analyze specific teacher or classroom test
scores. Finally, the California API fails to assign
punitive labels or provide interventions to schools
when they have a declining API but are over the scaled
score of 800.
Due to this use of status models, many school
districts in California find themselves in program
improvement even though they may be making progress
with their students. This study analyzed the use of
the CST status model versus the CST growth model in
Golden Valley Unified School District. Specifically,
the study examined all of Golden Valley Unified’s
sixth and seventh grade middle-school mathematics 2008
and 2009 classroom CST’s. This study analyzed student
test scores over a 3-year period to identify if
6
students were making year-over-year progress, and to
determine how much progress they were making.
The district was chosen because all of the middle
schools in the district were in program improvement
(PI). However, the high school with the largest
student population was not in PI and 9 of the 15
elementary schools were not in PI. One of the reasons
all of the middle schools were in PI was because they
all failed to make their AYP scores in the English
Language Learner subgroup in Mathematics as identified
in Table 2.
Table 2
Evidence of Problem in Four Middle Schools
School
Met 2008
Criteria
for API
English
Language
Arts
Math
AYP
PI
Status
Arch No No No No Year 1
Dart No No No No Year 2
Rally No No No No Year 3
Brown No No No No Year 5
7
Problem Analysis/Interpretation
Mathematics sixth and seventh grade test scores,
using status and growth models, were chosen because
the Education Superintendent of the State of
California has stated that, “mastery of Algebra I is
critical to success in today’s global economy”
(O'Connell, 2008, p. 1) The Algebra I course is taken
by all eighth graders in the district with the
exception of a small amount of accelerated students
who take Geometry. The Algebra I California Standards
Test (CST) is administered to the eighth graders
toward the end of the school year. Since the mastery
of Algebra I is so crucial, an evaluation of the
Mathematics test results leading up to this course was
conducted.
Another issue with Algebra I is that it is one of
the most difficult required courses for students in
California to pass. For example in Angel Unified
School District, which comprises approximately one
seventh of all students in California, students failed
Algebra I at a rate of 44% in comparison to 22% in
their eighth-grade English class (Hefland, 2006). The
8
problem of a high failure rate in the state-required
Algebra I course, is the purpose for evaluating the
Math 6 class and the Math 7 class Mathematics test
scores. It is more likely that the problem in Algebra
I can only be solved by intervening in Math 6 and in
Math 7. Two research reports support the conclusion
that success for Algebra I begins in the courses
leading up to Algebra I. One is an article by
Cavanagh (2008) and another journal by the University
of California (Wu, 2001).
The API result is a score that may not accurately
be assessing schools, districts, and classrooms which
are improving across the state (Goldschmidt et al.,
2005). This is because the API does not reward
schools or students for moving students up within a
bandwidth. Also, schools are not rewarded for keeping
students in the same bandwidth even though standards
are increasingly difficult from year-to-year. What
needs to be known is: if a growth metric can be
created that would be able to detect if growth from
sixth grade math to seventh grade math would predict
Algebra I success. This metric could be helpful to
9
educators in two respects. First, instructors would
be able to note trends in the data to help guide
teachers and administrators with instructional
practice. If test scores show growth in certain
classrooms, then that instructional practice may be
replicated in other classrooms. An increase in
mathematical ability in the sixth and seventh grade
level may then translate into better results in
Algebra I.
Second, the metric would provide recognition for
students, teachers, and schools who may not benefit
from the dichotomous APY model or from the complicated
API. The API does not award points for students who
make progress but not enough to perform at a higher
bandwidth. The growth model in this study allows for
that progress to be measured.
A growth model is a metric that tracks the same
students from one year to the next to determine if on
average the students made progress (Goldschmidt et
al., 2005). This model can measure each individual
student, students at the teacher-level, a school of
students, or even students at the district or state
10
level. The key for the growth model is to measure the
same students from one year to the next, and a growth
model cannot measure growth unless at least two scores
from each student are attained. As Goldschmidt et al.
(2005) states, “The basic question under this (growth)
model is, how much, on average, did students’
performance change?”
The growth model used in this study was the
residualized change model. The residualized change
model is different than a raw gain score model. The
residualized change model contrasts from the raw gain
model in that the residualized change model takes into
account the pretest. The pretest/change correlation
in the residualized change model is set at zero. In
the raw gain score model, the pretest is not set at
zero, and thus greater gains are usually associated
with lower scores on the pretest.
The residualized change model is preferred to the
raw gain score model because it is a more precise
measure of student growth. By taking out the effects
of the pretest, the residualized-change score student
growth scores are not as likely to be artificially
11
inflated by factors beyond the teacher’s control. In
addition, teachers are more likely to embrace this
measure because the residualized-change score model
would offer meaningful feedback about specific prior
student performance. Meaningful feedback of student
performance as described by Kelley and Finnigan (2004)
is seen as a way to increase the effectiveness and
accountability of programs.
Purpose of the Study
A use of the study is to find out what trends may
emerge from using a growth model to analyze sixth to
eighth grade mathematics test scores in the district.
The tendencies discovered could yield successful
instructional methods or curriculum so that students
could pass their sixth through eighth grade
mathematics courses at a higher rate. Consequently,
this study may have shown that there were teachers or
instructional methods that were not successful.
This study analyzed the rate that students were
making progress in a class even when they remained
below the proficiency level. The purpose of this
12
study was to answer the following six research
questions.
1. What is the correlation between student-level
residualized-change scores and student level CST
proficiency band(PB) placement scores, and proficiency
(P)scores?
2. Do student-level growth scores for the sixth
to seventh grade transition predict performance in
eighth-grade Algebra I?
3. Do sixth and seventh grade student level CST
scores predict performance on the eighth grade Algebra
I CST Test?
4. What is the reliability of teacher-level
residualized-change scores, and teacher-level CST
scores, proficiency band (PB) placement scores, and
proficiency (P)scores?
5. Are teacher-level residualized-change scores
correlated with teacher-level CST scores, API scores,
and AYP scores?
6. Do growth scores provide unfair conclusions
against teachers that initially have a low or high
achieving group of students?
13
This study should have great importance to
stakeholders at the school, in the district, and at
the state level. The middle school site
administrators should be able to identify classroom
averages and their performance in comparison to state
averages. In addition, the district should be able to
analyze each site in comparison to the state and among
each other and determine any trends. This
dissertation is valuable to the site teachers who will
be able to identify if their instruction is aligned,
not only with state tests, but with the district and
school regulations. The most fundamental of all
stakeholders are the students. If the teachers are
able to take advantage of the data and improve
instruction, mathematic performance should improve and
each student should reach their mathematics potential.
The study can also have practical implications
statewide. With accountability, groups seek methods
to determine school and teacher efficiency, and this
study may demonstrate that growth models provide more
desirable data than the current CST status models.
Another vital significance that this dissertation
14
provides is the ability to reward schools that have
student growth rather than just schools who have high
student performers (Goldschmidt et al., 2005). This
academic growth can be recognized at the student,
teacher, classroom, and school level. As this student
growth is revealed, teachers can strategically work on
closing the achievement gap.
15
CHAPTER 2
REVIEW OF THE LITERATURE
Models of School Accountability
There are currently three types of accountability
models used for evaluating student and school data.
These models are: Status models, value-added-models,
and growth models (Goldschmidt, Roschewski, Choi,
Auty, Hebbler, Bank, & Williams, 2005).
A status model measures a student or school’s
proficiency at one point in time. Growth models track
test scores of the same students from year-over-year.
Value-added-models are a type of growth model
where schools’ or districts’ added value to a score is
determined. This determination is either based on a
student’s prior academic achievement or a student’s
background (Goldschmidt et al., 2005)
Status Models
There are two types of status models:
A conditional and an unconditional model. A
conditional status model attempts to account for out-
16
of-school factors. Out-of-school factors could be
socio-economic data, ethnicity, family status,
transiency or a variety of other non-school issues.
An unconditional status model uses unadjusted-mean
school-performance, or percentage of proficient as an
indicator of performance (Goldschmidt et al., 2005).
An unconditional status model is the accountability
model used under the NCLB Act of 2001 (United States
Department of Education, 2002a) and the metric used in
the State of California. This status model is used to
determine the number of students in a school who meet
the state proficiency standard.
Goldschmidt et al., 2005) and Braden (2006)
provide three explanations on why a status model is
currently being used, “it is easy to understand, all
students have the same goals (equity), and it leads to
100% proficiency by 2014” (Braden, 2006, p. 16).
Status models are also seen as the best way to explain
school performance to politicians (Table 3). “Most
policymakers and the public are familiar with a basic
status accountability model” (Goldschmidt & Choi,
2007, p. 2).
17
Table 3
Current API Status Model
Far
Below
Basic
Below
Basic
Basic
Proficient
Advanced
Far Below
Basic
200 500 700 875 1000
Below
Basic
200 500 700 875 1000
Basic 200 500 700 875 1000
Proficient 200 500 700 875 1000
Advanced 200 500 700 875 1000
Source: California Department of Education, 2008a.
Table 3 is a display of the current status model
used by the State of California to measure academic
performance index (API) scores. This model awards
points only based on current-year accomplishments.
This model does not take into account any out-of-
school factors or in-school factors.
Hanushek and Raymond Evaluate Unconditional
Status Models. Hanushek and Raymond (2002) evaluated
the status model in 2002, the year after the NCLB Act
(United States Department of Education, 2002a) was
approved. They disagreed that the status model used
18
by the federal government to measure AYP would close
the achievement gap. “Put another way, if one school
has students who come to school with poorer
preparation than another, that school must meet a
higher standard in terms of its value-added to student
learning” Hanushek & Raymond, 2002, p. 203). Another
criticism Hanushek and Raymond have is that the status
model is used similar to a grade-point average and
that it is nothing more than an average of outcomes.
There is not year-to-year change, instructional
design, or teacher factors that are measured using a
status model (Hanushek & Raymond 2002).
Lastly, Hanushek and Raymond affirm that status
models do not identify family differences or allow for
any measurement errors in performance. Status models
simply average the scores for the year, and the school
is left with the result for better or worse. Hanushek
and Raymond (2002) believe this basic confusion
between average-student achievement and the
contribution of schools is well known and it would be
useful to adjust scores to get a better idea of the
impact of the school (Goldschmidt et al., 2005).
19
Evaluation of the Unconditional Status Model. A
status model is less suitable for program evaluation
(Goldschmidt et al., 2005). Problems evaluated with
the status model are, it fails to accommodate
differences in student populations between schools, it
fails to identify or reward progress toward
proficiency, and since it creates increasingly
difficult targets, the status model yields to high
rates of course failure (Goldschmidt et al. 2005).
The first issue of differences in student populations
is that the status model does not account for any
differences in socio-economic or racial school
composites. In addition, the status metric does not
account for out-of-school factors. These out-of-
school factors significantly limit what schools can
accomplish on their own (Berliner, 2009). The status
model simply compares all schools without regard to
any of those important characteristics.
A second disadvantage of the status model that
Goldschmidt (Goldschmidt et al., 2005) mentions is
that is does not count progress toward proficiency.
This can mean that if a student does not make
20
proficiency, his/her growth is not tabulated. Another
shortcoming of the status model that Goldschmidt
defines is that the model fails to take into account
the increasing gap of the proficiency target. This
means that the student who scores below proficient,
has to increase at a faster percentage year-over-year
than their peers who score proficient, to catch up to
the proficiency target.
Value-Added-Model
A value-added-model (VAM) attempts to account for
all of the factors that can affect a student’s
performance on a standardized test. (Goldschmidt, et
al., 2005). These factors can include family
background, innate ability, peer influences,
schooling, and luck (Hanushek, 1979). In using a VAM,
it is important for evaluators to note that at any
given time, the VAM is examining the accumulation of
all these factors from when the student began school
to the current analysis (Goldschmidt, et al., 2005).
In addition to a variety of variables for a VAM, the
criteria to judge performance can be relative or
21
absolute. Relative could mean in relation to a
district mean where absolute could be a fixed target
such as the AYP (Goldschmidt, et al., 2005).
Goldschmidt et al. (2005) listed the underlying
purpose for six VAM’s in the Council of Chief State
School Officers’ study. The underlying purpose for
two of the six VAM’s were to implicitly account for
the initial status of the student, a third model was
to explicitly account for the initial status of the
student. An underlying purpose of one of the other
three VAM’s was to use input and output trends, and
the last two were to estimate current growth and the
growth needed to pass the state or a probability of
passing the standardized test. None of the models
could draw the same inferences as the AYP and all of
the models had a difficult implementation process.
One of the most mathematically sophisticated, and
well studied value added models is the Tennessee
Value-Added Assessment System (TVAAS) developed by Dr.
William Sanders of Tennessee(Stone, 1999). The TVAAS
model used multiple data sets to determine if value
22
was added by a teacher or school (Goldschmidt et al.,
2005).
The TVAAS model was designed using statistical
mixed-model methodologies to conduct
multivariate, longitudinal analyses of student
achievement to make estimates of school, class
size, teacher, and other effects. (Wright, Horn,
& Sanders, 1997, p. 1)
However, the TVAAS model has been highly
criticized for not adjusting for demographic
characteristics, since out-of-school factors have been
shown to influence school performance gains (Ballou,
Sanders, & Wright, 2004). In addition, Goldschmidt et
al., (2005) state that some of the primary drawbacks
to the TVAAS model include: an extremely complex
convergence, a difficult implication process, and it
is not practical for program evaluation.
Value-Added-Model Evaluation. Another documented
value-added-model was one conducted in a Texas
analysis by Rivkin, Hanushek, and Kain (2004) in 2003
and evaluated by McCaffrey, Lockwood, Koretz, and Hami
(2003). This model was prepared for the Carnegie
Corporation and measured cohorts of students each with
3 years of test scores to remove the effects of
23
factors other than teachers. Much like the analytical
framework that is analyzed in this dissertation, this
analytic framework uses the standardized scores from a
state test and then converts these scores into z-
scores(McCaffrey, Lockwood, Koretz, & Hami, 2003).
Since the complex Texas accountability model
studied by Rivkin, Hanushek, and Kain was a value-
added-model, there was an attempt to account for the
non-teacher effects by differencing scores using a
linear regression model as the dependent variable and
the teacher turnover rate as the independent variable
(McCaffrey, Lockwood, Koretz, & Hami, 2003). This
model then included other school-level variables such
as changes in district administration and school-fixed
effects. A distinction between the value-added-model
and growth model is the effect that is attempted to be
determined by adding or subtracting student scores
based on out-of-school or school-level variables.
Growth Model
According to Goldschmidt et al.(2005, p. 4) “a
growth model measures progress by tracking the
24
achievement scores of the same students from one year
to the next.” With this model, achievement growth
over time can also be tracked at the classroom or
school level by combining the growth of students who
were present for the previous year’s test. In
addition, the growth model can define progress in
comparison to a statewide or local target (Goldschmidt
et al., 2005).
The underlying purpose of a growth model is to
rank or rate a classroom or school based on
performance change. This use of year-to-year
performance change is the characteristic that
distinguishes it from the status model. The purpose
of the status model is to rank or rate schools based
on current performance (Goldschmidt et al., 2005).
Also, important is that Goldschmidt et al., assert
that a growth model is simultaneously suitable for
program evaluation. This would mean that schools
could test out an intervention on a group of students
to see the degree of achievement that students were
making in that particular program.
25
The growth model can also measure within-school
inequities in performance (Goldschmidt et al., 2005).
These inequities could range from a variety of
inconsistencies within a school. Inconsistencies at
the site level could range from different curriculum
usage, teacher performance, student placement in
classes, or even morning compared to afternoon course
offerings. For a growth model to work at the optimal
level, a student must be provided with a unique ID,
and the test content should be similar from year-to-
year (Goldschmidt et al., 2005).
“The growth model provides the most concise
picture of what is happening to students as they
progress through a school” (Goldschmidt et al., 2005,
p.7). The picture is provided because a growth model
can measure learning growth of the same student and
determine whether or not an individual student made
progress. In addition Goldschmidt et al. (2005) affirm
that the growth model can also be used to compare a
target that schools meet by comparing the amount of
growth to a school or state measure. Furthermore, the
growth model being evaluated in this study analyzed
26
individual performance over time and this data can
easily track students throughout the State of
California.
Growth Versus Status Model
Four categories of Tracking Growth and Status
Models (Table 4) illustrate how schools can be rated
with four possible outcomes (Goldschmidt et al.,
2005).
Table 4
Tracking Growth and Status Models
High Growth
Group III
Low Status
High Growth
Group IV
High Status
Low Growth
Group I
Low Status
Low Growth
Group II
High Status
Group I shows a school that scored low on both types
of tests meaning that it had low achievement for the
current year and did not improve from the previous
year.
27
Group IV is a school that is high achieving and
still improving. Schools in Group II and III are
mixed and with divergent growth and status scores.
The contrary data of Group II and Group III shown in
Table 4 exemplify the importance of utilizing both
status and growth model data when analyzing school
achievement for accountability purposes (Goldschmidt
et al., 2005).
Growth Models Currently Being
Piloted by States
As of January 2009, 15 states had been approved
to include a growth model with their AYP
determination. Each of those states applied to the
Department of Education and agreed to adhere to seven
core principles that included proficiency by 2014.
Once the states applied to the federal government,
their application was evaluated by 11 peer reviewers
who had to approve their proposal. The State of
Delaware had to resubmit because their first
submission did not contain each of those tenets
(Spellings, 2005). In this study the following was
28
analyzed: state-approved state-growth models and one
growth-model that was not approved in Alaska, Arizona,
Delaware I (not approved), Delaware II (resubmitted
and approved), Florida, Michigan, North Carolina,
Tennessee, and Iowa.
The State of Alaska uses a growth model only for
students who do not meet the state standard for
proficiency. Students who meet the state standard for
proficiency fall under the status model. Each student
who does not meet proficiency has 4 years, or until
the 10
th
grade, to achieve the target of proficiency.
In Alaska, a student must attain a scaled score of 300
to be considered proficient. This scaled score of 300
is consistent at each grade level and subject matter.
The state labels a student who is not proficient, as
“on track to be proficient.” An example of a student
becomes, “on track to be proficient,” is in Table 5.
29
Table 5
On Track for Proficiency
An example of a student considered to be on track
to become proficient, if they are in fourth grade
or new to the state or LEA follows:
1. A student last year in fourth grade had a
score of 260.
2. (300-260)/4 = 10
3. If a student has 270 at the end of fifth grade,
he/she is on track to become proficient.
Source: Alaska Department of Education and Early
Development, 2007, p. 4.
The state of Alaska reduces the divisor by one
for each year that the student remains “on track to be
proficient” until the fourth year is reached. If the
student has not reached proficiency by scoring 300 on
the scaled metric, or did not reach proficiency using
the Alaska growth calculation, then the student is
counted as not proficient. The growth model is
adjusted for students in grade seven or higher as they
must demonstrate proficiency by tenth grade (Alaska
Department of Education and Early Development, 2007).
30
Arizona uses a growth model that is similar to
the Alaskan growth model. Two differences from
Alaska’s growth model are that Arizona’s students only
have 3 years to be proficient, and Arizona’s students
must be proficient by the eighth grade. In the state
of Arizona, students who are not proficient have their
scaled score subtracted from the lowest score needed
to be proficient. The result is then divided by three
to determine the level of growth the student needs to
attain to make adequate growth. Adequate growth is
the label Arizona uses to describe a student who is
making growth toward proficiency and that the growth
is sufficient enough that it satisfies making AYP. If
an Arizona student fails to make adequate growth, or
fails to score proficient by the eighth grade, then
the student is deemed as not making adequate growth
and this counts against a school’s AYP. An example
provided by the Arizona Department of Education is
provided in Table 6.
31
Table 6
Arizona Department of Education Example
A student scores 469 on the 6
th
grade reading test
in 2005. The passing score on the 8
th
grade test
is 499. The student’s reading score must improve
15 points each year –(499-469)/(8-6) = 30/2= 15—for
her to reach proficiency by 8
th
grade.
Source: Horne, 2007, p. 3)
The first Delaware growth model reviewed was the
growth model that was approved by the federal
government to be piloted by the state. The state of
Delaware had a growth model that was closely aligned
with the Hocevar (2010) growth model evaluated in this
dissertation. Delaware divides their students into
three scoring groups, “Well-Below” standard, “Below
the Standard,” and “Proficient.” The well-below
standard group is divided into a lower level of 1A and
a higher well-below standard group level of 1B. The
same distinction is made with the below the standard
group as they are divided into a lower level below the
standard group called 2A and higher level below the
32
standard group named 2B. The proficient group is not
given a number in the Delaware table. The Delaware
value table is shown in Table 7.
Table 7
Delaware Department of Education Example
Level Year 2 Level
Year 1
Level
Level
1A
Level
1B
Level
2A
Level
2B
Proficient
Level 1A 0 150 225 250 300
Level 1B 0 0 175 225 300
Level 2A 0 0 0 200 300
Level 2B 0 0 0 0 300
Proficient 0 0 0 0 300
Source: Delaware Department of Education, 2006,
p. 13.
As evidenced by Table 7, the Delaware model only
awards points for growth. These points are awarded to
student scores when the score grows enough to jump to
a higher level.
33
Also, Delaware is replacing the AYP requirement
of percent proficient of students needed to be
proficient during a year, with a model that instead
allows schools and districts to meet AYP by averaging
all scores to attain the proficiency rate. For
instance, in 2013 95% of all students will be required
under the NCLB (United States Department of Education,
2002a) to be proficient in reading. The state of
Delaware is piloting a model that would allow a school
or district to be proficient if all of its students in
all subgroups meet the average score of 285 out of 300
points or 95% of the points required. This average
would satisfy proficiency to make AYP (Delaware
Department of Education, 2006).
The growth model above was a resubmission for the
state of Delaware. The original Delaware growth model
that was denied approval by the federal government was
rejected for two reasons. The first motive for
sending back their model was that their scoring table
awarded points for going backwards. The second cause
for having Delaware have their model thrown out by the
federal government was that it also granted points for
34
student scores that maintained the lowest scaled score
level. The rejected model is displayed in Table 8.
Table 8
Delaware Rejected Model
Level Year 2 Level
Year
1
Level
Level
1A
Level
1B
Level
2A
Level
2B
Level
3
Level
4
Level
5
Level
1A
25 125 225 250 300 300 300
Level
1B
25 75 175 225 300 300 300
Level
2A
0 25 125 200 300 300 300
Level
2B
0 0 50 125 300 300 300
Level
3
0 0 25 100 300 300 300
Level
4
0 0 0 25 300 300 300
Level
5
0 0 0 0 300 300 300
Source: Delaware Department of Education (2006)
p. 12.
The piloted Florida growth model uses a three-
year growth trajectory to determine if students are
35
making adequate progress to be counted as “on track to
be proficient.” If a student is determined to be “on
track to be proficient,” then the student satisfies
AYP. However, a student in Florida’s proposed model
must be proficient within 3 years or by tenth grade,
whichever comes first. The Florida growth model is
very similar to Arizona’s growth model in that the
student’s current score is subtracted from the
proficient score and divided by three to determine the
proficiency score for the next year (Florida
Department of Education, 2006).
The Florida model emphasizes that a student
scoring farther below proficiency in the first year
will have to improve at a higher rate to be on track
to be proficient. This, however, is modified because
each year that the student is not proficient under the
status model, the growth rate is adjusted to include
the newest year of data. Florida, like all other
states under NCLB allows schools and districts to
attain AYP by using the current status model or by
reaching safe harbor (Florida Department of Education,
2006, p. 3).
36
The state of North Carolina uses a three- or
four-year growth trajectory model for its non-
proficient students. The growth model North Carolina
chose to determine score trajectory is an equi-
percentile model. The equi-percentile growth model
determines equivalent performance on two state tests
regardless of difficulty or scale.
The student’s score is calculated against a
standard normal curve and this score is then
calculated into a z-score. Proficiency is established
with a z-score of -1.0 which is in the 16
th
percentile.
The student’s current year z-score is then subtracted
from -1.0 and divided by four to determine the growth
target for the next year. For instance, if a student
has a -1.8 z-score in their first year in North
Carolina, the next year the student would need to
attain a z-score of -1.6 to meet the state definition
of adequate growth. Table 9 provides three examples
of the North Carolina growth model (United States
Department of Education, 2009).
37
Table 9
North Carolina Growth Model
Test 1 Test 2
Mean (:) 70 75
Standard
Deviation ()
15 15
Proficiency
goal
-1.00 -1.00
Student A:
Raw score 40 50
Z-score Z1=
40-70
15 = -2.0
Z2=
Student B:
Raw score 85 80
Z-score Z1=
85-70
15 = 1.00
Z2=
80-75
15 = 0.33
Student C:
Raw score 40 45
Z-score Z1=
40-70
15 = 2.00
Z2=
45-75
15 = 2.00
Example of computation for
annual growth target of student A:
-(z1 – pg 4 = --2.00 - (-1.00) 4 = 0.25
Student A must grow 0.25 z-scale points from year 1 to
year 2. Actual growth:
(-2.0 - (-0.167) = 0.33).
The student gained more than 0.25 points, thus student
A met the target.
Source: United States Department of Education, 2009,
p. 13.
38
In Table 9, student A was not proficient but is
identified as meeting growth and considered to be
proficient for AYP. Student B is proficient so the
score is not calculated using the growth model.
Student C improved their raw score but the z-score
remained the same, so the student did not make
adequate growth and would count against the school or
district AYP (United States Department of Education,
2009).
Michigan’s growth model is used for grades 3-8
only as the state does not test in grades 9 or 10.
Its 11
th
grade high school exam is too far removed from
the 8
th
grade to provide useful growth data. The
Michigan growth model divides student test scores into
12 different categories. The scores are first
separated into four groups: Not proficient, partially
proficient, proficient, and advanced. Within each
group are three sub groups: Low, mid, and high
(Michigan Department of Education, 2008).
If a student score is not proficient, then the
student has their score analyzed for growth based on a
39
3-year trajectory. The trajectory is based on how
many categories they must pass through to get to the
low proficient classification. A student in the
lowest classification of low and not proficient, must
jump six groups over 3 years to be low proficient or
an average of two groups a year. If the student jumps
two groups, they remain on trajectory to be
proficient. This student would meet adequate growth
in Michigan’s growth model. However, if a student
fell back the second year and then advanced to the
same level in the third year, they would not be marked
as making adequate progress because they had already
attained that level (Michigan Department of Education,
2008).
The Michigan model is similar to the approved
Delaware model in that it only looks at a comparison
of two consecutive years. It differs in comparison
with the Delaware model because it assigns an amount
of growth that the student must achieve to maintain
trajectory to proficiency. An example of a Michigan
growth model is in Table 10.
40
Table 10
Michigan Growth Model
Grade
3-7
Fall
2005
MEAP
Math-
ematics
Achieve
ment
Matched Fall 2006 MEAP Mathematics Achievement
Not
Proficient
Partially
Proficient Proficient Advanced
Low Mid High Low Mid High Low Mid High Low Mid High
N
O
T
P
R
O
F
I
C
Low 1 59 32 6 0 10 7 3 3 0 0
Mid 7 8 764 448 17 16 69 19 12 12 3 3
High 39 06 10,449 8,395 5,275 3,001 1,902 517 218 162 20 3
P
A
R
T
P
R
O
F
I
C
Low 22 33 6,153 7,117 6,258 4,772 3,717 1,064 424 249 29 11
Mid 21 04 4,236 6,318 7,199 6,990 6,813 2,510 945 530 42 15
High 17 8 2,981 5,596 7,850 9,531 12,035 5,723 2,602 1,427 78 15
P
R
O
F
I
C
Low 11 7 1,982 4,122 7,658 12,315 21,120 13,995 8,096 5,349 292 37
Mid 6 0 711 1,661 3,292 7,125 17,360 17,378 13,586 12,097 896 87
High 4 296 622 1,315 3,241 10,795 16,036 17,561 23,103 2,759 242
A
D
V
A
N
C
E
D
Source: Michigan Department of Education, 2008,
p. 11)
41
Within this model, Michigan also labels the
different amounts of growth by color coding. The red
color stands for significant decline, light red is
decline, the color yellow indicates no change, light
green means improvement, and dark green stands for
significant improvement. The state has identified
color coding as a method to help indicate if a
particular or singular statewide intervention or
curriculum adoption has improved test scores or
hindered growth (Michigan Department of Education,
2008).
Tennessee is piloting a growth model that is
called a projection model. The Tennessee model
measures all students who are projected to be
proficient within 3 years. The model assigns credit
for a student who will be proficient in 3 years and
does not provide credit for a student who is
proficient but projected to not be proficient in 3
years. An example of a student who is proficient but
not projected to be proficient would be a student who
is declining in growth over time but on the last test
42
the student scored proficient. The Tennessee model
only applies to grades 3-8 and is not used at the high
school level (Tennessee Department of Education,
2006).
The Tennessee projection model determines the
amount of growth by using all of a student’s previous
test scores and then multiplying that score by a
coefficient. The coefficient is established by a
number that equates to receiving the gain of a student
receiving an average Tennessee education. The
coefficient is updated each year. Also, the Tennessee
projection model does not necessarily grant year-over-
year growth credit. Credit for AYP is only achieved if
a student attains proficiency or if the student earns
the amount of growth determined by the projection
model (Tennessee Department of Education, 2006).
The State of Iowa proposes a growth model that
allows for students to achieve proficiency in 4 years
and is not based on trajectory or projection. The
Iowa growth model only counts the students in Iowa
that are not labeled in the status model as
proficient. Iowa divides student test scores into
43
three categories: High, intermediate, and low. The
high and intermediate sections are labeled as
proficient. The low classification is further divided
into marginal and weak. The marginal group is then
separated into high marginal and low marginal. The
high marginal group is measured as one standard error
below that of intermediate or the lowest proficient
group (Iowa Department of Education, 2007).
Although, the Iowa growth model allows students 4
consecutive years to achieve proficiency, they can
only earn AYP growth in two of those four years. Iowa
students who are counted as making growth toward
proficient, but not proficient, are labeled as
students making adequate yearly growth or AYG. The
AYG students are then added to the AYP students to
determine its total AYP. The State of Iowa expects
each school to have a gain of 8% to 10% in overall
proficiency using the growth model proposed. An
example of the growth model is shown in Table 11.
44
Table 11
Iowa Growth Model Example
Baseline
Level
Year 1
Level
Year 2
Level
Year 3
Level
Year 4
Level
Decision
Weak Low
Marginal
High
Marginal
Proficient Counts
in the
growth
model or
Yrs 1
and 2
Weak Weak Low
Marginal
Low
Marginal
High
Marginal
Counts
in the
growth
model or
Yrs 2
and 4
Weak Low
Marginal
Low
Marginal
High
Marginal
Proficient Counts
in the
growth
model or
Yrs 1
and 3
Weak Low
Marginal
High
Marginal
Low
Marginal
High
Marginal
Counts
in the
growth
model or
Yrs 1
and 2
Weak High
Marginal
Low
Marginal
Proficient Counts
in the
growth
model or
Yr 1
Low
Marginal
High
Marginal
Proficie
nt
Counts
in the
growth
model or
Yr 1
High
Marginal
Low
Marginal
High
Marginal
Does not
count in
the
growth
model
Source: Iowa Department of Education, 2007, p. 8).
45
The growth targets in Iowa remain constant and
the targets are not reset.
Summary
There were eight types of growth models examined
in the literature review. The research identified six
of the models as growth and trajectory models. One
was a year-over-year growth model and the other was a
projection model. With the exception of the year-
over-year growth model, the other seven models allowed
students 3 to 4 years to attain proficiency to make
AYP. The trajectory and models allowed students to be
proficient by either eighth or tenth grade. Table 12
analyzes the growth models in all eight states.
46
Table 12
Comparison of Eight States
State
Model Type
Years to
Prof
Grade
Completion
Alaska Growth/Trajectory 4 10
Arizona Growth/Trajectory 3 8
Delaware Growth Year-Year 10
Florida Growth/Trajectory 3 10
North
Carolina
Growth/Trajectory 3-4
Michigan Growth/Trajectory 3 8
Tennessee Projection 3 8
Iowa Growth/Trajectory 4 8/11
47
CHAPTER 3
METHOD
For this study, mathematics results were used
from the CST (California Standards Test), API
(Academic Performance Index), and AYP (Annual Yearly
Progress) scores in the Golden Valley Unified School
District across grade levels six, seven, and eight to
answer the following research questions.
Research Questions
1. What is the correlation between student-level
residualized-change scores and student level CST
proficiency band(PB) placement scores, and proficiency
(P)scores?
Research question one was a query about the
correlation between student-grade level residualized-
change scores and student-level California Standards
Test Scores (CST), proficiency band placement scores,
and proficiency scores. This question was examined by
evaluating CST test score data from all the math
students in Golden Valley Unified School District
48
(GVUSD) in grades six, seven, and eight and comparing
their residualized-change scores versus their actual
CST scores, proficiency band scores, and their
proficiency scores.
2. Do student-level growth scores for the sixth
to seventh grade transition predict performance in
eighth-grade Algebra I?
Research question two was an analysis on whether
student-level growth scores can accurately predict
performance on the eighth grade Algebra I CST. This
question was measured by using CST test score data
gathered in 2009 from one cohort of math students in
grades six, seven, and eight, who attended Golden
Valley Unified School District (GVUSD). The
residualized-change score from sixth to seventh grade
was correlated with the CST scores from eighth grade,
the CST band performance from eighth grade and the CST
proficiency rate from eighth grade.
3. Do sixth and seventh grade student level CST
scores predict performance on the eighth grade Algebra
I CST Test?
49
The next question of this quantitative study
required an analysis of whether status scores for
sixth to seventh grade Mathematics California
Standardized Test(CST)can predict performance on the
Algebra I CST test. This question correlated CST
status scores, CST band performance scores, and CST
proficiency performance in the sixth and seventh grade
with CST eighth-grade status scores, CST eighth grade
band performance scores, and CST eighth-grade
proficiency performance.
4. What is the reliability of teacher-level
residualized-change scores, and teacher-level CST
scores, proficiency band (PB) placement scores, and
proficiency (P)scores?
The reliability of the teacher-level
residualized-change scores, teacher-level CST scores,
proficiency band (PB) placement, and proficiency (P)
scores was assessed. Reliability was tested to see if
the test measured consistently (Salkind 2007). In
this study, the reliabilites of the teacher-level
residualized-change scores, the teacher-level CST
scores, proficiency band (PB) placement, and the
50
proficiency (P) scores were measured. Split-half
reliability via the Spearman-Brown split-half
correlation was used to measure the internal
consistency of measurements across random halves of
each instructors students.
5. Are teacher-level residualized-change scores
correlated with teacher-level CST scores, API scores,
and AYP scores?
A correlation was measured from sixth to seventh
grade residualized-change scores and from seventh to
eighth grade residualized-change scores to determine
if there was a relationship between teacher-level
residualized-change scores, teacher-level CST scores,
API scores, and AYP scores. The type of correlation
that measured this relationship was the Pearson
product-moment correlation.
6. Do growth scores provide unfair conclusions
against teachers that initially have a low or high
achieving group of students?
To help determine whether teachers with low or
high achieving groups have a disparity in growth
scores, a growth chart on a Cartesian plane was
51
constructed. The two charts plotted teacher-growth
scores and the previous year’s CST score. On the x-
axis, measured CST scores were plotted and the y-axis
teacher-growth scores were plotted. There was a chart
for growth from sixth to seventh grade and a chart to
measure growth from seventh to eighth grade.
Data Source
Two sets of mathematics CST data from Golden
Valley Unified School District was used. The first
set of data was each individual student’s mathematics
CST data from sixth, seventh, and eighth grade for
students who were enrolled during the 2007-
2008 school year. These data were gathered from all
of Golden Valley Unified School District’s middle
school students who took the Mathematics CST test and
the scores were grouped by individual teacher. The
middle school’s demographic data is provided in Table
13. The second set of Mathematics CST data was
collected from all students who entered eighth grade
in the 2008-2009 school year from these same schools.
52
Table 13
Golden Valley Unified School District Middle School Backgrounds(2007-08 SARC)
Middle
School
Arch
Wood
Dart
Rally
Hami
Snow
Ranch
Location Valley Rural Valley Valley Rural Rural Valley
Number of
Students
1,311 113
Gr. (6-8)
1,102 1,340 205
Gr.(6-8)
114
Gr. (6-8)
1,353
School-wide
API
716 816 742 715 708 835 652
Math
Instructors
10 2 8 11 2 2 11
Percent
Racial
Background
47 Hispanic
40 White
7 Afr-Am
37 Hispanic
54 White
2 Afr-Am
34 Hispanic
57 White
4 Afr-Am
47 Hispanic
36 White
8 Afr-Am
39 Hispanic
51 White
2 Afr-Am
13 Hispanic
69 White
0 Afr-Am
52 Hispanic
29 White
10 Afr-Am
Percent EL
Learners
18% 7% 10% 18% 21% 8% 26%
Percent
Free
Reduced
Lunch 2008
72% 66% 56% 70% 72% 51% 78%
Percent
Free
Reduced
Lunch 2007
68% 66% 51% 67% 61% 42% 78%
53
Specifically, data were assembled on each of
these students Mathematics CST scores from the
previous two years.
The five middle schools and two K-8 schools
comprise a total of 5,425 grade 6-8 students of which
51.2% are male and 48.8% are female. There are 877
English Learners on the four campuses, or 16.1% of the
total population. The overall racial composite for
this grade level of 6-8 is 46.3% Hispanic, 39.3% White
and 8.7% African-American. The average API among the
middle schools is 714, all four middle schools and
Hami K-8 failed to make their federal AYP goal in
mathematics. Snow K-8 has made all of its federal and
state academic testing goals and is not in program
improvement.
Instrumentation
The measures that were used were the Mathematics
CST test data for grades six and seven, and the
Algebra I CST test data for grade eight. All eighth
graders in Golden Valley USD must take a mathematics
class that is Algebra I or higher.
54
The sixth and seventh grade Mathematics CST
tests, and Algebra I CST test are multiple choice
tests that are 80 questions long and of those
questions, 65 are scored. The raw score is then
converted to a scaled score from 150-600. That scale
score is then sorted into five specific bandwidths
(advanced, proficient, basic, below basic (BB), and
far below basic(FBB). These bandwidths and specific
conversions are displayed for 2007 in Table 14. Each
bandwidth is assigned a point total with the important
number being 800 which defines proficiency (California
Department of Education, 2007a). The API target under
2009 AYP requirements was a 2009 Growth API of at
least 650 or growth in the API of at least one point
from 2008 to 2009 (California Department of Education,
2009a).
Table 15 describes the CST conversion process
that demonstrates how the varying bandwidths are
determined. A noteworthy feature of this table is
that the seventh grade and Algebra I test score
bandwidths have already been expanded in the higher
55
levels to allow for more students to attain advanced,
proficient, and basic.
Table 14
Mathematics CST 2007 Conversions
Sixth Grade
Seventh
Grade
Algebra I
Advanced Raw Pct 85-
100%
Scaled Score
421-600
Raw Pct 82-
100%
Scaled Score
414-600
Raw Pct 80-
100%
Scaled Score
433-600
Proficient Raw Pct 65-
84%
Scaled Score
351-417
Raw Pct 63-
81%
Scaled Score
352-413
Raw Pct 58-
79%
Scaled Score
352-432
Basic Raw Pct 46-
64%
Scaled Score
302-350
Raw Pct 45-
62%
Scaled Score
302-351
Raw Pct 43-
57%
Scaled Score
303-351
Below
Basic
Raw Pct 29-
45%
Scaled Score
254-301
Raw Pct 29-
44%
Scaled Score
258-301
Raw Pct 29-
42%
Scaled Score
254-302
Far Below
Basic
Raw Pct 0-
28%
Scaled Score
150-253
Raw Pct 0-
28%
Scaled Score
150-257
Raw Pct 0-
28%
Scaled Score
150-253
(California Department of Education, 2007a)
56
Table 15
Mathematics CST 2008 Conversions
Sixth Grade
Seventh
Grade
Algebra I
Advanced Raw Pct 85-
100%
Scaled Score
418-600
Raw Pct 82-
100%
Scaled Score
416-600
Raw Pct 80-
100%
Scaled Score
431-600
Proficient Raw Pct 66-
84%
Scaled Score
353-417
Raw Pct 62-
81%
Scaled Score
350-415
Raw Pct 58-
79%
Scaled Score
352-430
Basic Raw Pct 46-
65%
Scaled Score
300-352
Raw Pct 43-
61%
Scaled Score
300-349
Raw Pct 42-
57%
Scaled Score
300-351
Below
Basic
Raw Pct 29-
45%
Scaled Score
253-299
Raw Pct 29-
45%
Scaled Score
260-299
Raw Pct 28-
41%
Scaled Score
254-299
Far Below
Basic
Raw Pct 0-
28%
Scaled Score
150-252
Raw Pct 0-
28%
Scaled Score
150-259
Raw Pct 0-
27%
Scaled Score
150-253
Source: California Department of Education, 2008a.
The AYP requirement in the No Child Left Behind
(United States Department of Education, 2002a)
legislation is designed so that states provide targets
that all students must meet at a proficient rate as
determined by the state. Each state was left to
57
provide its own definition of proficiency and rate of
proficiency as long as all students can test at the
proficient level by 2014. Since all students are not
at proficient levels in all states, the definition of
AYP is how much progress is the school making in
getting all of its students to that level of
proficiency by 2014 (United States Department of
Education, 2002b).
In the state of California, there are four AYP
requirements that schools have to meet: Participation
Rate, Percent Proficient in English and Mathematics,
Academic Performance Index as an Additional Indicator
of Achievement, and Graduation Rate. The percent
proficient requirement is the AYP target that is the
federally mandated target where all students must
attain proficiency (California Department of
Education, 2009b) The AYP measure is a status model
metric and does not take growth from previous years
into context (Goldschmidt et al., 2005). Figure 1
illustrates the percent of students that need to be
proficient for a California middle or high school in a
unified school district to reach its target in
58
mathematics (California Department of Education,
2009b).
Figure 1. Number of Proficient Students
12.80
12.80
23.70
23.70
23.70
34.60
45.50
56.40
67.30
78.20
89.10
100.00
0 20 40 60 80 100
2002-03
2003-04
2004-05
2005-06
2006-07
2007-08
2008-09
2009-2010
2010-2011
2011-2012
2012-2013
2013-2014
Math
59
To meet the AYP requirements, the state of
California chose the CST test as the test that
measures whether a student is proficient in
Mathematics or English Language Arts. With that in
mind, the research questions at the student level
required CST scores, student proficiency band (PB)
scores, and the overall student proficiency (P) rate.
These scores were further analyzed using the research
questions as a guide to determine the residualized-
change scores of each of those measures. At the
teacher-level, a composite of each of these metrics
were tabulated.
The state of California also uses the CST test to
determine if schools meet an API target. The API is
used to measure the growth of the school and is
comprised of a variety of different assessments
including three different math tests that can be
administered to students in the middle school level
(California Department of Education, 2009a). All
Golden Valley Unified School District student scores
measured in this study took the CST in sixth and
60
seventh grades and took the Algebra I assessment in
eighth grade.
The state does rank schools of similar
demographics and requires subgroups to be measured as
well (California Department of Education, 2009a)
Although the API measures school growth, it does not
follow individual student achievement or teacher
growth score.
The API is a cross-sectional look at student
achievement. It does not track individual student
progress across years, but rather compares
snapshots of school or LEA level achievement
results from one year to the next (California
Department of Education, 2009a, p. 5)
This cross sectional look does not take
individual courses into account either (California
Department of Education, 2009a).
Procedure and Analysis
Student data for this study were acquired through
the Achieve Data Solutions’ Data Director software
application provided by Golden Valley Unified School
District. The set of CST data collected was the CST
student data in each mathematics teacher’s class over
61
the years from sixth through eighth grades. The data
analyzed were: CST status scores, CST proficiency
band performance, CST proficiency, and residualized-
change scores.
The CST status scores were the student scores
from their mathematics CST performance. The
proficiency band metric was examined after assigning a
point for each level the student achieved on the CST
proficiency scale identified in Tables 14 and 15. A
student who scored Far Below Basic was assigned a
point value of zero, a score of Below Basic was
assigned one point, a score of Basic was assigned two
points, a proficient score was assigned three points,
and an advanced score was assigned a four. So if a
teacher-level growth score was 1.0, the overall
average for the class would have been to attain one
level higher in proficiency table.
The teacher-level API performance was analyzed by
assigning points to student mathematics CST scores
based on Table 16 (California Dept Of Education,
2009a).
62
Table 16
CST Scores
Advanced
Proficient
Basic
Below
Basic
Far Below
Basic
1000 875 700 500 200
This meant that if a student scored in the Far
Below Basic range on the test, they were assigned 200
points. If they scored in the Below Basic range, they
were assigned 500 points, 700 points for Basic, 875
for proficient, and 1,000 for advanced. The
proficiency or AYP measure assigned one point for
scoring proficient or advanced and zero points for
scoring basic, below basic, or far below basic.
The residualized-change scores were determined by
regressing the scores to provide a more accurate
account for the year-over-year data. A positive
number in the change score meant the score went up
year-over-year and a negative number in the change
score would indicate a decline in year-over-year
63
growth. A zero would indicate no positive or negative
growth.
Correlations were performed with the six research
questions as a guide. All correlations were measured
using the Pearson Product Moment formula. Reliability
of these scores were measured using the Spearman-Brown
split-half formula. Finally, two charts were
constructed on a Cartesian plane using CST scores and
residualized-change scores on the x and y-axis to
determine if growth scores were unfair to teachers who
initially had either low or high achieving students.
Algebraic Explanation of
Residualized-Change Scores
Using x as the independent variable and y as the
dependent variable, a slope or regression line can be
attained by finding the residual of each point in
relation to the line. A residual is the error or
distance between the point and the slope of the
regression line. The normal procedure to find the
residual is to minimize the sum of the squares of all
the errors.
64
In general, if x has the value of x, y can be
determined using the formula y=bx+a where b is the
slope of the line or line of regression, and a is the
vertical intercept. To find the residualized change
score for a value, we would simply take the value of y
and subtract the predicted value of y’ to find the
residual change or y-y’=residual. If the residual is
above the predicted regression line, the value or
change score is positive, if the residual is below the
predicted regression line, the value or change score
is negative.
65
CHAPTER 4
RESULTS
In order to determine what trends may emerge from
using a growth model to analyze sixth to eighth grade
mathematics test scores in Golden Valley Unified
School District, many different scores were compared.
These scores included student and teacher-growth
scores and student and teacher status scores. These
scores were drawn from the CST scores for data sets
from 2007, 2008, and 2009. To discover if a
relationship existed between student and teacher-
growth scores and student and teacher status scores,
several correlational analyses were conducted.
For the first three research questions at the
student-level, residualized-change scores, CST scores,
proficiency band placement scores, and proficiency
scores were correlated. The residualized-change score
is a score that measures the change in a student score
from year-over-year and adjusts for differences in the
pretest scores. The student level CST scores were
scaled scores on the mathematics portion of the CST.
66
The proficiency band (PB) scores were the levels of
where students are in relationship to the five CST
bands of advanced, proficient, basic, below basic, and
far below basic, with each band assigned a numerical
value: advanced equals five, proficient equals four,
basic equals three, below basic equals two, and far
below basic equals one. The proficiency score
measured if a student scored proficient or above on
the CST test.
Regarding research questions 4, 5, and 6,
teacher-level change scores were determined by using a
composite score of the students assigned to each
teacher. As discussed in Chapter 3, CST scores,
proficiency band (PB) placement scores (i.e., API),
and proficiency (P) scores (i.e., AYP) for the
students were used for the teacher composite scores
(Table 17).
67
Student Level Descriptive Findings
Table 17
Descriptive Statistics
Descriptive Statistics
Mean Std. Deviation N
CST 6 346.26 55.725 956
CST 7 343.68 57.180 1095
CST 8 315.72 57.727 1261
CST 6B 2.3243 .95527 956
CST 7B 2.2603 .97446 1095
CST 8B 1.6677 1.02112 1261
CST 6P .4289 .49517 956
CST 7P .4100 .49207 1095
CST 8P .2213 .41526 1261
Standardized
Residual
6
th
-7
th
Grades
Change
.0000000 .99947410 952
Standardized
Residual
7
th
- 8
th
Grades
Change
.0000000 .99954286 1095
Table 17 displays the mean, standard deviation,
and number of students included in the study.
68
The mean of the CST status scores are below the
number of 350 for proficiency and the mean falls by
about 28 points from the seventh grade to the eighth
grade. The bandwidths also drop dramatically from
seventh to eighth grade as the average goes from
between basic and proficient to below basic to basic.
The percent of students attaining proficiency on the
Mathematics CST drop even more significantly from 41%
in seventh grade to just over 22% on the eighth grade
Algebra I CST.
Residualized-change scores were estimated by
regressing the seventh grade CST scores on the sixth
grade CST scores (Table 18). The beta of the
standardized coefficient was .820. The correlation of
grade six and grade seven scores was substantial,
positive, and statistically significant, t(950) =
44.135, p = .001 (Table 18).
69
Table 18
Coefficients Grade Six to Grade Seven
Coefficients
a
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig B
Std.
Error Beta
1
(Constant) 49.061 6.783 7.233 .000
CST 6 .854 .019 .820 44.135 .000
a
Dependent Variable: CST 7
Table 19
Coefficients Grade Seven to Grade Eight
Coefficients
a
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig B
Std.
Error Beta
1
(Constant) 64.120 7.494 8.557 .000
CST 7 .738 .022 .720 34.311 .000
a
Dependent Variable: CST 8
70
Residualized-change scores were also estimated by
regressing the eighth grade Algebra I CST scores on
the seventh grade Mathematics CST scores. The
correlation was .720. The correlation of grade seven
to grade eight scores was substantial, positive, and
statistically significant, t(1,093) = 34.311, p =
.001.
Figure 2. Standardized Residualized Change (6→7)
71
Figure 2 shows the distribution of CST
mathematics residual-change scores from sixth to
seventh grade. The skewness is near zero. The
kurtosis indicates a curve that peaksnear zero. There
were 952 students in the sample. Refer to Appendix A
for more detailed statistics.
Figure 3. Standardized Residualized Change (7→8)
72
Figure 3, which displays the distribution of CST
mathematics residualized change scores from the
seventh grade Mathematics CST to the eighth grade
Algebra I CST, is slightly skewed to the left. This
skew to the left is because more students performed
worse on the Algebra I CST than expected given scores
on the seventh grade Mathematics CST. Detailed data
for the distribution in Figures 2 and 3 are found in
Appendix A. Appendix B shows the inter-correlations
of all variables. These correlations are discussed in
the next section.
Correlational Findings
Answer to Research Question 1
1. What is the correlation between student-level
residualized change (RC) scores and student-level CST
scores, proficiency band (PB) placement scores, and
proficiency (P) scores?
Table 20 shows correlations of student-level RC
scores, student-level CST scores, PB scores, and P
73
scores. Table 21 shows the residual-change
correlations.
The first row of Table 20 displays the degree of
correlation between status scores and the standardized
residual scores from sixth to seventh grade. The
correlations are .573, .548, and .432 for the CST
scores, PB scores, and P scores, respectively. As
expected, the degree of correlation decreases as it is
measured from a continuous CST score to a CST
proficiency band (one of five categories) and to CST
proficiency rate.
The third row of Table 20, shows the degree
of correlation between status scores and the
standardized residual from seventh to eighth grade.
The correlations are .000, -.030, and -.025 for the
CST scores, PB scores, and P status scores,
respectively. These correlations are zero or near
zero because a zero correlation between seventh and
eighth grade growth and the seventh grade CST was pre-
specified.
74
Table 20
Degree of Correlation
Residual-Change Correlations CST Score, CST Bands, CST Proficiency
Sixth to
Seventh
Grade
Change
Seventh to
Eighth
Grade
Change
CST6 CST7 CST6B CST7B CST6P CST7P
Standardized
Residual
6
th
-7
th
Grade
Change
Pearson
Correlation
1.000 -.124 .000 .573 .000 .548 -.007 .432
Sig. (2-
tailed)
.000 .000 1.000 .000 .996 .000 .818 .000
Standardized
Residual
7
th
-8
th
Grade
Change
Pearson
Correlation
-.124 1.000 .092 .000 .076 -.030 .057 -.025
Sig. (2-
tailed)
.000 .000 .004 1.000 .019 .328 .079 .415
75
Table 21
Residual-Change Correlations
Residual-Change Correlations of Change Score to 8
th
Grade
CST
CST8 CST8B CST8P
Standardized
Residual
6
th
-7
th
Grade Growth
Pearson
Correlation
.336 .314 .235
Sig. (2-tailed) .000 .000 .000
Standardized
Residual
7
th
-8
th
Grade Growth
Pearson
Correlation
.694 .640 .529
Sig. (2-tailed) .000 .000 .000
Answer to Research Question 2
2. Do student-level residualized-change scores
for the sixth to seventh grade transition predict
performance on the eighth grade Algebra I CST test?
In Table 21 the correlations of Algebra I CST
with the residualized-change score from sixth to
seventh grade and from seventh to eighth grade are
displayed. All are statistically significant. The
correlation of sixth to seventh grade residual change
with the eighth grade CST was .336, the correlation of
residual-change scores with the eighth grade CST
76
bandwidth was .314 and the correlation to eighth grade
Algebra I CST proficiency was .235. All correlations
were statistically significant (p<.05), but these
validity correlations would be considered low to
moderate.
Although not related to research question number
two, it is interesting to note that the correlations
are higher for residual-change scores and the status
scores from seventh to eighth grade but this is due to
the auto correlation caused by using the same eighth
grade Algebra I CST result in both measurements.
Residual-change scores had a .694 correlation with the
eighth grade CST score, .640 correlation with the
eighth grade CST bandwidth, and a .529 with the eighth
grade CST proficiency level. All correlations were
statistically significant (p=.001).
Answer to Research Question 3
3. Do sixth and seventh grade student-level CST
scores predict performance on the eighth grade Algebra
I CST test?
77
Table 22 provides the results of the correlations
of the eighth grade Algebra I CST with the results of
sixth and seventh grade Mathematics CST scores. The
results yielded high positive correlations that were
above .500 for each measurement with the exception of
the proficiency rates from sixth grade to eighth grade
(.414) and from seventh to eighth grade (.468). A
correlation greater than .50 is a large correlation
(Cohen, 1988). Results suggest that status scores in
sixth or seventh grade mathematics relate to status
scores on the Algebra I CST in the eighth grade.
Teacher-Level Research Questions
Research questions four through six required
teacher-level residual-change scores, teacher-level
CST scores, teacher-level API scores, teacher-level
proficiency band (PB) scores, and teacher-level
proficiency (P) scores. The following section of
statistics pertains to those three questions.
In the teacher-level descriptive statistics in
Table 23, growth scores were measured from grades six
to seven and from seven to eight.
78
Table 22
Correlations of Eight Grade Algebra I
Correlations
CST8 CST8B CST8P
CST6
Pearson Correlation .666 .629 .553
Sig. (2-tailed) .000 .000 .000
N 956 956 956
CST7
Pearson Correlation .720 .688 .575
Sig. (2-tailed) .000 .000 .000
N 1095 1095 1095
CST6B
Pearson Correlation .616 .596 .502
Sig. (2-tailed) .000 .000 .000
N 956 956 956
CST7B
Pearson Correlation .654 .641 .510
Sig. (2-tailed) .000 .000 .000
N 1095 1095 1095
CST6P
Pearson Correlation .514 .495 .414
Sig. (2-tailed) .000 .000 .000
N 956 956 956
CST7P
Pearson Correlation .552 .543 .468
Sig. (2-tailed) .000 .000 .000
N 1095 1095 1095
79
Table 23
Teacher-Level Descriptive Statistics
Minimum Maximum Mean Std.
Deviation
Skewness Kurtosis
Statistic
Statistic
Statistic
Std.
Error
Statistic
Statistic
Std.
Error
Statistic
Std.
Error
GROWTH -.67 1.42 -.0017 .06804 .43567 1.356 .369 2.370 .724
CST 260.70 447.29 328.0774 5.84447 37.42289 1.190 .369 2.518 .724
API 381.48 948.53 664.9948 18.64210 119.36768 .198 .369 .576 .724
BAND .63 3.61 1.9335 .09829 .62938 .665 .369 1.047 .724
PROF .00 1.00 .3019 .03661 .23440 1.128 .369 1.195 .724
n = 41
80
The growth score range was 2.09 meaning that the
difference in average student growth from the highest
performing class (1.42) to the lowest performing class
(-.67) was slightly over two standard deviations.
The CST averages ranged from slightly over 260
points to just under 450 points and the CST band
performance had a range of just under three of the
bands. The API measured from a low of 381.48 to a
high of 948.53. This would suggest that teacher-level
scores had classrooms ranging from students who were
far below basic and below basic to a classroom where
almost every student scored in the advanced
proficiency band. The proficiency band measure showed
a similar difference with a low of .63 meaning that
the average students were far below basic to below
basic and a high of 3.61 where the average students
scored proficient or advanced. The proficiency rates
were spread further as classrooms ranged from having
zero students score proficient to all students scoring
proficient or above.
Figure 4 describes the distribution of the
teacher-growth scores. The growth scores are
81
positively skewed thus, their distribution is not
normal.
Figure 4. Distribution of Teacher-growth scores
The skewness is 1.356 (SE=.369) and the kurtosis
is 2.370 (SE=.724) In this curve, there are 3
82
teachers achieving high amounts of growth and 25
teachers that are seeing declines in student CST test
growth.
Answer to Research Question 4
4. What is the reliability of teacher-level
residualized-change scores, and teacher-level CST
scores, teacher-level API scores, and teacher-level
AYP scores?
Reliability was estimated using the Spearman-
Brown split-half coefficient. The formula for the
Spearman-Brown split-half coefficient is
r
sb
= 2r
xy
/ (1+r
xy
). The term r
xy
represents the
correlation between two halves, in this case half the
students for each teacher. The reliability was then
subtracted from 1 to obtain the percent of error for
each measurement. Reliability and percent of error
were estimated for teacher-level residualized-change
scores, teacher-level CST scores, proficiency band
placement scores (API), and proficiency scores (AYP).
The sample size of teachers from sixth to seventh
grade was 21 teachers and the sample size from seventh
83
to eighth grade was 20 teachers. Table 24 displays
the reliability and percent of error for each
measurement in the sample of teachers who taught pre-
Algebra and Table 25 displays the reliability and
percent of error for each measurement in the sample of
teachers who taught Algebra.
Table 24
Reliabilities: Seventh Grade Teachers
Sixth to Seventh Grade Reliability
(r
sb
)
Percent of
error (1-r
sb
)
Residual-change score .822 17.8%
Teacher-level CST Score .905 9.5%
Proficiency Band
Placement (API)
.879 12.1%
Proficiency Score (AYP) .859 14.1%
n = 21
Answer to Research Question 5
5. Are teacher-level residualized-change scores
correlated with teacher-level CST scores, API scores,
and AYP scores?
The correlation between teacher-level
residualized-change scores, teacher-level CST scores,
84
API scores, and AYP scores was done using the Pearson
product-moment formula.
Table 25
Reliabilities: Algebra I Teachers
Seventh to Eighth
Grade
Reliability
(r
sb
)
Percent of error
(1-r
sb
)
Residual-change score .943 5.7%
Teacher-level CST
Score
.959 4.1%
Proficiency Band
Placement (API)
.953 4.7%
Proficiency Score
(AYP)
.901 9.9%
n=20
The results are shown in Table 26 for the sixth
to seventh grade measurements and in Table 27 for the
seventh to eighth grade measurements. The sample size
from sixth to seventh grade was 21 instructors and the
sample size from seventh to eighth grade was 20
instructors. The correlation coefficients show strong
to very strong relationships with the sizes ranging
from .698 to .922.
85
Table 26
Correlations: Status Scores to Change Scores Sixth
to Seventh Grade
Change Score
Teacher-level CST Score .826
API Score .784
AYP Score .753
Table 27
Correlations: Status Scores to Change Scores Seventh
to Eighth Grade
Change Scores
Teacher-level CST Score .922
API Score .910
AYP Score .820
Answer to Research Question 6
6. Do growth scores provide unfair conclusions
against teachers that initially have a low or high
achieving group of students?
There are two figures in this section that
display measured teacher-growth scores in reference to
the pretest CST scores in Golden Valley Unified School
86
District. Figure 5 shows teacher class growth from
sixth to seventh grade using the CST 6 score on the x-
axis and the residualized change from sixth to seventh
grade on the y-axis. Figure 5 displays that seven of
the nine teachers who had a positive residualized-
change score also were above the district average on
the CST6 test. Further, 8 of the 12 who had negative
residualized-change scores were below average on the
district average on the CST6 test. Thus, there
appears to be greater growth scores for teachers who
have better students with which to begin. However,
notably the teacher in the upper left quadrant of the
figure had the highest amount of growth, but began
with the lowest achieving class.
Figure 6 displays teacher-class growth from
seventh to eighth grade using the CST 7 score on the
x-axis and the residual growth from seventh to eighth
grade on the y-axis.
87
Figure 5. Teacher Growth (6→7) CST 6
There is little evidence that there is bias in
the seventh to eighth grade test. However, the
teacher in the upper right hand quadrant of Figure 6
had the highest growth and began with the highest
300.00 350.00 400.00 450.00
CST6
-0.60
-0.30
0.00
0.30
0.60
0.90
1.20
growth
Growth (6-->7) By CST6
88
group of students. This outlier may suggest that the
group was a beneficiary of tracking.
Figure 6. Teacher Growth (7→8) by CST 7
300.00 350.00 400.00 450.00
CST7
-1.00
-0.50
0.00
0.50
1.00
1.50
growth
Growth (7-->8) By CST7
89
CHAPTER 5
SUMMARY, IMPLICATIONS, AND DISCUSSION
Summary
With the inception of the No Child Left Behind
(NCLB) legislation (United States Department of
Education, 2002a), states have been mandated under
federal law to have all students test at proficient or
better on state tests. The State of California has
determined student proficiency by using an assessment
tool known as the Standardized Testing and Reporting
Program (STAR) of which one of the components is the
California Standards Test (CST). Each year that the
CST is administered, students and schools are assigned
points based upon a state formula that calculates how
their students did during the past school year. The
result of the CST scores are evaluated using a status
model, which means that the state does not recognize
student CST score growth or student CST score decline,
only the current CST score and year being evaluated
(Goldschmidt et al., 2005) Therefore, schools and
teachers who are either making progress on the CST, or
90
are seeing test scores diminish may be unfairly
compared to their peer schools or instructors based
upon a student’s prior knowledge.
Furthermore, the No Child Left Behind Act of 2001
dictates that all students must test at the proficient
level by 2014 in the areas of English Language Arts
and Mathematics regardless of background. The state
of California has additionally required that all
students must pass Algebra I to qualify for a high
school diploma. To meet this requirement, the state
of California has added the requirement that all
students must begin to take Algebra I in the eighth
grade. This mandate by the State of California has
been the toughest for state students to overcome as
they pursue graduation as demonstrated by proficiency
scores on the Algebra I CST (Table 28).
There are also scholars who have found that as
many as 28% of students are low achievers in Algebra I
and are missing many of the skills necessary to
succeed in an Algebra I class.
91
Table 28
Percentage of Students Below Proficient on Algebra 1
CST
Source: The Center for the Future of Teaching and
Learning (2008, July)
Based on analysis of math scores on
the National Assessment of
Educational Progress (NAEP), the
Brookings report contends that, as a
result of a misguided national push,
as many as 28% of such algebra
students are low achievers, lacking
prerequisite skills in arithmetic.
It further contends that this
unpreparedness harms those students
and that their presence may weaken
the instructional opportunities of
highly proficient students.
(Burris, 2008, p. 1)
92
This places even more pressure on students to
perform well in the sixth and seventh grade. In
addition, many experts believe that part of the
solution to create success in Algebra I is to better
prepare students in Pre-Algebra. The president of the
National Council of Teachers of Mathematics states,
“part of a basic formula for success is to provide
sixth and seventh graders with a solid grounding in
pre-algebra concepts” (Seeley, 2006, p. 2).
With the insufficiency of the status model and
the shortcoming of the Algebra I proficiency rates as
a backdrop, a line of inquiry has been taken that
seeks to determine if growth models using the current
STAR program data would yield better information for
schools and educators. This study was designed to
evaluate growth at the Pre-Algebra level and find out
if that led to growth at the Algebra I level. In
other words does success at Pre-Algebra lead to
success in Algebra I? Using that framework, the
following six research questions guided this study:
1. What is the correlation between student-level
residualized-change scores and student level CST
93
proficiency band(PB) placement scores, and proficiency
(P)scores?
2. Do student-level growth scores for the sixth
to seventh grade transition predict performance in
eighth-grade Algebra I?
3. Do sixth and seventh grade student level CST
scores predict performance on the eighth grade Algebra
I CST Test?
4. What is the reliability of teacher-level
residualized-change scores, and teacher-level CST
scores, proficiency band (PB) placement scores, and
proficiency (P)scores?
5. Are teacher-level residualized-change scores
correlated with teacher-level CST scores, API scores,
and AYP scores?
6. Do growth scores provide unfair conclusions
against teachers that initially have a low or high
achieving group of students?
Student data utilized for this study were
obtained through the Golden Valley Unified School
District’s (GVUSD) Achieve Data Solutions’ Data
Director account. Specifically, student CST data in
94
mathematics for 2007, 2008, and 2009 were collected.
The sample included scores from 1,261 students and 37
teachers.
Growth-model data consisting of each student’s
grade six (2007), grade seven (2008), and Algebra
I/grade eight (2009) individual test scores were
collected. Of the original 1,261 students who had
mathematical test data in the sixth grade, 956 of
these students had mathematical test data in the
seventh and eighth grades. Using this longitudinal
data, student growth scores were calculated in each
one of the 37 instructors’ classrooms. Four of the
instructors taught a combination of two of the grade
levels providing 41 classrooms of data. California
Standards Test (CST) growth scores were tabulated from
sixth to seventh grade and from seventh to eighth
grade using the group of 956 students.
Data analyses were conducted through the
Statistical Package for the Social Sciences (SPSS).
The independent variable in determining growth from
sixth to seventh grade was the sixth grade CST test
score and the dependent variable was the seventh grade
95
CST test score. The independent variable in
determining growth from the seventh grade to the
eighth grade/Algebra I was the seventh grade test
score, and the eighth grade/Algebra I CST test score
was the dependent variable.
Implications
Student-Level Residualized-
Change Scores
Research Question #1. Data analyses for the first
research question [what is the correlation between
student-level residual-change scores and student level
California Standards Test (CST) scores, proficiency
band (PB) placement scores, and proficiency (P)
scores] revealed that there was a relationship between
student-level residual-change scores at the sixth to
seventh grade level and CST scores, CST PB scores and
CST P scores. However, that relationship declined
from CST scores to CST PB scores and CST P scores.
From seventh to eighth grade student-level residual-
change scores, there was also a relationship
identified in CST scores, CST PB scores, and CST P
96
scores. This relationship also was weakest at the CST
PB level and strongest at the CST score level.
Because there is a degree of autocorrelation in a
residual-change score and the posttest measurement,
the correlations between the residual-change scores
are as expected. The decline in degree of correlation
between residual changes scores and CST scores, CST PB
scores and CST P scores are also as expected since
information is lost when continuous scores (CST
scores) are reduced to categorical scores (CST PB),
and then to dichotomous scores (CST P). Although Linn
(2000) and Abedi(2004) and numerous measurement
experts issued early cautions against the use of
simple proficiency rate scores for accountability
purposes as required in NCLB, the policy has remained
unchanged.
Research Question #2. When examining the second
research question (do student-level growth scores for
the sixth to seventh grade transition predict
performance in eighth grade Algebra 1), the data
indicated that there was a small correlation between
sixth to seventh grade growth on the seventh grade
97
Mathematics CST and the Algebra I eighth grade CST.
The correlation decreased from the CST8 test score
(.336) to the CST PB (.314), and finally the
correlation for CST P was (.235). These results would
suggest that while change scores do predict Algebra I
performance, the validity correlations are small.
Small validity correlations were expected for two
reasons. The first was that the preponderance of low
scores on the Algebra I test caused a restriction of
range problem, and correlations are lessened by skewed
distribution and restriction of range (Shadish, Cook,
& Campbell, 2002). The second reason for the small
correlation was that at the individual level change
scores were unreliable. Residualized-change scores
are almost always unreliable because two scores are
used to compute a change score and they both have
error (Shadish et al., 2002). The implication is that
teachers and administrators should not use individual
student-change scores for any purpose and that
researchers should proceed with caution when
individual-level change scores are required.
98
Research Question #3. In examination of research
question three (do sixth and seventh grade student
level CST scores predict performance on the eighth
grade Algebra I CST Test), the data indicate that
there is a large correlation between status scores on
the CST test in sixth and seventh grades with that of
the eighth grade Algebra I CST test. The data also
indicate that the relationship is large when the CST
sixth and CST seventh grade performance band (PB) is
compared with the eighth grade Algebra I CST PB. The
data further indicates that the connection is moderate
when the CST sixth and seventh grade proficiency (P)
is compared with the eighth grade Algebra I CST P.
When compared with the validity coefficients of
residualized change coefficient immediately above
(Research Question 2), individual prior test
performances in grade 6 are better predictors.
Finally, individual test performance at grade 7 is the
best predictor of eighth-grade algebra performance.
In examining the results for research question
three, it is obvious that seventh grade CST student
status scores are the best predictor of performance on
99
the Algebra I CST test. In addition, the data suggest
that seventh grade CST status proficiency band
placement also has a large correlation with Algebra I
scores. This important test information at the
seventh-grade level should be used to help educators
provide remedial support for students who are at-risk
for failure in Algebra I at the eighth-grade level.
Teacher-Level Residualized-
Change Scores
Research Question #4. The data analyses for the
fourth research question [what is the reliability of
teacher-level residual-change scores, and teacher-
level CST scores, proficiency band(PB) placement, and
proficiency(P)scores] indicated that the four teacher-
level scores were reliably measured. The split-half
analysis conducted suggest that the teacher-level
residualized-change scores, the teacher-level CST
scores, CST PB scores, and CST P scores were even
reliable enough to make high-stakes’ decisions. The
results did, however, show greater reliability in the
100
growth from seventh to eighth grade than from sixth to
seventh grade.
The results from research question four indicate
that the residualized-change scores are reliable at
the teacher-level even when the student-level change
scores were likely very unreliable. Teacher-level
scores are more reliable because many student scores
are summed to get a composite measurement which
reduces the error, whereas with student-level scores,
there is only one score measured. The implications of
this finding are significant. Stakeholders can make
reliable decisions using teacher residualized-change
scores.
Research Question #5. In examination of research
question five (are teacher-level residualized-change
scores correlated with teacher-level CST scores, API
scores, and AYP scores) the data indicate that
teacher-level residual-change scores have a large
correlation with CST status score performance. This
correlation was largest for the change from seventh to
eighth grade but still very large for the change from
sixth to seventh grade. These results suggest that
101
the current-year instructor has a significant and
measurable impact on the CST scores of their students
whether progress is measured by status or growth
scores.
Goldschmidt et al. (2005) suggests that the
correlation between aggregate growth scores and
aggregate status scores could be low. If that were
the case, educators would face a dilemma. Which
scores are most usable? The present results suggest
that both types of scores could be used for
accountability purposes.
In light of current teacher practice, the results
are also noteworthy. These data imply that if a
teacher can improve the CST test score of her or his
students by targeting the next higher proficiency
band, then the improvement will be reflected in both
teacher-level growth and teacher-level status scores.
Research Question #6. The final research
question (do growth scores provide unfair conclusions
against teachers that initially have a low or high
achieving group of students) was aimed at trying to
determine if teacher-growth scores were unfair,
102
depending on whether a teacher had a low or high
performing group of students.
What the 6-7 grade data indicated was that most
of the classroom growth came from students who
initially performed well on the CST test. These data
also revealed that lower teacher-level growth score
reductions occurred with classrooms whose CST scores
were low to begin with. More specifically, below
average, sixth to seventh grade residualized change
was isolated to 9 of the 11 classes that had aggregate
student mean scores below the district sixth grade CST
average. In contrast, 7 of the 10 teachers who had
students above the sixth grade CST average saw growth
in their CST scores. This would imply that if seventh
grade teachers have weaker students with which to
begin, their predicted growth scores will on average
be smaller. This result thus refutes the argument
that growth score accountability is inherently fairer
at the teacher-level.
In contrast, the teacher-level or classroom
growth from the seventh grade CST test to the eighth
103
grade CST data indicated no correlation between
critical status and growth.
Discussion
Advantages of Residualized-
Change Score Models
“A growth model explicitly connects each
student’s performance from one year to a subsequent
year” (Goldschmidt, 2005, p. 20). Through residual
growth measurement at the student- and teacher-level,
this study has shown that growth models have potential
utility in our school system, both as a way to
evaluate teachers and schools.
The main reason why growth models are preferred
over status models for accountability purposes is
because of fairness. Millman (1997) states:
The single most frequent criticism of any attempt
to determine a teacher’s effectiveness by
measuring student learning, is that factors
beyond a teacher’s control affect the amounts
that students learn. . . . Educators want a level
playing field and do not believe such a thing is
possible. Many people would rather have their
fortunes determined by a roulette wheel, which is
invalid but fair, than by an evaluation system
that is not fair. (Millman, 1997, p. 244)
104
Growth models are fairer than status models for
two reasons (Hocevar, 2010). First, it makes sense to
base accountability judgments on students who have
been in a school at least 2 or more years. To use a
growth model, the same student’s score must be
measured over at least 2 years to determine their
change scores from one year to the next. With this
longitudinal data in hand, it is fairer to evaluate
school-level or teacher-level change scores.
Transiency bias, a serious accountability problem, is
solved at the middle-school level if only those
students who have been in the middle school for 3
years are included in the analysis.
The second reason growth models are fairer for
accountability applications, is that they control for
the downward bias of poverty. Since each student is
compared to themselves, there is no inherent bias
towards teachers who have a group of at-risk students
with which to begin. In fact, because at-risk
students have more room to improve, it is possible
105
that growth scores are unfair to teachers in low-
poverty neighborhoods.
Improvement models (Goldschmidt et al., 2005) are
often mistakenly described as school growth models.
Unfortunately improvement models do not take into
account individual student growth because they are not
longitudinal (Lynn, 2000). The California
accountability model and the models described in
chapter one of this dissertation illustrate the
problem. The API model in the State of California
actually sets a higher bar of growth for schools with
initially low-performing students than schools with
initially high-performing students. “For example,
. . . a school with an API of 400 had a growth target
of 20 points, while a school with an API of 600 had a
growth target of 10 points” (Hocevar, 2010, p. 2).
The problem with improvement models is still present
in California and most other states. In effect, the
status model can be detrimental to initially high-
achieving students because the bar is set too low and
the status model can be detrimental to initially low-
achieving students because the bar is set too high. A
106
growth model with the same amount of growth for every
student and every school would be more equitable for
all students.
In summary, a growth model is a metric that can
be fair to all teachers and schools. The No Child
Left Behind Act (NCLB) has created accountability for
schools, but I believe both educators and the public
will not back the punitive measures of NCLB unless the
accountability model is seen as fair. A growth model
is simply fairer than a status model. It analyzes
individual students over time based on a student-base
line not an artificial starting and ending point. As
evidenced by this study, a growth model can be used to
measure teacher and school performance.
Disadvantages of Residualized-
Change Scores
Although this study was established to determine
the usability of growth models, there is research in
opposition to using this type of measurement. One
view that does not support growth models is the
position that a growth model cannot be legitimate
107
because it fails to take into account inherent
psychometric difficulties. Psychometry is defined as,
“the branch of psychology that deals with the design,
administration, and interpretation of quantitative
tests for the measurement of psychological variables
such as intelligence, aptitude, and personality
traits” (Borg & Gall, 1983, p. 720). For many years,
psychometric experts have suggested that psychometric
problems may cause variance in the results of a simple
pretest-posttest growth score design.
The first psychometric problem that Borg and Gall
(1983) refer to is the ceiling effect. They define
the ceiling effect as the limitation on the range of
achievement on the posttest. An example of the
ceiling effect would be that a student who scored 90
on a 100 point pretest would have only 10 points of
improvement possible while a student who scored 30 on
a 100 point pretest would have 70 points of
improvement possible. This ceiling effect, according
to Borg and Gall (1983), would place an artificial
restriction on the distribution of growth scores.
108
The second psychometric problem with growth
scores that Borg and Gall (1983) point out is the
phenomenon of regression toward the mean. The
regression effect is when students who score high on
the pretest earn a somewhat lower score on the
posttest and those with a low score on the pretest
score a higher score on the posttest. The regression
effect occurs because the tests are unreliable. In
non-statistical terms, Borg and Gall (1983) describe
this regression effect as the operation of chance
factors. “Now it is unlikely that the student will
have the same good luck when he next takes a parallel
form of the test” (Borg & Gall, 1983, p. 721).
The third problem Borg and Gall have with growth
scores is that growth scores assume equal intervals at
all points of the test. For example, the growth from
90 to 100 on a 100-point test is assumed to equal that
of 40 to 50 on the same 100-point test (Borg & Gall
1983). This critique of growth scores is explained
because it is more difficult to earn the 100 points
than the 50 points and, therefore, cannot be assumed
to be the same gain (Borg & Gall, 1983) An educator’s
109
response may be that even though it is easier to score
at low levels, it does require as much new knowledge
to increase the score from the previous test.
The fourth problem with change scores that Borg
and Gall (1983) identify is the problem of the
unreliability of growth scores. Residualized-change
scores are almost always unreliable because two scores
are used to compute a change score and they both have
error (Shadish et al., 2002).
It is important to recognize that the Borg and
Gall (1983) critique is of raw score change scores
based on raw difference scores, not based
residualized-change scores. By using a residualized-
change score growth model, this study has solved the
problem of the phenomenon of regression toward the
mean. This is because by virtue of the underlying
mathematics, the correlation of residualized scores
with posttest results is zero. However, the other
psychometric problems for residualized-change scores
that Borg and Gall (1983) identify (ceiling effects,
low reliability, and the equal interval assumption)
are not solved by residualized-change scores.
110
An additional negative effect of using the
residualized-change score is that many faculty members
and community stakeholders have a difficult time
understanding what a residualized-change score is.
Most people do not have a background in statistics to
identify that the residualized-change score can
measure growth by reducing the effect of the pretest.
Ironically, eighth grade algebra standards in
California include an understanding of linear algebra
and residualized scores. Thus the problem of the
public’s misunderstanding of residualized scores is
hopefully solvable.
In addition to psychometric problems being a
disadvantage to growth models, another drawback to a
growth model measurement is the difficulty in
conducting a longitudinal study for all students. A
longitudinal study requires at least 2 years, and in
some cases 3 years, of same student data to establish
the most accurate rate of growth for a school or
district. With a transient student population, it is
complicated, if not impossible, to account for every
students’ test data over their academic career.
111
A final shortcoming of the growth model is that,
in my opinion, status models will always be needed.
Whether it is CST scores, API, or AYP scores,
stakeholders will want to know where a school or
student is in any given year. A status model is
perfect to answer that question. In this sense, a
growth model will always have to compete with the
status model. The intricacy of using two models is
that the community being served can have divergent
data which can cause a muddled interpretation of the
performance being measured.
Limitations
The first limitation in this study was is that
only 956 of the 1,261 students who sat for the eighth
grade Algebra I test also sat for sixth grade and
seventh grade Mathematics tests. Most of these
students were lost to transiency, but a few others
were tracked and took Algebra I in the seventh grade
and Geometry in the eighth grade. These missing
student results could have changed the outcome of the
study. A second limitation was that students who
112
received an extra or second period of mathematics
classes at the sixth, seventh, or eighth grade level
were not identified for this study. These placements
were non-random and thus, teachers who had
proportionally more or fewer remedial students may
have had biased results.
An additional limitation was that this
dissertation was not shared with stakeholders at
Golden Valley Unified School District. Had this
information been distributed to decision makers they
may have been able to provide conclusions that could
validate the differences in teacher level growth
scores. A fourth limitation in this study that was
not accounted for was suggested by a stakeholder:
“What if all the teachers were performing well in
comparison to the state?” Had state level been used,
the data would have placed the teacher-change scores
in a different context.
Conclusions
The purpose of this study was to determine if a
growth model would have usability at the student and
113
teacher level. The growth model used was the
residualized-change score model. The grade level and
subject matter chosen was middle school level
mathematics.
Residualized Change at the
Student Level
A conclusion from the research question one
analysis was that a relationship does exist between
individual student-change scores and status scores.
However, this relationship declined as information was
lost from CST scores, to performance band scores, and
even further with AYP scores (proficient/not
proficient). This result suggests that individual AYP
scores are not reliable enough for school stakeholders
to make reasonable decisions regarding individual
students.
From the research question two results, it can be
inferred that change scores from sixth to seventh
grade significantly predict eighth grade Algebra I
scores, but the correlation was too small for school
policymakers to make key decisions regarding
114
curriculum or individual students. In contrast, the
research question three data showed that status scores
from sixth and seventh grade are strongly correlated
with student CST performance on the eighth grade
Algebra I CST test. Taken together, these data from
research question two and three suggest that key
educational decisions about students be made using
prior year individual student status scores instead of
student-level residualized-change scores.
Teacher-Level Conclusions
A conclusion that can be made from the analysis
associated with the fourth research question is that
teacher-level residualized-change scores were reliably
measured. This would suggest that high stake
decisions can be made based on these change scores.
This means that schools could assign teachers to
students based on the academic growth need of the
students. In addition, schools could assign
professional development assistance to teachers who
are identified as struggling to attain student growth.
115
Finally, if a school district decided to base pay on
performance, a model such as this could be used.
The conclusion associated with research question
five is that growth and status scores are highly
correlated. The current-year instructor had a
measurable and significant impact on both aggregate
student growth and student status. With this
information, a principal may use either type of score
to assign looping of Pre-Algebra and Algebra I to a
successful teacher. In addition, a school or district
would be well served to place its highest performing
teachers in areas where the school and students need
to be held the most accountable, based on either
growth or status scores.
A conclusion based on the research of question
six results is that teachers in Golden Valley school
district who have low CST scores at the end of sixth
grade need more support to succeed on the seventh
grade CST. This problem was evidenced by the fact
that greater growth was observed for teachers who had
better students with which to begin. From an
accountability perspective, if this type of “bias”
116
should generalize, it will be a challenge to advocates
of the use of growth scores. A similar trend was not
observed for seventh to eight grade growth.
117
REFERENCES
Adebi, J. (2004, Jan). The No Child Left Behind Act
and English language learners: Assessment and
accountability issues. Educational Researcher.
American Educational Researcher Association, 33,
4-14.
Alaska Dept. Of Education. (2007). Peer review
guidance for the NCLB growth model applications,
Alaska response. Retrieved July 19, 2009, from
http://www.eed.state.ak.us/tls/Assessment/AKGrowt
hModel/May2007/AYP_Growth_Proposal050107.pdf.
Ballou, D., Sanders, W., & Wright, P. (2004).
Controlling for student background in value-added
assessment of teachers. Journal of Educational
and Behavioral Statistics, 29(1), 37-65.
Berliner, D. C. (2009). Poverty and potential.
Retrieved August 1, 2009, from
http://epicpolicy.org/files/PB-Berliner-non-
school.pdf.
Borg, W.R., & Gall, M.D. (1983). Research design and
methodology experimental design. Columbus, OH:
McGraw Hill.
Braden, J. P. (2006, September). Status,
improvement, and growth models of school
accountability. Paper presented at the meeting
of the Assessing Student Growth to Foster School
Excellence. Raleigh, NC.
Burris, C. (2008). Argument against 8
th
grade algebra
doesn’t add up. Retrieved March 9, 2010, from
greatlakescenter.org.
California Department of Education. (2004). API and
AYP key elements. Retrieved August 8, 2009, from
http://www.cde.ca.gov/ta/ac/ay/keyelements.asp.
118
California Department of Education. (2007a). API and
AYP key elements. Retrieved July 15, 2009, from
http://www.cde.ca.gov/ta/ac/ay/documents/infoguid
e08r.pdf.
California Department of Education. (2007b). CST
technical report. Retrieved August 10, 2009,
from http://www.cde.ca. gov/ta/tg/sr/documents/
csttechrpt07.pdf.
California Department of Education. (2008a). API and
AYP key elements. Retrieved August 8, 2009, from
http://www.cde.ca.gov/ta/ac/ay/keyelements.asp.
California Department of Education. (2008b). 2008 AYP
report on local educational agency list of
schools. Retrieved July 12, 2009, from
http://dq.cde.ca.gov/dataquest/AcntRpt2008/2008AY
PDst.aspx?cYear=&allCds=3367082&cChoice=AYP7a.
California Department of Education. (2008c).
California Standards Test technical report.
Retrieved July 15, 2009, from
http://www.cde.ca.gov/ta/tg/sr/
documents/csttechrpt08.pdf.
California Department of Education. (2009a, May).
2008-09 Academic Performance Index reports.
Retrieved August 1, 2009, from
http://www.cde.ca.gov/ta/ac/ap/
documents/infoguide08.pdf.
California Department of Education. (2009b, November).
Adequate Yearly Progress report information
guide. Retrieved January 17, 2010, from
http://www.cde.ca.gov/ta/ac/ay/
documents/infoguide09r.pdf.
Cavanagh, S. (2008). Catching up on algebra.
Education Week. Retrieved January 17, 2010, from
doi:http://www.edweek.org/products/spotlight/1102
2009/11022009SpotlightAlgebra.pdf.
119
Cohen, J. (1988). Statistical power analysis for the
behavioral sciences (2nd ed.). New Jersey:
Lawrence Erlbaum.
Delaware Department of Education. 2006). Delaware's
proposal for a growth model re-submitted to
United States Department of Education. Retrieved
July 19, 2009, from
http://www.ade.state.az.us/azlearns/
GrowthProposalArizona070702.pdf.
Florida Department of Education. (2006). Florida's
proposal to pilot a growth model under NCLB.
Retrieved July 19, 2009, from
http://www.fldoe.org/news/2006/2006_02_17/Summary
OfGrowthProposal.pdf.
Geithman, B. (2009). Examining principal perceptions
and teachers school effectiveness through a value
added accountability model. Unpublished doctoral
dissertation, University of Southern California.
Goldschmidt, P.,& Choi, K. (2007, April). The
practical benefits of growth models for
accountability and the limitations under NCLB.
Retrieved August 5, 2009, from
http://cse.ucla.edu/products/pokicy/
cresst_policy9_low.pdf.
Goldschmidt, P., Choi, K., Roschewski, W., Hebbler,
S., Blank, R., & Williams, A. (2005).
Policymakers guide to growth models for school
accountability: How do accountability models
differ? Washington, D.C.: The Council of Chief
State School Officers.
Hanushek, E. A. (1979). Conceptual and empirical
issues in the estimation of education production
functions. Journal of Human Resources, 14(3),
351-388.
120
Hanushek, E. A., & Raymond, M. E. (2002). Improving
educational quality: How best to evaluate our
schools? Retrieved August 1, 2009, from
http://www.bos.frb.org/economic/conf/conf47/conf4
7n.pdf.
Hefland, D. (2006). A formula for failure.
Retrieved July 10, 2009, from
http://www.latimes.com/news/
education/la-me-dropout30jan30,0,3211437.story.
Golden Valley Unified School District. (2008).
School Accountability Report Cards, 2008.
Retrieved on August 1, 2009, from http://Golden
Valleyusd.k12.ca.us/edserv/
acdmc_achv/acntblty/acnt_crds.html.
Hocevar, D. (2010). Can state test data be used by
elementary school principals to make teacher-
level and grade-level instructional decisions?
Unpublished Paper.
Hocevar, D., Brown, R., & Tate, K. (2008). Leveled
assessment modeling project. Unpublished
manuscript, University of Southern California.
Horne, T. (2007). Proposal for growth model to
evaluate Adequate Yearly Progress for schools and
districts. Retrieved July 19, 2009, from
http://www.ade.state.az.
us/azlearns/GrowthProposalArizona070702.pdf.
Iowa Department of Education. (2007). No Child Left
Behind growth model pilot proposal. Retrieved
July 24, 2009, from http://iowa.gov/educate/
index.php?option=com_docman&...&gid=3817.
Kelley, C., Finnigan,K. (2004, Oct). Organizational
context colors teacher expectancy. Retrieved
February 25, 2010, from
www.wcer.wisc.edu/news/coverStories/
organizational_context_colors.php.
121
Linn, R. L. (2000, March). Educational researcher.
American Educational Researcher Association, 29,
4-16.
Linn, R. L. (2005). Fixing the NCLB accountability
system. Los Angeles: University of California,
National Center of Research on Evaluation,
Standards, and Student Testing.
McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., &
Hamilton, L. S. (2003). Evaluating value-added
models for teacher accountability. Santa Monica,
CA: RAND Corporation.
Michigan Department of Education. (2008, May).
Michigan growth model pilot application.
Retrieved from
http://www.ed.gov/admins/lead/account/growthmodel
/mi/index.html.
Millman, J.E. (1997). Grading teachers, grading
schools: Is student achievement a valid
evaluation measure? Thousand Oaks, CA: Sage
Publications.
O'Connell , J. (2008). Algebra I success initiative.
Retrieved July 9, 2009, from
http://www.cde.ca.gov/nr/re/ht/algebrainitiative.
asp.
Rivkin, S., Hanushek, E., & Kain, J. (2004).
Disruption versus tiebout improvement: The costs
and benefits of switching schools. Journal of
Public Economics, 88, 1721-1746.Salkind, R.
(2007). Statistics for people who (think they)
hate statistics. Thousand Oaks, CA: Sage
Publications.
Seeley, C. (2006, June). 8th grade algebra: Finding
a formula for success. Retrieved January 5,
2010, from http://www.districtadministration.com
/viewarticleaspx?articleid=208.
122
Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002).
Experimental and quasi-experimental designs for
generalized causal inference: Boston, MA,
Houghton-Mifflin.
Spellings, M. (2005). Secretary Spellings announces
growth model pilot. Retrieved July 12, 2009,
from http://www.ed.gov/news/pressreleases/
2005/11/11182005.html.
Spellings, M. (2009, January). Impact of the growth
model on AYP determinations. Retrieved July 19,
2009, from http://www.ed.gov/admins/lead/
account/growthmodel/gmeval0109.doc.
Stone, J. (1999). Value added assessment: An
accountability revolution. Retrieved August 10,
2009, from http://www.education-consumers.com/
articles/value_added_assessment.shtm.
Tennessee Department of Education. (2006). NCLB
Growth model pilot program. Retrieved July 24,
2009, from
http://www.tennessee.gov/education/nclb/doc/NCLB%
20GrowthModelProposal.pdf.
The Center for the Future of Teaching and Learning
(2008, July). California’s approach to math
instruction still doesn’t add up. Retrieved
January 4, 2010, from
20http://www.cftl.org/centerviews/july08.html05.
Trochim, W. Mk. (2006). Research methods knowledge
base. Retrieved September 17, 2009, from
http://www.socialresearchmethods.net/kb/index.
php.
University of California Berkeley. (2001). How to
prepare students for algebra. Retrieved January
16, 2009, from
http://math.berkeley.edu/~wu/AE3.pdf.
123
United States Department of Education. (2002a). No
Child Left Behind Act of 2001. Retrieved July 8,
2009, from
http://www.nclb.gov/next/overview/index.html.
United States Department of Education. (2002b). Key
policy letter signed by Rod Paige July 24, 2002.
Retrieved July 10, 2009, from
http://www.ed.gov/policy/elsec/guid/secletter/020
724.html.
United States Department of Education. (2003, June).
Parents guide to No Child Left Behind. Retrieved
July 8, 2009, from
http://http://www.ed.gov/parents/
academic/involve/nclbguide/parentsguide.pdfhttp.
United States Department of Education. (2009).
Evaluation of the 2005-06 growth model pilot
program. Retrieved July 21, 2009, from
http://www.ed.gov/admins/lead/
account/growthmodel/gmeval0109.doc.
Wright, S. P., Horn, S. P., & Sanders, W. L. (1997).
Teacher and classroom context effects on student
achievement: Implications for teacher
evaluation. Journal of Personnel Evaluation in
Education, 11, 57-67.
Wu, H. (2001). How to prepare students for Algebra.
Unpublished manuscript, University of California
at Berkeley.
124
APPENDIX A
STATISTICS
Table 29
Statistics
N Minimum Maximum Mean
Std.
Deviation Skewness Kurtosis
Statistic Statistic Statistic Statistic
Std.
Error Statistic Statistic
Std.
Error Statistic
Std.
Error
CST6 956 202 567 346.26 1.802 55.725 .652 .079 .53 .158
CST7 1095 200 600 343.68 1.728 57.18 .579 .074 .602 .148
CST8 1261 187 600 315.72 1.626 57.727 .957 .069 1.751 .138
CST6B 956 0 4 2.3243 .0309 .95527 .007 .079 -.645 .158
CST7B 1095 0 4 2.2603 .02945 .97446 -.083 .074 -.496 .148
CST8B 1261 0 4 1.6677 .02876 1.02112 .198 .069 -.608 .138
CST6P 956 0 1 .4289 .01602 .49517 .288 .079 -1.921 .158
CST7P 1095 0 1 .41 .01487 .492 .366 .074 -1.869 .148
CST8P 1261 0 1 .2213 .01169 .41526 1.345 .069 -.192 .138
Standardized
Residual 952 -3.93763 4.58268 0 .032393 .999474 .076 .079 1.004 .158
Standardized
Residual 1095 -3.16555 5.48154 0 .030206 .999543 .661 .074 1.346 .148
Valid N
(list wise) 952
125
Appendix A, continued
Table 30
Descriptive Statistics
Mean Std. Deviation N
Standardized
Residual
.0000000 .99947410 952
Standardized
Residual
.0000000 .99954286 1095
CST 8 315.72 57.727 1261
CST 8B 1.6677 1.02112 1261
126
APPENDIX B
CORRELATIONS
Table 31
Correlations
Correlations
CST6 CST7 CST8 CST6B CST7B CST8B CST6P CST7P CST8P
Standardized
Residual
Standardized
Residual
CST6
Pearson
Correlation 1 .82 .666 .934 .759 .629 .792 .662 .553 0 .092
CST7
Pearson
Correlation 1 .72 .766 .937 .688 .645 .79 .575 .573 0
CST8
Pearson
Correlation 1 .616 .654 .939 .514 .552 .777 .336 .694
CST6B
Pearson
Correlation 1 .732 .596 .855 .643 .502 0 .076
CST7B
Pearson
Correlation 1 .641 .625 .841 .51 .548 -.03
CST8B
Pearson
Correlation 1 .495 .543 .774 .314 .64
CST6P
Pearson
Correlation 1 .626 .414 -.007 .057
CST7P
Pearson
Correlation 1 .468 .432 -.025
CST8P
Pearson
Correlation 1 .235 .529
Standardized
Residual
Pearson
Correlation 1 -.124
Standardized
Residual
Pearson
Correlation 1
Abstract (if available)
Abstract
With passage of The No Child Left Behind Act in 2001, the nation’s schools have been under pressure to bring all students to a proficient level as defined by their state. The State of California currently uses a status model accountability system to measure whether a student, school, or district is proficient. Other states such as Arizona, Michigan, North Carolina, and Tennessee are exploring an alternative to the status model by piloting a growth model system to assess whether their students are proficient. This case study of a California school district examined the usability of a growth model within seven middle schools at the teacher-level. The growth model used was a residualized-change score growth model that accounted for the pretest when measuring growth.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Quantifying student growth: analysis of the validity of applying growth modeling to the California Standards Test
PDF
The comparison of hybrid intervention and traditional intervention in increasing student achievement in middle school mathematics
PDF
An evaluation of the School Assistance and Intervention Team process in California public schools: lessons learned and indications for policy change
PDF
Use of accountability indicators to evaluate elementary school principal performance
PDF
A comparison of value-added, orginary least squares regression, and the California Star accountability indicators
PDF
Evaluating the efficacy of the High Point curriculum in the Coastline Unified School District using CST, CAHSEE, and CELDT data
PDF
The 2003-2012 impact of Algebra When Ready on indicators of college readiness across California school districts
PDF
Navigating the future of teacher professional development: Three essays on teacher learning in informal and digital context
PDF
Grading in crisis: examining the impact of COVID-19 on secondary public school teachers’ grading and reporting in south Florida
PDF
To what extent does being a former high school English learner predict success in college mathematics? Evidence of Latinx students’ duality as math achievers
PDF
An examination of teacher-centered Explicit Direct Instruction and student-centered Cognitively Guided Instruction in the context of Common Core State Standards Mathematics…
PDF
Closing the gender divide: how social status, connections, media, and culture relate to public attitudes towards female social entrepreneurs
Asset Metadata
Creator
Cameron, Garry Van
(author)
Core Title
The usability of teacher-growth scores versus CST/API/AYP math status scores in sixth and seventh grade mathematics classes
School
Rossier School of Education
Degree
Doctor of Education
Degree Program
Education
Publication Date
04/19/2010
Defense Date
03/12/2010
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Growth,OAI-PMH Harvest,scores,teacher
Place Name
California
(states)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Hocevar, Dennis (
committee chair
), Brown, Richard Sherdon (
committee member
), Horton, David (
committee member
)
Creator Email
gcameron@portangelesschools.org,gcameron@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2936
Unique identifier
UC1106510
Identifier
etd-Cameron-3543 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-315548 (legacy record id),usctheses-m2936 (legacy record id)
Legacy Identifier
etd-Cameron-3543.pdf
Dmrecord
315548
Document Type
Dissertation
Rights
Cameron, Garry Van
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu