Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Assessing and addressing random and systematic measurement error in performance indicators of institutional effectiveness in the community college
(USC Thesis Other)
Assessing and addressing random and systematic measurement error in performance indicators of institutional effectiveness in the community college
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ASSESSING AND ADDRESSING RANDOM AND SYSTEMATIC MEASUREMENT
ERROR IN PERFORMANCE INDICATORS OF INSTITUTIONAL
EFFECTIVENESS IN THE COMMUNITY COLLEGE
by
Robert J. Pacheco
______________________________________________________________________
A Dissertation Presented to the
FACULTY OF THE USC ROSSIER SCHOOL OF EDUCATION
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF EDUCATION
December 2012
Copyright 2012 Robert J. Pacheco
ii
DEDICATION
For my mother and father, two highly reliable predictor variables in the
attainment of the successful outcomes of my life. Special dedication to Kenneth Meehan,
Ph.D., third member of my dissertation committee, who passed away between the time of
the defense and final approval of the dissertation.
iii
ACKNOWLEDGMENTS
I would like to acknowledge my dissertation committee members, Dr. Richard
Brown, Dr. Robert Keim, the late Dr. Kenneth Meehan, and my chairperson, Dr. Dennis
Hocevar, who provided me unceasing support and guidance in this accomplishment. In
addition, I would like to thank the staff of the Research, Analysis and Accountability Unit
(RAA) of the California Community College Chancellor’s Office, especially Willard
Hom, Dr. Alice Van Ommeren, and Vice Chancellor Patrick Perry, for the camaraderie
they showed me from start to finish during this endeavor.
iv
TABLE OF CONTENTS
DEDICATION ............................................................................................................. ii
ACKNOWLEDGMENTS .......................................................................................... iii
LIST OF TABLES ..................................................................................................... vii
LIST OF FIGURES .................................................................................................... ix
ABSTRACT ................................................................................................................. x
CHAPTER 1: INTRODUCTION ................................................................................ 1
Background of the Problem ............................................................................. 1
Problem Identification ..................................................................................... 4
Purpose of the Study ........................................................................................ 5
Theoretical Framework .................................................................................... 8
Presence of Random Error ................................................................... 9
Presence of Systematic Error ............................................................. 10
Statistical Analysis and Techniques ............................................................... 11
Importance of the Study ................................................................................. 11
CHAPTER 2: REVIEW OF THE LITERATURE .................................................... 15
History of Institutional Effectiveness in Higher Education ............................ 15
Emerging Paradigms ....................................................................................... 18
Maturation of the Construct of Institutional Effectiveness ................ 19
Emergence of Performance Indicators ............................................... 22
From Compliance to Outcomes Analysis .......................................... 26
California’s Response ..................................................................................... 27
Basis for the Study .......................................................................................... 30
CHAPTER 3: RESEARCH METHODOLOGY ....................................................... 31
Population and Sample .................................................................................. 31
Data Sources .................................................................................................. 33
Instrumentation .............................................................................................. 34
Predictor and Criterion Variables .................................................................. 35
Levels of Measurement ...................................................................... 35
Predictor Variables (Input Measures) ................................................ 36
Performance Indicators (Output Measures) ....................................... 40
Research Questions ........................................................................................ 44
Theoretical Framework for the Methodology ................................................ 44
v
Procedures ...................................................................................................... 47
Research Question 1 .......................................................................... 47
Research Question 2 .......................................................................... 48
Research Question 3 .......................................................................... 51
Research Question 4 .......................................................................... 53
CHAPTER 4: RESULTS ........................................................................................... 55
Predictor Variables ......................................................................................... 55
Descriptive Analysis .......................................................................... 55
Correlation Analysis .......................................................................... 57
ARCC Outcome Measures ............................................................................. 58
Descriptive Analysis .......................................................................... 58
Correlation Analysis .......................................................................... 59
Predictor Variables with ARCC Outcome Measures ..................................... 61
Descriptive Analysis .......................................................................... 61
Summary ............................................................................................ 63
Statistical and Multivariate Correlation Analysis .......................................... 63
Research Question 1 .......................................................................... 63
Summary ............................................................................................ 70
Research Question 2 .......................................................................... 71
SPAR ................................................................................................. 72
Persistence .......................................................................................... 74
Thirty-Unit Completion ..................................................................... 75
Summary ............................................................................................ 77
Research Question 3 .......................................................................... 78
SPAR ................................................................................................. 79
Persistence .......................................................................................... 82
Thirty-Unit Completion ..................................................................... 85
Summary ............................................................................................ 88
Research Question 4 .......................................................................... 88
Persistence as a Predictor of SPAR ................................................... 89
Persistence as a Predictor of Thirty-Unit Completion ....................... 91
Summary ............................................................................................ 93
CHAPTER 5: DISCUSSION ..................................................................................... 94
Institutional Effectiveness .............................................................................. 94
Summary of the Findings ............................................................................... 98
Temporal Stability ............................................................................. 98
Systematic Error ................................................................................. 99
Persistence as a Tipping Point ........................................................... 99
Implications .................................................................................................. 100
Theory or Generalization ................................................................. 100
vi
Higher Education Practice ............................................................... 101
Future Research ............................................................................... 105
Limitations ................................................................................................... 106
Conclusion ................................................................................................... 107
REFERENCES ........................................................................................................ 108
APPENDIX A: MULTIPLE REGRESSION OF SPAR…………………………..115
APPENDIX B: MULTIPLE REGRESSION OF PERSISTENCE ......................... 119
APPENDIX C: MULTIPLE REGRESSION OF THIRTY-UNITS ........................ 123
APPENDIX D: TOP TEN OVER AND UNDER-PERFORMING
INSTITUTIONS SPAR ................................................................. 127
APPENDIX E: TOP TEN OVER AND UNDER-PERFORMING
INSTITUTIONS PERSISTENCE .................................................. 131
APPENDIX F: TOP TEN OVER AND UNDER-PERFORMING
INSTITUTIONS THIRTY-UNITS ................................................ 135
vii
LIST OF TABLES
Table
1. Descriptive Statistics of Predictor Variables for California Community
Colleges Indicators ................................................................................. 56
2. Linear Correlation Coefficients for Predictor Variables for California
Community College Indicators .............................................................. 57
3. Descriptive Statistics of ARCC Performance Indicators ................................. 59
4. Linear Correlation Coefficients for Performance Indicators (Outputs)
for California Community Colleges ....................................................... 60
5. Linear Correlation Coefficients for Predictor Variables with ARCC
Performance ........................................................................................... 62
6. Test-Retest Reliability for Student Progress and Achievement Rate .............. 64
7. Test-Retest Reliability for Persistence Rate .................................................... 65
8. Test-Retest Reliability for Basic Skills Success Rate ...................................... 65
9. Test-Retest Reliability for Basic Skills Improvement Rate ............................. 66
10. Test-Retest Reliability for Vocational Success Rate ....................................... 66
11. Inter-Item Consistency By Year Reporting ARCC ......................................... 68
12. Inter-Item Consistency Year-to-Year .............................................................. 69
13. Multiple Regression of Student Progress and Achievement Rate
(SPAR) Four-Year Reporting Period ..................................................... 73
14. Multiple Regression of Student Persistence Rate Four-Year Reporting
Period ..................................................................................................... 75
15. Multiple Regression of Thirty-Unit Completion Rate Four-Year
Reporting Period .................................................................................... 76
16. Test-Retest Reliability of SPAR Residuals Over Four-Year Reporting
Period ..................................................................................................... 79
viii
17. Descriptive Statistics for Standardized Residuals SPAR ................................. 80
18. Top Ten Over- and Under-Performing Institutions SPAR Four-Year
Reporting Period .................................................................................... 81
19. Test-Retest Reliability of Persistence Residuals Over Four-Year
Reporting Period .................................................................................... 82
20. Descriptive Statistics for Standardized Residuals for Persistence ................... 84
21. Top Ten Over- and Under-Performing Institutions Persistence
Four-Year Reporting Period .................................................................. 85
22. Test-Retest Reliability of Thirty-Unit Residuals Over Four-Year
Reporting Period .................................................................................... 86
23. Test-Retest Reliability of Thirty-Unit Completion Residuals Over
Four-Year Reporting Period .................................................................. 87
24. Top Ten Over- and Under-Performing Institutions Thirty-Unit
Completion Over Four-Year Reporting Period ...................................... 88
25. Linear Regression of SPAR 2002-2008 on Persistence 2002-2003 ................ 90
26. Linear Regression of SPAR 2003-2009 on Persistence 2003-2009 ................ 91
27. Linear Regression of Thirty-Unit Completion 2002-2008 on Persistence
2002-2003 .............................................................................................. 93
28. Linear Regression of Thirty-Unit Completion 2003-2009 on Persistence
2003-2004 .............................................................................................. 93
ix
LIST OF FIGURES
Figure
1. Sample distribution of residual SPAR year 4 ................................................... 80
2. Sample distribution of standardized residual for Persistence year 4
reporting period ...................................................................................... 83
3. Sample distribution curve for standardized residual for Thirty-Unit
Completion year 4 .................................................................................. 83
4. BGD of Persistence 2002-2003 with SPAR 2002-2008 ................................... 92
5. BGD of Persistence 2002-03 with Thirty-Unit Completion 2002-2008 ........... 94
6. Linear regression of SPAR for 2002-2008 on Persistence for 2002-2003 ....... 62
x
ABSTRACT
Over the past quarter-century, there has been an increased call for colleges and
universities to better demonstrate their institutional effectiveness. This quantitative study
assessed the existence and degree of random and systematic measurement error contained
in performance metrics used under an accountability system for community colleges to
determine the stability, consistency and validity of the indicators to measure institutional
effectiveness. A binary graphic display was also employed to visually represent the
impact of improving intermediate outcomes results (momentum points) such as student
persistence to predict subsequent success on later, terminal outcomes of student goal
achievement. Test-retest reliability and internal consistency results for the performance
indicators were strong, suggesting that the measures are stable and internally consistent
and that random measurement error was minimal for the measures. Multivariate
correlation analysis revealed that the performance measures did contain systematic error
due to socioeconomic factors irrelevant to the construct of institutional effectiveness.
Residual analyses were conducted to identify over and underperformance controlled for
presence of systematic error and to more accurately understand how and why institutional
differences arise and to assess how those colleges who perform less well can be brought
about to achieve better.
1
CHAPTER 1
INTRODUCTION
The purpose of this study was to determine the presence and degree of
measurement error, both random and systematic, in indicators of institutional
effectiveness that are part of an established system of community college accountability.
To establish a more accurate and equitable standard of expected or predicted
achievement, statistical techniques and analytical methods were utilized to increase the
replicability and fit of the measures. The goal of the analysis was to better understand
why differences between institutional achievement arise and to determine how those
colleges who perform less well can improve (Thorndike, 1963). Finally, the
methodology employed in the study provides a model for accrediting commissions and
governmental agencies to make more accurate and informed policy decisions based on
performance results.
Background of the Problem
Over the past quarter-century, there has been an increased call for colleges and
universities to better demonstrate their institutional effectiveness (Ewell, 2011). The call
for accountability in higher education has intensified during the past decade as state and
federal governments and accrediting commissions have augmented their requirements for
institutional performance measurement and demand greater transparency about the added
value that a college education provides (Ewell, 2008b; Pascarella, 2006). Gone are the
days when colleges and universities were allowed simply to self-monitor their internal
policies and practices when assessing institutional performance (Burke, 2004; Ewell,
2
2008b). In the new, more global century, measuring and reporting institutional
effectiveness have become national higher education priorities.
Prior to World War II, governmental oversight of postsecondary institutions was
limited to auditing expenditures and reviewing overall policy implementation (Burke,
2004), and assuring academic quality was the exclusive domain of the accrediting bodies
through a system of self-review and improvement. The post-World War II era witnessed
a dramatic expansion of higher education and the federal government’s role as a primary
financier of post-secondary study. As a result, regional accrediting agencies, which were
once purely academic bodies, have emerged as the primary vehicles through which the
federal government conducts policy oversight (Ewell, 2008b). While higher education
“accountability” might once have meant only an examination of the fiscal integrity of the
institution, the term now applies to results in the form of student success and achievement
milestones (Folger, 1977).
The manner and method of measuring institutional effectiveness have become
quantitative (Ewell, 2008b). The evolution of database technology and the increased
horsepower of statistical software have dramatically increased the construction of, access
to, and use of student success metrics. Student achievement data are now embedded in
the accreditation process and have become recognized proxies for institutional quality
(Goben, 2007).
Typically, performance metrics take the form of rates for educational milestone
completion, achievement threshold, and academic benchmark clearance (Hasson &
Meehan, 2010). While most of the performance indicators concern success on terminal
3
educational outcomes, an emerging trend is to review intermediate (“momentum”)
milestones that act as tipping points to predict later success on student exit outcomes
(Leinbach & Jenkins, 2008).
All levels of postsecondary education utilize performance metrics to identify
institutional effectiveness. In 1994, the American Association of Community Colleges
(AACC), a pioneer in this emerging field of institutional effectiveness analysis, created a
set of core indicators to measure this effectiveness (Ewell, 1993). The efforts of the
AACC resulted in the Voluntary Framework of Accountability (VFA), a pilot system of
40 two-year colleges that implement a set of identified measures of institutional
effectiveness to define college success (AACC, 2012). At the baccalaureate level, the
American Association of State Colleges and Universities (AASCU), the Association of
Public and Land Grant Universities (APLU), and the National Association of
Independent Colleges and Universities (NAICU) have jointly established the Voluntary
System of Accountability (VSA), using a “college portrait” with a common set of data
elements for the purpose of institutional comparison.
The federal government now mandates that colleges disclose success data to
prospective students and their families (U.S. Department of Education, 2002). Higher
education policy centers, such as the National Center for Public Policy in Higher
Education and the Institute for Higher Education Leadership and Policy, also use college-
level student success metrics as the basis for policy recommendations to state and local
governments and to the academy itself. Many state governments have followed suit and
4
utilize college-level student achievement measures as part of accountability systems for
institutional reporting, performance funding, and performance budgeting (Burke, 2004).
Given the high-stakes decisions, based on performance results, that are being
made at the institutional, federal, and state levels, two overarching theoretical
considerations emerge. First, how is the theoretical construct of institutional
effectiveness operationalized by the accrediting commissions, policy centers, and state
and federal governments? Second, do the established performance metrics consistently
and stably map onto to the construct of institutional effectiveness? Confidence in, and
the usability of, performance indicators to inform meaningful decision making rest on the
stability of the metrics and the extent to which factors irrelevant to the construct of
institutional effectiveness are present in the observed results, which affects the precision
of the measures.
Problem Identification
As a theoretical construct, “institutional effectiveness” has developed a high
degree of abstraction due to the divergent uses of the term by a wide range of
stakeholders, each with its own particular perspective on organizational performance
(Ewell, 2011; Head, 2011). Measuring institutional effectiveness has proven particularly
vexing for two-year institutions that attempt to implement consistent processes to sustain
quality improvement based on the multi-faceted missions of community colleges (Hasson
& Meehan, 2010). The term “institutional effectiveness” is sometimes used
interchangeably with “academic excellence” and “academic quality” or even with
assessment and evaluation (Hasson & Meehan, 2010; Head, 2011).
5
There is an inherent allure to using quantitative metrics, based on their efficiency
and numerical impact, to (a) self-assess performance and (b) benchmark improvement
with actual or aspirational peers. Yet, the object of all research methodology is to reduce
theoretical constructs into observable events that, in turn, can be measured and to
generate theories about the concept that can be tested and falsified so that knowledge can
be acquired (Crano & Brewer, 2004; Kuhn, 1996; Popper, 1959). Thus, the
operationalization of institutional effectiveness must successfully translate the concept to
an agreed-upon set of events that can be empirically observed. The more replicable an
instrument is and the more consistently it produces similar results over time, the better
the operationalization (Crano & Brewer, 2004). In sum, the more stable and consistent
the measures are over time, the more the measure overlaps with the construct and the
better the measurement of the idea or the better the “fit.” The desire to determine the
gulf between institutional effectiveness (the construct) and the performance indicators
(the instrument) provides the impetus for the study and forms the theoretical basis of the
research questions that guide the study.
Purpose of the Study
The purpose of the study, thus, was twofold. First, the researcher assessed the
existence and extent of measurement error contained in performance metrics to determine
the stability and validity of the measures. Second, the researcher employed statistical and
analytical techniques to establish a more accurate standard of predicted performance to
afford a more meaningful definition of institutional over- and underperformance. By
controlling for factors irrelevant to the construct of institutional effectiveness, the
6
researcher can make more meaningful comparisons among institutions and consider
factors that better predict performance. The ultimate goals of the study were to better
understand why differences in institutional achievement occur and to determine the
factors that affect college performance so that institutions can improve (Thorndike,
1963).
The system of accountability examined in the study was California’s
Accountability Reporting for Community Colleges (ARCC). In 2004, California enacted
the ARCC system to establish more rigorous accountability measures for two-year
institutional performance (Postsecondary Education Accountability, California Education
Code §84754.5 et seq.). Under the law that created ARCC, the Board of Governors of the
California Community Colleges (a) determined a set of educational priorities; (b) created
a workable framework of accountability; and (c) identified relevant indicators to measure
performance (Postsecondary Education Accountability, California Education Code
§84754.5 et seq.). The ARCC established two levels of performance indicators:
statewide and individual community college outcomes.
At the commencement of this study, the ARCC accountability system contained
six college-level performance metrics for which data could systematically be collected
and assessed, each tied to an established mission function of the community college
system. Specifically, the ARCC indicators include:
1. Student Progress and Achievement Rate (SPAR);
2. Percentage of Students Who Earned at Least 30 Units (Thirty-Unit Completion);
3. Persistence Rate (Persistence);
7
4. Annual Successful Course Completion Rate for Credit-Based Vocational Courses
(Vocational Success Rate);
5. Annual Successful Course Completion Rate for Credit-Based Basic Skills
Courses (Basic Skills Success Rate); and
6. Improvement Rates for Credit-Based Basic Skills Courses (Basic Skills
Improvement Rate).
The ARCC system is a statewide method of assessing institutional effectiveness
through the use of performance indicators as the evidentiary basis for institutions to (a)
make improvements to processes, policies, and procedures that affect student success and
(b) provide performance or scorecard data for the state to set higher education priorities
and inform budgetary and resource allocation decisions. Importantly, the ARCC
indicators include both terminal student outcomes and momentum point measures,
opening the opportunity to examine the degree to which performance on one measure
predicts performance on another, subsequently achieved measure. The ARCC reporting
system also was selected for the study due to its solid reputation in California and across
the nation and the number of years that the reporting system data could be accessed for
analysis.
Based on the problem identification and purpose of the study, four research
questions, which guided in this study, were developed:
1. What is the degree of random error present in the ARCC performance metrics
(SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success Rate, Basic
Skills Improvement Rate and Vocational Success Rate)? What is the temporal
8
stability of the performance indicators as measured by the test-retest reliability of
the metrics? Are the performance indicators internally consistent?
2. Are the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, subject to systematic measurement error? If so, what are the
confounding factors that impair assessment of institutional effectiveness by
consistently and artificially inflating or deflating scores among colleges?
3. If the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, are subject to systematic error, is there a viable method to control for
the confounding factors to make more meaningful comparisons among colleges
about institutional effectiveness?
4. Is Persistence, as an intermediate milestone, a tipping or momentum point
indicator of the terminal student outcome of SPAR? Is Persistence a tipping or
momentum point indicator of the Thirty-Unit Completion intermediate milestone?
Theoretical Framework
The study was grounded in classical measurement theory (CMT) and used a
quantitative research design. CMT posits the notion that the observed score on any single
measurement is a random draw from a distribution of possible scores for a respondent on
the instrument that measures a construct of interest (Brennan, 2006). By adapting the
CMT framework to research methodology, an observed score is the sum of the
respondent’s true score plus error, represented by the modified formula:
9
where O equals the observed score of a college on the performance indicator, T
corresponds to the institution’s true score on the metric and ∑ e (r + s) represents the sum
of error, which can be either random or systematic (Crano & Brewer, 2004).
All tests and instruments are imperfect measures of constructs, and the degree to
which the instrument misses the mark constitutes error, which can be random or
systematic (Crano & Brewer, 2004). The larger the error component in the observed
results, the more open to chance the outcomes are and the less reliable the conclusions
that can be drawn from the results. Error in the form of poor instrument design or
misalignment with the construct impairs the ability to attribute variation in the observed
scores to the presence or absence of the construct under consideration.
Presence of Random Error
The true-score component of an observed ARCC indicator is the stable
characteristic of institutional effectiveness of each college as measured by the metric.
Theoretically, no change in the true-score component of an ARCC performance indicator
should occur unless some real change has taken place in the college’s performance. The
degree of observed deviation in the scores from reporting period to reporting period is a
function of the random error contained in the ARCC as a measurement instrument.
Conversely, the degree of similarity between indicator rates from year to year reveals the
metric’s test-retest reliability. As a result, the degree of association represented by the
Pearson’s correlation coefficient was computed in the form of test-retest reliability to
measure the “true score” variation in the outcomes measure. The greater the coefficient,
10
the less the proportional effect of extraneous factors in the observed score differences in
the indicators.
In addition to the temporal stability analysis, the internal consistency of the
ARCC indicators was determined by computing an average of all of the possible ways of
splitting “items” using Cronbach’s coefficient alpha. If all the ARCC performance
indicators point to the construct of institutional effectiveness, they must do so with
sufficient consistency.
Presence of Systematic Error
The presence and extent of systematic error in the ARCC performance indicators
and the magnitude and degree of relationship among various combinations of possible
explanatory or predictor input (Creswell, 2008) variables that are irrelevant to the
measure were analyzed using multivariate correlation techniques. Traditional
socioeconomic variables, including population density, income level and educational
attainment of the community, college size, race, ethnicity, gender, and age of students,
were used to predict performance on the ARCC measures.
Any degree of explanatory overlap or degree of multicollinearity among the
explanatory variables was controlled, which enabled the identification of the best
combination of predictor variables for the ARCC outcomes measures. The multiple
regression technique was selected for this study due to the amount of information
generated about the variables in the procedure and the versatility and depth of the
findings, including both the magnitude and statistical significance of the relationships
(Gall, Gall, & Borg, 2007).
11
Statistical Analysis and Techniques
The total variation in the institutions after the irrelevant factors were controlled
for was determined by computing the difference between the sum of the observed scores
and the mean value of the ARCC measure. Explained variation was derived by
computing the difference between the predicted value and the mean value of the ARCC
indicator for each institution. Unexplained variation was determined by computing the
difference between the observed score and the predicted value on the ARCC metric. The
error, or residual, was standardized to show relative standing among colleges. Over- and
Underperforming institutions were identified to make more meaningful comparisons
among institutions and to benchmark peers.
Importance of the Study
Four theoretical assumptions provide the foundation for the importance of the
study. First, the performance metrics should be related, stable, and consistent if they are
correctly mapped to institutional effectiveness (Brennan, 2006). Second, a causal model
of analysis of the student success indicators can help explain institutional performance
and provide a basis for improvement (Goldschmidt & Hocevar, 2004). Third, the
removal of confounding factors increases the degree of fit between the measure and the
construct (Brennan, 2006; Creswell, 2008). Finally, benchmarking effective peers is a
viable way for an organization to improve its own performance level (Goldschmidt &
Hocevar, 2004; Tucker, 1996).
The importance of the study, then, lies in the attempt to fill the gap in existing
knowledge of institutional effectiveness performance measures by introducing recognized
12
and generally accepted psychometric operations to assess and address the presence of
random and systematic error. Additionally, the study attempted to generate new
understanding about the use of performance indicators to determine the factors that affect
positive educational outcomes. Importantly, the research design adopted a post-
positivistic, empirical orientation to inquiry with an identified theoretical construct
redefined and translated into observable measures and analyzed using a series of
generally accepted and objective research operations (Crano & Brewer, 2002; Creswell,
2009).
The model through which this study was conducted and the results presented are
based on the framework established under the auspices of the National Academy of
Sciences (NAS). In particular, the National Research Council (NRC) has established six
guiding principles for scientific inquiry in education (Shavelson & Towne, 2003). First,
the inquiry should pose significant questions that seek to fill the gap in existing
knowledge and help generate new understanding to pursue the causes or factors that
affect educational outcomes. The results must be empirically adequate and thus be
testable and refutable by professional peers. Second, the proposed research should link to
existing theoretical frameworks and build upon and even refine existing understanding.
Third, appropriate inquiry methods should be designed and employed to answer the
unique research questions posed. Fourth, the conclusions drawn from the data gathered
should follow a rational, consistent line of reasoning. The inferences made from the
results and observations must be critically reviewed in light of the inherent limitations of
the study, such as error, bias, or the internal and external challenges to validity and
13
reliability. Moreover, the study should consider and eliminate the possible counter-
explanations in a clear and convincing manner. Fifth, the research results should include
a cogent explanation of the findings to provide a means for replication and generalization
of findings across other studies. Finally, the results should be presented to the
educational profession to promote examination of the conclusions and critique of the
methods of investigation (Shavelson & Towne, 2003).
In this regard, the present study was conducted using the NRC framework for
scientific inquiry (Shavelson & Towne, 2003). The research used existing theoretical
frameworks and paradigms to help develop a better understanding of the construct of
institutional effectiveness and its operationalization as well as to inform practice in higher
education accountability. Results and observations were critically reviewed in light of
the inherent limitations of the study, and all scientific assumptions made and logical
conclusions drawn from the analysis of performance indicators are inherently testable and
falsifiable (Popper, 1959). Finally, the results will be presented to the educational
profession to promote examination of the conclusions and critique of the methods of
investigation and to expand the use of the results to improve student success and
achievement. The dissertation will be published and stored at the University of Southern
California library to permit public review and professional scrutiny by peers. In addition,
the findings of the study will be presented at either a professional symposium or research
conference and submitted to peer-reviewed journals for publication (Shavelson & Towne,
2003).
14
This dissertation sets forth in detail the steps of scientific inquiry used in the
study. The background of the problem, purpose of the study, and the research questions
were identified in this chapter. The existing theoretical models and body of knowledge
on the topic, which form the basis and rationale for the study, are set forth in Chapter 2,
Review of the Literature. The inquiry methods used to answer the research questions
posed are discussed in Chapter 3, Research Methodology. A summary and review of the
data are included in Chapter 4, Findings and Results. The conclusions drawn from an
examination of the results, the implications of the findings, the limitations of the study,
and recommendations for further study are set forth in detail in Chapter 5, Summary and
Conclusions.
15
CHAPTER 2
REVIEW OF THE LITERATURE
Over the past quarter-century, policymakers have put increased demands on
postsecondary institutions to better demonstrate their commitment to academic quality
(Ewell, 2011). As state and federal governments and accrediting commissions demand
greater disclosure from institutions about the value of the college experience in the
emerging economy and increase their requirements for demonstrating institutional
effectiveness, this call for accountability in higher education has intensified over the past
ten years, (Ewell, 2008b). As a result, colleges and universities are no longer permitted
to self-monitor their internal policies and practices in regard to the assessment of
institutional performance (Burke 2001; Ewell, 2008b). At the turn of the new century,
measuring and reporting higher education institutional effectiveness have become
national concerns.
History of Institutional Effectiveness in Higher Education
For nearly 250 years, colleges and universities in the United States were regarded
as autonomous institutions and, thus, allowed to independently review their own policies
and practices in regard to assessing institutional effectiveness (Ewell, 2008b). Unlike
many European nations, there is no ministry of education in the United States that sets the
curriculum or establishes the standards for approval of institutions. In the former
regulatory model in the United States, member institutions or individual programs
established their own standards for quality assurance, but with the guiding hand of the
federal government as the major financier of postsecondary study. Midway through the
16
last century, however, regional accrediting agencies emerged as the primary vehicles
through which federal oversight of institutional effectiveness was achieved (Ewell,
2008a).
In the second half of the 20th century, there was a dramatic expansion of
postsecondary study and college enrollments as veterans and the baby boomers saw a
college education as the pathway to prosperity in the post-World War II United States.
Importantly, the federal government continued to finance attendance in higher education
study, either through grants or student loans. With the use of greater federal tax dollars to
pay for higher education, the call for accountability has intensified, and colleges and
universities are being asked to steward resources and be publicly accountable for the use
of funds (Burke, 2004).
With the shift of accountability focus from institutional fiscal responsibility and
integrity to student achievements and outcomes, greater attention is being paid to the
concepts of student success and student development (Folger, 1977; Pascarella, 2006).
Thus, institutional effectiveness is now a critical component of the accreditation process
of institutional self-review and improvement (Goben, 2007).
The backdrop for the expanded focus on institutional effectiveness in higher
education has been the parallel push for greater accountability and transparency at all
educational levels. The landmark report A Nation at Risk (National Commission on
Excellence in Education, 1983) was a catalyst for the change in the systematic review of
elementary, middle, and high schools and culminated in the creation of legislation such as
No Child Left Behind (U.S. Department of Education, 2002). While focused on the K-12
17
system, the report’s call for public accountability mirrored nascent appeals for greater
scrutiny in the higher education sector, such as the Student Right to Know and Campus
Security Act of 1990 (Public Law No. 101-542, 20 U.S.C.A. sec. 1092(f)(7)), which
mandates that colleges that receive federal student aid compile and report graduation
rates. The law also provided a precedent for state and local governments that were
contemplating the establishment of more elaborate reporting systems. In this regard, the
higher education setting poses unique problems for accountability not found in K-12
venues.
The legislative changes also foreshadowed the creation of the Secretary’s
Commission on the Future of Higher Education, commonly known as the Spellings
Commission, which produced A Test of Leadership: Charting the Future of U.S. Higher
Education (Miller, 2006). Miller recommended, inter alia, a more complete disclosure of
student success and achievement data and the creation of a national database that contains
institutional statistics that can be used to assist families and individuals to make informed
decisions when matriculating to higher education institutions. The Commission,
however, is not without its critics, and some educational professionals are beginning to
question the location of the balance point on the accountability fulcrum between
unmonitored self-evaluation and governmental intrusiveness (Head, 2011).
Specifically, seven trends drive the increased call for heightened review of
academic quality assurance by accrediting commissions and the federal and state
governments (Ewell, 2008a). First, there is a growing alignment of national and societal
interests with traditionally understood elements of higher education academic quality
18
(e.g., National Center for Public Policy and Higher Education [NCPPHE], 2006).
Second, policymakers and the public at large are demanding that higher education
institutions account for the value that they add to student academic, social, and affective
development (Pascarella, 2006). Third, fiscal pressures and constraints will likely
continue to plague postsecondary institutions over the foreseeable future due to a
predicted stagnant economy (Ewell, 2008b). Fourth, changes in the demographic makeup
of student populations will lead to an increased expectation of equitable educational
outcomes for traditionally underrepresented groups. Fifth, evolving student enrollment
behavior, including reverse transfer, stop out, and multi-institutional attendance, are
altering established concepts of what constitutes the college degree pathway (Ewell,
2008b). Sixth, the expansion of distance education and other alternative learning
environments has fundamentally challenged the traditional classroom and face-to-face
teaching environments as the most viable means to learn. Finally, the increased
globalization of higher education poses new questions about how well-equipped U.S.
institutions are to compete in an ever-expanding international environment (European
Consortium for Accreditation, 2007).
Emerging Paradigms
Three converging movements provide the background for this study: (a) the
maturation of the concept of institutional effectiveness (Ewell, 2011); (b) the focus on
quantitative measures in the form of aggregate student outputs to operationalize the
construct (Alfred, 2011); and (c) the transformation of higher education accountability
from a compliance-driven system to an outcomes-based model (Burke, 2004). The point
19
of intersection of these sometimes disparate movements has caused a significant
paradigm shift for the field of higher education accountability and has set the stage for a
new, emerging era of educational evaluation (Kuhn, 1996).
Maturation of the Construct of Institutional Effectiveness
Although academic quality has been a major concern of the academy, the term
“institutional effectiveness” did not enter into the lexicon of higher education
accountability until 1984, when the Southern Association of Colleges and Schools
(SACS) introduced the term as part of its accreditation requirements (Head, 2011). The
term now is ubiquitous, but reduction of the concept to a set of generally accepted set of
observable events has proven daunting to policymakers and educational leaders. At the
community college level, where the institutional missions are often far-reaching and
diverse, there has been increased reliance on quantitative models of accountability in the
form of aggregated student and institutional data that reflect college milestones (Hasson
& Meehan, 2010). Student clearance of the thresholds, benchmarks, and completion of
terminal outcomes now establish the basis for causal inference that the college practices
produced positive results (Burke, 2004). Conversely, the failure of students to meet the
milestones is taken as a sign that colleges need to reevaluate programs and services to
improve results.
The consensus in the literature is that the greatest impetus for the philosophical
shift to open reporting and review of institutional quality has been the efforts of the
regional accrediting commissions to direct the attention of member institutions to quality
assurance as a major institutional priority (Ewell, 2011). The aim of the accrediting
20
commissions has not been the establishment of specific outputs or milestones for
investigation but, rather, the establishment of college-wide practices and processes that
demonstrate data-driven decision making in a continuous cycle of assessment, evaluation,
and improvement (Head, 2011). The accrediting commissions for the six higher
education regions in the United States address institutional effectiveness through an
established sets of standards that aspiring institutions must meet for initial eligibility and
to which member institutions must adhere as a condition of successful reaffirmation
(Accrediting Commission for Senior Colleges and Universities, 2002; Commission on
Institutions of Higher Education, 2006; Middle States Commission on Higher Education,
2009; New England Association of Schools and Colleges, 2011; Northwest Commission
on Colleges and Universities, 2010; Southern Association of Colleges and Commission
on Colleges, 2010; The Higher Learning Commission, 2003a, 2003b).
During the past decade, all accrediting agencies have, to varying degrees,
strengthened the portions of the standards related to institutional effectiveness and have,
as components of evaluation reports, begun to make formal institutional
recommendations and, in some cases, levy sanctions against member colleges for the
failure to make institutional effectiveness a sufficient priority (Head, 2011). Despite the
numerous incarnations that institutional effectiveness has taken in various regions of
country, the overarching theme present in all of the accreditation standards is the critical
role that an institution’s mission plays in framing the question of what constitutes
institutional quality. The college’s mission acts as the beacon that guides all strategic
planning efforts. Therefore, institutional effectiveness, in its most general terms, refers to
21
the extent to which a college or university meets its stated mission (Hasson & Meehan,
2010). Many Four-Year institutions have unitary missions, making the assessment of
academic quality more straightforward; however, community colleges face a more acute
dilemma because institutional missions are multi-functioned (Ewell, 2011).
The League for Innovation in the Community College, a preeminent policy and
institutional leadership group for two-year institutions, has identified five critical
missions of community colleges: transfer, career preparation, basic skills, continuing
education and community service, and access (Doucette & Hughes, 1990). Ewell (2011)
noted that comprehensive community college missions include a broad range of
functions, including (a) the first two years of a baccalaureate study (transfer pathway); (b)
the attainment of an associate’s degree as a terminal milestone, especially in career and
technical education fields; (c) vocational training for immediate wage gain and
employment; (d) contract education for local businesses and employers; (e) pre-
collegiate, basic skills education for the large number of students who enroll unprepared
to produce collegiate level work; and (f) noncredit and community education services,
such as lifelong learning and second language acquisition (Ewell, 2011). As an added
layer of complexity, each diverse function of a community college occurs concurrently, if
not independently, of each other function.
While nuanced in the field, the definition of institutional effectiveness used in this
study is “the ability of an institution to match its performance to the purposes of the
mission and vision statements and to the needs and expectations of its stakeholders”
(Alfred, Shults, & Seybert, 2007, p. 27). In the community college context, institutional
22
missions are often closely aligned with statewide priorities. The creation and use of
carefully delineated performance metrics, each tied to a specific function stated in the
mission, has become the preferred method to assess institutional quality at the two-year
institutions (Ewell, 1993).
Emergence of Performance Indicators
The growing array of performance indicators used by community colleges to
measure institutional effectiveness is the result of the expanding partnerships between
two-year institutions and the stakeholders, including local businesses and employers,
nonprofit organizations, and state and local governments in the new economy (Kirsch,
Braun, Yamamoto, & Sum, 2007). The method used to assess the effectiveness of the
multiple missions of the community college has been to assign an indicator of aggregate
student success and achievement for each function as a form of credible evidence upon
which decision makers may rely when determining whether efforts are working
(Donaldson, Christie, & Mark, 2008).
This distinctly quantitative approach to measuring institutional effectiveness
provides a set of performance results by which state and federal governments may
compare achievement and for individual institutions to benchmark institutional
improvement against actual or aspirational peers. Indeed, the evolution of the use of
performance metrics at the community college has been robust over the past 25 years.
The root of the use of performance indicators can be found in the efforts of
AACC, a leading professional organization for two-year colleges. Ewell (1993)
authorized the landmark text Core Indicators of Effectiveness for Community Colleges,
23
which identifies key performance measures as a way to assess organizational quality,
including such metrics as student persistence, graduation rates, goal attainment, and
transfer rates. The creation of performance metrics has resulted in an explosion of the
field of institutional research, as colleges garner resources to build the critical mass
necessary to generate and analyze the data. Although the treatise has gone through
multiple revisions and editions, and now includes a list of 16 effectiveness measures, the
report remains the vanguard thesis for output analysis in the community college (Alfred
et al., 2007).
Many states have followed suit in the use of quantitative outputs as part of
accountability reporting. Burke and Minassians (2003) conducted a review of higher
education accountability systems and found that 29 states used 158 performance
indicators in reporting systems and that 11 states incorporated 66 performance metrics in
funding models. The top measures for reporting models include graduation and retention
rates, diversity enrollments, transfer rates, and degrees awarded. The most common
measures for funding models include graduation and retention rates, job placement rates,
transfer rates, and faculty workload measures (Burke, 2004).
On the national level, the U.S. Department of Education mandates that
postsecondary institutions that receive federal assistance participate in the Integrated
Postsecondary Education Data System (IPEDS). The system is a series of data
submissions by institutions to the National Center for Education Statistics (NCES).
While not an accountability system, per se, IPEDS acts as a repository for data such as
enrollments, program completions, graduation rates, and cost of attendance. In turn,
24
these data are made available to students and parents for use when making decisions to
attend colleges and universities (NCES, 2004). Colleges use the navigator tools to create
peer grouping comparisons to benchmark innovation and improvement. The U.S.
Department of Education has established a committee on measures of student success to
address the emerging metrics of analysis.
As a response to state and federal governmental entry into the use of performance
metrics, professional associations and organizations also have begun to implement
nongovernmental, college-centered models that incorporate performance indicators at the
heart of the analysis. In 2007, AASCU and APLU, which comprise 520 institutions that
award approximately 70% of the bachelor’s degrees every year in the United States,
created VSA to identify and report educational outcomes and disseminate effective
colleges practices based on the findings (AASCU, 2012). NAICU (2012) created an
assessment and accountability framework to help prospective students and families make
more informed decisions. The largest concerted effort for accountability outside of state-
mandated systems is the VFA, established by the AACC. Under the VFA, AACC (2012)
has identified a series of core performance metrics that two-year institutions may use to
assess institutional effectiveness based on the unique missions of community and junior
colleges.
National higher education initiatives such as Achieving the Dream utilize student
success markers and have provided an investigation of the development of new measures
of student success to drive innovation and improvement (Baldwin, Bensimon, Dowd, &
Kleiman, 2011). The Equity for All initiative, led by the University of Southern
25
California Center for Urban Education (CUE) also uses performance indicators to effect
change. CUE has implemented the Equity Scorecard, an action research-based tool
designed to improve educational outcomes for historically underrepresented groups who
use the community college as the principal entry point into postsecondary education
(Dowd, 2005). By-products of this initiative have been a greater focus on developing
cultures of inquiry at community colleges and the expansion of the use of evidence by
college decision makers at all levels.
Performance indicators have been at the heart of the efforts of national and state
policymakers, as well. The National Center for Higher Education Management Systems
(NCHEMS) houses data drawn from a wide range of sources to help address policy and
strategic challenges (Ewell, 2006). NCPPHE (2006) produces a periodic update on
progress on performance metrics, referred to as the National Report Card for Higher
Education. To respond to the increased reliance on inter-college comparisons in
performance metrics, Johnson County Community College District established the
National Community College Benchmarking Institute to systematize peer institution
analysis (National Higher Education Benchmarking Institute, 2010). In California,
college-level indicators have been the primary measures employed by the Institute for
Higher Education Leadership and Policy at California State University, Sacramento
(Moore & Shulock, 2010; Shulock, Moore, & Offenstein, 2011).
While the rapid expansion of the use of performance indicators is clear, the
operationalization and use of the indicators as measures of institutional quality have been
uneven at best, calling into question the utility of the decision making based on the use of
26
metrics. Attempts have been made to consider the use of models that allow for more
equitable and fair comparison such as regression and peer grouping; however, the inputs
employed tend to be predictor variables that act as proxies for socioeconomic status
(Hom, 2008; Perry, 2005).
An emerging model for institutional comparison is the binomial graphic display
(BGD)
that
shows
the
relative
position
of
achievement
points
relative
to
the
best
fit
regression
line
(Goldschmidt
&
Hocevar,
2004).
Thus,
intra-‐institutional
comparisons
that
use
BGD
can
predict
performance
changes
in
the
output
measures
in
terms
of
hypothetical
changes
in
the
input
measures,
based
on
the
linear
regression
coefficient
(Goldschmidt & Hocevar, 2004). Further, BGD provides both
statistical and actuarial information on improvement for potential benchmark schools.
From Compliance to Outcomes Analysis
The final trend that provides a foundation for this study is the transformation of
higher education accountability from compliance to outcomes analysis. In the past two
decades, three types of accountability models have emerged: performance reporting,
performance-based funding, and performance budgeting. Performance reporting models,
such as those in California and Florida, rely on the public disclosure of performance on a
set of indicators matched to state goals that are used to drive innovation and improvement
(Baldwin et al., 2011; Burke, 2004). Performance funding models, such as those found in
Connecticut and Maryland, tie state funding to institutional performance on the metrics
(Burke, 2004). Finally, performance budget models, such as systems implemented in
27
Pennsylvania and Texas, use performance on measures to inform future resource
allocation to close achievement gaps.
Given the greater emerging reliance of governments on performance indicators as
markers to foster innovation and improvement, and the use of these measures by
institutions to inform and refine internal processes and practices of self-evaluation, it is
essential that these measures be stable, consistent, and free from extraneous factors that
are irrelevant to the construct of interest, that is, institutional quality. To the extent that
the indicators measure artifacts other than academic quality, they are subject to error,
which may be either random of systematic. The seminal question for this study, then, is,
how well do the outcome measures overlap with the idea of institutional effectiveness,
and what factors mediate and moderate that alignment?
California’s Response
The State of California began its examination of community college
accountability in the 1990s with an agreement referred to as the Partnership For
Excellence (PFE) among three centers of authority: the Community College Chancellor’s
Office, the Governor’s Office, and the State Legislature. The PFE focused on overall
statewide performance on identified performance indicators, including transfer rates,
degree and certificate completion, successful course completion, workforce development,
and basic skills improvement. The results were intended to influence change at the
community college system’s office level. No real change could occur at the individual
district or college levels due to the aggregate nature of the outcome measures. A major
hurdle for the PFE was the lack of a fully functional data system to provide the requisite
28
information to measure community college effectiveness. Although the PFE made a
concerted statewide effort to examine community college performance, it lacked the
requisite tools to address the growing statewide concern from the government for
accountability for two-year institutions. As a result, funding for the PFE ceased, and a
more concerted effort was launched by the California Legislature to address institutional
effectiveness at the community college level.
In 2004, California enacted ARCC to establish more rigorous accountability
measures for district performance (Postsecondary Education Accountability, California
Education Code §84754.5 et seq.). Under the law, the Board of Governors of the
California Community Colleges was required to (a) determine a set of educational
priorities; (b) create a workable framework of accountability; and (c) identify relevant
indicators to measure performance (Postsecondary Education Accountability, California
Education Code §84754.5 et seq.). As amended, ARCC established two levels of
performance indicators: statewide and individual community college outcomes.
Importantly, ARCC established a technical advisory group (TAG) to operationalize the
performance indicators and implement the accountability program.
Relevant to the present study, California has identified six college-level
performance indicators:
1. Student Progress and Achievement Rate (SPAR);
2. Percentage of Students Who Earned at Least 30 Units (Thirty-Unit Completion);
3. Persistence Rate (Persistence);
29
4. Annual Successful Course Completion Rate for Credit-Based Vocational Courses
(Vocational Success Rate);
5. Annual Successful Course Completion Rate for Credit-Based Basic Skills
Courses (Basic Skills Success Rate); and
6. Improvement Rates for Credit-Based Basic Skills Courses (Basic Skills
Improvement Rate).
Each of the variables was selected to address the multiple missions of the community
colleges.
The methodology employed by ARCC is a significant improvement over the
measures implemented by PFE in a number of critical areas. First, ARCC generates an
annual report for public consumption that identifies the measures and methods
implemented. The report affords the colleges an opportunity to generate a narrative to
explain the results in detail. There are college-specific indicators as well as system-level
indicators to track improvement. Moreover, ARCC makes between-college comparisons
based on a peer-grouping model of cluster analysis (Hom, 2008). Importantly, there are
no fixed targets or goals, no rankings of the individual colleges, and no ostensible link to
state funding. In theory, ARCC is designed to place decision making with respect to
funding allocation in light of the performance indicators in the hands of local voters and
local boards of trustees. As an extension of this control, local districts make the
necessary decisions for the colleges, as they are provided with reliable and valid data
upon which to evaluate performance and base their decisions.
30
Basis for the Study
The presence and degree of measurement error, both random and systematic, in
institutional effectiveness performance indicators is a relevant inquiry. Moreover, it is
relevant to ask whether statistical techniques and analytical methods can be used to
increase the stability and fit of the measures to establish a more accurate and equitable
standard of expected or predicted institutional achievement that help to:
1. Better explain why differences between institutional performance arise and to
determine how those colleges who achieve less well can use their knowledge of
more accurate factors to improve (Thorndike, 1963).
2. Afford accrediting commissions and governmental agencies more accurate data to
make and inform policy decisions based on the performance results.
31
CHAPTER 3
RESEARCH METHODOLOGY
The research questions posed in the study drove the selection of the inquiry
methods used to collect and measure the data and analyze and interpret the results
(Shavelson & Towne, 2003). This chapter presents the inquiry methods.
Population and Sample
The target population for this study consisted of the 112 credit-awarding
institutions of the California Community College System (CCCS or “the System”). The
unit of analysis of the sample, therefore, was at the institutional level; however, the
colleges included in the sample enroll approximately three million students, which
represents almost a million full-time equivalent students (FTEs). CCCS remains the
primary pathway for entry into postsecondary education for most Californians, with close
to 84 out of every 1000 residents’ participating in the System, including approximately
one-fourth of all 20- to 24-year-olds (California Community College Chancellor’s Office
[CCCCO], 2010).
The institutional performance data analyzed in this study are centrally housed in
the Management Information Systems Division (MIS) of CCCCO (“the Chancellor’s
Office”). A complete dataset that contains outcome results for California’s performance
measurement system was provided by the Research, Analysis and Accountability Unit
(RAA) of CCCCO. This opened a rare window of opportunity to maximize the
accessible population and accurately align the study sample with the universe of colleges
within the System.
32
With the traditional logistical hurdles of time, accessibility, and expense
fortuitously cleared, the largest possible sample of community colleges was constructed
to (a) increase the level of precision of the statistical and analytic tools employed in
analysis of random and systematic measurement error; (b) achieve a robust level of
population validity; and (c) be able to generalize the findings of the study to subgroups of
colleges within the population as well as to the population as a whole (Gall et al., 2007).
To this end, a sampling frame was generated using the dataset provided by RAA, which
consisted of a complete list of the colleges in the System as well as the outcome results
for each of the initial four years of implementation of the community college
accountability system.
Performance data could not be consistently collected for six of the 112 colleges in
CCCS because the institutions: (a) were chartered after the commencement of the study
or (b) existed at the start of the study but were not in operation long enough to generate
outputs for each of the performance indicators for the reporting periods under
consideration. Consequently, the six institutions with missing results were not included
in the sample, leaving a subset (N = 106) of community colleges for the study. The
exclusion of the six colleges was determined not to materially affect the conclusions
drawn from the findings due to the large number of included institutions in the sample as
a percentage of the total population of colleges within the System. The large pool of
included colleges also significantly reduced the impact of sampling error and raised the
probability that the college scores on the measured variables were representative of the
entire CCCS.
33
No college elected to be part of the sample of institutions. A non-experimental,
quantitative research design used to examine the variation in performance among
colleges in the outcome measures was based on publicly available data (CCCCO, 2011).
Indicators varied freely, and the extent of any covariation or relationship was duly noted.
Importantly, no treatment was administered and no experiment conducted.
Data Sources
Three sources of publicly available data were accessed and utilized to complete
the quantitative analysis for this study: (a) U.S. Census Bureau database (USCB or
“Census Bureau”), (b) the CCCCO Data Mart, and (c) ARCC performance measurement
datasets. Census Bureau data were accessed to retrieve demographic information on the
population density, income level, and educational attainment of the community where the
institution is located to determine whether these inputs predict performance on the
college outcomes measures. The data were mined directly from the official website of
USCB for the 2000 Decennial Census, the last full census for which figures on these
demographic metrics were available at the commencement of this study.
Similarly, the CCCCO Data Mart was used to retrieve college size, race, ethnicity,
gender, and age statistics for each college in the sample to determine whether these
institutional and student socioeconomic status inputs also predict performance on the
ARCC metrics. The statistics were extracted completely from datasets retrieved from the
MIS portal on the Chancellor’s Office official website. Finally, the ARCC datasets were
generated to collect the performance indicator results for the sample colleges and were
produced by staff of the RAA unit of CCCCO for specific use in this study.
34
Instrumentation
The instrument evaluated for sources of random and systematic measurement
error was the performance measurement system created for two-year institutions by the
State of California, officially referred to as ARCC. The performance indicators produced
under the ARCC framework are reported in the Focus on Results (CCCCO, 2010), a
report used by California to evaluate college-level performance in meeting statewide
educational outcome priorities.
For the time period of this study, data were collected and examined for each of the
colleges included in the sample on six of the eight ARCC indicators, including:
1. Student Progress and Achievement Rate (SPAR);
2. Percentage of Students Who Earned at Least 30 Units (Thirty-Unit Completion);
3. Persistence Rate (Persistence);
4. Annual Successful Course Completion Rate for Credit-Based Vocational Courses
(Vocational Success Rate);
5. Annual Successful Course Completion Rate for Credit-Based Basic Skills
Courses (Basic Skills Success Rate); and
6. Improvement Rates for Credit-Based Basic Skills Courses (Basic Skills
Improvement Rate).
After the commencement of the study, CCCCO added two further metrics as a
part of the ARCC performance measurement system: (a) Improvement Rates for Credit-
Based ESL Courses and (b) Career Development and College Preparation Progress and
Achievement Rate. Outcome results could not be collected for these performance
35
indicators for the sample colleges for each of the reporting years examined in this study
and, as a result, were not included in the analysis.
Special focus was placed on three of the outcomes metrics: SPAR, Persistence,
and Thirty-Unit Completion. SPAR was selected due to its role as a general or
comprehensive measure of student attainment of stated educational objectives.
Persistence was chosen based on its possible role as a tipping or momentum point to
reaching SPAR as a terminal educational outcome. Finally, Thirty-Unit Completion was
chosen as an additional tipping point indicator of success on later, terminal outcomes and
its role as a measure of wage gain, an elusive, but prioritized, outcome by the federal
government. In addition to the state-identified performance indicators in ARCC, a global
indicator was generated as a composite metric of the six performance indicators to
determine the internal consistency of the performance indicators and to determine
whether an overarching measure of institutional effectiveness could be derived.
Predictor and Criterion Variables
Levels of Measurement
Explanatory or predictor (input) variables were identified to determine whether
socioeconomic factors predict performance on SPAR, Persistence, and Thirty-Unit
Completion, impairing the fit of the measure to the idea of institutional effectiveness
(Brennan, 2006; Crano & Brewer, 2004). Operationalized definitions as well as the type
of variable and level of measurement for each input and output measure are set forth
below.
36
Variables with continuous or categorical data were recoded to create scales to
permit necessary statistical and quantitative analysis and analytic procedures. In some
instances, data were recoded to form dichotomous or scaled variables to create a method
to address the existence and degree of linear relationship within the variables and
between the input and output measures.
Predictor Variables (Input Measures)
The predictor variables analyzed in this study were population density, college
size, income, educational attainment, race and ethnicity, gender, and age.
Population density. Population density is an established demographic category of
interest by USCB and is defined as the population of an area divided by the number of
square miles of land for the area (U.S. Census Bureau, 2010). For this study, the
population density was determined by (a) identifying the ZIP code associated with the
principal address for the community college as reported to CCCCO; (b) matching the
identified population density for the ZIP code area as reported by USCB; and (c)
recording and assigning the population density figure to each college in the sample. The
population density figures were continuous, ratio data on the number of persons (rounded
to the hundredths decimal place) per square mile.
College size. College size is a recognized data element collected by CCCCO and
is defined as the number of FTEs at an institution. The FTEs for each college is a derived
element computed by the CCCCO MIS division, using enrollment data reported and
submitted by the individual colleges to the System’s MIS office. Specifically, the FTEs
for a college are computed by CCCCO by summing the total hours of positive attendance
37
in classes eligible for state apportionment and then dividing the total by 525, the total
number of hours for full-time status (15) for a 35-week academic year (15 x 35 = 525).
For this study, the college size was determined by (a) retrieving the FTEs for each college
in the sample for the years that ARCC has been in effect and (b) computing a mean FTEs
figure for the years under consideration. The college size data were continuous, ratio
data on the number of FTEs (rounded to the hundredths decimal place).
Income. Income level is an established demographic category of interest by
USCB and is defined as the median household income for a housing unit. For this study,
the income variable was determined by (a) identifying the ZIP code associated with the
principal address for the college as reported to CCCCO; (b) matching the ZIP code with
the median income for that area as reported by USCB; and (c) recording and assigning
the population density figure to each college in the sample. Income data were
continuous, ratio data on the number of persons (rounded to the nearest dollar).
Educational attainment. Educational attainment is an established demographic
category of interest by USCB and is defined as the highest level of scholastic
achievement that an individual has achieved (U.S. Census Bureau, 2010). For this study,
educational attainment was determined by (a) identifying the ZIP code associated with
the principal address for the college as reported to the CCCCO; (b) matching the ZIP
code with the identified levels of education achievement as noted by the USCB; and (c)
recording and assigning the population density figure to each college in the sample.
Educational attainment data were ordinal data on the percentage of persons who
reached the following scholastic milestones: less than 9th grade, 9th-12th grade, high
38
school graduate, some college, associate’s degree, bachelor’s degree, and
graduate/professional. The ordinal data were transformed into scales by assigning an
increasing value to the milestone attainment categories and computing an aggregate
figure for the metric as follows:
1. Percentage of less than 9th grade x 1;
2. Percentage of 9th-12th grade x 2;
3. Percentage of high school graduate x 3;
4. Percentage of some college x 4;
5. Percentage of associate’s degree x 5;
6. Percentage of bachelor’s degree x 6; and
7. Percentage of graduate/professional x 7.
Race and ethnicity. Race and ethnicity are recognized data elements measured by
CCCCO and were defined as the student’s self-declared racial or ethnic background.
Information about the student race or ethnicity is identified by the student on the
application for admission to the college. The colleges did not consistently report some
ethnic categories, and, as a result, the following groups were created: African American,
Asian, Caucasian, Hispanic, and “Other” (defined as 1 - the identified categories).
Percentages for each category are computed and reported by CCCCO. For this study, the
racial and ethnic categories were determined by (a) retrieving the data element figures for
college in the sample for the years that ARCC has been in effect and (b) computing a
mean for each category for the years under consideration. Racial figures were ratio data
reported as a percentage of persons in each category.
39
Gender. Gender is a recognized data element measured by CCCCO and is
defined as the student’s self-declared gender identity. Information about gender is
reported by the student on the application for admission to the college. For this study,
the gender categories were determined by (a) retrieving the percentage of males and
females for each college in the sample for the years that ARCC has been in effect and (b)
computing a mean for each category for the years under consideration. Gender data were
ratio data reported as a percentage of number of persons in each category.
Age. Age is a recognized data element measured by CCCCO and is determined
by the student’s self-declared birth date for the academic year of interest. Age is a
derived element computed by using the student’s reported birth date on the application
for admission and then placing the student into established age categories, including 19 or
less, 20 to 24, 25 to 29, 30 to 34, 35 to 39, 40 to 49, and 50+. For this study, the age
categories were determined by (a) retrieving the age figures for college in the sample for
the years that ARCC has been in effect; (b) computing a mean for each category for the
years under consideration; and (c) placing the age groups into two categories:
“traditional” and “nontraditional” students. Traditional students were defined as students
under the age of 24 during the year in question, and nontraditional students were defined
as students aged 25 or older during the academic year. Age data were ratio data on the
percentage of number of persons in each category.
40
Performance Indicators (Output Measures)
The output measures for this study include the state-identified ARCC
performance indicators as well as a derived global measure established by computing a
composite metric of institutional effectiveness using all of the ARCC indicators.
Student Progress and Achievement Rate (SPAR). SPAR is calculated as a
percentage of a cohort of first-time students who earn a minimum of 12 units who have
attempted a degree, certificate, or transfer threshold course within six years of entry and
who achieve any of the following successful outcomes:
1. Earned an associate’s degree or certificate;
2. Completed transfer to a Four-Year institution after enrollment in the California
community college;
3. Achieved transfer-directed status; or
4. Achieved transfer-prepared status.
For the purposes of this indicator, transfer directed means successful completion
of both transfer-level math and English courses. Transfer prepared for the SPAR
indicator includes successful completion of 60 University of California (UC) or
California State University (CSU) transferable units, with a grade point average in excess
of 2.0. The determination of whether a student transferred to a Four-Year institution is
determined using databases of CCCCO that include non-state schools as well as UC and
CSU transfers. Thus, a cohort of first-time students from 2001-2002 would be expected
to achieve triggering outcomes by the 2006-2007 academic year. This indicator is, in
41
essence, a global metric of student completion of educational objectives while enrolled at
a community college.
Percentage of Students Who Earned at Least 30 Units (Thirty-Unit Completion).
The Thirty-Unit Completion performance indicator is calculated as a percentage of a
cohort of first-time students with a minimum of 12 units who have attempted a degree,
certificate, or transfer threshold course within six years of entry and who earn at least 30
units while in the CCCS in courses identified to be “value-added” to the student as
defined in wage analysis as having a positive effect on future earnings.
The value-added threshold of units as operationalized by CCCCO involves
courses determined in wage studies as having a positive effect on the future earnings
potential of the student. As with the SPAR performance indicator, for the Thirty-Unit
Completion indicator, the cohort of first-time students from 2001-2002 would be
expected to achieve triggering outcomes by the 2006-2007 academic year. This indicator
is an indicator of wage gain associated with coursework taken at the community college.
Persistence Rate (Persistence). Persistence is calculated as a percentage of a
cohort of first-time students with a minimum of 6 units earned in their first fall term who
return and reenroll in the subsequent fall term anywhere in the CCCS.
Thus, a student enrolled in the fall 2006 term with the requisite 6 units earned
would be measured by his or her enrollment in the fall 2007 term. This performance
indicator concerns the retention of the student in the system from year to year as an
indicator of likelihood of future completion of an educational objective.
42
Annual Successful Course Completion Rate–Vocational Courses for Credit
(Vocational Success Rate). Vocational Success Rate is calculated as a percentage of
students enrolled in a credit-based career and technical education course over the
previous three academic years who receive:
1. A final course grade of A, B, C grade, or
2. A final assignment of Credit (CR) for the course.
The Vocational Success Rate excludes “special admit” students, including
students enrolled in K-12 when they enrolled in the vocational education course. The
Vocational Success Rate for the 2010 ARCC report, for example, would include students
enrolled in vocational classes for the 2006-2007, 2007-2008, and 2008-2009 academic
years.
Annual Successful Course Completion Rate for Credit-Based Basic Skills
Courses (Basic Skills Success Rate). Basic Skills Success Rate is calculated as a
percentage of students enrolled in a credit-based basic skills course over the previous
three academic years who receive:
1. A final course grade of A, B, C grade, or
2. A final assignment of Credit (CR) for the course.
The Basic Skills Success Rate excludes “special admit” students, including
students enrolled in K-12 when they enrolled in the basic skills course. The Basic Skills
Success Rate for the 2010 ARCC report, for example, would include students enrolled in
basic skills classes for the 2006-2007, 2007-2008, and 2008-2009 academic years.
43
Improvement Rates for Credit-Based Basic Skills Courses (Basic Skills
Improvement Rate). Basic Skills Improvement Rate is a calculated as a percentage of the
students enrolled in a Basic Skills math or English course and who successfully complete
that initial course and who, within three academic years, receive a final course grade of
A, B, C grade or a final assignment of Credit (CR) for the course at a higher level within
the same discipline.
The Basic Skills Improvement Rate includes only students who will start their
pre-collegiate instruction at least two levels below college or transfer level. The Basic
Skills Improvement Rate excludes “special admit” students, including students enrolled
in K-12 when they enrolled in the basic skills course. The Basic Skills Improvement
Rate for the 2010 ARCC report, for example, would include cohorts for 2004-2005
through 2006-2007, 2005-2006 through 2007-2008, and so forth.
Global Institutional Effectiveness Rate (Global Rate). The Global Rate is a
composite indicator synthesized from each of the ARCC performance metrics. The
measure was computed by summing the individual ARCC rates for a given year of the
accountability system and computing a mean for combined rates. The Global Rate was
created to assess the extent to which all of the performance measures in the ARCC
framework, as “items” in the instrument, reliably map to the theoretical construct of
institutional effectiveness and thus create an overarching indicator of college quality.
44
Research Questions
The study addressed four research questions:
1. What is the degree of random error present in the ARCC performance metrics
(SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success Rate, Basic
Skills Improvement Rate, and Vocational Success Rate)? What is the temporal
stability of the performance indicators as measured by the test-retest reliability of
the metrics? Are the performance indicators internally consistent?
2. Are the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, subject to systematic measurement error? If so, what are the
confounding factors that impair assessment of institutional effectiveness by
consistently and artificially inflating or deflating scores among colleges?
3. If the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, are subject to systematic error, is there a viable method to control for
the confounding factors to make more meaningful comparisons among colleges
about institutional effectiveness?
4. Is Persistence, as an intermediate milestone, a tipping or momentum point
indicator of the terminal student outcome of SPAR? Is Persistence a tipping or
momentum point indicator of the Thirty-Unit Completion intermediate milestone?
Theoretical Framework for the Methodology
The research questions posed in the study drove the selection of the inquiry
methods used to collect and measure the data and analyze and interpret the results
(Shavelson & Towne, 2003). The design of the study was grounded in CMT. CMT
45
posits that the observed score on any single measurement is a random draw from a
distribution of possible scores for a respondent on the instrument that measures a
construct of interest (Brennan, 2006). An observed score is the sum of the respondent’s
true score plus error represented by the hallmark test theory formula:
O = T + e,
where O represents the observed score, T equals the “true” score, and e equals the error
caused by chance fluctuation (Brennan, 2006).
The division of an observed result into two distinct components addresses the
inevitable imperfection inherent in any measurement instrument. The true score
component of the observed result represents the replicable part of O, the part of the
measure that repeats itself over time. To the extent it is reproducible, a true score
becomes temporally stable, and, as a result, the instrument’s precision of the measure is
optimized. The error component is the part of the score that is random, that is irrelevant
and unrelated to the construct being measured. The greater the error, the greater the
degree of chance in the score, and, as a result, the less likely that the variance in the
observed scores is due to real, meaningful differences between respondents. By
definition, random error affects the accuracy of the measurement and the ability to detect
meaningful differences among groups (Crano & Brewer, 2004). In theory, with large
groups of respondents, random error sums to zero (cancels out) over groups because the
error is due to chance.
For purposes of the examination of institutional effectiveness metrics, the classic
test theory formula discussed above was modernized to reflect augmentations in classical
46
test theory over the past half-century and its extension to research design. For this study,
the augmented formula was restated as follows:
where O equals the observed score of a college on the performance indicator, T
corresponds to the institution’s true score on the performance metric, and ∑e_(r + s)
represents the sum of error, which can be either random or systematic (Crano & Brewer,
2004). As with the classic formula, the random error component calculates the degree to
which chance plays a role in producing the observed result. Importantly, however, the
modified test theory formula includes the added dimension of examining the sum of
systematic error, which is the presence and extent of measurement bias in the results.
The difference between random error and measurement bias is the systematicity
of the error. Systematic measurement error, because it is consistent does not cancel
between groups and, typically, when bias is present in measurement processes, it
artificially pushes one group farther from its comparator. Systematic error reduces
observed score variance and expands the differences in true scores, creating a significant
difference between groups when none, in fact, exists. The difference observed between
colleges on a performance indicator would be a function of the biased measures and
operations, not a real difference in the institutional effectiveness. It is possible for
systematic error to artificially reduce group differences, making conclusions of similarity
erroneous; however, these types of measurement bias are rarer than sources of bias,
which exacerbate differences. In sum, systematic error attacks the validity of the measure
because it misidentifies the factors that cause group differences.
47
Procedures
Research Question 1
What is the degree of random error present in the ARCC performance metrics
(SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success Rate, Basic Skills
Improvement Rate and Vocational Success Rate)? What is the temporal stability of the
performance indicators as measured by the test-retest reliability of the metrics? Are the
performance indicators internally consistent?
The temporal stability of the indicators (Brennan, 2006) was measured for each
year of the ARCC report by computing a Pearson-product moment coefficient (Pearson’s
r) as a descriptive measure to demonstrate the existence, direction, and relative strength
of the test-retest relationship, computed as a linear correlation coefficient represented by
x- and y- quantitative values. The formula for Pearson’s r is:
where represents the test-retest reliability correlation coefficient from year to year, n
equals the number of paired scores of x- and y- data contained in the analysis, and Σ
refers to the sum of numbers. Thus, and refer to the additive sum of the x- and y-
values, respectively. The notation indicates that each individual x- value was
multiplied by the corresponding y- value, and the products were summed.
Thus, the true-score component of the observed ARCC result is the stable
characteristic of institutional effectiveness of each college as measured by the indicator.
No change in the true score component of the ARCC rate occurs unless some real change
48
has taken place in the college’s performance. The degree of deviation in the scores from
reporting period to reporting period is a function of the random error contained in ARCC
as a measurement instrument. Conversely, the degree of similarity between indicator
rates from year to year reveals the metric’s test-retest reliability. The correlation value as
represented by the Pearson’s correlation coefficient reflects the “true score” variation in
the outcomes measure. The greater the coefficient, the lesser the proportional effect of
extraneous factors that play a part in the observed score differences in the indicators.
In addition to the stability analysis via test-retest reliability, the internal
consistency of the ARCC indicators was determined by computing an average of all of
the possible ways of splitting “items.” Using Cronbach’s alpha, the average inter-
correlation between ARCC metrics was computed and reported in the form of a global
descriptive measure of consistency among indicators. The formula for Cronbach’s alpha
is:
. ,
where α represents the reliability estimate, k equals the number of performance
indicators, represents the variance in the performance indicator scores and the global
score, and represents the sum of all variances for each ARCC performance indicator
and the global metric.
Research Question 2
Are the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, subject to systematic measurement error? If so, what are the confounding
49
factors that impair assessment of institutional effectiveness by consistently and artificially
inflating or deflating scores among colleges?
The presence and degree of systematic error in the ARCC performance indicators
of SPAR, Persistence, and Thirty-Unit Completion, subject to systematic measurement
error, were analyzed using multivariate correlation techniques. Special focus was placed
on SPAR for its attempt to measure student attainment of stated educational objectives as
a general “g” factor. Persistence was chosen because of its role as an intermediate
milestone to achieving a terminal educational objective, as measured by SPAR. Further,
Thirty-Unit Completion was selected based on its role as a measure of wage gain and its
potential role as a tipping or momentum point for later, exit educational outcomes.
The multicorrelation design measured the degree of relationship among various
combinations of explanatory or predictor (input) variables on SPAR, Persistence, and
Thirty-Unit Completion as response variables (Creswell, 2008). The multiple regression
technique was selected for this study because of the amount of information generated
about the variables in the procedure and the versatility and depth of the findings,
including both the magnitude and observed probability of the relationships.
The formula used to regress the outcome measures with the predictor variables is:
where represents the predicted value of the variable being explained, ,
represent the predictor variables in the study, represents the estimate of the point of
intersection of the regression line with the y-intercept based on the sample size (the value
50
of the performance indicator when all of the predictor variables are 0), and
denote the regression coefficients used as multipliers for the predictor variables. Thus,
the individual coefficients in the multiple regression analysis were used as weighted
measures of the total effect of the predictor variables.
The initial step in the multiple regression analysis is the computation of the
correlation coefficient between the best predictor (input) variable and the ARCC
indicator under review. This analysis produces the multiple correlation coefficient (R),
which is a measure of the magnitude of a combination of the explanatory variables with
the outcome indicator. Sequentially, additional correlations are computed to identify the
remaining predictor variables contained in the matrix that best clarify the remaining
unexplained variance in the ARCC measure but that also possess the least correlation
with the previously identified best-predictor variables. This process minimizes the
degree of multicollinearity, or explanatory overlap, among the explanatory variables,
resulting in the identification of the best combination of predictor variables for the ARCC
outcomes measures. The squared value of the multiple correlation coefficient produces
an estimate of the common variation among the inputs and the ARCC outcomes,
commonly referred to as the coefficient of determination ( ).
The multiple regression used the least-squares regression line that minimized the
sum of the squared errors, or the vertical distance between the observed values of the
ARCC indicators and those predicted by the multiple regression line, is represented by
the formula:
Min .
51
The total variation was computed by determining the difference between the sum of the
observed scores and the mean value of the ARCC measure. Explained variation was
derived by computing the difference between the predicted value and the mean value of
the ARCC indicator. Finally, the unexplained variation was reached by computing the
difference between the observed score and the predicted value of the ARCC metric. The
coefficient of determination ( ) was computed by using the following formula:
=
As a result, the observed score on each of the ARCC indicators was broken down
into the portion that can be explained by the regression line and the portion that is
explained by other factors, or by error. An adjusted multiple coefficient of determination,
, was computed to measure how well the multiple regression fits the sample, given the
number of variables and sample size. The formula for the adjusted is:
=
where n represents the number of colleges in the sample (N = 106), k denotes the number
of predictor variables, and signifies the coefficient of determination for the predictor
variables. The adjusted further addresses the issue of chance in the multivariate
relationship of the variables. Finally, an observed probability (p-value) was computed for
each test statistic to determine the significance level of the results.
Research Question 3
If the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, are subject to systematic error, is there a viable method to control for the
52
confounding factors to make more meaningful comparisons among colleges about
institutional effectiveness?
The determination of real differences, to make meaningful comparisons among
institutions, given the presence of confounding factors, was determined using the
residuals produced by the multiple regression equation. The difference between the
college’s observed score and the predicted score on the performance indicators produced
a residual value that consisted of the error present, based on the combined effect of the
input variables. The formula for the residual is represented as:
R = –
where R represents the residual score, Y represents the observed score, and represents
the predicted score based on the regression line formula. Colleges with positive residual
values (i.e., ARCC scores above the regression line) over-performed on the performance
metric, based on the effect of the inputs. In contrast, institutions with negative residual
values (i.e., ARCC scores below the regression line) under-performed on the performance
indicator, given the combined effect of the explanatory variables. Importantly,
standardized residual scores were computed to determine the relative position of
institutions for each of the ARCC performance indicators.
Once the multiple regression was performed and the predicted and observed
values computed, a set of case-wise diagnostics was run to discover the outliers, based on
the greatest difference between the individual college observed scores and the predicted
score relative to the best-fit (least squares) regression line. Institutions that were at least
53
one standard deviation above or below the predicted value on the performance indicator
were identified.
Research Question 4
Is Persistence, as an intermediate milestone, a tipping or momentum point
indicator of the terminal student outcome of SPAR? Is Persistence a tipping or
momentum point indicator of the Thirty-Unit Completion intermediate milestone?
The determination of whether Persistence Rate, as a transitional measure to
completion of a final educational objective, predicted institutional performance on (a)
SPAR and (b) Thirty-Unit Completion metric, was completed in a three-step process.
First, a linear regression was performed to establish the degree and direction of the
bivariate relationship between (a) Persistence and SPAR and (b) Persistence and Thirty-
Unit Completion. Second, a set of case-wise diagnostics was run to determine the
outliers, both over- and under-performers (± 1 SD), based on relative position from the
regression line. The computation of the residuals was performed in the same manner as
the initial multiple regression to identify the presence and degree of error in prediction.
Finally, the institutions were plotted on a scatter diagram in the form of a BGD,
with Persistence measured on the horizontal x-axis, and the performance indicators
(SPAR and Thirty-Units Completion) measured on the vertical y-axis to show the relative
position of each college relative to the best-fit regression line (Goldschmidt & Hocevar,
2004). Based on an intra-institutional comparison using the BGD, predicted performance
changes in the output measures (SPAR and Thirty-Unit Completion) could logically be
made in terms of hypothetical changes in the input measure (Persistence) based on the
54
linear regression coefficient. Thus, a change in the college policies and procedures that
affect Persistence rates could inform us about gains that could be expected on either
SPAR or Thirty-Unit Completion as a result of hypothetical improvements of Persistence
(Goldschmidt & Hocevar, 2004). Further, the BGD provides both statistical and actuarial
information on potential benchmark institutions.
55
CHAPTER 4
RESULTS
Before conducting the more rigorous statistical and analytical techniques required
to answer the research questions, three fundamental quantitative procedures were
performed: (a) summary descriptive statistics were computed, and the degree,
directionality, and strength of the associations among the explanatory variables were
determined, (b) summary descriptive statistics were computed, and the degree,
directionality, and strength of the associations among the ARCC outcomes measures
were determined, and (c) the degree, directionality, and strength of the associations
among explanatory variables with the ARCC outcomes measures were calculated.
This initial stage of the statistical analysis was designed to assess the normality of
the distributions and to determine the presence and degree of linearity of the associations
among the explanatory variables and the ARCC measures. The temporal stability and
multiple regression analysis conducted in this study are appropriate when relationships
among the explanatory and response variables are linear.
Predictor Variables
Descriptive Analysis
Measures of central tendency (including mean and median), standard deviation,
skewness, and kurtosis were computed for each explanatory variable to determine the
center, spread, symmetry, and shape of the distributions. Table 1 presents the summary
statistics for the predictor variables: Income, Educational Attainment, Population
Density, College Size, and the percentage of Nontraditional Aged, Caucasian, Asian,
56
Hispanic, and African-American students. To determine the significance of the observed
symmetry and variability of the distributions, t-values were computed (critical value = 2)
by dividing the observed skewness and kurtosis by their respective standard errors, based
on the sample size, to arrive at the test statistics.
Table 1
Descriptive Statistics of Predictor Variables for California Community College
Indicators
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
The mean exceeded the median for the Income, Population Density, College Size,
Asian, and African American inputs. For all other explanatory variables, the mean
closely approximated the median. The t-values for skewness, however, indicate that the
sample distributions for Income (6.72), Population Density (11.66), College Size,
Nontraditional Aged (2.64), Hispanic (4.21), and African American were asymmetrical
and positively skewed. The Female (-9.53) explanatory variable also had a non-normal
distribution but was negatively skewed. The t-values for kurtosis show that the sample
M Median SD Skewness SE of
Skewness
t-value
for
Skewness
Kurtosis SE of
Kurtosis
t-value
for
Kurtosis
Income 50990 46480 22060 1.58 0.24 6.72 4.17 0.47 8.97
Educational 3.99 3.95 0.75 0.06 0.24 0.26 -0.18 0.47 -0.39
Population
Density
4091 2657 4724 2.74 0.24 11.66 11.23 0.47 24.15
College Size 9949 8893 5967 0.90 0.24 3.83 0.56 0.47 1.20
Female 0.56 0.57 0.07 -2.24 0.24 -9.53 8.49 0.47 18.26
Nontrad Aged
Students
0.48 0.47 0.10 0.62 0.24 2.64 0.16 0.47 0.34
Caucasian 0.42 0.40 0.20 0.16 0.24 0.68 -0.64 0.47 -1.38
Asian 0.11 0.07 0.10 1.22 0.24 5.19 0.58 0.47 1.25
Hispanic 0.30 0.27 0.16 0.99 0.24 4.21 0.87 0.47 1.87
African-
American
0.09 0.06 0.10 2.85 0.24 12.13 9.77 0.47 21.01
57
distributions for Income (8.97), Population Density (24.15), Female (18.26), and African
American (21.01) were significantly leptokurtic (peaked).
Correlation Analysis
The existence, strength, and directionality of the inter-correlations among the
predictor variables were computed using Pearson’s r. The race and ethnicity categories
were excluded from this step due to their categorical nature. Table 2 displays the linear
correlation coefficients for Income, Educational Attainment, Population Density, College
Size, Nontraditional Aged, and Gender.
Table 2
Linear Correlation Coefficients for Predictor Variables for California Community
College Indicators
Results of the correlation analysis reveal that the Income predictor variable
demonstrated a strong positive association with Educational Attainment (r = .71, p =
.001). Population Density had a positive correlation with College Size (r = .31, p = .001).
All other relationships either (a) could not exclude random chance as an explanation of an
Educational
Attainment
Population
Density
College
Size
Female Nontrad
Age
Students
Correlation
0.71 -0.17 0.20 -0.09 -0.02
Observed Prob
0.00 0.08 0.04 0.36 0.83
Correlation
-0.16 0.15 -0.01 0.02
Observed Prob
0.10 0.13 0.92 0.84
Correlation
0.31 0.08 0.01
Observed Prob
0.00 0.44 0.91
Correlation
-0.07 -0.41
Observed Prob
0.47 0.00
Correlation
-0.30
Observed Prob 0.00
n= 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2002)
Income
Educational
Attainment
Population Density
College Size
Female
58
apparent association or (b) demonstrated weak or no linear correlation. The lack of
association in the explanatory variables allowed discrimination among inputs to establish
the cumulative effect of the explanatory variables on the ARCC outcome measures
during the multiple regression procedure. The collinearity between income and
educational attainment was addressed with weighted scores to separate out the individual
effect of each input on the ARCC indicators.
ARCC Outcomes Measures
Descriptive Analysis
Measures of central tendency (including mean and median), standard deviation,
skewness, and kurtosis were computed for each ARCC outcome measure to determine
the center, variability, symmetry, and shape of the distributions. Table 3 presents the
summary statistics for the outcomes metrics predictor variables examined in this study:
SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success, Basic Skills
Improvement, Vocational Success Rate and Global. To determine the significance of the
observed symmetry and variability of the distributions, t-values were computed (critical
value = 2) by dividing the observed skewness and kurtosis by their respective standard
errors to arrive at the test statistics.
59
Table 3
Descriptive Statistics of ARCC Performance Indicators
M Median SD Skewness SE
of
Skewness
t-value
for
Skewness
Kurtosis SE
of
Kurtosis
t-value
for
Kurtosis
SPAR 51.08 51.26 7.41 -0.17 0.24 -0.72 0.33 0.47 0.71
Thirty-Unit Comp. 69.97 70.31 4.81 -0.53 0.24 -2.26 0.86 0.47 1.85
Persistence 66.20 67.58 7.91 -1.15 0.24 -4.89 1.78 0.47 3.83
Basic Skills Success 60.30 60.84 6.21 0.45 0.24 1.91 2.12 0.47 4.56
Basic Skills Imprvmnt 50.96 51.48 6.20 -0.45 0.24 -1.91 -0.30 0.47 -0.65
Voc Success Rate 76.36 76.33 6.16 0.78 0.24 3.32 1.09 0.47 2.34
Global 59.49 60.06 4.87 -0.39 0.24 -1.66 0.75 0.47 1.61
n = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2002)
Correlation Analysis
The existence, strength, and directionality of the inter-correlations among the
ARCC outcome measures were computed using Pearson’s r. The significance of the
estimate of correlation is noted with an observed probability of p < .01. Table 4 displays
the linear correlation coefficients for SPAR, Thirty-Unit Completion, Persistence, Basic
Skills Success, Basic Skills Improvement, Vocational Success Rate, and Global Rate.
60
Table 4
Linear Correlation Coefficients for Performance Indicators (Outputs) for California
Community Colleges
Thirty-Unit
Comp
Persistence Basic Skills
Success
Basic Skills
Imp
Vocational
Success
Rate
Global
Correlation
0.620 0.510 0.580 0.240 -0.080 0.790
Obs. Prob 0.001 0.001 0.001 0.014 0.420 0.001
Correlation
0.720 0.440 0.360 0.060 0.810
Obs. Prob 0.001 0.001 0.001 0.540 0.001
Correlation
0.410 0.430 -0.040 0.820
Obs. Prob
0.001 0.001 0.690 0.001
Correlation 0.340 0.130 0.730
Obs. Prob
0.001 0.170 0.001
Correlation
0.260 0.610
Obs. Prob
0.007 0.001
Correlation
0.090
Obs. Prob
0.380
n = 106
Voc Success Rate
SPAR
Thirty-Unit Comp
Persistence
Basic Skills Success
Basic Skills Imp
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
The results of the correlation analysis indicate that SPAR demonstrated a strong
positive correlation with Global Rate (r = .79, p = .001) and a moderate positive
correlation with Thirty-Unit Completion (r = .62, p = .001), Persistence (r = .51, p =
.001), and Basic Skills Success (r = .58, p = .001). Persistence displayed a strong
correlation with Thirty-Unit Completion (r = .72, p = .001), Basic Skills Success (r = .41,
p = .001), and Basic Skills Improvement (r = .43, p = .001). Additionally, Thirty-Unit
Completion manifested a moderate positive association with Basic Skills Success (r =
.36, p = .001). Basic Skills Success showed a moderate positive association with Basic
Skills Improvement (r = .34, p = .001). Basic Skills Improvement demonstrated only a
slight positive correlation with Vocational Success (r = .26, p = .007).
61
Global Rate displayed either a moderate or strong positive association with all
other ARCC performance indicators, except Vocational Success Rate (r = .09, p = .038).
In fact, Vocational Success Rate displayed a weak or no correlation with any of the
ARCC performance indicators, except Basic Skills Improvement (r = .26, p = .007). The
lack of linear association among the ARCC performance indicators with Vocational
Success Rate prohibited the construction and use of Global Rate as a general, all-
inclusive measure of institutional effectiveness. Global Rate, as a derived performance
metric, was still examined for internal consistency of the items, using Cronbach’s alpha
as part of the reliability analysis. All other relationships either (a) could not exclude
random chance as an explanation of an apparent association or (b) demonstrated weak or
no linear correlation.
Predictor Variables with ARCC Outcome Measures
Correlation Analysis
The existence, strength, and directionality of the correlation among the
explanatory variables with the ARCC outcomes measures were computed using
Pearson’s r. The race and ethnicity categories were excluded from the analysis due to
their categorical nature. The significance of the estimate of association is noted with an
observed probability of p < .01. Table 5 presents the linear correlation coefficients of the
explanatory variables with the ARCC performance indicators, including Global Rate.
62
Table 5
Linear Correlation Coefficients for Predictor Variables with ARCC Performance
Indicators
Global SPAR Thirty-Unit
Comp
Persistence Basic
Skills
Success
Basic
Skills
Imp
Vocational
Success
Rate
Correlation
0.61 0.54 0.40 0.46 0.49 0.39 0.22
Obs. Prob
0.00 0.00 0.00 0.00 0.00 0.00 0.02
Correlation
0.56 0.68 0.37 0.36 0.40 0.29 0.06
Obs. Prob
0.00 0.00 0.00 0.00 0.00 0.00 0.57
Correlation
-0.05 -0.10 0.07 0.04 -0.02 -0.15 -0.11
Obs. Prob
0.63 0.32 0.47 0.68 0.87 0.12 0.28
Correlation
0.49 0.29 0.50 0.55 0.27 0.24 0.00
Obs. Prob
0.00 0.00 0.00 0.00 0.01 0.01 0.97
Correlation
0.00 0.08 0.12 0.02 0.01 -0.21 -0.64
Obs. Prob
0.99 0.42 0.22 0.85 0.94 0.03 0.00
Correlation
-0.35 -0.25 -0.41 -0.49 -0.09 -0.13 0.32
Obs. Prob
0.00 0.01 0.00 0.00 0.38 0.19 0.00
n = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2002)
Nontrad
Aged
Students
Income
Educational
Attainment
Population
Density
College Size
Female
The results of the correlation analysis of the explanatory variables with the ARCC
performance metrics indicate that Income, Educational Attainment, and College Size
showed moderate correlations with each of the ARCC performance indicators, except
Vocational Success Rate. This parallels the findings in the inter-correlations of outputs
performed above, which showed Vocational Success Rate as the sole indicator in the
ARCC matrix that did not have strong associations with the other metrics. Population
Density did not manifest any linear correlation with any of the ARCC metrics. Similarly,
the Female input variable did not show any linear association with any of the outcome
measures, except for a moderate to high negative correlation with Vocational Success
Rate (r = -.64, p = .001). Nontraditional Aged displayed a moderate negative correlation
with Persistence (r = -.49, p = .001) and Thirty-Unit Completion (r = -.41, p = .001).
63
Summary
The summary descriptive statistics of the explanatory variables and ARCC
performance indicators and the linear correlation coefficients among the variables
indicate that (a) there is sufficient linearity to conduct meaningful multi-correlation
analysis and (b) there is an adequate difference in the explanatory variables to
discriminate the cumulative effect on the ARCC indicators through multiple regression
techniques.
Statistical and Multivariate Correlation Analysis
Research Question 1
What is the degree of random error present in the ARCC performance metrics
(SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success Rate, Basic Skills
Improvement Rate, and Vocational Success Rate)? What is the temporal stability of the
performance indicators as measured by the test-retest reliability of the metrics? Are the
performance indicators internally consistent?
The usability of the ARCC performance indicators as measures of institutional
effectiveness depends on the confidence in the stability and strength of the true score
component of the observed metric rates. Deviation in performance on the indicators is
expected from year to year; however, the objective is to maximize the proportion of true-
score variation in the ARCC indicator and to reduce the variation due to random chance.
A change in score must closely reflect a true change in the presence or absence of the
construct of institutional effectiveness being measured by the indicator. Thus, the linear
correlation between “testings” identifies the temporal stability of the ARCC measures.
64
The lower the reliability coefficient, the greater the proportional effect of random error
and the less confidence possessed in the metric as an indicator of institutional
effectiveness.
Separate Pearson’s r linear correlation coefficients were computed for each of the
ARCC indicators for each of the four years of reporting. Importantly, rates for cohort
metrics that extended beyond one year (i.e., SPAR, Thirty-Unit Completion) were
distinguished to examine the different cohorts as separate “test administrations” for each
individual year. The significance of the estimate of association is noted with an observed
probability of p < .01. Tables 6 through 10 present the linear correlation coefficients
(test-retest reliability) for SPAR, Persistence, Thirty-Unit Completion, Basic Skills
Success, Basic Skills Improvement, and Vocational Success, respectively.
Table 6
Test-Retest Reliability for Student Progress and Achievement Rate
SPAR
(2001-2007)
SPAR
(2002-2008)
SPAR
(2003-2009)
SPAR
(2000-2006)
Correlation 0.95 0.92 0.91
Obs. Prob 0.01 0.01 0.01
SPAR
(2001-2007)
Correlation 0.94 0.92
Obs. Prob 0.01 0.01
SPAR
(2002-2008)
Correlation 0.94
Obs. Prob 0.01
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
65
Table 7
Test-Retest Reliability for Persistence Rate
Persistence Rate
Fall 2005 to Fall
2006
Persistence Rate
Fall 2006 to Fall
2007
Persistence Rate
Fall 2007 to Fall
2008
Correlation 0.89 0.86 0.85
Obs. Prob
0.01 0.01 0.01
Correlation
0.93 0.84
Obs. Prob
0.01 0.01
Correlation
0.84
Obs. Prob
0.01
n = 106
Persistence
Rate Fall 2006
to Fall 2007
Persistence
Rate Fall 2005
to Fall 2006
Persistence
Rate Fall 2004
to Fall 2005
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2002)
Table 8
Test-Retest Reliability for Basic Skills Success Rate
Basic Skills
Success Rate
2006-2007
Basic Skills
Success
Rate
2007-2008
Basic Skills
Success
Rate
2008-2009
Basic Skills
Success Rate 2005-
2006
Correlation 0.88 0.79 0.76
Observed Prob
0.01 0.01 0.01
Basic Skills
Success Rate 2006-
2007
Correlation
0.89 0.81
Observed Prob
0.01 0.01
Basic Skills
Success Rate 2007-
2008
Correlation
0.89
Observed Prob
0.01
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
66
Table 9
Test-Retest Reliability for Basic Skills Improvement Rate
Basic Skills
Improvement Rate
2004-05 to 2005-07
Basic Skills
Improvement Rate
2005-06 to 2006-08
Basic Skills
Improvement Rate
2006-07 to 2007-09
Basic Skills
Improvement
Rate 2003-04
to 2005-06
Correlation 0.87 0.81 0.68
Observed Prob 0.01 0.01 0.01
Basic Skills
Improvement
Rate 2004-05
to 2006-07
Correlation
0.81 0.70
Observed Prob
0.01 0.01
Basic Skills
Improvement
Rate 2005-06
to 2007-08
Correlation
0.76
Observed Prob
0.01
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
Table 10
Test-Retest Reliability for Vocational Success Rate
Vocational
Success Rate
2006-2007
Vocational
Success Rate
2007-2008
Vocational
Success Rate
2008-2009
Vocational
Success Rate
2005-2006
Correlation 0.97 0.93 0.88
Observed Prob 0.01 0.01 0.01
Vocational
Success Rate
2006-2007
Correlation 0.94 0.90
Observed Prob
0.01 0.01
Vocational
Success Rate
2007-2008
Correlation
0.95
Observed Prob 0.01
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
67
The results of the correlation analysis indicate a strong degree of temporal
stability for each of the ARCC performance indicators across the Four-Year span that the
ARCC reporting system has been in place. The strongest test-retest reliability
coefficients were demonstrated for SPAR (r > .91, p = .01) and Vocational Success (r >
.88, p = .01).
In addition to measuring the temporal stability of the metrics, the inter-item
consistency of the ARCC performance indicators was computed to determine the degree
of inter-relation of the metrics and the corresponding ability to predict similar results.
Coefficient alphas were computed for each year of reporting of the ARCC and from year
to year. Table 11 displays the coefficient alphas for each performance indicator across
time.
68
Table 11
Inter-Item Consistency By Year Reporting ARCC
Alpha Number of Items
Year One ARCC Reporting 0.711 6
Corrected Item-Total Corr. Alpha (Item Deleted)
Thirty-Unit Completion 0.658 0.623
Basic Skills Success Rate 0.508 0.652
Persistence Rate 0.585 0.623
Vocational Success Rate 0.090 0.766
Basic Skills Improvement Rate 0.374 0.694
SPAR 0.520 0.646
Alpha No. of Items
Year Two ARCC Reporting 0.712 6
Corrected Item-Total Corr. Alpha (Item Deleted)
Thirty-Unit Completion 0.651 0.631
Basic Skills Success Rate 0.493 0.658
Persistence Rate 0.541 0.641
Vocational Success Rate 0.089 0.766
Basic Skills Improvement Rate 0.468 0.665
SPAR 0.519 0.648
Alpha No. of Items
Year Three ARCC Reporting 0.698 6
Corrected Item-Total Corr. Alpha (Item Deleted)
Thirty-Unit Completion 0.521 0.67
Basic Skills Success Rate 0.679 0.646
Persistence Rate 0.548 0.661
Vocational Success Rate 0.081 0.78
Basic Skills Improvement Rate 0.574 0.657
SPAR 0.455 0.69
Alpha Number of Items
Year Four ARCC Reporting 0.727 6
Corrected Item-Total Corr. Alpha (Item Deleted)
Thirty-Unit Completion 0.658 0.623
Basic Skills Success Rate 0.508 0.652
Persistence Rate 0.585 0.623
Vocational Success Rate 0.09 0.766
Basic Skills Improvement Rate 0.374 0.694
SPAR 0.52 0.646
69
Table 12
Inter-Item Consistency Year-to-Year
Alpha
No. of Items
SPAR 0.981 4
Corrected Item-Total Corr. Alpha (Item Deleted)
SPAR Yr 1
0.945 0.976
SPAR Yr 2
0.960 0.973
SPAR Yr 3
0.959 0.973
SPAR Yr 4
0.943 0.977
Alpha
No. of Items
Thirty-Unit Completion 0.964 4
Corrected Item-Total Corr. Alpha (Item Deleted)
Thirty-Unit Comp. Yr 1
0.912 0.952
Thirty-Unit Comp. Yr 2
0.943 0.943
Thirty-Unit Comp. Yr 3
0.902 0.956
Thirty-Unit Comp Yr 4
0.892 0.958
Alpha
Number of Items
Persistence
0.962 4
Corrected Item-Total Corr. Alpha (Item Deleted)
Persistence Yr 1
0.911 0.951
Persistence Yr 2
0.931 0.943
Persistence Yr 3
0.922 0.945
Persistence Yr 4
0.873 0.960
Alpha
Number of Items
Basic Skills Success Rate 0.954 4
Corrected Item-Total Corr. Alpha (Item Deleted)
Basis Skills Success Rate Yr 1
0.849 0.950
Basis Skills Success Rate Yr 2
0.918 0.930
Basis Skills Success Rate Yr 3
0.917 0.930
Basis Skills Success Rate Yr 4
0.864 0.946
70
Table 12.
Inter-Item Consistency Year-to-Year (continued)
Alpha
Number of Items
Basic Skills Improvement Rate 0.928 4
Corrected Item-Total Corr. Alpha (Item Deleted)
Basic Skills Improvement Yr 1
0.855 0.899
Basic Skills Improvement Yr 2
0.868 0.895
Basic Skills Improvement Yr 3
0.867 0.897
Basic Skills Improvement Yr 4
0.753 0.935
Alpha
Number of Items
Vocational Success Rate 0.981 4
Corrected Item-Total Corr. Alpha (Item Deleted)
Vocational Success Rate Yr 1
0.949 0.975
Vocational Success Rate Yr 2
0.965 0.971
Vocational Success Rate Yr 3
0.965 0.971
Vocational Success Rate Yr 4
0.926 0.981
The inter-item consistency of the ARCC performance indicators from year to year
demonstrated strong internal consistencies across time ( ) for each of the metrics.
The inter-item consistency coefficients for the ARCC performance indicators with each
other for each individual year of the report were not as robust but still relatively high (
), with strong inter-total correlations.
Summary
The high reliability coefficients for each of the metrics indicate strong temporal
stability in the ARCC indicators and that the proportional effect of random error as a
cause for the observed variation in the scores has been acceptably reduced. In addition,
the inter-item correlation of the metrics for each reporting year and for each metric from
71
year to year demonstrates that the ARCC performance indicators have strong internal
consistency within and between years. As a result, the confidence in the ARCC
indicators as representing the true values being measured is justified, and the metrics can
be considered fairly reliable measures of institutional effectiveness when summed within
and between years.
Research Question 2
Are the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, subject to systematic measurement error? If so, what are the confounding
factors that impair assessment of institutional effectiveness by consistently and artificially
inflating or deflating scores among colleges?
In addition to reducing the effects of random error on observed scores on the
ARCC indicators, the usability of the ARCC metrics as measures of institutional
effectiveness depends on the minimization of the effect of external factors that
systematically affect observed results but are irrelevant to the construct of institutional
effectiveness. Group differences in scores must be attributable, to the maximum extent
possible, to real differences in the presence or absence of the construct.
To this end, multivariate correlation analysis was performed, regressing SPAR,
Persistence, and Thirty-Unit Completion on the explanatory variables. The objective of
this stage of the study was to determine whether socioeconomic factors predict
institutional performance on SPAR, Persistence, and Thirty-Unit Completion.
Specifically, multiple regression procedures were completed for (a) SPAR, Persistence,
72
and Thirty-Unit Completion for every reporting year and (b) each performance indicator
as an aggregate of the four years.
SPAR
The regression of SPAR on the 10 predictor variables was .869, F(10, 95) =
29.266, p = .001. The and adjusted were .755 and .729, respectively. The
nonsignificant indicators were Income, Population Density, College Size, Female, Asian,
and Other. High SPARs were significantly associated with higher Educational
Attainment and fewer Nontraditional Aged, African American, and Hispanic students.
Table 13 presents the multiple correlation coefficient (R), the coefficient for
determination ( ), the adjusted , the coefficients for each of the explanatory
variables, the ANOVA results, and the statistical significance of each effect. The
multiple regression coefficient for each year of SPAR paralleled the results for SPAR as
an aggregate for the four years of reporting. Appendices A1 through A4 contain the
multiple correlation coefficient (R), the coefficient for determination ( ), the adjusted
, the coefficients for each of the explanatory variables, the ANOVA results for SPAR
in the aggregate, and the statistical significance of each effect for each year of SPAR
reporting.
73
Table 13
Multiple Regression of Student Progress and Achievement Rate (SPAR)
Four-Year Reporting Period
Regression
R R
Squared
Adj. R Squared SE of the
Estimate
0.869 0.755 0.729 3.856
Anova
Sum of
Squares
df Mean
Square
F
Regression
4352.444
10.000
435.244 29.266
Residual
1412.850
95.000
14.872
Total
5765.294
105.000
Unstandardized Standardized
Coefficients
Coefficients
B SE β t Sig.
(Constant)
45.717 6.093
7.503 0.000
Income
0.000 0.000 0.102 1.289 0.200
Population
Density
0.000 0.000 0.006 0.089 0.929
College Size
0.000 0.000 0.069 1.041 0.301
Nontrad Aged
Students
-17.890 4.829 -0.246 -3.705 0.000
Female
10.332 6.178 0.099 1.672 0.098
African
American
-21.281 4.081 -0.301 -5.215 0.000
Hispanic
-18.446 2.936 -0.399 -6.283 0.000
Educational
Attainment
2.899 0.803 0.293 3.609 0.000
Asian
11.574 5.060 0.157 2.287 0.024
Other
2.484 11.952 0.012 0.208 0.836
N = 106
74
Persistence
The regression of the Persistence on the 10 predictor variables was .778, F(10, 95)
= 14.538, p = .001. The and adjusted were .605 and .563, respectively. The
nonsignificant indicators were Income, Population Density, Hispanic, Educational
Attainment, Female, Asian, and Other. High Persistence rates were significantly
associated with larger College Size and fewer Nontraditional Students and African
American students. Table 14 displays the multiple correlation coefficient (R), the
coefficient for determination ( ), the adjusted , the coefficients for each of the
explanatory variables, the ANOVA results for Persistence as an aggregate, and the
statistical significance of each effect. The multiple regression coefficient for each year of
Persistence paralleled the results for Persistence as an aggregate for the four years of
reporting. Appendices B1 through B4 contain the multiple correlation coefficient (R), the
coefficient for determination ( ), the adjusted , the coefficients for each of the
explanatory variables, the ANOVA results, and the statistical significance of each effect
for each year of Persistence reporting.
75
Table 14
Multiple Regression of Student Persistence Rate Four-Year Reporting Period
R R Squared
0.778 0.605
Sum of
Squares
df Mean
Square
Regression 3968.295 10 396.830
Residual 2593.122 95 27.296
Total 6561.417 105
B SE β t Sig.
(Constant) 63.069 8.255 7.640 0.000
Income 0.000 0.000 0.235 2.335 0.022
Population
Density
0.000 0.000 0.012 0.154 0.878
College Size 0.000 0.000 0.268 3.195 0.002
NontradAged
Students
-24.010 6.542 -0.309 -3.670 0.000
Female 5.815 8.370 0.052 0.695 0.489
Afric-American -19.563 5.529 -0.259 -3.539 0.001
Hispanic 2.776 3.977 0.056 0.698 0.487
Educational
Attainment
0.453 1.088 0.043 0.416 0.678
Asian 12.518 6.856 0.160 1.826 0.071
Other 17.049 16.192 0.079 1.053 0.295
n=106
Regression
Adj. R Squared
0.563
Anova
F
Unstandardized
Coefficients
Standardized
Coefficients
SE of the Estimate
5.225
14.538
Thirty-Unit Completion
The regression of the Thirty-Unit Completion on the 10 predictor variables was
.771, F(10, 95) = 13.886, p = .001. The and adjusted were .594 and .551,
respectively. The nonsignificant indicators were Income, Population Density, Hispanic,
Educational Attainment, Asian, and Other. High Thirty-Unit Completion Rates were
76
significantly associated with larger College Size, more Female students, and fewer
Nontraditional and African American students. Table 15 presents the multiple correlation
coefficient (R), the coefficient for determination ( ), the adjusted , the coefficients
for each of the explanatory variables, the ANOVA results for Thirty-Unit Completion,
and the statistical significance of each effect.
Table 15
Multiple Regression of Thirty-Unit Completion Rate Four-Year Reporting Period
Regression
R R
Squared
Adj. R Squared SE of the
Estimate
0.771 0.594 0.551 3.221
Anova
Sum of
Squares
df Mean Square F
Regression
1440.357
10
144.036
13.886
Residual
985.434
95
10.373
Total
2425.791
105
Unstandardized
Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant)
62.433 5.089 12.269 0.000
Income
0.000 0.000 0.192 1.881 0.063
Population
Density
0.000 0.000 0.127 1.578 0.118
College Size
0.000 0.000 0.262 3.082 0.003
Nontrad Aged
Students
-10.632 4.033 -0.225 -2.636 0.010
Female
14.822 5.160 0.218 2.872 0.005
African
American
-21.089 3.408 -0.460 -6.188 0.000
Hispanic
-0.965 2.452 -0.032 -0.394 0.695
Educational
Attainment
0.541 0.671 0.084 0.807 0.422
Asian
3.531 4.226 0.074 0.835 0.406
Other
-8.673 9.982 -0.066 -0.869 0.387
N = 106
77
The multiple regression coefficient for each year of Thirty-Unit Completion
paralleled the results for Persistence as an aggregate for the four years of reporting.
Appendices C1 through C4 contain the multiple correlation coefficient (R), the
coefficient for determination ( ), the adjusted , the coefficients for each of the
explanatory variables, the ANOVA results, and the statistical significance of each effect
for each year of Thirty-Unit Completion reporting.
Summary
The regression of SPAR, Persistence, and Thirty-Unit Completion, respectively,
on the 10 predictor variables displayed high multiple regression coefficients and
coefficients of determination, which demonstrates that socioeconomic factors are highly
predictive in the aggregate of ARCC performance. The extent to which socioeconomic
variables explain the likely performance on the measures significantly calls into question
the confidence that the indicators are successfully mapping on to the idea of institutional
effectiveness. As a result, it is difficult to separate the location of the institution, income,
and educational attainment of the residents and student population makeup from what the
institution is actually doing to meet their pledge of academic quality.
In conclusion, the multivariate correlation analysis of SPAR, Persistence, and
Thirty-Unit Completion is subject to significant levels of systematic error, and the
metrics, while temporally stable, measure other factors that are irrelevant to the construct
of institutional effectiveness. Thus, the consideration of a method to control for the
socioeconomic variables and derive more precise measure of the differences among
institutions is appropriate.
78
Research Question 3
If the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, are subject to systematic error, is there a viable method to control for the
confounding factors to make more meaningful comparisons among colleges about
institutional effectiveness?
To remove the external, irrelevant factors that have a systematic effect on the
observed ARCC results on SPAR, Persistence, and Thirty-Unit Completion, residuals
generated from the multiple regression were standardized for relative comparison. Test-
retest reliability analysis was performed and linear correlation coefficients (Pearson’s R)
computed to determine the temporal stability of the residuals derived from the regression
over the four years of ARCC reporting. The significance of the estimate of association is
noted at the p < .01 level.
Measures of central tendency, standard deviation, skewness, and kurtosis were
computed for SPAR, Persistence, and Thirty-Unit Completion to determine the center,
variation, symmetry, and shape of the distributions. To determine the significance of the
observed symmetry and variability of the distributions, t-values were computed (critical
value = 2) by dividing the observed skewness and kurtosis, by their respective standard
errors, to arrive at the test statistics.
Case-wise diagnostics were run (± 1 SD) to determine which institutions over-
and under-perform on SPAR, Persistence, and Thirty-Unit Completion, controlling for
the cumulative impact of the socioeconomic variables. The institutions identified
demonstrate more extreme performance on the metrics and provide a basis for further
79
examination for the practices and policies in place that give rise to certain levels of
performance.
SPAR
The results of the correlation analysis indicate a strong degree of temporal
stability of the SPAR residuals (r > .71, p = .01), which provide statistical support for the
measurement of the effects of the explanatory variables. Table 16 shows the linear
correlation coefficients (test-retest reliability) for the standardized residuals for SPAR.
Table 16
Test-Retest Reliability of SPAR Residuals Over Four-Year Reporting Period
Standardized
Residual
2001-02 to 2006-07
Standardized
Residual
2002-03 to 2007-08
Standardized
Residual
2003-04 to 2008-09
Standardized
Residual 2000-01
to 2005-06
Correlation 0.83 0.75 0.71
Observed
Prob
0.01 0.01 0.01
Standardized
Residual 2001-02
to 2006-07
Correlation
0.80 0.74
Observed
Prob
0.01 0.01
Standardized
Residual 2002-03
to 2007-08
Correlation
0.80
Observed
Prob
0.01
N = 106
Descriptive analysis of the standardized residuals reveals that the mean closely
approximates the median for each year. The t-values for skewness indicate that the
distribution for the SPAR is approximately symmetrical, as no t-value exceeded the
critical value of 2.
80
Figure 1 illustrates a sample distribution curve for standardized residual for SPAR
Year 4 reporting period.
Figure 1. Sample distribution of residual SPAR year 4.
Table 17 displays the descriptive statistics for the standardized residuals for
SPAR.
Table 17
Descriptive Statistics for Standardized Residuals SPAR
M Median SD Skewness SE
of
Skewness
t-value
for
Skewness
Kurtosis SE
of
Kurtosis
t-value
for
Kurtosis
Standardized
Residual
2000-01 to
2005-06
0 0.083 0.951 0.291 0.235 1.24 0.76 0.465 1.63
Standardized
Residual
2001-02 to
2006-07
0 0.038 0.951 0.075 0.235 0.32 0.301 0.465 0.65
Standardized
Residual
2002-03-to
2007-08
0 -0.014 0.951 -0.388 0.235 -1.65 -0.161 0.465 -0.35
Standardized
Residual
2003-04 to
2008-09
0 0.024 0.951 -0.396 0.235 -1.69 -0.15 0.465 -0.32
N=106
81
Table 18 displays the over-performing and underperforming institutions on the
SPAR indicator over the four-year reporting period after the explanatory variables have
been controlled. Appendices D1 through D4 display the over-performing and
Underperforming institutions on the SPAR indicator each cohort year after the
explanatory variables have been controlled.
Table 18
Top Ten Over and Under-Performing Institutions SPAR Four-Year Reporting Period
SPAR Over Four-Year Reporting Period
Over-Performing Institutions
Controlled Uncontrolled
College Stand. Residual
Predicted
Value Rate College SPAR
Barstow 2.30 45.13 54.00 De Anza 68.42
Reedley 1.86 47.91 55.06 Foothill 67.45
Oxnard 1.76 42.51 49.29 Irvine Valley 65.62
Irvine Valley 1.46 59.99 65.62 Diablo Valley 65.22
San Diego City 1.38 48.70 54.01 Moorpark 64.06
Napa Valley 1.33 50.06 55.20 Orange Coast 62.80
Allan Hancock 1.33 45.57 50.70 West Valley 62.60
Antelope Valley 1.31 49.25 54.31 Ohlone 62.23
Diablo Valley 1.31 60.18 65.22 San Mateo 60.78
Coastline 1.18 52.78 57.33 Saddleback 60.77
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College SPAR
-1.29 51.31 46.32
42.03
-1.34 52.96 47.80 41.42
-1.35 48.61 43.41 41.26
-1.47 56.95 51.26 40.45
-1.56 47.42 41.42 39.36
-1.63 50.43 44.16 38.62
-1.74 48.74 42.03 37.09
-1.98 50.87 43.23 37.07
-2.19 34.98 26.54 35.99
-2.37 55.23 46.08 26.54
N=106
Underperforming institution names are omitted.
82
Persistence
The results of the correlation analysis indicate a strong degree of temporal
stability of the Persistence residuals (r > .68, p = .01). Table 19 presents the linear
correlation coefficients (test-retest reliability) for the standardized residuals for
Persistence.
Table 19
Test-Retest Reliability of Persistence Residuals Over Four-Year Reporting
Period
Standardized
Residual Fall
2005 to Fall
2006
Standardized
Residual Fall
2006 to Fall
2007
Standardized
Residual Fall
2007 to Fall
2008
Standardized
Residual Fall
2004 to Fall 2005
Correlation
0.76
0.73
0.70
Observed Prob 0.01 0.01 0.01
Standardized
Residual Fall
2005 to Fall 2006
Correlation 0.86 0.68
Observed Prob 0.01 0.01
Standardized
Residual Fall
2006 to Fall 2007
Correlation 0.74
Observed Prob 0.01
N = 106
Descriptive analysis of the standardized residuals reveals that the mean closely
approximates the median. The t-values for skewness indicate that the distribution for
Persistence is negatively skewed for the second, third, and fourth year of ARCC
reporting. The distribution demonstrated leptokurtosis in the second year only (t = 4.65).
Figure 2 illustrates the sample distribution curve for the standardized residual for
Persistence Year 4 reporting period.
83
Figure 2. Sample distribution of standardized residual for Persistence year 4 reporting
period.
Table 20 displays the descriptive statistics for the standardized residuals for
Persistence measure.
Table 20
Descriptive Statistics for Standardized Residuals for Persistence
M Median SD Skewness SE of
Skewness
t-value
for
Skewness
Kurtosis SE
of
Kurtosis
t-value
for
Kurtosis
Standardized
Residual
Fall 2004 to
Fall 2005
0 0.08 0.95 -0.43 0.24 -1.83 -0.22 0.47 -0.47
Standardized
Residual
Fall 2005 to
Fall 2006
0 0.14 0.95 -1.00 0.24 -4.25 2.16 0.47 4.65
Standardized
Residual
Fall 2006 to
Fall 2007
0 0.13 0.95 -0.63 0.24 -2.66 0.36 0.47 0.78
Standardized
Residual
Fall 2007 to
Fall 2008
0 0.08 0.95 -0.53 -2.26 0.40 -0.15 0.47 0.85
N = 106
84
Table 21 displays the over-performing and Underperforming institutions for the
Persistence measure over the four-year reporting period after the explanatory variables
have been controlled. Appendices E1 through E4 display the over-performing and
Underperforming institutions for the Persistence measure over each cohort period after
the explanatory variables have been controlled.
Table 21
Top Ten Over- and Under-Performing Institutions Persistence Four-Year Reporting
Period
SPAR Over Four Reporting Year Period
Over-Performing Institutions
Controlled Uncontrolled
College Stand. Residual
Predicted
Value Rate College SPAR
Las Positas
1.78 68.74 78.06
Las Positas
78.06
Gavilan
1.77 61.73 70.98
Orange Coast
77.78
Santa Ana
1.72 63.81 72.78
Evergreen Valley
77.13
Hartnell 1.47 63.40 71.06 Mt San Antonio 77.02
Napa Valley
1.41 61.79 69.15
Pasadena City
76.91
Cuyamaca 1.36 61.20 68.29 Moorpark 76.44
Allan Hancock 1.31 61.10 67.96 Diablo Valley 75.75
Long Beach City 1.25 68.57 75.11 Fullerton 75.63
Taft
1.22 53.30 59.66
El Camino
75.23
Contra Costa
1.08 61.75 67.37
Long Beach City
75.11
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College SPAR
-1.22 73.22 66.86 54.99
-1.28 65.47 58.76 54.55
-1.38 67.04 59.85 53.28
-1.89 61.85 52.00 52.42
-1.94 63.40 53.28 52.00
-1.96 56.23 45.98 47.20
-2.27 81.37 69.53
46.46
-2.54 57.42 44.18 45.98
-2.66 60.35 46.46 44.18
-2.71 50.07 35.92
35.92
N = 106
85
Thirty-Unit Completion
The results of the correlation analysis indicate a strong degree of temporal
stability of the Thirty-Unit Completion residuals (r > .69, p = .01). Table 22 presents the
linear correlation coefficients (test-retest reliability) for the standardized residuals for
Thirty-Unit Completion.
Table 22
Test-Retest Reliability of Thirty-Unit Residuals Over Four-Year Reporting Period
Standardized
Residual Fall 2005
to Fall 2006
Standardized
Residual Fall 2006
to Fall 2007
Standardized
Residual Fall 2007
to Fall 2008
Standardized
Residual Fall
2004 to Fall
2005
Correlation 0.85 0.69 0.70
Observed Prob 0.01 0.01 0.01
Standardized
Residual Fall
2005 to Fall
2006
Correlation 0.78 0.77
Observed Prob 0.01 0.01
Standardized
Residual Fall
2006 to Fall
2007
Correlation 0.72
Observed Prob 0.01
N = 106
Descriptive analysis of the standardized residuals reveals that the mean closely
approximates the median. The t-values for skewness indicate that the distribution for the
Persistence is negatively skewed for the second, third, and fourth year of ARCC
reporting. The distribution demonstrated leptokurtosis in the second year only (t = 4.65).
Figure 3 presents the sample distribution curve for standardized residual for
Thirty-Unit Completion Year 4 reporting period.
86
Figure 3. Sample distribution curve for standardized residual for Thirty-Unit Completion
year 4.
Table 23 displays the descriptive statistics for the standardized residuals for
Thirty-Unit Completion.
Table 23
Test-Retest Reliability of Thirty-Unit Completion Residuals Over Four-Year Reporting
Period
M Median SD Skewness SE
of
Skewness
t-value
for
Skewness
Kurtosis SE
of
Kurtosis
t-value
for
Kurtosis
Standardized
Residual
2000-01 to
2005-06
0 0.05 0.95 -0.26 0.24 -1.09 0.48 0.47 1.02
Standardized
Residual
2001-02 to
2006-07
0 -0.02 0.95 -0.37 0.24 -1.57 1.07 0.47 2.31
Standardized
Residual
2002-03 to
2007-08
0 -0.04 0.95 -0.41 0.24 -1.76 1.50 0.47 3.23
Standardized
Residual
2003-04 to
2008-09
0 0.06 0.95 -0.84 0.24 -3.57 3.84 0.47 8.26
N=106
Table 24 displays the over-performing and Underperforming institutions for the
Thirty-Unit Completion measure after the explanatory variables have been controlled.
87
Appendices F1 through F4 display the over-performing and Underperforming institutions
for the Thirty-Unit Completion measure for each cohort period after the explanatory
variables have been controlled.
Table 24
Top Ten Over- and Under-Performing Institutions Thirty-Unit Completion Over
Four-Year Reporting Period
Thirty-Unit Over Four Reporting Year Period
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
Redwoods 2.31 68.842 76.28 De Anza 81.75
Gavilan 1.985 67.967 74.36 Orange Coast 79.43
Imperial Valley 1.802 71.057 76.86 Pasadena City 79.37
Glendale 1.576 74.151 79.23 Glendale 79.23
Chabot 1.535 68.542 73.49 Saddleback 76.96
Canyons 1.508 68.803 73.66 Imperial Valley 76.86
Southwest LA 1.453 58.739 63.42 Fullerton 76.64
Antelope Valley 1.432 68.529 73.14
Redwoods 76.28
De Anza 1.412 77.205 81.75 Cuesta 76.07
Cuesta 1.351 71.721 76.07 Mt San Antonio 75.79
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
-1.094 75.988 72.46 63.42
-1.361 60.52 56.14 62.54
-1.494 72.346 67.53 62.49
-1.528 72.147 67.23
62.39
-1.531 67.472 62.54
61.67
-1.786 70.904 65.15 60.59
-1.811 67.504 61.67 58.81
-1.893 64.911 58.81
57.88
-2.185 69.531 62.49
56.14
-3.944 68.207 55.51
55.51
N = 106
Note: Underperforming college names are omitted
88
Summary
Recasting the institutions to control for the for the confounding factors along the
regression line does more meaningful comparisons among colleges about institutional
effectiveness. The total variation among the colleges after the irrelevant factors are
controlled for produces and measures explained and unexplained variation. The degree
of predictive error computed by subtracting the observed score on the performance
indicator and the predicted value for the college on the ARCC indicator creates a list of
over- and underperforming institutions to benchmark improved performance.
Research Question 4
Is Persistence, as an intermediate milestone, a tipping or momentum point
indicator of the terminal student outcome of SPAR? Is Persistence a tipping or
momentum point indicator of the Thirty-Unit Completion intermediate milestone?
The final step in the analysis was to determine whether and to what extent
Persistence from fall to fall at a California community college, as an intermediate
milestone or tipping point, predicts subsequent success on SPAR and on Thirty-Unit
Completion. For the period of reporting available, two corresponding cohort periods
could be aligned with the output measures. First, Persistence Rate from fall 2002-2003
was aligned with the SPAR 2002-2008 cohort period. Additionally, Persistence Rate
from fall 2003-2004 was aligned with the SPAR cohort period of 2003-2009.
For Thirty-Unit Completion, Persistence Rate from fall 2002-2003 was aligned
with the Thirty-Unit Completion 2002-2008 cohort period. Additionally, Persistence
Rate from fall 2003-2004 was aligned with the Thirty-Unit Completion cohort period of
89
2003-2009. Thus, linear regression of SPAR on Persistence was completed for each of
the two possible correlation periods. Similarly, a linear regression of Thirty-Units
Completed on Persistence was completed for each of the two possible periods in this
combination of metrics.
Persistence as a Predictor of SPAR
The regression of the SPAR 2002-2008 on Persistence 2002-2003 was .481, F(1,
104) = 31.428, p = .001. The and adjusted were .232 and .224, respectively. Table
25 presents the linear correlation coefficient (R), the coefficient for determination ( ),
the adjusted for the linear regression of SPAR on Persistence for this relevant cohort
period. By comparison, the regression of the SPAR 2003-2009 on Persistence 2003-2004
was .474, F(1, 104) = 30.247, p = .001. The and adjusted were .225 and .218,
respectively.
Table 25
Linear Regression of SPAR 2002-2008 on Persistence 2002-2003
Regression
R R Squared
Adjusted
R
Squared
Std. Error
of the
Estimate
0.481736 0.23207
0.224686 0.066544
Anova
Sum of
Squares
df Mean
Square
F Sig
Regression 0.13917 1 0.13917 31.42894 0.001
Residual 0.460521 104 0.004428
Total 0.599691 105
N = 106
90
Table 26 displays the linear correlation coefficient (R), the coefficient for
determination ( ), and the adjusted for the linear regression of SPAR on Persistence
for this relevant cohort period.
Table 26
Linear Regression of SPAR 2003-2009 on Persistence 2003-2009
Regression
R R
Squared
Adjusted R
Squared
Std. Error of the
Estimate
0.474664 0.474664
0.217857 0.07064
Anova
Sum of
Squares
df Mean
Square
F Sig
Regression 0.150932 1 0.150932 30.24656 0.001
Residual 0.518967 104 0.00499
Total 0.669899 105
N = 106
Because Persistence is a predictor of SPAR, the next step in the process is to plot
the institutions on a scatter diagram in the form of a BGD (Goldschmidt & Hocevar,
2004). The BGD reveals the institutions with similar Persistence rates and the predicted
observed outcome of SPAR for the same relevant cohort period. Figure 4 is a BGD of
Persistence 2002-2003 and SPAR 2002-2003.
91
Figure 4. BGD of Persistence 2002-2003 with SPAR 2002-2008.
Persistence as a Predictor of Thirty-Unit Completion
The regression of the Thirty-Unit Completion 2002-2008 on Persistence 2002-
2003 was .650, F(1, 104) = 76.230, p = .001. The and adjusted were .423 and .417,
respectively. Table 27 presents the linear correlation coefficient (R), the coefficient for
determination ( ), and the adjusted for the linear regression of Thirty-Unit
Completion on Persistence for this relevant cohort period. By comparison, the regression
of the Thirty-Unit Completion 2003-2009 on Persistence 2003-2004 was .692, F(1, 104)
= 95.568, p = .001. The and adjusted were .479 and .474, respectively.
92
Table 27
Linear Regression of Thirty-Unit Completion 2002-2008 on Persistence 2002-2003
Regression
R R
Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.650353 0.422959
0.41741 0.035916
Anova
Sum of
Squares
df Mean
Square
F Sig
Regression 0.098334 1 0.098334 76.22981 0.001
Residual 0.134157 104 0.001290
Total 0.232492 105
N = 106
Table 28 displays the linear correlation coefficient (R), the coefficient for
determination ( ), and the adjusted for the linear regression of Thirty-Unit
Completion and Persistence for this relevant cohort period.
Table 28
Linear Regression of Thirty-Unit Completion 2003-2009 on Persistence 2003-2004
Regression
R R
Squared
Adjusted
R Squared
Std. Error of
the Estimate
0.692007
0.478874
0.473863
0.037729
Anova
Sum of
Squares
df Mean
Square
F Sig
Regression 0.136037 1 0.136037 95.56793 0.001
Residual 0.148040 104 0.001423
Total 0.284076 105
N = 106
Because Persistence is a predictor of Thirty-Unit Completion, the next step in the
process is to plot the institutions on a scatter diagram in the form of a BGD. The BGD
93
reveals the institutions with similar Persistence rates and the predicted observed outcome
of Thirty-Unit Completion for the same relevant cohort period.
Figure 5 is a binomial graphic display of Persistence 2002-2003 and SPAR 2002-
2003.
Figure 5. BGD of Persistence 2002-03 with Thirty-Unit Completion 2002-2008.
Summary
Persistence, as an intermediate milestone, is a tipping point indicator because it
explains approximately 23% of the common variance in the two measures. Additionally,
Persistence, as an intermediate milestone, is a tipping point indicator of Thirty-Unit
Completion because it explains approximately 42% of the common variance in the two
measures. Based on this, the examination of the college policies and practices that affect
Persistence at their institution will likely yield a movement in the SPAR or Thirty-Unit
Completion indicator. Institutions can examine the near-term indicator as a way to
predict future success on the more terminal student educational objectives.
94
CHAPTER 5
DISCUSSION
The purpose of this study was to examine whether and to what extent college-
level student success performance indicators used to measure institutional effectiveness
are prone to measurement error. In the past 25 years, policymakers and educational
leaders have strengthened their call to postsecondary institutions to demonstrate the value
that they add to student academic, social, and affective development. The answer to the
call for greater academic quality in higher education has taken a decidedly quantitative
direction. Metrics such as degree and certificate completion rates, clearance of
achievement or unit thresholds, and employment or licensure pass rates act as proxies for
institutional effectiveness. Advancements in database technology have increased access
to student data, and performance measures include both intermediate milestones and
traditional student terminal outcomes. The causal inference drawn from the use of these
measures is that higher performance on the metrics translates into greater institutional
quality.
Institutional Effectiveness
The concept of institutional effectiveness has developed a significant degree of
abstraction due to the varied use of the term and divergent perspectives of higher
education stakeholders (Ewell, 2011; Head, 2011). Measuring institutional effectiveness
has proven particularly troublesome for community colleges due to the multiple missions
of two-year institutions. The usability of the performance indicators depends on the
stability of the metrics over time and the internal consistency within measures when
95
summed both within and between reporting periods. The more replicable the
performance indicators are, the greater the operationalization of the idea. Phrased
differently, the more the measures stably and consistently map onto the construct of
institutional effectiveness, the better the measurement of the concept and the better the
fit.
Six performance indicators implemented by ARCC were analyzed for temporal
stability and internal consistency. Each metric is designed to measure one of the many
functions of a community college. Specifically, the indicators included:
1. Student Progress and Achievement Rate (SPAR);
2. Percentage of Students Who Earned at Least 30 Units (Thirty-Unit Completion);
3. Persistence Rate (Persistence);
4. Annual Successful Course Completion Rate for Credit-Based Vocational Courses
(Vocational Success Rate);
5. Annual Successful Course Completion Rate for Credit-Based Basic Skills
Courses (Basic Skills Success Rate); and
6. Improvement Rates for Credit-Based Basic Skills Courses (Basic Skills
Improvement Rate).
The presence and degree of error in prediction were examined, in particular, for
SPAR, Persistence, and Thirty-Unit Completion. SPAR was selected due to its role as a
general or “g” factor measure of student attainment of stated educational objectives.
Persistence was chosen based on its role as a momentum or tipping point to reaching
SPAR as a terminal educational outcome. Finally, Thirty-Unit Completion was chosen
96
because it is a measure of gainful employment and because of its potential role as an
additional tipping point indicator for subsequent exit outcomes. Importantly, statistical
techniques and analytical methods were employed to improve the degree of replicability
of the measures with the goal of enabling higher education leaders to make more
trustworthy and credible decisions based on the results.
Using the principles of CMT, the researcher divided observed scores on the
performance metrics into two component parts: true score and error (Crano & Brewer,
2004). The true-score component of the observed ARCC indicators is the replicable
characteristic of institutional effectiveness of each college, as measured by the metric.
Error represents the deviation or variation in scores due to chance or systematically due
to factors irrelevant to institutional effectiveness. Theoretically, changes in the true score
component of the ARCC rate should take place only when some real change has occurred
in the college’s performance. The degree of observed deviation in the scores over
reporting periods is a function of the random error contained in ARCC as a measurement
instrument. The extent of similarity between indicator rates from period to period reveals
the metric’s temporal stability measured in terms of test-retest reliability. Thus,
correlation coefficients were computed to measure the “true score” variation in the
outcomes measures. The lower the coefficient, the greater the proportional effect that
extraneous factors play in the observed score differences. As an added measure of
reliability, the internal consistency of the ARCC indicators was determined using
Cronbach’s coefficient alpha by computing an average of all of the possible ways of
splitting “items.”
97
The presence and degree of systematic error in the ARCC performance indicators
were analyzed using multivariate correlation methods. Explanatory variables irrelevant
to the measure were correlated and the magnitude and degree of relationship noted to
predict performance on SPAR, Persistence, and Thirty-Unit Completion. Predictor
variables used to explain performance on the ARCC measures included population
density, income level and educational attainment of the community, college size, race,
ethnicity, gender, and age of students. Explanatory overlap or multicollinearity among
the explanatory variables was controlled to produce the best combination of predictor
variables for the ARCC outcomes measures. The multiple regression technique was
selected for this study due to the quantity of data generated on the variables and the
flexibility and strength of the findings, including both the size and significance of the
relationships.
The total variation among the colleges after the irrelevant factors were controlled
for was determined by computing the explained and unexplained variation measured.
The degree of predictive error was computed by subtracting the observed score on the
performance indicator and the predicted value for the college on the ARCC indicator.
The residual was standardized to show relative standing among institutions. Over- and
Underperforming institutions were identified to benchmark peer institutions for improved
performance.
Based on the problem identification and purpose of the study, four research
questions, which guided in this study, were developed:
98
1. What is the degree of random error present in the ARCC performance metrics
(SPAR, Thirty-Unit Completion, Persistence, Basic Skills Success Rate, Basic
Skills Improvement Rate, and Vocational Success Rate)? What is the temporal
stability of the performance indicators as measured by the test-retest reliability of
the metrics? Are the performance indicators internally consistent?
2. Are the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, subject to systematic measurement error? If so, what are the
confounding factors that impair assessment of institutional effectiveness by
consistently and artificially inflating or deflating scores among colleges?
3. If the ARCC performance indicators, SPAR, Persistence, and Thirty-Unit
Completion, are subject to systematic error, is there a viable method to control for
the confounding factors to make more meaningful comparisons among colleges
about institutional effectiveness?
4. Is Persistence, as an intermediate milestone, a tipping or momentum point
indicator of the terminal student outcome of SPAR? Is Persistence a tipping or
momentum point indicator of the Thirty-Unit Completion intermediate milestone?
Summary of the Findings
Temporal Stability
The test-retest reliability analysis resulted in high reliability coefficients for each
of the metrics, which demonstrated a strong temporal stability in the ARCC indicators
and, thus, minimized the proportional effect of random error as a cause for the observed
variation in the scores. In addition, the inter-item correlation of the metrics for each
99
reporting year and for each metric from year to year displayed high alpha coefficients for
the ARCC performance indicators, which suggested strong internal consistency within
and between reporting periods. As a result, there is satisfactory confidence in the ARCC
indicators as stable and consistent measures of institutional effectiveness.
Systematic Error
The regression of SPAR, Persistence, and Thirty-Unit Completion, respectively,
on the 10 predictor variables displayed high multiple regression coefficients and
coefficients of determination, which demonstrated that socioeconomic factors are highly
predictive in the aggregate of ARCC performance. The extent to which socioeconomic
variables explain likely performance on the measures significantly calls into question the
confidence that the indicators are successfully mapping on to the idea of institutional
effectiveness. As a result, it is difficult to separate the location of the institution, income
and educational attainment of the residents, and student population makeup from what
the institution is actually doing to meet its commitment to academic quality. Thus, the
use of analytic methods to tease out the impact of the socioeconomic variables and derive
more precise measure of the differences among institutions is appropriate.
Persistence as a Tipping Point
Linear regression coefficients indicate that Persistence is an intermediate
milestone indicator by explaining approximately 23% of the common variance with
SPAR and approximately 42% of the common variance with Thirty-Unit Completion.
Thus, the examination of the college policies and practices that affect Persistence at an
institution will likely yield a movement in the SPAR or Thirty-Unit Completion
100
performance indicator. Institutions can evaluate the components and factors that affect
the term indicator as a way to predict future success on the more terminal student
educational objectives.
Implications
The findings have implications for (a) theory or generalization, (b) higher
education practice, and (c) future research.
Theory or Generalization
The temporal stability and the internal consistency of the performance indicators
were strong in the ARCC indicators over the four years of the system’s existence.
Importantly, the metrics are at such an aggregate level that the amount of institutional
change it would take to move a sufficient number of students into the numerator of the
outcome ratio to increase the rate in any significant way would be a marked challenge for
most two-year institutions. The ARCC indicators use systems office data elements that
collect efficient components in the measurement of the construct. As the data collection
procedures improve and the number of elements expand, greater refinement in the
definitions of the performance indicators might yield different results.
SPAR, for example, consists of a cohort of first-time students who earn a
minimum of 12 units, who have attempted a degree, certificate, or transfer course within
six years of entry, and who (a) earn an associate’s degree or certificate, (b) transfer to a
Four-Year institution, (c) complete both transfer-level math and English courses, or (d)
complete 60 transferable units with a grade point average in excess of 2.0. Redefinition
101
of SPAR by shortening the time to complete the outcome or changing the cohort unit
threshold might affect the stability and consistency of the measure.
Thirty-Unit Completion depends heavily on the degree of positive effect that the
identified units have on future earning potential. Different course identification or more
refined measurement of the gainful employment potential of the courses could likewise
affect stability and consistency. Similarly, a redefinition of the Thirty-Unit matrix, for
example, to the most common transfer courses taken might yield a different result, as
well.
The cohort for Persistence rate comprised first-time students with a minimum of 6
units earned in their first fall term who return and reenroll in the subsequent fall term
anywhere in the California Community College system. It is unclear, if the definition of
Persistence were altered to measure two fall terms or semester-to-semester terms, or if the
unit threshold were to change, whether the redefining of the measure would increase the
volatility of the measure and thus affect its stability and consistency. Clearly, the
performance indicators as defined are stable and consistent, but increased precision or
alternative definitions of the measures might yield lower reliability coefficients as
operationalization of institutional effectiveness becomes more refined or if time periods
for completion were to change.
Higher Education Practice
The use of residuals and the BGD provides three key advantages as a method of
investigation. First, the model provides an indicator of the degree of underperformance
relative to peer institutions. Second, the model provides a way to predict future success
102
on terminal educational outcomes by knowing the degree of success on more
intermediate milestones. Finally, the model offers a method to benchmark organizational
improvement by examining institutions that implement practices that result in over-
performance, given student inputs. Colleges can reverse-engineer the success of peers by
examining the practices and policies in place that yield positive results.
For example, Table 29 displays the Over-performing institutions for the
Persistence measure for the four-year reporting period of ARCC. Examination of the
table reveals that two of the ten institutions, Long Beach City College (LBCC) and Las
Positas College (LPC), over-performed on this metric in both controlled and uncontrolled
settings. These data suggest that these two institutions might be worthy of examination
to conduct further action research, perhaps through technical assistance models, to
discover what these institutions are doing to re-enroll students in California community
colleges. The remaining eight institutions in the top ten outperforming institutions are
exceeding expected achievement which is explained by reasons other than the socio-
economic factors of the service area where the colleges are located. Further investigation
of these institutions might reveal practices that could be transferred to other colleges that
experience less success in re-enrolling students. Counterfactual analysis of the
Underperforming institutions would reveal whether same or similar practices are in place
at poorly performing colleges to better triangulate whether the practices are associated
with positive institutional performance on the student persistence.
103
Table 29
Over-Performing Institutions on Persistence Measure over the Four-Year
Reporting Period
Over-Performing Institutions
Controlled
Uncontrolled
College
Stand.
Residual
Predicted
Value Rate
College Persistence
Las Positas
1.784 68.739 78.06
Las Positas 78.06
Gavilan
1.772 61.726 70.98
Orange Coast 77.78
Santa Ana
1.717 63.805 72.78
Evergreen Valley 77.13
Hartnell
1.467 63.395 71.06
Mt San Antonio 77.02
Napa Valley
1.409 61.79 69.15
Pasadena City 76.91
Cuyamaca
1.357 61.196 68.29
Moorpark 76.44
Allan Hancock
1.313 61.096 67.96
Diablo Valley 75.75
Long Beach City
1.251 68.574 75.11
Fullerton 75.63
Taft
1.217 53.30 59.66
El Camino 75.23
Contra Costa
1.076 61.749 67.37
Long Beach City 75.11
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
The BGD also is a useful tool for examining the impact that changes in one metric
may have on another. Under the BGD model, college data on performance drivers
(inputs) and institutional targets (outputs) can displayed graphically by plotting inputs on
the x-axis and outputs on the y-axis. In the case of the ARCC variables, Persistence as a
recognized tipping point metric would be plotted on the x-axis and SPAR would be
plotted on the y-axis as the output. The nonstandardized regression coefficient is the
expected change in SPAR, given a change of one unit in Persistence.
104
Figure 6 displays the BGD for the regression of SPAR for 2002-2008 on
Persistence Rate for 2002-2003.
Figure 6. Linear regression of SPAR for 2002-2008 on Persistence for 2002-2003.
The slope of the regression line between Persistence and SPAR is an indicator of
the strength of Persistence as an intermediate measure of student success on the terminal
outcome measure of SPAR (Goldschmidt & Hocevar, 2004). The greater the slope of the
regression of SPAR on Persistence, the greater the chance that changing factor which
impact Persistence might also yield a corresponding positive change in the SPAR
measure as a terminal or exit outcome for students.
105
Persistence as a metric can thus analyzed and manipulated by a college in ways
that are logically consistent with reaching a higher SPAR target based on benchmarking
efforts. Increasing fall-to-fall reenrollment practices will likely have a positive impact on
future SPAR results based on the predictive value of Persistence on SPAR as noted in the
binomial graphic display.
Future Research
Recommendations for future research that are logical extensions of this study
include qualitative study of the institutions that over-perform on the metrics, in which
socioeconomic status is held constant. The quantitative nature of this study does not
address the potential reasons why the differences between institutions exist or how efforts
could be made by underperforming institutions to improve. Direct inquiry into college
practices from interviews, campus visits and examination of college policies might
provide insights in institutional success that could be shared with struggling institutions.
Another potential area for future research is an examination of the manner in
which the metrics are operationalized to discover what behavioral markers of success
students are actually attaining for inclusion as the positive result. For example, data
available through the Systems’ office can be used to tease out the number of students
securing degrees, certificates or being identified as transfer-directed or transfer-prepared
in the SPAR measure. Other research could include the application of similar techniques
and methods used in this study the performance indicators in other accountability systems
VFA to determine whether and to what extent random and systematic error is present in
those measures. The findings from such an extension of this study would add credibility
106
to the application of CMT to the use of performance indicators to a larger number of
settings.
Limitations
The design of the study and the method of data analysis present some limitations
to the findings of the study. First, the socioeconomic data used to identify the particular
colleges consisted of census data for the primary ZIP code where the college is located.
While this method may provide accurate information about institutions that have distinct
service areas, it loses power in areas such as Los Angeles, where students shop for course
offerings at institutions outside their geographic location. Additionally, many students
attend multiple institutions to match course offerings with personal and work schedules.
For example, some colleges have extensive programs for military students who work at
installations within the service area. Still other colleges have extensive online programs
where students reside virtually across the state and nation, and the socioeconomic model
may not match with actual student populations. Additionally, the timing of the study was
between census periods, and, as a result, demographic data from the 2000 census were
used, which may no longer reflect the current population of the service area.
The study addressed exogenous factors, those outside the control of the
institutions, that affect ARCC outcomes. An examination of the practices and procedures
in place at institutions would yield endogenous factors over which colleges could have
some control. For example, the impact of practices such as mandatory assessment in
math and English, development of educational plans, or the extent of transfer center
107
counseling might reveal which matriculation services, and in what combination, have the
largest impact on improving student success rates and, thus, institutional effectiveness.
Conclusion
Measuring and reporting institutional effectiveness has become a national higher
education priority, and the performance indicators that serve as proxies for academic
quality must be temporally stable and internally consistent. High reliability coefficients
for the ARCC performance metrics demonstrate strong temporal stability in the ARCC
indicators, which minimizes the proportional effect of random error or chance as a cause
for the observed variation in the scores. Inter-item correlations of the metrics within and
between years displayed high alpha coefficients which suggest strong internal
consistency among the measures. Socioeconomic variables significantly predict college
performance on SPAR, Persistence, and Thirty-Unit Completion, which reduces the fit
that the indicators have with the concept of institutional effectiveness. The use of
analytic tools, such as multiple regression, provide colleges with a method to reduce the
effect that socioeconomic variables have on observed ARCC rates and offer more
precision on how the metric measures institutional effectiveness at their colleges.
Finally, the BGD provides a viable method for colleges to examine the likelihood that
changing the factors that constitute momentum point milestones such as Persistence
might change later SPAR results.
108
REFERENCES
Accrediting Commission for Senior Colleges and Universities. (2002). Evidence guide: A
guide to using evidence in the accreditation process. Alameda, CA: Accrediting
Commission for Senior Colleges and Universities, Western Association of
Schools and Colleges. Retrieved from http://www.wascsenior.org/findit/files/
forms/EvidenceGuidejan_i-02.pdf
Alfred, R. L. (2011). The future of institutional effectiveness New Directions for
Community Colleges, 153, 103-113.
Alfred, R. L., Shults, C., & Seybert, J. A. (2007). Core indicators of effectiveness for
community colleges. Washington, DC: Community College Press, American
Association of Community Colleges.
American Association of Community Colleges. (2012). Volunteer framework of
accountability. Retrieved from
http://www.aacc.nche.edu/resources/aaccprograms/vfaweb/default.aspx
American Association of State Colleges and Universities. (2012). The voluntary system of
accountability (VSA). Retrieved from http://www.voluntarysystem.org/index.cfm
Baldwin, C., Bensimon, E. M., Dowd, A. C., & Kleiman, L. (2011). Measuring student
success. In R. Head (Ed.), Institutional effectiveness (Vol. 153, pp. 75-88). San
Francisco, CA: Jossey-Bass.
Brennan, R. L. (Ed.). (2006). Educational measurement (4th ed.). Westport, CT:
ACE/Praeger.
109
Burke, J. C. (2004). Achieving accountability in higher education: Balancing public,
academic, and market demands. San Francisco, CA: Jossey-Bass.
Burke, J. C., & Minassians H. (2003). Reporting higher education results: Missing links
in the performance chain. San Francisco, CA: Jossey-Bass. Retrieved from
http://www.rockinst.org/education/accountability_higher_ed.aspx
California Community College Chancellor’s Office. (2010). Focus on results. Retrieved
from
http://extranet.cccco.edu/Portals/1/TRIS/Research/Accountability/ARCC/ARCC
%202010,%20March%202010.pdf
Commission on Institutions of Higher Education. (2006). Standards for accreditation.
Bedford, MA: Author. Retrieved from
http://cihe.neasc.org/downloads/Standards/Standards_for_Accreditation_2006.pdf
Crano, W. D., & Brewer, M. B. (2004). Principles and methods of social research.
Mahwah, NJ: Lawrence Erlbaum.
Creswell, J. W. (2008). Research design: Qualitative, quantitative, and mixed methods
approaches. Thousand Oaks, CA: Sage.
Donaldson, S. I., Christie, C. A., & Mark, M. M. (2008). What counts as credible
evidence in applied research and evaluation practice? Thousand Oaks, CA: Sage.
Doucette, D., & Hughes, B. (Eds.). (1990). Assessing institutional effectiveness in
community colleges. Laguna Hills, CA: League for Innovation in the Community
College.
110
Dowd, A. C. (2005). Data don't drive: Building a practitioner-driven culture of inquiry
to assess community college performance. Retrieved from
http://cue.usc.edu/tools/Dowd_Data%20Don%27t%20Drive.pdf
European Consortium for Accreditation. (2007). Advancing mutual recognition of
accreditation decisions. Retrieved from
http://www.ecaconsortium.net/main/documents/publications#recognition
Ewell, P. T. (1993). The role of states and accreditors in shaping assessment practice. In
T. W. Banta (Ed.), Making a difference: Outcomes of a decade of assessment in
higher education (pp. 339-356). San Francisco, CA: Jossey-Bass.
Ewell, P. T. (2008a). Assessment and accountability in America today: Background and
context. New Directions for Institutional Research, S1, 7-17.
Ewell, P. T. (2008b). US accreditation and the future of quality assurance: A tenth
anniversary report from the Council for Higher Education Accreditation.
Washington, DC: Council for Higher Education Accreditation.
Ewell, P. T. (2011). Accountability and institutional effectiveness in the community
college. New Directions for Community Colleges, 153, 23-36.
Folger, J. K. (1977). Increasing the public accountability of higher education. San
Francisco, CA: Jossey-Bass. Retrieved from
http://books.google.com/books/about/Increasing_the_public_accountability_of.ht
ml?id=LKy0AAAAIAAJ
Gall, M. D., Gall, J. P., & Borg, W. R. (2007). Educational research: An introduction
(8th ed.). Boston, MA: Pearson/Allyn & Bacon.
111
Goben, A. (2007). FanSAStic! results: collaboratively leading institutional effectiveness
efforts in higher education institutions. Paper presented at the SAS Global Forum,
Las Vegas, NV.
Goldschmidt, N. P., & Hocevar, D. H. (2004). How Oregon uses data to drive
improvement: The binary graphic display. Paper presented at the annual meeting
of the Association for Institutional Research, Boston, MA.
Hasson, C., & Meehan, K. (2010). Assessing and planning for institutional effectiveness.
Berkeley, CA: RP Group. Retrieved from
http://www.rpgroup.org/sites/default/files/INQUIRY%20GUIDE%20-
%20Assessing%20and%20Planning%20for%20Institutional%20Effectiveness.pdf
Head, R. B. (2011). The evolution of institutional effectiveness in the community college.
New Directions for Community Colleges, 153, 5-11.
Hom, W. (2008). Peer grouping: The refinement of performance indicators. The Journal
of Applied Research in the Community College, 16(1), 45-51.
Kirsch, I., Braun, H., Yamamoto, K., & Sum, A. (2007). America’s perfect storm.
Princeton, NJ: Educational Testing Service.
Kuhn, T. S. (1996). The structure of scientific revolutions. Chicago, IL: University of
Chicago Press.
Leinbach, D. T., & Jenkins, D. (2008). Using longitudinal data to increase community
college student success: A guide to measuring milestone and momentum point
attainment (CCRC Research Tools No. 2). Retrieved from
http://ccrc.tc.columbia.edu/Publication.asp?UID=570
112
Middle States Commission on Higher Education. (2009). Characteristics of excellence in
higher education (12th ed.). Philadelphia, PA: Author. Retrieved from
http://www.msche.org/publications/CHX06_Aug08REVMarch09.pdf
Miller, C. (2006). A test of leadership: Charting the future of US higher education (A
Report of the Commission Appointed by the Secretary of Education Margaret
Spellings). Retrieved from
http://www.ed.gov/about/bdscomm/list/hiedfuture/reports/finalreport.pdf
Moore, C., & Shulock, N. (2010). Divided we fail: Improving completion and closing
racial gaps in California’s community colleges. Retrieved from
http://www.csus.edu/ihelp/PDFs/R_Div_We_Fail_1010.pdf
National Association of Independent Colleges and Universities. (2012). NAICU.
Retrieved from http://www.naicu.edu/special_initiatives/a-brief-overview
National Center for Educational Statistics. (2004). College navigator. Retrieved from
http://nces.ed.gov/collegenavigator
National Center for Public Policy and Higher Education. (2006). Measuring up!
Retrieved from
http://www.highereducation.org/reports/reports_center_2006.shtml
National Commission on Excellence in Education. (1983). A nation at risk. Retrieved
from http://www.ed.gov/pubs/NatAtRisk/risk.html
National Higher Education Benchmarking Institute. (2012). Benchmark project.
Retrieved from http://www.nccbp.org/national-higher-education-benchmarking-
institute
113
New England Association for Schools and Colleges. (2011). Standards for accreditation.
Retrieved from
http://cihe.neasc.org/standards_policies/standards/standards_html_version
Northwest Commission on Colleges and Universities. (2010). NWCCU standards of
accreditation. Redmond, WA: Author. Retrieved from
http://www.nwccu.org/Pubs%20Forms%20and%20UpdatesfPublications/Standar
ds%20for%20Accreditation.pdf
Pascarella, E. T. (2006). How college affects students: Ten directions for future research.
Journal of College Student Development, 47(5), 508-520.
Perry, P. (2005) College transfer performance: A methodology for equitable
measurement and comparison. Journal for Applied Research in Community
Colleges, 13(1), 73-87.
Popper, K. R. (1959). The logic of scientific inquiry. London, England: Hutchinson.
Postsecondary Education Accountability, California Education Code §84754.5 (2004).
Shavelson, R. J., & Towne, L. (Eds.). (2003). Scientific research in education.
Washington, DC: National Academies Press.
Shulock, N., Moore, C., & Offenstein, J. (2011). The road less travelled: Realizing the
potential of career technical education in the California community colleges.
Retrieved from
http://www.csus.edu/ihelp/PDFs/R_Road_Less_Traveled_02_11.pdf
Southern Association of Colleges and Schools Commission on Colleges. (2010).
Principles of accreditation: Foundations for quality enhancement. Decatur, GA:
114
Author. Retrieved from
http://www.sacscoc.org/pdf/2OlOPrinciplesofAccreditation.pdf.
Student Right to Know and Campus Security Act of 1990. Public Law No. 101-542, 20
U.S.C.A. sec. 1092(f)(7) (1990).
The Higher Learning Commission. (2003a). Commission statement on assessment of
student learning. Chicago, IL: Author. Retrieved from
http://content.springcm.com/content/DownloadDocuments.ashx?Selection=Docu
ment%2C201 77502%3B&accountld=5968
The Higher Learning Commission. (2003b). Handbook of accreditation. (3rd ed.).
Chicago, IL: Author. Retrieved from
http://content.springcm.com/content/DownloadDocuments.ashx?Selection=Docu
ment%2C10611003%3B&accountld=5968
Thorndike, R. L. (1963). The concepts of over-and underachievement. New York, NY:
Teachers College, Columbia University.
Tucker, S. (1996). Benchmarking: A guide for educators. Thousand Oaks, CA: Corwin
Press.
U.S. Census Bureau. (2010). American FactFinder. Retrieved from
http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml
U.S. Department of Education. (2002). No Child Left Behind Act of 2001 (Public Law
No. 107-110, 115 Stat. 1425). Retrieved from
http://www.ed.gov/policy/elsec/leg/esea02/index.html
115
APPENDIX A
MULTIPLE REGRESSION OF SPAR
Appendix A1
Multiple Regression of Student Progress and Achievement Rate (SPAR) 2000-01 to
2005-06 Reporting Periods
Regression
R R Squared Adjusted R
Squared
Std.
Error of
the
Estimate
0.835
0.698
0.666
4.357
Anova
Sum of
Squares df Mean Square F
Regression 4167.012 10 416.701 21.950
Residual 1803.467 95 18.984
Total 5970.479 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 40.178 6.884 5.836 0.000
Income 0.000 0.000 0.083 0.943 0.348
Population Density 0.000 0.000 -0.017 -0.249 0.804
College Size 0.000 0.000 0.101 1.381 0.171
Nontraditional Aged -15.230 5.456 -0.206 -2.791 0.006
Female 18.333 6.980 0.172 2.626 0.010
African American -21.416 4.611 -0.298 -4.645 0.000
Hispanic -16.836 3.317 -0.358 -5.076 0.000
Educational Attainment 2.749 0.908 0.273 3.029 0.003
Asian 11.005 5.717 0.147 1.925 0.057
Other 10.517 13.504 0.051 0.779 0.438
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
116
Appendix A2
Multiple Regression of Student Progress and Achievement Rate (SPAR) 2001-02 to
2006-07 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.846 0.716 0.686 4.112
Anova
Sum of
Squares df Mean Square F
Regression 4053.65 10 405.365 23.972
Residual 1606.469 95 16.910
Total 5660.119 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 45.934 6.497 7.070 0.000
Income 0.000 0.000 0.119 1.402 0.164
Population Density 0.000 0.000 0.000 -0.003 0.997
College Size 0.000 0.000 0.059 0.824 0.412
Nontraditional Aged -14.567 5.149 -0.202 -2.829 0.006
Female 8.631 6.588 0.083 1.310 0.193
African American -22.111 4.352 -0.316 -5.081 0.000
Hispanic -18.282 3.131 -0.400 -5.840 0.000
Educational
Attainment
2.608 0.857 0.266 3.045 0.003
Asian 10.170 5.396 0.140 1.885 0.063
Other 5.402 12.745 0.027 0.424 0.673
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
117
Appendix A3
Multiple Regression of Student Progress and Achievement Rate (SPAR) 2002-03 to
2007-08 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std.
Error of
the
Estimate
0.860
0.739
0.711
4.066
Anova
Sum of Squares df Mean Square F
Regression 4443.791 10 444.379 26.885
Residual 1570.229 95 16.529
Total 6014.020 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 49.506 6.424 7.707 0.000
Income 0.000 0.000 0.115 1.409 0.162
Population Density 0.000 0.000 0.009 0.147 0.884
College Size 0.000 0.000 0.056 0.820 0.414
Nontraditional Aged -20.575 5.091 -0.277 -4.041 0.000
Female 6.216 6.513 0.058 0.954 0.342
African American -19.090 4.302 -0.264 -4.437 0.000
Hispanic -19.584 3.095 -0.415 -6.327 0.000
Educational
Attainment
2.619 0.847 0.259 3.093 0.003
Asian 12.201 5.335 0.162 2.287 0.024
Other 7.689 12.600 0.037 0.610 0.543
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
118
Appendix A4
Multiple Regression of Student Progress and Achievement Rate (SPAR) 2003-04
to 2008-09 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std.
Error of
the
Estimat
e
0.851
0.725
0.696
4.416
Anova
Sum of
Squares df Mean Square F
Regression 4883.805 10 488.381 25.042
Residual 1852.758 95 19.503
Total 6736.564 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 47.250 6.978 6.772 0.000
Income 0.000 0.000 0.081 0.971 0.334
Population Density 0.000 0.000 0.028 0.426 0.671
College Size 0.000 0.000 0.052 0.751 0.455
Nontraditional Aged -21.187 5.530 -0.269 -3.831 0.000
Female 8.148 7.075 0.072 1.152 0.252
African American -22.506 4.673 -0.294 -4.816 0.000
Hispanic -19.082 3.362 -0.382 -5.676 0.000
Educational Attainment 3.619 0.920 0.339 3.934 0.000
Asian 12.919 5.795 0.163 2.229 0.028
Other -13.673 13.687 -0.062 -0.999 0.320
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
119
APPENDIX B
MULTIPLE REGRESSION OF PERSISTENCE
Appendix B1
Multiple Regression of Student Persistence Rate (Persistence) Fall 2004 to Fall 2005
Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.776 0.601 0.559 4.912
Anova
Sum of Squares df Mean Square F
Regression 3458.644 10 345.864 14.336
Residual 2291.906 95 24.125
Total 5750.550 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 62.447 7.761 8.047 0
Income 0.00006959 0 0.207 2.056 0.043
Population Density 0.00007981 0 0.051 0.64 0.524
College Size 0 0 0.282 3.353 0.001
Nontrad Aged Students -23.584 6.15 -0.324 -3.835 0
Female 6.879 7.869 0.066 0.874 0.384
African American -16.048 5.198 -0.227 -3.088 0.003
Hispanic 0.348 3.739 0.008 0.093 0.926
Educational
Attainment
0.826 1.023 0.084 0.807 0.422
Asian 10.811 6.445 0.147 1.677 0.097
Other 10.722 15.223 0.053 0.704 0.483
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
120
Appendix B2
Multiple Regression of Student Persistence Rate (Persistence) Fall 2005 to Fall 2006
Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.757 0.573 0.528 6.073
Anova
Sum of
Squares df Mean Square F
Regression 4708.300 10 470.830 12.767
Residual 3503.539 95 36.879
Total 8211.838 105
Unstandardized
Coefficients
Standardized
Coefficients
B
SE β t Sig.
(Constant) 59.511 9.595 6.202 0.000
Income 0.000 0.000 0.183 1.754 0.083
Population Density 0.000 0.000 0.010 0.117 0.907
College Size 0.000 0.000 0.232 2.662 0.009
Nontraditional Aged -23.672 7.604 -0.272 -3.113 0.002
Female 5.791 9.729 0.046 0.595 0.553
African American -25.067 6.426 -0.297 -3.901 0.000
Hispanic 5.457 4.623 0.099 1.180 0.241
Educational Attainment 1.085 1.265 0.092 0.858 0.393
Asian 14.668 7.969 0.167 1.841 0.069
Other 26.018 18.821 0.108 1.382 0.170
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
121
Appendix B3
Multiple Regression of Student Persistence Rate (Persistence) Fall 2006 to Fall 2007
Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.731
0.535 0.486 6.109
Anova
Sum of
Squares df Mean Square F
Regression 4075.429 10 407.543 10.922
Residual 3544.827 95 37.314
Total 7620.256 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 68.083 9.651 7.054 0.000
Income 0.000 0.000 0.164 1.506 0.135
Population Density 0.000 0.000 -0.111 -1.287 0.201
College Size 0.000 0.000 0.252 2.768 0.007
Nontraditional Aged -23.265 7.649 -0.278 -3.042 0.003
Female -2.910 9.787 -0.024 -0.297 0.767
African American -16.852 6.464 -0.207 -2.607 0.011
Hispanic 2.405 4.650 0.045 0.517 0.606
Educational Attainment 0.658 1.272 0.058 0.517 0.607
Asian 19.923 8.015 0.236 2.486 0.015
Other 16.743 18.932 0.072 0.884 0.379
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
122
Appendix B4
Multiple Regression of Student Persistence Rate (Persistence) Fall 2007 to Fall 2008
Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.732 0.535 0.486 6.116
Anova
Sum of
Squares df Mean Square F
Regression 4092.654 10 409.265 10.940
Residual 3554.034 95 37.411
Total 7646.689 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 62.236 9.664 6.440 0.000
Income 0.000 0.000 0.336 3.081 0.003
Population Density 0.000 0.000 0.102 1.181 0.241
College Size 0.000 0.000 0.256 2.816 0.006
Nontraditional Aged -25.519 7.659 -0.304 -3.332 0.001
Female 13.498 9.799 0.112 1.377 0.172
African American
-20.286 6.472 -0.249 -3.134 0.002
Hispanic
2.892 4.656 0.054 0.621 0.536
Educational Attainment -0.756 1.274 -0.066 -0.594 0.554
Asian 4.672 8.026 0.055 0.582 0.562
Other 14.713 18.957 0.063 0.776 0.440
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
123
APPENDIX C
MULTIPLE REGRESSION OF STUDENT THIRTY-UNITS
Appendix C1
Multiple Regression of Student Thirty-Unit Completed Rate (Thirty-Unit) 2000-01 to
2005-06 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.765 0.585 0.541 3.594
Anova
Sum of
Squares df Mean Square F
Regression 1726.176 10 172.618 13.364
Residual 1227.070 95 12.917
Total 2953.246 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 60.466 5.678 10.648 0.000
Income 0.000 0.000 0.233 2.264 0.026
Population Density 0.000 0.000 0.125 1.541 0.127
College Size 0.000 0.000 0.327 3.806 0.000
Nontraditional Aged -8.395 4.500 -0.161 -1.865 0.065
Female 18.553 5.758 0.248 3.222 0.002
African American -25.022 3.803 -0.494 -6.579 0.000
Hispanic -2.187 2.736 -0.066 -0.799 0.426
Educational
Attainment
-0.023 0.749 -0.003 -0.031 0.975
Asian -0.311 4.716 -0.006 -0.066 0.948
Other -8.532 11.139 -0.059 -0.766 0.446
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
124
Appendix C2
Multiple Regression of Student Thirty-Unit Completed Rate (Thirty-Unit) 2001-02 to
2006-07 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.743 0.552 0.505 3.571
Anova
Sum of
Squares df Mean Square F
Regression 1493.726 10 149.373 11.715
Residual 1211.268 95 12.750
Total 2704.994 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 62.376 5.642 11.056 0.000
Income 0.000 0.000 0.186 1.740 0.085
Population Density 0.000 0.000 0.140 1.659 0.100
College Size 0.000 0.000 0.201 2.249 0.027
Nontrad Aged
Students
-11.581 4.471 -0.232 -2.590 0.011
Female 15.240 5.721 0.213 2.664 0.009
African American -23.801 3.779 -0.491 -6.299 0.000
Hispanic -0.975 2.718 -0.031 -0.359 0.721
Educational
Attainment
0.785 0.744 0.116 1.056 0.294
Asian 0.539 4.685 0.011 0.115 0.909
Other -13.005 11.067 -0.094 -1.175 0.243
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
125
Appendix C3
Multiple Regression of Student Thirty-Unit Completed Rate (Thirty-Unit) 2002-03 to
2007-08 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.725 0.526 0.476 3.343
Anova
Sum of
Squares df Mean Square F
Regression 1175.862 10 117.586 10.521
Residual 1061.726 95 11.176
Total 2237.589 105
Unstandardized Coefficients
Standardized
Coefficients
B SE β t Sig.
(Constant) 65.260 5.282 12.355 0.000
Income 0.000 0.000 0.193 1.751 0.083
Population Density 0.000 0.000 0.142 1.639 0.105
College Size 0.000 0.000 0.233 2.539 0.013
Nontrad Aged
Students
-11.933 4.186 -0.263 -2.851 0.005
Female 11.096 5.356 0.170 2.072 0.041
African American -16.852 3.538 -0.382 -4.764 0.000
Hispanic -1.323 2.545 -0.046 -0.520 0.604
Educational
Attainment
0.454 0.696 0.074 0.652 0.516
Asian 3.260 4.387 0.071 0.743 0.459
Other -4.200 10.361 -0.033 -0.405 0.686
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
126
Appendix C4
Multiple Regression of Student Thirty-Unit Completed Rate (Thirty-Unit) 2003-04 to
2008-09 Reporting Periods
Regression
R R Squared
Adjusted R
Squared
Std. Error of
the Estimate
0.727 0.529 0.480 3.765
Anova
Sum of
Squares df Mean Square F
Regression 1513.854 10.000 151.385 10.681
Residual 1346.522 95.000 14.174
Total
2860.376
105.000
Unstandardized Coefficients
Standardized
Coefficients
B
SE β t Sig.
(Constant) 61.629 5.948 10.361 0.000
Income 0.000 0.000 0.117 1.068 0.288
Population Density 0.000 0.000 0.078 0.901 0.370
College Size 0.000 0.000 0.231 2.524 0.013
Nontrad Aged Stds -10.619 4.714 -0.207 -2.253 0.027
Female 14.398 6.032 0.195 2.387 0.019
African American -18.681 3.984 -0.375 -4.689 0.000
Hispanic 0.624 2.866 0.019 0.218 0.828
Educ Attainment 0.950 0.784 0.136 1.211 0.229
Asian 10.633 4.940 0.205 2.152 0.034
Other -8.955 11.668 -0.063 -0.767 0.445
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
127
APPENDIX D
TOP TEN OVER AND UNDER-PERFORMING INSTITUTIONS SPAR
Appendix D1
Top Ten Over and Under-Performing Institutions SPAR 2000-01 to 2005-06
Reporting Periods
SPAR Over 2000-01 to 2005-06 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted Value Rate College SPAR
Reedley 3.30 48.98 63.33 Foothill 67.57
Barstow 2.69 46.05 57.76 De Anza 67.52
San Diego City 1.86 49.71 57.83 Irvine Valley 66.28
Oxnard 1.75 43.78 51.39 Diablo Valley 65.51
Irvine Valley 1.37 60.30 66.28 Moorpark 64.07
Gavilan 1.37 46.06 52.03 Reedley 63.33
Monterey 1.28 52.66 58.22 Ohlone 62.47
Diablo Valley 1.17 60.41 65.51 West Valley 61.71
Canyons 1.17 51.59 56.67 Las Positas 61.36
Antelope Valley 1.11 50.11 54.97 Orange Coast 61.23
Underperforming Institutions
Controlled Uncontrolled
College Stand. Residual Predicted Value Rate College SPAR
-1.43 55.77 49.53 41.75
-1.46 35.95 29.61 41.5
-1.52 36.78 30.14 41.04
-1.57 48.66 41.83 40.41
-1.57 57.42 50.58 39.18
-1.58 55.96 49.06 38.17
-1.67 58.14 50.88 36.36
-1.79 51.23 43.43 36.07
-1.82 51.57 43.65 30.14
-1.84 48.43 40.41 29.61
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
128
Appendix D2
Top Ten Over and Under-Performing Institutions SPAR 2001-02 to 2006-07
Reporting Periods
SPAR Over 2001-02 to 2007-08 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
Reedley 2.72 47.49 58.69 Foothill 66.99
Barstow 2.53 45.48 55.86 De Anza 66.36
San Diego City 1.78 48.50 55.83 Irvine Valley 66.14
Napa Valley 1.66 50.72 57.53 Diablo Valley 65.16
Irvine Valley 1.62 59.47 66.14 West Valley 64.35
Allan Hancock 1.62 46.11 52.77 Orange Coast 63.92
Orange Coast 1.49 57.80 63.92 Moorpark 62.10
Oxnard 1.44 42.70 48.64 Ohlone 61.31
Diablo Valley 1.37 59.53 65.16 San Mateo 60.87
Southwest LA 1.36 34.79 40.38 Saddleback 60.50
Underperforming Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
-1.17 55.35 50.55 41.91
-1.31 52.35 46.96 41.45
-1.35 50.60 45.04 41.44
-1.38 48.45 42.79 40.40
-1.43 47.33 41.44 40.38
-1.44 57.95 52.03 39.98
-1.82 51.35 43.87 39.37
-1.95 50.66 42.62 37.39
-2.31 55.12 45.60 33.80
-2.37 34.66 24.90 24.90
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
129
Appendix D3
Top Ten Over and Under-Performing Institutions SPAR 2002-03 to 2007-08
SPAR Over 2002-03 to 2007-08 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
Mira Costa 1.80 52.28 59.61 De Anza 69.35
Oxnard 1.79 42.09 49.38 Foothill 67.97
Napa Valley 1.57 49.36 55.72 Diablo Valley 66.11
San Diego Mesa 1.51 56.15 62.27 Irvine Valley 64.15
Diablo Valley 1.45 60.23 66.11 Moorpark 63.69
Barstow 1.35 44.52 50.00 Orange Coast 62.85
Santa Barbara City 1.31 56.13 61.48 San Diego Mesa 62.27
Cuyamaca 1.29 50.38 55.62 Santa Barbara City 61.48
Coastline 1.27 51.82 56.98 Ohlone 61.16
Laney 1.26 47.79 52.91 West Valley 60.99
Underperforming Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
-1.26 50.84 45.74 41.45
-1.45 50.37 44.48 41.44
-1.51 48.92 42.78 41.38
-1.53 49.89 43.69 41.35
-1.61 56.95 50.42 38.07
-1.71 49.59 42.63 37.53
-1.83 51.57 44.13 37.14
-2.24 50.47 41.38 36.92
-2.36 35.27 25.68 36.32
-2.67 54.49 43.63 25.68
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
130
Appendix D4
Top Ten Over and Under-Performing Institutions SPAR 2003-04 to 2008-09
Reporting Periods
SPAR Over 2003-04 to 2008-09 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
Coastline 2.10 53.16 62.40 De Anza 70.50
Barstow 1.79 44.47 52.40 Foothill 67.30
Antelope Valley 1.51 49.16 55.80 Moorpark 66.40
Mira Costa 1.49 53.43 60.00 Irvine Valley 65.90
Oxnard 1.42 41.49 47.70 Santa Monica 65.30
Moorpark 1.29 60.68 66.40 Diablo Valley 64.10
San Diego City 1.21 48.21 53.50 Ohlone 64.00
Santa Monica 1.20 60.00 65.30 West Valley 63.40
Southwestern 1.20 45.83 51.10 Orange Coast 63.20
Grossmont 1.18 54.35 59.50 San Mateo 62.50
Underperforming Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College SPAR
-1.33 52.01 46.10 40.00
-1.34 50.22 44.30 39.90
-1.42 58.71 52.40 39.80
-1.46 53.20 46.80 39.00
-1.46 47.41 41.00 38.50
-1.58 47.56 40.60 37.80
-1.83 34.06 26.00 37.70
-2.11 55.35 46.00 37.70
-2.34 49.32 39.00 37.50
-2.78 50.10 37.80 26.00
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
131
APPENDIX E
TOP TEN OVER AND UNDER-PERFORMING INSTITUTIONS PERSISTENCE
Appendix E1
Top Ten Over and Under-Performing Institutions Persistence Fall 2004 to Fall
2005 Reporting Periods
Persistence Over Fall 2004 to Fall 2005 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College Persistence
Gavilan 2.36 61.06 72.65 Orange Coast 79.06
Chabot 1.58 67.77 75.52 Evergreen Valley 77.76
Las Positas 1.54 68.45 76.00 Diablo Valley 77.71
Evergreen Valley 1.40 70.88 77.76 West Valley 76.98
Long Beach City 1.38 68.57 75.36 Pasadena City 76.82
Allan Hancock 1.21 60.82 66.76 Mt San Antonio 76.46
Siskiyous 1.19 57.30 63.13 Las Positas 76.00
San Joaquin Delta 1.16 69.39 75.07 Chabot 75.52
Napa Valley 1.16 61.58 67.25 Long Beach City 75.36
Underperforming Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College Persistence
-1.46 65.83 58.64 57.14
-1.62 65.65 57.70 54.88
-1.67 80.62 72.42 54.13
-1.83 63.86 54.88 52.96
-1.85 60.03 50.92 50.97
-1.88 50.11 40.85 50.92
-1.89 62.26 52.96 50.11
-2.01 60.82 50.97 46.15
-2.21 56.68 45.81 45.81
-2.38 57.87 46.15 40.85
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
132
Appendix E2
Top Ten Over and Under-Performing Institutions Persistence Fall 2005 to Fall
2006 Reporting Periods
Persistence Over Fall 2005 to Fall 2006 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Persistence
Gavilan 1.89 61.40 72.89 Las Positas 79.34
Las Positas 1.80 68.44 79.34 Orange Coast 79.11
Cuyamaca 1.67 61.12 71.26 Mt San Antonio 77.11
Merritt 1.57 54.51 64.07 Skyline 76.79
Santa Ana 1.35 63.48 71.67 Evergreen Valley 76.64
Laney 1.32 57.06 65.08 Golden West 76.40
Taft 1.26 52.89 60.57 Pasadena City 76.10
Imperial Valley 1.23 67.67 75.17 El Camino 75.37
Napa Valley 1.23 61.76 69.22 Moorpark 75.23
Merced 1.13 63.08 69.92 Cypress 75.23
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Persistence
-1.02 62.47 56.25 54.46
-1.09 72.82 66.21 54.31
-1.40 61.49 53.01 53.01
-1.42 71.20 62.57 52.36
-1.51 63.46 54.31 50.56
-1.96 56.33 44.41 50.21
-2.50 81.94 66.77 44.41
-2.56 49.98 34.41 37.67
-2.93 48.96 31.18 34.41
-3.62 59.65 37.67 31.18
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
133
Appendix E3
Top Ten Over and Under-Performing Institutions Persistence Fall 2006 to Fall
2007 Reporting Periods
Persistence Over Fall 2006 to Fall 2007 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Persistence
Santa Ana 2.04 64.12 76.59 Orange Coast 80.61
Cuyamaca 1.65 60.76 70.82 Las Positas 78.12
Napa Valley 1.61 62.13 71.99 Golden West 78.05
Hartnell 1.61 63.88 73.70 Moorpark 78.05
Imperial Valley 1.50 66.56 75.72 Ohlone 78.02
Las Positas 1.41 69.51 78.12 Pasadena City 77.77
Alameda 1.33 65.95 74.09 Evergreen Valley 77.66
Contra Costa 1.31 62.00 69.98 West Valley 77.37
Allan Hancock 1.19 61.87 69.13 Fullerton 77.23
Laney 1.07 62.04 68.60 Mt San Antonio 76.82
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Persistence
-1.39 69.78 61.32 52.96
-1.46 73.15 64.21 52.82
-1.47 62.71 53.71 52.56
-1.51 66.33 57.14 51.96
-1.63 57.69 47.73 50.83
-1.69 64.13 53.82 47.73
-2.35 83.80 69.47 45.24
-2.53 58.28 42.82 45.02
-2.59 53.45 37.63 42.82
-2.60 60.90 45.02 37.63
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
134
Appendix E4
Top Ten Over and Under-Performing Institutions Persistence Fall 2007 to Fall
2008 Reporting Periods
Persistence Over Fall 2007 to Fall 2008 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College Persistence
Hartnell
1.87 63.42 74.90
Las Positas 78.80
Irvine Valley
1.73 65.34 75.90
Mt San Antonio 77.70
Monterey
1.71 57.53 68.00
Moorpark 77.40
Las Positas
1.67 68.56 78.80
Fullerton 77.30
Santa Ana
1.65 64.16 74.30
Ohlone 77.00
Allan Hancock
1.55 60.80 70.30
Pasadena City 76.90
Siskiyous
1.34 55.20 63.40
El Camino 76.90
Long Beach City
1.28 68.41 76.20
Evergreen Valley 76.50
Gavilan
1.22 62.27 69.70
Long Beach City 76.20
Taft
1.14 50.24 57.20
San Francisco City 76.20
Underperforming Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate Persistence
-1.24 68.78 61.20
53.70
-1.29 60.05 52.20
53.00
-1.41 64.93 56.30
52.20
-1.58 79.12 69.40
50.10
-1.97 62.16 50.10
49.20
-2.06 60.94 48.30
48.30
-2.25 47.76 34.00
48.00
-2.27 57.21 43.30
43.30
-2.50 55.13 39.80
39.80
-2.77 64.99 48.00
34.00
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
135
APPENDIX F
TOP TEN OVER AND UNDER-PERFORMING INSTITUTIONS THIRTY-UNITS
Appendix F1
Top Ten Over and Under-Performing Institutions Thirty-Unit 2000-01 to
2005-06 Reporting Periods
Thirty-Unit Over 2000-01 to 2005-06 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
Redwoods 2.32 68.46 76.80
Pasadena City 80.80
Canyons 2.05 68.10 75.50
De Anza 79.50
Gavilan 1.92 67.79 74.70
Glendale 79.00
Pasadena City 1.78 74.43 80.80
Orange Coast 77.90
Santa Ana 1.61 68.39 74.20
Saddleback 77.30
Antelope Valley 1.56 67.81 73.40
Redwoods 76.80
Glendale 1.40 73.98 79.00
Fullerton 75.70
Chabot 1.38 67.52 72.50
Cuesta 75.70
Cuesta 1.35 70.83 75.70
Moorpark 75.50
Lake Tahoe 1.34 69.74 74.60
Canyons 75.50
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
-1.12 73.72 69.70
60.90
-1.13 68.22 64.20
60.20
-1.26 77.10 72.60
59.30
-1.47 64.11 58.80
59.10
-1.81 71.83 65.30
58.80
-2.07 66.78 59.30
58.40
-2.12 71.50 63.90
56.90
-2.14 69.47 61.80
56.60
-2.28 67.27 59.10
55.40
-2.80 67.01 56.90
54.30
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
136
Appendix F2
Top Ten Over and Under-Performing Institutions Thirty-Unit 2001-02 to 2006-07
Reporting Periods
Thirty-Unit Over 2001-02 to 2006-07 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
Redwoods
2.21 68.83 76.70
De Anza
81.10
Canyons
1.83 68.42 75.00
Orange Coast
78.90
Gavilan
1.81 67.91 74.40
Pasadena City
78.50
Southwest LA
1.80 57.52 64.00
Cuesta
78.00
Imperial Valley
1.73 71.27 77.40
Glendale
77.80
Cuesta
1.67 72.03 78.00
Imperial Valley
77.40
De Anza
1.47 75.80 81.10
Saddleback
77.00
Laney
1.46 61.99 67.20
Redwoods
76.70
Orange Coast
1.26 74.35 78.90
Moorpark
75.50
Siskiyous
1.26 66.12 70.60
Fullerton
75.30
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
-1.18 58.55 54.40 62.70
-1.19 71.33 67.10 62.50
-1.42 72.00 66.90 61.50
-1.48 59.36 54.10 61.50
-1.48 66.78 61.50 61.50
-1.49 70.91 65.60 60.50
-1.72 67.62 61.50 56.70
-2.41 69.13 60.50 54.80
-2.74 64.53 54.80 54.40
-3.20 68.14 56.70 54.10
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
137
Appendix F3
Top Ten Over and Under-Performing Institutions Thirty-Unit 2002-03 to 2007-08
Reporting Periods
Thirty-Unit Over 2002-03 to 2007-08 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College Stand.
Residual
Predicted
Value
Rate College Thirty-Unit
Imperial Valley 2.40 70.28 78.30 De Anza 82.70
Redwoods 1.88 68.92 75.20 Orange Coast 79.80
De Anza 1.74 76.84 82.70 Glendale 78.40
Glendale 1.48 73.46 78.40 Imperial Valley 78.30
Antelope Valley 1.48 68.86 73.80 Pasadena City 77.70
Copper Mountain 1.46 67.64 72.50 Saddleback 76.60
Orange Coast 1.45 74.99 79.80 Mt San Antonio 76.20
Santa Rosa 1.43 71.28 76.10 Santa Rosa 76.10
Chabot 1.34 69.22 73.70 Fullerton 75.70
LA Trade Tech 1.31 64.42 68.80 Cuesta 75.40
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College Thirty-Unit
-1.18 67.13 63.20 63.30
-1.23 72.17 68.10 63.20
-1.43 69.08 64.30 63.20
-1.51 68.33 63.30 62.00
-1.61 72.38 67.00 61.40
-1.71 65.22 59.50 61.20
-1.78 67.95 62.00 60.40
-1.86 70.50 64.30 60.40
-1.87 69.39 63.20 59.50
-3.76 68.74 56.20 56.20
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
138
Appendix F4
Top Ten Over and Under-Performing Institutions Thirty-Unit 2003-04 to
2008-09 Reporting Periods
Thirty-Unit Over 2003-04 to 2008-09 Reporting Periods
Over-Performing Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College
Thirty-
Unit
Gavilan 2.33 68.51 77.30 De Anza 83.80
Redwoods 1.92 69.16 76.40 Glendale 81.70
Southwest LA 1.92 60.15 67.40 Orange Coast 81.10
Glendale 1.73 75.15 81.70 Pasadena City 80.50
Chabot 1.61 70.20 76.30 Fullerton 79.80
Imperial Valley 1.61 72.12 78.20 Imperial Valley 78.20
Taft 1.60 60.15 66.20 Santa Rosa 77.80
Santa Rosa 1.45 72.34 77.80 West Valley 77.80
Orange Coast 1.19 76.65 81.10 Santa Monica 77.70
Fullerton 1.07 75.80 79.80 Mt San Antonio 77.40
Underperforming Institutions
Controlled Uncontrolled
College
Stand.
Residual
Predicted
Value Rate College
Thirty-
Unit
-1.14 71.56 67.30 63.30
-1.21 68.84 64.30 63.10
-1.29 69.09 64.20 63.00
-1.34 68.00 62.90 62.90
-1.49 70.14 64.50 62.70
-1.49 68.08 62.50 62.50
-1.50 77.13 71.50 62.20
-2.05 61.62 53.90 60.10
-2.10 71.16 63.30 53.90
-4.45 68.93 52.20 52.20
N = 106
Source: California Community College Chancellor’s Office (2007-2010)
United States Census Bureau (2000)
Abstract (if available)
Abstract
Over the past quarter-century, there has been an increased call for colleges and universities to better demonstrate their institutional effectiveness. This quantitative study assessed the existence and degree of random and systematic measurement error contained in performance metrics used under an accountability system for community colleges to determine the stability, consistency and validity of the indicators to measure institutional effectiveness. A binary graphic display was also employed to visually represent the impact of improving intermediate outcomes results (momentum points) such as student persistence to predict subsequent success on later, terminal outcomes of student goal achievement. Test-retest reliability and internal consistency results for the performance indicators were strong, suggesting that the measures are stable and internally consistent and that random measurement error was minimal for the measures. Multivariate correlation analysis revealed that the performance measures did contain systematic error due to socioeconomic factors irrelevant to the construct of institutional effectiveness. Residual analyses were conducted to identify over and underperformance controlled for presence of systematic error and to more accurately understand how and why institutional differences arise and to assess how those colleges who perform less well can be brought about to achieve better.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Input-adjusted transfer scores as an accountability model for California community colleges
PDF
Developmental math in California community colleges and the delay to academic success
PDF
Achieving equity in educational outcomes through organizational learning: enhancing the institutional effectiveness of community colleges
PDF
A comparison of value-added, orginary least squares regression, and the California Star accountability indicators
PDF
The practice and effects of a school district's retention policies
PDF
The effect of reading self-efficacy, expectancy-value, and metacognitive self-regulation on the achievement and persistence of community college students enrolled in basic skills reading courses
PDF
A gap analysis to find best practices in philanthropy to support California's community colleges and offer potential solutions to close performance gaps
PDF
An exploratory, quantitative study of accreditation actions taken by the Western Association of Schools and Colleges' Accrediting Commission for Community and Junior Colleges Since 2002
PDF
A longitudinal study on the opportunities to learn science and success in science in the California community college system
PDF
An examination of the direct/indirect measures used in the assessment practices of AACSB-accredited schools
PDF
The effects of a math summer bridge program on college self-efficacy and other student success measures in community college students
PDF
The impact of ""wall-to-wall"" small learning communities: career academy participation and its relationship to academic performance and engagement
PDF
Measuring the alignment of high school and community college math assessments
PDF
The effects of culturally responsive standards based instruction on African American student achievement
PDF
Perspectives of Native American community college students
PDF
Impact of accreditation actions: a case study of two colleges within Western Association of Schools and Colleges' Accrediting Commission for Community and Junior Colleges
PDF
Academic achievement among Hmong students in California: a quantitative and comparative analysis
PDF
College readiness in California high schools: access, opportunities, guidance, and barriers
PDF
The effect of traditional method of training on learning transfer, motivation, self-efficacy, and performance orientation in comparison to evidence-based training in Brazilian Jiu-Jitsu
PDF
Use of accountability indicators to evaluate elementary school principal performance
Asset Metadata
Creator
Pacheco, Robert J.
(author)
Core Title
Assessing and addressing random and systematic measurement error in performance indicators of institutional effectiveness in the community college
School
Rossier School of Education
Degree
Doctor of Education
Degree Program
Education (Leadership)
Publication Date
11/26/2012
Defense Date
10/19/2011
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
accountability,community college,educational measurement,institutional effectiveness,multiple regression,OAI-PMH Harvest,performance measures
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Hocevar, Dennis (
committee chair
), Keim, Robert G. (
committee member
), Meehan, Kenneth (
committee member
)
Creator Email
rjpachec@usc.edu,robert_pacheco@ymail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-119202
Unique identifier
UC11291472
Identifier
usctheses-c3-119202 (legacy record id)
Legacy Identifier
etd-PachecoRob-1342.pdf
Dmrecord
119202
Document Type
Dissertation
Rights
Pacheco, Robert J.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
accountability
community college
educational measurement
institutional effectiveness
multiple regression
performance measures