Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Making appropriate decisions for nonnormal data: when skewness and kurtosis matter for the nominal response model
(USC Thesis Other)
Making appropriate decisions for nonnormal data: when skewness and kurtosis matter for the nominal response model
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
i
MAKING APPROPRIATE DECISIONS FOR NONNORMAL DATA:
WHEN SKEWNESS AND KURTOSIS MATTER
FOR THE NOMINAL RESPONSE MODEL
Skye Nichole Parral
A Thesis Submitted in Partial Fulfillment of the
Requirements for the Degree of
Master of Arts
in Psychology
at
The University of Southern California
August 2018
ii
TABLE OF CONTENTS
ABSTRACT ............................................................................................................... iv
LIST OF TABLES ..................................................................................................... v
LIST OF FIGURES .................................................................................................... vi
ACKNOWLEDGMENTS .......................................................................................... vii
Chapter 1 - Introduction ............................................................................................. 1
Nominal Response Model .................................................................................. 2
Estimation ......................................................................................................... 7
EM and ML ................................................................................................ 8
Empirical Histograms ................................................................................. 11
Item Parameter Recovery ................................................................................... 12
Research Design ................................................................................................ 14
Chapter 2 - Methods ................................................................................................... 17
Design ............................................................................................................... 17
Factors Influencing Item Parameter Recovery .................................................... 17
Latent Trait Distribution ............................................................................. 17
Sample Size
CBD Parameters
Data Generation ................................................................................................ 18
Model ................................................................................................................ 18
Outcome Measures ............................................................................................ 19
Chapter 3 - Results .................................................................................................... 21
Descriptive Statistics ......................................................................................... 21
CBD Parameter Recovery .................................................................................. 23
Absolute bias ..................................................................................................... 23
RMSE ............................................................................................................... 24
Chapter 4 - Discussion................................................................................................ 26
iii
APPENDICES ........................................................................................................... 31
A. Tables ......................................................................................................... 31
B. Figures ........................................................................................................ 37
REFERENCES .......................................................................................................... 39
iv
ABSTRACT
The use of Item Response Theory is rising in the field of psychology at a fast rate over
the past decade. Psychological research is based on measuring unobservable processing
in the mind using various data collection techniques. The focus of the paper is using
survey data for measuring these latent traits. Latent traits most often measured within
psychology are most often nonnormally distributed, this presents a problems because
conventional statistical methods are based on the normal distribution. The use of new
estimation techniques such as empirical histograms allow for the estimation of the shape
of the latent trait distribution. This technique provides more accurate results for the item
parameters for the survey when sample sizes are large and coupled with high skewness
and kurtosis than using the traditional method of using maximum likelihood.
v
LIST OF TABLES
Table Page
1. Demonstration of Item Parameter Recovery for Nonnormal Latent Trait .........
Distributions .................................................................................................... 31
2. Test Conditions for the Latent Trait Distributions ............................................ 32
3. Average Proportion of converged conditions ................................................... 33
4. Average CBD parameter recovery and CBD parameter standard deviation for ML
and EH estimation ........................................................................................... 34
5. CBD Absolute Bias ......................................................................................... 35
6. RMSE of CBD Parameters .............................................................................. 36
vi
LIST OF FIGURES
Figure Page
1. Equal CBD parameters .................................................................................... 37
2. Unequal CBD parameters ................................................................................ 38
vii
ACKNOWLEDGMENTS
K.S.J.P. – It’s finally done.
1
CHAPTER 1
Introduction
For the last decade, Item Response Theory (IRT) has gained popularity within the
field of psychometrics with regard to constructing and validating applied psychological
measurements (Reise & Henson, 2003). Psychological instruments are assessing,
diagnosing, and informing researchers about issues exhibited in clinical populations and
non-clinical populations. The essence of psychological science is measuring the
unobservable processes in the mind, a latent trait(s), and what behaviors and emotions are
indicators of a persons current state. The latent trait distribution refers to the true shape of
the latent trait (e.g. depression, happiness, ADHD) which is often not assessed by
researchers. All latent traits are unobservable therefore researchers will ask questions
about observable behaviors to infer about the latent trait of interest. The latent trait
distribution is not the distribution of the sample data which is often investigated using
frequencies or histograms of sample data. Moreover, the latent trait distribution is not
typically normally distributed for clinical populations (e.g. depression is most often
skewed; Preston & Reise, 2013). IRT allows estimation of polytomous models which
matches the Likert-type response format, commonly used for psychological tools.
Additionally, polytomous IRT models provide information about each response category
and its relationship to the latent trait. The nominal response model (NRM; Bock 1972,
1997) is the least constrained model of the polytomous IRT models, being the most
general of the divide-by-total models (Thissen & Steirnberg 1986). The NRM has the
2
unique ability to investigate category functioning which informs the researcher about
unnecessary categories that are not meaningful to the participant.
Studying human behavior and accurately predicting future behaviors or outcomes
is the basis for psychological research. Measuring these behaviors with little error is the
goal of psychological constructs, these constructs are assumed to be normally distributed
yet researchers often do not investigate the distribution of the latent trait. The normal
distribution of a construct is essential to the use of statistical techniques due to the special
properties it displays and allowing for correctly rejecting or retaining the null hypothesis.
When the distribution of the latent trait is not normal, based on the skewness and kurtosis
of the distribution, the use of statistical techniques is in jeopardy. Thus, investigating the
distribution of the latent trait prior to completing statistical analyses is imperative, due to
the impact of accurately understanding and predicting human behavior and the creation of
useful behavioral interventions.
Previous simulation studies have looked into the functioning of IRT models using
extreme distributions of the latent trait (de Ayala & Sava-Bolesta, 1999, DeMars 2003,
van den Oord, 2005, Preston & Reise, 2013, Woods 2006, & Woods & Thissen, 2006,
Woods, 2007). Recently, Woods (2007) investigated a near normal distribution, with a
skew of 0.53 and a kurtosis of 3.75, which displayed results similar to a normal
distribution. The near-normal test condition is rarely seen in the growing body of
literature about item parameter recovery and very little in this area of research is done
using the NRM. Moreover, the variety of latent trait distributions that possibly exist
between normal and extremely skewed or kurtotic has yet to be evaluated. However,
these aforementioned distributions are more likely representative of psychological
3
constructs being measured; thus, it is essential to inquire about the effects of skewness
and kurtosis when estimating item parameters using polyotmous IRT models, specifically
the NRM.
The purpose of this study is two-fold, 1) to investigate the accuracy of item
parameter recovery as it is affected by the shape of the latent trait distribution when using
Maximum Likelihood (ML; Bock & Liberman, 1970) estimation for the NRM and 2)
how will the use of empirical histogram (EH; Bock & Akin, 1981, Mislevy, 1984)
estimation for the NRM, which estimate the shape latent trait, corrects the bias. The
sections below provide a detailed description of the NRM, ML, and EH estimation and
their contributions to investigating the effects of nonnormality for item parameters
Nominal Response Model
The NRM informs a researcher about the relationship of an individual’s latent
trait and the probability of endorsing a particular item category which is useful for
clinical populations. Unique only to the NRM is the ability to evaluate the functioning of
each category within an item (Preston, Reise, Cai, & Hayes, 2011). Additionally, the
NRM can be used to test if all response option categories are equally informative, if an
item contains too many response options, and if the response options are ordered. A plot
of category response curves (CRC; Embretson & Reise, 2000, See figure 1) displays the
probability of endorsing a category given the level of the participant’s latent trait. As
more or less of the latent trait is present the probability for endorsing a particular
category is estimated. Figure 1 displays the CRCs for a 4 point Likert-type response
formatted item, ‘strongly disagree’, ’disagree’, ‘agree’ and ‘strongly agree’, representing
the probability of responding in strongly disagree monotonically decreases as a function
4
of ones increasing latent trait level. The two middle categories display curves that are
parabolic, representing an increase then decrease in probability of endorsing the each
particular category given ones trait level. The peak of these parabolic functions represent
the highest probability of a participant endorsing a specific category, given the
corresponding level on the latent trait. The probability of responding in the last category
monotonically increases as a function of one’s trait level. The level of the latent trait
needed to endorse a category increases as the x-axis increases.
Using the NRM, the CRC plots inform the researcher about category functioning
and if the participants are making meaningful decisions between two adjacent categories.
As shown in Figure 1, using ordered categories such that as a subject has increased levels
of the latent trait then the subject will be more likely to endorse “strongly agree.” The
intersections of the CRCs represent the with-in item category boundary
discrimination(CBD), this value determines the amount of information given by
endorsing 1 category versus endorsing the adjacent category 2, ‘strong disagree’ versus
‘disagree’. The value of the CBD parameter indicates the how similar the adjacent
curves are, values that are closer to zero suggest the distinction between these categories
is not as meaningful, larger values indicate that a clear choice is being made when
choosing a category. The CBD parameters in Figure 1 are 1.21, 1.21, 1.11 thus each
category is almost equally discriminating across the latent trait. Conversely, in Figure 2
the CBD parameters vary widely as we move across the latent trait, .38, 1.57, and .74,
respectively, here we can see that the curves for the ‘strongly disagree’ versus ‘disagree’
are much more similar than the curves for the ‘disagree’ versus ‘agree’. Thus,
participants are making a clear distinction between broadly agreeing and disagreeing with
5
the item’s statement but, the severity of the amount of disagreement (disagree versus
strongly disagree) is not as meaningful.
The NRM produces two parameter values used to create the aforementioned plots
of CRC’s. The conditional probability of a participant at ! endorsing category "(" =
0…'
(
) for item i is calculated using
+
(,
(! ) =
exp (0
(,
! +2
(,
)
∑ exp (
4
, 56
0
(,
! +2
(,
)
(1)
Recall the CRC plots have two monotonic functions, one for each category on the
extremes of the response format; these functions are the result of identifying the model.
To identify the model Σ0
(,
= Σ2
(,
= 0, creating the monotonic functions for the extreme
categories. Each category within each item produces parameters a and c, as depicted by
the numerator which represents the value of a specific response category. The
denominator contains the summation over all categories; hence the ‘divide-by-total’
model membership. The 0
(,
and 2
(,
parameters, depict a linear relationship between the
latent trait and the odds of responding in a specific category. The 0
(,
represents the slope
of the line as it changes concavity, here 2
(,
is the category intercept parameter which
represents the frequency of the endorsement for a specific category; larger numbers
indicate more individuals endorsing the category. An increased frequency of participants
endorsing a category can be an indicator of skewness and kurtosis within the latent trait
due to the particular response option being the best descriptor of the participant’s latent
trait. The 0
(,
and 2
(,
from equation 1 are not directly interpreted but are used to calculate
the CBD’s and intersections of the CRC’s, explained below.
6
A reformulation of the NRM was created by Thissen, Steinberg, and Fitzpatrick
(1989) for a more useful interpretation. This version of the NRM leads itself to the
conclusion that a participant endorsing a particular category is based on the participant
choosing category 2 versus 1(i.e “Strongly Disagree” and “Disagree”. Thus indicating
the participant who endorses category 2 possesses more of the latent trait than a
participant who would endorse category 1. Typically, participants are making a choice
between two adjacent categories rather than considering all of the response options. The
NRM model assumes that this probability is the choice between two adjacent categories
for category " and "
8
= "−1 is,
+
(,
|" = " <= "
8
=
1
1+exp (−0
>
∗
! +@
>
)
(2)
Where, 0
>
∗
= 0
,
−0
, 8
and @
>
= 2
, 8
−2
,
.
Equation 2 implies the probability of endorsing category " is a monotonically
increasing function above and beyond the probability of endorsing category "
8
, when
participants are deciding between two adjacent response options. The probability of
endorsing category "
8
is reflective of a higher level of the latent trait than ". The
conditional probability outlined in Eq 2 for x or x’ is indicating that all other response
options have the probability of 1 minus this expression. Here the 0
>
∗
is the CBD
parameter, it should be noted that CBD parameters can have negative values which
indicate that the CRC’s are not ordered. For example if the CBD parameter between
‘agree’ and ‘strongly agree’ is -1.59 then participants who agree are on the highest end of
the latent trait while participants who ‘strongly agree’ are lower which confounds the
7
measurement. The @
>
term (Eq. 2) is the intercept which is used for calculating the
intersection parameters 2
>
∗
,
2
>
∗
=
@
>
0
,
−0
, 8
(3)
The intersections of CRC’s of adjacent categories represent the point on the latent trait
that is equal amongst the two categories that is, participants at this point on the latent trait
have the same probability of endorsing either category.
Estimation
Traditional estimation of the NRM uses the expectation-maximization (EM)
algorithm (Bock & Aitkin, 1981), which allows estimation of the item parameters when
the shape of the latent trait distribution is known. The EM algorithm is an iterative
process of expectation (E) and maximization (M) steps (see EM and ML section below).
In 1997, Bock combined the ML algorithm within the M-step for the NRM. The NRM is
often estimated using the ML algorithm which assumes the examinees are independent,
item responses are independent, conditional on the latent trait, and the probability
distribution of the population of examinees is specified prior to estimation of the item
parameters (Bock & Aitkin, 1981; Bock & Lieberman, 1970). Unfortunately, a normal
distribution is used in place of a the latent trait distribution of the population, that is
chosen by a researcher, is commonly implemented in computer programs for estimation
purposes. The actual shape of the latent trait distribution is rarely evaluated and rarely
known, this is an unrealistic expectation of a researcher who is operating within an IRT
framework (Preston & Reise, 2013). Often the shape of the latent trait distribution is
8
often unknown prior to estimation because the shape is unobservable. Therefore,
improperly imposing a normal prior can cause bias estimates for the item parameters
when the shape of the latent trait distribution is nonnormally distributed (Preston &
Reise, 2013).
EM and ML
The EM algorithm computes expected vectors of response frequencies =̅ (C
,
=̅ (C
= D
=
E
F
(E
G
E
(H
C
)I(H
C
)
+
J
E
K
EL6
(4)
iteratively, item by item for the entire instrument represented by Eq.(1) for
mutually exclusive categories. The “expected frequency” of correct responses for item i
at latent trait level k given the response pattern F
(E
represents the expected number of
persons at each ! level for a given item is represented in Eq.(4). Where =
E
is the
frequency of the response pattern M in the sample of N subjects with M = 1,2,…P, where
s is total number of distinct response patterns, and P ≤ min (U,V) with N possible
response patterns, F
(E
is response pattern M for individual W, G
E
(H
C
) is likelihood for each
point, I(H
C
) is the weight of theta at the quadrature point and +
J
E
is the marginal
probability (Bock & Atkin, 1981 and Bock, 1997). Currently, I(H
C
), is assumed to be
normally distributed in computer programming software.
The expected number of persons endorsing the item at level k is, U
X
C
,
U
X
C
= D=
E
G
E
(H
C
)I(H
C
)/+
J
E
K
EL6
9
(5)
where H
Z
,[ = 1,2,…,\ represents the quadrature points used for estimation. These
equations (4 and 5) provide the expectations for the estimation equations (E-step). The
provisional estimates of item parameters for the W
]^
item are obtained in the M-step via
the Newton-Raphson solution (Bock & Atkin, 1981). Here, _
(
represents the c parameter
and `
(
represents the a parameter. The gradient 2('
(
−1) vector is
ab
_
(
`
(
c= Dd
(
e
C
[=̅ (C
−U
X
C
+
(
(H
C
)]⨂i
1
H
C
j,
(6)
the elements of +
(
kH
Z
l , '
(
-vector, are the category probabilities for the W
]^
item at ! =
H
C
. The d
(
matrix can incorporate various constraints on the model such as constraining
all CRC values within an item to be equal producing one of several constrained versions
of the NRM, the generalized partial credit model (GPCM; Murkai, 1992). The resulting
products are then summed across the quadrature points. The Hessian matrix is
mb
_
(
`
(
c= −DU
X
C
n
C
d
(
o
(
(H
C
)d
(
8
⨂i
1
H
C
j[1,H
C
],
(7)
and element (g,h) of o
(
(H
C
) is computed as +
(p
(H
C
)qr
p^
−+
(^
(H
C
)s where r
p^
is
Kronecker’s delta (Bock, 1975, 1985 p. 524 as cited in Bock, 1997). The M-step
performs probit analysis, which is cumulative, for the results obtained from equations 4
and 5, these estimates are fitted to a probit regression line. These item parameter
estimates are then input into the E-step again and updated expectations are created for the
next M- step. This process of cycling through the E and M steps is continued iteratively
10
until convergence which is specified by a predetermined value within the computer
program, by default, or the researcher.
Bock and Lieberman (1970) developed the unconditional maximum likelihood
estimation procedure which integrates over the latent trait distribution and estimates the
item parameters using the maximum likelihood in the marginal distribution. The
advantage of this procedure is that goodness-of-fit tests for the model and standard error
estimates for the item parameters can be calculated. The ML approach, for dichotomous
items, for item parameter estimates notes that the pattern of responses for each participant
and assigns them to one of 2
t
mutually exclusive categories for the response pattern.
The frequency of the response pattern j occurring in a sample of N subjects is represented
by =
>
, where j = 1, 2, …, f and f ≤ min(N, F) with N possible response patterns. If N
subjects are randomly sampled from a normally distributed latent trait at the population
level, the number of subjects observed in each category, =
>
= Uu
>
, u
>
is the
unconditional probability of observing pattern s, will be multinomially distributed with
parameters U and +
>
, where vku
>
l = +
>
. Therefore, the probability of the sample may
be expressed in terms of the item parameters by the means of the multinomial law, when
interpreted as a function of the parameters, gives the likelihood function (Bock and
Lieberman, 1970) measuring the likelihood of a participant demonstrating a particular
response pattern given all of the possible response patterns
G =
U!
∏ =
>
!
>
y+
>
z
{
|
}
>L6
.
(8)
11
The likelihood equations for the slope, 0
(
, and intercept, 2
(
, parameters for the W
]^
item as
presented in equation 4 are
~ logG
~2
(
= UD
u
>
+
>
|
}
>L6
~+
>
~2
(
= 0,0Ç@
(9)
~ logG
~0
(
= UD
u
>
+
>
|
}
>L6
~+
>
~0
(
= 0.
(10)
Empirical Histograms
An assumption of the ML algorithm is, data are assumed to be a random sample
from a known population thus, the latent trait distribution of theta is known prior to
estimation. In psychological research the shape of the latent trait distribution is often
unknown prior to estimation of the item parameters and it is unlikely the shape is
normally distributed. However, most IRT software programs impose a normal
distribution for the latent trait, theta, when estimating the item parameters (Preston &
Reise, 2013). The use of a normal latent trait distributions for theta is likely to bias the
estimates of the item parameters which will introduce unnecessary measurement error for
the psychological construct. Previous studies (de Ayala & Sava-Bolesta, 1999; DeMars,
2003; and Preston & Reise, 2013), have shown that using a normal distribution for theta
biases the item parameters estimates under the NRM. Alternatively, EH estimation can
be implemented for estimation of the latent trait distribution of theta while
simultaneously estimating the item parameters. Recently, FlexMIRT 2.0 (Cai, 2013)
provides researchers an option to use EH estimation, allowing researchers to specify a
12
series of equally spaced quadrature points to estimate the latent trait distribution. As the
number of quadrature points increases, the accuracy of the estimation also increases and
the resulting item parameters are improved (Woods, 2007).
This technique estimates the area “under the curve” of the latent trait distribution
using the height of rectangles determined by the quadrature points. This allows the
researcher to obtain the true shape of the latent trait distribution by plotting the estimated
heights of the rectangles. The use of EH estimation occurs in the E- step of the EM
algorithm. As seen in Equation 11 the quadrature weight I(H
C
), from Equation 4, is
replaced with É(H
Z
) which is represents the posterior density given the data (Bock &
Atkin, 1981). Thus, the definite integral is now,
ÉkH
Z
l ≅
∑ =
E
G
E
kH
Z
lIkH
Z
l
K
EL6
∑ ∑ =
E
G
E
kH
Z
lIkH
Z
l
K
E
Z
^
.
(11)
The use of the equation 11 promises to provide improved estimates of the item
parameters (Bock & Atkin, 1981). The use of this modified EM algorithm empirically
estimates the density of the latent trait distribution at each quadrature point rather than
imposing the traditionally implemented normal prior. The densities É(H
Z
) of the latent
trait distribution for theta at selected points are used as weights for item parameter
estimation in the M-step. These values are then use to calculate updated estimates of the
item parameters.
Item Parameter Recovery
Simulation analyses investigating item parameter recovery demonstrate that the
shape of the latent trait distribution, CBD parameters, symmetry of the intersection
13
parameters, and sample size affects the accuracy of the item parameter estimates (de
Ayala & Sava-Bolesta, 1999, DeMars, 2003, and Preston & Reise, 2013). Accuracy of
item parameter estimation under the NRM investigating maximum item information, and
item parameter recovery was investigated by de Ayala and Sava-Bolesta (1999). Results
indicated that bias in item parameter recovery increased as the sample size to parameter
ratio decreased. Moreover, the CBD parameters were biased when the latent trait
distribution was manipulated to exhibit extreme skewness and kurtosis. To decrease the
severity of nonnormality De Ayala and Sava-Bolesta (1999) recommended increasing the
sample size in hopes of increasing the number of participants endorsing less popular
response options, which can influence the nonnormality of the latent trait.
Item parameter recovery under the NRM was also evaluated by DeMars (2003);
these results suggest fewer response options would lead to less error variance in the item
parameter estimates. The skewed shape of the latent trait distribution biases item
parameter recovery. DeMars suggests that the ratio of sample size to the number of
response options plays a large factor in item parameter accuracy because error variance
of the item parameter estimate increases as the number of response options increases.
The distance between the intersection parameters affected item parameter recovery,
especially when the latent trait distribution was nonnormal because the distances between
the intersections is partly determined by the frequency of the responses for each category.
Preston & Reise (2013) specifically investigated within item CBD variation,
sample size, and shape of the latent trait distribution. Results indicate that ignoring the
shape of the latent trait distribution creates biases in the item parameter estimation when
compared to the generated item parameters. The bias at the test level was computed
14
using the difference between the true and estimated values. Conditions that ignore the
true shape of the latent trait distribution demonstrate the bias, the negative extreme of ! is
underestimated while the positive extreme is over estimated (Preston & Reise, 2013).
Recently, Woods (2007) tested item parameter recovery using a simulation study
under the graded response model with standard normal, near normal, and two skewed
conditions representing the distribution of the latent trait, theta. Woods concluded that
near normal data are not as sensitive to item parameter recovery, in comparison to two
extremely skewed conditions. The sensitivity of IRT item parameters for nonnormally
distributed data has not been investigated thoroughly for latent traits that span normal to
extremely nonnormal distributions. The NRM item parameter estimates for normal to
extremely nonnormal latent trait distributions has been not been presented in the
literature. The inclusion of this knowledge will help researchers identify when ML and
EH estimation should be implemented for the most accurate calculations of item
parameters.
Research Design
To demonstrate the effects of nonormally distributed latent traits a brief
simulation study was conducted to investigate the inaccuracy of item parameter recovery.
For this brief demonstration, a test length of ten items with 4 response categories were
generated for 1,000 participants and CBD parameters were allowed to vary within an
item given a specific range of 0.50-1.50 or 0.75-1.25. The shape of the latent trait
distribution was simulated as, normal (skew = 0 kurtosis = 0), near normal (skew = 0.25
kurtosis = 0.75), moderately nonnormal (skew = 0.75 kurtosis = 1.75), highly nonnormal
(skew = 1.25 kurtosis = 2.75), and extremely nonormal (skew = 1.50 kurtosis = 3.75).
15
Data for 10 conditions were simulated and estimated under the NRM using ML
estimation, ignoring the true shape of the latent trait distribution, for 1,000 datasets each.
The proportion of converged iterations was assessed. The item parameter recovery was
investigated by the average absolute bias of the CBD parameters and the average root
mean square error (RMSE) over items and replications. RMSE is calculated as ÖÜVv =
á(Θ−! )
|
, where Θ, represents the true CBD parameter and ! represents the estimated
CBD parameter. This paper refers to bias as an assessment of the effect of a normal
distribution being imposed on a nonnormally distributed latent trait, theta. When theta is
normally distributed the expected value for bias should be very close to zero. An increase
in bias indicates that the estimated CBD parameters are different than the true, known,
CBD parameters when the shape of the latent trait distribution is ignored.
The demonstration results are presented in Table 1, the first three columns
indicates the value of skewness and kurtosis for the latent trait distribution and the range
of the CBD parameters. The remainder of the columns presents the bias of the CBD
parameters, RMSE, and the proportion of converged iterations. When the true shape of
the latent trait distribution is normal the CBD parameters that are recovered are highly
accurate. Alternatively, when the latent trait distribution deviates further form normality
the CBD parameter estimates become increasingly biased. Comparing the two ranges of
the estimated CBD’s, when the latent trait distribution is the most extreme the bias for the
CBD parameters is the largest. The smaller of the two ranges for the CBD’s exhibits a
larger jump in bias for CBD parameters when the latent trait distribution is slightly
nonnormal (0.014) compared to the wider range of CBD parameters (0.008).
16
Therefore, the purpose of this research is to assess the sensitivity of item
parameter recovery as estimated under the NRM for skewed, kurtotic, and mixtures of
skewness and kurtosis. It is hypothesized that IRT models using ML estimation for item
parameter recovery will be biased and the use of EH estimation will aid in the correction
of these biases. The combinations of conditions including normal distributions and near
normal distributions with large sample sizes will be the most robust conditions with the
highest level of item parameter recovery for ML and EH estimation. Additionally, item
parameter recovery will improve as the sample size increases for EH estimation.
17
CHAPTER 2
Methods
Design
The simulation study is manipulating multiple variables to investigate the
robustness of item parameter recovery under nonnormally distributed latent traits. Based
on previous research and aforementioned demonstrations, the following variables are
included in this study: 1) within item CBD variation, 2) sample size, and 3) latent trait
distribution. All conditions were simulated with 10 items with 4 category response
options. The distances between the CBD parameters were manipulated based on the
latent trait distribution. The null condition for this study is when the latent trait
distribution is normally distributed, with the absence of skewness and kurtosis.
Factors Influencing Item Parameter Recovery
Latent Trait Distributions
The main focus of the present study is item parameter recovery, under the NRM,
when the latent trait distribution is nonnormal. The latent trait distribution can be
affected by skewness and kurtosis independently and in combination. Since positively
skewed and negatively skewed data are reflections of each other, only positively skewed
distributions are investigated. However, considering leptokurtotic and platokurtotic
distributions do not share the same inverse relationship, both shapes are included. In
order to generate the shape of the latent trait distribution Fleishman’s (1978) power
method weights were used to create specific mixtures of skewness and kurtosis.
Fleishman includes a table of weights (a, b, c, and d) to apply to the equation â = 0+
18
äH,2H
|
+ @H
ã
, to simulate nonnormal distributions. Fleishman acknowledges that no
solution exists for certain combinations of skewness and kurtosis using the power weight
method, so kurtosis values ≤ -1.20 were not obtained.
However, since Fleshman’s table does not include power weights for platokurtic
distributions this researcher will create the distributions needed for the simulation using
mixtures of normal distributions. In order to simulate real possibilities researchers may
encounter with applied data, the study includes the influences of skewness and kurtosis
independently and in combination a series of no skew (skew= 0), slight skew (skew=
.25), moderate skew (skew= .75), high skew (skew= 1.25), and extreme skew (skew=
1.50) conditions are used. For the kurtosis conditions a similar framework of no kurtosis
(kurtosis= 0), slight kurtosis (kurtosis= .75), moderate kurtosis (kurtosis= 1.75), high
kurtosis (kurtosis= 2.75), and extreme kurtosis (kurtosis= 3.75) will be implemented. All
plausible distributions can be seen in Table 2 along with indicators of which distributions
will be created by this researcher.
Sample Size
Based on research by Preston & Reise (2013), de Ayala & Save-Bolesta (1999)
and DeMars (2003) three sample size conditions were chosen, N= 250, N= 500, and N=
2,000. The two smaller sample sizes of the three are more representative of a
psychological study and the larger is representative of educational based tests (Preston &
Reise, 2013).
19
CBD Parameters
The range of CBD parameters will be allowed to vary between a range of .3 to 3.5
for low discrimination to very high discrimination ordered category responses, this range
will simulate realistic CBD conditions, an adaptation from Preston & Reise (2013).
These values are randomly generated from a uniform distribution.
Data Generation
All conditions presented in Table 1 will be generated under the NRM and item
parameters will be estimated using the NRM as estimated in FlexMIRT 2.0. Each test
condition will have 1,000 datasets generated.
Model
Each test condition will be estimated using ML and EH estimation techniques
using the program defaults with program defaults for maximum likelihood estimation,
specifying the estimation of the Fisher information function using 801 quadrature points
for stability of the standard errors, and expected a posteriori (EAP) theta score estimation,
within FlexMIRT 2.0. The a parameters will be used to calculate the CBD parameters
using the equation for 0
>
∗
as described in the NRM section.
Outcome Measure
The proportion of converged conditions will be calculated for each test condition.
The CBD parameter recovery will be evaluated (de Ayala & Sava-Bolesta, 1999; Woods,
2006) by computing the mean and standard deviation of the recovered parameters, the
average of the absolute bias averaged over items and replications, and RMSE averaged
over items and replications. The recovered latent trait distribution will be compared to
20
the generated latent trait distribution by calculating the bias using the quadrature points
from the EH estimation output.
21
CHAPTER 3
Results
The results of the simulation study are presented in Tables 3 through 6 where each
table is presenting the results for all conditions of each outcome measure. The first two
columns of these tables list the skewness and kurtosis of the generated latent trait
distribution. The next section of each table represents the outcome measure for the ML
and EH estimation. Under each estimation technique the varying sample sizes (250, 500,
2,000) are grouped together. Each outcome measure will be discussed separately. The
first row of each table represents a normal latent trait distribution (skewness = 0 and
kurtosis = 0). These tables can be thought of in terms of horizontal groupings by
kurtosis. These groupings will be 5 rows in length and the skewness will increase while
holding the kurtosis constant. This is so that the reader can more easily navigate the
influences of skewness and kurtosis in combination and in the absence of one another.
Descriptive statistics
Table 3 is the proportion of converged replications for each condition, which was
generally excellent. Proportion of replications converged was high over all conditions
with 0.91 being the minimum number of converged replications occurring when
skewness = .25 and kurtosis = 0 when the sample size is 250. As the sample size
increases the proportion of converged replications for each condition increases. The
largest sample size displays a general pattern of 1.0 for nearly all the latent trait
distribution.
22
Table 4 contains the means and standard deviations of the recovered CBD
parameters. The expected values of the mean and standard deviation for the generated
CBD’s range between 3.5 and .3 is 1.9 and .9, respectively.
ML Estimation. The small sample size condition (N = 250) displays means and
standard deviations that are relatively close to the expected values until the latent trait
distribution becomes more misshapen. When skewness = 1.25 and kurtosis = 1.75 then
mean of the CBD parameter is underestimated (M = 1.720, SD = 1.183). The as we
further deviate from normality the CBD parameters continue to be underestimated. There
are two instances when the standard deviation of the recovered CBD parameters is much
higher than anticipated (SD = 1.400 and SD = 2.314), for these distributions kurtosis =
2.75, while the skewness = 1.25 or 1.5, respectively.
The increase of sample size, using ML estimation, provides a loose extension of
these results, given the mean and standard deviation begin to show smaller estimates
when skewness = 1.25 and kurtosis = 1.75. This particular distribution also produces the
largest value for the standard deviation (SD = 2.492). The remaining combinations of
skewness and kurtosis display a nonlinear pattern of increasing then decreasing mean
and standard deviation values within each grouping. The large sample size (N = 2,000) is
an extension of the 500 sample size. The lowest mean value for CBD parameters occurs
as skewness = 1.5 and kurtosis = 2.75, (M = 1.589).
EH Estimation. The use of EH estimation for the small sample size produced
standard deviations that are highly biased when the latent trait distribution has skewness
of 0.25, 0.75, 1.25 and 1.5 while kurtosis is zero (SD = 4.189, 3.952, 3.457, 3.442,
respectively).The standard deviations for the small sample sizes are all too large to be
23
and are outside the range of what is possible, given thetrue CBD parameters where never
larger than 3.5. The increase of the sample size to 500 and 2,000 provides estimates that
are closer to the expected values of the CBD parameters. The estimates for these sample
sizes are appropriately within the range of realistic CBD parameter estimates.
CBD parameter recovery
Absolute bias. Table 5 displays the absolute bias in CBD parameter recovery for
each condition, bias is as an assessment of the effect of a normal distribution being
imposed on a nonnormally distributed latent trait, theta. When theta is normally
distributed the expected value for bias should be very close to zero. An increase in bias
indicates that the estimated CBD parameters are different than the true, known, CBD
parameters when the shape of the latent trait distribution is ignored.
ML Estimation. As expected, parameters are less biased under the larger sample
size condition (mean bias = -0.008) than the small sample size condition (mean bias = -
0.246). Focusing on the smallest sample size the bias gradually increases as latent trait
distribution deviates further from normality. The rate of bias increasing is faster when as
kurtosis increases when skewness remains at zero compared to when kurtosis is zero and
skewness increases. Bias is the largest when skewness = 1.5 and kurtosis = 2.75 (mean
bias = -0.246). Ignoring the non-normality produces considerable negative bias once the
kurtosis value reaches 1.75 when paired with skewness = 1.25.
The increase of sample size to 500 reduces the amount of bias for latent trait
distribution that are closer to near normal conditions skewness < .75 and kurtosis < .75.
However, when skewness > .75 and kurtosis > 1.75 there is an increase in bias compared
to the sample size of 250. The largest sample size is an extension of these results and the
24
downwardly bias increases with the sample size with the exception of the normal
distribution and the near normal condition of skewness = .25 and kurtosis = 0.
EH Estimation. The unexpected result of the normal latent trait distribution
having a mean bias of .242 falls in line with the trend of the values for this sample size,
they are large values. The few exceptions to the pattern are when skewness = 1.25 and
kurtosis = 1.75 (mean bias = -0.008), skewness = 1.25 and kurtosis = 2.75 (mean bias =
0.049), and skewness = 1.25, 1.5 and kurtosis = 3.75 (mean bias = 0.083, -0.038). As the
sample size increases to 2,000 the bias decrease to values ranging between -0.093 to
0.004.
RMSE Table 6 displays the RMSE of the CBD parameters for each condition,
with lower values indicating more accurate parameter estimates. As with the CBD
absolute bias parameters, RMSE is smaller under the largest sample size condition than
the smallest sample size condition.
ML Estimation. For these results, the focus will be on the sample size of 2,000
first for simplicity. The RMSE was small for the normal distribution condition when
estimated with skewness = 0 and kurtosis = 0 (mean RMSE = 0.183), indicating the CBD
parameters were most accurately recovered when the estimated under the correct
distribution, which is contrary to the results summarized using absolute bias. The RMSE
increases as the latent trait distribution shape deviates further from normality. The larger
values for RMSE occur when skewness > 1.25 and kurtosis > 1.75. The decrease in
sample size to 500 and 250 amplify the pattern of results seen for 2,000 participants.
EH Estimation. The sample size of 2,000 produced a narrow range of RMSE
values between 0.314 through 0.205 across all conditions. The bias for these conditions
25
was low which is contradictory information when comparing it to the RMSE results. As
the sample size decreases we can see the RMSE increases across all conditions with a
maximum value (mean RMSE = 4.027) which is well outside the range of tolerable
values for CBD parameter recovery. These contrary results between the absolute bias and
the RMSE outcome measures can be explained by the way these two outcome measures
are calculated. The absolute bias was determined from the average deviation between the
true CBD parameter and the estimated CBD parameters, whereas RMSE was computed
from the square root of the average squared deviations. Therefore, the sign of the
deviations was taken into account when calculating the absolute bias, whereas the
variances of these signed deviations were used when calculating the RMSE.
26
CHAPTER 4
Discussion
IRT models are on the rise as a method of constructing and evaluating
psychological measurements. This increasing popularity includes the use of IRT models
that factor in the response format to study psychological latent traits that are that are
likely not normally distributed in the population (Preston & Reise, 2013). Psychological
assessments are being used to make high-stakes decisions about individuals, such as
clinical diagnoses or special education (Preston & Reise, 2013). These scores derived
from surveys are informative of life satisfaction, stress, and affect which are highly
important when piecing together a portrait of an individual for therapy. Referring to the
introduction, the inherent nonnormality of these constructs is frequently ignored during
estimation using the ML/EM estimation algorithm, potentially leading to incorrect
conclusions about an individual. These results indicate that accuracy of the estimated
CBD parameters varies depending on the combination of the skewness and kurtosis,
sample size, and estimation method used to calculate the CBD parameters.
Overall, the effects of ignoring nonnormality in the distribution of the latent trait
under the NRM are seen in this paper. To investigate the effects of nonnormality on
CBD parameter estimation and latent trait distribution recovery, data were generated
under normal, skewed, kurtoic, and combinations of skewed and kurtotic distributions.
These data were estimated using ML estimation in FlexMIRT 2.0 which does not take the
shape of the true latent trait distribution into account while calculating the parameters for
the NRM. The same data was estimated using EH, which does account for the shape of
the latent trait while calculating the CBD parameters and updates the shape with every
27
interaction. Results showed that CBD parameters were very inaccurate and upwardly
biased leading to large differences between estimated and true item parameters when
using ML estimation.
The effects of ignoring the nonnormality increased a throughout the nonlinear
pattern of results. Specifically, there are combination of skewness and kurtosis were less
biased even when the values of skewness and kurtosis were high in combination. The
pattern follows a cubic shape in the in the extreme end of the spectrum, which is will be a
topic of further investigation. Inaccuracies of CBD parameter estimates were magnified
under a small sample size. Overall, the consequences of ignoring nonnormality while
using ML estimation in both the CBD parameter estimates and the θ estimates were
moderate to severe. The estimated CBD parameter represents the end results of the using
a distribution (either implemented by the computer software or estimated by the use of
EH estimation) and the generated survey data. The ability of the data to overcome the use
of a normal distribution during the calculations is seen in these differences of the true and
estimated CBD parameter.
New developments in IRT estimation procedures, such as EH estimation (Cai,
2013), is a great tool in FlexMIRT 2.0. Potentially allowing for some improvements in
the estimation of CBD parameters and θ estimates, as seen here. The purpose of this
research was to evaluate item parameter recovery and latent trait distribution recovery
using FlexMIRT’s implementation of EH estimation using the NRM, a lateral extension
of Preston and Reise (2013). To study the improvement of estimation when accounting
for the shape of the latent trait distribution, the normal, skewed, kurtoic, and
combinations of skewed and kurtotic distributions were estimated. The CBD parameter
28
and θ estimate recovery was improved when using EH. Specifically, the increase in
sample size when the latent trait distribution is nonnormally distributed will results in
stable CBD parameters. These estimated parameters increase in accuracy as sample size
increases when EH estimation is implemented. This stresses the need for an increased
sample size to be able to accommodate EH estimation. The use of EH estimation for
nonnormal latent trait distribution with sample sizes under or around 500 should be
interpreted with extreme caution. The sample size of 500 preforms well for absolute bias,
mean, and standard deviation values yet when considering the RMSE a sample size of
2,000 out preforms and are consistent with Woods (2006). When researchers
acknowledge the shape of the latent trait distribution they can then improve the
estimation precision. Correctly recovering the θ estimates leads to increased accuracy in
recovering the CBD parameter estimates. Using EH estimation out preformed ML for
CBD parameter recovery when sample sizes are 500 and 2,000. The focus on the CBD
parameter recovery is due to the decision participants are making to endorse a particular
category. The distinction between these categories should be clear (higher CBD
parameter) rather than the probability curves for each category over lapping each other
(lower CDB parameter).
Based on these results, it is recommended that EH estimation when a researcher
considers the construct being measured has the potential of being severely nonnormally
distributed and the sample size is ³ 2,000. If there is previous literature of a latent trait
having a small deviation from being normally distributed then it would be reasonable to
use ML estimation for sample sizes of 500. The cost of EH estimation is time for
calculations and sample size. These results are an extension of the of previous literature
29
in education, where IRT grew up, and by psychometricians of the past decade. IRT was
developed for the field of education where the sample sizes are over 10,000 most often.
Therefore, the results of this research project were not unexpected. The extension of
using EH estimation for psychological research with small sample sizes is not attainable.
This presents a common problem most researchers face, the need for more data and
participants. However, this need can be minimized by the advancement of statistical
techniques and understanding how these technique effect the data we analyze. The types
of investigations are important but quantitative researchers need to make these
advancements in techniques attainable for applied researchers. The field of psychology
can only move forward if we can all understand and learn from the applied and
quantitative sides of the research cycle.
Future research can test new software being developed to implement for non-
normal latent trait distribution Ramsay curve item response theory (RC-IRT; Woods &
Thissen, 2006). The advancements in IRT software, such as EQSIRT (Multivariate
Software, 2010) will allow for this future research to be conducted. RC-IRT estimates the
shape of the latent trait distribution, similar to EH estimation, while estimating the CBD
parameters. The estimation of the latent trait distribution is controlled by the user by
manipulating the splines the program to measure the shape of the latent trait.
Furthermore, future research should also consider the type of distribution
implemented in the first iteration of the calculations. Based on the Bayesian principle of
stable estimation the data will overcome the distribution that is given in the computer
program and converge on an answer even if it is wrong. However, will be data
convergence be faster and more reliable if researchers provide a more realistic guess of
30
the shape of the latent trait distribution. This would be a vein of further exploration to see
if researchers in psychology would be able to use this technique to “get around” the large
sample sizes needed for accurate CBD parameter estimation.
31
Table 1
Demonstration of Item Parameter Recovery
Latent Trait
Distribution CBD Range Bias RMSE
Proportion
Converged
Skew Kurtosis
0 0
0.50-1.50
0.007 0.163 1.00
0.25 0.75
0.50-1.50
0.008 0.165 1.00
0.75 1.75
0.50-1.50
0.037 0.192 1.00
1.25 2.75
0.50-1.50
0.086 0.254 1.00
1.5 3.75
0.50-1.50
0.113 0.296 1.00
0 0
0.75-1.25
0.008 0.157 1.00
0.25 0.75
0.75-1.25
0.014 0.159 1.00
0.75 1.75
0.75-1.25
0.046 0.191 1.00
1.25 2.75
0.75-1.25
0.083 0.25 1.00
1.5 3.75
0.75-1.25
0.119 0.281 1.00
32
Table 2
Test Conditions for the Latent Trait Distributions
Kurtosis
Extreme High Moderate Slight None
3.75 2.75 1.75 0.75 0
Skewness
None 0 x x x x x
Slight 0.25 x x x x x
Moderate 0.75 x x x x x
High 1.25 x x x * *
Extreme 1.5 x x * * *
Note: X Fleishman's power weights.* generated by this researcher
33
Table 3
Average Proportion of converged conditions
Converged
All Conditions
Skewness Kurtosis
N = 250
500
2,000
0 0
0.95
1.0
1.0
0.25 0
0.91
1.0
1.0
0.75 0
0.93
0.96
1.0
1.25 0
0.93
0.97
1.0
1.5 0
0.91
0.97
1.0
0 0.75
0.98
1.0
1.0
0.25 0.75
0.95
0.98
1.0
0.75 0.75
0.99
0.98
1.0
1.25 0.75
0.98
0.96
0.98
1.5 0.75
0.95
0.94
1.0
0 1.75
0.94
0.98
1.0
0.25 1.75
0.96
1.0
1.0
0.75 1.75
0.93
0.98
1.0
1.25 1.75
0.94
0.94
1.0
1.5 1.75
0.93
0.95
1.0
0 2.75
0.95
0.97
1.0
0.25 2.75
0.94
0.98
1.0
0.75 2.75
0.96
0.98
1.0
1.25 2.75
0.95
0.99
1.0
1.5 2.75
0.89
0.95
0.98
0 3.75
0.95
1.0
1.0
0.25 3.75
0.93
0.98
1.0
0.75 3.75
0.97
0.98
1.0
1.25 3.75
0.89
0.97
1.0
1.5 3.75 0.98 0.99 1.0
34
Table 4
Average CBD parameter recovery and CBD parameter standard deviation for ML and EH estimation
CBD Mean (SD)
ML Estimation EH estimation
Skewness Kurtosis N = 250 500 2,000 250 500 2,000
0 0 1.992 (1.198) 1.965 (1.031) 1.927 (0.953) 2.131 (1.489) 2.011 (1.099) 1.935 (0.964)
0.25 0 1.981 (1.181) 1.915 (1.022) 1.911 (0.958) 2.200 (4.189) 1.958 (1.122) 1.919 (0.960)
0.75 0 1.894 (1.250) 1.883 (1.104) 1.845 (1.014) 2.015 (3.952) 1.934 (1.068) 1.911 (0.955)
1.25 0 1.877 (1.265) 1.842 (1.113) 1.800 (1.147) 2.004 (3.457) 1.910 (1.003) 1.937 (0.941)
1.5 0 1.756 (1.408) 1.859 (1.167) 1.802 (1.211) 1.997 (3.442) 1.879 (1.207) 1.944 (0.925)
0 0.75 1.963 (1.099) 1.886 (0.989) 1.834 (0.913) 2.133 (1.324) 1.987 (1.092) 1.878 (0.950)
0.25 0.75 1.928 (1.138) 1.905 (0.991) 1.827 (0.910) 2.102 (1.637) 1.988 (1.095) 1.869 (0.938)
0.75 0.75 1.884 (1.167) 1.877 (1.055) 1.851 (0.950) 2.051 (3.471) 1.942 (1.105) 1.899 (0.934)
1.25 0.75 1.877 (1.205) 1.779 (1.210) 1.863 (0.978) 2.195 (2.849) 1.954 (1.119) 1.921 (0.948)
1.5 0.75 1.760 (1.156) 1.898 (1.103) 1.871 (1.006) 2.195 (2.951) 1.982 (1.478) 1.871 (0.978)
0 1.75 1.846 (1.057) 1.854 (0.962) 1.794 (0.862) 2.022 (1.255) 1.987 (1.082) 1.889 (0.929)
0.25 1.75 1.855 (1.079) 1.829 (0.955) 1.772 (0.866) 2.083 (1.620) 1.972 (1.091) 1.858 (0.926)
0.75 1.75 1.862 (1.080) 1.836 (0.982) 1.805 (0.910) 2.042 (1.448) 1.948 (1.072) 1.897 (0.949)
1.25 1.75 1.720 (1.183) 1.733 (2.492) 1.643 (1.008) 1.857 (1.290) 1.891 (1.568) 1.807 (0.938)
1.5 1.75 1.898 (1.089) 1.702 (2.170) 1.734 (0.958) 1.964 (1.374) 1.875 (1.423) 1.821 (0.954)
0 2.75 1.812 (1.041) 1.731 (0.903) 1.703 (0.850) 2.122 (2.987) 1.889 (1.026) 1.816 (0.926)
0.25 2.75 1.798 (1.046) 1.761 (0.911) 1.724 (0.848) 2.037 (1.619) 1.940 (1.067) 1.845 (0.927)
0.75 2.75 1.809 (1.060) 1.767 (0.934) 1.714 (0.861) 2.026 (1.533) 1.920 (1.064) 1.840 (0.933)
1.25 2.75 1.772 (1.400) 1.750 (0.994) 1.725 (0.898) 1.942 (2.865) 1.888 (1.055) 1.880 (0.921)
1.5 2.75 1.638 (2.314) 1.638 (1.098) 1.589 (1.015) 1.768 (1.604) 1.816 (1.055) 1.817 (0.930)
0 3.75 1.778 (1.013) 1.699 (0.902) 1.699 (0.821) 2.040 (1.449) 1.864 (1.029) 1.851 (0.918)
0.25 3.75 1.758 (0.993) 1.703 (0.898) 1.669 (0.829) 2.040 (1.497) 1.888 (1.049) 1.815 (0.923)
0.75 3.75 1.780 (1.067) 1.730 (0.915) 1.680 (0.839) 2.073 (1.638) 1.911 (1.102) 1.830 (0.926)
1.25 3.75 1.744 (1.031) 1.707 (0.945) 1.686 (0.884) 1.998 (2.154) 1.843 (0.996) 1.853 (0.936)
1.5 3.75 1.709 (1.156) 1.694 (1.010) 1.641 (0.921) 1.852 (1.256) 1.832 (1.008) 1.823 (0.910)
35
Table 5
CBD Absolute bias
Bias in item parameters
Normal Prior EH estimation
Skewness Kurtosis
N = 250
500
2,000
N = 250
500 2,000
0 0
0.103
0.048
0.008
0.242
0.094
0.017
0.25 0
0.078
0.027
0.011
0.297
0.07
0.019
0.75 0
0.016
-0.019
-0.062
0.137
0.032
0.004
1.25 0
0.021
-0.031
-0.042
0.124
0.024
0.01
1.5 0
0.028
-0.035
-0.027
0.097
0.017
0.015
0 0.75
0.04
-0.008
-0.54
0.21
0.093
-0.009
0.25 0.75
0.026
-0.012
-0.048
0.2
0.071
-0.05
0.75 0.75
-0.002
-0.024
-0.061
0.165
0.042
-0.013
1.25 0.75
-0.015
-0.027
-0.074
0.147
0.035
-0.02
1.5 0.75
-0.04
-0.031
-0.079
0.142
0.027
-0.024
0 1.75
-0.039
-0.089
-0.108
0.137
0.045
-0.13
0.25 1.75
-0.031
-0.83
-0.119
0.196
0.06
-0.033
0.75 1.75
-0.058
-0.072
-0.102
0.122
0.04
-0.01
1.25 1.75
-0.145
-0.147
-0.216
-0.008
0.011
-0.051
1.5 1.75
-0.178
-0.152
-0.237
-0.001
0.021
-0.037
0 2.75
-0.1
-0.137
-0.164
0.21
0.02
-0.051
0.25 2.75
-0.093
-0.136
-0.164
0.146
0.044
-0.043
0.75 2.75
-0.11 -0.132
-0.171
0.106
0.022
-0.044
1.25 2.75
-0.122
-0.146
-0.198
0.049
-0.007
-0.043
1.5 2.75
-0.246
-0.276
-0.32
-0.115
-0.097
-0.093
0 3.75
-0.131
-0.2
-0.213
0.131
-0.035
-0.062
0.25 3.75
-0.139
-0.189
-0.205
0.143
-0.004
-0.059
0.75 3.75
-0.131
-0.173
-0.197
0.163
0.008
-0.047
1.25 3.75
-0.171
-0.193
-0.215
0.083
-0.57
-0.048
1.5 3.75 -0.18 -0.22 0.251 -0.038 -0.082 -0.069
36
Table 6
RMSE of CDB parameters
CBD RMSE estimates
ML Estimation EH estimation
Skewness Kurtosis
N = 250
500
2,000
N = 250
500 2,000
0 0
0.661
0.399
0.183
1.044
0.505
0.207
0.25 0
0.670
0.401
0.212
4.027
0.572
0.239
0.75 0
0.810
0.621
0.497
3.085
0.511
0.257
1.25 0
0.871
0.718
0.654
3.149
0.481
0.241
1.5 0
0.901
0.845
0.762
3.184
0.412
0.211
0 0.75
0.566
0.381
0.192
0.858
0.534
0.239
0.25 0.75
0.612
0.395
0.209
1.238
0.545
0.205
0.75 0.75
0.700
0.486
0.336
3.295
0.547
0.233
1.25 0.75
0.759
0.512
0.491
3.194
0.541
0.224
1.5 0.75
0.802
0.604
0.511
3.098
0.537
0.209
0 1.75
0.571
0.416
0.224
0.780
0.530
0.214
0.25 1.75
0.573
0.396
0.237
1.221
0.501
0.218
0.75 1.75
0.613
0.422
0.301
1.037
0.482
0.219
1.25 1.75
0.829
2.334
0.639
0.872
1.238
0.268
1.5 1.75
0.849
2.487
0.749
0.986
1.157
0.271
0 2.75
0.576
0.401
0.271
2.799
0.456
0.215
0.25 2.75
0.603
0.407
0.274
1.255
0.514
0.224
0.75 2.75
0.629
0.425
0.319
1.169
0.494
0.216
1.25 2.75
1.089
0.55
0.471
2.672
0.522
0.314
1.5 2.75
2.187
0.819
0.756
1.352
0.565
0.311
0 3.75
0.564
0.434
0.320
1.033
-0.035
0.225
0.25 3.75
0.574
0.428
0.315
1.109
0.471
0.220
0.75 3.75
0.652
0.463
0.338
1.272
0.585
0.233
1.25 3.75
0.638
0.529
0.429
1.895
0.447
0.270
1.5 3.75 0.824 0.648 0.559 0.857 0.493 0.300
37
Figure 1
Equal CBD Parameters
38
Figure 2
Unequal CBD Parameters
39
References
Bock, R.D. (1972).Estimating item parameters and latent ability when responses are scored in
two or more nominal categories. Psychometrika, 37, 29-51.
Bock, R. D. (1997).The nominal categories model. In W. J. van der Linden & R. K. Hambleton
(Eds.), Handbook of modern item response theory (pp. 33-49). New York: Springer.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters:
Application of an EM algorithm. Psychometrika,46, 443-459.
Cai, L. (2013). flexMIRT R version 2: Flexible multilevel multidimensional
item analysis and test scoring [Computer software]. Chapel Hill, NC: Vector
Psychometric Group.
de Ayala, R.J., & Sava-Bolesta, M. (1999). Item parameter recovery for the nominal response
model. Applied Psychological Measurement, 23, 3-19.
DeMars, C. E. (2003). Sample size and the recovery of the nominal response model item
parameters. Applied Psychological Measurement, 27, 275-288.
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43,
521-532.
Mislevy, R. J., & Wilson, M. (1996). Marginal maximum likelihood estimation for a
psychometric model of discontinuous development. Psychometrika, 61(1), 41-71.
doi:10.1007/BF02296958
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied
Psychological Measurement, 16, 159-176.
40
Preston, K. S. J. & Reise, S. P. (2013). Estimating the Nominal Response Model Under
Nonnormal Conditions. Educational and Psychological Measurement, XX, XXX-XXX.
Preston, K. S. J., Reise, S. P., Cai, L. & Hays, R. D. (2011). Evaluating the Discrimination of
Response Categories in the PROMIS Emotional Distress Item Pools.Educational and
Psychological Measurement, 71, 523-550.
Reise, S. P., & Henson, J. M. (2003). A discussion of modern versus traditional psychometrics as
applied to personality assessment scales. Journal Of Personality Assessment, 81(2), 93-
103. doi:10.1207/S15327752JPA8102_01
van den Oord, E. J. C. G. (2005). Estimating Johnson curve population distributions in
MULTILOG. Applied Psychological Measurement, 29, 23-30.
Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 49,
501-519.
Thissen, D., Steinberg, L., & Fitzpatrick, A. R. (1989). Multiple-choice models: The distractors
are also part of the item. Journal of Educational Measurement, 26, 161-176.
The Regents of the University of Michigan. (2014). HRS 2012 Core (EarlyV1.0). Retrieved from
http://hrsonline.isr.umich.edu/index.php?p=shoavail&iyear=NC
Woods, C. M. (2007). Ramsay curve IRT for likert-type data. Applied Psychological
Measurement, 31, 195-212.
Woods, C. M. (2008). Likelihood-ratio DIF testing: Effects of nonnormality. Applied
Psychological Measurement, 20, 1-16.
Woods, C.M., & Thissen, D. (2006). Item response theory with estimation of the latent
population distribution using spline-based densities. Psychometrica, 71, 1-22.
Abstract (if available)
Abstract
The use of Item Response Theory is rising in the field of psychology at a fast rate over the past decade. Psychological research is based on measuring unobservable processing in the mind using various data collection techniques. The focus of the paper is using survey data for measuring these latent traits. Latent traits most often measured within psychology are most often nonnormally distributed, this presents a problems because conventional statistical methods are based on the normal distribution. The use of new estimation techniques such as empirical histograms allow for the estimation of the shape of the latent trait distribution. This technique provides more accurate results for the item parameters for the survey when sample sizes are large and coupled with high skewness and kurtosis than using the traditional method of using maximum likelihood.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Biometric models of psychopathic traits in adolescence: a comparison of item-level and sum-score approaches
PDF
Comparing skipped correlations: the overlapping case
PDF
Estimation of nonlinear mixed effects mixture models with individually varying measurement occasions
PDF
Robustness of rank-based and other robust estimators when comparing groups
PDF
Outlier-robustness in adaptations to the lasso
PDF
The design, implementation, and evaluation of accelerated longitudinal designs
PDF
A Bayesian region of measurement equivalence (ROME) framework for establishing measurement invariance
PDF
Interpersonal coping responses during adolescence: implications for adjustment
PDF
Essays on econometrics analysis of panel data models
PDF
Peer Coach Training for disruptive youth
PDF
Measuring truth detection ability in social media following extreme events
PDF
Bayesian hierarchical and joint modeling of the reversal learning task
PDF
Long-term blood pressure variability across the clinical and biomarker spectrum of Alzheimer’s disease
PDF
People can change when you want them to: changes in identity-based motivation affect student and teacher Pathways experience
PDF
Cross-ethnic friendships, intergroup attitudes, and intragroup social costs among Asian-American and Latino-American youth
PDF
A self-knowledge model of social inference
PDF
Applying adaptive methods and classical scale reduction techniques to data from the big five inventory
PDF
Dynamics of victimization, aggression, and popularity in adolescence
PDF
Nice by nature? A twin study of the development and physiology of prosocial personality
PDF
A comparison of classical methods and second order latent growth models for longitudinal data analysis
Asset Metadata
Creator
Parral, Skye Nichole
(author)
Core Title
Making appropriate decisions for nonnormal data: when skewness and kurtosis matter for the nominal response model
School
College of Letters, Arts and Sciences
Degree
Master of Arts
Degree Program
Psychology
Publication Date
08/10/2018
Defense Date
08/10/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
kurtosis,nominal response model,OAI-PMH Harvest,skewness
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wilcox, Rand (
committee chair
), John, Richard Sheffield (
committee member
), Schwartz, David (
committee member
)
Creator Email
parral@usc.edu,skye.parral@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-65281
Unique identifier
UC11671725
Identifier
etd-ParralSkye-6727.pdf (filename),usctheses-c89-65281 (legacy record id)
Legacy Identifier
etd-ParralSkye-6727.pdf
Dmrecord
65281
Document Type
Thesis
Format
application/pdf (imt)
Rights
Parral, Skye Nichole
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
kurtosis
nominal response model
skewness