Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Five quantile test: A comparison of independent groups
(USC Thesis Other)
Five quantile test: A comparison of independent groups
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter free, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely afreet reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information Company 300 North Zeeb Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. FIVE QUANTILE TEST: A COMPARISON OF INDEPENDENT GROUPS by Jan Muska A Thesis Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CA LIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF ARTS (Psychology) December 1996 Copyright 1996 Jan Muska Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 1383539 Copyright 1996 by Muska, Jan All rights reserved. UMI Microform 1383539 Copyright 1997, by UMI Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. UMI 300 North Zeeb Road Ann Arbor, MI 48103 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UNIVERSITY OF SOUTHERN CALIFORNIA TH E GRADUATE SCHOOL U N IV E R S ITY PARK LOS ANGELES. C A LIFO R N IA S 0 0 0 7 This thesis, written by . . . \ J a w s . /i.y ^ .4 s 2 -___________ under the direction of h..4 .5 . Thesis Committee, and approved by all its members, has been pre sented to and accepted by the Dean of The Graduate School, in partial fulfillment of the requirements for the degree of Master of Arts D a te - Novembe.r J .3AJL 9 9 6 THESIS COMMITTEE Chairman Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. II Table of Contents page I . Introduction 1 1.1. Shift Function 2 1.2. Multiple Quantiles Test 3 1.3. Proposed Five Quantile (FQ) Test 4 1.3.1. Jackknife vs. Bootstrap 5 1.3.2. Five Quantiles vs. Nine Quantiles 5 1.3.3. Equal Probability of Rejection Confidence Band 6 1.3.4. Conclusions 9 2. Method 9 2.1 Performance of the Jackknife 10 2.2. Critical Value Function 11 2.2.1. Simulation of the Empirical Distribution for the T M p Statistic 1 1 2.2.2. Estimation of the Critical Value CRP 12 2.2.3. Construction of the Critical Value Function 12 2.3. Type I Error Performance 14 2.3.1. Distribution Type 14 2.3.2. Sample Size 15 2.3.3. Number of Independent Pair-wise Comparisons 15 2.4. Power of the FQ Test 16 2.4.1. Distribution Models 16 3. Results 17 3.1. Jackknife 18 3.2. Type I Error 21 3.3. Power 26 4. Discussion 26 5. References 28 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table I: Table 2: Table 3: Table 4: Table 5: Table 6: Table 7: iii List of Tables page The Computational Summary of the FQ Test 18 Estimated Bias Ratio and Efficiency for the Jackknife 19 and Bootstrap Estimator of Variance Estimated Critical Value for the Test of Hypothesis 20 (Tp = 0). Employing the Jackknife and Bootstrap Variance Estimator, with Random Variables X and Y Sampled from the Standard Normal Distribution Estimated Critical Value for the Test of Hypothesis 20 (Tp = 0). Employing the Jackknife and Bootstrap Variance Estimator, with Random Variables X and Y Sampled from the g-and-h Distribution (g = 1.0. h = .5) Estimated Probability of the Type I Error fors Number 23 of Independent Groups. Using All Pair-wise Comparisons Estimated Probability of the Type I Error fors Number 24 of Independent Groups. Using Comparison with the Control Group ni Estimated Power of the FQ. Trimmed Mean Fixed 25 Effect (TM). and One Way ANOVA (AN) tests, for the Five Non-Null Distribution Models (DM ) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures Figure I: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: page Plot of quantile differences between experimental I and control group vs. quantiles of the control group Plot of the right half of the EPR and Wilcox’ 8 simultaneous 95% confidence band for the null hypothesis, n = 10 Plot of the right half of the EPR and Wilcox' 9 simultaneous 95% confidence band for the null hypothesis, n = 40 Estimated probability of the type I error for the 2 1 g-and-h distribution (g = 0. h = 0) - the all pair-wise comparisons Estimated probability of the type I error for the 22 g-and-h distribution (g = 1. h = .5) - the all pair-wise comparisons Power for comparison of two independent groups with 27 25 observations each Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Five Quantile Test: A Comparison of Independent Groups I. Introduction Comparison of independent groups is a standard problem in experimental psychology. Typically, each group is described with some measure of location (e.g.. mean or trimmed mean). This measure of location is then used to judge whether or not a difference exists. The problem with this approach is an unwarranted, implicit assumption that the groups differ equally through out their distributions. That is: the difference found between groups with the measure of location is representative of the differences at the tails. 120 100- C 1 3 o c 03 1 — 80- 03 £ 60- Q 03 '■g 40- C O B 20. -100 -200 -150 -50 50 100 0 Quantiles for the Control Group Figure I . Plot of the quantile differences between experimental and control group vs. quantiles of the control group Salk's (1973) data exemplify the inadequacy of using a measure of location to judge differences among groups. Salk (1973) studied the effect of exposing infants to a normal Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. adult heart beat on infants' weight gain. Figure 1 shows differences between the quantiles of the experimental and control groups plotted against the quantiles of the control group. The estimated quantiles p = . I . .2......... 9 used in the plot are Harrell and Davis (1982) (HD) estimators (defined in section 1.2.). The quantiles represent weight gain in grams. Notice that the exposure to the adult heart beat produced the greatest increase in weight gain for the infants with low weight gain and the lowest increase in weight gain for the infants with large weight gain. Using only the measure of location would prevent the researcher from finding this trend. Statistical tests allowing the researcher to compare two independent groups based on more than one location exist in the literature. For direct relevance to the current work, two such tests are mentioned as an example: ( 1) shift function: (2) multiple quantiles test. 1.1. Shift Function. The shift function was introduced by Doksum (1974) and Doksum & Sievers (1976). The idea is that the random variable X can be shifted by some amount A(X) to have identical distribution with random variable Y. or A(X) + X = Y. The random variables X and Y come from the distributions F and G respectively. Doksum (1974) defined the shift function as A(x) = inf{A:F(x) < G (x + A )J . The two independent groups are different when the shift function A(X) is at any point non-zero. Doksum (1974) and Doksum & Sievers (1976) suggested the use of the Kolmogorov-Smirnov confidence band to evaluate differences between two independent groups. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2. Multiple Quantiles Test. Wilcox (1995a). with his multiple quantiles method, introduced a variation on the idea of the shift function proposed by Doksum (1974) and Doksum & Sievers (1976). That is. Wilcox ( 1995a) replaced the classic quantile estimator of the pIh quantile F*(p) with the HD estimator xp in the definition of the shift function used by Doksum (1974). or A (x p) 4- Xp = yp. The HD estimator xp is defined as n X p = I W j *X j . i = l where \|. x„ are the ordered statistics composing the empirical distribution F. The weights Wj = Pr{— < Y < - } . n n where Y is the random variable with beta distribution with parameters a = p * (n + 1) and P = ( I - p)* (n + I) (Wilcox. 1996). The idea of using the HD estimator was to utilize this estimator’s advantage in smaller variance over the classic quantile estimator F '(p ) for small sample sizes (less than 50) (Parrish. 1990). In addition. Wilcox shift function is not continuous but discrete. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4 built around nine quantiles (i.e.. p = .1. .2......... 9). This idea of using discrete points for the comparison of two distributions apparently stems from W ilk & Gnanadesikan (1968) quantile-quantile plots. The results in Wilcox ( 1995a) show that the Kolmogorov-Smirnov and multiple quantiles tests are close competitors. For example, comparing the power obtained by Wilcox ( 1995b) for the Kolmogorov-Smirnov test and by Wilcox (1995a) for the multiple quantiles test reveals little difference between the two tests. Since the HD quantile estimator offers a more efficient estimate than traditional quantile F'(p) (Harrell & Davis. 1982: Parrish. 1990). it seems reasonable that Wilcox (1995a) multiple quantiles test be preferable for small sample sizes. However. literature offers nothing to support or reject this surmise. More work is needed to understand advantages or disadvantages of one test over the other. 1.3. Proposed Five Quantile (FQ) Test Wilcox ( 1995a) multiple quantiles test was developed for comparisons of two independent groups only. However, many researchers face the problem of comparing three or more independent groups. This work offers an extension of Wilcox ( 1995a) multiple quantiles test to multiple independent groups (up to 10). There were minor changes to the computational machinery of the test: ( I ) using jackknife instead of bootstrap: (2) five as opposed to nine quantiles: and (3) a simultaneous confidence band that rejects null hypothesis (if the null is true) for each quantile with equal probability. For the reason of using only five quantiles. the proposed test is called the five quantile (FQ) test. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 1.3.1. Jackknife vs. Bootstrap. Wilcox ( 1995a) multiple quantiles test uses the bootstrap to estimate the HD standard error. The HD standard error stands for variance of the Harrell & Davis quantile estimator. The bootstrap, a resampling method, evaluates any statistic (like the HD estimator) for each of the B random subsamples of the size n. Each suhsample is created by random sampling with replacement from the empirical distribution. The B random subsamples compose a subset from the universe of all possible subsamples of the size n that could be generated from the empirical distribution of the size n (Efron. 1 981. 1982). Wilcox ( 1995a) used B = 200. Jackknife. also a resampling method, evaluates a statistic for each of the n possible subsamples of the size n - I . The jackknife subsamples are generated by removing one observation at a time from the empirical distribution (Efron. 1981. 1982). For sample sizes less than B. the jackknife method is faster than the bootstrap in terms of the computation time. However, literature is unclear as to the adequacy of the jackknife for the estimation of the HD standard error. For example. Wilcox & Charlin (1986) reported some difficulty with the jackknife. On the other hand. Harrell & Davis (1982) reported successful use of the jackknife. This work offers evidence that jackknife competes well with the bootstrap as an estimator of the HD standard error. 1.3.2. Five Quantiles vs. Nine Quantiles. The extension of the Wilcox (1995a) multiple quantiles test proposed in this work uses only five quantiles (p = . I . .25. .5. .75. .9). This particular selection of quantiles as opposed to the nine quantiles used by Wilcox ( 1995a) was motivated by the time-expensive estimation of the HD standard error. The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. variance has to be estimated for each quantile. Thus, use of five quantiles reduced the computer time required for the estimation of variance by almost a half. Yet, the power of the test based on these five quantiles in detecting a difference between two independent distributions was not reduced. Perhaps the opposite. The selection of fewer quantiles means the narrower simultaneous confidence band. However, how much narrower this band would be. as compared to the multiple quantiles test, is unclear. Why the selection of the quantiles p = .1. .25. .5. .75. .9? The quantile p = .5 was included because the center of distribution has traditionally been of interest to the researcher. The quantiles p = .25. .75 have a prominent role in the data analysis, as exemplified by the inclusion of these quantiles in the construction of the graphical display tools like the boxplot. Also. Lunneborg (1986) pointed out advantages of comparing two distributions at the quantile p = .25. As to the quantiles p = . I . .9 .1 have found that these extreme points have a substantial influence on the performance of Wilcox (1995a) multiple quantiles test. Also. R. R. Wilcox (personal communications. November. 1995) stated that extreme quantiles (p = .1. .9) are important in understanding the behavior in the tails. 1.3.3. Equal Probability of Rejection Confidence Band. Wilcox ( 1995a) simultaneous I - a confidence band for his multiple quantiles test is defined as (yp -Xp)±t|_a *yS2B(yp) + S2B(Xp) ’ where t|.„ is I - a quantile of the T distribution. T = max{IT jl.lT ^ I I T 91 }. where Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. S~r( xP> is the bootstrap estimate of the HD standard error, with B = 200 iterations. Figure 2 shows Wilcox (1995a) simultaneous 95% confidence band for the Tp statistics as a straight, solid line. Figure 2 also shows simultaneous 95% confidence band that rejects null hypothesis {G_,(p) = F '(p )}. if the null is true, with equal probability for each Tr statistic. Let p = pi. p:...... P q . where q is the number of selected quantiles. In this case, p = . I . .2......... 9. Then. Pr{Tp i >CRpi} = Pr{Tp2 >CRp2}=...= Pr{Tp q > CRp q } (I) where CRP is the I - a p quantile of the Tp statistic. To guarantee the I - a simultaneous coverage of this confidence band. a p was required to satisfy the following equation: (I - a pi) n ( I - a p :) n ... ( l-otpq) = I - a . (2) where a P j = Pr{TP j > CRP j}. where I < i < q. The band that satisfies the equations ( I ) and (2) is called the equal probability of rejection (EPR) band. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 8 5.0 T - k * k 2.0 - • - E P R > 1.0 • - Wilcox 0.0 - : ! 1 ‘ 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Quantile p Figure 2. Plot of the right half of the EPR and Wilcox simultaneous 9 5 confidence band for the null easier for the extreme quantiles p = . I or .9 compared to the EPR band. The EPR band offers an equal probability of rejection of the null hypothesis, if the null is true, for each T p statistic. Therefore. Wilcox confidence band rejects Tp statistic for the extreme quantiles with the greater probability than for the central quantiles. This is an undesirable trend. For that reason. FQ test is using the EPR band. Both bands reported in Figure 2 were computed for comparison of two independent groups, with 10 observations in each group. Asymptotically, the two confidence bands are identical. Simulation study showed that there is no substantial difference between these two bands for samples with approximately 40 or more observations (see Figure 3). hypothesis, n = 10 Notice in Figure 2 that the Wilcox band makes it harder for the rejection of null hypothesis {G'1 (p) = F'(p)} for quantiles p = .3. .4. .5. .6. .7 while making the rejection Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 9 3.5 - 3.0 ± 2.5 f 2.0 • EPR u — a—Wilcox 1.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Quantile p Figure 3. Plot of the right half of the EPR and Wilcox simultaneous 959r confidence band for the null hypothesis, n = 40 1.3.4. Conclusions. The FQ test, substantially faster in terms of the computer time than the multiple quantiles test, equaled the performance of the multiple quantiles test in terms of the power and the type I error. In addition, regarding the power, the FQ test outperformed the one way ANOVA for all but normal distributions with location shift only. The FQ test was also a good competitor to the trimmed mean Fixed effect model (Wilcox. 1 996). This work was composed of several steps. First, the jackknife's performance as an estimator of the HD standard error was evaluated. Second, the function returning critical values for the 95% confidence band was constructed. The input for this function is the number of independent pairwise comparisons and the sample size. Third, the FQ test's type I error was assessed using g-and-h distribution family. Fourth, the power of the FQ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2. Method 10 test was evaluated for several non-null alternatives utilizing contaminated normal and normal distribution models. 2.1. Performance of the Jackknife A Monte Carlo study was conducted to resolve ambiguity in the literature regarding the appropriateness of the jackknife as the estimator of the HD standard error. The performance of the jackknife was evaluated by comparing the jackknife with the bootstrap in terms of their variances and the bias ratio. The variance estimates of jackknife and bootstrap were obtained from simulation study based on 5000 iterations. Following the methodology of Harrell & Davis (1982). the bias ratio was defined as the ratio of mean jackknife (or bootstrap) estimator over the HD estimator's variance. The mean jackknife (or bootstrap) and the HD estimator's variance were obtained from simulation study also based on 5000 iterations. Additionally, the performance of the jackknife was assessed by looking at the convergence of the critical value for the statistic T _ Y p - X p - ( G 1 (p) — F *(p)) j(y P)+ s 2j(xp) (3) to the value 1.96. S2 j(xp) is the jackknife estimate of the HD standard error or Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. where xp . j is the HD estimator computed for ilh subsample. Let X|. Xi xk be a random sample from F. 1 < i < K. and K is number of observation in the sample. Then the i'h subsample is defined as X|. X2 xm. Xj+i,.... Xk (Efron. 1979). 2.2. Critical Value Function Let X* be a random variable from the distribution Fs . where s = 1.2.........10 is the number of independent groups. Then ( x iP - x JP - (Fr ' ( p) - Fjr ' ( p ) ) , TM p = maxjj {---------- I==_ ----------} (4) ■\ JS J (x jp ) + S J (x jp ) where xip is the HD estimator of the p,h quantile for the distribution Fj. The notation max^ {} refers to maximizing the value in brackets over all combinations of i and j. given i < j. To construct a function approximating the critical value CRP for the T M P statistic, given the input (i.e.. sample size and NIPC), distribution of the T M P statistic has to be known. Thus, the development procedure of the critical value function involved: ( I ) simulation of the empirical distribution for the T M P statistic: (2) Estimation of the critical value CRP : and (3) construction of the critical value function. 2.2.1. Simulation of the Empirical Distribution for the TM p Statistic. Since the distribution of the T M P statistic is unknown, this distribution was simulated on the computer using 5000 iterations. In the spirit of the method developed by Wilcox ( 1995a) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 12 for the estimation of critical value, the empirical distribution of the T M P statistic was estimated for ail the combinations of the sample size (n = 10. I I . 13. 14. 17. 20, 24. 29. 38. 50. 69. 92. 100) and the number of independent samples (s = 2. 3.......10). The samples were randomly drawn from the standard normal distribution. 2.2.2. Estimation of the Critical Value CRP . Using the EPR confidence band poses the problem of evaluating the critical values CRP . Due to the dependency among the T M P statistics ( i.e.. TM.m. TM..25. TM .50. TM. 75. TM.qoh the critical value CRP was estimated iteratively. The iterative algorithm was built to satisfy the two conditions fo rap that were mentioned in the introduction (see equation I and 2). 2.2.3. Construction of the Critical Value Function. Step one. following Wilcox ( 1995a). the relationship between the sample size n and the critical value CRp was approximated by the function CRp = Ap + Bp * rtfpin (5) where nm in = minj{nj} and nj the sample size for the group j. 1 < j < s. This approximation was done without a regard to the number of independent comparisons for which each critical value CRP was evaluated. The constants Ap . Bp . and Cp were evaluated iteratively. The goal was to minimize the sum of squares by first adjusting the constant Cp and them computing the constants Ap and Bp using the ordinary least squares. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13 Step two involved estimation of the relationship between the slope ratio SRP — B|\j|pc\p I Bp • (6 ) and the number of independent pairwise comparisons (NIPC). where NIPC = s * (s - I ) / 2. First, the CRP statistics were grouped based on the NIPC involved in the computation of these CRp's. Then the function (5) was rewritten as C R p = A NIPCp + B NIPCp *nm in Cp. The unknown constants A n i p c . p and B n ip c .p were estimated using the ordinary least squares. Finally, the relationship between the slope ratio SRp and the NIPC was approximated by the function The constants A s r . p. B s r .p. and Ep were evaluated iteratively, like the constants in the function (5). Step three consisted of defining two temporary variables, or SRp = A SRp + BS R p * N IP C EP (7) (8) Vp = NIPCF p . (9) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 14 The constant Fp was iteratively evaluated (using the same method as for the constant C) for the relationship RESp = A res p + Bres.p * NIPCFp + e . The residuals RESp = C R p - ( A w . p + B\y.p * W p ). The constants Aw.P and Bw. P were evaluated using the ordinary least squares. Finally, step four involved producing the critical value function returning the estimates of the critical values based on the temporary variables Wp and V p. This function is C R p = B0 p + B lp * W p + B2 p * Vp ( 10) The constants B0p . B Ip . and B2P were evaluated using the ordinary least squares. 2.3. Type I Error Performance The type I error performance of the FQ test was evaluated using a Monte Carlo study with 5000 iterations. This study involved manipulation of three factors potentially affecting the type I error performance. These factors were: (1) distribution type: (2) sample size: and (3) number of independent pairwise comparisons (NIPC). 2.3.1. Distribution Type. The g-and-h distribution family was used to generate six distributions of varying skewness and kurtosis (g = 0, 1 and h = .0. .2. .5). The parameters g and h affect skewness and kurtosis respectively. When g = 0. the random variable Xj is symmetrically distributed, and Xj is defined Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 15 h * Z 2 Xj =Z*exp( ). where Z is a random variable with the standard normal distribution. In a special case, when h = 0. the a-and-h distribution become standard normal distribution. When g ^ 0. then e x K g » a - l If in addition tog the parameter h = 1. Xj has a distribution with Cauchy-like tails (Wilcox. 1995a. 1994). 2.3.2. Sample Size. Let n = (ni. m ns) be a vector, and the element nj be the sample size of the i,h group (1 < i < s). The Monte Carlo study involved three, equal sample size conditions, where n = (10. 10.......10). n = (25. 25.........25). n = (40. 40........ 40). Also, there was one. unequal sample size condition, where n = (10.40.40........ 40). 2.3.3. Number of Independent Pairwise Comparisons. The I - a confidence band was constructed using the all pairwise comparisons. Similarly, the type I error performance of the FQ test was examined for s = 2. 4. 6, 8. 10 independent groups, using the all pairwise comparisons. That is: the NIPC = 1.6. 15, 28.45. To justify the use of the critical value function for the situations other than those involving the all pairwise comparisons, the type I error performance was evaluated using Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 16 the pairwise comparisons with the control group. That is: the NIPC = I. 3. 5. 7. 9. Given the vector n = (ni. n2 ns ). the m was designated as the control group. 2.4. Power o f the FQ Test The power of the FQ test was estimated using Monte Carlo study with 5000 iterations for five distribution models, two. six. and ten independent groups, and four sample size conditions. The selection of the sample size was identical to the selection used for the evaluation of the type I error. For comparative purposes, the trimmed mean fixed effect (TM ) test (Wilcox. 1996) and the one way ANOVA were included. The power of the T M test and the ANOVA was estimated based on 5000 and 1000 iterations respectively. 2.4.1. Distribution Models. Let {X |. X 2 X s} be a set of random variables with respective distributions {F,. F, FJ. Let N (M . SD) be the normal distribution with the mean M and the standard deviation SD. Let CN1(M ) and CN2(M) be standard normal distributions with 10% contamination from N(0. 10) and N(0, 20) respectively. Let the distribution mean M j = (i - 1) / (s — 1). where l<i<s. Then the distribution models used to estimate the power are defined as: 1 . F = {N (M ,.I). N(M 2 , I) N (M S . I)}. 2. F = (C N I(M |). C N I(M i) C N 1(M S )}. 3. F = {C N 2(M ,).C N 2(M 2)......C N 2(M S )}. 4. F = {N(0.1). N(0. 3)......N(0. 3)}. 5. F = {N(0,1). N(2. 3)......N(2. 3)}. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 17 The distribution models four and five were used in addition to the models one through three employed by Wilcox (1995a). The reason for the inclusion of the model four was to show that the FQ test is sensitive to the difference among distributions in scale, when the location is identical. Finally, the distribution model five offers the most realistic condition compared to everyday research, providing both the shift in scale and location. 3. Results First, the jackknife method was shown to perform comparably to the bootstrap method as the estimator of HD standard error. Considering jackknife's speed in variance estimation for small sample sizes, the jackknife method was chosen over the bootstrap as the variance estimator for the FQ test. Second, the simulation study showed that the FQ test, somewhat conservative with heavy tails, provides a good control of the type [ error for the light tail distributions. Third, the FQ test outperformed the one way A N O VA and generally performed well across a variety of distribution models. Table I offers a computational summary for the FQ test. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 8 Tahle I The Computational Summary of the FO Test Quantile p Estimated constants p = . 1 0. .90 p = .25. .75 p = .50 CR p = B0p + B lp *W p + B 2 p * V p B0p -54.241 -2.482 -8.223 Bin 153.420 7.731 4.308 B2p 57.030 5.139 10.809 Vp = NlPCF r Fp .008 .060 .027 Wp = n m inc', /S R p Cp -1.64 -.74 -.53 SRp = A S R p + BS R -p * NIPCn p Asr. p .409 .528 .529 BS r.p 2.145 2.052 2.031 EP -.52 -.61 -.60 3.1. Jackknife Overall, for normal distribution, the jackknife exhibited slightly lower bias and slightly more variability than the bootstrap (see Table 2). Note in Table 2 that the bias ratio less than one indicates underestimation of the true variance while the bias ratio greater than one indicates the overestimation. Table 2 also reports variance of the jackknife and bootstrap estimator (lower value means more efficiency). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 19 Tahle 2 Estimated Bias Ratio and Efficiency for the Jackknife and Bootstrap Estimator of Variance Jackknife Bootstrap n P Bins Ratio Variance Bias Ratio Variance in 0.5 0.969089 0.111213 1.07268 0.1032 30 0.5 0.988426 0.057749 0.95977 0.0485 50 0.5 0.997308 0.042642 1.09622 0.0386 10 0.75 0.951339 0.128628 1.01 901 0.1344 30 0.75 0.982486 0.066719 0.96064 0.0546 50 0.75 0.996084 0.048308 1.04301 0.0419 10 0.9 0.909488 0.248670 0.74109 0.1812 30 0.9 0.956033 0.103256 0.88964 0.0842 50 0.9 0.990320 0.074719 0.93681 0.0596 For the standard normal distribution, as seen in Table 3. the critical value of the T p statistic (3) employing the jackknife converged at a slower rate than the critical value using the bootstrap. For heavy tail, skewed distribution (simulated with g-and- h distribution, where g = 1 and h = .5). the critical value of the Tp statistic fell bellow the nominal value 1.96. This trend was especially evident for the extreme quantile p = .9 (see Table 4). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Tahlc 3 Estimated Critical Value for the Test of Hypothesis (Tp = 0). Employing the Jackknife and Bootstrap Variance Estimator, with Random Variables X and Y Sampled from the Standard Normal Distribution Quantile P n Critical Value Using Jackknife Bootstrap 0.5 10 2.3286 1.9587 0.5 30 2.1462 2.1856 0.5 100 2.0757 2.0091 0.5 150 2.1210 1.9022 0.75 10 2.2894 2.0905 0.75 30 2.2295 2.2119 0.75 100 2.1069 2.1189 0.75 150 2.0834 1.9288 0.0 10 2.6250 2.9137 0.9 30 2.2058 2.2887 0.9 100 2.1219 1.9560 0.9 150 1.9982 1.8975 Tahle 4 Estimated Critical Value for the Test of Hypothesis (T „ : = 0). Employing the ^ . . . Jackknife and Bootstrap Variance Estimator, with Random Variables X and Y Sampled from the e-and-h Distribution ta = 1.0. h = .5) Quantile Critical Value Using P n Jackknife Bootstrap 0.5 10 1.7889 1.1158 0.5 30 1.8957 1.8438 0.5 100 1.8944 1.7791 0.5 150 2.0230 1.7842 0.75 10 1.7256 1.6548 0.75 30 1.7917 1.5697 0.75 100 1.8360 1.6551 0.75 150 1.9593 1.6692 0.9 10 2.4035 2.7879 0.9 30 1.7472 1.7604 0.9 100 1.8584 1.5375 0.9 150 1.8129 1.6182 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2! 3.2. Type I Error For normal distributions, the estimated probability of the type I error for the FQ test approached the nominal value .05 (see Figure 4). As the distribution grew heavier in the tails, the FQ test became increasingly conservative. For example, for the g-and-h distribution (g = I and h = .5). the estimated probability of the type I error vacillated between .009 and .026 (see Figure 4). Generally, skewness did not have a substantial effect on the type I error. Even for large skewness (g = 1). the estimated probability of the type I error decreased only moderately (see Table 5). Any effect that the skewness had on a was most pronounced when the distribution had light tails. For example. let us look at the results for two independent groups and the sample size m = ni = 10 reported in Table 5. When h = 0 and g = 0. the estimated a was .051. When g = 1, the estimated r e . decreased to .036. However, for heavy tail distribution (h = .5). the estimated a was .023 when g = 0 and essentially identical (.019) when g = I. 0.06 "3 C . 'J >. = E - 0.05 0.04 4 6 8 Number of Independent Groups 10 - 10, 10, -20,20, ■40,40, ■10,40,. Figure 4. Estimated probability of the type I error for the g-and-h distribution (g = 0. h = 0) - the all pairwise comparisons Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 22 10,10,... 20,20,... 40.40.... 10.40.... Figure 5. Estimated probability of the type I error for the g-and-h distribution (g = I. h = .5) - the all pairwise comparisons Figure 5 revealed another trend. That is. for distributions with heavy tails the estimated probability of the type I error was moderately decreasing with increasing number of independent comparisons (or groups). Using all pairwise comparisons, the estimated probability of the type I error is reported in Table 5. Table 6 shows estimated probability of the type I error when pairwise comparisons with the control group were conducted. Virtually, there is no difference between these two types of estimates. This result supports the use of the critical value function for any number of independent pairwise comparisons. 0 .0 3 0.02 0.01 r. _ . til 0.00 4 6 8 Number of Independent Groups 10 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 23 Tahle 5 Estimated Probability of the Type I Error Tor s Number of Independent Groups. Using All Pairwise Comparisons Number of Independent Groups (s) -- Number of Independent Pairwise Comparisons (NIPC) g h n = (n,. n2, .. •. ns ) s = 2 NIPC = 1 s = 4 NIPC = 6 s = 6 s = 8 s = 10 NIPC = 15 NIPC = 28 NIPC = 45 0 0.0 (10. 10. ....10) 0.0512 0.0488 0.0454 0.0454 0.0558 0 0.0 (25, 25, .. .. 25) 0.0496 0.0522 0.0460 0.0448 0.0504 0 0.0 (40, 40, .. .. 40) 0.0560 0.0532 0.0506 0.0536 0.0502 0 0.0 (10. 40. .. -, 40) 0.0562 0.0510 0.0462 0.0510 0.0512 0 0.2 (10. 10, ....10) 0.0362 0.0310 0.0336 0.0326 0.0328 0 0.2 (25, 25, .. ..25) 0.0324 0.0338 0.0286 0.0336 0.0344 0 0.2 (40. 40, .. .. 40) 0.0386 0.0410 0.0400 0.0304 0.0330 0 0.2 (10. 40, .. ..40) 0.0414 0.0316 0.0284 0.0312 0.0286 0 0.5 (10, 10, . ..10) 0.0232 0.0232 0.0222 0.0206 0.0188 0 0.5 (25, 25. .. ..25) 0.0272 0.0220 0.0162 0.0190 0.0146 0 0.5 (40. 40. .. ., 40) 0.0258 0.0246 0.0222 0.0150 0.0166 0 0.5 (10. 40, .. .. 40) 0.0240 0.0208 0.0166 0.0172 0.0166 1 0.0 (10. 10, .. ..10) 0.0362 0.0344 0.0356 0.0352 0.0344 1 0.0 (25, 25, .. .. 25) 0.0368 0.0368 0.0350 0.0244 0.0336 1 0.0 (40, 40, .. .. 40) 0.0386 0.0364 0.0306 0.0342 0.0350 1 0.0 (10, 40, .. ., 40) 0.0416 0.0320 0.0324 0.0346 0.0272 1 0.2 (10, 10, .. ..10) 0.0318 0.0240 0.0246 0.0258 0.0244 1 0.2 (25, 25. .. ., 25) 0.0342 0.0254 0.0248 0.0206 0.0222 1 0.2 (40. 40. .. ,40) 0.0322 0.0302 0.0300 0.0246 0.0226 1 0.2 (10. 40. .. . 40) 0.0340 0.0264 0.0230 0.0218 0.0188 1 0.5 (10. 10, .. .,10) 0.0186 0.0208 0.0176 0.0132 0.0180 1 0.5 (25. 25, .. .25) 0.0188 0.0154 0.0124 0.0126 0.0110 1 0.5 (40. 40. .. ,40) 0.0260 0.0200 0.0138 0.0160 0.0146 1 0.5 (10, 40, .. . 40) 0.0234 0.0146 0.0122 0.0110 0.0088 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 24 Table 6 Estimated Probability of the Type I Error for s Number of Independent Groups. Using Comparison with the Control Group m g h n = (n,, nz, .. •. ns ) Number of Independent Groups (s) - Number of Independent Pairwise Comparisons (NIPC) s = 2 NIPC = 1 s = 4 s = 6 NIPC = 3 NIPC = 5 s = 8 NIPC = 7 s = 10 NIPC = 9 0 0.0 (10,10,. -.10) 0.0512 0.0528 0.0478 0.0484 0.0450 0 0.0 (25, 25,.. .. 25) 0.0496 0.0510 0.0518 0.0500 0.0486 0 0.0 (40, 40. .. .,40) 0.0560 0.0566 0.0522 0.0538 0.0530 0 0.0 (10, 40,.. .,40) 0.0562 0.0430 0.0364 0.0394 0.0328 0 0.2 (10, 10, . .,10) 0.0362 0.0358 0.0338 0.0346 0.0320 0 0.2 (25, 25, .. .,25) 0.0324 0.0370 0.0316 0.0348 0.0320 0 0.2 (40, 40, .. ..40) 0.0386 0.0364 0.0408 0.0368 0.0346 0 0.2 (10, 40, .. .,40) 0.0414 0.0314 0.0224 0.0218 0.0230 0 0.5 (10, 10, .. .,10) 0.0232 0.0248 0.0256 0.0212 0.0160 0 0.5 (25. 25, .. ..25) 0.0272 0.0262 0.0214 0.0214 0.0162 0 0.5 (40, 40, .. .,40) 0.0258 0.0262 0.0214 0.0202 0.0196 0 0.5 (10. 40, .. ., 40) 0.0240 0.0164 0.0140 0.0126 0.0132 1 0.0 (10. 10, .. .,10) 0.0362 0.0354 0.0348 0.0344 0.0332 1 0.0 (25, 25, .. ,25) 0.0368 0.0348 0.0366 0.0272 0.0356 1 0.0 (40, 40, .. ,40) 0.0386 0.0390 0.0350 0.0380 0.0374 1 0.0 (10, 40, .. .40) 0.0416 0.0280 0.0252 0.0234 0.0238 1 0.2 (10, 10. ....10) 0.0318 0.0248 0.0254 0.0274 0.0220 1 0.2 (25, 25, .. .25) 0.0342 0.0282 0.0268 0.0256 0.0264 1 0.2 (40, 40. .. ,40) 0.0322 0.0318 0.0300 0.0286 0.0252 1 0.2 (10, 40. .. ,40) 0.0340 0.0200 0.0214 0.0000 0.0138 1 0.5 (10, 10, .. .,10) 0.0186 0.0226 0.0224 0.0188 0.0188 1 0.5 (25, 25, ... .25) 0.0188 0.0202 0.0188 0.0138 0.0124 1 0.5 (40, 40, ... ,40) 0.0260 0.0208 0.0174 0.0176 0.0180 1 0.5 (10, 40, ....40) 0.0234 0.0152 0.0102 0.0100 0.0092 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 25 Table 7 Estimated Power of the FQ. Trimmed Mean Fixed Effect (TM ). and One Wav ANOVA (AN) tests, for the Five Nun-Null Distribution Models (DM) DM n = (ti|. ti’. . .. n<) Number of Independent Groups (s) - Number oflndependent Pairwise Comparisons (NIPC) s = 2 NIPC = 1 s = 6 NIPC = 15 s= 10 NIPC = 45 FQ TM AN FQ TM AN FQ TM AN 1 (10. 10... 10) 0.411 0.439 0.560 0.231 0.349 0.463 0.220 0.426 0.516 (10. 10. ... 10) 0.248 0.354 0.193 0.126 0.260 0.066 0.119 0.335 0.069 3 (10. 10. .. 10) 0.224 0.335 0.131 0.109 0.258 0.038 0.109 0.327 0.043 4 (10. 10. ..... 10) 0.253 0.055 0.070 0.165 0.076 0.061 0.144 0.094 0.051 5 (10. 10 10) 0.370 0.134 0.507 0.288 0.207 0.262 0.259 0.238 0.199 1 (25. 25.... .25) 0.831 0.889 0.940 0.628 0.816 0.903 0.624 0.881 0.961 (25. 25.... .25) 0.641 0.796 0.280 0.399 0.690 0.140 0.382 0.769 0.137 3 (25. 25.... . 25) 0.605 0.789 0.140 0.368 0.671 0.053 0.347 0.752 0.075 4 (25. 25.... .25) 0.775 0.048 0.057 0.703 0.057 0.053 0.644 0.051 0.063 3 (25. " ’5 25) 0.871 0.300 0.861 0.904 0.480 0.783 0.898 0.508 0.649 1 (40. 4 0.... .40) 0.966 0.984 0.991 0.879 0.976 0.992 0.888 0.990 0.998 T (40. 4 0 .... .40) 0.859 0.949 0.366 0.684 0.928 0.202 0.683 0.961 0.220 (40. 40. ... .40) 0.838 0.941 0.169 0.671 0.911 0.071 0.657 0.961 0.074 4 (40. 40. ... .40) 0.951 0.060 0.063 0.963 0.056 0.063 0.952 0.050 0.060 3 (40. 40 40) 0.985 0.437 0.974 0.997 0.751 0.987 0.998 0.791 0.966 1 (10. 40. ... .40) 0.610 0.623 0.807 0.674 0.850 0.948 0.756 0.964 0.987 ■ > (10. 40. ... .40) 0.409 0.529 0.218 0.431 0.741 0.142 0.505 0.896 0.171 3 (10. 40. ... .40) 0.389 0.521 0.112 0.418 0.731 0.067 0.487 0.890 0.081 4 (10. 40. ... .40) 0.567 0.049 0.001 0.247 0.055 0.031 0.162 0.054 0.041 3 (10. 40.......40) 0.763 0.348 0.561 0.505 0.400 0.217 0.412 0.397 0.164 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 26 3.3. Power For model 1 (normal distribution, shift in location only), the AN O VA test was the undisputed champion. The FQ test performed reasonably for this model (see Table 7). For models 2 and 3 (contaminated normal, shift in location only), the T M test performed the best. The FQ test followed the performance of the T M test very closely when two independent groups were compared. The performance of the FQ test grew somewhat worse with increasing number of the independent groups. The AN O VA test completely broke down, having power not much larger than the probability of the type I error (see Table 7). For model 4 (normal distribution, shift in scale only), both the A N O VA and TM tests failed as expected, rejecting the null hypothesis at the a levels. For model 5 (normal distribution, shift in location and scale), the FQ test performed the best (see Table 7). 4. Discussion The FQ test, though conservative with heavy tails, performed satisfactorily over all distribution models. For two independent groups, the FQ test equaled the performance of the multiple quantiles test (Wilcox. 1995a). For some distribution models (1.2. and 3) the FQ test performed worse than either the ANO VA or the T M test. However, for any model, the FQ test never completely broke down as shown in Figure 6. Figure 6 displays the power of the FQ. TM and AN O VA tests over the five distribution models for two independent groups. Each group had 25 observations. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 27 3 C l . Distribution Model Figure 6. Power for comparison of two independent groups with 25 observations each Researchers do not know the true distribution of their data. For that reason, a test that can perform well over all distributions is desired. The FQ test is not perfect — conservative with heavy tail distributions, and with a moderate tendency to grow less powerful when the number of comparisons increases. However, the FQ test does never completely fail, unlike the ANOVA or T M test. Also, FQ test is a tool that allows the researcher to investigate tail behavior in addition to the central tendency. For these reasons, the FQ test offers a valuable alternative to the tests like the ANOVA or T M test. As for the conservative tendency of the FQ test with the heavy tails, this seems to be the result of overestimation of the HD standard error. That is why the critical values reported in Table 4 are lower than those reported in Table 3. This problem of the overestimation of variance is true of both re-sampling methods, the bootstrap and the jackknife. At the moment there seem to be no other viable alternative to the estimation of the HD estimator's variance. » — FQ I---T M k — AN Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 28 5. References Doksum. K. A. (1974). Empirical probability plots and statistical inference for nonlinear models in the two-sample case. The Annals of Statistics. 2. 267-277. Doksum. K. A.. & Sievers. G.L. (1976). Plotting with confidence: Graphical comparisons of two populations. Biometrika. 63.4 2 1 -434. Efron. B. (1979). The 1977 Rietz lecture: Bootstrap method: Another look at the jackknife. The Annals of Statistics. 7. 1-26. Efron. B. (1981). Nonparametric estimates of standard error: The jackknife. the bootstrap and other methods. Biometrika. 68. 589-599. Harrell. F. E.. & Davis. C.E. (1982). A new distribution-free quantile estimator. Biometrika. 69. 635-640. Lunneborg. C. E. (1986). Confidence intervals for a quantile contrast: Application of the bootstrap. Journal of Applied Psychology. 71.4 5 1 -456. Parrish. R. S. (1990). Comparisons of quantile estimators in normal samples. Biometrics. 46. 247-258. Wilk. M. B.. & Gnanadesikan. R. (1968). Probability plotting methods for the analysis of data. Biometrika. 55. I - 17. Wilcox. R. R. & Charlin. V. L. (1986). Comparing medians: a Monte Carlo study. Journal of Educational Statistics. 11. 263-274. Wilcox. R. R. ( 1994b). The percentage bend correlation coefficient. Psvchometrika. 59. 601-616. Wilcox. R. R. (1995a). Comparing two independent groups via multiple quantiles. The Statistician. 44. 9 1 -99. Wilcox. R. R. (1995b). Some practical reasons for reconsidering the Kolmogorov- Smirnov test. Unpublished manuscript. University of Southern California. Department of Psychology. Wilcox. R. R. (1996). Statistics for the social sciences. San Diego: Academic Press. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Comparison of bootstrap prediction loss/error estimators
PDF
Expectancies for alternative behaviors predict drinking problems: A test of a cognitive reformulation of the matching law
PDF
Hyperactive symptoms, cognitive functioning, and drinking habits
PDF
Adolescents' social attitudes: Genes and culture?
PDF
Insights into the nature of phonological and surface dyslexia: Evidence from a novel word learning task
PDF
Ethnic identity, acculturation, self-esteem and perceived discrimination: A comparison study of Asian American adolescents
PDF
Head injury and dementia: A co-twin control study of Swedish twins
PDF
Intragroup evaluations, attitude source, and in-group member derogation
PDF
Is there more to discrete prepulses than meets the eye?
PDF
Adaptation to sine-wave gratings selectively reduces the sensory gain of the adapted stimuli
PDF
Hedonic aspects of conditioned taste aversion in rats and humans
PDF
Effects of a corneal anesthetic on the extinction of the classically conditioned response in the rabbit
PDF
Cognitive functioning and dementia following cancer: A Swedish twin study
PDF
Invariance to changes in contrast polarity in object and face recognition
PDF
Asymmetries in the bidirectional associative strengths between events in cue competition for causes and effects
PDF
Effects of personal resource sufficiency on perceived difficulty and desirability of earthquake preparedness
PDF
Effects of threat and self-focus on consensual bias in majorities
PDF
Effects of bilateral stimulation and stimulus redundancy on performance in processing nonword letter trigrams
PDF
Depressed children and the social and behavioral attributes of their best friend
PDF
Aging and the use of context and frequency information in ambiguity resolution
Asset Metadata
Creator
Muska, Jan
(author)
Core Title
Five quantile test: A comparison of independent groups
School
Graduate School
Degree
Master of Arts
Degree Program
Psychology
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,psychology, psychometrics
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
[illegible] (
committee chair
), Baker, Laura A. (
committee member
), Earleywine, Mitchell (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-8744
Unique identifier
UC11341233
Identifier
1383539.pdf (filename),usctheses-c16-8744 (legacy record id)
Legacy Identifier
1383539.pdf
Dmrecord
8744
Document Type
Thesis
Rights
Muska, Jan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
psychology, psychometrics