TUTORIAL ON QUANTILE REGRESSION
Transcripción
TUTORIAL ON QUANTILE REGRESSION
TUTORIAL ON QUANTILE REGRESSION Xuming He Department of Statistics, University of Illinois at Urbana-Champaign Ying Wei Department of Biostatistics, Columbia University ENAR 2005 TUTORIAL ON QUANTILE REGRESSION – p. 1 Motivation for quantile regression • Regression — to obtain a summary of the relationship between a response variable y and a set of covariates x. • Least squares regression captures how the mean of y changes with x. • Sometimes, a single mean curve is not informative enough. • Conditional quantile functions provide a more complete view. TUTORIAL ON QUANTILE REGRESSION – p. 2 10 8 12 Motivation for quantile regression 6 8 0.9 4 Y 6 Y 0.75 4 0.5 0.9 2 0.75 2 0.25 0.1 0.0 0.2 0.4 0.6 0.8 0 0.5 0.25 0.1 1.0 X The distribution of y may be asymmetric around the mean. 0.6 0.8 1.0 1.2 1.4 1.6 X Heteroscedasticity may exist in the data. TUTORIAL ON QUANTILE REGRESSION – p. 3 Example 1 – Risk factors for low birthweight • Low birthweight is known to be associated with ◦ higher infant mortality (Abreveya, 2001). ◦ higher health-care cost (Lewit et al., 1995). ◦ a wide range of subsequent health problems (Hack et al., 1995). ◦ long-term educational attainment and even labor market outcomes (Corman and Chaikind, 1998, and Currie and Hyson, 1999). • Investigate the factors influencing birthweight, especially the ones that may help reduce the incidence of low birthweight infants. TUTORIAL ON QUANTILE REGRESSION – p. 4 Example 1 – Risk factors for low birthweight • The research question can be rephrased as exploring the covariate effects on the lower quantiles of birthweight. • Potential covariates include ◦ Mother’s education ◦ Mother’s prenatal care ◦ Mother’s age ◦ Mother’s weight gain ◦ ... • Covariate effects on lower quantiles may differ from those on the mean or median birthweight. • Reference: Abreveya (2001) and Koenker and Hallock (2001). TUTORIAL ON QUANTILE REGRESSION – p. 5 Example 2 – Growth charts 20 95% 75% 15 25% 10 5% 5 Weight (kg) 50% 0.0 0.5 1.0 1.5 • Display the distribution of a certain measurement within a certain range of time for a certain population. 2.0 Age TUTORIAL ON QUANTILE REGRESSION – p. 6 Example 2 – Growth charts 20 95% 75% B 5% measurements from an individual subject in the context of population values. 10 15 25% • Focused on the entire 5 Weight (kg) 50% • Used to screen the A 0.0 0.5 1.0 1.5 2.0 distribution, especially the extreme quantiles. Age TUTORIAL ON QUANTILE REGRESSION – p. 7 Example 2 – Growth charts • Widely used in medical sciences: Pediatrics: to monitor an infant’s growth status by screening his/her height, weight and head circumference. (Hamill et al., 1979) HIV research: to monitor CD4 lymphocyte counts in uninfected children born to HIV-1 infected women. Genetics: to help determine the gene frequency of the most common mutations of the HFE gene causing hereditary hemochromatosis.(Koziol et al., 2001) TUTORIAL ON QUANTILE REGRESSION – p. 8 Outlines • Quantile regression ◦ ◦ ◦ Estimation Properties Computational aspects • Software in SAS and R • Examples (continue) • Inference in quantile regression • Additional topics ◦ Beyond the linear models ◦ Quantile regression with censored data TUTORIAL ON QUANTILE REGRESSION – p. 9 Univariate Sample Quantile via Optimization A random sample {y1 , y2 , ..., yn }, Median = argminξ The τ th sample quantile = argminξ where X X |yi − ξ|; ρτ (yi − ξ), τ τ−1 ρτ (u) = u(τ − 1(u<0) ). 0 TUTORIAL ON QUANTILE REGRESSION – p. 10 Linear Regression Quantile (Koenker and Bassett, 1978) • A linear model for the τ th quantile yi = x⊤ i β τ + ei , i = 1, . . . , n, (1.1) where the τ th quantile of ei is zero. • Underlying assumption of model (1.1): Qτ (Y |X) = X ⊤ β τ • Estimator of β τ β̂ τ = argminβ ∈Rp n X i=1 ρτ (yi − x⊤i β). • Regression Quantile: Q̂τ (Y |X) = X ⊤ β̂ τ . TUTORIAL ON QUANTILE REGRESSION – p. 11 Basic properties of regression quantiles • Scale equivariant: for any a > 0, β̂ τ (ay, X) = aβ̂ τ (y, X), β̂ τ (−ay, X) = aβ̂ 1−τ (y, X). • Regression Shift: for any γ ∈ Rp , β̂ τ (y + Xγ, X) = β̂ τ (y, X) + γ. • Reparameterization of design: for any |A| = 6 0, β̂ τ (y, AX) = A−1 β̂ τ (y, X). • Equivariant to monotone transformation, Qτ (h(Y )|X) = h(Qτ (Y |X)). TUTORIAL ON QUANTILE REGRESSION – p. 12 Asymptotic properties of regression quantiles • Asymptotics for the univariate sample quantile. Let {y1 , y2 , ..., yn } be an i.i.d. random sample with distribution F , and X ρτ (yi − ξ), ξˆτ = argminξ i then √ τ (1 − τ ) ˆ , n(ξτ − ξτ ) ∼ N 0, 2 −1 f (F (τ )) where f = F ′ . TUTORIAL ON QUANTILE REGRESSION – p. 13 Asymptotic properties of regression quantiles • Asymptotics for linear models. yi = x⊤ i βτ + ei , where ei ∼ Fi and Fi−1 (τ ) = 0 then the join distribution √ n Vn−1/2 (β̂τk − βτk )m k=1 ∼ N (0, In ) , (1.2) where ◦ Hn (τ ) = n−1 P xi x⊤ i fi (0), and Dn = n−1 P xi x⊤ i ; ◦ Vn = [(τk ∧ τl − τk τl )Hn (τk )−1 Dn Hn (τl )−1 ]m . k,l=1 • Reference: Koenker and Bassett (1978), He and Shao (1996). TUTORIAL ON QUANTILE REGRESSION – p. 14 Computational aspects • β̂ τ can be calculated by means of linear programming, which is efficient in computation. ◦ Primal formulation as a Linear Program 2n }. min{τ 1⊤ u+(1−τ )1⊤ v|y = Xb+u−v, (b, u, v) ∈ Rp ×R+ ◦ Dural formulation as a Linear Program max{y ⊤ d|X ⊤ d = (1 − τ )X ⊤ 1, d ∈ [0, 1]n }. ◦ Solutions are characterized by an exact fit to p observations. TUTORIAL ON QUANTILE REGRESSION – p. 15 Existing algorithms and practical recommendations • Simplex method (Koenker and D’Orey, 1987 and 1993) ◦ Start at a random vertex, and search around the edge of a circumscribed polygon, until the optimum is found. ◦ For moderate data size. • Interior Point method (Portnoy and Koenker, 1997) ◦ Start from the interior of the circumscribed polygon to avoid crossing the boundary. ◦ Computationally efficient for large data size. TUTORIAL ON QUANTILE REGRESSION – p. 16 Existing algorithms and practical recommendations • Interior point method with preprocessing (Portnoy and Koenker, 1997) ◦ For very large datasets, e.g., n > 105 . • Smoothing method (Chen, 2004) ◦ Search the optimized solution by smoothing the objective function ρτ (·). TUTORIAL ON QUANTILE REGRESSION – p. 17 Estimation in SAS • Package: SAS/STAT PROC QUANTREG ◦ Available as an experimental procedure for SAS 9.1 (Windows only); ◦ Downloadable from http://www.sas.com/statistics • Basic syntax PROC QUANTREG DATA = sas-data-set <options>; BY variables; CLASS variables; MODEL response = independents </ options>; RUN; TUTORIAL ON QUANTILE REGRESSION – p. 18 Estimation in SAS • To specify the quantile level: ◦ Use the option QUANTILE in the MODEL statement MODEL Y = X / QUANTILE = <number list | ALL>; choice of quantile(s) example a single quantile QUANTILE = 0.25 several quantiles QUANTILE = 0.25 0.5 0.75 entire quantile process QUANTILE = ALL ◦ Default value is 0.5, the median. TUTORIAL ON QUANTILE REGRESSION – p. 19 Estimation in SAS • To specify the algorithm: ◦ Use the option ALGORITHM in the PROC QUANTREG statement method option Simplex ALGORITHM = SIMPLEX Interior point ALGORITHM = INTERIOR Interior point with preprocessing ALGORITHM = INTERIOR PP Smoothing ALGORITHM = SMOOTHING ◦ By default, the Simplex method is used. TUTORIAL ON QUANTILE REGRESSION – p. 20 Estimation in SAS • Default Output — report the name of the data set and the response variable, the number of covariates, the number of observations, algorithm of optimization and the method for confidence intervals. Model information — report the sample mean and standard deviation, sample median, MAD and interquartile range for each variable included in the MODEL statement. Summary statistics — report the quantile level to be estimated and the optimized objective function. Quantile objective function — report the estimated coefficients and their 95% confidence intervals. Parameter Estimates TUTORIAL ON QUANTILE REGRESSION – p. 21 Estimation in R • R package quantreg ( contributed by Koenker). ◦ Official releases of R and the install package of quantreg are available at http://lib.stat.cmu.edu/R/CRAN/ • The syntax of rq(). library(quantreg) rq(formula, tau=.5, data, weights, na.action, method="br", ...) TUTORIAL ON QUANTILE REGRESSION – p. 22 Estimation in R • Arguments formula model statement, e.g. y ∼ x1 + x2 tau the quantile(s) to be estimated: ◦ If tau is a single number ∈ (0,1), return a single quantile function. ◦ If tau is a vector ∈ (0,1), return several quantile functions. ◦ If tau 6∈ (0,1), return the entire quantile process. TUTORIAL ON QUANTILE REGRESSION – p. 23 Estimation in R method • method="br" Simplex method (default). • method="fn" the Frisch−Newton interior point method. • method="pfn" the Frisch−Newton approach with preprocessing. TUTORIAL ON QUANTILE REGRESSION – p. 24 Example 1: Exploring the risk factors of low birthweight • Data (NATALITY1997) : ◦ National Center for Health Statistics (NCHS). ◦ Provide information on live births in the United States during calender year 1997, including infant birthweight. ◦ Also provide – socioeconomic and demographic characteristics of mothers; – Mother’s prenatal care and pregnancy history TUTORIAL ON QUANTILE REGRESSION – p. 25 Example 1: Exploring the risk factors of low birthweight • Data (NATALITY1997) used in our example: ◦ Subset: live singleton birthes; mothers between the ages of 18 and 45, either black or white, residing in the U.S. ◦ Sample size: 198,377 observations. ◦ More detailed description of the data is available at http://www.cdc.gov/nchs/births.htm TUTORIAL ON QUANTILE REGRESSION – p. 26 A quantile regression model for birthweight Model: Qτ (Birthweight) = Boy + Married + Black + High School + Some College + College + No Prenatal + Prenatal Second + Prenatal Third + Smoker + Ciga Per Day + Mother’s Age + Mother’s Age 2 + Mother’s weight gain + Mother’s weight gain 2 Choice of quantile levels τ = (0.05, 0.1, 0.15, 0.2, 0.25, ..., 0.8, 0.85, 0.9, 0.95) TUTORIAL ON QUANTILE REGRESSION – p. 27 SAS codes for the birthweight model ods graphics on; PROC QUANTREG DATA=LIBRARY.natality; MODEL weight = Boy Married Black ... Mother_Wt_Gain2/QUANTILE = 0.05 0.1 ... 0.9 0.95 PLOT = QUANTPLOT; RUN; ods graphics off; TUTORIAL ON QUANTILE REGRESSION – p. 28 Output from SAS / PROC QUANTREG TUTORIAL ON QUANTILE REGRESSION – p. 29 Output from SAS / PROC QUANTREG TUTORIAL ON QUANTILE REGRESSION – p. 30 Output from SAS / PROC QUANTREG TUTORIAL ON QUANTILE REGRESSION – p. 31 Output from SAS / PROC QUANTREG TUTORIAL ON QUANTILE REGRESSION – p. 32 Some conclusions for Example 1 • Heteroscedasticity exists in the Natality data. • Mother’s race, education and prenatal care have much more significant impact on lower quantiles of birthweight than the upper quantiles. • These effects will be underestimated by least squares regression. TUTORIAL ON QUANTILE REGRESSION – p. 33 Example 2: Growth charts • Data: Berkeley data of Tuddenham and Snyder (1954) • Number of subjects: 136 • Measurements: longitudinal measurements of height and weight collected ◦ quarterly between ages 0 and 3; ◦ yearly between ages 2 and 8; ◦ semi-yearly between ages 8 and 21. TUTORIAL ON QUANTILE REGRESSION – p. 34 Growth charts Unconditional Reference Centiles −− Boy’s weight, 0−18 years 80 0.95 0.9 0.75 0.5 60 0.25 0.1 40 20 0 Weight (kg) 0.05 0 5 10 15 Age (years) TUTORIAL ON QUANTILE REGRESSION – p. 35 Conditional growth chart • Screening an individual subject given his/her prior path or other information (Wei, Pere, Koenker and He, 2004). • e.g., the rapid weight gain during infancy could lead to obesity later in the childhood (Stettler et al., 2002). • A simple AR(2) model Qτ (Wt ) = βτ 0 + βτ 1 Wt−1 + βτ 2 Wt−2 + βτ 3 Ht , where ◦ Wt is the current weight at time t. ◦ Wt−1 and Wt−2 are two prior weights at time t − 1 and t − 2, respectively. ◦ Ht is the current height at time t. TUTORIAL ON QUANTILE REGRESSION – p. 36 10 8 Wt = 9.8kg 6 Wt−1 = 7.2kg Wt−2 = 5.4kg Ht = 70.6cm 4 Weight (Kg) 12 14 An illustrative example 0.0 0.2 0.4 0.6 0.8 1.0 Age (year) TUTORIAL ON QUANTILE REGRESSION – p. 37 Example of conditional growth chart (cont.) • Model: Qτ (W0.75 ) = β0 + β1 W0.5 + β2 W0.25 + β3 H0.75 The resulting estimated conditional quantiles τ 0.05 0.1 0.25 0.5 0.75 0.9 0.95 Q̂τ 8.04 8.18 8.38 8.74 9.05 9.36 9.55 • Conclusion: His current weight = 9.8 > 9.55 = 0.95th conditional quantile. =⇒ The subject gained an unusual amount of weight at age 0.75 given his prior weights and current height. TUTORIAL ON QUANTILE REGRESSION – p. 38 Comparison with Least Squares Analysis • Gaussian model (Cole, 1994) Zt = β0 + β1 Zt−1 + Ut where Ut ∼ N (0, σ 2 ). where Zt is the Box-Cox transformed weight, i.e., ( λ Wt −1 λ 6= 0 ; λ , Zt = ln(Wt ), λ = 0. • The conditional quantiles Q̃τ (Zt ) = β̂0 + β̂1 Zt−1 + σΦ−1 (τ ). • Marginal normality =⇒ Joint normality ? TUTORIAL ON QUANTILE REGRESSION – p. 39 Normal Q−Q Plot 0 1 2 1 −1 −2 −2 −1 0 1 2 −2 −1 0 1 Theoretical Quantiles Theoretical Quantiles t = 9.5 t−1 = 9 t = 9.5 ; t−1 = 9 normality of Z(t) normality of Z(t−1) Normal Q−Q Plot −1 0 1 2 −2 −1 0 1 2 0 e+00 2 −1 e−04 Sample Quantiles 1 e−04 Theoretical Quantiles Sample Quantiles −2 0 Sample Quantiles 230 Sample Quantiles 190 −1 0.9714 0.9718 0.9722 0.9726 Sample Quantiles −2 0.9714 0.9718 0.9722 0.9726 210 230 210 190 Sample Quantiles 2 3 250 normality of Z(t−1) 250 normality of Z(t) −2 −1 0 1 Theoretical Quantiles Theoretical Quantiles Theoretical Quantiles t = 14 t−1 = 13.5 t = 14 ; t−1 = 13.5 2 TUTORIAL ON QUANTILE REGRESSION – p. 40 Inference in quantile regression 1. Direct estimation • Based on the asymptotic normality of the estimators and the direct estimation of their variance-covariance matrix. • In an i.i.d. error model, yi = x⊤ i β + ei , ei ∼ F. ◦ The variance covariance matrix of β̂τ is (τ (1 − τ )/f (F −1 (τ ))2 )(X ⊤ X)−1 , where f (F −1 (τ )) is the common error density evaluated at F −1 (τ ). TUTORIAL ON QUANTILE REGRESSION – p. 41 Inference in quantile regression 1. Direct estimation • In a non i.i.d. error model, yi = x⊤ i β + ei ei ∼ Fi . • The variance covariance matrix of β̂τ is (τ (1 − τ ))(X ⊤ F X)−1 (X ⊤ X)(X ⊤ F X)−1 , where F = diag{f1 (F1−1 (τ )), f2 (F2−1 (τ )), ..., fn (Fn−1 (τ ))}, and fi is the density function of ei evaluated at its τ th quantile Fi−1 (τ ). TUTORIAL ON QUANTILE REGRESSION – p. 42 Inference in quantile regression 2. Rank score method • Avoid direct estimation of the error densities. • Introduced by Gutenbrunner, Jurěcková, Koenker, and Portnoy (1993) for an i.i.d error model yi = x⊤ i β + ei , ei i.i.d. • Extended by Koenker and Machado (1999) to location-scale regression models ⊤ yi = x⊤ i β + (xi γ)ei . • Generalization of sign tests. TUTORIAL ON QUANTILE REGRESSION – p. 43 Inference in quantile regression 3. Resampling method • To avoid direct estimation of variance-covariance matrix. • Resampling methods ◦ Pairwise bootstrap (Efron and Tibshirani, 1998). ◦ Bootstrapping the estimating equations (Parzen, Wei and Ying, 1993). ◦ Markov chain marginal bootstrap (MCMB) (He and Hu, 2002). TUTORIAL ON QUANTILE REGRESSION – p. 44 Inference in quantile regression 3. Resampling method (cont.) • Confidence interval based on bootstrap sample {β̂τ∗1 , ..., β̂τ∗m } ◦ SD-based [β̂τ ± zα/2 SD∗ (β̂τ∗ )]. ◦ Percentile based [2β̂τ − Q∗1−α/2 , 2β̂τ + Q∗α/2 ]. TUTORIAL ON QUANTILE REGRESSION – p. 45 Inference in quantile regression 4. Summary • Direct estimation ◦ The asymptotic validity holds generally. ◦ The performance in finite samples is sensitive to the choice of smoothing. ◦ Computationally efficient. • Rank score method ◦ The asymptotic validity holds for certain models. ◦ The performance is stable, and robust against common deviations from the model assumptions. ◦ Time consuming for large data sets. TUTORIAL ON QUANTILE REGRESSION – p. 46 Inference in quantile regression 4. Summary • Resampling ◦ The asymptotic validity holds generally. ◦ Could be time consuming for large data sets. ◦ MCMB considerably relieves the computation burden by reducing a high-dimension problem to several one-dimensional problems. TUTORIAL ON QUANTILE REGRESSION – p. 47 Inference in quantile regression 5. Practical recommendations • Use rank score method for relatively small problems, e.g. n ≤ 1000 and p ≤ 10. • Use MCMB for moderately large problem, e.g. 1 × 104 < np < 2 × 106 . • Use direct estimation with non i.i.d. error assumption for very large problems. • Defaults in SAS PROC QUANTREG are similar. • Reference: Kocherginsky, He and Mu (2005). TUTORIAL ON QUANTILE REGRESSION – p. 48 Inference in quantile regression 6. Computation packages • SAS — specify at PROC QUANTREG Statement PROC QUANTREG CI= <NONE|RANK|...> ALPHA = value ◦ By default, the rank score method is used for small data sets (n < 5000 and p < 20). Otherwise, the resampling method is used. • R – use function summary.rq(). summary.rq(object, se = “nid”, ...) ◦ By default, the rank score method is used. If rq(..., iid = F, ), then the direct method with non i.i.d. error model is used. TUTORIAL ON QUANTILE REGRESSION – p. 49 Inference in quantile regression List of inference options in the computation packages Options Methods subcategory R SAS Direct i.i.d. model se = "iid" CI = SPARCITY/IID n.i.d. model se = "nid" CI = SPARCITY se = "rank", CI = RANK se = "boot", Not avaiable Rank Score Resampling Pairwise bsmethod ="xy" Parzen, Wei and Ying se = "boot", Not available bsmethod ="pxy" MCMB use R package rqmcmb2 CI = RESAMPLING TUTORIAL ON QUANTILE REGRESSION – p. 50 Examples (cont.) • Birthwegiht example: ◦ Using the MCMB method implemented in SAS, construct confidence band of the estimated coefficients. ◦ Comparison of computing times with the other methods. • Growth chart example: ◦ Using the Rank Score method implemented in R, to test the significance of prior weights and current height. ◦ Comparison of confidence intervals using different methods. TUTORIAL ON QUANTILE REGRESSION – p. 51 Comparison of inference methods • Comparison of computing time for the birthweight example. List of total computing time for τ = 0.5. CPU time (min.) Methods R SAS Direct with n.i.d. model 0.4 0.5 Rank Score > 30 > 30 Resampling with MCMB 3.6 3.0 ◦ The CPU time includes both estimation (using interior point algorithm with preprocessing) and confidence interval calculation. ◦ CPU 1.4 GHz and Memory 512 M. TUTORIAL ON QUANTILE REGRESSION – p. 52 Growth chart example Qτ (W0.75 ) = βτ 0 + βτ 1 W0.5 + βτ 2 W0.25 + βτ 3 H0.75 , W0.5 W0.25 H0.75 τ βτ 1 90% CI βτ 2 90% CI βτ 3 90% CI 0.05 0.89 (0.42,1.41) 0.07 (-1.20,0.26) 0.07 (-0.06,0.10) 0.10 1.00 (0.68,1.34) -0.05 (-0.48, 0.20) 0.05 (-0.01,0.10) 0.25 1.18 (0.73,1.32) -0.27 (-0.43,0.05) 0.02 (0.00,0.11) 0.50 0.90 (0.77,1.11) -0.05 (-0.45,0.05) 0.04 (0.03,0.06) 0.75 1.02 (0.94,1.13) -0.21 (-0.46,0.10) 0.04 (-0.01,0.09) 0.90 1.12 (0.58,1.30) -0.34 (-0.39,0.56) 0.01 (-0.05,0.17) 0.95 0.62 (0.42,1.56) 0.31 (-0.68,0.66) 0.08 (-0.10,0.19) * (n = 54 boys.) TUTORIAL ON QUANTILE REGRESSION – p. 53 Comparison of inference methods • Using growth chart example Rank method Direct method, nid Paired bootstrap τ β̂τ 1 90% CI length 90% CI length 90% CI length 0.05 0.89 (0.42,1.41) 0.99 (0.66,1.13) 0.47 ( 0.60,1.19) 0.59 0.10 1.00 (0.68,1.34) 0.66 (0.67,1.33) 0.66 ( 0.73,1.27) 0.55 0.25 1.19 (0.73,1.32) 0.59 (0.95,1.43) 0.47 ( 0.91,1.47) 0.56 0.50 0.90 (0.77,1.11) 0.34 (0.66,1.14) 0.48 ( 0.63,1.18) 0.55 0.75 1.02 (0.94,1.13) 0.20 (0.70,1.34) 0.65 ( 0.80,1.24) 0.44 0.90 1.12 (0.58,1.30) 0.72 (0.78,1.47) 0.69 ( 0.67,1.57) 0.90 0.95 0.62 (0.42,1.56) 1.14 (-.18,1.42) 1.60 (-0.01,1.26) 1.27 TUTORIAL ON QUANTILE REGRESSION – p. 54 Additional topics • Nonparametric quantile regression model Qτ (Yt ) = gτ (t) ◦ A simple idea: approximate gτ (t) by a linear combination of spline basis functions, and then use PROC QUANTREG with the basis functions as ”covariates". ◦ Reference: He and Shi (1994) and Koenker, Ng and Portnoy (1994). TUTORIAL ON QUANTILE REGRESSION – p. 55 • Quantile regression with censored data. The proportional hazard model h(t, x; β) = h0 (t)exp{x⊤ β} uses one global coefficient for the effect of each covariate on all quantile functions. Available software on censored quantile regression: (a) the censoring time known for all observations; or (b) right censoring all above one quantile function; or (c) censoring time independent of covariates. Reference: Portnoy (2003). TUTORIAL ON QUANTILE REGRESSION – p. 56 References [1] Abrevaya, J. (2001), The effects of demographics and maternal behavior on the disgtribution of birth outcomes. Empirical Economics, Vol. 26, 247 - 257. [2] Chen, C. (2004), An adaptive algorithm for quantile regression. Thoery and applications of recent robust methods. ed. M. Hubert, G. Pison, A. Struyf and S. Van Aelst, Series: Statistics for Industry and Technology, Birkhauser, Basel, 39 - 48. [3] Cole, T. J. (1994), Growth charts for both cross-sectional and longitudinal data. Statitics in Medicine, Vol. 13, 2477 - 2492. [4] Corman, H. and Chaikind, S. (1998), The effect of low birthweight on the school performance and behavior of school-aged children. Economics of Education Review, Vol. 17, 307 - 316. [5] Currie, J. and Hyson, R. (1999), Is the impack of health shocks cushioned by socioeconomic status? the case of low birthweight. American Economic Review, Vol. 89, 245 - 250. [6] Efron, B. and Tibshirani, R. (1998), An introduction to the Bootstrap. CRC Press, New York. [7] Gutenbrunner, C., Jurěcková, J., Koenker, R. and Portnoy, S. (1993), Tests of linear hypotheses based on regression rank scores. Journal of Nonparametric Statistics Vol. 2, 307 - 333. TUTORIAL ON QUANTILE REGRESSION – p. 57 References [8] Hack, M., Klein, N. K., and Taylor, G. (1995), Long-term developmental outcomes of low birth weight infants. The Future of Children, Vol. 5, 19 - 34. [9] Hamill, P. V. V., Dridzd, T. A., Johnson, C. L., Reed, R. B., Roche, A. F. and Moore, W. M. (1979), Physical grwoth: National Center for Health Statistics percentiles. American Journal of Clinical Nutrition, Vol. 32, 607 - 629. [10] He, X. and Hu, F. (2002), Markov Chain Marginal Bootstrap. Jounral of the American Statistical Association, Vol. 97, 783 - 795. [11] He, X. and Shao, Q. (1996), A general Bahadur representation of M -estimators and its application to linear regression with nonstochastic designs. The Annals of Statistics, Vol. 24, 2608 - 2630. [12] He, X. and Shi, P. (1994), Convergence rate of B-spline estimators of nonparametric conditional quantile functions. Journal of Nonparametric Statistics, Vol. 3, 299 - 308. [13] Lewit, E. M., Baker, L. S., Corman, H. and Shiono, P. H. (1995), The direct costs of low birth weight. The future of Children, Vol. 5, 35 - 51. [14] Kocherginsky, M., He, X. and Mu, Y. (2005), Practical Confidence Intervals for Regression Quantiles. Journal of Computational and Graphical Statistics, Vol. 14, 41 - 55. TUTORIAL ON QUANTILE REGRESSION – p. 58 References [15] Koenker, R. and Bassett, G. J. (1978), Regression quantiles. Econometrica, Vol. 46, 33-50. [16] Koenker,R. and Hallock, K. (2001), Quantile Regression. Journal of Economic Perspectives, Vol. 51, No. 4, 143-156. [17] Koenker, R. and D’Orey, V. (1987), Computing regression quantiles. Applied Statistics, Vol. 36, 383-393. [18] Koenker, R. and D’Orey, V. (1993), A remark on computing regression quantiles. Applied Statistics, Vol. 43, 410-414. [19] Koenker, R., Ng, P. and Portnoy, S. (1994), Quantile smoothing splines. Biometrika, Vol. 81, 673-680. [20] Koenker, R and Machado, J. A. F. (1999), Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association, Vol. 94, 1296-1310. [21] Koziol, J. A., Ho, N. J., Felitti, V. J. and Beutler, E. (2001), Reference centiles for serum ferritin and percentage of transferrin saturation, with application to mutations of the HFE gene. Clinical Chemistry, Vol. 47, No. 10, 1804-1810. TUTORIAL ON QUANTILE REGRESSION – p. 59 References [22] Stettler, N., Zemel, B. S., Kumanyika, S. and Stallings, V. (2002), Infant weight gain and childhood overweight status in a multicenter cohort study. Pediatrics, Vol. 109, No. 2, 194-199. [23] Parzen, M. I., Wei, L. J. and Ying, Z. (1994), A resampling method based on pivotal estimating functions. Biometrika, Vol. 81, 341-350. [24] Portnoy, S. (2003), Censored Regression Quantiles, Jounral of the American Statistical Association, Vol. 98, 1001 - 1012; [25] Portnoy, S. and Koenker, R. (1997), The Gaussian Hare and the Laplacian Tortoise: Computability of squared-error versus absolute-error estimators. Statistical Science, Vol. 12, 279-300. [26] Tuddenham, R. D. and Snyder, M. M. (1954), Physical growth of California boys and girls from birth to age 18. California Public Child Development, Vol. 1, 183-364. [27] Wei, Y., Pere, A., Koenker, R., and He, X. (2004), Quantile regression methods for reference growth charts. Preprint. TUTORIAL ON QUANTILE REGRESSION – p. 60