The context Lecture 12 There are many contexts in which a variable is ordinal that have three or more Ordinal Logistic Some typical examples are health status Regression bad, very bad), olitical ic (very liberal, slightly slightly conservative, e), fertility intention(the more the better, two, one, no) This lecture briefly introduce ordinal logistic regression In these examples, the distance between gories is not equal The context and data type continuous. In this case, just use OLS The ordinal logistic regression equation egression. Certainly, this is widely done, Fitting an ordinal logistic regression particularly when the dependent variable will often result in biased estimates of the An illustrative example of fertility analysis using Stata
1 1 Lecture 12 Ordinal Logistic Regression 2 This lecture briefly introduce ordinal logistic regression • The context and data type • The ordinal logistic regression equation • Fitting an ordinal logistic regression • Results and interpretation • An illustrative example of fertility analysis using Stata 2 3 The context • There are many contexts in which a variable is ordinal that have three or more categories • Some typical examples are health status (very good, good, so-so, bad, very bad), political ideology (very liberal, slightly liberal, moderate, slightly conservative, very conservative), fertility intention (the more the better, two, one, no) 4 • In these examples, the distance between categories is not equal. • Treat the variable as though it were continuous. In this case, just use OLS regression. Certainly, this is widely done, particularly when the dependent variable has 5 or more categories. However, this will often result in biased estimates of the regression parameters
Ignoring the ordinal categories of the The Ordered Logit Model (OLM) variable and treating it as nomial, i.e. us MNLM. The key problem is a loss of Say Y is an ordinal dependent variable efficiency By ignoring the fact that the with c categories. Let Pr(Y sj)denote the ategories are ordered, you fail to use probability that the response on Y falls in some of the information available to you, category j or below(i.e, in category and you may estimate many more 1, 2, .. or j). This is called a cumulative parameters than is necessary. This probability. It equals the sum of the increases the risk of getting insignificant probabilities in category j and below results, but your parameter estimates still should be unbiased Pr(Y sil=PrY=1)+伊Pr(Y=2)+… +Pr(Y=j) Data type A"c category Y dependent variable"has c cumulative probabilities: Pr(Y $1), Pr(Ys As in other logistic regression, the 2), Pr(Y sc). The final cumulative predictors in ordinal logistic regression probability uses the entire scale; as a may be quantitative, categorical, or a consequence, therefore Pr(Y sc)=1 mixture of the two. The dependent variable The order of forming the final cumulative should be discrete and ordinal with three probabilities reflects the ordering or more categones dependent variable scale, and those probabilities themselves satisfy In SPSS, discrete(categorical) variables are entered as factors and continuous PrYs1)sPr(YS2)≤S∴≤ Pr(Y sc)=1 variables as covariates
3 5 • Ignoring the ordinal categories of the variable and treating it as nomial, i.e. use MNLM. The key problem is a loss of efficiency. By ignoring the fact that the categories are ordered, you fail to use some of the information available to you, and you may estimate many more parameters than is necessary. This increases the risk of getting insignificant results, but your parameter estimates still should be unbiased. 6 Data type • As in other logistic regression, the predictors in ordinal logistic regression may be quantitative, categorical, or a mixture of the two. The dependent variable should be discrete and ordinal with three or more categories. • In SPSS, discrete (categorical) variables are entered as factors, and continuous variables as covariates. 4 7 The Ordered Logit Model (OLM) • Say Y is an ordinal dependent variable with c categories. Let Pr(Y ≤ j) denote the probability that the response on Y falls in category j or below (i.e., in category 1,2, …, or j). This is called a cumulative probability. It equals the sum of the probabilities in category j and below: Pr(Y ≤ j)= Pr(Y = 1) + (Pr(Y = 2)+ … +Pr(Y = j) 8 • A “ c category Y dependent variable” has c cumulative probabilities: Pr(Y ≤ 1), Pr(Y ≤ 2), … Pr(Y ≤ c). The final cumulative probability uses the entire scale; as a consequence, therefore, Pr(Y ≤ c) = 1. The order of forming the final cumulative probabilities reflects the ordering of the dependent variable scale, and those probabilities themselves satisfy: Pr(Y ≤ 1) ≤ Pr(Y ≤ 2) ≤ … ≤ Pr(Y ≤ c) = 1
In ordered logit, an underlying probability The coefficients and threshold points are score for an observation of being in the ith estimated using maximum likelihood In the response category is estimated as a linear parameterization of SPSS, no constant function of the independent variables and appears because its effect is absorbed into a set of threshold points(also called cut the threshold The SPSS output provides single values for The probability of observing response ategory i corresponds to the probability that the estimated linear function, plus each X variable) are the main items of random error, is within the range of the interests in the ordered logit table. (One of threshold points estimated for that the advantages using Stata is that odds ratios are available) Pr(response category for the jth When b=0. x has no effect on y. the outcome=)=Pr(-1<b, X,+ b2X2+ effect of x increases as the absolute value bkxk+u ski) of b increases. There are not separate b One estimates the coefficients b,, b2,.b, coefficients for each of the outcomes(or ne minus the number of outcomes as we along with threshold points k,, k2,..., KH-1 have seen in multinomial logistic here i is the number of possible response categories of the dependent variable. All of regression in which we considered logistic this is a direct generalization of the binary gression with a nominal dependent ariable) logistic model
5 9 • In ordered logit, an underlying probability score for an observation of being in the ith response category is estimated as a linear function of the independent variables and a set of threshold points (also called cut points). • The probability of observing response category i corresponds to the probability that the estimated linear function, plus random error, is within the range of the threshold points estimated for that response. 10 • Pr(response category for the jth outcome = i) = Pr(ki-1 <b1X1j + b2X2j + … + bkXkj + uj ≤ ki) • One estimates the coefficients b1, b2, … bk along with threshold points k1, k2, …, ki-1, where i is the number of possible response categories of the dependent variable. All of this is a direct generalization of the binary logistic model. 6 11 • The coefficients and threshold points are estimated using maximum likelihood. In the parameterization of SPSS, no constant appears because its effect is absorbed into the threshold points. • The SPSS output provides single values for the b coefficients. The b coefficients (one for each X variable) are the main items of interests in the ordered logit table. (One of the advantages using Stata is that odds ratios are available) 12 • When b = 0, X has no effect on Y. The effect of X increases as the absolute value of b increases. There are not separate b coefficients for each of the outcomes (or one minus the number of outcomes as we have seen in multinomial logistic regression in which we considered logistic regression with a nominal dependent variable)
Estimating an ordered logit model In OLM, a particular b coefficient takes the same value for the logit coefficient for The explication of the OLM is facilitated by each cumulative probability. The model considering an example using the 1997 assumes that the effect of x is the same data. Suppose that the response variable for each cumulative probability. This is health status of children, this is captured cumulative logit model with common by question 302F: effects is often called a"proportional odds F. Health conditions of live births model 2). Basically health Sick but not disabled Congenitally disabled ). Disabled after birth Ordered logit model has the form: We are going to examine the effect on child health of matemal age at childbearing, residence, ethnicity, education, duration of breastfeeding, and child sex We recode the health status variable into 4 categories (1)healthy, (2) basically healthy, (3)sick or disabled P+p and(4)dead, as shown in the following table(we 1-(+2) restrict our sample to children aged 0-5) HEALTH4 ++k ak+ 1-(+2+ sically healthy 9061 d+B+,R=1 Missing system 8
7 13 • In OLM, a particular b coefficient takes the same value for the logit coefficient for each cumulative probability. The model assumes that the effect of X is the same for each cumulative probability. This cumulative logit model with common effects is often called a “proportional odds” model. 14 8 15 Estimating an ordered logit model • The explication of the OLM is facilitated by considering an example using the 1997 data. Suppose that the response variable is health status of children, this is captured by question 302F: F. Health conditions of live births? 1). Healthy 2). Basically healthy 3). Sick but not disabled 4). Congenitally disabled 5). Disabled after birth 6). Dead 7).N/A 16 HEALTH4 1121 75.8 89.3 89.3 90 6.1 7.2 96.5 15 1.0 1.2 97.7 29 2.0 2.3 100.0 1255 84.9 100.0 224 15.1 1479 100.0 healthy basically healthy sick or disabled dead Total Valid Missing System Total Frequency Percent Valid Percent Cumulative Percent We are going to examine the effect on child health of maternal age at childbearing, residence, ethnicity, education, duration of breastfeeding, and child sex. We recode the health status variable into 4 categories: (1) healthy, (2) basically healthy, (3) sick or disabled, and (4) dead, as shown in the following table (we restrict our sample to children aged 0-5):
We are going to fit the following Our hypothesis is that both child and equation maternal characteristics affect child ategories will be more likely to have In -P(Ysj) =a,+bX,+b,X,++bX healthier children. Prolonged duration of l-P(Y≤j breastfeeding is associated with increased obability of being healthy of a child. The practice of discrimination against girl Dependent variable suggests that a girl child is more likely to alth status. denoted as health4 be in a worse status of health than a boy (4 categories: healthy, basically healthy, sick or ild Independent variables The ordinal logistic regression equation in our exampl Pa Bfeed: duration of breastfeeding, an interval variable P(Ysj=bMac+b Par_mum+b,Bfeed Urban: place of residence, 1 if urban, 0 otherwise PY≤j b, Chdsex +b Urban+b Han Primary. 1 if primary school, 0 otherwise +b, Primary +byJumior +bg Sencol junior. 1 if junior middle school, 0 otherwise Sencol: 1 if senior middle school and over. 0 otherwise
9 17 We are going to fit the following equation: 11 2 2 ( ) ln ... 1( ) j n n PY j a bX bX bX PY j ⎡ ⎤ ≤ = + + ++ ⎢ ⎥ ⎣ ⎦ − ≤ Dependent variable: health status, denoted as health4 (4 categories: healthy, basically healthy, sick or disabled, and dead). 18 • Independent variables: MAC: Maternal age at childbearing, an interval variable Par_num: parity, an interval variable Bfeed: duration of breastfeeding, an interval variable Chdsex: child sex, 1 if a girl, 0 otherwise Urban: place of residence, 1 if urban, 0 otherwise Han: 1 if Han, 0 otherwise Primary: 1 if primary school, 0 otherwise Junior: 1 if junior middle school, 0 otherwise Sencol: 1 if senior middle school and over, 0 otherwise 10 19 • Our hypothesis is that both child and maternal characteristics affect child survival. Women in higher socio-economic categories will be more likely to have healthier children. Prolonged duration of breastfeeding is associated with increased probability of being healthy of a child. The practice of discrimination against girls suggests that a girl child is more likely to be in a worse status of health than a boy child. 20 The ordinal logistic regression equation in our example: 12 3 4 56 7 89 ( ) ln _ 1( ) PY j b Mac b Par num b Bfeed PY j b Chdsex b Urban b Han b Primary b Junior b Sencol ⎡ ⎤ ≤ =+ + ⎢ ⎥ ⎣ ⎦ − ≤ + ++ + ++