Description Usage Arguments Value Author(s) References See Also Examples
Fit binary and proportional odds ordinal
logistic regression models using maximum likelihood estimation or
penalized maximum likelihood estimation. See cr.setup
for how to
fit forward continuation ratio models with lrm
.
For the print
method, format of output is controlled by the
user previously running options(prType="lang")
where
lang
is "plain"
(the default), "latex"
, or
"html"
.
1 2 3 4 5 6 7 8 9 10  lrm(formula, data=environment(formula),
subset, na.action=na.delete, method="lrm.fit",
model=FALSE, x=FALSE, y=FALSE, linear.predictors=TRUE, se.fit=FALSE,
penalty=0, penalty.matrix, tol=1e7,
strata.penalty=0, var.penalty=c('simple','sandwich'),
weights, normwt, scale=FALSE, ...)
## S3 method for class 'lrm'
print(x, digits=4, strata.coefs=FALSE,
coefs=TRUE, title='Logistic Regression Model', ...)

formula 
a formula object. An 
data 
data frame to use. Default is the current frame. 
subset 
logical expression or vector of subscripts defining a subset of observations to analyze 
na.action 
function to handle 
method 
name of fitting function. Only allowable choice at present is 
model 
causes the model frame to be returned in the fit object 
x 
causes the expanded design matrix (with missings excluded)
to be returned under the name 
y 
causes the response variable (with missings excluded) to be returned
under the name 
linear.predictors 
causes the predicted X beta (with missings excluded) to be returned
under the name 
se.fit 
causes the standard errors of the fitted values to be returned under
the name 
penalty 
The penalty factor subtracted from the log likelihood is
0.5 β' P β, where β is the vector of regression
coefficients other than intercept(s), and P is

penalty.matrix 
specifies the symmetric penalty matrix for nonintercept terms.
The default matrix for continuous predictors has
the variance of the columns of the design matrix in its diagonal elements
so that the penalty to the log likelhood is unitless. For main effects
for categorical predictors with c categories, the rows and columns of
the matrix contain a c1 \times c1 submatrix that is used to
compute the
sum of squares about the mean of the c parameter values (setting the
parameter to zero for the reference cell) as the penalty component
for that predictor. This makes the penalty independent of the choice of
the reference cell. If you specify 
tol 
singularity criterion (see 
strata.penalty 
scalar penalty factor for the stratification
factor, for the experimental 
var.penalty 
the type of variancecovariance matrix to be stored in the 
weights 
a vector (same length as 
normwt 
set to 
scale 
set to 
... 
arguments that are passed to 
digits 
number of significant digits to use 
strata.coefs 
set to 
coefs 
specify 
title 
a character string title to be passed to 
The returned fit object of lrm
contains the following components
in addition to the ones mentioned under the optional arguments.
call 
calling expression 
freq 
table of frequencies for 
stats 
vector with the following elements: number of observations used in the
fit, maximum absolute value of first
derivative of log likelihood, model likelihood ratio
chisquare, d.f.,
Pvalue, c index (area under ROC curve), Somers' D_{xy},
GoodmanKruskal gamma, Kendall's taua rank
correlations
between predicted probabilities and observed response, the
Nagelkerke R^2 index, the Brier score computed with respect to
Y > its lowest level, the gindex, gr (the
gindex on the odds ratio scale), and gp (the gindex
on the probability scale using the same cutoff used for the Brier
score). Probabilities are rounded to the nearest 0.0002
in the computations or rank correlation indexes.
In the case of penalized estimation, the 
fail 
set to 
coefficients 
estimated parameters 
var 
estimated variancecovariance matrix (inverse of information matrix).
If 
effective.df.diagonal 
is returned if 
u 
vector of first derivatives of loglikelihood 
deviance 
2 log likelihoods (counting penalty components) When an offset variable is present, three deviances are computed: for intercept(s) only, for intercepts+offset, and for intercepts+offset+predictors. When there is no offset variable, the vector contains deviances for the intercept(s)only model and the model with intercept(s) and predictors. 
est 
vector of column numbers of 
non.slopes 
number of intercepts in model 
penalty 
see above 
penalty.matrix 
the penalty matrix actually used in the estimation 
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
Le Cessie S, Van Houwelingen JC: Ridge estimators in logistic regression. Applied Statistics 41:191–201, 1992.
Verweij PJM, Van Houwelingen JC: Penalized likelihood in Cox regression. Stat in Med 13:2427–2436, 1994.
Gray RJ: Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. JASA 87:942–951, 1992.
Shao J: Linear model selection by crossvalidation. JASA 88:486–494, 1993.
Verweij PJM, Van Houwelingen JC: Crossvalidation in survival analysis. Stat in Med 12:2305–2314, 1993.
Harrell FE: Model uncertainty, penalization, and parsimony. ISCB Presentation on UVa Web page, 1998.
lrm.fit
, predict.lrm
,
rms.trans
, rms
, glm
,
latex.lrm
,
residuals.lrm
, na.delete
,
na.detail.response
,
pentrace
, rmsMisc
, vif
,
cr.setup
, predab.resample
,
validate.lrm
, calibrate
,
Mean.lrm
, gIndex
, prModFit
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258  #Fit a logistic model containing predictors age, blood.pressure, sex
#and cholesterol, with age fitted with a smooth 5knot restricted cubic
#spline function and a different shape of the age relationship for males
#and females. As an intermediate step, predict mean cholesterol from
#age using a proportional odds ordinal logistic model
#
n < 1000 # define sample size
set.seed(17) # so can reproduce the results
age < rnorm(n, 50, 10)
blood.pressure < rnorm(n, 120, 15)
cholesterol < rnorm(n, 200, 25)
sex < factor(sample(c('female','male'), n,TRUE))
label(age) < 'Age' # label is in Hmisc
label(cholesterol) < 'Total Cholesterol'
label(blood.pressure) < 'Systolic Blood Pressure'
label(sex) < 'Sex'
units(cholesterol) < 'mg/dl' # uses units.default in Hmisc
units(blood.pressure) < 'mmHg'
#To use prop. odds model, avoid using a huge number of intercepts by
#grouping cholesterol into 40tiles
ch < cut2(cholesterol, g=40, levels.mean=TRUE) # use mean values in intervals
table(ch)
f < lrm(ch ~ age)
options(prType='latex')
print(f, coefs=4) # write latex code to console
m < Mean(f) # see help file for Mean.lrm
d < data.frame(age=seq(0,90,by=10))
m(predict(f, d))
# Repeat using ols
f < ols(cholesterol ~ age)
predict(f, d)
# Specify population model for log odds that Y=1
L < .4*(sex=='male') + .045*(age50) +
(log(cholesterol  10)5.2)*(2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(L)]
y < ifelse(runif(n) < plogis(L), 1, 0)
cholesterol[1:3] < NA # 3 missings, at random
ddist < datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')
fit < lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)),
x=TRUE, y=TRUE)
# x=TRUE, y=TRUE allows use of resid(), which.influence below
# could define d < datadist(fit) after lrm(), but data distribution
# summary would not be stored with fit, so later uses of Predict
# or summary.rms would require access to the original dataset or
# d or specifying all variable values to summary, Predict, nomogram
anova(fit)
p < Predict(fit, age, sex)
ggplot(p) # or plot()
ggplot(Predict(fit, age=20:70, sex="male")) # need if datadist not used
print(cbind(resid(fit,"dfbetas"), resid(fit,"dffits"))[1:20,])
which.influence(fit, .3)
# latex(fit) #print nice statement of fitted model
#
#Repeat this fit using penalized MLE, penalizing complex terms
#(for nonlinear or interaction effects)
#
fitp < update(fit, penalty=list(simple=0,nonlinear=10), x=TRUE, y=TRUE)
effective.df(fitp)
# or lrm(y ~ \dots, penalty=\dots)
#Get fits for a variety of penalties and assess predictive accuracy
#in a new data set. Program efficiently so that complex design
#matrices are only created once.
set.seed(201)
x1 < rnorm(500)
x2 < rnorm(500)
x3 < sample(0:1,500,rep=TRUE)
L < x1+abs(x2)+x3
y < ifelse(runif(500)<=plogis(L), 1, 0)
new.data < data.frame(x1,x2,x3,y)[301:500,]
#
for(penlty in seq(0,.15,by=.005)) {
if(penlty==0) {
f < lrm(y ~ rcs(x1,4)+rcs(x2,6)*x3, subset=1:300, x=TRUE, y=TRUE)
# True model is linear in x1 and has no interaction
X < f$x # saves time for future runs  don't have to use rcs etc.
Y < f$y # this also deletes rows with NAs (if there were any)
penalty.matrix < diag(diag(var(X)))
Xnew < predict(f, new.data, type="x")
# expand design matrix for new data
Ynew < new.data$y
} else f < lrm.fit(X,Y, penalty.matrix=penlty*penalty.matrix)
#
cat("\nPenalty :",penlty,"\n")
pred.logit < f$coef[1] + (Xnew %*% f$coef[1])
pred < plogis(pred.logit)
C.index < somers2(pred, Ynew)["C"]
Brier < mean((predYnew)^2)
Deviance< 2*sum( Ynew*log(pred) + (1Ynew)*log(1pred) )
cat("ROC area:",format(C.index)," Brier score:",format(Brier),
" 2 Log L:",format(Deviance),"\n")
}
#penalty=0.045 gave lowest 2 Log L, Brier, ROC in test sample for S+
#
#Use bootstrap validation to estimate predictive accuracy of
#logistic models with various penalties
#To see how noisy crossvalidation estimates can be, change the
#validate(f, \dots) to validate(f, method="cross", B=10) for example.
#You will see tremendous variation in accuracy with minute changes in
#the penalty. This comes from the error inherent in using 10fold
#cross validation but also because we are not fixing the splits.
#20fold cross validation was even worse for some
#indexes because of the small test sample size. Stability would be
#obtained by using the same sample splits for all penalty values
#(see above), but then we wouldn't be sure that the choice of the
#best penalty is not specific to how the sample was split. This
#problem is addressed in the last example.
#
penalties < seq(0,.7,length=3) # really use by=.02
index < matrix(NA, nrow=length(penalties), ncol=11,
dimnames=list(format(penalties),
c("Dxy","R2","Intercept","Slope","Emax","D","U","Q","B","g","gp")))
i < 0
for(penlty in penalties)
{
cat(penlty, "")
i < i+1
if(penlty==0)
{
f < lrm(y ~ rcs(x1,4)+rcs(x2,6)*x3, x=TRUE, y=TRUE) # fit whole sample
X < f$x
Y < f$y
penalty.matrix < diag(diag(var(X))) # save time  only do once
}
else
f < lrm(Y ~ X, penalty=penlty,
penalty.matrix=penalty.matrix, x=TRUE,y=TRUE)
val < validate(f, method="boot", B=20) # use larger B in practice
index[i,] < val[,"index.corrected"]
}
par(mfrow=c(3,3))
for(i in 1:9)
{
plot(penalties, index[,i],
xlab="Penalty", ylab=dimnames(index)[[2]][i])
lines(lowess(penalties, index[,i]))
}
options(datadist=NULL)
# Example of weighted analysis
x < 1:5
y < c(0,1,0,1,0)
reps < c(1,2,3,2,1)
lrm(y ~ x, weights=reps)
x < rep(x, reps)
y < rep(y, reps)
lrm(y ~ x) # same as above
#
#Study performance of a modified AIC which uses the effective d.f.
#See Verweij and Van Houwelingen (1994) Eq. (6). Here AIC=chisq2*df.
#Also try as effective d.f. equation (4) of the previous reference.
#Also study performance of Shao's crossvalidation technique (which was
#designed to pick the "right" set of variables, and uses a much smaller
#training sample than most methods). Compare crossvalidated deviance
#vs. penalty to the gold standard accuracy on a 7500 observation dataset.
#Note that if you only want to get AIC or Schwarz Bayesian information
#criterion, all you need is to invoke the pentrace function.
#NOTE: the effective.df( ) function is used in practice
#
## Not run:
for(seed in c(339,777,22,111,3)){
# study performance for several datasets
set.seed(seed)
n < 175; p < 8
X < matrix(rnorm(n*p), ncol=p) # p normal(0,1) predictors
Coef < c(.1,.2,.3,.4,.5,.6,.65,.7) # true population coefficients
L < X %*% Coef # intercept is zero
Y < ifelse(runif(n)<=plogis(L), 1, 0)
pm < diag(diag(var(X)))
#Generate a large validation sample to use as a gold standard
n.val < 7500
X.val < matrix(rnorm(n.val*p), ncol=p)
L.val < X.val %*% Coef
Y.val < ifelse(runif(n.val)<=plogis(L.val), 1, 0)
#
Penalty < seq(0,30,by=1)
reps < length(Penalty)
effective.df < effective.df2 < aic < aic2 < deviance.val <
Lpenalty < single(reps)
n.t < round(n^.75)
ncv < c(10,20,30,40) # try various no. of reps in crossval.
deviance < matrix(NA,nrow=reps,ncol=length(ncv))
#If model were complex, could have started things off by getting X, Y
#penalty.matrix from an initial lrm fit to save time
#
for(i in 1:reps) {
pen < Penalty[i]
cat(format(pen),"")
f.full < lrm.fit(X, Y, penalty.matrix=pen*pm)
Lpenalty[i] < pen* t(f.full$coef[1]) %*% pm %*% f.full$coef[1]
f.full.nopenalty < lrm.fit(X, Y, initial=f.full$coef, maxit=1)
info.matrix.unpenalized < solve(f.full.nopenalty$var)
effective.df[i] < sum(diag(info.matrix.unpenalized %*% f.full$var))  1
lrchisq < f.full.nopenalty$stats["Model L.R."]
# lrm does all this penalty adjustment automatically (for var, d.f.,
# chisquare)
aic[i] < lrchisq  2*effective.df[i]
#
pred < plogis(f.full$linear.predictors)
score.matrix < cbind(1,X) * (Y  pred)
sum.u.uprime < t(score.matrix) %*% score.matrix
effective.df2[i] < sum(diag(f.full$var %*% sum.u.uprime))
aic2[i] < lrchisq  2*effective.df2[i]
#
#Shao suggested averaging 2*n crossvalidations, but let's do only 40
#and stop along the way to see if fewer is OK
dev < 0
for(j in 1:max(ncv)) {
s < sample(1:n, n.t)
cof < lrm.fit(X[s,],Y[s],
penalty.matrix=pen*pm)$coef
pred < cof[1] + (X[s,] %*% cof[1])
dev < dev 2*sum(Y[s]*pred + log(1plogis(pred)))
for(k in 1:length(ncv)) if(j==ncv[k]) deviance[i,k] < dev/j
}
#
pred.val < f.full$coef[1] + (X.val %*% f.full$coef[1])
prob.val < plogis(pred.val)
deviance.val[i] < 2*sum(Y.val*pred.val + log(1prob.val))
}
postscript(hor=TRUE) # along with graphics.off() below, allow plots
par(mfrow=c(2,4)) # to be printed as they are finished
plot(Penalty, effective.df, type="l")
lines(Penalty, effective.df2, lty=2)
plot(Penalty, Lpenalty, type="l")
title("Penalty on 2 log L")
plot(Penalty, aic, type="l")
lines(Penalty, aic2, lty=2)
for(k in 1:length(ncv)) {
plot(Penalty, deviance[,k], ylab="deviance")
title(paste(ncv[k],"reps"))
lines(supsmu(Penalty, deviance[,k]))
}
plot(Penalty, deviance.val, type="l")
title("Gold Standard (n=7500)")
title(sub=format(seed),adj=1,cex=.5)
graphics.off()
}
## End(Not run)
#The results showed that to obtain a clear picture of the penalty
#accuracy relationship one needs 30 or 40 reps in the crossvalidation.
#For 4 of 5 samples, though, the super smoother was able to detect
#an accurate penalty giving the best (lowest) deviance using 10fold
#crossvalidation. Crossvalidation would have worked better had
#the same splits been used for all penalties.
#The AIC methods worked just as well and are much quicker to compute.
#The first AIC based on the effective d.f. in Gray's Eq. 2.9
#(Verweij and Van Houwelingen (1994) Eq. 5 (note typo)) worked best.

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.