The purpose of this short TP is to get practice on survival analysis
As usual, make sure that you read the help
for any new functions that you use.
Load the ISwR and survival
packages into R
and then load the melanom data and start to explore it
using the numerical and graphical summaries you have learned about this week
(e.g. functions like summary, hist,
Also be sure to look at the help for melanom.
Create a Surv object
To carry out survival analysis, we need to create a Surv object.
The values 2 and 3 for the status are to be considered as censored:
mel.surv <- Surv(days, status==1)
We can get the Kaplan-Meier estimate of the survival function
surv.all <- survfit(mel.surv)
The result of summary only gives estimates for event times.
The censoring times can also be shown if you use the censored
Usually it's more interesting to look at a plot rather than the numerical values:
The short vertical lines on the curve show where censoring has occurred,
and the dashed bands around the curve give approximate confidence intervals.
We can look at the curves separately for each gender (colored differently for males and females):
surv.sex <- survfit(mel.surv ~ sex)
Make sure you can tell which color corresponds to which gender.
Does one group appear to have longer survival than the other?
We can carry out the log-rank test to test whether the population curves are the same
using the survdiff function:
survdiff(Surv(days, status==1) ~ sex)
Is the observed difference statistically significant?
Carrying out Cox modeling and understanding the output you get is
beyond the scope of the course, but here is a brief summary.
Cox modeling is carried out in a similar manner to regression modeling you have
already done with lm,
but with linearity assumed on the log hazard scale.
In R, you use the function coxph
with a formula including the variables that you want to include in the model.