Survival Analysis in R

Survival Analysis in R

Survival analysis, also known as time-to-event analysis, is used to analyze the expected duration of time until a specific event of interest occurs. Common applications include analyzing time-to-failure for machinery, time until purchase in marketing, or time-to-death in medical studies.

In this tutorial, we'll use R to perform basic survival analysis using the survival package.

1. Installation and Loading the Package:

install.packages("survival") library(survival) 

2. The Survival Object:

The Surv() function creates a survival object, which represents the data in a format suitable for survival analysis.

# Sample data time <- c(2, 3, 5, 8, 12) # time till event or censoring event <- c(1, 1, 0, 1, 0) # 1 if event occurred, 0 if censored # Create a survival object s_obj <- Surv(time, event) 

3. Kaplan-Meier Estimation:

The survfit() function fits survival curves using the Kaplan-Meier method.

fit <- survfit(s_obj ~ 1) print(fit) # Plotting the survival curve plot(fit) 

4. Cox Proportional Hazards Model:

For analyzing the relationship between survival time and one or more predictor variables, we use the Cox proportional hazards model.

Let's use the lung dataset from the survival package.

# Load dataset data(lung) # Fit the model cox_model <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data=lung) # Summary of the model summary(cox_model) 

5. Checking Proportional Hazards Assumption:

We can visually inspect whether the proportional hazards assumption holds by plotting the scaled Schoenfeld residuals.

install.packages("ggplot2") library(ggplot2) # Compute residuals resid <- residuals(cox_model, type="scaledsch") # Plot ggplot(data.frame(time=cox_model$time, resid=resid), aes(x=time, y=resid)) + geom_point() + geom_smooth(se=FALSE) 

A horizontal line in the plot suggests the assumption holds, while a non-horizontal trend suggests it might be violated.

6. Predicting with the Cox Model:

You can make predictions using the Cox model:

# Predict survival for new data newdata <- data.frame(age=65, sex=1, ph.ecog=1) predict(cox_model, newdata, type="expected") 

7. Other Considerations:

  • Right Censoring: In survival analysis, it's common for the event of interest not to occur for all subjects during the study period. Such subjects are "censored".

  • Multiple Events: Some subjects may experience the event more than once. There are extensions of basic survival analysis to handle such "repeated events".

  • Time-varying Covariates: If covariates change over time, they're "time-varying". Handling such covariates requires advanced methods.

Conclusion:

Survival analysis is a powerful technique for analyzing time-to-event data. With R and the survival package, you have the tools to explore, model, and predict such data effectively.

Examples

  1. Introduction to Survival Analysis with R:

    • Survival analysis is a statistical approach for analyzing time-to-event data, often used in medical or reliability studies.
    # Example: Creating a survival object library(survival) survival_object <- Surv(time = c(5, 10, 15), event = c(1, 1, 0)) 
  2. Kaplan-Meier Survival Curves in R:

    • Kaplan-Meier curves estimate survival probabilities over time.
    # Example: Kaplan-Meier curve km_curve <- survfit(survival_object ~ 1) plot(km_curve, main = "Kaplan-Meier Survival Curve") 
  3. Cox Proportional Hazards Model in R:

    • Cox PH model assesses the impact of covariates on survival.
    # Example: Cox Proportional Hazards Model cox_model <- coxph(Surv(time, event) ~ age + treatment, data = my_data) 
  4. R Survival Analysis for Clinical Data:

    • Apply survival analysis to clinical data, considering time and event variables.
    # Example: Analyzing clinical survival data survival_object <- Surv(time = clinical_data$follow_up_time, event = clinical_data$outcome) 
  5. Time-to-Event Analysis in R:

    • Analyze time-to-event data using survival analysis techniques.
    # Example: Time-to-event analysis km_curve <- survfit(Surv(time, event) ~ 1, data = time_data) 
  6. Survival Analysis Plotting in R:

    • Plot survival curves and other relevant visualizations.
    # Example: Plotting survival curves plot(km_curve, main = "Survival Analysis", xlab = "Time", ylab = "Survival Probability") 
  7. Log-Rank Test in R:

    • Assess differences in survival curves using the log-rank test.
    # Example: Log-rank test logrank_test <- survdiff(Surv(time, event) ~ group, data = survival_data) 
  8. Comparing Survival Curves in R:

    • Compare multiple survival curves visually and statistically.
    # Example: Comparing survival curves multi_km_curve <- survfit(Surv(time, event) ~ group, data = survival_data) 
  9. Handling Censored Data in Survival Analysis with R:

    • Address censored data points using the Surv object.
    # Example: Handling censored data survival_object <- Surv(time = my_data$time, event = my_data$status) 
  10. Stratified Survival Analysis in R:

    • Perform survival analysis within strata or subgroups.
    # Example: Stratified survival analysis stratified_km_curve <- survfit(Surv(time, event) ~ strata(group), data = survival_data) 
  11. Parametric Survival Models in R:

    • Fit parametric survival models like exponential or Weibull distribution.
    # Example: Fitting a Weibull survival model weibull_model <- survreg(Surv(time, event) ~ covariate, data = survival_data, dist = "weibull") 
  12. Multivariate Survival Analysis in R:

    • Assess the impact of multiple covariates on survival.
    # Example: Multivariate survival analysis coxph_model <- coxph(Surv(time, event) ~ covariate1 + covariate2, data = survival_data) 
  13. Survival Analysis with Competing Risks in R:

    • Analyze survival data in the presence of competing risks.
    # Example: Competing risks survival analysis library(cmprsk) competing_risks_model <- cuminc(time, status, group) 

More Tags

pie-chart checkmarx field mobile-country-code change-password razor-pages auto yocto combobox nunit

More Programming Guides

Other Guides

More Programming Examples