R Program to Drop Columns from a Dataframe

📘 Premium Read: Access my best content on Medium member-only articles — deep dives into Java, Spring Boot, Microservices, backend architecture, interview preparation, career advice, and industry-standard best practices.

🎓 Top 15 Udemy Courses (80-90% Discount): My Udemy Courses - Ramesh Fadatare — All my Udemy courses are real-time and project oriented courses.

▶️ Subscribe to My YouTube Channel (176K+ subscribers): Java Guides on YouTube

▶️ For AI, ChatGPT, Web, Tech, and Generative AI, subscribe to another channel: Ramesh Fadatare on YouTube

1. Introduction

Data manipulation is a fundamental step in data analysis. At times, we might have redundant or unnecessary columns in our dataframe that we'd like to remove for clarity. In R, dropping columns from a dataframe can be achieved using a few different techniques. This guide will focus on the use of the select function from the dplyr package.

2. Program Overview

1. Create a sample dataframe.

2. Drop columns using negative selection.

3. Drop columns by name.

3. Code Program

# Load necessary library library(dplyr) # Create a sample dataframe df <- data.frame( Name = c('John', 'Jane', 'Doe'), Age = c(25, 28, 22), Gender = c('Male', 'Female', 'Male'), Score = c(85, 90, 78) ) # Display the original dataframe print("Original Dataframe:") print(df) # Drop the 'Gender' and 'Score' columns using negative selection df1 <- df %>% select(-c(Gender, Score)) # Display the dataframe after dropping columns print("Dataframe after Dropping 'Gender' and 'Score' Columns:") print(df1) # Another method: Drop the 'Age' column by name df2 <- df[, -which(names(df) %in% c("Age"))] # Display the dataframe after dropping the 'Age' column print("Dataframe after Dropping 'Age' Column:") print(df2) 

Output:

[1] "Original Dataframe:" Name Age Gender Score 1 John 25 Male 85 2 Jane 28 Female 90 3 Doe 22 Male 78 [1] "Dataframe after Dropping 'Gender' and 'Score' Columns:" Name Age 1 John 25 2 Jane 28 3 Doe 22 [1] "Dataframe after Dropping 'Age' Column:" Name Gender Score 1 John Male 85 2 Jane Female 90 3 Doe Male 78 

4. Step By Step Explanation

- We initiate by creating a sample dataframe df with columns: Name, Age, Gender, and Score.

- To drop columns, we use the select function from the dplyr package. By placing a - in front of the column name(s) we wish to exclude, we're effectively telling R to keep all columns except those specified.

- In another method, if you want to exclude columns without the dplyr package, you can use base R's negative indexing with the help of which and names functions.

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare