R Programming We will be starting shortly … Welcome to the Digital Regenesys course in
Data Types
Data Types • Data types are used to define the type of data that a variable can hold. There are several data types in R, including:
Data Types • Numeric: Numeric data types are used for storing numeric values. They can be integers or decimal numbers. Suppose we want to calculate the average temperature in degrees Celsius for the month of June. We can create a numeric vector called "temp_june" and assign it some temperature values in degrees Celsius, like this: temp_june <- c(20, 22.5, 23, 24, 25.5) We can then use the mean() function to calculate the average temperature: mean_temp_june <- mean(temp_june)
Data Types • Character: Character data types are used for storing text values. They are represented by a sequence of characters enclosed in quotes. Suppose we have a dataset of customer names and addresses, and we want to extract just the first name of each customer. We can create a character vector called "customer_names" and assign it some customer names, like this: customer_names <- c("John Smith", "Mary Johnson", "Bob Lee") We can then use the strsplit() function to split each name by the space character, and extract just the first name: first_names <- sapply(strsplit(customer_names, " "), [, 1)
Data Types • Logical: Logical data types are used for storing true/false values. They are represented by the values TRUE and FALSE. Suppose we want to filter a dataset to only include rows where a certain condition is true. We can create a logical vector called "filter_condition" that checks if a certain variable in our dataset is greater than 5, like this: filter_condition <- my_dataset$my_variable > 5 We can then use this logical vector to filter our dataset: filtered_dataset <- my_dataset[filter_condition,]
Data Types • Factor: Factor data types are used for storing categorical data. They are represented by a set of values that can be assigned to a variable. Suppose we have a dataset of customer information, and we want to summarize the number of customers in each age group. We can create a factor variable called "age_group" that categorizes customers into different age groups, like this: age_group <- cut(my_dataset$age, breaks = c(0, 18, 30, 50, Inf), labels = c("Under 18", "18-30", "31-50", "Over 50")) We can then use the table() function to summarize the number of customers in each age group: customer_age_summary <- table(age_group)
Data Types • Date and Time: Date and time data types are used for storing dates and times. They are represented by a set of values that can be assigned to a variable. • Suppose we want to calculate the number of days between two dates. We can create two date variables called "start_date" and "end_date", and calculate the number of days between them using the as.numeric() and difftime() functions, like this: start_date <- as.Date("2022-01-01") end_date <- as.Date("2022-01-31") days_between <- as.numeric(difftime(end_date, start_date, units = "days"))
Data Types in R
Using variables • Performing arithmetic operations on numeric variables • Perform character operations
Using variables • Perform relational and logical operations • == • > • < • >= • <= • != • | • & • ! • isTrue(x)
Input-Output
Input/Output in R • In R, the print() function is used to display the value of an object on the console. Here's the basic syntax of the print() function: print(x) • Here, x is the object you want to print. It can be a scalar, a vector, a matrix, a data frame, or any other R object. • By default, the print() function prints the value of the object with some additional information, such as the name of the object and the type of the object. For example: x <- 3 print(x) • Output: • [1] 3
Input/Output in R • You can also customize the output of the print() function by specifying additional arguments. For example, you can use the quote argument to specify whether character strings should be printed with quotes around them: x <- "Hello, world!" print(x, quote = FALSE) • Output: [1] Hello, world!
Input/Output in R • To print a string and a value together in the same print() statement in R, you can use the paste() or paste0() function to combine the string and value into a single character string, and then pass that string to the print() function. Here's an example: x <- 42 print(paste("The answer is", x)) • Output: [1] "The answer is 42" • In this example, we first create a variable x with the value 42. Then, we use the paste() function to combine the string "The answer is" with the value of x into a single character string "The answer is 42". Finally, we pass that string to the print() function to print it to the console.
Input/Output in R • In R, you can use the scan() function to input integers from the console. Here's the basic syntax of the scan() function: x <- scan() • When you execute this code, R will prompt you to enter some input at the console. You can enter one or more integers separated by spaces, and then press the Enter key to submit your input. R will read your input and store it in the variable x as a numeric vector. x <- scan() 1 2 3 4 5 • In this example, we use the scan() function to input some integers from the console. We type the integers "1 2 3 4 5" and then press Enter. R reads our input and stores it in the variable x as a numeric vector: > x [1] 1 2 3 4 5
Conditional Statements
Conditional Statements in R In R, conditional statements can be written using the if...else or if...else if...else syntax. Here is the general syntax for if...else: if (condition) { # code to execute if condition is TRUE } else { # code to execute if condition is FALSE } The condition is an expression that evaluates to either TRUE or FALSE. If the condition is TRUE, the code inside the first set of curly braces {} is executed. Otherwise, the code inside the second set of curly braces {} is executed.
Conditional Statements in R Here is the general syntax for if...else if...else: if (condition1) { # code to execute if condition1 is TRUE } else if (condition2) { # code to execute if condition1 is FALSE and condition2 is TRUE } else { # code to execute if both condition1 and condition2 are FALSE } The condition1 is an expression that evaluates to either TRUE or FALSE. If condition1 is TRUE, the code inside the first set of curly braces {} is executed. If condition1 is FALSE, the program checks condition2. If condition2 is TRUE, the code inside the second set of curly braces {} is executed. If both condition1 and condition2 are FALSE, the code inside the third set of curly braces {} is executed.
Conditional Statements in R Example Here's an example of how to use the if...else syntax to print a message if a number is positive or negative: # Assign a number to the variable 'num' num <- -5 # Check if 'num' is positive or negative if (num > 0) { print(paste(num, "is positive")) } else { print(paste(num, "is negative")) }
Loops
Loops in R • In R, there are several types of loops, including for loops, while loops, and repeat loops. Here is the general syntax for each type of loop: • for loop for (variable in sequence) { # code to execute for each value of the variable } • In a for loop, the variable is assigned each value of the sequence, and the code inside the curly braces {} is executed for each value of the variable.
Loops in R • while loop while (condition) { # code to execute while the condition is TRUE } • In a while loop, the condition is checked before each iteration of the loop. If the condition is TRUE, the code inside the curly braces {} is executed. This process repeats until the condition is no longer TRUE.
Loops in R • repeat loop repeat { # code to execute indefinitely if (condition) { break } } • In a repeat loop, the code inside the curly braces {} is executed indefinitely. However, the loop can be exited using the break statement when a certain condition is met.
Loops in R • Here is an example of how to use a for loop to iterate through a sequence of numbers: # Create a vector of numbers from 1 to 5 nums <- c(1, 2, 3, 4, 5) # Use a for loop to print each number in the vector for (num in nums) { print(num) } • In this example, the nums vector contains the numbers 1 through 5. The for loop assigns each value in the nums vector to the num variable, and the print() function is used to print each value of num to the console. • Output: [1] 1 [1] 2 [1] 3 [1] 4 [1] 5
Functions
What are functions • In R, functions are a set of instructions that can be called and executed to perform a specific task. • A function in R takes input values, called arguments, and returns an output value. • Functions can be built-in functions that are already available in R or user-defined functions that are created by the user.
Few Built-in Functions • here are some commonly used built-in functions in R along with examples: • sum() - Calculates the sum of a vector of numbers # Calculate the sum of numbers from 1 to 5 sum(1:5) # Output: 15 • mean() - Calculates the mean of a vector of numbers # Calculate the mean of numbers from 1 to 5 mean(1:5) # Output: 3
User Defined Functions • In R, you can create your own functions using the function() keyword. Here's the basic syntax for creating a function: my_function <- function(arg1, arg2, ...) { # code block return(output) } • Here's what each part of the syntax means: • my_function is the name you give to your function. • arg1, arg2, and ... are the input arguments to your function. You can have as many input arguments as you need. • The code block within the curly braces {} contains the code that your function will execute. • output is the value that your function will return when it is called.
User Defined Functions • Here's an example of a user-defined function in R that calculates the area of a rectangle: # Define a function called "rectangle_area" rectangle_area <- function(length, width) { area <- length * width return(area) } # Call the function to calculate the area of a rectangle with length 5 and width 3 rectangle_area(5, 3) • In this example, we define a function called rectangle_area that takes two arguments: length and width. The function then calculates the area of the rectangle using the formula length * width and returns the result. We then call the function and pass in values for the length and width arguments to calculate the area of a rectangle with length 5 and width 3.
Define and call a function • Here's an example of a simple function that calculates the sum of two numbers: sum_numbers <- function(x, y) { sum <- x + y return(sum) } • Once you've defined your function, you can call it just like any other function in R: result <- sum_numbers(3, 4) # Output: result = 7 • In this example, the function sum_numbers() takes two input arguments (x and y), calculates their sum, and returns the result. You can modify the code block within the function to perform any operation you want, such as statistical analysis, data manipulation, or plotting.
Sample Program with R
Working with Diamond Mispricing Data mydata<read.csv(file.choose()) install.packages("ggplot2") library(ggplot2) ggplot(data=mydata[mydata$carat<2.5,], aes(x=carat, y=price, colour=clarity))+ geom_point(alpha=0.1)+ geom_smooth() File: Diamonds_Pricelist
basic commands of R programming for code

basic commands of R programming for code

  • 2.
    R Programming We willbe starting shortly … Welcome to the Digital Regenesys course in
  • 3.
  • 4.
    Data Types • Datatypes are used to define the type of data that a variable can hold. There are several data types in R, including:
  • 5.
    Data Types • Numeric:Numeric data types are used for storing numeric values. They can be integers or decimal numbers. Suppose we want to calculate the average temperature in degrees Celsius for the month of June. We can create a numeric vector called "temp_june" and assign it some temperature values in degrees Celsius, like this: temp_june <- c(20, 22.5, 23, 24, 25.5) We can then use the mean() function to calculate the average temperature: mean_temp_june <- mean(temp_june)
  • 6.
    Data Types • Character:Character data types are used for storing text values. They are represented by a sequence of characters enclosed in quotes. Suppose we have a dataset of customer names and addresses, and we want to extract just the first name of each customer. We can create a character vector called "customer_names" and assign it some customer names, like this: customer_names <- c("John Smith", "Mary Johnson", "Bob Lee") We can then use the strsplit() function to split each name by the space character, and extract just the first name: first_names <- sapply(strsplit(customer_names, " "), [, 1)
  • 7.
    Data Types • Logical:Logical data types are used for storing true/false values. They are represented by the values TRUE and FALSE. Suppose we want to filter a dataset to only include rows where a certain condition is true. We can create a logical vector called "filter_condition" that checks if a certain variable in our dataset is greater than 5, like this: filter_condition <- my_dataset$my_variable > 5 We can then use this logical vector to filter our dataset: filtered_dataset <- my_dataset[filter_condition,]
  • 8.
    Data Types • Factor:Factor data types are used for storing categorical data. They are represented by a set of values that can be assigned to a variable. Suppose we have a dataset of customer information, and we want to summarize the number of customers in each age group. We can create a factor variable called "age_group" that categorizes customers into different age groups, like this: age_group <- cut(my_dataset$age, breaks = c(0, 18, 30, 50, Inf), labels = c("Under 18", "18-30", "31-50", "Over 50")) We can then use the table() function to summarize the number of customers in each age group: customer_age_summary <- table(age_group)
  • 9.
    Data Types • Dateand Time: Date and time data types are used for storing dates and times. They are represented by a set of values that can be assigned to a variable. • Suppose we want to calculate the number of days between two dates. We can create two date variables called "start_date" and "end_date", and calculate the number of days between them using the as.numeric() and difftime() functions, like this: start_date <- as.Date("2022-01-01") end_date <- as.Date("2022-01-31") days_between <- as.numeric(difftime(end_date, start_date, units = "days"))
  • 10.
  • 11.
    Using variables • Performingarithmetic operations on numeric variables • Perform character operations
  • 12.
    Using variables • Performrelational and logical operations • == • > • < • >= • <= • != • | • & • ! • isTrue(x)
  • 13.
  • 14.
    Input/Output in R •In R, the print() function is used to display the value of an object on the console. Here's the basic syntax of the print() function: print(x) • Here, x is the object you want to print. It can be a scalar, a vector, a matrix, a data frame, or any other R object. • By default, the print() function prints the value of the object with some additional information, such as the name of the object and the type of the object. For example: x <- 3 print(x) • Output: • [1] 3
  • 15.
    Input/Output in R •You can also customize the output of the print() function by specifying additional arguments. For example, you can use the quote argument to specify whether character strings should be printed with quotes around them: x <- "Hello, world!" print(x, quote = FALSE) • Output: [1] Hello, world!
  • 16.
    Input/Output in R •To print a string and a value together in the same print() statement in R, you can use the paste() or paste0() function to combine the string and value into a single character string, and then pass that string to the print() function. Here's an example: x <- 42 print(paste("The answer is", x)) • Output: [1] "The answer is 42" • In this example, we first create a variable x with the value 42. Then, we use the paste() function to combine the string "The answer is" with the value of x into a single character string "The answer is 42". Finally, we pass that string to the print() function to print it to the console.
  • 17.
    Input/Output in R •In R, you can use the scan() function to input integers from the console. Here's the basic syntax of the scan() function: x <- scan() • When you execute this code, R will prompt you to enter some input at the console. You can enter one or more integers separated by spaces, and then press the Enter key to submit your input. R will read your input and store it in the variable x as a numeric vector. x <- scan() 1 2 3 4 5 • In this example, we use the scan() function to input some integers from the console. We type the integers "1 2 3 4 5" and then press Enter. R reads our input and stores it in the variable x as a numeric vector: > x [1] 1 2 3 4 5
  • 18.
  • 19.
    Conditional Statements inR In R, conditional statements can be written using the if...else or if...else if...else syntax. Here is the general syntax for if...else: if (condition) { # code to execute if condition is TRUE } else { # code to execute if condition is FALSE } The condition is an expression that evaluates to either TRUE or FALSE. If the condition is TRUE, the code inside the first set of curly braces {} is executed. Otherwise, the code inside the second set of curly braces {} is executed.
  • 20.
    Conditional Statements inR Here is the general syntax for if...else if...else: if (condition1) { # code to execute if condition1 is TRUE } else if (condition2) { # code to execute if condition1 is FALSE and condition2 is TRUE } else { # code to execute if both condition1 and condition2 are FALSE } The condition1 is an expression that evaluates to either TRUE or FALSE. If condition1 is TRUE, the code inside the first set of curly braces {} is executed. If condition1 is FALSE, the program checks condition2. If condition2 is TRUE, the code inside the second set of curly braces {} is executed. If both condition1 and condition2 are FALSE, the code inside the third set of curly braces {} is executed.
  • 21.
    Conditional Statements inR Example Here's an example of how to use the if...else syntax to print a message if a number is positive or negative: # Assign a number to the variable 'num' num <- -5 # Check if 'num' is positive or negative if (num > 0) { print(paste(num, "is positive")) } else { print(paste(num, "is negative")) }
  • 22.
  • 23.
    Loops in R •In R, there are several types of loops, including for loops, while loops, and repeat loops. Here is the general syntax for each type of loop: • for loop for (variable in sequence) { # code to execute for each value of the variable } • In a for loop, the variable is assigned each value of the sequence, and the code inside the curly braces {} is executed for each value of the variable.
  • 24.
    Loops in R •while loop while (condition) { # code to execute while the condition is TRUE } • In a while loop, the condition is checked before each iteration of the loop. If the condition is TRUE, the code inside the curly braces {} is executed. This process repeats until the condition is no longer TRUE.
  • 25.
    Loops in R •repeat loop repeat { # code to execute indefinitely if (condition) { break } } • In a repeat loop, the code inside the curly braces {} is executed indefinitely. However, the loop can be exited using the break statement when a certain condition is met.
  • 26.
    Loops in R •Here is an example of how to use a for loop to iterate through a sequence of numbers: # Create a vector of numbers from 1 to 5 nums <- c(1, 2, 3, 4, 5) # Use a for loop to print each number in the vector for (num in nums) { print(num) } • In this example, the nums vector contains the numbers 1 through 5. The for loop assigns each value in the nums vector to the num variable, and the print() function is used to print each value of num to the console. • Output: [1] 1 [1] 2 [1] 3 [1] 4 [1] 5
  • 27.
  • 28.
    What are functions •In R, functions are a set of instructions that can be called and executed to perform a specific task. • A function in R takes input values, called arguments, and returns an output value. • Functions can be built-in functions that are already available in R or user-defined functions that are created by the user.
  • 29.
    Few Built-in Functions •here are some commonly used built-in functions in R along with examples: • sum() - Calculates the sum of a vector of numbers # Calculate the sum of numbers from 1 to 5 sum(1:5) # Output: 15 • mean() - Calculates the mean of a vector of numbers # Calculate the mean of numbers from 1 to 5 mean(1:5) # Output: 3
  • 30.
    User Defined Functions •In R, you can create your own functions using the function() keyword. Here's the basic syntax for creating a function: my_function <- function(arg1, arg2, ...) { # code block return(output) } • Here's what each part of the syntax means: • my_function is the name you give to your function. • arg1, arg2, and ... are the input arguments to your function. You can have as many input arguments as you need. • The code block within the curly braces {} contains the code that your function will execute. • output is the value that your function will return when it is called.
  • 31.
    User Defined Functions •Here's an example of a user-defined function in R that calculates the area of a rectangle: # Define a function called "rectangle_area" rectangle_area <- function(length, width) { area <- length * width return(area) } # Call the function to calculate the area of a rectangle with length 5 and width 3 rectangle_area(5, 3) • In this example, we define a function called rectangle_area that takes two arguments: length and width. The function then calculates the area of the rectangle using the formula length * width and returns the result. We then call the function and pass in values for the length and width arguments to calculate the area of a rectangle with length 5 and width 3.
  • 32.
    Define and calla function • Here's an example of a simple function that calculates the sum of two numbers: sum_numbers <- function(x, y) { sum <- x + y return(sum) } • Once you've defined your function, you can call it just like any other function in R: result <- sum_numbers(3, 4) # Output: result = 7 • In this example, the function sum_numbers() takes two input arguments (x and y), calculates their sum, and returns the result. You can modify the code block within the function to perform any operation you want, such as statistical analysis, data manipulation, or plotting.
  • 33.
  • 34.
    Working with DiamondMispricing Data mydata<read.csv(file.choose()) install.packages("ggplot2") library(ggplot2) ggplot(data=mydata[mydata$carat<2.5,], aes(x=carat, y=price, colour=clarity))+ geom_point(alpha=0.1)+ geom_smooth() File: Diamonds_Pricelist