Basics of R Programming Mr. S.K.Patil
Basics of R Programming What is R? –  R is a programming language for statistical computing and data visualization.  R is an integrated suite of software facilities for data manipulation, calculation and graphical display.  R is a programming language & software environment which becomes the first choice for statistical computing and data analysis.  R is a programming language & an analytics tool.  R was developed in 1993 by Robert Gentleman and Ross Ihaka at the University of Auckland, New Zealand.  R was started by professors as a programming language to teach introductory statistics.  R was built to simplify complex data manipulation & create clear, customizable visualizations.
Basics of R Programming What is R? –  It has gained popularity among statisticians, data scientists and researchers because of its capabilities and the availability of large amount of packages.  R has established itself as an important tool in various industries, including finance and healthcare, due to its ability to handle large datasets and perform in-depth statistical analysis.  R is a wonderful tool for statistical analysis, visualization and reporting.
Basics of R Programming Why R Programming? –  R is a unique language that offers a wide range of features for data analysis, making it an essential tool for professionals in various fields.  Reasons to prefer R –  Free and Open-Source: R is open to everyone, meaning users can modify, share and distribute their work freely.  Designed for Data: R is built for data analysis, offering a comprehensive set of tools for statistical computing and graphics.  Large Package Repository: The Comprehensive R Archive Network (CRAN) offers thousands of add-on packages for specialized tasks.  Cross-Platform Compatibility: R can work on Windows, Mac and Linux operating systems.  Great for Visualization: With packages like ggplot2 , R makes it easy to create informative, interactive charts and plots.
Basics of R Programming Features of R –  Free and Open-Source: R is a free and open-source programming language distributed under the GNU General Public License.  Cross-Platform Interoperability: R has distributions that run on Windows, Linux, and Mac, allowing for easy porting of R code across different platforms.  Interpreter-Based Development: R uses an interpreter, which facilitates the development of code by allowing for immediate execution and testing.  Database Integration: R effectively connects to various databases and can import data from sources like Microsoft Excel, Access, MySQL, SQLite, Oracle, etc.  Bridging Software Development and Data Analysis: R serves as a flexible language that bridges the gap between software development and data analysis tasks.
Basics of R Programming Features of R –  Rich Package Ecosystem: R provides a vast collection of packages with diverse codes, functions, and features tailored for data analysis and modeling tasks.  Statistical Modeling Capabilities: It offers a range of tools for statistical modeling, allowing for advanced analysis of data and generation of models.  Data Visualization: R is equipped with powerful tools for creating a wide array of visualizations to aid in data exploration and interpretation.  Machine Learning Capabilities: R supports machine learning with various libraries and packages, making it suitable for building and evaluating ML models.  Data Import and Manipulation: R provides robust functionality for importing, cleaning, and manipulating data, making it suitable for data pre-processing tasks.
Basics of R Programming Features of R –  Report Generation: It integrates tools for generating reports in formats such as CSV, XML, HTML, and PDF, and can also create interactive web-based reports.
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Data Science and Analytics:  R is widely used in data science due to its statistical and graphical capabilities.  Applications:  Data preprocessing and wrangling.  Exploratory Data Analysis (EDA).  Building predictive models (regression, classification).  Visualizing trends and patterns.  Popular Packages: ggplot2, dplyr, tidyverse
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Finance & Economics:  R is heavily used for modeling financial systems and analyzing market data.  Applications:  Time series analysis and forecasting.  Portfolio optimization and risk assessment.  Stock price prediction and algorithmic trading.  Econometric modeling and hypothesis testing.  Popular Packages: quantmod, TTR, xts
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Bioinformatics and Computational Biology:  R is a standard tool in genomics and life sciences for handling complex biological data.  Applications:  Gene expression analysis.  Sequence alignment and annotation.  Protein structure and function analysis.  DNA microarray data analysis.  Popular Packages: Bioconductor, edgeR, limma
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Healthcare and Epidemiology:  R is extensively used for medical statistics and public health modeling.  Applications:  Clinical trial data analysis.  Survival analysis.  Disease modeling.  Risk factor identification and patient data mining.  Popular Packages: survival, epitools, epiR, ggsurvplot
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Marketing and Customer Analytics:  R is widely used for analyzing customer behavior and marketing performance.  Applications:  Customer segmentation (clustering).  Predicting churn and customer lifetime value.  A/B testing for campaign analysis.  Sentiment analysis and opinion mining from reviews or social media.  Popular Packages: caret, tm, text2vec, cluster
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Academia and Education:  R is a preferred tool in academic institutions for teaching and research.  Applications:  Teaching statistics and programming.  Data collection and research analysis.  Academic paper visualizations and simulations.  Publishing reproducible research.  Popular Packages: RMarkdown
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Manufacturing and Industrial Engineering:  R supports quality control and optimization tasks in manufacturing environments.  Applications:  Process control and Six Sigma analysis.  Root cause analysis.  Predictive maintenance and downtime analysis.  Inventory and supply chain optimization.  Popular Packages: qcc, SixSigma, forecast
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Logistics and Supply Chain:  R helps in optimizing logistics operations and forecasting demands.  Applications:  Demand forecasting using time series.  Route and network optimization.  Warehouse operations analytics.  Vendor and inventory analysis.  Popular Packages: forecast, tidyverse, lpsolve
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Environmental Science and Climate Research:  R is used in modeling environmental and geographical data.  Applications:  Climate pattern forecasting.  Pollution level analysis.  Remote sensing and spatial data analysis.  Popular Packages: raster, sp, rgdal, leaflet
Basics of R Programming Applications of R –  R is used in a variety of fields, including:  Media and Entertainment:  R is used for analyzing consumer preferences and digital media content.  Applications:  Viewer behavior prediction and segmentation.  Recommendation engines.  Sentiment analysis of reviews.  Social media trend analytics.  Popular Packages: tm, sentiment, textdata
Basics of R Programming Introduction to R Studio –  R Studio is an integrated development environment (IDE) for R.  IDE is a GUI, where we can write our quotes, see the results and also see the variables that are generated during the course of programming.  R Studio is available as both Open source and Commercial software.  R Studio is also available as both Desktop and Server versions.  R Studio is also available for various platforms such as Windows, Linux, and macOS.  R Studio is an open-source tool that provides IDE to use R language, and enterprise-ready professional software for data science teams to develop & share the work with their team.
Basics of R Programming Why use R Studio? –  R Studio is preferred for several reasons:  Integrated Development Environment (IDE): It provides a user-friendly interface designed specifically for R programming, offering features like syntax highlighting, code completion, and debugging tools that enhance productivity and code quality.  Project Management: R Studio facilitates efficient project organization with its project-based workflow, allowing users to manage multiple scripts, data files, and plots within a cohesive workspace, which is essential for organizing projects after the R language installation.  Data Visualization: It supports powerful data visualization capabilities through integrated plotting tools and compatibility with popular visualization libraries like ggplot2. It enables users to create insightful graphs and charts effortlessly, enhancing data representation after the R language installation.
Basics of R Programming Why use R Studio? –  R Studio is preferred for several reasons:  Package Management: R Studio simplifies package management with tools like the Package Manager and integrated CRAN repository access, making it easy to install, update, and manage R packages crucial for various analytical tasks, supporting package management post R language installation.  Markdown and R Markdown Support: It seamlessly integrates with Markdown and R Markdown, enabling users to create dynamic reports, presentations, and documents that combine code, visualizations, and narrative text in a single file, facilitating report generation after R language installation.  Collaboration and Sharing: R Studio facilitates collaboration by supporting version control systems like Git and enabling seamless sharing of projects and analyses through R Studio Server and R Studio Cloud, promoting collaboration after the R language installation.
Basics of R Programming How to Download R and R Studio? –  To Install R and R Studio on Windows we will have to download R and R Studio with the following steps.  Step 1: First, we need to set up an R environment in our local machine. We can download the same from https://posit.co/download/rstudio-desktop/
Basics of R Programming How to Download R and R Studio? –  Step 2: We have to download both the applications, first go with R Base and then install R Studio. After click on install R we will get a new page like this. Here we can select the Linux, mac or windows any one according to users system.
Basics of R Programming How to Download R and R Studio? –  Step 3: After selecting OS, we get below page where we have to select R base.
Basics of R Programming How to Download R and R Studio? –  Step 4: Now click on the link show above in image so R base start downloading.
Basics of R Programming How to Download R and R Studio? –  Step 5: Again go to main page and download R Studio Desktop for Windows.
Basics of R Programming Steps to Install R and R Studio –  Step 1: After downloading R for the Windows platform, install it by double- clicking it.  Step 2: Download R Studio from their official page. Note: It is free of cost (under AGPL licensing).  Step 3: After downloading, we will get a file named "RStudio-x.x.xxxx.exe" in our Downloads folder.  Step 4: Double-click the installer, and install the software.  Step 5: Test the R Studio installation.  Search for R Studio in the Window search bar on Taskbar. Start the application & insert the following code in the console. print(‘Hello World’)  Step 6: Installation is successful.
Basics of R Programming R Studio interface overview –  After the installation process is over, the R Studio interface looks like:
Basics of R Programming R Studio interface overview –  The console panel(left panel) is the place where R is waiting for us to tell it what to do, and see the results that are generated when we type in the commands.  To the top right, we have the Environment / History panel. It contains 2 tabs:  Environment tab: It shows the variables that are generated during the course of programming in a workspace that is temporary.  History tab: In this tab, we will see all the commands that are used till now from the start of usage of R Studio.  To the right bottom, we have another panel, which contains multiple tabs, such as files, plots, packages, help, and viewer.  The Files tab shows the files and directories that are available within the default workspace of R.  The Plots tab shows the plots that are generated during the course of programming.
Basics of R Programming R Studio interface overview –  To the right bottom, we have another panel, which contains multiple tabs, such as files, plots, packages, help, and viewer.  The Packages tab helps us to look at what are the packages that are already installed in the R Studio and it also gives a user interface to install new packages.  The Help tab is the most important one where we can get help from the R Documentation on the functions that are in built-in R.  The final and last tab is that the Viewer tab which can be used to see the local web content that's generated using R.
Basics of R Programming Creation and Execution of R File in R Studio –  Creating an R file –  There are two ways to create an R file in R studio:  Click on the File tab, it will give a drop-down menu, where we can select the new file and then R script, so that, we will get a new file open.
Basics of R Programming Creation and Execution of R File in R Studio –  Creating an R file –  There are two ways to create an R file in R studio:  Use the plus button, which is just below the file tab and choose R script, from there, to open a new R script file.
Basics of R Programming Creation and Execution of R File in R Studio –  Creating an R file –  Once we open an R script file, this is how an R Studio with the script file open looks like.
Basics of R Programming Creation and Execution of R File in R Studio –  Creating an R file –  So, three panels console, environment/history and file/plots panels are there.  On top left we have a new window, which is now being opened as a script file. Now we are ready to write a script file or some program in R Studio.  Writing Scripts in an R file –  Writing scripts to an R file is demonstrated here with an example:  A variable ‘x' is assigned with a value 11, in the first line of the code and there is ‘y’ which is ‘x' times 10, that is the second line.  Here, the code is evaluating the value of x times 10 and assign the value to the y.
Basics of R Programming Creation and Execution of R File in R Studio –  Writing Scripts in an R file –  The third statement, which is print(z(x, y)) means concatenates this x and y and print the result.
Basics of R Programming Creation and Execution of R File in R Studio –  Saving an R File –  After writing a script file, there is a need to save this file before execution.  To save the R file, from the file menu click either save or save as button.  When we click the save button, it will automatically save the file with untitled x. The x can be 1 or 2 depending on how many R scripts already opened.  It is a nice idea to use the Save as button, so that, we can rename the script file according to our convenience.  When we click Save as button, it will pop out a window, where we can rename the script file as demo.R.  Once we rename, then by clicking the save button we can save the script file.
Basics of R Programming Creation and Execution of R File in R Studio –  Saving an R File –  The window will look like below –
Basics of R Programming Creation and Execution of R File in R Studio –  Execution of an R File –  There are several ways in which the execution of the commands that are available in the R file is done.
Basics of R Programming Creation and Execution of R File in R Studio –  Execution of an R File –  Using the run command: This "run" command can be executed using the GUI, by pressing the run button there, or we can use the Shortcut key control + enter. What does it do? It will execute the line in which the cursor is there.  Using the source command: This "source" command can be executed using the GUI, by pressing the source button there, or we can use the Shortcut key control + shift + S. What does it do? It will execute the whole R file and only print the output which we wanted to print.  Using the source with echo command: This "source with echo" command can be executed using the GUI, by pressing the source with echo button there, or we can use the Shortcut key control + shift + enter. What does it do? It will print the commands also, along with the output we are printing.
Basics of R Programming Creation and Execution of R File in R Studio –  Execution of an R File –  The output will be printed in the console. The values of x, y and z are also shown in the environment panel.
Basics of R Programming Creation and Execution of R File in R Studio –  Execution of an R File –  Run command over Source command:  Run can be used to execute the selected lines of R code.  Source and Source with echo can be used to run the whole file.  The advantage of using Run is, we can troubleshoot or debug the program when something is not behaving according to our expectations.  The disadvantage of using run command is, it populates the console and makes it messy unnecessarily.
Basics of R Programming Clear the Console and the Environment in R Studio –  Clearing the Console –  We can Clear console in R and R Studio, In some cases when we run the codes using "source" and "source with echo" our console will become messy & it is needed to clear the console.  The console can be cleared using the shortcut key "ctrl + L".  Note: Remember that clearing the console will not delete the variables that are there in the workspace. We can see that in the environment tab even though we have cleared the console in the workspace we still have the variables that are created earlier.
Basics of R Programming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Variables on the R environment can be cleared in two ways:  Using rm() command:  When we want to clear a single variable from R environment we can use the "rm()" command followed by the variable we want to remove.  rm (variable) e.g. rm (x)  If we want to delete all the variables that are there in the environment then we can use the "rm" with an argument "list" is equal to "ls" followed by a parenthesis.  rm (list = ls())
Basics of R Programming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:  We can also clear all the variables in the environment using the GUI in the environment pane.  When we press the brush button, it will pop up a window saying "you want to remove all the objects from the environment?“.  If we say yes it will clear all the variables which are there & we can see the environment is empty now.
Basics of R Programming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:
Basics of R Programming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:
Basics of R Programming Basic Syntax in R Programming –  A program in R is made up of three things: Variables, Comments & Keywords.  Variables are used to store the data.  Comments are used to improve code readability.  Keywords are reserved words that hold a specific meaning to the compiler.  Variables in R –  Name given to reserved memory locations that can store any type of data.  In R, the assignment can be denoted in three ways:  = (Simple Assignment)  <- (Leftward Assignment)  -> (Rightward Assignment)
Basics of R Programming Basic Syntax in R Programming –  Variables in R –  The rightward assignment is less common & can be confusing, so it is generally recommended to use <- or = operator for assigning values in R.
Basics of R Programming Basic Syntax in R Programming –  Comments in R –  Comments are a way to improve your code's readability and are only meant for the user so the interpreter ignores it.  Only single-line comments are available in R but we can also use multiline comments by using a simple trick which is shown below.  Single line comments can be written by using # at the beginning of the statement.  From the below output, we can see that both comments were ignored by the interpreter.
Basics of R Programming Basic Syntax in R Programming –  Comments in R –
Basics of R Programming Basic Syntax in R Programming –  Keywords in R –  Keywords are the words reserved by a program because they have a special meaning thus a keyword can't be used as a variable name, function name, etc.  We can view these keywords by using either help(reserved) or ?reserved.
Basics of R Programming Basic Syntax in R Programming –  Keywords in R –  if, else, repeat, while, function, for, in, next and break are used for control- flow statements and declaring user-defined functions.  The ones left are used as constants like TRUE/FALSE are used as boolean constants.  NaN defines Not a Number value and NULL are used to define an Undefined value.  Inf is used for Infinity values.
Basics of R Programming R Commands –  The following R commands provide an overview of different application areas in R programming.  Depending on specific needs and projects, we can pick and match the commands that suits.  Reading and Writing Commands –  Reading and writing data are fundamental tasks in data analysis and manipulation.  In R, several functions & packages can help to handle different types of data sources.  read.csv() - Read data from a CSV file.  write.csv() - Write data to a CSV file.
Basics of R Programming R Commands –  Dataframe Operations Commands -  Dataframe operations in R are essential for data manipulation and analysis.  Here are some common operations we might perform on data frames using base R & dplyr package, which is part of the tidyverse collection.  data.frame() - Create a data frame.  subset() - Filter data based on specific conditions.  merge() - Merge data from different data frames.  aggregate() - Aggregate data based on specific criteria.  transform() - Create new variables in a data frame.  sort() - Sort vectors or data frames.  unique() - Identify unique values in a vector or column.
Basics of R Programming R Commands –  Applying Functions Commands -  Applying functions to data frames is a powerful technique in R for data transformation and analysis.  Here are various ways to apply functions to data frames.  apply() - Apply a function to rows or columns of matrices or data frames.  lapply(), sapply(), mapply() - Apply functions to lists or vectors.
Basics of R Programming R Commands –  Using dplyr for Data Manipulation -  The dplyr is a powerful package in R designed to make data manipulation easy and intuitive.  It provides a set of verbs that allow you to solve the most common data manipulation challenges.  dplyr::filter() - Filter data in data frames.  dplyr::mutate() - Create new variables in data frames.  dplyr::select() - Select specific columns from a data frame.  dplyr::summarize() - Summarize data by applying functions.
Basics of R Programming R Commands –  Data Visualizations Commands -  Data visualization is a critical part of data analysis, and R offers powerful libraries like ggplot2 for creating various types of visualizations.  Base R Plotting Functions –  In R, base plotting functions provide a straightforward way to create a wide range of plots.  plot() - Create scatter plots and other basic plot types.  hist() - Create histograms.  barplot() – Create bar charts.  boxplot() – Create box plots.  ggplot2::ggplot() - Create sophisticated and customizable visualizations.
Basics of R Programming R Commands –  Data Visualizations Commands -  Specialized Plots –  Specialized plots cater to specific data visualization needs, offering more advanced and tailored representations.  qqnorm(), qqline() - Create quantile-quantile plots.  acf() - Display autocorrelation functions.  density() - Display density functions and histograms.  heatmap() - Create heat maps.
Basics of R Programming R Commands –  Statistical Analysis Commands –  Statistical analysis in R involves a wide range of techniques & commands.  Descriptive Statistics –  To compute descriptive statistics such as mean, median, standard deviation, and quartiles, we can use the summary and quantile functions.  summary() - Get a summary of data, including statistical metrics.  mean(), median(), sd() - Calculate mean, median, and standard deviation.
Basics of R Programming R Commands –  Statistical Analysis Commands –  Hypothesis Testing –  Hypothesis testing is a fundamental concept in statistics used to make inferences about a population based on sample data.  t.test() - Perform T-tests for hypothesis testing.  anova() - Perform analysis of variance (ANOVA).  chi-sq.test() - Perform chi-square tests.
Basics of R Programming R Commands –  Statistical Analysis Commands –  Regression and Correlation –  Regression and correlation are statistical techniques used to analyze the relationship between variables.  lm() - Perform linear regressions.  cor() - Calculate correlation coefficients between variables.
Basics of R Programming R Commands –  Data Import and Export Commands –  Data import and export are essential tasks in R for working with external data sources.  R Data Objects –  In R, there are several data objects commonly used for storing and working with data.  readRDS(), saveRDS() - Read and save R data objects.  Reading and Writing Various Formats –  Reading and writing data in various formats is a common task in R.  read.table() - Read data from a file.  write.table() - Write data to a file.
Basics of R Programming R Commands –  Data Import and Export Commands –  Excel Files –  Working with Excel files in R is a common task, and there are multiple packages available for reading and writing Excel files.  readxl::read_excel() - Read data from Excel files.  writexl::write_xlsx() - Write data to Excel files.
Basics of R Programming R Commands –  Control Structures and Conditionals –  Control structures and conditionals in R allow you to control the flow of execution in your code based on certain conditions.  Conditional Statements –  Conditional statements in R allow to execute specific code blocks based on whether certain conditions are true or false.  ifelse()- Perform condition evaluations and conditional expressions.  Loops –  Loops in R allow to execute a block of code repeatedly.  for() - Loop over a sequence.  while() - Perform while loops.  repeat() - Execute a loop indefinitely until a condition is met.
Basics of R Programming Variables and scope of variables –  In R, variables are the containers for storing data values.  They are reference, or pointers, to an object in memory i.e. whenever a variable is assigned to an instance, it gets mapped to that instance.  A variable in R can store a vector, a group of vectors or a combination of many R objects.  Naming convention for Variables –  The variable name in R has to be Alphanumeric characters with an exception of underscore('_') and dot('.'), the special characters which can be used in the variable names.  The variable name has to be started always with an alphabet.  Other special characters like('!', '@', '#', '$') are not allowed in the variable names.
Basics of R Programming Variables and scope of variables –  Scope of a Variable –  The location where we can find a variable and also access it if required is called the scope of a variable.  There are mainly two types of variable scopes:  Global Variables: Global variables are those variables that exist throughout the execution of a program. It can be changed and accessed from any part of the program.  Local Variables: Local variables are those variables that exist only within a certain part of a program like a function and are released when the function call ends.
Basics of R Programming Variables and scope of variables–  Scope of a Variable –  Global Variable:  As the name suggests, Global Variables can be accessed from any part of the program.  They are available throughout the lifetime of a program.  They are declared anywhere in the program outside all of the functions or blocks.  Declaring global variables: Global variables are usually declared outside of all of the functions and blocks. They can be accessed from any portion of the program.
Basics of R Programming Variables and scope of variables–  Scope of a Variable –  Global Variable:  In above code, the variable 'global' is declared at the top of program outside all of the functions so it is a global variable and can be accessed or updated from anywhere in the program.
Basics of R Programming Variables and scope of variables–  Scope of a Variable –  Local Variable:  Variables defined within a function or block are said to be local to those functions.  Local variables do not exist outside the block in which they are declared, i.e. they can not be accessed or used outside that block.  Declaring local variables: Local variables are declared inside a block.
Basics of R Programming Variables and scope of variables–  Scope of a Variable –  Local Variable:  The above program displays an error saying "object 'age' not found".  The variable age was declared within the function "func()" so it is local to that function & not visible to the portion of the program outside this function.  To correct the above error we have to display the value of variable age from the function "func()" only.
Basics of R Programming Variables and scope of variables–  Scope of a Variable –  Local Variable:
Basics of R Programming Data Types in R –  Data types in R define the kind of values that variables can hold.  Choosing the right data type helps to optimize memory usage & computation.  R does not require explicit data type declarations while variables can change their type dynamically during execution.  R Programming language has the following basic R-data types:  Numeric Data type in R –  Decimal values are called numeric in R. It is the default R data type for numbers in R.  If we assign a decimal value to a variable x then it will be of numeric type.  Real numbers with a decimal point are represented using this data type in R.  It uses a format for double-precision floating-point numbers to represent numerical values.
Basics of R Programming Data Types in R –  Numeric Data type in R –  Even if an integer is assigned to a variable, it is still saved as a numeric value.
Basics of R Programming Data Types in R –  Numeric Data type in R –  When R stores a number in a variable, it converts the number into a "double" value or a decimal type with at least two decimal places.  It means that a value such as "5" here, is stored as 5.00 with a type of double and a class of numeric.  Also it is not an integer, which can be confirmed with the is.integer() function.
Basics of R Programming Data Types in R –  Integer Data type in R –  R supports integer data types which are the set of all integers.  We can create as well as convert a value into an integer type using the as.integer() function.  We can also use the capital 'L' notation as a suffix to denote that a particular value is of the integer R data type.
Basics of R Programming Data Types in R –  Logical Data type in R –  R has logical data types that take either a value of true or false.  A logical value is often created via a comparison between variables.  Boolean values, which have two possible values, are represented by this R data type: FALSE or TRUE.
Basics of R Programming Data Types in R –  Complex Data type in R –  R supports complex data types that are set of all the complex numbers.  The complex data type is to store numbers with an imaginary component.
Basics of R Programming Data Types in R –  Character Data type in R –  R supports character data types where we have all the alphabets and special characters.  It stores character values or strings. Strings in R can contain alphabets, numbers, and symbols.  The easiest way to denote that a value is of character type in R data type is to wrap the value inside single or double inverted commas.
Basics of R Programming Data Types in R –  Raw Data type in R –  To save and work with data at the byte level in R, use the raw data type.  By displaying a series of unprocessed bytes, it enables low-level operations on binary data.  Five elements make up this raw vector x, each of which represents a raw byte value.
Basics of R Programming Data Types in R –  Find Data Type of an Object in R –  To find the data type of an object we have to use class() function.  The syntax for doing that is we need to pass the object as an argument to the function class() to find the data type of an object.  Syntax - class (object)
Basics of R Programming Data Types in R –  Type Verification in R –  We can verify the data type of an object, if we doubt about it's data type.  To do that, we need to use the prefix "is." before the data type as a command.  Syntax – is.data_type(object)
Basics of R Programming Data Types in R –  Convert the Data Type of an Object to Another –  The process of altering the data type of an object to another type is referred to as data type conversion.  This is a common operation in many programming languages that is used to alter data and perform various computations.  The conversion is performed directly by the programmer.  Syntax – as.data_type(object)  All the conversions are not possible and if attempted will be returning an "NA" value.
Basics of R Programming Data Types in R –  Convert the Data Type of an Object to Another –
Basics of R Programming Operators in R –  Operators are the symbols directing the compiler to perform various kinds of operations between the operands.  Operators simulate various mathematical, logical, & decision operations performed on a set of Complex Numbers, Integers, & Numericals as input operands.  R supports majorly four kinds of binary operators between a set of operands.  Arithmetic Operators –  It evaluates specified operator between operands, which may be either scalar values, complex numbers, or vectors.  They are performed element-wise at the corresponding positions of the vectors.
Basics of R Programming Operators in R –  Arithmetic Operators –  Addition Operator (+) –  Subtraction Operator (-) –  Multiplication Operator (*) –  Division Operator (/) -
Basics of R Programming Operators in R –  Arithmetic Operators –  Power Operator (^) – The first operand is raised to the power of the second operand.  Modulo Operator (%%) - It returns the remainder after dividing the first operand by the second operand.
Basics of R Programming Operators in R –  Logical Operators –  It simulates element-wise decision operations based on the specified operator between the operands, which are then evaluated to either a True or False boolean value.  Any non-zero integer value is considered as a TRUE value, be it a complex or real number.  Element-wise Logical AND operator (&) - Returns True if both the operands are True.  Element-wise Logical OR operator (|) - Returns True if either of the operand is True.
Basics of R Programming Operators in R –  Logical Operators –  NOT operator (!) - A unary operator that negates the status of the elements of the operand.  Logical AND operator (&&) - Returns True if both the first elements of the operands are True. (Single output)  Logical OR operator (||) - Returns True if either of the first elements of the operands is True. (Single output)
Basics of R Programming Operators in R –  Relational Operators –  The Relational Operators in R carry out comparison operations between the corresponding elements of the operands.  Returns a boolean TRUE value if the first operand satisfies the relation compared to the second.  A TRUE value is always considered to be greater than the FALSE.  Less than (<) –  Less than equal to (<=) –
Basics of R Programming Operators in R –  Relational Operators –  Greater than (>) –  Greater than equal to (>=) –  Not equal to (!=) -
Basics of R Programming Operators in R –  Assignment Operators –  They are used to assigning values to various data objects in R. The objects may be integers, vectors, or functions.  These values are then stored by the assigned variable names.  There are two kinds of assignment operators: Left and Right  Left Assignment (<- or <<- or =) - Assigns a value to a vector.  Right Assignment (-> or ->> ) - Assigns a value to a vector.
Basics of R Programming Operators in R –  Miscellaneous Operators –  They are mixed operators in R that simulate the printing of sequences and assignment of vectors, either left or right-handed.  %in% Operator - Checks if an element belongs to a list and returns a boolean value TRUE if the value is present else FALSE.  %*% Operator –  This operator is used to multiply a matrix with its transpose.  Transpose of the matrix is obtained by interchanging the rows to columns and columns to rows.
Basics of R Programming Operators in R –  Miscellaneous Operators –  %*% Operator –  The number of columns of the first matrix must be equal to the number of rows of the second matrix.  Multiplication of the matrix A with its transpose, B, produces a square matrix.
Basics of R Programming Keywords in R –  R keywords are reserved words that have special meaning in the language.  They helps to control program flow, define functions, and represent special values.  We can check for which words are keywords by using the help(reserved) or ?reserved function.  if - used for decision-making to execute code only if a condition is true.  else - executes code when the if condition is false.  while - loop which runs a block repeatedly while a condition remains true.  repeat - runs a block indefinitely until explicitly stopped with break.  for - loop iterates over a sequence, running code for each element.  function - defines reusable blocks of code.
Basics of R Programming Keywords in R –  next - skips the current iteration in a loop and continues with the next.  break - stops the execution of a loop immediately.  TRUE/FALSE – Boolean constants representing logical true and false values.  NULL - Represents an empty or undefined object.  Inf and NaN - Inf and -Inf represent positive and negative infinity. NaN means “Not a Number” and occurs in undefined numerical operations.  NA - Represents missing or unavailable data.
Basics of R Programming Control Structures in R Programming –  Control statements are expressions used to control the execution and flow of the program based on the conditions provided in the statements.  These structures are used to make a decision after assessing the variable.  In R programming, there are 8 types of control statements as follows:  if condition  if-else condition  for loop  nested loops  while loop  repeat and break statements  return statement  next statement
Basics of R Programming Control Structures in R Programming –  if Condition –  This control structure checks the expression provided in parenthesis is true or not.  If true, the execution of the statements in braces {} continues.  Syntax – if (expression) { statements …. }  Example -
Basics of R Programming Control Structures in R Programming –  if-else Condition –  It is similar to if condition but when the test expression in if condition fails, then statements in else condition are executed.  Syntax – if (expression) { statements } else { statements }  Example -
Basics of R Programming Control Structures in R Programming –  for Loop –  It is a type of loop or sequence of statements executed repeatedly until exit condition is reached.  Syntax – for (value in vector) { statements }  Example -
Basics of R Programming Control Structures in R Programming –  Nested loops –  Nested loops are similar to simple loops. Nested means loops inside loop.  The nested loops are used to manipulate the matrix.  Example -
Basics of R Programming Control Structures in R Programming –  while loops –  It is another kind of loop iterated until a condition is satisfied.  The testing expression is checked first before executing the body of loop.  Syntax – while(expression){ statement ….. }  Example -
Basics of R Programming Control Structures in R Programming –  repeat loop and break statement –  The repeat is a loop which can be iterated many number of times but there is no exit condition to come out from the loop.  So, break statement is used to exit from the loop. It can be used in any type of loop to exit from the loop.  Syntax – repeat { statements ….. if(expression){ break } }
Basics of R Programming Control Structures in R Programming –  repeat loop and break statement –  Example –
Basics of R Programming Control Structures in R Programming –  return statement –  It is used to return the result of an executed function and returns control to the calling function.  Syntax – return(expression)  Example -
Basics of R Programming Control Structures in R Programming –  next statement –  It is used to skip the current iteration without executing the further statements and continues the next iteration cycle without terminating the loop.  Example -
Basics of R Programming Functions in R Programming –  A function accepts input arguments and produces the output by executing valid R commands that are inside the function.  Functions are useful when we want to perform a certain task multiple times.  In R programming, when we are creating a function the function name and the file in which we are creating the function need not be the same and we can have one or more functions in R.  Creating a Function in R –  Functions are created in R by using the command function().  The general structure of the function file is as follows: func = function(arguments){ statements }
Basics of R Programming Functions in R Programming –  Parameters or Arguments in R Functions –  In programming, parameters and arguments refer to the values passed into a function.  They are often used interchangeably, but there is a subtle difference:  Parameters are the variables defined in the function definition.  Arguments are the actual values passed to the function when it is called.  A function can have multiple parameters, and these are separated by commas within the parentheses.
Basics of R Programming Functions in R Programming –  Function Parameter Rules –  Number of Parameters: A function should be called with the correct number of parameters. If the number doesn't match, an error occurs.  Default Parameter Values: Some functions have default values for parameters. If no argument is passed, these defaults are used.  Return Value: The return() function sends the result back from the function.
Basics of R Programming Functions in R Programming –  Calling a Function in R –  After creating a Function, we have to call the function to use it.  Calling a function in R is done by writing it's name and passing possible parameters value.  Passing Arguments to Functions in R Programming Language –  There are several ways we can pass the arguments to the function:  Case 1: Generally in R, the arguments are passed to the function in the same order as in the function definition.  Case 2: If we don’t want to follow any order what we can do is then we can pass the arguments using names of the arguments in any order.  Case 3: If the arguments are not passed the default values are used to execute the function.
Basics of R Programming Functions in R Programming –  Calling a Function in R –  Passing Arguments to Functions in R Programming Language –  Example:
Basics of R Programming Functions in R Programming –  Types of Function in R –  Built-in Function: Built-in functions in R are pre-defined functions that are used to perform common tasks or operations.  User-defined Function: R language allow us to write our own function.  Built-in Function in R Programming Language –  Built-in Function are the functions that are already existing in R language and we just need to call them to use.  Example - sum(), max(), min(), etc.
Basics of R Programming Functions in R Programming –  Types of Function in R –  Other Built-in Function in R –  The list of built-in R functions and their uses: Category Functions Mathematical Functions abs(), sqrt(), round(), exp(), log(), cos(), sin(), tan() Statistical Functions mean(), median(), cor(), var() Data Manipulation Functions unique(), subset(), aggregate(), order() File Input/Output Functions read.csv(), write.csv(), read.table(), write.table()
Basics of R Programming Functions in R Programming –  Types of Function in R –  User-defined Functions in R Programming Language –  User-defined functions are the functions that are created by the user.  The User defines the working, parameters, default parameter, etc. of that user-defined function.  They can be only used in that specific code.
Basics of R Programming Packages R Programming –  Packages in R programming are a set of R functions, compiled code, and sample data.  These are stored under a directory called "library" within the R environment.  By default, R installs a group of packages during installation.  Once we start the R console, only the default packages are available by default.  Other packages that are already installed need to be loaded explicitly to be utilized by the R program that's getting to use them.  What are Repositories?  A repository is a place where packages are located and stored so we can install R packages from it.  Organizations and Developers have a local repository, typically they are online and accessible to everyone.
Basics of R Programming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  CRAN (Comprehensive R Archive Network):  It is the official repository.  It is a network of FTP and web servers maintained by the R community around the world.  The R community coordinates it, and for a package to be published in CRAN, the Package needs to pass several tests to ensure that the package is following CRAN policies.
Basics of R Programming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  Bioconductor:  It is a specialized repository, intended for open source software for bioinformatics.  Similar to CRAN, it has its own submission and review processes, and its community is very active having several conferences and meetings per year in order to maintain quality.
Basics of R Programming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  GitHub:  It is the most popular repository for open-source projects.  It's popular as it comes from the unlimited space for open source, the integration with git, a version control software.  It is easy o share and collaborate with others.
Basics of R Programming Packages R Programming –  Get library locations containing R packages -  The .libpath() method handles the management of library paths.  These are the directories where a program searches for external libraries or modules required for execution.  Syntax - .libPaths()  Get the list of all the R packages installed –  We load a package using library(), the functions and objects in that package become available in the global environment.  Syntax – library()
Basics of R Programming Packages R Programming –  Install an R packages -  There are multiple ways to install R Package, some of them are:  Installing R Packages From CRAN:  For installing R Package from CRAN we need the name of the package and use the following command install.packages(“package name”)  Installing Package from CRAN is the most common and easiest way as we just have to use only one command.  To install more than one package at a time, we have to write them as a character vector in first argument of the install.packages() function: install.packages(c(“package1”,”package2”))
Basics of R Programming Packages R Programming –  Install an R packages -  Installing BiocManager Packages:  The BiocManager package should be used to install and manage Bioconductor packages.  To install the BiocManager package, use the following command: install.packages("BiocManager")  Once BiocManager is installed, we can use it to install Bioconductor packages.  Example - to install the edgeR package and its dependencies from the Bioconductor repository, use: BiocManager::install("edgeR")
Basics of R Programming Packages R Programming –  Update, Remove and Check Installed Packages in R -  To check what packages are installed on our computer, type command: installed.packages()  To update all the packages, type command: update.packages()  To update a specific package, type command: install.packages("PACKAGE NAME")
Basics of R Programming Packages R Programming –  Installing Packages Using R Studio UI -  In R Studio goto Tools → Install Package, and there we will get a pop-up window to type the package we want to install:  Under Packages, type, and search Package which we want to install and then click on install button.
Basics of R Programming Packages R Programming –  How to Load Packages in R Programming Language -  When a R package is installed, we are ready to use its functionalities.  If we just need a frequent use of a few functions or data inside a package we can access them with the following notation.  Example - Load a package using the library function - library(dplyr) Alternatively, load a package using the require function - require(dplyr)  Both functions attempt to load the specified package, but there is a subtle difference between the two:  library() returns an error if the package is not found or cannot be loaded  require() returns a warning and sets the value of the package variable to FALSE.
Basics of R Programming Packages R Programming –  Difference Between a Package and a Library -  There is always confusion between a package and a library, and we find people calling libraries as packages.  library(): It is the command used to load a package, and it refers to the place where the package is contained, usually a folder on our computer.  Package: It is a collection of functions bundled conveniently. The package is an appropriate way to organize our own work and share it with others.
Basics of R Programming Data Structures in R Programming –  A data structure is a particular way of organizing data in a computer so that it can be used effectively.  The idea is to reduce the space and time complexities of different tasks.  Data structures in R programming are tools for holding multiple values.  R’s base data structures are often organized by their dimensionality (1D, 2D or nD) and whether they’re homogeneous (all elements must be of the identical type) or heterogeneous (the elements are often of various types).  This gives rise to the six data types which are most frequently utilized in data analysis – Vectors, Lists, Data Frames, Matrices, Arrays, Factors
Basics of R Programming Data Structures in R Programming –  Vectors –  A vector is an ordered collection of basic data types of a given length.  All the elements of a vector must be of the identical data type.  Vectors are one-dimensional data structures.  Created using the c() function.
Basics of R Programming Data Structures in R Programming –  Lists –  Ordered collection of elements, which can be of different types.  These are also one-dimensional data structures.  A list can be a list of vectors, list of matrices, a list of characters and a list of functions and so on.  Created using the list() function.
Basics of R Programming Data Structures in R Programming –  Data Frames –  Table-like structure where each column can contain different types of data.  They are two-dimensional, heterogeneous data structures. These are lists of vectors of equal lengths.  Created using the data.frame() function.
Basics of R Programming Data Structures in R Programming –  Data Frames –  Data frames have the following constraints placed upon them:  A data-frame must have column names & every row should have a unique name.  Each column must have the identical number of items.  Each item in a single column must be of the same data type.  Different columns may have different data types.
Basics of R Programming Data Structures in R Programming –  Matrices –  Two-dimensional data structure with rows and columns, where all elements are of the same type.  Matrices are two-dimensional, homogeneous data structures.  Created using the matrix() function.
Basics of R Programming Data Structures in R Programming –  Arrays –  Arrays are the R data objects which store the data in more than two dimensions.  Arrays are n-dimensional data structures.  They are homogeneous data structures.  For example, if we create an array of dimensions (2, 3, 3) then it creates 3 rectangular matrices each with 2 rows and 3 columns.
Basics of R Programming Data Structures in R Programming –  Factors –  Factors are the data objects which are used to categorize the data and store it as levels.  They are useful for storing categorical data.  They can store both strings and integers.  They are useful to categorize unique values in columns like (“TRUE” or “FALSE”) or (“MALE” or “FEMALE”), etc..  They are useful in data analysis for statistical modeling.

Introduction to the R Programming Language

  • 1.
    Basics of RProgramming Mr. S.K.Patil
  • 2.
    Basics of RProgramming What is R? –  R is a programming language for statistical computing and data visualization.  R is an integrated suite of software facilities for data manipulation, calculation and graphical display.  R is a programming language & software environment which becomes the first choice for statistical computing and data analysis.  R is a programming language & an analytics tool.  R was developed in 1993 by Robert Gentleman and Ross Ihaka at the University of Auckland, New Zealand.  R was started by professors as a programming language to teach introductory statistics.  R was built to simplify complex data manipulation & create clear, customizable visualizations.
  • 3.
    Basics of RProgramming What is R? –  It has gained popularity among statisticians, data scientists and researchers because of its capabilities and the availability of large amount of packages.  R has established itself as an important tool in various industries, including finance and healthcare, due to its ability to handle large datasets and perform in-depth statistical analysis.  R is a wonderful tool for statistical analysis, visualization and reporting.
  • 4.
    Basics of RProgramming Why R Programming? –  R is a unique language that offers a wide range of features for data analysis, making it an essential tool for professionals in various fields.  Reasons to prefer R –  Free and Open-Source: R is open to everyone, meaning users can modify, share and distribute their work freely.  Designed for Data: R is built for data analysis, offering a comprehensive set of tools for statistical computing and graphics.  Large Package Repository: The Comprehensive R Archive Network (CRAN) offers thousands of add-on packages for specialized tasks.  Cross-Platform Compatibility: R can work on Windows, Mac and Linux operating systems.  Great for Visualization: With packages like ggplot2 , R makes it easy to create informative, interactive charts and plots.
  • 5.
    Basics of RProgramming Features of R –  Free and Open-Source: R is a free and open-source programming language distributed under the GNU General Public License.  Cross-Platform Interoperability: R has distributions that run on Windows, Linux, and Mac, allowing for easy porting of R code across different platforms.  Interpreter-Based Development: R uses an interpreter, which facilitates the development of code by allowing for immediate execution and testing.  Database Integration: R effectively connects to various databases and can import data from sources like Microsoft Excel, Access, MySQL, SQLite, Oracle, etc.  Bridging Software Development and Data Analysis: R serves as a flexible language that bridges the gap between software development and data analysis tasks.
  • 6.
    Basics of RProgramming Features of R –  Rich Package Ecosystem: R provides a vast collection of packages with diverse codes, functions, and features tailored for data analysis and modeling tasks.  Statistical Modeling Capabilities: It offers a range of tools for statistical modeling, allowing for advanced analysis of data and generation of models.  Data Visualization: R is equipped with powerful tools for creating a wide array of visualizations to aid in data exploration and interpretation.  Machine Learning Capabilities: R supports machine learning with various libraries and packages, making it suitable for building and evaluating ML models.  Data Import and Manipulation: R provides robust functionality for importing, cleaning, and manipulating data, making it suitable for data pre-processing tasks.
  • 7.
    Basics of RProgramming Features of R –  Report Generation: It integrates tools for generating reports in formats such as CSV, XML, HTML, and PDF, and can also create interactive web-based reports.
  • 8.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Data Science and Analytics:  R is widely used in data science due to its statistical and graphical capabilities.  Applications:  Data preprocessing and wrangling.  Exploratory Data Analysis (EDA).  Building predictive models (regression, classification).  Visualizing trends and patterns.  Popular Packages: ggplot2, dplyr, tidyverse
  • 9.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Finance & Economics:  R is heavily used for modeling financial systems and analyzing market data.  Applications:  Time series analysis and forecasting.  Portfolio optimization and risk assessment.  Stock price prediction and algorithmic trading.  Econometric modeling and hypothesis testing.  Popular Packages: quantmod, TTR, xts
  • 10.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Bioinformatics and Computational Biology:  R is a standard tool in genomics and life sciences for handling complex biological data.  Applications:  Gene expression analysis.  Sequence alignment and annotation.  Protein structure and function analysis.  DNA microarray data analysis.  Popular Packages: Bioconductor, edgeR, limma
  • 11.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Healthcare and Epidemiology:  R is extensively used for medical statistics and public health modeling.  Applications:  Clinical trial data analysis.  Survival analysis.  Disease modeling.  Risk factor identification and patient data mining.  Popular Packages: survival, epitools, epiR, ggsurvplot
  • 12.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Marketing and Customer Analytics:  R is widely used for analyzing customer behavior and marketing performance.  Applications:  Customer segmentation (clustering).  Predicting churn and customer lifetime value.  A/B testing for campaign analysis.  Sentiment analysis and opinion mining from reviews or social media.  Popular Packages: caret, tm, text2vec, cluster
  • 13.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Academia and Education:  R is a preferred tool in academic institutions for teaching and research.  Applications:  Teaching statistics and programming.  Data collection and research analysis.  Academic paper visualizations and simulations.  Publishing reproducible research.  Popular Packages: RMarkdown
  • 14.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Manufacturing and Industrial Engineering:  R supports quality control and optimization tasks in manufacturing environments.  Applications:  Process control and Six Sigma analysis.  Root cause analysis.  Predictive maintenance and downtime analysis.  Inventory and supply chain optimization.  Popular Packages: qcc, SixSigma, forecast
  • 15.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Logistics and Supply Chain:  R helps in optimizing logistics operations and forecasting demands.  Applications:  Demand forecasting using time series.  Route and network optimization.  Warehouse operations analytics.  Vendor and inventory analysis.  Popular Packages: forecast, tidyverse, lpsolve
  • 16.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Environmental Science and Climate Research:  R is used in modeling environmental and geographical data.  Applications:  Climate pattern forecasting.  Pollution level analysis.  Remote sensing and spatial data analysis.  Popular Packages: raster, sp, rgdal, leaflet
  • 17.
    Basics of RProgramming Applications of R –  R is used in a variety of fields, including:  Media and Entertainment:  R is used for analyzing consumer preferences and digital media content.  Applications:  Viewer behavior prediction and segmentation.  Recommendation engines.  Sentiment analysis of reviews.  Social media trend analytics.  Popular Packages: tm, sentiment, textdata
  • 18.
    Basics of RProgramming Introduction to R Studio –  R Studio is an integrated development environment (IDE) for R.  IDE is a GUI, where we can write our quotes, see the results and also see the variables that are generated during the course of programming.  R Studio is available as both Open source and Commercial software.  R Studio is also available as both Desktop and Server versions.  R Studio is also available for various platforms such as Windows, Linux, and macOS.  R Studio is an open-source tool that provides IDE to use R language, and enterprise-ready professional software for data science teams to develop & share the work with their team.
  • 19.
    Basics of RProgramming Why use R Studio? –  R Studio is preferred for several reasons:  Integrated Development Environment (IDE): It provides a user-friendly interface designed specifically for R programming, offering features like syntax highlighting, code completion, and debugging tools that enhance productivity and code quality.  Project Management: R Studio facilitates efficient project organization with its project-based workflow, allowing users to manage multiple scripts, data files, and plots within a cohesive workspace, which is essential for organizing projects after the R language installation.  Data Visualization: It supports powerful data visualization capabilities through integrated plotting tools and compatibility with popular visualization libraries like ggplot2. It enables users to create insightful graphs and charts effortlessly, enhancing data representation after the R language installation.
  • 20.
    Basics of RProgramming Why use R Studio? –  R Studio is preferred for several reasons:  Package Management: R Studio simplifies package management with tools like the Package Manager and integrated CRAN repository access, making it easy to install, update, and manage R packages crucial for various analytical tasks, supporting package management post R language installation.  Markdown and R Markdown Support: It seamlessly integrates with Markdown and R Markdown, enabling users to create dynamic reports, presentations, and documents that combine code, visualizations, and narrative text in a single file, facilitating report generation after R language installation.  Collaboration and Sharing: R Studio facilitates collaboration by supporting version control systems like Git and enabling seamless sharing of projects and analyses through R Studio Server and R Studio Cloud, promoting collaboration after the R language installation.
  • 21.
    Basics of RProgramming How to Download R and R Studio? –  To Install R and R Studio on Windows we will have to download R and R Studio with the following steps.  Step 1: First, we need to set up an R environment in our local machine. We can download the same from https://posit.co/download/rstudio-desktop/
  • 22.
    Basics of RProgramming How to Download R and R Studio? –  Step 2: We have to download both the applications, first go with R Base and then install R Studio. After click on install R we will get a new page like this. Here we can select the Linux, mac or windows any one according to users system.
  • 23.
    Basics of RProgramming How to Download R and R Studio? –  Step 3: After selecting OS, we get below page where we have to select R base.
  • 24.
    Basics of RProgramming How to Download R and R Studio? –  Step 4: Now click on the link show above in image so R base start downloading.
  • 25.
    Basics of RProgramming How to Download R and R Studio? –  Step 5: Again go to main page and download R Studio Desktop for Windows.
  • 26.
    Basics of RProgramming Steps to Install R and R Studio –  Step 1: After downloading R for the Windows platform, install it by double- clicking it.  Step 2: Download R Studio from their official page. Note: It is free of cost (under AGPL licensing).  Step 3: After downloading, we will get a file named "RStudio-x.x.xxxx.exe" in our Downloads folder.  Step 4: Double-click the installer, and install the software.  Step 5: Test the R Studio installation.  Search for R Studio in the Window search bar on Taskbar. Start the application & insert the following code in the console. print(‘Hello World’)  Step 6: Installation is successful.
  • 27.
    Basics of RProgramming R Studio interface overview –  After the installation process is over, the R Studio interface looks like:
  • 28.
    Basics of RProgramming R Studio interface overview –  The console panel(left panel) is the place where R is waiting for us to tell it what to do, and see the results that are generated when we type in the commands.  To the top right, we have the Environment / History panel. It contains 2 tabs:  Environment tab: It shows the variables that are generated during the course of programming in a workspace that is temporary.  History tab: In this tab, we will see all the commands that are used till now from the start of usage of R Studio.  To the right bottom, we have another panel, which contains multiple tabs, such as files, plots, packages, help, and viewer.  The Files tab shows the files and directories that are available within the default workspace of R.  The Plots tab shows the plots that are generated during the course of programming.
  • 29.
    Basics of RProgramming R Studio interface overview –  To the right bottom, we have another panel, which contains multiple tabs, such as files, plots, packages, help, and viewer.  The Packages tab helps us to look at what are the packages that are already installed in the R Studio and it also gives a user interface to install new packages.  The Help tab is the most important one where we can get help from the R Documentation on the functions that are in built-in R.  The final and last tab is that the Viewer tab which can be used to see the local web content that's generated using R.
  • 30.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Creating an R file –  There are two ways to create an R file in R studio:  Click on the File tab, it will give a drop-down menu, where we can select the new file and then R script, so that, we will get a new file open.
  • 31.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Creating an R file –  There are two ways to create an R file in R studio:  Use the plus button, which is just below the file tab and choose R script, from there, to open a new R script file.
  • 32.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Creating an R file –  Once we open an R script file, this is how an R Studio with the script file open looks like.
  • 33.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Creating an R file –  So, three panels console, environment/history and file/plots panels are there.  On top left we have a new window, which is now being opened as a script file. Now we are ready to write a script file or some program in R Studio.  Writing Scripts in an R file –  Writing scripts to an R file is demonstrated here with an example:  A variable ‘x' is assigned with a value 11, in the first line of the code and there is ‘y’ which is ‘x' times 10, that is the second line.  Here, the code is evaluating the value of x times 10 and assign the value to the y.
  • 34.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Writing Scripts in an R file –  The third statement, which is print(z(x, y)) means concatenates this x and y and print the result.
  • 35.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Saving an R File –  After writing a script file, there is a need to save this file before execution.  To save the R file, from the file menu click either save or save as button.  When we click the save button, it will automatically save the file with untitled x. The x can be 1 or 2 depending on how many R scripts already opened.  It is a nice idea to use the Save as button, so that, we can rename the script file according to our convenience.  When we click Save as button, it will pop out a window, where we can rename the script file as demo.R.  Once we rename, then by clicking the save button we can save the script file.
  • 36.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Saving an R File –  The window will look like below –
  • 37.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Execution of an R File –  There are several ways in which the execution of the commands that are available in the R file is done.
  • 38.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Execution of an R File –  Using the run command: This "run" command can be executed using the GUI, by pressing the run button there, or we can use the Shortcut key control + enter. What does it do? It will execute the line in which the cursor is there.  Using the source command: This "source" command can be executed using the GUI, by pressing the source button there, or we can use the Shortcut key control + shift + S. What does it do? It will execute the whole R file and only print the output which we wanted to print.  Using the source with echo command: This "source with echo" command can be executed using the GUI, by pressing the source with echo button there, or we can use the Shortcut key control + shift + enter. What does it do? It will print the commands also, along with the output we are printing.
  • 39.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Execution of an R File –  The output will be printed in the console. The values of x, y and z are also shown in the environment panel.
  • 40.
    Basics of RProgramming Creation and Execution of R File in R Studio –  Execution of an R File –  Run command over Source command:  Run can be used to execute the selected lines of R code.  Source and Source with echo can be used to run the whole file.  The advantage of using Run is, we can troubleshoot or debug the program when something is not behaving according to our expectations.  The disadvantage of using run command is, it populates the console and makes it messy unnecessarily.
  • 41.
    Basics of RProgramming Clear the Console and the Environment in R Studio –  Clearing the Console –  We can Clear console in R and R Studio, In some cases when we run the codes using "source" and "source with echo" our console will become messy & it is needed to clear the console.  The console can be cleared using the shortcut key "ctrl + L".  Note: Remember that clearing the console will not delete the variables that are there in the workspace. We can see that in the environment tab even though we have cleared the console in the workspace we still have the variables that are created earlier.
  • 42.
    Basics of RProgramming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Variables on the R environment can be cleared in two ways:  Using rm() command:  When we want to clear a single variable from R environment we can use the "rm()" command followed by the variable we want to remove.  rm (variable) e.g. rm (x)  If we want to delete all the variables that are there in the environment then we can use the "rm" with an argument "list" is equal to "ls" followed by a parenthesis.  rm (list = ls())
  • 43.
    Basics of RProgramming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:  We can also clear all the variables in the environment using the GUI in the environment pane.  When we press the brush button, it will pop up a window saying "you want to remove all the objects from the environment?“.  If we say yes it will clear all the variables which are there & we can see the environment is empty now.
  • 44.
    Basics of RProgramming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:
  • 45.
    Basics of RProgramming Clear the Console and the Environment in R Studio –  Clearing the Environment –  Using the GUI:
  • 46.
    Basics of RProgramming Basic Syntax in R Programming –  A program in R is made up of three things: Variables, Comments & Keywords.  Variables are used to store the data.  Comments are used to improve code readability.  Keywords are reserved words that hold a specific meaning to the compiler.  Variables in R –  Name given to reserved memory locations that can store any type of data.  In R, the assignment can be denoted in three ways:  = (Simple Assignment)  <- (Leftward Assignment)  -> (Rightward Assignment)
  • 47.
    Basics of RProgramming Basic Syntax in R Programming –  Variables in R –  The rightward assignment is less common & can be confusing, so it is generally recommended to use <- or = operator for assigning values in R.
  • 48.
    Basics of RProgramming Basic Syntax in R Programming –  Comments in R –  Comments are a way to improve your code's readability and are only meant for the user so the interpreter ignores it.  Only single-line comments are available in R but we can also use multiline comments by using a simple trick which is shown below.  Single line comments can be written by using # at the beginning of the statement.  From the below output, we can see that both comments were ignored by the interpreter.
  • 49.
    Basics of RProgramming Basic Syntax in R Programming –  Comments in R –
  • 50.
    Basics of RProgramming Basic Syntax in R Programming –  Keywords in R –  Keywords are the words reserved by a program because they have a special meaning thus a keyword can't be used as a variable name, function name, etc.  We can view these keywords by using either help(reserved) or ?reserved.
  • 51.
    Basics of RProgramming Basic Syntax in R Programming –  Keywords in R –  if, else, repeat, while, function, for, in, next and break are used for control- flow statements and declaring user-defined functions.  The ones left are used as constants like TRUE/FALSE are used as boolean constants.  NaN defines Not a Number value and NULL are used to define an Undefined value.  Inf is used for Infinity values.
  • 52.
    Basics of RProgramming R Commands –  The following R commands provide an overview of different application areas in R programming.  Depending on specific needs and projects, we can pick and match the commands that suits.  Reading and Writing Commands –  Reading and writing data are fundamental tasks in data analysis and manipulation.  In R, several functions & packages can help to handle different types of data sources.  read.csv() - Read data from a CSV file.  write.csv() - Write data to a CSV file.
  • 53.
    Basics of RProgramming R Commands –  Dataframe Operations Commands -  Dataframe operations in R are essential for data manipulation and analysis.  Here are some common operations we might perform on data frames using base R & dplyr package, which is part of the tidyverse collection.  data.frame() - Create a data frame.  subset() - Filter data based on specific conditions.  merge() - Merge data from different data frames.  aggregate() - Aggregate data based on specific criteria.  transform() - Create new variables in a data frame.  sort() - Sort vectors or data frames.  unique() - Identify unique values in a vector or column.
  • 54.
    Basics of RProgramming R Commands –  Applying Functions Commands -  Applying functions to data frames is a powerful technique in R for data transformation and analysis.  Here are various ways to apply functions to data frames.  apply() - Apply a function to rows or columns of matrices or data frames.  lapply(), sapply(), mapply() - Apply functions to lists or vectors.
  • 55.
    Basics of RProgramming R Commands –  Using dplyr for Data Manipulation -  The dplyr is a powerful package in R designed to make data manipulation easy and intuitive.  It provides a set of verbs that allow you to solve the most common data manipulation challenges.  dplyr::filter() - Filter data in data frames.  dplyr::mutate() - Create new variables in data frames.  dplyr::select() - Select specific columns from a data frame.  dplyr::summarize() - Summarize data by applying functions.
  • 56.
    Basics of RProgramming R Commands –  Data Visualizations Commands -  Data visualization is a critical part of data analysis, and R offers powerful libraries like ggplot2 for creating various types of visualizations.  Base R Plotting Functions –  In R, base plotting functions provide a straightforward way to create a wide range of plots.  plot() - Create scatter plots and other basic plot types.  hist() - Create histograms.  barplot() – Create bar charts.  boxplot() – Create box plots.  ggplot2::ggplot() - Create sophisticated and customizable visualizations.
  • 57.
    Basics of RProgramming R Commands –  Data Visualizations Commands -  Specialized Plots –  Specialized plots cater to specific data visualization needs, offering more advanced and tailored representations.  qqnorm(), qqline() - Create quantile-quantile plots.  acf() - Display autocorrelation functions.  density() - Display density functions and histograms.  heatmap() - Create heat maps.
  • 58.
    Basics of RProgramming R Commands –  Statistical Analysis Commands –  Statistical analysis in R involves a wide range of techniques & commands.  Descriptive Statistics –  To compute descriptive statistics such as mean, median, standard deviation, and quartiles, we can use the summary and quantile functions.  summary() - Get a summary of data, including statistical metrics.  mean(), median(), sd() - Calculate mean, median, and standard deviation.
  • 59.
    Basics of RProgramming R Commands –  Statistical Analysis Commands –  Hypothesis Testing –  Hypothesis testing is a fundamental concept in statistics used to make inferences about a population based on sample data.  t.test() - Perform T-tests for hypothesis testing.  anova() - Perform analysis of variance (ANOVA).  chi-sq.test() - Perform chi-square tests.
  • 60.
    Basics of RProgramming R Commands –  Statistical Analysis Commands –  Regression and Correlation –  Regression and correlation are statistical techniques used to analyze the relationship between variables.  lm() - Perform linear regressions.  cor() - Calculate correlation coefficients between variables.
  • 61.
    Basics of RProgramming R Commands –  Data Import and Export Commands –  Data import and export are essential tasks in R for working with external data sources.  R Data Objects –  In R, there are several data objects commonly used for storing and working with data.  readRDS(), saveRDS() - Read and save R data objects.  Reading and Writing Various Formats –  Reading and writing data in various formats is a common task in R.  read.table() - Read data from a file.  write.table() - Write data to a file.
  • 62.
    Basics of RProgramming R Commands –  Data Import and Export Commands –  Excel Files –  Working with Excel files in R is a common task, and there are multiple packages available for reading and writing Excel files.  readxl::read_excel() - Read data from Excel files.  writexl::write_xlsx() - Write data to Excel files.
  • 63.
    Basics of RProgramming R Commands –  Control Structures and Conditionals –  Control structures and conditionals in R allow you to control the flow of execution in your code based on certain conditions.  Conditional Statements –  Conditional statements in R allow to execute specific code blocks based on whether certain conditions are true or false.  ifelse()- Perform condition evaluations and conditional expressions.  Loops –  Loops in R allow to execute a block of code repeatedly.  for() - Loop over a sequence.  while() - Perform while loops.  repeat() - Execute a loop indefinitely until a condition is met.
  • 64.
    Basics of RProgramming Variables and scope of variables –  In R, variables are the containers for storing data values.  They are reference, or pointers, to an object in memory i.e. whenever a variable is assigned to an instance, it gets mapped to that instance.  A variable in R can store a vector, a group of vectors or a combination of many R objects.  Naming convention for Variables –  The variable name in R has to be Alphanumeric characters with an exception of underscore('_') and dot('.'), the special characters which can be used in the variable names.  The variable name has to be started always with an alphabet.  Other special characters like('!', '@', '#', '$') are not allowed in the variable names.
  • 65.
    Basics of RProgramming Variables and scope of variables –  Scope of a Variable –  The location where we can find a variable and also access it if required is called the scope of a variable.  There are mainly two types of variable scopes:  Global Variables: Global variables are those variables that exist throughout the execution of a program. It can be changed and accessed from any part of the program.  Local Variables: Local variables are those variables that exist only within a certain part of a program like a function and are released when the function call ends.
  • 66.
    Basics of RProgramming Variables and scope of variables–  Scope of a Variable –  Global Variable:  As the name suggests, Global Variables can be accessed from any part of the program.  They are available throughout the lifetime of a program.  They are declared anywhere in the program outside all of the functions or blocks.  Declaring global variables: Global variables are usually declared outside of all of the functions and blocks. They can be accessed from any portion of the program.
  • 67.
    Basics of RProgramming Variables and scope of variables–  Scope of a Variable –  Global Variable:  In above code, the variable 'global' is declared at the top of program outside all of the functions so it is a global variable and can be accessed or updated from anywhere in the program.
  • 68.
    Basics of RProgramming Variables and scope of variables–  Scope of a Variable –  Local Variable:  Variables defined within a function or block are said to be local to those functions.  Local variables do not exist outside the block in which they are declared, i.e. they can not be accessed or used outside that block.  Declaring local variables: Local variables are declared inside a block.
  • 69.
    Basics of RProgramming Variables and scope of variables–  Scope of a Variable –  Local Variable:  The above program displays an error saying "object 'age' not found".  The variable age was declared within the function "func()" so it is local to that function & not visible to the portion of the program outside this function.  To correct the above error we have to display the value of variable age from the function "func()" only.
  • 70.
    Basics of RProgramming Variables and scope of variables–  Scope of a Variable –  Local Variable:
  • 71.
    Basics of RProgramming Data Types in R –  Data types in R define the kind of values that variables can hold.  Choosing the right data type helps to optimize memory usage & computation.  R does not require explicit data type declarations while variables can change their type dynamically during execution.  R Programming language has the following basic R-data types:  Numeric Data type in R –  Decimal values are called numeric in R. It is the default R data type for numbers in R.  If we assign a decimal value to a variable x then it will be of numeric type.  Real numbers with a decimal point are represented using this data type in R.  It uses a format for double-precision floating-point numbers to represent numerical values.
  • 72.
    Basics of RProgramming Data Types in R –  Numeric Data type in R –  Even if an integer is assigned to a variable, it is still saved as a numeric value.
  • 73.
    Basics of RProgramming Data Types in R –  Numeric Data type in R –  When R stores a number in a variable, it converts the number into a "double" value or a decimal type with at least two decimal places.  It means that a value such as "5" here, is stored as 5.00 with a type of double and a class of numeric.  Also it is not an integer, which can be confirmed with the is.integer() function.
  • 74.
    Basics of RProgramming Data Types in R –  Integer Data type in R –  R supports integer data types which are the set of all integers.  We can create as well as convert a value into an integer type using the as.integer() function.  We can also use the capital 'L' notation as a suffix to denote that a particular value is of the integer R data type.
  • 75.
    Basics of RProgramming Data Types in R –  Logical Data type in R –  R has logical data types that take either a value of true or false.  A logical value is often created via a comparison between variables.  Boolean values, which have two possible values, are represented by this R data type: FALSE or TRUE.
  • 76.
    Basics of RProgramming Data Types in R –  Complex Data type in R –  R supports complex data types that are set of all the complex numbers.  The complex data type is to store numbers with an imaginary component.
  • 77.
    Basics of RProgramming Data Types in R –  Character Data type in R –  R supports character data types where we have all the alphabets and special characters.  It stores character values or strings. Strings in R can contain alphabets, numbers, and symbols.  The easiest way to denote that a value is of character type in R data type is to wrap the value inside single or double inverted commas.
  • 78.
    Basics of RProgramming Data Types in R –  Raw Data type in R –  To save and work with data at the byte level in R, use the raw data type.  By displaying a series of unprocessed bytes, it enables low-level operations on binary data.  Five elements make up this raw vector x, each of which represents a raw byte value.
  • 79.
    Basics of RProgramming Data Types in R –  Find Data Type of an Object in R –  To find the data type of an object we have to use class() function.  The syntax for doing that is we need to pass the object as an argument to the function class() to find the data type of an object.  Syntax - class (object)
  • 80.
    Basics of RProgramming Data Types in R –  Type Verification in R –  We can verify the data type of an object, if we doubt about it's data type.  To do that, we need to use the prefix "is." before the data type as a command.  Syntax – is.data_type(object)
  • 81.
    Basics of RProgramming Data Types in R –  Convert the Data Type of an Object to Another –  The process of altering the data type of an object to another type is referred to as data type conversion.  This is a common operation in many programming languages that is used to alter data and perform various computations.  The conversion is performed directly by the programmer.  Syntax – as.data_type(object)  All the conversions are not possible and if attempted will be returning an "NA" value.
  • 82.
    Basics of RProgramming Data Types in R –  Convert the Data Type of an Object to Another –
  • 83.
    Basics of RProgramming Operators in R –  Operators are the symbols directing the compiler to perform various kinds of operations between the operands.  Operators simulate various mathematical, logical, & decision operations performed on a set of Complex Numbers, Integers, & Numericals as input operands.  R supports majorly four kinds of binary operators between a set of operands.  Arithmetic Operators –  It evaluates specified operator between operands, which may be either scalar values, complex numbers, or vectors.  They are performed element-wise at the corresponding positions of the vectors.
  • 84.
    Basics of RProgramming Operators in R –  Arithmetic Operators –  Addition Operator (+) –  Subtraction Operator (-) –  Multiplication Operator (*) –  Division Operator (/) -
  • 85.
    Basics of RProgramming Operators in R –  Arithmetic Operators –  Power Operator (^) – The first operand is raised to the power of the second operand.  Modulo Operator (%%) - It returns the remainder after dividing the first operand by the second operand.
  • 86.
    Basics of RProgramming Operators in R –  Logical Operators –  It simulates element-wise decision operations based on the specified operator between the operands, which are then evaluated to either a True or False boolean value.  Any non-zero integer value is considered as a TRUE value, be it a complex or real number.  Element-wise Logical AND operator (&) - Returns True if both the operands are True.  Element-wise Logical OR operator (|) - Returns True if either of the operand is True.
  • 87.
    Basics of RProgramming Operators in R –  Logical Operators –  NOT operator (!) - A unary operator that negates the status of the elements of the operand.  Logical AND operator (&&) - Returns True if both the first elements of the operands are True. (Single output)  Logical OR operator (||) - Returns True if either of the first elements of the operands is True. (Single output)
  • 88.
    Basics of RProgramming Operators in R –  Relational Operators –  The Relational Operators in R carry out comparison operations between the corresponding elements of the operands.  Returns a boolean TRUE value if the first operand satisfies the relation compared to the second.  A TRUE value is always considered to be greater than the FALSE.  Less than (<) –  Less than equal to (<=) –
  • 89.
    Basics of RProgramming Operators in R –  Relational Operators –  Greater than (>) –  Greater than equal to (>=) –  Not equal to (!=) -
  • 90.
    Basics of RProgramming Operators in R –  Assignment Operators –  They are used to assigning values to various data objects in R. The objects may be integers, vectors, or functions.  These values are then stored by the assigned variable names.  There are two kinds of assignment operators: Left and Right  Left Assignment (<- or <<- or =) - Assigns a value to a vector.  Right Assignment (-> or ->> ) - Assigns a value to a vector.
  • 91.
    Basics of RProgramming Operators in R –  Miscellaneous Operators –  They are mixed operators in R that simulate the printing of sequences and assignment of vectors, either left or right-handed.  %in% Operator - Checks if an element belongs to a list and returns a boolean value TRUE if the value is present else FALSE.  %*% Operator –  This operator is used to multiply a matrix with its transpose.  Transpose of the matrix is obtained by interchanging the rows to columns and columns to rows.
  • 92.
    Basics of RProgramming Operators in R –  Miscellaneous Operators –  %*% Operator –  The number of columns of the first matrix must be equal to the number of rows of the second matrix.  Multiplication of the matrix A with its transpose, B, produces a square matrix.
  • 93.
    Basics of RProgramming Keywords in R –  R keywords are reserved words that have special meaning in the language.  They helps to control program flow, define functions, and represent special values.  We can check for which words are keywords by using the help(reserved) or ?reserved function.  if - used for decision-making to execute code only if a condition is true.  else - executes code when the if condition is false.  while - loop which runs a block repeatedly while a condition remains true.  repeat - runs a block indefinitely until explicitly stopped with break.  for - loop iterates over a sequence, running code for each element.  function - defines reusable blocks of code.
  • 94.
    Basics of RProgramming Keywords in R –  next - skips the current iteration in a loop and continues with the next.  break - stops the execution of a loop immediately.  TRUE/FALSE – Boolean constants representing logical true and false values.  NULL - Represents an empty or undefined object.  Inf and NaN - Inf and -Inf represent positive and negative infinity. NaN means “Not a Number” and occurs in undefined numerical operations.  NA - Represents missing or unavailable data.
  • 95.
    Basics of RProgramming Control Structures in R Programming –  Control statements are expressions used to control the execution and flow of the program based on the conditions provided in the statements.  These structures are used to make a decision after assessing the variable.  In R programming, there are 8 types of control statements as follows:  if condition  if-else condition  for loop  nested loops  while loop  repeat and break statements  return statement  next statement
  • 96.
    Basics of RProgramming Control Structures in R Programming –  if Condition –  This control structure checks the expression provided in parenthesis is true or not.  If true, the execution of the statements in braces {} continues.  Syntax – if (expression) { statements …. }  Example -
  • 97.
    Basics of RProgramming Control Structures in R Programming –  if-else Condition –  It is similar to if condition but when the test expression in if condition fails, then statements in else condition are executed.  Syntax – if (expression) { statements } else { statements }  Example -
  • 98.
    Basics of RProgramming Control Structures in R Programming –  for Loop –  It is a type of loop or sequence of statements executed repeatedly until exit condition is reached.  Syntax – for (value in vector) { statements }  Example -
  • 99.
    Basics of RProgramming Control Structures in R Programming –  Nested loops –  Nested loops are similar to simple loops. Nested means loops inside loop.  The nested loops are used to manipulate the matrix.  Example -
  • 100.
    Basics of RProgramming Control Structures in R Programming –  while loops –  It is another kind of loop iterated until a condition is satisfied.  The testing expression is checked first before executing the body of loop.  Syntax – while(expression){ statement ….. }  Example -
  • 101.
    Basics of RProgramming Control Structures in R Programming –  repeat loop and break statement –  The repeat is a loop which can be iterated many number of times but there is no exit condition to come out from the loop.  So, break statement is used to exit from the loop. It can be used in any type of loop to exit from the loop.  Syntax – repeat { statements ….. if(expression){ break } }
  • 102.
    Basics of RProgramming Control Structures in R Programming –  repeat loop and break statement –  Example –
  • 103.
    Basics of RProgramming Control Structures in R Programming –  return statement –  It is used to return the result of an executed function and returns control to the calling function.  Syntax – return(expression)  Example -
  • 104.
    Basics of RProgramming Control Structures in R Programming –  next statement –  It is used to skip the current iteration without executing the further statements and continues the next iteration cycle without terminating the loop.  Example -
  • 105.
    Basics of RProgramming Functions in R Programming –  A function accepts input arguments and produces the output by executing valid R commands that are inside the function.  Functions are useful when we want to perform a certain task multiple times.  In R programming, when we are creating a function the function name and the file in which we are creating the function need not be the same and we can have one or more functions in R.  Creating a Function in R –  Functions are created in R by using the command function().  The general structure of the function file is as follows: func = function(arguments){ statements }
  • 106.
    Basics of RProgramming Functions in R Programming –  Parameters or Arguments in R Functions –  In programming, parameters and arguments refer to the values passed into a function.  They are often used interchangeably, but there is a subtle difference:  Parameters are the variables defined in the function definition.  Arguments are the actual values passed to the function when it is called.  A function can have multiple parameters, and these are separated by commas within the parentheses.
  • 107.
    Basics of RProgramming Functions in R Programming –  Function Parameter Rules –  Number of Parameters: A function should be called with the correct number of parameters. If the number doesn't match, an error occurs.  Default Parameter Values: Some functions have default values for parameters. If no argument is passed, these defaults are used.  Return Value: The return() function sends the result back from the function.
  • 108.
    Basics of RProgramming Functions in R Programming –  Calling a Function in R –  After creating a Function, we have to call the function to use it.  Calling a function in R is done by writing it's name and passing possible parameters value.  Passing Arguments to Functions in R Programming Language –  There are several ways we can pass the arguments to the function:  Case 1: Generally in R, the arguments are passed to the function in the same order as in the function definition.  Case 2: If we don’t want to follow any order what we can do is then we can pass the arguments using names of the arguments in any order.  Case 3: If the arguments are not passed the default values are used to execute the function.
  • 109.
    Basics of RProgramming Functions in R Programming –  Calling a Function in R –  Passing Arguments to Functions in R Programming Language –  Example:
  • 110.
    Basics of RProgramming Functions in R Programming –  Types of Function in R –  Built-in Function: Built-in functions in R are pre-defined functions that are used to perform common tasks or operations.  User-defined Function: R language allow us to write our own function.  Built-in Function in R Programming Language –  Built-in Function are the functions that are already existing in R language and we just need to call them to use.  Example - sum(), max(), min(), etc.
  • 111.
    Basics of RProgramming Functions in R Programming –  Types of Function in R –  Other Built-in Function in R –  The list of built-in R functions and their uses: Category Functions Mathematical Functions abs(), sqrt(), round(), exp(), log(), cos(), sin(), tan() Statistical Functions mean(), median(), cor(), var() Data Manipulation Functions unique(), subset(), aggregate(), order() File Input/Output Functions read.csv(), write.csv(), read.table(), write.table()
  • 112.
    Basics of RProgramming Functions in R Programming –  Types of Function in R –  User-defined Functions in R Programming Language –  User-defined functions are the functions that are created by the user.  The User defines the working, parameters, default parameter, etc. of that user-defined function.  They can be only used in that specific code.
  • 113.
    Basics of RProgramming Packages R Programming –  Packages in R programming are a set of R functions, compiled code, and sample data.  These are stored under a directory called "library" within the R environment.  By default, R installs a group of packages during installation.  Once we start the R console, only the default packages are available by default.  Other packages that are already installed need to be loaded explicitly to be utilized by the R program that's getting to use them.  What are Repositories?  A repository is a place where packages are located and stored so we can install R packages from it.  Organizations and Developers have a local repository, typically they are online and accessible to everyone.
  • 114.
    Basics of RProgramming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  CRAN (Comprehensive R Archive Network):  It is the official repository.  It is a network of FTP and web servers maintained by the R community around the world.  The R community coordinates it, and for a package to be published in CRAN, the Package needs to pass several tests to ensure that the package is following CRAN policies.
  • 115.
    Basics of RProgramming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  Bioconductor:  It is a specialized repository, intended for open source software for bioinformatics.  Similar to CRAN, it has its own submission and review processes, and its community is very active having several conferences and meetings per year in order to maintain quality.
  • 116.
    Basics of RProgramming Packages R Programming –  What are Repositories?  Some of the most popular repositories for R packages are:  GitHub:  It is the most popular repository for open-source projects.  It's popular as it comes from the unlimited space for open source, the integration with git, a version control software.  It is easy o share and collaborate with others.
  • 117.
    Basics of RProgramming Packages R Programming –  Get library locations containing R packages -  The .libpath() method handles the management of library paths.  These are the directories where a program searches for external libraries or modules required for execution.  Syntax - .libPaths()  Get the list of all the R packages installed –  We load a package using library(), the functions and objects in that package become available in the global environment.  Syntax – library()
  • 118.
    Basics of RProgramming Packages R Programming –  Install an R packages -  There are multiple ways to install R Package, some of them are:  Installing R Packages From CRAN:  For installing R Package from CRAN we need the name of the package and use the following command install.packages(“package name”)  Installing Package from CRAN is the most common and easiest way as we just have to use only one command.  To install more than one package at a time, we have to write them as a character vector in first argument of the install.packages() function: install.packages(c(“package1”,”package2”))
  • 119.
    Basics of RProgramming Packages R Programming –  Install an R packages -  Installing BiocManager Packages:  The BiocManager package should be used to install and manage Bioconductor packages.  To install the BiocManager package, use the following command: install.packages("BiocManager")  Once BiocManager is installed, we can use it to install Bioconductor packages.  Example - to install the edgeR package and its dependencies from the Bioconductor repository, use: BiocManager::install("edgeR")
  • 120.
    Basics of RProgramming Packages R Programming –  Update, Remove and Check Installed Packages in R -  To check what packages are installed on our computer, type command: installed.packages()  To update all the packages, type command: update.packages()  To update a specific package, type command: install.packages("PACKAGE NAME")
  • 121.
    Basics of RProgramming Packages R Programming –  Installing Packages Using R Studio UI -  In R Studio goto Tools → Install Package, and there we will get a pop-up window to type the package we want to install:  Under Packages, type, and search Package which we want to install and then click on install button.
  • 122.
    Basics of RProgramming Packages R Programming –  How to Load Packages in R Programming Language -  When a R package is installed, we are ready to use its functionalities.  If we just need a frequent use of a few functions or data inside a package we can access them with the following notation.  Example - Load a package using the library function - library(dplyr) Alternatively, load a package using the require function - require(dplyr)  Both functions attempt to load the specified package, but there is a subtle difference between the two:  library() returns an error if the package is not found or cannot be loaded  require() returns a warning and sets the value of the package variable to FALSE.
  • 123.
    Basics of RProgramming Packages R Programming –  Difference Between a Package and a Library -  There is always confusion between a package and a library, and we find people calling libraries as packages.  library(): It is the command used to load a package, and it refers to the place where the package is contained, usually a folder on our computer.  Package: It is a collection of functions bundled conveniently. The package is an appropriate way to organize our own work and share it with others.
  • 124.
    Basics of RProgramming Data Structures in R Programming –  A data structure is a particular way of organizing data in a computer so that it can be used effectively.  The idea is to reduce the space and time complexities of different tasks.  Data structures in R programming are tools for holding multiple values.  R’s base data structures are often organized by their dimensionality (1D, 2D or nD) and whether they’re homogeneous (all elements must be of the identical type) or heterogeneous (the elements are often of various types).  This gives rise to the six data types which are most frequently utilized in data analysis – Vectors, Lists, Data Frames, Matrices, Arrays, Factors
  • 125.
    Basics of RProgramming Data Structures in R Programming –  Vectors –  A vector is an ordered collection of basic data types of a given length.  All the elements of a vector must be of the identical data type.  Vectors are one-dimensional data structures.  Created using the c() function.
  • 126.
    Basics of RProgramming Data Structures in R Programming –  Lists –  Ordered collection of elements, which can be of different types.  These are also one-dimensional data structures.  A list can be a list of vectors, list of matrices, a list of characters and a list of functions and so on.  Created using the list() function.
  • 127.
    Basics of RProgramming Data Structures in R Programming –  Data Frames –  Table-like structure where each column can contain different types of data.  They are two-dimensional, heterogeneous data structures. These are lists of vectors of equal lengths.  Created using the data.frame() function.
  • 128.
    Basics of RProgramming Data Structures in R Programming –  Data Frames –  Data frames have the following constraints placed upon them:  A data-frame must have column names & every row should have a unique name.  Each column must have the identical number of items.  Each item in a single column must be of the same data type.  Different columns may have different data types.
  • 129.
    Basics of RProgramming Data Structures in R Programming –  Matrices –  Two-dimensional data structure with rows and columns, where all elements are of the same type.  Matrices are two-dimensional, homogeneous data structures.  Created using the matrix() function.
  • 130.
    Basics of RProgramming Data Structures in R Programming –  Arrays –  Arrays are the R data objects which store the data in more than two dimensions.  Arrays are n-dimensional data structures.  They are homogeneous data structures.  For example, if we create an array of dimensions (2, 3, 3) then it creates 3 rectangular matrices each with 2 rows and 3 columns.
  • 131.
    Basics of RProgramming Data Structures in R Programming –  Factors –  Factors are the data objects which are used to categorize the data and store it as levels.  They are useful for storing categorical data.  They can store both strings and integers.  They are useful to categorize unique values in columns like (“TRUE” or “FALSE”) or (“MALE” or “FEMALE”), etc..  They are useful in data analysis for statistical modeling.