R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix R programming language: conceptual overview Maxim Litvak 2016-06-10
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Outline 1 Introduction 2 Statistical computing 3 Functional programming 4 Dynamic 5 OOP 6 Statistical computing - revision I 7 Statistical computing - revision II 8 Statistical computing - revision III 9 Statistical computing - revision IV 10 Appendix
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix R description R is a dynamic language for statistical computing that combines lazy functional features and object-oriented programming.
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix R Properties Properties: ˆ Dynamic ˆ Statistical computing ˆ Lazy functional ˆ OOP . . . R users usually focus on statistical computing, however, understanding the rest is crucial to boost productivity.
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Statistical computing ˆ You already know how it works :-)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Basics I ˆ Functional programming (FP) is a paradigm that prescribes to break down the task into evaluation of (mathematical) functions ˆ FP is not about organizing code in subroutines (also called functions but in dierent sense)! (this is called procedural programming) ˆ It's about organizing the whole programm as function
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Basics II ˆ Functions as rst-class objects ˆ can be passed as an argument ˆ returned from a function ˆ assigned to a variable ˆ Think of examples to the points above!
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Scoping
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Lazy ˆ lazy (or call-by-need) means evaluation is delayed until value is needed ˆ What do you think will the following piece of code work? f - function(){g()}
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Lazy ˆ It's valid even though we use function g() which isn't dened ˆ We kind of promise that it's gonna be dened to the time than f is called ˆ . . . but if we don't keep our promise f() Error in f() : could not find function g
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Functional - Lazy ˆ Now, let's dene the function g() before calling the function f() g - function() 0 # now g() is defined f() [1] 0 ˆ Now it works
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Function - Referential transparency I ˆ Referential transparency - if an expression can be replaced with its value without changing the behaviour of the program (side eect) ˆ In R it's up to the developer, she/he should be however conscious if their code produce side eects ˆ Assume function g returns 0 and function f returns the only argument (f - function(x) x). Is there a dierence between ˆ f(0) ˆ f(g())
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Function - Referential transparency IIa ˆ Which of the following 2 cases are referential transparent?
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Function - Referential transparency IIb ˆ (cont.) ˆ I executed - FALSE g - function(){ executed - TRUE return(0) } f(g()) ˆ II executed - TRUE g - function(){ executed - FALSE return(0) } f(g())
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Dynamic: Typing - I ˆ Types are optional and could be changed Code var - FALSE class(var) [1] logical var [1] FALSE var[3] - 1 class(var) numeric var [1] 0 NA 1
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Dynamic: Typing - II ˆ What do you think would be the type of var variable after the following action? var - ! var[3] - 1
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Dynamic: Typing - III ˆ Types are implicitly there (assigned by compiler) ˆ Types could be changed (implicitly by compiler)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Dynamic: Evaluation (Language abstraction) ˆ With eval you can dynamically evaluate code, e.g. eval(parse=text(f - function(x) x)) ˆ It allows to have more freedom in code manipulation (example will follow), beware performance! ˆ R allows to abstract the language itself
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix OOP - Basics ˆ Object-oriented programming is a paradigm in programming that prescribes to break down the task into objects with particular behaviour and data.
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix OOP in R ˆ Competing OOP standards in R: S3 (old), S4 (newer), reference classes, special libraries (R6, proto) ˆ xkcd:
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix OOP in R: S4 ˆ Assume an object of class Company has 2 properties: headcount (HC) and earnings (EBIT) ˆ if you add (i.e. merge) 2 companies, then you add up their earnings +20% (synergy eects) and add up their headcount -20% (economies of scale)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix OOP in R: S4 ˆ Solution setClass(Company , representation(HC = numeric , EBIT = numeric) ) setMethod(+ , signature(Company, Company) , function(e1, e2){ new(Company , HC = (e1@HC + e2@HC)*0.8 , EBIT = (e1@EBIT + e2@EBIT)*1.2 ) }) Microsoft - new(Company , HC = 50, EBIT = 95)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix OOP in R: S4 ˆ Result Microsoft + LinkedIn An object of class Company Slot HC: [1] 41.6 Slot EBIT: [1] 120
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Comparison to other languages ˆ Python class Company(): def __init__(self, HC, EBIT): self.HC = HC self.EBIT = EBIT def __add__(self, other): return Company((self.HC+other.HC)*0.8 ,(self.EBIT + other.EBIT)*1.2) def __repr__(self): out=HC:%s,EBIT:%s%(self.HC,self.EBIT) return out Microsoft = Company(50, 95) LinkedIn = Company(2, 5)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Comparison to other languages class Company { private double HC; private double EBIT; public Company(double HC, double EBIT) {this.HC = HC;this.EBIT = EBIT;} public static operator +(Company A , Company B) { double HC = (A.HC + B.HC)*0.8; double EBIT = (A.EBIT + B.EBIT)*1.2; return new Company(HC, EBIT) } }
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Statistical computing - revision ˆ Example: given X (e.g. norm) distribution ˆ pX is its probability function ˆ dX is its density function ˆ qX is its quantile function ˆ How to abstract X? ˆ Construct a function that takes name of the distribution with 2 parameters as an argument (e.g. norm, unif) and returns its quantile function parametrized with [0;1] (hint: use eval)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Possible solution ˆ 1-st step: how could it look for a particular function eval(parse(text=function(x) qnorm(x,0,1))) ˆ 2-nd step: separate distribution parameter eval( parse( text=paste0( function(x) q,norm,(x,0,1) ) ) ) function (x) qnorm(x, 0, 1)
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Possible solution (cont.) ˆ 3-rd step: abstract distribution as an argument and return as function F - function(dist){ eval(parse( text=paste0( function(x) q, dist ,(x,0,1) ) )) } ˆ Now you can get quantiles for dierent distributions ˆ Log-normal F(lnorm)(0.5) 1 ˆ Uniform F(unif)(0.8) 0.8
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Last remark ˆ Further it can be generalize to distributions with dierent number of parameters and pass parameters as an argument
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix References ˆ Morandat, Floréal, et al. Evaluating the design of the R language. ECOOP 2012Object-oriented programming. Springer Berlin Heidelberg, 2012. 104-131.
R programming language: conceptual overview Maxim Litvak Introduction Statistical computing Functional programming Dynamic OOP Statistical computing - revision I Statistical computing - revision II Statistical computing - revision III Statistical computing - revision IV Appendix Repository ˆ You can nd the latest version of this presentation here: ˆ github.com/maxlit/workshops/tree/master/R/r- advanced-overview

R programming language: conceptual overview