Why Python? (For Stats People) @__mharrison__ © 2013
About Me ● ● ● 12+ years Python Worked in Data Analysis, HA, Search, Open Source, BI, and Storage Author of multiple Python Books
Book
Book Treading on Python Volume 1 meant to make people proficient in Python quickly
Why Python?
General Purpose Language “I’d rather do math in a general-purpose language than do general-purpose programming in a math language.” John D Cook
Who's Using Python? ● Startups (on HN) ● Data Scientists (Strata) ● Big Companies
Who ● Google ● Nasa ● ILM ● Redhat ● Finance ● Instagram ● Pinterest ● Youtube ● ...
Open Source Free in both senses of the word
Batteries Included ● Text ● Network ● JSON ● Command Line ● Files ● XML
Large Community PyPi - PYthon Package Index ● Web ● Database ● GUI ● Scientific ● Network Programming ● Games
Large Community ● User Groups ● PyLadies ● Conferences
Local ● utahpython.org - 2nd Thurs. 7pm ● Utah Open Source Conference
Tooling ● Editors ● Testing ● Profiling ● Debugging ● Documentation
Optimizes for Programmer Time “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” Donald Knuth
Executable Pseudocode function quicksort(array) if length(array) ≤ 1 return array // an array of zero or one elements is already sorted select and remove a pivot element pivot from 'array' // see '#Choice of pivot' below create empty lists less and greater for each x in array if x ≤ pivot then append x to less' else append x to greater return concatenate(quicksort(less), list(pivot), quicksort(greater)) // two recursive calls http://en.wikipedia.org/wiki/Quicksort
Executable Pseudocode >>> def quicksort(array): ... if len(array) <= 1: ... return array ... pivot = array.pop(len(array)/2) ... lt = [] ... gt = [] ... for item in array: ... if item < pivot: ... lt.append(item) ... else: ... gt.append(item) ... return quicksort(lt) + [pivot] + quicksort(gt)
But... Python has Timsort. Optimized for real world (takes advantage of inherent order) and written in C. (Stolen by Java, Android, and Octave)
Multi-paradigm Languange ● Imperative ● Object Oriented ● Functional
Imperative >>> def sum(items): ... total = 0 ... for item in items: ... total = total + item ... return total >>> sum([2, 4, 8]) 14
OO >>> class Summer: ... def __init__(self): ... self.items = [] ... def add_item(self, item): ... self.items.append(item) ... def sum(self): ... return sum(self.items) >>> >>> >>> >>> 5 s = Summer() s.add_item(2) s.add_item(3) s.sum()
Functional >>> import operator >>> sum = lambda x: reduce(operator.add, x) >>> sum([4, 8, 22]) 34
Why Not Python?
Slow Sometimes you have to optimize. Good C integration
If it ain't broke don't fix it Don't replace existing solutions for fun
R has more depth Though Python is catching up in some areas
Going Forward
IPython Notebook ● Notebook w/ integrated graphs
Libraries ● Numpy - matrix math ● scipy - scientific libraries ● scipy.stats - stats ● statsmodel - modeling ● pandas - dataframe ● matplotlib - graphing ● scikit.learn - ml
That's all Questions? Tweet me For beginning Python secrets see Treading on Python Volume 1 @__mharrison__ http://hairysun.com

Why Python (for Statisticians)