Must Know	Classes	Functions
List & Dict & Set Comprehensions Lambda Functions Map Filter Zip Reduce args & *kwargs Unpack variables Generator (map, filter, zip) Closure & Decorator Context Manager Magic Method Metaclasses Threading & Multiprocessing	self (class instance) variables (class & instance) method vs. classmethod vs. staticmethod _ (private) vs. __ (name mangling) @property (getter, setter) LEGB (local, enclosing, global, builtins) Abstract class Dataclasses Classes in Dynamic Language	Enclosing function Attrs Functions in Dynamic Language

Collections	Itertools	Functools
defaultdict OrderedDict Counter namedtuple deque	count cycle repeat	reduce
	accumulate chain compress filterfalse groupby islice starmap takewhile dropwhile zip_longest
	product permutations combinations combinations_with_replacement

String	Int	Set	Tuple
f-string	Underscore Placeholders	Search	Swap

Conditional	For-Loop	Try-Except	Design	Ipython
Ternary operator	Enumerate For-Else	TEEF	Annotation Typing Pass and Ellipsis	VSCode Python Interactive window Time Measure Memory Measure

Built-ins
pathlib

Numpy	Pandas	Matplotlib (Pyplot)
Create Array or Matrix Basic Operations Indexing and Slicing Shape Manipulation Copying Broadcasting	Creation and Viewing Selection Setting, Deleting, and Handling Operations and Apply Functions Concat and Merge Grouping and Categorical Data Type Other Pandas Tricks	Basic (Single Plot) Multiple Figures and Axes Line Plots and Filling Area Time Series Scatter Plots Bar Charts Pie Charts Histograms Stack Plots Image Styles, Colors, Colormaps

Seaborn
Basic (Seaborn) Color Palette Multiple Plots

Must Know

List & Dict & Set Comprehensions

[(i, j) for i in range(3) for j in range(3) if i > j] # [(1, 0), (2, 0), (2, 1)]

Details 🔥

Lambda Functions

li = [1, 2, 3] li = [*map(lambda x: x * 10, li)] #li = [10, 20, 30]

Details 🔥

Map

num1 = [100, 1, 20] num2 = [19, 4, 94] num3 = [40, 6, 30] [*map(lambda x, y, z: max(x, y, z), num1, num2, num3)] # [100, 6, 94]

Details 🔥

Filter

names = ['Liam', 'Olivia', 'Noah', 'Emma', 'Oliver', 'Ava'] choice = filter(lambda x: x.startswith('O'), names) print(*choice, sep=', ') # Olivia, Oliver

Details 🔥

Zip

a = [1, 2, 3] b = [4, 5, 6] c = [*zip(a, b)] # [(1, 4), (2, 5), (3, 6)] a, b = zip(*c) # a=(1, 2, 3), b=(4, 5, 6)

Details 🔥

*args & **kwargs

Defining Functions with *arg and **kwarg

def example(a, *arg, b=0, **kwarg): print(a) # 1 print(arg) # (2, 3) print(b) # 1 print(kwarg) # {'x': 'a', 'y': [1, 2, 3]} example(1, 2, 3, b=1, x='a', y=[1, 2, 3])

Calling Functions with *arg and **kwarg

def func(greet, time, name): print(greet, time, name) func(*["Good", "Morning"], **{"name": "Jay"}) # Good Morning Jay

Details 🔥

Unpack Variables

Unpacking Iterable

a, b, *_ = [1, 2, 3, 4, 5] # 1, 2, [3, 4, 5]

Unpacking Generator

first, *amid, last = map(lambda x: x**2, range(1, 10000)) first # 1 last # 99980001

Unpacking in For-loop

sales = [("Pencil", 0.22, 1500), ("Notebook", 1.30, 550)] for product, *_ in sales: print(product) # Pencil, Notebook

Unpacking Function

def compute(i): return i, i ** 2, i ** 3, i ** 4, i ** 5 num, power, cube, *_ = compute(3) power # 9 cube # 27

Combining Dicts

number = {"one": 1, "two": 2} letter = {"a": "A", "b": "B"} combine = {**number, **letter} combine # {'one': 1, 'two': 2, 'a': 'A', 'b': 'B'}

Details 🔥

Generator (map, filter, zip)

def square_it(value): for i in range(value): yield i**2 li = square_it(10_000_000) [i for i in li if i < 50] # [0, 1, 4, 9, 16, 25, 36, 49]

Details 🔥

Closure & Decorator

def count_decorator(count): # new decorator with argument def decorator(orig_func): def wrapper(*args, **kwargs): print(f"func name: {orig_func.__name__}") print(f"func args: {args}, {kwargs}") for _ in range(count): # use the argument orig_func(*args, **kwargs) return wrapper return decorator # return the original decorator @count_decorator(2) def greet(msg): print(msg) greet("hello") # func name: greet # func args: ('hello',), {} # hello # hello

Details 🔥

Context Manager

@contextmanager def enterFolder(folderName): home = os.getcwd() os.chdir(folderName) yield os.chdir(home) with enterFolder('folder1'), open('example1.txt', 'w') as f: f.write('file1')

Details 🔥

Magic Method

class BinaryInt(str): def __new__(cls, val): return str.__new__(cls, f"{val: b}") def __add__(self, val): val += int(self, 2) return f"{val:b}" a = BinaryInt(2) print(a) # 10 print(a + 4) # 110

Details 🔥

Metaclasses

class Meta(type): def __new__(mtcls, name, bases, attrs): if name != "Base" and "must_to_do" not in attrs: raise TypeError("Bad Class: must_to_do() is needed") return super().__new__(mtcls, name, bases, attrs) class Base(metaclass=Meta): def server_func(self): return self.must_to_do() class Derived(Base): ... # TypeError: Bad Class: must_to_do() is needed

Details 🔥

Threading & Multiprocessing

import concurrent.futures with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: futures = [executor.submit(load_url, url, 60) for url in URLS] for future in concurrent.futures.as_completed(futures): result = future.result() print(len(result)) with concurrent.futures.ProcessPoolExecutor() as executor: results = executor.map(load_url, URLS, [60] * len(URLS), chunksize=4) for result in results: print(len(result))

Details 🔥

Classes

self (class instance)

class Person: def __init__(self, name): self.name = name def say(self): return f"I'm {self.name}" p = Person("Jay") p.say() == Person.say(p) # True

Details 🔥

variables (class & instance)

class Employee: num_emp = 0 # Class variable def __init__(self, pay): self.pay = pay # Instance variable Employee.num_emp += 1 e1 = Employee(100) e2 = Employee(200) e1.num_emp # 2 Employee.num_emp # 2 e1.pay # 100 Employee.pay # AttributeError: type object 'Employee' has no attribute 'pay'

Details 🔥

method vs. classmethod vs. staticmethod

class Person: def __init__(self, name, age): self.name = name self.age = age @staticmethod def splitPersonString(string, split_sign="-"): return string.split(split_sign) @classmethod def fromString(cls, cls_str): return cls(*cls.splitPersonString(cls_str, ", ")) p1 = Person.fromString("Jay, 99") p1.name # Jay p1.age # 99

Details 🔥

_ (private) vs. __ (name mangling)

class Dog: _weight = 5 # private variable def __bark(self): # name mangling fucntion print("bark") dog = Dog() dog._weight # 5 dog.__bark() # AttributeError: 'Dog' object has no attribute '__bark' dog._Dog__bark() # bark

Details 🔥

@property (getter, setter)

class User: def __init__(self, first_name, last_name, password): self.first_name = first_name self.last_name = last_name self.password = password @property def fullname(self): return f"{self.first_name} {self.last_name}" @property def password(self): raise AttributeError("password is not readable.") @password.setter def password(self, passord): from hashlib import md5 self.password_hash = md5(b"{password}").hexdigest() user = User("Mimi", "Wang", "0000") user.fullname # Mimi Wang user.password_hash # 7fbccc9c3a9a5afef65563cd00404c1416 user.password # Attribute Error: password is not readable.

Details 🔥

LEGB (local, enclosing, global, builtins)

min([1, 2, 31]) # builtins min min = "global min" def outer(): # we can do "global min" here to change global min = "enclosing min" def inner(): # we can do "nonlocal min" here to change enclosing min = "local min"

Details 🔥

Abstract class

from abc import ABC, abstractmethod class Base(ABC, object): @property @abstractmethod def foo(self): ... @abstractmethod def do(self): ...

Details 🔥

Dataclasses

from dataclasses import InitVar, dataclass, field from typing import List @dataclass class InventoryItem: name: str unit_price: float = field(default=0.0) quantity_on_hand: int = field(default=0, repr=False) parts: List[str] = field(default_factory=list) parts_number: InitVar[int] = 0 def __post_init__(self, parts_number): self.parts.extend([f"part{i}" for i in range(1, parts_number + 1)]) item = InventoryItem("product", parts_number=2) # InventoryItem (name = 'product', unit_price=0.0, parts=['part1', 'part2'])

Details 🔥

Classes in Dynamic Language

def getClass(x): if x == 1: for i in range(11): class Example: a = i return Example cls = getClass(1) cls.b = "123" print(cls.a, cls.b) # 10 123

Details 🔥

Functions

Enclosing function

def add_with_b(b): def add(a): return a + b return add add4 = add_with_b(4) add4(3) # 7 add4(7) # 11

Details 🔥

Attrs

class Cat: def __repr__(self): return f"({self.name}: {self.age})" listOfCats = [] attrs = [{"name": "meow1", "age": 5}, {"name": "meow2", "age": 10}] for attr in attrs: cat = Cat() for key, val in attr.items(): setattr(cat, key, val) listOfCats.append(cat) print(listOfCats) # [(meow1: 5), (meow2: 10)]

Details 🔥

Functions in Dynamic Language

for i in range(100): def say(): print(i) def returnFunc(a): if a < 100: def mul(b): print(a * b) return mul else: def add(b): print(a + b) return add

Details 🔥

Collections

defaultdict

from collections import defaultdict d = defaultdict(list) d["a"] = [1, 2, 3] d["b"].append(4) d["c"].extend([5, 6]) # defaultdict(<class 'list'>, {'a': [1, 2, 3], 'b': [4], 'c': [5, 6]})

Details 🔥

OrderedDict

from collections import OrderedDict location = ["C", "B", "A"] population = [32, 46, 12] d = OrderedDict({l: p for l, p in zip(location, population)}) # OrderedDict([('C', 32), ('B', 46), ('A', 12)]) d["D"] = 44 # OrderedDict([('C', 32), ('B', 46), ('A', 12), ('D', 44)]) d.popitem(last=False) # OrderedDict([('B', 46), ('A', 12), ('D', 44)]) d.move_to_end("D", last=False) # OrderedDict ([( 'D', 44), ('B', 46), ('A', 12)])

Details 🔥

Counter

from collections import Counter c = Counter(cats=4, dogs=8) # Counter({'dogs': 8, 'cats': 4}) c.update(birds=10) # Counter({'birds': 10, 'dogs': 8, 'cats': 4}) c = c - Counter({"birds": 5}) # Counter({'dogs': 8, 'birds': 5, 'cats': 4}) c.most_common(2) # [('dogs', 8), ('birds', 5)]

Details 🔥

namedtuple

from collections import namedtuple Dog = namedtuple("Dog", "name, age") d1 = Dog("funny", 4) features = ["happy", 3] d2 = Dog._make(features) # Dog(name='happy', age=3) d2._asdict() # OrderedDict([('name', 'happy'), ('age', 3)])

Details 🔥

deque

from collections import deque li = [40, 30, 50, 46, 39, 44] d = deque(li[:2]) # Let 's compute the moving average with range=3 d.appendleft(0) s = sum(d) for elem in li[2:]: s += elem - d.popleft() d.append(elem) print(s / 3) # 40, 42, 45, 43

Details 🔥

Itertools

Infinite iterators

count

from itertools import count gen = count(2.5, 0.5) for x in gen: print(x) # 2.5, 3.0, 3.5, 4.0, ... non-stop

cycle

from itertools import cycle gen = cycle([1, 2, 3]) for x in gen: print(x) # 1, 2, 3, 1, 2, ... non-stop

repeat

from itertools import repeat class Cat: ... gen = repeat(Cat(), 2) for cat in gen: print(cat) # <__main__.Cat object at 0x0000019AC1C5D348> # <__main__.Cat object at 0x0000019AC1C5D348>

Details 🔥

Iterators terminating on the shortest input sequence

accumulate

import operator from itertools import accumulate gen = accumulate([1, 2, 3, 4]) list(gen) # [1, 3, 6, 10] gen = accumulate([1, 2, 3, 4], func=operator.mul) list(gen) # [1, 2, 6, 24]

chain

from itertools import chain gen = chain([1, 2], [3, 4]) list(gen) # [1, 2, 3, 4] gen = chain("AB", "CD") list(gen) # [A, B, C, D]

compress

from itertools import compress gen = compress([1, 2, 3], [1, 0, 1]) gen = compress([1, 2, 3], [True, False, True]) # same list(gen) # [1, 3]

filterfalse

from itertools import filterfalse gen = filterfalse(lambda x: x%2 == 0, [1, 2, 3]) list(gen) # [1, 3]

groupby

from itertools import groupby gen = groupby("AABBCCCAA") # default func = lambda x: x for k, g in gen: print(k, list(g)) # A [A, A] # B [B, B] # C [C, C, C] # A [A, A] gen = groupby([1, 2, 3, 4], lambda x: x // 3) for k, g in gen: print(k, list(g)) # 0 [1, 2] # 1 [3, 4] gen = groupby([("A", 100), ("B", 200), ("C", 600)], lambda x: x[1] > 500) for k, g in gen: print(k, list(g)) # False [(A, 100), (B, 200)] # True [(C, 600)]

islice

gen = islice([1, 2, 3], 2) # equals to A[:2] list(gen) # [1, 2] gen = islice("ABCD", 2, 4) # equals to A[2:4] list(gen) # [C, D] gen = islice("ABCD", 0, None, 2) # equals to A[::2] list(gen) # [A, C]

starmap

from itertools import starmap # with only one argument gen = starmap(lambda x: x.lower(), "ABCD") list(gen) # [a, b, c, d] # with 2 arguments gen = starmap(lambda x, y: x + y, [(1, 2), (3, 4)]) list(gen) # [3, 7] # with different size of arugments gen = starmap(lambda *keys: sum(keys) / len(keys), [[3, 8, 3], [4, 2]]) list(gen) # [4.6666667, 3.0]

takewhile

from itertools import takewhile gen = takewhile(lambda x: x < 2, [1, 2, 3, 2, 1]) list(gen) # [1] gen = takewhile(lambda x: x.isupper(), "ABCdefgHIJ") list(gen) # [A, B, C]

dropwhile

gen = dropwhile(lambda x: x < 2, [1, 2, 3, 2, 1]) list(gen) # [2, 3, 2, 1] gen = dropwhile(lambda x: x.isupper(), "ABCdefgHIJ") list(gen) # [d, e, f, g, H, I, J]

zip_longest

from itertools import zip_longest gen = zip_longest("ABC", ("X", "Y")) list(gen) # [('A', 'X'), ('B', 'Y'), ('C', None)] gen = zip_longest("ABC", [1, 2], fillvalue=-1) list(gen) # [('A', 1), ('B', 2), ('C', -1)]

Details 🔥

Combinatoric iterators

product

from itertools import product gen = product("AB", "CD") list(gen) # [AC, AD, BC, BD] gen = product("AB", repeat=2) list(gen) # [AA, AB, BA, BB] gen = product("AB", "CD", repeat=2) list(gen) # [ACAC, ACAD, ACBC, ACBD, # ADAC, ADAD, ADBC, ADBD, # BCAC, BCAD, BCBC, BCBD, # BDAC, BDAD, BDBC, BDBD]

permutations

gen = permutations("ABC") # same as r=3 list(gen) # [ABC, ACB, BAC, BCA, CAB, CBA] gen = permutations("ABC", r=2) list(gen) # [AB, AC, BA, BC, CA, CB] gen = permutations("ABC", r=1) list(gen) # [A, B, C]

combinations

gen = combinations("ABC", 1) list(gen) # [A, B, C] gen = combinations("ABC", 2) list(gen) # [AB, AC, BC] gen = combinations("ABC", 3) list(gen) # [ABC]

combinations_with_replacement

gen = combinations_with_replacement("ABC", 1) list(gen) # [A, B, C] gen = combinations_with_replacement("ABC", 2) list(gen) # [AA, AB, AC,  # BB, BC,  # CC] gen = combinations_with_replacement("ABC", 3) list(gen) # [AAA, AAB, AAC, ABB, ABC, ACC, # BBB, BBC, BCC, # CCC]

Details 🔥

Functools

Reduce

from functools import reduce reduce(lambda x, y: x - y, [1, 2, 3, 4, 5], 100) # 85

Details 🔥

String

f-string

first_name = "Kain" last_name = "Mccarthy" print(f"Hi, I'm {first_name} {last_name}.") # Hi, I'm Kain Mccarthy. pi = 3.14159265359 print(f"{pi:.2f}") # 3.14 d = {"name": "Shelly"} print(f"She is {d['name']}") # She is Shelly i = 1000000 print(f"{i:,}") # 1,000,000 # Ref: # * https://youtu.be/nghuHvKLhJA # * https://blog.louie.lu/2017/08/08/outdate-python-string-format-and-fstring/

Int

Underscore Placeholders

a = 100_000_000 b = 10_000_000 c = 1_0_0 print(f"{a+b+c:,}") # 110,000,100 # Ref: # * https://youtu.be/C-gEQdGVXbk&t=140

Set

Search

long_list = [i for i in range(100_000_000)] long_set = set(long_list) %%time 100_000_000 in long_list # False # Wall time: 1.26 s %%time 100_000_000 in long_set # False # Wall time: 0 ns # Ref: # * https://stackoverflow.com/questions/2831212/python-sets-vs-lists/17945009 # * https://youtu.be/r3R3h5ly_8g?t=1010

Tuple

Swap

a, b = 1, 2 a # 1 b # 2 a, b = b, a a # 2 b # 1 # Ref: # * https://youtu.be/VBokjWj_cEA?list=LL&t=445

Condition

Ternary operator

if x < 1: x += 1 else: x -= 1 # equivalent to: x = (x + 1) if (x < 1) else (x - 1) # Ref: # * https://www.youtube.com/watch?v=C-gEQdGVXbk&t=34s

For-Loop

Enumerate

arr = ["a", "b", "c"] for index, element in enumerate(arr): print(index, element) # 0 a # 1 b # 2 c for index, element in enumerate(arr, start=3): print(index, element) # 3 a # 4 b # 5 c # Ref # * https://youtu.be/VBokjWj_cEA?list=LL&t=190

For-Else

for text in "to be or not to be".split(): if text.strip().startswith("o"): print(f"Found it! `{text}`") break else: print("Not found") # Found it! `or` # Ref: # * https://www.youtube.com/watch?v=Dh-0lAyc3Bc

Try-Except

TEEF

try: print(1/1) except Exception as e: print(e) else: print("Safe") # executed when except didn't happen finally: print("Done") # Always executed # 1.0 # Safe # Done # Ref: # * https://youtu.be/VBokjWj_cEA?list=LL&t=1331

Design

Annotation

def func(a: str, b: int = 3) -> str: return a*b func.__annotations__ # {'a': <class 'str'>, 'b': <class 'int'>, 'return': <class 'str'>} func("hi") # hihihi func("hi", 5) # hihihihihi

def func(a: "str longer than 5", b: 1+2 = 3) -> "str longer b times": return a*b func.__annotations__ # {'a': 'str longer than 5', 'b': 3, 'return': 'str longer b times'} func("hi") # hihihi func("ohayou", 2) # ohayouohayou

Ref

https://mozillazg.com/2016/01/python-function-argument-type-check-base-on-function-annotations.html

Typing

from typing import Any, Dict, Iterable, List, Union def func(a: List[int], b: Union[str, int], c: Dict[str, int], d: Iterable, e: Any): print(len(a)) print(f"{b} can be str or int.") print(f"{c['something']} will return int.") for i in d: print(i) print(f"{type(e)} can be any type.") # Ref: # * https://myapollo.com.tw/zh-tw/python-typing-module/

Pass and Ellipsis

# Style 1 def my_abstract_method(self): pass # Style 2 def my_abstract_method(self): ... # Style 3 def my_abstract_method(self): """  This function is ...  """ # Ref: # * https://stackoverflow.com/questions/55274977/when-is-the-usage-of-the-python-ellipsis-to-be-preferred-over-pass # * https://stackoverflow.com/questions/772124/what-does-the-ellipsis-object-do

IPython

VSCode Python Interactive window

#%% 1+1 # 2 # Ref: # * https://code.visualstudio.com/docs/python/jupyter-support-py

Time Measure

One Line

%time sleep(0.3) # Wall time: 310 ms %timeit sleep(0.3) # 311 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Multiple Lines

%%time for i in range(10): sleep(0.1) # Wall time: 1.09 s %%timeit for i in range(10): sleep(0.1) # 1.09 s ± 2.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Ref

Memory Measure

Installation

!pip install -U memory_profiler %load_ext memory_profiler

One Line

%memit [i for i in range(1000)] # peak memory: 51.31 MiB, increment: 0.36 MiB

Multiple Lines

%%memit l = [] for x in range(10000): l.append(x*2) # peak memory: 52.76 MiB, increment: 0.70 MiB

Ref

Modules

pathlib

sub_folder = Path("subfolder/subfolder") sub_folder.mkdir(parents=True, exist_ok=True) file_ = sub_folder / Path("test.txt") file_.touch() file_.write_text("Hello") file_.read_text() file_.unlink() Path("subfolder/subfolder").rmdir()

Details 🔥

Numpy

Create Array or Matrix

np.array([[1, 2], [3, 4], [5, 6]]) # create from list np.zeros((3, 3)) # create filled with 0's np.ones((2, 4, 4)) # create filled with 1's np.empty((5, 2)) # create with speed np.arange(2, 10, 3) # create array from range (start, end, step_size) np.linspace(5, 50, 20) # create a linear space (start, end, num_elements) # create from random generator rng = np.random.default_rng(seed=42) rng.random((2, 4)) rng.normal(3, 2.5, size=(2, 4)) # sample from N(3, 6.25) rng.integers(low=2, high=10, size=(10, 2)) # random integer matrix

Details 🔥

Basic Operations

Sort and Concatenate

np.sort(a, axis=None) np.sort(a, axis=-1)[::-1] a.sort() a[::-1].sort() np.concatenate((a, b), axis=None) np.concatenate((a, b), axis=2)

Element-wise

a = np.arange(5) # [0, 1, 2, 3, 4] b = np.ones(5, dtype=int) # [1, 1, 1, 1, 1] a + b # [1 2 3 4 5] a - b # [-1 0 1 2 3] a ^ 2 # [ 0 1 4 9 16] a * 10 # [ 0 10 20 30 40] a > 2 # [False False False True True] np.sqrt(a) # [0. , 1. , 1.41421356, 1.73205081, 2. ] a*b # [0 1 2 3 4] a@b # 10

All (None) Column-wise (0), Row-wise (1)

A = np.random.default_rng(42).random((2, 4)) # [[0.77395605, 0.43887844, 0.85859792, 0.69736803], # [0.09417735, 0.97562235, 0.7611397 , 0.78606431]]) A.max() # 0.97562235 A.max(axis=0) # [0.77395605, 0.97562235, 0.85859792, 0.78606431] A.max(axis=1) # [0.85859792, 0.97562235] A.mean() # 0.6732255180088094 A.mean(axis=0) # [0.4340667 , 0.7072504 , 0.80986881, 0.74171617] A.mean(axis=1) # [0.69220011, 0.65425093]

Details 🔥

Indexing and Slicing

# Index and slicing arrays x[1, 3] == x[1][3] y[1:5:2, ::3] # Indexing arrays x[np.array([0, 1, 2, -1, -2])] y[np.array([1, 2, 3]), 1:4:2] y[np.array([1, 2]), np.array([-1, -1])] # Masking arrays x[x>5] x[(x%2==0) | (x>7)] y[[True]*3 + [False] + [True] + [False], 2::2] # Ellipsis syntax x[-1, ..., 3] # same as x[-1, :, 3] x[:3, ...] # same as x[0:3, :, :] and x[0:3] and x[:3] x[::2, ..., np.array([0, 2])] # same as x[0:5:2, :, np.array([0, 2])]

Details 🔥

Shape Manipulation

A = np.array([[[1, 2, 3], [4, 5, 6]], [[4, 6, 8], [2, 1, 6]]]) A.shape # (2, 2, 3) A = A.reshape(3, 2, 2) # (3, 2, 2) A = A[np.newaxis, ...] # (1, 3, 2, 2) A = np.expand_dims(A, axis=4) # (1, 3, 2, 2, 1) A = A.flatten() # (12,) A = A.reshape(2, -1, 2) # (2, 3, 2)

Details 🔥

Copying

# shallow copy: values will change on every variable a = np.arange(10).reshape(5, 2) b = a.view() c = a.reshape(-1) d = a[:3, :1] # deep copy: copy and create an entirely new array a = np.arange(10000000) b = a[:100].copy() del a

Details 🔥

Broadcasting

# scalar broadcasting a = np.array([1, 2, 3]) a * 3 # [3, 6, 9] # general broadcasting a = np.ones( (8, 1, 6, 1)) b = np.zeros( (7, 1, 5)) (a*b).shape # 8, 7, 6, 5 # outer product a = np.arange(4)[:, np.newaxis] # (4, 1) b = np.array([1, 2, 3]) # (3,) a + b # (4, 3) # [0] + [1, 2, 3] = [1 2 3] # [1] [2 3 4] # [2] [3 4 5] # [3] [4 5 6]

Details 🔥

Pandas

Creation and Viewing

# Create Series pd.Series([1, 2, 3, 4, 5]) pd.Series(np.arange(1, 6), index=list("abcde")) pd.Series({"a": 100, "b": 50, "c": 120}) pd.Series("hi", index=list("12345")) # Create DataFrame pd.DataFrame({ "col_1": [1, 2, 3, 4, 5], "col_2": np.arange(1, 6), "col_3": pd.Series(np.arange(1, 7), index=list("abc123")), }, index=list("abcde")) pd.DataFrame( [ {"a": 1, "b": 2}, {"b": 10, "c": 5}, {"a": 55, "b": 489, "c": 32, "d": 590}, ], index=["first", "second", "third"], columns=list("ab") ) pd.DataFrame( np.arange(10).reshape(2, 5), # [[0,1,2,3,4], [5,6,7,8,9]] index=pd.date_range("20200101", periods=2), columns=list("abcde")) # Viewing df.head(2) df.tail(3) df.index df.columns df.to_numpy() df.sort_index() df.sort_values("col_name")

Details 🔥

Selection

	Single Column	Multiple Columns	Continuous Columns	All Columns
Single Row	`df.loc[row, column]` or `df.at[row, column]`	`df.loc[row, [column, column]]`	`df.loc[row, column:column]`	`df.loc[row]`
Multiple Rows	`df.loc[[row, row], column]`	`df.loc[[row, row], [column, column]]`	`df.loc[[row, row], column:column]`	`df.loc[[row, row]]`
Continuous Rows	`df.loc[row:row, column]`	`df.loc[row:row, [column, column]]`	`df.loc[row:row, column:column]`	`df[row:row]`
All Rows	`df[column]`	`df[[column, column]]` or `df.loc[:, [column, column]]`	`df.loc[:, column:column]`	`df`

df["col1"] df[["col1", "col2"]] df["row1":"row5"] df.loc["row1", "col1"] # df.iloc[0, 0] df.at["row1", "col1"] # df.iat[0, 0] df.loc["row1", ["col1", "col2"]] # df.iloc[0, [0, 1]] df.loc["row1", "col1":"col5"] # df.iloc[0, 0:4] df.loc[["row1", "row2"]] # df.iloc[[0, 1]] df.loc["row1":"row5", "col1"] # df.iloc[0:4, 0] df[(df["col1"] > 18)] df[(df > 6) & (df < 25)] df[df["col1"].isin([10, 15, 0])]

df.iloc is same as df.loc but using position.
df.iat is same as df.at but using position.
Details 🔥

Setting, Deleting, and Handling

# Modify columns df["col1"] += 10 df.loc[:, "col1"] = "bar" df.loc[:, ["col1", "col3"]] = np.arange(12).reshape(6, 2) # Modify single element df.loc["row1", "col1"] = 0 df.iloc[0, 0] = 1 # Modify by boolean indexing df[df < 100] = -df # Append df["total"] = df.sum(axis=1).to_numpy() df["gt"] = df["total"] > 50000 df["foo"] = "bar" # Insert df.insert(0, "col0", df["col2"][:2]) # col_index, col_name, values # Delete column del df["total"] df.drop(columns=["foo"], inplace=True) # same as `df.drop(["foo"], axis=1)` gt50000 = df.pop("gt50000") # Delete row df.drop(["e", "d"], inplace=True) # Handle NaN miss_df.dropna(how='any') miss_df.fillna(value=10000000)

Details 🔥

Operations and Apply Functions

# Arithmetic df + df2 df - df.iloc[0] 1 / df # Numpy np.sqrt(df) np.max(df, axis=1) # Built-in df.mean() df.max(axis=1) # Apply df.apply(np.cumsum, axis=1) df.apply(lambda x: x.sum() / x.size) # x means df # Series s.value_counts() s.str.upper() s.str.split("-").str.get(0)

Details 🔥

Concat and Merge

# Concat rows pd.concat([df[:3], df.iloc[7:, :2]]) # Merge two DataFrame pd.merge(df, df2, on="name", how="right")

Details 🔥

Grouping and Categorical Data Type

# Groupby df.groupby("col_A").sum() df.groupby(["col_A", "col_B"]).max() # Categorical - discrete df["grade"] = df["grade"].astype("category") df["grade"].cat.categories = ["Bad", "Good", "Excellent"] df.sort_values(by="grade") df.groupby("grade").size() # Categorical - continuous df["grade-labels"] = pd.cut(df["score"], bins=range(0, 120, 20), labels=list("EDCBA"))

Details 🔥

Other Pandas Tricks

# Rename Columns df.columns = ["col_one", "col_two"] df = df.add_prefix("Xx_") df = df.add_suffix("_xX") df.columns = df.columns.str.replace("Xx", "Oo") df.columns = df.columns.str.replace("xX", "oO") # Reverse Row or Column Order df.loc[::-1].reset_index(drop=True) # reverse rows df.loc[:, ::-1] # reverse columns # Split DataFrame into 2 random subsets sub1 = df.sample(frac=0.75, random_state=42) sub2 = df.drop(sub1.index) sub1.index = sub1.index.sort_values() sub2.index = sub2.index.sort_values() # Filter by Category (or Largest Category) df[df.genre.isin(["A", "D"])] df[~df.genre.isin(["A", "D"])] df[df.genre.isin(df.genre.value_counts().nlargest(1).index)] # Split String into Multiple Columns df[["first", "last"]] = df["name"].str.split(' ', expand=True) df["city"] = df["location"].str.split(", ", expand=True)[0] # Change Display Options (Not Change Data) pd.set_option("display.float_format", "${:.2f}".format) pd.reset_option("display.float_format") # Style a DataFrame style = {"Date": "{:%Y/%m/%d}", "Value": "${:d}", "Volume": "{:,}"} df.style.format(style) \ .hide_index() \ .highlight_max("Value", color="red") \ .highlight_min("Value", color="green") \ .bar("Area", color="orange", align="zero") \ .background_gradient(subset="Volume", cmap="Greens") \ .set_caption("Random Chart")

Details 🔥

Matplotlib (Pyplot)

Basic (Single Plot)

import matplotlib.pyplot as plt # with this magic function, we can skip `plt.show()` %matplotlib inline plt.plot(np.sin(np.linspace(0, 10, 100)), "*-b", lw=2, markersize=5, label="sin(x)") plt.plot(np.log(np.arange(100)), c="g", ls="--", marker=".", lw=2, markersize=5, label="log(x)") plt.xlabel("X here") plt.ylabel("Y here") plt.title("sin(x) and log(x)") plt.grid() plt.legend() plt.text(x=70, y=-1, s="hahahaha") plt.annotate("wow \nmax", xy=(16, 1), xytext=(40, 0.9), arrowprops={"facecolor": "orange", "shrink": 0.05}) plt.annotate("wow \nmax again", xy=(78, 1), xytext=(95, 0.9), arrowprops={"facecolor": "red", "shrink": 0.05})

Details 🔥

Multiple Figures and Axes

# Object-oriented style fig1, ax = plt.subplots() ax.plot(...) fig2, axs = plt.subplots(2, 1) axs[0].plot(...) axs[1].plot(...) # Pyplot style plt.figure(1) plt.title("Figure 1") plt.figure(2) plt.subplot(311) plt.title("Figure 2") plt.subplot(323) plt.subplot(324) plt.subplot(337) plt.subplot(338) plt.subplot(339)

Details 🔥

Line Plots and Filling Area

years = [1.1, 1.3, 1.5, 2.0, 2.2, ...] salary = [39343.00, 46205.00, 37731.00, 43525.00, 39891.00, ...] salary_mean = np.mean(salary) # Line Plots plt.plot(years, salary, marker="o", markersize=5, lw=2, ls="-", ) # Filling Areas plt.fill_between(years, salary, salary_mean, where=(salary > salary_mean), alpha=.4, color="green", edgecolor="black", interpolate=True, label="On Average" )

Details 🔥

Time Series

import matplotlib.dates as mdates dates = np.arange(np.datetime64("2021-01-01"), np.datetime64("2021-01-22")) prices = np.random.default_rng(42).normal(500, 30, len(dates)) plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%a, %d %m")) plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=7)) plt.gca().xaxis.set_minor_locator(mdates.DayLocator()) plt.plot_date(dates, prices, ls="solid", c="orange", marker="^", markersize=10) plt.grid() plt.tight_layout()

Details 🔥

Scatter Plots

temperature = [14.2, 16.4, 11.9, 15.2, ...] ice_cream_sales = [215, 325, 185, 332, ...] colors = np.array(ice_cream_sales) / np.linalg.norm(ice_cream_sales) plt.scatter(temperature, ice_cream_sales, s=ice_cream_sales, # set the size according to the prices of the ice cream c=colors, # set the colors according to the prices of the ice cream cmap="Greens", # preferred color type edgecolor="black", # the edge color of points lw=0.5, # the edge width of points alpha=.75, ) plt.xlabel("temperature") plt.ylabel("ice cream price") plt.yscale("log") # use log scale on y-axis to handle outliners cbar = plt.colorbar() cbar.set_label("Expensive") plt.tight_layout()

Details 🔥

Bar Charts

# Bar Charts ages = [25, 26, 27, 28, 29, ...] salary_all = [38496, 42000, 46752, 49320, 53200, ...] index = np.arange(len(ages)) width = 0.25 plt.bar(index - width, salary_all, width=0.25, label="All Devs") plt.bar(index, salary_py, width=0.25, label="Python") plt.bar(index + width, salary_js, width=0.25, label="JavaScript") plt.xticks(ticks=index, labels=ages) plt.title("Median Salary (USD) by Age") plt.xlabel("Ages") plt.ylabel("Median Salary (USD)") plt.legend() plt.tight_layout() # Horizontal Bar Charts language = ['JavaScript', 'HTML/CSS', 'SQL', 'Python', ...] popularity = [59219, 55466, 47544, 36443, ...] plt.barh(language, popularity) plt.title("Most Popular Languages") plt.xlabel("Number of People Who Use") plt.tight_layout()

Details 🔥

Pie Charts

grade = ["A", "B", "C", "D", "E"] number = [10, 18, 23, 8, 5] explode = [0.1, 0, 0, 0, 0] plt.pie(number, labels=grade, shadow=True, autopct="%1.1f%%", pctdistance=0.6, startangle=90, explode=explode ) plt.title("Test Grade") plt.tight_layout()

Details 🔥

Histograms

height_stats = np.random.default_rng(42).normal(160, 15, 1000) interval_bin = [120, 130, 140, 150, 160, 170, 180, 190, 200] plt.hist(height_stats, bins=interval_bin, edgecolor="black", lw=1, density=True) # Plot the probability density curve import scipy.stats as ss density = ss.kde.gaussian_kde(height_stats) index = np.arange(120, 200) plt.plot(index, density.evaluate(index), color="pink", lw=3, ls="--", label="Probability Density") # Plot the mean line plt.axvline(np.mean(height_stats), c="orange", lw=5, label="Height Mean") plt.legend() plt.title("Height Stats") plt.xlabel("Heights") plt.ylabel("Probability Density") plt.tight_layout()

Details 🔥

Stack Plots

years = [1950, 1960, 1970, 1980, 1990, 2000, 2010, 2018] population_by_continent = { 'africa': [228, 284, 365, 477, 631, 814, 1044, 1275], 'americas': [340, 425, 519, 619, 727, 840, 943, 1006], 'asia': [1394, 1686, 2120, 2625, 3202, 3714, 4169, 4560], 'europe': [220, 253, 276, 295, 310, 303, 294, 293], 'oceania': [12, 15, 19, 22, 26, 31, 36, 39], } y = population_by_continent.values() labels = population_by_continent.keys() colors = ["#96ceb4", "#ffeead", "#ff6f69", "#ffcc5c", "#88d8b0"] plt.style.use("seaborn") plt.stackplot(years, y, labels=labels, colors=colors) plt.legend(loc="upper left") plt.title("World Population") plt.xlabel("Year") plt.ylabel("Population (Millions)") plt.tight_layout()

Details 🔥

Image

img = mpimg.imread("https://www.catster.com/wp-content/uploads/1970/01/Am-ShortHair-breed_getty1140883355-768x513.png") plt.imshow(img) # Applying pseudocolor schemes plt.imshow(img[..., 0], cmap="gray") plt.colorbar() # Flipping Photos Vertically or Horizontally plt.imshow(img[::-1]) # Reverse at the first axis == vertical flip plt.imshow(img[:, ::-1]) # Reverse at the second axis == horizontal flip

Details 🔥

Styles, Colors, Colormaps

# Switch Style plt.style.use("seaborn-pastel") # Data x = np.random.default_rng(42).integers(0, 100, 100) y = (2*x+1) * np.random.default_rng(43).normal(5, 1, 100) regr = sklearn.linear_model.LinearRegression() regr.fit(x[:, np.newaxis], y[:, np.newaxis]) regr_line = regr.predict(x[:, np.newaxis]) # Plotting with fancy color and colormap plt.scatter(x, y, c=y, alpha=0.25, cmap="plasma") plt.plot(x, regr_line, color="darkviolet", alpha=0.5, lw=5, ls="-", label="regression line") plt.title("Linear Regression Test") plt.xlabel("X") plt.ylabel("y") plt.legend() plt.colorbar()

Seaborn

Basic (Seaborn)

x = np.array(range(1, 5)) y = x**2 df = pd.DataFrame(zip(x, y), columns=["col_1", "col_2"]) # Plotting with data parameter def plot(): sns.lineplot(x="col_1", y="col_2", data=df) # Seaborn Styles sns.set_style("white") # Scaling the plots sns.set_context("paper", font_scale=1.5) # Changing the figure Size plt.figure(figsize=(8, 4)) # width, height  # Using Seaborn with Matplotlib plt.subplot(211) plt.title("Square X") plot() # Seaborn Styles Context Manager with sns.axes_style("darkgrid"): plt.subplot(212) plot() plt.tight_layout()

Details 🔥

Color Palette

# Sequential Palette palette = sns.color_palette("YlGn") sns.palplot(palette) plt.title("YlGn Colormap (Sequential)") # Diverging Palette palette = sns.color_palette("coolwarm") sns.palplot(palette) plt.title("coolwarm Colormap (Diverging)") # Qualitative Palette palette = sns.color_palette("Pastel2") sns.palplot(palette) plt.title("Pastel2 Colormap (Qualitative)")

Details 🔥

Multiple Plots

Details 🔥

Using Matplotlib

data = sns.load_dataset("iris") plt.figure(figsize=(11, 3)) plt.subplot(121) sns.lineplot(x="sepal_length", y="sepal_width", data=data) plt.subplot(122) sns.lineplot(x="petal_length", y="petal_width", data=data)

Using Seaborn

FacetGrid

grid = sns.FacetGrid(data, col="species") grid.map(plt.plot, "sepal_width")

PairGrid

x_vars = ["sepal_length", "sepal_width", "petal_length", "petal_width"] y_vars = ["species"] grid = sns.PairGrid(data, x_vars=x_vars, y_vars=y_vars) grid.map(sns.barplot)

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
assets		assets
classes		classes
collections		collections
functions		functions
functools		functools
itertools		itertools
matplotlib		matplotlib
modules		modules
must_know		must_know
numpy		numpy
pandas		pandas
seaborn		seaborn
transformers		transformers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

windsuzu/PythonUniverse

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Must Know

List & Dict & Set Comprehensions

Lambda Functions

Map

Filter

Zip

*args & **kwargs

Defining Functions with *arg and **kwarg

Calling Functions with *arg and **kwarg

Unpack Variables

Unpacking Iterable

Unpacking Generator

Unpacking in For-loop

Unpacking Function

Combining Dicts

Generator (map, filter, zip)

Closure & Decorator

Context Manager

Magic Method

Metaclasses

Threading & Multiprocessing

Classes

self (class instance)

variables (class & instance)

method vs. classmethod vs. staticmethod

_ (private) vs. __ (name mangling)

@property (getter, setter)

LEGB (local, enclosing, global, builtins)

Abstract class

Dataclasses

Classes in Dynamic Language

Functions

Enclosing function

Attrs

Functions in Dynamic Language

Collections

defaultdict

OrderedDict

Counter

namedtuple

deque

Itertools

Infinite iterators

count

cycle

repeat

Iterators terminating on the shortest input sequence

accumulate

chain

compress

filterfalse

groupby

islice

starmap

takewhile

dropwhile

zip_longest

Combinatoric iterators

product

permutations

combinations

combinations_with_replacement

Functools

Reduce

String

f-string

Int

Underscore Placeholders

Set

Search

Tuple

Swap

Condition