Python Cheat Sheet Table of Contents
This cheat sheet provides a quick reference for essential Python
concepts, focusing on practical use cases for data analysis and
Basics Control Flow
comments, arithmetical IF, ELSE, ELIF,
programming. It covers fundamental topics like variables,
operations, variables, DATA FOR LOOPS
arithmetic, data types, and expands into key areas such as lists, TYPES
dictionaries, functions, and control flow.
Examples throughout the cheat sheet are drawn from the
Data Structures OOP Basics
Mobile App Store Dataset and illustrate common operations, Lists, dictionaries, CLASS, Instantiate, init,
frequency tables method
from basic string manipulation to building frequency tables
and working with dates and times.
Each section is designed to give you a concise, actionable Functions Dates and Times
overview of Python’s core functionality in the context of real- DEFINITION, arguments, RETURN DATETIME, DATE, TIME,
STATEMENTS, PARAMETERS STRFTIME, STRPTIME
world data.
Strings
FORMATTING, TRANSFORMING,
CLEANING
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Basics
Syntax for How to use Explained Syntax for How to use Explained
Comments # print(1 + 2)
We call the sequence of Operation
x += 2 # Addition
Augmented assignment
characters that follows the
Shortcuts operators: used to update a
print(5 * 10)
x -= 2 # Subtraction
# This program will only print 50
# a code comment; any code x *= 2 # Multiplication
variable in place without
that follows # will not be repeating the variable name;
x /= 2 # Division
executed for example, instead of
x **= 2 # Exponentiation
writing: x = x + 2 , you can
Arithmetical
use: x += 2
1 + 2 # output: 3 Addition
Operations
Data Types x = [1, 2, 3]
Use the type() command to
4 - 5 # output: -1 Substraction
determine the data type of a
y = 4
value or variable.
Multiplication
print(type(x)) # list
30 * 2 # output: 60
print(type(y)) # integer
print(type('4')) # string
20 / 3 # output: 6.666666666666667 Division
4 ** 3 # output: 64 Exponentiation Converting
int('4') # casting a string to an integer
Converting between data
Data Types types is also referred to as
str(4) # casting an integer as a string
(4 * 18) ** 2 / 10 # output: 518.4 Use parentheses to control
float('4.3') # casting a string as a float
casting
the order of operations
str(4.3) # casting a float as a string
Initializing
cost = 20
Variable names can only
Variables total_cost = 20 + 2**5
contain letters, numbers, and
underscores―they cannot
currency = 'USD'
begin with a number
1_app = 'Facebook' # this will cause an error
Updating
x = 30
To update a variable, use the
Variables print(x) # 30 is printed
= assignment operator to set
a new value
x = 50
print(x) # 50 is printed
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Data Structures
Syntax for How to use Explained Syntax for How to use Explained
Lists Creating a list and appending List of Lists Opening a dataset file and
a_list = [1, 2]
from csv import reader
a value to it using it to create a list of lists
a_list.append(3)
opened_file = open('AppleStore.csv')
print(a_list) # output: [1, 2, 3] read_file = reader(opened_file)
apps_data = list(read_file)
Creating a list of data points;
row_1 = ['Facebook', 0.0, 'USD', 2974676]
lists can store multiple data
row_2 = ['Instagram', 4.5, 'USD', 2161558]
types at the same time Creating a list of lists by
row_1 = ['Facebook', 'USD', 2974676, 3.5]
initializing a new list whose
Indexing row_2 = ['Instagram', 'USD', 2161558, 8.2]
Retrieving an element from a elements are themselves lists
print(row_1[0]) # output: 'Facebook'
list using each item’s index row_3 = ['Clash', 0.0, 'USD', 2130805, 4.5]
print(row_2[1]) # output: 4.5
row_4 = ['Fruit', 1.99, 'USD', 698516, 9.1]
number; note that list
print(row_1[3]) # output: 2974676
indexing begins at 0
print(row_2[3]) # output: 2161558 lists = [row_1, row_2, row_3, row_4]
Negative
print(row_1[-1]) # output: 2974676
Negative list indexing works Indexing Retrieving an element from a
Indexing by counting backwards from first_row_first_element = lists[0][0]
print(row_2[-1]) # output: 2161558
list of lists by first selecting
the last element, beginning # output: 'Facebook'
print(row_1[-3]) # output: 0.0
the row, then the element
with -1 second_row_third_element = lists[1][2]
print(row_2[-4]) # output: 'Instagram' within that row
# output: 2161558
third_row_last_element = lists[-2][4]
Retrieving multiple list
num_ratings = [row_1[-1], row_2[-1]]
# output: 4.5
elements to create a new list
print(name_and_ratings)
last_row_last_element = lists[-1][-1]
# output: [2974676, 2161558] # output: 9.1
List Slicing List slicing includes the start
row_3 = ['Clash of Clans', 0.0,
Slicing
first_two_rows = lists[:2]
Slicing lists of lists allows
index but excludes the end
'USD', 2130805, 4.5]
List of Lists extracting full rows or
index; when the start is last_two_rows = lists[-2:]
print(row_3[:2])
specific elements from a
omitted, the slice begins at all_but_first_row = lists[1:]
# output: ['Clash of Clans', 0.0]
single row; positive indices
the start of the list; when the second_row_partial = lists[1][:3]
print(row_3[1:4])
select from the start, and
end is omitted, it continues # output: ['Instagram', 'USD', 2161558]
# output: [0.0, 'USD', 2130805]
negative indices select from
to the end of the list last_row_partial = lists[-1][1:-2]
print(row_3[3:])
the end
# output: [1.99, 'USD']
# output: [2130805, 4.5]
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Data Structures
Syntax for How to use Explained Syntax for How to use Explained
Dictionaries
# First way
Creating a dictionary by Frequency
frequency_table = {}
Builds a frequency table by
dictionary = {'key_1': 1, 'key_2': 2}
defining key:value pairs at Tables
for row in a_data_set:
counting occurrences of
# Second way
time of initialization (first values in the 6th column
a_data_point = row[5]
way) or by creating an empty (row[5]) of a_data_set ,
dictionary = {}
if a_data_point in frequency_table:
dictionary and setting the incrementing the count if the
dictionary['key_1'] = 1
frequency_table[a_data_point] += 1
value for each key (second value exists, or adding it if
dictionary['key_2'] = 2 else:
way) not
frequency_table[a_data_point] = 1
dictionary = {'key_1': 100, 'key_2': 200}
Retrieve individual dictionary
values by specifying the key; Defined
Categorizes app sizes from
dictionary['key_1'] # outputs 100
data_sizes = {'0 - 10 MB': 0,
keys can be strings, numbers, Intervals apps_data into predefined
dictionary['key_2'] # outputs 200 '10 - 50 MB': 0,
or tuples, but not lists or sets ranges (e.g., '0 - 10 MB')
'50 - 100 MB': 0,
and increments the
Use the in operator to check '100 - 500 MB': 0,
dictionary = {'key_1': 100, 'key_2': 200}
corresponding count based
for dictionary key '500 MB +': 0}
on each app's size inside the
'key_1' in dictionary # outputs True
membership for row in apps_data[1:]:
data_sizes dictionary
'key_5' in dictionary # outputs False
data_size = float(row[2])
100 in dictionary # outputs False
if data_size <= 10000000:
data_sizes['0 - 10 MB'] += 1
dictionary = {'key_1': 100, 'key_2': 200}
Update dictionary values by elif 10000000 < data_size <= 50000000:
dictionary['key_1'] += 600
specifying the key and data_sizes['10 - 50 MB'] += 1
dictionary['key_2'] = 400
assigning a new value elif 50000000 < data_size <= 100000000:
data_sizes['50 - 100 MB'] += 1
print(dictionary)
elif 100000000 < data_size <= 500000000:
# output: {'key_1': 700, 'key_2': 400}
data_sizes['100 - 500 MB'] += 1
elif data_size > 500000000:
data_sizes['500 MB +'] += 1
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Functions
Syntax for How to use Explained Syntax for How to use Explained
Basic
def square(number):
Create a function with a Arguments
def subtract(a, b):
Use named arguments and
Functions single parameter: number positional arguments
return number**2
return a - b
print(square(5)) # output: 25 print(subtract(a=10, b=7)) # output: 3
print(subtract(b=7, a=10)) # output: 3
Create a function with more print(subtract(10, 7)) # output: 3
def add(x, y):
than one parameter x and y
return x + y
Helper
Functions
def find_sum(lst):
Define helper functions to
print(add(3 + 14)) # output: 17 a_sum = 0
find the sum and length of a
for element in lst:
list; the mean function reuses
This function creates a a_sum += float(element)
these to calculate the
def freq_table(list_of_lists, index):
average by dividing the sum
frequency table for any given return a_sum
frequency_table = {}
by the length
column index of the
for row in list_of_lists:
provided list_of_lists def find_length(lst):
value = row[index]
length = 0
if value in frequency_table:
for element in lst:
frequency_table[value] += 1
length += 1
else:
return length
frequency_table[value] = 1
return frequency_table
def mean(lst):
return find_sum(lst) / find_length(lst)
print(mean([1, 2, 4, 6, 2]) # output: 3
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Functions
Syntax for How to use Explained Syntax for How to use Explained
Multiple
Define a function that Multiple
This function uses multiple
def price(item, cost):
def sum_or_difference(a, b, return_sum=True):
Arguments accepts multiple arguments Return
return statements to either
return "The " + item + " costs $" +
if return_sum:
and returns a formatted Statements return the sum or the
str(cost) + "."
return a + b
string combining both inputs difference of two values,
else:
depending on the
print(price("chair", 40.99))
return a - b
return_sum argument,
# output: 'The chair costs $40.99.'
which defaults to True
print(sum_or_difference(10, 7))
# output: 17
Similar to the previous
def price (item, cost):
function, but uses print() print(sum_or_difference(10, 7, False))
print("The + item + costs $" +
to display the string
" "
# output: 3
str(cost) + .")
immediately rather than
"
returning it for further use This function is similar to the
price("chair", )
40.99 def sum_or_difference(a, b, return_sum=True):
previous one but omits the
# utput:
o 'The chair costs 0.99.'
$4 if return_sum:
else clause, returning the
return a + b
difference directly when
Default
Define a function with a return a - b
def add_value(x, constant=3.14):
return_sum is False,
Arguments
return x + constant
default argument; if no simplifying the logic
second argument is print(sum_or_difference(10, 7))
provided, the default value is # output: 17
print(add_value(6, 3)) # output: 9
used in the calculation print(sum_or_difference(10, 7, False))
print(add_value(6)) # output: 9.14
# output: 3
Returning
This function returns
def sum_and_difference(a, b):
Multiple
multiple values (sum and
a_sum = a + b
Values difference) at once by
difference = a - b
separating them with
return a_sum, difference
commas, allowing them to
be unpacked into separate
sum_1, diff_1 = sum_and_difference(15, 10)
variables when called
P ython Cheat Sheet F ree resources at dataquest.io/guide/introduction-to-python-tutorial
Strings
Syntax for How to use Explained Syntax for How to use Explained
Formatting
continents = "France is in {} and China is
Insert values by order into String
green_ball = "red ball".replace("red",
Replace parts of a string by
placeholders for simple Cleaning specifying the old and new
in {}".format("Europe", "Asia")
"green")
string formatting values
# France is in Europe and China is in Asia # green ball
Use indexed placeholders to friend_removed = "hello there
Remove a specified substring
squares = "{0} times {0} equals
repeat or position values from a string by replacing it
{1}".format(3,9)
friend!".replace(" friend", "")
with an empty string
# 3 times 3 equals 9 # hello there!
Use a loop to remove
Assign values to named bad_chars = ["'", ",", ".", "!"]
population = "{name}'s population is {pop}
multiple specified characters
placeholders using variable string = "We'll remove apostrophes, commas,
million".format(name="Brazil", pop=209)
from a string by replacing
names periods, and exclamation marks!"
# Brazil's population is 209 million
them with an empty string
for char in bad_chars:
Insert values into a string by name
string = string.replace(char, "")
two_decimal_places = "I own
{:.2f}% of the
Format a float to two decimal # Well remove apostrophes commas periods and
places for precise output
company".format(32.5548651132)
# exclamation marks
# I own 32.55% of the company
print("hello, my friend".title())
Capitalize the first letter of
each word in the string
Insert a number with # Hello, My Friend
india_pop = "The approx pop of {} is
commas as a thousand
{:,}".format("India", 1324000000)
Split a string into a list of
separator by position split_on_dash = "1980-12-08".split("-")
# The approx pop of India is 1,324,000,000 substrings based on the
# ['1980', '12', '08']
specified delimiter
Format a number with first_four_chars = "This is a long
Slice the string to return the
balance_string = "Your bank balance is first four characters; missing
commas and two decimal string."[:4]
${:,.2f}".format(12345.678)
indices default to the start or
places for currency formatting # This
# Your bank balance is $12,345.68 end of the string
superman = "Clark" + " " + "Kent"
Concatenate strings using
the + operator to join them
# Clark Kent
with a space
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Control Flow
Syntax for How to use Explained Syntax for How to use Explained
For Loops row_1 = ['Facebook', 0.0, 'USD', 2974676]
With each iteration, this loop If
if True:
The condition True always
for element in row_1:
will print an element Statements print('This will always be printed.') executes the code inside the
print(element) from row_1, in order if block
rating_sum = 0
Convert a column of strings if True:
Both conditions evaluate to
for row in apps_data[1:]:
(row[7]) in a list of lists print(1)
True, so all print statements
rating = float(row[7])
(apps_data) to a float and if 1 == 1:
are executed
rating_sum = rating_sum + rating keep a running sum of print(2)
ratings print(3)
apps_names = []
Append values with each
for row in apps_data[1:]:
iteration of a for loop if True:
Only the blocks with True
name = row[1]
print('First Output')
conditions are executed, so
apps_names.append(name) if False:
the second print statement is
print('Second Output')
skipped
if True:
Conditional
price = 0
Use comparison operators to print('Third Output')
Statements print(price == 0) # Outputs True
check if a value equals
print(price == 2) # Outputs False another, returning True or
False Else
if False:
The code in the else clause
Statements print(1)
is always executed when
print('Games' == 'Music') # Outputs False
Compare strings and lists else:
the if statement is False
print('Games' != 'Music') # Outputs True
using == for equality and != print('The condition above was false.')
print([1,2,3] == [1,2,3]) # Outputs True
for inequality, returning True
print([1,2,3] == [1,2,3,4]) # Outputs False or False
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Control Flow Object-Oriented Programming Basics
Syntax for How to use Explained Syntax for How to use Explained
Else
if "car" in "carpet":
The in operator checks if a Defining
class MyClass:
Define an empty class
Statements substring exists in a string,
print("The substring was found.")
Classes pass
executing the corresponding
else:
if or else block
print("The substring was not found.") Instantiate an object from
Instantiating
class MyClass:
the class by calling the class
Class Objects pass
name followed by
Elif
The elif statement allows mc_1 = MyClass()
if 3 == 1:
parentheses
Statements for multiple conditions to be
print('3 does not equal 1.')
tested; if the if condition is Setting
Use the __init__ method to
elif 3 < 1:
class MyClass:
False, the elif condition is Class
initialize an object's
print('3 is not less than 1.')
def __init__(self, param_1):
checked, and if both are Attributes attributes during
else:
self.attribute = param_1
False, the else block is instantiation by passing
print('Both conditions above are mc_2 = MyClass("arg_1")
executed arguments
false.') # mc_2.attribute is set to "arg_1"
Defining
Define a method within the
class MyClass:
Multiple
if 3 > 1 and 'data' == 'data':
Use and to require both Class
class to modify an attribute;
conditions to be True and or
def __init__(self, param_1):
Conditions print('Both conditions are true!')
Methods add_20 increases the value
self.attribute = param_1
if 10 < 20 or 4 >= 5:
to require at least one of attribute by 20 when
condition to be True def add_20(self):
print('At least one condition is true.') called
self.attribute += 20
mc_3 = MyClass(10) # mc_3.attribute is 10
if (20 > 3 and 2 != 1) or 'Games' == 'Game':
Use parentheses to group mc_3.add_20() # mc_3.attribute is now 30
print('At least one condition is true.') conditions and control the
order of evaluation in
complex logical expressions
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial
Dates and Time
Syntax for How to use Explained
Importing
Import the module, requiring Creating
Convert a datetime object into
import datetime
eg_2_str = eg_2.strftime(
Datetime
the full path to access Datetime
a formatted string; the "f "
current_time = datetime.datetime.now() "%B %d, %Y at %I:%M %p")
Examples functions or classes Objects stands for formatting
# "August 15, 1990 at 08:45 AM"
Import the module with alias
import datetime as dt
Create a time object that includes
current_time = dt.datetime.now()
dt for shorter references, a eg_3 = dt.time(hour=5, minute=23,
common practice microseconds
second=45, microsecond=123456)
# 05:23:45.123456
Import only the datetime class,
from datetime import datetime
enabling direct access without
current_time = datetime.now()
the module name prefix eg_4 = dt.timedelta(weeks=3)
Add a timedelta object
representing 3 weeks to
future_date = eg_1 + eg_4
Import multiple classes from a datetime object to calculate a
from datetime import datetime, date
# 1985-04-03 14:30:45
the module, allowing direct future date
current_time = datetime.now()
use of their respective
current_date = date.today()
methods Accessing
eg_1.year # returns 1985
Access specific components
Datetime
directly from datetime and
Import all classes and functions, eg_1.month # returns 3
from datetime import *
Attributes time objects using their built-in
making every definition and all eg_2.day # returns 15
current_time = datetime.now()
attributes
constants accessible without eg_2.hour # returns 8
current_date = date.today()
using a prefix; this is not advised eg_3.minute # returns 23
min_year = MINYEAR
for this module
eg_3.microsecond # returns 123456
max_year = MAXYEAR
Creating
Create a datetime object with eg_2_time = eg_2.time()
Extract the time component
import datetime as dt
Datetime
both date (March 13, 1985) and from a datetime object that
eg_1 = dt.datetime(1985, 3, 13, 14, 30, 45) # 08:45:30
Objects time (14:30:45) components contains both date and time
using the .time() method
from datetime import datetime as dt
Convert a formatted string
into a datetime object; the
eg_2 = dt.strptime("15/08/1990 08:45:30",
"p" stands for parsing
"%d/%m/%Y %H:%M:%S")
Python Cheat Sheet Free resources at dataquest.io/guide/introduction-to-python-tutorial