APRIORI ALGORITHM
pip install mlxtend
Collecting mlxtend
Requirement already satisfied: scipy>=1.2.1 in c:\users\cathy\
anaconda3\lib\site-packages (from mlxtend) (1.10.1)
Requirement already satisfied: numpy>=1.16.2 in c:\users\cathy\
anaconda3\lib\site-packages (from mlxtend) (1.24.3)
Requirement already satisfied: pandas>=0.24.2 in c:\users\cathy\
anaconda3\lib\site-packages (from mlxtend) (1.5.3)
Collecting scikit-learn>=1.3.1 (from mlxtend)
Downloading scikit_learn-1.6.1-cp311-cp311-win_amd64.whl (11.1 MB)
Requirement already satisfied: matplotlib>=3.0.0 in c:\users\cathy\
anaconda3\lib\site-packages (from mlxtend) (3.7.1)
Requirement already satisfied: joblib>=0.13.2 in c:\users\cathy\
anaconda3\lib\site-packages (from mlxtend) (1.2.0)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (1.0.5)
Requirement already satisfied: cycler>=0.10 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (1.4.4)
Requirement already satisfied: packaging>=20.0 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (23.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (10.0.1)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\cathy\
anaconda3\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\cathy\
anaconda3\lib\site-packages (from pandas>=0.24.2->mlxtend) (2022.7)
Collecting threadpoolctl>=3.1.0 (from scikit-learn>=1.3.1->mlxtend)
Downloading threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Requirement already satisfied: six>=1.5 in c:\users\cathy\anaconda3\
lib\site-packages (from python-dateutil>=2.7->matplotlib>=3.0.0-
>mlxtend) (1.16.0)
Installing collected packages: threadpoolctl, scikit-learn, mlxtend
Attempting uninstall: threadpoolctl
Found existing installation: threadpoolctl 2.2.0
Uninstalling threadpoolctl-2.2.0:
Successfully uninstalled threadpoolctl-2.2.0
Attempting uninstall: scikit-learn
Found existing installation: scikit-learn 1.2.
Uninstalling scikit-learn-1.2.2:
Successfully uninstalled scikit-learn-1.2.2
Successfully installed mlxtend-0.23.4 scikit-learn-1.6.1
threadpoolctl-3.6.0
Note: you may need to restart the kernel to use updated packages.
QUESTION1:
from mlxtend.frequent_patterns import apriori, association_rules
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
dataset = [
['Milk', 'Eggs', 'Butter'],
['Milk', 'Eggs'],
['Eggs', 'Butter'],
['Milk', 'Butter'],
['Eggs', 'Butter']
]
te = TransactionEncoder()
oht = te.fit_transform(dataset)
df_oht = pd.DataFrame(oht, columns=te.columns_)
# Apply Apriori algorithm to find frequent itemsets
frequent_itemsets = apriori(df_oht, min_support=0.4,
use_colnames=True)
rules = association_rules(frequent_itemsets, metric="confidence",
min_threshold=0.5)
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)
Frequent Itemsets:
support itemsets
0 0.8 (Butter)
1 0.8 (Eggs)
2 0.6 (Milk)
3 0.6 (Eggs, Butter)
4 0.4 (Milk, Butter)
5 0.4 (Milk, Eggs)
Association Rules:
antecedents consequents antecedent support consequent support
support \
0 (Eggs) (Butter) 0.8 0.8
0.6
1 (Butter) (Eggs) 0.8 0.8
0.6
2 (Milk) (Butter) 0.6 0.8
0.4
3 (Butter) (Milk) 0.8 0.6
0.4
4 (Milk) (Eggs) 0.6 0.8
0.4
5 (Eggs) (Milk) 0.8 0.6
0.4
confidence lift representativity leverage conviction \
0 0.750000 0.937500 1.0 -0.04 0.8
1 0.750000 0.937500 1.0 -0.04 0.8
2 0.666667 0.833333 1.0 -0.08 0.6
3 0.500000 0.833333 1.0 -0.08 0.8
4 0.666667 0.833333 1.0 -0.08 0.6
5 0.500000 0.833333 1.0 -0.08 0.8
zhangs_metric jaccard certainty kulczynski
0 -0.250000 0.6 -0.250000 0.750000
1 -0.250000 0.6 -0.250000 0.750000
2 -0.333333 0.4 -0.666667 0.583333
3 -0.500000 0.4 -0.250000 0.583333
4 -0.333333 0.4 -0.666667 0.583333
5 -0.500000 0.4 -0.250000 0.583333
Question 2:
import seaborn as sns
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
import warnings
warnings.filterwarnings('ignore', category=RuntimeWarning)
# Load the tips dataset from seaborn
tips = sns.load_dataset('tips')
transaction = pd.get_dummies(tips[['day', 'time', 'sex', 'smoker',
'size']])
transaction = transaction.apply(lambda x: x > 0)
frequent_itemsets = apriori(transaction, min_support=0.1,
use_colnames=True)
rules = association_rules(frequent_itemsets, metric="lift",
min_threshold=1.0)
rules = rules.dropna(subset=['lift', 'confidence'])
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)
Frequent Itemsets:
support itemsets
0 1.000000 (size)
1 0.254098 (day_Thur)
2 0.356557 (day_Sat)
3 0.311475 (day_Sun)
4 0.278689 (time_Lunch)
.. ... ...
100 0.131148 (time_Dinner, day_Sat, smoker_No, sex_Male)
101 0.176230 (time_Dinner, sex_Male, smoker_No, day_Sun)
102 0.110656 (day_Sat, smoker_Yes, size, sex_Male, time_Din...
103 0.131148 (day_Sat, size, smoker_No, sex_Male, time_Dinner)
104 0.176230 (size, day_Sun, smoker_No, sex_Male, time_Dinner)
Association Rules:
antecedents consequents \
0 (size) (day_Thur)
1 (day_Thur) (size)
2 (day_Sat) (size)
3 (size) (day_Sat)
4 (size) (day_Sun)
.. ... ...
543 (size) (time_Dinner, sex_Male, smoker_No, day_Sun)
544 (day_Sun) (time_Dinner, sex_Male, smoker_No, size)
545 (smoker_No) (sex_Male, time_Dinner, size, day_Sun)
546 (sex_Male) (time_Dinner, smoker_No, size, day_Sun)
547 (time_Dinner) (sex_Male, smoker_No, size, day_Sun)
antecedent support consequent support support confidence
lift \
0 1.000000 0.254098 0.254098 0.254098
1.000000
1 0.254098 1.000000 0.254098 1.000000
1.000000
2 0.356557 1.000000 0.356557 1.000000
1.000000
3 1.000000 0.356557 0.356557 0.356557
1.000000
4 1.000000 0.311475 0.311475 0.311475
1.000000
.. ... ... ... ...
...
543 1.000000 0.176230 0.176230 0.176230
1.000000
544 0.311475 0.315574 0.176230 0.565789
1.792891
545 0.618852 0.237705 0.176230 0.284768
1.197990
546 0.643443 0.233607 0.176230 0.273885
1.172421
547 0.721311 0.176230 0.176230 0.244318
1.386364
representativity leverage conviction zhangs_metric
jaccard \
0 1.0 0.000000 1.000000 0.000000 0.254098
1 1.0 0.000000 inf 0.000000 0.254098
2 1.0 0.000000 inf 0.000000 0.356557
3 1.0 0.000000 1.000000 0.000000 0.356557
4 1.0 0.000000 1.000000 0.000000 0.311475
.. ... ... ... ... ...
543 1.0 0.000000 1.000000 0.000000 0.176230
544 1.0 0.077936 1.576254 0.642303 0.390909
545 1.0 0.029125 1.065801 0.433608 0.259036
546 1.0 0.025917 1.055472 0.412457 0.251462
547 1.0 0.049113 1.090102 1.000000 0.244318
certainty kulczynski
0 0.000000 0.627049
1 0.000000 0.627049
2 0.000000 0.678279
3 0.000000 0.678279
4 0.000000 0.655738
.. ... ...
543 0.000000 0.588115
544 0.365585 0.562116
545 0.061739 0.513074
546 0.052556 0.514136
547 0.082655 0.622159
QUESTION 3:
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
import warnings
# Suppress the warning
warnings.filterwarnings('ignore', category=DeprecationWarning)
# Example of a small transaction dataset
data = {
'milk': [1, 0, 1, 1, 0],
'bread': [1, 1, 1, 0, 1],
'butter': [0, 1, 1, 1, 0],
'cheese': [1, 0, 1, 1, 1]
}
df = pd.DataFrame(data)
transaction = df.apply(lambda x: x > 0)
frequent_itemsets = apriori(transaction, min_support=0.1,
use_colnames=True)
rules = association_rules(frequent_itemsets, metric="lift",
min_threshold=1.0)
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)
Frequent Itemsets:
support itemsets
0 0.6 (milk)
1 0.8 (bread)
2 0.6 (butter)
3 0.8 (cheese)
4 0.4 (milk, bread)
5 0.4 (milk, butter)
6 0.6 (milk, cheese)
7 0.4 (bread, butter)
8 0.6 (cheese, bread)
9 0.4 (cheese, butter)
10 0.2 (milk, bread, butter)
11 0.4 (milk, bread, cheese)
12 0.4 (cheese, milk, butter)
13 0.2 (cheese, bread, butter)
14 0.2 (cheese, milk, bread, butter)
Association Rules:
antecedents consequents antecedent
support \
0 (milk) (butter)
0.6
1 (butter) (milk)
0.6
2 (milk) (cheese)
0.6
3 (cheese) (milk)
0.8
4 (milk, bread) (cheese)
0.4
5 (cheese, bread) (milk)
0.6
6 (milk) (cheese, bread)
0.6
7 (cheese) (milk, bread)
0.8
8 (cheese, milk) (butter)
0.6
9 (cheese, butter) (milk)
0.4
10 (milk, butter) (cheese)
0.4
11 (cheese) (milk, butter)
0.8
12 (milk) (cheese, butter)
0.6
13 (butter) (cheese, milk)
0.6
14 (cheese, bread, butter) (milk)
0.2
15 (milk, bread, butter) (cheese)
0.2
16 (cheese, butter) (milk, bread)
0.4
17 (milk, bread) (cheese, butter)
0.4
18 (cheese) (milk, bread, butter)
0.8
19 (milk) (cheese, bread, butter)
0.6
consequent support support confidence lift
representativity \
0 0.6 0.4 0.666667 1.111111
1.0
1 0.6 0.4 0.666667 1.111111
1.0
2 0.8 0.6 1.000000 1.250000
1.0
3 0.6 0.6 0.750000 1.250000
1.0
4 0.8 0.4 1.000000 1.250000
1.0
5 0.6 0.4 0.666667 1.111111
1.0
6 0.6 0.4 0.666667 1.111111
1.0
7 0.4 0.4 0.500000 1.250000
1.0
8 0.6 0.4 0.666667 1.111111
1.0
9 0.6 0.4 1.000000 1.666667
1.0
10 0.8 0.4 1.000000 1.250000
1.0
11 0.4 0.4 0.500000 1.250000
1.0
12 0.4 0.4 0.666667 1.666667
1.0
13 0.6 0.4 0.666667 1.111111
1.0
14 0.6 0.2 1.000000 1.666667
1.0
15 0.8 0.2 1.000000 1.250000
1.0
16 0.4 0.2 0.500000 1.250000
1.0
17 0.4 0.2 0.500000 1.250000
1.0
18 0.2 0.2 0.250000 1.250000
1.0
19 0.2 0.2 0.333333 1.666667
1.0
leverage conviction zhangs_metric jaccard certainty
kulczynski
0 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
1 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
2 0.12 inf 0.500000 0.750000 1.000000
0.875000
3 0.12 1.600000 1.000000 0.750000 0.375000
0.875000
4 0.08 inf 0.333333 0.500000 1.000000
0.750000
5 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
6 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
7 0.08 1.200000 1.000000 0.500000 0.166667
0.750000
8 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
9 0.16 inf 0.666667 0.666667 1.000000
0.833333
10 0.08 inf 0.333333 0.500000 1.000000
0.750000
11 0.08 1.200000 1.000000 0.500000 0.166667
0.750000
12 0.16 1.800000 1.000000 0.666667 0.444444
0.833333
13 0.04 1.200000 0.250000 0.500000 0.166667
0.666667
14 0.08 inf 0.500000 0.333333 1.000000
0.666667
15 0.04 inf 0.250000 0.250000 1.000000
0.625000
16 0.04 1.200000 0.333333 0.333333 0.166667
0.500000
17 0.04 1.200000 0.333333 0.333333 0.166667
0.500000
18 0.04 1.066667 1.000000 0.250000 0.062500
0.625000
19 0.08 1.200000 1.000000 0.333333 0.166667
0.666667