Merging recomm_sys #72

akkadhim · 2024-12-25T17:51:26Z

Adding the recommendation system experiments. Please ignore any changes outside the (examples/recomm_system) directory.

BooBSD · 2024-12-26T18:20:03Z

@akkadhim Could you please export your noisy datasets to a CSV file for testing in other languages?

akkadhim · 2024-12-27T12:00:29Z

@akkadhim Could you please export your noisy datasets to a CSV file for testing in other languages?

Sure, below are different datasets for different noise ratios.

noisy_dataset_0.05.csv
noisy_dataset_0.005.csv
noisy_dataset_0.02.csv
noisy_dataset_0.2.csv
noisy_dataset_0.01.csv
noisy_dataset_0.1.csv

BooBSD · 2024-12-27T13:10:34Z

@akkadhim Thank you!

BooBSD · 2024-12-27T14:26:26Z

@akkadhim Is it correct that, after one-hot booleanization, your input data consists of 10709 bits? This includes 1350 unique product_ids + 317 categories + 9042 user_ids.

akkadhim · 2024-12-27T15:22:37Z

@akkadhim Is it correct that, after one-hot booleanization, your input data consists of 10709 bits? This includes 1350 unique product_ids + 317 categories + 9042 user_ids.
After expanding the original dataset and adding the noise, the unique features will be:
Users: 1193
Items: 1350
Categories: 211
I used the one_hot_encoding for the TM classifier, and at that step, the dataset split to train and test portions.

BooBSD · 2024-12-27T15:36:56Z

@akkadhim
Got it. However, the columns category and user_id contain lists of categories and users, joined by the "|" and "," characters (for example: "Computers&Accessories|Accessories&Peripherals|Cables&Accessories|Cables|USBCables" or "AH4BURHCF5UQFZR4VJQXBEQCTYVQ,AGSJLPK6HU2FB4HII64NQ3OYFFFA,AGG75KFRXNLCYVRAPA6D4ZBNTNSA"). Why weren’t they split into individual unique categories and user IDs? Could you confirm if your method of booleanization is correct?

BooBSD · 2024-12-27T17:33:17Z

@akkadhim
I tested both booleanization methods (yours and mine) and obtained approximately the same validation accuracy.
I split your dataset such that the first 80% is used for training, and the last 20% for validation.

My best validation accuracy:

noisy_dataset_0.005.csv: 99.73%
noisy_dataset_0.2.csv: 84.87%

Here is the proof:

#1 Accuracy: 83.81% Best: 83.81% Training: 1.946s Testing: 0.107s #2 Accuracy: 96.69% Best: 96.69% Training: 0.609s Testing: 0.009s #3 Accuracy: 99.69% Best: 99.69% Training: 0.442s Testing: 0.008s #4 Accuracy: 99.69% Best: 99.69% Training: 0.350s Testing: 0.007s #5 Accuracy: 99.69% Best: 99.69% Training: 0.279s Testing: 0.007s #6 Accuracy: 99.69% Best: 99.69% Training: 0.238s Testing: 0.006s #7 Accuracy: 99.69% Best: 99.69% Training: 0.192s Testing: 0.006s #8 Accuracy: 99.69% Best: 99.69% Training: 0.178s Testing: 0.006s #9 Accuracy: 99.69% Best: 99.69% Training: 0.173s Testing: 0.006s #10 Accuracy: 99.69% Best: 99.69% Training: 0.147s Testing: 0.005s .... #300 Accuracy: 99.73% Best: 99.73% Training: 0.085s Testing: 0.003s #301 Accuracy: 99.69% Best: 99.73% Training: 0.090s Testing: 0.003s #302 Accuracy: 99.73% Best: 99.73% Training: 0.086s Testing: 0.003s #303 Accuracy: 99.73% Best: 99.73% Training: 0.084s Testing: 0.003s #304 Accuracy: 99.73% Best: 99.73% Training: 0.081s Testing: 0.003s #305 Accuracy: 99.73% Best: 99.73% Training: 0.089s Testing: 0.003s #306 Accuracy: 99.73% Best: 99.73% Training: 0.080s Testing: 0.003s #307 Accuracy: 99.73% Best: 99.73% Training: 0.081s Testing: 0.003s #308 Accuracy: 99.73% Best: 99.73% Training: 0.089s Testing: 0.003s #309 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #310 Accuracy: 99.69% Best: 99.73% Training: 0.083s Testing: 0.003s #311 Accuracy: 99.69% Best: 99.73% Training: 0.081s Testing: 0.003s #312 Accuracy: 99.73% Best: 99.73% Training: 0.082s Testing: 0.003s #313 Accuracy: 99.73% Best: 99.73% Training: 0.079s Testing: 0.003s #314 Accuracy: 99.69% Best: 99.73% Training: 0.081s Testing: 0.003s #315 Accuracy: 99.73% Best: 99.73% Training: 0.083s Testing: 0.003s #316 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #317 Accuracy: 99.73% Best: 99.73% Training: 0.085s Testing: 0.003s #318 Accuracy: 99.73% Best: 99.73% Training: 0.086s Testing: 0.003s #319 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #320 Accuracy: 99.73% Best: 99.73% Training: 0.091s Testing: 0.003s

These results were obtained on a CPU, and it works quite fast.

akkadhim · 2024-12-28T12:19:11Z

@akkadhim Got it. However, the columns category and user_id contain lists of categories and users, joined by the "|" and "," characters (for example: "Computers&Accessories|Accessories&Peripherals|Cables&Accessories|Cables|USBCables" or "AH4BURHCF5UQFZR4VJQXBEQCTYVQ,AGSJLPK6HU2FB4HII64NQ3OYFFFA,AGG75KFRXNLCYVRAPA6D4ZBNTNSA"). Why weren’t they split into individual unique categories and user IDs? Could you confirm if your method of booleanization is correct?

For the user_id, the CSV formatting rules allow handling such cases by enclosing the value in double quotes, while the categories column format maintains the original structure of the dataset. Splitting these fields would alter the representation of hierarchical categories and associated user IDs.
Yes, it is correct.

akkadhim · 2024-12-28T12:22:31Z

@akkadhim I tested both booleanization methods (yours and mine) and obtained approximately the same validation accuracy. I split your dataset such that the first 80% is used for training, and the last 20% for validation.

My best validation accuracy:

noisy_dataset_0.005.csv: 99.73%
noisy_dataset_0.2.csv: 84.87%

Here is the proof:

#1 Accuracy: 83.81% Best: 83.81% Training: 1.946s Testing: 0.107s #2 Accuracy: 96.69% Best: 96.69% Training: 0.609s Testing: 0.009s #3 Accuracy: 99.69% Best: 99.69% Training: 0.442s Testing: 0.008s #4 Accuracy: 99.69% Best: 99.69% Training: 0.350s Testing: 0.007s #5 Accuracy: 99.69% Best: 99.69% Training: 0.279s Testing: 0.007s #6 Accuracy: 99.69% Best: 99.69% Training: 0.238s Testing: 0.006s #7 Accuracy: 99.69% Best: 99.69% Training: 0.192s Testing: 0.006s #8 Accuracy: 99.69% Best: 99.69% Training: 0.178s Testing: 0.006s #9 Accuracy: 99.69% Best: 99.69% Training: 0.173s Testing: 0.006s #10 Accuracy: 99.69% Best: 99.69% Training: 0.147s Testing: 0.005s .... #300 Accuracy: 99.73% Best: 99.73% Training: 0.085s Testing: 0.003s #301 Accuracy: 99.69% Best: 99.73% Training: 0.090s Testing: 0.003s #302 Accuracy: 99.73% Best: 99.73% Training: 0.086s Testing: 0.003s #303 Accuracy: 99.73% Best: 99.73% Training: 0.084s Testing: 0.003s #304 Accuracy: 99.73% Best: 99.73% Training: 0.081s Testing: 0.003s #305 Accuracy: 99.73% Best: 99.73% Training: 0.089s Testing: 0.003s #306 Accuracy: 99.73% Best: 99.73% Training: 0.080s Testing: 0.003s #307 Accuracy: 99.73% Best: 99.73% Training: 0.081s Testing: 0.003s #308 Accuracy: 99.73% Best: 99.73% Training: 0.089s Testing: 0.003s #309 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #310 Accuracy: 99.69% Best: 99.73% Training: 0.083s Testing: 0.003s #311 Accuracy: 99.69% Best: 99.73% Training: 0.081s Testing: 0.003s #312 Accuracy: 99.73% Best: 99.73% Training: 0.082s Testing: 0.003s #313 Accuracy: 99.73% Best: 99.73% Training: 0.079s Testing: 0.003s #314 Accuracy: 99.69% Best: 99.73% Training: 0.081s Testing: 0.003s #315 Accuracy: 99.73% Best: 99.73% Training: 0.083s Testing: 0.003s #316 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #317 Accuracy: 99.73% Best: 99.73% Training: 0.085s Testing: 0.003s #318 Accuracy: 99.73% Best: 99.73% Training: 0.086s Testing: 0.003s #319 Accuracy: 99.73% Best: 99.73% Training: 0.088s Testing: 0.003s #320 Accuracy: 99.73% Best: 99.73% Training: 0.091s Testing: 0.003s

These results were obtained on a CPU, and it works quite fast.

Very impressive! Nice work, @BooBSD!

BooBSD · 2025-01-29T15:26:12Z

@akkadhim Hey, can you please share your presentation on GloVe, Word2Vec, etc., from the latest meeting call?

akkadhim · 2025-01-29T15:42:11Z

@akkadhim Hey, can you please share your presentation on GloVe, Word2Vec, etc., from the latest meeting call?

Sure, please send me your emil address.

BooBSD · 2025-01-30T08:08:42Z

@akkadhim boobsd@gmail.com

Thank you!

akkadhim added 23 commits November 4, 2024 10:55

add recom sys

aa06920

rename

6280bfb

complete recom sys

771edcf

rename

ec3fc87

tunning

08693ab

update

9c4be1f

update

daf8d5a

update

9dacba5

run on gpu 6

3dd2b7c

add requirments

e9bdcd6

update

da31b30

expanded ds

fababa5

update

82305ab

before add example no

218a96f

orgnizing files

799493f

update

801b7e3

add TMClassifier

fdfb81f

update

3168dc7

add graph nn

e2232de

add main.bash

c454631

add results

84d8012

Merge branch 'master' into master

347a43c

fair comparisons

d68ae71

akkadhim added 8 commits February 17, 2025 11:34

update

c3d895b

adding exp id

26e2083

update

f057bbe

update

de8eb1b

Merge remote-tracking branch 'upstream/master'

dcaaba8

rerun

bfdf40c

calc total time

cedabda

prepare as sup. mat.

14b0b68

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merging recomm_sys #72

Merging recomm_sys #72

Uh oh!

akkadhim commented Dec 25, 2024 •

edited

Loading

BooBSD commented Dec 26, 2024

akkadhim commented Dec 27, 2024

BooBSD commented Dec 27, 2024

BooBSD commented Dec 27, 2024

akkadhim commented Dec 27, 2024

BooBSD commented Dec 27, 2024 •

edited

Loading

BooBSD commented Dec 27, 2024 •

edited

Loading

akkadhim commented Dec 28, 2024

akkadhim commented Dec 28, 2024

BooBSD commented Jan 29, 2025

akkadhim commented Jan 29, 2025

BooBSD commented Jan 30, 2025

Labels

2 participants

Merging recomm_sys #72

Are you sure you want to change the base?

Merging recomm_sys #72

Uh oh!

Conversation

akkadhim commented Dec 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

BooBSD commented Dec 26, 2024

akkadhim commented Dec 27, 2024

BooBSD commented Dec 27, 2024

BooBSD commented Dec 27, 2024

akkadhim commented Dec 27, 2024

BooBSD commented Dec 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

BooBSD commented Dec 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

akkadhim commented Dec 28, 2024

akkadhim commented Dec 28, 2024

BooBSD commented Jan 29, 2025

akkadhim commented Jan 29, 2025

BooBSD commented Jan 30, 2025

Labels

2 participants

akkadhim commented Dec 25, 2024 •

edited

Loading

BooBSD commented Dec 27, 2024 •

edited

Loading

BooBSD commented Dec 27, 2024 •

edited

Loading