Skip to content

Conversation

@govarsha
Copy link
Contributor

@govarsha govarsha commented Jan 23, 2024

ODSC-52180

Right now anomaly operator supports only one target category column. Added code to support more than 1.

Screenshots:

image image image image image image
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jan 23, 2024
@govarsha govarsha changed the title Added support for multiple target category columns Anomaly: Added support for multiple target category columns Jan 23, 2024
@github-actions
Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-23.19%

@govarsha govarsha changed the title Anomaly: Added support for multiple target category columns Anomaly: Added support for >1 target category columns Jan 23, 2024
@github-actions
Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-23.19%

@ahosler ahosler merged commit 8c1f1a9 into feature/anomaly_detect Jan 25, 2024
for group, df in self.data.groupby(spec.target_category_columns[0])
# Merge target category columns

self.data["__Series__"] = utils._merge_category_columns(self.data, spec.target_category_columns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the "Series" column gets added to self.data and then is referenced throughout the rest of the code.
Could you add to the docstring of this method "Merges all target category columns into a single column named "Series"" Or something equivalent.

I also see that the same code is run on line 180 of base_model.py. Do we need to do this twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the docstring.
Here _load_data in anomaly_dataset doesn't load/preprocess validation data. We are loading the validation data in base_model and the merge_category_columns is done there on validation data.

if data.empty:
return total_metrics, summary_metrics, None

data["__Series__"] = utils._merge_category_columns(data, self.spec.target_category_columns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this redundant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended behaviour when there is no target_category_column present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method _load_data in anomaly_dataset doesn't load/preprocess validation data. We are loading the validation data here and the merge_category_columns is done here for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yeah I forgot to handle when no target_category_column is present for validation data. I will do that change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

2 participants