Skip to content

Conversation

@gustavocidornelas
Copy link
Contributor

Summary

Adds two new validations for datasets:

  1. Check if the labels are zero indexed integers;
  2. Check if the column dtypes are one of the supported_dtypes.

Both validations were already happening in the backend. This PR adds them to the Python API to fail early.

There is a related PR for the backend here.

Testing

Locally, introducing problems with the dataset that make the validations fail.

Examples:
1.
Screen Shot 2022-09-28 at 18 55 32

Screen Shot 2022-09-28 at 18 54 21

@gustavocidornelas gustavocidornelas force-pushed the cid/dataset-column-validations branch from db17ec6 to 4665c11 Compare September 30, 2022 12:06
@whoseoyster whoseoyster merged commit e7e458f into main Oct 4, 2022
@whoseoyster whoseoyster deleted the cid/dataset-column-validations branch October 4, 2022 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants