Skip to content

Conversation

@gustavocidornelas
Copy link
Contributor

Updated docstrings to be consistent with the project.add_XXX dynamic and with the deprecation of categorical_features_map in favor of categorical_feature_names.

Points that still need to be resolved:

  • We need to consider moving the docstrings for add_model, add_dataset, and add_dataframe to projects.py. Otherwise, the organization is not ideal, as it seems like add_XXX is a method of the UnboxClient class (it is, but that's not the dynamic we'd like to communicate). I didn't move it all because inside projects.py, the add_XXX methods don't have the arguments properly defined like the methods in the UnboxClient class (i.e., we use *args, **kwargs).
@linear
Copy link

linear bot commented Jul 18, 2022

UNB-2346 Update Python API reference

Deprecation of arguments and new project behavior

>>> model = client.add_model(
... name='Churn Classifier',
>>> model = project.add_model(
... name='Linear classifiers',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this name singular? I don't want users to think "Linear classifiers" should be used to compare several different candidates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

on that new tree. In the future, if you'd like to commit a new version to that same lineage, you can simply call `add_model`
using ``name='Linear classifier'`` again and use ``description`` with the new commit message. Alternatively, if you'd like
to start a new separate lineage inside that project, you can call the ``add_model`` method with a different ``name``, e.g.,
``name ='Nonlinear classifiers'``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you confuse "Linear classifier" with "Linear classifiers"

Make everything just "Linear Classifier" pls

categorical_features_map : Dict[str, List[str]], default {}
A dict containing a list of category names for each feature that is categorical. E.g. `{'Weather': ['Hot', 'Cold']}`.
categorical_feature_names : List[str], default []
A list containing the feature names for each feature that is categorical. E.g. `["Gender", "Geography"]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you improve this wording? "The feature names for each feature" is awkward

"A list containing the names of all categorical features used by the model. ..."

Examples
--------
Instantiate the client:
The datasets are uploaded directly into a **project** on the Unbox platform, using the ``project.add_dataset()`` method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datasets are uploaded. No need for "The datasets"

--------
Instantiate the client
The dataframes are uploaded directly into a **project** on the Unbox platform, using the ``project.add_dataframe()`` method.
Therefore, you need to get the project object to start uploading your dataframes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can get rid of these first two sentences in all the examples sections. Show, don't tell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll remove them

.. important::
The project name must be unique in a user's account. Furthermore, this is the name
that is used to identify the project when uploading models and datasets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reword: This name must be unique within a user's collection of projects.

Don't think the second line is needed

.. important::
Versioning models on the Unbox platform happens via the ``name`` argument. If ``add_model`` is called
with a ``name`` that still does not exist inside the project, Unbox treats it as the **first version** of a new model lineage.
On the other hand, if the ``name`` argument value already exists inside the project, Unbox treats it as a **new version** of an existing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if the name argument value already exists" --> "if a model with the specified name already exists"

The ``model`` arg must be the actual trained model object, and the ``input_features`` arg must be a 2D numpy array
containing a batch of features that will be passed to the model as inputs.
You can optionally include other kwargs in the function, including tokenizers, variables, encoders etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why you removed tokenizers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, just because this is inside the example for tabular classification. I kept it for the example for NLP.

I'm trying to think now of uses of a tokenizer on a tabular task type... I think it would be a bit unusual, but I guess it can exist. I'll add it back just in case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, you're right. Keep it removed for tabular :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

Copy link
Contributor

@whoseoyster whoseoyster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge after addressing my comments

>>> model = client.add_model(
... name='Churn Classifier',
>>> model = project.add_model(
... name='Linear classifiers',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still plural

categorical_features_map : Dict[str, List[str]], default {}
A dict containing a list of category names for each feature that is categorical. E.g. `{'Weather': ['Hot', 'Cold']}`.
categorical_feature_names : List[str], default []
A list containing the names of all categorical features on the dataset. E.g. `["Gender", "Geography"]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general grammar comment -- you occasionally use "on" instead of "in." This should be "A list containing the names of all categorical features in the dataset"

categorical_features_map : Dict[str, List[str]], default {}
A dict containing a list of category names for each feature that is categorical. E.g. `{'Weather': ['Hot', 'Cold']}`.
categorical_feature_names : List[str], default []
A list containing the names of all categorical features on the dataframe. E.g. `["Gender", "Geography"]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on -> in

@gustavocidornelas gustavocidornelas merged commit b0b6e00 into cid/error-handling Jul 25, 2022
@gustavocidornelas gustavocidornelas deleted the cid/updated-api-reference branch July 25, 2022 20:15
@whoseoyster whoseoyster restored the cid/updated-api-reference branch August 1, 2022 20:59
@whoseoyster whoseoyster deleted the cid/updated-api-reference branch August 1, 2022 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants