Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@

.. module:: openlayer

*******************
***********************
Openlayer Documentation
*******************
***********************

**Date**: |today| **Version**: |version|

Expand Down
5 changes: 2 additions & 3 deletions docs/source/reference/upload.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,10 @@ Version control flow
OpenlayerClient.status
OpenlayerClient.restore

Dataset / Task types
--------------------
Task types
----------
.. autosummary::
:toctree: api/

DatasetType
TaskType

80 changes: 45 additions & 35 deletions openlayer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -516,46 +516,51 @@ def add_dataset(

The YAML file with the dataset config must have the following fields:

- ``columnNames`` : List[str]
columnNames : List[str]
List of the dataset's column names.
- ``classNames`` : List[str]
classNames : List[str]
List of class names indexed by label integer in the dataset.
E.g. ``[negative, positive]`` when ``[0, 1]`` are in your label column.
- ``labelColumnName`` : str
labelColumnName : str
Column header in the csv containing the labels.

.. important::
The labels in this column must be zero-indexed integer values.
- ``label`` : str
label : str
Type of dataset. E.g. ``'training'`` or
``'validation'``.
- ``featureNames`` : List[str], default []
featureNames : List[str], default []
List of input feature names. Only applicable if your ``task_type`` is
:obj:`TaskType.TabularClassification` or :obj:`TaskType.TabularRegression`.
- ``textColumnName`` : str, default None
textColumnName : str, default None
Column header in the csv containing the input text. Only applicable if
your ``task_type`` is :obj:`TaskType.TextClassification`.
- ``predictionsColumnName`` : str, default None
predictionsColumnName : str, default None
Column header in the csv containing the predictions. Only applicable if you
are uploading a model as well with the :obj:`add_model` method.

.. important::
Each cell in this column must contain a list of
class probabilities. For example, for a binary classification
task, the cell values should look like this:
.. csv-table::
:header: ..., predictions
..., "[0.6650292861587155, 0.3349707138412845]"
..., "[0.8145561636482788, 0.18544383635172124]"
task, the column with the predictions should look like this:

- ``categoricalFeatureNames`` : List[str], default []
**predictions**

``[0.1, 0.9]``

``[0.8, 0.2]``

``...``

categoricalFeatureNames : List[str], default []
A list containing the names of all categorical features in the dataset.
E.g. ``["Gender", "Geography"]``. Only applicable if your ``task_type`` is
:obj:`TaskType.TabularClassification` or :obj:`TaskType.TabularRegression`.
- ``language`` : str, default 'en'
language : str, default 'en'
The language of the dataset in ISO 639-1 (alpha-2 code) format.
- ``sep`` : str, default ','
sep : str, default ','
Delimiter to use. E.g. `'\\t'`.

force : bool
If :obj:`add_dataset` is called when there is already a dataset of the same type
in the staging area, when ``force=True``, the existing staged dataset will be
Expand Down Expand Up @@ -727,46 +732,51 @@ def add_dataframe(

The YAML file with the dataset config must have the following fields:

- ``columnNames`` : List[str]
columnNames : List[str]
List of the dataset's column names.
- ``classNames`` : List[str]
classNames : List[str]
List of class names indexed by label integer in the dataset.
E.g. ``[negative, positive]`` when ``[0, 1]`` are in your label column.
- ``labelColumnName`` : str
Column header in the csv containing the labels.
labelColumnName : str
Column header in the dataframe containing the labels.

.. important::
The labels in this column must be zero-indexed integer values.
- ``label`` : str
label : str
Type of dataset. E.g. ``'training'`` or
``'validation'``.
- ``featureNames`` : List[str], default []
featureNames : List[str], default []
List of input feature names. Only applicable if your ``task_type`` is
:obj:`TaskType.TabularClassification` or :obj:`TaskType.TabularRegression`.
- ``textColumnName`` : str, default None
Column header in the csv containing the input text. Only applicable if your
``task_type`` is :obj:`TaskType.TextClassification`.
- ``predictionsColumnName`` : str, default None
Column header in the csv containing the predictions. Only applicable if you
textColumnName : str, default None
Column header in the dataframe containing the input text. Only applicable if
your ``task_type`` is :obj:`TaskType.TextClassification`.
predictionsColumnName : str, default None
Column header in the dataframe containing the predictions. Only applicable if you
are uploading a model as well with the :obj:`add_model` method.

.. important::
Each cell in this column must contain a list of
class probabilities. For example, for a binary classification
task, the cell values should look like this:
.. csv-table::
:header: ..., predictions
..., "[0.6650292861587155, 0.3349707138412845]"
..., "[0.8145561636482788, 0.18544383635172124]"
task, the column with the predictions should look like this:

- ``categoricalFeatureNames`` : List[str], default []
**predictions**

``[0.1, 0.9]``

``[0.8, 0.2]``

``...``

categoricalFeatureNames : List[str], default []
A list containing the names of all categorical features in the dataset.
E.g. ``["Gender", "Geography"]``. Only applicable if your ``task_type`` is
:obj:`TaskType.TabularClassification` or :obj:`TaskType.TabularRegression`.
- ``language`` : str, default 'en'
language : str, default 'en'
The language of the dataset in ISO 639-1 (alpha-2 code) format.
- ``sep`` : str, default ','
sep : str, default ','
Delimiter to use. E.g. `'\\t'`.

force : bool
If :obj:`add_dataframe` is called when there is already a dataset of the same
type in the staging area, when ``force=True``, the existing staged dataset will
Expand Down Expand Up @@ -993,7 +1003,7 @@ def push(self, project_id: int):
Notes
-----
- To use this method, you must first have committed your changes with the :obj:`commit`
method.
method.

Examples
--------
Expand Down
10 changes: 10 additions & 0 deletions openlayer/validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,11 @@ class DatasetValidator:
dataset_df : pd.DataFrame, optional
The dataset to validate.

Methods
-------
validate:
Runs all dataset validations.

Examples
--------

Expand Down Expand Up @@ -936,6 +941,11 @@ class ModelValidator:
sample_data : pd.DataFrame
Sample data to be used for the model validation.

Methods
-------
validate:
Runs all model validations.

Examples
--------

Expand Down