Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Adjusts the operators documentation.
  • Loading branch information
mrDzurb committed Oct 23, 2023
commit d18fca05b6c863a6b603fafc48a83719bd06e8c8
25 changes: 18 additions & 7 deletions ads/opctl/operator/cmd.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,11 @@
from .__init__ import __operators__
from .common import utils as operator_utils
from .common.backend_factory import BackendFactory
from .common.errors import OperatorCondaNotFoundError, OperatorImageNotFoundError
from .common.errors import (
OperatorCondaNotFoundError,
OperatorImageNotFoundError,
OperatorSchemaYamlError,
)
from .common.operator_loader import _operator_info_list


Expand Down Expand Up @@ -160,18 +164,20 @@ def init(
output = os.path.join(tempfile.TemporaryDirectory().name, "")

# generating operator specification
operator_config = None
operator_config = {}
try:
operator_cmd_module = runpy.run_module(
f"{operator_info.type}.cmd", run_name="init"
)
operator_config = operator_cmd_module.get("init", lambda: "")(
**{**kwargs, **{"type": type}}
)
with fsspec.open(
os.path.join(output, f"{operator_info.type}.yaml"), mode="w"
) as f:
f.write(yaml.dump(operator_config))

if not merge_config:
with fsspec.open(
os.path.join(output, f"{operator_info.type}.yaml"), mode="w"
) as f:
f.write(yaml.dump(operator_config))
except Exception as ex:
logger.info(
"The operator's specification was not generated "
Expand Down Expand Up @@ -407,7 +413,12 @@ def verify(
run_name="verify",
)
operator_module.get("verify")(config, **kwargs)

except OperatorSchemaYamlError as ex:
logger.debug(ex)
raise ValueError(
f"The operator's specification is not valid for the `{operator_info.type}` operator. "
f"{ex}"
)
except Exception as ex:
logger.debug(ex)
raise ValueError(
Expand Down
1 change: 1 addition & 0 deletions ads/opctl/operator/common/operator_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ def _validate_dict(cls, obj_dict: Dict) -> bool:
"""
schema = cls._load_schema()
validator = OperatorValidator(schema)
validator.allow_unknown = True
result = validator.validate(obj_dict)

if not result:
Expand Down
1 change: 1 addition & 0 deletions ads/opctl/operator/lowcode/forecast/schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,7 @@ spec:
required: true
schema:
type: string
default: ["Column1"]

horizon:
required: true
Expand Down
12 changes: 6 additions & 6 deletions docs/source/user_guide/cli/opctl/configure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ This will prompt you to setup default ADS CLI configurations for each OCI profil
[OCI]
oci_config = ~/.oci/config
oci_profile = ANOTHERPROF
auth = api_key # security_token, instance_principal, resource_principal

[CONDA]
conda_pack_folder = </local/path/for/saving/condapack>
Expand Down Expand Up @@ -137,7 +138,7 @@ To generate starter specification run -

ads opctl init --help

The resource type is a mandatory attribute that needs to be provided. Currently supported resource types - `dataflow`, `deployment`, `job` and `pipeline`.
The resource type is a mandatory attribute that needs to be provided. Currently supported resource types - ``dataflow``, ``deployment``, ``job`` and ``pipeline``.
For instance to generate starter specification for the Data Science job, run -

.. code-block::
Expand All @@ -149,10 +150,10 @@ The resulting YAML will be printed in the console. By default the ``python`` run

**Supported runtimes**

- For a ``job`` - `container`, `gitPython`, `notebook`, `python` and `script`.
- For a ``pipeline`` - `container`, `gitPython`, `notebook`, `python` and `script`.
- For a ``dataflow`` - `dataFlow` and `dataFlowNotebook`.
- For a ``deployment`` - `conda` and `container`.
- For a ``job`` - ``container``, ``gitPython``, ``notebook``, ``python`` and ``script``.
- For a ``pipeline`` - ``container``, ``gitPython``, ``notebook``, ``python`` and ``script``.
- For a ``dataflow`` - ``dataFlow`` and ``dataFlowNotebook``.
- For a ``deployment`` - ``conda`` and ``container``.


If you want to specify a particular runtime use -
Expand All @@ -166,4 +167,3 @@ Use the ``--output`` attribute to save the result in a YAML file.
.. code-block::

ads opctl init job --runtime-type container --output job_with_container_runtime.yaml

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
142 changes: 133 additions & 9 deletions docs/source/user_guide/operators/common/run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,49 @@ The first step is to generate starter kit configurations that simplify the execu

.. code-block:: bash

ads operator init --type <operator-type>
ads operator init --help

.. figure:: figures/operator_init.png
:align: center

.. admonition:: Important
:class: warning

If the ``--merge-config`` flag is set to ``true``, the ``<operator-type>.yaml`` file will be merged with the backend configuration which contains pre-populated infrastructure and runtime sections. You don't need to provide a backend information separately in this case.

.. code-block:: bash

ads operator run -f <operator-type>.yaml

Alternatively ``ads opctl run`` command can be used:

.. code-block:: bash

ads opctl run -f <operator-type>.yaml

The operator will be run in chosen environment without requiring additional modifications.


Different Ways To Run Operator
------------------------------

To operator can be run in two different ways:

.. code-block:: bash

ads operator run -f <operator-config>.yaml

Or alternatively:

.. code-block:: bash

ads opctl run -f <operator-config>.yaml

Despite the presented above commands look equivalent, the ``ads operator run`` command is more flexible.
Here the few restrictions when running the operator within the ``ads opctl run`` command:

- The ``<operator-config>.yaml`` file must contain all the necessary information for running the operator. This means that the ``<operator-config>.yaml`` file must contain the ``runtime`` section describing the backend configuration for the operator.
- If the ``<operator-config>.yaml`` file not contains the ``runtime`` section, then the ``ads opctl run`` command can be used in restricted mode with ``-b`` option. This option allows you to specify the backend to run the operator on. The ``-b`` option can be used with the following backends: ``local``, ``dataflow``, ``job``. However you will not be able to use the ``-b`` option with the local ``container`` backend and Data Science Jobs ``container`` backend.


Run Operator Locally
Expand All @@ -34,20 +76,26 @@ To run the operator locally, follow these steps:

1. Create and activate a new conda environment named ``<operator-type>``.
2. Install all the required libraries listed in the ``environment.yaml`` file generated by the ``ads operator init --type <operator-type>`` command.
3. Review the ``<operator-type>.yaml`` file generated by the ``ads operator init`` command and make necessary adjustments to input and output file locations.
3. Review the ``<operator-type>.yaml`` file generated by the ``ads operator init`` command and make necessary adjustments to input and output file locations. Notice that the ``<operator-type>.yaml`` will not be generated if the ``--merge-config`` flag is set to ``true``.
4. Verify the operator's configuration using the following command:

.. code-block:: bash

ads operator verify -f <operator-type>.yaml
ads operator verify -f <operator-config>.yaml

5. To run the operator within the ``<operator-type>`` conda environment, use this command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b local

The operator will be run in your local environment without requiring additional modifications.
The alternative way to run the operator would be to use the ``ads opctl run`` command:

.. code-block:: bash

ads opctl run -f <operator-type>.yaml -b local

See the `Different Ways To Run Operator <#different-ways-to-run-operator>`_ section for more details.

Within Container
~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -83,6 +131,21 @@ Following is the YAML schema for validating the runtime YAML using `Cerberus <ht

ads operator run -f <operator-type>.yaml -b backend_operator_local_container_config.yaml

Or within a short command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b local.container


The alternative way to run the operator would be to use the ``ads opctl run`` command. However in this case the runtime information needs to be merged within operator's config. See the `Different Ways To Run Operator <#different-ways-to-run-operator>`_ section for more details.

.. code-block:: bash

ads opctl run -f <operator-type>.yaml

If the backend runtime information is not merged within operator's config, then there is no way to run the operator within the ``ads opctl run`` command using container runtime. The ``ads operator run`` command should be used instead.


Run Operator In Data Science Job
--------------------------------
Expand Down Expand Up @@ -122,22 +185,40 @@ To publish ``<operator-type>:<operator-version>`` to OCR, use this command:

After publishing the container to OCR, you can use it within Data Science jobs service. Check the ``backend_job_container_config.yaml`` configuration file built during initializing the starter configs for the operator. It should contain pre-populated infrastructure and runtime sections. The runtime section should have an image property, like ``image: iad.ocir.io/<tenancy>/<operator-type>:<operator-version>``.

1. Adjust the ``<operator-type>.yaml`` configuration with the proper input/output folders. When running operator in a Data Science job, it won't have access to local folders, so input data and output folders should be placed in the Object Storage bucket. Open the ``<operator-type>.yaml`` and adjust the data path fields.
3. Adjust the ``<operator-type>.yaml`` configuration with the proper input/output folders. When running operator in a Data Science job, it won't have access to local folders, so input data and output folders should be placed in the Object Storage bucket. Open the ``<operator-type>.yaml`` and adjust the data path fields.

2. Run the operator on the Data Science jobs using this command:
4. Run the operator on the Data Science jobs using this command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b backend_job_container_config.yaml

You can run the operator within the ``--dry-run`` attribute to check the final configs that will be used to run the operator on the service.
Or within a short command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b job.container

In this case the backend config will be built on the fly.
However the recommended way would be to use explicit configurations for both operator and backend.

The alternative way to run the operator would be to use the ``ads opctl run`` command. However in this case the runtime information needs to be merged within operator's config. See the `Different Ways To Run Operator <#different-ways-to-run-operator>`_ section for more details.

.. code-block:: bash

ads opctl run -f <operator-type>.yaml

If the backend runtime information is not merged within operator's config, then there is no way to run the operator within the ``ads opctl run`` command using container runtime. The ``ads operator run`` command should be used instead.

You can run the operator within the ``--dry-run`` attribute to check the final configs that will be used to run the operator on the service. This command will not run the operator, but will print the final configs that will be used to run the operator on the service.

Running the operator will return a command to help you monitor the job's logs:

.. code-block:: bash

ads opctl watch <OCID>


Run With Conda Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -173,7 +254,28 @@ For more details on configuring the CLI, refer to the :doc:`Explore & Configure

.. code-block:: bash

ads operator run -f <operator-type>.yaml --backend-config backend_job_python_config.yaml
ads operator run -f <operator-type>.yaml -b backend_job_python_config.yaml

Or within a short command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b job

In this case the backend config will be built on the fly.
However the recommended way would be to use explicit configurations for both operator and backend.

The alternative way to run the operator would be to use the ``ads opctl run`` command. However in this case the runtime information needs to be merged within operator's config. See the `Different Ways To Run Operator <#different-ways-to-run-operator>`_ section for more details.

.. code-block:: bash

ads opctl run -f <operator-type>.yaml

Or if the backend runtime information is not merged within operator's config:

.. code-block:: bash

ads opctl run -f <operator-type>.yaml -b job

6. Monitor the logs using the ``ads opctl watch`` command::

Expand Down Expand Up @@ -217,7 +319,29 @@ After publishing the conda environment to Object Storage, you can use it within

.. code-block:: bash

ads operator run -f <operator-type>.yaml --backend-config backend_dataflow_dataflow_config.yaml
ads operator run -f <operator-type>.yaml -b backend_dataflow_dataflow_config.yaml

Or within a short command:

.. code-block:: bash

ads operator run -f <operator-type>.yaml -b dataflow

In this case the backend config will be built on the fly.
However the recommended way would be to use explicit configurations for both operator and backend.

The alternative way to run the operator would be to use the ``ads opctl run`` command. However in this case the runtime information needs to be merged within operator's config. See the `Different Ways To Run Operator <#different-ways-to-run-operator>`_ section for more details.

.. code-block:: bash

ads opctl run -f <operator-type>.yaml

Or if the backend runtime information is not merged within operator's config:

.. code-block:: bash

ads opctl run -f <operator-type>.yaml -b dataflow


5. Monitor the logs using the ``ads opctl watch`` command::

Expand Down