モデルによる前処理と後処理¶

このトピックでは、モデルを作成し、Snowflake Model Registryにログを記録し、デプロイする方法について、多くのモデルタイプとシナリオを例に説明します。これらには次が含まれます。

インメモリscikit-learnモデルとパイプライン。
あなただけのカスタムモデル。
複数のモデル。

インメモリscikit-learnモデルとパイプライン¶

Snowflake ML では、 ModelContext クラスのキーワード引数を使用することで、インメモリ scikit-learn モデルを Modeling Registry にシームレスに統合することができます。以下は、モデルコンテキストのキーワード引数としてインメモリ scikit-learn モデルを渡し、カスタムモデルクラスでそれを呼び出す例です。

from sklearn import datasets, svm import pandas as pd from snowflake.ml.model import custom_model # Step 1: Import the Iris dataset iris_X, iris_y = datasets.load_iris(return_X_y=True) # Step 2: Initialize a scikit-learn LinearSVC model and train it svc = svm.LinearSVC() svc.fit(iris_X, iris_y) # Step 3: Initialize ModelContext with keyword arguments mc = custom_model.ModelContext( my_model=svc, ) # Step 4: Define a custom model class to utilize the context class ExampleSklearnModel(custom_model.CustomModel): def __init__(self, context: custom_model.ModelContext) -> None: super().__init__(context) @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: # Use the model from the context for predictions model_output = self.context['my_model'].predict(input) # Return the predictions in a DataFrame return pd.DataFrame({'output': model_output}) 

Copy

Snowflake ML での `scikit-learn` パイプラインの使用¶

以下は、Snowflake ML 内で scikit-learn パイプラインを使用する方法を示す例です。これには、スケーリングやインピュテーションなどの前処理ステップが含まれ、その後に予測モデルが続き、これらはすべて ModelContext を使用してカスタム・モデル・クラス内で管理されます。

from sklearn import datasets from sklearn.svm import SVC from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.impute import SimpleImputer import pandas as pd from snowflake.ml.model import custom_model # Step 1: Load the Iris dataset iris_X, iris_y = datasets.load_iris(return_X_y=True) # Step 2: Create a scikit-learn pipeline # The pipeline includes: # - A SimpleImputer to handle missing values # - A StandardScaler to standardize the data # - A Support Vector Classifier (SVC) for predictions pipeline = Pipeline([ ('imputer', SimpleImputer(strategy='mean')), ('scaler', StandardScaler()), ('classifier', SVC(kernel='linear', probability=True)) ]) # Step 3: Fit the pipeline to the dataset pipeline.fit(iris_X, iris_y) # Step 4: Initialize ModelContext with the pipeline mc = custom_model.ModelContext( pipeline_model=pipeline, ) # Step 5: Define a custom model class to utilize the pipeline class ExamplePipelineModel(custom_model.CustomModel): def __init__(self, context: custom_model.ModelContext) -> None: super().__init__(context) @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: # Use the pipeline from the context to process input and make predictions predictions = self.context['pipeline_model'].predict(input) probabilities = self.context['pipeline_model'].predict_proba(input) # Return predictions and probabilities as a DataFrame return pd.DataFrame({ 'predictions': predictions, 'probability_class_0': probabilities[:, 0], 'probability_class_1': probabilities[:, 1] }) # Example usage: # Convert new input data into a DataFrame new_input = pd.DataFrame(iris_X[:5]) # Using the first 5 samples for demonstration # Initialize the custom model and run predictions custom_pipeline_model = ExamplePipelineModel(context=mc) result = custom_pipeline_model.predict(new_input) print(result) 

Copy

独自のモデルの使用¶

以下の例では、カスタムモデルとして独自のモデルを使用しています。

mc = custom_model.ModelContext( my_model=your_own_model, ) from snowflake.ml.model import custom_model import pandas as pd import json class ExampleYourOwnModel(custom_model.CustomModel): def __init__(self, context: custom_model.ModelContext) -> None: super().__init__(context) @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: model_output = self.context['my_model'].predict(features) return pd.DataFrame({'output': model_output}) 

Copy

複数のモデルの使用¶

以下は、複数のモデルを組み合わせ、構成ファイルを使用して予測を生成する際にバイアスを適用するカスタムモデルです。

mc = custom_model.ModelContext( model1=model1, model2=model2, feature_preproc=preproc } ) 

Copy

注釈

model1 および model2 は、レジストリによってネイティブにサポートされた任意のタイプのモデルのオブジェクトです。 feature_preproc は scikit-learn pipeline オブジェクトです。

from snowflake.ml.model import custom_model import pandas as pd import json class ExamplePipelineModel(custom_model.CustomModel): @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: ... return pd.DataFrame(...) # Here is the fully-functional custom model that uses both model1 and model2 class ExamplePipelineModel(custom_model.CustomModel): def __init__(self, context: custom_model.ModelContext) -> None: super().__init__(context) @custom_model.inference_api def predict(self, input: pd.DataFrame) -> pd.DataFrame: features = self.context['feature_preproc'].transform(input) model_output = self.context['model1'].predict( self.context['model2'].predict(features) ) return pd.DataFrame({'output': model_output}) 

Copy

モデルによる前処理と後処理¶

インメモリscikit-learnモデルとパイプライン¶

Snowflake ML での scikit-learn パイプラインの使用¶

独自のモデルの使用¶

複数のモデルの使用¶

Snowflake ML での `scikit-learn` パイプラインの使用¶