Classification overview

A common use case for machine learning is classifying new data by using a model trained on similar labeled data. For example, you might want to predict whether an email is spam, or whether a customer product review is positive, negative, or neutral.

You can use any of the following models in combination with the ML.PREDICT function to perform classification:

Logistic regression models: use logistic regression by setting the MODEL_TYPE option to LOGISTIC_REG.
Boosted tree models: use a gradient boosted decision tree by setting the MODEL_TYPE option to BOOSTED_TREE_CLASSIFIER.
Random forest models: use a random forest by setting the MODEL_TYPE option to RANDOM_FOREST_CLASSIFIER.
Deep neural network (DNN) models: use a neural network by setting the MODEL_TYPE option to DNN_CLASSIFIER.
Wide & Deep models: use wide & deep learning by setting the MODEL_TYPE option to DNN_LINEAR_COMBINED_CLASSIFIER.
AutoML models: use an AutoML classification model by setting the MODEL_TYPE option to AUTOML_CLASSIFIER.

Recommended knowledge

By using the default settings in the CREATE MODEL statements and the ML.PREDICT function, you can create and use a classification model even without much ML knowledge. However, having basic knowledge about ML development helps you optimize both your data and your model to deliver better results. We recommend using the following resources to develop familiarity with ML techniques and processes: