Classification overview
A common use case for machine learning is classifying new data by using a model trained on similar labeled data. For example, you might want to predict whether an email is spam, or whether a customer product review is positive, negative, or neutral.
You can use any of the following models in combination with the ML.PREDICT
function to perform classification:
- Logistic regression models: use logistic regression by setting the
MODEL_TYPE
option toLOGISTIC_REG
. - Boosted tree models: use a gradient boosted decision tree by setting the
MODEL_TYPE
option toBOOSTED_TREE_CLASSIFIER
. - Random forest models: use a random forest by setting the
MODEL_TYPE
option toRANDOM_FOREST_CLASSIFIER
. - Deep neural network (DNN) models: use a neural network by setting the
MODEL_TYPE
option toDNN_CLASSIFIER
. - Wide & Deep models: use wide & deep learning by setting the
MODEL_TYPE
option toDNN_LINEAR_COMBINED_CLASSIFIER
. - AutoML models: use an AutoML classification model by setting the
MODEL_TYPE
option toAUTOML_CLASSIFIER
.
Recommended knowledge
By using the default settings in the CREATE MODEL
statements and the ML.PREDICT
function, you can create and use a classification model even without much ML knowledge. However, having basic knowledge about ML development helps you optimize both your data and your model to deliver better results. We recommend using the following resources to develop familiarity with ML techniques and processes: