The ML.FEATURE_IMPORTANCE function
This document describes the ML.FEATURE_IMPORTANCE function, which lets you see the feature importance score. This score indicates how useful or valuable each feature was in the construction of a boosted tree or a random forest model during training. For more information, see the feature_importances property in the XGBoost library.
Syntax
ML.FEATURE_IMPORTANCE( MODEL `PROJECT_ID.DATASET.MODEL` )
Arguments
ML.FEATURE_IMPORTANCE takes the following arguments:
PROJECT_ID: your project ID.DATASET: the BigQuery dataset that contains the model.MODEL: the name of the model.
Output
ML.FEATURE_IMPORTANCE returns the following columns:
feature: aSTRINGvalue that contains the name of the feature column in the input training data.importance_weight: aFLOAT64value that contains the number of times a feature is used to split the data across all trees.importance_gain: aFLOAT64value that contains the average gain across all splits the feature is used in.importance_cover: aFLOAT64value that contains the average coverage across all splits the feature is used in.
If the TRANSFORM clause was used in the CREATE MODEL statement that created the model, ML.FEATURE_IMPORTANCE returns the information of the pre-transform columns from the query_statement clause of the CREATE MODEL statement.
Permissions
You must have the bigquery.models.create and bigquery.models.getData Identity and Access Management (IAM) permissions in order to run ML.FEATURE_IMPORTANCE.
Limitations
ML.FEATURE_IMPORTANCE is only supported with boosted tree models and random forest models.
Example
This example retrieves feature importance from mymodel in mydataset. The dataset is in your default project.
SELECT * FROM ML.FEATURE_IMPORTANCE(MODEL `mydataset.mymodel`)
What's next
- For more information about Explainable AI, see BigQuery Explainable AI overview.
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models.