The ML.ADVANCED_WEIGHTS function
This document describes the ML.ADVANCED_WEIGHTS function, which lets you see the underlying weights that a linear or binary logistic regression model uses during prediction, along with the associated p-values and standard errors for that weight. ML.ADVANCED_WEIGHTS is an extended version of ML.WEIGHTS for linear and binary logistic regression models.
Usage requirements
You can only use ML.ADVANCED_WEIGHTS on linear and binary logistic regression models that are trained with the following option settings:
- The CALCULATE_P_VALUESvalue isTRUE.
- The CATEGORY_ENCODING_METHODvalue isDUMMY_ENCODING.
- The L1_REGvalue is0.
It's common to require standard errors or p-values for either the regression coefficients or other estimated quantities for these penalized regression methods. In principle, such standard errors can be calculated—for example, using the bootstrap. In practice, this calculation isn't done for reasons that the authors of the R package explain as follows:
Multiclass logistic regression models aren't supported.
Syntax
ML.ADVANCED_WEIGHTS( MODEL `PROJECT_ID.DATASET.MODEL`, STRUCT( [STANDARDIZE AS standardize]))
Arguments
ML.ADVANCED_WEIGHTS takes the following arguments:
- PROJECT_ID: your project ID.
- DATASET: the BigQuery dataset that contains the model.
- MODEL: the name of the model.
- STANDARDIZE: a- BOOLvalue that specifies whether the model weights should be standardized to assume that all features have a mean of zero and a standard deviation of one. Standardizing the weights allows the absolute magnitude of the weights to be compared to each other. The default value is- FALSE.
Output
ML.ADVANCED_WEIGHTS returns the following columns:
- processed_input: a- STRINGvalue that contains the name of the feature column. The value of this column is the name of the feature column that's provided in the- query_statementclause used during model training. If the feature is non-numeric, then there are multiple rows with the same- processed_inputvalue, one for each category of the feature.
- category: a- STRINGvalue that contains the category name if the column identified in the- processed_inputvalue is non-numeric. Returns a- NULLvalue for numeric columns.
- weight: a- FLOAT64value that contains the weight of each feature.
- standard_error: a- FLOAT64value that contains the standard error of the weight.
- p_value: a- FLOAT64value that contains the p-value that was tested against the null hypothesis. The p-value for feature $j$ is calculated using the following formula:$$ p(j) = 2 * (1 - stats.norm.cdf(abs(\hat\beta_j), loc=0, scale=\sigma_j)) $$- such that $\hat\beta_j$ is the weight of feature $j$ after training and $\sigma_j$ is its standard error. 
If the TRANSFORM clause was used in the CREATE MODEL statement that created the model, ML.ADVANCED_WEIGHTS outputs the weights of the TRANSFORM output features. The weights are denormalized by default, with the option to get normalized weights, exactly like models that are created without TRANSFORM.
Permissions
You must have the bigquery.models.create andbigquery.models.getData Identity and Access Management (IAM) permissions in order to run ML.ADVANCED_WEIGHTS.
Limitations
The total cardinality of training features must be less than 1,000. This limitation is the result of the limitations of computing p-values and standard error when you set the CALCULATE_P_VALUES option to TRUE when training the model.
Examples
The following examples demonstrate ML.ADVANCED_WEIGHTS with and without standardization.
Without standardization
The following example retrieves weight information from mymodel in mydataset where the dataset is in your default project.
The query returns the weights associated with each one-hot encoded category for the input column input_col.
SELECT * FROM ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`, STRUCT(FALSE AS standardize))
With standardization
The following example retrieves weight information from mymodel in mydataset. The dataset is in your default project.
The query retrieves standardized weights, which assume all features have a mean of 0 and a standard deviation of 1.0.
SELECT * FROM ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`, STRUCT(TRUE AS standardize))
What's next
- For information about Explainable AI, see BigQuery Explainable AI overview.
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models.