Dataproc sets special metadata values for the instances that run in your cluster:
| Metadata key | Value | 
|---|---|
| dataproc-bucket | Name of the cluster's staging bucket | 
| dataproc-region | Region of the cluster's endpoint | 
| dataproc-worker-count | Number of worker nodes in the cluster. The value is 0for single node clusters. | 
| dataproc-cluster-name | Name of the cluster | 
| dataproc-cluster-uuid | UUID of the cluster | 
| dataproc-role | Instance's role, either MasterorWorker | 
| dataproc-master | Hostname of the first master node. The value is either [CLUSTER_NAME]-min a standard or single node cluster, or[CLUSTER_NAME]-m-0in a high-availability cluster, where[CLUSTER_NAME]is the name of your cluster. | 
| dataproc-master-additional | Comma-separated list of hostnames for the additional master nodes in a high-availability cluster, for example, [CLUSTER_NAME]-m-1,[CLUSTER_NAME]-m-2in a cluster that has 3 master nodes. | 
| SPARK_BQ_CONNECTOR_VERSION or SPARK_BQ_CONNECTOR_URL | The version or URL that points to a Spark BigQuery connector version to use in Spark applications, for example, 0.42.1orgs://spark-lib/bigquery/spark-3.5-bigquery-0.42.1.jar. A default Spark BigQuery connector version is pre-installed in Dataproc2.1and later image version clusters. For more information, see Use the Spark BigQuery connector. | 
You can use these values to customize the behavior of initialization actions.
You can use the --metadata flag in the gcloud dataproc clusters create command to provide your own metadata:
gcloud dataproc clusters create CLUSTER_NAME \ --region=REGION \ --metadata=name1=value1,name2=value2... \ ... other flags ...