The Imagen lets you edit images in seconds, using text prompts, masks, and existing images to guide the edits.
View Imagen for Editing and Customization model card
Supported model versions
Imagen API supports the following models:
- imagen-3.0-capability-001
For more information about the features that the model supports, see Imagen models.
HTTP request
curl -X POST \  -H "Authorization: Bearer $(gcloud auth print-access-token)" \  -H "Content-Type: application/json" \ https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagen-3.0-capability-001:predict \ -d '{  "instances": [  {  "referenceImages": [  {  "referenceType": "REFERENCE_TYPE_RAW",  "referenceId": 1,  "referenceImage": {  "bytesBase64Encoded": string  }  },  {  "referenceType": "REFERENCE_TYPE_MASK",  "referenceId": 2,  "referenceImage": {  "bytesBase64Encoded": string  },  "maskImageConfig": {  "maskMode": "MASK_MODE_USER_PROVIDED"  }  }  ],  "prompt": string  }  ],  "parameters": {  "addWatermark": boolean,  "baseSteps": integer,  "editMode": string,  "guidanceScale": integer,  "includeRaiReason": boolean,  "includeSafetyAttributes": boolean,  "language": string,  "negativePrompt": string,  "outputOptions": {  "mimeType": string,  "compressionQuality": integer  },  "personGeneration": string,  "safetySetting": string,  "sampleCount": integer,  "seed": integer,  "storageUri": string  } }' Instances
| Instances | |
|---|---|
| prompt | 
  Optional. The text prompt for the image. If a  | 
| referenceImages |  List of   Required. For mask editing, exactly two reference images must be specified, one with  | 
referenceImages object
 The referenceImages object describes the image assets for Imagen to edit.
| Parameters | |
|---|---|
| referenceType | 
 Required. The type of reference image. One of the following: 
 | 
| referenceId | 
 Required. A unique identifier for the reference image. Not used for masked editing. | 
| referenceImage.bytesBase64Encoded | 
 Required. Base64-encoded image bytes. Accepts PNG, JPEG, GIF, and BMP files. The maximum size is 20MB after transcoding to PNG. If you provide a mask image, it must be the same dimensions as the base image. | 
| maskImageConfig.maskMode | 
  Required when  
 | 
| maskImageConfig.dilation | 
  Optional. Range: [0, 1]. The percentage of image width to dilate (grow) the mask by. This can help compensate for imprecise masks. For best results, we recommend the following  
 | 
| maskImageConfig.maskClasses | 
  Optional. Mask classes for  | 
Parameters
| Parameters | |
|---|---|
| addWatermark | 
 Optional. Add an invisible watermark to the generated images.  The default value is  | 
| baseSteps | 
  Optional. The number of sampling steps. A higher value has better image quality, while a lower value has better latency. Defaults to   For smaller mask areas or for removal or insert modes, use  | 
| editMode | 
 Required for mask editing. An enum with one of the following values: 
 | 
| guidanceScale | 
 Optional. Controls how much the model adheres to the text prompt. Large values increase output and prompt alignment, but might compromise image quality.  Accepted range:   Default:  | 
| includeRaiReason | 
  Optional. Whether to include a safety reason for filtered images in the response. The default value is  | 
| includeSafetyAttributes | 
  Optional. Whether to report the safety scores of each image in the response. The default value is  | 
| language | 
 Optional. The language code that corresponds to your text prompt language. The following values are supported: 
   | 
| negativePrompt | 
 Optional. A description of what to discourage in the generated images. | 
| outputOptions |  Optional. Describes the output image format in an  | 
| personGeneration | 
 Optional. Allow generation of people by the model. The following values are supported: 
  For mask-based editing  | 
| sampleCount | 
 Optional. The number of images to generate. The default value is 4. | 
| seed | 
  Optional. The random seed for image generation. This isn't available when  | 
| safetySetting | 
 Optional. Adds a filter level to safety filtering. The following values are supported: 
 
 The default value is    | 
| storageUri | 
 Optional. The Cloud Storage URI to store the generated images. | 
Output options object
The outputOptions object describes the image output.
| Parameters | |
|---|---|
| outputOptions.mimeType | 
 Optional. The image format that the output should be saved as. The following values are supported: 
 The default value is  | 
| outputOptions.compressionQuality | 
  Optional. The level of compression if the output type is  | 
Sample request
REST
Before using any of the request data, make the following replacements:
-  REGION: The region that your project is located in. For more information about supported regions, see Generative AI on Vertex AI locations.
-  PROJECT_ID: Your Google Cloud project ID.
-  TEXT_PROMPT: Optional. A text prompt to guide the images that the model generates. For best results, use a description of the masked area and avoid single-word prompts. For example, use "a cute corgi" instead of "corgi".
-  B64_BASE_IMAGE: A base64-encoded image of the image being edited that is 10MB or less in size. For more information about base64-encoding, see Base64 encode and decode files.
-  B64_MASK_IMAGE: A base64-encoded black and white mask image that is 10MB or less in size.
-  MASK_DILATION: Optional. A float value between 0 and 1, inclusive, that represents the percentage of the image width to grow the mask by. Usingdilationhelps compensate for imprecise masks. We recommend a value of0.01.
-  EDIT_STEPS: Optional. An integer that represents the number of sampling steps. A higher value offers better image quality, a lower value offers better latency.We recommend that you try 35steps to start. If the quality doesn't meet your requirements, then we recomment increasing the value towards an upper limit of75.
-  SAMPLE_COUNT: Optional. An integer that describes the number of images to generate. The accepted range of values is1-4. The default value is4.
HTTP method and URL:
POST https://REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/publishers/google/models/imagen-3.0-capability-001:predict
Request JSON body:
 { "instances": [ { "prompt": "TEXT_PROMPT", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_RAW", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "B64_BASE_IMAGE" } }, { "referenceType": "REFERENCE_TYPE_MASK", "referenceImage": { "bytesBase64Encoded": "B64_MASK_IMAGE" }, "maskImageConfig": { "maskMode": "MASK_MODE_USER_PROVIDED", "dilation": MASK_DILATION } } ] } ], "parameters": { "editConfig": { "baseSteps": EDIT_STEPS }, "editMode": "EDIT_MODE_INPAINT_INSERTION", "sampleCount": SAMPLE_COUNT } } To send your request, choose one of these options:
curl
 Save the request body in a file named request.json, and execute the following command: 
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/publishers/google/models/imagen-3.0-capability-001:predict"
PowerShell
 Save the request body in a file named request.json, and execute the following command: 
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/publishers/google/models/imagen-3.0-capability-001:predict" | Select-Object -Expand Content
"sampleCount": 2. The response returns two prediction objects, with the generated image bytes base64-encoded.  { "predictions": [ { "bytesBase64Encoded": "BASE64_IMG_BYTES", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": "BASE64_IMG_BYTES" } ] } Class IDs
Use the following object class IDs to automatically create an image mask based on specific objects.
| Class ID ( class_) | Object | 
|---|---|
| 0 | backpack | 
| 1 | umbrella | 
| 2 | bag | 
| 3 | tie | 
| 4 | suitcase | 
| 5 | case | 
| 6 | bird | 
| 7 | cat | 
| 8 | dog | 
| 9 | horse | 
| 10 | sheep | 
| 11 | cow | 
| 12 | elephant | 
| 13 | bear | 
| 14 | zebra | 
| 15 | giraffe | 
| 16 | animal (other) | 
| 17 | microwave | 
| 18 | radiator | 
| 19 | oven | 
| 20 | toaster | 
| 21 | storage tank | 
| 22 | conveyor belt | 
| 23 | sink | 
| 24 | refrigerator | 
| 25 | washer dryer | 
| 26 | fan | 
| 27 | dishwasher | 
| 28 | toilet | 
| 29 | bathtub | 
| 30 | shower | 
| 31 | tunnel | 
| 32 | bridge | 
| 33 | pier wharf | 
| 34 | tent | 
| 35 | building | 
| 36 | ceiling | 
| 37 | laptop | 
| 38 | keyboard | 
| 39 | mouse | 
| 40 | remote | 
| 41 | cell phone | 
| 42 | television | 
| 43 | floor | 
| 44 | stage | 
| 45 | banana | 
| 46 | apple | 
| 47 | sandwich | 
| 48 | orange | 
| 49 | broccoli | 
| 50 | carrot | 
| 51 | hot dog | 
| 52 | pizza | 
| 53 | donut | 
| 54 | cake | 
| 55 | fruit (other) | 
| 56 | food (other) | 
| 57 | chair (other) | 
| 58 | armchair | 
| 59 | swivel chair | 
| 60 | stool | 
| 61 | seat | 
| 62 | couch | 
| 63 | trash can | 
| 64 | potted plant | 
| 65 | nightstand | 
| 66 | bed | 
| 67 | table | 
| 68 | pool table | 
| 69 | barrel | 
| 70 | desk | 
| 71 | ottoman | 
| 72 | wardrobe | 
| 73 | crib | 
| 74 | basket | 
| 75 | chest of drawers | 
| 76 | bookshelf | 
| 77 | counter (other) | 
| 78 | bathroom counter | 
| 79 | kitchen island | 
| 80 | door | 
| 81 | light (other) | 
| 82 | lamp | 
| 83 | sconce | 
| 84 | chandelier | 
| 85 | mirror | 
| 86 | whiteboard | 
| 87 | shelf | 
| 88 | stairs | 
| 89 | escalator | 
| 90 | cabinet | 
| 91 | fireplace | 
| 92 | stove | 
| 93 | arcade machine | 
| 94 | gravel | 
| 95 | platform | 
| 96 | playingfield | 
| 97 | railroad | 
| 98 | road | 
| 99 | snow | 
| 100 | sidewalk pavement | 
| 101 | runway | 
| 102 | terrain | 
| 103 | book | 
| 104 | box | 
| 105 | clock | 
| 106 | vase | 
| 107 | scissors | 
| 108 | plaything (other) | 
| 109 | teddy bear | 
| 110 | hair dryer | 
| 111 | toothbrush | 
| 112 | painting | 
| 113 | poster | 
| 114 | bulletin board | 
| 115 | bottle | 
| 116 | cup | 
| 117 | wine glass | 
| 118 | knife | 
| 119 | fork | 
| 120 | spoon | 
| 121 | bowl | 
| 122 | tray | 
| 123 | range hood | 
| 124 | plate | 
| 125 | person | 
| 126 | rider (other) | 
| 127 | bicyclist | 
| 128 | motorcyclist | 
| 129 | paper | 
| 130 | streetlight | 
| 131 | road barrier | 
| 132 | mailbox | 
| 133 | cctv camera | 
| 134 | junction box | 
| 135 | traffic sign | 
| 136 | traffic light | 
| 137 | fire hydrant | 
| 138 | parking meter | 
| 139 | bench | 
| 140 | bike rack | 
| 141 | billboard | 
| 142 | sky | 
| 143 | pole | 
| 144 | fence | 
| 145 | railing banister | 
| 146 | guard rail | 
| 147 | mountain hill | 
| 148 | rock | 
| 149 | frisbee | 
| 150 | skis | 
| 151 | snowboard | 
| 152 | sports ball | 
| 153 | kite | 
| 154 | baseball bat | 
| 155 | baseball glove | 
| 156 | skateboard | 
| 157 | surfboard | 
| 158 | tennis racket | 
| 159 | net | 
| 160 | base | 
| 161 | sculpture | 
| 162 | column | 
| 163 | fountain | 
| 164 | awning | 
| 165 | apparel | 
| 166 | banner | 
| 167 | flag | 
| 168 | blanket | 
| 169 | curtain (other) | 
| 170 | shower curtain | 
| 171 | pillow | 
| 172 | towel | 
| 173 | rug floormat | 
| 174 | vegetation | 
| 175 | bicycle | 
| 176 | car | 
| 177 | autorickshaw | 
| 178 | motorcycle | 
| 179 | airplane | 
| 180 | bus | 
| 181 | train | 
| 182 | truck | 
| 183 | trailer | 
| 184 | boat ship | 
| 185 | slow wheeled object | 
| 186 | river lake | 
| 187 | sea | 
| 188 | water (other) | 
| 189 | swimming pool | 
| 190 | waterfall | 
| 191 | wall | 
| 192 | window | 
| 193 | window blind | 
What's next
- For more information, see Imagen on Vertex AI.