使用 Imagen 生成图片

借助 Firebase AI Logic SDK,您可以访问 Imagen 模型(通过 Imagen API),从而根据文本提示生成图片。借助此功能,您可以执行以下操作:

  • 根据自然语言提示生成图片
  • 生成各种格式和风格的图片
  • 渲染图片中的文字

本指南介绍了如何仅通过提供文本提示来使用 Imagen 生成图片。

不过请注意,Imagen 还可以使用其自定义功能(目前仅适用于 Android 和 Flutter)根据参考图片生成图片。在请求中,您提供一个文本提示和一张参考图片,引导模型根据指定的样式、主题(例如产品、人物或动物)或控制变量生成新图片。例如,您可以根据猫的照片或火箭和月球的绘画生成新图片。

跳转到纯文本输入的代码

GeminiImagen 型号之间进行选择

Firebase AI Logic SDK 支持使用 Gemini 模型或 Imagen 模型生成和修改图片。

对于大多数使用情形,请先选择 Gemini,然后仅在图像质量至关重要的专业任务中选择 Imagen

如果您想执行以下操作,请选择 Gemini

  • 利用世界知识和推理能力生成与上下文相关的图片。
  • 无缝融合文字和图片,或交织文字和图片输出。
  • 在长文本序列中嵌入准确的视觉元素。
  • 以对话方式修改图片,同时保持上下文。

如果您想执行以下操作,请选择 Imagen

  • 优先考虑画质、写实度、艺术细节或特定风格(例如印象派或动漫)。
  • 融入品牌元素、风格或生成徽标和产品设计。
  • 用于明确指定所生成图片的宽高比或格式。

准备工作

点击您的 Gemini API 提供商,以查看此页面上特定于提供商的内容和代码。

如果您尚未完成入门指南,请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 API 提供商初始化后端服务,以及创建 ImagenModel 实例。

支持此功能的模型

Gemini Developer API 支持通过最新的稳定版 Imagen 模型生成图片。无论您以何种方式访问 Gemini Developer API,此支持的 Imagen 模型限制都适用。

  • imagen-4.0-generate-001
  • imagen-4.0-fast-generate-001
  • imagen-4.0-ultra-generate-001
  • imagen-3.0-generate-002

根据纯文本输入生成图片

您可以仅通过文本提示让 Imagen 模型生成图片。您可以生成一张图片多张图片

您还可以为图片生成设置许多不同的配置选项,例如宽高比和图片格式。

根据纯文字输入生成一张图片

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

您可以仅通过文本提示让 Imagen 模型生成单张图片。

请务必创建 ImagenModel 实例并调用 generateImages

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service let ai = FirebaseAI.firebaseAI(backend: .googleAI()) // Create an `ImagenModel` instance with a model that supports your use case let model = ai.imagenModel(modelName: "imagen-4.0-generate-001") // Provide an image generation prompt let prompt = "An astronaut riding a horse" // To generate an image, call `generateImages` with the text prompt let response = try await model.generateImages(prompt: prompt) // Handle the generated image guard let image = response.images.first else {  fatalError("No image in the response.") } let uiImage = UIImage(data: image.data) 

Kotlin

 suspend fun generateImage() {  // Initialize the Gemini Developer API backend service  val ai = Firebase.ai(backend = GenerativeBackend.googleAI())  // Create an `ImagenModel` instance with an Imagen model that supports your use case  val model = ai.imagenModel("imagen-4.0-generate-001")  // Provide an image generation prompt  val prompt = "An astronaut riding a horse"  // To generate an image, call `generateImages` with the text prompt  val imageResponse = model.generateImages(prompt)  // Handle the generated image  val image = imageResponse.images.first()  val bitmapImage = image.asBitmap() } 

Java

 // Initialize the Gemini Developer API backend service // Create an `ImagenModel` instance with an Imagen model that supports your use case ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.googleAI())  .imagenModel(  /* modelName */ "imagen-4.0-generate-001"); ImagenModelFutures model = ImagenModelFutures.from(imagenModel); // Provide an image generation prompt String prompt = "An astronaut riding a horse"; // To generate an image, call `generateImages` with the text prompt Futures.addCallback(model.generateImages(prompt), new FutureCallback<ImagenGenerationResponse<ImagenInlineImage>>() {  @Override  public void onSuccess(ImagenGenerationResponse<ImagenInlineImage> result) {  if (result.getImages().isEmpty()) {  Log.d("TAG", "No images generated");  }  Bitmap bitmap = result.getImages().get(0).asBitmap();  // Use the bitmap to display the image in your UI  }  @Override  public void onFailure(Throwable t) {  // ...  } }, Executors.newSingleThreadExecutor()); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai"; // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {  // ... }; // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig); // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create an `ImagenModel` instance with an Imagen model that supports your use case const model = getImagenModel(ai, { model: "imagen-4.0-generate-001" }); // Provide an image generation prompt const prompt = "An astronaut riding a horse."; // To generate an image, call `generateImages` with the text prompt const response = await model.generateImages(prompt) // If fewer images were generated than were requested, // then `filteredReason` will describe the reason they were filtered out if (response.filteredReason) {  console.log(response.filteredReason); } if (response.images.length == 0) {  throw new Error("No images in the response.") } const image = response.images[0]; 

Dart

import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; // Initialize FirebaseApp await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service final model = FirebaseAI.googleAI(); // Create an `ImagenModel` instance with an Imagen model that supports your use case final model = ai.imagenModel(model: 'imagen-4.0-generate-001'); // Provide an image generation prompt const prompt = 'An astronaut riding a horse.'; // To generate an image, call `generateImages` with the text prompt final response = await model.generateImages(prompt); if (response.images.isNotEmpty) {  final image = response.images[0];  // Process the image } else {  // Handle the case where no images were generated  print('Error: No images were generated.'); } 

Unity

 using Firebase.AI; // Initialize the Gemini Developer API backend service var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()); // Create an `ImagenModel` instance with a model that supports your use case var model = ai.GetImagenModel(modelName: "imagen-4.0-generate-001"); // Provide an image generation prompt var prompt = "An astronaut riding a horse"; // To generate an image, call `generateImages` with the text prompt var response = await model.GenerateImagesAsync(prompt: prompt); // Handle the generated image if (response.Images.Count == 0) {  throw new Exception("No image in the response."); } var image = response.Images[0].AsTexture2D(); 

了解如何选择适合您的应用场景和应用的模型

根据纯文本输入生成多张图片

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

默认情况下,Imagen 模型每个请求仅生成一张图片。不过,您可以在创建 ImagenModel 实例时提供 ImagenGenerationConfig,让 Imagen 模型根据每个请求生成多张图片。

请务必创建 ImagenModel 实例并调用 generateImages

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service let ai = FirebaseAI.firebaseAI(backend: .googleAI()) // Create an `ImagenModel` instance with a model that supports your use case let model = ai.imagenModel(  modelName: "imagen-4.0-generate-001",  // Configure the model to generate multiple images for each request  // See: https://firebase.google.com/docs/ai-logic/model-parameters  generationConfig: ImagenGenerationConfig(numberOfImages: 4) ) // Provide an image generation prompt let prompt = "An astronaut riding a horse" // To generate images, call `generateImages` with the text prompt let response = try await model.generateImages(prompt: prompt) // If fewer images were generated than were requested, // then `filteredReason` will describe the reason they were filtered out if let filteredReason = response.filteredReason {  print(filteredReason) } // Handle the generated images let uiImages = response.images.compactMap { UIImage(data: $0.data) } 

Kotlin

 suspend fun generateImage() {  // Initialize the Gemini Developer API backend service  val ai = Firebase.ai(backend = GenerativeBackend.googleAI())  // Create an `ImagenModel` instance with an Imagen model that supports your use case  val model = ai.imagenModel(  modelName = "imagen-4.0-generate-001",  // Configure the model to generate multiple images for each request  // See: https://firebase.google.com/docs/ai-logic/model-parameters  generationConfig = ImagenGenerationConfig(numberOfImages = 4)  )  // Provide an image generation prompt  val prompt = "An astronaut riding a horse"  // To generate images, call `generateImages` with the text prompt  val imageResponse = model.generateImages(prompt)  // If fewer images were generated than were requested,  // then `filteredReason` will describe the reason they were filtered out  if (imageResponse.filteredReason != null) {  Log.d(TAG, "FilteredReason: ${imageResponse.filteredReason}")  }  for (image in imageResponse.images) {  val bitmap = image.asBitmap()  // Use the bitmap to display the image in your UI  } } 

Java

 // Configure the model to generate multiple images for each request // See: https://firebase.google.com/docs/ai-logic/model-parameters ImagenGenerationConfig imagenGenerationConfig = new ImagenGenerationConfig.Builder()  .setNumberOfImages(4)  .build(); // Initialize the Gemini Developer API backend service // Create an `ImagenModel` instance with an Imagen model that supports your use case ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.googleAI())  .imagenModel(  /* modelName */ "imagen-4.0-generate-001",  /* imageGenerationConfig */ imagenGenerationConfig); ImagenModelFutures model = ImagenModelFutures.from(imagenModel); // Provide an image generation prompt String prompt = "An astronaut riding a horse"; // To generate images, call `generateImages` with the text prompt Futures.addCallback(model.generateImages(prompt), new FutureCallback<ImagenGenerationResponse<ImagenInlineImage>>() {  @Override  public void onSuccess(ImagenGenerationResponse<ImagenInlineImage> result) {  // If fewer images were generated than were requested,  // then `filteredReason` will describe the reason they were filtered out  if (result.getFilteredReason() != null){  Log.d("TAG", "FilteredReason: " + result.getFilteredReason());  }  // Handle the generated images  List<ImagenInlineImage> images = result.getImages();  for (ImagenInlineImage image : images) {  Bitmap bitmap = image.asBitmap();  // Use the bitmap to display the image in your UI  }  }  @Override  public void onFailure(Throwable t) {  // ...  } }, Executors.newSingleThreadExecutor()); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai"; // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {  // ... }; // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig); // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create an `ImagenModel` instance with an Imagen model that supports your use case const model = getImagenModel(  ai,  {  model: "imagen-4.0-generate-001",  // Configure the model to generate multiple images for each request  // See: https://firebase.google.com/docs/ai-logic/model-parameters  generationConfig: {  numberOfImages: 4  }  } ); // Provide an image generation prompt const prompt = "An astronaut riding a horse."; // To generate images, call `generateImages` with the text prompt const response = await model.generateImages(prompt) // If fewer images were generated than were requested, // then `filteredReason` will describe the reason they were filtered out if (response.filteredReason) {  console.log(response.filteredReason); } if (response.images.length == 0) {  throw new Error("No images in the response.") } const images = response.images[0]; 

Dart

import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; // Initialize FirebaseApp await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service final ai = FirebaseAI.googleAI(); // Create an `ImagenModel` instance with an Imagen model that supports your use case final model = ai.imagenModel(  model: 'imagen-4.0-generate-001',  // Configure the model to generate multiple images for each request  // See: https://firebase.google.com/docs/ai-logic/model-parameters  generationConfig: ImagenGenerationConfig(numberOfImages: 4), ); // Provide an image generation prompt const prompt = 'An astronaut riding a horse.'; // To generate images, call `generateImages` with the text prompt final response = await model.generateImages(prompt); // If fewer images were generated than were requested, // then `filteredReason` will describe the reason they were filtered out if (response.filteredReason != null) {  print(response.filteredReason); } if (response.images.isNotEmpty) {  final images = response.images;  for(var image in images) {  // Process the image  } } else {  // Handle the case where no images were generated  print('Error: No images were generated.'); } 

Unity

 using Firebase.AI; // Initialize the Gemini Developer API backend service var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()); // Create an `ImagenModel` instance with a model that supports your use case var model = ai.GetImagenModel(  modelName: "imagen-4.0-generate-001",  // Configure the model to generate multiple images for each request  // See: https://firebase.google.com/docs/ai-logic/model-parameters  generationConfig: new ImagenGenerationConfig(numberOfImages: 4) ); // Provide an image generation prompt var prompt = "An astronaut riding a horse"; // To generate an image, call `generateImages` with the text prompt var response = await model.GenerateImagesAsync(prompt: prompt); // If fewer images were generated than were requested, // then `filteredReason` will describe the reason they were filtered out if (!string.IsNullOrEmpty(response.FilteredReason)) {  UnityEngine.Debug.Log("Filtered reason: " + response.FilteredReason); } // Handle the generated images var images = response.Images.Select(image => image.AsTexture2D()); 

了解如何选择适合您的应用场景和应用的模型



支持的功能和要求

Imagen 模型提供许多与图片生成相关的功能。 本部分介绍了将模型与 Firebase AI Logic 搭配使用时支持的功能

支持的功能

Firebase AI Logic 支持 Imagen 型号的以下功能

  • 在生成的图片中生成人物、人脸和文字

  • 使用 Vertex AI Gemini API编辑图片或在请求中添加图片(目前仅适用于 Android 和 Flutter)

  • 为生成的图片添加水印

  • 使用 Vertex AI Gemini API
    验证数字水印 如果您想验证图片是否带有水印,可以使用 Vertex AI Studio媒体标签页将图片上传到 Vertex AI Studio

  • 配置图片生成参数,例如生成的图片数量、宽高比和水印

  • 配置安全设置

Firebase AI Logic 支持 Imagen 型号的以下高级功能

  • 设置输入文本的语言

  • 停用提示重写器enhancePrompt 参数)。这意味着,基于 LLM 的提示重写工具始终会自动在提供的提示中添加更多详细信息,以提供更高质量的图片,从而更好地反映所提供的提示。

  • 将生成的图片直接写入 Google Cloud Storage,作为模型回答(storageUri 参数)的一部分。而是始终以 base64 编码的图片字节的形式在响应中返回图片。
    如果您想将生成的图片上传到 Cloud Storage,可以使用 Cloud Storage for Firebase

规范和限制

属性(每个请求)
输入 token 数上限 480 个词元
输出图片数量上限 4 张图片
支持的输出图片分辨率(像素)
  • 1024x1024 像素(宽高比为 1:1)
  • 896x1280(宽高比为 3:4)
  • 1280x896(宽高比为 4:3)
  • 768x1408(宽高比为 9:16)
  • 1408x768(宽高比为 16:9)



您还可以做些什么?

了解如何控制内容生成

详细了解支持的型号

了解适用于各种应用场景的模型及其配额价格


就您使用 Firebase AI Logic 的体验提供反馈