Developer Insights with Apple's Foundation Models and Image Playground
I built a bedtime stories app for kids using Apples Foundation Models and Image Playground, and here are the key takeaways for other developers. Part of Apple Intelligence from WWDC 2025, these frameworks let you generate text and images on-device, keeping things fast and private. This article walks through the backend logic for creating structured stories with a custom StoryResponse
struct and generating an image, sharing what worked and what didn’t.
Step 1: Set Up the Environment
Takeaway: Check compatibility first to avoid crashes. Target iOS 18+ or macOS 15+ on Apple Silicon (A17 Pro, M1, or later) with Apple Intelligence enabled. In Xcode, import the frameworks:
import FoundationModels import ImagePlayground
FoundationModels
handles text generation, ImagePlayground
does images. Models download on first use, but verify availability to catch unsupported devices early.
Step 2: Define the Structured Output
Takeaway: Keep your struct simple to avoid confusing the LLM. The @Generable
macro enforces structure, but too many fields can break things. Heres the StoryResponse
struct I used:
@Generable struct StoryResponse { @Guide(description: "Story title, catchy, relevant, under 5 words.") let title: String @Guide(description: "Story body, whimsical style for kids.") let paragraph: [Paragraph] @Guide(description: "Image description, complements story, whimsical style.") let imageDescription: String @Guide(description: "Category like adventure, fantasy, or general if unsure.") let category: String @Guide(description: "One-sentence story summary.") let summary: String @Guide(description: "Main characters, comma-separated.") let characters: String @Generable struct Paragraph { @Guide(description: "Paragraph order in story.") let order: Int @Guide(description: "Paragraph text.") let text: String } }
This gives you a short title, 3 to 5 paragraphs, an image description, a category, a summary, and characters. @Guide
annotations tell the LLM exactly what you want, cutting down on bad outputs.
Step 3: Initialize Model and Session
Takeaway: Clear session instructions save time. Without them, I got random stories. Heres how to set up the model and session:
let model = SystemLanguageModel.default guard case .available = model.availability else { fatalError("Model unavailable: \(model.availability)") } let session = LanguageModelSession( instructions: "Youre a storyteller for kids aged 3 to 7. Use simple, dreamy language and positive themes. Output a title under 5 words, 3 to 5 paragraphs, an image description, a category like fantasy, a one-sentence summary, and comma-separated characters." )
Checking model.availability
early caught issues with disabled Apple Intelligence or old hardware.
Step 4: Craft the Prompt
Takeaway: Specific prompts beat vague ones, specially when trying to get a clean and safe response from the Apple local models. In the final app I had to be carefull building the intructions for the model and the prompt to get usable results:
let childName = "Emma" let theme = "brave little dragon learning to fly" let prompt = """ Generate a bedtime story for \(childName) about \(theme). Include: 1. Title under 5 words. 2. 3 to 5 paragraphs, 50 to 100 words each, whimsical style for kids 3 to 7. 3. Image description for a key scene, like a dragon over a glowing forest. 4. Category, like fantasy or adventure. 5. One-sentence summary. 6. Characters, comma-separated. Make it happy, soothing, with a perseverance moral. """
Set options for better control:
let options = GenerationOptions( temperature: 0.7, // Keeps it creative but not wild maximumResponseTokens: 600 // Around 300 to 400 words )
A temperature
of 0.7 worked best; anything above 0.8 got too random.
Step 5: Stream the Response and Generate Image
Takeaway: Streaming is great but needs careful buffer handling. Adding ImageCreator
tied text to visuals cleanly but before get the image, it's better to get a summarized version f the story and put some restrictions, Apple Image Playground is very picky when you try to create images with people and need a persona to be defined:
do { var storyBuffer = StoryResponse( title: "", paragraph: [], imageDescription: "", category: "general", summary: "", characters: "" ) for try await partialResponse in session.streamResponse(to: prompt, generating: StoryResponse.self, options: options) { storyBuffer = partialResponse // Log for debugging print("Title: \(storyBuffer.title)") print("Paragraphs: \(storyBuffer.paragraph.map { \"\($0.order): \($0.text)\" })") print("Image Description: \(storyBuffer.imageDescription)") print("Category: \(storyBuffer.category)") print("Summary: \(storyBuffer.summary)") print("Characters: \(storyBuffer.characters)") } // Final story let finalStory = storyBuffer print("Complete story: \(finalStory)") // Generate image let imageCreator = try await ImageCreator() let style = ImagePlaygroundStyle.animation let images = try await imageCreator.images( for: [.text(finalStory.imageDescription)], style: style, limit: 1 ) // `images` has one animation-style image print("Generated image for: \(finalStory.imageDescription)") } catch { print("Error: \(error)") }
Streaming Tip
Streaming StoryResponse
objects lets you process in real-time, but initialize storyBuffer
with defaults to avoid null issues. @Generable
keeps the output on track.
Image Tip
The imageDescription
field feeds ImageCreator
. ImagePlaygroundStyle.animation
gives kid-friendly visuals, but short, specific descriptions (like "dragon on glowing hill") work better than long ones. Sticking to one image keeps things fast.
Step 6: Error Handling and Optimization
Takeaway: Expect errors and plan for them. I hit these:
- Text:
.tokenLimitExceeded
from long prompts; 600 tokens fixed it. - Image: Bad
imageDescription
inputs; tweaking prompts helped. - Streaming: Partial responses;
storyBuffer
accumulation solved it.
Optimization notes:
- Text: Stick to
temperature
0.6 to 0.8, reuse sessions for context. - Image: Short
imageDescription
, single image for speed. - Performance: Test on real devices, older Apple Silicon lags on images.
Step 7: Advanced Enhancements
Takeaway: Chaining prompts adds polish. Generating the title first improved flow:
let titlePrompt = "Generate a title under 5 words for a story about \(theme)." let title = try await session.respond(to: titlePrompt).content let fullPrompt = "Using title '\(title)', generate a story for \(childName)..."
Personalizing with user inputs (like favorite animals) and reusing sessions for sequels made stories feel connected.
Conclusion
Building this bedtime stories app showed how to combine Apples Foundation Models and Image Playground for a fast, private storytelling engine. The @Generable
macro kept StoryResponse
outputs structured, streamResponse
enabled real-time text generation, and ImageCreator
added kid-friendly visuals. Key lessons for developers: verify device compatibility early, craft precise prompts, and simplify structs to avoid LLM issues. Testing on target devices caught performance hiccups, and tweaking imageDescription
was critical for solid images with a nice representation including only animal, objects and enviroment definitions.
Theres plenty of room to improve. You can add multi-language support with prompt tweaks or integrate Apples speech APIs for narration. For images, experimenting with other ImagePlaygroundStyle
options or generating multiple images per story could enhance visuals. Chaining inferences, like creating character backstories first, can deepen narratives. Start small, test frequently, and refine prompts to maximize LLM output. Want to see it in action? Check out my app (Bedtime Snuggles) on the App Store to try the stories and images for yourself.
Top comments (0)