TL;DR
We will create an AI tool to create slides from a PDF. I'll show you how to build a backend service that generates PowerPoint slides asnyc using Python, Celery, and python-pptx. The backend simply accepts a PDF and returns slides as a pptx file. Exciting stuff isn't it.
The architecture of this tool is heavily inspired by what we work on at SlideSpeak. SlideSpeak is an AI tool to create slides from PDF and more. The code for this tutorial is available here:
Here's how the results of the PDF to slides AI generator look like:
But since we all absolutely love PowerPoint slides, let's get into it.
What You'll Build
This tutorial will walk you through creating a backend service that:
- Provides a RESTful API to request slide generation
- Processes slide requests asynchronously with Celery
- Creates professional PowerPoint slides with python-pptx
- Supports multiple slide layouts (title, content, bullet points, etc.)
- Extracts text from PDF files
- Uses OpenAI to generate presentation content automatically
- Scales efficiently to handle multiple requests
Tech Stack
- FastAPI: For creating the RESTful API endpoints
- Celery: For handling asynchronous tasks
- Redis: As message broker and result backend for Celery
- python-pptx: For programmatically creating PowerPoint files
- PyPDF2: For extracting text from PDF files
- OpenAI API: For intelligent content generation
- Docker & Docker Compose: For containerizing the application
Architecture
Getting Started
Before diving into the code, let's understand the project structure:
presentation_generator/ βββ app/ β βββ __init__.py β βββ main.py # FastAPI application β βββ models.py # Pydantic models β βββ config.py # Configuration β βββ ppt_generator.py # slide generation logic β βββ pdf_processor.py # PDF processing and OpenAI integration βββ celery_app/ β βββ __init__.py β βββ tasks.py # Celery tasks β βββ celery_config.py # Celery configuration βββ requirements.txt βββ docker-compose.yml
Step 1: Setting Up the Environment
Let's start by creating our project directory and installing the required dependencies:
mkdir presentation_generator cd presentation_generator python -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate
Now, create a requirements.txt
file with the following dependencies:
fastapi==0.103.1 uvicorn==0.23.2 celery==5.3.4 redis==5.0.0 python-pptx==0.6.21 python-multipart==0.0.6 pydantic==2.3.0 pydantic-settings==2.0.3 pypdf2==3.0.1 openai==1.6.0 python-dotenv==1.0.0
Install these dependencies:
pip install -r requirements.txt
Step 2: Setting Up Configuration
Let's create a configuration file to manage our application settings. Create app/config.py
:
from pydantic_settings import BaseSettings class Settings(BaseSettings): APP_NAME: str = "Presentation Generator" REDIS_URL: str = "redis://localhost:6379/0" RESULT_BACKEND: str = "redis://localhost:6379/0" STORAGE_PATH: str = "./storage" OPENAI_API_KEY: str = "" class Config: env_file = ".env" settings = Settings()
This configuration can be overridden with environment variables or values in a .env
file. Note that we've added an OPENAI_API_KEY
setting that we'll use later.
Step 3: Creating Data Models
Next, let's define our data models with Pydantic. Create app/models.py
:
from pydantic import BaseModel, Field from typing import List, Optional from enum import Enum class SlideType(str, Enum): TITLE = "title" CONTENT = "content" IMAGE = "image" BULLET_POINTS = "bullet_points" TWO_COLUMN = "two_column" class SlideContent(BaseModel): type: SlideType title: str content: Optional[str] = None image_url: Optional[str] = None bullet_points: Optional[List[str]] = None column1: Optional[str] = None column2: Optional[str] = None class PresentationRequest(BaseModel): title: str author: str slides: List[SlideContent] theme: Optional[str] = "default" # New model for PDF-based presentation requests class PDFPresentationRequest(BaseModel): title: Optional[str] = None author: Optional[str] = "Generated Presentation" theme: Optional[str] = "default" num_slides: Optional[int] = 5 class PresentationResponse(BaseModel): task_id: str status: str = "pending" class PresentationStatus(BaseModel): task_id: str status: str file_url: Optional[str] = None message: Optional[str] = None
We've added a new PDFPresentationRequest
model for handling PDF uploads. This model allows customizing the title, author, theme, and number of slides to generate.
Step 4: Implementing the AI Slide Generator
Now, let's create the core AI slide generation logic. Create app/ppt_generator.py
:
import os from pathlib import Path import uuid from pptx import Presentation from pptx.util import Inches, Pt from app.models import SlideType, SlideContent, PresentationRequest from app.config import settings class PPTGenerator: def __init__(self): # Ensure storage directory exists os.makedirs(settings.STORAGE_PATH, exist_ok=True) def generate_presentation(self, request: PresentationRequest) -> str: """Generate a PowerPoint slide based on the request""" prs = Presentation() # Add title slide title_slide_layout = prs.slide_layouts[0] slide = prs.slides.add_slide(title_slide_layout) title = slide.shapes.title subtitle = slide.placeholders[1] title.text = request.title subtitle.text = f"By {request.author}" # Add content slides for slide_content in request.slides: self._add_slide(prs, slide_content) # Save the presentation file_id = str(uuid.uuid4()) file_path = os.path.join(settings.STORAGE_PATH, f"{file_id}.pptx") prs.save(file_path) return file_path def _add_slide(self, prs: Presentation, content: SlideContent): """Add a slide based on its type and content""" if content.type == SlideType.TITLE: slide_layout = prs.slide_layouts[0] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title subtitle = slide.placeholders[1] title.text = content.title if content.content: subtitle.text = content.content elif content.type == SlideType.CONTENT: slide_layout = prs.slide_layouts[1] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title body = slide.placeholders[1] title.text = content.title if content.content: body.text = content.content elif content.type == SlideType.BULLET_POINTS: slide_layout = prs.slide_layouts[1] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title body = slide.placeholders[1] title.text = content.title if content.bullet_points: tf = body.text_frame tf.text = "" # Clear default text for point in content.bullet_points: p = tf.add_paragraph() p.text = point p.level = 0 elif content.type == SlideType.TWO_COLUMN: slide_layout = prs.slide_layouts[3] # Assuming layout 3 is two-content slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title title.text = content.title # Handle columns - this may vary based on your pptx template left = slide.placeholders[1] right = slide.placeholders[2] if content.column1: left.text = content.column1 if content.column2: right.text = content.column2 elif content.type == SlideType.IMAGE: # Basic image slide slide_layout = prs.slide_layouts[5] # Blank slide with title slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title title.text = content.title # Note: In a real application, you would handle image downloads # and insertion here. For simplicity, we're omitting this.
This class handles the creation of PowerPoint slides using the python-pptx library. It supports different slide types and saves the generated files with unique IDs.
Step 5: Setting Up Celery
Now, let's configure Celery for asynchronous task processing. First, create celery_app/celery_config.py
:
from app.config import settings broker_url = settings.REDIS_URL result_backend = settings.RESULT_BACKEND task_serializer = 'json' result_serializer = 'json' accept_content = ['json'] timezone = 'UTC' task_track_started = True worker_hijack_root_logger = False
Next, initialize the Celery application in celery_app/__init__.py
:
from celery import Celery from app.config import settings app = Celery('presentation_generator') app.config_from_object('celery_app.celery_config') # Import tasks to ensure they're registered from celery_app import tasks
Step 6: Creating Celery Tasks
Let's define our asynchronous task for generating slides. Create celery_app/tasks.py
:
import os import logging from celery import shared_task from app.models import PresentationRequest from app.ppt_generator import PPTGenerator logger = logging.getLogger(__name__) @shared_task(bind=True) def generate_presentation_task(self, request_dict): """Generate a PowerPoint presentation asynchronously""" try: # Convert dict back to PresentationRequest request = PresentationRequest(**request_dict) logger.info(f"Starting presentation generation for: {request.title}") # Generate the presentation generator = PPTGenerator() file_path = generator.generate_presentation(request) # In a real application, you might upload to S3 or similar file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully" } except Exception as e: logger.error(f"Error generating presentation: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise
This task will be processed asynchronously by Celery workers.
Step 7: Creating the PDF Processor
Now, let's add the PDF processing functionality. Create app/pdf_processor.py
:
import os import tempfile from PyPDF2 import PdfReader from openai import OpenAI from typing import List, Dict, Any from app.config import settings from app.models import SlideContent, SlideType class PDFProcessor: def __init__(self): self.client = OpenAI(api_key=settings.OPENAI_API_KEY) def extract_text_from_pdf(self, pdf_content: bytes) -> str: """Extract text content from PDF bytes""" with tempfile.NamedTemporaryFile(delete=False) as temp: temp.write(pdf_content) temp_path = temp.name try: pdf = PdfReader(temp_path) text = "" for page in pdf.pages: text += page.extract_text() + "\n" return text finally: # Clean up the temp file if os.path.exists(temp_path): os.unlink(temp_path) def generate_presentation_content(self, text: str, title: str = None, num_slides: int = 5) -> Dict[str, Any]: """Generate presentation content using OpenAI""" # Prepare the system message system_message = f""" You are an expert presentation creator. Your task is to create a well-structured presentation from the provided text content. Extract the key points and organize them into a cohesive presentation. Create a presentation with the following: 1. A title slide with an engaging title (if not provided) and subtitle 2. {num_slides-1} content slides Structure the presentation logically and extract the most important information. """ # Prepare the user message user_message = f""" Create a presentation based on the following content: {text[:10000]} # Limit text to avoid token limits Please structure your response in JSON format with the following structure: {{ "title": "Main Title of Presentation", "slides": [ {{ "type": "title", "title": "Presentation Title", "content": "Subtitle - e.g. Author's Name" }}, {{ "type": "bullet_points", "title": "Key Point 1", "bullet_points": ["Point 1", "Point 2", "Point 3"] }}, ... ] }} Ensure all slide content is concise and impactful. Use different slide types appropriately: - title: For title slides with a subtitle - content: For slides with paragraphs of text - bullet_points: For key points in a list format - two_column: For comparing information side by side """ if title: user_message += f"\nUse '{title}' as the presentation title." # Call the OpenAI API response = self.client.chat.completions.create( model="gpt-4o", response_format={"type": "json_object"}, messages=[ {"role": "system", "content": system_message}, {"role": "user", "content": user_message} ] ) # Extract the response content content = response.choices[0].message.content # Parse the JSON content import json presentation_data = json.loads(content) return presentation_data
This class handles the extraction of text from PDF files and uses OpenAI to generate presentation content based on that text. It uses PyPDF2 to read the PDF and extract text, then sends that text to OpenAI's API with specific instructions to create a well-structured presentation.
Step 8: Updating Celery Tasks
Next, let's update our Celery tasks to handle PDF processing. Modify celery_app/tasks.py
:
import os import logging from celery import shared_task from app.models import PresentationRequest, PDFPresentationRequest from app.ppt_generator import PPTGenerator from app.pdf_processor import PDFProcessor logger = logging.getLogger(__name__) @shared_task(bind=True) def generate_presentation_task(self, request_dict): """Generate a PowerPoint presentation asynchronously""" try: # Convert dict back to PresentationRequest request = PresentationRequest(**request_dict) logger.info(f"Starting presentation generation for: {request.title}") # Generate the presentation generator = PPTGenerator() file_path = generator.generate_presentation(request) # In a real application, you might upload to S3 or similar file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully" } except Exception as e: logger.error(f"Error generating presentation: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise @shared_task(bind=True) def generate_presentation_from_pdf_task(self, pdf_text, request_dict): """Generate a PowerPoint presentation from PDF text asynchronously""" try: # Convert dict back to PDFPresentationRequest request = PDFPresentationRequest(**request_dict) logger.info(f"Starting presentation generation from PDF") # Process the PDF text with OpenAI processor = PDFProcessor() presentation_data = processor.generate_presentation_content( pdf_text, title=request.title, num_slides=request.num_slides ) # Create a PresentationRequest from the generated content presentation_request = PresentationRequest( title=presentation_data.get("title", request.title or "Generated Presentation"), author=request.author, theme=request.theme, slides=presentation_data.get("slides", []) ) # Generate the presentation generator = PPTGenerator() file_path = generator.generate_presentation(presentation_request) # In a real application, you might upload to S3 or similar file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully from PDF" } except Exception as e: logger.error(f"Error generating presentation from PDF: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise
We've added a new task generate_presentation_from_pdf_task
that takes the extracted PDF text and request details, then uses the PDF processor to generate presentation content with OpenAI.
Step 9: Updating the FastAPI Application
Now, let's update our FastAPI application to add the PDF upload endpoint. Modify app/main.py
:
import os from fastapi import FastAPI, BackgroundTasks, HTTPException, UploadFile, File, Form, Depends from fastapi.responses import FileResponse from fastapi.staticfiles import StaticFiles from celery.result import AsyncResult from typing import Optional from app.models import PresentationRequest, PDFPresentationRequest, PresentationResponse, PresentationStatus from app.config import settings from app.pdf_processor import PDFProcessor from celery_app.tasks import generate_presentation_task, generate_presentation_from_pdf_task app = FastAPI(title=settings.APP_NAME) # Mount storage directory for file downloads app.mount("/download", StaticFiles(directory=settings.STORAGE_PATH), name="download") @app.post("/api/presentations", response_model=PresentationResponse) async def create_presentation(request: PresentationRequest): """Submit a new presentation generation task""" # Submit task to Celery task = generate_presentation_task.delay(request.model_dump()) return PresentationResponse(task_id=task.id) @app.post("/api/presentations/from-pdf", response_model=PresentationResponse) async def create_presentation_from_pdf( pdf_file: UploadFile = File(...), title: Optional[str] = Form(None), author: str = Form("Generated Presentation"), theme: str = Form("default"), num_slides: int = Form(5) ): """Submit a presentation generation task from PDF file""" if not pdf_file.filename.endswith('.pdf'): raise HTTPException(status_code=400, detail="File must be a PDF") # Read PDF file content pdf_content = await pdf_file.read() # Extract text from PDF processor = PDFProcessor() pdf_text = processor.extract_text_from_pdf(pdf_content) # Create request object request = PDFPresentationRequest( title=title or f"Presentation based on {pdf_file.filename}", author=author, theme=theme, num_slides=num_slides ) # Submit task to Celery task = generate_presentation_from_pdf_task.delay(pdf_text, request.model_dump()) return PresentationResponse(task_id=task.id) @app.get("/api/presentations/{task_id}", response_model=PresentationStatus) async def get_presentation_status(task_id: str): """Get the status of a presentation generation task""" task_result = AsyncResult(task_id) if task_result.state == 'PENDING': return PresentationStatus( task_id=task_id, status="pending", message="Task is pending" ) elif task_result.state == 'FAILURE': return PresentationStatus( task_id=task_id, status="failed", message=str(task_result.info.get('message', 'Unknown error')) ) elif task_result.state == 'SUCCESS': result = task_result.get() return PresentationStatus( task_id=task_id, status="completed", file_url=result.get('file_url'), message=result.get('message') ) else: return PresentationStatus( task_id=task_id, status=task_result.state.lower(), message="Task is in progress" ) @app.get("/api/download/{file_id}") async def download_presentation(file_id: str): """Download a generated presentation""" file_path = os.path.join(settings.STORAGE_PATH, file_id) if not os.path.exists(file_path): raise HTTPException(status_code=404, detail="File not found") return FileResponse(path=file_path, filename=f"presentation_{file_id}")
We've added a new endpoint /api/presentations/from-pdf
that accepts PDF file uploads along with optional parameters like title, author, theme, and the number of slides to generate.
Step 10: Containerizing with Docker
Let's update our Docker configuration to include the OpenAI API key. First, create a .env
file:
APP_NAME=Presentation Generator REDIS_URL=redis://redis:6379/0 RESULT_BACKEND=redis://redis:6379/0 STORAGE_PATH=/app/storage OPENAI_API_KEY=your_openai_api_key_here
Next, update the docker-compose.yml
file to include the OpenAI API key:
version: '3' services: api: build: . command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload volumes: - .:/app - presentation_data:/app/storage ports: - "8000:8000" depends_on: - redis environment: - REDIS_URL=redis://redis:6379/0 - RESULT_BACKEND=redis://redis:6379/0 - OPENAI_API_KEY=${OPENAI_API_KEY} worker: build: . command: celery -A celery_app worker --loglevel=info volumes: - .:/app - presentation_data:/app/storage depends_on: - redis environment: - REDIS_URL=redis://redis:6379/0 - RESULT_BACKEND=redis://redis:6379/0 - OPENAI_API_KEY=${OPENAI_API_KEY} redis: image: redis:7-alpine ports: - "6379:6379" volumes: presentation_data:
This setup will pass your OpenAI API key from the .env
file to the containerized services.Separation of concerns** - API, task processing, and presentation generation are separate
- Asynchronous processing - Long-running tasks don't block the API
- Containerization - Easy deployment and scaling
- Type safety - Pydantic models ensure data validation
You can extend this project in many ways, such as adding more slide types, integrating with data visualization libraries, or implementing template management.
Feel free to customize this service to fit your specific needs and save yourself from the drudgery of creating presentations manually!
GitHub Repository
The complete code for this tutorial is available on GitHub.
If you found this tutorial helpful, give it a β€οΈ and share it with others who might benefit from creating slides with AI!
Happy coding! π
Top comments (0)