Posted on Apr 21

Create a PDF to Slide AI Generator with Python, Celery, and python-pptx 🔥🚀

TL;DR

We will create an AI tool to create slides from a PDF. I'll show you how to build a backend service that generates PowerPoint slides asnyc using Python, Celery, and python-pptx. The backend simply accepts a PDF and returns slides as a pptx file. Exciting stuff isn't it.

The architecture of this tool is heavily inspired by what we work on at SlideSpeak. SlideSpeak is an AI tool to create slides from PDF and more. The code for this tutorial is available here:

Here's how the results of the PDF to slides AI generator look like:

But since we all absolutely love PowerPoint slides, let's get into it.

What You'll Build

This tutorial will walk you through creating a backend service that:

Provides a RESTful API to request slide generation
Processes slide requests asynchronously with Celery
Creates professional PowerPoint slides with python-pptx
Supports multiple slide layouts (title, content, bullet points, etc.)
Extracts text from PDF files
Uses OpenAI to generate presentation content automatically
Scales efficiently to handle multiple requests

Tech Stack

FastAPI: For creating the RESTful API endpoints
Celery: For handling asynchronous tasks
Redis: As message broker and result backend for Celery
python-pptx: For programmatically creating PowerPoint files
PyPDF2: For extracting text from PDF files
OpenAI API: For intelligent content generation
Docker & Docker Compose: For containerizing the application

Architecture

Getting Started

Before diving into the code, let's understand the project structure:

presentation_generator/ ├── app/ │ ├── __init__.py │ ├── main.py # FastAPI application │ ├── models.py # Pydantic models │ ├── config.py # Configuration │ ├── ppt_generator.py # slide generation logic │ └── pdf_processor.py # PDF processing and OpenAI integration ├── celery_app/ │ ├── __init__.py │ ├── tasks.py # Celery tasks │ └── celery_config.py # Celery configuration ├── requirements.txt └── docker-compose.yml

Step 1: Setting Up the Environment

Let's start by creating our project directory and installing the required dependencies:

mkdir presentation_generator cd presentation_generator python -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate

Now, create a requirements.txt file with the following dependencies:

fastapi==0.103.1 uvicorn==0.23.2 celery==5.3.4 redis==5.0.0 python-pptx==0.6.21 python-multipart==0.0.6 pydantic==2.3.0 pydantic-settings==2.0.3 pypdf2==3.0.1 openai==1.6.0 python-dotenv==1.0.0

Install these dependencies:

pip install -r requirements.txt

Step 2: Setting Up Configuration

Let's create a configuration file to manage our application settings. Create app/config.py:

from pydantic_settings import BaseSettings class Settings(BaseSettings): APP_NAME: str = "Presentation Generator" REDIS_URL: str = "redis://localhost:6379/0" RESULT_BACKEND: str = "redis://localhost:6379/0" STORAGE_PATH: str = "./storage" OPENAI_API_KEY: str = "" class Config: env_file = ".env" settings = Settings()

This configuration can be overridden with environment variables or values in a .env file. Note that we've added an OPENAI_API_KEY setting that we'll use later.

Step 3: Creating Data Models

Next, let's define our data models with Pydantic. Create app/models.py:

from pydantic import BaseModel, Field from typing import List, Optional from enum import Enum class SlideType(str, Enum): TITLE = "title" CONTENT = "content" IMAGE = "image" BULLET_POINTS = "bullet_points" TWO_COLUMN = "two_column" class SlideContent(BaseModel): type: SlideType title: str content: Optional[str] = None image_url: Optional[str] = None bullet_points: Optional[List[str]] = None column1: Optional[str] = None column2: Optional[str] = None class PresentationRequest(BaseModel): title: str author: str slides: List[SlideContent] theme: Optional[str] = "default" # New model for PDF-based presentation requests class PDFPresentationRequest(BaseModel): title: Optional[str] = None author: Optional[str] = "Generated Presentation" theme: Optional[str] = "default" num_slides: Optional[int] = 5 class PresentationResponse(BaseModel): task_id: str status: str = "pending" class PresentationStatus(BaseModel): task_id: str status: str file_url: Optional[str] = None message: Optional[str] = None

We've added a new PDFPresentationRequest model for handling PDF uploads. This model allows customizing the title, author, theme, and number of slides to generate.

Step 4: Implementing the AI Slide Generator

Now, let's create the core AI slide generation logic. Create app/ppt_generator.py:

import os from pathlib import Path import uuid from pptx import Presentation from pptx.util import Inches, Pt from app.models import SlideType, SlideContent, PresentationRequest from app.config import settings class PPTGenerator: def __init__(self): # Ensure storage directory exists  os.makedirs(settings.STORAGE_PATH, exist_ok=True) def generate_presentation(self, request: PresentationRequest) -> str: """Generate a PowerPoint slide based on the request""" prs = Presentation() # Add title slide  title_slide_layout = prs.slide_layouts[0] slide = prs.slides.add_slide(title_slide_layout) title = slide.shapes.title subtitle = slide.placeholders[1] title.text = request.title subtitle.text = f"By {request.author}" # Add content slides  for slide_content in request.slides: self._add_slide(prs, slide_content) # Save the presentation  file_id = str(uuid.uuid4()) file_path = os.path.join(settings.STORAGE_PATH, f"{file_id}.pptx") prs.save(file_path) return file_path def _add_slide(self, prs: Presentation, content: SlideContent): """Add a slide based on its type and content""" if content.type == SlideType.TITLE: slide_layout = prs.slide_layouts[0] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title subtitle = slide.placeholders[1] title.text = content.title if content.content: subtitle.text = content.content elif content.type == SlideType.CONTENT: slide_layout = prs.slide_layouts[1] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title body = slide.placeholders[1] title.text = content.title if content.content: body.text = content.content elif content.type == SlideType.BULLET_POINTS: slide_layout = prs.slide_layouts[1] slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title body = slide.placeholders[1] title.text = content.title if content.bullet_points: tf = body.text_frame tf.text = "" # Clear default text  for point in content.bullet_points: p = tf.add_paragraph() p.text = point p.level = 0 elif content.type == SlideType.TWO_COLUMN: slide_layout = prs.slide_layouts[3] # Assuming layout 3 is two-content  slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title title.text = content.title # Handle columns - this may vary based on your pptx template  left = slide.placeholders[1] right = slide.placeholders[2] if content.column1: left.text = content.column1 if content.column2: right.text = content.column2 elif content.type == SlideType.IMAGE: # Basic image slide  slide_layout = prs.slide_layouts[5] # Blank slide with title  slide = prs.slides.add_slide(slide_layout) title = slide.shapes.title title.text = content.title # Note: In a real application, you would handle image downloads  # and insertion here. For simplicity, we're omitting this.

This class handles the creation of PowerPoint slides using the python-pptx library. It supports different slide types and saves the generated files with unique IDs.

Step 5: Setting Up Celery

Now, let's configure Celery for asynchronous task processing. First, create celery_app/celery_config.py:

from app.config import settings broker_url = settings.REDIS_URL result_backend = settings.RESULT_BACKEND task_serializer = 'json' result_serializer = 'json' accept_content = ['json'] timezone = 'UTC' task_track_started = True worker_hijack_root_logger = False

Next, initialize the Celery application in celery_app/__init__.py:

from celery import Celery from app.config import settings app = Celery('presentation_generator') app.config_from_object('celery_app.celery_config') # Import tasks to ensure they're registered from celery_app import tasks

Step 6: Creating Celery Tasks

Let's define our asynchronous task for generating slides. Create celery_app/tasks.py:

import os import logging from celery import shared_task from app.models import PresentationRequest from app.ppt_generator import PPTGenerator logger = logging.getLogger(__name__) @shared_task(bind=True) def generate_presentation_task(self, request_dict): """Generate a PowerPoint presentation asynchronously""" try: # Convert dict back to PresentationRequest  request = PresentationRequest(**request_dict) logger.info(f"Starting presentation generation for: {request.title}") # Generate the presentation  generator = PPTGenerator() file_path = generator.generate_presentation(request) # In a real application, you might upload to S3 or similar  file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully" } except Exception as e: logger.error(f"Error generating presentation: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise

This task will be processed asynchronously by Celery workers.

Step 7: Creating the PDF Processor

Now, let's add the PDF processing functionality. Create app/pdf_processor.py:

import os import tempfile from PyPDF2 import PdfReader from openai import OpenAI from typing import List, Dict, Any from app.config import settings from app.models import SlideContent, SlideType class PDFProcessor: def __init__(self): self.client = OpenAI(api_key=settings.OPENAI_API_KEY) def extract_text_from_pdf(self, pdf_content: bytes) -> str: """Extract text content from PDF bytes""" with tempfile.NamedTemporaryFile(delete=False) as temp: temp.write(pdf_content) temp_path = temp.name try: pdf = PdfReader(temp_path) text = "" for page in pdf.pages: text += page.extract_text() + "\n" return text finally: # Clean up the temp file  if os.path.exists(temp_path): os.unlink(temp_path) def generate_presentation_content(self, text: str, title: str = None, num_slides: int = 5) -> Dict[str, Any]: """Generate presentation content using OpenAI""" # Prepare the system message  system_message = f""" You are an expert presentation creator. Your task is to create a well-structured presentation from the provided text content. Extract the key points and organize them into a cohesive presentation. Create a presentation with the following: 1. A title slide with an engaging title (if not provided) and subtitle 2. {num_slides-1} content slides Structure the presentation logically and extract the most important information. """ # Prepare the user message  user_message = f""" Create a presentation based on the following content: {text[:10000]} # Limit text to avoid token limits Please structure your response in JSON format with the following structure: {{ "title": "Main Title of Presentation", "slides": [ {{ "type": "title", "title": "Presentation Title", "content": "Subtitle - e.g. Author's Name" }}, {{ "type": "bullet_points", "title": "Key Point 1", "bullet_points": ["Point 1", "Point 2", "Point 3"] }},  ... ] }}  Ensure all slide content is concise and impactful. Use different slide types appropriately: - title: For title slides with a subtitle - content: For slides with paragraphs of text - bullet_points: For key points in a list format - two_column: For comparing information side by side """ if title: user_message += f"\nUse '{title}' as the presentation title." # Call the OpenAI API  response = self.client.chat.completions.create( model="gpt-4o", response_format={"type": "json_object"}, messages=[ {"role": "system", "content": system_message}, {"role": "user", "content": user_message} ] ) # Extract the response content  content = response.choices[0].message.content # Parse the JSON content  import json presentation_data = json.loads(content) return presentation_data

This class handles the extraction of text from PDF files and uses OpenAI to generate presentation content based on that text. It uses PyPDF2 to read the PDF and extract text, then sends that text to OpenAI's API with specific instructions to create a well-structured presentation.

Step 8: Updating Celery Tasks

Next, let's update our Celery tasks to handle PDF processing. Modify celery_app/tasks.py:

import os import logging from celery import shared_task from app.models import PresentationRequest, PDFPresentationRequest from app.ppt_generator import PPTGenerator from app.pdf_processor import PDFProcessor logger = logging.getLogger(__name__) @shared_task(bind=True) def generate_presentation_task(self, request_dict): """Generate a PowerPoint presentation asynchronously""" try: # Convert dict back to PresentationRequest  request = PresentationRequest(**request_dict) logger.info(f"Starting presentation generation for: {request.title}") # Generate the presentation  generator = PPTGenerator() file_path = generator.generate_presentation(request) # In a real application, you might upload to S3 or similar  file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully" } except Exception as e: logger.error(f"Error generating presentation: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise @shared_task(bind=True) def generate_presentation_from_pdf_task(self, pdf_text, request_dict): """Generate a PowerPoint presentation from PDF text asynchronously""" try: # Convert dict back to PDFPresentationRequest  request = PDFPresentationRequest(**request_dict) logger.info(f"Starting presentation generation from PDF") # Process the PDF text with OpenAI  processor = PDFProcessor() presentation_data = processor.generate_presentation_content( pdf_text, title=request.title, num_slides=request.num_slides ) # Create a PresentationRequest from the generated content  presentation_request = PresentationRequest( title=presentation_data.get("title", request.title or "Generated Presentation"), author=request.author, theme=request.theme, slides=presentation_data.get("slides", []) ) # Generate the presentation  generator = PPTGenerator() file_path = generator.generate_presentation(presentation_request) # In a real application, you might upload to S3 or similar  file_url = f"/download/{os.path.basename(file_path)}" return { "status": "completed", "file_url": file_url, "message": "Presentation generated successfully from PDF" } except Exception as e: logger.error(f"Error generating presentation from PDF: {str(e)}") self.update_state( state="FAILURE", meta={ "status": "failed", "message": f"Error: {str(e)}" } ) raise

We've added a new task generate_presentation_from_pdf_task that takes the extracted PDF text and request details, then uses the PDF processor to generate presentation content with OpenAI.

Step 9: Updating the FastAPI Application

Now, let's update our FastAPI application to add the PDF upload endpoint. Modify app/main.py:

import os from fastapi import FastAPI, BackgroundTasks, HTTPException, UploadFile, File, Form, Depends from fastapi.responses import FileResponse from fastapi.staticfiles import StaticFiles from celery.result import AsyncResult from typing import Optional from app.models import PresentationRequest, PDFPresentationRequest, PresentationResponse, PresentationStatus from app.config import settings from app.pdf_processor import PDFProcessor from celery_app.tasks import generate_presentation_task, generate_presentation_from_pdf_task app = FastAPI(title=settings.APP_NAME) # Mount storage directory for file downloads app.mount("/download", StaticFiles(directory=settings.STORAGE_PATH), name="download") @app.post("/api/presentations", response_model=PresentationResponse) async def create_presentation(request: PresentationRequest): """Submit a new presentation generation task""" # Submit task to Celery  task = generate_presentation_task.delay(request.model_dump()) return PresentationResponse(task_id=task.id) @app.post("/api/presentations/from-pdf", response_model=PresentationResponse) async def create_presentation_from_pdf( pdf_file: UploadFile = File(...), title: Optional[str] = Form(None), author: str = Form("Generated Presentation"), theme: str = Form("default"), num_slides: int = Form(5) ): """Submit a presentation generation task from PDF file""" if not pdf_file.filename.endswith('.pdf'): raise HTTPException(status_code=400, detail="File must be a PDF") # Read PDF file content  pdf_content = await pdf_file.read() # Extract text from PDF  processor = PDFProcessor() pdf_text = processor.extract_text_from_pdf(pdf_content) # Create request object  request = PDFPresentationRequest( title=title or f"Presentation based on {pdf_file.filename}", author=author, theme=theme, num_slides=num_slides ) # Submit task to Celery  task = generate_presentation_from_pdf_task.delay(pdf_text, request.model_dump()) return PresentationResponse(task_id=task.id) @app.get("/api/presentations/{task_id}", response_model=PresentationStatus) async def get_presentation_status(task_id: str): """Get the status of a presentation generation task""" task_result = AsyncResult(task_id) if task_result.state == 'PENDING': return PresentationStatus( task_id=task_id, status="pending", message="Task is pending" ) elif task_result.state == 'FAILURE': return PresentationStatus( task_id=task_id, status="failed", message=str(task_result.info.get('message', 'Unknown error')) ) elif task_result.state == 'SUCCESS': result = task_result.get() return PresentationStatus( task_id=task_id, status="completed", file_url=result.get('file_url'), message=result.get('message') ) else: return PresentationStatus( task_id=task_id, status=task_result.state.lower(), message="Task is in progress" ) @app.get("/api/download/{file_id}") async def download_presentation(file_id: str): """Download a generated presentation""" file_path = os.path.join(settings.STORAGE_PATH, file_id) if not os.path.exists(file_path): raise HTTPException(status_code=404, detail="File not found") return FileResponse(path=file_path, filename=f"presentation_{file_id}")

We've added a new endpoint /api/presentations/from-pdf that accepts PDF file uploads along with optional parameters like title, author, theme, and the number of slides to generate.

Step 10: Containerizing with Docker

Let's update our Docker configuration to include the OpenAI API key. First, create a .env file:

APP_NAME=Presentation Generator REDIS_URL=redis://redis:6379/0 RESULT_BACKEND=redis://redis:6379/0 STORAGE_PATH=/app/storage OPENAI_API_KEY=your_openai_api_key_here

Next, update the docker-compose.yml file to include the OpenAI API key:

version: '3' services: api: build: . command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload volumes: - .:/app - presentation_data:/app/storage ports: - "8000:8000" depends_on: - redis environment: - REDIS_URL=redis://redis:6379/0 - RESULT_BACKEND=redis://redis:6379/0 - OPENAI_API_KEY=${OPENAI_API_KEY} worker: build: . command: celery -A celery_app worker --loglevel=info volumes: - .:/app - presentation_data:/app/storage depends_on: - redis environment: - REDIS_URL=redis://redis:6379/0 - RESULT_BACKEND=redis://redis:6379/0 - OPENAI_API_KEY=${OPENAI_API_KEY} redis: image: redis:7-alpine ports: - "6379:6379" volumes: presentation_data:

This setup will pass your OpenAI API key from the .env file to the containerized services.Separation of concerns** - API, task processing, and presentation generation are separate

Asynchronous processing - Long-running tasks don't block the API
Containerization - Easy deployment and scaling
Type safety - Pydantic models ensure data validation

You can extend this project in many ways, such as adding more slide types, integrating with data visualization libraries, or implementing template management.

Feel free to customize this service to fit your specific needs and save yourself from the drudgery of creating presentations manually!

GitHub Repository

The complete code for this tutorial is available on GitHub.

If you found this tutorial helpful, give it a ❤️ and share it with others who might benefit from creating slides with AI!

Happy coding! 🚀

DEV Community