DEV Community

Cover image for Connecting Cassandra with Django: The Complete Guide (With & Without Docker)
sizan mahmud0
sizan mahmud0

Posted on

Connecting Cassandra with Django: The Complete Guide (With & Without Docker)

Part 1: Setup Without Docker (Traditional Method)

Why Cassandra with Django?

Before we dive in, let's understand why you'd want to use Cassandra with Django:

Cassandra strengths:

  • Handles massive write-heavy workloads
  • Linear scalability (add nodes, get more capacity)
  • No single point of failure
  • Perfect for time-series data, logs, IoT data

Django + Cassandra use cases:

  • Analytics platforms
  • Real-time messaging systems
  • IoT data collection
  • Event logging systems
  • Social media feeds

Important note: Cassandra is NOT a replacement for PostgreSQL/MySQL for typical Django apps. Use it for specific high-scale scenarios.


Installing Cassandra Locally

On Ubuntu/Debian

# Add Cassandra repository echo "deb https://debian.cassandra.apache.org 41x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list # Add repository keys curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add - # Update and install sudo apt-get update sudo apt-get install cassandra # Start Cassandra sudo systemctl start cassandra sudo systemctl enable cassandra # Verify installation nodetool status 
Enter fullscreen mode Exit fullscreen mode

On macOS

# Install via Homebrew brew install cassandra # Start Cassandra brew services start cassandra # Verify cqlsh 
Enter fullscreen mode Exit fullscreen mode

On Windows

  1. Download Apache Cassandra from official website
  2. Extract to C:\cassandra
  3. Set JAVA_HOME environment variable
  4. Run C:\cassandra\bin\cassandra.bat

Setting Up Django Project

Step 1: Install Required Packages

# Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install Django and Cassandra driver pip install django pip install cassandra-driver pip install django-cassandra-engine # Create requirements.txt pip freeze > requirements.txt 
Enter fullscreen mode Exit fullscreen mode

Your requirements.txt should include:

Django==4.2.7 cassandra-driver==3.28.0 django-cassandra-engine==1.8.0 
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Django Project

django-admin startproject myproject cd myproject python manage.py startapp myapp 
Enter fullscreen mode Exit fullscreen mode

Step 3: Configure Django Settings

Open myproject/settings.py:

# Add cassandra engine to installed apps INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'django_cassandra_engine', 'myapp', ] # Configure database connections DATABASES = { 'default': { 'ENGINE': 'django.db.backends.sqlite3', 'NAME': BASE_DIR / 'db.sqlite3', }, 'cassandra': { 'ENGINE': 'django_cassandra_engine', 'NAME': 'myapp_keyspace', 'HOST': '127.0.0.1', 'PORT': 9042, 'OPTIONS': { 'replication': { 'strategy_class': 'SimpleStrategy', 'replication_factor': 1 }, 'connection': { 'consistency': 'ONE', 'retry_connect': True, 'idle_heartbeat_interval': 30, 'idle_heartbeat_timeout': 10, } } } } # Set default database for cassandra engine DATABASE_ROUTERS = ['django_cassandra_engine.routers.CassandraRouter'] 
Enter fullscreen mode Exit fullscreen mode

Step 4: Create Cassandra Models

Create myapp/models.py:

from cassandra.cql import TimeUUID from django_cassandra_engine.models import DjangoCassandraModel from django_cassandra_engine import columns import uuid class UserActivity(DjangoCassandraModel): """ Store user activity logs in Cassandra Perfect for time-series data """ user_id = columns.UUID(primary_key=True, default=uuid.uuid4) timestamp = columns.DateTime(primary_key=True, clustering_order="DESC") activity_type = columns.Text() description = columns.Text() metadata = columns.Map(columns.Text, columns.Text) class Meta: get_pk_field = 'user_id' class SensorData(DjangoCassandraModel): """ IoT sensor data example """ sensor_id = columns.Text(primary_key=True) recorded_at = columns.DateTime(primary_key=True, clustering_order="DESC") temperature = columns.Float() humidity = columns.Float() location = columns.Text() class Meta: get_pk_field = 'sensor_id' class ChatMessage(DjangoCassandraModel): """ Chat/messaging system """ room_id = columns.UUID(primary_key=True) message_id = columns.TimeUUID(primary_key=True, default=uuid.uuid1, clustering_order="DESC") user_id = columns.UUID() username = columns.Text() message = columns.Text() timestamp = columns.DateTime() class Meta: get_pk_field = 'room_id' 
Enter fullscreen mode Exit fullscreen mode

Step 5: Sync Models with Cassandra

# Create keyspace and tables python manage.py sync_cassandra # Verify in CQL shell cqlsh # Check keyspace USE myapp_keyspace; # Show tables DESCRIBE TABLES; # Check table structure DESCRIBE TABLE user_activity; 
Enter fullscreen mode Exit fullscreen mode

Step 6: Create Views to Use Cassandra Models

Create myapp/views.py:

from django.http import JsonResponse from django.views.decorators.csrf import csrf_exempt from .models import UserActivity, SensorData, ChatMessage import json import uuid from datetime import datetime @csrf_exempt def log_activity(request): """Log user activity""" if request.method == 'POST': data = json.loads(request.body) activity = UserActivity.create( user_id=uuid.UUID(data['user_id']), timestamp=datetime.now(), activity_type=data['activity_type'], description=data['description'], metadata=data.get('metadata', {}) ) return JsonResponse({ 'status': 'success', 'activity_id': str(activity.user_id) }) def get_user_activities(request, user_id): """Get all activities for a user""" activities = UserActivity.objects.filter( user_id=uuid.UUID(user_id) ).limit(100) results = [{ 'user_id': str(a.user_id), 'timestamp': a.timestamp.isoformat(), 'activity_type': a.activity_type, 'description': a.description, 'metadata': a.metadata } for a in activities] return JsonResponse({'activities': results}) @csrf_exempt def save_sensor_data(request): """Save IoT sensor data""" if request.method == 'POST': data = json.loads(request.body) sensor = SensorData.create( sensor_id=data['sensor_id'], recorded_at=datetime.now(), temperature=data['temperature'], humidity=data['humidity'], location=data['location'] ) return JsonResponse({'status': 'success'}) def get_sensor_readings(request, sensor_id): """Get recent sensor readings""" readings = SensorData.objects.filter( sensor_id=sensor_id ).limit(50) results = [{ 'sensor_id': r.sensor_id, 'recorded_at': r.recorded_at.isoformat(), 'temperature': r.temperature, 'humidity': r.humidity, 'location': r.location } for r in readings] return JsonResponse({'readings': results}) @csrf_exempt def send_message(request): """Send chat message""" if request.method == 'POST': data = json.loads(request.body) message = ChatMessage.create( room_id=uuid.UUID(data['room_id']), message_id=uuid.uuid1(), user_id=uuid.UUID(data['user_id']), username=data['username'], message=data['message'], timestamp=datetime.now() ) return JsonResponse({'status': 'success'}) def get_room_messages(request, room_id): """Get messages from a chat room""" messages = ChatMessage.objects.filter( room_id=uuid.UUID(room_id) ).limit(100) results = [{ 'message_id': str(m.message_id), 'user_id': str(m.user_id), 'username': m.username, 'message': m.message, 'timestamp': m.timestamp.isoformat() } for m in messages] return JsonResponse({'messages': results}) 
Enter fullscreen mode Exit fullscreen mode

Step 7: Configure URLs

Create myapp/urls.py:

from django.urls import path from . import views urlpatterns = [ # User activity endpoints  path('activity/log/', views.log_activity), path('activity/<str:user_id>/', views.get_user_activities), # Sensor data endpoints  path('sensor/save/', views.save_sensor_data), path('sensor/<str:sensor_id>/', views.get_sensor_readings), # Chat endpoints  path('chat/send/', views.send_message), path('chat/room/<str:room_id>/', views.get_room_messages), ] 
Enter fullscreen mode Exit fullscreen mode

Update myproject/urls.py:

from django.contrib import admin from django.urls import path, include urlpatterns = [ path('admin/', admin.site.urls), path('api/', include('myapp.urls')), ] 
Enter fullscreen mode Exit fullscreen mode

Step 8: Test Your Setup

# Run Django development server python manage.py runserver # Test endpoints using curl # Log activity curl -X POST http://localhost:8000/api/activity/log/ \ -H "Content-Type: application/json" \ -d '{ "user_id": "550e8400-e29b-41d4-a716-446655440000", "activity_type": "login", "description": "User logged in", "metadata": {"ip": "192.168.1.1"} }' # Get user activities curl http://localhost:8000/api/activity/550e8400-e29b-41d4-a716-446655440000/ # Save sensor data curl -X POST http://localhost:8000/api/sensor/save/ \ -H "Content-Type: application/json" \ -d '{ "sensor_id": "SENSOR001", "temperature": 25.5, "humidity": 60.0, "location": "Room 101" }' 
Enter fullscreen mode Exit fullscreen mode

Part 2: Setup With Docker

Docker makes Cassandra setup much cleaner and reproducible. Let's containerize everything.

Step 1: Create Docker Configuration Files

Create docker-compose.yml in your project root:

version: '3.8' services: cassandra: image: cassandra:4.1 container_name: cassandra_db ports: - "9042:9042" - "9160:9160" environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_DC=datacenter1 - CASSANDRA_RACK=rack1 - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch - CASSANDRA_NUM_TOKENS=128 volumes: - cassandra_data:/var/lib/cassandra healthcheck: test: ["CMD-SHELL", "cqlsh -e 'describe cluster'"] interval: 30s timeout: 10s retries: 5 networks: - django_network django: build: . container_name: django_app command: python manage.py runserver 0.0.0.0:8000 volumes: - .:/app ports: - "8000:8000" depends_on: cassandra: condition: service_healthy environment: - CASSANDRA_HOST=cassandra - CASSANDRA_PORT=9042 networks: - django_network volumes: cassandra_data: networks: django_network: driver: bridge 
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Dockerfile

Create Dockerfile in project root:

FROM python:3.11-slim # Set environment variables ENV PYTHONUNBUFFERED=1 ENV PYTHONDONTWRITEBYTECODE=1 # Set work directory WORKDIR /app # Install system dependencies RUN apt-get update && apt-get install -y \  gcc \  python3-dev \  && rm -rf /var/lib/apt/lists/* # Install Python dependencies COPY requirements.txt /app/ RUN pip install --upgrade pip RUN pip install -r requirements.txt # Copy project COPY . /app/ # Run migrations and start server CMD ["sh", "-c", "python manage.py sync_cassandra && python manage.py runserver 0.0.0.0:8000"] 
Enter fullscreen mode Exit fullscreen mode

Step 3: Update Django Settings for Docker

Update myproject/settings.py:

import os # ... existing settings ...  DATABASES = { 'default': { 'ENGINE': 'django.db.backends.sqlite3', 'NAME': BASE_DIR / 'db.sqlite3', }, 'cassandra': { 'ENGINE': 'django_cassandra_engine', 'NAME': 'myapp_keyspace', 'HOST': os.environ.get('CASSANDRA_HOST', '127.0.0.1'), 'PORT': int(os.environ.get('CASSANDRA_PORT', 9042)), 'OPTIONS': { 'replication': { 'strategy_class': 'SimpleStrategy', 'replication_factor': 1 }, 'connection': { 'consistency': 'ONE', 'retry_connect': True, 'idle_heartbeat_interval': 30, 'idle_heartbeat_timeout': 10, } } } } 
Enter fullscreen mode Exit fullscreen mode

Step 4: Create .dockerignore

Create .dockerignore:

__pycache__ *.pyc *.pyo *.pyd .Python env/ venv/ .env .git .gitignore *.sqlite3 *.log 
Enter fullscreen mode Exit fullscreen mode

Step 5: Build and Run with Docker

# Build and start all services docker-compose up --build # In another terminal, check Cassandra is ready docker exec -it cassandra_db cqlsh # Verify keyspace and tables USE myapp_keyspace; DESCRIBE TABLES; # Check Django logs docker-compose logs -f django # Access your Django app # Open browser: http://localhost:8000 
Enter fullscreen mode Exit fullscreen mode

Step 6: Useful Docker Commands

# Start services docker-compose up -d # Stop services docker-compose down # Stop and remove volumes (clean slate) docker-compose down -v # View logs docker-compose logs -f docker-compose logs cassandra docker-compose logs django # Execute commands in containers docker exec -it cassandra_db cqlsh docker exec -it django_app python manage.py shell # Restart specific service docker-compose restart django # Scale Cassandra (multi-node cluster) docker-compose up -d --scale cassandra=3 
Enter fullscreen mode Exit fullscreen mode

Step 7: Production Docker Setup

For production, create docker-compose.prod.yml:

version: '3.8' services: cassandra: image: cassandra:4.1 container_name: cassandra_prod ports: - "9042:9042" environment: - CASSANDRA_CLUSTER_NAME=ProdCluster - MAX_HEAP_SIZE=4G - HEAP_NEWSIZE=800M volumes: - cassandra_prod_data:/var/lib/cassandra deploy: resources: limits: memory: 8G cpus: '4' networks: - prod_network django: build: . container_name: django_prod command: gunicorn myproject.wsgi:application --bind 0.0.0.0:8000 --workers 4 volumes: - static_volume:/app/staticfiles expose: - 8000 depends_on: - cassandra environment: - DEBUG=False - CASSANDRA_HOST=cassandra networks: - prod_network nginx: image: nginx:alpine container_name: nginx_prod volumes: - ./nginx.conf:/etc/nginx/nginx.conf - static_volume:/app/staticfiles ports: - "80:80" depends_on: - django networks: - prod_network volumes: cassandra_prod_data: static_volume: networks: prod_network: driver: bridge 
Enter fullscreen mode Exit fullscreen mode

Comparison: Docker vs Without Docker

Without Docker

Pros:

  • ✅ Direct control over installation
  • ✅ Easier debugging initially
  • ✅ No containerization overhead
  • ✅ Good for learning Cassandra

Cons:

  • ❌ Manual installation and configuration
  • ❌ Environment inconsistencies
  • ❌ Harder to replicate on other machines
  • ❌ Cleanup is messy

With Docker

Pros:

  • ✅ Consistent environment everywhere
  • ✅ Easy setup and teardown
  • ✅ Perfect for team collaboration
  • ✅ Production-ready configuration
  • ✅ Easy to scale

Cons:

  • ❌ Learning curve for Docker
  • ❌ Resource overhead
  • ❌ Network complexity

Best Practices

1. Connection Pooling

# In settings.py 'OPTIONS': { 'connection': { 'consistency': 'ONE', 'connections_per_host': 5, 'protocol_version': 4, } } 
Enter fullscreen mode Exit fullscreen mode

2. Proper Data Modeling

# Good: Partition by natural key class UserActivity(DjangoCassandraModel): user_id = columns.UUID(primary_key=True) timestamp = columns.DateTime(primary_key=True, clustering_order="DESC") # Bad: Everything in one partition class BadModel(DjangoCassandraModel): id = columns.UUID(primary_key=True) # Will create hot partitions 
Enter fullscreen mode Exit fullscreen mode

3. Query Optimization

# Good: Query by partition key activities = UserActivity.objects.filter(user_id=user_id).limit(100) # Bad: Full table scan all_activities = UserActivity.objects.all() # NEVER do this! 
Enter fullscreen mode Exit fullscreen mode

4. Error Handling

from cassandra.cluster import NoHostAvailable from django.core.exceptions import ObjectDoesNotExist try: activity = UserActivity.objects.get(user_id=user_id) except ObjectDoesNotExist: return JsonResponse({'error': 'Not found'}, status=404) except NoHostAvailable: return JsonResponse({'error': 'Database unavailable'}, status=503) 
Enter fullscreen mode Exit fullscreen mode

Troubleshooting

Issue: Cassandra won't start

# Check logs docker-compose logs cassandra # Increase memory in docker-compose.yml environment: - MAX_HEAP_SIZE=2G 
Enter fullscreen mode Exit fullscreen mode

Issue: Django can't connect

# Add connection retry logic 'OPTIONS': { 'connection': { 'retry_connect': True, 'connect_timeout': 30, } } 
Enter fullscreen mode Exit fullscreen mode

Issue: Slow queries

# Check with TRACING in cqlsh TRACING ON; SELECT * FROM user_activity WHERE user_id = ...; 
Enter fullscreen mode Exit fullscreen mode

Conclusion

You now have two ways to integrate Cassandra with Django:

  1. Without Docker: Better for learning and local development
  2. With Docker: Better for teams and production

Choose based on your needs. Start simple, scale when necessary.

Remember: Cassandra is powerful but complex. Use it for the right use cases—high-write scenarios, time-series data, and distributed systems.


Questions about Cassandra + Django? Drop them in the comments!

Top comments (0)