In previous articles, we explored how to download and install PrestoDB locally on your machine. In this guide, we take it a step further: you'll learn how to set up and run a single-node Presto cluster using Docker, and connect it to Apache Superset. We'll walk through querying data from multiple sources like MySQL and MongoDB via PrestoDB. Whether you're a developer, data engineer, or BI enthusiast, this step-by-step tutorial will help you build a modern analytics stack with open-source tools and Docker.
Pre-Requisites:
- Docker Application (I am using OrbStack).
- Knowledge of Basic Docker Commands.
Step -1: Project Structure:
Step -2: Setting Up Docker Compose:
version: "3.8" services: superset: image: apache/superset:latest container_name: superset ports: - "8088:8088" environment: SUPERSET_SECRET_KEY: 'supersecretkey' PYTHONUNBUFFERED: 1 depends_on: - db volumes: - superset_home:/app/superset_home healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8088/health"] interval: 30s timeout: 10s retries: 5 command: > /bin/bash -c " sleep 10 && superset db upgrade && superset fab create-admin --username admin --firstname Admin --lastname User --email admin@superset.com --password admin && superset init && superset run -h 0.0.0.0 -p 8088 " db: image: postgres:15 container_name: superset_db environment: POSTGRES_DB: superset POSTGRES_USER: superset POSTGRES_PASSWORD: superset volumes: - db_data:/var/lib/postgresql/data mysql: image: mysql:latest container_name: mysql environment: MYSQL_ROOT_PASSWORD: root MYSQL_DATABASE: testdb ports: - "3307:3306" volumes: - mysql_data:/var/lib/mysql mongo: image: mongo:latest container_name: mongodb ports: - "27018:27017" volumes: - mongo_data:/data/db presto: image: prestodb/presto:latest container_name: presto ports: - "8081:8080" volumes: - ./presto/etc/catalog/mongodb.properties:/opt/presto-server/etc/catalog/mongodb.properties - ./presto/etc/catalog/mysql.properties:/opt/presto-server/etc/catalog/mysql.properties depends_on: - mysql - mongo volumes: superset_home: db_data: mysql_data: mongo_data:
Step -3: Creating Presto Catalog Files:
mysql.properties (To connect MySQL Database)
connector.name=mysql connection-url=jdbc:mysql://mysql:3306 connection-user=root connection-password=root
mongodb.properties (To connect MongoDB Database)
connector.name=mongodb mongodb.seeds=mongodb:27017
Step -4: Start all the Services:
- Go to terminal and navigate to the docker-compose.yml file directory.
- Hit the below command. (It will automatically start all the services, just wait for 3-5 mins, as docker will pull all the images).
docker-compose up -d
- Once all the images are pulled, hit the below command to check the status of all containers.
docker ps
You will see an output like snapshot shared above. Now, let's confirm that PrestoDB and Apache Superset are running on their respective ports or not.
Open browser and check Apache Superset is listening on port 8088 (http://localhost:8088/) and Presto on port 8081 (http://localhost:8081/).
Step -5: Connecting PrestoDB as a database to Apache Superset:
- Superset doesn't ship Presto driver by default. So, as a next step we need to install it manually. Hit the below command to enter inside superset docker container.
docker exec -it superset bash
As soon as you hit this command, you will be inside superset docker container.
We need to install pyhive[presto], this is a important Python package to connect PrestoDB with Superset. Hit the below command.
pip install "pyhive[presto]"
- Once Installation is complete, exit the Superset container using exit command and restart Superset container.
docker restart superset
- Open Superset on browser: localhost:8088 and enter username and password.
Username:admin Password:admin
- Navigate to Settings -> Database Connections -> Database.
Click on CONNECT once you see "Connection looks good".
Congratulations, everything is running smoothly and Presto has connected with Apache Superset.
Step -6: Let's run a SQL Query also verify MySQL and MongoDB should visible as Catalogs:
Conclusion:
Follow Presto at Official Website, Linkedin, Youtube, and Join Slack channel to interact with the community.
Top comments (2)
good post!
Thanks @propelius