This project allows you to generate custom playlists based on a user-defined prompt, like “sad girl wistful Friday evening” or “midwestern emo puppy love.” By using Deeplearning4j to embed song lyrics into vectors and MongoDB Atlas's Vector Search to find similar songs, you can create playlists that match the semantic vibe of your chosen playlist name.
This repo demonstrates how to:
- Embed song lyrics using GloVe pre-trained embeddings with Deeplearning4j.
- Store and query embeddings in MongoDB Atlas using Vector Search.
- Generate playlists based on a descriptive prompt that captures the "vibe."
- Custom Playlist Names: Create playlists based on any prompt, and retrieve songs that match the semantic meaning of the prompt.
- Deeplearning4j Integration: Use Deeplearning4j to process and embed song lyrics into vectors.
- MongoDB Atlas Vector Search: Perform vector searches in MongoDB to find songs with similar lyrical content.
- Lyrics Embedding: We use Deeplearning4j with GloVe embeddings to convert song lyrics into vectors.
- Vector Storage: These embeddings are stored in MongoDB Atlas in a
songs
collection. - Playlist Generation: When a playlist name is provided, it is also embedded into a vector, and MongoDB’s Vector Search feature is used to find songs with similar embeddings.
Before you start, ensure you have the following:
- Java 11+ (Java 21 recommended)
- Maven 3.9.6+
- MongoDB Atlas with a deployed cluster (M0+)
- GloVe embeddings (
glove.840B.300d.txt
) from GloVe: Global Vectors for Word Representation - Kaggle Genius Lyrics dataset (
song_lyrics.csv
) from Kaggle
-
Clone this repository:
git clone https://github.com/yourusername/ai-powered-playlist-generator.git cd ai-powered-playlist-generator
-
Add the necessary resources:
- Download the GloVe embeddings (
glove.840B.300d.txt
) and place it in thesrc/main/resources
folder. - Download the Genius Song Lyrics dataset and place
song_lyrics.csv
in thesrc/main/resources
folder.
- Download the GloVe embeddings (
-
Set up MongoDB Atlas:
- Create a MongoDB cluster.
- Configure your
.env
orapplication.properties
with the MongoDB URI and database/collection information:mongodb.uri=mongodb+srv://<username>:<password>@<cluster-url>/?retryWrites=true&w=majority mongodb.database=music mongodb.collection=songs
-
Run the project:
mvn spring-boot:run
-
Load the song data into MongoDB by sending a request to the
/loadSampleData
endpoint:curl -X GET "http://localhost:8080/loadSampleData?fileName=song_lyrics.csv"
-
Generate a playlist by sending a request to
/newPlaylist
with your desired playlist name:curl -X GET "http://localhost:8080/newPlaylist?playlistName=sad%20girl%20wistful%20Friday%20evening"
src/main/java
: Contains all Java classes for embedding, MongoDB interaction, and playlist generation.src/main/resources
: Stores external resources like GloVe embeddings and song lyrics.pom.xml
: Contains the necessary dependencies for Deeplearning4j, MongoDB, and other libraries.
- Java: Backend logic
- Spring Boot: REST API for playlist generation
- Deeplearning4j: Lyrics embedding using GloVe
- MongoDB Atlas: Vector Search for finding similar songs
- Maven: Dependency management
- The current implementation uses a pre-trained GloVe model, so results might not be as precise as a custom model trained on song lyrics.
- We are not using audio data or user listening history, which would improve the playlist recommendations further.
- Custom Model Training: Fine-tune a model on song lyrics for better embedding accuracy.
- Audio Data Integration: Use audio features to enhance playlist generation.
- User Personalization: Incorporate user preferences and listening habits.
This project is licensed under the MIT License. See the LICENSE file for details.