Open Source AI Finder

Discover the latest open-source models for your projects.

FoleyControl

text-to-audio

A controllable text-to-Foley sound generation system from Stability AI. It generates everyday sound effects from text, with control over timing and acoustic properties.

film post-productiongame sound designvideo editingcreating sound effects

📄 Research Project

Link or repository →

Minimax Music Model

text-to-audio

A music generation model from the Chinese company Minimax. It supports creating music and vocals from text, instrument selection, and can produce songs up to 3 minutes long.

music productionsongwritingcontent creationgenerating jingles

📄 Proprietary

Link or repository →

Udio

text-to-audio

A platform and model for generating high-quality music from text prompts. It can create full tracks with vocals and instrumentals in a wide variety of styles.

music productionsongwritinggenerating background musicvocal creation

📄 Proprietary

Link or repository →

GRAG Image Editing

image-to-image

An image editing method that uses Retrieval-Augmented Generation (RAG) and guided attention. It allows for precise edits based on a text prompt and a reference image for context.

photo editingstyle transferobject manipulation in imagesreference-based editing

📄 Not specified

Link or repository →

Game Tars

ai-agent

An autonomous agent designed to play complex real-time strategy (RTS) games like StarCraft II by perceiving the game state and making strategic decisions.

game playing agentsrts game automationai in gamingdecision-making models

📄 Research Project

Link or repository →

IGGT

text-generation

Stands for Iterative Generation of Grammatical Text. It's a method to improve the grammatical correctness of text generated by LLMs by iteratively refining the output.

improving text qualitygrammar correctioncontent writingenhancing llm outputs

📄 Not specified

Link or repository →

Hierarchical SVG Generation

text-to-image

A model that generates images as Scalable Vector Graphics (SVGs) from text prompts. It uses a hierarchical generation process to create structured and editable vector graphics.

logo designicon generationcreating scalable graphicsdigital art

📄 Research Project

Link or repository →

MoCha

text-to-3d

A motion-controllable model for generating 3D animations of human-object interactions from text descriptions and motion control signals.

3d animationgame developmentvirtual realityrobotics simulation

📄 Research Project

Link or repository →

PoMELLI

tool

A preference optimization method from Google for fine-tuning LLMs. It is designed to be a more efficient and effective alternative to methods like DPO for aligning models with human feedback.

fine-tuning llmsaligning models with human preferencesimproving model helpfulnessai safety

📄 Research Project

Link or repository →

Nitro-E

tool

A Rocm-based framework developed by AMD for efficient Large Language Model (LLM) inference on AMD GPUs. It aims to optimize performance for deploying LLMs on AMD hardware.

deploying llmsoptimizing llm inferencerunning llms on amd gpus

📄 Open Source (Apache 2.0)

Link or repository →

Open Source AI Finder

Signup For The AI Newsletter