Last night, my side project CocoIndex hit Trending in the Rust category on GitHub — and my notifications exploded. It turns out a lot of developers are quietly fighting the same problem: keeping AI agents connected to data that changes every minute, not every quarter.
This post is the story behind the project, but more importantly, it's about a simple idea: if we want autonomous agents to be useful in the real world, we need to take their memory and data freshness as seriously as we take their models.
The problem: agents are reasoning over yesterday's world
Most agent demos assume the world stands still while the model thinks. In production, nothing stands still:
- Issues are closed, tickets move stages, alerts fire and resolve.
- Product catalogs change, prices update, docs get refactored.
- Logs, events, and sensor data stream in continuously.
If your "knowledge" is a static dump you refresh once in a while, your agent ends up planning on a reality that no longer exists. You see this as:
- Hallucinations that are actually just stale context.
- Agents repeating actions that already succeeded or failed.
- Suggestions that would have been great… last week.
That was the itch that led to CocoIndex: a way to keep derived data—embeddings, graphs, tables—always in sync with the live sources feeding your agents.
What CocoIndex does (without the buzzwords)
CocoIndex is a Rust-powered engine with a Python-first API that lets you describe how raw data turns into the "memory" your agents query. You point it at sources (files, object storage, APIs, databases), define transformations (chunking, embeddings, LLM extraction, graph building), and point it at your targets (vector stores, SQL, graph stores, custom sinks).
Three principles shaped the design:
- Dataflows, not glue scripts: You declare a flow: "take these documents, split them, embed them, write here," instead of wiring ad‑hoc scripts and cron jobs.
- Everything observable: You can inspect inputs and outputs of every transformation step, with lineage to track where each piece of derived data came from.
- Incremental by default: When something changes, CocoIndex figures out the minimal work needed and only recomputes what's necessary, reusing cached heavy steps like embeddings.
Under the hood, Rust gives the engine the performance and safety to run these flows continuously without choking when data or traffic spikes.
How memory actually works in agents
If you've been following agent architectures, you've probably seen memory split into three layers: short-term, long-term, and working memory.
- Short-term memory: The immediate context — the last few messages, the current task, the current tool outputs. This lives in the model's context window and is cheap but extremely limited.
- Long-term memory: Everything you want the agent to "remember" across sessions: docs, conversations, events, user state, world knowledge. This typically lives in vector stores, databases, or knowledge graphs.
- Working memory: The scratchpad where the agent mixes both short-term and long-term information to reason, plan, and decide what to do next.
Most of the hype goes into working memory tricks and clever planning loops. But for real systems, long-term memory is where everything breaks: if that layer is incomplete, stale, or inconsistent, the nicest planner in the world can't save you.
CocoIndex lives in that long-term layer. Its job is to keep that memory store accurate and up to date, no matter how messy or fast-changing your raw data is.
Why incremental indexing is a superpower, not an implementation detail
Rebuilding everything from scratch sounds simple… until you try it at scale. In a world of embeddings and LLM calls, "just re-run the pipeline" means:
- Re-paying for embeddings that haven't changed.
- Re-processing entire document sets because a single section changed.
- Re-publishing whole indexes when only a few rows were touched.
Incremental indexing flips that:
- When a source object changes, CocoIndex identifies only the affected pieces.
- It recomputes just the transformations that depend on those pieces and reuses cached outputs for everything else.
- It updates the targets (vector store, tables, graph) with minimal edits and cleans up stale entries via lineage tracking.
For agents, this has two huge implications:
- Freshness at lower cost: You can update continuously instead of batching once a day, without blowing up your compute or API bill.
- Trustworthy memory: When an item is updated or deleted at the source, the derived memory reflects that change quickly and correctly, instead of leaving ghost entries around.
The more you let agents act autonomously, the more they rely on this guarantee. Otherwise you're handing them a map that never quite matches the terrain.
Why this matters for autonomous "driving" agents
Think of "autonomous driving agents" not just as physical cars, but as any agent "driving" a complex system: infrastructure, customer support, growth experiments, financial operations, internal tools. These agents:
- Observe a constantly changing environment.
- Retrieve relevant history and facts.
- Plan and execute actions over long horizons.
All three steps depend on a reliable memory substrate that evolves with the system. If your index lags by hours, your agent is effectively driving while looking in the rear-view mirror.
CocoIndex aims to be that substrate: a Rust engine that keeps your long-term memory aligned with reality through incremental updates, transparent lineage, and AI-native transformations. Rust Trending is a fun milestone — but the real goal is to make "fresh, trustworthy memory" a default property of every agent stack.
If this resonates
If you've ever shipped an agent that broke because "the data wasn't updated yet," this is probably your life too. In that case:
- Check out the project on GitHub and browse the examples for semantic search, PDFs, code, and graphs.
- Steal ideas for your own pipelines, or open issues if your use case pushes the limits.
- If you like this direction, a star or share helps more developers discover tools that treat memory and freshness as first-class concerns.
The models are getting better; now it's on us to make sure the world they see is actually up to date.
Top comments (0)