DEV Community

Trent Brew
Trent Brew

Posted on

What I Learned Building a Knowledge Graph for AI Agents

AI assistants scrape through TODO files, commit messages, and scattered notes, trying to piece together what blocks a feature. They guess. They miss critical dependencies and recent decisions.

The fix: let agents query project knowledge like a database instead of parsing human prose.

Before: Context scattered across files

# “What’s blocking the auth feature?” # AI scrapes TODO.md, commit messages, Slack. Guesses. You verify manually. 
Enter fullscreen mode Exit fullscreen mode

After: Query the graph

code-hq query 'FIND ?b WHERE urn:task:auth-123 dependsOn* ?b AND ?b.taskStatus != "Done"' # → urn:task:db-456 (Setup database, assigned to Bob) 
Enter fullscreen mode Exit fullscreen mode

The core problems

  • Missing relationships: TODO lists describe tasks in isolation. Real work is about dependencies, ownership, and ripple effects.
  • Context brittleness: Rephrase a comment or move a task, and the AI's understanding breaks. No stable way to reference project state.
  • Translation overhead: Humans use Markdown. Agents need structured data.

Solution: maintain two layers - one for humans (Markdown, UIs) and one for machines (structured graph). Keep them synchronized.

The approach

1) Everything as a graph

Projects aren't lists - they're webs of relationships. Represent everything as JSON-LD entities with clear connections:

{ "@context": "https://schema.org", "@graph": [ { "@id": "urn:task:auth-123", "@type": "Task", "name": "Fix authentication bug", "taskStatus": "InProgress", "priority": "high", "assignee": { "@id": "urn:person:alice" }, "dependsOn": [{ "@id": "urn:task:db-456" }], "dateModified": "2025-10-24T04:12:00Z" }, { "@id": "urn:task:db-456", "@type": "Task", "name": "Setup database", "taskStatus": "Blocked", "assignee": { "@id": "urn:person:bob" } } ] } 
Enter fullscreen mode Exit fullscreen mode

2) A query language for exploring relationships

A simple query language that traverses relationships and filters results:

FIND ?t WHERE ?t a Task ; dependsOn* ?b . ?b taskStatus != "Done" . FILTER (?t = urn:task:auth-123) 
Enter fullscreen mode Exit fullscreen mode

3) CLI for humans

A CLI that handles the tedious parts:

code-hq init code-hq create task "Fix auth bug" --priority high --assignee trent code-hq tasks --status blocked code-hq people --role developer code-hq show --view kanban 
Enter fullscreen mode Exit fullscreen mode

4) Generated views

Generate human-readable views from the structured data:

# Tasks (Generated) ## In Progress - Fix authentication bug (@trent, high) ## Blocked - Setup database (@trent) 
Enter fullscreen mode Exit fullscreen mode

Humans use Markdown. Agents get structured data. Both stay synchronized through the CLI.

Migration

# Parse TODO.md → tasks with inferred priority/assignee/dependencies code-hq import todo ./TODO.md # Import GitHub issues (PLACEHOLDER: add repo) code-hq import github --repo org/repo # Keep files in sync (graph → Markdown views) code-hq render --views kanban,timeline 
Enter fullscreen mode Exit fullscreen mode

Trade-offs

  • Authoring cost → CLI verbs (create, link, set) and smart importers make structured data feel natural
  • Schema drift → JSON-LD contexts with validation on write
  • Review friction → Small, machine-stable diffs in .code-hq/
  • Team adoption → Humans keep Markdown, agents get structure
  • VSCode only → Visualization layer depends on VSCode extension for now (web client coming soon): code-hq-vscode

Getting started

# Start with a simple graph code-hq init # Add a few tasks with relationships code-hq create task "Test semantic workflows" --priority medium # Query to understand the current state code-hq query 'FIND ?t WHERE ?t a Task ; priority "high"' # See it from a human perspective code-hq show --view kanban 
Enter fullscreen mode Exit fullscreen mode

Architecture

┌─────────────────────────┐ │ CLI (Human Interface) │ Simple commands that update understanding └────────────┬────────────┘ │ ┌────────────▼────────────┐ │ Semantic Graph Layer │ JSON-LD with clear relationships └────────────┬────────────┘ │ ┌────────────▼────────────┐ │ Query Engine │ TQL for exploring connections └────────────┬────────────┘ │ ┌────────────▼────────────┐ │ Agent Workflows │ Standup summaries, PR reviews, planning └─────────────────────────┘ 
Enter fullscreen mode Exit fullscreen mode

IDE Integration

Cursor

Add to .cursorrules:

This project uses code-hq for task management. Reference .code-hq/prompts/task-management.md for commands. Always check existing tasks before creating duplicates. Update task status when starting/finishing work. 
Enter fullscreen mode Exit fullscreen mode

Windsurf

Create .windsurf/workflows/codehq.md:

--- description: code-hq commands reference --- Read `.code-hq/prompts/_index.md` for overview. See `.code-hq/prompts/task-management.md` for details. 
Enter fullscreen mode Exit fullscreen mode

Claude Code

Automatically sees .code-hq/prompts/ as context.
Ask: "How do I manage tasks in this project?"

What's next

code-hq solves developer workflows. The next step is specialized AI agents for marketing, finance, and ops, all sharing the same knowledge graph. One source of truth for the entire company.


CodeHQ: The CLI

Source Code: https://github.com/trentbrew/code-hq
NPM Package: https://www.npmjs.com/package/code-hq

CodeHQ: The VSCode Extension

https://open-vsx.org/extension/codehq/codehq-vscode

TQL: The Query Language

Source Code: https://github.com/trentbrew/TQL

JSON-LD: JSON For Linked Data

Docs: https://json-ld.org/

Top comments (0)