Skip to content

Conversation

nhickster
Copy link

Description

Add read_graph_summary function to the memory MCP server for efficient entity overview without loading the entire knowledge graph. This new function provides entity names, types, observation counts, and most recent observations while preserving full relation data, significantly reducing context usage for overview scenarios.

This allows LLMs to get an overview of the graph in order to make a better decisions about which entities to fully load into the context with open_nodes

Server Details

  • Server: memory
  • Changes to: tools (added new read_graph_summary tool)

Motivation and Context

The existing read_graph function returns complete entity data including all observations, which can become very large and consume significant context space when users only need an overview of their knowledge graph. This creates inefficiency when:

  • LLMs want to quickly scan what entities exist and their recent activity
  • LLM context is being consumed by large observation arrays
  • LLMs need to identify entities of interest before diving into full details

The read_graph_summary function solves this by providing:

  • Entity names and types for quick identification
  • Observation counts to understand data richness
  • Most recent observation for context about latest activity
  • All relations preserved for understanding connections

LLMs can then use the existing open_nodes function to get full details for specific entities of interest.

How Has This Been Tested?

Comprehensive testing completed using VSCode with MCP integration:

  1. Functional Testing:

    • Created test entities with varying observation counts (3-24 observations)
    • Verified correct entity summary structure (name, entityType, observationCount, lastObservation)
    • Confirmed most recent observation is correctly identified and returned
    • Tested with empty graph (returns empty arrays)
  2. Performance Verification:

    • Tested with entity containing 24 observations
    • read_graph: Returns full 24-observation array
    • read_graph_summary: Returns count + last observation only
  3. Integration Testing:

    • Verified tool registration in MCP protocol
    • Confirmed proper JSON schema validation
    • Tested workflow: read_graph_summary → identify entities → open_nodes for details
  4. Edge Case Testing:

    • Empty entities arrays
    • Entities with single observations
    • Relations-only graphs

Breaking Changes

No breaking changes - This is a purely additive feature that:

  • Adds a new optional tool (read_graph_summary)
  • Does not modify existing tool behavior
  • Maintains backward compatibility
  • Requires no configuration changes

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Protocol Documentation
  • My changes follows MCP security best practices
  • I have updated the server's README accordingly
  • I have tested this with an LLM client
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have documented all environment variables and configuration options

Additional context

Implementation Details:

  • Uses existing loadGraph() infrastructure for consistency
  • Creates lightweight EntitySummary interface separate from full Entity
  • Preserves all relations unchanged for compatibility
  • Returns KnowledgeGraphSummary type for clear API contracts
  • Follows existing async/await patterns and error handling

Performance Benefits:

  • Significant reduction in context usage for overview scenarios
  • Enables efficient "browse then deep-dive" workflows
  • Maintains full functionality through complementary tool usage
  • Zero performance impact on existing tools

Design Decision:

  • Returns most recent observation rather than first/random to provide most current context
  • Includes observation count to help users understand data richness
  • Keeps relations unchanged to maintain graph connectivity information
  • Uses separate interface to clearly distinguish from full graph structure
@nhickster
Copy link
Author

This PR addresses the core issue raised in #2415 - that read_graph consumes excessive LLM context when dealing with large memory files containing many observations.

While #2415 proposed implementing a more comprehensive approach, this solution provides an efficient graph index (summary) for the LLM to use. The new read_graph_summary function gives LLMs just enough context (entity names, types, observation counts, and most recent observations) to make informed decisions about which specific entities to examine in detail using open_nodes.

This approach:

  • Solves the immediate context consumption problem
  • Maintains full functionality through the existing open_nodes workflow
  • Keeps the implementation simple and maintainable
  • Leverages LLMs' natural strength at analyzing and prioritizing information

The result is an efficient "browse then deep-dive" pattern that most LLMs seem to handle pretty well.

@skynet
Copy link

skynet commented Aug 26, 2025

Claude Desktop suffers enormously from tokens scarcity. Any plan to use this with a vector database?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants