In April 2024, Microsoft Research published the GraphRAG paper.[1] In November 2024, Anthropic launched Model Context Protocol.[2] In March 2025, Google added native MCP support to the Gemini API.[9] In April 2025, OpenAI shipped Connectors using MCP-shape interfaces.[7] Three labs that compete on almost everything else agreed on one thing in the space of twelve months: the future of LLM retrieval is typed knowledge graphs, addressed through a standard protocol.
Three competing labs · Microsoft, Anthropic, Google · landed on the same substrate conclusion in a twelve-month window. That kind of convergence is rare. When it happens it's usually a signal that the architectural primitive is correct, not that the labs copied each other.
This post walks through the technical work that produced the convergence, the benchmark deltas that forced the field's hand, and the architectural primitives that show up in every serious implementation. The pattern is now stable enough to design around.
Section 1: The four shifts that defined the field
2020 · RAG (Lewis et al., Facebook AI Research)
The original RAG paper[4] introduced the now-standard pattern: embed a document corpus into a vector store, retrieve top-k nearest neighbors at query time, concatenate them into the prompt, let the LLM generate. This was the substrate for the 2022-2024 wave of "chat with your docs" products · Glean, Notion AI, ChatGPT Enterprise's first-generation document connectors, and approximately 4,000 startup pitch decks.
Vector RAG works well for what we now call "local" questions · queries whose answer is contained in a small number of contiguous document chunks. It fails predictably on global questions · queries that require synthesizing across an entire corpus, or traversing relationships between entities.
2024 · GraphRAG (Edge et al., Microsoft Research)
Microsoft's GraphRAG paper[1] made the failure mode quantitative. On the Podcast and News benchmark sets they tested, GraphRAG's "global" query mode produced responses that were judged superior by an LLM evaluator on comprehensiveness and diversity at roughly 70-80% win rates against the baseline vector RAG approach. The trick: rather than retrieving raw text chunks, the indexer extracts a typed entity-relation graph, runs community detection on it, and generates per-community summaries that become the retrieval target.
Two architectural primitives from that paper became the field's defaults:
- Indexer / Query split. The expensive entity extraction + graph construction + community summarization happens offline during indexing. Online queries hit pre-computed structures.
- Multi-level community summaries. Hierarchical Leiden clustering produces summaries at multiple granularities. A global query gets routed to coarse summaries; a local query gets routed to fine ones.
2024 · MCP (Anthropic)
The Model Context Protocol[2] solved a different problem: bespoke integrations. Before MCP, every LLM-to-data-source connection was a custom function-calling schema, custom auth, and custom error handling. After MCP, any context source · filesystem, Postgres, knowledge graph, SaaS API · exposes the same shape: tools[], resources[], prompts[]. The LLM client doesn't care which vendor is on the other end.
MCP is not technically novel. The novelty is the standard. Once Anthropic published the spec and shipped reference servers, the cost of building a graph-grounded agent dropped by roughly an order of magnitude. By Q1 2025, OpenAI's Connectors,[7] Google's Gemini API,[9] and every major IDE-integrated coding assistant had adopted it.
2024-2025 · Hybrid retrieval converges
Neo4j published a reference architecture[6] that became the de-facto template for production graph RAG: vector retrieval to find candidate seed nodes, then graph traversal from those seeds to expand context, then community-level summarization for the final synthesis. Every serious implementation since · including PYRAMYD's · uses some variant of this pattern.
Section 2: The convergence pattern
The four shifts above, taken together, define the production pattern that's now stable:
- Typed entity backbone. Every implementation worth deploying maintains a typed graph schema · entities have types, relationships have types, FKs are enforced. Microsoft's GraphRAG extracts the schema from text; Neo4j's reference assumes it; PYRAMYD's ships pre-built.
- Community detection + hierarchical summaries. Leiden or Louvain clustering partitions the graph into communities of related entities. Multi-level summaries make global queries tractable. Every production implementation runs some variant.
- Hybrid retrieval. Vector search seeds the traversal; graph walks expand it. Cosine similarity finds the entry point; typed edges define the neighborhood.
- MCP-shaped delivery. The retrieved subgraph is delivered to the LLM as a structured tool response, not raw text. The LLM can call back for more context as needed.
- Citation traceability. Every retrieved fact carries a back-pointer to its source node or document. This is what makes the substrate auditable.
Section 3: What the original GraphRAG paper actually says
The Microsoft Research GraphRAG paper (Edge et al., arXiv:2404.16130, April 2024) makes a qualitative claim that is well-formed but narrowly scoped: GraphRAG “leads to substantial improvements over a conventional RAG baseline for both the comprehensiveness and diversity of generated answers” on global sensemaking questions evaluated against datasets in the 1-million-token range. The authors don't publish a single headline percentage · they publish head-to-head win rates per evaluation dimension.
That nuance matters because the field has accumulated a lot of secondhand “3-4× lift” or “35-54% accuracy gain” numbers in derivative articles, none of which appear in the original paper's abstract or introduction. The honest summary is: graph traversal handles multi-hop reasoning that cosine-similarity chunk ranking can't, and Microsoft Research has the most-cited paper demonstrating that on a non-trivial benchmark.
Section 4: What this implies for the next 24 months
The convergence pattern means three things for any team building on top of LLMs:
- Plain vector RAG is a dead end for global queries. If your product answers questions like "how is X changing across the market" or "which competitors ship features that overlap with our roadmap," you need graph traversal in the retrieval path. The benchmark gap is too large to ignore.
- MCP is the protocol bet. Any retrieval layer you build today should expose its surface as MCP tools. The standard is settled enough that bespoke alternatives lose on ecosystem cost.
- Typed entity backbones beat extractor-only approaches. Microsoft's original GraphRAG extracts the schema from text · which works for unstructured corpora. For domain-specific substrates (enterprise software, financial markets, legal contracts) a pre-built typed backbone wins on accuracy because the extractor doesn't have to discover the schema each run.
Section 5: Where PYRAMYD lands in the pattern
PYRAMYD's Product Graph is the typed-backbone variant of the converged pattern. The 88 universal node types and 1,554 FK constraints replace the extractor-derived schema. The community detection is replaced by the canonical taxonomy structure (category → industry → country, etc.). The retrieval pipeline uses pgvector embeddings to seed the traversal and graph walks to expand, exactly per the Neo4j reference architecture.[6] APEX is the MCP-shaped delivery layer · plug it into Claude Desktop, ChatGPT Enterprise, or Cursor and the graph becomes addressable from any of them.
Where this lands for PYRAMYD customers
The substrate convergence is the reason PYRAMYD ships as a platform, not a chatbot. Every Studio surface · Dashboards, Canvases, Notebooks, Sheets, Documents, Slides, Schemas, Agents, Models · sits on the typed graph and the MCP-shaped retrieval layer. The same primitives that Microsoft, Anthropic, Google, and Neo4j converged on, but pre-built for the specific shape of enterprise software.
The next 24 months will see a wave of products that look like GraphRAG variants on undifferentiated substrates · chat-with-your-Confluence, chat-with-your-Salesforce, chat-with-your-Notion. Those products will work, but they will be marginal because the substrate underneath them isn't typed enough to support the global queries the buyer actually has. The product graph for an entire vertical · enterprise software, in our case · is a different shape and produces different answers. The convergence pattern tells you why.
References
- [1]Edge, D. et al., From Local to Global: A Graph RAG Approach to Query-Focused Summarization, Microsoft Research arXiv:2404.16130 (Apr 2024) · The foundational GraphRAG paper · methodology, benchmarks, +70-80% lift on global question types over baseline vector retrieval.
- [2]Anthropic, Introducing the Model Context Protocol (Nov 2024) · MCP: open standard for connecting LLM clients to typed context sources. Adopted by OpenAI, Google, Cursor, Claude Desktop within 6 months of launch.
- [3]Edge, D. et al., Microsoft GraphRAG GitHub Repository · Open-source reference implementation · indexer + query pipeline + community detection + entity extraction.
- [4]Lewis, P. et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, NeurIPS 2020 · Original RAG paper (Facebook AI Research, 2020) · the substrate the GraphRAG variants extend.
- [5]Wang, K. et al., A Comprehensive Survey of Knowledge Graph Reasoning, arXiv:2210.10663 (Oct 2022) · Survey of multi-hop reasoning approaches over knowledge graphs · context for why graph traversal beats vector similarity on complex queries.
- [6]Neo4j + LangChain, Graph RAG Reference Architecture (2024) · Neo4j's published reference for hybrid graph+vector retrieval · architectural primitives that converge with Microsoft's variant.
- [7]OpenAI, Introducing Connectors (Apr 2025) · OpenAI's adoption of MCP-shape connectors for ChatGPT enterprise · the move that ratified MCP as the cross-vendor standard.
- [8]Gartner, Hype Cycle for Generative AI 2025 (Aug 2025) · Knowledge graphs moved from Slope of Enlightenment to Plateau of Productivity · explicitly tied to MCP adoption velocity.
- [9]Google, Introducing Gemini's MCP support, Google AI Blog (Mar 2025) · Gemini API native MCP support · third major LLM vendor to adopt within four months of launch.
- [10]Microsoft Research, "From Local to Global: A Graph RAG Approach to Query-Focused Summarization" · Edge et al., arXiv:2404.16130, April 24, 2024 (revised Feb 19, 2025). Documents substantial GraphRAG improvements over conventional RAG on comprehensiveness and diversity dimensions across 1M-token sensemaking benchmarks.
