Google Cloud introduced the Open Knowledge Format (OKF) v0.1 on June 12, 2026 — a vendor-neutral, open specification that formalizes how organizations store and share knowledge with AI agents, using nothing more than plain markdown files and YAML frontmatter.
Key Highlights
- OKF standardizes organizational knowledge as a portable directory of markdown files that agents can read directly
- Only one required YAML field:
type— no proprietary SDK, no vendor accounts, no lock-in - Available on GitHub under
GoogleCloudPlatform/knowledge-catalogwith three sample bundles and two reference implementations - Directly addresses the "context-assembly problem" that forces every AI team to rebuild knowledge integration from scratch
- Draws on Andrej Karpathy's "LLM wiki" pattern, now formalized into an interoperable open standard
The Problem OKF Solves
Every organization building AI agents faces the same painful reality: institutional knowledge is fragmented. It lives across wikis, buried in code comments, locked in metadata catalog APIs, scattered across shared drives, and stored in the heads of a few senior engineers.
This fragmentation forces agent builders to repeatedly solve the same data integration challenge before their agents can function intelligently. OKF aims to end that cycle by establishing a neutral, portable format that anyone can produce without an SDK — and anyone can consume without a vendor integration.
Sam McVeety (Tech Lead, Data Analytics, Google Cloud) and Amir Hormati (Tech Lead, BigQuery, Google Cloud) authored the specification, drawing inspiration from the "LLM wiki" pattern that has emerged organically among teams building internal knowledge bases for their AI agents.
How OKF Works
At its core, OKF represents an organization's knowledge as a directory tree of markdown files:
- Each file represents a single "concept" — a database table, an API endpoint, a metric, or a runbook
- A minimal YAML frontmatter block tags the concept with
type,title,description,resource, andtags - Two reserved filenames keep bundles navigable:
index.mdenumerates a directory's contents;log.mdrecords change history with ISO 8601 date headings - Cross-linking uses standard markdown links, keeping relationships readable by both humans and agents
The result is a knowledge bundle that lives in version control alongside code, renders on GitHub, and requires no backend to host. The full v0.1 specification spans 451 lines and 14.7 kilobytes — intentionally compact.
Karpathy's LLM Wiki Pattern Made Official
The specification explicitly formalizes the "LLM wiki" pattern — teams maintaining shared markdown libraries that agents read, update, and cross-reference. As Andrej Karpathy noted in the context of this approach:
"LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The bookkeeping that causes humans to abandon personal wikis is exactly what LLMs are good at."
By standardizing this community-discovered pattern, Google Cloud aims to prevent organizations from building incompatible knowledge silos as they scale their agentic systems.
What Ships with v0.1
The GitHub repository (GoogleCloudPlatform/knowledge-catalog) includes:
- The full OKF v0.1 specification — minimally opinionated, only the
typefield is required - Enrichment agent — walks BigQuery datasets, drafts OKF concept documents for tables and views, enriches them with citations, schemas, and join paths
- Static HTML visualizer — converts OKF bundles into interactive graph views with no backend required
- Three sample bundles — GA4 e-commerce data, Stack Overflow public datasets, and Bitcoin public datasets, all produced by the reference agent
Why This Matters for Agent Builders
OKF's minimalism is a deliberate design choice. Rather than standardizing taxonomies, storage infrastructure, or domain-specific schemas, it provides just enough structure for agents to navigate and consume knowledge reliably.
This positions OKF as a format, not a platform — no vendor lock-in, no proprietary accounts, and no forced migration path. Organizations can adopt it incrementally, starting with a single dataset and expanding as their agent ecosystems grow.
For MENA-region organizations building agents over Arabic-language content or multilingual enterprise data, OKF's markdown-native, encoding-agnostic design is particularly relevant: the format imposes no language-level constraints on document content.
What's Next
The v0.1 release is explicitly a starting point. Google Cloud is inviting the community to write producers and consumers, propose extensions, and submit pull requests via GitHub. Backward-compatible growth is built into the specification's design philosophy.
Google Cloud Knowledge Catalog has already been updated to ingest OKF bundles natively, giving enterprise teams an immediate integration path — but the open specification means any agent framework, orchestration platform, or CI pipeline can adopt it independently.
Source: Google Cloud Blog