6 Jun 2026

A filesystem for institutional knowledge

The Context team

Ask how to give an agent your company's knowledge and the default answer is a vector database: take every document, chop it into chunks, embed the chunks, and retrieve the nearest ones to a query. It works, to a point. It also throws away two things agents are unusually good at using: structure, and the ability to write.

We took a different shape. The institutional-context layer is a filesystem the agent traverses like a code repository, with a place at every level for the agent to record what it learns.

Why a filesystem, not a vector store

Large language models are heavily trained on navigating filesystems and code repositories. They reliably traverse directories, understand that this folder belongs under that one, and reason with paths. It is a native skill, learned from a vast amount of real code, and a vector store does not use any of it.

Chunk-and-embed flattens a structured corpus into a bag of fragments and asks the model to reason over similarity scores instead of over the structure the company actually organizes its work by. A filesystem keeps that structure intact and hands the agent a way of moving through it that it already knows cold. The agent does not need a special interface to your knowledge. It needs the one it learned from a million repositories.

The tree is not the only structure the store keeps. Alongside the directories, it holds cross-reference links between entities, an account, a ticket, a policy, a prior decision, so the agent can follow a relationship across the tree the way it would follow an import in a codebase, not just walk a folder hierarchy. Structure here means both the hierarchy and the graph laid over it.

Retrieval that reads the source, not the chunk

Embeddings are not discarded. They are one tool of several. Retrieval is hybrid: semantic search alongside traditional information retrieval, used to ground a query in the full breadth of the store. The difference is what happens next. The agent then explores the source documents directly, rather than being handed pre-embedded fragments and nothing else.

The two halves cover each other's blind spots. Embeddings are good at fuzzy, conceptual matches and bad at exact ones, an account number, a policy code, a specific identifier, where a near match is a wrong match. Traditional retrieval catches exactly those. Hybrid means a vague question about a concept and a precise lookup of an identifier both land on the right place.

That matters whenever the answer depends on the surroundings of a document: the section it sits in, the file next to it, the note in the same folder. A similarity match on an isolated chunk throws those surroundings away. An agent that navigates to the right place and reads keeps them, the way a person who knows the repository would read around a result instead of stopping at it.

The part that is writable

Here is the part a vector store cannot do well. Every directory in the tree can contain a .context subdirectory, and that is where an agent writes what it learns while doing the work. An observation about how this account is handled. A procedural note about the step everyone gets wrong. A correction a human just made.

These persist as structured context that the next agent sees at retrieval time. The store is not a frozen index of documents. It is a living layer the agents grow. They do not only consume institutional knowledge; they produce it, and they leave it in the folder where the next agent will look.

org/

.context/ org-wide standards and procedures

finance/

.context/ how this desk reads the policy

accounts/

acme/

.context/ Acme uses adjusted EBITDA; risks first

2025-Q4-filing.pdf

earnings-model.xlsx

Institutional context is a filesystem the agent traverses. A .context directory at every level is where agents write what they learn.

An annotation, a quarter later

A concrete version makes it click. An agent is asked to build a quarterly earnings deck for an account. It navigates the store, reads the relevant filings and the prior deck, and produces a draft. An expert reviews it and makes two corrections: this account's decks always use adjusted EBITDA, not GAAP net income, and the waterfall chart should compare quarter over quarter, not year over year.

Those corrections are written into the account's .context, with a note and a pointer to the task that produced them. A quarter later, the same request comes in. This time the agent navigates straight to the account, reads the precedent it left for itself last time, applies adjusted EBITDA and the right waterfall without being told, and needs no corrections. The work got easier because the last run wrote down what it learned, in the place this run would look.

How an agent reads down the tree

Because the store is hierarchical, an agent does not read one note. It reads a path of them. Approaching a task, it walks from the org level down to the specific thing it is working on, picking up context at each step. At the org level, the standards everyone follows. At the team level, how this desk interprets them. At the account or project level, the specifics that apply only here.

By the time it reaches the file it needs, it is carrying all three layers at once, the same stack of context a person builds up over years of working somewhere. The partition into org, team, and individual is not only for permissions. It is how general procedure and local exception get composed into the context for one particular piece of work.

It sits beside your data, it does not overwrite it

An important boundary: the .context layer sits alongside the company's canonical data, never on top of it. The source of truth in the CRM or the warehouse stays the source of truth. What the agents write is a separate, additive layer of observations, procedures, and corrections.

It is append-only, with semantic deduplication so the same lesson is not recorded a hundred times, and full provenance so every note traces back to the task and the person that produced it. You can always see where a piece of learned context came from, and you can always set it aside and read the canonical record underneath. The agents annotate the library. They do not rewrite the books.

It inherits permissions from the source

Knowledge is only safe if access to it respects the rules that already govern it. The store ingests from the systems a company already uses, and it inherits each source's permissions: role-based access, object-level controls, the identity hierarchy. An agent authenticates through the customer's identity provider and inherits the invoking user's access, scoped by role, team, and sensitivity.

So an agent retrieving from the store sees exactly what the person behind it would see, across every connected system, and the .context notes are partitioned the same way, by org, team, and individual. Learned context never becomes a backdoor around the permissions on the data it was learned from. If you could not see the document, you cannot see the note an agent wrote about it either.

It gets richer while no one is working

The store does not only grow during tasks. Between active tasks, idle compute distills the raw traces of recent work into cleaner structured context, turning a messy record of what happened into the procedural note a future agent can use directly.

That moves work from inference time, when latency is expensive and a user is waiting, to idle time, when neither is true. The retrieval an agent does tomorrow is better because of the distillation that ran on last night's idle cycles. It is a subject worth its own post; the point here is that the store improves even when no task is touching it.

Why this compounds

Put the pieces together and the store is a different kind of asset than an index. It is structured the way the company works. It is written to by every task. It inherits the permissions of the data it covers. And it improves on idle compute. Each of those makes it harder to replicate and more useful over time.

An index of last year's documents is replaceable; anyone can rebuild one from the same files. A year of annotations about how the work actually goes, sitting in the right folders, under the right permissions, is not. The filesystem is not just where the knowledge is stored. It is where the knowledge accumulates.

Move through it, read it, write to it

A vector database answers a narrow question well: which chunks look like this query. An agent doing real work needs more than that. It needs to move through a company's knowledge the way someone who works there would, to read the source and not a fragment of it, and to write down what it learns so the next agent starts ahead. A filesystem with a place to take notes does all three. So that is what we built.