COSTA is a Deep Context Model. Large Language Models are broad by design. A Deep Context Model is specific by design. The difference is not capability. It is orientation.
One-to-one notes, meeting transcripts, tasks, the calendar, the journal. These files exist in almost every working life, in some form. The question is what happens to them. In most tools they are attachments — documents you bring to a conversation, paste into a window, or upload for a single session. When the session ends, they are gone. The next session starts from nothing again.
COSTA treats these files as a corpus. They are stored as plain Markdown, on your machine, in a consistent structure. That structure is not incidental. It is what allows both AI layers built on top of them to work. A folder of unstructured notes contains the same information. It does not have the same queryability.
The value of the corpus is deferred. On day one it contains one entry. At six months it contains a picture of every working relationship, every open commitment, and a record of how the work has felt to do. That is not a note-taking application. It is an accumulation engine, and the AI layers are what it accumulates for.
The brief generates each morning without prompting. It reads the task list, the calendar for the week, the 1-1 notes for anyone appearing in today's meetings, and a selection of recent briefs to avoid repetition. It produces a synthesis before the working day starts. You did not ask for it.
Meeting prep works the same way. Opening a person's page assembles a synthesis from every previous note involving them before you read a word. Task extraction runs on incoming meeting transcripts: commitments made in a meeting become tasks because they were captured, not because you transferred them manually. Org sentiment analyses recent notes on a weekly cycle and produces a scored read on team health.
This is the layer that has no equivalent in a generic AI tool. ChatGPT surfaces nothing proactively. It waits. COSTA runs on the corpus continuously and surfaces what is relevant when it is relevant. The difference is not capability. It is architecture.
The assistant panel is the part of COSTA that most closely resembles a generic AI tool. It is a chat interface. You ask questions, it responds. The underlying model is Claude — the same model available on Claude.ai.
The difference is the context assembled before you ask. When you open the assistant on a person's page, it has already loaded every 1-1 note for that person, their context file, and any relevant meeting notes. When you ask about an open project, it reads the linked document history before answering. When you ask it to help prep a message, it reads the writing style guide and applies it without being told to.
A generic tool can do none of this automatically. You could provide all of it manually — paste the notes, describe the context, supply the history. Most people do not, because the overhead is too high. So they ask vaguer questions and get vaguer answers. The assistant is not smarter than Claude.ai. It has more to work with.
When you ask a generic AI tool about a direct report, it answers from what it knows about management in general. The response is coherent, often reasonable, and specific to no one. That is inference: the model reasoning from patterns in its training data toward a plausible answer about a situation it cannot actually see.
When you ask COSTA the same question, it reads the last six months of actual notes before responding. It knows the last three sessions circled the same problem without resolution. It knows a commitment was made in February and not followed up. It knows the pattern that no individual note makes visible on its own. That is recall: the model working from your actual record, not from general knowledge about people like yours.
The difference is not the model. It is what the model is reading. Inference produces answers that are plausible in general. Recall produces answers that are accurate in particular. For the questions a manager asks continuously — how is this person doing, what did we agree, what should I prepare — the gap between plausible and accurate is the whole point.
The same distinction holds in the automatic layer. The brief is not inference about what someone with your calendar and job title might care about today. It reads your actual task list, your actual calendar, the actual notes from the actual relationships those meetings involve. The output is specific because the input is specific. Specificity is a function of the corpus, not the model.
Recall is not binary. A DCM with one day of context recalls one day. The answer is more specific than a generic tool's, but not by much. The corpus is too thin to show patterns. What it knows about a person is a first impression, not a picture.
At six months the picture is different. The model has read sixty or more notes on the same relationship. It knows which issues recur without resolving. It knows which commitments were made and not followed up. It knows how a person responds under pressure — not because it infers this from general knowledge about people, but because it has observed it across dozens of sessions. That is not a property of the model. It is a property of the corpus.
The quality of recall scales with the depth of the corpus. On day one a DCM is marginally better than a generic tool for questions about your people and your work. At six months it is categorically better, because it has something a generic tool can never have: a record of what actually happened.
Building a corpus inside a specific application creates a dependency. The honest version of this is: you are accumulating something valuable inside a system you do not control, and that is worth naming plainly rather than glossing over.
The corpus itself is not the dependency. Everything COSTA stores is plain Markdown, on your machine, with no proprietary format. If the application were abandoned tomorrow, the notes would still be there — readable in any editor, greppable, portable, complete. The data does not belong to the application. That is a deliberate design decision, made precisely because this risk is real.
The extraction layer is the dependency. The briefs, the meeting prep synthesis, the proactive surfacing — these are produced by COSTA's processing of the corpus on a schedule. They do not port automatically. A richer corpus in a different system does not automatically become a richer brief. The value extraction mechanism is coupled to the application even though the underlying data is not.
That is the bet worth understanding. Not that COSTA will be the only tool that can process these files — any sufficiently capable system could eventually do what COSTA does with the same corpus — but that building the corpus now, with the structure COSTA enforces, is worth doing before the question of which tool processes it is fully settled. The corpus is the durable asset. The extraction layer is the current best way to use it, and the one most likely to improve.
COSTA is a Deep Context Model. A Large Language Model is broad by design: trained on vast general knowledge, optimised to respond well to any question from any person about any subject. A Deep Context Model is specific by design: built around a single person's corpus, optimised to answer questions that only make sense with that corpus present.
Worth saying plainly: what this describes is retrieval-augmented generation on a curated personal corpus. The underlying model is an LLM. The retrieval layer is structured Markdown files. A technically capable person could build something similar themselves — a folder of notes, a script to prepopulate a context window, a cron job to generate a morning summary. The claim is not that this architecture is novel. The claim is narrower: that the combination of an opinionated schema, a persistent corpus, and an automatic processing layer produces something that no amount of clever prompting of a stateless tool can replicate — and that the hard part is not the technology but the design decisions about what to capture, how to structure it, and what to surface without being asked. Those decisions are where COSTA's value actually lives.
The comparison between a DCM and an LLM is not a comparison between two versions of a chat interface. It is a comparison between a tool with one layer and a tool with three. A generic AI tool gives you the interactive layer. Its answers are as good as the context you bring to each session. That context is yours to assemble, every time, with no persistence between sessions and nothing surfacing unless you ask. Every answer is inference, because that is all there is.
COSTA adds the corpus — the infrastructure that accumulates context over time — and the automatic layer that processes it continuously without being asked. The model is the same. What is different is what the model has access to, and what the system does before you arrive at the question.
The gap between them compounds every day more goes into the system. That is the argument for building the infrastructure before asking questions of it.