Technical writeups

The corpus Automatic layer The assistant Recall vs inference Compounding return The dependency The gap

Design philosophy

Below the waterline

COSTA is a Deep Context Model. Large Language Models are broad by design. A Deep Context Model is specific by design. The difference is not capability. It is orientation.

Design principles

01

The corpus precedes everything.

02

Structure is what makes the model specific.

03

The model is the same. The corpus is the difference.

01 · The infrastructure

What accumulates

One-to-one notes, meeting transcripts, tasks, the calendar, the journal. These files exist in almost every working life, in some form. The question is what happens to them. In most tools they are attachments — documents you bring to a conversation, paste into a window, or upload for a single session. When the session ends, they are gone. The next session starts from nothing again.

COSTA treats these files as a corpus. They are stored as plain Markdown, on your machine, in a consistent structure. That structure is not incidental. It is what allows both AI layers built on top of them to work. A folder of unstructured notes contains the same information. It does not have the same queryability.

The value of the corpus is deferred. On day one it contains one entry. At six months it contains a picture of every working relationship, every open commitment, and a record of how the work has felt to do. That is not a note-taking application. It is an accumulation engine, and the AI layers are what it accumulates for.

Corpus growth over time

Day one

First 1-1 note

Task list

Single data point. Useful for today.

One month

12–15 1-1 notes

Meeting transcripts

Daily briefs

Task history

Patterns start forming. Brief knows your people.

Six months

60+ 1-1 notes per person

Full meeting history

Journal entries

Org sentiment trend

Impact logs

Recall becomes the default. Inference is the fallback.

02 · The automatic layer

What surfaces without asking

The brief generates each morning without prompting. It reads the task list, the calendar for the week, the 1-1 notes for anyone appearing in today's meetings, and a selection of recent briefs to avoid repetition. It produces a synthesis before the working day starts. You did not ask for it.

Meeting prep works the same way. Opening a person's page assembles a synthesis from every previous note involving them before you read a word. Task extraction runs on incoming meeting transcripts: commitments made in a meeting become tasks because they were captured, not because you transferred them manually. Org sentiment analyses recent notes on a weekly cycle and produces a scored read on team health.

This is the layer that has no equivalent in a generic AI tool. ChatGPT surfaces nothing proactively. It waits. COSTA runs on the corpus continuously and surfaces what is relevant when it is relevant. The difference is not capability. It is architecture.

Same calendar. Different inputs. Different brief.

Generic AI toolprompted

Input

1-1 with Laura · 10:00

You have a 1-1 with Laura scheduled at 10:00. Consider preparing some open-ended questions around workload, goals, and any blockers she may be facing.

COSTAautomatic

Loaded automatically

Calendar · 6 months of Laura's 1-1 notes · open tasks · org sentiment

Laura is at 10:00. The capacity concern she raised in February came up again last week — four of the last six sessions have closed without action. The headcount commitment from February hasn't been followed up. She's energised by the AI work; the quality problem is what keeps sitting unresolved.

03 · The interactive layer

What you can ask

The assistant panel is the part of COSTA that most closely resembles a generic AI tool. It is a chat interface. You ask questions, it responds. The underlying model is Claude — the same model available on Claude.ai.

The difference is the context assembled before you ask. When you open the assistant on a person's page, it has already loaded every 1-1 note for that person, their context file, and any relevant meeting notes. When you ask about an open project, it reads the linked document history before answering. When you ask it to help prep a message, it reads the writing style guide and applies it without being told to.

A generic tool can do none of this automatically. You could provide all of it manually — paste the notes, describe the context, supply the history. Most people do not, because the overhead is too high. So they ask vaguer questions and get vaguer answers. The assistant is not smarter than Claude.ai. It has more to work with.

04 · The headline difference

Recall versus inference

When you ask a generic AI tool about a direct report, it answers from what it knows about management in general. The response is coherent, often reasonable, and specific to no one. That is inference: the model reasoning from patterns in its training data toward a plausible answer about a situation it cannot actually see.

When you ask COSTA the same question, it reads the last six months of actual notes before responding. It knows the last three sessions circled the same problem without resolution. It knows a commitment was made in February and not followed up. It knows the pattern that no individual note makes visible on its own. That is recall: the model working from your actual record, not from general knowledge about people like yours.

The difference is not the model. It is what the model is reading. Inference produces answers that are plausible in general. Recall produces answers that are accurate in particular. For the questions a manager asks continuously — how is this person doing, what did we agree, what should I prepare — the gap between plausible and accurate is the whole point.

The same distinction holds in the automatic layer. The brief is not inference about what someone with your calendar and job title might care about today. It reads your actual task list, your actual calendar, the actual notes from the actual relationships those meetings involve. The output is specific because the input is specific. Specificity is a function of the corpus, not the model.

The same question. Different context. Different answer.

Generic AI toolinference

Context provided

None

How is Laura doing?

To get a clear picture of how your direct report is doing, consider scheduling a regular check-in focused on their workload, development goals, and any blockers. Ask open-ended questions about what's going well and what feels difficult.

COSTArecall

Context loaded automatically

6 months of 1-1 notes

Open commitments

Recent sentiment signals

How is Laura doing?

The capacity concern Laura raised in February is still open — it came up again in the last three sessions without resolution. He's energised by the AI tooling work but has flagged that the contextualised quality problem is sitting unresolved. The Feb commitment to revisit headcount hasn't been followed up.

05 · The compounding return

What time adds

Recall is not binary. A DCM with one day of context recalls one day. The answer is more specific than a generic tool's, but not by much. The corpus is too thin to show patterns. What it knows about a person is a first impression, not a picture.

At six months the picture is different. The model has read sixty or more notes on the same relationship. It knows which issues recur without resolving. It knows which commitments were made and not followed up. It knows how a person responds under pressure — not because it infers this from general knowledge about people, but because it has observed it across dozens of sessions. That is not a property of the model. It is a property of the corpus.

The quality of recall scales with the depth of the corpus. On day one a DCM is marginally better than a generic tool for questions about your people and your work. At six months it is categorically better, because it has something a generic tool can never have: a record of what actually happened.

Same question. Same model. Different corpus depth.

Large Language Model

No corpus

How is Laura doing?

To understand how your direct report is doing, consider scheduling a regular check-in focused on workload, goals, and any blockers. Ask open-ended questions about what's going well and where they feel stretched.

Inference

Deep Context Model

One day of context

How is Laura doing?

Laura mentioned she's adjusting to the new role structure and flagged capacity as a concern in your first session. Too early to see patterns — this is a first impression, not a picture.

Thin recall

Deep Context Model

Six months of context

How is Laura doing?

The capacity concern Laura raised in week two has come up in seven of the last twelve sessions without resolution. She's energised by the AI work but consistently moves past the quality problem without engaging it directly. The February commitment to revisit headcount hasn't been followed up. That pattern of agreeing in the room and not following through has appeared three times this quarter.

Deep recall

06 · The dependency

What building here actually means

Building a corpus inside a specific application creates a dependency. The honest version of this is: you are accumulating something valuable inside a system you do not control, and that is worth naming plainly rather than glossing over.

The corpus itself is not the dependency. Everything COSTA stores is plain Markdown, on your machine, with no proprietary format. If the application were abandoned tomorrow, the notes would still be there — readable in any editor, greppable, portable, complete. The data does not belong to the application. That is a deliberate design decision, made precisely because this risk is real.

The extraction layer is the dependency. The briefs, the meeting prep synthesis, the proactive surfacing — these are produced by COSTA's processing of the corpus on a schedule. They do not port automatically. A richer corpus in a different system does not automatically become a richer brief. The value extraction mechanism is coupled to the application even though the underlying data is not.

That is the bet worth understanding. Not that COSTA will be the only tool that can process these files — any sufficiently capable system could eventually do what COSTA does with the same corpus — but that building the corpus now, with the structure COSTA enforces, is worth doing before the question of which tool processes it is fully settled. The corpus is the durable asset. The extraction layer is the current best way to use it, and the one most likely to improve.

07 · The comparison

What generic tools actually are

COSTA is a Deep Context Model. A Large Language Model is broad by design: trained on vast general knowledge, optimised to respond well to any question from any person about any subject. A Deep Context Model is specific by design: built around a single person's corpus, optimised to answer questions that only make sense with that corpus present.

Worth saying plainly: what this describes is retrieval-augmented generation on a curated personal corpus. The underlying model is an LLM. The retrieval layer is structured Markdown files. A technically capable person could build something similar themselves — a folder of notes, a script to prepopulate a context window, a cron job to generate a morning summary. The claim is not that this architecture is novel. The claim is narrower: that the combination of an opinionated schema, a persistent corpus, and an automatic processing layer produces something that no amount of clever prompting of a stateless tool can replicate — and that the hard part is not the technology but the design decisions about what to capture, how to structure it, and what to surface without being asked. Those decisions are where COSTA's value actually lives.

The comparison between a DCM and an LLM is not a comparison between two versions of a chat interface. It is a comparison between a tool with one layer and a tool with three. A generic AI tool gives you the interactive layer. Its answers are as good as the context you bring to each session. That context is yours to assemble, every time, with no persistence between sessions and nothing surfacing unless you ask. Every answer is inference, because that is all there is.

COSTA adds the corpus — the infrastructure that accumulates context over time — and the automatic layer that processes it continuously without being asked. The model is the same. What is different is what the model has access to, and what the system does before you arrive at the question.

The gap between them compounds every day more goes into the system. That is the argument for building the infrastructure before asking questions of it.

One layer versus three

Generic AI tool

COSTA

Interactive layer

Ask a question · get an answer

Interactive layer

Assistant · full corpus as context

Automatic layer

Brief · meeting prep · task extraction · org sentiment

Corpus

1-1 notes · transcripts · tasks · calendar · journal

Context is yours to supply, every session.

Context accumulates. Both layers draw from it.

Also in this series

Organisational Memory →Context Architecture →Brief Generation →Colour System →

COSTA · Below the Waterline← Back to app