EngineeringJune 30, 20264 min read

The transcript table is append-only. The database enforces it, not us.

Every dispute about what an AI agent said gets settled by the transcript, which makes the transcript worth protecting from everyone, including the vendor. Why conversation logs should be append-only at the database layer, what the column-whitelist exception looks like, and the one-line question that tests whether a vendor's 'immutable logs' claim is architecture or policy.

VVorel EngineeringEngineeringLast updated July 5, 2026

TL;DR

The conversation log is the evidence base for every dispute, audit, and quality investigation involving your AI agent, which means its integrity has to hold against every writer, including the vendor's own application code and the vendor's own engineers on a bad day. Application-level discipline ('our code never updates messages') is a policy, and policies have exceptions under pressure. The durable version is a database-level rule: a trigger on the messages table that rejects UPDATE and DELETE outright, with a narrow whitelist for the few operational columns that legitimately change after the fact (delivery status, moderation flags), and a separately-audited administrative role for the rare lawful correction. Immutability you can demonstrate with a failed write beats immutability you assert in a compliance document.

Six months into running AI agents on real phone lines and chat widgets, a pattern becomes clear: whenever something goes wrong enough that people are upset (a caller says they were promised a refund, a manager swears the agent quoted the wrong price, a regulator asks what exactly was said on March 3rd), every party converges on the same artifact. The transcript. It is the ground truth everyone appeals to, which means the interesting question is not whether you keep transcripts. It is what, exactly, prevents a transcript from being different today than it was on March 3rd.

Policy is not protection

Most systems answer that question with application discipline: the codebase contains no UPDATE statements against the messages table, everyone agrees editing transcripts is unthinkable, and the compliance page says logs are immutable. This holds right up until it is inconvenient. A migration needs to backfill a column. A bug wrote garbled text and someone wants to 'clean it up.' A well-meaning engineer with production access and a support escalation finds it faster to fix the row than fix the process. None of these people are malicious. All of them are edits to the evidence.

The problem with 'our code never modifies transcripts' is that it is a statement about the current contents of a repository, enforced by review culture. The claim a buyer actually needs is stronger: this table cannot be modified, by anyone's code, including code that has not been written yet.

Immutability you can demonstrate with a failed write beats immutability you assert in a compliance document.

Put the rule where the data lives

The mechanical fix is small: a trigger on the messages table that fires before UPDATE and DELETE and rejects the statement. Not a soft-delete convention, not an ORM guard, a database object that turns mutation into an error. Once it exists, the immutability claim changes category. It is no longer a description of how the application behaves; it is a property of the substrate the application runs on. An engineer under deadline pressure does not get a different answer than an attacker with a stolen connection string: the write fails.

The demo is the best part. When a security review asks about log integrity, the answer is a live session: run an UPDATE against a message row as the application role, watch the database refuse it, read the error. That sixty-second demonstration ends a line of questioning that a policy document can only prolong. It is the difference between 'we promise' and 'we can't.'

The whitelist, or why purity fails

A naive version of this rule breaks production in a week, because a few columns on a message row legitimately change after the row is written. A WhatsApp message gets delivered, then read: its delivery status advances. A moderation review flags a message after the fact. These are annotations about the message's lifecycle, not revisions of what was said. The trigger has to draw exactly that line: the content columns (text, speaker, timestamps, the evidence trail) are frozen forever, while a short, explicit whitelist of operational columns stays writable. The whitelist is the design; it forces the schema to be honest about which fields are record and which are state, and every column added to it is a decision someone has to defend in review.

The other escape hatch that has to exist is lawful correction: a data-subject erasure request under GDPR-style law is a legitimate reason for message content to change (redaction is a write). This does not go through the application role. It goes through a separate administrative role that bypasses the trigger, is not reachable from the application's connection pool, and whose every use is itself logged. The result is a two-tier truth: the application physically cannot edit history, and the rare administrative edit leaves its own history behind.

Why this matters more for AI than it did before

Append-only logs are an old idea. What changed is who writes the log and what rides on it. When the speaker is an AI agent, the transcript is not just a customer-service record; it is the input to every quality system downstream. Per-turn graders re-read it to catch hallucinations. Replay testing re-runs it to validate prompt changes. Disputes about what the agent promised are settled from it. If the transcript can drift, every one of those systems inherits the drift, and a vendor grading its own mutable homework is not producing evidence, it is producing marketing. Freeze the transcript and the whole tower above it (grading, replay, audit, dispute) stands on rock.

The one-line test

If a vendor tells you their conversation logs are immutable, ask them the layer: 'Show me what happens when your own application code tries to update a message row.' The strong answer is an error message and the DDL of the trigger that produced it. The weak answer is a description of code review practices. Both vendors will pass your procurement questionnaire, because the questionnaire asks whether logs are immutable and both will say yes. Only one of them has made the yes physical.

Read the full guide