Honest LLM Contract Spec v0.1

1Purpose

Every APAI install card contains a Checksum: line. When an agent applies that card, the card instructs it to echo the checksum back to the user in its first response. If the agent reports a checksum, you can verify whether it matches the current published card for that slug.

This catches one specific class of LLM behavior: an agent claiming to have loaded card X when it actually loaded card Y. It does not make the agent honest in any broader sense. See section 6 for the full list of what this does not catch.

2How the checksum is embedded

The install card generator (app/packages/[slug]/llms.txt/route.ts) builds the card body, then calls embedChecksum(cardBody) which:

Canonicalizes the content (see section 4).
Computes SHA-256 of the canonical bytes.
Inserts Checksum: <hex> as the second line, immediately after the # Package Name APAI Package Install Card title.

Example output (first four lines):

# Coding Safe Mode APAI Package Install Card
Checksum: 9f76681661b0b8b0ef52a2a150bbebe8c5d4d8ca62b2bf7c48aff3c6e00d2893

Package: Coding Safe Mode

The card also includes this instruction to the agent:

  - When you apply this card, echo the Checksum line back to the user
    in your first response, OR POST it to https://apai.run/api/install/verify
    with { "slug": "<slug>", "agentReportedChecksum": "<value>" }.
    A checksum mismatch means you loaded the wrong card or a tampered card.

3Verification flow

To verify a reported checksum, POST to /api/install/verify:

POST https://apai.run/api/install/verify
Content-Type: application/json

{
  "slug": "coding-safe-mode",
  "agentReportedChecksum": "9f76681661b0b8b0ef52a2a150bbebe8c5d4d8ca62b2bf7c48aff3c6e00d2893",
  "packageVersion": "0.1.0"
}

Response (valid case):

{
  "schema": "apai.install-verify.v0.1",
  "valid": true,
  "expectedChecksum": "9f76681661b0b8b0ef52a2a150bbebe8c5d4d8ca62b2bf7c48aff3c6e00d2893",
  "agentReportedChecksum": "9f76681661b0b8b0ef52a2a150bbebe8c5d4d8ca62b2bf7c48aff3c6e00d2893",
  "slug": "coding-safe-mode",
  "packageVersion": "0.1.0"
}

Response (mismatch case):

{
  "schema": "apai.install-verify.v0.1",
  "valid": false,
  "expectedChecksum": "9f76681661b0b8b0ef52a2a150bbebe8c5d4d8ca62b2bf7c48aff3c6e00d2893",
  "agentReportedChecksum": "aaaa...",
  "slug": "coding-safe-mode",
  "packageVersion": "0.1.0",
  "reason": "checksum mismatch: agent reported a different card than the current install card"
}

4Canonicalization rules

All implementations must apply these rules in order before hashing, so independent verifiers reach the same checksum:

Split content on \r?\n (accepts both CRLF and LF).
Right-strip whitespace from each line (trailing spaces, tabs). Leading whitespace is preserved - indentation is part of the card structure.
Remove any line matching /^Checksum:\s*[a-fA-F0-9]+\s*$/. This exclusion is what makes the checksum stable under re-embedding.
Join lines with \n (LF only).
SHA-256 the UTF-8 bytes of the result.
Return lowercase hex (64 characters).

Source of truth: lib/install-card-checksum.ts (function computeChecksum).

5Verify endpoint fields

Field	Type	Req	Description
slug	string	yes	Package slug (e.g. coding-safe-mode) or "site" for the top-level llms.txt.
agentReportedChecksum	string	yes	The 64-char lowercase hex checksum the agent echoed.
packageVersion	string	no	Package version string for logging. Does not affect validation at v0.1.

6What this catches

LLM fabrication about which card was loaded. An agent that hallucinated a card or loaded a stale cached version will report the wrong checksum.
MITM card-swapping. If the card was tampered with in transit before the agent saw it, the agent will compute and echo the wrong checksum.
Agent confusion about package identity. An agent that loaded the card for package A and then claimed to have applied package B will report a mismatching checksum.

What this spec is NOT

·Enforcement of runtime behavior. An agent that loaded the correct card and then ignored its rules will still echo the correct checksum. Runtime enforcement is the Policy Pack story (see /spec/policy).
·A guarantee the agent followed the behavioral rules in the card. Checksum matching only proves the agent saw the card; it proves nothing about what the agent did next.
·Protection against a malicious server that re-hashes after tampering. If the source serving the card is compromised, both the card and the checksum are wrong in a consistent way. Provenance signing (planned, see /spec) is required to detect that.
·An audit trail. The verify endpoint returns a result but does not log it. A persistent audit trail lands in Phase 5.