Methodology

How Modelmark records AI model change

A source-backed record of meaningful AI model releases, updates, pricing, access, deprecations, and capability changes. Here's how a change becomes a record — shown through examples.

Methodology v1 · Public draft

The short version

Every record answers five questions.

What changed
The specific, factual change — stated plainly, not dramatised.
Who changed it
The lab or provider responsible for the model or family.
When it happened
The dates around the change, kept distinct from each other.
Where it came from
The source the change is traceable to, snapshotted when possible.
How confident we are
A required confidence label, so certainty is never assumed.

Methodology in practice

Illustrative

How four kinds of change get handled.

Generic examples — not real model or lab data. Each shows whether it enters the record, the preferred source, the confidence, and what happens when the facts change later.

A new model is released

Recorded

Record?: Yes — if it's source-backed and meaningful.
Source: Official release note, model card, docs, or model-hub page.
Confidence: Official, if a provider source directly supports it.
Fact: Model name, lab, release date, availability.
Interpretation: Why the release may matter.
If it changes: If details change later, supersede or correct the record.

A pricing change appears

Recorded

Record?: Yes — if pricing materially changes.
Source: Official pricing page or changelog.
Confidence: Official, if the provider page supports it.
Fact: Old and new price, unit, effective date.
Interpretation: Who may care about the change.
If it changes: If clarified later, add a correction or superseding record.

A model is deprecated

Recorded

Record?: Yes.
Source: Provider deprecation notice, migration guide, API docs, or changelog.
Confidence: Official, if the provider announces it.
Fact: Model affected, replacement if given, effective date.
Interpretation: Migration impact for people building on it.
If it changes: If the timeline changes, supersede with a newer record.

A benchmark claim appears

Recorded as a claim

Record?: Carefully — only as a claim.
Source: Official technical report or paper.
Confidence: Official or corroborated, depending on the source.
Fact: That the provider made the claim.
Interpretation: A claim is not independent verification.
If it changes: If disputed or revised, correct, supersede, or retract.

What enters · what stays out

Restraint keeps the record trustworthy.

Enters the record

New models, updates, and renames
Capability changes
Pricing changes
Availability and access changes
Deprecations and retirements
Open-weights releases
Benchmark claims — kept as claims

Stays out

Rumours without reliable sourcing
Founder drama, funding news, and company gossip
General AI market commentary
Social speculation about unreleased models
Benchmark chatter without a clear source or defined claim
Generic product launches that don't affect model access or behaviour
Minor documentation edits that don't change model understanding

▸ View full event type list

New model releasedmodel_released: A new model or model family is announced or made available.
Model updatedmodel_updated: An existing model receives a meaningful, documented update.
Model renamedmodel_renamed: A model name, slug, version, or public identity changes.
Capability changedcapability_changed: Modalities, tool use, context window, reasoning mode, or supported uses change.
Pricing changedpricing_changed: Input, output, cache, batch, image, audio, fine-tuning, or hosting prices change.
Availability changedavailability_changed: Access shifts across app, API, cloud provider, region, waitlist, or general availability.
Access changedaccess_changed: Permissions, account requirements, rate limits, tiering, or gating change.
Deprecation announceddeprecation_announced: A provider signals future removal, replacement, or migration guidance.
Model retiredmodel_retired: A model is removed, disabled, or no longer available for new use.
Open weights releasedopen_weights_released: Weights are released, relicensed, mirrored, or made available through a hub.
Benchmark claimbenchmark_claim: A provider makes a material benchmark claim — recorded as a claim, not a finding.

Sources and confidence

Closest to the provider, and honest about certainty.

Source preference

1Official provider docs, pricing, changelogs, and model cards
2Official model hubs and provider listings where access actually lives
3Corroborated third-party reporting, then clearly-labelled inferences

A link alone is not enough. When possible, Modelmark stores a snapshot of the source at capture time so the record stays verifiable if the live page changes later.

Confidence labels

official: Directly supported by an official source from the lab, provider, or model owner.
corroborated: Supported by multiple independent sources that agree.
single_source: Supported by one non-official source.
inferred: Derived by Modelmark from observed source changes, and clearly labelled.

Unknown is better than guessed. If a field is unavailable, Modelmark marks it as unavailable rather than filling the gap with speculation.

▸ View full source hierarchy

01Official provider docs, pricing pages, changelogs, model cards, release notes, or blog posts
02Official model-hub repositories controlled by the lab or author
03Official cloud/provider listings where access is actually provided
04Research papers or technical reports from the provider or authors
05Corroborated third-party reporting from reputable publications
06Official social posts from labs or clearly identified model authors
07Clearly labelled Modelmark inferences from documented source changes

Facts, interpretation & time

The record states what changed. The note says why it matters.

Facts

release date
source URL
pricing value
access status
context window
availability
event type

Interpretation

why it matters
useful-for tags
importance level
editorial note

API customers should eventually be able to request facts only — the factual record without interpretive guidance.

One change, several moments

Happened
happened_at
When the change happened, according to the source.
Effective
effective_at
When the change takes effect, if different from the announcement.
Detected
detected_at
When Modelmark first detected the change.
Published
published_at
When Modelmark approved and published the record.
Last verified
last_verified_at
When the record or its source was last checked.

Corrections and history

We don't silently rewrite history.

Corrected
The original record contained a factual error, fixed with a visible note.
Superseded
A later record replaces or changes the previous understanding.
Retracted
The event should no longer be treated as valid.

A visible correction log is not an embarrassment. It is proof that the record is maintained.

How AI helps, and where it stops

AI assists ingestion. People publish.

AI may help extract and structure drafts as records are ingested. The path runs once, at ingestion — never at request time.

Source RegistryScheduled CheckerSnapshot + HashChange DetectorAI ExtractorHuman Review QueuePublished Event StoreApp / API / MCP

The website is not generating facts live with AI.
AI is not used at request time to invent or rewrite records.
Human review is required before anything is published.

App, API & MCP

One record, three ways to read it.

App

Gives users a calm, readable view of what changed and why it may matter.

API

Gives builders structured, source-backed model-change data.

MCP

Will let AI agents retrieve model-change records with provenance inline.

Dropping provenance from an API or MCP response should be treated as a product bug.

▸ View technical methodology details

ChangeEvent: An immutable, append-only record of one material change to one model or family.
Source: The evidence behind a change. No event is published without at least one, snapshotted when possible.
Model: The current canonical state of a model, rebuilt from its ChangeEvents.
Lab: The entity responsible for a model or model family.
Correction: A public record of a factual change to a previously published event.

Versioning

The methodology is versioned. Every record stores the methodology version used when it was published — breaking changes get a new major version, additive clarifications a minor one.

Non-negotiables

No record without a source.
Snapshot sources when possible.
Records are append-only.
Corrections are public.
Confidence is required.
Source type is required.
Unknown is better than guessed.
Fact and interpretation stay separate.
AI assists ingestion only, never at request time.
App, API, and MCP read from the same data model.
Provenance travels with the data.
The methodology is public and versioned.

The goal

Modelmark's job is not to make model releases feel dramatic. Its job is to make them easier to verify, understand, and return to over time.

Back to Modelmark Request API / MCP access

How Modelmark records AI model change

Every record answers five questions.

What changed

Who changed it

When it happened

Where it came from

How confident we are

How four kinds of change get handled.

A new model is released

A pricing change appears

A model is deprecated

A benchmark claim appears

Restraint keeps the record trustworthy.

Closest to the provider, and honest about certainty.

The record states what changed. The note says why it matters.

Happened

Effective

Detected

Published

Last verified

We don't silently rewrite history.

Corrected

Superseded

Retracted

AI assists ingestion. People publish.

One record, three ways to read it.