Skip to content
Modelmark

Methodology

How Modelmark records AI model change

A source-backed record of meaningful AI model releases, updates, pricing, access, deprecations, and capability changes. Here's how a change becomes a record — shown through examples.

Methodology v1 · Public draft

The short version

Every record answers five questions.

  • What changed

    The specific, factual change — stated plainly, not dramatised.

  • Who changed it

    The lab or provider responsible for the model or family.

  • When it happened

    The dates around the change, kept distinct from each other.

  • Where it came from

    The source the change is traceable to, snapshotted when possible.

  • How confident we are

    A required confidence label, so certainty is never assumed.

Methodology in practice

Illustrative

How four kinds of change get handled.

Generic examples — not real model or lab data. Each shows whether it enters the record, the preferred source, the confidence, and what happens when the facts change later.

A new model is released

Recorded
Record?
Yes — if it's source-backed and meaningful.
Source
Official release note, model card, docs, or model-hub page.
Confidence
Official, if a provider source directly supports it.
Fact
Model name, lab, release date, availability.
Interpretation
Why the release may matter.
If it changes
If details change later, supersede or correct the record.

A pricing change appears

Recorded
Record?
Yes — if pricing materially changes.
Source
Official pricing page or changelog.
Confidence
Official, if the provider page supports it.
Fact
Old and new price, unit, effective date.
Interpretation
Who may care about the change.
If it changes
If clarified later, add a correction or superseding record.

A model is deprecated

Recorded
Record?
Yes.
Source
Provider deprecation notice, migration guide, API docs, or changelog.
Confidence
Official, if the provider announces it.
Fact
Model affected, replacement if given, effective date.
Interpretation
Migration impact for people building on it.
If it changes
If the timeline changes, supersede with a newer record.

A benchmark claim appears

Recorded as a claim
Record?
Carefully — only as a claim.
Source
Official technical report or paper.
Confidence
Official or corroborated, depending on the source.
Fact
That the provider made the claim.
Interpretation
A claim is not independent verification.
If it changes
If disputed or revised, correct, supersede, or retract.

What enters · what stays out

Restraint keeps the record trustworthy.

Enters the record

  • New models, updates, and renames
  • Capability changes
  • Pricing changes
  • Availability and access changes
  • Deprecations and retirements
  • Open-weights releases
  • Benchmark claims — kept as claims

Stays out

  • Rumours without reliable sourcing
  • Founder drama, funding news, and company gossip
  • General AI market commentary
  • Social speculation about unreleased models
  • Benchmark chatter without a clear source or defined claim
  • Generic product launches that don't affect model access or behaviour
  • Minor documentation edits that don't change model understanding
View full event type list
New model releasedmodel_released
A new model or model family is announced or made available.
Model updatedmodel_updated
An existing model receives a meaningful, documented update.
Model renamedmodel_renamed
A model name, slug, version, or public identity changes.
Capability changedcapability_changed
Modalities, tool use, context window, reasoning mode, or supported uses change.
Pricing changedpricing_changed
Input, output, cache, batch, image, audio, fine-tuning, or hosting prices change.
Availability changedavailability_changed
Access shifts across app, API, cloud provider, region, waitlist, or general availability.
Access changedaccess_changed
Permissions, account requirements, rate limits, tiering, or gating change.
Deprecation announceddeprecation_announced
A provider signals future removal, replacement, or migration guidance.
Model retiredmodel_retired
A model is removed, disabled, or no longer available for new use.
Open weights releasedopen_weights_released
Weights are released, relicensed, mirrored, or made available through a hub.
Benchmark claimbenchmark_claim
A provider makes a material benchmark claim — recorded as a claim, not a finding.

Sources and confidence

Closest to the provider, and honest about certainty.

Source preference

  1. 1Official provider docs, pricing, changelogs, and model cards
  2. 2Official model hubs and provider listings where access actually lives
  3. 3Corroborated third-party reporting, then clearly-labelled inferences

A link alone is not enough. When possible, Modelmark stores a snapshot of the source at capture time so the record stays verifiable if the live page changes later.

Confidence labels

official
Directly supported by an official source from the lab, provider, or model owner.
corroborated
Supported by multiple independent sources that agree.
single_source
Supported by one non-official source.
inferred
Derived by Modelmark from observed source changes, and clearly labelled.

Unknown is better than guessed. If a field is unavailable, Modelmark marks it as unavailable rather than filling the gap with speculation.

View full source hierarchy
  1. 01Official provider docs, pricing pages, changelogs, model cards, release notes, or blog posts
  2. 02Official model-hub repositories controlled by the lab or author
  3. 03Official cloud/provider listings where access is actually provided
  4. 04Research papers or technical reports from the provider or authors
  5. 05Corroborated third-party reporting from reputable publications
  6. 06Official social posts from labs or clearly identified model authors
  7. 07Clearly labelled Modelmark inferences from documented source changes

Facts, interpretation & time

The record states what changed. The note says why it matters.

Facts

  • release date
  • source URL
  • pricing value
  • access status
  • context window
  • availability
  • event type

Interpretation

  • why it matters
  • useful-for tags
  • importance level
  • editorial note

API customers should eventually be able to request facts only — the factual record without interpretive guidance.

One change, several moments

  1. Happened

    happened_at

    When the change happened, according to the source.

  2. Effective

    effective_at

    When the change takes effect, if different from the announcement.

  3. Detected

    detected_at

    When Modelmark first detected the change.

  4. Published

    published_at

    When Modelmark approved and published the record.

  5. Last verified

    last_verified_at

    When the record or its source was last checked.

Corrections and history

We don't silently rewrite history.

  • Corrected

    The original record contained a factual error, fixed with a visible note.

  • Superseded

    A later record replaces or changes the previous understanding.

  • Retracted

    The event should no longer be treated as valid.

A visible correction log is not an embarrassment. It is proof that the record is maintained.

How AI helps, and where it stops

AI assists ingestion. People publish.

AI may help extract and structure drafts as records are ingested. The path runs once, at ingestion — never at request time.

Source RegistryScheduled CheckerSnapshot + HashChange DetectorAI ExtractorHuman Review QueuePublished Event StoreApp / API / MCP
  • The website is not generating facts live with AI.
  • AI is not used at request time to invent or rewrite records.
  • Human review is required before anything is published.

App, API & MCP

One record, three ways to read it.

App

Gives users a calm, readable view of what changed and why it may matter.

API

Gives builders structured, source-backed model-change data.

MCP

Will let AI agents retrieve model-change records with provenance inline.

Dropping provenance from an API or MCP response should be treated as a product bug.

View technical methodology details
ChangeEvent
An immutable, append-only record of one material change to one model or family.
Source
The evidence behind a change. No event is published without at least one, snapshotted when possible.
Model
The current canonical state of a model, rebuilt from its ChangeEvents.
Lab
The entity responsible for a model or model family.
Correction
A public record of a factual change to a previously published event.

Versioning

The methodology is versioned. Every record stores the methodology version used when it was published — breaking changes get a new major version, additive clarifications a minor one.

Non-negotiables

  • No record without a source.
  • Snapshot sources when possible.
  • Records are append-only.
  • Corrections are public.
  • Confidence is required.
  • Source type is required.
  • Unknown is better than guessed.
  • Fact and interpretation stay separate.
  • AI assists ingestion only, never at request time.
  • App, API, and MCP read from the same data model.
  • Provenance travels with the data.
  • The methodology is public and versioned.

The goal

Modelmark's job is not to make model releases feel dramatic. Its job is to make them easier to verify, understand, and return to over time.