Service / Data Registration API
Sealed datasets for AI vendors.
A vendor-integrable HTTP API that turns an AI training dataset manifest into a publicly verifiable, cryptographically sealed artifact. Your enterprise customers verify the dataset's contents, your assertions about it, and the timestamp — without trusting your build pipeline.
Live v1 · per-record metered · 1M records/month free tier
What this is
A service, not a product.
Your data labeling pipeline calls a single endpoint with a structured manifest (dataset id, schema, per-record SHA-256 hashes, vendor claims). VLI canonicalizes it (RFC 8785), Merkle-hashes the records, signs the envelope (Ed25519), anchors the root to vli-registry, and returns a verification URL the vendor delivers to their customer alongside the dataset.
The dataset itself never touches VLI. We hold kilobytes of sealed manifest, not gigabytes of data. No PII liability, no IP custody, no training-data legal exposure.
Why it matters
Three audiences, one truth.
The same sealed bundle answers a different question for each side of the deal.
Integrate
Three paths, one API.
Pick whichever fits your pipeline. SDKs hash records client-side so you don't ship raw content.
npm install @vli/register
import { VLIRegister } from "@vli/register";
const client = new VLIRegister({ apiKey: process.env.VLI_API_KEY });
const sealed = await client.registerDataset({
datasetId: "support_tickets_v3",
schema: { fields: ["text", "label"] },
records: dataset.map((row, i) => ({
id: `rec_${i}`,
content: JSON.stringify(row), // SDK hashes
})),
claims: {
no_pii: true,
human_labeled: true,
double_reviewed: true,
},
});
console.log("Verify at:", sealed.verification_url); pip install vli-register
from vli_register import VLIRegister, ManifestInput
client = VLIRegister(api_key=os.environ["VLI_API_KEY"])
sealed = await client.register_dataset(ManifestInput(
dataset_id="support_tickets_v3",
schema={"fields": ["text", "label"]},
records=[
{"id": f"rec_{i}", "content": str(row)}
for i, row in enumerate(dataset)
],
claims={"no_pii": True, "human_labeled": True, "double_reviewed": True},
)) POST /register/v1/datasets
curl -X POST https://verifylinkinfra.com/register/v1/datasets \
-H "Authorization: Bearer $VLI_API_KEY" \
-H "Content-Type: application/json" \
--data @manifest.json Verification
How customers verify.
Three independent verification paths. VLI is not a trust intermediary — the math checks itself.
- Browser. Open
verification_url. The page fetches the sealed bundle and re-runs Ed25519 + Merkle root verification client-side via Web Crypto API. No server trust. - Offline CLI. Run
clearkey verify-dataset --file bundle.vli. Same checks, no network. ClearKey is open source. - Independent log query. The bundle includes a
registry_txreference. Auditor queriesvli-registrydirectly to confirm the manifest hash was logged at the timestamp claimed.
Claims
What goes in claims.
Claims are vendor-asserted, freeform, hashed and bound to the manifest signature. Once sealed, they cannot be retroactively edited without breaking verification. Suggested keys for AI training data:
no_pii— confirms PII removal pipeline ranhuman_labeled— distinguishes from synthetic / model-generateddouble_reviewed— confirms second-reviewer passprovenance— high-level source descriptionlicense— internal-use, MIT, CC-BY-4.0, etc.
The value is what your customer relies on. Don't assert what you can't defend.
Pricing
Pricing.
- Free tier: 1,000,000 records sealed per SAPI org per month
- Pay-as-you-go: $0.001 per record above the free tier (placeholder; final pricing TBD)
- Billing: Stripe checkout via
POST /register/v1/billing/portal - Self-host: the entire service is open source under Apache 2.0 — run your own anchor and registry if preferred
Track usage at GET /register/v1/usage.
Further reading
Resources.
Get started
Ready to integrate?
Self-serve from the Vault: sign in with your SAPI ID, then create a
Data Registration key from the developer console.
Plaintext shown once — copy it into VLI_API_KEY in your
SDK env.
Get an API key from the Vault → SDK source on GitHub →
Need a higher rate limit or contract terms? Reach out — happy to talk volume.