An AI-powered detection platform built to identify synthetic identity documents — addressing a national-scale vulnerability in institutional verification.
In 2025, generative AI reduced the cost of producing convincing synthetic identity documents to near zero. Existing verification systems — built for an era of analog forgery — cannot reliably distinguish authentic credentials from machine-generated fakes at scale.
The institutions that depend on document verification — banks, immigration authorities, government agencies, healthcare systems, notarial bodies, employers running background checks — face a rapidly widening trust gap. As of 2026, no widely deployed verification stack adequately handles diffusion-model artifacts, C2PA-stripped metadata, or transformer-generated micro-tampering.
This is not a hypothetical threat. It is an operational vulnerability with measurable downstream cost: financial fraud, illegal entry, identity misuse, and erosion of institutional trust at national scale.
SentinelVerify combines five independent detection signals, each targeting a different class of synthetic document artifact. No single signal is reliable in isolation; together, they form a calibrated confidence layer that institutions can integrate into existing KYC and identity verification pipelines.
Metadata forensics
EXIF inspection, C2PA provenance verification, and detection of AI software markers (Midjourney, DALL-E, Stable Diffusion, etc.)
Error Level Analysis
ELA isolates re-encoding compression artifacts, surfacing edited or fabricated regions invisible to the human eye.
FFT spectral analysis
Frequency-domain fingerprinting detects diffusion model signatures — the periodic artifacts that current generators leave behind.
Vision reasoning
Transformer-based vision (Claude 4.7) inspects each document for semantic inconsistencies and structured anomalies.
A fifth layer, currently in integration, fine-tunes convolutional neural networks against the HuggingFace AI-image detection corpus — providing a learned signal that complements the deterministic ones.
The system is built on a Python/FastAPI core with PostgreSQL for case persistence and audit trails. Documents enter the pipeline through a single ingestion endpoint and run through parallel forensic passes; signals are aggregated through a calibrated heuristic layer specialized for ID cards and passports.
Cross-verification connects to authoritative external sources — initial integrations target IRS taxpayer verification and OpenCorporates entity validation — allowing the system to confirm not just that a document is authentic, but that the data it asserts matches reality.
Every case persists with full audit trail: input artifacts, signal outputs, intermediate scores, final verdict, and reviewer overrides. This is non-negotiable for institutional deployment — verification decisions must be reconstructable for legal and compliance review.
SentinelVerify is built in public against a published roadmap. Items below are tracked in real time — status reflects current state, not aspiration.
Phase 1 targets a Florida pilot, with phase 2 expanding to Texas, California, and New York — the four states that together account for the majority of identity fraud volume in the United States.
The threat surface is national. Synthetic identity documents are not a futuristic concern; they are a present-day attack vector against financial institutions, immigration systems, and any infrastructure that depends on government-issued credentials. The cost of detection systems has not kept pace with the cost of generation.
SentinelVerify is one attempt — among many that will be needed — to close that gap. The work is technical, but the stakes are institutional: the integrity of verification is the foundation of trust in modern infrastructure.
The SentinelVerify production codebase is intentionally private. The system processes real identity documents — driver’s licenses, passports, and government-issued IDs — submitted for fraud verification. Public availability of detection logic would (1) expose techniques to adversaries seeking to evade detection, and (2) introduce unnecessary risk to the data subjects whose documents pass through the pipeline.
Architecture documentation, methodology, and research artifacts are available in the public companion repository. Source code is available under NDA to qualified institutional partners and academic reviewers on request.