MT19937 in Cryptography — Forensic Research Project

A structured, evidence-based investigation into cryptographic implementations that use the Mersenne Twister PRNG (mt19937)

Overview

The Mersenne Twister (mt19937), published by Matsumoto & Nishimura in 1998, is one of the most widely deployed general-purpose pseudo-random number generators in software history. It offers excellent statistical properties and an extraordinarily long period (2¹⁹⁹³⁷ − 1), making it highly attractive for simulations, games, and statistical sampling.

However, mt19937 is not a cryptographically secure PRNG (CSPRNG). Its internal state (624 × 32-bit words) can be fully reconstructed from as few as 624 consecutive output values, its output is entirely predictable from its seed, and it provides no forward secrecy or backtracking resistance. Its use in any security-sensitive context — key generation, IV/nonce derivation, session token creation, password salting — constitutes a serious cryptographic vulnerability.

Despite this, mt19937 has repeatedly appeared in cryptographic and security-sensitive codebases, sometimes by oversight, sometimes due to language defaults (e.g., Python's random module, PHP's rand()/mt_rand(), Java's java.util.Random), and sometimes through deliberate but misinformed design decisions.

This project systematically documents every such instance, at forensic quality, for academic and security research purposes.

Research Objectives

Catalogue all known cryptographic libraries, frameworks, protocols, and products that have used mt19937 (or a language's default PRNG backed by mt19937) in security-sensitive contexts.
Classify each instance by severity, exploitability, and remediation status.
Provide reproducible evidence for every finding, anchored to primary sources (source code, CVEs, advisories, academic papers).
Track remediation: identify which findings have been patched, when, and how.
Produce a public-facing dataset suitable for citation in security advisories and academic publications.

Background: Why mt19937 Is Cryptographically Unsafe

Property	mt19937	CSPRNG Requirement
State reconstruction	✗ Possible from 624 outputs	Must be computationally infeasible
Forward secrecy	✗ None	Required
Backtracking resistance	✗ None	Required
Seed entropy	✗ Often low (timestamp, PID)	Must be ≥128 bits
NIST/FIPS approval	✗ Not approved	Required for regulated contexts
Output bias	✓ Negligible (statistical)	N/A

Reference: Matsumoto, M. & Nishimura, T. (1998). Mersenne Twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation, 8(1), 3–30.

Relevant CWEs:

CWE-338 — Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG)
CWE-330 — Use of Insufficiently Random Values
CWE-334 — Small Space of Random Values

Repository Structure

mt19937-research/
│
├── AGENTS.md                        ← Agent operational instructions (read first)
├── README.md                        ← This file
├── CHANGELOG.md                     ← Running log of significant updates
│
├── sources/                         ← Raw, unmodified evidence artifacts
│   ├── code/                        ← Archived source code snapshots
│   │   └── <impl-name>_<version>/
│   ├── advisories/                  ← CVE/advisory snapshots
│   ├── papers/                      ← Academic papers (PDF + BibTeX)
│   └── web/                         ← Archived web pages
│
├── candidates/                      ← Unverified candidates awaiting analysis
│   └── <impl-name>.candidate.json
│
├── findings/                        ← Verified, classified findings
│   ├── registry.jsonl               ← Master NDJSON registry
│   ├── SUMMARY.md                   ← Human-readable dashboard
│   ├── DISPUTES.md                  ← Contested findings
│   │
│   ├── evidence/                    ← One EvidenceRecord per finding
│   │   └── <impl-name>_<id>.evidence.json
│   │
│   ├── reports/                     ← Narrative report per implementation
│   │   └── <impl-name>.md
│   │
│   └── verification/                ← Independent sign-offs
│       └── <impl-name>_<id>.verify.json
│
├── taxonomy/
│   ├── severity.md                  ← Severity classification guide
│   ├── cwe-mapping.md               ← CWE cross-reference table
│   └── language-patterns.md        ← Language-specific mt19937 fingerprints
│
├── scripts/
│   ├── search_github.py             ← GitHub code search helper
│   ├── validate_record.py           ← JSON schema validator
│   └── build_summary.py            ← Generates SUMMARY.md from registry
│
└── meta/
    ├── agent-assignments.md
    ├── coordination-log.md
    └── open-questions.md

Getting Started

For Human Researchers

Read AGENTS.md in full before contributing any findings.
Check meta/agent-assignments.md to understand current role assignments.
Review findings/SUMMARY.md to understand the current state of the dataset.
Open questions are tracked in meta/open-questions.md — consider tackling one.

For AI Research Agents

Your primary instruction document is AGENTS.md. Read it completely before taking any action.
Identify your assigned role (Discovery, Forensic, Classification, Documentation, or Verification).
Check meta/coordination-log.md for pending tasks and recent decisions.
All output files must conform to the schemas defined in AGENTS.md §6.
Do not produce findings without primary source citations. Do not speculate.

Running the Validation Script

Before committing any JSON record, validate it:

python scripts/validate_record.py --file candidates/my-lib.candidate.json
python scripts/validate_record.py --file findings/evidence/my-lib_EVID-0001.evidence.json

Rebuilding the Summary Dashboard

python scripts/build_summary.py
# Output: findings/SUMMARY.md

Research Methodology

Phase 1 — Discovery

Systematic search across code repositories (GitHub, GitLab, Sourcegraph), vulnerability databases (NVD, OSV, MITRE CVE, exploit-db), package registries (PyPI, npm, Maven, RubyGems, Packagist, crates.io), and academic literature (Google Scholar, Semantic Scholar, USENIX, ACM DL, IEEE Xplore).

Phase 2 — Forensic Source Analysis

Each candidate undergoes direct source code examination. The exact file, line numbers, version, and commit hash are recorded. The context of the mt19937 usage (what it generates, how it is seeded, whether any documentation disclaims cryptographic use) is assessed against a standardized questionnaire.

Phase 3 — Classification

Confirmed findings are classified using a five-tier severity scale (CRITICAL → INFORMATIONAL) and mapped to relevant CWEs and CVEs. See taxonomy/severity.md for criteria.

Phase 4 — Independent Verification

All CRITICAL and HIGH findings are independently re-verified by a separate agent or researcher before being added to the finalized registry.

Phase 5 — Reporting

Per-implementation narrative reports are generated in findings/reports/. The master registry (findings/registry.jsonl) serves as the canonical, machine-readable dataset.

Severity Scale (Summary)

Severity	Description
CRITICAL	mt19937 directly used for key, IV, or nonce generation; practically exploitable
HIGH	mt19937 used for session tokens or password salts; predictability demonstrated
MEDIUM	Security-adjacent usage; exploitation requires additional prerequisites
LOW	Present near security functions but not directly used cryptographically
INFORMATIONAL	Non-cryptographic use within a security library; noted for completeness

Known Language-Default Risks

Several programming languages use mt19937 as their default random() implementation. Any security-sensitive code that calls these without explicitly selecting a CSPRNG is potentially in scope:

Language / Runtime	Unsafe Default	Safe Alternative
Python 2 & 3	`random` module	`secrets`, `os.urandom()`
PHP < 7.0	`rand()`, `mt_rand()`	`random_bytes()`, `random_int()`
Java	`java.util.Random`	`java.security.SecureRandom`
C / C++ (many impls)	`rand()`, `std::mt19937`	`/dev/urandom`, `getrandom()`, OpenSSL
Ruby	`Random` / `rand`	`SecureRandom`
JavaScript (some runtimes)	`Math.random()`	`crypto.getRandomValues()`
R	Default RNG	`sodium` package
MATLAB	`rand()`	N/A (not a security platform)

Prior Work & Related Resources

Argyros & Kiayias (2012) — I Forgot Your Password: Randomness Attacks Against PHP Applications — demonstrated practical exploitation of mt_rand() in PHP web applications (USENIX Security 2012).
Goldberg & Wagner (1996) — Randomness and the Netscape Browser — early precedent for PRNG failures in security contexts.
NCC Group (various) — Multiple advisories citing improper PRNG usage.
CWE-338 — MITRE's canonical classification for this class of vulnerability.
NIST SP 800-90A — Standard for approved DRBG mechanisms; mt19937 is not listed.

Contribution Guidelines

All contributions must include primary source evidence. Unsourced claims will not be accepted.
Use the JSON schemas defined in AGENTS.md §6 exactly — do not add or remove fields without team consensus.
Record all significant decisions in meta/coordination-log.md.
Disputed findings go to findings/DISPUTES.md until resolved.
Responsible disclosure: do not share unpatched CRITICAL or HIGH findings externally without project lead approval.

License & Citation

This dataset is released for academic and security research purposes.

If you use this dataset in a publication, please cite:

@misc{mt19937-research-2026,
  title        = {MT19937 in Cryptography: A Forensic Research Dataset},
  year         = {2026},
  howpublished = {\url{https://github.com/your-org/mt19937-research}},
  note         = {Collaborative forensic research project}
}

Contact & Coordination

All inter-agent and inter-researcher communication must be logged in meta/coordination-log.md. For urgent escalations (actively exploited unpatched vulnerabilities), flag the entry with [URGENT] and notify the project lead immediately.

Schema version: 1.0.0 | Initialized: 2026-03-28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MT19937 in Cryptography — Forensic Research Project

Overview

Research Objectives

Background: Why mt19937 Is Cryptographically Unsafe

Repository Structure

Getting Started

For Human Researchers

For AI Research Agents

Running the Validation Script

Rebuilding the Summary Dashboard

Research Methodology

Phase 1 — Discovery

Phase 2 — Forensic Source Analysis

Phase 3 — Classification

Phase 4 — Independent Verification

Phase 5 — Reporting

Severity Scale (Summary)

Known Language-Default Risks

Prior Work & Related Resources

Contribution Guidelines

License & Citation

Contact & Coordination

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
candidates		candidates
findings		findings
meta		meta
scripts		scripts
taxonomy		taxonomy
.gitattributes		.gitattributes
666.png		666.png
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CRITICAL.md		CRITICAL.md
LICENSE		LICENSE
README.md		README.md
RESEARCH_SUMMARY.md		RESEARCH_SUMMARY.md

Folders and files

Latest commit

History

Repository files navigation

MT19937 in Cryptography — Forensic Research Project

Overview

Research Objectives

Background: Why mt19937 Is Cryptographically Unsafe

Repository Structure

Getting Started

For Human Researchers

For AI Research Agents

Running the Validation Script

Rebuilding the Summary Dashboard

Research Methodology

Phase 1 — Discovery

Phase 2 — Forensic Source Analysis

Phase 3 — Classification

Phase 4 — Independent Verification

Phase 5 — Reporting

Severity Scale (Summary)

Known Language-Default Risks

Prior Work & Related Resources

Contribution Guidelines

License & Citation

Contact & Coordination

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages