Voynich Manuscript — Theories and Decipherment Attempts

A detailed breakdown of every serious theory about the Voynich Manuscript, rated by current scholarly credibility, with recent evidence for and against. See concept-voynich-manuscript for the full background.

Status: No decipherment has been accepted by the scholarly community as of April 2026.

Theory Rankings (As of 2026)

Theory	Scholarly Standing	Main Proponents	Fatal Objection
Homophonic substitution cipher (Latin/Italian)	Mainstream, viable	Naibbe study (2025), many others	No decipherment works
Hoax / deliberate gibberish	Mainstream, growing	Schinner (2007), Rugg (2004)	Production cost; Zipfian stats
Constructed language	Plausible minority	Zandbergen, others	Would expect different stats
Unknown natural language	Minority	Various	Would expect decipher at some frequency
Hebrew encoded text	Fringe-to-minority	Witten et al. (2019 AI study)	Methodology criticized heavily
Old Turkish	Fringe	Hraçya Altounian et al. (2022)	No accepted reconstruction
Alien / horoscope / magical	Fringe	Popular media	No serious scholarly support

The Naibbe Cipher (2025) — Most Important Recent Development

Standing: Emerging — not yet peer-reviewed by cryptology mainstream
Publication: Cryptologia, November 2025
Author: Greshko (code available on GitHub)

The Naibbe cipher is named after a 14th-century Italian card game (naibbe, an early tarot/card game). It is a verbose homophonic substitution cipher: each plaintext letter maps to multiple possible Voynich glyph strings (making frequency analysis harder), and extra glyphs are inserted to pad word lengths.

How it works:

Take a Latin or Italian plaintext
Roll a die to break the text into chunks
Draw a playing card to select one of six encryption tables
Use the table to convert letter chunks into Voynich glyph sequences

All required tools (dice, playing cards) were available in 15th-century Europe. The cipher can be performed entirely by hand.

What it reproduces:

Symbol frequency distributions matching Voynichese
Word length distributions
Character positional constraints (certain glyphs only appear at word beginnings/ends)
Entropy levels
Zipfian word-frequency distribution

What it does NOT do:

Actually decode the Voynich Manuscript
Prove this was the actual cipher used
Explain the illustrations

Significance: Proves the cipher hypothesis remains viable. Demonstrates that a historically plausible mechanism exists. Also introduces a tarot/playing-card connection — the manuscript’s unknown author would have needed access to card game tables, suggesting cultural context in the Italian card-game tradition.

Critics note: Many cipher systems can reproduce statistical signatures of Voynichese. Reproducing statistics doesn’t prove the specific mechanism. The author explicitly stresses this is a proof of concept, not a claimed solution.

The Multispectral Imaging Discovery (September 2024)

Standing: Established finding
Discovered by: Lisa Fagin Davis (Medieval Academy of America), Roger Easton (Rochester Institute of Technology)
Published: Blog post + follow-on research, September 2024

Multispectral images taken in 2014 by The Lazarus Project were reprocessed in 2024, revealing three previously invisible columns of letters on folio 1r:

Column 1: Roman alphabet (A through Z)
Column 2: Voynich characters
Column 3: Roman alphabet offset by one letter

The offset between the two Roman alphabet columns is a classic single-shift Caesar cipher — the simplest possible substitution. Fagin Davis concluded this represents an early owner’s failed decipherment attempt — probably from the 17th century, consistent with the Marci/Baresch correspondence. The decoder tried two substitution ciphers simultaneously, or was building a cipher key using Voynich characters.

This is significant because: (1) it shows contemporaries also thought it was a substitution cipher, and (2) it failed — suggesting the actual cipher, if any, is more complex than a simple substitution.

The Sex and Women’s Medicine Theory (2024)

Standing: Emerging scholarly hypothesis
Publication: Social History of Medicine (Oxford University), 2024
Authors: Keagan Brewer, Michelle L. Lewis

Core argument: The manuscript encodes gynecological and reproductive medicine, deliberately hidden using cipher as recommended by contemporaneous medical practitioners.

Evidence:

Johannes Hartlieb (~1410–68), a Bavarian physician from the same time and place as the manuscript, explicitly recommended using “secret letters” (a cipher, secret alphabet, or similar) to conceal medical texts on reproduction, contraception, and abortion — topics too sensitive for open circulation
The “Rosette page” (a nine-panel fold-out diagram) is interpreted as representing coitus and conception — anatomical details in the upper-left panel allegedly correspond to Abu Bakr Al-Rāzī’s (Rhazes) description of female reproductive anatomy, including “five small veins in the vaginas of virgins”
The balneological (bathing women) section is reinterpreted as a gynecological diagram, not a leisure bath scene

Objections:

The anatomical reading requires significant interpretive latitude
No text has been deciphered to confirm the content independently
Alternative readings of the Rosette page remain equally plausible

Significance: Shifts the question from “what language” to “what subject” — if the content purpose is established independently, it constrains candidate languages and cipher types.

The AI / Machine Learning Approaches

Hebrew AI Study (2019) — Now Considered Flawed

Authors: Bradley Hauer and Grzegorz Kondrak (University of Alberta)
Claimed: AI analysis found 80%+ of Voynich words in Hebrew dictionaries
Method: Used neural machine translation to test candidate languages; Hebrew scored highest

Why it was criticized:

The Hebrew matching used a permissive method that removed vowels (treating words as consonant-skeleton matches), inflating match rates
80% dictionary coverage can be achieved for many texts in many languages using sufficiently loose matching
No coherent Hebrew translation was produced for any continuous passage

Topic Modeling (2021)

A Yale-affiliated research group applied LDA, LSA, and NMF to find vocabulary clusters across the manuscript. Results suggested genuine topical variation across sections — the herbal section vocabulary differs from balneological vocabulary in structured ways. This supports the manuscript having intentional organization, but doesn’t identify the content.

Large Language Models (2024–2026)

GPT-class models have been applied in various informal and semi-formal experiments. None have produced accepted decipherments. The fundamental epistemological problem: without an external ground truth (a known translation of even a few words), LLMs cannot distinguish the correct interpretation from infinitely many wrong ones that match the same statistical constraints. The Voynich Manuscript is a case where more powerful pattern-matching doesn’t help because the problem is not pattern-matching but pattern-validation.

The Gibberish Studies (Strengthening Skepticism)

A 2025 experiment (cited by Art Newspaper) had volunteers write pages of “convincing gibberish” — fake text meant to look linguistic. The resulting texts showed Zipfian-like word frequency distributions, similar word-length patterns to Voynichese, and some statistical properties previously thought to prove linguistic structure. This does not prove the Voynich Manuscript is gibberish, but it weakens the statistical evidence that was once considered definitive proof of real language.

The key finding: human beings are surprisingly good at generating statistically language-like text even when deliberately producing nonsense. Intelligent gibberish is hard to distinguish from language by purely statistical means.

The “Currier A / Currier B” Two-Language Problem

Any theory must explain this: the manuscript contains two statistically distinct text types (A and B), written by five distinct scribes (per Lisa Fagin Davis’s paleographic analysis), with A appearing in herbal/pharmaceutical sections and B in balneological/astrological sections.

Implications:

If it’s a cipher: either two different ciphers were used, or the same cipher was applied to two different languages
If it’s a hoax: a sophisticated one requiring multiple collaborators to maintain statistical consistency
If it’s a constructed language: possibly two dialects or two stages of the language
If it’s a real language: possibly two different source languages, two regional dialects, or texts from two different time periods stitched together

The five-scribe analysis is particularly important — it argues against the “lone eccentric hoaxer” theory and suggests collaborative institutional production, possibly a scriptorium or a group with shared purpose.

Key Fringe Theories (Not Seriously Maintained)

Roger Bacon as author: Definitively disproved by radiocarbon dating — Bacon died in 1292, the vellum is 1404–1438
John Dee forgery: Dee was an Elizabethan mathematician; the vellum predates him by a century
Alien origin / extraterrestrial: No serious scholarly support
Lost Aztec or Mesoamerican language: Botanicals in the herbal section have been compared to New World plants, but early-15th-century vellum predates Columbus

Open Conference: 2026

A 2026 International Conference on the Voynich Manuscript is being organized under the auspices of the Medieval Academy of America. This represents the first large-scale interdisciplinary scholarly gathering specifically focused on the manuscript — a sign that mainstream scholarship is investing seriously in a resolution.

What Would Actually Settle This?

Scholars have identified what a valid decipherment would require:

A proposed key that decodes at least one full coherent passage into a known language
The decoded text should be independently verifiable (e.g., match a known historical recipe, prayer, or medical formula)
The key should work for multiple unconnected passages, not just isolated words
External corroboration — ideally, finding a contemporary reference to the cipher system used

No proposed decipherment has met these criteria.

Quartz 4

Explorer

Voynich Manuscript — Theories and Decipherment Attempts

Voynich Manuscript — Theories and Decipherment Attempts

Theory Rankings (As of 2026)

The Naibbe Cipher (2025) — Most Important Recent Development

How it works:

What it reproduces:

What it does NOT do:

The Multispectral Imaging Discovery (September 2024)

The Sex and Women’s Medicine Theory (2024)

The AI / Machine Learning Approaches

Hebrew AI Study (2019) — Now Considered Flawed

Topic Modeling (2021)

Large Language Models (2024–2026)

The Gibberish Studies (Strengthening Skepticism)

The “Currier A / Currier B” Two-Language Problem

Key Fringe Theories (Not Seriously Maintained)

Open Conference: 2026

What Would Actually Settle This?

See Also

Graph View

Table of Contents

Backlinks