Voynich Manuscript — Theories and Decipherment Attempts

A detailed breakdown of every serious theory about the Voynich Manuscript, rated by current scholarly credibility, with recent evidence for and against. See concept-voynich-manuscript for the full background.

Status: No decipherment has been accepted by the scholarly community as of April 2026.


Theory Rankings (As of 2026)

TheoryScholarly StandingMain ProponentsFatal Objection
Homophonic substitution cipher (Latin/Italian)Mainstream, viableNaibbe study (2025), many othersNo decipherment works
Hoax / deliberate gibberishMainstream, growingSchinner (2007), Rugg (2004)Production cost; Zipfian stats
Constructed languagePlausible minorityZandbergen, othersWould expect different stats
Unknown natural languageMinorityVariousWould expect decipher at some frequency
Hebrew encoded textFringe-to-minorityWitten et al. (2019 AI study)Methodology criticized heavily
Old TurkishFringeHraçya Altounian et al. (2022)No accepted reconstruction
Alien / horoscope / magicalFringePopular mediaNo serious scholarly support

The Naibbe Cipher (2025) — Most Important Recent Development

Standing: Emerging — not yet peer-reviewed by cryptology mainstream
Publication: Cryptologia, November 2025
Author: Greshko (code available on GitHub)

The Naibbe cipher is named after a 14th-century Italian card game (naibbe, an early tarot/card game). It is a verbose homophonic substitution cipher: each plaintext letter maps to multiple possible Voynich glyph strings (making frequency analysis harder), and extra glyphs are inserted to pad word lengths.

How it works:

  1. Take a Latin or Italian plaintext
  2. Roll a die to break the text into chunks
  3. Draw a playing card to select one of six encryption tables
  4. Use the table to convert letter chunks into Voynich glyph sequences

All required tools (dice, playing cards) were available in 15th-century Europe. The cipher can be performed entirely by hand.

What it reproduces:

  • Symbol frequency distributions matching Voynichese
  • Word length distributions
  • Character positional constraints (certain glyphs only appear at word beginnings/ends)
  • Entropy levels
  • Zipfian word-frequency distribution

What it does NOT do:

  • Actually decode the Voynich Manuscript
  • Prove this was the actual cipher used
  • Explain the illustrations

Significance: Proves the cipher hypothesis remains viable. Demonstrates that a historically plausible mechanism exists. Also introduces a tarot/playing-card connection — the manuscript’s unknown author would have needed access to card game tables, suggesting cultural context in the Italian card-game tradition.

Critics note: Many cipher systems can reproduce statistical signatures of Voynichese. Reproducing statistics doesn’t prove the specific mechanism. The author explicitly stresses this is a proof of concept, not a claimed solution.


The Multispectral Imaging Discovery (September 2024)

Standing: Established finding
Discovered by: Lisa Fagin Davis (Medieval Academy of America), Roger Easton (Rochester Institute of Technology)
Published: Blog post + follow-on research, September 2024

Multispectral images taken in 2014 by The Lazarus Project were reprocessed in 2024, revealing three previously invisible columns of letters on folio 1r:

  • Column 1: Roman alphabet (A through Z)
  • Column 2: Voynich characters
  • Column 3: Roman alphabet offset by one letter

The offset between the two Roman alphabet columns is a classic single-shift Caesar cipher — the simplest possible substitution. Fagin Davis concluded this represents an early owner’s failed decipherment attempt — probably from the 17th century, consistent with the Marci/Baresch correspondence. The decoder tried two substitution ciphers simultaneously, or was building a cipher key using Voynich characters.

This is significant because: (1) it shows contemporaries also thought it was a substitution cipher, and (2) it failed — suggesting the actual cipher, if any, is more complex than a simple substitution.


The Sex and Women’s Medicine Theory (2024)

Standing: Emerging scholarly hypothesis
Publication: Social History of Medicine (Oxford University), 2024
Authors: Keagan Brewer, Michelle L. Lewis

Core argument: The manuscript encodes gynecological and reproductive medicine, deliberately hidden using cipher as recommended by contemporaneous medical practitioners.

Evidence:

  • Johannes Hartlieb (~1410–68), a Bavarian physician from the same time and place as the manuscript, explicitly recommended using “secret letters” (a cipher, secret alphabet, or similar) to conceal medical texts on reproduction, contraception, and abortion — topics too sensitive for open circulation
  • The “Rosette page” (a nine-panel fold-out diagram) is interpreted as representing coitus and conception — anatomical details in the upper-left panel allegedly correspond to Abu Bakr Al-Rāzī’s (Rhazes) description of female reproductive anatomy, including “five small veins in the vaginas of virgins”
  • The balneological (bathing women) section is reinterpreted as a gynecological diagram, not a leisure bath scene

Objections:

  • The anatomical reading requires significant interpretive latitude
  • No text has been deciphered to confirm the content independently
  • Alternative readings of the Rosette page remain equally plausible

Significance: Shifts the question from “what language” to “what subject” — if the content purpose is established independently, it constrains candidate languages and cipher types.


The AI / Machine Learning Approaches

Hebrew AI Study (2019) — Now Considered Flawed

Authors: Bradley Hauer and Grzegorz Kondrak (University of Alberta)
Claimed: AI analysis found 80%+ of Voynich words in Hebrew dictionaries
Method: Used neural machine translation to test candidate languages; Hebrew scored highest

Why it was criticized:

  • The Hebrew matching used a permissive method that removed vowels (treating words as consonant-skeleton matches), inflating match rates
  • 80% dictionary coverage can be achieved for many texts in many languages using sufficiently loose matching
  • No coherent Hebrew translation was produced for any continuous passage

Topic Modeling (2021)

A Yale-affiliated research group applied LDA, LSA, and NMF to find vocabulary clusters across the manuscript. Results suggested genuine topical variation across sections — the herbal section vocabulary differs from balneological vocabulary in structured ways. This supports the manuscript having intentional organization, but doesn’t identify the content.

Large Language Models (2024–2026)

GPT-class models have been applied in various informal and semi-formal experiments. None have produced accepted decipherments. The fundamental epistemological problem: without an external ground truth (a known translation of even a few words), LLMs cannot distinguish the correct interpretation from infinitely many wrong ones that match the same statistical constraints. The Voynich Manuscript is a case where more powerful pattern-matching doesn’t help because the problem is not pattern-matching but pattern-validation.


The Gibberish Studies (Strengthening Skepticism)

A 2025 experiment (cited by Art Newspaper) had volunteers write pages of “convincing gibberish” — fake text meant to look linguistic. The resulting texts showed Zipfian-like word frequency distributions, similar word-length patterns to Voynichese, and some statistical properties previously thought to prove linguistic structure. This does not prove the Voynich Manuscript is gibberish, but it weakens the statistical evidence that was once considered definitive proof of real language.

The key finding: human beings are surprisingly good at generating statistically language-like text even when deliberately producing nonsense. Intelligent gibberish is hard to distinguish from language by purely statistical means.


The “Currier A / Currier B” Two-Language Problem

Any theory must explain this: the manuscript contains two statistically distinct text types (A and B), written by five distinct scribes (per Lisa Fagin Davis’s paleographic analysis), with A appearing in herbal/pharmaceutical sections and B in balneological/astrological sections.

Implications:

  • If it’s a cipher: either two different ciphers were used, or the same cipher was applied to two different languages
  • If it’s a hoax: a sophisticated one requiring multiple collaborators to maintain statistical consistency
  • If it’s a constructed language: possibly two dialects or two stages of the language
  • If it’s a real language: possibly two different source languages, two regional dialects, or texts from two different time periods stitched together

The five-scribe analysis is particularly important — it argues against the “lone eccentric hoaxer” theory and suggests collaborative institutional production, possibly a scriptorium or a group with shared purpose.


Key Fringe Theories (Not Seriously Maintained)

  • Roger Bacon as author: Definitively disproved by radiocarbon dating — Bacon died in 1292, the vellum is 1404–1438
  • John Dee forgery: Dee was an Elizabethan mathematician; the vellum predates him by a century
  • Alien origin / extraterrestrial: No serious scholarly support
  • Lost Aztec or Mesoamerican language: Botanicals in the herbal section have been compared to New World plants, but early-15th-century vellum predates Columbus

Open Conference: 2026

A 2026 International Conference on the Voynich Manuscript is being organized under the auspices of the Medieval Academy of America. This represents the first large-scale interdisciplinary scholarly gathering specifically focused on the manuscript — a sign that mainstream scholarship is investing seriously in a resolution.


What Would Actually Settle This?

Scholars have identified what a valid decipherment would require:

  1. A proposed key that decodes at least one full coherent passage into a known language
  2. The decoded text should be independently verifiable (e.g., match a known historical recipe, prayer, or medical formula)
  3. The key should work for multiple unconnected passages, not just isolated words
  4. External corroboration — ideally, finding a contemporary reference to the cipher system used

No proposed decipherment has met these criteria.


See Also