How AI Read a Herculaneum Scroll: Vesuvius Challenge Breakthrough
TL;DR
– Vesuvius Challenge read an unopened Herculaneum scroll — 2,000 Greek characters unseen since ancient times.
– A convolutional neural network detected ancient ink ghosting inside carbonized papyrus using X-ray CT scans from Diamond Light Source.
– Luke Farritor (Nebraska undergrad) found the first word with a significant prize. The full pipeline was built by Farritor, Nader, and Schillinger.
– The AI knew zero Greek. It only detected ink texture differences. Human scholars did the reading.
– Open-source win: all code, data, and model weights dropped publicly.
– Many scrolls may still be sealed in the Herculaneum library.
—
Nobody could open it. That’s where the story starts.
The Vesuvius Challenge just cracked open a Herculaneum scroll sealed for centuries.
No hands. No unrolling. A machine learning model found the ghost of ancient ink inside a carbonized papyrus roll. And human scholars read 2,000 Greek characters that hadn’t been seen in nearly two millennia. The text: a philosophical argument about pleasure, scarcity, why we always want what we can’t have. Philodemus of Gadara wrote it.
An undergrad from Nebraska found the first word.
Here’s why that matters for the problems you’re actually wrestling with in AI right now.
What Is the Vesuvius Challenge?
An open competition.
Real money. Defined goal: read text inside a sealed scroll without unfolding it.
The Herculaneum scrolls survived Vesuvius. But as black, brittle carbon sticks. No ink visible. No way to unroll them without dust. For a long time, that was the whole story.
The breakthrough wasn’t a fancier camera. Ink: carbon-based. Papyrus: carbon-based. On standard imaging, the ink is literally invisible.
A ghost pressed into a ghost.
Then someone scanned the scrolls at Diamond Light Source in the UK with high-resolution X-ray CT. Fed those 3D volumetric images into a convolutional neural network. Trained it to spot minute texture and density differences between papyrus with ink underneath and papyrus without.
The machine learned to see what physics alone couldn’t separate.
That’s it. That’s the whole trick.
How Machine Learning Detected Ancient Ink
Here’s what the pipeline actually did:
– Ran each scroll through an X-ray CT scanner at Diamond Light Source
– Generated 3D volumetric image stacks (hundreds of gigabytes per scroll)
– Trained a convolutional neural network on labeled samples of ink vs. blank papyrus
– The model output a probability map: likely ink here, likely blank there
– Human scholars used those maps to read the text
No magic. No general intelligence.
A narrow detector doing one specific thing.
The same pattern shows up everywhere AI actually delivers in production.
Medical imaging finds early-stage tumors in noisy scans. Satellite imagery spots illegal fishing boats from thermal wake signatures. The AI doesn’t understand the domain. It finds the ghost pattern.
Your data has ghost patterns too. More on that later.
Prize Breakdown: How the Incentives Worked
The Vesuvius Challenge launched with prize money that actually got people’s attention:
– Luke Farritor (then a student in Nebraska): first letter detection — “πορφυρας” (purple cloth/dye). Significant prize for about ten characters.
– Youssef Nader (working independently from Berlin): found the same word. Received a separate prize.
– Grand Prize team (Farritor, Nader, Schillinger): built the end-to-end pipeline that read larger passages.
The structure mattered. Open data, open competition, measurable results. Nat Friedman funded it because the problem was well-defined and the progress was testable: either you read text or you didn’t.
Give a small team a specific detection task with a clear metric. They’ll outwork a large team with vague direction every single time. Vesuvius is the extreme version of that principle. But the principle is identical.
AI Reading Ancient Scrolls: Business Takeaways
Here’s what stopped me.
The models used in this project had zero built-in knowledge of ancient Greek.
They detected ink. Only ink. Human scholars read what the machine found. The pipeline was ink detection, not translation.
If you’re building AI into a business process right now, that should be loud.
The wins aren’t coming from one model that does everything. They’re coming from narrow, specific models trained on your specific data, doing one detection task reliably. The Vesuvius pipeline read a scroll that sat in the dark for centuries. Given that three people built a model that did one thing. Detected ghost ink. They didn’t try to build a model that understood ancient Greek.
Your data has ghost ink too. Not a metaphor.
In every ops database, every customer interaction log, every sensor feed your business runs, there are signals that look like noise until a specifically trained model surfaces them.
The reason most AI projects fail: teams try to solve too much at once. Pick one detection task. Get the data right. Train the model narrow. Measure the output. That’s the pipeline that just read a book that sat for centuries.
Open Source Won. That’s the Real Story
Code: public. Data: open. Preprint: online. Nobody charging for model weights.
Nobody suing over methodology.
This is what open AI development looks like when it actually produces frontier results.
The Herculaneum library holds many scrolls still sealed. Nobody knows what’s inside. Researchers are already talking about applying the same approach to recycled papyrus wrappings from Egyptian mummies. Texts unreadable for millennia. If even a fraction of that library yields readable text, we’ve just bricked open a window on the ancient world that was closed for centuries.
Next time someone tells you only a closed, expensive, enterprise AI platform can solve your problem. Remember that an open competition with a significant prize just read a book that empires couldn’t open. The structure of the problem matters more than the size of your budget.
What to Do With This
Look at your data first. What’s the ghost signal you’ve been calling noise? What detection task have you been skipping as the manual process was “good enough”?
Vesuvius didn’t succeed with better technology than centuries of scholars. It succeeded since someone defined the problem as a narrow detection task, built a model for that task specifically. And opened the data for others to test.
Copy the structure. The rest follows.
Frequently Asked Questions
What is the Vesuvius Challenge?
An open competition launched with prize money funded by Nat Friedman. The goal: develop AI that can read text inside sealed Herculaneum scrolls without physically unrolling them. The first breakthrough came recently.
How did AI read the ancient scrolls?
Researchers scanned the scrolls using high-resolution X-ray CT imaging at Diamond Light Source in the UK. The 3D volumetric images were fed into a convolutional neural network trained to detect subtle texture and density differences between papyrus with ink underneath and papyrus without. The AI detected ink. It did not translate Greek. Human scholars read the output.
What did the first readable scroll say?
The first successfully read passage was a work by Philodemus of Gadara, a philosophical text arguing about pleasure, scarcity, and human desire. Approximately 2,000 Greek characters were decoded.
How much were the Vesuvius Challenge prizes?
Luke Farritor received a significant prize for first detecting a word in an unopened scroll. Youssef Nader received a separate prize for an independent detection of the same word. The Grand Prize went to the team that built the end-to-end readable pipeline.
What are the business applications of this AI method?
The same pattern. Narrow detection models finding ghost signals in noisy data. Applies directly to medical imaging, satellite analysis, industrial quality control, and any domain where important information is buried in fine-grained noise that human eyes or traditional processing miss. The lesson for operators: specific, measurable detection tasks with real data beat vague, general-purpose AI initiatives.
Sources
– Vesuvius Challenge official site. Vesuviuschallenge.com
– Diamond Light Source X-ray CT scanning methodology. Diamond.ac.uk
– Preprint: “Restoration of Herculaneum papyri reveals ancient Greek text”. Arxiv.org
– Nat Friedman announcement — twitter.com/natfriedman
