· 6 min
Rasayana: a living atlas of Indian medicine, built as a graph
I wanted one place where a plant connects to its molecules, those molecules connect to the proteins they act on, and every single link shows where it came from. So I built it.
Traditional Indian medicine holds centuries of patient, accumulated observation about plants.
The problem is where that knowledge lives.
It lives in prose. In commentaries, formularies, and oral lineage. It sits far from the molecular pharmacology that could test it, explain it, or simply place it next to a known mechanism.
The plant and the molecule are rarely in the same room.
I wanted to build that room.
Rasayana is a Sanskrit term from Ayurveda for rejuvenation, the path of longevity. I borrowed the name for a living atlas of Indian medicine, built as a knowledge graph.
It is live now: Open it → rasayana.vayuai.ai
Curiosity you can wander, with provenance you can check.
What it actually is
Rasayana is a graph of roughly 9,291 plants, about 89,800 phytochemicals, and a set of protein targets, joined by close to 1.35 million sourced links.
A link is an edge. Every edge says one specific thing: this plant contains this molecule, or this molecule acts on this target.
The point is not the size of the number. The point is what each edge carries.
Every single one carries its source.
The graph is fused from open scientific databases, not invented. Plant-to-molecule chemistry comes mostly from Dr. Duke's Phytochemical and Ethnobotanical Database. Molecule-to-target activity comes from CMAUP, the Collective Molecular Activities of Useful Plants. Plant context and naming come from Wikipedia.
Three open sources, reconciled into one navigable structure.
That is the whole idea in one picture. A plant on the left, a molecule in the middle, a protein on the right, and the name of the database on every arrow.
Why I built it
I keep coming back to the same gap.
Many useful systems are not blocked by a lack of knowledge. They are blocked by the knowledge being unconnected.
Ethnobotany knows that a plant has a use. Pharmacology knows that a molecule has a target. But the two bodies of knowledge mostly grew up apart, in different languages and different file formats.
So a simple question becomes hard:
This plant is traditionally used. What is actually in it, and what do those molecules touch?
That question should take one query, not a literature review.
Rasayana exists to make that one specific kind of curiosity cheap. Start at a plant you are curious about, and walk outward to its chemistry and to the protein targets that chemistry has been reported to act on. Or start at a molecule and walk back to every plant that contains it.
Wander the connective tissue. Check the citation at every step.
How it works
The stack is deliberately ordinary, because the value is in the data and the honesty, not in exotic infrastructure.
A Vite and React front end. A FastAPI back end. Postgres 17 with pgvector for embeddings.
It deploys as a Vercel static front end plus Python serverless functions, talking to a single free Neon Postgres instance that holds the entire graph. The full 1.35 million edges live there, not a sampled subset.
Provenance is not a feature bolted on at the end. It is a column. Source travels with the edge through every join, so it is always there to render.
Three ways in
I built three doors into the same graph, because different people arrive with different questions.
The Library is for wandering. You move between plants, molecules, and targets and follow the links by hand. Click a plant, see its phytochemicals. Click a molecule, see its targets and every other plant that contains it. There is no single right path. The structure is the experience.
The Course is for starting at zero. It is a six-lesson interactive "Learn" path with live widgets. It connects the classical ideas of Ayurveda, the doshas and the six tastes, to real molecules and to modern pharmacology. It is written for someone who has never opened a chemistry textbook, and it earns each connection rather than asserting it.
Ask is for the direct question. It is a natural-language assistant that answers strictly from the graph.
The assistant, honestly
Ask is the part I want to describe most carefully, because this is where AI tools usually start overpromising.
Ask is grounded in the graph. It does not free-associate from a language model's memory.
It works by translation. It takes your plain-English question and writes read-only SQL over the database. It runs that query. Then it answers from the rows that come back, streams the response, and cites where each fact came from.
The constraint is the design. The assistant has read-only access. It cannot modify the data, drop a table, or write a row. It can look, and only look.
That constraint is also a boundary on what it is good for.
Ask is good at "what molecules does this plant contain, and what do those molecules target." That is a question the graph can answer from its rows.
Ask is not a clinician. It will not tell you what to take, what dose, or what it will do to your body. Those are not facts in the graph, so they are not facts it will invent.
If it is not in the rows, it does not get said.
The provenance stance
This is the core value, so I will state it plainly.
Every claim in the graph carries its source. That is the entire point of the thing.
A traditional use is shown as a traditional use, not as proven pharmacology.
A molecular target is shown with the database it came from, not as a settled clinical fact.
The honest framing is this: Rasayana maps and connects open data. It does not validate folk claims. It does not turn observation into proof by putting it next to a molecule. And nothing in it is medical advice.
This is a research and education tool. It is a way to see how centuries of plant observation line up against the modern molecular record, with every link checkable. It is not a recommendation engine, and it is not a substitute for a doctor.
I think that distinction is what makes the tool trustworthy rather than impressive.
The limits, stated plainly
A tool is only as honest as its limitations section, so here is mine.
The data is only as good, and only as current, as the open sources it was fused from. If Duke or CMAUP has a gap or an error, Rasayana inherits it.
Coverage is uneven across plants. Some species are richly documented. Others have a handful of edges. The graph reflects what has been studied and published, which is not the same as what is true.
A link between a molecule and a target is a reported activity, not a clinical outcome. "Has been observed to act on" is a long way from "treats." Rasayana shows the former and never claims the latter.
So the value is not a verdict. The value is the connective tissue and the citations.
Rasayana does not tell you what works. It shows you what connects, and exactly where each connection was reported, and then it gets out of your way.
That is a smaller claim than "AI decodes ancient medicine."
It is also a true one, which I care about more.
What I learned building it
Two things stand out.
First, provenance has to be load-bearing, not decorative. If the source is a column that travels through every join, honesty is automatic and free. If it is an afterthought you try to attach later, it quietly disappears under the first complicated query. I put the citation in the schema, not in the copy.
Second, grounding an assistant is mostly about saying no. The hard part of Ask was not getting a language model to write SQL. It was constraining it to answer from the rows and to refuse the questions the graph cannot support. A grounded assistant is defined as much by what it declines as by what it returns.
Where this goes next
There is more graph to build, and more of it to explain.
Deeper sourcing, so the citations get richer and the coverage gaps shrink.
More target and pathway context, so a molecule-to-protein edge can sit inside the biology it belongs to instead of standing alone.
And more of the Course, because the connective tissue is only useful to someone who has been taught how to read it.
The direction is the same as the first day. Put the plant and the molecule in the same room. Show the source on every link. Let people wander, and let them check.
A connection you cannot trace is a rumor. A connection with its source attached is a starting point.
For me, that is the whole difference, and it is the line Rasayana is built on.