How to read a psychedelic clinical trial without getting fooled.
A working operator's guide to the eight things that matter in a Phase 2 readout, and the four that almost always get overhyped. Bring a pen.
Every week a new headline lands: a Phase 2 result described as "promising," a Phase 3 readout called "groundbreaking," a trial that "could change everything." Most of the people writing those headlines have never opened a clinical trial registry. Most of the people sharing them have not either. This piece is for everyone who wants to do better.
This is not academic. It is the framework I use every Tuesday before I write a word.
Start with the primary endpoint. Not the press release.
The first thing I do when a trial result drops is go to the registration on ClinicalTrials.gov. Why? Because the primary endpoint is locked before the trial starts. When a company issues a press release and leads with a secondary outcome, or mentions an unplanned subgroup analysis, that is a signal worth noting.
Primary endpoints are the reason the trial was designed. Secondary endpoints are hypotheses worth exploring. The two are not the same thing. A press release that leads with a secondary outcome while the primary endpoint missed is doing something with language that you should name.
Check the registration date against the results date. Check whether the endpoint definition changed mid-trial. This is called outcome switching, and it is more common in early-stage psychedelic research than most people acknowledge publicly.
Eight things that actually matter in a Phase 2 readout
These are in rough order of importance:
- Sample size and statistical power. A Phase 2 study with 40 participants was never designed to prove efficacy. It was designed to generate signal for Phase 3. Treat it that way.
- Blinding quality. Psychedelic trials have a fundamental blinding problem. Active placebos (like low-dose niacin or low-dose drug) are not the same as inert placebos. When you read that a trial was "double-blind," ask what the active control was.
- Rater independence. Who assessed the outcome? Were raters blinded to treatment allocation? In trials where participants often know whether they received the active treatment, rater blinding matters enormously.
- The comparator. Waitlist controls and treatment-as-usual controls are not the same thing. A drug that outperforms a waitlist is doing something. Whether it outperforms an active control with a therapist and structured sessions is a different question.
- Responder rates versus mean change. Headline effect sizes often report mean change on a depression scale. Responder rates (the percentage of participants who hit a clinically meaningful threshold) tell you something different and often more useful.
- Safety data, not just efficacy data. Adverse event tables are rarely what journalists report on. I read them anyway. Serious adverse events, discontinuation rates, and cardiovascular data matter, particularly in at-risk populations.
- Duration of follow-up. A four-week endpoint is not the same as a six-month endpoint. In psychedelic trials especially, the durability of response is the open question. If a trial does not report long-term follow-up, note that gap.
- Industry sponsorship and investigator ties. This is not an accusation. It is context. Sponsored trials are not automatically biased, but they run through different incentive structures than independent academic studies. Both types of evidence belong in your reading, and neither should be taken at face value.
The four things that almost always get overhyped
Effect size without context. A Cohen's d of 0.8 sounds impressive. Without knowing what the comparator was, how long the effect lasted, or whether the scale used has clinical relevance, the number is not informative.
"Statistically significant" as a threshold for clinical significance. A p-value below 0.05 means the result is unlikely to be chance. It does not mean the effect is large enough to matter to a patient.
Open-label results presented alongside controlled results. Some of the most-cited positive findings in psychedelic research come from open-label studies. Open-label studies cannot establish causality. They generate hypotheses. Treating them as proof is a category error.
Anecdotal experience from trial participants. Patients who improve often attribute improvement to the drug. Patients often cannot separate drug effect from expectation effect from therapist effect. This is not a criticism of patients. It is a limitation of the study design. Qualitative data from trial participants is interesting. It is not evidence of mechanism.
How to read the numbers quickly
When a Phase 2 result drops, I spend about ten minutes on this checklist before I form an opinion:
- Go to ClinicalTrials.gov. Read the registration.
- Check the primary endpoint. Did the trial hit it?
- What was the comparator? Active control or waitlist?
- What is the sample size? What was the trial powered to detect?
- Read the adverse event table. Any patterns?
- How long was follow-up? Is there a long-term cohort?
- Who funded it? Who were the investigators?
This takes ten minutes. It is not sufficient for a full assessment. But it is enough to know whether a press release is representing the data fairly.
A closing note on the state of the field
The psychedelic medicine field is producing genuinely interesting science. Some of it is rigorous. Some of it is not yet ready for the weight being placed on it by investors, patients, and advocates. The gap between those two categories is exactly where misinformation lives.
Reading primary sources is not a superpower. It is just the work. If you want to form informed views about which companies are building on durable evidence and which ones are building on press releases, this is where that work starts.
If you want this kind of analysis in your inbox every Tuesday, the signup is below.
Get this analysis every Tuesday, in your inbox.
Join 8,400+ readers who rely on Psilosignal for honest, cited coverage of psychedelic medicine.
Subscribe free