The Unjournal

Making impactful research more rigorous — and rigorous research more impactful

David Reinstein · Founder & Co-Director, unjournal.org

A talk for university researchers

In one slide

The Unjournal (unjournal.org) — a grant-funded nonprofit that pays experts to publicly evaluate and rate research, and assesses Pivotal Questions for stakeholders.

We aim to make impactful research more rigorous, and academic work more useful.

We support open science, open access & transparency.

We work to improve peer-review — aligning research incentives with truth-seeking and social value.

Focus: economics, policy & quantitative social science with global-impact potential  ·  unjournal.pubpub.org

A bit about me

A typical day in academia — before I left to chase impact.

1 · The limitations of journal peer-review

An old system, still running the show

Journals do real things — curation, dissemination, community, a trusted signal.

But we already disseminate ourselves — working papers, arXiv, SSRN, dynamic docs — and a 17th-century filter still governs careers and credibility.

So the scarce, valuable part is increasingly the evaluation — the expert judgement — and we throw most of it away.

The biggest cost isn’t fees or paywalls: it’s the game

The average economics paper goes to 3–4 journals before it’s placed.

Reviewer time alone is a back-of-envelope ~$150M/year in economics.

But the biggest cost is authors’ time — reformatting, resubmitting, journal-shopping, strategising instead of just improving the work.

…e.g. “spin it as a hamburgers-economics paper for the American Hamburger Journal

★ THE PUBLISH-OR-PERISH SLOT MACHINE ★
insert: 1 finished paper · ~6 months / spin
1 · pick an arm (which journal?)
AEREJJDE
2 · pull the lever → wait ~6 months →
rejectR&Raccept
PAYOUT: one line on your CV
Careers staked on a noisy, slow spin — each spin ~6 months; placement takes 2–6 years.

“Playing this game diverts us from producing the most credible, useful research.”

“Published — so stop bothering me about it”

Journals take one format: ~30 static pages.

Publication says “done” — so we slice off the next paper.

Little room to improve, correct, or build in place.

Research as a living document

Evaluation needn’t be chained to a 30-page PDF:

Any format — dynamic, interactive, replicable documents.

Improve in place → then ask for further evaluation.

Open evaluation feeds open science & replication.

An interactive specification-curve / “multiverse” document — easy to build now, far more useful than static pages.  ↓ what separating evaluation unlocks

Separating evaluation from publishing → a world of benefits

Decouple the evaluation from the 30-page PDF and it becomes a citable, first-class object — DOI, metadata, discoverable — instead of a one-shot stamp in a “PDF prison”.

2 · What The Unjournal is

unjournal.org  ·  info.unjournal.org  ·  unjournal.pubpub.org

We are not a journal, we don’t “publish papers”

A non-profit commissioning open evaluation of publicly-hosted research with potential for global impact.

We commission and pay for expert evaluation — and authors can still publish in a journal too.

Multiple evaluations, structured ratings, and an author response — all public, with DOIs.

Credible, citable peer review — not tied to any journal’s accept/reject.

unjournal.org · unjournal.pubpub.org  ·  Funders: Survival & Flourishing Fund, Long-Term Future Fund, EA Infrastructure Fund

What an evaluation gives you

We don’t accept/reject or assign a tier; so we benchmark instead. Two halves, equally important:

The substance

  • Detailed referee-style reports
  • The authors’ response
  • Editorial summary

The structured ratings

  • Percentile ranking
  • Journal-tier equivalent (0–5)
  • Nine criteria, with quantified uncertainty
  • Claim identification and assessment

All public, citable, and comparable — see the evaluator interface →  ·  ↓ the actual instrument

Inside the evaluation form

What an evaluator actually fills in — every rating elicited with a 90% credible interval, not a point score.
Percentile rating  ·  Methods: justification, reasonableness, validity, robustness
0255075100
Midpoint 72  ·  90% CI 66 – 78  (tight — a confident rating; one of nine criteria)
Journal-tier rating (0–5)  — elicited as two separate ratings, each with its own 90% CI
“Should” — normative merit  → Midpoint 3.8, 90% CI 2.7 – 4.6 (wide — more uncertain)
“Will” — predicted placement  → Midpoint 3.2, 90% CI 2.5 – 3.9
012345
0 won’t publish · 1 OK · 2 marginal-B · 3 top-B · 4 marginal-A · 5 top-A (top-5: AER, QJE, Econometrica). Non-integers encouraged; gap between “should” and “will” = the placement lottery.
Claim identification & assessment — evaluators pull out the paper’s key claims and rate, for each, the strength of evidence and its implications.
Rebuilt from the live instrument · open the evaluator form →

Why this can work now

Some research users want more than “which journal published it” — and they want it faster:

Funders & research users who need evidence to act — e.g. Coefficient Giving, Survival & Flourishing Fund.

They want credible expert judgment, transparent reasoning, quantified beliefs & uncertainty.

What they’re really after: decision-relevance and value of information.

Many don’t yet know which questions matter most — or what the evidence already says. Helping surface and answer those is part of the job (prioritisation; Pivotal Questions).

And researchers keep their independence: we evaluate and prioritise existing public work — we don’t commission, own, or steer the research itself.

A different demand signal from the journal system.

↓ “but doesn’t this need everyone to move at once?”

Solving the coordination problem

Academics ~broadly agree open evaluation is better — but can’t move first alone.

Funding & grantmaker incentives can tip the balance.

We work to be highly visible — so evaluations & ratings are seen before conventional reviewers weigh in.

Building a bridge, not asking you to jump off one: Fear of Standing Out → Fear of Missing Out.

Making it discoverable where it counts

Unjournal evaluations are indexed in Google Scholar — surfacing with the working paper, not years later.  search “source:unjournal” →

Speed

One round of public evaluation → a credible output now.

A publicly citable signal after one round.

A traditional journal: 6+ months, R&R at best — then maybe accepted after substantial revisions.

For fast-moving topics, that lag means missing the decision window entirely.

AI capabilities · AI’s impact on labour markets · policy windows

How long does it take?

Target ~2–3 months · prioritisation → published package

  • Recruit ~2 evaluators — ~1–2 weeks
  • Evaluations (reports + ratings) — 5+ weeks  (~3-week turnaround target each)
  • Author response~2 weeks  (longer if revising)
  • Total target: ~7–10 weeks

Versus a traditional economics journal: ~1–3 years (often 24+ months to acceptance).

Self-reported evaluator effort ≈ 8–32 hours per evaluation. The target above is from our process docs; we track the dates but haven’t yet published a measured median end-to-end.

How it works

  1. Find / receive the research
  2. Prioritise for decision-relevance (as a team)
  3. Recruit an evaluation manager → ~2 paid expert evaluators
  4. Reports + ratings + author response (evaluators may adjust)
  5. Manager synthesispublish the package, with a DOI

Who evaluates? Domain experts from our 180+ pool (½ hold doctorates, ~40 professors), matched to each paper — paid, named or anonymous.  ▶ 2-min explainer · ↓ full workflow & video · who’s behind it (§3)

Our workflow

Watch the 2-minute explainer

▶  Watch the 2-minute explainer on YouTube

A short narrated walk-through of the Unjournal evaluation process  ·  youtu.be/ZCSeAmzMB50

We prioritise research for impact-potential

Prioritisation is triage, not evaluation

First question: will better evidence here change real decisions?

We do prioritise influential, widely-read work — but we don’t chase the merely clever.

↓ how the triage actually runs

How the triage runs

How the team votes

Every candidate paper gets a team vote on impact-potential — Strong Yes / Weak Yes / Unsure / Weak No / Strong No, with vote counts and an average. This is the actual voting board (Coda).

Some considerations

3 · What we’ve done

Where we are now

57 evaluation packages on PubPub

100+ expert evaluations

180+ evaluators (120+ PhDs, ~40 profs)

~$450 avg evaluator payment

1,000+ structured ratings recorded

40+ field specialists

ISSN 3071-2173 · 501(c)(3) · DOIs

Plus a live prioritized-research pipeline, and Pivotal-Questions workshops & belief elicitations underway.  ·  Founded 2022, public since 2023.

Every rating comes with a credible interval

Dots = evaluator medians; bars = stated uncertainty. Published, decision-relevant evaluations, sorted by midpoint.  ratings dashboard →  ·  ↓ a bridge to journal tiers

One number hides too much

Each paper is rated on ~8 dimensions — Overall, Open Science, Advances Knowledge, Methods, Logic & Communication, Real-World relevance, Global Relevance — each with uncertainty. A spider plot shows strengths and weaknesses a single tier would flatten.  explore the dashboard →

Benchmarking existing signals: a known currency

Predicted vs. merited journal tier (0–5). A translation layer — not an endorsement of placement as the right endpoint.

Overall ratings by research area

Every published evaluation’s overall percentile rating, grouped by research area (✗ marks the area median).  dashboard →

What we’ve evaluated — 57 packages by area

Global health & wellbeing15
Development & governance10
Economics, welfare & policy7
Environment & climate6
Meta-science & methods6
Animal welfare & markets5
Catastrophic & long-term risk4
AI & emerging tech2
Behaviour & attitudes2
Published packages (n=57). Health, development, environment & applied micro — the wheelhouse of many economics departments.

A concrete example: an award-winning evaluation

2024–25 Evaluator Prize · 1st Evaluation of “Water Treatment & Child Mortality: a meta-analysis”

“Very influential.”— GiveWell water team (Teryn Mattox); they had been weighing commissioning their own replication. The eval informs chlorination grantmaking.
“Thorough and thoughtful… extensive write-up and precise recommendations.”— the paper’s authors, who revised the framing in response.

Read it →

Do authors find it useful?

Across tracked evaluations

  • 19 of 57 tracked evaluations drew an author response (16 formal)
  • Of 22 closely assessed: 15 a positive signal; ~a third substantively revised
  • For 8 papers we compared drafts — a median ~22% of changes traced to our feedback (LLM-assisted)
  • Author survey (n≈8): quality 30–90; one — “as good as a standard referee report, or better.”

Author responses · author survey

Did authors adapt? All 57, tracked

Each square = 1 of 57 tracked papers, by combined evidence tier. Green = LLM-analysed, shaded by share of major post-evaluation changes attributed to our feedback · blue = manually-confirmed update · orange = mixed / weak signal · grey = not yet assessed.  LLM attribution via Claude Opus 4.6 — indicative, human verification ongoing.

The people behind The Unjournal

A management team (7) and advisory board (16) govern the process and standards; field specialist team members (~60) source and prioritize research:

David ReinsteinFounding Director
Anirudh TagatCo-Director
Gavin TaylorManagement
Bob KubinecManagement
Hansika KapoorManagement
Ryan BriggsManagement
Alexander HerwixManagement

Each paper is evaluated by ~2 domain experts, often matched from our evaluator pool:

180+ evaluators in the pool

½+ are economists

½+ hold doctorates

40+ field specialists · 8 areas

↓ the advisory board

The advisory board

Our advisory board — methodologists, forecasters & meta-science researchers across economics, statistics, and policy.

Field-specialist teams (8 areas)

Development economics Anirudh Tagat · Ryan Briggs · Michael Wiebe · Nathan Fiala · Emmanuel Orkoh · Robert Kubinec · Masyhur Hilmy · Wayne Sandholtz · Lee Crawfurd · Yannick Dupraz · Leena Bhattacharya · William Seitz

Global health & well-being Jake Eaton · Rosie Bettle · Charlotte Lane · Shobhit Kulshreshtha · Jonah Goldberg · Valentin Klotzbücher · Priya Lall · Francesco Ramponi · Sarah Reynolds

Economics, welfare & governance David Reinstein · Julian Jamison · Tabaré Capitán · Joel Christoph · Andrei Potlogea · Greg Sasso · Brian Weber · Daniel Horn · Moritz Hennecke · Seth Benzell

Psychology, behavioral science, attitudes Hansika Kapoor · Jonathan Berman · Mattie Toma · Carina Ines Hausladen · Hannah Metzler

Innovation, meta-science, social impact of technology Daniela Cialfi · Jordan Dworkin · Kris Gulati · Andrew Kao · Gavin Taylor · Gary McDowell

Environmental economics Tanya O’Garra · Ben Balmford

Animal welfare (markets, attitudes) Josh Tasoff · Kevin Kuruc · Florian Habermacher · Nicolas Treich · Ash Mader · Brinda Poojary

Catastrophic risks, AI governance & safety David Manheim · Anca Hanea · Alexander Herwix · Tristan Williams

These specialists span many universities and institutions worldwide — a good chance some are already in your department.

4 · Pivotal Questions

The Pivotal Questions project

From single papers → identifying stakeholders’ specific ‘operationalised’ questions that matter:

What would change key decisions — and what research evidence informs it?

What do experts believe now, and how uncertain are they?

Researchers + practitioners + stakeholders — incl. Founders Pledge, Animal Charity Evaluators; participants from Coefficient Giving & more.

Beliefs on our platforms · overview · workshops: cultured-meat · wellbeing

What 10 workshop forecasts looked like

n = 10 expert forecasts from our cultured-meat workshop — medians with 80% credible intervals.

Voices from the workshops

Cultured-meat workshop — Oana Kubinyecz on cell-line cost drivers (top). Wellbeing workshop — Matt Lerner’s DALY ↔︎ life-satisfaction comparison & Michael Plant (HLI) on imperfect-but-usable metrics (bottom).

5 · Would this be useful to you?

Maybe people here are already involved?

Our field specialists and evaluators are spread across many universities — there’s a good chance some are already in your department.

Some are field specialists who help us prioritise and recruit.

Others sit in our 180+ evaluator pool, matched to papers in their area.

And we’ve evaluated work co-authored by university economists like those in this room.

↓ where a department’s strengths might fit

Where a department’s strengths might fit

A department strength …maps onto Unjournal work
Behavioural & experimental belief elicitation; the Pivotal-Questions forecasts
Environmental & climate natural-capital valuation; our climate & animal-welfare evaluations
Health, wellbeing & development cost-effectiveness, WELLBYs, the RCTs we prioritise
Econometrics & methods meta-science; calibrating our ratings
Open research & reproducibility public disciplinary judgement alongside repositories & compliance
Development the field RCTs we prioritise
AI, technological change & labour AI’s societal & labour-market impact — a fast-growing Unjournal priority

Ways to engage and adopt this

  • Join our team or evaluator pool
  • Suggest research (and pivotal questions)
  • Use our outputs & data
  • Bring students in
  • Recognise better signals

Evaluate — paid

Staff · postdocs · advanced PhDs

Paid (~$450 avg) — for work you partly do already.

Faster and more visible than a report that vanishes into a journal.

Named or anonymous; counts as service; citable with a DOI.

A referee report you’re proud of becomes a public, citable output.

Submit or suggest research

Authors

Submit a working paper → credible public evaluations and ratings.

The journal path stays open — get feedback and a public signal before it resolves.

Or suggest others’ high-impact work — anonymously if you like.

Why request public evaluation?

A public commitment — and a signal.

“I’m willing to have this evaluated openly.”

Requesting open evaluation can itself carry information — strong-but-under-credited work has the most to gain.

Try the model live

Interactive: adjust the prior, the selection effect, and the evaluation’s informativeness.  open in a new tab → (if the embed doesn’t load)

Evaluation unlocks credibility — wherever you’re from

A paper from a famous department is trusted on pedigree. Strong work from a less-prominent university can stay locked behind prestige filters — a credible public evaluation is the key that opens it: portable, structured evidence that travels independently of where the author sits.

When is requesting worth it?

Most valuable when your work is strong but under-credited — or sitting just below the bar.

If committing to open evaluation becomes a positive signal, you’ll want in early.

Less of a clear win when the work already clears the bar and it’s a sensitive career moment.

Timing concerns? Talk to us — we can embargo or schedule.

Full “model” (v. preliminary, ~Fable-generated with human feedback): unjournal-reluctance-note.netlify.app

Students & early-career researchers

See what economists, funders and practitioners actually care about — a methodological conversation that sharpens your own work.

Do real peer review, and get feedback on your evaluation from us, often from the authors.

Gain visibility within a network of funders, grantmakers and impact-minded researchers.

And potential RA / fellowship roles: evaluation, meta-research, Pivotal-Questions support.

Use our outputs

Evaluation packages & prioritisation → a vetted evidence base to build on, teach, cite, and discuss.

Pivotal Questions & workshops → framing for agendas, grants, collaborations, research-impact cases.

Public evaluations → possible research-assessment, grant, or esteem evidence (e.g. the UK’s REF).

The ratings dataset → meta-analysis, and field-experiment collaborations on the evaluation process itself.

Visibility to research users

Funders and nonprofits read these evaluations.

Some use them in grantmaking and methodology.

A route to feedback, uptake, and sometimes collaboration.

A way to put careful work in front of people who actually use evidence.

Recognise better signals

A strong public evaluation speaks to quality and usefulness directly — alongside what the venue signals, not only where it landed.

Multidimensional ratings with uncertainty, expert reports and discussion, an author response, a citable DOI.

For research leaders & managers — encouraging engagement signals a commitment to rigour, transparency and innovation.

And it opens the research-impact channel: our funder and practitioner network, including Pivotal Questions.

6 · Looking ahead

AI makes evaluation more important

A flood of plausible AI-generated papers — some correct and useful, many not.

So more need for efficient, transparent evaluation — connected to real stakeholders and impact.

AI can help: scalable code and data checks.

But the current consensus: keep a human in the loop for the final calls.

Not “does it fit a top-5 template” — but “is it true, and does it matter?”

How does AI evaluation compare to humans?

One exploratory pilot · ~45 papers

A “frontier” (Jan. 2026) LLM vs. our human ratings: only modest rank agreement (r ≈ 0.3).

Human–human agreement still exceeds human–LLM.

On written critiques, LLMs catch of human concerns — but ~half their flags aren’t substantive.

Not yet a substitute — but an open question, and we’re exploring AI prioritisation, research reasoning, and alignment here.

Preliminary methods & results: llm-uj-research-eval.netlify.app/methods

Questions for you

  • What would make an evaluation count as evidence of quality — reliable, meaningful, valued?
  • Where would faster public evaluation be most useful?
  • What would make public evaluation feel safe and valuable for authors?
  • How could this invigorate teaching and research training?
  • How could it help build agendas, attract funding, and demonstrate value (e.g. research-assessment exercises like the UK’s REF)?
  • Which of your department’s strengths connect most naturally?

Thank you

What does open (Unjournal) evaluation provide?

Now: faster, useful feedback + a credible public signal, and useful inputs to practitioners and funders.

Soon: it starts to carry career value.

Eventually: it can replace much or all of what we ask the journal stamp to do.

Which of these would actually help your work?

David Reinstein · contact@unjournal.org · unjournal.org · unjournal.pubpub.org

1 · Problem 2 · Unjournal 3 · Evidence 4 · Pivotal Qs 5 · Fit 6 · Ahead