Using LLMs for IV discovery and data

LLMs excel at search and discovery. Why not use them to find IVs for causal models?

In a new section in Causal Book, Using LLMs for IV discovery and data, we offer a prompt template to help discover candidate IVs and their actual data. We tested it with the latest Gemini (2.5 Pro Preview 06-05-2025) and the results are promising.

This section is the latest addition to the IV design pattern chapter of Causal Book. The book itself aims to:

  1. provide solution patterns and their code implementations in R and Python,
  2. discuss different approaches to the same pattern on the same data (Statistics, Machine Learning, Bayesian),
  3. demystify some surprising (or seemingly surprising) challenges in applying the causal design patterns.

See the full table of contents here.

We’ll next dive into the regression discontinuity design pattern, which I hope will be even more fun with the newly added support in DoubleML.