LLMs excel at search and discovery. Why not use them to find IVs for causal models?
In a new section in Causal Book, Using LLMs for IV discovery and data, we offer a prompt template to help discover candidate IVs and their actual data. We tested it with the latest Gemini (2.5 Pro Preview 06-05-2025) and the results are promising.
This section is the latest addition to the IV design pattern chapter of Causal Book. The book itself aims to:
- provide solution patterns and their code implementations in R and Python,
- discuss different approaches to the same pattern on the same data (Statistics, Machine Learning, Bayesian),
- demystify some surprising (or seemingly surprising) challenges in applying the causal design patterns.
See the full table of contents here.
We’ll next dive into the regression discontinuity design pattern, which I hope will be even more fun with the newly added support in DoubleML.