This work shows that the complex nonlinear computation of LLMs for attribute extraction can be well-approximated with a simple linear function…
and more importantly, without a conceptual model.
The study has two main findings:
1. Some of the implicit knowledge is represented in a simple, interpretable, and structured format.
2.. This representation is not universally used, and superficially similar facts can be encoded and extracted in very different ways.
This is an interesting study that highlights the simplistic and associative nature of language models and the resulting randomness in their output.