Exception handling in data science

I started coding in C and used C# before switching to Java (those were the dark days). Today, R and Python make up my daily stack (and I dabble in Julia for fun). Now, in retrospect, one habit I wish the data science community had developed better is exception handling. More often than not, I come across code that is doomed to crash (hey, it’s called an exception for a reason).

The next time you code to analyze data, remember this video of division by zero. This is the pain and suffering your computer goes through when your denominator hits zero. This is also a reminder to appreciate (not complain about) the errors and warnings in R and Python (my former and future students, hear me?).

Received the video in personal communication and happy to add a source.

LLMs are most useful for experts on a topic…

…because experts are more likely to know what they don’t know. When users don’t know what they don’t know, so-called “hallucinations” are less likely to be detected and this seems to be a growing problem, likely exacerbated by the Dunning-Kruger effect. Well, this is my take on them.

In the study cited in the article, several LLM models are asked to summarize news articles to measure how often they “hallucinated” or made up facts.

The LLM models showed different rates of “hallucination”, with OpenAI having the lowest (about 3%), followed by Meta (about 5%), Anthropic’s Claude 2 system (over 8%), and Google’s Palm chat with the highest (27%).

Source

Some advice from my assistant on how to analyze data

๐˜ˆ๐˜ฑ๐˜ฑ๐˜ณ๐˜ฐ๐˜ข๐˜ค๐˜ฉ ๐˜บ๐˜ฐ๐˜ถ๐˜ณ ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ข ๐˜ฎ๐˜ช๐˜น ๐˜ฐ๐˜ง ๐˜ด๐˜ฌ๐˜ฆ๐˜ฑ๐˜ต๐˜ช๐˜ค๐˜ช๐˜ด๐˜ฎ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ค๐˜ถ๐˜ณ๐˜ช๐˜ฐ๐˜ด๐˜ช๐˜ต๐˜บ. ๐˜›๐˜ณ๐˜ฆ๐˜ข๐˜ต ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ข๐˜ด ๐˜ข ๐˜ด๐˜ต๐˜ฐ๐˜ณ๐˜บ๐˜ต๐˜ฆ๐˜ญ๐˜ญ๐˜ฆ๐˜ณ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ฅ๐˜ฐ๐˜ฆ๐˜ด๐˜ฏ’๐˜ต ๐˜ฅ๐˜ช๐˜ณ๐˜ฆ๐˜ค๐˜ต๐˜ญ๐˜บ ๐˜ต๐˜ฆ๐˜ญ๐˜ญ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ต๐˜ณ๐˜ถ๐˜ต๐˜ฉ, ๐˜ฃ๐˜ถ๐˜ต ๐˜ฐ๐˜ง๐˜ง๐˜ฆ๐˜ณ๐˜ด ๐˜ค๐˜ญ๐˜ถ๐˜ฆ๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต, ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ณ๐˜ช๐˜จ๐˜ฐ๐˜ณ๐˜ฐ๐˜ถ๐˜ด ๐˜ข๐˜ฏ๐˜ข๐˜ญ๐˜บ๐˜ด๐˜ช๐˜ด ๐˜ข๐˜ฏ๐˜ฅ ๐˜ค๐˜ณ๐˜ช๐˜ต๐˜ช๐˜ค๐˜ข๐˜ญ ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ช๐˜ฏ๐˜จ, ๐˜ณ๐˜ฆ๐˜ท๐˜ฆ๐˜ข๐˜ญ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฅ๐˜ฆ๐˜ฆ๐˜ฑ๐˜ฆ๐˜ณ ๐˜ฏ๐˜ข๐˜ณ๐˜ณ๐˜ข๐˜ต๐˜ช๐˜ท๐˜ฆ. ๐˜Š๐˜ถ๐˜ญ๐˜ต๐˜ช๐˜ท๐˜ข๐˜ต๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ข๐˜ณ๐˜ต ๐˜ฐ๐˜ง ๐˜ข๐˜ด๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ณ๐˜ช๐˜จ๐˜ฉ๐˜ต ๐˜ฒ๐˜ถ๐˜ฆ๐˜ด๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด -๐˜ฏ๐˜ฐ๐˜ต ๐˜ฐ๐˜ฏ๐˜ญ๐˜บ ๐˜ฐ๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฅ๐˜ข๐˜ต๐˜ข, ๐˜ฃ๐˜ถ๐˜ต ๐˜ข๐˜ญ๐˜ด๐˜ฐ ๐˜ฐ๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜ด๐˜ต๐˜ข๐˜ฌ๐˜ฆ๐˜ฉ๐˜ฐ๐˜ญ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด. ๐˜œ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด๐˜ต๐˜ข๐˜ฏ๐˜ฅ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ค๐˜ฐ๐˜ฏ๐˜ต๐˜ฆ๐˜น๐˜ต ๐˜ข๐˜ฏ๐˜ฅ ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ญ๐˜บ๐˜ช๐˜ฏ๐˜จ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ค๐˜ฆ๐˜ด๐˜ด๐˜ฆ๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜จ๐˜ฆ๐˜ฏ๐˜ฆ๐˜ณ๐˜ข๐˜ต๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ต๐˜ฐ ๐˜ข๐˜ท๐˜ฐ๐˜ช๐˜ฅ ๐˜ฎ๐˜ข๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ช๐˜ฏ๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ค๐˜ฆ ๐˜ฐ๐˜ฏ ๐˜ด๐˜ฆ๐˜ฆ๐˜ฎ๐˜ช๐˜ฏ๐˜จ ๐˜ด๐˜ช๐˜จ๐˜ฏ๐˜ข๐˜ญ๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ฎ๐˜ข๐˜บ ๐˜ข๐˜ค๐˜ต๐˜ถ๐˜ข๐˜ญ๐˜ญ๐˜บ ๐˜ฃ๐˜ฆ ๐˜ฏ๐˜ฐ๐˜ช๐˜ด๐˜ฆ. ๐˜ˆ๐˜ฏ๐˜ฅ ๐˜ฏ๐˜ฆ๐˜ท๐˜ฆ๐˜ณ ๐˜ง๐˜ฐ๐˜ณ๐˜จ๐˜ฆ๐˜ต ๐˜ต๐˜ฉ๐˜ฆ ๐˜ท๐˜ข๐˜ญ๐˜ถ๐˜ฆ ๐˜ฐ๐˜ง ๐˜ค๐˜ฐ๐˜ฎ๐˜ฎ๐˜ถ๐˜ฏ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ด๐˜ต๐˜ฐ๐˜ณ๐˜บ๐˜ต๐˜ฆ๐˜ญ๐˜ญ๐˜ช๐˜ฏ๐˜จ; ๐˜บ๐˜ฐ๐˜ถ๐˜ณ ๐˜ช๐˜ฏ๐˜ด๐˜ช๐˜จ๐˜ฉ๐˜ต๐˜ด ๐˜ข๐˜ณ๐˜ฆ ๐˜ฐ๐˜ฏ๐˜ญ๐˜บ ๐˜ข๐˜ด ๐˜ท๐˜ข๐˜ญ๐˜ถ๐˜ข๐˜ฃ๐˜ญ๐˜ฆ ๐˜ข๐˜ด ๐˜บ๐˜ฐ๐˜ถ๐˜ณ ๐˜ข๐˜ฃ๐˜ช๐˜ญ๐˜ช๐˜ต๐˜บ ๐˜ต๐˜ฐ ๐˜ฆ๐˜ง๐˜ง๐˜ฆ๐˜ค๐˜ต๐˜ช๐˜ท๐˜ฆ๐˜ญ๐˜บ ๐˜ค๐˜ฐ๐˜ฎ๐˜ฎ๐˜ถ๐˜ฏ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ๐˜ฎ.

Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence

The Executive Order defines “AI” as:
“a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments.”

This means that the scope is not limited to generative AI, which is good. Using “AI” as an umbrella term may still not be a good idea for the reasons my assistant lists below but I hope this is a first step in the right direction.

“๐˜ˆ๐˜” ๐˜ข๐˜ด ๐˜ข ๐˜ฃ๐˜ญ๐˜ข๐˜ฏ๐˜ฌ๐˜ฆ๐˜ต ๐˜ต๐˜ฆ๐˜ณ๐˜ฎ

๐˜–๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ฐ๐˜ด๐˜ช๐˜ต๐˜ช๐˜ท๐˜ฆ ๐˜ด๐˜ช๐˜ฅ๐˜ฆ, ๐˜ช๐˜ต ๐˜ด๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ง๐˜ช๐˜ฆ๐˜ด ๐˜ค๐˜ฐ๐˜ฎ๐˜ฎ๐˜ถ๐˜ฏ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ฃ๐˜บ ๐˜จ๐˜ณ๐˜ฐ๐˜ถ๐˜ฑ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฐ๐˜จ๐˜ฆ๐˜ต๐˜ฉ๐˜ฆ๐˜ณ ๐˜ข ๐˜ธ๐˜ช๐˜ฅ๐˜ฆ ๐˜ณ๐˜ข๐˜ฏ๐˜จ๐˜ฆ ๐˜ฐ๐˜ง ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ฆ๐˜ฎ๐˜ถ๐˜ญ๐˜ข๐˜ต๐˜ฆ ๐˜ฉ๐˜ถ๐˜ฎ๐˜ข๐˜ฏ ๐˜ค๐˜ฐ๐˜จ๐˜ฏ๐˜ช๐˜ต๐˜ช๐˜ท๐˜ฆ ๐˜ง๐˜ถ๐˜ฏ๐˜ค๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ด๐˜ถ๐˜ค๐˜ฉ ๐˜ข๐˜ด ๐˜ญ๐˜ฆ๐˜ข๐˜ณ๐˜ฏ๐˜ช๐˜ฏ๐˜จ, ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฃ๐˜ญ๐˜ฆ๐˜ฎ ๐˜ด๐˜ฐ๐˜ญ๐˜ท๐˜ช๐˜ฏ๐˜จ, ๐˜ข๐˜ฏ๐˜ฅ ๐˜ฑ๐˜ข๐˜ต๐˜ต๐˜ฆ๐˜ณ๐˜ฏ ๐˜ณ๐˜ฆ๐˜ค๐˜ฐ๐˜จ๐˜ฏ๐˜ช๐˜ต๐˜ช๐˜ฐ๐˜ฏ. ๐˜›๐˜ฉ๐˜ช๐˜ด ๐˜ด๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ง๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ค๐˜ข๐˜ฏ ๐˜ฃ๐˜ฆ ๐˜ฃ๐˜ฆ๐˜ฏ๐˜ฆ๐˜ง๐˜ช๐˜ค๐˜ช๐˜ข๐˜ญ ๐˜ง๐˜ฐ๐˜ณ ๐˜ฆ๐˜ฅ๐˜ถ๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ข๐˜ญ ๐˜ฑ๐˜ถ๐˜ณ๐˜ฑ๐˜ฐ๐˜ด๐˜ฆ๐˜ด, ๐˜ฑ๐˜ฐ๐˜ญ๐˜ช๐˜ค๐˜บ๐˜ฎ๐˜ข๐˜ฌ๐˜ช๐˜ฏ๐˜จ, ๐˜ข๐˜ฏ๐˜ฅ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฎ๐˜ฐ๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ฑ๐˜ถ๐˜ฃ๐˜ญ๐˜ช๐˜ค ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด๐˜ต๐˜ข๐˜ฏ๐˜ฅ๐˜ช๐˜ฏ๐˜จ. ๐˜๐˜ต ๐˜ฑ๐˜ณ๐˜ฐ๐˜ท๐˜ช๐˜ฅ๐˜ฆ๐˜ด ๐˜ข ๐˜ค๐˜ฐ๐˜ฏ๐˜ท๐˜ฆ๐˜ฏ๐˜ช๐˜ฆ๐˜ฏ๐˜ต ๐˜ด๐˜ฉ๐˜ฐ๐˜ณ๐˜ต๐˜ฉ๐˜ข๐˜ฏ๐˜ฅ ๐˜ง๐˜ฐ๐˜ณ ๐˜ฅ๐˜ช๐˜ด๐˜ค๐˜ถ๐˜ด๐˜ด๐˜ช๐˜ฏ๐˜จ ๐˜ช๐˜ฏ๐˜ฏ๐˜ฐ๐˜ท๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ณ๐˜ข๐˜ฏ๐˜จ๐˜ช๐˜ฏ๐˜จ ๐˜ง๐˜ณ๐˜ฐ๐˜ฎ ๐˜ด๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ ๐˜ข๐˜ญ๐˜จ๐˜ฐ๐˜ณ๐˜ช๐˜ต๐˜ฉ๐˜ฎ๐˜ด ๐˜ต๐˜ฐ ๐˜ค๐˜ฐ๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ๐˜น ๐˜ฏ๐˜ฆ๐˜ถ๐˜ณ๐˜ข๐˜ญ ๐˜ฏ๐˜ฆ๐˜ต๐˜ธ๐˜ฐ๐˜ณ๐˜ฌ๐˜ด ๐˜ธ๐˜ช๐˜ต๐˜ฉ๐˜ฐ๐˜ถ๐˜ต ๐˜จ๐˜ฆ๐˜ต๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ฃ๐˜ฐ๐˜จ๐˜จ๐˜ฆ๐˜ฅ ๐˜ฅ๐˜ฐ๐˜ธ๐˜ฏ ๐˜ช๐˜ฏ ๐˜ต๐˜ฆ๐˜ค๐˜ฉ๐˜ฏ๐˜ช๐˜ค๐˜ข๐˜ญ ๐˜ฅ๐˜ฆ๐˜ต๐˜ข๐˜ช๐˜ญ๐˜ด.

๐˜–๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฅ๐˜ฐ๐˜ธ๐˜ฏ๐˜ด๐˜ช๐˜ฅ๐˜ฆ, ๐˜ฉ๐˜ฐ๐˜ธ๐˜ฆ๐˜ท๐˜ฆ๐˜ณ, ๐˜ต๐˜ฉ๐˜ฆ ๐˜ต๐˜ฆ๐˜ณ๐˜ฎ ๐˜ค๐˜ข๐˜ฏ ๐˜ฃ๐˜ฆ ๐˜ฎ๐˜ช๐˜ด๐˜ญ๐˜ฆ๐˜ข๐˜ฅ๐˜ช๐˜ฏ๐˜จ ๐˜ฃ๐˜ฆ๐˜ค๐˜ข๐˜ถ๐˜ด๐˜ฆ ๐˜ฐ๐˜ง ๐˜ช๐˜ต๐˜ด ๐˜ฃ๐˜ณ๐˜ฐ๐˜ข๐˜ฅ ๐˜ด๐˜ค๐˜ฐ๐˜ฑ๐˜ฆ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ถ๐˜ฃ๐˜ญ๐˜ช๐˜ค’๐˜ด ๐˜ท๐˜ข๐˜ณ๐˜บ๐˜ช๐˜ฏ๐˜จ ๐˜ช๐˜ฏ๐˜ต๐˜ฆ๐˜ณ๐˜ฑ๐˜ณ๐˜ฆ๐˜ต๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ฐ๐˜ง ๐˜ธ๐˜ฉ๐˜ข๐˜ต ๐˜ˆ๐˜ ๐˜ฆ๐˜ฏ๐˜ค๐˜ฐ๐˜ฎ๐˜ฑ๐˜ข๐˜ด๐˜ด๐˜ฆ๐˜ด. ๐˜๐˜ต ๐˜ค๐˜ข๐˜ฏ ๐˜ค๐˜ฐ๐˜ฏ๐˜ง๐˜ญ๐˜ข๐˜ต๐˜ฆ ๐˜ณ๐˜ถ๐˜ฅ๐˜ช๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต๐˜ข๐˜ณ๐˜บ ๐˜ด๐˜ฐ๐˜ง๐˜ต๐˜ธ๐˜ข๐˜ณ๐˜ฆ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ข๐˜ฅ๐˜ท๐˜ข๐˜ฏ๐˜ค๐˜ฆ๐˜ฅ ๐˜ฎ๐˜ข๐˜ค๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ ๐˜ญ๐˜ฆ๐˜ข๐˜ณ๐˜ฏ๐˜ช๐˜ฏ๐˜จ ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด, ๐˜ญ๐˜ฆ๐˜ข๐˜ฅ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฐ ๐˜ช๐˜ฏ๐˜ง๐˜ญ๐˜ข๐˜ต๐˜ฆ๐˜ฅ ๐˜ฆ๐˜น๐˜ฑ๐˜ฆ๐˜ค๐˜ต๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ฐ๐˜ณ ๐˜ถ๐˜ฏ๐˜ฅ๐˜ถ๐˜ฆ ๐˜ง๐˜ฆ๐˜ข๐˜ณ. ๐˜๐˜ฏ ๐˜ข๐˜ฅ๐˜ฅ๐˜ช๐˜ต๐˜ช๐˜ฐ๐˜ฏ, ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฃ๐˜ณ๐˜ฐ๐˜ข๐˜ฅ ๐˜ถ๐˜ด๐˜ฆ ๐˜ฐ๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜ต๐˜ฆ๐˜ณ๐˜ฎ ๐˜ค๐˜ข๐˜ฏ ๐˜ฐ๐˜ฃ๐˜ด๐˜ค๐˜ถ๐˜ณ๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฏ๐˜ถ๐˜ข๐˜ฏ๐˜ค๐˜ฆ๐˜ฅ ๐˜ฆ๐˜ต๐˜ฉ๐˜ช๐˜ค๐˜ข๐˜ญ, ๐˜ญ๐˜ฆ๐˜จ๐˜ข๐˜ญ, ๐˜ข๐˜ฏ๐˜ฅ ๐˜ด๐˜ฐ๐˜ค๐˜ช๐˜ฐ๐˜ฆ๐˜ค๐˜ฐ๐˜ฏ๐˜ฐ๐˜ฎ๐˜ช๐˜ค ๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ด๐˜ฑ๐˜ฆ๐˜ค๐˜ช๐˜ง๐˜ช๐˜ค ๐˜ต๐˜ฐ ๐˜ฅ๐˜ช๐˜ง๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ต ๐˜ˆ๐˜ ๐˜ข๐˜ฑ๐˜ฑ๐˜ญ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด, ๐˜ต๐˜ฉ๐˜ฆ๐˜ณ๐˜ฆ๐˜ฃ๐˜บ ๐˜ฉ๐˜ช๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ช๐˜ฏ๐˜จ ๐˜ง๐˜ฐ๐˜ค๐˜ถ๐˜ด๐˜ฆ๐˜ฅ ๐˜ฅ๐˜ฆ๐˜ฃ๐˜ข๐˜ต๐˜ฆ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ต๐˜ฉ๐˜ฐ๐˜ถ๐˜จ๐˜ฉ๐˜ต๐˜ง๐˜ถ๐˜ญ ๐˜ณ๐˜ฆ๐˜จ๐˜ถ๐˜ญ๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ. ๐˜›๐˜ฉ๐˜ฆ ๐˜ฃ๐˜ญ๐˜ข๐˜ฏ๐˜ฌ๐˜ฆ๐˜ต ๐˜ต๐˜ฆ๐˜ณ๐˜ฎ ๐˜ค๐˜ข๐˜ฏ ๐˜ข๐˜ญ๐˜ด๐˜ฐ ๐˜ฐ๐˜ฃ๐˜ด๐˜ค๐˜ถ๐˜ณ๐˜ฆ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ด๐˜ช๐˜จ๐˜ฏ๐˜ช๐˜ง๐˜ช๐˜ค๐˜ข๐˜ฏ๐˜ต ๐˜ฅ๐˜ช๐˜ง๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ค๐˜ฆ๐˜ด ๐˜ช๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ค๐˜ข๐˜ฑ๐˜ข๐˜ฃ๐˜ช๐˜ญ๐˜ช๐˜ต๐˜ช๐˜ฆ๐˜ด ๐˜ข๐˜ฏ๐˜ฅ ๐˜ณ๐˜ช๐˜ด๐˜ฌ๐˜ด ๐˜ฐ๐˜ง ๐˜ฅ๐˜ช๐˜ง๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ต ๐˜ˆ๐˜ ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด, ๐˜ฑ๐˜ฐ๐˜ต๐˜ฆ๐˜ฏ๐˜ต๐˜ช๐˜ข๐˜ญ๐˜ญ๐˜บ ๐˜ญ๐˜ฆ๐˜ข๐˜ฅ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฐ ๐˜ข ๐˜ฐ๐˜ฏ๐˜ฆ-๐˜ด๐˜ช๐˜ป๐˜ฆ-๐˜ง๐˜ช๐˜ต๐˜ด-๐˜ข๐˜ญ๐˜ญ ๐˜ข๐˜ฑ๐˜ฑ๐˜ณ๐˜ฐ๐˜ข๐˜ค๐˜ฉ ๐˜ต๐˜ฐ ๐˜ฑ๐˜ฐ๐˜ญ๐˜ช๐˜ค๐˜บ ๐˜ข๐˜ฏ๐˜ฅ ๐˜จ๐˜ฐ๐˜ท๐˜ฆ๐˜ณ๐˜ฏ๐˜ข๐˜ฏ๐˜ค๐˜ฆ.

Source

Baggage handling at airports

Every time I fly, I am struck by how archaic airport baggage handling still is. Sure, the airline industry is infamous for maintaining its legacy Fortran and Cobol software, but that’s because aviation is a pioneer in using computers to run its operations. Meanwhile, baggage handling remains a bottleneck in air travel because the process is highly manual and inefficient (except for using the same conveyor system that seems to have been in use since 1971).

When robots take over (or assist with) baggage handling, overall passenger satisfaction is likely to improve. Increased use of robots to solve such low-stakes bottleneck problems may also help the public perception of robots.

evoBot looks like one of the robots that can solve this problem. The robot achieves excellent balance using an inverted pendulum design, and can reach speeds of 37 mph and carry over 220 pounds. Pretty impressive.

Not shown in the video, but it can also lift luggage off the ground and deliver it to its destination (airplane or the 1971 conveyor belt). Munich Airport seems to have tested it already. I hope to see it in action soon.

Does ChatGPT know Chinese?

If you ask it, its answer is “Yes.” If you ask it if it “understands” Chinese, its answer is again “Yes” without hesitation. Searle’s 1980 Chinese Room argument is more relevant than ever in the age of LLMs:

๐˜š๐˜ถ๐˜ฑ๐˜ฑ๐˜ฐ๐˜ด๐˜ฆ ๐˜ข ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ (๐˜ฃ๐˜ฐ๐˜น ๐˜ช๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ช๐˜ค๐˜ต๐˜ถ๐˜ณ๐˜ฆ) ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ฃ๐˜ฆ๐˜ฉ๐˜ข๐˜ท๐˜ฆ๐˜ด ๐˜ข๐˜ด ๐˜ช๐˜ง ๐˜ช๐˜ต ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด๐˜ต๐˜ข๐˜ฏ๐˜ฅ๐˜ด ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ. ๐˜๐˜ต ๐˜ต๐˜ข๐˜ฌ๐˜ฆ๐˜ด ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ ๐˜ค๐˜ฉ๐˜ข๐˜ณ๐˜ข๐˜ค๐˜ต๐˜ฆ๐˜ณ๐˜ด ๐˜ข๐˜ด ๐˜ช๐˜ฏ๐˜ฑ๐˜ถ๐˜ต ๐˜ข๐˜ฏ๐˜ฅ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฅ๐˜ถ๐˜ค๐˜ฆ๐˜ด ๐˜ฐ๐˜ต๐˜ฉ๐˜ฆ๐˜ณ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ ๐˜ค๐˜ฉ๐˜ข๐˜ณ๐˜ข๐˜ค๐˜ต๐˜ฆ๐˜ณ๐˜ด ๐˜ข๐˜ด ๐˜ฐ๐˜ถ๐˜ต๐˜ฑ๐˜ถ๐˜ต. ๐˜›๐˜ฉ๐˜ช๐˜ด ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ ๐˜ฑ๐˜ฆ๐˜ณ๐˜ง๐˜ฐ๐˜ณ๐˜ฎ๐˜ด ๐˜ช๐˜ต๐˜ด ๐˜ต๐˜ข๐˜ด๐˜ฌ ๐˜ด๐˜ฐ ๐˜ค๐˜ฐ๐˜ฏ๐˜ท๐˜ช๐˜ฏ๐˜ค๐˜ช๐˜ฏ๐˜จ๐˜ญ๐˜บ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ช๐˜ต ๐˜ค๐˜ฐ๐˜ฎ๐˜ง๐˜ฐ๐˜ณ๐˜ต๐˜ข๐˜ฃ๐˜ญ๐˜บ ๐˜ฑ๐˜ข๐˜ด๐˜ด๐˜ฆ๐˜ด ๐˜ต๐˜ฉ๐˜ฆ ๐˜›๐˜ถ๐˜ณ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฆ๐˜ด๐˜ต: ๐˜ช๐˜ต ๐˜ค๐˜ฐ๐˜ฏ๐˜ท๐˜ช๐˜ฏ๐˜ค๐˜ฆ๐˜ด ๐˜ข ๐˜ฉ๐˜ถ๐˜ฎ๐˜ข๐˜ฏ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ ๐˜ด๐˜ฑ๐˜ฆ๐˜ข๐˜ฌ๐˜ฆ๐˜ณ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฎ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ ๐˜ช๐˜ด ๐˜ช๐˜ต๐˜ด๐˜ฆ๐˜ญ๐˜ง ๐˜ข ๐˜ญ๐˜ช๐˜ท๐˜ฆ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ ๐˜ด๐˜ฑ๐˜ฆ๐˜ข๐˜ฌ๐˜ฆ๐˜ณ. ๐˜›๐˜ฐ ๐˜ข๐˜ญ๐˜ญ ๐˜ฐ๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฒ๐˜ถ๐˜ฆ๐˜ด๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ฆ๐˜ณ๐˜ด๐˜ฐ๐˜ฏ ๐˜ข๐˜ด๐˜ฌ๐˜ด, ๐˜ช๐˜ต ๐˜ฎ๐˜ข๐˜ฌ๐˜ฆ๐˜ด ๐˜ข๐˜ฑ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฑ๐˜ณ๐˜ช๐˜ข๐˜ต๐˜ฆ ๐˜ณ๐˜ฆ๐˜ด๐˜ฑ๐˜ฐ๐˜ฏ๐˜ด๐˜ฆ๐˜ด, ๐˜ด๐˜ถ๐˜ค๐˜ฉ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ข๐˜ฏ๐˜บ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ ๐˜ด๐˜ฑ๐˜ฆ๐˜ข๐˜ฌ๐˜ฆ๐˜ณ ๐˜ธ๐˜ฐ๐˜ถ๐˜ญ๐˜ฅ ๐˜ฃ๐˜ฆ ๐˜ค๐˜ฐ๐˜ฏ๐˜ท๐˜ช๐˜ฏ๐˜ค๐˜ฆ๐˜ฅ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ต๐˜ฉ๐˜ฆ๐˜บ ๐˜ข๐˜ณ๐˜ฆ ๐˜ต๐˜ข๐˜ญ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฐ ๐˜ข๐˜ฏ๐˜ฐ๐˜ต๐˜ฉ๐˜ฆ๐˜ณ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ-๐˜ด๐˜ฑ๐˜ฆ๐˜ข๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ฉ๐˜ถ๐˜ฎ๐˜ข๐˜ฏ ๐˜ฃ๐˜ฆ๐˜ช๐˜ฏ๐˜จ. ๐˜๐˜ฏ ๐˜ต๐˜ฉ๐˜ช๐˜ด ๐˜ค๐˜ข๐˜ด๐˜ฆ, ๐˜ฅ๐˜ฐ๐˜ฆ๐˜ด ๐˜ต๐˜ฉ๐˜ฆ ๐˜ฎ๐˜ข๐˜ค๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ ๐˜ญ๐˜ช๐˜ต๐˜ฆ๐˜ณ๐˜ข๐˜ญ๐˜ญ๐˜บ ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด๐˜ต๐˜ข๐˜ฏ๐˜ฅ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ? ๐˜–๐˜ณ ๐˜ช๐˜ด ๐˜ช๐˜ต ๐˜ฎ๐˜ฆ๐˜ณ๐˜ฆ๐˜ญ๐˜บ ๐˜ด๐˜ช๐˜ฎ๐˜ถ๐˜ญ๐˜ข๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ข๐˜ฃ๐˜ช๐˜ญ๐˜ช๐˜ต๐˜บ ๐˜ต๐˜ฐ ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ณ๐˜ด๐˜ต๐˜ข๐˜ฏ๐˜ฅ ๐˜Š๐˜ฉ๐˜ช๐˜ฏ๐˜ฆ๐˜ด๐˜ฆ?

More recently in his book, Searle linked his original argument to consciousness, but that’s probably a higher bar than needed to reason that ChatGPT is a box that has no idea what it’s talking about.

Autonomous taxis are boring

I took several rides in Google’s Waymo robotaxi. This is a short video of the experience, which is great, almost flawless. One problem is that it gets boring really fast.

About half the time during my trip, I used Uber or Lyft instead of a Waymo, and I met a professional dancer, a retired chef, a compliance officer, a criminal justice expert, an Amazon truck driver, and a painter.

I had really fun conversations that touched on “AI” and dance music, the best old school restaurants in town, the private equity fundraising process, cybersecurity, privacy, more “AI” and so on. As a bonus, almost all of the conversations included some useful, local information about the city.

None of the robotaxi rides had any of this, serendipity was nonexistent. The robot feels friendly, sure, but that’s about it. The longer the ride, the more boring it gets.

Replacing influencers with generative models

Replacing influencers with generative AI looks like a great use case. The real question is whether influencers who promote “AI” will also be replaced by AI.

Some details:
With just a few minutes of sample video from the person to be cloned and a payment of $1,000, brands can clone a human streamer to work 24/7.

The AI videobots may already be having some economic impact: the average salary for livestream hosts in China is down 20% from 2022 (just another YoY figure, not a causal effect).

Source

AI as an umbrella term

This is based on a recent Nature study, and it’s useful with a caveat that may make the findings and visuals less striking than they look:

“๐‘๐‘Ž๐‘ก๐‘ข๐‘Ÿ๐‘’ ๐‘ ๐‘’๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘’๐‘‘ ๐‘“๐‘œ๐‘Ÿ ๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘๐‘™๐‘’๐‘ , ๐‘Ÿ๐‘’๐‘ฃ๐‘–๐‘’๐‘ค๐‘  ๐‘Ž๐‘›๐‘‘ ๐‘๐‘œ๐‘›๐‘“๐‘’๐‘Ÿ๐‘’๐‘›๐‘๐‘’ ๐‘๐‘Ž๐‘๐‘’๐‘Ÿ๐‘  ๐‘–๐‘› ๐‘†๐‘๐‘œ๐‘๐‘ข๐‘ , ๐‘ค๐‘–๐‘กโ„Ž ๐‘ก๐‘–๐‘ก๐‘™๐‘’๐‘ , ๐‘Ž๐‘๐‘ ๐‘ก๐‘Ÿ๐‘Ž๐‘๐‘ก๐‘ , ๐‘œ๐‘Ÿ ๐‘˜๐‘’๐‘ฆ๐‘ค๐‘œ๐‘Ÿ๐‘‘๐‘  ๐‘๐‘œ๐‘›๐‘ก๐‘Ž๐‘–๐‘›๐‘–๐‘›๐‘” ๐‘กโ„Ž๐‘’ ๐‘ก๐‘’๐‘Ÿ๐‘š๐‘  โ€˜๐‘š๐‘Ž๐‘โ„Ž๐‘–๐‘›๐‘’ ๐‘™๐‘’๐‘Ž๐‘Ÿ๐‘›๐‘–๐‘›๐‘”โ€™; โ€˜๐‘›๐‘’๐‘ข๐‘Ÿ๐‘Ž๐‘™ ๐‘›๐‘’๐‘ก*โ€™, โ€˜๐‘‘๐‘’๐‘’๐‘ ๐‘™๐‘’๐‘Ž๐‘Ÿ๐‘›๐‘–๐‘›๐‘”โ€™, โ€˜๐‘Ÿ๐‘Ž๐‘›๐‘‘๐‘œ๐‘š ๐‘“๐‘œ๐‘Ÿ๐‘’๐‘ ๐‘กโ€™, โ€˜๐‘‘๐‘’๐‘’๐‘ ๐‘™๐‘’๐‘Ž๐‘Ÿ๐‘›๐‘–๐‘›๐‘”โ€™, โ€˜๐‘ ๐‘ข๐‘๐‘๐‘œ๐‘Ÿ๐‘ก ๐‘ฃ๐‘’๐‘๐‘ก๐‘œ๐‘Ÿ ๐‘š๐‘Ž๐‘โ„Ž๐‘–๐‘›๐‘’โ€™, โ€˜๐‘Ž๐‘Ÿ๐‘ก๐‘–๐‘“๐‘–๐‘๐‘–๐‘Ž๐‘™ ๐‘–๐‘›๐‘ก๐‘’๐‘™๐‘™๐‘–๐‘”๐‘’๐‘›๐‘๐‘’โ€™, โ€˜๐‘‘๐‘–๐‘š๐‘’๐‘›๐‘ ๐‘–๐‘œ๐‘›๐‘Ž๐‘™๐‘–๐‘ก๐‘ฆ ๐‘Ÿ๐‘’๐‘‘๐‘ข๐‘๐‘ก๐‘–๐‘œ๐‘›โ€™, โ€˜๐‘”๐‘Ž๐‘ข๐‘ ๐‘ ๐‘–๐‘Ž๐‘› ๐‘๐‘Ÿ๐‘œ๐‘๐‘’๐‘ ๐‘ ๐‘’๐‘ โ€™, โ€˜๐‘›๐‘Ž๐‘–ฬˆ๐‘ฃ๐‘’ ๐‘๐‘Ž๐‘ฆ๐‘’๐‘ โ€™, โ€˜๐‘™๐‘Ž๐‘Ÿ๐‘”๐‘’ ๐‘™๐‘Ž๐‘›๐‘”๐‘ข๐‘Ž๐‘”๐‘’ ๐‘š๐‘œ๐‘‘๐‘’๐‘™๐‘ โ€™, โ€˜๐‘™๐‘™๐‘š*โ€™, โ€˜๐‘โ„Ž๐‘Ž๐‘ก๐‘”๐‘๐‘กโ€™, โ€˜๐‘”๐‘Ž๐‘ข๐‘ ๐‘ ๐‘–๐‘Ž๐‘› ๐‘š๐‘–๐‘ฅ๐‘ก๐‘ข๐‘Ÿ๐‘’ ๐‘š๐‘œ๐‘‘๐‘’๐‘™๐‘ โ€™, โ€˜๐‘’๐‘›๐‘ ๐‘’๐‘š๐‘๐‘™๐‘’ ๐‘š๐‘’๐‘กโ„Ž๐‘œ๐‘‘๐‘ โ€™.”

So, SVM, Naive Bayes, Random forest, and Ensemble methods are all called AI (not untrue, but…). Gaussian processes? Well, papers with a GP regression count then. Dimensionality reduction? So, papers using PCA or LDA count.

This feeds the trend of using AI as an umbrella term unfortunately.

Source

Using predictive modeling as a hammer when the nail needs more thinking

The business problem is to put a lifeguard station on a beach to save some lives (i.e., find the best location for the lifeguard station). This is not really a predictive modeling problem. But that’s the hammer our data scientists have and they have access to fancy libraries. There is also some historical data: swimmers rescued and drowned at other beaches. It all checks out. Resistance to ๐˜ฑ๐˜ช๐˜ฑ ๐˜ช๐˜ฏ๐˜ด๐˜ต๐˜ข๐˜ญ๐˜ญ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฑ๐˜ฉ๐˜ฆ๐˜ต is futile.

Transforming the problem into an objective function could have signaled that this is an optimization problem (a prescriptive modeling problem), but that step was skipped. In the picture shown, we may need a solution:

– minimizes distance => ๐—ฆ๐—ผ๐—น๐˜ƒ๐—ฒ๐—ฑ ๐˜‚๐˜€๐—ถ๐—ป๐—ด ๐—ฝ๐—ถ๐—ฝ ๐—ถ๐—ป๐˜€๐˜๐—ฎ๐—น๐—น ๐—ณ๐—ฎ๐—ป๐—ฐ๐˜†_๐—น๐—ถ๐—ฏ๐—ฟ๐—ฎ๐—ฟ๐˜†
while also…
– minimizing time => ๐—ง๐—ต๐—ฒ ๐—ฑ๐—ผ๐—บ๐—ฎ๐—ถ๐—ป ๐—ฒ๐˜…๐—ฝ๐—ฒ๐—ฟ๐˜ ๐—ฒ๐—ป๐˜๐—ฒ๐—ฟ๐˜€ ๐˜๐—ต๐—ฒ ๐—ฟ๐—ผ๐—ผ๐—บ
– minimizing swimming => ๐—ง๐—ต๐—ฒ ๐—น๐—ฎ๐—ฏ๐—ผ๐—ฟ ๐˜‚๐—ป๐—ถ๐—ผ๐—ป ๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ป๐—ฒ๐˜€
– minimizing time to ice cream => ๐—ง๐—ต๐—ฒ ๐—ฒ๐˜…๐—ฒ๐—ฐ๐˜‚๐˜๐—ถ๐˜ƒ๐—ฒ ๐—น๐—ฒ๐—ฎ๐—ฑ๐—ฒ๐—ฟ๐˜€๐—ต๐—ถ๐—ฝ ๐˜€๐˜๐—ฒ๐—ฝ๐˜€ ๐—ถ๐—ป
– [not shown] minimizing walking on sand => ๐—ง๐—ต๐—ฒ ๐——๐—ฒ๐—ฝ๐—ฎ๐—ฟ๐˜๐—บ๐—ฒ๐—ป๐˜ ๐—ผ๐—ณ ๐—Ÿ๐—ฎ๐—ฏ๐—ผ๐—ฟ ๐—ฟ๐—ฒ๐—พ๐˜‚๐—ถ๐—ฟ๐—ฒ๐—บ๐—ฒ๐—ป๐˜
and hopefully not…
– maximizing time => ๐—” ๐—ท๐˜‚๐—ป๐—ถ๐—ผ๐—ฟ ๐—ฑ๐—ฎ๐˜๐—ฎ ๐˜€๐—ฐ๐—ถ๐—ฒ๐—ป๐˜๐—ถ๐˜€๐˜ ๐˜€๐—ผ๐—น๐˜ƒ๐—ฒ๐˜€ ๐˜๐—ต๐—ฒ ๐—ฝ๐—ฟ๐—ผ๐—ฏ๐—น๐—ฒ๐—บ

So, the ideal solution requires more thinking about the problem. For example, maximizing the number of lives saved may actually require constraints on how to minimize time so that lifeguards don’t risk their lives during the rescue.

The law of the instrument works a little too well in predictive modeling (and more generally in machine learning). Objective functions are often lost in translation when they should be an explicit step in the modeling process. Best practice tends to favor performance metrics, even though achieving the highest performance on the wrong function is clearly useless (and sometimes detrimental).

More focus on objective functions and less obsession with “better performance” may be what we need. This would underline the importance of problem formulation and domain knowledge, and undermine the ๐˜ฑ๐˜ช๐˜ฑ ๐˜ช๐˜ฏ๐˜ด๐˜ต๐˜ข๐˜ญ๐˜ญ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฑ๐˜ฉ๐˜ฆ๐˜ต solution.

A combination of Warren Powell‘s writing and the accompanying xkcd comic inspired this post (courtesy of xkcd.com).

Scott Cunningham’s “Mixtape”

I have had a copy of Scott Cunningham‘s “Mixtape” since it came out. I’ve skimmed through it before, but last night, while putting together a few slides, I read an entire chapter and loved it. It has just enough detail to keep the reader from going to the cited work.ย It is also candid and fun. The print version is nicely sized and designed so you can blend in and look cool while others around you are reading the latest fiction. The book has already made its impact and this is probably a late call but I had to share.

Source

Here is a little reflection and this one is seriously about AI

We seem to have a growing barrier to discussing AI: the use of AI as an umbrella term. My former students will say “Here we go again,” but if something means everything, it means nothing. If we take the time to define what we mean when we refer to AI, it will probably help the conversation.

Attached is a figure I’ve been using in my classes since 2017 to make this point (sorry, not the cat picture but the following figure of a timeline from AI to ML to Deep Learning). We might be better off referring to specific models and algorithms (or at least a group of models, such as LLMs, instead of AI).

Over the weekend, I attended a series of discussions on “AI” at the Academy of Management‘s annual conference. I had the opportunity to hear the perspectives of great scholars from a variety of backgrounds. Once again, I was puzzled as to what was meant by “AI” in most of the discussions.

Source

What if parallel trends are not so parallel?

In Data Duets, Duygu Dagli and I offered our take on Ashesh Rambachan and Jonathan Roth‘s recently published but long overdue paper now titled “A more credible approach to parallel trends.”

Problem:
We want to test the causal effect of a promotion on sales, let’s say a coupon. The coupon was sent to newer customers. Did the coupon increase the sales? Or would the new customers have bought more anyway?

Solution:
We’ll never know the answer to the last question but we can answer the first question after making some assumptions. More on this in the post.

This is less elaborate than our earlier posts on synthetic controls and Lord’s paradox. We will probably keep it this way so that we can post more often.

Source

Public trust in generative models

The fact that 73% of consumers trust content created by generative AI models is intriguing. And it’s not just people playing around with a chatbot for trivial conversations:*

– 67% believe they could benefit from receiving medical advice from a generative AI model
– 66% would seek advice from a generative AI model on relationships (work, friendships, romantic relationships) or life/career plans
– 64% are open to buying new products or services recommended by a generative AI model
– 53% trust generative AI-assisted financial planning

To put this into perspective, only 62% of people trust their doctor the most for medical advice.**

* 2023 survey by Capgemini of 10,000 consumers
** 2023 survey by OnePoll for Bayer of over 2000 adults

Source

tidylog

H/T to Travis Gerke, I’ve just discovered the wonderful work of Benjamin Elbers. tidylog provides feedback for dplyr and tidyr operations in R. This is another simple and powerful idea that basically uses wrapper functions for the dplyr and tidyr functions with feedback added. This will help greatly with both teaching and “doing.”

Source

Pandas AI

Pandas AI is an interesting and somewhat natural direction for embedding large language models into data science/analytics. This is less of a black box than automated exploratory data analysis tools, but still makes things easier.

We will likely see more ideas like Gabriele Venturi‘s here. For any serious project, though, we’ll still need skilled humans who can understand how the algorithm is responding to queries, and can check and confirm that it’s responding as expected.

Source

Experimental data analysis and the importance of conceptual models

In this new post, Duygu Dagli and I took a quick look at the analysis of experimental data. I really enjoyed writing this piece because Lord’s revelation is one of my favorites (pardon the pun).

Lord’s paradox is related to the better known Simpson’s paradox and it highlights the importance of constructing the right conceptual model before moving on to modeling the data. In the post, I speculated about one potential conceptual model and discussed its implications for modeling the data at hand.

Frankly, the example in the post had much to unpack. I just picked up on the part that relates to causal models and Lord’s paradox. I also ended up touching on an interesting discussion around the use of diff-in-diff vs. lagged regression models.

After running an experiment, how do you estimate the average treatment effect (ATE)? Which model do you choose to use? In this post, we use five different models with different assumptions to answer the same question. We find five different ATEs… Which one is the correct average treatment effect in this experiment? How do we decide?

In this post, Gorkem Turgut Ozer and I explore these questions (and more) by discussing the differences across models and potential implications. We ended up covering an interesting paradox I enjoyed learning about!

To me, a main takeaway is, business value from data is maximized when the right conceptual model meets the right method. For this to happen more often, data science and pricing leaders need the technical skills to be able to ask the right questions. They also need to build a trusting relationship with their teams to delegate and learn from them.

Source

Human learner vs. ChatGPT in taking tests designed for humans

Across the board, ChatGPT is passing exams by answering a mix of short-answer, essay, and multiple choice questions:

– U.S. medical licensing exam (says the attached study)
– Wharton School MBA exam on Operations Management
– University of Minnesota Law School exams in Constitutional Law, Employee Benefits, Taxation, and Torts

If ChatGPT is able to pass these exams, it is not because ChatGPT is revolutionary (while it is surely impressive) but because they are just bad exams. These exams must be lacking enough components that require some form of creative thinking and use of imagination.

Source

ChatGPT excitement

What is demonstrated here is a successful translation from human language to code. OpenAI has another project for this purpose: Codex. Microsoft’s GitHub Copilot serves as a specialized version (both are descendents of GPT-3). DeepMindโ€™s AlphaCode and the open source PolyCoder also target English to code translation.

What is missing (and provided by Marco) is the articulation of a solution that stems from a conceptual model, which in turn, is informed by causal links. For example: diversification reduces asset-specific risk.

Unless ChatGPT reasonably limits the weights of each individual stock based only on the objective at the beginning (minimize SD) without being explicitly instructed, we’d better curb our enthusiasm here.

Source

Just tried out ChatGPT…

Just tried out ChatGPT, the new large language model trained by OpenAI, and I was blown away by its capabilities! It can generate human-like text responses to any prompt, making it a powerful tool for conversation simulation, language translation, and more.

I also had a chance to play around with the code, and it’s surprisingly simple to use. Here’s a quick example of how to generate a response from ChatGPT using the Python API:

Not bad but a bit too excited (blown away, really?). Also the shameless self-promotion using my voice without any disclosure. We’re not off to a good, trusting relationship.