On Data Science, Causal Models, & AI

Does ChatGPT know Chinese?

If you ask it, its answer is “Yes.” If you ask it if it “understands” Chinese, its answer is again “Yes” without hesitation. Searle’s 1980 Chinese Room argument is more relevant than ever in the age of LLMs:

𝘚𝘶𝘱𝘱𝘰𝘴𝘦 𝘢 𝘮𝘰𝘥𝘦𝘭 (𝘣𝘰𝘹 𝘪𝘯 𝘵𝘩𝘦 𝘱𝘪𝘤𝘵𝘶𝘳𝘦) 𝘵𝘩𝘢𝘵 𝘣𝘦𝘩𝘢𝘷𝘦𝘴 𝘢𝘴 𝘪𝘧 𝘪𝘵 𝘶𝘯𝘥𝘦𝘳𝘴𝘵𝘢𝘯𝘥𝘴 𝘊𝘩𝘪𝘯𝘦𝘴𝘦. 𝘐𝘵 𝘵𝘢𝘬𝘦𝘴 𝘊𝘩𝘪𝘯𝘦𝘴𝘦 𝘤𝘩𝘢𝘳𝘢𝘤𝘵𝘦𝘳𝘴 𝘢𝘴 𝘪𝘯𝘱𝘶𝘵 𝘢𝘯𝘥 𝘱𝘳𝘰𝘥𝘶𝘤𝘦𝘴 𝘰𝘵𝘩𝘦𝘳 𝘊𝘩𝘪𝘯𝘦𝘴𝘦 𝘤𝘩𝘢𝘳𝘢𝘤𝘵𝘦𝘳𝘴 𝘢𝘴 𝘰𝘶𝘵𝘱𝘶𝘵. 𝘛𝘩𝘪𝘴 𝘮𝘰𝘥𝘦𝘭 𝘱𝘦𝘳𝘧𝘰𝘳𝘮𝘴 𝘪𝘵𝘴 𝘵𝘢𝘴𝘬 𝘴𝘰 𝘤𝘰𝘯𝘷𝘪𝘯𝘤𝘪𝘯𝘨𝘭𝘺 𝘵𝘩𝘢𝘵 𝘪𝘵 𝘤𝘰𝘮𝘧𝘰𝘳𝘵𝘢𝘣𝘭𝘺 𝘱𝘢𝘴𝘴𝘦𝘴 𝘵𝘩𝘦 𝘛𝘶𝘳𝘪𝘯𝘨 𝘵𝘦𝘴𝘵: 𝘪𝘵 𝘤𝘰𝘯𝘷𝘪𝘯𝘤𝘦𝘴 𝘢 𝘩𝘶𝘮𝘢𝘯 𝘊𝘩𝘪𝘯𝘦𝘴𝘦 𝘴𝘱𝘦𝘢𝘬𝘦𝘳 𝘵𝘩𝘢𝘵 𝘵𝘩𝘦 𝘮𝘰𝘥𝘦𝘭 𝘪𝘴 𝘪𝘵𝘴𝘦𝘭𝘧 𝘢 𝘭𝘪𝘷𝘦 𝘊𝘩𝘪𝘯𝘦𝘴𝘦 𝘴𝘱𝘦𝘢𝘬𝘦𝘳. 𝘛𝘰 𝘢𝘭𝘭 𝘰𝘧 𝘵𝘩𝘦 𝘲𝘶𝘦𝘴𝘵𝘪𝘰𝘯𝘴 𝘵𝘩𝘢𝘵 𝘵𝘩𝘦 𝘱𝘦𝘳𝘴𝘰𝘯 𝘢𝘴𝘬𝘴, 𝘪𝘵 𝘮𝘢𝘬𝘦𝘴 𝘢𝘱𝘱𝘳𝘰𝘱𝘳𝘪𝘢𝘵𝘦 𝘳𝘦𝘴𝘱𝘰𝘯𝘴𝘦𝘴, 𝘴𝘶𝘤𝘩 𝘵𝘩𝘢𝘵 𝘢𝘯𝘺 𝘊𝘩𝘪𝘯𝘦𝘴𝘦 𝘴𝘱𝘦𝘢𝘬𝘦𝘳 𝘸𝘰𝘶𝘭𝘥 𝘣𝘦 𝘤𝘰𝘯𝘷𝘪𝘯𝘤𝘦𝘥 𝘵𝘩𝘢𝘵 𝘵𝘩𝘦𝘺 𝘢𝘳𝘦 𝘵𝘢𝘭𝘬𝘪𝘯𝘨 𝘵𝘰 𝘢𝘯𝘰𝘵𝘩𝘦𝘳 𝘊𝘩𝘪𝘯𝘦𝘴𝘦-𝘴𝘱𝘦𝘢𝘬𝘪𝘯𝘨 𝘩𝘶𝘮𝘢𝘯 𝘣𝘦𝘪𝘯𝘨. 𝘐𝘯 𝘵𝘩𝘪𝘴 𝘤𝘢𝘴𝘦, 𝘥𝘰𝘦𝘴 𝘵𝘩𝘦 𝘮𝘢𝘤𝘩𝘪𝘯𝘦 𝘭𝘪𝘵𝘦𝘳𝘢𝘭𝘭𝘺 𝘶𝘯𝘥𝘦𝘳𝘴𝘵𝘢𝘯𝘥 𝘊𝘩𝘪𝘯𝘦𝘴𝘦? 𝘖𝘳 𝘪𝘴 𝘪𝘵 𝘮𝘦𝘳𝘦𝘭𝘺 𝘴𝘪𝘮𝘶𝘭𝘢𝘵𝘪𝘯𝘨 𝘵𝘩𝘦 𝘢𝘣𝘪𝘭𝘪𝘵𝘺 𝘵𝘰 𝘶𝘯𝘥𝘦𝘳𝘴𝘵𝘢𝘯𝘥 𝘊𝘩𝘪𝘯𝘦𝘴𝘦?

More recently in his book, Searle linked his original argument to consciousness, but that’s probably a higher bar than needed to reason that ChatGPT is a box that has no idea what it’s talking about.

Autonomous taxis are boring

Uncategorized

I took several rides in Google’s Waymo robotaxi. This is a short video of the experience, which is great, almost flawless. One problem is that it gets boring really fast.

About half the time during my trip, I used Uber or Lyft instead of a Waymo, and I met a professional dancer, a retired chef, a compliance officer, a criminal justice expert, an Amazon truck driver, and a painter.

I had really fun conversations that touched on “AI” and dance music, the best old school restaurants in town, the private equity fundraising process, cybersecurity, privacy, more “AI” and so on. As a bonus, almost all of the conversations included some useful, local information about the city.

None of the robotaxi rides had any of this, serendipity was nonexistent. The robot feels friendly, sure, but that’s about it. The longer the ride, the more boring it gets.

Replacing influencers with generative models

Uncategorized

Replacing influencers with generative AI looks like a great use case. The real question is whether influencers who promote “AI” will also be replaced by AI.

Some details:
With just a few minutes of sample video from the person to be cloned and a payment of $1,000, brands can clone a human streamer to work 24/7.

The AI videobots may already be having some economic impact: the average salary for livestream hosts in China is down 20% from 2022 (just another YoY figure, not a causal effect).

Source

AI as an umbrella term

Uncategorized

This is based on a recent Nature study, and it’s useful with a caveat that may make the findings and visuals less striking than they look:

“𝑁𝑎𝑡𝑢𝑟𝑒 𝑠𝑒𝑎𝑟𝑐ℎ𝑒𝑑 𝑓𝑜𝑟 𝑎𝑟𝑡𝑖𝑐𝑙𝑒𝑠, 𝑟𝑒𝑣𝑖𝑒𝑤𝑠 𝑎𝑛𝑑 𝑐𝑜𝑛𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑝𝑎𝑝𝑒𝑟𝑠 𝑖𝑛 𝑆𝑐𝑜𝑝𝑢𝑠, 𝑤𝑖𝑡ℎ 𝑡𝑖𝑡𝑙𝑒𝑠, 𝑎𝑏𝑠𝑡𝑟𝑎𝑐𝑡𝑠, 𝑜𝑟 𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑡ℎ𝑒 𝑡𝑒𝑟𝑚𝑠 ‘𝑚𝑎𝑐ℎ𝑖𝑛𝑒 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔’; ‘𝑛𝑒𝑢𝑟𝑎𝑙 𝑛𝑒𝑡*’, ‘𝑑𝑒𝑒𝑝 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔’, ‘𝑟𝑎𝑛𝑑𝑜𝑚 𝑓𝑜𝑟𝑒𝑠𝑡’, ‘𝑑𝑒𝑒𝑝 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔’, ‘𝑠𝑢𝑝𝑝𝑜𝑟𝑡 𝑣𝑒𝑐𝑡𝑜𝑟 𝑚𝑎𝑐ℎ𝑖𝑛𝑒’, ‘𝑎𝑟𝑡𝑖𝑓𝑖𝑐𝑖𝑎𝑙 𝑖𝑛𝑡𝑒𝑙𝑙𝑖𝑔𝑒𝑛𝑐𝑒’, ‘𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙𝑖𝑡𝑦 𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛’, ‘𝑔𝑎𝑢𝑠𝑠𝑖𝑎𝑛 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑒𝑠’, ‘𝑛𝑎𝑖̈𝑣𝑒 𝑏𝑎𝑦𝑒𝑠’, ‘𝑙𝑎𝑟𝑔𝑒 𝑙𝑎𝑛𝑔𝑢𝑎𝑔𝑒 𝑚𝑜𝑑𝑒𝑙𝑠’, ‘𝑙𝑙𝑚*’, ‘𝑐ℎ𝑎𝑡𝑔𝑝𝑡’, ‘𝑔𝑎𝑢𝑠𝑠𝑖𝑎𝑛 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 𝑚𝑜𝑑𝑒𝑙𝑠’, ‘𝑒𝑛𝑠𝑒𝑚𝑏𝑙𝑒 𝑚𝑒𝑡ℎ𝑜𝑑𝑠’.”

So, SVM, Naive Bayes, Random forest, and Ensemble methods are all called AI (not untrue, but…). Gaussian processes? Well, papers with a GP regression count then. Dimensionality reduction? So, papers using PCA or LDA count.

This feeds the trend of using AI as an umbrella term unfortunately.

Source

Using predictive modeling as a hammer when the nail needs more thinking

Uncategorized

The business problem is to put a lifeguard station on a beach to save some lives (i.e., find the best location for the lifeguard station). This is not really a predictive modeling problem. But that’s the hammer our data scientists have and they have access to fancy libraries. There is also some historical data: swimmers rescued and drowned at other beaches. It all checks out. Resistance to 𝘱𝘪𝘱 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 𝘱𝘳𝘰𝘱𝘩𝘦𝘵 is futile.

Transforming the problem into an objective function could have signaled that this is an optimization problem (a prescriptive modeling problem), but that step was skipped. In the picture shown, we may need a solution:

– minimizes distance => 𝗦𝗼𝗹𝘃𝗲𝗱 𝘂𝘀𝗶𝗻𝗴 𝗽𝗶𝗽 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗳𝗮𝗻𝗰𝘆_𝗹𝗶𝗯𝗿𝗮𝗿𝘆
while also…
– minimizing time => 𝗧𝗵𝗲 𝗱𝗼𝗺𝗮𝗶𝗻 𝗲𝘅𝗽𝗲𝗿𝘁 𝗲𝗻𝘁𝗲𝗿𝘀 𝘁𝗵𝗲 𝗿𝗼𝗼𝗺
– minimizing swimming => 𝗧𝗵𝗲 𝗹𝗮𝗯𝗼𝗿 𝘂𝗻𝗶𝗼𝗻 𝗶𝗻𝘁𝗲𝗿𝘃𝗲𝗻𝗲𝘀
– minimizing time to ice cream => 𝗧𝗵𝗲 𝗲𝘅𝗲𝗰𝘂𝘁𝗶𝘃𝗲 𝗹𝗲𝗮𝗱𝗲𝗿𝘀𝗵𝗶𝗽 𝘀𝘁𝗲𝗽𝘀 𝗶𝗻
– [not shown] minimizing walking on sand => 𝗧𝗵𝗲 𝗗𝗲𝗽𝗮𝗿𝘁𝗺𝗲𝗻𝘁 𝗼𝗳 𝗟𝗮𝗯𝗼𝗿 𝗿𝗲𝗾𝘂𝗶𝗿𝗲𝗺𝗲𝗻𝘁
and hopefully not…
– maximizing time => 𝗔 𝗷𝘂𝗻𝗶𝗼𝗿 𝗱𝗮𝘁𝗮 𝘀𝗰𝗶𝗲𝗻𝘁𝗶𝘀𝘁 𝘀𝗼𝗹𝘃𝗲𝘀 𝘁𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺

So, the ideal solution requires more thinking about the problem. For example, maximizing the number of lives saved may actually require constraints on how to minimize time so that lifeguards don’t risk their lives during the rescue.

The law of the instrument works a little too well in predictive modeling (and more generally in machine learning). Objective functions are often lost in translation when they should be an explicit step in the modeling process. Best practice tends to favor performance metrics, even though achieving the highest performance on the wrong function is clearly useless (and sometimes detrimental).

More focus on objective functions and less obsession with “better performance” may be what we need. This would underline the importance of problem formulation and domain knowledge, and undermine the 𝘱𝘪𝘱 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 𝘱𝘳𝘰𝘱𝘩𝘦𝘵 solution.

A combination of Warren Powell‘s writing and the accompanying xkcd comic inspired this post (courtesy of xkcd.com).

Scott Cunningham’s “Mixtape”

Uncategorized

I have had a copy of Scott Cunningham‘s “Mixtape” since it came out. I’ve skimmed through it before, but last night, while putting together a few slides, I read an entire chapter and loved it. It has just enough detail to keep the reader from going to the cited work. It is also candid and fun. The print version is nicely sized and designed so you can blend in and look cool while others around you are reading the latest fiction. The book has already made its impact and this is probably a late call but I had to share.

Source

Here is a little reflection and this one is seriously about AI

Uncategorized

We seem to have a growing barrier to discussing AI: the use of AI as an umbrella term. My former students will say “Here we go again,” but if something means everything, it means nothing. If we take the time to define what we mean when we refer to AI, it will probably help the conversation.

Attached is a figure I’ve been using in my classes since 2017 to make this point (sorry, not the cat picture but the following figure of a timeline from AI to ML to Deep Learning). We might be better off referring to specific models and algorithms (or at least a group of models, such as LLMs, instead of AI).

Over the weekend, I attended a series of discussions on “AI” at the Academy of Management‘s annual conference. I had the opportunity to hear the perspectives of great scholars from a variety of backgrounds. Once again, I was puzzled as to what was meant by “AI” in most of the discussions.

Source

What if parallel trends are not so parallel?

Uncategorized

In Data Duets, Duygu Dagli and I offered our take on Ashesh Rambachan and Jonathan Roth‘s recently published but long overdue paper now titled “A more credible approach to parallel trends.”

Problem:
We want to test the causal effect of a promotion on sales, let’s say a coupon. The coupon was sent to newer customers. Did the coupon increase the sales? Or would the new customers have bought more anyway?

Solution:
We’ll never know the answer to the last question but we can answer the first question after making some assumptions. More on this in the post.

This is less elaborate than our earlier posts on synthetic controls and Lord’s paradox. We will probably keep it this way so that we can post more often.

Source

Public trust in generative models

Uncategorized

The fact that 73% of consumers trust content created by generative AI models is intriguing. And it’s not just people playing around with a chatbot for trivial conversations:*

– 67% believe they could benefit from receiving medical advice from a generative AI model
– 66% would seek advice from a generative AI model on relationships (work, friendships, romantic relationships) or life/career plans
– 64% are open to buying new products or services recommended by a generative AI model
– 53% trust generative AI-assisted financial planning

To put this into perspective, only 62% of people trust their doctor the most for medical advice.**

* 2023 survey by Capgemini of 10,000 consumers
** 2023 survey by OnePoll for Bayer of over 2000 adults

Source

tidylog

Uncategorized

H/T to Travis Gerke, I’ve just discovered the wonderful work of Benjamin Elbers. tidylog provides feedback for dplyr and tidyr operations in R. This is another simple and powerful idea that basically uses wrapper functions for the dplyr and tidyr functions with feedback added. This will help greatly with both teaching and “doing.”

Source