Scott Cunningham’s “Mixtape”

I have had a copy of Scott Cunningham‘s “Mixtape” since it came out. I’ve skimmed through it before, but last night, while putting together a few slides, I read an entire chapter and loved it. It has just enough detail to keep the reader from going to the cited work. It is also candid and fun. The print version is nicely sized and designed so you can blend in and look cool while others around you are reading the latest fiction. The book has already made its impact and this is probably a late call but I had to share.

Source

Here is a little reflection and this one is seriously about AI

We seem to have a growing barrier to discussing AI: the use of AI as an umbrella term. My former students will say “Here we go again,” but if something means everything, it means nothing. If we take the time to define what we mean when we refer to AI, it will probably help the conversation.

Attached is a figure I’ve been using in my classes since 2017 to make this point (sorry, not the cat picture but the following figure of a timeline from AI to ML to Deep Learning). We might be better off referring to specific models and algorithms (or at least a group of models, such as LLMs, instead of AI).

Over the weekend, I attended a series of discussions on “AI” at the Academy of Management‘s annual conference. I had the opportunity to hear the perspectives of great scholars from a variety of backgrounds. Once again, I was puzzled as to what was meant by “AI” in most of the discussions.

Source

What if parallel trends are not so parallel?

In Data Duets, Duygu Dagli and I offered our take on Ashesh Rambachan and Jonathan Roth‘s recently published but long overdue paper now titled “A more credible approach to parallel trends.”

Problem:
We want to test the causal effect of a promotion on sales, let’s say a coupon. The coupon was sent to newer customers. Did the coupon increase the sales? Or would the new customers have bought more anyway?

Solution:
We’ll never know the answer to the last question but we can answer the first question after making some assumptions. More on this in the post.

This is less elaborate than our earlier posts on synthetic controls and Lord’s paradox. We will probably keep it this way so that we can post more often.

Source

Public trust in generative models

The fact that 73% of consumers trust content created by generative AI models is intriguing. And it’s not just people playing around with a chatbot for trivial conversations:*

– 67% believe they could benefit from receiving medical advice from a generative AI model
– 66% would seek advice from a generative AI model on relationships (work, friendships, romantic relationships) or life/career plans
– 64% are open to buying new products or services recommended by a generative AI model
– 53% trust generative AI-assisted financial planning

To put this into perspective, only 62% of people trust their doctor the most for medical advice.**

* 2023 survey by Capgemini of 10,000 consumers
** 2023 survey by OnePoll for Bayer of over 2000 adults

Source

tidylog

H/T to Travis Gerke, I’ve just discovered the wonderful work of Benjamin Elbers. tidylog provides feedback for dplyr and tidyr operations in R. This is another simple and powerful idea that basically uses wrapper functions for the dplyr and tidyr functions with feedback added. This will help greatly with both teaching and “doing.”

Source

Pandas AI

Pandas AI is an interesting and somewhat natural direction for embedding large language models into data science/analytics. This is less of a black box than automated exploratory data analysis tools, but still makes things easier.

We will likely see more ideas like Gabriele Venturi‘s here. For any serious project, though, we’ll still need skilled humans who can understand how the algorithm is responding to queries, and can check and confirm that it’s responding as expected.

Source

Experimental data analysis and the importance of conceptual models

In this new post, Duygu Dagli and I took a quick look at the analysis of experimental data. I really enjoyed writing this piece because Lord’s revelation is one of my favorites (pardon the pun).

Lord’s paradox is related to the better known Simpson’s paradox and it highlights the importance of constructing the right conceptual model before moving on to modeling the data. In the post, I speculated about one potential conceptual model and discussed its implications for modeling the data at hand.

Frankly, the example in the post had much to unpack. I just picked up on the part that relates to causal models and Lord’s paradox. I also ended up touching on an interesting discussion around the use of diff-in-diff vs. lagged regression models.

After running an experiment, how do you estimate the average treatment effect (ATE)? Which model do you choose to use? In this post, we use five different models with different assumptions to answer the same question. We find five different ATEs… Which one is the correct average treatment effect in this experiment? How do we decide?

In this post, Gorkem Turgut Ozer and I explore these questions (and more) by discussing the differences across models and potential implications. We ended up covering an interesting paradox I enjoyed learning about!

To me, a main takeaway is, business value from data is maximized when the right conceptual model meets the right method. For this to happen more often, data science and pricing leaders need the technical skills to be able to ask the right questions. They also need to build a trusting relationship with their teams to delegate and learn from them.

Source

Human learner vs. ChatGPT in taking tests designed for humans

Across the board, ChatGPT is passing exams by answering a mix of short-answer, essay, and multiple choice questions:

– U.S. medical licensing exam (says the attached study)
– Wharton School MBA exam on Operations Management
– University of Minnesota Law School exams in Constitutional Law, Employee Benefits, Taxation, and Torts

If ChatGPT is able to pass these exams, it is not because ChatGPT is revolutionary (while it is surely impressive) but because they are just bad exams. These exams must be lacking enough components that require some form of creative thinking and use of imagination.

Source

ChatGPT excitement

What is demonstrated here is a successful translation from human language to code. OpenAI has another project for this purpose: Codex. Microsoft’s GitHub Copilot serves as a specialized version (both are descendents of GPT-3). DeepMind’s AlphaCode and the open source PolyCoder also target English to code translation.

What is missing (and provided by Marco) is the articulation of a solution that stems from a conceptual model, which in turn, is informed by causal links. For example: diversification reduces asset-specific risk.

Unless ChatGPT reasonably limits the weights of each individual stock based only on the objective at the beginning (minimize SD) without being explicitly instructed, we’d better curb our enthusiasm here.

Source

Just tried out ChatGPT…

Just tried out ChatGPT, the new large language model trained by OpenAI, and I was blown away by its capabilities! It can generate human-like text responses to any prompt, making it a powerful tool for conversation simulation, language translation, and more.

I also had a chance to play around with the code, and it’s surprisingly simple to use. Here’s a quick example of how to generate a response from ChatGPT using the Python API:

Not bad but a bit too excited (blown away, really?). Also the shameless self-promotion using my voice without any disclosure. We’re not off to a good, trusting relationship.