Moving up on the ladder of data centricity: From descriptive statistics to modeling

Most organizations use data in one way or another. Some still collect the data using mechanical tally counters, but that, as well, is data. Simple counts of visits to different exhibits in a museum can provide the director and curator with valuable insights (or not, depending on how strong the thumb muscles of staff are). Descriptive analysis is probably the most common approach to data analysis across organizations. For example, year-over-year (YoY) comparisons, a favorite across industries, are essentially descriptive. A YoY comparison of revenue is not more informative than the data collected by tally counters in a museum: “More visitors entered Exhibit A this year compared to last year.” These descriptive findings are like figuring out today is hotter than yesterday. They can be useful, albeit to a limited extent, and can be misleading when the difference is falsely attributed to a factor, sometimes just because that is readily available even if it was not measured (e.g., the YoY revenue is down because the sales team is underperforming). Measuring relevant factors and including them in a statistical model is another step up on the ladder of data centricity, no matter how simple the model is.

Most of the finance industry compete on models and algorithms today. Even so, Renaissance Technologies, a pioneer of quantitative trading in hedge funds, manages $110 billion of assets with the help of a linear regression model:

“Nick Patterson, who spent a decade as a researcher at Renaissance, says, “One tool that Renaissance uses is linear regression, which a high school student could understand” (OK, a particularly smart high school student; it’s about finding the relationship between two variables). He adds: “It’s simple, but effective if you know how to avoid mistakes just waiting to be made.””
Source: Computer Models Won’t Beat the Stock Market Any Time Soon

Simplicity helps the interpretation, and is preferable if the desired outcome (e.g., high prediction accuracy) is still achieved. In most cases, however, value creation using data involves causal reasoning and inference, higher steps on the ladder. This is one of a series of posts on “data centricity,” a framework I have been developing. Comments and feedback are welcome in any form.

The rise of unobtrusive learning in the age of big data

One advantage of living in the age of big data is a diminishing need to ask customers explicitly for feedback. A variety of methods for unobtrusive learning from customers have emerged thanks to digitalization (vs. digitization). For example, customers now write reviews every day about products and services without being asked to do so. The behavior of customers can be captured by tracking their website visits. Sensors are now so cheap that a retailer can put sensors all over the floor in its stores and track the physical movements of customers. Compared to the tools and technologies used in an Amazon Go store, such a data collection initiative can be considered a small step today. Motion sensors for store shelves, neural network-powered cameras, and wireless beacons can easily be added as complements. From a managerial perspective, the phenomenon is more than a shift from a push mindset to a pull mindset. Leveraging it fully requires careful planning and execution. This is probably why “data centric” companies are capturing more value from unobtrusive methods while most retailers still struggle to learn from the reviews on their own product pages. Capturing most of the value also requires a systematic effort, rather than ad hoc attempts, ideally starting from product development into the full product life cycle. For example, when launched in 2004, Yelp required asking friends for recommendations. Users could not write reviews without being explicitly asked for. Yelp switched to the current model four months after the launch, based on the data on how early users behaved at the site. This is a short intro to a series of posts on “data centricity,” a concept I have been developing. Comments and feedback are welcome in any form.