How Hollywood horror inspires us to focus on problems that yield measurable results

Data scientists are curious, creative, and resourceful. They come from disparate backgrounds, and they’re capable of finding wonder in the mundane. They’re constantly asking why and what if and finding pleasure in the process. Sometimes these questions yield interesting results, but interesting does not imply impactful. If data scientists want to prove the value of their work, they’ll need to occasionally highlight its impact. They must apply their curiosity to scenarios that will yield measurable results. They must focus on scenarios that scream for data science.

What do these scenarios look like?

A good way to start identifying them is by looking for attributes that are:

  • Inherent to the design of a system and assigned arbitrarily. Like the placement of recommendations on a webpage. If there are any recommendations at all, they’ll have to be presented in some order on the page. Sometimes that placement is decided arbitrarily, for example, by ordering the recommendations numerically by the random integers assigned to each one. Sometimes the recommendations are placed in alphabetical order for easy lookup by their name s — even if the end users are unfamiliar with the content and wouldn’t know what name to look up in the first place. In both of these cases, the recommendations need to be placed somewhere, but no one has ever given much thought to precisely where that place should be. In other cases, you’ll find that people have given A LOT of thought to the placement of objects, but they based their decisions on outdated or sub-optimal metrics, like when baseball managers relied solely on a player’s batting average to determine his place in the batting lineup instead of using his on-base percentage, too. Other examples of these placement attributes might include the physical location of your regional sales offices; the operating hours of maintenance shops used by your service fleet; the aisle, shelf, and bin in which goods are placed inside a distribution center; or the assignment of expert human resources to sales teams across the world. All of these attributes are inherent to the design of the parent system (after all, the expert employees need to be assigned somewhere), but they’re often placed based on what’s convenient, or what is known when the system was first designed.
  • Mutable. With enough time and money, you can change anything. Nothing is permanently permanent. Ideally, you should look for attributes that you can change for an acceptable cost. And if you can’t change it yourself, consider the influence you have with your stakeholder teams and the costs they will incur. It’s much cheaper to change the sequence of the assembly line on the shop floor than to pack up all the machinery and move it to a new location entirely.
  • Of sufficient volume. It’s great if you can reorder the recommendations on your webpage, but if no one is clicking any of them in the first place, then you’re yelling into an empty room. You should first focus on your funnel and develop your acquisition channel. Take on problems that have scale. If no problems of scale exist in your domain, you should consider that to be your first problem.

Have you ever noticed that the masked killers in the movies this time of year chase after the people who scream? These fictional killers ignore silent victims because the entertainment value is too low. They want a spectacle. They want a show. They want to optimize for the thrill. Learn from the horror movies and find the problems that scream. Look for things with mutable attributes of sufficient volume that were assigned arbitrarily as a part of a larger system. These scenarios scream for data science. They really put on a show.