Lately I’ve been thinking about thinking; specifically, critical thinking skills and how important they are for everyone, especially testers. When testers can’t think critically, they aren’t able to diagnose software problems quickly or find good solutions to testing challenges. In light of this, I’ve decided to focus on critical thinking in my blog posts this year!
Each month, I’ll be writing about a different logical fallacy. A logical fallacy is a common reasoning error that most of us make when thinking about a problem. Logical fallacies are often made when people are arguing for a specific side in a debate, but they are also found when trying to get at the root cause of a problem. In each of my blog posts, I’ll describe the logical fallacy of the month, give a typical example, and then describe how it can be found in software testing. Then I’ll invite you to look for ways you have used that fallacy in your own testing. This month, we begin with the Causal Fallacy.
The Causal Fallacy happens when someone takes two separate events and determines that one causes the other because they correlate. For example, let’s say that researchers on Amity Island are investigating why there have been so many shark attacks that summer. They take a look at other data they have for the island, and they discover that ice cream sales are up that summer as well. They come to the conclusion that all that ice cream consumption must be causing the increase in shark attacks.
Obviously this is ridiculous! A correlation between two data points does not mean that one causes the other. (For some really funny examples of data correlation, see Tyler Vigen’s website, Spurious Correlations.)
Let’s take a look at a software testing example. Imagine there is a social media site, Cute Kitten Photos, where users can create an account and share photos of their kittens. Every Friday, the data collection team at Cute Kitten Photos runs a series of reports where they determine the weekly usage of the platform and the most liked photos for the week. Also every Friday, the IT team has discovered that the CPU usage on the servers has spiked to dangerous levels, so much so that some users are getting 500 errors when they try to go to the site.
It’s pretty clear that these data collection reports are causing the CPU spikes, right? That’s what the IT team thinks! But the data collection team is sure that they are not causing the problem. They point to data that shows that their reports cause very little load on the system.
Finally, a deeper investigation discovers that one user of the platform has been sharing TIFF images of their kittens. The software is not handling this image type well, which is causing several retries, and retries of the retries, putting the heavy load on the servers.
From this example, it’s easy to see that correlation does not mean causation, and that the first and most obvious cause of a problem is not always the right one.
In the next few weeks, when you encounter an odd bug that seems to have an obvious cause, ask yourself, “What else could this mean?”
Looking forward to seeing more test examples on this matter. But thanks a lot for the link to Spurious Correlations page – it is hilarious 🙂