Anyone who has ever done laundry has most likely faced the issue where they are folding their recently washed clothes and they discover that they are missing one sock. Sometimes the sock is missing because it never made it into the laundry basket. Sometimes the sock was left in the washing machine. There are even jokes about how the clothes dryer sends socks into another dimension!
What’s interesting is the reaction that people have to the missing sock. Some people shrug their shoulders and figure that the missing sock will turn up eventually. Others will spend most of their day looking for that missing sock: they’ll search through the laundry room, all of the undone laundry, their closet, under the bed, etc.
This is a great metaphor for what testers do when they encounter a strange and hard-to-reproduce bug! Some testers decide that since the bug is hard to reproduce, they should go on and test something else. Other testers decide to devote every moment to finding the cause of the odd behavior, to the exclusion of all other testing. Which is the correct behavior? The answer is: “It depends”. In this post, I’ll list three reasons why you might want to hunt for the elusive bug, and three reasons why you might want to put off the hunt for later.
Reasons to Hunt for That Bug:
When the bug happens, it’s a big deal
You might be testing a system where everything works just fine most of the time. But when the bug occurs, the system crashes, or data is lost, or a customer can’t submit an order. This is a serious problem. Even if the bug happens just 1% of the time, it’s important to figure out what’s going on, because you will lose users as a result of this issue.
The problem is intermittent, but happens frequently
Currently I’m plagued by a bug in some software I’m using. I’m logged into the software, but about 50% of the time the application forgets who I am. This is really annoying. If someone were to ask me about this software, do you think I’d recommend it? If I didn’t need this tool for my work, I’d have switched to something else a while ago. You don’t want your users to give you poor reviews or stop using your application!
The problem hints at an important performance issue
Perhaps your software works just great when you test it with one or two users in your test environment, but you’re seeing strange behavior in your Production environment. Don’t just shrug this off with a “Works for me” statement! This bug could indicate that there is a problem with your application when it’s under load. Perhaps there’s a memory leak that gets worse the longer the application is used. Or maybe the calls to the server are taking too long, and the problem is compounded as more and more calls are made, locking the database. Whatever the reason, it’s important to find out the root cause of the problem and fix it before your customers see it.
Reasons to Save the Hunt for Later:
You’ve been testing for weeks, and you only saw the problem once
We all know that software and hardware isn’t perfect. Strange glitches can happen, including service interruptions in hosting environments, power surges on equipment, and loss of electricity or Internet connection. The bug that you saw just once could have been caused by any one of these things. This is the kind of bug to watch for, but not to chase after. If it happens again, then you can start looking into it.
You’re pretty sure you know the cause already
If you think that the reason why you saw a bug is because a team member forgot to deploy one part of the application, or someone forgot to turn a toggle on, or your test user was deleted; and the bug went away as soon as those things were fixed, then there’s probably no reason to hunt any further.
You are in a time-sensitive situation, and you think the issue doesn’t pose a risk
It’s very aggravating to have a giant bug show up in Production and then hear from a tester “Oh, I saw that bug, but we were in a hurry and didn’t have time to investigate.” But if the bug is something obscure that you think that a user will never do, and you are needing to meet an important deadline, it might be okay to wait until after the release to dig in further.
Here’s an example: a team is trying to release some new search functionality to their application. The team has been testing for weeks and it appears that things are working well. The day of the release, a tester discovers that if she enters a search term with 1000 characters, there is an intermittent bug. Rather than calling off the release and spending hours looking for the bug, the team should probably go forward with the release and investigate later; because it’s very unlikely that end users will be doing searches on 1000-character terms.
Do you have a “missing sock”?
The next time you encounter a strange bug, ask yourself: is this something that should be investigated now, or something that can wait until later? The size of the bug, the frequency of the bug, and likelihood that it will be seen by users can help you decide.