New API and UI Test App!

I am writing a book on software testing, which I’m hoping to publish by the end of the year. In order to help my readers learn about manual testing, API testing, and API and UI automation, I decided I’d like to have an example application to accompany the book.

There are a lot of great practice sites out there for API testing and UI automation, but not a lot of sites that offer both, and some of those sites are a little too complex for someone learning the basics. So I decided to create my own!

The Contact List App is a simple web application that allows testers to create an account, log in, and add, edit, and delete a list of contacts. The web elements are easy to access, so it’s great for getting started with UI automation. The application includes an API for testers to practice GET, POST, PUT, PATCH, and DELETE operations.

This was my first time creating a complete application, and it was quite a learning experience! I’ll share some of the things I learned in a future post. I hope that you will find the application helpful as you hone your testing skills, and please email me with any feedback.

If you’d like to stay informed about my upcoming book, and perhaps even preview chapters, you can sign up for my mailing list here. In the meantime, enjoy the Contact List App!

HTTP Standards

Have you ever been testing an API and gotten involved in a dispute about what a response code should be?  Perhaps you witnessed a disagreement between two developers, or perhaps a developer was insisting the response code should be one thing and you thought it should be something different.  You might have gone to Stack Overflow to see what others think, and discovered people referencing something called an RFC.  

RFC stands for “Request for Comments”.  The RFCs related to HTTP protocols are produced by the ISOC: the Internet Society.  The Internet Society consists of tens of thousands of members, a staff, and a board of trustees.  They produce the standards that most developers use when developing websites and web applications, and those standards are written up in the form of RFCs.  

The first HTTP RFC was created in 1996, and RFC 2616, the most recent RFC that discusses response codes, was created in 1999.  These RFCs are surprisingly easy to read!  Let’s take a look at some of the contents of RFC 2616 and see how we can apply them to real-life scenarios.

Responses to POST requests:

In section 9.5, the RFC states “The action performed by the POST method might not result in a resource that can be identified by a URI. In this case, either 200 (OK) or 204 (No Content) is the appropriate response status, depending on whether or not the response includes an entity that describes the result.” So when you are doing a POST that doesn’t create a new resource, such as a request that checks for the existence of a resource, you could use a 200 or a 204 response code. But if your POST is returning a response body, you must use the 200 response code rather than the 204 response code, because the 204 response code cannot include a response body.

Responses to PUT requests:

In section 9.6, the RFC says “If an existing resource is modified, either the 200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate successful completion of the request.” So as with the POST request, if you are modifying a resource and not returning anything in the body of the response, you could use a 200 or a 204 response code. But if you were returning something in the response body, you’d definitely want to use a 200.

Responses where the user is not allowed access to a resource:

Section 10.4.4 says of a 403 “Forbidden” response code: “The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity.  If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.” So if your user does not have permission to view a resource, and you don’t mind if the user knows that the resource exists, then you can use a 403 response code. But if you don’t want the user to even know that the resource exists, because perhaps that might give a malicious user too much information, then you’d want to return a 404 so that all they’d know was that the resource was not found.

Hopefully these examples have shown how useful the HTTP RFC documents can be. So the next time you are testing an API and you’re wondering if you’re seeing the right response code, try going to the source!

Your Lone Wolf Days Are Over!

The very first test automation job I had was for a company that had no QA Engineer before I was hired. I’d never done automation before, but I convinced the company that with my rudimentary knowledge of Java, I’d be able to figure it out. This was long before there were awesome online resources like Test Automation University, so it took me a long time and a lot of trial and error before I had automated tests that would run and pass. My tests were long, flaky, hard to maintain and filled with implicit waits and duplicated code, but they were my tests, and I had really enjoyed solving the problem of test automation on my own.

Then the company hired a new software developer, and our manager thought it would be a great idea for him to learn about our software by looking at my tests. Without consulting me, the new developer completely rewrote my tests. I was really annoyed, until I looked at the tests and saw that he had reorganized them using page object models and methods so that no code was duplicated. It would be so much easier to maintain the tests now! That week I learned that it’s always best to work with others rather than going it alone, because others often have skills or information that we don’t.

Software developers know this already, because they are required to collaborate with others through feature planning and code reviews. But often testers aren’t required to do this. Test automation code is just as important as production code because of the value it provides, and for that reason, we shouldn’t be lone wolves, even if we enjoy it! Here are some lone wolves that you may encounter or recognize in yourself.

The Gollum: In The Lord of the Rings, Gollum loved his ring so much he called it his “precious”. For the test automation Gollum, her automated tests are her “precious”. She’s worked long and hard at those tests and is very proud of them! Unfortunately, because she was the only one who worked on the tests, they make sense only to her. Nobody wants to help maintain the tests because they are so hard to understand. As a result, she is the only one who can fix the tests when they break, and therefore she has become the bottleneck for automated testing on her team.

How to stop being The Gollum: Share your test automation with others and get feedback on how they could be more useful and maintainable. Implement the feedback and repeat.

The Frank Sinatra: Just as the real Frank Sinatra sang about doing it “My Way”, the test automation Frank wants to do it all his way. He is convinced that he alone knows the right way to do test automation. And unsurprisingly, the right way is with his favorite tool! Every other tool falls short in his estimation, so he’s going to stick with his tool even if the rest of the company is using something different. As a result, he can’t collaborate and share ideas with testers on other teams, and his test automation never improves.

How to stop being The Frank Sinatra: Try some new test automation tools! You don’t have to love every single one, just give yourself enough time to really understand their strong points. You may be surprised at how much test tools have improved in recent years!

The Magpie: This species of bird is attracted to shiny objects. The test automation Magpie is attracted to new test automation frameworks! If it’s new, she wants to try it out, and she loves writing test automation from scratch. In her opinion, any problems with her existing automation must be problems with the framework, and the problems become an excuse to scrap the project and start over again with a new framework. This means her team never has a complete test suite that they can run and rely upon. It also means that the team has to keep up with all the framework changes, which will make them reluctant to contribute to the automation.

How to stop being The Magpie: Get input from your team about what framework would be best for your application, then stick with it. When you run into trouble, ask your team for help, or ask other test automation engineers. You don’t have to keep using the framework forever, but use it at least long enough to see it working in CI/CD.

The Hermit: The Hermit simply loves working alone. He’s very busy working on his automation all by himself, so he doesn’t want to take the time to explain what he’s doing to the rest of his team or to any other testers. He hates asking for help and sees it as a sign of weakness; he’d rather figure everything out on his own. As a result, his automation never improves, and no one at the company ever benefits from his expertise.

How to stop being The Hermit: Find a developer or test automation engineer that you admire and trust, and ask them for their opinion on your automation work. Implement some of their suggestions. Volunteer to lead a workshop on something you’re really good at. Try to help one person a week who is stuck on a thorny automation problem.

Software is a collaborative adventure! Building software is complicated. There are many facets of software quality to consider, while at the same time delivering features that will put your company ahead of its competitors. That’s why we don’t have room for lone wolves anymore. Software testers need to work together to contribute to test automation projects that deliver fast, accurate results. And software testers need to work with software developers to make sure that quality goes both ways: we need quality production code and quality test code.

Reliability Engineering

Imagine that you are working on a team that is creating a new feature that allows users to submit and watch videos. The application uses a third party- we’ll call it Encodurz- that encodes the videos that are uploaded to your application. You work hard to test the feature; making sure that the UI is flawless, that the user journeys make sense, and that the videos play correctly, among many other things.

It’s time for your new feature to be released to the world, and you’re excited! But on the day of the release, Encodurz has an outage. No videos can be encoded, so none of the uploaded videos will display! Your users don’t know that it’s not your fault; all they see is that the new feature doesn’t work. They call and complain to customer service and they post complaints on Twitter. This is why reliability engineering is important!

Reliability engineering focuses on the ability of applications to be as available as possible. It aims to offer a good user experience, even in the following situations:
* The server goes down
* The database is unavailable
* An API that your application relies on is unavailable
* A third-party provider that your application depends on is unavailable

Do you know how your application will behave in those scenarios? If not, it sounds like it’s a good time to test those scenarios! There are two ways to test:

Bring the service down
You can bring a server down by unplugging it, but chances are your server is not nearby for you to do that. But you can also bring a server, a webservice, or a database down by shutting it down using scripted commands. If you don’t have permission to do that, you can find someone in DevOps at your company who has the correct permissions.

Change your connection strings
It’s really easy to simulate an outage of a server, a database, an API or any other service your application depends on simply by changing the way your application connects to it. For example, if your app connects to your company database with a username and password, all you need to do is simply send in a bad password. Or you could change the URI needed for the connection so that it’s incorrect.

NOTE that it is a bad idea to test either of the above strategies in Production, at least until your application is VERY resilient. You will want to do this testing in your test environment.

Once you have discovered what happens when your application or its dependencies has an outage, it’s time to make your app as resilient as possible. Here are seven strategies for doing that:

  1. Use a “circuit-breaker”
    This method puts logic in the code that tries connecting to a resource a few times, and when it is unable to connect, switches over to a different resource. For example, if your application usually points to Server A, and you have a backup server called Server B, when the circuit-breaker is tripped the connection changes over to Server B.
  2. Use retries
    Sometimes a third-party app will fail temporarily for an unknown reason. You don’t want your request to the app to fail and never try again. So you can build in some retries; perhaps if the request fails, you wait 30 seconds and try again, if it fails again you wait 60 seconds and try again, and so on. You don’t want to retry indefinitely, but instead set some sort of time limit so if the request still hasn’t succeeded when the limit is met, an error is returned.
  3. Use cached data
    It’s a great idea to have some kind of caching service that will be able to serve up data if a request fails for some reason. When the request fails, your application just grabs the slightly-stale cached data and returns that instead.
  4. Enter into read-only mode
    If your application detects that there’s a problem writing to a data source, you can configure it to go into a read-only mode so that your users can at least see their data. You should set a message to display when this is the case to explain to users why they can’t update their data at the moment.
  5. Provide messaging that something isn’t quite right
    It is so annoying to get a cryptic error like “Error: T-128556” when using an application. That’s not helpful at all! Instead, provide your users with as much detail as you can about what’s wrong. In the example at the beginning of this post, there could have been a message that read “Sorry, we are having an issue connecting to our video encoding software at the moment. Please try again in a few minutes.”
  6. Have a status page that explains what is going on
    If your application goes down completely or is very degraded, it’s a great idea to have a status page (hosted on a different server) that provides a way to communicate to your end users what’s happening. You could include a timestamp with your time zone, a list of the features that are affected, and a message about what’s going wrong. Then you can keep the status page updated at regular intervals until the problem is fixed.
  7. After the problem has ended, do a post-mortem to see what lessons you’ve learned
    If the outage was very brief or only affected a few users, you might not need to make the post-mortem public. But if it was a big outage, it’s a great idea to communicate to customers what happened, why it happened, and how you are going to prevent it in the future. See Slack’s message about their January 4th outage for a really good example of this.

No application can run perfectly 100% of the time; servers are imperfect and our apps are almost always dependent upon outside forces. But it’s important to know exactly what will happen when services are down and to figure out the best ways to respond to those issues before they happen. As testers, we can encourage our team to participate in this process.

A New Year Challenge

It’s a new year, and a new year means it’s time for new challenges! My challenge to you is: this year, learn- and I mean REALLY learn- a programming language. Don’t just learn “enough to be dangerous”, or enough to write test automation; really understand the components of the language and how to use them.

I did this last year, and it was one of the most rewarding things I did! I’ve always liked Node, so I decided to learn Node.js. I took this awesome course; it took about nine months for me to complete it, and it was definitely worth the time it took.

Why should you take the time to really learn a programming language? Here are four reasons:

  1. You will understand what your test code is doing
    We’ve all done it- we want to get something done in our test automation, we do a quick search on Stack Overflow to see how other people have accomplished it, and we copy and paste their code. But when we do this, we don’t really understand what our test code is doing. This can result in flaky tests that we don’t know how to fix. When we fully understand our code, however, it’s easy to fix.
  2. You will write better code, which means more maintainable tests.
    When you understand the principles of writing code in your chosen language, you’ll write clean code. You’ll know how to set things up like the page object model, and refactor code to make sure you’re not repeating yourself. You’ll put values into a config file so that your code will be easily reusable by everyone on your team. This will result in tests that are easy to maintain.
  3. You will have sympathy for developers.
    Early in my testing career, I’m ashamed to say that I had a certain disdain for developers who couldn’t figure out how to validate date fields or phone numbers. When I tried coding these things myself, however, I discovered just how difficult they were. Coding is hard! Once you’ve tried writing an application and struggled with what seems on the surface to be a simple problem, you’ll respect the developers you work with so much more. This will enable you to work together more effectively.
  4. You will understand your product’s code better.
    Understanding how to code in your chosen language won’t just make you better at writing code, it will make you better at reading code. You’ll be able to look at your production code and see what it’s doing and why. This will give you ideas for how to test it. For example, if you looked at the code for a form field and you saw that one of the values was expecting an enum, you might decide that it would be a good idea to try entering a value that wasn’t included in the enum, or a value that wasn’t in all caps, to see if the program would respond appropriately.

So this year, I challenge you to really learn a programming language! And once you’ve completed whatever course you’ve chosen, try writing an application on your own. I’ll be using the skills I learned in the Node.js course I took to create a simple application that readers of my forthcoming book will be able to use for testing practice. Let’s all become better coders together!

Book Review: Leading Quality

I’ve been reading and reviewing one quality-related book a month for all of 2020, and it’s been a lot of fun! It was a great way to motivate myself to read books on software testing. My final book review of the year is “Leading Quality”, by Ronald Cummings-John and Owais Peer. The authors started their own business, which provides global testing services and test strategy for companies. In Leading Quality, they discuss the important lessons they’ve learned about how businesses think about and generate quality in their applications.

The book is designed for quality leaders, and it is made up of three sections. In Section I, Becoming a Leader of Quality, the authors discuss the three quality narratives that often define a company’s test strategy. None of these narratives are “wrong”; they are simply a way to define how your company is currently thinking. The Ownership Narrative focuses on who is responsible for quality. Ideally, the answer should be “Everyone”! The “How to Test” Narrative focuses on what tools and techniques are used in testing. It’s important to have a clear understanding of what your options are and what the maturity of your product is when using this narrative. The Value Narrative focuses on what the return on investment is for quality activities. This could include metrics like revenue potential, savings, and risk mitigation.

Section II, Mastering Your Strategic Decisions focuses on making decisions about when and how to test. There’s a discussion of the three stages of product maturity: the validation stage, where a new product is in its infancy and the team is rushing to get it to market and prove the idea; the predictability stage, where the product has been proven and now the team is focused on cleaning up tech debt in order to be ready to scale; and the scaling stage, where the product has a high number of users and the team is focused on minimizing the negative impact of any bug.

Section III, Leading Your Team to Accelerate Growth, discusses the steps for creating quality at your company: set the vision, assess your starting point, and determine a strategy to achieve the vision. Expect that the strategy will change as you go. Determine what company metrics you’d like to align with to measure your success. Examples of metrics include: attention-based, which measures how much time a user spends on the platform, transaction-based, which measures sales, and productivity-based, which measures how quickly a user can be successful at an activity.

Leading Quality is a short book that is packed full of valuable suggestions for any team or company leader. I recommend it to anyone who would like to think about developing an effective quality strategy for their company.

Now that I’ve completed my twelve book reviews for the year, I thought it would be helpful to show the whole list here:
Agile Testing Condensed, by Lisa Crispin and Janet Gregory
The Unicorn Project, by Gene Kim
Enterprise Continuous Testing, by Wolfgang Platz with Cynthia Dunlop
Continuous Testing for DevOps Professionals, by Eran Kinsbruner and contributors
Perfect Software and Other Illusions About Testing, by Gerald Weinberg
Unit Testing Principles, Practices, and Patterns, by Vladimir Khorikov
Performance Testing- A Practical Guide and Approach, by Albert Witteveen
Explore It! Reduce Risk and Increase Confidence with Exploratory Testing, by Elizabeth Hendrickson
Accelerate: The Science of Lean Software and DevOps, by Nicole Forsgren, Jez Humble, and Gene Kim
The Way of the Web Tester, by Jonathan Rasmusson
Clean Code, by Robert C. Martin and contributors
and of course, Leading Quality, by Ronald Cummings-John and Owais Peer

And now for the BIG news! All year I have been working on a book of my own: a comprehensive look at software testing! I’m nearly finished with the first draft and will be refining it and publishing it in the next year. In order to have the time to work on this massive project, I will be scaling back my blog posts to one post a month. If you’d like to stay informed about my progress on my book, be sure to sign up for my email list using this form.

I wish you all a Happy 2021, and happy reading!

Exploring the Cypress Real World App

Last Feburary, I checked out Cypress for the first time, and I was astounded! I wrote a post about just how easy it was to get started running Cypress. Cypress is so easy to set up because it runs within your native browser (or headless), so you don’t need to bother with browser drivers as you do with Selenium.

In June, the folks at Cypress announced that they had created a “Real World App”, which is an app designed to help people learn how to use Cypress automation for both API and UI testing. The instructions for getting started with the Real World App are fairly simple, but I thought I’d try to make them even simpler in this blog post.

Prerequisites: To use the Real World App, you’ll need to have Git, Node, and Yarn installed. You can check to see what you already have installed by going to your command line and typing:
git –version
node –version
yarn –version

If you get a version number in response to each of these commands, you are off to a great start! Note that you will need to have Node version 12 or above in order to run the Real World App, so if you have a version below that, you can follow the same instructions to update Node as you would to install Node.

To install Git, go to this page and follow the instructions for your operating system.
To install Node, go to this page and click on the LTS tab. Download either the Windows Installer or the MacOS Installer, open the installer, and follow the prompts.
To install Yarn, go to this page and follow the instructions for your operating system. For MacOS, I recommend using the Homebrew option. For Windows, I recommend downloading the msi installer.

Once you have followed all of the installation instructions, check one more time to make sure that the installations were successful by running these commands:
git –version
node –version
yarn –version

Cloning the App: You are now ready to copy the Cypress Real World App to your machine. First you’ll need to clone the project in Git:
* Go to this page, and at the top of the page you’ll see a button that says “Code”.
* Click on this button, and in the dropdown, click on the clipboard icon. This will copy the Git address of the Real World App code.
* Navigate in your command line to where you would like to install the app. (If you don’t know how to do this, check out this post.) Then type git clone, and paste the Git address that you copied in the previous step. Click Enter, and the app will be cloned to your directory.

Installing and Running the App: To install the application, first navigate to your new project location in the command line and then type yarn install. Yarn will look at all of the dependencies needed to run the app and install them for you. Next, type yarn dev in the command line. This script will start up the application on your machine. You’ll see a new tab open up in your browser with the app’s login page!

Get to Know the App: Take a moment to log into the app and see what you can do with it. When the application started up, it created some sample users that you can use to log in. To find a username, open a new tab in the command line window, navigate to your project location, and then type yarn list:dev:users. Choose one of the usernames, and log in with the password s3cret. Once you’re logged in, try doing things like creating a new bank account and sending money to another user.

Start Cypress: At this point, we’ve started the app, but we haven’t started Cypress! Let’s do that now. Type yarn cypress:open in the command line. You should see the Cypress window open, and the window will list several api and ui tests.

Run the Tests: In the upper right corner of the Cypress window, you’ll see a play button with the words “Run 16 integration specs”. Click on this button. A test runner window will open, and you’ll see the tests fly by! Marvel at the fact that you can run over 100 tests in just over 2 minutes. The first set of tests are the API tests, so you won’t see anything happening in the right side of the test runner window. When the UI tests begin, you’ll see everything that’s happening in the UI in the right side of the window. When the tests are done, try running just a single test script by clicking on it in the Cypress window.

Look at the Test Code: Now it’s time to look at the code that runs the tests. Open up the cypress-real-world-app folder using your favorite code editor (I’m a huge fan of Visual Studio Code, which works in both Windows and MacOS). To find the test scripts, click on the cypress folder in the file explorer of the code editor, then click on the tests folder. You should see two folders of test scripts: one for the API tests and one for the UI tests. These should match what you see in the Cypress window. Click on a UI test and on an API test to see what they look like. See how much of the code you can understand.

Make a Small Change to the Code: Let’s try making a change to the code. First, let’s run a test. In the Cypress window, click on the auth.spec.ts test. A new test window will open and run that test script. Now open up the auth.spec.ts file in your code editor. Make a simple change to the code, such as changing the last name in the userInfo (line 39). Save the change, and watch the tests automatically run again in the test window. In the test window, if you open up the test called “should allow a visitor to sign-up, login, and logout”, you can scroll over the different steps of the test and see snapshots of what was happening at each step, including adding the new last name from line 39.

Continue Your Exploration: Keep playing around with the code to see how much you can understand. When you are feeling more confident, try creating your own test script using the elements in the existing tests. Simply add a new file to the api or ui folder, and make sure it has .spec.ts at the end of the name. You should see your new test added to the Cypress window, and you can run it by clicking on it.

I hope that this exercise has shown you just how easy it is to get started with Cypress! For more information, check out Cypress’ excellent documentation.

Five Reasons You’re Not Ready For Continuous Deployment

Continuous Deployment (CD) is often seen as the “Holy Grail” of software development. A developer checks in code, and it is miraculously deployed and tested in the QA, Stage and Production environments, without needing any human intervention at all. This sounds great- and it is- but only if you are ready for it! Here are five reasons that your team might not be ready for Continuous Deployment.

Reason One: You Don’t Have Enough Test Coverage

Sometimes teams can be so excited to set up Continuous Deployment that they don’t pay attention to what they are testing. It’s great to have tests pass and deployments automatically go all the way to Production, but if you are missing tests for important functionality you’re going to need to remember to do manual testing with every deployment. Otherwise, something could break and the automated tests won’t pick up on the problem.

The remedy: Make sure you have all the tests you need before you set up CD.

Reason Two: Your Tests Are Flaky

If your tests aren’t reliable, you are going to get all sorts of false failures. With CD set up, this means that deployments will fail. If your developers are trying to deploy to the QA environment, but they can’t get their code there because of your flaky tests, they will be annoyed. And no one wants to have to stop what they are doing to investigate why your automation failed in Production.

The remedy: Make sure your tests are reliable. If there are flaky tests, pull them out of the test suite until you can fix them, and make sure that you are manually testing anything that’s no longer covered by automation.

Reason Three: Your Tests Take Too Long

UI tests can take a very long time. If you really want to set up CD, you’ll have to consider just how much time they are taking. If developer A checks in code which kicks off the tests, and then has to wait for an hour to find out if the tests have passed, and meanwhile developer B checks in code which now has to wait until the first deployment has completed, soon you will have a mess on your hands.

The remedy: Make sure your tests are fast. See which tests you can shorten through strategies like: switching to API tests for testing back-end logic, setting up your test data ahead of time, using API and other services calls to set up conditions for tests, running tests in parallel, and eliminating redundant tests.

Reason Four: You Don’t Understand the Deploy Process

Having CD set up won’t be helpful at all if you and your team don’t understand how it works. When things go wrong with a critical deployment, you don’t want to have to find someone in DevOps to help you diagnose the issue. That will waste the DevOps member’s time and cause stress for everyone on the team.

The remedy: Make sure everyone on the team understands the deploy process. Learn how to configure the deploys, what common errors mean, how to fix a hung deploy, and so on. Take turns monitoring the deploys and solving problems so you aren’t dependent on one team member who can then never take a vacation.

Reason Five: You Don’t Have Alerting Set Up

Just because your deployments are now automatic doesn’t mean you can sit back, relax, and never think of them again! Sometimes your tests will fail, sometimes your connections to dependencies won’t get configured properly, and sometimes a flaky thing will happen that will fail the deployment. You don’t want to find this out from your CEO, or someone in DevOps, or your customers!

The remedy: Make sure that you have alerting and paging set up when deployments fail. You could have the person that made the code change get paged when there’s a failure, or you could have everyone on the team take turns being the one on duty for that week. Make sure everyone takes their paging seriously; if they’re on call for a week where they’re going to be on vacation, they should find someone to substitute for them.

Continuous Deployment, when done correctly, is a valuable tool that makes it easier for teams to quickly produce quality software. But be sure that you are completely ready for this step by taking an honest, objective look at these five reasons with your team.

Should You Hunt for That Bug?

Anyone who has ever done laundry has most likely faced the issue where they are folding their recently washed clothes and they discover that they are missing one sock. Sometimes the sock is missing because it never made it into the laundry basket. Sometimes the sock was left in the washing machine. There are even jokes about how the clothes dryer sends socks into another dimension!

What’s interesting is the reaction that people have to the missing sock. Some people shrug their shoulders and figure that the missing sock will turn up eventually. Others will spend most of their day looking for that missing sock: they’ll search through the laundry room, all of the undone laundry, their closet, under the bed, etc.

This is a great metaphor for what testers do when they encounter a strange and hard-to-reproduce bug! Some testers decide that since the bug is hard to reproduce, they should go on and test something else. Other testers decide to devote every moment to finding the cause of the odd behavior, to the exclusion of all other testing. Which is the correct behavior? The answer is: “It depends”. In this post, I’ll list three reasons why you might want to hunt for the elusive bug, and three reasons why you might want to put off the hunt for later.

Reasons to Hunt for That Bug:

When the bug happens, it’s a big deal
You might be testing a system where everything works just fine most of the time. But when the bug occurs, the system crashes, or data is lost, or a customer can’t submit an order. This is a serious problem. Even if the bug happens just 1% of the time, it’s important to figure out what’s going on, because you will lose users as a result of this issue.

The problem is intermittent, but happens frequently
Currently I’m plagued by a bug in some software I’m using. I’m logged into the software, but about 50% of the time the application forgets who I am. This is really annoying. If someone were to ask me about this software, do you think I’d recommend it? If I didn’t need this tool for my work, I’d have switched to something else a while ago. You don’t want your users to give you poor reviews or stop using your application!

The problem hints at an important performance issue
Perhaps your software works just great when you test it with one or two users in your test environment, but you’re seeing strange behavior in your Production environment. Don’t just shrug this off with a “Works for me” statement! This bug could indicate that there is a problem with your application when it’s under load. Perhaps there’s a memory leak that gets worse the longer the application is used. Or maybe the calls to the server are taking too long, and the problem is compounded as more and more calls are made, locking the database. Whatever the reason, it’s important to find out the root cause of the problem and fix it before your customers see it.

Reasons to Save the Hunt for Later:

You’ve been testing for weeks, and you only saw the problem once
We all know that software and hardware isn’t perfect. Strange glitches can happen, including service interruptions in hosting environments, power surges on equipment, and loss of electricity or Internet connection. The bug that you saw just once could have been caused by any one of these things. This is the kind of bug to watch for, but not to chase after. If it happens again, then you can start looking into it.

You’re pretty sure you know the cause already
If you think that the reason why you saw a bug is because a team member forgot to deploy one part of the application, or someone forgot to turn a toggle on, or your test user was deleted; and the bug went away as soon as those things were fixed, then there’s probably no reason to hunt any further.

You are in a time-sensitive situation, and you think the issue doesn’t pose a risk
It’s very aggravating to have a giant bug show up in Production and then hear from a tester “Oh, I saw that bug, but we were in a hurry and didn’t have time to investigate.” But if the bug is something obscure that you think that a user will never do, and you are needing to meet an important deadline, it might be okay to wait until after the release to dig in further.

Here’s an example: a team is trying to release some new search functionality to their application. The team has been testing for weeks and it appears that things are working well. The day of the release, a tester discovers that if she enters a search term with 1000 characters, there is an intermittent bug. Rather than calling off the release and spending hours looking for the bug, the team should probably go forward with the release and investigate later; because it’s very unlikely that end users will be doing searches on 1000-character terms.

Do you have a “missing sock”?
The next time you encounter a strange bug, ask yourself: is this something that should be investigated now, or something that can wait until later? The size of the bug, the frequency of the bug, and likelihood that it will be seen by users can help you decide.

Book Review: Clean Code

“Clean Code”, by Robert C. Martin and guest contributors, is frequently mentioned as one of the top books that software developers should read. I had been wanting to read the book for years, but I knew that it would require a big time commitment. Since I had a goal of reading one tech book a month this year, I decided it was time to take the plunge and finally read it!

I was fairly certain that I would learn a lot about good coding practices, and I definitely did. But what really fascinated me was the emphasis on testing! The author states that it’s not possible to refactor code well unless you have good tests in place. He sites this example from when he was coaching a team who decided that their test code didn’t need to be maintained to the same standards of the production code:

“From release to release the cost of maintaining my team’s test suite rose. Eventually it became the single biggest complaint among the developers. When managers asked why their estimates were getting so large, the developers blamed the tests. In the end, they were forced to discard the test suite entirely. But without a test suite they lost the ability to make sure that changes to their code base worked as expected. Without a test suite they could not ensure that changes to one part of their system did not break other parts of their system. So their defect rate began to rise. As the number of unintended defects arose, they started to fear making changes…Their production code began to rot.”

As someone who is passionate about software quality, I knew that automated tests were crucial. But I didn’t realize just how crucial they were for simply keeping production code clean and updated!

In one of the main sections of the book, the author gives an example of completely refactoring a parsing tool that he had created. One step at a time, he pulls out arguments from functions and puts them in new functions, creates interfaces to reduce code repetition, and so on. The reason he is able to make all these changes to clean up his code is that he has tests in place! He begins his refactoring by making sure that all the tests pass. Then he makes one change at a time, running the tests after each change. As long as the tests continue to pass, he knows he’s not breaking anything.

Software testers might be reading this review and saying to themselves, “This is great! I’ll tell all my developers to read this book.” While that is a great idea, software testers need to read the book too. Because now that we know that the quality of test code is just as important as the quality of production code, we need to write really clean code as well!

Some of the important clean coding principles outlined in this book are:
• the name of a variable, function, or class should tell you why it exists, what it does, and how it is used
• functions should do only one thing, and they should do it well
• functions should either do something or answer something, but not both
• functions should be read from the top down: if function A calls function B, function B should be listed below function A
• strive to use as few arguments in a function as possible; two should be the maximum
• avoid writing comments as much as possible; the code itself should clearly state what it does through good naming
• standardize on common spacing practices, so everyone’s code looks the same; this will make it easier to read
• don’t leave commented-out lines in the code; either delete them or fix them. The longer they are there, the more others will be afraid to touch them for fear of breaking something, and they will clutter up the code

I learned all this and so much more in “Clean Code”. I recommend that everyone who writes any type of code, production or test, take the time to read this book and put its recommendations into practice.