The One Question to Ask to Improve Your Testing Skills

We’ve all been in this situation: we’ve tested something, we think it’s working great, and after it goes to Production a customer finds something obvious that we missed.  We can’t find all the bugs 100% of the time, but we can increase the number of bugs we find with this one simple question:

What haven’t I tested yet?”  

I have asked this question of myself many times; I make a habit of asking it before I move any feature to Done.  It almost always results in my finding a bug.  The conversation with myself usually goes like this:

Good Tester Me:  “What haven’t we tested yet?”  “Well, we haven’t tested with an Admin user.”
Lazy Tester Me: “Why should that make a difference?  This feature doesn’t have anything to do with user privileges.”
Good Tester Me: “That may be the case, but we should really test it anyway, to be thorough.”
Lazy Tester Me: “But I’ve been testing this feature ALL DAY!  I want to move on to something else.”
Good Tester Me: “You know that we always find the bugs in the last things we think of to test.  TEST IT!”

And I’m always happy I did.  Even if I don’t find a bug, I have the peace of mind that I tested everything I could think of, and I’ve gained valuable product knowledge that I can share with others.

When I ask myself this question, here are twelve follow-up questions I ask:

Did I test with more than one user? 
It seems so obvious, but we are often so embroiled in testing a complicated feature that we don’t think to test it with more than our favorite test user.  Even something as simple as the first letter of a last name could be enough to trigger different behavior in a feature.

Did I test with different types of users?
Users often come with different privileges.  When I was first starting out in testing, I would often test with an admin user, because it was the easiest thing to do.  Finding out I’d missed a bug where a regular user didn’t have access to a feature they should have taught me a valuable lesson!

Did I test with more than one account/company? 
For those of us testing B2B applications, we often have customers from different accounts or companies. I missed a bug once where the company ID started with a 0, and the new feature hadn’t been coded to handle that.

Did I test this on mobile?
Anyone who has ever tested an application on mobile or tablet knows that it can behave very differently from what is seen on a laptop.  You don’t want your users to be unable to click a “Submit” button because it’s off-screen and can’t be accessed.

Did I test this on more than one browser? 

Browsers have more parity in behavior than they did a few years ago, but even so, you will be occasionally surprised by a link that will work in some browsers but not others.

Did I try resizing the browser?
I often forget to do this.  One things I’ve discovered when resizing is that the scroll bar can disappear, making it impossible for users to scroll through records.

Did I test with the Back button? 

This seems so simple, but a lot of bugs can crop up here!  Also be sure to test the Cancel button on a form.

Is this feature on any other pages, and have we tested on those pages? 
This one recently tripped up my team.  We forgot to test our feature on a new page that’s currently in beta.  Be sure to mentally run through all the pages in your application and ask yourself if your feature will be on those pages.  If you have a really large application, you may want to ask testers from other teams in your organization.

Did I test to make sure that this feature works with other features? 
Always think about combining your features.  Will your search feature work with your notification feature?  Will your edit feature work with your sorting feature? And so on.

Have I run negative tests on this feature? 
This is one that’s easy to forget when you are testing a complicated feature.  You may be so focused on getting your application configured correctly for testing that you don’t think about what happens when bad data is passed in.  For UI tests, be sure to test the limits of every text field, and verify that the user gets appropriate error messages.  For API tests, be sure to pass in invalid data in the test body, and try using bad query parameters.  Verify that you get 400-level responses for invalid requests rather than a generic 500 response.

Have I run security tests on this feature?
It’s a sad fact of life that not all of our end users will be legitimate users of our application.  There will be bad actors looking for security flaws to exploit.  This is especially true for financial applications and ones with a lot of personally identifiable information (PII).  Protect your customers by running security scans on your features.

Have I checked the back-end database to make sure that data is being saved as I expected?

When you fill out and submit a form in your application, a success message is not necessarily an indication that the data’s been saved.  There could be a bug in your software that causes an error when writing to the database.  Even if the data has been saved, it could have been saved inaccurately, or there may be an error when retrieving the data.  For example, a phone number might be saved with parentheses and dashes, but when the data is retrieved the front-end doesn’t know how to parse those symbols, so the phone number isn’t displayed.  Always check your back-end data for accuracy.

How is the end user going to use this feature?  Have I run through that scenario?

It’s so easy to get wrapped up in our day-to-day tasks of testing, writing automation, and working with our team that we forget about the end user of our application.  You should ALWAYS understand how your user will be using your feature.  Think about what journey they will take.  For example, in an e-commerce app, if you’re testing that you can pay with PayPal, make sure you also run through a complete journey where you add a product to your cart, go to the checkout page, and then pay with PayPal.

Missing a bug that then makes it to Production can be humbling!  But it happens to everyone.  The good news is that every time this happens, we learn a new question to ask ourselves before we stop testing, making it more likely that we’ll catch that bug next time.

What questions do you ask yourself before you call a feature Done?  Let me know in the comments section!  

Five Strategies for Managing Test Automation Data

Has this ever happened to you?  You arrive at work in the morning to find that many of your nightly automated tests have failed.  Upon investigation, you discover that your test user has been edited or deleted.  Your automation didn’t find a bug, and your test isn’t flaky; it simply didn’t work because the data you were expecting wasn’t there.  In this week’s post, I’ll take a look at five different strategies for managing test data, and when you might use each.

Strategy One: Using data that is already present in the system

This is the easiest strategy- there’s nothing to do for setup- but it is also the most risky.  Even if you label your user with “DO NOT REMOVE” there’s always a chance that some absent-minded person will delete it.  
However, this strategy can work well if you are just making simple requests.  For example, if you are testing getting a list of contacts, you can assert that contacts were returned.  For the purposes of your test, it doesn’t matter what contacts were returned; you just need to know that some contacts were returned.  
Strategy Two: Updating or creating data as a setup step

Most automated test platforms offer the ability to create a setup step that either runs before each test or before a suite of tests.  This strategy works well if it’s easy to create or update the record you want to use.  I have a suite of automated API tests that test adding and updating a user’s contact information.  Before the tests begin, I run requests that delete the user’s email addresses and phone numbers.  
The downside to this strategy is that sometimes my requests to delete the user’s contact information fail.  When this happens, my tests fail.  Also, updating data as a setup step adds more time to the test suite, which is something to consider when you need fast results.
Strategy Three:  Using test steps to create and delete data

This is a good strategy when you are testing CRUD (Create, Read, Update, Delete) operations, because you can use the actual tests to create and delete your test data.  If I was testing an API for a contact list, for example, I would have my first test create the contact and assert that the contact was created.  Then I would update the contact and assert that the contact was updated.  Finally, I would delete the contact and assert that the contact was deleted.  There is no impact to the database, because I am both creating and destroying the data.  
However, if one of the tests fails, it’s likely the others will as well.  If for some reason the application was unable to create the contact, the second test would fail, because there would be nothing to update.  And the third test would fail because the record would not exist to be deleted.  So even though there was only one bug, you’d have three test failures.
Strategy Four:  Taking a snapshot of the database and restoring it after the tests

This strategy is helpful when your tests are doing a lot of data manipulation.  You take a snapshot of the database as a setup step for the test suite.  Then you can manipulate all the data you want, and as a cleanup step, you restore the database to its original state.  The advantage to this method is that you don’t need to write a lot of steps to undo all the changes to your data.  
But this method relies on having the right data there to begin with.  For instance, if you are planning to do a lot of processing on John Smith’s records, and someone happened to delete John Smith before you ran your tests, taking a snapshot of the database won’t help; John Smith simply won’t be there to test on.  It’s also possible that taking a snapshot will be time-consuming, depending on the size of your database.
Strategy Five: Creating a mini-database with the data you need for your tests

In this strategy, you spin up your own database with only the data you need for testing, and when your tests have finished, you destroy the database.  If you are using Microsoft technologies, you could do this with their DACPAC functionality; or you are using Docker, you could create your own database as part of your Docker instance.  With this strategy, there is no possibility of your data ever being incorrect, because it is always brand-new and exactly how you configured it.  Also, because your database will be smaller than your real QA environment database, your tests will likely execute more quickly.  
The downside to this strategy is that it requires a lot of preparation.  You may have to do a lot of research on how your data tables relate to each other in order to determine what data you need.  And you’ll need to do a fair amount of coding or configuration to set up the creation and destruction steps.  But in a situation where you want to be sure that your data is right for testing, such as when a developer has just committed new code, this solution is ideal.
All of these strategies can be useful, depending on your testing needs.  When you evaluate how accurate you need your data to be, how likely it is that it will be altered by someone else, how quickly you need the tests to run, and how much you can tolerate the occasional failure, it will be clear which strategy to choose.  

What to Put in a Smoke Test

The term “smoke test” is usually used to describe a suite of basic tests that verify that all the major features of an application are working.  Some use the smoke test to determine whether a build is stable and ready for further testing.  I usually use a smoke test as the final check in a deploy to production.  In today’s post, I’ll share a cautionary tale about what can happen if you don’t have a smoke test.  Then I’ll continue that tale and talk about how smoke tests can go wrong.

Early in my testing career, I worked for a company that had a large suite of manual regression tests, but no smoke test.  Each software release was difficult, because it was impossible to run all the regression tests in a timely fashion.  With each release, we picked which tests we thought would be most relevant to the software changes and executed those tests.

One day, in between releases, we heard that there had been a customer complaint that our Global Search feature wasn’t working.  We investigated and found that the customer was correct.  We investigated further and discovered that the feature hadn’t worked in weeks, and none of us had noticed.  This was quite embarrassing for our QA team!

To make sure that this kind of embarrassment never happened again, one of our senior QA engineers created a smoke test to run whenever there was a release to production.  It included all the major features, and could be run fairly quickly.  We felt a lot better about our releases after that.

However, the tester who created the test kept adding test steps to the Smoke Test.  Every time a new feature was created, a step was added to the smoke test.  If we found a new bug in a feature, even it was a small one, a step checking for the bug was added to the smoke test.  As the months went on, the smoke test took longer and longer to execute and became more and more complicated.  Eventually the smoke test itself took so much time that we didn’t have time to run our other regression tests.

Clearly there needs to be a happy medium between having no smoke test at all, and having one that takes so long to run that it’s no longer a smoke test.  In order to decide what goes in a smoke test, I suggest asking these three questions:

1. What would absolutely embarrass us if it were broken in this application?

Let’s use an example of an e-commerce website to consider this question.  For this type of website, it would be embarrassing or even catastrophic if a customer couldn’t:

  • search for an item they were looking for
  • add an item to their cart
  • log in to their account
  • edit their information
So at the very least, a smoke test for this site should include a test for each of these features.
2. Is this a main feature of the application?

Examples of features in an e-commerce website that would be main features, but less crucial ones, might be:
  • wish list functionality
  • product reviews
  • recommendations for the user
If these features were broken, it wouldn’t be catastrophic, but they are features that customers expect.  So a test for each one should be added.
3. If there was a bug here, would it stop the application from functioning?

No one wants to have bugs in their application!  But some bugs are more important than others.  If the e-commerce website had an issue where their “Add to Cart” button was off-center, it might look funny, but it wouldn’t stop customers from shopping.  
But a bug where a customer couldn’t remove an item from their cart might keep them from checking out with the items they want, which would affect sales.  So a test to check that items can be removed from a cart would be important in a Smoke Test.
With these questions in mind, here is an example of a smoke test that could be created for an e-commerce site:
1. Log in
2. Verify product recommendations are present
3. Do a search for a product
4. Read a review of a product
5. Add an item to the cart
6. Add a second item to the cart and then delete it
7. Edit customer information
8. Check out
9. Write a review
A smoke test like this wouldn’t take very long to execute manually, and it would also be easy to automate.  
Whenever new features are added to the application, you should ask yourself the first two questions to determine whether a test for the feature should be added to the smoke test.  And whenever a bug is found in the product, you should ask yourself the third question to determine whether a test for that issue should be added to the smoke test.
Because we want our applications to be of high quality, it’s easy to fall into the trap of wanting to test everything, all the time.  But that can create a test burden that keeps us so busy that we don’t have time for anything else.  Creating a simple, reliable smoke test can free us up for other activities, such as doing exploratory testing on new features or creating nightly automated tests.  

How to Log a Bug

Last week, we talked about all the things you should check before you log a bug, in order to make sure that what you are seeing is really a bug.  Once you have run through all your checks and you are sure you have a bug, it’s time to log it.  But just throwing a few sentences in your team’s bug-tracking software is not a good idea!  The way you log a bug and the details that you include can mean the difference between a bug being prioritized or left on the backlog; or the difference between a developer being able to find the problem, or closing the bug out with a “cannot repro” message.  In this post, I’ll outline some best practices for logging a bug, with examples of what to do and what not to do.

Let’s take an example bug from the hypothetical Superball Sorter that I described a few weeks ago. Imagine that when you tested the feature, you discovered that if three of the children have a rule where they accept only large balls of some color, the small purple ball is never sorted.

Here are the components of a well-logged bug:

Title: The title of the bug should begin with the area of the application it is referring to.  For example, if a bug was found in the Contacts section of your application, you could begin the title with “Contacts”.  In this case, the area of the application is the Superball Sorter.  After the application area, you should continue the title with a concise statement of the problem.

RIGHT:  Superball Sorter: Small purple ball is not sorted when three children have large ball rules

WRONG: Small purple ball not sorted

While the second example gives at least a clue as to what the bug is about, it will be hard to find among dozens of other bugs later, when you might try to search by “Superball”.  Moreover, it doesn’t state what the conditions are when the ball is not sorted, so if there is another bug found later where this same ball isn’t sorted, there could be confusion as to which bug is which.

Description: The first sentence of the bug should describe the issue in one sentence.  This sentence can provide a bit more detail than the title.  I often start this sentence with “When”, as in “when I am testing x, then y happens”.

RIGHT: When three children have sorting rules involving large balls, the small purple ball is not sorted.

WRONG: Doug doesn’t get the small purple ball

There are a number of things wrong with this second example.  Firstly, the issue happens regardless of which three children have rules where they get only large balls, so referring to Doug here could be misleading.  Secondly, the statement doesn’t describe what rules have been set up.  A developer could read this sentence and assume that the small purple ball is never sorted, regardless of what rules are set up.

Environment and browser details:  These can be optional if it’s assumed that the issue was found in the QA environment and if the issue occurs on all browsers.  But if there’s any chance that the developer won’t know what environment you are referring to, be sure to include it.  And if the issue is found on one browser but not others, be sure to mention that detail.

RIGHT: This is happening in the Staging environment, on the Edge browser only

Steps to reproduce: The steps should include any configuration and login information, and clearly defined procedures to display the problem.

RIGHT:
1. Log in with the following user:
username: foobar
password: mebs47
2. Assign Amy a rule where she gets large red balls only
Assign Bob a rule where he gets large orange balls only
Assign Carol a rule where she gets large yellow balls only
3. Create a set of superballs to be sorted, and ensure that there is at least one small purple ball
4. Run the superball sorter
5. Check each child’s collection for the small purple ball or balls

WRONG:
Everyone has a rule but Doug, and no one is getting the small purple ball

The above example doesn’t provide nearly enough information.  The developer won’t know the login credentials for the user, and won’t know that the three rules should be for large balls.

ALSO WRONG:
1. Open the application
2. Type “foobar” in the username field
3. Type “mebs47” in the password field
4. Click the login button
5. Go to the Superball Sorter rules page
6. Click on Amy’s name
7. Click on the large ball button
8. Click on the red ball button
9. Click the save button
10. Click on Bob’s name
etc. etc. etc.

These steps provide WAY too much information.  It can be safely assumed that the developer knows how to log in; simply providing the credentials should be enough.  It can also be safely assumed that the dev knows how to set a rule in the Superball Sorter, since he or she wrote the code.
Expected and Actual Result: For the sake of clarity, it’s often a good idea to state what behavior you were expecting, and what behavior you got instead.  This can help prevent misunderstandings, and it’s also helpful when a bug has been sitting on the backlog for months and you’ve forgotten how the feature is supposed to work.  
RIGHT:
Expected result: Doug gets the small purple ball, because he is the only child configured to accept it
Actual result: Doug does not get the small purple ball, and the ball remains unsorted

Screenshot or Stack Trace: Include this information only if it will be helpful.  
RIGHT: 
Exception: Ball was not recognized
Caused by: Ball.sort(Ball.java:11)

WRONG: 
Exception: An unknown error occurred 

ALSO WRONG:
The above screenshot is not particularly helpful.  While it shows that Doug doesn’t have a small purple ball, that is easily conveyed by the Actual Result that has already been described; no picture is needed.

A clearly written bug is helpful to everyone: to the product owner, who prioritizes bug fixes; to the developer, who has to figure out what’s wrong and fix it; and to the tester, who will need to retest the issue once it’s fixed.  When you take care to log bugs well, it prevents frustration and saves everyone time.  

Before You Log That Bug…

Have you heard the ancient fable, “The Boy Who Cried Wolf”?  In the tale, a shepherd boy repeatedly tricks the people of his village by crying out that a wolf is about to eat his sheep.  The villagers run to his aid, only to find that it was a prank.  One day, the boy really does see a wolf.  He cries for help, but none of the villagers come because they are convinced that it must be another trick.

Similarly, we do not want to be “The Tester Who Cried Bug”.  As software testers, if we repeatedly report bugs that are really user error, our developers won’t believe us when we really find a bug.  To keep this from happening, let’s take a look at the things we should check before we report a bug.  These tasks fall into two categories: checking for user error and gathering information for developers.

Check for User Error

  • The first thing you should always check is to verify that the code you should be testing has actually been deployed.  When I was a new tester, I constantly made this mistake.  It’s such a waste of time to keep investigating an issue with code when the problem is actually that the code isn’t there!
  • Are you testing in the right environment?  When you have multiple environments to test in, and they all look similar, it’s easy to think you are testing in one environment when you are actually in another.  Take a quick peek at your URL if you are testing a web app, or at your build number if you are testing a mobile app.
  • Do you understand the feature?  In a perfect world, we would all have great documentation and really well-written acceptance criteria.  In the real world, this often isn’t the case!  Check with your developer and product owner to make sure that you understand exactly how the feature is supposed to behave.  Maybe you misunderstood something when you started to test.
  • Have you configured the test correctly?  Maybe the feature only works when certain settings are enabled.  Think about what those settings are and go back and check them.  
  • Are you testing with the right user?  Maybe this feature is only available to admin users or paid users.  Verify the criteria of the feature and check your user.
  • Does the back-end data support the test?  Let’s say you are testing that a customer’s information is displayed.  You are expecting to see the customer’s email address on the page, but the email is not there.  Maybe the problem is actually that the email address is null, and that is why it is not displaying.
If you have checked all of the above, and you are still seeing an issue, then it’s time to think about reporting the bug.  But before you do, consider the questions the developer might ask you when he or she begins to investigate the issue.  It will save time for both of you if you have all of those questions answered ahead of time.
Information for the Developer
  • Are you able to reproduce the issue?  You should be able to reproduce the issue at least once before logging the bug.  This doesn’t mean that you shouldn’t log intermittent issues, as they are important as well; but it does mean that you should have as much information as possible about when the issue occurs and when it doesn’t.
  • Do you have clear, reproducible steps to demonstrate the issue?  It is incredibly frustrating to a developer to hear that something is wrong with their software, but have only vague instructions available to use for investigation.  For best results, give the developer a specific user, complete with login credentials, and clear steps that they can use to reproduce the problem.
  • Is this issue happening in Production?  Maybe this isn’t a new bug; maybe the issue was happening already.  This is especially possible when you are testing old code that no one has looked at or used in a while.  See last week’s post, The Power of Pretesting, for ideas on testing legacy software.
  • Does the issue happen on every browser?  This information can be very helpful in narrowing down the possible cause of an issue.
  • Does the issue happen with more than one user?  It’s possible that the user you are testing with has some kind of weird edge case in their configuration or their data.  This doesn’t mean that the issue you are seeing isn’t a bug; but if you can show that there are some users where the issue is not happening, it will help narrow the scope of the problem.
  • Does the issue happen if the data is different?  Try varying the data and see if the issue goes away.  Maybe the problem is caused by a data point that is larger than the UI is expecting, or a field that is missing a value.  The more narrowly you can pinpoint the problem, the faster the developer can fix it.
The ideal relationship between a tester and a developer is one of mutual trust.  If you make sure to investigate each issue carefully before reporting it, and if you are able to report issues with lots of helpful details, your developer will trust that when you cry “Bug”, it’s something worth investigating!

The Power of Pretesting

Having been in the software testing business for a few years now, I’ve become accustomed to various types of testing: Acceptance Testing, Regression Testing, Exploratory Testing, Smoke Testing, etc.  But in the last few weeks, I’ve been introduced to a type of testing I hadn’t thought of before: Pretesting.

On our team, we are working to switch some automatically delivered emails from an old system to a new system.  When we first started testing, we were mainly focused on whether the emails in the new system were being delivered.  It didn’t occur to us to look at the content of the new emails until later, and then we realized we had never really looked at the old emails.  Moreover, because the emails contained a lot of detail, we found that we kept missing things: some extra text here, a missing date there.  We discovered that the best way to prevent these mistakes was to start testing before the new feature was delivered, and thus, Pretesting was born.

Now whenever an email is about to be converted to the new system, we first test it in the old system.  We take screenshots, and we document any needed configuration or unusual behavior.  Then when the email is ready for the new system, it’s easy to go back and compare with what we had before.  This is now a valuable tool in our testing arsenal, and we’ve used it in a number of other areas of our application.

When should you use Pretesting?

  • when you are testing a feature you have never tested before
  • when no one in your company seems to know how a feature works
  • when you suspect that a feature is currently buggy
  • when you are revamping an existing feature
  • when you are testing a feature that has a lot of detail

Why should you Pretest?

Pretesting will save you the headache of trying to remember how something “used to work” or “used to look”.  If customers will notice the change you are about to make, it’s good to make note of how extensive the change will be.  If there are bugs in the existing feature, it would be helpful to know that before the development work starts, because the developer could make those bug fixes while in the code.  Pretesting is also helpful for documenting how the feature works, so you can share those details with others who might be working on or testing the feature.

How to Pretest:

1. Conduct exploratory testing on the old feature.  Figure out what the Happy Path is.
2. Document how the Happy Path works and include any necessary configuration steps
3. Continue to test, exploring the boundaries and edge cases of the feature
4. Document any “gotchas” you may find- these are not bugs, but are instead areas of that feature that might not work as someone would expect
5. Log any bugs you find and discuss the with your team to determine whether they should be fixed with the new feature or left as is
6. Take screenshots of any complicated screens, such as emails, messages, or screens with a lot of text, images, or buttons. Save these screenshots in an easily accessible place such as a wiki page.

When the New Feature is Ready:

1. Run through the same Happy Path scenario with the new feature, and verify that it behaves the same way as the old feature
2. Test the boundaries and edge cases of the feature, and verify that it behaves in the same way as the old feature
3. Verify that any bugs the team has decided to fix have indeed been fixed
4. Compare the screenshots of the old feature with the screenshots of the new feature, and verify that they are the same (with the exceptions of anything the team agreed to change)
5. Do any additional testing needed, such as testing new functionality that the old feature didn’t have or verifying that the new feature integrates with other parts of the application

The power of Pretesting is that it helps you notice details you might otherwise miss in a new feature, and as a bonus, find existing bugs in the old feature.  Moreover, testing the new feature will be easy, because you will have already created a test plan.  Your work will help the developer do a better job, and your end users will appreciate it!

Automating Tests for a Complicated Feature

Two weeks ago, I introduced a hypothetical software feature called the Superball Sorter, which would sort out different colors and sizes of Superballs among four children.  I discussed how to create a test plan for the feature, and then last week I followed up with a post about how to organize a test plan using a simple spreadsheet.  This week I’ll be describing how to automate tests for the feature.

In case you don’t have time to view the post where I introduced the feature, here’s how it works:

  • Superballs can be sorted among four children- Amy, Bob, Carol, and Doug
  • The balls come in two sizes: large and small
  • The balls come in six colors: red, orange, yellow, green, blue, and purple
  • The children can be assigned one or more rules for sorting: for example, Amy could have a rule that says that she only accepts large balls, or Bob could have a rule that says he only accepts red or orange balls
  • Distribution of the balls begins with Amy and then proceeds through the other children in alphabetical order, and continues in the same manner as if one was dealing a deck of cards
  • Each time a new ball is sorted, distribution continues with the next child in the list
  • The rules used must result in all the balls being sortable; if they do not, an error will be returned
  • Your friendly developer has created a ball distribution engine that will create balls of various sizes and colors for you to use in testing

When automating tests, we want to keep our tests as simple as possible.  This means sharing code whenever we can.  In examining the manual test plan, we can see that there are three types of tests here:

  • A test where none of the children have any rules
  • Tests where all the children have rules, but the rules result in some balls not being sortable
  • Tests where one or more children have rules, and all of the balls are sortable
We can create a separate test class for each of these types, which will make it easy for us to share code inside each class.  We’ll also have methods that will be shared among the classes:

child.deleteBalls()- this will clear all the distributed balls, getting ready for the next test

child.deleteRules()- this will clear out all the existing rules, getting ready for the next test

distributeBalls(numberOfBalls)- this will randomly generate a set number of balls of various sizes and colors and distribute them one at a time to the children, according to the rules

verifyEvenDistribution(numberOfBalls)- this is for scenarios where none of the children have rules; it will take the number of balls distributed and verify that each child has one-fourth of the balls

child.addRule(Size size, Color color)- this will set a rule for a child; each child can have more than one rule, and either the size or color (but not both) can be null

child.verifyRulesRespected()- for the specified child, this will iterate through each ball and each rule and verify that each ball has respected each rule

child.addFourthRuleAndVerifyError(Size size, Color color)- for the tests in the InvalidRules class, it will always be the fourth child’s rules that will trigger the error, because it’s only with the fourth child that the Sorter realizes that there will be balls that can’t be sorted.  So this method will assert that an error is returned.

Each test class should have setup and cleanup steps to avoid repetitive code.  However, I would use child.deleteBalls() and child.deleteRules() for each child in both my setup and cleanup steps, just in case there is there is an error in a test that causes the cleanup step to be missed.

For the DistributionWithNoRules test class, each test would consist of:

distributeBalls(numberOfBalls);
verifyEvenDistribution(numberOfBalls);

All that would vary in each test would be the number of balls passed in.

For the DistributionWithRules test class, each test would consist of:

child.AddRule(Size size, Color color); –repeated as many times as needed for each child
distributeBalls(numberOfBalls);
child.verifyRulesRespected(); –repeated for each child that has one or more rules

Finally, for the InvalidRules test class, each test would consist of:

child.AddRule(Size size, Color color); –repeated as many times as needed for the first three children
child.addFourthRuleAndVerifyError(Size size, Color color); –verifying that the fourth rule triggers an error

The nice thing about organizing the tests like this is that it will be easy to vary the tests.  For example, you could have the DistributionWithNoRules test class run with 40, 400, and 800 balls; or you could set up a random number generator that would generate any multiple of four for each test.

You could also set up your DistributionWithRules and InvalidRules test classes to take in rule settings from a separate table, varying the table occasionally for greater test coverage.

Astute readers may have noticed that there are a few holes in my test plan:

  • how would I assert even distribution in a test scenario where no children have rules and the number of balls is not evenly divisible by four?
  • how can I show that a child who doesn’t have a rule is still getting the correct number of balls, when there are one or more children with a rule?  For example, if Amy has a rule that she only gets red balls, and Bob, Carol, and Doug have no rules, how can I prove that Bob, Carol and Doug get an even distribution of balls?
  • will the Superball Sorter work with more or less than four children? How would I adjust my test plan for this?
I’ll leave it to you to think about handling these things, and perhaps I will tackle them in future posts.

I’ve had a lot of fun over the last few weeks creating the idea of the Superball Sorter and thinking of ways to test it!  I’ve started to actually write code for this, and someday when it’s finished I will share it with you.

Organizing a Test Plan

In last week’s post, we took a look at a hypothetical software feature which would sort out Superballs among four children according to a set of rules.  I came up with forty-five different test cases from simple to complicated that would test various combinations of the rules. 

But a blog post is not a very easy way to read or execute on a test plan!  So in this week’s post, I’ll discuss the techniques I use to organize a test plan, and also discuss ways that I find less effective.

What I Don’t Do

1. I don’t write up step by step instructions, such as

Navigate to the login page
Enter the username into the username field
etc. etc.

Unless you are writing a test plan for someone you have never met, who you will never talk to, and who has never seen the application, this is completely unnecessary.  While some level of instruction is important when your test plan is meant for other people, it’s safe to assume that you can provide some documentation about the feature elsewhere. 

2. I don’t add screenshots to the instructions.

While having screenshots in documentation is helpful, when they are in a test plan it just makes the plan larger and harder to read.

3. I don’t use a complicated test tracking system.

In my experience, test tracking systems require more time to maintain than the time needed to actually run the tests.  If there are regression tests that need to be run periodically, they should be automated.  Anything that can’t be automated can be put in a one-page test plan for anyone to use when the need arises. 

What I Do:

1. I use a spreadsheet to organize my tests.

Spreadsheets are so wonderful, because the table cells are already built in.  They are easy to use, easy to edit, and easy to share.  For test plans that I will be running myself, I use an Excel spreadsheet.  For plans that I will be sharing with my team, I use a table in a Confluence page that we can all edit. 

You can see the test spreadsheet I created for our hypothetical Superball sorter here.  I’ll also be including some screenshots in this post. 

2. I keep the instructions simple. 

In the screenshot below, you can see the first two sections of my test:

In the second test case, I’ve called the test “Amy- Large balls only”.  This is enough for me to know that what I’m doing here is setting a rule for Amy that she should accept Large balls only.  I don’t need to write “Create a rule for Amy that says that she should accept Large balls only, and then run the ball distribution to pass out the balls”.  All of that is assumed from the description of the feature that I included in last week’s post

Similarly, I’ve created a grouping of four columns called “State at End of Test Pass”.  There is a column for each child, and in each cell I’ve included what the expected result should be for that particular test case.  For example, in the third test case, I’ve set Amy, Carol, and Doug to “any balls”, and Bob to “Small balls only”.  This means that Amy, Carol, and Doug can have any kind of balls at all at the end of the test pass, and Bob should have only Small balls.  I don’t need to write “Examine all of Bob’s balls and verify that they are all Small”.  “Small balls only” is enough to convey this.

3. I use headers to make the test readable. 

Because this test plan has forty-five test cases, I need to scroll through it to see all the tests.  Because of this, I make sure that every section has good headers so I don’t have to remember what the header values are at the top of the page. 

In the above example, you can see the end of one test section and the beginning of the final test section, “Size and Color Rules”.  I’ve put in the headers for Amy, Bob, Carol, and Doug, so that I don’t have to scroll back up to the top to see which column is which. 

4. I keep the chart cells small to keep the test readable. 

As you can see below, as the test cases become more complex, I’ve added abbreviations so that the test cases don’t take up too much space:

After running several tests, it’s pretty easy to remember that “A” equals “Amy” and “LR” equals “Large Red”. 

5. I use colors to indicate a passing or failing test.  The great thing about a simple test plan is that it’s easy to use it as a report for others.  I’ll be talking about bug and test reports in a future post, but for now it’s enough to know that a completed test plan like this will be easy for anyone to read.  Here’s an example of what the third section of the test might look like when it’s completed, if all of the tests pass: 

If a test fails, it’s marked in red.  If there are any extra details that need to be added, I don’t put them in the cell, making the chart hard to read; instead, I add notes on the side that others can read if they want more detail. 

6. I use tabs for different environments or scenarios. 

The great thing about spreadsheets such as Google Sheets or Excel is that they offer the use of tabs.  If you are testing a feature in your QA, Staging, and Prod environments, you can have a tab for each environment, and copy and paste the test plan in each.  Or you can use the tabs for different scenarios.  In the case of our Superball Sorter test plan, we might want to have a tab for testing with a test run of 20 Superballs, one for 100 Superballs, and one for 500 Superballs. 

Test plans should be easy to read, easy to follow, and easy to use when documenting results.  You don’t need fancy test tools to create them; just a simple spreadsheet, an organized mindset, and an ability to simplify instructions are all you need.

Looking over the forty-five test cases in this plan, you may be saying to yourself, “This would be fine to run once, but I wouldn’t want to have to run it with every release.”  That’s where automation comes in!  In next week’s post, I’ll talk about how I would automated this test plan so regression testing will take care of itself. 

How to Design a Test Plan

Being a software tester means much more than just running through Acceptance Criteria on a story.  We need to think critically about every new feature and come up with as many ways as we can to test it.  When there are many permutations possible in a feature, we need to balance being thorough with testing in a reasonable amount of time.  Automation can help us test many permutations quickly, but too many people jump to automation without really thinking about what should be tested.

In this week’s post, I’m going to describe a hypothetical feature and talk about how I would design a test plan for it.

The feature is called the Superball Sorter, and here is how it works:

  • Superballs can be sorted among four children- Amy, Bob, Carol, and Doug
  • The balls come in two sizes: large and small
  • The balls come in six colors: red, orange, yellow, green, blue, and purple
  • The children can be assigned one or more rules for sorting: for example, Amy could have a rule that says that she only accepts large balls, or Bob could have a rule that says he only accepts red or orange balls
  • Distribution of the balls begins with Amy and then proceeds through the other children in alphabetical order, and continues in the same manner as if one was dealing a deck of cards
  • Each time a new ball is sorted, distribution continues with the next child in the list
  • The rules used must result in all the balls being sortable; if they do not, an error will be returned
  • Your friendly developer has created a ball distribution engine that will create balls of various sizes and colors for you to use in testing

Here’s a quick example:  let’s say that Carol has a rule that she only accepts small balls.

The first ball presented for sorting is a large red ball.  Amy is first in the list, and she doesn’t have any rules, so the red ball will go to her.

The next ball presented is a small blue ball.  Bob is second on the list, and he doesn’t have any rules, so the blue ball will go to him.

The third ball is a large purple ball.  Carol is next on the list, BUT she has a rule that says that she only accepts small balls, so the ball will not go to her.  Instead the ball is presented to Doug, who doesn’t have any rules, so the large purple ball will go to him.

So what we have after a first pass is:
Amy: large red ball
Bob: small blue ball
Carol: no ball
Doug: large purple ball

Since Doug got the most recent ball, we’d continue the sorting by offering a ball to Amy.

How should we test this?  Before I share my plan, you may want to take a moment and see what sort of test plan you would design yourself.  Then you can compare your plan with mine.

My test plan design philosophy always begins with testing the simplest possible option, and then gradually adding more complex scenarios.  So, I will begin with:

Part One:  No children have any rules

  • If no children have any rules, we should see that the balls are always evenly distributed between Amy, Bob, Carol, and Doug in that order.  If we send in twenty balls, for example, we should see that each child winds up with five.  

Next,  I will move on to testing just one type of rule.  There are only two parameters for the size rule, but six parameters for the color rule, so I will start with the size rule:

Part Two: Size rules only

We could have anywhere from one child to four children with a rule.  We’ll start with one child, and work up to four children.  Also, one child could have two rules, although that would be a bit silly, since the two rules would be accepting large balls only and accepting small balls only, which would be exactly the same as having no rules.  So let’s write up some test cases:

A.  One child has a rule

  • Amy has a rule that she only accepts large balls.  At the end of the test pass, she should only have large balls.
  • Bob has a rule that he only accepts small balls.  At the end of the test pass, he should only have small balls.
B. Two children have rules
  • Amy and Bob both have rules that they only accept large balls.  At the end of the test pass, they should only have large balls.
  • Carol has a rule that she only accepts large balls, and Doug has a rule that he only accepts small balls.  At the end of the test pass, Carol should have only large balls, and Doug should have only small balls.
C. Three children have rules
  • Amy, Bob, and Carol have rules that they only accept small balls. At the end of the test pass, they should only have small balls.
  • Amy and Bob have rules that they only accept small balls, and Carol has a rule that she only accepts large balls.  At the end of the test pass, Amy and Bob should have only small balls, and Carol should have only large balls.
  • Amy has a rule that she accepts both large balls and small balls, and Bob and Carol have rules that they only accept large balls.  At the end of the test pass, Amy should have both large and small balls (assuming that both were distributed during the test pass), and Bob and Carol should have only large balls.
D. Four children have rules
  • Amy and Bob have rules that they only accept large balls, and Carol and Doug have rules that they only accept small balls.  
  • All four children have a rule that they only accept large balls- this rule should return an error
Now that we have extensively tested the size rule, it’s time to test the other rule in isolation:
Part Three: Color rules only
As with the size rule, anywhere from one to four children could have a color rule.  But this rule type is a bit more complex, because each child could have from one to six color rules.  Let’s start simple with one child and one rule:
A. One child has one rule
  • Bob accepts only red balls
  • Bob accepts only orange balls
  • Bob accepts only yellow balls
  • Bob accepts only green balls
  • Bob accepts only blue balls
  • Bob accepts only purple balls
This tests that each color rule will work correctly on its own.  
B. One child has more than one rule
  • Carol accepts only red and orange balls
  • Carol accepts only red, orange, and yellow balls
  • Carol accepts only red, orange, yellow, and green balls
  • Carol accepts only red, orange, yellow, green, and blue balls
  • Carol accepts only red, orange, yellow, green, blue and purple balls (which, again, is sort of silly, because it’s like having no rule at all)
C. Two children have color rules
  • Amy and Bob both accept only red balls
  • Amy accepts only red balls and Bob accepts only blue balls
  • Amy accepts only red and green balls and Bob accepts only blue and yellow balls
  • Carol accepts only red, orange, and yellow balls, and Doug accepts only green balls
Note that there are MANY more possibilities here than what we are actually testing.  We are merely trying out a few different scenarios, such as one where one child has three rules and one child has one.
D. Three children have color rules
  • Amy, Bob, and Carol accept only red balls
  • Amy accepts only red balls, Bob accepts only orange balls, and Carol accepts only yellow balls
  • Amy accepts only red balls, Bob accepts only red and orange balls, and Carol accepts only red, orange, and yellow balls
The last scenario above exercises a path where the children share one rule but not other rules.
E. Four children have color rules
  • Amy, Bob, Carol, and Doug only accept purple balls- this should return an error
  • Amy only accepts red and yellow balls, Bob only accepts orange balls, Carol only accepts yellow and blue balls, and Doug only accepts green balls- this should also return an error, because no one is accepting purple balls
  • Amy only accepts red balls, Bob only accepts red and orange balls, Carol only accepts yellow balls, and Doug only accepts yellow, green, blue, and purple balls
Now that we’ve exercised both rule types separately, it’s time to try testing them together!  Here’s where things get really complicated.  Let’s try to start simply with scenarios where each child has either a color rule or a size rule, but not both, and move on to more complex scenarios from there:
Part Four: Size and color rules

A. Children have one size rule or one color rule
  • Doug only accepts large balls, and Bob only accepts red balls
  • Doug only accepts large balls, Bob only accepts red balls, and Carol only accepts small balls
  • Doug only accepts large balls, Bob only accepts red balls, and Carol only accepts small balls, and Amy only accepts yellow balls
  • Amy and Doug only accept large balls, Bob only accepts small balls, and Carol only accepts purple balls
  • Amy and Doug only accept large balls, and Bob and Carol only accept purple balls- this should return an error, because there’s no one to accept any small balls other than the small purple balls
B. Children have both a size and a color rule
  • Amy only accepts large red balls
  • Amy only accepts large red balls, and Bob only accepts small blue balls
  • Amy only accepts large red balls, Bob only accepts small blue balls, and Carol only accepts large green balls
  • Amy only accepts large red balls, Bob only accepts small blue balls, Carol only accepts large green balls, and Doug only accepts small yellow balls- this should return an error
C. Children have a size rule and more than one color rule
  • Amy only gets large red, orange, and yellow balls, Bob only gets small red, orange, and yellow balls, Carol only gets large green, blue, and purple balls, and Doug only gets small green, blue and purple balls
  • Try the above scenario, but remove the large yellow ball from Amy’s list- this should return an error, because there’s no one to accept the large yellow balls

D. Children have more than one size rule and more than one color rule

  • Amy: large red, large blue, small yellow; Bob: large orange, large purple, small green; Carol: large yellow, small red, small blue; Doug: large green, small orange, small purple
  • Try the above scenario, but add a small purple ball to Amy’s rules
  • Try the first scenario, but change Doug’s small purple rule to a large purple rule- this should return an error, because now there’s no one to accept the small purple balls
And there you have it!  Forty-five rules that exercise a great many of the options and permutations that the Superball Sorter offers.  If you have read this far, you must enjoy testing as much as I do!  
How did my test plan compare to yours?  Did you think of things I didn’t?  Be sure to let me know in the comments section!
You may have noticed that while this test plan is complete, it’s not that easy to read, and it doesn’t provide any way to keep track of test results.  Next week I’ll talk about how to organize the test plan so you can run it quickly and record results easily.  

Localization Testing

If your app is used anywhere outside your country of origin, chances are it uses some kind of localization strategy.  Many people assume that localization simply means translation to another language, but this is not the case.  Here are some examples of localization that your application might use:

Language: different countries speak different languages.  But different regions can speak different languages as well.  An example of this would be Canada: in the province of Quebec, the primary language spoken is French, and in the other provinces, the primary language spoken is English.

Spelling: even when two areas speak the same language, the spelling of words can be different.  For example, “color” in the US, as opposed to “colour” in Canada and the UK.

Words and idioms: words can vary even in a common language.  In the UK, a truck is a lorry, and a car’s trunk is a boot.  In the US, to “table” a topic means to stop talking about it until a later meeting.  But in the UK and Canada, to “table” a topic means to start talking about it in the current meeting- the complete opposite of what it means in the US!

Currency: different countries will use different currencies.  But this doesn’t just mean using a different symbol in front of the currency, like $ or £.  The currencies can also be formatted differently.  In the US, fractions of a dollar are separated with a dot, and amounts over one thousand are separated with a comma.  In the UK, it’s the opposite.  So what would be written as 1,000.00 in the US would be written as 1.000,00 in the UK.

Date and Time Formats:  in the US, dates are written as month/day/year, but in the UK, dates are written as day/month/year.  The US generally writes times using AM and PM, but many other countries use 24-hour time, so what would be 1:00 PM in the US would be 13:00 elsewhere.

Units of Measure: the US usually uses US Customary units, such as pounds for weight and feet and inches for height.  Most other countries will use the metric system for these measurements.  Most countries measure air temperature in Celsius, while the US uses Fahrenheit.

Postal Codes and Phone Numbers: these vary widely from country to country.  See my posts on international phone numbers and postal codes for some examples.

Images: pictures in an application might need to be varied from country to country, but there are some considerations.  For example, if your application was to be used internationally, you might not want to include a picture of a building with an American flag in the front.  Or if your app were to be used in religiously conservative countries, you might not want a picture of a person in a sleeveless shirt.

Testing for Localization

The first step in localization testing is to determine exactly what will be localized.  Your company may decide to localize for date and time, postal codes, and phone numbers, but not for language.  Or a mobile app may choose to only use other languages that are built into the device, so that the text of the app would be in one language, but the buttons for the app would be in the user’s language.

If your app will be using other languages, gather all the texts you will need to be checking.  For example, if your app has menu items such as “Home”, “Search”, “Your Account”, and “About Us”, and your app will be localized for French and Spanish, find out what those menu items should be in French and Spanish.  It goes without saying that whoever has done the translations should have consulted with a native speaker to make sure that the translations are correct.

Next, create a test plan.  The simplest way to do this would be to create a spreadsheet, where the left column lists the different localization types you need to test and the top row lists the different countries.  Here is a very basic example:

Once your matrix is created, it should be very simple to run through your tests.  If you are testing on mobile, here’s a helpful hint: When you switch your mobile device to a different language, make sure you know exactly how to switch it back if you don’t recognize the words in the language you are switching to.  When I was testing localization for Mandarin, this was especially important; since I didn’t know any of the characters, I had no idea what any of the menu items said.  I memorized the order of the menu items so I knew which item I needed to click on to get back to English.

Another important thing to watch for as you are testing is that translated items fit well in the app.  For example, your Save button might look perfectly fine in English, but in German it could look like this:

Once you have completed your localization testing, you’ll want to automate it.  This could be done with UI automation tools such as Selenium.  You could have a separate test suite for each language, where the setup step would be to set the desired country on the browser or device, and each test would validate one aspect of localization, such as verifying that button texts are in the correct language, or validating that you can enter a postal code in the format of that country.  It would be very helpful to use a tool like Applitools to validate that buttons are displaying correctly or that the correct flag icon is displaying for the location.

Localization is a tricky subject, and just like software, it’s hard to make it perfect.  But if you and your development team clarify exactly what you want to localize, and if you are methodical in your testing, you’ll ensure that a majority of your users will be satisfied with your application.