This week I’m writing about three things not often associated with testing: logging, monitoring, and alerting. Perhaps you’ve taken advantage of logging in your testing, but monitoring and alerting seem like a problem for IT or DevOps. However, a bug-free application doesn’t mean a thing if your users can’t get to it because the server crashed! For this reason, it’s important to understand logging, monitoring, and alerting so that we as testers can participate in ensuring the health of our applications.
Logging:
Logging is simply recording information about what happens in an application. This can be done through writing to a file or a database. Often developers will include logging statements in their code to help determine what’s going on with the application below the UI. This is especially helpful in applications that make calls to a number of servers or databases.
Recently I tested a notification system that passed a message from a function to a number of different channels. Logging was so helpful in testing because it enabled me to follow the message through the channels. If I hadn’t had good logging, I wouldn’t have had any way to figure out where the bug was when I didn’t get a message I was expecting.
Good log messages should be easy to access and easy to search. You shouldn’t have to log on to some obscure remote desktop and sift through tens of thousands of entries with no line breaks. One helpful tool for logging is Kibana– an open-source tool that lets you search and sort through logs in an easy-to-read format.
Good log messages should also be easy to understand and provide helpful information. It’s so frustrating to find a log message about an error and discover that it says “An unknown error occurred”, or “Error TSGB-45667”. Ask your developer if he or she can provide log messages that make it clear what went wrong and where in the code it happened.
Another helpful tactic for logging is to give each event a specific GUID as an identifier. The GUID will stay associated with everything that happens with the event, so you can follow it as it moves from one area of an application to another.
Monitoring:
Monitoring means setting up automatic processes to watch the health of your application and the servers that run it. Good monitoring ensures that any potential problems can be discovered and dealt with before they reach the end user. For example, if it becomes clear that a server’s disk space is reaching maximum capacity, additional servers can be added to handle the load.
Things to monitor include:
- server response times
- load on the server
- server errors, such as 500-level response errors
- CPU usage
- memory usage
- disk space
- CPU or memory usage goes above a certain threshold
- Disk space goes below a certain threshold
- The number of 500 errors goes above a certain level
- A health check fails twice in a row
- Response times are slower than expected
- Load is higher than normal
- How can we troubleshoot user issues?
- How do we know that we have enough servers to handle our application’s load?
- How will we know if our API is responding correctly?
- How will we know if a DDoS attack is being attempted on our application?
- How will we know if our end users are experiencing long wait times?
- How will we know if we are running out of disk space?
So glad to see you encourage testers to get involved with logging, monitoring and alerting. I would add to that, observability. We testers have good skills for spotting patterns in data and identifying risks, it's another way we can make valuable contributions to our team and product.
That is an excellent point, Lisa! Thanks for including it.
Monitoring is the new testing!
Thank you for sharing wonderful information with us to get some idea about it.
Azure Training
Azure Online Training
MS Azure Online Training