Increasing the Reliability of Automated Tests

In software testing, test automation is the use of software/tools separate from the software being tested to control the execution of tests and the comparison of actual results with expected results. By using these, we can automate some repetitive but necessary tasks and tests. We can also perform tests that are too complex to perform manually.  Automation of tests is critical for continuous delivery and continuous testing in the Agile/DevOps world. Automated tests can be unreliable or flaky due to a variety of reasons. And ensuring reliability is important, especially when the tests are run in the continuous integration and testing cycles so that the entire development process is robust. We all have encountered automated tests whose results are unpredictable, especially GUI tests. Some of the symptoms are the following 
  • Tests run as expected in isolation, however they fail in a batch run where they are part of a suite of tests
  • Tests fail due to the state of the GUI element; however, inspection shows that the element is present
  • Test fail due to causes that cannot be determined, on re-run the tests pass
  • Many tests fail due to the same cause
  • Automated tests take a long time to execute, more than manual execution of the same set of tests
 The unstable behavior of the automated tests is very frustrating to the authors of these tests. Quite often they tend to ignore these issues, thereby decreasing the value that test automation brings in terms of increasing test coverage and mitigating product risks. However, while analyzing the root cause of these common symptoms, patterns emerge, that helps in stabilizing these tests to a large extent and reducing false failures. Not isolating the test and its associated data properly – This is one major reason why tests fail in a batch run. Does my test consume data from another test? Does it need to use unique data every during every execution? If so, how can I make sure that the data gets automatically refreshed during each execution? Is my suite sharing test data with others (developers, other testers)? How do I prevent concurrent usage that can leave my tests in an unknown state? Strategies to address these questions are important so that the test results are reliable. Not understanding and accommodating the prerequisites and post conditions properly – Very often, test data is not initialized properly. For e.g., when the current tests need data from the execution of another test, the corresponding data must be initialized properly in the test data repository/database/file. Data that has a dependency on the system dates and/or configuration data must be setup automatically. At the end of the tests – the post conditions must be implemented properly. This could involve clean up of the test data, setting up data for other dependent tests and so on. Not paying enough attention to synchronization – This causes tests to fail sometimes and pass sometimes. It is not always possible to predict how long a page load may take each time the tests run. This is because of the load conditions on the server could vary, network speed has an impact too, sometimes, the page load condition returns a success while the elements on the page are not in the right state. The solution here is to introduce waits that poll for a certain condition. For e.g., we could set flags on the elements when a http requests complete and the automation code can check for that. One could also poll for known error conditions. It is important to timeout the waits for obvious reasons to ensure optimum runtime. Many tests fail due to the same cause – These failures are usually caused by transient conditions – for e.g., the URL is accessible, database is not in a known state, services required for the application is unavailable, element/locator change in the application (XPath/CSS changes) etc.  A small set of pipe-cleaning tests can be used before an actual test run to ensure that the environment is ready for execution. The recommended approach here is, when the test results show false failures/negatives, it is important to perform Root Cause Analysis (RCA) every time to ascertain the reason why the tests are unstable.  Most often, it is not due to the test framework, but the way we organize the execution. Many a time, the solution to these issues also needs support and help from the developers too! After all, quality is the responsibility of everyone in the team!