3 ways to use data, analytics, and machine learning in test automation

Just 10 years ago, most application development testing strategies focused on unit testing for validating business logic, manual test cases to certify user experiences, and separate load testing scripts to confirm performance and scalability. The development and release of features were relatively slow compared to today’s development capabilities built on cloud infrastructure, microservice architectures, continuous integration and continuous delivery (CI/CD) automations, and continuous testing capabilities.

Furthermore, many applications are developed today by configuring software as a service (SaaS) or building low-code and no-code applications that also require testing the underlying business flows and processes.

Agile development teams in devops organizations aim to reduce feature cycle time, increase delivery frequencies, and ensure high-quality user experiences. The question is, how can they reduce risks and shift-left testing without creating new testing complexities, deployment bottlenecks, security gaps, or significant cost increases?

Esko Hannula, product line manager at Copado, spoke to me about the recent acquisition of Qentinel and the testing challenges facing devops organizations. He believes machine learning is key to handling increasing test volumes. “The quality of digital business is the quality of the code and testing that runs it. The more code there is to test, the more important it gets to marry machine learning with test automation. QA people and machine intelligence can support each other in making wise decisions based on data rather than a mere gut feeling.”

I recently wrote about using service virtualization to develop more robust web service tests when building microservices or interfacing with many third-party APIs. I then looked a step further and researched testing capabilities based on data, analytics, and machine learning that development teams and QA test automation engineers can leverage to develop and support more robust testing.

These capabilities are emerging, with some testing platforms offering robust functionality today while others are in early adopter phases. Development teams should research and plan for these testing functions as they will all become mainstream capabilities.

Generating tests using natural language processing

Test quality has improved significantly during the last decade as QA platforms analyze a webpage’s document object model (DOM), leverage computer vision to detect user interface changes, and utilize optical character recognition to extract text elements. But developing tests often requires test engineers to click through user interfaces manually, input data in forms, and navigate workflows while QA platforms record the test case.

An emerging approach is to use natural language processing (NLP) to document test cases. Sauce Labs recently acquired AutonomIQ, a tool that enables users to describe the testing steps in natural language and then their software automatically creates the test cases.

John Kelly, CTO of Sauce Labs, describes why this capability is important as more organizations develop customer relationship management customization, business process management workflows, and low-code applications. He describes the experience from a business perspective: “I have internal business processes that subject matter experts can describe in natural language, which NLP machine learning can then convert to test cases that can run as often as desired. I can then demonstrate to outside auditors that controls are followed properly. So, a codeless approach to creating test cases is an emerging way to document and validate business processes.”

Expanding tests with synthetic test data generation

Once QA engineers capture test cases, the next task is to generate sufficient test data to validate the underlying business rules and boundary conditions. Test data generation can be particularly challenging for open-ended experiences like search engines, complicated multifield forms, document uploads, and testing with personally identifiable information or other sensitive data.

Tools from Curiosity Software, Datprof, Delphix, GenRocket, Torana (iCEDQ), K2View, and others provide test data automation capabilities for different applications and data flows, including functional testing, API testing, dataops, data lakes, and business intelligence.

Optimizing continuous testing practices

Several platforms are looking to help agile development teams and QA automation engineers optimize their testing practices.

Failure analysis helps development teams research the root causes when tests fail. Kelly describes the challenge: “You have a thousand selenium tests, run them all, and get 300 failures. The team doesn’t know if it’s a broken API or something else and whether the problem will happen in production, knowing the test environment doesn’t fully reflect it. They’re interested in the root causes of test failures. Our models cohort the failed tests and report which tests are related to the same problem.”

Another challenge is optimizing the test suite and determining which tests to run based on a release’s code changes. Testing teams can heuristically design a “smoke test,” a regression test around the essential app functionalities and flows. But for devops teams implementing continuous testing, there’s an opportunity to connect the data between tests, code changes, and production systems and apply machine learning to choose which tests to run. Optimizing the tests in a build is a much-needed capability for dev teams that release code frequently on mission-critical applications.

One solution targeting this challenge is YourBase which creates a dependency graph that maps test cases with their code paths. When developers change the code, the tool uses the dependency graph to optimize which test cases need to run. Yves Junqueira, CEO of YourBase, told me, “We see companies that have tens or even hundreds of thousands of tests. They want to improve their lead time to get code to production and improve developer productivity. These teams must make smart decisions about which tests are really necessary for their changes and want a better understanding of test failures.”

A third approach operates outside the testing environment and helps device engineers and software developers trace production errors, exceptions, and critical events. Backtrace provides this capability. Development teams use its aggregate error reporting and deduplication analytics to quickly find and resolve issues in gaming, mobile, or other embedded applications.

The key for devops organizations is recognizing that driving frequent releases on more mission-critical applications requires a parallel effort to increase the automation, robustness, and intelligence in testing. AIops platforms help IT service management teams support microservices and complex application dependencies by centralizing operational data and enabling machine learning capabilities. In a similar manner, QA platforms aim to provide agile development teams with automation, analytics, NLP, and machine learning capabilities to improve testing.

3 ways to use data, analytics, and machine learning in test automation

Better testing means better software. Using NLP, test data generation, and optimized testing can quickly improve applications.

Generating tests using natural language processing

Expanding tests with synthetic test data generation

Optimizing continuous testing practices