Improved access to quality data means better testing, and better software

Data accessibility and quality are among the most significant factors impacting test success and directly influence how long effective testing will take. It’s not hard to see why: if the data isn’t available, testing can’t happen. If your people can’t access the data they need for testing, or if access is complicated and complex, they’re testing less and being frustrated while trying to achieve the preconditions they need to do their jobs. And if testing is conducted utilising wrong, inappropriate, or just plain outdated data, it won’t be effective. It can’t be.

Funnily enough, despite the frequency of these problems and the significance of data as a component in testing success, they aren’t often solved. And, as software development becomes more agile, testing ‘frequency’ ramps up a little more, with the associated iterative cadence requiring a continual supply of ‘fresh’ and easily accessed data for effective testing. The second part of the problem occasionally causes or at least influences the first…if access is difficult, once a data set is secured, the tester sticks with it.

Running the tests on the same data just won’t – and doesn’t – cut it. Instead of matching your CI/CD pipeline, you need ‘Just In Time’ (JIT) datasets. Those datasets should be within easy reach so testers can focus on testing rather than data extraction, transformation, and load activities.

Testers often encounter common data issues during their battle to access quality material for their work. These issues include data refreshes that render test data sets obsolete, encrypted data that is either unusable or significantly more challenging to work with, limited accessibility that hinders testers from extracting and interpreting data, and the Pesticide Paradox, which is my personal favorite. The Pesticide Paradox refers to the fact that using the same tests and the same data will only eliminate the same bugs, making it impossible to detect any issues that may be missed using static data.

In addition to the aforementioned issues, testers often face challenges such as perishable data. Once the tester receives the data set, its relevance becomes limited over time. There is also the concern of reserved or ring-fenced data, preventing accidental consumption of preconfigured test data by another tester. Incomplete or absent context for test data further complicates matters, as testers may not fully understand what they are testing. Lastly, it is important to note that one set of data may not be suitable for testing across various environments and different types of tests.

All these issues boil down to a necessity for a feed of accurate, up-to-date, and appropriate test data. This data should be available to the tester with absolute ease – again, so they can focus on the task of testing, rather than performing the sort of data acrobatics more reasonably associated with business intelligence or data warehousing specialists.

Our answer to this problem is the development of a data preparation solution we call Assurity Data. At a high level, this is a staged database and an up-to-date copy of the environment being tested. The beauty of it is that while a complete dataset, it is entirely separate from the application under the test environment and so doesn’t interfere with ‘testing process’. With a user-friendly web interface, Assurity Data presents testers with a set of ‘cartridges’ suitable for the various test cases and test types required to run. It’s a bit like a big virtual VCR (remember those!); choose your cartridge, load it up, and conduct your tests. There’s no manual extraction, no data massaging, and no hassle. In the background, the cartridges are refreshed with randomised JIT data, so the tests are taking place are real, current, and relevant.

It isn’t only human testers benefiting from Assurity Data. The solution supports automated testing, too, with a code-level interface providing data cartridges for automated and performance testing scenarios.

The results of Assurity Data is a substantial improvement in test performance, with efficacy improvements of some 30%. That’s in terms of a reduction in time taken; probably more important is that testing is achieving its goal, which is more effective risk reduction by detecting and eliminating a wider spectrum of bugs and other deficiencies.

And that means better software quality, faster, and at a reduced cost. And it doesn’t hurt that, relieved of data access and quality hassles, testers using Assurity Data are happier, too.