A Detailed Guide to Data-Driven Testing | Examples & FAQs

Data Driven Testing in Agile Industry

A product can speak for itself if it's engineered to support its customers' needs with the highest quality. Quality therefore is not the responsibility of one but of many! However, in an organization this responsibility is rarely a function of one team, rather should be a collective and collaborative exercise. Most organizations find this hard unless there is effective communication across, which is hard to comeby. This problem directly or indirectly reflects the way a product works. Agile solves this!

Agile is a methodology that supports an efficient software through iterative development. The philosophy allows frequent inspection on figuring out what needs to change, and upholds the willingness to adapt to that change quickly! It is a process that encourages collaboration between teams to layout requirements and solutions though a set of engineering best practices.

Agile these days, adopts the policy of embracing change promising frequent delivery with high-quality software, improved team performances and technical excellence. Agile also takes into consideration the business approach to properly align development with customer needs and company goals. These policies are not just confined to software development rather is a manifesto on the way our industry works as of date. Needless to say, this is accompanied with a lot of changes and challenges. One of the major challenges we still face as of today, is that of maintaining quality with such frequent delivery.

“Quality in a service or product is not about what you put into it. It is what your client or customer gets out of it.” - Peter Drucker

Now that we have established how important ‘Quality’ is, we need to explore solutions that give us that level of quality. ‘Testing’ therefore, was and will remain one of the most important standalone branches in the software development process that guarantees quality. This is one field that takes as much heat as the development itself as and when deadlines get narrower.

Testing is of many types and is broadly dependent on the software product or application type being built. However, one aspect remains the same - testing often and testing better. While this can be done in many automated ways, testing applications through data has remained the most widely accepted choice.

The information throughout will help you seek knowledge on what we call Data-Driven Testing (DDT).

What Is Data-Driven Testing?

Imagine a scenario, where a tester has to test a workflow in an application which has multiple entry points for inputs. The test engineer is likely to input a permutation of numbers and execute the application only then to realise that it might fail for a different permutation of numbers. To keep doing this over months manually and to maintain records on it is not just cumbersome and confusing, it’s sheer pain! For all that the software industry stands for, we have got to have better solutions for such problems. Don’t you agree?

If we could have all the usable test data stored or documented at a single place (a storage facility), it would save us a lot of time that would otherwise be spent in creating different test cases using different types of test data. And if we were to find a way to build an automated process that uses data in this file to run multiple times without any manual effort, we have the perfect solution!

This is exactly what Data-Driven testing tries to achieve.

The concept is centered around separating the test data from the test script and this cannot get simpler than it already is! First, a tester would ideally document all input values in a storage file (could be a .csv, .xls, .xml and so on) that covers best case, worst case, positive and negative test scenarios; and second, we would have a test script developed that uses this data as variables (by substituting values in the scripted place holders) while it runs iteratively until for all data sets.

Data-Driven testing, therefore conceptualizes an automated framework where testing is triggered on a set of data stored in a storage facility (files, databases). This framework resolves the lengthy and time-consuming process of creating individual tests for each data set.

DDT is a methodology where a sequence of test steps (structured in test scripts as reusable test logic) are automated to run repeatedly for different permutations of data (picked up from data sources) in order to compare actual and expected results for validations.

In this process, four important operations are observed:

  1. Collecting different sets of test data in a storage facility like a file or a database

  2. Creation of scripts that is capable of reading this data, passing it through required layers of the script; which then automatically triggers simulation of other action items

  3. Storing the retrieved results (from point 2) and then retrospecting the results (should it throw errors) in terms of what was expected and what was actually obtained.

  4. Continued testing with the next set of input data

Data-Driven Testing: Use Case Example

To explain what we just spoke about, consider the following examples.

Example 1: Login Panel

A basic Login Panel (post a sign up) consists of an email (could be business email for business applications) and a password. Allowing only a registered user is critical here, as the login feature is built for security reasons whose workability is to block unauthorized users from accessing data within the platform/product. Some of the best cases one can think about authenticating the email and password could be –

And to check these individual cases one would have to input different variables every time a script is run to test the authentication/login process. This is where a data-driven testing framework can be useful!

Writing these different variables in a storage file and calling these values individually in a functional test script is the most efficient way of approaching this problem.

A test script could look something like this -

pip install selenium
from selenium import webdriver
driver = webdriver.Chrome()
def site_login():
driver.get (“URL”)
driver.find_element_by_id (“ID”).send_keys(“password”)
#if you are using FB then,
def fb_login():
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait WebDriverWait(driver, until(EC.title_contains("home"))

A test file could look something like this -

test file

Please note that in this automated framework, test scripts that test the functional workflow are kept separate from the test values.

Example 2: Testing an agreed upon format in a CSV upload

In many instances, we use the ‘upload a file’ feature in an application. One of the most important requirements in an ‘upload a file’ feature is to test for the format/template of the file that is being uploaded. Assuming that we use a .csv file to store data, then the test should ideally help validate the data-type in every cell of the csv file.

The data in the file is expected to be picked up by an application that processes it for a specific use case (such as, to power a graph!). Here, testing the content-type in every cell according to a certain type takes precedence.

Most importantly, the test should be around checking the format of supported and unsupported files, thereby reporting them in case they don’t meet requirements. Supported files should be enabled and uploaded without any data loss. For instance, if the upload file is text, an image upload should not be allowed. Or if the upload file should have just two columns with a specific header, then it should not allow any other file format.

Let’s say the requirement suggests we upload a file with two columns, that have a single header each. The rest of the content in the file should not have any special characters in between a cell content.

Should the file format be correct, the application is triggered to read the right data in every cell to be processed further. This test has to run for all cells, across all rows and columns. This is where DDT can help!

A sample template could look something like -


All of these mentioned above can be tested using a data-driven automated framework.
Creating a test script that creates a placeholder to accommodate the data (content that’s normally present in a file upload) from the test file, and then having different values called in this test script will execute this test framework seamlessly!

This being explained, let’s try to understand the applications coverage that they make happen.

Architecture of DDT in an Automated Framework

DDT framework is an automation testing framework that allows you to use a single test script to validate a test case against different types of test data. The test data corresponding to positive as well as negative testing is stored in a file, and the test script uses all these values as inputs to execute the tests. Therefore, the framework provides re-usable logic that improves test coverage.

Since the agenda is based on test data in an automated framework, we would ideally focus on helping you understand how to input data and what to expect as the output data through a type of automated framework explained above; most importantly, how do we go about structuring this data.

Take a look at the flow-diagram below. This replicates a typical Data-Driven Testing framework.

typical Data-Driven Testing framework

Data File and the Test Data

A Data File is the first point of input for any type of DDT framework. A typical Data File is loaded with test data that is inclusive of scenarios such as positive test cases, negative test cases, exception throwing and exception handling test cases, min-max limit cases and not to forget the permutations of the data that contributes for appropriate data coverage.

Now that we have knowledge on the test data and the Data File, it is rendered as ‘ready’ only if one can parse this data through the Driver Script depending on the requirement of the application that is being tested. These data sets can be stored in data storage files such as an excel, xml, json, yml, or even a HDP database.

Sometimes, it’s wise to include Hash (key, value pairs), Arrays, Data Tables to structure big data that is to be tested. Testsigma’s flagship product enables all these capabilities in a single module. For more, refer https://testsigma.com/ai-driven-test-automation

The Test File that contains the data does not just contain input values, but may also contain output or expected results that relates to successful running of the application under test. The idea of having the output results in the same file is to make sure that the results obtained on testing an application are in par with the expected ones.

This is technically termed as analyzing Actual Results vs Expected Results.

Driver Script

Driver Script is the second most important piece in a DDT framework. This piece is like any other script whose usage matches that of an executable file. Driver Script, as the name indicates, is a piece of code that replicates the test case of an application.It reads data from the data files such that they are used in the test scripts to execute corresponding tests on the application. The Driver script then outputs results on execution.

In other words, its structure contains placeholders for variables (test data) picked up from the Test File. The output that the script generates is compared with the ‘expected results.

A driver script is often a simple piece of code written efficiently. A driver script in a DDT usually contains the application code and the implementation code that works together.

Overall, DDT revolves around how efficiently an application can be tested. Technically, it’s about how the test data works with the test script to bring out expected results.

Data-Driven Scripts are similar to application specific scripts (pyspark script, JavaScript) programmed to accommodate variable datasets.

Some of the key features that’s a must are –

This brings us to the last part of the architectural design that is to compare results.

Expected vs Actual Results

DDT framework allows testers and stakeholders to understand the performance of the existing design in the application.

In other words, when a system gives out an output, it becomes necessary that the output needs validation. This validation is achieved by comparing the actual results with that of the expected ones. If there are any differences, the root cause is found and evaluated to be corrected. Further, these corrections should not affect the expected workflow of the product. For this, a valid feedback loop is enabled in the organization’s process which directs these corrections to the corresponding development teams.

Please note that, in this process, there could be new test cases that are to be added in order to validate an instance more thoroughly. In such cases, the Data File and the Driver Script are modified to fit requirements.

With an efficient architecture as discussed above, there are many advantages DDT framework enables. Some of them are listed as follows.

Advantages of Data-Driven testing

Data-Driven testing offers many advantages some of them are:

Disadvantages of DDT

DDT has no jarring problems or cons as such. The cons listed below are more of limitations rather than disadvantages.

Best Practices of Data-Driven Testing

To make the Data-Driven testing as efficient as it can get, here is a checklist of best practices you might want to take a note of.

How does Data-Driven Testing work?

Complimentary to the architecture as explained in the previous section, DDT can comfortably be tagged as a test automation framework in an agile environment. DDT accommodates both positive and negative test cases in a single flow of test.

Example Of DDT with its implementation structure -

Let us see an example for Data Driven Testing.

Consider the Login Page of a Hotel Reservation website. A pseudo workflow could be something as follows:

  1. A test data file is created as TestData_HotelReservation.csv (Comma Separated Values)

  2. This file contains inputs given to the driver script and expected results in a table

  3. The driver script for above data file will be, data = open(‘TestData_HotelReservation.csv’).read() lines = data.splitlines()

  4. Steps performed for above driver scripts are as follows:

    • Read Value1

    • Read Value2

    • Read Operator

  5. Calculate the result using an operator on Value1 and value2

  6. Finally, compare the expected result with the actual result

Data Driven Testing with Testsigma

Testsigma enables data driven testing for below storage types:

  1. Data tables created in Testsigma
  2. Excel file
  3. json file

The excel and json files can be easily imported into Testsigma. A test case in Testsigma can then be configured to read data from these data files by toggle of a button.

To know more, read here.

Types of Data Driven Testing

As you already know, DDT uses classes of iterative data to cover portions of application. The method is staight-forward where the test scripts (programmed in a scripting language) are executed for test data stored in a test file.

At the core of this is the way the scripting language supports the data file. This is where DDT varies slightly; i.e., based on the types of data files it uses.

Some of them are as follows -

Importance of Data Driven Testing

Once you get a hang of how a product application should work, you will understand the nuances of handling things in a way that will cover all scenarios, both -positive and negative.

A test engineer at this point will understand the kind of variables to add that will make the data-file holistic. This being present, a script representing a test, runs in loops for all test data in the file. The outputs are compared with verifiable data making the process a success in an automated environment.

Testsigma supports a data driven framework (with our flagship product) that allows users to navigate into different project tabs (modules), allowing them to document inputs for specific use cases. Additionally, the feature allows storage of outputs that are obtained at runtime such that there is scope to compare.

Some of the outstanding attributes of this offering are as follows :

If a codeless testing tool is used for this automation framework, the coding hassles can be forgotten. When such a tool also provides easy integration with data files, the automated testing becomes as easy as a manual test creation task. Testsigma is one such tool that offers these features and is recommended for implementation of data driven testing .

Create a Data Driven Automation Framework

Let’s consider a rather interesting concept that leverages DDT to function better - Test Automation Framework.

A framework basically is a set of guidelines that are to be followed in order to glean beneficial results.

In an automation environment, a set of guidelines such as best coding practices, effective test-data handling capabilities, object and class repository treatment and so on - when followed, will lead to better results, less maintenance, increased re-usability. This is the optimum version of supporting an application with DDT.

There are different types of automated frameworks -

  1. Linear Scripting

  2. The Test Library Architecture Framework.

  3. The Data-Driven testing Framework.

  4. The Keyword-Driven or Table-Driven Testing Framework.

  5. The Hybrid Test Automation Framework

We have already discussed DDT Framework. Let’s discuss a bit of Keyword Driven Test Automation.

Keyword Driven Test Automation

Keyword Driven Test Automation is often regarded as Table Driven Test Automation or Action Word Based Testing.

The framework revolves around dividing the Test Case into four processes -

You can maintain all these categorizations on an excel sheet.

Let’s consider an example of an Online Shopping Store

What you will need in terms of process :

  1. To enable Opening a Browser

  2. Navigate to URL from a registered domain

  3. Access My Account button to change countries

  4. Enter Username and Password of standard specifics

  5. Enable LogIn and LogOut buttons

  6. Redirection to Homepage if loading a child page takes more than 5 seconds.

Requirements in terms of resources:

generic workflow of keyword driven testing

Some features of Keyword Driven Test Automation are -

The difference between Keyword Driven Testing and Data Driven Testing

Automated testing aims at covering large test scenarios. Automated Testing framework supports both - Keyword Driven Testing and Data Driven Testing, although they solve the same objective of optimizing an application.

Data Driven Testing refers to a storage file from which variables are copied onto a script that runs automatically several times on different cases; thereafter, storing the actual output in the same file base to compare it with the expected output. The number of data rows in the storage file therefore, is directly proportional to the test cases executed at run time. Ex: Booking and Reservation Functions

On the other hand, Keyword Driven Testing enables you to use a keyword to represent an action item. A sequence of keywords, therefore, drives a script. Further, you could use the same set of keywords to build a variety of test scripts. Ex: Real-time machines that operate on time-bound basis

At this point we have a clear understanding on what Data Driven Testing and Keyword Driven Testing actually mean in an automated environment. Let’s take a look at the ‘data’ part of it for a bit. Questions like what is data? - Its origin and so on. If you are already familiar with this piece, skip to the bottom of the page for conclusions and takeaways.

What is Test Data?

Data that is considered exclusively for usage in tests is test data. It powers a script with inputs and produces outputs by measuring actual results from an application that is being developed.

They are selected to power best case scenarios, worst case scenarios, exceptions and anything in-between. The variable or containers (place holders in the script where these variables are replaced) could be numbers or names or a combination of both. They also can be correlated.

These data sets can be created by the tester, a program, a function, or could be a computerised generation. They are inhabited with characteristics such as re-utility, redundancy and so on.

How does test data generation work?

Now that we have an understanding of what test data is, we very well understand and are in terms of the fact that test data is a critical piece of programming or testing an application.

They cannot be random all the time, and therefore, there are places to be looked at to scrape data. Data can be from past operations collected and archived. You can borrow archived data anytime you want. Other ways to gather data are -

That clarified now, an obvious question remains, what if one is not technically sound? How does one validate such test data and therefore test cases? This is where BDT comes into play. In other words, Behavior Driven Testing supports management staff with limited technical knowledge.

What is Behavior Driven Testing?

Testing need not be a technical function all the time. In complex systems, there are applications that require logical test data such as dates in terms of duration, discounts and other mathematical calculations, temperature measurements and so on.

Such kinds of data are most often required in real-time operational systems, and areas related to logistics and inventory management. Please note that although we are talking about Software systems/applications, they ultimately serve an industry purpose as exemplified.

The skill set here is inclined more towards logical, rational and research based expertise, rather than tech skills. These kinds of tasks are often palmed off to management personnels or product owners. Needless to say, the test strategy also changes with the changing audience. This is where Behavior Driven Testing comes into play!

High level prognosis of BDT in today’s world

Behavior Driven Testing is most often focused on the behavior of users for an application rather than the technical functions of a software. This test takes into consideration business objectives.

The test cases are written in a more natural language which is verbose and easily understood by anyone. This encourages communication across teams with ease.

BDT is designed to enable an agreed upon expectation although there could be different perspectives of Stakeholders and Delivery Teams. It is initiated as a business goal, and translates into features and user stories. These are then approved by the non-tech stakeholders before hitting development.

The objective is to make a business readable and domain specific language that enables you to describe a system’s behavior without having to be involved in how the behavior is to be implemented.


Agile methodology promises to deliver faster and better! This can only be possible if there is a quality check that happens parallel to the development process. Testing therefore becomes critical in delivering applications uncompromising on the quality.

No one solution fits all! - As honest as it sounds, it is quite an expected factor in the ever changing ecosystem of software. Something that works for one may not work for the other. Ultimately, it is left at the organization's discretion to choose the type of test methodology that would serve them well in the long run.

Testsigma leverages this idea to help businesses perform at their maximum efficiency. Through our testing products, we enhance business intelligence by reducing risks, develop easy access on development and testd, share information/feedback on a real-time analysis.

The end purpose of any testing process is to speed up decision making which is formulated by rapid automation testing as explained throughout. Needless to say, “Speed is Power!” When you can run, why crawl?


What are the types of data driven testing?

DDT slightly varies w.r.t the type of data storage files that are used. Based on this, we have

  • CSV Files (Comma Separated Values)

  • Excel Sheets

  • Script Arrays

  • Database Tables

  • Table Variables

What are the features to look for in the new/codeless/advanced data driven automation framework tools?

  • Clone a module - so that testers /authors can re-use repetitive steps

  • Automatic reporting - an auto generated report with details about the test module, occurrence, screenshots or provision to add video links.

  • Writing code within the tool - to help cover corner cases and for easy integration

  • Browser Support - to test an application’s functionalities on other browsers.

  • Adding Assertions - to make the UI of the automation tool more intuitive thereby eliminating manual effort.

What are some effective tips for test data management?

Test data is transactional information which is generated in quite a bit of volume. Effectively maintaining this will only help reduce lifecycle time of an application. Data consistency, privacy, sub-setting, and validity are some of the known challenges. Here’s how you can manage them better -

  • Plan the test data in par with the test coverages. This saves time and you can use the data to test a module immediately.

  • Mask data that seems confidential. Here, you may have to write a code to create an encryption.

  • Refresh your data source regularly. Delete or replace what’s irrelevant and has changed. Create alternatives to existing ones in case you need extensive coverage.