Testing the Data Warehouse

Learn How To Improve the Quality of Your Data Warehouse Through Testing

David Haertzen David Haertzen, Principal Enterprise Architect

Testing the data warehouse and business intelligence system is critical to success.  Without testing, the data warehouse could produce incorrect answers and quickly lose the faith of the business intelligence users. Effective testing requires putting together the right processes, people and technology and deploying them in productive ways.

Data Warehouse Testing Responsibilities

Who should be involved with testing?  The right team is essential to success:

  • Business Analysts gather and document requirements
  • QA Testers develop and execute test plans and test scripts
  • Infrastructure people set up test environments
  • Developers perform unit tests of their deliverables
  • DBAs test for performance and stress
  • Business Users perform functional tests including User Acceptance Tests (UAT)

Business Requirements and Testing

When should your project begin to think about testing?  The answer is simple - at the beginning of the project.  Successful testing begins with the gathering and documentation of requirements.  Without requirements there is no measure of system correctness.

Expect to produce a Requirements Traceability Matrix (RTM) that cross references data warehouse and business intelligence features to business requirements.  The RTM is a primary input to the Test Plan.

Data Warehousing Test Plan

The Test Plan, typically prepared by the QA Testers, describes the tests that must be performed to validate the data warehousing and business intelligence system.  It describes the types of tests and the coverage of required system features.

Test Cases are details that enable implementation of the Test Plan.  The Test Case itemizes steps that must be taken to test the system along with expect results.  A Text Execution Log tracks each test along with the results (pass or fail) of each test item.

Testing Environments and Infrastructure

Multiple environments must typically be created and maintained to support the system during its lifecycle:

  • Development
  • QA
  • Staging / Performance
  • Production

These kinds of tools can facilitate testing and problem correction:

  • Automated test tool
  • Test data generator
  • Test data masker
  • Defect manager
  • Automated test scripts

Unit Testing for the Data Warehouse

Developers perform tests on their deliverables during and after their development process.  The unit test is performed on individual components and is based on the developer's knowledge of what should be developed.

Unit testing should definitely be performed before deliverables are turned over to QA by developers.  Tested components are likely to have fewer bugs.

QA Testers Perform Many Types of Tests

QA Testers design and execute a number of tests:

Integration Test

Test the systems operation from beginning to end, focusing on how data flows through the system.  This is sometimes called "system testing" or "end-to-end testing".

Regression Test Validate that the system continues to function correctly after being changed.  Avoid "breaking" the system.

Can the Data Warehouse Perform?

Tests can be designed and executed that show how well the system performs with heavy loads of data:

Extract Performance Test

Test the performance of the system when extracting a large amount of data.

Transform and Load Performance Test

Test the performance of the system when transforming and loading a large amount of data.  Testing with a high volume is sometimes called a "stress test".

Analytics Performance Test Test the performance of the system when manipulating the data through calculations. 

Business Users Test Business Intelligence

Does the system produce the results desired by business users?  The main concern is functionality, so business users perform functional tests to make sure that the system meets business requirements.  The testing is performed through the user interface (UI) which includes data exploration and reporting.

Correctness Test

The system must be produce correct results.  The measures and supporting context need to match numbers in other systems and be calculated correctly.

Usability Test The system should be as easy to use as possible.  It involves a controlled experiment about how business users can use the business intelligence system to reach stated goals.
Performance Test

The system must be able to return results quickly without bogging down other resources.

Business Intelligence Must Be Believed

Quality must be baked into the data warehouse or users will quickly lose faith in the business intelligence produced.  It then becomes very difficult to get people back on board.

Putting the quality in requires both the testing described in this article and data quality at the source described in the article, Data Sources for Data Warehousing, to launch a successful data warehousing / business intelligence effort.


Copyright© 1999-2015, First Place Software, Inc.