Friday, December 31, 2010

Database testing example

A vast majority of business applications store and retrieve data. Many applications use databases for this purpose, owing to a variety of reasons such as a standard and methodical way of storing information, performance and security. In addition to testing the application software, we should also be testing the database. You can avoid a number of bugs if the database schema used by your application is created correctly. What is more, a correct database schema prevents having to implement a complex logic in your application to handle the different scenarios that can lead to data problems, and consequently application bugs affecting its users. First, view my video, Database testing and SQL tutorial for beginners.

Now, work with the example below.


Now let us look at a simple database that may be used by a school to store its student information. Since the number of rows and columns is small, we can inspect the data and look for problems manually. Of course, when testing tables with a large number of rows, you need to design and execute SQL queries, but let us not worry about that right now. The intention is to look for database issues, both existing and potential.

Here are the problems with this database:

1. Missing tables
One can see that the Grades (Classes) exist just as a column in the Students table. Grades are really a different entity. This school may have a Grade 3 even if there were no students enrolled in it yet. And this is not apparent from the current data. There should be another table for Grades and each row in the Students table should point to the corresponding Grade via a foreign key.

2. Missing or incorrectly named columns
The Fees table has an ID column. But what ID is this? Is it the Fees ID or the Student ID? If it is the Fees ID, then there is no linkage between the Students and Fees table. If it is the Student ID, it should be named as such to avoid confusion for the developers.

3. ID columns
The Students ID column is has a text datatype. This means that every query reading this column will have to parse the value to rid the leading zero or zeroes. A more severe problem is that text based indexes are slower than numeric indexes leading to slower performance.

4. Dates
One cannot make out the date format from the Date of Joining column values. It could be mm/dd/yyyy or dd/mm/yyyy. This means that every query on the Date of Joining column has to know the date format before it can parse and process the value.

These are not the only problems with the data presented. There are more. Can you spot them?

I hope that this analysis would have piqued your interest in database testing and you would show this interest in your own software testing.