Saturday, October 29, 2011

Database Normalization: What to test for First Normal Form?

Designing a relational database to minimize data redundancy (and therefore maximize data integrity) is called normalization. View my video on Normalization and First Normal Form. Then read on...

The concept of data normalization was introduced by Edgar Codd, right in the years after he invented the concept of the relational model of storing data. There are various degrees of normalization 1NF (First Normal Form), 2NF, 3NF and so on. Each degree of normalization is stricter than the previous one e.g. if a table is in 3NF then it is automatically in 1NF and 2NF. In this article, I will explain the First Normal Form and what to test for it. Articles on testing the other normal forms will follow.
Years ago, when I used to work as a database developer, I remembered 1NF as each table having only relevant columns, a primary key and no repeating groups. It is still useful to remember 1NF like this because most definitions of 1NF agree to this. Let us now see examples of few tables that are not in 1NF.

a. Having columns of other entities
Customer (CustomerID, FirstName, LastName, Address, SupportContactName)
Here the Support Contact does not belong to the Customer entity. Such a design gives rise to data modification and querying problems. For example, if a Customer row is deleted, then the Support Contact data is also deleted. Also, it is not possible to store available Support Contacts that are not yet assigned to any customer.

b. Having no primary key
Employee (FirstName, LastName, JobTitle)
Here it is possible for two or more employees having the same column values implying duplicate rows. There are update/ delete problems with such a table design. Which row to update if only one of the employees is promoted? How to delete just one row if one of the employee leaves the company? Also, each query written on this table needs to have the logic that includes duplicate rows.

c. Having a repeating group
Customer (CustomerID, FirstName, LastName, Address, PhoneNumber1, PhoneNumber2, PhoneNumber3)
With such a table, there are many problems. For example, many customers may have only one phone number so the space for the other two phone numbers is wasted. Each query on this table fetching phone number values needs to include three columns, impacting performance. This table cannot accommodate a customer who has four phone numbers.

Another example of a repeating group is the table
Customer (CustomerID, FirstName, LastName, Address, PhoneNumbers)
Here it is possible to store any number of phone numbers within the limit of the column width. If there is more than one phone number, this list of phone numbers is comma separated. The problem is that space for a potentially large list of phone numbers is reserved for every row and wasted. Each query on this table fetching phone number(s) has to implement the logic to understand one phone number or a list of phone numbers. Frankly, the PhoneNumbers column is not a single column but really a group of columns for the Customer table. Therefore, the table is not even relational.

In order to convert the Customer table to 1NF, it is necessary to design two tables:
Customer (CustomerID, FirstName, LastName, Address)
CustomerPhoneNumber (CustomerID, PhoneNumber)
Note that now a customer can have any number of phone numbers (zero, one or more than one). Also, the Customer data is stored in the Customer table and PhoneNumber data in the CustomerPhoneNumber table. Customer table has CustomerID as the primary key. CustomerPhoneNumber has CustomerID and PhoneNumber as the primary key. No space is wasted. There are no data modification problems e.g. deleting a phone number does not require deleting a customer. No query needs to include multiple columns for the phone number values, speeding performance. No query needs to implement the logic of understanding a single phone number as well as extract individual phone numbers from a list.

With the above understanding in mind, here the tests that should be applied to check 1NF on every table in the database:
1. Is each column an attribute of the table's entity?
2. Is there a primary key?
3. Are there no duplicate rows?
4. Are there no repeating columns?
5. Are there no multiple values in any column of any row?

Want to learn more? See more explanation with multiple examples in my video on Normalization and First Normal Form

Friday, October 28, 2011

Performance Test Scripts Sections

Performance test scripts model the virtual user's expected interaction with the system. A performance test script is usually created within the performance testing tool. The default performance test script generated by the tool needs to be re-factored, parametrized, co-related and unit tested before it can be used in a performance test. Each performance test script contains various sections. It is important to know about these in order to create robust scripts that work correctly.

1. Environment: This section of a performance test script contains the general information related to the entire script. Examples of data in the environment section are repository description of the scripts, protocol used, browser used and time units (e.g. ms) used.

2. Include: It gives the reference of other pre-existing scripts that contain functions, constants and variables used in the current performance test script. Example of an include script is the file containing all browser response status codes (e.g. 200, 404 and 500).

3. Variables: These are used when it is not possible to know the data value in advance. For example, a performance test script modeled to work with any username/ password would use variables to read these values at run-time from a data source (e.g. a CSV file) and subsequently use these variables to perform user actions. Another example is using a variable to store the cookie value, which cannot be predicted in advance. 
The scope of a variable can differ. A variable can be local to a script and a virtual user. Or it can be local to a particular virtual user across all scripts executed by this virtual user. Or the variable can be global in scope across all scripts and all virtual users in the load test.

4. Constants: These are defined once in the performance test script and may be used multiple times in the script. They provide configuration control. A change in a constant value is automatically reflected wherever the constant is used in the entire script.

5. Timers: These are special variables that track the time elapsed between sending a request to the system and loading of the responses received from the system. Timer values are aggregated to determine the response times of an entire user transaction or a part thereof.

6. Code: This is the main section of the performance test script. It contains script instructions that model a user performing a transaction in the system. It also contains the validation checks on the responses given by the system. The code is written in the scripting language generated by the performance testing tool or any scripting language that is supported by the performance testing tool.

7. Waits: These are commonly used to model the pauses given by users between operations in the system. The performance testing tool does nothing during the wait period. Note that if all Wait statements were removed, it would put an unrealistic load on the system due to the non-stop issuing of requests by the user.

8. Comments: These are useful to explain the sections in a performance test script. Comments are especially important in scripts representing lengthy user transactions.

Saturday, October 22, 2011

SQL Test Online

Structured Query Language (SQL) is the programming language used to find out and modify the data stored in a relational database. Knowledge of designing SQL queries is important in software testing. Take the test below and find out your level of knowledge and the areas in which you should improve. Each question has exactly one correct answer. There is no need to consult any reference for answering these questions.

1. SELECT statement can fetch data from _______ table(s) of the database.
a. exactly one
b. at least two
c. one or two
d. any

2. What values can be included in the SELECT expression (the list following the SELECT keyword)?
a. Any columns
b. All columns
c. Computed values e.g. Price * 10
d. All of the above

3. Which function gives the total number of rows in a table?
a. SUM
b. COUNT
c. ROWCOUNT
d. This has to be done indirectly by executing a SQL query (
e.g. SELECT * FROM Authors) and noticing the number of rows.

4. Which of the following SQL queries is correct?
a. SELECT * FROM Books WHERE Price BETWEEN 10 AND 25
b. SELECT * FROM Books WHERE Price BETWEEN 10, 25
c. SELECT * FROM Books WHERE Price BETWEEN (10, 25)
d. SELECT * FROM Books WHERE Price >10 AND Price < 25

5. Which JOIN clause returns only the matching values from two tables?
a. CROSS
b. INNER
c. LEFT OUTER
d. RIGHT OUTER

6. Which statement is correct for the GROUP BY clause?
a. GROUP BY allows grouping by only one column
b. GROUP BY needs to precede the WHERE clause
c. An aggregate function needs to be specified based on the column specified in GROUP BY
d. HAVING clause can be used in place of GROUP BY clause

7. What is true about Normalization?
a. It avoids data duplicities within and across tables.
b. It is easier to extend the database structure of a normalized database.
c. A normalized database structure is better than a de-normalized one when the SQL queries against it cannot be predicted in advance.
d. All of the above.

8. Which of these SQL queries is correct?
a. SELECT * FROM Employees ORDER BY LastName + FirstName
b. SELECT * FROM Employees ORDER BY LastName ORDER BY FirstName
c. SELECT FirstName, LastName FROM Employees ORDER BY LastName, FirstName DESCENDING
d. SELECT FirstName FROM Employees ORDER BY LastName, FirstName

9. Which of these statements is incorrect for the UNION operator?
a. Both SELECT statements have the same number of columns.
b. The UNION operator returns values that are duplicated in the two resultsets.
c. The column names returned by the UNION operator are taken from the first SELECT statement.
d. Either of the two SELECT statements can have WHERE, GROUP BY, HAVING and ORDER BY clauses.

10. Which of these is valid for a correlated sub query?
a. It is specified in the WHERE clause of the outer query.
b. It is specified in the FROM clause of the outer query.
c. It uses value in the outer query in its WHERE clause.
d. It is mentioned in the outer query's SELECT clause.

Click the Read More link for the correct answers. 

Sunday, October 9, 2011

Risk Management in Software Testing


Risk management is a critical activity in software test planning and tracking. See my short video, Risk Management in Projects or read on.

It includes the identification, prioritization/analysis and treatment of risks faced by the business. Risk management is performed at various levels, project level, program level, organization level, industry level and even national or international level. In this article, risk management is understood to be done at a project level within the context of software testing. Risks arise from a variety of perspectives like project failure, safety, security, legal liabilities and non-compliances with regulations. An important thing to understand is that risks are potential problems, not yet occurred. A problem that has already occurred is an issue and is treated differently in software test planning. Risk management in software testing consists of the following activities:

Risk Identification
Risks are identified within the scope of the project.  Risks can be identified using a number of resources e.g. project objectives, risk lists of past projects, prior system knowledge, understanding of system usage, understanding of system architecture (see my video, Introduction to Software Architecture)/ design, prior customer bug reports/ complaints, project stakeholders and industry practices. For example, if certain areas of the system are unstable and those areas are being developed further in the current project, it should be listed as a risk.
It is good to document the identified risks in detail so that it stays in project memory and can be clearly communicated to project stakeholders. Usually risk identification is an iterative process. It is important to re-visit the risk list whenever the project objectives change or new business scenarios are identified. As the project proceeds, some new risks appear and some old risks disappear.

Risk Prioritization
It is simpler to prioritize a risk if the risk is understood accurately. Two measures, Risk Impact and Risk Probability, are applied to each risk. Risk Impact is estimated in tangible terms (e.g. dollar value) or on a scale (e.g. 10 to 1 or High to Low). Risk Probability is estimated somewhere between 0 (no probability of occurrence) and 1 (certain to occur) or on a scale (10 to 1 or High to Low).  For each risk, the product of Risk Impact and Risk Probability gives the Risk Magnitude.  Sorting the Risk Magnitude in descending order gives a list in which the risks at the top are the more serious risks and need to be managed closely.
Adding all the Risk Magnitudes gives an overall Risk Index of the project. If the same Risk Prioritization scale is used across projects, it is possible to identify the riskier projects by comparing the Risk Magnitudes.

Risk Treatment
Each risk in the risk list is subject to one or more of the following Risk Treatments.
 a. Risk Avoidance: For example, if there is a risk related to a new component, it is possible to postpone this component to a later release. Risk Avoidance is uncommon because it impacts the project objectives e.g.  delivery of new features.
 b. Risk Transfer: For example, if the risk is insufficient security testing of the system, it may be possible to hire a specialized company to perform the security testing. Risk Transfer takes place when this vendor is held accountable for ample security testing of the system. Risk Transfer increases the project cost.
 c. Risk Mitigation: This is a common risk treatment. The objective of Risk Mitigation is to reduce the Risk Impact or Risk Probability or both. For example, if the testing team is new and does not have prior system  knowledge, a risk mitigation treatment may be to have a knowledgeable team member join the team to train others on-the-fly. Risk Mitigation also increases the project cost.
 d. Risk Acceptance: Any risk not treated by any prior treatments has to be accepted. This happens when there is no viable mitigation available due to reasons such as cost. For example, if the test environment has only  one server, risk acceptance means not building another server. If the existing server crashes, there will be down-time and it will be a real issue in the project.

Few other points are:
1. Risk management brings clarity and focus to the team and other stakeholders. Though the team should avoid burning more time on risk management if it is not providing more value.
2. The risk list should be a live document, consisting of current risks, their prioritization and treatment plans. The test approach and test plan should be synched with the risk list whenever the latter is updated.
3. Bigger projects commonly involve more stakeholders and have more formal risk management process.


Image: jscreationzs / FreeDigitalPhotos.net

Tuesday, October 4, 2011

SQL Injection

If you have read my earlier article, Code injection attacks, you would have some idea about SQL injection attack. This post explains SQL injection in detail so that you may understand it well.

What is the SQL injection vulnerability? Vulnerability is a weakness in the application software under test that can be attacked to cause the application (or even the underlying operating system) to behave in an undesirable manner. The SQL injection (SQLi in short) vulnerability lives in the middle-layer or the database layer of the application. It exists when the application executes a dynamic SQL query against the database without validating, escaping or rejecting the unexpected inputs given by the attacker. These inputs become a part of the dynamic SQL query and are executed against the database.

What is the SQL injection attack? It occurs when some text or even another SQL query is inserted into the application's SQL query. Attacks can be successful or unsuccessful depending on the application and the underlying database. A successful SQL injection attack may show confidential data to the attacker, allow the attacker to impersonate another user, increase the attacker's privileges to higher levels, insert/ modify/ delete data in the database tables or even perform administrative operations on the database like shutting down the database instance.

With this background, let us see examples showing SQL injection.

Example 1. The application query is
SELECT * from Salaries WHERE EmployeeName = ? AND EmployeeId = ?

The EmployeeName and EmployeeId are strings that are obtained from the client-side application. The intention of the developer is that an EmployeeName, say John, and an EmployeeId, say A100, is sent by the client to the server. After concatenating these values, the query becomes
SELECT * from Salaries WHERE EmployeeName = 'John' AND EmployeeId = 'A100'

The application then shows the salary details of this particular employee. Now, the attacker passes  "name" as the EmployeeName and "Id' OR 'x'='x" for EmployeeId. After concatenating these values, the query becomes
SELECT * from Salaries WHERE EmployeeName = 'name' AND EmployeeId = 'Id' OR 'x'='x'

The result is that the WHERE clause now contains an OR  condition that is always true. Due to operator precedence, this query returns all rows of the Salaries table.

Example 2. The application query on the login page of the application is
SELECT UserName FROM Users WHERE UserName = ? AND Password = ?

As in the previous example, there are two inputs obtained from the client-side application. Here, the developer expects two strings, say John and Smith99, are sent by the client to the server. In such a case, the query becomes
SELECT UserName FROM Users WHERE UserName = 'John' AND Password = 'Smith99'

If this UserName and Password combination exists in the users table, it returns one or more rows. Instead, the attacker passes "Administrator'--" and "password" values. After concatenating these values, the query becomes
SELECT UserName FROM Users WHERE UserName = 'Administrator'--' AND Password = 'password'

After removing the commented part, the query is effectively
SELECT UserName FROM Users WHERE UserName = 'Administrator'

The result is that if there is a user named Administrator, the attacker logs into the application impersonating this user and has all of the user's privileges on which to base further attacks.

Example 3. The application query on the search page of the application is
SELECT * FROM Products WHERE ProductName = ?

If the attacker provides the value "name'; DELETE FROM Products;--", the query becomes
SELECT * FROM Products WHERE ProductName = 'name'; DELETE FROM Products;

If the database allows batch execution (of multiple queries), the latter query delete all rows from the Products table.

I hope that you understand SQL injection now. Few other points to note are
1. SQL injection is further divided into two categories, SQL manipulation and Code injection. Strictly speaking, examples 1 and 2 involve SQL manipulation. Example 3 involves Code injection, because extra SQL code is inserted there.
2. If the attacker knows in advance about the attack SQL, he can devise complex attacks. This is possible if the application throws error messages showing the SQL query text or if it is an Open Source application or if the attacker somehow obtains access to the application source code.
3. Testing SQL injection can be automated. A variety of  tools are available for this. Examples include SQLiX (SQL Injection Scanner) and sqlmap (Open Source Penetration testing tool).

Image: Ambro / FreeDigitalPhotos.net

Sunday, October 2, 2011

Team Productivity - 10 ways to ensure that your team members excel and grow

This post is about the softer skills of software test management. It is about how to have your people excel in their jobs. It is something about which I feel strongly. The end result of a project is not the only important thing. Even more important is the career benefit to your team member. Is it possible to run a project such that throughout the project, your team member matures his skills, his attitude and his professionalism? Executing projects consistently like this will ensure that your team member grows professionally. Better performance on subsequent projects will be a given. So, how does a test manager go about having their team members excel and grow? It's not easy. It's also not very difficult. Here is how.

1. Know your team member's career objectives: Every time you start working with a team member with whom you have not worked before, set aside some time for a frank discussion. You may already have some data about their strengths/ weaknesses and past performance. However, have them mention their prior experiences in their own words. Observe the things they talk about and the things they focus upon. Avoid judging them at this time. Ask them about their passions and interests. Find out the situations that challenge them. Know the things in which they pride. Take notes to refer later.

2. Determine his preferred learning style: People have different favorite styles when it comes to learning new things. One person may prefer reading the new material, another may prefer having someone talk about it and another may prefer experimenting with it. Knowing your team member's preferred learning style will enable you to decide the best option, give them time to read new material, organize a classroom training or give them time and resources to play with it. Also, your own preferred learning style may be different from your team member's. So, do not impose your preferred style on him.

3. Identify his favored working style: Different people have different favored working styles. A team member may prefer crisp and clear instructions; another may prefer an outline to decide the details himself. One person may be fast (and mistake-prone) and another may be slow (but thorough). One team member may prefer being assigned only one task and another may be quite comfortable with multiple tasks with varying priorities. One may be able to maintain focus on the given task and another may need periodic reminders. When executing the task, one person may communicate regularly with you and another may give you infrequent updates. Understanding your team member's favored working style will enable you to know what to expect from him as "normal".

4. Remember that every individual is unique: Understand the motivations, challenges and favorite styles of your team member will enable you to deploy him on project tasks that benefit the project and also the team member the most. For example, a technical oriented person would contribute the most to the technical tasks in the project, a fast team member would contribute most to the effort-intensive tasks of the project and a good communicator would contribute most to project tasks requiring co-operation with other teams.

5. For every assignment, ask him to confirm the objectives, approach and deliverables as understood by him: Knowing your team member's favorite working style will enable you to design and communicate an assignment to him in the most helpful way. But, it is possible that some objectives or approach or deliverables are not registered cent percent. So, always ask him to re-iterate these in his own words. This will enable you to identify and close any gaps in understanding them.

6. Don't simplify the work: As a manager, your role is to distribute the work assignments and track them. You may give your suggestions to simplify the work. But, always encourage the team member to think about the challenges, identify appropriate approaches to tackle them and choose the best approach. Providing him ready-made solutions would rob him of this opportunity to grow. Also, increase your expectations gradually. Raising the bar will ensure that you team member does not run out of problems and challenges.

7. Never accept failure as-is: There will be situations in which the outcome is not successful. Even so, such situations will make the team member learn what does not work (well). When faced with failure, provide your team member with the required encouragement and help until he does come up with an acceptable success.

8. Never neglect communication: Regular communication is critical to strong relationships. Keep your team member informed about relevant developments. Contact him from time to time, even if it just to catch up. At least, always be available should your team member wish to contact you for an update, an approval or even as a sounding board for an idea.

9. Maintain a consistent management approach yourself: Your team member should regard you as a fair and trustworthy person, who treats team members uniformly and says and does the same things. So, do not play favorites within the team and do not go after anyone. If you are inconsistent, your team member will be worried and unsure about you and this will impact their performance and growth.

10. Always respect your team member: Finally, always be respectful to your team member. Be ready to listen to him while suspending judgment. The thing about respect is that it's mutual; if you respect your team member, he will automatically respect you. Respect will ensure that your team member stays focused on excelling and growth.

If you are a manager, follow the above 10 points over and over again. You will have an exclusive understanding of your team and you will make prudent business decisions. Also, your team members will grow professionally.
If you are a team member, show these 10 points to your manager. Understanding, open communication and consistently working in challenging situations are sure to make you grow faster.

Image Courtesy: Salvatore Vuono / FreeDigitalPhotos.net