Sunday, October 9, 2011

Risk Management in Software Testing

Risk management is a critical activity in software test planning and tracking. See my short video, Risk Management in Software Projects or read on.

It includes the identification, prioritization/analysis and treatment of risks faced by the business. Risk management is performed at various levels, project level, program level, organization level, industry level and even national or international level. In this article, risk management is understood to be done at a project level within the context of software testing. Risks arise from a variety of perspectives like project failure, safety, security, legal liabilities and non-compliances with regulations. An important thing to understand is that risks are potential problems, not yet occurred. A problem that has already occurred is an issue and is treated differently in software test planning. Risk management in software testing consists of the following activities:

Risk Identification
Risks are identified within the scope of the project.  Risks can be identified using a number of resources e.g. project objectives, risk lists of past projects, prior system knowledge, understanding of system usage, understanding of system architecture (see my video, Introduction to Software Architecture)/ design, prior customer bug reports/ complaints, project stakeholders and industry practices. For example, if certain areas of the system are unstable and those areas are being developed further in the current project, it should be listed as a risk.
It is good to document the identified risks in detail so that it stays in project memory and can be clearly communicated to project stakeholders. Usually risk identification is an iterative process. It is important to re-visit the risk list whenever the project objectives change or new business scenarios are identified. As the project proceeds, some new risks appear and some old risks disappear.

Risk Prioritization
It is simpler to prioritize a risk if the risk is understood accurately. Two measures, Risk Impact and Risk Probability, are applied to each risk. Risk Impact is estimated in tangible terms (e.g. dollar value) or on a scale (e.g. 10 to 1 or High to Low). Risk Probability is estimated somewhere between 0 (no probability of occurrence) and 1 (certain to occur) or on a scale (10 to 1 or High to Low).  For each risk, the product of Risk Impact and Risk Probability gives the Risk Magnitude.  Sorting the Risk Magnitude in descending order gives a list in which the risks at the top are the more serious risks and need to be managed closely.
Adding all the Risk Magnitudes gives an overall Risk Index of the project. If the same Risk Prioritization scale is used across projects, it is possible to identify the riskier projects by comparing the Risk Magnitudes.

Risk Treatment
Each risk in the risk list is subject to one or more of the following Risk Treatments.
 a. Risk Avoidance: For example, if there is a risk related to a new component, it is possible to postpone this component to a later release. Risk Avoidance is uncommon because it impacts the project objectives e.g.  delivery of new features.
 b. Risk Transfer: For example, if the risk is insufficient security testing of the system, it may be possible to hire a specialized company to perform the security testing. Risk Transfer takes place when this vendor is held accountable for ample security testing of the system. Risk Transfer increases the project cost.
 c. Risk Mitigation: This is a common risk treatment. The objective of Risk Mitigation is to reduce the Risk Impact or Risk Probability or both. For example, if the testing team is new and does not have prior system  knowledge, a risk mitigation treatment may be to have a knowledgeable team member join the team to train others on-the-fly. Risk Mitigation also increases the project cost.
 d. Risk Acceptance: Any risk not treated by any prior treatments has to be accepted. This happens when there is no viable mitigation available due to reasons such as cost. For example, if the test environment has only  one server, risk acceptance means not building another server. If the existing server crashes, there will be down-time and it will be a real issue in the project.

Few other points are:
1. Risk management brings clarity and focus to the team and other stakeholders. Though the team should avoid burning more time on risk management if it is not providing more value.
2. The risk list should be a live document, consisting of current risks, their prioritization and treatment plans. The test approach and test plan should be synched with the risk list whenever the latter is updated.
3. Bigger projects commonly involve more stakeholders and have more formal risk management process.

Image: jscreationzs /