You are on page 1of 16

CPSC 546 Modern Software Management – week 5

"The opposite of risk management is crisis management”


Tim Lister
Module 5: Risk Management
Introduction to risk management in software projects
Identifying and categorizing software project risks
Thread and Risk Analysis (TRA)
Risk assessment and prioritization
Mitigation strategies and contingency planning

Module 6: Continuous Integration and Delivery


DevOps culture: mindset and practices

Page 2
Real world example
https://d3.harvard.edu/platform-rctom/submission/the-failed-launch-of-www-healthcare-gov/

“I’m going to try and download every movie ever made, and you’re going to try to sign up for Obamacare, and we’ll see which happens first” –
Jon Stewart challenging Kathleen Sebelius (former Secretary of Health and Human Services) to a race.

Root Causes
The myriad of problems experienced during the website's rollout were due to several factors, primarily:
•Lack of relevant experience. HHS employees and managers had a lot of experience with private insurance markets and maintaining large government
projects but did not have required experience in technology product launches. Key technical positions were unfilled and project managers had little
knowledge on the amount of work required and typical product development processes leaving very little time to test and troubleshoot the website.
•Lack of leadership. There was no formal division of responsibilities in place between the many government offices involved which caused a delay in
key decision making or a lack of communication when key decisions were made. For example, the contractor responsible for the log on system
estimated a low demand because the initial website plan included the option to shop for products without creating an account or logging in. However,
due to technical delays, that functionality was removed from the initial website launch (so all users would need to log in) without increasing capacity.
•Schedule pressure. Since the launch date was mandated in the Affordable Care Act, HHS employees were pressured to launch on time regardless of
completion or the amount (and results) of testing and troubleshooting performed.

The key issues discussed above resulted in the rollout of the


healthcare.gov website ballooning the initial $93.7M budget to
an ultimate cost of $1.7B.

Page 3
Qualitative vs Quantitative

Qualitative risk analysis tends to be more Quantitative risk analysis uses verifiable
subjective. It focuses on identifying risks to data to analyze the effects of risk in terms of
measure both the likelihood of a specific risk cost overruns, scope creep, resource
event occurring during the project life cycle
consumption, and schedule delays.
and the impact it will have on the overall
schedule should it hit.
Risk: Potential cost overrun due to inaccurate initial cost estimates.
The goal is to determine severity. Results are Likelihood: 30% (based on historical data).
then recorded in a risk assessment matrix (or Impact: $100,000 (estimated additional cost if the risk materializes)
Risk Exposure: $30,000 ($100,000 * 30%)
any other form of an intuitive graphical report)
Risk Mitigation:
in order to communicate outstanding hazards Use advanced cost estimation techniques
to stakeholders. conduct sensitivity analysis
Allocate a contingency budget.
Risk: Frequent changes in project requirements.
Likelihood: High
Impact: High
Risk Exposure: High
Risk Mitigation: Establish a robust change management process, involve
stakeholders in requirement discussions, and educate stakeholders about the
impact of frequent changes.

Page 4 https://www.safran.com/blog/whats-the-difference-between-qualitative-and-quantitative-risk-analysis
Before we go into the matrix, risk is all about

Impact (severity): delays in project delivery, cost overruns, compromised software quality,
decreased customer satisfaction, or project failure.
Probability: Each risk has a certain likelihood or probability of occurring. Some risks are more
likely to materialize than others. If you know something will happen, that is an issue/fact!
Mitigation: Risk mitigation involves taking proactive measures to reduce the probability or
impact of identified risks. Strategies for mitigation may include contingency planning, risk
avoidance, risk transfer (such as insurance), or risk acceptance (choosing not to take action if
the risk is deemed acceptable).

Page 5
Exposure = Severity * Probability

Page 6
Risk examples
Risk: Limited availability of skilled developers. Risk: Potential security vulnerabilities in the software.
1. Likelihood: Medium (availability can vary based on market conditions). 1. Likelihood: Medium (depends on the complexity of the software and
2. Impact: High (can lead to project delays and reduced quality). security measures).
3. Risk Exposure: Medium. 2. Impact: High (can lead to data breaches or system compromises).
Mitigation: Cross-train team members, explore outsourcing options, and maintain 3. Risk Exposure: Medium.
a contingency plan for resource shortages. Mitigation: Conduct regular security audits, follow best practices for secure
coding, and implement security patches promptly.
Risk: Relying on third-party libraries or services.
1. Likelihood: Medium (dependability varies with third-party providers). Risk: Using new or unproven technologies in the project.
2. Impact: Medium (service disruptions can affect project functionality). 1. Likelihood: Medium (depends on the chosen technologies).
3. Risk Exposure: Medium. 2. Impact: Medium (technological challenges can lead to delays or quality
Mitigation: Evaluate third-party providers, have backup plans in case of service issues).
disruptions, and consider building fallback mechanisms. 3. Risk Exposure: Medium.
Mitigation: Conduct thorough technology assessments, pilot new technologies in
controlled environments, and have fallback plans if the technology doesn't work
as expected.

Risk: Relying heavily on a specific vendor or proprietary software tool.


1. Likelihood: Low (unless the project is highly dependent on a single
vendor).
2. Impact: High (vendor discontinuation can disrupt the project).
3. Risk Exposure: Medium.

Mitigation: Consider open-source alternatives, maintain vendor relationships,


and periodically review vendor options.

Page 7
Quantitative

Risk: Risk of defects and quality issues in the software.


1. Likelihood: 15% (based on historical defect rates).
2. Impact: $50,000 (estimated cost to fix defects if the risk materializes).
Risk Exposure: $7,500 ($50,000 * 15%).
Risk Mitigation: Implement rigorous testing processes, conduct code reviews, and invest in automated testing tools.

Risk: Risk of a data breach leading to financial losses and reputation damage.
1. Likelihood: 10% (based on security assessments).
2. Impact: $500,000 (estimated cost of a data breach).
Risk Exposure: $50,000 ($500,000 * 10%).
Risk Mitigation: Invest in robust security measures, conduct regular security audits, and have an incident response plan in place.

Risk: Risk of key team members leaving the project.


1. Likelihood: 5% (based on historical turnover rates).
2. Impact: 4 weeks (estimated delay due to replacement and onboarding).
Risk Exposure: 0.2 weeks (4 weeks * 5%).
Risk Mitigation: Cross-train team members, maintain knowledge transfer documentation, and have a talent acquisition plan in place.

Page 8
Technical Risks

Technical Risks: Technical risks are related to the development and implementation of the
software itself. These risks can have a significant impact on the project's success, quality,
and timeline.
Technology Risks: Risks associated with the choice of technology stack, tools, and
frameworks, including the risk of using new or unproven technologies.
Software Complexity: Risks stemming from the complexity of the software architecture,
design, or coding, which can lead to development delays or increased defect rates.
Integration Risks: Risks associated with integrating various software components or
third-party systems, such as compatibility issues and data transfer challenges.
Performance Risks: Risks related to the software's performance, scalability, and
responsiveness, including bottlenecks and resource constraints.
Security Risks: Risks related to vulnerabilities, data breaches, and cyberattacks, which
can compromise the confidentiality and integrity of data.

Page 9
Operational Risks

Operational Risks: Operational risks involve issues that can affect the software's
functionality, maintenance, and usability once it is in production.
Maintenance and Support: Risks related to the ongoing maintenance, updates, and
support of the software, including the availability of skilled personnel.
Scalability and Capacity Planning: Risks associated with accommodating growth in
user numbers or data volume without degradation in system performance.
Data Backup and Recovery: Risks associated with data loss and the ability to recover
data in case of system failures or disasters.
User Training and Adoption: Risks related to user acceptance and proficiency in using
the software, which can impact productivity and satisfaction.

Page 10
External risks
External risks are beyond the control of the project team and are influenced by external factors or
stakeholders.
Market Risks: Risks associated with changes in market conditions, competition, or shifts in user
preferences that can affect the software's relevance and market share.
Regulatory and Compliance Risks: Risks related to changes in laws, regulations, or industry standards
that impact the software's compliance and legal standing.
Supplier and Vendor Risks: Risks arising from dependencies on external suppliers, vendors, or third-
party services that may not meet expectations or experience disruptions.
Economic Risks: Risks tied to economic conditions, such as currency fluctuations, inflation, and
economic downturns, which can impact project budgets and costs.

Project Management Risks: These risks are related to project planning, execution, and resource
management.
Scope Creep: The risk of uncontrolled changes to project scope, potentially leading to timeline and
budget overruns.
Resource Constraints: Risks related to the availability and allocation of human and material
resources, including skilled team members.
Schedule Risks: Risks associated with delays in project milestones or deadlines.
Budget Risks: Risks related to cost overruns or financial constraints that may impact project success.
Page 11
Risk Avoidance:
Risk avoidance involves taking proactive steps to eliminate or completely avoid the risk and its potential negative consequences. This strategy
is often used when the risk is deemed too significant or when it conflicts with the project's objectives or constraints.
Example: If a software project team identifies a high-risk feature that is not essential for the project's success, they may choose to
avoid the risk by excluding that feature from the project scope.

Risk Transfer:
Risk transfer involves shifting the responsibility for managing a risk to another party. This strategy is commonly used when a third party, such
as an insurance provider or a subcontractor, can better handle the risk.
Example: Purchasing insurance for a project to transfer the financial risk of certain events, like natural disasters, to the insurance
company.

Risk Acceptance:
Risk acceptance is a strategy where the project team or organization acknowledges the risk and its potential impact but decides not to take
any specific action to mitigate it. This strategy is typically used when the risk is low in impact or likelihood, and the cost of mitigation exceeds
the potential losses.
Example: Accepting the risk of minor delays in a software project due to potential team member illness because it's not cost-
effective to have a backup for every team member.

Risk Mitigation:
Risk mitigation aims to reduce the likelihood or impact of a risk by taking proactive measures. It's one of the most common risk management
strategies and is often applied when a risk is significant but manageable.
Example: Implementing regular code reviews and automated testing processes to mitigate the risk of software defects and
vulnerabilities.

Reduce risk, monitor and act, training and skill development


Page 12
DevOps culture
"Wall of Confusion" is a term used to describe the disconnect or
misalignment that can occur between development (Dev) and
operations (Ops) teams during the software development and delivery
process.

In traditional software development practices, development and operations teams operated in silos. Developers
focused on creating and enhancing software, while operations teams were responsible for deploying and maintaining it
in production environments.

Differences of understanding and approaches

Deployment Frequency: Developers often aim for frequent releases, while operations prefer fewer, more stable deployments.
Risk Tolerance: Developers may be more willing to take risks to introduce new features, whereas operations prioritize risk mitigation.
Monitoring and Alerts: Developers and operations have different expectations regarding monitoring, alerting thresholds, and incident
response.
Documentation: Operations teams often prioritize thorough documentation for system configurations and maintenance, which may not
align with developer practices.
Tooling: Developers and operations may have different preferences for the tools and technologies used in the development and
deployment pipelines.

Page 13
Mitigating the Wall of Confusion:

Cultural Shift: DevOps emphasizes a cultural shift where teams collaborate, share
responsibilities, and collectively own the entire software delivery lifecycle.
Automation: Automating manual tasks, including testing, deployment, and
infrastructure provisioning, can reduce miscommunication and errors.
Shared Metrics: Establishing shared performance metrics and KPIs that both Dev
and Ops teams can monitor helps align their goals.
Cross-Functional Teams: Create cross-functional teams where developers and
operations professionals work together, fostering mutual understanding and trust.

Page 14
DevOps culture - terminology
Continuous Delivery – A software engineering approach in which software is produced in short cycles, ensuring that software can be reliably released manually at any time. It
contrasts with continuous deployment.
Continuous Deployment – A software engineering approach in which software functionalities are delivered frequently through automated deployments. It contrasts with
continuous delivery.
Continuous Integration – A software engineering practice of merging all working software copies to a shared mainline several times a day.
Continuous testing – A process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on risks associated with a software release
candidate.
Deployment management – Planning, scheduling, and control over the movement of software releases in test and live environments.
Fail fast – A trial-end-error strategy that involves trying something, failing quickly, implementing feedback, and adapting accordingly.

The CALMS Solution


To overcome most of the barriers to implementing DevOps, people resort to the CALMS Framework.
Culture – The DevOps culture strives to remove walls and barriers between teams, so they can identify and
address potential problems faster. A successful DevOps implementation starts with addressing culture.
Automation – Automation can help a DevOps team streamline its day-to-day tasks, and minimize
downtime, outages, or other critical incidents. For example, automation can help teams eliminate manual
configuration errors.
Lean – Lean principles applied to DevOps refer to striving for continuous improvement and accepting
failure as part of a systematic approach to everyday operations. For instance, the importance of creating
feedback loops and adaptation are core elements in DevOps.
Measurement – Measurement is vital for assessing the effectiveness of SOPs and identifying opportunities
for improvement. With the right daily, weekly, monthly and annual metrics, teams can better understand
its strengths and weaknesses, and explore ways to turn their weaknesses into strengths.
Sharing – DevOps teams operate in an environment where information sharing makes it simple for team
members to stay updated on important issues. For example, they consistently share information to keep all
team members informed about all aspects of an incident from onset to resolution.

Page 15 https://project-management.com/the-definitive-guide-to-devops/
Page 16 https://www.harness.io/blog/devops-tools-lifecycle-mesh

You might also like