Professional Documents
Culture Documents
Page 2
Real world example
https://d3.harvard.edu/platform-rctom/submission/the-failed-launch-of-www-healthcare-gov/
“I’m going to try and download every movie ever made, and you’re going to try to sign up for Obamacare, and we’ll see which happens first” –
Jon Stewart challenging Kathleen Sebelius (former Secretary of Health and Human Services) to a race.
Root Causes
The myriad of problems experienced during the website's rollout were due to several factors, primarily:
•Lack of relevant experience. HHS employees and managers had a lot of experience with private insurance markets and maintaining large government
projects but did not have required experience in technology product launches. Key technical positions were unfilled and project managers had little
knowledge on the amount of work required and typical product development processes leaving very little time to test and troubleshoot the website.
•Lack of leadership. There was no formal division of responsibilities in place between the many government offices involved which caused a delay in
key decision making or a lack of communication when key decisions were made. For example, the contractor responsible for the log on system
estimated a low demand because the initial website plan included the option to shop for products without creating an account or logging in. However,
due to technical delays, that functionality was removed from the initial website launch (so all users would need to log in) without increasing capacity.
•Schedule pressure. Since the launch date was mandated in the Affordable Care Act, HHS employees were pressured to launch on time regardless of
completion or the amount (and results) of testing and troubleshooting performed.
Page 3
Qualitative vs Quantitative
Qualitative risk analysis tends to be more Quantitative risk analysis uses verifiable
subjective. It focuses on identifying risks to data to analyze the effects of risk in terms of
measure both the likelihood of a specific risk cost overruns, scope creep, resource
event occurring during the project life cycle
consumption, and schedule delays.
and the impact it will have on the overall
schedule should it hit.
Risk: Potential cost overrun due to inaccurate initial cost estimates.
The goal is to determine severity. Results are Likelihood: 30% (based on historical data).
then recorded in a risk assessment matrix (or Impact: $100,000 (estimated additional cost if the risk materializes)
Risk Exposure: $30,000 ($100,000 * 30%)
any other form of an intuitive graphical report)
Risk Mitigation:
in order to communicate outstanding hazards Use advanced cost estimation techniques
to stakeholders. conduct sensitivity analysis
Allocate a contingency budget.
Risk: Frequent changes in project requirements.
Likelihood: High
Impact: High
Risk Exposure: High
Risk Mitigation: Establish a robust change management process, involve
stakeholders in requirement discussions, and educate stakeholders about the
impact of frequent changes.
Page 4 https://www.safran.com/blog/whats-the-difference-between-qualitative-and-quantitative-risk-analysis
Before we go into the matrix, risk is all about
Impact (severity): delays in project delivery, cost overruns, compromised software quality,
decreased customer satisfaction, or project failure.
Probability: Each risk has a certain likelihood or probability of occurring. Some risks are more
likely to materialize than others. If you know something will happen, that is an issue/fact!
Mitigation: Risk mitigation involves taking proactive measures to reduce the probability or
impact of identified risks. Strategies for mitigation may include contingency planning, risk
avoidance, risk transfer (such as insurance), or risk acceptance (choosing not to take action if
the risk is deemed acceptable).
Page 5
Exposure = Severity * Probability
Page 6
Risk examples
Risk: Limited availability of skilled developers. Risk: Potential security vulnerabilities in the software.
1. Likelihood: Medium (availability can vary based on market conditions). 1. Likelihood: Medium (depends on the complexity of the software and
2. Impact: High (can lead to project delays and reduced quality). security measures).
3. Risk Exposure: Medium. 2. Impact: High (can lead to data breaches or system compromises).
Mitigation: Cross-train team members, explore outsourcing options, and maintain 3. Risk Exposure: Medium.
a contingency plan for resource shortages. Mitigation: Conduct regular security audits, follow best practices for secure
coding, and implement security patches promptly.
Risk: Relying on third-party libraries or services.
1. Likelihood: Medium (dependability varies with third-party providers). Risk: Using new or unproven technologies in the project.
2. Impact: Medium (service disruptions can affect project functionality). 1. Likelihood: Medium (depends on the chosen technologies).
3. Risk Exposure: Medium. 2. Impact: Medium (technological challenges can lead to delays or quality
Mitigation: Evaluate third-party providers, have backup plans in case of service issues).
disruptions, and consider building fallback mechanisms. 3. Risk Exposure: Medium.
Mitigation: Conduct thorough technology assessments, pilot new technologies in
controlled environments, and have fallback plans if the technology doesn't work
as expected.
Page 7
Quantitative
Risk: Risk of a data breach leading to financial losses and reputation damage.
1. Likelihood: 10% (based on security assessments).
2. Impact: $500,000 (estimated cost of a data breach).
Risk Exposure: $50,000 ($500,000 * 10%).
Risk Mitigation: Invest in robust security measures, conduct regular security audits, and have an incident response plan in place.
Page 8
Technical Risks
Technical Risks: Technical risks are related to the development and implementation of the
software itself. These risks can have a significant impact on the project's success, quality,
and timeline.
Technology Risks: Risks associated with the choice of technology stack, tools, and
frameworks, including the risk of using new or unproven technologies.
Software Complexity: Risks stemming from the complexity of the software architecture,
design, or coding, which can lead to development delays or increased defect rates.
Integration Risks: Risks associated with integrating various software components or
third-party systems, such as compatibility issues and data transfer challenges.
Performance Risks: Risks related to the software's performance, scalability, and
responsiveness, including bottlenecks and resource constraints.
Security Risks: Risks related to vulnerabilities, data breaches, and cyberattacks, which
can compromise the confidentiality and integrity of data.
Page 9
Operational Risks
Operational Risks: Operational risks involve issues that can affect the software's
functionality, maintenance, and usability once it is in production.
Maintenance and Support: Risks related to the ongoing maintenance, updates, and
support of the software, including the availability of skilled personnel.
Scalability and Capacity Planning: Risks associated with accommodating growth in
user numbers or data volume without degradation in system performance.
Data Backup and Recovery: Risks associated with data loss and the ability to recover
data in case of system failures or disasters.
User Training and Adoption: Risks related to user acceptance and proficiency in using
the software, which can impact productivity and satisfaction.
Page 10
External risks
External risks are beyond the control of the project team and are influenced by external factors or
stakeholders.
Market Risks: Risks associated with changes in market conditions, competition, or shifts in user
preferences that can affect the software's relevance and market share.
Regulatory and Compliance Risks: Risks related to changes in laws, regulations, or industry standards
that impact the software's compliance and legal standing.
Supplier and Vendor Risks: Risks arising from dependencies on external suppliers, vendors, or third-
party services that may not meet expectations or experience disruptions.
Economic Risks: Risks tied to economic conditions, such as currency fluctuations, inflation, and
economic downturns, which can impact project budgets and costs.
Project Management Risks: These risks are related to project planning, execution, and resource
management.
Scope Creep: The risk of uncontrolled changes to project scope, potentially leading to timeline and
budget overruns.
Resource Constraints: Risks related to the availability and allocation of human and material
resources, including skilled team members.
Schedule Risks: Risks associated with delays in project milestones or deadlines.
Budget Risks: Risks related to cost overruns or financial constraints that may impact project success.
Page 11
Risk Avoidance:
Risk avoidance involves taking proactive steps to eliminate or completely avoid the risk and its potential negative consequences. This strategy
is often used when the risk is deemed too significant or when it conflicts with the project's objectives or constraints.
Example: If a software project team identifies a high-risk feature that is not essential for the project's success, they may choose to
avoid the risk by excluding that feature from the project scope.
Risk Transfer:
Risk transfer involves shifting the responsibility for managing a risk to another party. This strategy is commonly used when a third party, such
as an insurance provider or a subcontractor, can better handle the risk.
Example: Purchasing insurance for a project to transfer the financial risk of certain events, like natural disasters, to the insurance
company.
Risk Acceptance:
Risk acceptance is a strategy where the project team or organization acknowledges the risk and its potential impact but decides not to take
any specific action to mitigate it. This strategy is typically used when the risk is low in impact or likelihood, and the cost of mitigation exceeds
the potential losses.
Example: Accepting the risk of minor delays in a software project due to potential team member illness because it's not cost-
effective to have a backup for every team member.
Risk Mitigation:
Risk mitigation aims to reduce the likelihood or impact of a risk by taking proactive measures. It's one of the most common risk management
strategies and is often applied when a risk is significant but manageable.
Example: Implementing regular code reviews and automated testing processes to mitigate the risk of software defects and
vulnerabilities.
In traditional software development practices, development and operations teams operated in silos. Developers
focused on creating and enhancing software, while operations teams were responsible for deploying and maintaining it
in production environments.
Deployment Frequency: Developers often aim for frequent releases, while operations prefer fewer, more stable deployments.
Risk Tolerance: Developers may be more willing to take risks to introduce new features, whereas operations prioritize risk mitigation.
Monitoring and Alerts: Developers and operations have different expectations regarding monitoring, alerting thresholds, and incident
response.
Documentation: Operations teams often prioritize thorough documentation for system configurations and maintenance, which may not
align with developer practices.
Tooling: Developers and operations may have different preferences for the tools and technologies used in the development and
deployment pipelines.
Page 13
Mitigating the Wall of Confusion:
Cultural Shift: DevOps emphasizes a cultural shift where teams collaborate, share
responsibilities, and collectively own the entire software delivery lifecycle.
Automation: Automating manual tasks, including testing, deployment, and
infrastructure provisioning, can reduce miscommunication and errors.
Shared Metrics: Establishing shared performance metrics and KPIs that both Dev
and Ops teams can monitor helps align their goals.
Cross-Functional Teams: Create cross-functional teams where developers and
operations professionals work together, fostering mutual understanding and trust.
Page 14
DevOps culture - terminology
Continuous Delivery – A software engineering approach in which software is produced in short cycles, ensuring that software can be reliably released manually at any time. It
contrasts with continuous deployment.
Continuous Deployment – A software engineering approach in which software functionalities are delivered frequently through automated deployments. It contrasts with
continuous delivery.
Continuous Integration – A software engineering practice of merging all working software copies to a shared mainline several times a day.
Continuous testing – A process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on risks associated with a software release
candidate.
Deployment management – Planning, scheduling, and control over the movement of software releases in test and live environments.
Fail fast – A trial-end-error strategy that involves trying something, failing quickly, implementing feedback, and adapting accordingly.
Page 15 https://project-management.com/the-definitive-guide-to-devops/
Page 16 https://www.harness.io/blog/devops-tools-lifecycle-mesh