Professional Documents
Culture Documents
Contributors
Bruce Cowper Microsoft Trustworthy Computing Cristin Goodwin Microsoft Trustworthy Computing Tim Rains Microsoft Trustworthy Computing The Microsoft Malware Protection Center Andrew Cushman Microsoft Trustworthy Computing William Howerton Good Harbor Security Risk Management Travis Scoles Schireson Associates Dave Forstrom Microsoft Trustworthy Computing Jacob Olcott Good Harbor Security Risk Management Neil Shah Schireson Associates
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. This document is provided as-is. Information and views expressed in this document, including URL and other Internet website references, may change without notice. You bear the risk of using it. Copyright 2013 Microsoft Corporation. All rights reserved. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
Foreword
This special edition of the Microsoft Security Intelligence Report (SIR) was authored by Microsofts Global and Security Strategy and Diplomacy (GSSD) team. GSSD works collaboratively with governments, multilateral organizations, industry, and non-profit groups to enhance security across the cyber ecosystem. Leveraging technical depth and public policy expertise, GSSD supports public and private sector initiatives that promote trustworthy plans and policies, resilient operations, and investments in innovation. While Microsoft has long reported on the technical measures of cybersecurity through the SIR and other sources of information, we have been looking to better understand the full environment that leads to a given cybersecurity outcome. We believe that is dependent on a range of technical and non-technical measures including use of modern technology, mature processes, user education, law enforcement and public policies related to cyberspace. Each of these measures may contribute directly or indirectly to the cyber security performance measures reported in the SIR. This paper introduces a methodology for examining how the non-technical socio-economic factors in a country or region impact cybersecurity performance. With this methodology we can build a model we hope can help predict the expected cybersecurity performance of a given country or region based on our observation of non-technical socio-economic data. From that prediction, we can attempt to better understand the public policies that distinguish the performance of different countries and regions. We are excited by the initial results of our research that demonstrate significant differences in security outcomes between countries that have, for example, signed or ratified the Council of Europe. Both policy makers and technology experts face increasing demands for innovation and impact. It is our hope that this work catalyzes additional research into the holistic factors impacting cybersecurity around the world as well as a data-driven approach to policy making. Paul Nicholas Senior Director of Global Security Strategy and Diplomacy Trustworthy Computing, Microsoft Tim Rains Director Trustworthy Computing, Microsoft
Introduction
The world is in the midst of an unprecedented technological transition, characterized by growth in the volume and diversity of people, devices, and data connected to the Internet. Across the globe, billions of people are using information and communications technology (ICT) infrastructure to conduct business, interact with governments and each other. The World Economic Forum recently observed that more than 70 percent of the worlds citizens live in societies that have just begun 1 their digitization journeys. With so many people moving towards an increasingly digital lifestyle, the world that emerges at the conclusion of this transition will likely be very different than the world we know today. Cybersecurity is critical for the success of the worlds digital future. Building a safer, more trusted Internet nationally and internationally requires policymakers, business decision makers, and ICT providers to collectively develop technical and policy solutions that will enable citizens, enterprises, and governments to meet their computing objectives in a secure, private, and reliable manner. Over the past decade, national policymakers and the international policy community have undertaken a variety of initiatives that have been fundamental to establishing effective non-technical cybersecurity public policy. As a company, Microsoft has participated in many of these initiatives because we believe these efforts improve and enhance global cybersecurity. Through our participation, we have come to appreciate and understand the difficulty that policymakers face when evaluating the success of their initiatives designed to reduce cyber risks today and in the future. Understanding whether certain policies can measurably reduce cyber risks at a national level is a critical exercise for policymakers seeking effective solutions to these challenges. In this vane, Microsoft set out to create a methodology to evaluate the impact of policy solutions on national cybersecurity efforts. Using a reasonable statistical measurement for evaluating cybersecurity on a national level, a framework was created to examine various factors that distinguish levels of cybersecurity performance among countries and to identify whether adoption of certain policies or strategic actions are related to cybersecurity performance. The results of our analysis have implications for current and future policy initiatives. We found that countries adopting or implementing certain policies, including international treaties like the Council of Europe Convention on Cybercrime and voluntary codes of conduct like the London Action Plan, are more likely to over-perform on a key cybersecurity metric compared to countries that have not adopted the same policies. For policymakers seeking ways to improve national cybersecurity, these policies represent activities that are likely to have a meaningful and measurable impact. While we believe that these specific policy actions are critical steps for policymakers to consider when addressing cybersecurity on a national level, the manner in which these policies were created and adopted through international partnership or joint public/private efforts likely serve as important models for how successful cybersecurity policies might be created in the future. Recognizing the limitations of our study, we nevertheless hope that this whitepaper adds value to other efforts to form more reliable risk reduction metrics in cyberspace and serves as a useful tool for national policymakers considering various approaches towards achieving greater cybersecurity.
2
1http://www3.weforum.org/docs/Global_IT_Report_2012.pdf 2Cybersecurity:
Since Q1 of 2011, the CCM has been reported based on geographic location rather than the adminis-
trator defined location. http://blogs.technet.com/b/security/archive/2011/11/15/determining-thegeolocation-of-systems-infected-with-malware.aspx Microsoft Security Intelligence Report Volume 12: July - December 2011. http://www.microsoft.com/security/sir/archive/default.aspx
4
Figure 1 - Infection rates by country/region in 4Q11, by CCM CCM, like other technical cybersecurity metrics used in the industry, is an imperfect one. For instance, CCM does not measure and report important cybersecurity outcomes, including actual damage caused by infections. While we chose to use the CCM metric as an indicator of cybersecurity for purposes of our study, we hope that industry, government, and academia continue developing other useful metrics in order to create a more complete understanding of the impact of cyber risk.
Correlation with CCM -0.6 -0.5 -0.5 0.6 -0.5 -0.6 -0.5 -0.3 -0.3 -0.5
The elements of the graph include: 2011 Average CCM - Along the X-axis, is the average quarterly CCM numbers reported in the SIR for 2011. Expected/Predicted CCM - Along the Y-axis, we report the predicted level of cybersecurity for each country. This accounts for the variation among countries and gives us an expected/predicted CCM number based on the 34 variables identified above. Model Line - The diagonal line from the lower-left to the upper-right of the graph represents a perfect fit of the model. If we were able to perfectly predict the levels of cybersecurity performance for each country, each would fall on this line.
Strength of Our Predictive Model The strength of this model is expressed by the term R which explains how much of the predicted value can be explained by the regression formula. Generally, ranging from 0 to 1 an R of 0 would indicate no predictive power, 0.1-.03 weak prediction, 0.4-0.6 moderate prediction and 0.7-1 strong prediction. Our model has an R of 0.68, moderate predictive ability. While purely scientific studies may strive for R values of .9 or above, we consider our model to be a good starting point for this discussion.
2 2 2 2
Since the model is not perfect, individual countries are on, above, or below the model line. Countries above the line are considered to be out-performing the model. That is, their actual levels of cybersecurity performance are better (lower CCM) than our model predicts based on the nontechnical indicators. Conversely, countries located below the line are under-performing the model. 6 Their actual levels of cybersecurity are worse (higher CCM) than our model had predicted. We then used latent class segmentation to classify each country into one of three clusters, based on both their actual and predicted CCM. The end result is a model with three distinct clusters of countries, which we call Maximizers, Aspirants, and Seekers.
7
Note on our methodology: We expect that countries positions on the chart will change over time as
both non-technical and technical conditions evolve. We also expect that CCM changes will be more frequent and erratic, relative to some of the other indicator variables; this is based on past observations of CCM fluctuating between quarters relatively more than other government indicators, such as GDP. For this reason, we have chosen to model and report on annualized averages where possible, as this minimizes potentially misleading data that is a direct result of quarterly fluctuation. In some cases, the predicted CCM is extremely low, and potentially below 0, which cannot happen from a practical standpoint. This is a result of using a linear regression model the model cannot understand that the practical floor for CCM is 0. Negative CCM results should be interpreted as a small positive number that is approaching zero, from a real-world standpoint.
7
Vermunt, Jeroen K. and Jay Magidson. Latent Class Models for Classification.
In latent class segmentation, we create variables (known as latent variables), and assign each of the countries to belong to one of those variables. The variables act to explain the variance between expected and predicted CCM countries with similar variance are grouped together. The optimal clustering model is determined by maximizing the explainable difference, and is found by testing varying number of latent variables (varying numbers of clusters) and varying combinations of countries included in each cluster.
Figure 3 Cluster Analysis of Cybersecurity Performance Maximizers: Maximizers are countries with more effective cybersecurity capabilities and outperform the model. This cluster has a moderate level of predicted cybersecurity, but relatively, it has the best cybersecurity performance of all clusters. This over-performance of the model is the defining attribute of the cluster. Within the countries that comprise the cluster, we see that they 8 often have better performance in key indicator variables (as defined by CHAID analysis , which determines the strength of relationship between predictor variables and cluster membership), including personal computers in use per capita, health expenditure per capita, regime stability, and broadband penetration. Maximizers include a relatively high percentage of European countries.
G. V. Kass Journal of the Royal Statistical Society. Series C (Applied Statistics) , Vol. 29, No. 2 (1980), pp. 119-127 Published by: Wiley for the Royal Statistical Society Article Stable URL: http://www.jstor.org/stable/2986296
10
Aspirants: Aspirants are countries who are on a par with the model and are still developing cybersecurity capabilities. This cluster has a moderate level of predicted cybersecurity, and in reality it performs on par with those predictions. This predictability of cybersecurity performance is the defining attribute of the cluster. Of all three clusters, Aspirants is also the largest. Within the countries that comprise the cluster, we see that they often have average to above average performance in key indicator variables, including broadband speed, secure Internet servers per capita, R&D expenditure, and consumer telecommunications expenditure. Countries from around the world comprise the Aspirants cluster, but it contains a slightly higher percentage of Latin American/Caribbean nations than others. Seekers: Seekers are countries with higher cybersecurity risk who underperform on model expectations. While this cluster has a moderate to low level of predicted cybersecurity, in reality it has a low level of cybersecurity, as measured by high CCM. As such, Seekers underperform with regards to their cybersecurity potential. Of the three, the Seekers cluster is the smallest. The countries that comprise the cluster often poorly perform in key indicator variables, including literacy, offences (crime) per capita, broadband speed, and broadband penetration. Compared to the key attributes of Aspirants, we see that Seekers may be less likely to invest in technological infrastructure development. Countries from around the world comprise the Seekers cluster, but it contains a higher percentage of Middle Eastern/African nations than the others.
11
http://londonactionplan.org/the-london-action-plan/
10
http://www.bsa.org/country/Research%20and%20Statistics/~/media/5536D2D93FA746E69CBC12ECBCE0 F319.ashx
11 12
http://portal.bsa.org/globalpiracy2011/downloads/study_pdf/2011_BSA_Piracy_Study-InBrief.pdf http://www.microsoft.com/security/sir/story/default.aspx#!unsecure_distribution
12
resilience. Sophisticated attacks against the U.S. government resulted in the creation of the Comprehensive National Cybersecurity Initiative (CNCI) in 2008, an effort representing a significant increase in policy, operational, and financial commitments that spanned the whole of government. As attacks continued, militaries increasingly looked to develop specific military doctrines, policy state13 ments or military strategies related cyberspace. By 2011, 33 countries that had done so.
Maximizers
Piracy London Action Plan Membership COE Convention on Cybercrime Defense Strategy for Cybersecurity
Aspirants
62% 20% 17% 15%
Seekers
68% 10% 7% 21%
James A. Lewis and Katrina Timlin, Center for Strategic and International Studies, Cybersecurity and
Cyberwar: Preliminary Assessment of National doctrine and Organization, in Resources: Ideas for Peace and Security (U.N. Inst. for Disarmament Research, 2011), http://www.unidir.org/pdf/ouvrages/pdf-1-929045-011-J-en.pdf.
13
Plan aims to promote international cooperation in addressing spam, online fraud, and malware. 9 Rather than create new legally binding obligations for members, the Plan outlines activities for both public and private sector participants to fight spam, fostering better cooperation between organizations in order to defend against cyber threats. Forty-six percent of the over-performing clusters countries are members of the London Action Plan. Also similar to the COE signatory trends, membership in the London Action Plan is linked with CCM performance, relative to expectations. As with COE signatory rates, there exists an implied relationship between membership in the London Action Plan and relative cybersecurity. While the relationship between CCM performance and the London Action Plan may not be causal, we can definitively say that membership in the London Action Plan would be part of a profile for a country that has relatively good cybersecurity.
Piracy Rate
Though we did not evaluate individual policy approaches towards reducing piracy, the average piracy rate of countries in the low-CCM cluster was drastically lower than the other clusters. The implications of this observation are complex. Countries that do a better job managing cybersecurity may also do a better job mitigating piracy, or countries with higher piracy rates may have a more difficult time containing malware and other cyber threats. This is a topic for further research, but we found the relationship between piracy rates and CCM scores compelling enough to highlight here. As opposed to the other profiling factors discussed above, piracy rate is an outcome rather than a policy tool. However, this does show the potential benefit of protecting intellectual property as higher rates of piracy are positively correlated with higher CCM. This is unsurprising, as pirated software poses a serious security risk to its users. A 2008 study by the Harrison Group found that companies that used unlicensed software were seventy-three percent more likely than those companies that use fully licensed software to experience loss or damage of sensitive data, and were 43 14 percent more likely to suffer critical computer failures.
14
http://go.microsoft.com/fwlink/?LinkId=143927
14
15
15
Figure 5 - Progression of Cybersecurity Policy As policymakers consider future initiatives designed to impact national cybersecurity, it will be important to draw lessons from the policy discussions of the previous decade. Policymakers should pay particular attention to the lessons from policies that this study identifies to have a positive correlation on national cybersecurity, such as the Council of Europe Convention on Cybercrime and the London Action Plan. As a participant in some of these initiatives, our company has observed firsthand the reasons for their effectiveness, and offers the following impressions:
Evolving Context for Cybersecurity Policy: New Demographics of Global Internet Users
In considering cybersecurity policy initiatives, it is important for policymakers to consider the global demographics of Internet users. During the creation of many of the initiatives noted above, such as the Council of Europe Convention on Cybercrime and the London Action Plan, Internet users were largely concentrated in North America and Western Europe. Because of this and other factors, countries in those regions took leading roles in developing and leading global cybersecurity policy initiatives. However, in coming years, shifts in Internet user demographics will create new centers of gravity in the global online population. As demonstrated in the data visualization (figure 6), which shows a map of the world in 2020 with countries sized by their relative population of Internet users and colored according to the total number of Internet users relative to their population, countries such as China, India, Nigeria, and other emerging economies will be home to the bulk of global Internet users.
16
16
This shift in demographics does not mean that these new centers of gravity will necessarily drive policy initiatives, but it does mean that global-scale initiatives as well as some regional and national-level initiatives will need to be responsive to these emerging demographic changes. More than ever, policymakers will have to consider the unique and diverse perspectives that different countries bring to cybersecurity while maintaining currently established policy frameworks that have proven key to promoting the growth of the global ICT industry.
16
Please see Appendix C for this map-style data visualization, which includes an explanation of the rela-
tive sizing and coloring of countries. Additionally, Appendix C includes similar data visualizations for a subset of countries during the years 2000, 2005, 2010, and 2015, to demonstrate the growth of Internet user populations.
17
18
19
Conclusion
Though it is hard to predict exactly what the digital world will look like in the decades ahead, strong cybersecurity will be critical to its successful existence. Policymakers around the globe are faced with the difficult challenge of creating policies that positively impact their national cybersecurity. Knowing which types of initiatives have the greatest positive impact on cybersecurity will allow policymakers to make informed, results-based policy decisions. In reviewing qualitative and quantitative impacts on national cybersecurity, this paper seeks to place policy decisions alongside a framework of technical and demographic projections to create a view of what the future environment for policymaking could look like. By identifying the underlying principles of certain policies that are correlated with over-performance in cybersecurity, such as intergovernmental frameworks for cooperation and voluntary codes of conduct, policymakers can develop future approaches that are more likely to be effective in combating the evolving threats in cyberspace. To meet our future security challenges in cyberspace, Microsoft urges governments to participate in a broader dialogue on normative standards to better protect citizens on the Internet that includes perspectives from the ICT industry. This process develops rules of behavior in cyberspace that can reduce threats, increases confidence and trust, and helps improve security of the cyber ecosystem at the international level. As discussed in this paper, CCM is a rough approximation of the attack surface for a particular country or region. Industry and governments can work in partnership to reduce this attack surface and make the computing infrastructure less susceptible to attack and compromise.
20
Appendix A: Methodology
In order to test the predictability of CCM given non-technical measures, we used linear regression modeling. A regression analysis allows us to build a model that shows the predicted impact on CCM as the various indicator variables (such as GDP, Computers Per Capita, etc.) fluctuate. By solving for a universal starting point (known in regression analysis as the constant), we then were able to use the model to predict CCM at the country level, with differences in predicted CCM across countries being driven by differences in the indicator variables (e.g., given the GDP Per Capita, Computers Per Capita, etc., we can predict CCM). There are several existing approaches to regression modeling, each with its own set of advantages. The type of analysis we utilized to build the model was Correlated Component Regression (CCR). CCR modeling differs from other regression techniques in that instead of constructing a relationship between the dependent variable (in this case CCM) and the individual predictors (in this case, the indicator variables), CCR constructs relationships between the dependent variable and a number of components components being latent variables created by the model. Each component consists of the total number of predictor variables included in the model (GDP, etc.), but the weighting of each of those predictors varies from component to component (a similar concept to principal component analysis). As a result, some components may be more heavily representative of particular indicators, such as GDP, while other components are more heavily representative of other indicators, such as Facebook or IE6 usage. We chose to use CCR modeling because it offers an advantage over other techniques, in that it reduces potential error created by datasets that have a large number of correlated predictors (such as computers per capita and % of population with an Internet connected computer), relative to data points which was beneficial to this dataset, given that we used 34 predictors to predict CCM, based on 106 countries/regions. The first stage of analysis was a step down analysis, designed to identify those indicators that are most important to the model. By using step down analysis, we were able to reduce the number of indicators from 80 to 34. In this case, step down analysis was run by creating a model with all indicators, identifying the 1% of indicators that were least important to the model and removing those variables from the model. This process was completed until the model with the best fit was identified. When identifying the model of best fit, we used a methodology commonly known as cross validation. The reason we did this was to measure not only how well our model could predict the data we fed into it (the 106 locations predictor variable data), but also how well it could predict random data (e.g. how well it could predict CCM performance for countries/regions that we arent testing). Cross validation, commonly known as K-Fold cross validation, works by using pieces of the dataset to test results. In cross validation, we divide the data into an equal number (represented by the variable K) of folds of random cases, and then apply the model to see how well K -1 folds predict the final fold. In this analysis, K=10. In simpler terms, we would repeatedly randomly test how well 90% of the data predicted the final 10% to determine fit. We used this methodology to optimize the model tuning parameter (number of components and predictor variables), as well as to identify the model of best fit. As the final step, we ran a cross-validated 5 component model. The results were interpreted in the same way as other regressions each predictor coefficient determined impact on CCM. Coefficients may not have been directionally consistent with correlations; this is because some of the predictor variables in the model help explain otherwise unexplainable variance in other predictor variables, as opposed to directly predicting CCM. These types of variables are commonly known as suppressor variables, since they suppress otherwise inherent error in some of the predictor variables included in the model, and help to improve overall model accuracy. As a result, the accuracy of the model lies in the overall prediction in aggregate, and not the direct relationship with any specific indicator.
21
-0.6
Fixed broadband connections per 100 people The contracted capacity of international connections between countries for transmitting Internet traffic
2010
-0.3
2008
-0.6
Corruption
-0.5
0.6 -0.3
Percent of households who own a personal computer Corruption perceptions index relates to perceptions of the degree of corruption as seen by business people and country analysts, and ranges between 10 (highly clean) and 0 (highly corrupt). Pressures on the population such as disease and natural disasters that makes it difficult for the government to protect its citizens or demonstrate lack of capacity or will. Number of Facebook users Foreign direct investment is net inflows of investment to acquire a lasting management interest in an enterprise operating in an economy other than that of the investor. Adjusted by the moving average volatility.
2010
2010
Transparency International
2009 2011
Failed States Index Socialbakers.com World Development Indicators International Monetary Fund
2008
GDP Per Capita Government Type Gross Income Per Capita Health Expenditure Per Person
-0.3
Gross domestic product per capita, current prices The extent to which a society is autocratic or democratic. Income before taxes from all sources. Health expenditure per capita with external aid High-tech exports as a percentage of manufactured exports. Information and Communication Technology exports as a percentage of total goods exports. Internet Explorer 6 usage share
2011
Polity IV Euromonitor International World Health Organization World Development Indicators World Development Indicators Microsoft World Development Indicators
Hi-tech Exports
-0.3
2008
-0.2 -0.2
2008 2011
-0.2
2008
22
-0.4 -0.5
Life Expectancy at Birth Adult Literacy Rate Domestic consumption plus country exports minus country imports. Number of offences per 100,000 people. Offence refers to any act which is punishable under law. The number includes both criminal and administrative offences. Percent of households with a broadband internet connection via home computer. Refers to labor productivity, i.e. output of goods and services in the economy per employed person Expenditures for research and development are current and capital expenditures (both public and private) on creative work undertaken systematically to increase knowledge, including knowledge of humanity, culture, and society, and the use of knowledge for new applications. The number of years since the most recent regime change. Measures the extent of regulation within the business sector. It captures general regulation with respect to investment and competition. Royalty and license fees are payments and receipts between residents and non-residents for the authorized use of intangible, non-produced, non-financial assets and proprietary rights. The extent to which individuals within a society respect property rights, the police and the judiciary system, as well the quality of police and legal safeguards
2010 2010
Market Size
-0.6
2008
Euromonitor International Euromonitor International World Development Indicators Euromonitor International Euromonitor International Euromonitor International
-0.5 -0.4
2008 2008
World Development Indicators Polity IV World Bank Governance Indicators World Development Indicators World Bank Governance Indicators World Development Indicators Euromonitor International World Development Indicators World Development Indicators Euromonitor International
Regulation
-0.5
2008
Royalty Receipts
-0.4
2008
Rule of Law
-0.5
2008
-0.3 -0.4
2008 2010
-0.5
Secure Internet servers per one million people. Start-up business costs measured as share of gross national income per capita Consumer Expenditure on Telecommunications Services Group-based inequality, or perceived inequality, in education, jobs, and economic status. Also measured by group-based poverty levels, infant mortality rates, and education levels.
2008
0.2 -0.3
2009 2010
0.6
2009
-0.3
2008
23
24
25
26
27