Centralizing the Decentralized: The Value Implications of Single Sign-on Services

Kartik Rishi School of Information Informatics - HCI kartik@kartikrishi.com Teresa Lam School of Information Informatics - HCI teresahl@uw.edu

Abstract
The nature of the internet is that its decentralized nature is the greatest strength to its continued growth and value; however a trend that is developing is the use of points of authority that house your identity and interface with other web services to authenticate your identity. This industry falls under the title of Single Sign-On (SSO) services that allow you to log in on many different sites. We take a look at major SSO integrators and see how they utilize SSO to provide value to users and see how they benefit from having that system in place. We also take a look at the datause policies of SSO providers to understand how the industry in general treat users and their data. After that we follow-up with a study on the usage of SSOs through the lens of actual users and by combining all this data we develop a best practices for users to help them be more informed on how their data is used and how they can service their own personal values and interests.

Scott Kuehnert School of Information Informatics - HCI jscottkuehnert@gmail.com

Augustus Yuan School of Information Informatics - HCI augbog@uw.edu

Copyright is held by the author/owner(s). INFO 444, Autumn 2012 School of Information University of Washington

Keywords
VSD, Single Sign-on Services, SSO, Values, SSO Integrators, SSO Providers, data, services, privacy, best practices

2

Introduction
The internet as we know it is growing at an incredible pace, and with it, new services are popping up everywhere with a new solution to any and all of our old problems. Need to shop for clothes online? There is a website for that. Want to listen to a variety of music? There is a web application for that too! Are you interested in having a discussion with your friends and family? You bet there is a way to do it online! With the expanding role of the internet in our day-to-day lives, we develop manifestations of ourselves throughout the internet via user accounts tagged to emails that you may not even remember the passwords for! Wouldn’t it be nice if all you had to remember was one account, one email, one password? The premise behind a Single Sign-On (SSO) service is that a user only has to establish their account in one place and is able to utilize it on many other sites! The user no longer has to provide different credentials for different sites, to ultimately establish a connection to their identity on that site and to access the service that it provides. In this day and age where users provide so much information about themselves, a single site can develop a significant idea of who the user is, and in that process becoming an SSO Provider, where the new service is that they can establish the user’s unique identity anywhere. For those that actually implement the other side of the relationship, SSO Integrators, are sites that offer a service that the user wants and will communicate with SSO Providers to provide a convenient authentication for who the user is and let them continue on with what they intended to do. What does this present to the user in terms of benefits? The user is now able to consolidate their various user

accounts in to one convenient account that allows access to various services. On top of that, because their information is shared, their preferences and trends carry over, making the services that SSO Integrators provide very personal to the user. To develop on the personality of services, Integrators can also utilize geographic and friend data to provide content that is dynamic and far more relevant to your immediate location and your friends. SSO also presents the opportunity for users to have various SSO Integrators work with each other to improve the level of service provided, simply because the user has a “global” identity shared among all of them. Our research began with a story about a man named Bogomil Shopov, an online IT marketing and community management professional from Bulgaria. This individual was able to purchase 1,500,000 entries of first and last name, email, and private Facebook profile IDs for $5 USD (http://talkweb.eu/openweb/1819). That’s five bucks, straight and simple. This brings us to the negative side of SSO services and that is that while a user’s data may be integrated with various other sites, what data is truly transferred and what actually happens to it? Our team intends to explore the various aspects behind Single Sign-On services and an understanding of those services can gain us insights in to the users that utilize them. We will begin our study by determining some of the direct and indirect stakeholders involved with SSO services, to determine the key players and motivations behind how these systems are setup. Following that we will take a glimpse in to some well-established web services that are SSO Integrators and how they utilize data provided by SSO Providers to service users. From

3

there we will expand to establish who the top three SSO Providers are and then discuss each one in detail to understand how their system works and synthesizing their data use policies. By establishing a profile and understanding on the top three Providers, we intend to compare and contrast their approach to SSO and come to an understanding of what user values are implicated by how those systems were developed. After some insight in to common Integrators and Providers, we intend to develop a common understanding of how users approach SSO services in their day-to-day life to establish a better idea of the relevance of the technology and prevalence in daily life. Upon completion of understanding a broad user base, we intend to analyze how common users utilize SSO services and what that also means in terms of values implicated. Now you may be asking, what’s our true purpose behind all of this work? We hope to analyze both our technical study of SSO services and an empirical measurement of the penetration of SSO technology in our peer groups and develop a strong understanding of what values are truly at stake for users in this ProviderIntegrator-User relationship. Once we understand what those values are we intend to develop a best practice guideline that users can quickly read up on and understand key aspects of SSO services and how they can better protect themselves. With those guidelines, users express more control on their information by having more knowledge on its spread and can improve their leverage in the Provider-Integrator-User relationship.

Methodologies & Stakeholders
The basis of our work will be rooted in the principles of Value Sensitive Design (VSD), a “tripartite methodology, consisting of iteratively applied conceptual, empirical, and technical investigations; an emphasis on considering indirect as well as direct stakeholders (that is, people who are affected by a technical system but don’t use it directly, as well as those who do); and an interactional theory of the relationship between values and technology. (Borning)” To begin the direct stakeholders include the users and providers of SSOs. The indirect stakeholders include SSO integrators such as deal sites like Groupon and LivingSocial, data-aggregation services, and marketing agencies. The benefits for users are that they get to use one service to sign into various different websites. This saves them time from having to create a new account each time they visit a new website. In addition, users only have to memorize one username and password rather than multiple ones which can get confusing at times. They also benefit from personalized ads which can be helpful for users. The harms for the users include the possibility of third party websites obtaining information from the user that they did not wish to provide. Another harm is that SSO integrators have permission to access all the information that you provide in the social network which could be more than what users want to provide to these sites. As for the SSO integrators, they benefit by creating more personalized ads targeted to users, which in turn increases the likelihood of a user buying a product on the site. They also perform analytics and conduct customer research. The SSO benefits from users

4

continuing to use their social media website which increases their traffic which means they can earn more money. Users benefit from simplicity, time efficiency, and personalization. Conflicting value tensions include lack of privacy and consent. SSO integrators benefit from gaining valuable information while SSOs benefit from popularity.

      

WordPress Blogger Verisign ClaimID ClickPass Google-Profile AOL

SSO Integrators
In this section, we will be investigating how certain websites integrate Single Sign-On from social media sites and use it to their advantage. Single Sign-On services such as Facebook Connect can carry a lot of data from a user’s Facebook account into the service. Data such as interests, gender, likes, and friends in a user’s network become a lot more transparent for the Integrator and while they may use this information for the user’s gain, they may also use it for their gain as well. For this reason, this section will look more deeply into the privacy/data use policies stated by SSO Integrators regarding the data they collect from users and how they provide benefits in exchange. One example of an information technology that makes use of SSO specifically is StackExchange, a large network comprised of 90 Q&A sites which are all linked together. We were interested in StackExchange because, despite having its own StackExchange account that gives you access to all ninety sites, they also integrate a variety of other social media sites to allow you to connect with the different sites, including:      Google Facebook Yahoo MyOpenID LiveJournal

This brings up a lot of privacy issues to us as to how much data StackExchange is collecting from all these sites, and what they are using it for. Under StackExchange’s privacy policy, they state that they will tell you how they are using the data and they will make the notice in “clear and conspicuous language when you are asked to first provide [StackExchange] with personal information” and that they will “notify [the user] before [StackExchange] uses the information for something other than the purpose for which it was originally collected.” StackExchange uses this information to their benefit, however, in ways they have listed in their privacy policy such as allowing the user “to register to [StackExchange] websites, online communities, and other services,” communicate with users effectively, and evaluate quality of their services. StackExchange also uses this information to “help employers find or contact users who post profiles on the Careers site,” and “transfer information to others as described in this policy to satisfy our legal, regulatory, compliance, or auditing requirements.” In exchange, the user gets access to many of the services StackExchange offers including its huge network, all of it being extremely accessible through one, simple, signon. Another example includes deal sites such as Jackthreads, PLNDR, and Zappos whom focus primarily on marketing clothes in general for very cheap deals.

5

They, too, have Single Sign-On services that allow users to connect via Facebook, or other social media websites. The major benefit they gain from this is they get access to your social media profile and anything you allow through their application. Here is an example picture that is used by Groupon, a website focused on delivering coupon deals to its users:
Figure 1: A prompt that informs the user of all the data pieces that Groupon requests from Facebook during the Facebook Connect session.

Groupon’s business model revolves around providing deals for a diverse range of local activities (including restaurants, events, fitness, health, education, etc) to their users. A majority of how Groupon profits from this is they make deals with different businesses to advertise so that those businesses can get more customers. With so many businesses doing different

6

things and so many users, information about the users is extremely helpful for Groupon. Specifically, Groupon can use a variety of information you make available publicly to target which coupons they want to send to you. They use the information for things such as maintaining the website, providing personalized ads, evaluating you for certain offers, and performing analytics for customer research. In their privacy policy they state that if you want to limit the information they obtain, “you may manage the sharing of certain Personal Information with [Groupon] when you connect through social networking platforms or applications” and that adjusting permissions of that personal information is dependent on the privacy policy of the social networking platform. In this situation, Single Sign-On has the main advantage of personalizing the Groupon experience and has less of an emphasis on convenience. One final example of how Single Sign-On is utilized is in Wolfram|Alpha, a computational knowledge engine that uses your Facebook account to deliver very precise analytics including habits, charts, graphs, and statistics. Wolfram|Alpha mentions that the main purpose they use the information is to “help enhance and refine [Wolfram’s] content” and that “information collected about you through your experience and queries is used to better understand the entire population that is utilizing our website and how we might improve our services to improve the collective experience.” They also make it explicitly clear that “personally identifiable information Wolfram|Alpha is allowed to access is affected by the privacy settings you have established at the TPS” and that “the linkage between any TPS and Wolfram|Alpha is completely voluntary, and our ability

to access your information at the TPS requires that linkage, you have a choice whether or not to disclose such information.” It goes to show just how much information Wolfram|Alpha has at its disposal and many third party sites can potentially benefit from it. The user also benefits from this because he/she can gain knowledge of the different habits he/she exhibits and can focus on fixing them if necessary. Some snippets of Wolfram|Alpha analytics have been included:

Figure 2: An example of a piece of analytic that Wolfram|Alpha develops off of Facebook data.

For example in the above image, you can see the user’s activity during the week. We can see in the second

7

graph that there is a lot of time spent on Facebook around 2-3 AM on Friday morning. We can also see a large variety of application usage in the first graph. Wolfram|Alpha also makes the information very accessible for the user by providing different ways to download the data. They also have a way to monetize by allowing users to obtain RAW data from Wolfram|Alpha if users purchase the Pro plan.
Figure 3: The prompt that shows that you can download analytic information if you subscribe to Wolfram|Alpha Pro

basis of its having been given to us as input.” However, generating content through Wolfram|Alpha, the user is agreeing that Wolfram|Alpha can “store [user’s data] in log files, and use [user data] to generate the results.” Overall, we can see that different SSO integrators go about using data collected through Single Sign-On in various ways. In StackExchange, a majority of the use is for convenience for the user – with so many different

In the terms of use, Wolfram|Alpha explicitly states that they “will not attempt to associate individual Wolfram|Alpha inputs with individual human users, and will not release individual or aggregated lists of inputs, or any personally identifiable information, to any third party, except in response to lawful court orders. We will not attempt to assert intellectual property rights over anything given as input to Wolfram|Alpha simply on the

Q&A sites, having one account that allows you to access all of them is extremely convenient and it makes all the Q&A sites easily accessible. In Groupon, the data collected through Single Sign-On is used to create a very personalized experience for the user, and target specific coupons based on the user’s data. Finally, Wolfram|Alpha makes collected data very accessible to the user, and also uses the data to better their own

8

website or search engine. All SSO integrators have explicitly stated somewhere in their privacy policy that they will not openly reveal user data to third parties unless they are required to by court order primarily use the data for the convenience of the user. Next we take a look in to how user’s approach the use of SSOs by a detailed empirical investigation.

Empirical Investigation on SSO Users
We decided to do a survey for our empirical investigation to gain insight on what users felt when it came to their privacy online. We wanted to determine whether users actually cared about the privacy of their information online or not. In addition, we would like to see to what extent are the participants willing to give up their privacy for other values. Procedure For our study, we gave our participants a survey that consisted of 23 questions. These questions asked them for information about their demographics, Single SignOn services, how much time they spent on the computer and Internet, as well as privacy and security related questions. We put our survey up online at various websites including Amazon Mechanical Turk (Amazon Mechanical Turk, 2005) and Reddit. The majority of our responses came from Amazon Mechanical Turk, which essentially is a paid crowdsource service that connects companies to a large body of people willing to do small tasks for a small sum of money. These tasks are typically those that are difficult for computers to accomplish while easy for humans due to the difference in comprehension. This platform is great to access a large variety of individuals.

Participants We had a total of 170 participants for our survey. Of the 170 participants, only 142 had valid responses to the survey questions. We analyzed and based our study on the 142 responses. Of the 142 people that took our survey, 48 were female and 94 were male. This means that about a third of our data were made up of females and two thirds of our data were made up of males. We had a wide variety of age groups take our survey. Around 60% of the survey responders ranged from ages 22-30. As for location, about 82% of our data came from India. Demographics Age 18-21 22-25 26-30 31-40 41-50 51-60 61-70 n 17 43 41 28 9 3 1 % 12% 30.3% 28.9% 19.7% 6.3% 2.1% 0.7%

Figure 4: A series of tables displaying the demographic information for the empirical investigation survey we conducted.

Gender Male Female

n 94 48

% 66.2% 33.8%

Country India USA Pakistan Other

n 116 17 2 7

% 81.7% 12% 1.4% 4.9%

9

1: While this question may seem quite broad, the context around it is a series of questions related to internet usage and SSOs, so there exists some implicit framing to the question

Results One of the questions that we asked in our survey was “Why do you use Single Sign-On services?” and we had a lot of consistent answers from our participants. A male around the age of 26-30 from India responded to the question by saying “It’s easy and convenient”. Another response from a male that’s also around the age of 26-30 states “It provides security as one time login and logout. Also [there’s] no need to remember all the passwords every time”. We used a website called Many Eyes (Many Eyes, 2007) which is a “graphical tool that uses techniques to create a graphical network representation of patterns of reference in collaborative discourse” (Wikipedia, 2011). One of the options on this site is a graphical representation called a tag cloud which counts the frequency of the words within our data. Below is a tag cloud for the responses to the question “Why do you use Single Sign-On services?” The word easy is the biggest word which means that it is the most frequent word response.

One of the relationships that we explored was the usage of SSOs vs. Privacy violated in the future. We asked the participants to answer the question: Do you ever worry that your privacy might be violated in the future?1 Please mark the scale from 1-5: 1- Not Worried At All 2- Somewhat Worried 3- Neutral 4- Worried 5- Extremely Worried Of the 142 participants, 47 responded “1 – Not Worried At All”. From the 47 respondents, 32 use SSO services. This means that 68.1% use SSOs and are not worried about their privacy being violated in the future. As for those who answered a “5 – Extremely Worried”, 10 out of the 19 use SSOs which is a 52.6%. There is a 15.5% difference between those who answered a “1” and those who answered a “5” that use SSOs. This means that it is worth noting that out of the participants who

Figure 5: This is a word cloud comprised of user responses to the question, “Why do you use Single Sign-On services?” The larger the word the more frequent that response occurred in survey responses.

10

use Single Sign-On services, there are more participants who are not worried about their privacy being violated in the future as opposed to being worried about their privacy being violated. Another relationship that we explored was the usage of SSOs vs. Privacy violated in the past. Of the 142 sample population, 11 participants said yes at having their privacy violated in the past. Of the 11 participants, 5 said they used SSO, which is 45.5%. Unfortunately we were not able to determine whether SSOs played a part of violating the participant’s privacy in the past or not since about half of the users that had their privacy violated in the past used SSOs and the other half did not. We also looked at the relationship between the users who had their privacy violated in the past and whether that affected whether they worried about having their privacy violated in the future. Only 11 out of the 142 participants actually had their privacy violated in the past. 72.7% of the 11 participants answered either a “5 – Extremely Worried” or a “4 – Worried” for their privacy being violated in the future. This shows that people who had their privacy violated in the past are more concerned about their future privacy. This makes sense because normally people who had a bad experience in the past would end up being more worried and cautious in the future. Limitations We had some limitations because some of our questions could have been too broad or ambiguous for the user. For example, we did not specify the question “Do you ever worry that your privacy might be violated in the future?” to just online. We did believe that users

could determine that is was for online because of the wording and flow of our previous questions, but it’s possible that not everyone understood it to mean just online. Also, our data was limited to participants from India which can provide different answers than users from the US because of cultural differences. Future Work In the future, we would try and have more females take the survey to get a 50/50 male and female ratio. In addition, the majority of the sample for our survey was from India, but for the future we would like the majority to be from the USA for consistency. We would also ask more detailed questions to get richer data from our participants as well. Some of our wording from our survey could be asked in a better way for the future as well. Now that we have set up the foundation of knowledge in both the SSO Integrators and the SSO Users, we have an idea of the essential “front-end” of this industry. Next we take an in-depth look in to the “backend” or in other words a look in to how SSOs work in the provider perspective.

SSO Providers
A central goal of this research is to be useful to users of Single Sign-On (SSO) services for making decisions about what data they share and with whom. In order to get an overview of the abilities of SSO providers with respect to data usage, identify problem areas for users, and draft best practices for users to follow when deciding whether or not to use an SSO service, we have performed analyses of the data use policies of each of large SSO providers. This analysis forms the core of our technical investigation for this project.

11

We observed that data use policies tend to be hard to read because of a variety of factors including their size, the vocabulary used in them, and their overall complexity. So, part of the motivation for this research was to expose details of those policies in a way that’s easy to understand for users of those services. Another reason we performed this analysis was to inform the creation of best practices for users to follow when deciding whether or not to use a Single Sign-On service. Methods We began this portion of our research by brainstorming some ideas. Before beginning our formal research, we sketched out a few questions we had pertaining to the data use policies of SSO providers. These included questions such as “How and when do SSO providers collect user data? How and when do they share user data? What sorts of control do users have over the sharing and collection of their data?” We then gathered the data use policies of the top three Single Sign-On providers across the web: Facebook, Google, and Twitter. (Gigya) The next task was to come up with a list of categories to classify sections of the data use policies into that we consider to be potentially of interest to users. Our research and early brainstorms guided the creation of high level categories such as “allows for collection of user data” and “allows for sharing of user data”. We used those high level categories in a first-pass reading of the data use policies for each major SSO provider in which we identified general regions of text that relate to the high level categories. Then, we used the insights

gained from the first readings to produce a more detailed list of allowances that may be of interest to users. The word “allowance” is used to refer to practices that are allowed by a company’s data use policy. The list includes abilities that we consider to be concerning, reassuring, or neutral (good, bad, or neither for the user). “Concerning” in this case means potentially causing harm to users. “Reassuring” in this case means potentially protecting users from harm. “Harm” is defined to be any occurrence that is detrimental to a valued quantity (such as physical health, income, reputation, mobility, etc). Groups like http://knowprivacy.org and http://www.privacychoice.org/ served as inspiration for our policy analysis, and some of the practices on the list (such as “Allows users to delete data” and “Notifies users when government requests access to their data”) came from those websites. (Know Privacy, PrivacyChoice) The final list of practices of interest can be seen in the appendix under item Appendix A. The list is broken into data collection, data sharing, ad targeting, user control, and SSO. However, the majority of the allowances on the list are related to the first two categories, data collection and sharing, because we are primarily concerned with the values of privacy and security. User control and SSO refer to the value of informed consent. During a second read-through, we tagged specific clauses in the data use policies that relate to allowances on the list with numbers such as “[1]”, “[2]”, and “[3]”, and placed the tags within a table next to the allowance they relate to. The final result is a table that shows each instance of a given allowance

12

within each data use policy, and the clauses that relate to that allowance. For example, the cell for the intersection of the allowance “allows collection of IP address” and the SSO provider Google may contain “[3], [4], [7]” indicating that clauses marked [3], [4], and [7] in Google’s data use policy relate to the given allowance.
Figure 6: A visual example of the codifying of an existing data use policy and how it fits in the categories we established.

In the case that a data use policy mentioned the collection or sharing of “Basic,” “Personal,” or “Sensitive” information, the meaning of the words in the context of the particular policy was parsed as necessary for entry into the table. For example, the definition of “Basic information” in the Facebook data use policy is described as: “basic info includes your User ID, as well your friends' User IDs (or your friend list) and your public information.” (Facebook) All clauses referencing the collection of basic info were broken into the categories that relate to user Id, friend information, and public information. Results The results show some interesting trends. The first is that Google and Facebook have somewhat inverse priorities in their data use policies. Facebook is more oriented on the sharing of data than collection of data, whereas Google’s data use policy referenced the collection of data more than the sharing of data. Google’s data use policy has many references to the types of data the company may collect from users and when, but the policy only mentions the sharing of that information with third parties in a few limited circumstances. On the other hand, Facebook’s data use policy allows for the sharing of data in multiple places, and only discusses data collection a handful of times in the beginning of the data use policy. This relationship can be seen in the following stack histogram: This stack histogram shows the number of occurrences of clauses within each privacy policy that allow for a practice on our list of allowances (appendix item Appendix A). The blue bars represent data from Facebook’s data use policy, the red bars represent data from Google’s data use policy, and the green bars

It is important to note that the number of times an allowance occurs in a data use policy does not necessarily reflect the degree to which a company performs a given action. It’s tempting to see the quantity of references as an indicator of a company’s actions in that area. Instead, it’s more useful to think about the number of references as a measure of the number of ways a company may possibly allow for a given practice. Just because an allowance exists doesn’t mean they use it – for example, a company may reserve the ability to share personal data with governments who request it, but never exercise that ability on a user’s account.

Figure 8: A display of those allowances compared over the three SSO Providers and grouped according to the buckets they fall under. An expanded version is available in the appendix.

13

represent Twitter’s data use policy data. The x-axis is hidden, but each bin is an allowance from the list, in the same order as presented in the appendix under item Appendix A. The full histogram can be viewed in the index under item Appendix C.Since the list was broken into Data collection, Data sharing, Ad targeting, User control, and SSO, the histogram was drawn in clusters, representing data collection, data sharing, and user control and consent, indicated by magenta, cyan, and yellow regions respectively. The complementary focuses of Facebook and Google’s data use policies is evident in the distribution of values near the beginning and the end of the histogram. Facebook is blue, and Google is red. Notice how Google has more values near the beginning (where the bins represent data-collection allowances), and Facebook has more values near the middle and end (where the bins represent data-sharing allowances). The circled blue bar on the far left of the graph represents the category for “Allows collection of data generated on/with the website (such as game characters, scores, application usage etc)”. Since Facebook is a service

that largely revolves around content generation and use of third-party applications, it’s unsurprising that there are many places in the data use policy for Facebook that refer to the ability to collect data generated with use of the website. The second trend is that the policies appear to focus more on the companies’ abilities rather than on the users’ abilities. This is evident in the large proportion of “concerning” practices over “reassuring” practices that each company’s policy allows for. The reassuring practices largely reflect users’ abilities, such as the ability to delete their data or the ability to opt in or out of a data collection/sharing, whereas the concerning practices largely reflect abilities of the services, such as the ability to collect data or share it with third parties. The following pie charts illustrate the ratio of concerning policies, neutral, and reassuring policies contained within the data use policies of the top three SSO providers:

14

Figure 9: A series of pie charts showing the breakdown of “concerning” “neutral” and “reassuring” allowances in their respective policies.

Another finding is that none of the top three SSO provider’s data use policies mention two allowances that we deemed to be of interest to users. Those allowances were “Allows sharing of data that third parties share with the provider about you with third parties” and “Notifies users when government requests access to their data.” Those categories were inspired by the privacy policy analysis tools on privacychoice.org.

Figure 10: A zoom on the allowance chart bringing focus to the lack of any policy addressing those categories.

15

One final observation that stuck out was:

Conclusion
This form of disadvantages. analysis has its advantages and

Figure 11: Another observation of a lack of a service addressing a specific category.

One of the most important cons with our approach is that gives us no insight into how data is actually used by these companies, simply how data may be used. Viewers of the results may be misled into thinking companies with higher scores associated with a certain allowance engage more in the allowed activity. The upshot is that very different documents may be directly compared with a common metric. This is potentially very helpful for anyone interested in understanding and comparing data use policies, and this gives us a framework to aggregate and compare data pertaining to many companies at once. The data is quantitative; so many quantitative analysis techniques can be used to tease results out of the data. For example, we can look for correlation between the presence of one type of allowance and the presence of another type of allowance within the policies if we had enough data to perform the statistics confidently. An issue with our study in particular is that we missed some allowances that users may be interested in. For example, the length of time company’s hold on to user data before deleting it is of concern for some people, but we did not cover it. Other researchers may want to look into holes in our allowances, find out how confidently we can equate terms across privacy policies, and investigate whether or not there is a correlation between the number of times a privacy policy mentions an ability and the number of ways that ability is used in practice.

The only SSO provider in the top three to mention single sign on services explicitly in their data use policy was Facebook. Google and Twitter may have clauses that apply generally enough to cover Single Sign-On usage, but they never directly address SSO in their data use policies.

16

Best Practices & Conclusion
At this point we have had an in-depth look on the three main aspects of SSO services: Integrators, Users and Providers. We have establish an understanding of how SSO Integrators utilize SSO systems in a practical manner to provide better services for users while seeing how the treat data of the users. After an interesting survey we have determined some more information on the prevalence of SSOs in a typical user’s life and their views on how their values such as privacy are treated. Finally we had a power look in to the way SSO providers approach their services and how data from users are treated. While the information provided can be used to develop critical thoughts on various aspects of SSOs and even internet usage, or original goal was to service users by informing them of how they can better serve themselves when dealing with their data online rooted in our research in SSO services. BE MINDFUL ON THE VALUE OF “YOU” We cannot stress enough the value that an individual has and in particular their identifying information. A trend that we have come to see is that individuals tend to not value their privacy and security of data until something that harms those values. We urge that user’s take their identity online seriously to avoid leaks on their data to undesired third-parties. STAY UP-TO-DATE During our investigation we experienced a change on the privacy and data-use policies of Facebook, one of our SSO Providers. While we adjusted our work we realized that it is incredibly important for users to stay on top of changes to the privacy- and data use- policies they engage in. While the changes that we encountered

for Facebook were minimal, it’s very easy for services to change their stance quickly. If anything is apparent by the data we harvested and more so the purpose of this paper, these services don’t work to inform user’s on the details of their policies. MANAGE THE ACCESS TO YOUR INFORMATION Over extended periods of time, a user is likely to establish many different connections between their SSO Providers and various SSO Integrators. While some may be valuable to the user and their day-to-day life, others aren’t necessary for the user to maintain connection with. We advise that for those services that are used less often, it’s useful to disconnect or shut down accounts so that those services no longer have active access to your data and you have one less service to manage. EVALUATE THE VALUE OF SERVICES USED We investigated how three well-known services utilized SSO systems and what they provided in terms of value to users. While these services do indeed offer great value and protection of user data, that is not the case with others. Therefore we advise that you take time to evaluate on your own whether or not a service you intend to sign-up with provides the right value for you and how they handle your information by reading their policies. In addition, you can take some time to do some investigation online in to possible violations those services have had in regards to user data. MANAGE DIFFERENT KINDS OF DATA Although this should be fairly straightforward, it’s something that should always be kept in mind. The purpose of this best practice is for you to keep in mind what kinds of information you have available and to

17

whom. While information like your personal email and your name may not be that important or potentially insecure, having your address or social security information shared around can be quite detrimental to user identity security. Conduct a self-audit of what information you can find about yourself that can be harmful and work towards eliminating that data from the internet as best you can. While our investigations can be picked through for further conclusions we believe that we have established a fair foundation for informing users on quite a few aspects of their online identities. As points of authority on the internet grow further and start showing up in other aspects of our lives, the importance for an informed user is paramount to the overall security of individuals on the internet.

"Which Identities Are We Using to Sign in Around the Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012. <http://info.gigya.com/Identity.html> "Wolfram|Alpha Privacy Policy." Wolfram|Alpha. Wolfram|Alpha, 5 Mar. 2009. Web. 5 Dec. 2012. <http://www.wolframalpha.com/privacypolicy.html>. "Your Privacy. Simplified." PrivacyChoice. N.p., n.d. Web. 06 Dec. 2012 <http://www.privacychoice.org/>

References
"Facebook Data Use Policy." Facebook. N.p., 08 June 2012. Web. 06 Dec. 2012. <https://www.facebook.com/full_data_use_policy> "Groupon: Privacy Statement." Privacy Statement. Groupon, 13 Sept. 2012. Web. 05 Dec. 2012. <http://www.groupon.com/privacy>. "Which Identities Are We Using to Sign in Around the Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012. <http://info.gigya.com/Identity.html> Stack Exchange, Inc. Official Privacy Policy. Stack Exchange, Inc., 28 June 2012. Web. 5 Dec. 2012. <http://stackexchange.com/legal/privacy-policy>.

18

Appendix

Appendix A This is an expanded list of the categories that we compared the SSO Providers with each other. They are listed by the bucket the fall in to and are codified by whether they are “Concerning” “Reassuring” or “Neutral”. Concerning Reassuring Neutral

Data Collection
Allows collection of personally identifiable information (name, birthday, address, phone, email, gender) Allows collection of information about contacts/friends Allows collection of information others have shared about you Allows collection of profile information (such as user ID, Allows collection of IP address Allows collection of location data Allows collection of data generated on/with the website (such as game characters, scores, application usage etc) Allows collection of browsing history/health history/religion/political orientation (Potentially sensitive information) Allows collection of uploaded media (images, video, text, etc) Allows collection of data that third parties share with the provider about you personal description, likes, interests, etc)

Data Sharing
Allows sharing of personally identifiable information (name, birthday, address, phone, email, gender) Allows sharing of information about contacts/friends

19

Allows sharing of information others have shared about you Allows sharing of profile information (such as user ID, Allows sharing of IP address Allows sharing of location data Allows sharing of data generated on/with the website (such as game characters, scores, application usage etc) Allows sharing of browsing history/health history/religion/political orientation (Potentially sensitive information) Allows sharing of uploaded media (images, video, text, etc) Allows sharing of data that third parties share with the provider about you Allows sharing of untagged data (unassociated with users’ profiles) with third parties with third parties Requires that receivers of data follow certain guidelines/rules Notifies users when government requests access to their data personal description, likes, interests, etc)

Ad Targeting
Uses data to target users with advertisements (but does not share that data with advertisers)

User Control
Allows for opt-out and opt-in for data collection/sharing Allows users to delete their data

SSO
Specifically mentions single sign on in the data use/privacy policy

Appendix B

The bracketed numbers in the provider columns represent locations in the corresponding privacy policies (included with the appendix of this document with item numbers: _____) in which the specific

20

clauses related to a data collection practice exist. To view specific clauses, reference the privacy policy and look for a number highlighted in yellow with the value of interest. The text that comes after the number is the clause referenced.
Allowances
Data collection
Allows collection of personally identifiable information (name, birthday, address, phone, email, gender)

Provider 1
Facebook
[1][13][16]

Provider 2
Google
[1][2][3][4][6]

Provider 3
Twitter
[1][3][7][12][13] [15]

Allows collection of information about contacts/friends

[3][15][24][25]

[2][3]

[11][13][15]

Allows collection of information others have shared about you Allows collection of profile information (such as user ID, personal description, likes, interests, etc)

[24]

None

None

[12]

[1][2][3]

[2][5][13] [16]

Allows collection of IP address

[8]

[3][4][7]

[21]

Allows collection of location data

[8][11]

[2][9]

[6][15][17][19]

Allows collection of data generated on/with the website (such as game characters, scores, application usage etc) Allows collection of browsing

[2][3][6][7][9][11][19][24]

[3][5][10]

[14][15][16]

None

[3][5][10]

[20]

21

history/health history/religion/ political orientation (Potentially sensitive information) Allows collection of uploaded media (images, video, text, etc) Allows collection of data that third parties share with the provider about you [5][14] [3][2] [7]

[10][37]

[3]

[13][22]

Data Sharing
Allows sharing of personally identifiable information (name, birthday, address, phone, email, gender) with third parties Allows sharing of information about contacts/friends with third parties Allows other users to share information about you with third parties Allows sharing of profile information (such as user ID, personal description, likes, interests, etc) with third parties Allows sharing of IP address with third parties Allows sharing of location data with third parties Allows sharing of data generated on/with the website (such as game characters, scores, application usage etc) with third parties [13][19][29][30][36][44] [2][16] [9][25][26]

[16][19][29][30][36][38] [44] [24][32][33]

None

[15][25][26]

None

[15]

[15][19][26][29][30][36] [38][41] [44]

[2]

[4][9][22][25] [26]

[44]

None

[24][25][26]

[19][28][29][30][44]

None

[15][17][19][25] [26] [9][14][15][25] [26]

[16][29][30][44]

[2]

22

Allows sharing of browsing history/health history/religion/political orientation (Potentially sensitive information) with third parties Allows sharing of uploaded media (images, video, text, etc) with third parties Allows sharing of data that third parties share with the provider about you with third parties Allows sharing of untagged data (unassociated with users’ profiles) with third parties with third parties Requires that receivers of data follow certain guidelines/rules Notifies users when government requests access to their data

[29][30][44]

[16]

[25][26]

[15][19][29][30][44]

None

[14][15][25][26]

None

None

None

[12][42][45]

[18]

[29]

[45][46]

None

[29]

None

None

None

Advertisements
Targets users with specific advertisements (but does not share that data with advertisers) [4][43] [11] [20]

User Control
Allows for opt-out or opt-in for data collection and sharing Allows users to delete their data [20][22][27][34][39] [12][13] [10][18][23]

[21][23][31][40]

[14][15]

[30][31]

SSO

23

Specifically mentions single sign on in the data use policy

[35][36]

None

None

24

Appendix C

25

Appendix D

SSO Survey for Users
1) Are you male or female?  Male  Female What age group do you fall under?  17 and under  18-21  22-25  26-30  31-40  41-50  51-60  61-70  71 and over What country do you live in? ______________________ How many hours a week do you spend on a computer? ________ How many hours a week do you spend on the internet? ________ What percent of the time do you use the internet for personal and business uses? (Your responses should sum to 100.) ______% Personal ______% Business Please estimate the number of hours you spend per week on the following services: Services Number of Hours Email ________ Facebook ________ Twitter ________ Google Account ________ Other: __________ ________ Other: __________ ________ Other: __________ ________ What are your primary uses for the internet?  Shopping  Research  Communication

2)

3) 4) 5) 6)

7)

8)

26

  9)

News Other (please list) __________________

Has your privacy ever been violated on the Internet?  Yes  No

10) If yes, please briefly describe the most recent time that your privacy was violated. ___________________________________________________________________ 11) Do you      ever worry that your privacy might be violated in the future? Please mark the scale from 1-5. 1- Not worried at all 2- Somewhat worried 3- Neutral 4- Worried 5- Extremely worried

12) Please briefly describe a situation where your privacy might be violated online. 13) Are you familiar with Single Sign-On Services (SSO)? (For example: Facebook Connect or Google Accounts)  Yes  No 14) Do you use Single Sign-On services?  Yes  No 15) If yes, which SSO do you use?  Facebook Connect  Google Account  Twitter  Other (please list) ___________________ 16) If yes, why do you use Single Sign-On service? __________________________________________________________________________ 17) Please describe how you think Single Sign-On services work. __________________________________________________________________________ 18) Do you have any privacy or security concerns related to your use of Single Sign-On services?  Yes  No

27

19) Do you have multiple identities online?  Yes  No 20) How many email addresses do you have? ______ 21) Do you typically link your payment/credit card information to your personal identity online?  Yes  No 22) How do     you typically pay for things you purchase online? Direct credit card PayPal Google Wallet Other (please list) _____________

23) Where did you hear about this survey?  Facebook  Reddit  Search Engine  Other (please list) _____________ Appendix E For access to other pertinent data points please reference: Data RAW Survey Data Facebook Policy (Codified) Twitter Policy (Codified) Link https://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlE

https://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2M

https://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEk

28

Google Policy (Codified)

https://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2c

Sign up to vote on this title
UsefulNot useful