«Privacy and Integrity in the Untrusted Cloud Ariel Joseph Feldman A Dissertation Presented to the Faculty of Princeton University in Candidacy for ...»
Privacy and Integrity in the Untrusted
Ariel Joseph Feldman
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Department of
Advisor: Edward W. Felten
c Copyright 2012 by Ariel Joseph Feldman.
This work is licensed under the Creative Commons AttributionNonCommercial-NoDerivs 3.0 United States License. To view a copy of this
license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/us/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
Abstract Cloud computing has become increasingly popular because it oﬀers users the illusion of having inﬁnite computing resources, of which they can use as much as they need, without having to worry about how those resources are provided. It also provides greater scalability, availability, and reliability than users could achieve with their own resources. Unfortunately, adopting cloud computing has required users to cede control of their data to cloud providers, and a malicious provider could compromise the data’s conﬁdentiality and integrity. Furthermore, the history of leaks, breaches, and misuse of customer information at providers has highlighted the failure of government regulation and market incentives to fully mitigate this threat. Thus, users have had to choose between trusting providers or forgoing cloud computing’s beneﬁts entirely.
This dissertation aims to overcome this trade-oﬀ. We present two systems, SPORC and Frientegrity, that enable users to beneﬁt from cloud deployment without having to trust the cloud provider. Their security is rooted not in the provider’s good behavior, but in the users’ cryptographic keys. In both systems, the provider only observes encrypted data and cannot deviate from correct execution without detection. Moreover, for cases when the provider does misbehave, SPORC introduces a mechanism, also applicable to Frientegrity, that enables users to recover. It allows users to switch to a new provider and repair any inconsistencies that the provider’s misbehavior may have caused.
SPORC is a framework for building a wide variety of user-facing applications from collaborative word processing and calendaring to email and instant messaging with an untrusted provider. It allows concurrent, low-latency editing of shared state, permits disconnected operation, and supports dynamic access control even in the presence of concurrency. Frientegrity extends SPORC’s model to online social networking. It introduces novel mechanisms for verifying the provider’s correctness and access control that scale to hundreds of friends and tens of thousands of posts while still providing iii the same security guarantees as SPORC. By eﬀectively returning control of users’ data to the users themselves, these systems do much to mitigate the risks of cloud deployment. Thus, they clear the way for greater adoption of cloud applications.
During my time at Princeton, I have beneﬁted from the support, insight, encouragement, and friendship of many people. Not only did they make this dissertation possible, they also made my years in graduate school some of the best of my life.
First, I wish to thank my advisor, Ed Felten. When I initially applied to Princeton, I asked to be considered for its masters program. But Ed knew that what I really wanted was a Ph.D., even before I did, and took the unusual step of getting me admitted to the doctoral program instead. Throughout my graduate career, I have beneﬁted immensely from his insight and his uncanny ability to quickly distill the essence of any complex issue. I have also gained much from his advice — he has always seemed to know the right thing to do in any situation, academic or otherwise.
I also thank Mike Freedman who, as an unoﬃcial second advisor, pointed me towards secure distributed systems. His help not only led to the most productive period of my graduate career, it also resulted in the work that comprises this dissertation.
I have been particularly lucky that my colleagues in my research group and in the Center for Information Technology Policy as a whole were also my friends. Joe Calandrino, Will Clarkson, Ian Davey, Deven Desai, Shirley Gaw, Alex Halderman, Joe Hall, Nadia Heninger, Josh Kroll, Tim Lee, David “The Admiral” Robinson, Steve Schultze, Harlan Yu, the late Bill Zeller, and honorary member Jeﬀ Dwoskin (who also provided this excellent dissertation template) made our lab a free-wheeling place where we were just as likely to play practical jokes and make horrible puns as we were to collaborate on interesting and relevant research. I especially wish to acknowledge Bill’s contributions and loyal friendship. He was a coauthor of the SPORC work presented in this dissertation, but he also enriched our lives with his brilliance, humor, generosity, and incisive wit. I miss him dearly.
In addition, I wish to thank Aaron Blankstein, coauthor of this dissertation’s Frientegrity work, for his insights, hard work, and friendship. My Princeton experience v was also enhanced by the friendship and research ideas of Tony Capra, Forrester Cole, Wyatt Lloyd, Haakon Ringberg, Sid Sen, Daniel Schwartz-Narbonne, Jeﬀ Terrace, and Yi Wang. Beyond Princeton, I thank my summer mentors, Umesh Shankar of Google and Josh Benaloh of Microsoft Research for their valuable guidance. Special thanks to Yotam Gingold. Although Yotam and I graduated Brown the same year, he began his CS Ph.D. program two years ahead of me, and his advice and example have been invaluable as I navigate the challenges of academia.
I thank my committee — Andrew Appel, Brian Kernighan, and Dave Walker along with Ed and Mike — for their constructive comments throughout the dissertation process. Thanks as well to Melissa Lawson and Laura Cummings-Abdo for their help with all things administrative and for making the CS Department and CITP run smoothly. I am also grateful for the ﬁnancial support that made this dissertation’s research possible: Princeton’s Upton Fellowship and grants from Microsoft, Google, and the Oﬃce of Naval Research.
Most importantly, I would like to thank my family. I would not have been able to reach this milestone without the education that my parents, Ella and Arthur, provided for me, along with their boundless advice, support, and encouragement.
Thanks also to my late grandparents Charles, Lillian, Zecharia, and Rina, as well as to my aunts, uncles, cousins, and many others for their support and their keen interest in my progress over the years. Finally, I would like to thank Racquel for the happiness that she has brought to my life over the past two years. Despite all of the times that I have been stressed or unavailable while ﬁnishing this dissertation, she has always supported me. Her love, understanding, and belief in me, even when I doubted myself, has allowed me to achieve this goal.
Introduction The past decade has seen the rise of cloud computing , an arrangement in which businesses and individual users utilize the hardware, storage, and software of thirdparty companies called cloud providers instead of running their own computing infrastructure. Cloud computing oﬀers users the illusion of having inﬁnite computing resources, of which they can use as much or as little as they need, without having to concern themselves with precisely how those resources are provided or maintained .
Cloud computing encompasses a wide range of services that vary according to the degree to which they
away the details of the underlying hardware and software from users. At the lowest level of abstraction, often referred to as infrastructure as a service, the provider only virtualizes the hardware and storage while leaving users responsible for maintaining the entire software stack from operating system to applications. Examples of such services include Amazon EC2  and competing offerings from IBM , Microsoft , and Rackspace . At the opposite end of the spectrum, called software as a service, the provider oﬀers speciﬁc applications such as word processing, email, and calendaring directly to end users, usually via the Web, and manages all of the necessary hardware and software . Although this category typically refers to services intended to replace desktop applications such as Google Apps  and Microsoft Oﬃce Live , it can also cover applications with no desktop analogs such as social networking services like Facebook  and Twitter .
Regardless of type, users have increasingly adopted cloud deployment to perform functions that they cannot carry out themselves or that cloud providers can execute
more eﬃciently. More speciﬁcally, cloud computing oﬀers users the following beneﬁts:
Scalability: To operate their own computing infrastructure, users must make a ﬁxed up-front investment in hardware and software. If the demands on their systems later increase, they must invest in additional resources and bear the burden of integrating them with their existing infrastructure. Furthermore, if load subsequently declines, users are left with unused capacity. With cloud deployment, however, users can purchase only the resources they need in small increments and can easily adjust the resources allocated to them in response to changes in demand.
Availability, reliability, and global accessibility: Because cloud providers are in the business of oﬀering computing resources to many customers they typically have greater expertise in managing systems and beneﬁt from greater economies of scale than their users. As a result, their systems often have higher availability and reliability than systems users could ﬁeld on their own. Moreover, storing users’ data on cloud providers’ servers allows users to access the data from anywhere without having to run their own always-on server with a globally-reachable IP address.
Maintainability and convenience: By abstracting away the details of the underlying hardware, and in some cases, the software, cloud providers free users from having to maintain those resources. In particular, software-as-a-service users often can use an application without having to explicitly install any software, simply by navigating to Web site.
Unfortunately, these beneﬁts have come at a signiﬁcant cost. By delegating critical functions to third-party cloud providers, and, in many cases, transferring data that had previously resided on their own systems to providers’ servers, users have been forced to give up control over their data. They must trust the providers to preserve their data’s conﬁdentiality and integrity, and a provider that is malicious or has been subject to attack or legal pressure can compromise them. Providers typically promise to safeguard users’ data through privacy policies or service level agreements which are sometimes backed by the force of law. But as we explain below, these measures have been inadequate. As a result, users are currently faced with a dilemma: either forgo the many advantages of cloud deployment or subject their data to a myriad of new threats.
1.1 The Risks of Cloud Deployment
The recent history of cloud computing is rife with data leaks, both accidental and deliberate, and has led to a widespread recognition of the privacy risks of cloud deployment. The ﬁrst of these risks is unplanned data disclosure that occurs as a result of bugs or design errors in a cloud provider’s software. For example, a ﬂaw in Google Docs and Spreadsheets allowed documents to be viewed by unauthorized users , while two of the most popular photo hosting sites, Flickr and Facebook, have suﬀered from ﬂaws that have leaked users’ private pictures. [76, 40]. Second, cloud providers’ centralization of information makes them attractive targets for attack by malicious insiders and outsiders, and their record of protecting users’ data has been disappointing. Indeed, in 2011, Twitter reached a settlement with the United States Federal Trade Commission over its lax security practices that allowed outside attackers to impersonate any user in the system and read their private messages .
Moreover, numerous sites have experienced break-ins in which users’ email addresses, passwords, and credit card numbers were stolen [98, 80].
Third, cloud providers face pressure from from government agencies worldwide to release users’ data on-demand. For example, Google receives thousands of requests per year to turn over their users’ private data, and complies with most of them . In addition, several countries including India, Saudi Arabia, the United Arab Emirates, and Indonesia have threatened to block Research in Motion’s email service unless it gave their governments access to users’ information. . Furthermore, users’ data may be disclosed to authorities without users’ knowledge, often without warrants.
The U.S. Electronic Communications Privacy Act states that a warrant is not required to access data that has been stored on cloud providers’ servers for longer than six months . Moreover, under the “third-party doctrine,” U.S. courts have held that information stored on providers’ servers is not entitled to the Fourth Amendment’s protection against unreasonable searches and seizures because by giving their data to third parties voluntarily, users have given up any expectation of privacy .
Finally, cloud providers often have an economic incentive to voluntarily disclose data that their users thought was private. Providers such as Google [97, 119] and Facebook [87, 95] have repeatedly weakened their privacy policies and default privacy settings in order to promote new services, and providers frequently stand to gain by selling users’ information to marketers. Furthermore, even if users’ data is stored with a provider that keeps its promises, the data is still at risk. If the provider is later acquired by another company or its assets are sold in bankruptcy, the new owners could “repurpose” the data.
These risks to the conﬁdentiality of users’ data have received much publicity.