The Yale Law Journal

VOLUME
133
2023-2024
Forum

How to Get the Property Out of Privacy Law

22 Apr 2024

abstract. Privacy law emphasizes control over “your” data, but requiring consent for each data use is unprincipled, not to mention utterly impractical in the AI era. American lawmakers should reject the property model and use a framework that creates defined zones of privacy and clear safe harbors, irrespective of consent.

Introduction

In the United States, multiple attempts to pass an omnibus privacy law have faltered.1 Explanations for these repeated failures usually home in on specific features of the proposals: a controversial preemption of state law, or a private right of action that was unacceptable to business-oriented legislators, for example.2 These reasons are true in a sense; they accurately identify the portions of the bill that divide active stakeholders and break open political alliances. But there is also a deeper explanation—a latent tension between a property-based approach to privacy law and a torts-based one.

Property frameworks give people significant control over whether their data is collected and how it is used. Under this model, loss of control is a harm in itself (like loss of property), in addition to whatever downstream harms might also follow. By contrast, the torts framework manages risks related to activities. It assumes that nobody automatically has the right to exclude others from collecting, creating, or using information about them, but they are entitled to protection from unjustified risks and misuse of personal data that will foreseeably lead to physical, economic, and dignitary harms. The two frameworks have irreconcilable differences with respect to who decides how data will be collected or used, and how the decisions will be made. The conflict has smoldered and kept American lawmakers in paralysis.

Privacy advocates typically use the property/control framework,3 and this has only increased over the last ten years under the influence of European and Californian privacy laws.4 European law treats data as something that belongs to the people described in them. The data subjects have exclusive control over processing in most circumstances, just as individuals have a fundamental right to control access and use of their property.5 Thus, under the European Union’s General Data Protection Regulation (GDPR), any time a data controller wants to reuse data for a new purpose, it must seek the consent of the data subject.6 Deviations from the property frame occur only to make it even more difficult for individuals to sell or give away control of personal data, lest they trade away their privacy too easily.7 For example, the California Consumer Privacy Act (CCPA) and GDPR put limitations on what types of quid pro quo trades a data controller can offer in exchange for the consent of a data subject.8 And even after permission is given, both legal regimes allow data subjects to renege and claw back their data under certain circumstances.9 Thus, privacy law is attempting to make personal data an extra-sticky form of property.10

But the property frame, popular as it may be, is unworkable and unprincipled.11 Consider how Europe’s privacy law has already affected the region’s approach to Artificial Intelligence (AI). Shortly after Open AI released ChatGPT to the general public, Italian privacy regulators forbade its access to the Italian market because the company collected and automatically analyzed users’ queries.12 Yet as advances in AI and machine learning place more power in the hands of end users to plot and commit a wide range of acts, both good and bad, AI safety, to say nothing of performance, will require AI companies to monitor the uses and misuses of their clients to avoid catastrophic risks.13 Europe’s AI Act will require all AI systems to guard against bias and other risks (which requires companies to take into account “characteristics or elements that are particular to the specific geographical, behavioural or functional setting”),14 to maintain traceable logs of input data,15 and to engage in post-market monitoring of users and information-sharing about threats.16 Each of these decisions to collect, analyze, and occasionally disclose personal information undermines the data-subject control that was supposed to be so critical under GDPR. Europe’s pretzel-shaped path for regulating the technology sector has a very uncertain future because the promise of fundamental rights to control personal information is simply not tenable.

Beyond its impracticality, treating personal information as property belonging to the data subject is unsound in principle, notwithstanding the widespread habit of referring to personal information as “my data.” Privacy laws that attempt to create sticky privacy interests in personal data are not merely impractical. They are also incompatible with the philosophy of property rights. Treating personal data like sticky property—something that makes it difficult for the data subject to relinquish their control and easier to claw it back in most circumstances—lacks historical and logical foundation. Property rules rest on an assumption that the rights-holder has superior knowledge about the best uses of the property—when to exclude, when to share, and when to sell—and would do so without causing significant problems for others.17 Outside of the special case of intellectual property, these conditions almost never hold when the object of the right is speech.18 Just as “my” ideas, “my” opinions, and “my” observations are not really mine—not in any sense that allows me to exclude you from using them too—”my” data is also not my data.19

Privacy law should return to its roots in tort theory, where legal rules are intended to mediate conflicts between legitimate activities and interests without assigning veto power to anybody.20This is still the right frame. Good privacy policy will require lawmakers—courts, federal agencies, or what have you—to proactively protect people against risks that they may not have reason to know about. It will also require lawmakers to permit data processing that provides some benefit to the data processor, data subjects, or third parties without the necessity of getting the data subject’s permission. This basic structure follows the American tradition of treating privacy as one of many objectives in a bustling zone of conflicting activities and interests.

This Essay argues that the American tradition of treating privacy as part of the management of social risks rather than as a sticky property bestowed to data subjects is a virtue of the American legal tradition that should not be cast aside in the rush to reign in technology companies.

This Essay proceeds as follows. Part I distinguishes a tort-style, risk-based treatment of privacy law from a property-style, rights-based framework and traces the historical, meandering path through both. Part II explains why a tort rule is more fitting for personal information than a property rule. Part III describes in broad strokes how a risk-based privacy regime would work. Courts, legislators, or regulators would need to establish some clear zones of protection (per se violations) and zones of liberty (safe harbors or per se nonviolations) in order to serve the foreseeable and obvious needs of data subjects and data users. They would also need to establish some benchmarks for analyzing novel forms of data processing. I provide a more elaborate discussion of the zones of liberty (safe harbors) because the breadth of these allowances is what distinguishes a risk-based approach from a rights-based approach that has some exceptions.

A return to the torts frame will set the United States up for success as privacy law is forced to respond to new uses of personal data in AI, autonomous vehicles, health innovations, and other areas where meaningful systems of privacy self-management will be impractical and undesirable.

I. tort versus property and the battle for privacy

Across the many ways to define property, the common feature is the right to exclude.21 As Guido Calabresi and Douglas Melamed put it, property laws give an individual or group enough control over something that they have a limited veto power over when or how it is used.22 They differentiated property entitlements from liability rules by focusing on the nature of the legal remedies: an individual who contests the actions of another under a liability rule might be awarded compensation based on their damages, and even then only if they can also prove fault.23 The property owner has a different remedy. Because the property owner alone has the right to decide whether the action should be taken, the property owner can have the court completely undo the other’s actions—by, for example, enjoining the defendant to return the item or not take the action again—and can demand punitive damages or other strong deterrents to reinforce the right of exclusive control.24

For the purposes of this Essay, I want to focus particularly on who gets to manage behavior by recognizing when a wrong occurs.25 With tort-liability rules, when two parties disagree over whether an action was wrong or not, that disagreement is resolved by a disinterested rule maker. This would be a judge under the common law, and the determination would be made ex post, after the putative harm has occurred. But I will also count as a “tort” framework other forms of lawmaking that identify wrongs without allocating a property interest, even if that work is done by legislators or administrative agencies. The important point is that a tort approach to managing behavior requires a disinterested public entity to decide what sort of conduct is wrongful based on their assessment of a myriad of societal benefits and risks. By contrast, a property framework does not require a disinterested assessment of wrongs. One of the interested parties—the one who holds the property right—has the final word on whether the other could use the property or not. If you need another person’s consent to do something, a property-style interest is involved.26

It is natural to assume that tort rules attach to activities while property rules govern things. For example, “driving” is an activity that multiple people can pursue without asking for your permission, but they do have to ask permission before driving your car.27 However, the distinction between activities and things becomes muddy with intangible or nonrivalrous things. Nowhere is this more obvious than in the context of privacy. If a company creates a log of your movements throughout a store, is this an activity (“creating” a log) or an invasion (creating a log of “your” movements)?

American privacy law has wrestled with this question for over a century. In The Right to Privacy, the famous article by Samuel D. Warren and Louis D. Brandeis that started it all, privacy was conceived as control: “The common law secures to each individual the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others.”28 Warren and Brandeis explicitly reject the idea of treating privacy like a property right, but this is only to distinguish it from the colloquial meaning of property that is typically commoditized and valued through its distribution and sale.29 As for the formal definition of a property right, where the law offers exclusive control (including the hoarding of the property and the perpetual exclusion of others), Warren and Brandeis had precisely this in mind.

However, the earliest instantiation of privacy claims that could be brought against a private party emerged through common-law tort.30 Privacy-related tort claims were recognized only when plaintiffs could show they suffered “outrageous” intrusions or disclosures that would “be offensive and objectionable to a reasonable man of ordinary sensibilities.”31 In other words, people were generally at liberty to observe each other, share gossip, and otherwise invade what some could consider to be their personal bubble, just as they were at liberty to pursue other private activities like driving or playing frisbee in a park—even when those acts cause annoyance, delay, or accidental injury. But if the activities of collecting or sharing information are unreasonable, and if as a result of those activities the target of the privacy-intruding activity has suffered distress or concrete harm, then tort law recognizes a wrong.32

The emphasis on data-subject control reemerged in the 1960s, when computers became more common. The efficiency of computers changed the quality and quantity of personal data collection, especially by the government. In response to the anxiety around computers, an influential congressional report (the HEW Report) highlighted data-subject control more than American tort law traditionally had. According to the “Fair Information Practice Principles” (FIPPs) promulgated in the report, a data processor should not be able to share personal data with another entity without the informed consent of the data subject.33 However, a closer read of the HEW Report reveals more nuance. The report explicitly rejected formulations of privacy that assume the data subject has exclusive control and instead favored the concept of “mutuality.”34 As it explained:

[Some of the privacy formulations] speak[] of the data subject as having a unilateral role in deciding the nature and extent of his self-disclosure. None accommodates the observation that records of personal data usually reflect and mediate relationships in which both individuals and institutions have an interest, and are usually made for purposes that are shared by institutions and individuals. In fact, it would be inconsistent with this essential characteristic of mutuality to assign the individual record subject a unilateral role in making decisions about the nature and use of his record.35

The 1977 report Personal Privacy in an Information Society, produced by the Privacy Protection Study Commission, further refined the FIPPs to make clear that they do not place data in the data subject’s absolute control. For example, where the HEW Report says that “[t]here must be a way for an individual to prevent information about him obtained for one purpose from being used or made available for other purposes without his consent,”36 the 1977 report uses the principle that “[t]here shall be limits on the external disclosures of information about an individual a record-keeping organization may make (the Disclosure Limitation Principle).”37 Swapping data-subject consent for “limits” of some unspecified origin marks a shift away from the property frame.

Nevertheless, privacy statutes that were modeled after the FIPPs—starting with the Privacy Act38 (which constrains how the federal government handles personal data) and including privacy statutes related to healthcare (HIPAA39), credit reporting (FCRA40), and electronic communications (ECPA41)—prioritized a consent-based regime while leaving enough leeway and loopholes for the regulated industries to achieve some minimal level of innovation and operational efficiency.42 Recent proposals for federal privacy legislation have pushed for more data-subject control, with fewer allowances, and the thrust of most legal scholarship runs in the same direction.43

Now, to be clear, the division between the property and tort frameworks is not so sharp. Nearly every privacy advocate and scholar understands that privacy ensures a personal sphere that is shielded, but not absolutely closed off, from the uses and needs of others. Even when privacy law is built around a right of control, the right of a data subject to lock away information has been understood as a limited one that must be reconciled with, and sometimes superseded by, other compelling social needs. Leading privacy law scholars including Alan Westin,44 Daniel J. Solove,45 Helen Nissenbaum,46 and Neil Richards47 have recognized that information control is not always in the best interest of society or of the data subjects themselves, and privacy does not and should not require consent in every single conceivable case.48

So, the debate boils down to how wide or narrow the scope of freedom is for the data user—the potential privacy-violator, that is. It might be useful to separate the torts framework from the property one by thinking of defaults: Is it the case that individuals generally have control over the information that describes them, and exceptions are made to that general rule? (This would be the property frame.) Or is it instead more accurate to say that individuals generally have the freedom to observe, collect, and share information about others, and that this general rule of permissiveness is limited under circumstances that are harmful or risky? (This would be the torts frame.)49

Put this way, the torts framework is anathema to nearly every serious piece of scholarship or privacy proposal put out over the last several decades.50 Even Daniel Solove and Ignacio Cofone, who have advocated for regulating privacy based on risk rather than property-style user self-management (and for reasons very similar to my own), have not veered very far from the property frame’s center of gravity.51 Cofone would treat unexpected repurposing of personal data as a form of privacy harm per se that can support liability.52 Solove has embraced such a capacious definition of harm that his proposals would still require data-dependent firms to seek consent or stop what they are doing in order to avoid exposure to debilitating liability in a wide variety of real-world scenarios.53 For example, Solove suggests that use of personal data to create predictions should be regarded as risky based on the chance of error.54 He has also argued that anxiety about downstream consequences of a revelation should be recognized as a harm—even the fallout when a person is exposed as a liar.55 If firms face credible threats of liability in these scenarios based on the risk of anxiety or error, the scope of freedom becomes severely limited to a range that looks, to me, not much different than that afforded by the GDPR.56

The privacy scholarship has created a misimpression for the general public that strong control-based privacy laws do not pose serious limitations on legitimate and useful activities—on freedoms to experiment and innovate, to perform research, to speak, to compete against dominant technology firms, or to offer content and services at a price that is heavily subsidized by behavioral advertising. A tort approach that emphasizes these liberties, and that creates legal liability only when a data practice foreseeably causes unjustified and concrete harm to others, offers much more promise for an enduring form of consumer protection.57 A tort approach deters and provides recourse for activities that are harmful in a meaningful sense of causing real welfare reductions, and it also frees the data users to pursue activities that are not likely to cause harm.

II. privacy law needs to manage risks while recognizing legitimate data activities

Why is it better to have a risk-management system overseen by a judge or regulator rather than a sticky property right managed by individual data subjects? After all, if the data economy is good for consumers, they can always choose to license or sell access to their personal information. Why should decision-making be taken out of their hands?

The theory and scholarship coming out of the law-and-economics movement continues to offer the richest and most sustained attention on questions about when human interactions should be regulated through private property rights or through liability rules. Generally speaking, law should recognize a property interest when individuals have special information about how to get the best value out of a resource and transaction costs are low enough to allow everybody to trade and rearrange their entitlements so that the resource is used for its most valuable purposes.58 Conversely, the scale tips against recognizing a property interest if transaction costs are high, or if the decisions of the rights-holder are likely to cause negative externalities to third parties.

For reasons I explain, these factors cut against recognizing a property interest in personal data.

A. Information Gaps and Transaction Costs

Would data subjects know how to maximize the value of their personal data if they had full control over its uses? It is hard to believe they would. One problem is that people might not have stable values for their own privacy, as suggested by the so-called “privacy paradox.” This is the frequently replicated phenomenon where individuals report high levels of concern about their privacy but are also willing to give it up for small payments or perks.59 While there are several theories that could explain the paradox,60 the most parsimonious explanation is that abstract questions about the value of privacy cannot account for what is actually a pretty utilitarian calculation. Concerns about privacy are adjusted up or down depending on the likely consequences of each data practice in context.61 People will generally allow a data practice if they believe the benefits outweigh the risks.62

In other words, for most people, the value that they derive from personal data either being used or not used depends on what they get out of it, what they lose from it, and whether it helps other people. This would be a good match for a property and contract/market system if the type and value of data processing was obvious to data subjects, and if the consenting process was low cost. The trouble is, it is costly and time-consuming not only to manage consent processes, but even to analyze the risks and benefits of each data usage to figure out whether to consent in the first place.63

The main benefits that Big Data brings to consumers are the same ones that the data-using companies want, too: access to personal data helps drive down transaction costs.64 Within the broad set of factors that can cause markets to be inefficient, unfair, or to even fail, many spring from the fact that the participants in the market lack relevant information to make the best choices. These include search and matching costs (the costs of making sure that prospective buyers know about what is offered for sale and that prospective sellers know where there is demand for their products and services); verification costs (making sure that goods and services offered for sale really are what sellers purport them to be); the costs of bargaining; and the costs of policing or enforcing performance of a contract.65 When these costs are reduced, the surplus is usually shared by all participants in the market.

For example, one longstanding source of concern is the use of greater amounts of personal data for credit scoring and lending decisions. The unspoken presumption is that a scoring system that uses a greater amount of personal data creates a privacy imposition on the loan applicants and a financial benefit to the banks. But this is not so. A study of mortgages in the San Francisco Bay Area found that loan applicants living in counties with greater privacy protections (that set privacy as the default) paid higher interest rates and also defaulted more often than the applicants living in the counties that set data flow as the default, even after controlling for confounders.66 Banks could not match applicants to loans as well, so there was more risk, and the costs of risk were, of course, passed along to the consumers.

Consider another example of reduced matching costs (though it is rarely discussed in quite this way): quarantine during a pandemic. The goal of a quarantine, whether voluntary or compulsory, is to restrict the movements of individuals who are most likely to be infected and contagious without interfering with the activities of those who are least likely to be so. In an environment with strong privacy defaults and information friction, this is very hard to do, with the result being that more people are quarantined, more people are infected, or both. South Korea’s public health authority took the unusual step of publicly disclosing the time-stamped geolocation of individuals who later tested positive for COVID, allowing residents to self-assess whether they had been exposed to the virus and quarantine if necessary.67 This marked a loss of control for the individuals whose location histories were shared automatically, and for every South Korean resident who may in the future become COVID-positive. But in exchange for this loss, Korean residents avoided an estimated 200,000 cases and 7,700 deaths during the subsequent four months.68

Transaction costs have become visible in the wake of the implementation of the GDPR in Europe. The GDPR caused investment in research and development for new technology startups in the EU to falter, and the productivity of EU firms fell as compared to U.S. counterparts.69 Meanwhile, the internal budgets for privacy offices at companies of all sizes (including in the United States) increased twenty-nine percent.70 Costs to American companies that comply with all aspects of GDPR-style laws are estimated to be approximately $480 per data subject.71 Companies respond by raising prices for consumers—a cost that may well be worth it to some people and in some contexts, but probably not as a general rule. And this ignores the costs of inconvenience not only in the form of the time required to click through and manage consents, but also in terms of the degraded service that results from a less customized experience. For example, the introduction of GDPR seems to have resulted in consumers having to use twenty-one percent more search terms and to access sixteen percent more websites before making their online transactions.72

That said, when personal data winds up in the vault of a data company, there is no guarantee it will be used for net-beneficial purposes (like matching loans to applicants) rather than welfare-reducing purposes (like creating sucker lists).73 It’s not just consumers but the data collectors, too, who do not know how information that is stored or shared can be used in the future—to the data subject’s benefit or detriment.74 The information vacuum leaves data subjects with no clear preference between sharing their personal information and trying to lock it away. Privacy discussions in the media and legal journals do not usually even attempt to net this out by comparing the chance of harm from lost control over personal data to the chance of benefits to the data subjects and to others. Our collective instincts about the Big Data economy may be overly pessimistic because so much of the benefit comes in an invisible form of lowering needless transaction costs. But the larger point is that the holders of the property right, if one were to exist, would not know which disclosures and uses inure to their benefit and which do not. If risk management were handed to a trusted and trustworthy authority—a judge or an agency that had the right incentives to identify and weed out harmful practices—the data subject would be freed from worry and from the relentless queue of consent requests.75 Otherwise, data subjects are destined to be habitual consenters or nonconsenters. The habitual consenters will suffer the costs of oversharing (e.g., greater risk of identity theft), and the habitual nonconsenters will bear the costs of undersharing (e.g., higher interest rates).

B. Externalities and Collective Action Problems

If a property rule is likely to cause harmful externalities—that is, harm to individuals who are not represented by the parties bargaining over the use of the property—liability rules should apply instead.76 A veto over the use of personal data will cause externalities, both bad and good. The good spillover effects include protecting others when a harmful practice like imputing information or predicting the behavior of others for a malicious purpose is stymied by others who refuse to supply their data during the training stage. These spillover effects of privacy are discussed in the privacy literature.77 But there are also several types of negative externalities, and these are likely to be more common. The negative externalities include (1) evading detection of fraud, (2) frustrating attempts to improve accuracy and to test for and reduce bias, and (3) reducing competition in the technology industry.

Evading detection of fraud. Harmful externalities would arise from personal data ownership any time a data subject can exercise veto power during the course of an investigation for fraud, crime, or other misbehavior. If a fraudster exploits the privacy of their communications or bank accounts to obscure their bad intentions and gain the trust of a mark, the costs of privacy are borne not by the banks or the communications providers but by the third-party fraud victim. This problem was vividly captured in Richard A. Posner’s The Right of Privacy, which cautioned proponents of privacy rights that it can become a right to commit a wide range of formal and informal frauds.78

Reducing accuracy and bias correction. Data practices in machine learning or basic social-science research depend on having a representative sample of data to perform well and avoid biased results.79 These goals would be frustrated if some (nonrandom) set of data subjects refuse to allow access to their data.80 Indeed, the timely and worthy goals of tackling unintentional bias in AI systems will require more personal data to avoid biased and unnecessary error in predictions.81 In theory, because everybody benefits from a more accurate and fair AI or machine-learning system, everybody would be willing to pay their share, via money or reductions in privacy, to ensure that the AI system has access to an adequate amount of training and context data. But nobody has the incentive to pay extra to make up for data holdouts. Even the AI service providers, who would have some incentive to achieve a minimum level of accuracy to have a viable product, are likely to lack the incentive to pay for the optimal amount of personal data because a biased or error-prone system can still be marketable as long as it performs better than the available alternatives.

Reducing competition. Consent-based privacy controls also frustrate competition in the digital marketplace. The companies that are in the best position to collect and manage consents and to combine a variety of types of data are the largest companies that already dominate their markets—Google and Amazon, for example.82 This can be understood as another collective-action problem:83 Consumers as a whole know that they would benefit in the long run from more companies in more robust competition for their attention and money. But on the margin, no data subject would volunteer to spread their data around and subject themselves to the risks of misuse if the goal of greater competition is likely to fail due to the inaction of the other data subjects.

These problems of collective inaction are mitigated when data users are allowed to operate in a limited zone of freedom where their activities can proceed as long as they aren’t likely to cause harm—a classic liability rule rather than a property rule.

C. Speech as a Quintessential Liberty Zone

Debates about data-privacy laws are so steeped in the language of consumer protection and digital markets that they obscure the fact that data privacy is a direct restriction on information, and on the means of production of knowledge.84 In other words, privacy laws are speech restrictions. Modern free-speech law is rooted in the theory that speech and information create thorny collective-action problems: the benefits of free speech are often amorphous, hard to predict in advance, and spread across large numbers of people, and constitutional scrutiny helps counterbalance that tendency to undervalue it.85

The First Amendment effectively requires a risk-management approach to the regulation of speech. Outside of intellectual property, which has a distinct claim to matching the theory of property rights and its own constitutional basis for doing so,86 speech has been treated as a special activity that should be constrained only when the harms are serious and nonspeculative.87

Consider the rules allowing plaintiffs to sue for defamation. Defamation law permits the subject of a communication to sue the speaker if the communication is false, impugns the character or reputation of the subject, and is made to a third party with the requisite level of fault.88 The real-world harm that can be caused from spreading lies about a person are obvious. A defamatory lie harms both the subject of the lie and the listeners who may be duped by them. While harm can also occur from spreading true statements about a person, as a rough rule of thumb, we might expect that reputation damage is more unjust, and therefore more harmful, when the harsh judgments of character are based on fabrications. In other words, there is clear, concrete harm when actors engage in spreading falsehoods. And yet, under the pressure of constitutional law and the logic of tort law, a claim for defamation is highly constrained. A plaintiff cannot simply say, “This information pertains to me, ergo I can demand the removal and deletion of the publication and compensation for my lost reputation.”

Defamation law looks nothing like a property claim. A plaintiff has to prove that the defendant’s disclosure was false, was published with at least negligence with respect to its veracity, and caused concrete harm (in most cases).89 Even then, defendants can raise several legal privileges that operate to ensure that frank conversations and normal operations are not impeded by fear of defamation liability. A person or company is privileged to share information, even if it’s false, about another person if it is done in the course of an official proceeding,90 if the information is offered in self-defense,91 if they are warning others of danger, if they have a common interest with the listener,92 if the disclosure is in the public interest93 or if the disclosure occurs entirely within a corporation.94

Thus, when it comes to falsehoods, the law is about as far from a property frame as one could get: not only do the subjects of falsehoods have no veto power over when they are being discussed, but the liability rule that applies to the information sharers is narrow, crafted with a good deal of concern about chilling information flows to the detriment of everyone. Defamation law has recognized several safe-harbor privileges that ensure speech activities are not chilled when the free flow of information is net beneficial, even if it isn’t perfectly beneficial for each and every person.

Defamation is not the best model for privacy law, but it is a useful guide because it showcases the theoretical, practical, and constitutional reasons to avoid assigning property rights in information.

D. Case Study: Facebook

Many of the problems I have described with treating personal information as property can be seen in the political crosswinds that are jostling the major U.S. technology companies. The demand for property-style privacy law spiked shortly after the revelation that Facebook had allowed third-party companies like Cambridge Analytica to collect a rich trove of Facebook user data (as well as some more basic information about the users’ Facebook friends).95 The immediate legal response, including the voter initiative that brought the California Consumer Privacy Act (CCPA) into being, was to create sticky property rights for data subjects so that data could not be disclosed or used for a new commercial purpose without a renewed and salient consent procedure.96

However, the intervening years have also brought a good deal of concern that the largest technology companies, including Facebook, were amassing such a rich trove of personal data about their users that startup companies would not be able to compete—hence efforts in Europe and the United States to affirmatively require digital platforms to make user data available to third-party companies.97 There has also been an increased understanding that small content producers are dependent, to varying degrees, on behavioral-advertising revenues, and that the costs of consent and high friction affect a large ecosystem of journalists and entertainment firms.98 A risk-based legal rule would avoid these problems by forcing lawmakers and judges to be more honest and concrete about collective priorities when consumers’ goals and interests are in tension.

We can see some of the logic of a torts framework refracted through the facts of In re Facebook.99 Facebook was tracking the web-browsing behavior of Facebook users when they visited websites with an imbedded Facebook “like” button. The websites cooperated with the practice because they got an advertising boost if visitors clicked the “like” button,100 and Facebook of course got access to more particularized user-behavior data that it could leverage in its ad-exchange business.101 Perhaps the third-party websites provided notice in their privacy policies,102 but the practices were not made salient. As a result, Facebook users were subjected to cross-site tracking without realizing it.103

To proponents of property-style privacy rights these facts constituted a straightforward violation of privacy rights. Facebook engaged in an unconsented observation and collection of personal data where its users weren’t expecting it, and this interfered with Facebook users’ exclusive control over that data. Case closed.

Tort principles, by contrast, required a different sort of analysis. Applying the intrusion upon seclusion tort,104 the court’s opinion made clear that the plaintiffs first had to prove that the personal information the defendant collected was private (“seclusion”) to have any chance of recovery.105 This requires plaintiffs to prove that the defendant’s observations and data collection violated the norms and reasonable expectations of the observed.106 It is not always going to be a straightforward inquiry because, unlike norms that have developed in physical space where real property and architectural features can double as markers of expectations, expectations are less visible in digital space.107

But even if the data is private, the defendant could still prevail if the collection was not unreasonable (“highly offensive”).108 This element essentially acknowledges that making observations and using information is an activity people are generally permitted to do. Thus, the consequences of observing or using somebody’s personal information have to be significant in order for courts to curtail that freedom. The way the court put it:

“[P]laintiffs must show more than an intrusion upon reasonable privacy expectations. Actionable invasions of privacy also must be ‘highly offensive’ to a reasonable person, and ‘sufficiently serious’ and unwarranted so as to constitute an ‘egregious breach of the social norms.’” Determining whether a defendant’s actions were “highly offensive to a reasonable person” requires a holistic consideration of factors such as the likelihood of serious harm to the victim, the degree and setting of the intrusion, the intruder’s motives and objectives, and whether countervailing interests or social norms render the intrusion inoffensive. While analysis of a reasonable expectation of privacy primarily focuses on the nature of the intrusion, the highly offensive analysis focuses on the degree to which the intrusion is unacceptable as a matter of public policy.109

The guiding light is risk management. When courts ask how the plaintiff might be harmed by the alleged practices, what the defendant intended to do with the information, and what society gets out of the whole affair, they are recognizing that the legitimate interests of defendants and others are also at stake, and any disagreements that arise must be managed without giving a veto or exclusive control to anyone.

III. a practical guide to risk-based privacy law

Privacy is part of a larger social contract.110 American privacy law has been stuck in a perennial state of contestation because it is part of a complex set of trade-offs and coordinated actions that data subjects have with each other, with industry, and with the government.111 While it may very well be that most people prefer, in general, and all else being equal, to have control over their data, these abstract preferences say little about where legal rights and obligations should be drawn in real-world contexts, where personal interests in privacy come into conflict with other pressing or pragmatic concerns. When control-based privacy rights come at a cost to threat detection, machine-learning applications, or even consumer convenience, “all else” is not equal, and consumers will be better off in many scenarios without weighing in on data processing.

The HEW Report stated that a respect for privacy requires data collectors to anticipate and behave responsibly when conflicts arise between their interests and those of the data subjects. Some data uses are in the mutual interests of the controllers and the data subjects, some are in their mutual interests but are “not perceived as such,” and some are in direct conflict.112 This observation was ahead of its time and explains why privacy rules are so difficult to articulate in advance. Add to this third parties’ legitimate interests, and it is clear that the interests people have in a data practice sometimes run together and sometimes run against each other, and that data subjects will have a hard time knowing when their welfare is in jeopardy.

A risk- or harm-based approach to American privacy law is the right course. However, because the consent model has dominated privacy discourse for so long, lawmakers and legal scholars have not been focused on designing effective risk-based frameworks.

Risk-based privacy law needs to evolve three categories of practices: (1) per se privacy violations, which are harmful practices that should not be conducted unless the data processor has received clear and well-informed consent (and possibly not even then); (2) safe-harbor data practices, which can be thought of as per se nonviolations and are the data practices that are clearly warranted because of public needs or because of their benefit to the data subject, data processor, or others; and (3) the messy middle category consists of the practices that are neither obviously harmful nor obviously desirable. The legality of the practices in the messy middle should depend on the procedures that are in place to provide notice or transparency, the sensitivity of the data or inferences, the ability of data subjects to avoid the practice if they wish, and the costs and benefits of the practice. Liability for the middle category will depend on a common-law-like process that sorts new use cases into the first two categories (per se violations or safe harbors).113

A. Per Se Violations

Per se privacy violations involve observations, disclosures, or uses of data that are nearly universally unwanted and disturbing or are unnecessary for the welfare of the data subject and the community. Lawmakers have a head start in identifying per se violations based on existing privacy torts, rules, and statutory laws that have stood the test of time. Examples of established per se privacy violations include intrusive observations or recordings of private spaces and conversations, including upskirt photography,114 wiretapping phone lines and other private communication channels,115 and the needless infliction of embarrassment.116

A few themes can be discerned from this collection of longstanding privacy violations. First, the collection of information from uninvited observers will cause people to have less candor with one another, so it is imperative to define some spaces and contexts that are private—where those within the seclusion zone can be assured they are not monitored by outsiders, and where outsiders have effective notice that the usual freedoms to observe and record are not available unless they are invited in. Second, while the zones of seclusion will not be easy to define in all cases, there are some areas and contexts (e.g., bathroom stalls) for which there is near-universal agreement. And third, the law can and should recognize when personal information is disclosed to an audience that will foreseeably take advantage of vulnerable data subjects or will foreseeably overreact and retaliate against them.

I suspect other categories of per se violation could be added to a new statute without much controversy based on these principles. For example, uses of data that are purely extractive and designed to facilitate fraud or addiction could be recognized as per se violations.117 The indiscriminate publication of large amounts of private information (for no apparent public purpose) may be another. A firm’s knowing or reckless noncompliance with its own privacy policies could also be considered a per se violation, with statutory damages or actual damages awarded depending on the firm’s mental state.118 Legislators or regulators could also prohibit the use of data that has the purpose or the unjustified effect of discriminating against protected classes of individuals. For example, using personal data to infer an employment applicant’s race, pregnancy status, or religious affiliation in order to affect the chance of hiring would be a violation no matter what type of data was used, how it was collected, or how it was processed.119

B. Safe Harbors (or Privileges, or Per Se Nonviolations)

Existing statutes and precedents can also supply a partial list of data practices that the law has historically exempted from privacy-related restrictions. These should be treated categorically as nonviolations in a risk-based system. Data practices that have become commonplace and that could not be forbidden without a significant shock to popular digital-media services may also be good candidates for legal privileges. As a general rule, safe harbors should be created if there would be broad agreement, even if not universal agreement, among well-informed observers that a data practice is good for society on balance.120 Together, a set of safe harbors can establish a zone of liberty for data processors and innovators.

What follows is a starter set of data collection and processing safe harbors based on common privacy-law exceptions, free-speech-related privileges, and common industry practices. They are listed from least to most controversial.

1. With Consent

A data practice done with the knowledge and voluntary consent of the data subject should be considered a per se nonviolation, as long as the practice is not on the list of per se violations. Few would find this controversial since this exception replicates the key feature of popular privacy-law proposals—data-subject control. Operationalizing consent is another matter, though. There will be ambiguities over whether a data subject has sufficient knowledge about the bargain and whether the consent of a data subject is voluntary or performed with an unacceptable level of duress. Data-use disclosures that are buried in an end-user agreement may not constitute evidence of “knowledge” unless the practices are also well publicized or generally known to data subjects. And consent that is obtained in circumstances where the data subject has little choice—such as at a hospital or with an employer—could also fall short of the proper definition of consent. But these issues are not intractable, and they are not unique to the privacy context. Consent doctrines in battery,121 intellectual property,122 and police searches123 can serve as models.

The government could simplify and encourage effective consent by developing voluntary labeling schemes that help firms quickly signal the sorts of data practices the firm has committed to using (or not using). This would allow firms to compete on privacy more efficiently as a salient feature of their services.124 But the important thing, for this Essay, is that consent offers just one of many routes for companies to avoid legal complications.

2. For the Direct Benefit of the Data Subject

When a wallet is returned to its owner, the owner will not resent the Good Samaritan who looked inside to find an ID. The same will be true when personal data is used to locate individuals suffering from a mental-health crisis, track down displaced children, find individuals for the purpose of relaying payments to them (such as class-action settlements, child-support payments, or refunds), provide warnings about known or credible threats to health or safety, or complete forms or bypass red tape for a transaction that the data subject initiated. To generalize, services performed under the reasonable and good-faith belief that the services will assist and benefit the data subject should be per se nonviolations of privacy. They can be analogized to the tort doctrine of “presumed consent.”125

3. For Self-Protection or the Protection of Others

Occasionally, the data subject is an aggressor who is attempting to use deceit to harm others.126 In these situations, the past or future victims of their deceit have more compelling interests in discovering the misrepresentation than the data subject has in hiding it.127 Data should be able to be used without consent for detecting or warning about fraud, criminal activity, or other misbehavior, and for complying with federal or local “Know Your Customer” laws.128 To be sure, detecting fraud and crime requires the analysis of the data of many innocent individuals. But when access and analysis of personal data is done for the purpose of exposing misconduct, the data processor does not need consent.

For example, Apple once had plans to check all images uploaded to iCloud against the images of known child pornography.129 If (and only if) an accountholder’s photos produce ten matches, the software would have automatically alerted Apple employees so they could share the information with authorities. Apple has since abandoned its plans in response to criticisms and concerns related to privacy.130 The safe harbor I suggest here would allow Apple to proceed with the program without the threat of legal penalties because the purpose of the program is to detect threats to third parties. Of course, this would not require Apple or any other firm to scan for threats to third parties. But if they choose to do so, that act would be immunized from privacy-related suits.

4. For the Purposes of Statistical Research or Internal Research and Development

Most privacy laws allow data processors to use data for internal research, product improvement, and new-product development, as well as for general statistical-research purposes.131 They also allow data processors to prepare deidentified versions of the data and share them to other researchers. These research uses of personal data often fall outside the statutory definitions of “personal data” because the data is expected to be used in a manner that does not directly link back to the individual data subjects. The reasons this exception to privacy-related liability is controversial at all is that there is significant concern over the potential for deidentified data to be reidentified, or to be used in a manner that is highly stigmatizing for a particular identity group.132 Thus, the crafting of this safe harbor will depend on a thoughtful approach to make sure data processors are taking reasonable efforts to prevent the reidentification of data subjects.

5. Cooperating with Civil, Criminal, or Regulatory Investigations

Every privacy statute has an exemption for firms that respond to subpoenas, summonses, or warrants as long as the judicial or law-enforcement requests are consistent with Fourth Amendment law and other applicable statutes.133 Legal scholars have been critical of these exemptions,134 but they are part of an enduring trade-off between privacy and public safety. At its best, data can be used by police not only to detect or deter a perpetrator of a serious crime, but also to clear a suspect or exonerate a criminal defendant.135 Thus, it probably makes sense to provide a law-enforcement safe harbor in a generally applicable privacy law, and to encourage lawmakers to place appropriate restrictions on law-enforcement access through more targeted legislation.

6. For Matching, with the Direct Participation of the Data Subject

Finding reliable information about potential clients, customers, or business partners is a market transaction cost that can frustrate matching between two sides of a market or search process.136Just as consumers often need information about businesses to have confidence that they know enough about the quality of goods and services, businesses, too, need information to find their customers and clients.

Sometimes, this matching is performed with the proactive participation of the data subject, as when the data subject applies for a loan, seeks admission to a school, or enters search terms into a flight aggregator website or a search engine. When the data subject initiates a matching process, the data processor responding to the request should be permitted to use independently sourced data—that is, information that goes beyond what the applicant has supplied—in order to find the best match between the applicant and the supplied product, service, or content.

Let us use lending as an example, since the matching process in this market is more familiar than other matching practices. When a lender uses a credit report to make lending decisions, these independent sources of information can reassure lenders about the creditworthiness of a loan applicant.137 Credit reports help the applicants as much as they help the lender. While there may be some threshold beyond which additional data is not useful for improving matching and performance, society has not reached that threshold.

The value of matching extends well beyond credit markets. In the health sector, for example, advances in machine learning are changing the practice of medicine because new programs can digest and learn from vast amounts of data about both the patient and others to make customized, time-sensitive recommendations.138 Data-driven medical-adherence scoring systems, which use information about patients to predict whether they are likely to stick with a prescribed treatment, can improve health by better matching patients to services, such as treatments and pharmacy interventions.139 Some of the benefits of personal-data use can be harnessed with the data subject’s consent and active participation, as when a health or wellness app asks users for permission to access data from sensors or other digital services. But this is not always the case. First, managing consent and permissions adds a layer of costs that will often be impractical or financially detrimental. For example, in the United States, healthcare-privacy regulation caused hospitals to slow or stop the adoption of electronic medical records due to the costs of consent and compliance burdens, preventing implementation of this cost-saving technology.140 The transactions would be even more difficult and costly if every company had to negotiate with each individual data subject over payment and other contract terms. The practical effect of such a rule would mean that all industries would simply have to operate with much less information and would be much more wasteful and higher priced as a result.141

To be sure, the effects on individual data subjects will be mixed when personal data is used to match people to scarce resources. Not every data subject will be made better off. Some will receive less favorable treatment (e.g., worse home-mortgage terms) than they would in the absence of secondary sources of personal data. But as long as the purpose of analyzing the data subjects is legitimate (e.g., to match credit-card offers, dating-site users, or search results), access to more complete data will result in more “winners” than “losers,” and the average user of the matching service (and society at large) will be well served.142

This recommendation does not currently exist in any privacy statute, so far as I am aware. And the concept of matching, which is often performed through algorithmic predictions and scoring, has tended to provoke the suspicion and fear of consumers.143 Thus, I count this as one of the more controversial recommendations for a safe harbor. Nevertheless, the demonstrated benefits of using additional data to perform better matching (in terms of both reduced error and reduced bias), as compared to alternative methods, are sizable enough to justify it. And that’s to say nothing of the free-speech interests involved.144

7. For Personalizing or Targeting Speech, Even Without the Direct Participation of the Data Subject

Finally, lawmakers should seriously consider recognizing a safe harbor for data collection and processing that is performed for the purpose of tailoring the creation or delivery of speech. What I am talking about here includes some exciting and high-value applications, such as personalized diagnosis and recommendation tools that are emerging in Health AI.145 But it also includes some of the more controversial practices in the Big Data economy—behavioral advertising and hyperpersonalized social media newsfeeds—which have long attracted the attention and ire of regulators.146 Yet these practices deserve protection from the threat of litigation outside especially harmful circumstances. First, the use of personal data to tailor or target messaging—including marketing—is fully protected speech under the First Amendment.147 Also, services often depend on meeting the niche demands of their audiences or on the extra income coming from targeted advertising in order to subsidize the zero-price goods and services that internet users have come to love.148 A law that makes these popular business models illegal will diminish the quantity and quality of content, perhaps in ways that users cannot fully appreciate. The case that the targeted quality of advertising or recommender systems has caused more harm than good to consumers generally has not been substantiated, with the possible exception of the wide-ranging (but also mixed) evidence about special harms of social media to adolescents.149

* * *

Privacy law (and society at large) would benefit from the delineation of safe harbors. Establishing safe harbors will spur activity and innovations in these zones of liberty and will reduce the economy-wide costs of legal uncertainty and consent rituals. The set of per se nonviolations I have recommended here range from the banal (e.g., for the direct benefit of the data subject) to the more controversial (e.g., for cooperation with law enforcement, or for targeted content and marketing), but all have some claim to legitimacy based on logic, tradition, or constitutional protection.

C. The Messy Middle

Between the per se violations and safe harbors resides an indeterminant middle category where a wide range of factors will have to be used to determine whether a particular practice, carried out in its real-world context, causes unjustified harm. In this zone, reasonable minds may disagree on whether a data practice is appropriate, just as they will disagree in difficult cases of negligence about whether the defendant behaved reasonably. In other words, these are the hard cases. But before diving into the factors that will determine whether a privacy violation has occurred, let us first reflect on how much has already been resolved. Many of the data practices that provoke privacy debates—from behavioral advertising to identity theft—have already been determined as either automatically permissible or automatically impermissible. What is left are practices that are not terribly common or that have not yet emerged, and that do not obviously belong in one per se category or the other.

For these hard cases, most privacy experts would home in on a few factors that cut either for or against the recognition of a privacy harm. The analysis of “highly offensive” in the In re Facebook case provides a starting place for the factors that should be relevant—the motivation of the data processor, the risk of harm to the data subject, and the impact on third parties, among others.150 Other relevant factors include whether the data processor overcollected (recording more data than is likely to be useful for legitimate primary and secondary purposes),151 used security best practices,152 and provided salient forms of notice.153 The costs of obtaining consent (both financial costs and the costs to the utility of the data for the particular purpose) should also be considered, as should the actual and perceived costs and benefits to the individual, to the data processor, and to third parties. Identifying a privacy violation requires a consideration of the totality of these factors. The hallmark of a violation is when the privacy risks cannot be justified by the benefits of a practice.154

Conclusion

This Essay has argued that privacy law, in order to be meaningful and workable in a technologically advanced environment, must be crafted around principles of risk-mitigation rather than data ownership. Data processing should operate in a general zone of permissiveness, with limitations based on the foreseeable risks that a particular practice will create.

This proposal will be foreign to a privacy culture accustomed to advocating for data-subject control, and some may worry that a risk-based privacy regime puts too much faith in the companies that collect and use data—allowing them to decide what to do with it instead of the data subject. Viewed this way, a risk-based approach could be mistaken as little more than codified self-regulation. But this is not so. A risk-based privacy framework would not leave the scope and meaning of privacy protection to the priorities and whims of data users. Instead, a neutral intermediary—a judge or a federal agency, for example—would craft rules for safe harbors and per se violations, would determine the propriety of other practices in a case-by-case manner, and would assess new data practices and business models as they emerge. Data processors would be no more in control over the definition of “privacy” than manufacturers are in control of the definition of “negligent design.”

Nevertheless, there is a deep question hidden in the objection. Who should be considered knowledgeable enough and reasonable enough to define which practices are harmful, beneficial, or more-or-less a wash? And on what basis could we feel confident that the decision maker has all the necessary information? There are viable arguments in favor of common-law courts, of legislatures, and of expert agencies like the Federal Trade Commission. I am not sure which has the best case. But current and future debates about who should craft privacy rules of reasonable care should not let the main point be obscured: data processing is a presumptively valid, net-positive activity. Safe harbors should be ample enough to cover nearly every low-risk activity involving the collection and use of information. Close cases should require the decision makers—whichever branch of government they are in—to take an accounting of not only the data subject’s interests but the interests of processors and third parties, too. And unnecessarily risky practices involving personal data should be forbidden and deterred, no matter how many unthinking click-through consents the data user may have been able to collect.

Brechner Eminent Scholar and Professor of Law, University of Florida. I am tremendously grateful for comments and feedback on an early draft from CJ Pommier, Derek Bambauer, Neil Chilson, Jorge Contreras, James Cooper, Erika Douglas, Barbara Evans, Dan Gilman, Jim Harper, Gus Hurwitz, Ginger Jin, DeBrae Kennedy-Mayo, Geoff Manne, Will Rinehart, Eugene Volokh, Peter Winn, Andrew Woods, Christopher Yoo, and John Yun. I also benefitted greatly from the patience and wisdom of Dorrin Akbari and the editors of the Yale Law Journal Forum.