Models, Race, and the Law
abstract. Capitalizing on recent advances in algorithmic sampling, The Race-Blind Future of Voting Rights explores the implications of the long-standing conservative dream of certified race neutrality in redistricting. Computers seem promising because they are excellent at not taking race into account—but computers only do what you tell them to do, and the rest of the authors’ apparatus for measuring minority electoral opportunity failed every check of robustness and numerical stability that we applied. How many opportunity districts are there in the current Texas state House plan? Their methods can give any answer from thirty-four to fifty-one, depending on invisible settings. But if we focus only on major technical flaws, we might miss the fundamental fact that race-blind districting would devastate minority political opportunity no matter how it is deployed, just due to the mathematics of single-member districts. In the end, the Article develops an extreme interpretation of a dubious idea proposed by Judge Easterbrook through an empirical study that is unsupported by the methods.
Introduction
The Voting Rights Act of 1965 (VRA) guarantees that all American citizens, regardless of race or ethnicity, should have an equal opportunity to participate in the political process and to elect representatives of their choice.1 The VRA frequently interacts with single-member districts, which serve as the electoral system for congressional and nearly all state legislative races and are the go-to remedy in local VRA enforcement. It has long been known in the redistricting literature that random boundary placement puts minorities at a major structural disadvantage.2 Single-member districts can secure electoral opportunity for minorities, but only if the minority population is sufficiently concentrated and the boundaries are favorably aligned. The ability of the VRA to remediate historical discrimination and underrepresentation thus depends on proactive redistricting. As a matter of practice, when a set of districts empowers minority communities to elect representatives in rough proportion to their population, courts have held the promise of political equality to have been fulfilled.3 However, proportionality has functionally operated as a ceiling even when viewed as normatively desirable: White voters will never be represented by less than their share of the population while minority communities nearly invariably will.4
In The Race-Blind Future of Voting Rights (henceforth, the Article), Jowei Chen and Nicholas Stephanopoulos sketch out a less proactive future of districting, including a mechanism that stands to needlessly sabotage minority political power and undermine the signal remedial goal of the VRA.5 The authors devote their Article to delineating a new baseline of opportunity provided by a randomized redistricting protocol that operates with no regard to race.6 Their project is strategic and pragmatic, motivated by the prediction that an increasingly conservative Supreme Court is likely to effect “avulsive change” for the VRA in the near term, quite possibly by dropping any role for rough proportionality and elevating race-blind mapping as a new ideal.7 Their Article thus seeks to provide a roadmap for voting-rights advocates to navigate a new nominally race-blind landscape.
To present their approach as a manageable standard, Chen and Stephanopoulos go big—modeling voter preferences in 1,903 districts and evaluating 38,000 districting plans spanning 19 states—and describe their outputs as the race-blind baseline, full stop. Their particular setup is said to be capable of capturing the full dynamics of non-racial redistricting.
We find that most—though not all—enacted state-house plans overrepresent minority voters relative to the race-blind baseline. For example, numerous plans in the Deep South include substantially more African American opportunity districts than would typically emerge from a nonracial redistricting process, while a few plans in the Border South include fewer such districts. Similarly, several western states feature extra Hispanic opportunity districts compared to the race-blind baseline, while only one western state underrepresents Hispanic voters.8
As we show below, the authors’ methodology does not warrant these kinds of conclusive statements, much less the slippage into the unmistakably normative language of over- and underrepresentation.
We certainly share the authors’ enthusiasm about the burgeoning ensemble method. The central counterfactual problem in vote dilution law for many decades has been that of conceptualizing the undiluted baseline, or understanding how districts might convert votes into seats in a state of nature, absent manipulation. In recent years, algorithms that generate large samples of “ensembles” of plausible districting plans have been increasingly used to approach that question. Using ensembles made to conform to legal rules, but without regard to race or partisan data, can provide a non-gerrymandered baseline. Unfortunately, the approach taken by Chen and Stephanopoulos does not conform to best practices in mathematical modeling.9
First, the authors’ ambitious scope leads them to take many shortcuts in methodology as they build their ensembles and label of opportunity. They borrow tools from mathematical and statistical modeling (notably the randomized districting algorithm developed in the research group that one of us runs10) but do not provide a detailed description of their design choices; do not report any convergence metrics to confirm that their ensembles of districting plans are representative of any particular weighting of plans; and do not provide any control of errors that propagate through their workflow, especially through their idiosyncratic use of ecological inference.
There are quite a few junctures where their modeling decisions should be flagged. For example, the nineteen states under consideration all have different statutory and constitutional rules for redistricting. Therefore, a one-size-fits-all modeling approach cannot come close to the mark of capturing legal nuance. This is not simply a question of whether to take each rule or principle into account, but how to operationalize that priority. For example, the legal language around county preservation is markedly different across these states: Texas mentions county preservation,11 North Carolina12 and Ohio13 have extremely specific language about how to measure it, and Delaware14 and Illinois15 do not have any county preservation rules at all. Nevertheless the same kind of (very strong) county filter is applied by Chen and Stephanopoulos in generating districts in all states—the details, impacts, and alternatives are left completely undiscussed even though the particular filter they use sacrifices the properties needed for representative sampling. Perhaps more fundamentally, the authors rely on a single presidential election to infer voter preferences—Obama versus Romney 2012—immediately decoupling their findings from VRA practice where attorneys would never claim to identify minority opportunity based on Obama’s reelection numbers alone. Beyond this, the authors consider only a single plausible definition of opportunity district; they do not compare their “opportunity” label against the ground truth of recent district performance; and they provide no significant robustness checks at any step in their modeling. Because the authors package their series of complex and computationally intensive functions into a single statistic (the median number of opportunity districts) with very little discussion about their modeling choices, readers may not appreciate the extent to which many of the ingredients are arbitrary, approximate, or numerically unstable. We unpack some of the workflow complexity in Table 2. Do these many choices have effects that cancel out in the end somehow, leaving the finding of over- or underrepresentation intact even if the numbers shift? Do their design choices systematically bias estimates upwards or downwards relative to what would be possible if more elections were taken into account or state laws were handled differently? Chen and Stephanopoulos, when they do address these questions, do so glibly.16
Second, the authors misuse the ensembles that they do generate. Ensembles are not suited to identifying a single ideal value of a score, as Chen and Stephanopoulos implicitly do by assigning a designation of under- or overrepresentation based on the median value alone.17 Rather, ensembles are a powerful tool for understanding baseline ranges for valid districting plans and are useful for clarifying decisionmaking tradeoffs. As the Supreme Court held in 1994, “no single statistic provides courts with a shortcut to determine whether a set of single-member districts unlawfully dilutes minority strength.”18 The single statistic presented by Chen and Stephanopoulos is no exception.
One of the challenges of introducing novel technical methods in a law review is that the blueprints that are especially important for validation—the details of algorithm design, the magnitude of uncertainty, convergence metrics, alternative specifications, and other robustness checks—are not likely to draw needed scrutiny from law review editors or indeed to hold the attention of most readers. The temptation is thus to gloss over or omit these technical details altogether, even in an eighty-six-page article and its fifty-three-page appendix. But transparency is all the more important for a project that has not been subject to rigorous peer review. This worry about law review publication is not new. Nearly twenty years ago, Lee Epstein and Gary King wrote an important piece in which they reviewed the legal literature and sounded the alarm that “the current state of empirical legal scholarship is deeply flawed.”19 The lack of attention to sound methodology, they warned, would lead readers to “learn considerably less accurate information about the empirical world than the studies’ stridently stated, but overly confident, conclusions suggest.”20
This is exactly what generates our grave concerns about the current Article and its placement in a flagship law review. Chen and Stephanopoulos’s style of leveraging technical tools while ignoring the scientific standards surrounding their development and deployment risks creating an unnecessarily muddy legal terrain. And the stakes are high: they have provided a recipe that may well devastate electoral opportunity for minority groups just as public opinion and voting behavior are pushing the other way.
In sum, we find that The Race-Blind Future of Voting Rights is a provocative proof of concept that stands on a shaky empirical foundation. The Article uses the promising ensemble method of random district generation to deliver a baseline for minority electoral opportunity; this Response both flags technical issues and questions the conceptual alignment of the methods with their application to voting rights law.
The full text of this Essay can be found by clicking on the PDF link to the left.
See, e.g., Bernard Grofman, For Single-Member Districts, Random is Not Equal, in Representation and Redistricting Issues 55, 55-58 (Bernard Grofman, Arend Lijphart, Robert McKay & Howard Scarrow eds., 1982). Jowei Chen also coauthored a ground-breaking study of the interplay of geography and this well-known majority seat bonus. See Jowei Chen & Jonathan Rodden, Unintentional Gerrymandering: Political Geography and Electoral Bias in Legislatures, 8 Q.J. Pol. Sci. 239 (2013).
See, e.g., Johnson v. De Grandy, 512 U.S. 997, 1000 (1994); see also Ellen D. Katz, Margaret Aisenbrey, Anna Baldwin, Emma Cheuse & Anna Weisbrodt, Documenting Discrimination in Voting: Judicial Findings Under Section 2 of the Voting Rights Act Since 1982, 39 U. Mich. J.L. Reform 643, 654-60 (2006) (documenting and analyzing section 2 decisions).
See, e.g., Nicholas O. Stephanopoulos, The Relegation of Polarization, 83 U. Chi. L. Rev. Online 160, 168 (2017) (explaining that a “more accurate statement of the [dominant theory of vote dilution] is that minority voters should be able to elect their preferred candidates to the extent permitted by their geographic distribution up to a ceiling of proportionality”).
One attempt to model the Voting Rights Act (VRA) compliance in a Markov chain can be found in a collaborative effort by data scientists and a voting-rights attorney, see Amariah Becker, Moon Duchin, Dara Gold & Sam Hirsch, Computational Redistricting and the Voting Rights Act, Metric Geometry & Gerrymandering Group (2020), https://mggg.org/VRA [https://perma.cc/8WJ4-KRPD].
This Markov chain algorithm, called recombination or ReCom, is discussed at more length infra Part III. 2 and Appendix A.1. For a detailed discussion of ReCom, see Daryl DeFord, Moon Duchin & Justin Solomon, Recombination: A Family of Markov Chains for Redistricting, Metric Geometry & Gerrymandering Group (Mar. 27, 2020), https://mggg.org/ReCom.pdf [https://perma.cc/6N3Z-B5G7].
Stephenson v. Bartlett, 582 S.E.2d 247, 250 (N.C. 2003) (interpreting Article 2 of the state constitution that “no county shall be divided” to permit county splits for VRA compliance or when necessary to comply with the one-person-one-vote standard so long as county groupings are minimized and resulting districts fall within five percent of population equality).
Ohio Const. art. 19, § 2(B)(5) (“Of the eighty-eight counties in this state, sixty-five counties shall be contained entirely within a district, eighteen counties may be split not more than once, and five counties may be split not more than twice.”); id. § 2(B)(7) (“No two congressional districts shall share portions of the territory of more than one county, except for a county whose population exceeds four hundred thousand.”); id. § 2(B)(8) (“The authority drawing the districts shall attempt to include at least one whole county in each congressional district. This division does not apply to a congressional district that is contained entirely within one county or that cannot be drawn in that manner while complying with federal law.”).
Del. Code tit. 29, § 804 (“In determining the boundaries of the several representative and senatorial districts within the State, the General Assembly shall use the following criteria. Each district shall, insofar as is possible: (1) Be formed of contiguous territory; (2) Be nearly equal in population; (3) Be bounded by major roads, streams or other natural boundaries; and (4) Not be created so as to unduly favor any person or political party.”).
It is of course insufficient to assert that design choices are applied for measuring opportunity in both the enacted plan and its comparator maps, as the authors do. Chen & Stephanopoulos, supra note 5, at 901 n.174 (“Any idiosyncrasies in our particular ecological inference run are reflected in the numbers of opportunity districts we report for both the enacted plans and the simulated [sic] maps.”). We demonstrate this inadequacy in infra Figure 5, where we show that instability may affect the measurement of the enacted plan while leaving the ensemble unchanged.
For a discussion of their reliance on the median, see infra Section II.2. It was sleight of hand of just this kind—treating a single number based on piles of political modeling choices as an authoritative indicator—that earned the memorable label of sociological gobbledygook. Transcript of Oral Argument at 40, Gill v. Whitford, 138 S. Ct. 1916 (No. 16-1161) (“CHIEF JUSTICE ROBERTS: . . . the whole point is you’re taking these issues away from democracy and you’re throwing them into the courts pursuant to, and it may be simply my educational background, but I can only describe as sociological gobbledygook.”).
Some examples of more subtle forms of racial discrimination in voting include moving from single-member districts to at-large voting or vice versa, changing elected positions to appointed positions, prohibiting “bullet voting,” and vote dilution via cracking and packing when redistricting. See, e.g., Presley v. Etowah Cty. Cmm’n, 502 U.S. 491 (1992); Allen v. State Bd. of Elections, 393 U.S. 544 (1969).
The political-science literature on this topic, where this effect goes by the name of a “winner’s bonus” or “seat bonus” for the majority, is too large to survey here. For just one important example, see Pippa Norris, Choosing Electoral Systems: Proportional, Majoritarian and Mixed Systems, 18 Int’l Pol. Sci. Rev. 297 (1997). For a few other key themes in the literature, see Moon Duchin, Taissa Gladkova, Eugene Henninger-Voss, Ben Klingensmith, Heather Newman & Hannah Wheelen, Locating the Representational Baseline: Republicans in Massachusetts, 18 Election L.J. 388 (2019).
See, e.g., Korematsu v. United States, 323 U.S. 214, 216 (1944) (“It should be noted, to begin with, that all legal restrictions which curtail the civil rights of a single racial group are immediately suspect. That is not to say that all such restrictions are unconstitutional. It is to say that courts must subject them to the most rigid scrutiny.”); United States v. Carolene Prods., 304 U.S. 144, 152 n.4 (1938) (calling for a “more searching judicial inquiry” in cases where the ordinary political process fails to address prejudice against “discrete and insular minorities”).
Id. at 919 (referring to the proportionality baseline as “an upper limit to how much representation minority groups can legally claim”); see also Katz et al., supra note 3; Stephanopoulos, supra note 4, at 168 (explaining that a “more accurate statement of the theory [of rough proportionality] is that minority voters should be able to elect their preferred candidates to the extent permitted by their geographic distribution up to a ceiling of proportionality”).
See, e.g., Justin Levitt, Quick and Dirty: The New Misreading of the Voting Rights Act, 43 Fla. St. U. L. Rev. 573, 578 (2016) (“Proper focus on local nuance and meaningful political power—as precedent demands—can restore the Voting Rights Act to a vehicle for fighting both racial discrimination and racial essentialism.”).
See, e.g., Holder v. Hall, 512 U.S. 874, 928 (1994) (Thomas, J., concurring in judgement) (referring to proportional representation as “the most logical ratio for assessing a claim of vote dilution” and noting that other standards would have “less intuitive appeal”); Thornburg v. Gingles, 478 U.S. 30, 84 (1986) (O’Connor, J., concurring) (“[A]ny theory of vote dilution must necessarily rely to some extent on a measure of minority voting strength that makes some reference to the proportion between the minority group and the electorate at large.”).
See Chen & Stephanopoulos, supra note 5, at 872 (citing to Justice Thomas’s concurring opinion in Holder v. Hall, 512 U.S. 874 (1994), and the majority opinion in Shaw v. Reno, 509 U.S. 630, 657 (1993), which referred to remedial racial districting as “political apartheid” that may “balkanize us into competing racial factions”).
For other examples of VRA theater, in which redistricting actors proclaim one set of data-driven aims while targeting another set of political and racial aims, see Levitt supra note 39 at 605, which notes that “a state may have incorrectly attempted to comply with section 2 and yet still have drawn lines that provide an equal opportunity for minority voters to elect candidates of choice;” and Shelby County v. Holder, 679 F.3d 848, 885 (D.C. Cir. 2012) (Williams, J., dissenting), which criticizes the reverse-engineered coverage formula in § 4(b) of the VRA by noting that “sometimes a skilled dart-thrower can hit the bull’s eye throwing a dart backwards over his shoulder . . . . Congress hasn’t proven so adept.”
Chen & Stephanopoulos, supra note 5, at 877; see Lani Guinier, The Tyranny of the Majority 121 (1994) (“It’s districting in general—not race-conscious districting in particular—that is the problem.”). Guinier and others have looked to alternative voting systems precisely for their promise in this regard, and ranked-choice voting in particular is currently seeing a surge of interest, from Maine to Alaska. Gerdus Benade, Ruth Buck, Moon Duchin, Dara Gold & Thomas Weighill, Ranked Choice Voting and Minority Representation, Metric Geometry & Gerrymandering Group, https://mggg.org/RCV [https://perma.cc/9995-RN7X].
See Heather K. Gerken, Understanding the Right to an Undiluted Vote, 114 Harv. L. Rev. 1663, 1723 (2001) (“The right to an undiluted vote does not fit easily into either a group rights or an individual rights category. While it is certainly true that an individual’s right is linked to the status of the group, that is because the injury being asserted by an individual is the inability to aggregate her vote. The only way to measure that individual harm is to evaluate the position of other group members with whom she wishes to coalesce.”).
This “outlier analysis” has been the focus of recent litigation about partisan gerrymandering in state and federal courts. See, e.g., Rucho v. Common Cause, 139 S. Ct. 2484, 2517-18 (2019) (Kagan, J., dissenting) (“[T]he plaintiffs demonstrated the districting plan’s effects mostly by relying on what might be called the ‘extreme outlier approach.’”); League of Women Voters v. Pennsylvania, 178 A.3d 737, 828 (Pa. 2018) (Baer, J., concurring in part) (“[A] petitioner may establish that partisan considerations predominated in the drawing of the map by, inter alia, introducing expert analysis and testimony that the adopted map is a statistical outlier in contrast with other maps drawn utilizing traditional districting criteria.”).
Rucho v. Common Cause, 139 S. Ct. 2484, 2518 (2019) (Kagan, J., dissenting); see also Moon Duchin, How to Reason from the Universe of Maps (The Normative Logic of Map Sampling), Election L. Blog (July 5, 2019), https://electionlawblog.org/?p=106069 [https://perma.cc/2WQJ-3BMU].
This histogram shows the outcome of 100,000 simulation trials with a true fair coin, approximating a familiar bell curve. If we want to test four coins for fairness, suppose we flip each one 1,000 times. Coin 1 gives 504 heads; Coin 2 gives 508 heads; Coin 3 gives 473 heads; and Coin 4 gives 586 heads. What can we conclude?
Each column is a sample of outcomes from repeated trials with an actually fair coin (up to the limits of a computer’s ability to randomize). The more trials in our sample, the more predictable the results. (Note that if there is a tie for the most frequently observed value, the smallest of these values is reported as the mode.)
All ensembles that we generate in this Response use the implementation of ReCom in the high-performance programming language Julia, which is publicly available at GerryChain, GitHub, https://github.com/mggg/GerryChainJulia [https://perma.cc/C82D-UNZF].
Chen & Stephanopoulos, supra note 5, at 890, n.145. The pressures of the authors’ one-size-fits-all modeling begin to show with these kinds of exceptions. The authors also hard-code various exceptional cases in their programs, for instance by manually loosening the intact-county threshold and the compactness threshold in some states.
The shaded range shows the seats outcomes ever observed in the ensemble, regardless of its frequency, and whole numbers of districts are shown as small dots. As an example, of the two million maps made for Louisiana’s congressional delegation, just six districting plans included a majority-Black district. The remaining 1,999,994 plans had zero majority-minority districts. Despite its extremely low frequency, Figure 2 includes this one seat. This is a good reminder that sub-sampling, or skipping over many plans to thin the ensemble, may not be the best practice for these ensemble applications, even though it is frequently used in other domains of applied statistics. If we only sample every 10,000 plans visited by the random walk, we may miss rare events entirely and subvert the exploration features of Markov chain sampling.
We also note that as the granularity of districting gets finer (more and smaller districts, like in state Houses), the range of seat-share outcomes observed in a neutral ensemble is reliably narrower, but the mean and median seat-share creep higher. Jonathan Rodden and Thomas Weighill have a similar finding that increased granularity results in lower variance in their study of scale effects in Pennsylvania districting. However, fascinatingly, they find that in the specific case of Pennsylvania and a partisan measure rather than the racial measure considered here, the ensemble average is stable at every scale—there is no “sweet spot” of district size for Democrats in Pennsylvania. Jonathan Rodden & Thomas Weighill, Political Geography and Representation: A Case Study of Districting in Pennsylvania, in Political Geometry (Moon Duchin & Olivia Walch eds., forthcoming 2021), https://mggg.org/gerrybook [https://perma.cc/LV7B-VCKN].
Figure 2 depicts shortfalls from proportionality, viewed with comparator ensembles of two million districting plans for Congress (top), state Senate (middle), and state House (bottom). Blue line: proportionality (BVAP share). Bracket: share of majority-Black districts (BVAP > 50%) in enacted plan. Colored dots and range: share of majority-Black districts in neutral ensemble plans, with large dot marking median. Note: Delaware has a single seat in the U.S. House of Representatives and is thus not included in the top panel. Arizona and Maryland employ multimember districts in their state lower House and are thus not included in the bottom panel.
Figure 3 depicts Congressional (top), state Senate (middle), and state House (bottom) districting and Black population. Blue line: proportionality (BVAP share). Bracket: share of districts with BVAP > 40% in enacted plan. Colored dots and range: share of districts with BVAP > 40% in neutral ensemble plans, with large dot marking median. Note: Delaware has a single seat in the U.S. House of Representatives and is thus not included in the top panel. Arizona and Maryland employ multimember districts in their state lower House and are thus not included in the bottom panel.
It is crucial to remember that if race is considered among proactive redistricting goals, it is easy to outperform a neutral algorithm, and indeed it is often easy to outperform the enacted plans. For an automated search technique for majority-minority districts, see Sarah Cannon, Ari Goldbloom-Helzner, Varun Gupta, JN Matthews & Bhushan Suwal, Voting Rights, Markov Chains, and Optimization by Short Bursts, arXiv (2020), https://arxiv.org/abs/2011.02288 [https://perma.cc/LL95-A7FU].
Justin Levitt particularly and sharply observes this. See Levitt, supra note 39, at 575-76 (“In some circumstances, the jurisdictions’ reliance on crude demographic targets over-concentrates real minority political power; in other circumstances, it under-concentrates real minority political power. In still other circumstances, the real political effects are unclear, because the lure of the demographic assumption means that nobody has bothered to examine the real political effects.” (internal citation omitted)).
See Jowei Chen, The Impact of Political Geography on Wisconsin Redistricting: An Analysis of Wisconsin’s Act 43 Assembly Districting Plan, 16 Election L.J. 443 (2017); Jowei Chen & David Cottrell, Evaluating Partisan Gains from Congressional Gerrymandering: Using Computer Simulations to Estimate the Effect of Gerrymandering in the U.S. House, 44 Electoral Stud. 329 (2016); Jowei Chen & Jonathan Rodden, Cutting Through the Thicket: Redistricting Simulations and the Detection of Partisan Gerrymanders, 14 Election L.J. 331 (2015).
By definition, a Markov chain is a random walk without memory, meaning that the position at time n+1 is governed by a probabilistic choice based only on the location at time n and not on the previous history. Many kinds of dynamical system have steady states; Markov chains are remarkable because, when designed carefully, there is a unique steady-state distribution for the system, and the random walk process beginning at any initial configuration will always converge to it. This means that the empirical distribution drawn from a large enough sample of observations will converge to the same long-term shape, no matter what the initial position. We discuss Markov chain theory in more detail infra Appendix A.1.
Charles J. Geyer, Introduction to Markov Chain Monte Carlo, in Handbook of Markov Chain Monte Carlo (Steve Brooks, Andrew Gelman, Galin L. Jones & Xiao-Li Meng, eds. 2011), http://www.mcmchandbook.net/HandbookChapter1.pdf [https://perma.cc/S5QX-34UY].
For an overview of court approaches to compactness before and during the Shaw line of cases, see generally Richard H. Pildes & Richard G. Niemi, Expressive Harms, “Bizarre Districts,” and Voting Rights: Evaluating Election-District Appearances After Shaw v. Reno, 92 Mich. L. Rev. 483, 484 (1993), which notes that compactness violations are found “[w]hen physical geography is stretched too thin.” For a discussion of how ReCom compactness fits into the legal history, see Moon Duchin & Bridget Eileen Tenner, Discrete Geometry for Electoral Geography (Aug. 23, 2018) (unpublished manuscript), https://arxiv.org/pdf/1808.05860.pdf [https://perma.cc/G9XM-DCZJ].
Since 2018, Chen has incorporated Flip chains into his expert work, but only in a hill-climbing manner which is designed for optimization, not representative sampling. See, e.g., Rucho, 318 F. Supp. 3d; Whitford, 218 F. Supp. 3d. The work from the research teams of Duke’s Jonathan Mattingly and Harvard’s Kosuke Imai is particularly notable in targeting a prescribed distribution. For an extended discussion of challenges and sophisticated fixes for Flip chains, see Daryl DeFord & Moon Duchin, Random Walks and the Universe of Districting Plans, in Political Geometry (Moon Duchin & Olivia Walch eds., forthcoming 2021), https://mggg.org/gerrybook [https://perma.cc/LV7B-VCKN].
To be precise, the stationary probability of selecting a plan in the ReCom chain is approximately proportional to its spanning tree score, a measure of compactness that draws from clustering theory. See DeFord, Duchin & Solomon, supra note 10 and Duchin & Tenner supra note 86. A small adjustment to the Markov procedure makes the chain reversible and makes it target exactly the spanning tree distribution. See Sarah Cannon, Moon Duchin, Dana Randall & Parker Rule, A Reversible Recombination Chain for Graph Partitions, Metric Geometry & Gerrymandering Group (2020), https://mggg.org/ReCom [https://perma.cc/4WA4-DPMT]. The simplicity for the modeler and the speed of heuristic convergence recommend ReCom over Flip-based Markov chains. ReCom “is more computationally costly than Flip at each step in the Markov chain, but this tradeoff is net favorable thanks to superior convergence and distributional qualities.” This piece is not a suitable place for a full introduction to these methods, but we refer the reader to DeFord & Duchin, supra note 90.
Since noncompact plans are exponentially more numerous, the probability of selecting them approaches 100% as the problem size expands. And in addition to being undesirable for compactness reasons, sampling from a uniform distribution has been proven to be computationally intractable. Lorenzo Najt, Daryl DeFord, and Justin Solomon have shown that if you could create an algorithm that samples districting plans approximately uniformly, then you have solved a suite of problems long believed to be impossible. In particular, the solution would give you a way to crack internet encryption! Lorenzo Najt, Daryl DeFord & Justin Solomon, Complexity and Geometry of Sampling Connected Graph Partitions 1-2 (Aug. 23, 2019) (unpublished manuscript), https://arxiv.org/pdf/1908.08881.pdf [https://perma.cc/WZ8L-Y39M].
In particular, the authors clearly break the key property that sample statistics converge to the same target distribution regardless of initial position. Here, desirable properties like compactness and county integrity are playing the role of altitude in our exploration metaphor; their customizations to favor high altitude end up forbidding bridges altogether, and this literally disconnects the landscape we are trying to explore—agents can no longer explore effectively in any amount of time. For details, see infra Appendix A.1 and Figure A5.
Chen & Stephanopoulos, supra note 5, at 882. Also “a modified version of a MCMC redistricting algorithm that one of us has previously employed in expert testimony.”“ Id. at 891. The footnote in support of these claims says, of Chen’s prior methods: “Under this related approach, a recombination MCMC algorithm developed by one of us was used to create a single map that satisfied the specified parameters. This process was repeated hundreds or thousands of times to generate a large number of maps. In other words, the maps were the endpoints of hundreds or thousands of separate Markov chains, not way-points along a single, very long Markov chain.” There is simply no evidence of any setup that is capable of representative sampling in Chen’s earlier work. Since expert witnesses can certainly update their methods when better ones become available, there is no need for this flagrantly misleading description. See infra Appendix A.1 for more on Markov chain theory.
Figure 4 depicts the racial voting gap in nine elections over the 150 districts of the Texas House of Representatives. Black and Hispanic voters agree on the candidate of choice in all nine elections. Each histogram plots the estimated difference between Black+Hispanic support for the candidate of choice and White support using the “Preferred EI” method described in the text. General elections show massive polarization of about sixty percentage points, while Democratic primaries and runoffs show broad agreement between White and minority voters. In six of these elections (marked with *), the minority-preferred candidate is either Black or Hispanic.
To give their fuller description, “we define an opportunity district as one where (1) the minority-preferred candidate wins the general [Obama-Romney] election, and (2) minority voters who support the minority-preferred candidate outnumber white voters backing that candidate, provided that (3) minority voters of different racial groups are aggregated only if each group favors the same candidate.” Chen & Stephanopoulos, supra note 5, at 899. The data demonstrations in this Part are based on applying their definition in full, after correcting small coding errors. See infra Appendix A.2. In their data, the groups considered are Black, Hispanic, and Other, a category that includes everyone else—White, Asian, two or more races, and so on. We found that the effects of replacing this super-category with only non-Hispanic White voters amount to about a one-seat difference in the Texas House, but the effect size would surely be higher in other parts of the country. In particular, three of the states treated in the Article (CA, NV, NY) are at or near 10% Asian population share, and Asian voters are far more likely to align with Black and Hispanic than White voters. See Christopher S. Elmendorf & Douglas M. Spencer, Administering Section 2 of the VRA After Shelby County, 115 Colum. L. Rev. 2143, 2210-11 (2015).
Jowei Chen, Data Files, Replication Code, and Simulated Districting Plans, U. Mich., http://www-personal.umich.edu/~jowei/race [https://perma.cc/MFN5-RERB].
Compare this to Figure 5, in which their style of ecological interference (EI) reports forty-four or forty-five minority opportunity districts (MODs). As compared to their Article, which reports forty-six MODs, our districting ensemble in Figure 5 was generated using Texas Legislative Council precinct units rather than the custom units from the replication materials, which may account for the discrepancy.
See Estimates of the Total Populations of Counties and Places in Texas for July 1, 2019 and January 1, 2020, Tex. Demographic Ctr. tbl.1 (2020), https://demographics.texas.gov/Resources/TPEPP/Estimates/2019/2019_txpopest_county.pdf [https://perma.cc/28XU-LD9H].
Figure 5 depicts a comparison for TX House (150 districts) based on applying various counting questions to a single precinct-level ensemble of two million plans. Radii of colored disks are proportional to the frequency with which an outcome was observed in the ensemble; bracket (X) marks the number of each kind of district in the currently enacted plan. The definition of MOD begins with an Obama win as in (1) and layers a group control requirement on top. (2)-(4) show that this group control condition depends heavily on the way EI is run. (5)-(7) compare starting points with broader electoral history as an alternative to relying on the vote pattern from a single election.
See Texas House of Representatives District 46, Ballotpedia, https://ballotpedia.org/Texas_House_of_Representatives_District_46 [https://perma.cc/YF4V-V32J].
In such an attack, an adversary uses a simple computational technique to reconstruct the Census person-by-person data file and then pairs it with commercially available data to match names, phone numbers, and addresses with all the information included on the Census form. See Disclosure Avoidance and the 2020 Census, U.S. Census Bureau, https://www.census.gov/about/policies/privacy/statistical_safeguards/disclosure-avoidance-2020-census.html [https://perma.cc/AGX2-88RA]; Michael Hawes, Differential Privacy and the 2020 Decennial Census, U.S. Census Bureau (Mar. 5, 2020), https://www2.census.gov/about/policies/2020-03-05-differential-privacy.pdf [https://perma.cc/CF97-RCND].
John Abowd, Robert Ashmead, Simson Garfinkel, Daniel Kifer, Philip Leclerc, Ashwin Machanavajjhala, Brett Moran, William Sexton, Pavel Zhuravlev, Census TopDown: Differentially Private Data, Incremental Schemas, and Consistency with Public Knowledge (Nov. 14, 2019) (unpublished manuscript), https://github.com/uscensusbureau/census2020-das-2010ddp/blob/master/doc/20191020_1843_Consistency_for_Large_Scale_Differentially_Private_Histograms.pdf [https://perma.cc/V888-VLSK].
Charles J. Geyer, Introduction to Markov Chain Monte Carlo, in Handbook of Markov Chain Monte Carlo (Steve Brooks, Andrew Gelman, Galin L. Jones & Xiao-Li Meng, eds. 2011), http://www.mcmchandbook.net/HandbookChapter1.pdf [https://perma.cc/5N5X-ZRPL].
As to the rebranding of earlier methods as “a recombination MCMC algorithm,” every expert report and publication of Dr. Chen’s uses either a Petri dish method or, from 2018 onwards, a series of hill-climbing Flip runs. Indeed, the word “optimize” appears repeatedly; there are no occurrences of “recombination” or “spanning tree” and there is no discussion of convergence or the relative weighting of plans. See, e.g., Expert Report of Jowei Chen, Common Cause v. Rucho, 279 F. Supp. 3d 587 (M.D.N.C.) (No. 1:16-CV-1026), vacated, 138 S. Ct. 2679 (2018); Expert Report of Jowei Chen, Ph.D., Whitford v. Gill, No. 15-cv-421-jdp (W.D. Wis. Oct. 15, 2018); sources cited supra note 82.
Code snippet to identify “minority opportunity districts,” separated and commented for readability. Black and Hispanic opportunity districts are defined for object “ag” (an aggregate of precinct-level vote estimates) by multivariate compound statements; in the circled expressions, DEMS should be replaced by REPS to match the description in the Article.
The estimate of Black voters’ support for Obama in each precinct as we toggle only the statewide/by-county setting for EI. The scatterplot shows the 9,082 precincts of Texas. Statewide EI, as in the “Preferred” style, reports high levels of Obama support; running the same script looped over the individual countries, as in the Article, reports a large share of precincts where Black voters are roughly evenly split between Obama and Romney, which seems fairly implausible.
Stephen Ansolabehere & Brian Schaffner, CCES Common Content, 2012, Harv. Dataverse, https://doi.org/10.7910/DVN/HQEVPK [https://perma.cc/KGR5-29U2] (vote validated dataset, variables = race, hispanic, inputstate and CC410a). Estimated support is 94.7%, with a 95% confidence interval of [90.8, 97.3] based on an exact binomial test.
These are many ways of handling a county criterion. Each column shows an ensemble of 100,000 maps, counting the number of House districts with more Obama than Romney votes. Given this variability, it would be reasonable to declare that 51-57 Obama districts, or even 50-58, is the normal range. It is far less reasonable to declare 54 as the median—and declare a plan with55 such seats to “overrepresent” minorities—which would be the conclusion from the method in the Article.
Multiple seed plans are available in our replication materials. Models-Race-Law, GitHub https://github.com/mggg/models-race-law [https://perma.cc/KR3G-86Y3].
Ca. Const. art. 21, § 2(d)(4) (“The geographic integrity of any city, county, city and county, local neighborhood, or local community of interest shall be respected in a manner that minimizes their division to the extent possible without violating the requirements of any of the preceding subdivisions.”).
2011 Redistricting Guidelines, S.C. Senate (Apr. 13, 2011), redistricting.scsenate.gov/Documents/RedistrictingGuidelinesAdopted041311.pdf [https://perma.cc/JRV7-CHTJ]; 2011 Guidelines and Criteria for Congressional and Legislative Redistricting, S.C. House Judiciary Comm., redistricting.schouse.gov/6334-1500-2011-Redistricting-Guidelines-(A0404871).pdf [https://perma.cc/FAC5-NY72].