As courts, libraries, and archives move to make court records available online, the increased ease of public access raises concerns about privacy. Little work has been done, however, to study how often sensitive information appears in court records and the context in which it appears. This Article fills this gap by analyzing a large corpus of briefs and appendices submitted to the North Carolina Supreme Court from 1984 to 2000. Based on a survey of privacy laws and privacy scholarship, we created a taxonomy of 140 types of sensitive information, grouped into thirteen categories. We then coded a stratified random sample of 504 court filings in order to determine the frequency of appearance of each sensitive information type and to identify relationships, patterns, and correlations between information types and various case and document characteristics.
We present several important findings. First, although a wide variety of sensitive information appears in the court records we sampled, it is not uniformly distributed throughout the records. Most of the documents contained relatively few incidences of sensitive information while a handful of documents contained a large number of pieces of sensitive information. Second, court records vary substantially in the types and frequency of sensitive information they contain. Sensitive information in seven of the categories— “Location,” “Identity,” “Criminal Proceedings,” “Health,” “Assets,” “Financial Information,” and “Civil Proceedings”—appeared much more frequently than the other categories in our taxonomy. Third, information associated with criminal proceedings, such as witness and crime victim names, is pervasive in court records, appearing in all types of cases and records. Fourth, criminal cases have disproportionately more sensitive information than civil or juvenile cases, with death penalty cases far exceeding all other case types. Fifth, appendices are generally not quantitatively different from legal briefs in terms of the frequency and types of sensitive information they contain, a finding that goes against the intuition of many privacy advocates. Sixth, there were no overarching trends in the frequency of sensitive information during the seventeen-year period we studied.
Although we found a substantial amount of sensitive information in the court records we studied, we do not take a position regarding what information, if any, courts or archivists should redact or what documents should be withheld from online access or otherwise managed for privacy protection. These largely normative questions must be answered based on a careful balancing of the competing public access and privacy interests. Nevertheless, we expect that this highly granular view of the occurrence of sensitive information in these North Carolina Supreme Court records will help policymakers and judges evaluate the potential harms to privacy interests that might arise from online access to court records. We also hope that scholars will draw on our taxonomy and empirical data to develop and ground normative arguments about the proper approach for balancing government transparency and personal privacy.
David S. Ardia and Anne Klinefelter,
Privacy and Court Records: An Empirical Study,
30 Berkeley Tech. L.J. 1807