Daniel P. Lopresti

“Quantifying Information Leakage in Document Redaction” (with A. L. Spitz), Proceedings of the First ACM Workshop on Hardcopy Document Processing (in association with Thirteenth Conference on Information and Knowledge Management), November 2004, Washington, DC, pp. 63-69.

In this paper, we examine ways in which sensitive information might leak through the process of redaction. Such attacks apply known methods from document image analysis and natural language processing to recover text thought to have been obliterated for the purposes of public release. Systematically identifying and testing these weaknesses is a first step towards designing effective countermeasures. We describe our development of a prototype semi-automated system intended to accept as input a redacted document and provide feedback to the user as to whether the document might suffer from such leaks.

Paper  (PDF 525 kbytes)


