In "Once a data loss report, always a data loss report?" Dissent asks about what we should be collecting and analyzing.
Scenario 1: “We thought we had lost a computer with sensitive customer records, but it turns out we didn’t lose it.”
Should that entry in a breach list be removed? I think that the answer depends, in part, on the stated inclusion criteria for the list, the stated or anticipated purpose/intended usage of the list, and on whether the list compiler has been provided with a statement by the agency or business to support the claim of no loss.
If the inclusion criteria are worded so as to only include agencies or businesses where records were actually compromised or might have been accessed, then one might see some merit in an argument to remove the entry in our hypothetical case. Common sense would dictate that if I say “I lost my wallet!” but then find it an hour later in another room in my house or under a pile of papers on my desk, it wasn’t really “lost” and no harm, no foul, right?
But what if one of the purposes of the list is to enable tracking and analysis of costs associated with notifications and our hypothetical company had already made a notification before discovering the hardware on their premises?
I just want to chime in and say that errors and recoveries are fascinating numbers to learn about as well. How many lost tapes are recovered in a week or a month? Is anyone wrapping their backups in tamper-evident tape? (What would that even do to a drive's read mechanism?) Laptops are clearly not tamper evident, and the Dataloss forensics page explains how a thief could silently pull data from a machine using well known techniques.
Also, if some fraction of reports are erroneous, what's the source of those errors? Seems like a useful question to ask, and we can't do it if databases are redacted. Finally, scientific reproducability, that is, the ability for a researcher to look at data and see if the same results come out, requires that the data be made available.