General Discussion
Showing Original Post only (View all)Hillary's Email: How to deal with deleting selected items in a large amount of email. [View all]
People have been talking about the 30,000 or so emails deleted from Hillary Clinton's private server. Some are saying that every item should have been read by someone before a decision was made. Well, if you think about the time required to read and analyze that many emails, it's easy to understand why that wouldn't work very well at all.
In the first place, everyone doing the review would be people who had no role in creating the emails, either sent or received, and it would take a number of people to do the review in a timely way. It would take the originator far too long to do it alone. So, how would they decide which emails were related to the SOS's office and which were private?
That's simple: The reviewers would be given a long list of what to look for, including senders and recipients, terms that might indicate a SOS-related subject and other information that would need to be watched for. Quite a long list, really. Then, as they looked at the emails, they'd have to check to see if the senders, recipients or subject matter were in that list. If so, then the decision to delete or preserve could be made.
Not simple? You're right. Very difficult, actually, if people were used to do that review. Now, imagine this: Instead of providing a list of those things to people, you created a computerized search through that very large mass of information, looking for matches. The computer can easily compare all aspects of email to a list of any length. If a match indicates that the email is SOS-related, it is preserved. If no match occurs, then it can safely be deleted.
It's the same process, whether people or a computer does the comparison. The difference is really the time required, accuracy, completeness and objectivity. People are slow at such things. In the first place, they can't remember the entire list. In the second, they might not be entirely objective in making decisions. Finally, people miss things in doing such comparisons. The computer, on the other hand, does such comparisons quickly, completely, and without any human failings. Of course, the quality of the review depends on the data being compared and the skill of the programmer who designed the algorithm. But the same thing is true if people do the comparison.
Here's another thing to consider: Most of us have email client applications that check incoming emails and move some into a spam folder. This uses the same sort of data analysis. I use Yahoo mail. It has been a very long time since I found an email in my spam folder that did not belong there. Months, at least. I review the spam folder every day before permanently deleting everything in it. If there's any failing in that algorithm, it is in sending spam sometimes to my Inbox. It fails safe. If it doesn't know, it errs on the side of showing me the email. I add to its comparison data by marking such mail as spam after my own review.
Computers do a much better job at this type of comparison, and that's how Hillary Clinton's emails were reviewed. Data analysis and comparison is the sort of job that is ideal for computers. Humans do a crappy job of that sort of thing.