I would guess that most of us have seen the movie “iRobot”, where the plastic guy learns and develops a little further than what the designers wanted and started to evolve and learn from its mistakes. Now Symantec has an e-Discovery tool that learns from the patterns of discovery for each case and applies them to the ESI (Electronic Stored Information) to extract your relevant information.
The Symantec Clearwell system uses technology called “Transparent Predictive Coding”, where the Clearwell system relies on machine-learning technology to train a computer to “predict” how documents should be classified. The Clearwell technology works by relying on input (or “training”) from human reviewers that teach the computer to classify documents as either “responsive” or “nonresponsive” to a particular legal matter.
The technology is exciting for organizations attempting to manage skyrocketing legal costs because the ability to expedite the document review process and find key documents faster has the potential to save organizations thousands of hours of time. In a profession where the cost of reviewing a single gigabyte of data has been estimated to be around $18,000 (R163 000), narrowing days, weeks, or even months of tedious document review into more reasonable time frames means massive savings for thousands of organizations struggling to keep litigation expenditures in check.
How does this Transparent Predictive Coding work?
Let’s start with a manual review of electronically stored information, you would sift through electronic software systems and check for documents that were relevant to the case and then somehow tag those information sources.
Predictive coding technology relies on humans to review a small fraction of the overall document population, which ultimately results in a fraction of the review costs. The process entails feeding decisions about how to classify a small number of case documents called a training set into a computer system.
The computer then relies on the human training decisions to generate a model that is used to predict how the remaining documents should be classified. The information generated by the model can be used to rank, analyze, and review the documents quickly and efficiently.
Although documents can be coded with multiple designations that relate to various issues in the case during eDiscovery, many times predictive coding technology is simply used to segregate responsive and privileged documents from non-responsive documents in order to expedite and simplify the document review process.
Training the predictive coding system is an iterative process that requires attorneys and their legal teams to evaluate the accuracy of the computer’s document prediction scores at each stage. A prediction score is simply a percentage value assigned to each document that is used to rank all the documents by degree of responsiveness.
If the accuracy of the computer-generated predictions is insufficient, additional training documents can be selected and reviewed to help improve the system’s performance. Multiple training sets are commonly reviewed and coded until the desired performance levels are achieved. Once the desired performance levels are achieved, informed decisions can be made about which documents to produce.
For example, if the legal team’s analysis of the computer’s predictions reveals that within a population of 1 million documents, only those with prediction scores in the 70 percent range and higher appear to be responsive, the team may elect to produce only those 300,000 documents to the requesting party. The financial consequences of this approach are significant because a majority of the documents can be excluded from expensive manual review by humans. The simple rule of thumb in eDiscovery is that the fewer documents requiring human review, the more money saved since document review is typically the most expensive facet of eDiscovery.
In a recent interview with Alison Walton, Symantec’s eDiscovery lawyer said that due to the nature of our legal system, processes and reasons for litigation, South Africa is way behind other countries of the world, but she said that the increase of fraud and corruption in the country will drive products like Clearwell into corporations, who want to eradicate such problems. This will then spill over to the law firms who will then integrate and use these systems to assist in the cases.
The global village of international trading will also assist in spreading the use of e-Discovery systems in South Africa, which is rather poor compared to the rest of the world. (Editor’s comment) It does seem that South Africa is way behind, but the growth in e-Discovery, both as managing and archiving for corporates, and discovery for law firms is going to be huge in the next 3 to 5 years.
“Symantec is no stranger to South Africa”, says Mark Smissen, Business Development Manager for Southern Africa, who was interviewed with Walton, “our Enterprise Vault .cloud has a number of clients on the service. Our next step with the Clearwell e-Discovery product is to spread the word to existing customers and to introduce this solution to industry leaders – it’s time to release this best-kept secret to the South African market.”