J.S. Held Strengthens Forensic Accounting and Financial Investigations Expertise and Expands Suite of Services in Canada with Acquisition of ADS Forensics
Read MoreHaving immersed themselves in an array of deep dive demonstrations of new Generative AI (GenAI) products for eDiscovery displayed during Legal Technology events in London and New York, Chantelle Jalland and Marybeth Kings reflect on the latest technology developments in eDiscovery. What is clear is that all leading document review platform providers have developed, or are in the process of developing, GenAI extensions to enhance their core product.
Each software developer has approached the opportunity of incorporating GenAI into their existing products in a slightly nuanced way.
Before exploring new GenAI features, it is important to take a step back and look at the journey of AI adoption in the form of machine learning in eDiscovery.
Predictive coding (also known as technology-assisted review or TAR) is a technology that is able to utilise AI to identify and categorise documents as potentially relevant, learning from human decisions on other documents within a review pool. Originally advised for large cases containing over 50,000 documents, this advice evolved as the technology improved to automate parts of the workflow so that there can now be benefits on matters with as few as 1,000 documents.
It has been 15 years since predictive coding was released [1] as part of a document review platform for use in eDiscovery. It took three years until the use of predictive coding was accepted by Judge Andrew J. Peck in federal court in New York [2] and a further four years for the formal acceptance by the High Court in the UK. [3] That is not to say that predictive coding wasn’t used before formal decisions were issued, as it was still possible to adopt the technology if privately agreed upon by the parties. That said, it is clear that adoption was slow, and lawyers were (and still are) cautious to utilise the technology.
The Civil Procedure Rules in the UK [4] has the overriding objective of dealing with cases justly and at a proportionate cost. We understand that not every matter is suitable for predictive coding workflows, however, there is an argument that it should at the very least be considered for use in all cases.
Unfortunately, the reality is that predictive coding is not being used as much as it should, due to a lack of trust, education, and understanding in the technology. One study suggests that only 30% of people are using predictive coding in all or most of their cases. [5] All document review tools that incorporate predictive coding also include statistical validation techniques, which should in theory mitigate any trust issues with the technology.
The term ‘AI’ became mainstream when 100 million users were introduced to OpenAI’s chatbot, ChatGPT within two months of it becoming public [6]. Since this launch 18 months ago, there is a greater awareness of the availability of sophisticated AI technology across all industries. It is no surprise to the eDiscovery industry that existing technologies would expand to utilise GenAI in streamlining document review.
Here, it is important to pause and define what a Large Language Model (LLM) is. According to Merriam Webster, an LLM is a language model that utilizes deep methods on an extremely large data set as a basis for predicting and constructing natural-sounding text [7]. In other words, you can ask an LLM a question as you would a human, and it will return a natural human-sounding response. In the background, the LLM stores vast amounts of information with various static connections with probabilities between words. LLMs themselves don’t learn when questions provide new information.
In eDiscovery, it is not as simple as saying ‘let’s use GenAI.’ We need to understand how we can use GenAI as part of an end-to-end workflow within different use cases. Each document review tool has adopted GenAI in a different way. Some technology companies openly advise that they are connecting to ChatGPT-4, some others withhold which LLM they connect to, and some connect to multiple different LLMs or create their own Small Language Models (SLMs). These SLMs are slimmed-down language models that are generally easier to train, fine-tune and deploy, as well as being cheaper to run [8]. There are some examples of SLMs that are built for very specific tasks, (for example, summarisation), where it is not possible to alter the prompt, and therefore leads to the question of validation. With so many options out there, each geared towards tackling a specific problem, we must first look at the unique challenges our data sets present to know which AI will best provide a solution.
Let’s explore five GenAI applications below:
While these are simply five examples and by no means an exhaustive list, the variety of forms and use cases demonstrated by the above makes one wonder, what else? Could it be used by the receiving party to query what is new in the received production versus their own? Will it eventually make judgements on the likelihood of winning a case? Will there soon be a wider repository of case studies from which it can draw? For example, there could soon be a custom LLM that has the ability to reference actual cases and their judgements from within a review platform (one that isn’t as subject to hallucinations and making up case studies [9]).
It appears the real gap in the commentary is the lack of concrete evidence on comparing GenAI solutions to existing TAR solutions. Yes, there have been a few eDiscovery experiments that analyse the different approaches with reference to a case study. Generally, with this GenAI model having a recall of approximately 95% over TAR with approximately 85%. But arguably, the statistical advantage of GenAI needs to be evaluated in conjunction with costs to understand the true benefit.
The pricing point is complex, as most GenAI relies on third-party LLM token costs. Without going into too much technical detail, each LLM has a context window, essentially an input limit, from which the LLM generates a response. The amount of text that is input into an LLM is broken down into segments called tokens, which is used to calculate costs. Currently, doubling the size of a context window quadruples the cost, which is a particular pain point for the eDiscovery sector as we would hope to use entire databases of documents as context.
Technology providers are comparing the cost of adopting this approach against human review, though perhaps we should be comparing it against the cost of utilising existing technologies. Some are packaging the cost in a higher overall per GB rate, while others cost per prompt run or by mimicking the token cost. What we can say for certain, is that it would be cost-prohibitive to run 5 million documents through a GenAI model that links to ChatGPT-4 with its current pricing. Therefore, a workflow still needs to involve existing eDiscovery filtering techniques. Others are building their own SLMs that can lead to more economical cost models, but potentially at a cost to quality due to the smaller data pool that it draws from.
Whether exploring technology that connects to LLMs or SLMs, it is imperative to feel confident that sensitive client data is protected at all times, by asking your provider detailed privacy and security questions.
It is evident that technology providers are developing purpose-built technology solutions, and not simply utilising GenAI for the sake of using it.
The evolution of eDiscovery expertise seems to be shifting from analysis of appropriate keywords to prompt engineering. Or will prompt engineering soon be a term of the past as custom models are developed with pre-generated prompts?
The question is not how we can replace humans with GenAI, but rather how can we adopt LLMs alongside existing technologies in the most efficient way. Unsurprisingly, it comes down to what works best for your dataset and workflow.
So, what will we see in the next 12 months? Will GenAI replace first pass review? We certainly need more evidence of the effectiveness across a variety of matters. Our prediction is that it will not be a one-for-one overnight replacement – and not because of the technology, but because of current pricing. It will take time to understand the true benefit and ensure we are able to trust existing validation techniques. However, will it take 15 years to get to 30% adoption, as was the case with TAR? We don’t think so.
We would like to thank Chantelle Jalland and Marybeth Kings for providing insight and expertise that greatly assisted this research.
Marybeth Kings is a Senior Director in the EMEA Digital Investigations & Discovery team at J.S. Held. Based in London, she has more than a decade of experience assisting government, legal, and corporate clients with their eDiscovery matters, from internal and regulatory investigations, arbitrations, and data subject access requests to pro-bono, charitable, and non-standard applications of eDiscovery software. Examples of projects she has worked on include data subject access requests, litigation, arbitrations, internal investigations, regulatory investigations, voluntary disclosure, as well as claims tracking and management.
Marybeth can be reached at [email protected] or +1 44 20 4534 0415.
[1] Axcelerate Document Review Platform developed by Recommind was released in 2009 and is now owned by OpenText https://www.opentext.com/products/axcelerate
[2] Da Silva Moore v. Publicis Groupe & MSL Group, No. 11 Civ. 1279 (ALC) (AJP) (S.D.N.Y. Feb. 24, 2012)
[3] Pyrrho Investments Ltd v MWB Property Ltd [2016] EWHC 256 (Ch)
[4] Civil Procedure Rules, Rule 1.2.
[5] 2024 State of the Industry Report, 9 January 2024 https://ediscoverytoday.com/2024/01/09/2024-state-of-the-industry-report-is-out-heres-how-to-get-it-ediscovery-trends/
[6] ChatGPT sets record for fastest-growing user base, 2 February 2023 https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
[7] Merriam-Webster Online Dictionary https://www.merriam-webster.com/dictionary/large%20language%20model
[8] https://thenewstack.io/the-rise-of-small-language-models/
How GenAI is changing eDiscovery & making it less expensive by replacing human effort with AI computing....
This paper discusses the application of digital forensics, the types of data digital forensics experts work with, the investigation process, and some example scenarios wherein digital forensics experts are called to help address impacts of...
This article focuses on how advances in AI and machine learning can aid forensic investigations procedures and further bring the detection of fraud and other financial crimes into the digital age....