Thought Leadership

The case for OCR in Insurance Compliance – Is 98% good enough?

  • 3 October 2023
  • 0 replies
  • 46 views
The case for OCR in Insurance Compliance – Is 98% good enough?
Userlevel 2

No system can promise a perfect success rate over an extended period of time, but not striving towards it is something that industries such as insurance compliance simply cannot afford. While a 98% success rate may seem great in some contexts, like exam grades or how full a glass of wine is, it falls short in critical areas such as reviewing insurance policy documents. In industries where precision is paramount, nothing less than a 100% success rate will suffice. 

Artificial Intelligence is being paraded as the most significant disruptor since the dotcom boom and with equal reverence by being described as having the ability to impact almost all aspects of commerce and society. McKinsey, in its 2021 report, described the relationship between AI and insurance by stating that the former could “transform every aspect of the (insurance) industry”.  Surely, a technology with such promise must be able to demonstrate its effectiveness. 

In several cases, AI, through its several, uniquely powerful segments has performed tasks and operations exceedingly well as compared to their traditional non-AI software solutions. In some cases, such as with ChatGPT, it has even surpassed humans in time taken to learn new things and pass academic tests that have a reputation for being extremely challenging. In insurance compliance, OCR is being marketed as a potent data extraction and organization tool while Machine Learning algorithm-based technologies are being considered as highly effective fraud detection mechanisms. 

But no AI-driven bot has yet replaced a doctor and no ChatGPT fueled prompt responder has been granted a license to practice law. Even the most basic of tasks such as bookkeeping and data logging have yet to be completely automated to the extent where they do not require human intervention of any kind. In insurance compliance, AI simply hasn’t shown a consistent ability to extract data from documents at the level of accuracy, speed, and reliability this industry demands. 

Our argument is not that AI may replace bookkeepers and data loggers faster than it will replace insurance compliance reviewers based on classifying the former as basic and the latter as complex. It is to encapsulate the idea that any job that has required human checks for 100% accuracy shall continue to do so regardless of how powerful AI becomes, at least, in the near future. It is to present the case that automation requires large, useful datasets on which ML models are trained and no dataset of this kind exists for the insurance industry. 

Insurance compliance can greatly benefit from sophisticated technologies that make it a more efficient, structured, and accurate process and several AI technologies have shown promise to that effect. To present our case, we shall first explore which specific AI technology has the potential to assist insurance compliance reviewers and why we believe its role will remain limited to that extent. 

Optical Character Recognition – A game changer for data extraction, assimilation, and organization 

Optical Character Recognition (OCR) is a computer vision and pattern recognition technology that leverages complex algorithms to perform the task of converting printed or handwritten text from physical or digital images into machine-readable text data. 

The process begins with image pre-processing, involving operations such as noise reduction, binarization, and skew correction to enhance the clarity and uniformity of the input image. Subsequently, feature extraction methods like contour analysis and edge detection are employed to identify and isolate individual characters or symbols within the image. OCR systems then use these special computer programs (neural networks) to help them recognize characters (letters and words) in images or scans. 

Additionally, deep learning techniques enable the system to learn complex patterns and variations in characters, improving accuracy. To further enhance OCR accuracy, post-processing techniques like language modeling, spell-checking, and contextual analysis may be applied to correct recognition errors and improve the overall quality of the extracted text. OCR technology finds applications in diverse domains, including document digitization, text mining, automated data entry, and accessibility solutions, making it a fundamental tool in the modern digital ecosystem. 

Applying OCR to real-world Insurance Compliance  

OCR is often characterized  as a ‘vision and pattern’ recognition system, akin to human capabilities. Extracting data from standardized documents is relatively straightforward as the required information is consistently located.  However, insurance compliance reviewers face documents that are quite different from one another based on several factors such as the issuing carrier. 

Because OCR requires a large quantity of data to get trained enough so that it is able to perform the required vision and pattern recognition to the desired level of accuracy, and because policy documents are simply not available in this type of a dataset, no OCR model in the market is capable of accurately discerning the correct information from policy documents and then organizing it correctly.  

The structure of documents such as term sheets and declaration pages vary based on the carrier while policy documents vary based on the variant of the ACORD forms used. Folks whose day-to-day job is reviewing policy documents would tell you that the documents they receive from carriers and agents are almost always in unstructured formats on top of being unstandardized to begin with. Agents sometimes edit PDFs to add notes, write outside the space available for a specific piece of information and can add useful information on pages quite spread apart in long-form policy documents. 

Additionally, OCR might struggle when  faced with variations in typed font styles, font sizes, spacing and document completeness. These inconsistencies contribute to unstructured nature of the data, making accurate extraction challenging. Similarly, suboptimal scan quality, characterized by noise, creases, irregular boundaries and stains can significantly affect OCR’s ability to extract data accurately. Additionally, handwritten elements further exacerbate OCR’s accuracy.  

The best-case scenario would be iIf the OCR tool being employed is able to extract 98% of the data correctly with the purpose of saving the reviewer’s time from not having to perform this task manually. However, since the existence of errors is a real possibility, the reviewer would still be required to manually check and verify each piece of information. 

This will simply be the first step in the insurance compliance process which would be followed by analyzing the information to check its compliance level with the chosen rules program. This process is not based on simply matching phrases found in policy documents with rules. It involves understanding the context of the information and then making logical deductions to ascertain the level of insurance compliance. This essentially reduces OCR as an assistive tool for one specific segment of a comprehensive regime of processes that form the insurance compliance protocols.  

OCR is, at best, a game changing assistive technology in Insurance Compliance 

In the realm of insurance compliance, the demand for precision and reliability remains paramount. Clients with loans, often in the tens or hundreds of millions, expect nothing less than 100% accuracy, an aspiration that even the most advanced OCR, boasting a remarkable 98% success rate, struggles to attain. 

Even if it were not for the 2% and even if modern OCR were to achieve a staggering 99.9% accuracy, the lingering uncertainty of that elusive 0.01% would still pose an insurmountable risk in domains like insurance compliance, where errors can swiftly translate into substantial financial consequences. 

What makes matters worse is that the alternative to an OCR-based solution is a traditional insurance consultant or reviewer that has been doing this for years, is available in the market, and eager to take your client’s business. 

To circumvent the perils of this fractional yet formidable margin of error, the human touch emerges as an indispensable safeguard. The attentive eye of an insurance reviewer, coupled with the discerning capability to scrutinize each data point extracted from a policy document based on years of training, becomes the linchpin of reliability. While the expedience of OCR-driven extraction surpasses manual data entry, the essential role of a human verifier remains irreplaceable. 

Using the same analogy, the seemingly insignificant error-prone range of OCR (between 2% and 0.01%) can become a problem that is worth tens of thousands or even millions of dollars. Policies that do not comply with required rules but are, due to errors, presented as such can be construed as fraud and can lead to lawsuits, fines, revocations of coverage, as well as loan cancellations. Likewise, carriers may limit payouts or deny them altogether if, due to errors, the information used at the time of acquiring the policy either does not match the information used for claims or is incorrect to begin with. 

Such errors may also empower lenders by providing them with grounds for seeking punitive and compensatory damages through litigation by claiming fraudulent business practices on behalf of borrowers. The reputational damage alone might be enough to cause irreparable losses to organizations that experience errors due to operational lapses OCR may contribute to.  

In this context, Optical Character Recognition (OCR) assumes its rightful place not as a revolutionary game changer, but rather as a formidable assistive tool, adept at economizing time and effort. By entrusting humans with the pivotal tasks of accuracy validation and compliance assurance, OCR harmonizes seamlessly with our pursuit of uncompromising excellence in the intricate landscape of information management that forms the bedrock of modern insurance compliance. 

Advocate harmonizes technology and human involvement in Insurance Compliance 

As a technology-driven  insurance compliance consultant, Advocate has built sophisticated systems that leverage technology to enable faster and more accurate insurance compliance. These systems rely on a harmonious collaboration between  automation and insurance expertise.  

Tech enabled disruption need not only be on the shoulders of AI.  Advocate Compliance Engine (ACE) exemplifies the capabilities of a tech-expert collaboration within the insurance compliance domain. 

Our team of insurance experts meticulously scrutinizes every paragraph, line, and comma, extracting and structuring valuable information from policy documents, term sheets, and declaration pages. This data is then seamlessly entered and run against rigorous automated checklists, customized to meet each lender’s specific insurance requirements and regulatory guidelines, ensuring 100% compliance. This combination of technology-driven compliance and insurance expertise guarantees unparalleled accuracy in insurance policies processed by Advocate.  


0 replies

Be the first to reply!

Reply