POC Prototype for Digital Account Opening

Document Classifier (ID, Driver’s License, Passports)using a python model & OCR.

Digital Account opening has been challenged with the capability to accept & verify documentation, capture and store metadata, amongst other legislative requirements.

In this POC I have created a PROTOTYPE that:

  • classifies the 3 document types;
  • extracts text from the image;
  • and eventually will need to be able to verify the barcode.

The key building blocks to create this end-to-end process are: document classification, text extraction, verification and storage.

Machine Learning models to classify documents types from images (ID, Passports, Driver's licenses) can be trained for the Personal market and Business customers.

Nuances in the Business/ Wholesale market would be MoA, registration number as images and text extraction will include 'trading as', legal name, registration number, date of registration, country of registration,


1. Document Classification:

   For ID's, Driver’s Licenses & passports.

A classifier is a type of model that you can use to automate the identification and classification of a document type

Model Performance Training - Precision and recall above 85%.


Model Performance Prediction - results were generally above 94% accuracy in predicting the right document.


2. Text Extraction:

Extracting text from an image, in this scenario my ID:

3. Verification:

The barcode itself does not store information but points to the information stored in the database software. Using the code to create the barcode will establish if it is fraudulent or not. Can work for international ID’s as well as that info is available.


For more Follow me on Medium https://medium.com/@aveshnee7