Volume 11 - Volume 11
Input Fields Recognition in Documents Using Deep Learning Techniques
Abstract
Identification of input fields that appear on a document is a crucial requirement while digitizing any
document. This paper presents a Deep Learning based approach to detect input fields from a form or
document which consists of text, images and input fields like textbox, checkbox. The forms have been
crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO
V3 model is trained on the labelled dataset having four classes (static text, static image, input text,
checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper
presents detection of limited types of input fields generally appearing on printed forms. We also
discussed how such detection models can scale and sustain higher loads. If given the labelled dataset
for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is
trained for 3500 iterations and the accuracy achieved is 71 percent.
Paper Details
PaperID: 2468
Author's Name: Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar and Sanjay Wandhekar
Volume: Volume 11
Issues: Volume 11
Keywords: Deep Learning, YOLO, OCR, Forms, Document’s Input Fields.
Year: 2021
Month: August
Pages: 4405-4415