Fine-tuned LayoutLMv3 for Indonesian receipts extraction
Oka Sudana, Ayu Wirdiani, Andre Dwi Winama Putra
Abstract
Shopping is a transaction that generates a record as a payment receipt. Typically, a receipt is given as a small piece of paper that can be easily lost. It is essential to store the transaction information in the receipt digitally. Keeping the information in a digital form will make it easily accessible and will overcome the problem of easily lost receipts. Currently, the process of transferring receipt information into digital form is still being done manually. Having a system that can extract this information helps speed up the digitalization process tremendously. This research proposes a method that applies finetuning to the LayoutLMv3 model and with the help of optical character recognition (OCR) from Google Vision, can be used to extract transaction information contained in the receipt. The system works by using Google Vision to parse and segment every word contained within the receipt and its bounding box The LayoutLMv3 model will then assign labels to each word, and important words will be extracted. The finetuned LayoutLMv3 model successfully achieved an accuracy of 97.98% on training data and 90% accuracy on real-time test scenarios for extracting information on receipts written in the Indonesian.
Keywords
Finetuning; Google Vision; LayoutLMv3; Mobile application; Optical character recognition; Receipt extractions
DOI:
https://doi.org/10.11591/eei.v15i2.10127
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191 , e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .