Skip to main content
OpenConf small logo

Providing all your submission and review needs
Abstract and paper submission, peer-review, discussion, shepherding, program, proceedings, and much more

Worldwide & Multilingual
OpenConf has powered thousands of events and journals in over 100 countries and more than a dozen languages.

Toward Digital Invoicing: A Layered Approach

Automated invoice digitization presents substantial challenges due to extreme layout variability, degraded scan quality, and vendor-specific formatting conventions. Traditional approaches based on optical character recognition OCR or neural Intelligent Document Processing (IDP) models frequently fail to maintain structural consistency and logical validity under such heterogeneous conditions. In this paper, we introduce a unified three-layer architecture for invoice digitization that integrates visual, semantic, and structural information within a coherent reasoning framework. The architecture comprises three sequential stages: (i) a Pre-Processing Layer that standardizes and prepares scanned invoice documents for reliable downstream analysis; (ii) a Processing Layer that executes structured information extraction through integrated visual layout analysis, text understanding, and semantic reasoning including specialized modules for non-tabular field extraction and Convolutional Neural Network (CNN) model for table detection and reconstruction; (iii) a Post-Processing and Validation Layer that enforces domain-specific constraints, validates extracted fields against business rules, and normalizes outputs into a standardized structured format. Experimental evaluation demonstrates that our implemented system achieves consistent performance across diverse invoice layouts, formats, and degrees of quality. Compared to commercial OCR solutions and state-of-the-art IDP models, our approach provides measurable improvements in critical field extraction accuracy and table reconstruction while maintaining competitive computational efficiency. These results indicate that a modular architecture integrating semantic reasoning with domain-aware validation scales at a practical level for invoice processing applications.

Elmehdi Hassani
ENSIAS, University Mohammed V in Rabat, Morocco
Morocco

Noureddine Kerzazi
ENSIAS, University Mohammed V in Rabat, Morocco
Morocco