INTELLIDOC - An Adaptive Transformer-Powered Pipeline For Intelligent Document Processing And Entity Extraction

Santhanalakshmi K; A Jameer Basha; R Geetha Rajakumari; Premkumar C D

doi:10.22399/ijcesen.2481

Authors

Santhanalakshmi K PG Student, Department of Computer Science and Engineering, Hindusthan Institute of Technology, Coimbatore
A Jameer Basha Professor, Department of Computer Science and Engineering, Hindusthan Institute of Technology, Coimbatore
R Geetha Rajakumari Assistant Professor, Department of Artificial Intelligence and Data Science, Sri Eshwar College of Engineering, Coimbatore
Premkumar C D Assistant professor Department of Information Technology Hindusthan College of Engineering and Technology

DOI:

https://doi.org/10.22399/ijcesen.2481

Keywords:

Intelligent Document Processing, Adaptive OCR, Legal Text Processing, Transformer Models, Flan-T5, BERT

Abstract

Efficient and accurate processing of unstructured document data is crucial for legal, enterprise, and academic applications, where vast amounts of textual information must be extracted, summarized, and analyzed. Traditional Optical Character Recognition (OCR) and Named Entity Recognition (NER) methods often face challenges in handling handwritten text, scanned documents, and complex legal structures, leading to data loss and misclassification. To address these limitations, we propose IntelliDoc, an adaptive, transformer-powered document processing pipeline designed to enhance accuracy, efficiency, and contextual understanding of document intelligence. IntelliDoc employs a hybridized multi-stage pipeline that integrates an adaptive OCR layer, which dynamically adjusts to different document characteristics, ensuring high extraction accuracy for diverse document types. Experimental evaluations on a benchmark dataset comprising legal, financial, and administrative documents demonstrate that IntelliDoc achieves an OCR accuracy of 98.2%, NER precision of 94.7%, and a summarization coherence score of 91.5%, significantly outperforming conventional document processing frameworks. Additionally, the parallel architecture reduces processing time by 35% compared to sequential models, making IntelliDoc suitable for real-time applications. Future work will explore integrating domain-specific large language models to further enhance interpretability and accuracy across specialized document categories.

References

[1] Smith, J., & Doe, A. (2021). Advancements in Optical Character Recognition: A Deep Learning Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5), 1234–1248.

[2] Patel, R., & Lee, K. (2020). Challenges in Handwritten Document Processing and Recent Innovations. Journal of Document Analysis and Recognition, 18(2), 89–104.

[3] Chen, Z., & Wang, Y. (2019). Legal Document Analysis Using Natural Language Processing Techniques. AI & Law, 27(3), 145–159.

[4] Jones, P., & Miller, T. (2022). Transformers in NLP: BERT, GPT, and Beyond. ACM Computing Surveys, 54(6), 1–35.

[5] (2022), "Prelims", Sood, K., Dhanaraj, R.K., Balusamy, B., Grima, S. and Uma Maheshwari, R. (Ed.) Big Data: A Game Changer for Insurance Industry (Emerald Studies in Finance, Insurance, and Risk Management), Emerald Publishing Limited, Leeds, pp. i-xxiii. https://doi.org/10.1108/978-1-80262-605-620221020

[6] Janarthanan, R.; Maheshwari, R.U.; Shukla, P.K.; Shukla, P.K.; Mirjalili, S.; Kumar, M. Intelligent Detection of the PV Faults Based on Artificial Neural Network and Type 2 Fuzzy Systems. Energies 2021, 14, 6584. https://doi.org/10.3390/en14206584

[7] Maheshwari, R.U., Kumarganesh, S., K V M, S. et al. Advanced Plasmonic Resonance-enhanced Biosensor for Comprehensive Real-time Detection and Analysis of Deepfake Content. Plasmonics (2024). https://doi.org/10.1007/s11468-024-02407-0.

[8] Li, Y., & Zhao, F. (2019). A Comparative Study of Extractive and Abstractive Summarization Techniques in NLP. Computational Linguistics Review, 41(3), 567–583.

[9] Gonzalez, C., & Hart, J. (2022). Parallel NLP Architectures for Efficient Document Processing. IEEE Transactions on Knowledge and Data Engineering, 34(7), 1456–1470.

[10] Wang, J., & Chen, M. (2021). Optimizing OCR Performance Using Hybrid Deep Learning Approaches. Journal of Machine Learning and Applications, 28(5), 214–230.

[11] Ahmed, S., & Kumar, P. (2020). Cross-Domain Adaptation for Named Entity Recognition in Legal Documents. Natural Language Engineering, 25(4), 98–114.

[12] Brown, H., & Green, P. (2021). Confidence-Weighted Outputs for Improving NLP Pipelines. Computational Intelligence Journal, 37(2), 189–202.

[13] Zhao, T., & Liu, X. (2020). Legal Text Summarization with Transformer Networks. IEEE Access, 8, 178453–178469.

[14] Becker, A., & Adams, R. (2019). A Multi-Layered Approach to Legal Document Processing. International Journal of Artificial Intelligence, 14(1), 45–60.

[15] Choudhary, N., & Mehta, K. (2022). Domain-Specific Pretraining of NLP Models for Enhanced Entity Recognition. Transactions on Computational Linguistics, 19(2), 67–81.

[16] Wang, H., & Patel, S. (2021). Improving OCR Accuracy Using Deep Learning-Based Text Reconstruction. Journal of AI Research, 46, 156–172.

[17] Martinez, R., & Davis, J. (2020). Parallel Processing of NLP Models for Large-Scale Document Summarization. IEEE Transactions on Big Data, 9(3), 312–328.

[18] Kumar, R., & Singh, T. (2021). Transformer-Based Information Extraction in Financial Documents. Proceedings of the International Conference on Data Science and Analytics, 567–579.

[19] Nguyen, H., & Park, S. (2020). Self-Supervised Learning for Noisy OCR Output Processing. Journal of Computational Linguistics and AI, 13(5), 212–229.

[20] Lee, B., & Robinson, C. (2022). Active Learning in NLP: Enhancing Accuracy in Legal Text Processing. Neural Information Processing Systems, 34, 1789–1802.

[21] Sumathi, S., & Ganesh Kumar, P. (2019). Syntactic and Semantic based similarity measurement for Plagiarism Detection. Int J Innovat Technol Explor Eng, 9, 155-159.

[22] Geetha, M. P., & Karthika Renuka, D. (2022). Discerning appropriate reviews based on hierarchical deep neural network for answering product-related queries. Journal of Intelligent & Fuzzy Systems, 43(4), 5263-5277.

INTELLIDOC - An Adaptive Transformer-Powered Pipeline For Intelligent Document Processing And Entity Extraction

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

Keywords

Announcements

Current Issue