Smart ERP: Scalable Data Engineering Frameworks Using Artificial Intelligence

Authors

  • Yuvaraj Kavala Author

DOI:

https://doi.org/10.70153/IJCMI/2023.15302

Keywords:

AI in ERP, Data Engineering, Software Engineering, Intelligent ETL, Metadata-Driven Architecture, Enterprise Systems

Abstract

Enterprise Resource Planning (ERP) systems are foundational to the functioning of large organizations, facilitating the seamless integration of diverse business functions such as finance, procurement, human resources, inventory, and sales. However, the exponential increase in volume, velocity, and variety of data generated from these modules has rendered traditional data engineering practices inadequate. This paper presents an AI-driven data engineering framework designed to address the challenges of data quality, scalability, and pipeline agility in modern ERP environments. The proposed framework incorporates machine learning models for intelligent metadata-aware ETL rule generation, unsupervised anomaly detection, and dynamic orchestration of end-to-end data pipelines. It also embeds core software engineering principles including modular architecture, CI/CD automation, observability, and testing to ensure reliability and maintainability. A proof-of-concept system was implemented using technologies such as Apache Airflow, Spark, AWS SageMaker, and Amundsen, and was evaluated on synthetic ERP data across procurement, payroll, and inventory modules. Experimental results reveal that the anomaly detection accuracy improved from 71% to 94% using the proposed framework, while ETL rule generation time decreased from two hours to just 15 minutes. Pipeline execution latency was reduced by 45%, and downstream model accuracy improved from 82% to 91% after automated data cleaning. These results confirm the framework’s effectiveness in improving data quality and operational efficiency. The findings advocate for the integration of AI-powered pipelines within ERP systems as a transformative approach to enable scalable, intelligent, and high-fidelity data processing, essential for next-generation enterprise software resilience and performance.

Downloads

Download data is not yet available.

Author Biography

  • Yuvaraj Kavala

    Yuvaraj Kavala

    Data Architect

    Petabyte Technologies

    7460 Warren Parkway, Suite 100, Frisco, TX -75034

    E-Mail: kavalayuvaraj@gmail.com

     

References

Chen, M., Mao, S., & Liu, Y. (2015). Big data: A survey. Mobile Networks and Applications, 19(2), 171–209.

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137–144.

Ibrahim, R., et al. (2016). Cloud ERP systems: Anatomy of adoption factors & attitudes. Journal of Enterprise Information Management, 29(2), 304–327.

Sivarajah, U., et al. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263–286.

Wamba, S. F., et al. (2017). Big data analytics and firm performance: Effects of dynamic capabilities. Journal of Business Research, 70, 356–365.

Schelter, S., et al. (2018). Automating large-scale data quality verification. Proceedings of the VLDB Endowment, 11(12), 1781–1794.

Halevy, A., et al. (2018). Machine learning for data transformation: The case of schema matching. IEEE Data Engineering Bulletin, 41(2), 38–47.

Polyzotis, N., et al. (2018). Data lifecycle challenges in production ML systems. IEEE Data Engineering Bulletin, 41(4), 5–16.

Wang, R. Y., & Strong, D. M. (2019). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 12(4), 5–34. (Reprint of 1996 work with 2019 commentary).

Almeida, F., & Calistru, C. (2019). The main challenges of Industry 4.0. Journal of Engineering and Technology Management, 54, 1–9.

Hellerstein, J. M., et al. (2019). Ground: A data context service. Proceedings of the VLDB Endowment, 13(12), 3165–3178.

Vassiliadis, P., et al. (2019). ETL 2.0: Next-generation real-time data integration. Information Systems Frontiers, 21(4), 719–740.

Chandrasekaran, K., et al. (2020). AI-driven metadata management for enterprise data lakes. Journal of Data and Information Quality, 12(3), 1–25.

Gulzar, M. A., et al. (2020). AI for automated ETL pipeline optimization. IEEE Transactions on Knowledge and Data Engineering, 32(11), 2182–2196.

Schellenberger, M., et al. (2020). Reinforcement learning for dynamic resource allocation in data pipelines. Proceedings of the ACM Symposium on Cloud Computing, 478–491.

Zhang, A., et al. (2020). Machine learning in production: Challenges and best practices. Communications of the ACM, 63(4), 48–55.

Krishnan, S., & Wu, E. (2021). AlphaClean: Automatic generation of data cleaning pipelines. Proceedings of the VLDB Endowment, 14(12), 2885–2898.

Le, Q. T., et al. (2021). Deep learning for anomaly detection in ERP transactional data. Expert Systems with Applications, 178, 115016.

Marz, N., & Warren, J. (2021). Big Data: Principles and best practices of scalable real-time data systems (2nd ed.). Manning Publications.

Ribeiro, M. T., et al. (2021). Continuous integration for machine learning. Journal of Systems and Software, 180, 111026.

Verma, A., et al. (2021). AI-augmented data governance in enterprise systems. Information Systems Management, 38(4), 312–329.

Diamantopoulos, T., et al. (2022). Towards ML-oriented data pipelines. IEEE Transactions on Big Data, 8(1), 177–191.

Liu, X., et al. (2022). MetaETL: A metadata-driven framework for adaptive ETL. Journal of Data Science, 20(3), 421–440.

Patel, P., et al. (2022). Dynamic orchestration of data workflows using reinforcement learning. Future Generation Computer Systems, 134, 187–201.

Zheng, A., & Casari, A. (2022). Feature engineering for machine learning: Principles and techniques for data scientists. O’Reilly Media.

Downloads

Published

2023-11-30

How to Cite

[1]
Y. Kavala, “Smart ERP: Scalable Data Engineering Frameworks Using Artificial Intelligence”, IJCMI, vol. 15, no. 1, pp. 1248–1262, Nov. 2023, doi: 10.70153/IJCMI/2023.15302.

Similar Articles

1-10 of 19

You may also start an advanced similarity search for this article.