DOI QR코드

DOI QR Code

OCR-LLM 기반 판매논리 자동 생성 프레임워크

A Framework for Automating Sales Logic Generation Using OCR and LLM

  • 김홍대 (국민대학교 비즈니스IT전문대학원) ;
  • 한윤 (국민대학교 비즈니스IT전문대학원) ;
  • 김남규 (국민대학교 비즈니스IT전문대학원)
  • Hongdae Kim (Graduate School of Business IT, Kookmin University) ;
  • Yoon Han (Graduate School of Business IT, Kookmin University) ;
  • Namgyu Kim (Graduate School of Business IT, Kookmin University)
  • 투고 : 2025.07.02
  • 심사 : 2025.08.26
  • 발행 : 2025.11.30

초록

최근 가전제품 유통 환경은 오프라인 대면 구매에서 온라인 기반의 비대면 구매로 빠르게 전환되고 있다. 이와 같은 구매 트렌드 변화에 따라, 소비자가 구매 결정과정에서 매장 직원의 설명 대신 스스로 정보를 탐색하고 비교하는 경향이 뚜렷하게 나타나고 있다. 하지만 대표적인 고관여 제품인 가전제품은 일반적으로 고가이면서 기술적으로 복잡하다는 특성을 갖고 있으므로, 단순한 사양 나열 수준의 설명으로는 구매 결정을 효과적으로 유도하기 어렵다는 특성을 갖는다. 이러한 문제를 해결하기 위해, 본 연구는 상품 이미지로부터 소비자의 구매 동기를 자극하는 FAB 구조의 설득형 판매 논리를 자동으로 생성하기 위한 OCR-LLM 기반 프레임워크를 제안한다. 제안 프레임워크는 텍스트 추출 모듈과 판매논리 생성 모듈의 두 모듈로 구성되며, 이미지 전처리, OCR, LLM 정제, 그리고 판매논리 생성의 네 단계로 구성된다. 제안 방법론의 실효성을 확인하기 위해 국내 주요 유통사의 실제 제품 상세 이미지를 기반으로 실험을 수행하였다. 실험 결과 제안 방법론이 생성한 판매논리는 문장당 평균 1.95개의 구매동기를 포함하고 제품별 평균 86.64%의 동기 항목을 포괄하여, 단순 사양 나열을 넘어선 다층적 설득 구조를 형성하는 것으로 나타났다. 이러한 설득형 문장은 소비자와 비대면 상호작용을 수행하는 LLM 기반 시맨틱 검색 및 챗봇 등 AI 에이전트 시스템에서, 구매 전환을 유도하는 응답 생성을 위한 유의미한 자원으로 활용가치가 높을 것으로 기대한다.

The home appliance retail environment is rapidly shifting from offline, face-to-face purchases to online, contactless transactions. Along with this trend, consumers are increasingly inclined to explore and compare information independently, rather than relying on in-store sales staff. In particular, high-priced and technologically complex home appliances are considered high-involvement products, and their purchase decisions involve extensive pre-purchase information search. As such, simple specification listings are often insufficient to effectively influence purchase decisions. To address this issue, this study proposes an OCR-LLM-based framework for automatically generating persuasive sales logic in the FAB (Features-Advantages-Benefits) structure from product images, aimed at stimulating consumer purchase motivation. The proposed framework consists of two main modules: a text extraction module and a sales logic generation module, and it operates through four automated stages-image preprocessing, OCR, LLM-based text refinement, and sales logic generation. Experiments were conducted using real product detail images from a major domestic retailer. The results show that the generated sentences include an average of 1.95 purchase motivations per sentence and cover an average of 86.64% of key motivation items per product, thereby forming a multi-layered persuasive structure that goes beyond simple specification descriptions. Furthermore, these persuasive sentences can serve as valuable knowledge resources for AI agent systems-such as LLM-based semantic search and chatbots-that replace human interaction in contactless purchase environments and aim to generate responses that drive purchase conversion.

키워드

참고문헌

  1. 김종욱, 박상철, "온라인 소비자 구매결정과정에서의 제품관여도 효과에 관한 연구", Asia Pacific Journal of Information Systems, 제15권, 제3호, 2005, pp. 133-161.
  2. 시종욱, 이상진, 김성영, "온라인-오프라인 상점을 위한 한글 메뉴판 인식: 어텐션 메커니즘을 적용한 VGG-ResNet 융합 모델", 한국정보전자통신기술학회논문지, 제17권, 제4호, 2024, pp. 190-197. https://doi.org/10.17661/JKIIECT.2024.17.4.190
  3. 엄예솔, F. Abid, 이준호, "웹툰 속 의성어/의태어의 감지와 인식", 한국컴퓨터종합학술대회 논문집, 2024, pp. 993-995.
  4. 온유나, 한주혁, 김민재, 김연우, 이원희, "한국어 문서의 표 텍스트 정보 추출을 위한 표 인식 도구의 비교 분석", 한국컴퓨터종합학술대회 논문집, 2024, pp. 1940-1942.
  5. 윤선영, 전성복, "스포츠용품에 있어서 브랜드 이미지가 소비자 제품구매에 미치는 영향에 관한 연구", Archives of Design Research, 제16권, 제2호, 2003, pp. 385-394.
  6. 홍일유, 이정민, 조휘형, "지속적 관여도 및 인지된 위험이 소비자의 온라인 상인선택 프로세스에 미치는 영향에 관한 연구: 요구신뢰수준 개념을 중심으로", Asia Pacific Journal of Information Systems, 제22권, 제1호, 2012, pp. 29-52. https://doi.org/10.1111/isj.2012.22.issue-1
  7. Acharya, A., B. Singh, and N. Onoe, "Llm based generation of item-description for recommendation system", Proceedings of the 17th ACM Conference on Recommender Systems, 2023, pp. 1204-1207.
  8. Agarwal, D., J. Jeevan, R. K. Manikandan, N.R. Ramith, and M. L. Vandana, "Advanced automated document processing using optical character recognition (OCR)", Proceedings of the 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), 2024, pp. 1-5.
  9. Baek, Y., B. Lee, D. Han, S. Yun, and H. Lee, "Character region awareness for text detection", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9365-9374.
  10. Chan, Z., X. Chen, Y. Wang, J. Li, Z. Zhang, K. Gai, and R. Yan, "Stick to facts: Towards fidelity-oriented product description generation", arXiv preprint arXiv:2503.08454, 2025, Available at https://arxiv.org/abs/2503.08454.
  11. Elhissoufi, M., E. H. Nfaoui, L. Alla, and J. Elghalfiki, "Leveraging generative large language models for optimizing sales arguments creation: An evaluation of GPT-4 capabilities", International Journal of Intelligent Engineering & Systems, Vol.17, No.5, 2024.
  12. Futrell, C., Fundamentals of Selling: Customers for Life through Service, McGraw-Hill/Irwin, Boston, MA, 2009.
  13. Guo, X., S. Wang, H. Zhao, S. Diao, J. Chen, Z. Ding, and L. Wu, "Intelligent online selling point extraction for e-commerce recommendation", Proceedings of the AAAI Conference on Artificial Intelligence, Vol.36, No.11, 2022, pp. 12360-12368.
  14. Gutman, J., "A means-end chain model based on consumer categorization processes", Journal of Marketing, Vol.46, No.2, 1982, pp. 60-72. https://doi.org/10.1177/002224298204600207
  15. Huang, Y. and M. Benyoucef, "From e-commerce to social commerce: A close look at design features", Electronic Commerce Research and Applications, Vol.12, No.4, 2013, pp. 246-259. https://doi.org/10.1016/j.elerap.2012.12.003
  16. Lemoine, J. and M. Lemoine, Vendre plus et mieux avec la méthode SONCAS, ESF Éditeur, Paris, 2007.
  17. Li, H., Q. Dong, J. Chen, H. Su, Y. Zhou, Q. Ai, and Y. Liu, "LLMs-as-judges: A comprehensive survey on LLM-based evaluation methods", arXiv preprint arXiv:2412.05579, 2024, Available at https://arxiv.org/abs/2412.05579.
  18. Long, S., X. He, and C. Yao, "Scene text detection and recognition: The deep learning era", International Journal of Computer Vision, Vol.129, No.1, 2021, pp. 161-184. https://doi.org/10.1007/s11263-020-01369-0
  19. Mittal, B., "I, me, and mine—How products become consumers' extended selves", Journal of Consumer Behaviour, Vol.5, No.6, 2006, pp. 550-562. https://doi.org/10.1002/cb.v5:6
  20. Munjal, R. S., A. D. Prabhu, N. Arora, S. Moharana, and G. Ramena, "Stride: Scene text recognition in-device", Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), 2021, pp. 1-8.
  21. Niemir, M. and B. Mrugalska, "Basic product data in e-commerce: Specifications and problems of data exchange", European Research Studies Journal, Vol.24, No.5, 2021, pp. 317-329. https://doi.org/10.35808/ersj/2735
  22. Nwokoma, F. O., J. N. Odii, I. I. Ayogu, and J. C. Ogbonna, "Camera-based OCR scene text detection issues: A review", World Journal of Advanced Research and Reviews, Vol.12, No.3, 2021, pp. 484-489. https://doi.org/10.30574/wjarr
  23. Pizer, S. M., E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, and K. Zuiderveld, "Adaptive histogram equalization and its variations", Computer Vision, Graphics, and Image Processing, Vol.39, 1987, pp. 355-368. https://doi.org/10.1016/S0734-189X(87)80186-X
  24. Roy, S., S. Sural, N. Chhaya, A. Natarajan, and N. Ganguly, "An integrated approach for improving brand consistency of web content: Modeling, analysis, and recommendation", ACM Transactions on the Web, Vol.15, No.2, 2021, pp. 1-25. https://doi.org/10.1145/3450445
  25. Saward, G., V. Ambrosiadou, and S. Polovina, "A FAB approach to e-commerce knowledge accessibility requirements", University of Hertfordshire, 2000.
  26. Shi, B., X. Bai, and C. Yao, "An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.39, No.11, 2016, pp. 2298-2304. https://doi.org/10.1109/TPAMI.2016.2646371
  27. Singh, A., G. Pang, M. Toh, J. Huang, W. Galuba, and T. Hassner, "TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8802-8812.
  28. Soni, D. and R. Jain, "Understanding consumer psychology through motivational segmentation: The case of SONCAS", International Journal of Marketing Studies, Vol.9, No.3, 2017, pp. 15-24. https://doi.org/10.5539/ijms.v9n4p15
  29. Wei, H., C. Liu, J. Chen, J. Wang, L. Kong, Y. Xu, Z. Ge, L. Zhao, J. Sun, Y. Peng, C. Han, and X. Zhang, "General OCR theory: Towards OCR-2.0 via a unified end-to-end model", arXiv preprint arXiv:2409.1704, 2024, Available at https://arxiv.org/abs/2409.1704.
  30. Wijaya, B. S., "The development of hierarchy of effects model in advertising", International Research Journal of Business Studies, Vol.5, No.1, 2012, pp. 73-85. https://doi.org/10.21632/irjbs.5.1
  31. Yang, K. and L. D. Jolly, "The effects of consumer perceived value and subjective norm on mobile data service adoption", Journal of Retailing and Consumer Services, Vol.16, No.6, 2009, pp. 502-508. https://doi.org/10.1016/j.jretconser.2009.08.005
  32. Zhou, J., B. Liu, J. N. A. Y. Hong, K. C. Lee, and M. Wen, "Leveraging large language models for enhanced product descriptions in e-commerce", arXiv preprint arXiv:2310.18357, 2023, Available at https://arxiv.org/abs/2310.18357. https://doi.org/10.18357
  33. Zuiderveld, K. J., "Contrast limited adaptive histogram equalization", Graphics Gems, Vol.4, No.1, 1994, pp. 474-485. https://doi.org/10.1016/B978-0-12-336156-1.50061-6
  34. JaidedAI, "EasyOCR(Version 1.7.2)", 2024, Available at https://github.com/JaidedAI/EasyOCR.