Towards a small language model powered chain-of-reasoning for open-domain question answering

Jihyeon Roh;Minho Kim;Kyoungman Bae;

doi:10.4218/etrij.2023-0355

ETRI Journal

Volume 46 Issue 1
/
Pages.11-21
/
2024
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Towards a small language model powered chain-of-reasoning for open-domain question answering

Jihyeon Roh (Language Intelligence Research Group, Electronics and Telecommunications Research Institute) ;
Minho Kim (Language Intelligence Research Group, Electronics and Telecommunications Research Institute) ;
Kyoungman Bae (Language Intelligence Research Group, Electronics and Telecommunications Research Institute)

Received : 2023.08.26
Accepted : 2023.12.20
Published : 2024.02.20

https://doi.org/10.4218/etrij.2023-0355 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

We focus on open-domain question-answering tasks that involve a chain-of-reasoning, which are primarily implemented using large language models. With an emphasis on cost-effectiveness, we designed EffiChainQA, an architecture centered on the use of small language models. We employed a retrieval-based language model to address the limitations of large language models, such as the hallucination issue and the lack of updated knowledge. To enhance reasoning capabilities, we introduced a question decomposer that leverages a generative language model and serves as a key component in the chain-of-reasoning process. To generate training data for our question decomposer, we leveraged ChatGPT, which is known for its data augmentation ability. Comprehensive experiments were conducted using the HotpotQA dataset. Our method outperformed several established approaches, including the Chain-of-Thoughts approach, which is based on large language models. Moreover, our results are on par with those of state-of-the-art Retrieve-then-Read methods that utilize large language models.

Keywords

Acknowledgement

This research was supported by the Institute for Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government (MSIT) (no. 2022-0-00369, [Part 4] Development of AI Technology to support Expert Decision-making that can Explain the Reasons/Grounds for Judgment Results based on Expert Knowledge).

References

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, and D. Zhou, Chain-of-Thought prompting elicits reasoning in large language models, Vol. 35, 2022, pp. 24824-24837.
J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, On faithfulness and factuality in abstractive summarization, arXiv preprint, 2020, DOI 10.48550/arXiv.2005.00661
A. Lazaridou, E. Gribovskaya, W. Stokowiec, and N. Grigorev, Internet-augmented language models through few-shot prompting for open-domain question answering, arXiv preprint, 2022, DOI 10.48550/arXiv.2203.05115
H. He, H. Zhang, and D. Roth, Rethinking with retrieval: faithful large language model inference, arXiv preprint, 2022, DOI 10.48550/arXiv.2301.00303
X. Ma, Y. Gong, P. He, H. Zhao, and N. Duan, Query rewriting for retrieval-augmented large language models, arXiv preprint, 2023, DOI 10.48550/arXiv.2305.14283
W. Shi, S. Min, M. Yasunaga, M. Seo, R. James, M. Lewis, L. Zettlemoyer, and W. Yih, REPLUG: retrieval-augmented blackbox language models, arXiv preprint, 2023, DOI 10.48550/arXiv.2301.12652
G. Izacard, P. Lewis, M. Lomeli, L. Hosseini, F. Petroni, T. Schick, J. Dwivedi-Yu, A. Joulin, S. Riedel, and E. Grave, Atlas: Few-shot learning with retrieval augmented language models, arXiv preprint, 2022, DOI 10.48550/arXiv.2208.03299
S. Min, W. Shi, M. Lewis, X. Chen, W. Yih, H. Hajishirzi, and L. Zettlemoyer, Nonparametric masked language modeling, (Findings of the Association for Computational Linguistics), 2023, pp. 2097-2118, DOI 10.18653/v1/2023.findings-acl.132
W. Yu, Retrieval-augmented generation across heterogeneous knowledge, (Proc. NAACL: Human Language Technologies: Student Research Workshop), 2022, pp. 52-58, DOI 10.18653/v1/2022.naacl-srw.7.
E. Perez, P. Lewis, W. Yih, K. Cho, and D. Kiela, Unsupervised question decomposition for question answering, (Proc. Empirical Methods in Natural Language Processing), 2020, pp. 8864-8880.
H. Dai, Z. Liu, W. Liao, X. Huang, Y. Cao, Z. Wu, L. Zhao, S. Xu, W. Liu, N. Liu, S. Li, D. Zhu, H. Cai, L. Sun, Q. Li, D. Shen, T. Liu, and X. Li, AugGPT: leveraging ChatGPT for text data augmentation, arXiv preprint, 2023, DOI 10.48550/arXiv.2302.13007
X. Wang, J. Wei, D. Schuurmans, Q. V. Le, E. H. Chi, S. Narang, A. Chowdhery, and D. Zhou, Self-consistency improves chain of thought reasoning in language models, (International Conference on Learning Representations, Kigali, Rwanda), 2022.
G. Izacard and E. Grave, Leveraging passage retrieval with generative models for open domain question answering, (Proc. of the 16th Conf. of the European Chapter of the Association for Computational Linguistics), 2021, pp. 874-880.
A. Asai, M. Gardner, and H. Hajishirzi, Evidentiality-guided generation for knowledge-intensive NLP tasks, (Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA), 2022, pp. 2226-2243.
S. Hofstatter, J. Chen, K. Raman, and H. Zamani, Fid-light: efficient and effective retrieval-augmented text generation, arXiv preprint, 2022, DOI 10.48550/arXiv.2209.14290
Y. Levine, O. Ram, D. Jannai, B. Lenz, S. Shalev-Shwartz, A. Shashua, K. Leyton-Brown, and Y. Shoham, Huge frozen language models as readers for open-domain question answering, (ICML 2022 Workshop on Knowledge Retrieval and Language Models), 2022.
S. Zheng, J. Huang, and K. C.-C. Chang, Why does ChatGPT fall short in answering questions faithfully? arXiv preprint, 2023, DOI 10.48550/arXiv.2304.10513
Z. Deng, Y. Zhu, Y. Chen, M. Witbrock, and P. Riddle, Interpretable AMR-based question decomposition for multi-hop question answering, arXiv preprint, 2022, DOI 10.48550/arXiv.2206.08486
Y. Liu, S. Yavuz, R. Meng, D. Radev, C. Xiong, and Y. Zhou, HPE: Answering complex questions over text by hybrid question parsing and execution, arXiv preprint, 2023, DOI 10.48550/arXiv.2305.07789
J. Li, M. Ren, Y. Gao, and Y. Yang, Ask to understand: question generation for multi-hop question answering, arXiv preprint, 2022, DOI 10.48550/arXiv.2203.09073
S. Min, V. Zhong, L. Zettlemoyer, and H. Hajishirzi, Multi-hop reading comprehension through question decomposition and rescoring, (Proc. Annual Meeting of the Association for Computational, Linguistics, Florence, Italy), 2019, pp. 6097-6109.
M. Bevilacqua, R. Blloshmi, and R. Navigli, One spring to rule them both: symmetric AMR semantic parsing and generation without a complex pipeline, (Proc. AAAI Technical Track on Speech and Natural Language Processing), Vol. 35, 2021, pp. 12564-12573.
E. Chung and J. G. Park, Sentence-chain based seq2seq model for corpus expansion, EERI J. 39 (2017), no. 4, 455-466.
H. You, R. Sun, Z. Wang, L. Chen, G. Wang, H. A. Ayyubi, K.-W. Chang, and S.-F. Chang, IdealGPT: iteratively decomposing vision and language reasoning via large language models, arXiv preprint, 2023, DOI 10.48550/arXiv.2305.14985
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. R. Narasimhan, and Y. Cao, ReAct: Synergizing reasoning and acting in language models, (International Conference on Learning Representations), 2022.
W. Yu, Z. Zhang, Z. Liang, M. Jiang, and A. Sabharwal, Improving language models via plug-and-play retrieval feedback, arXiv preprint, 2023, DOI 10.48550/arXiv.2305.14002
P. Lu, B. Peng, H. Cheng, M. Galley, K.-W. Chang, Y. N. Wu, S.-C. Zhu, and J. Gao, Chameleon: Plug-and-play compositional reasoning with large language models, arXiv preprint, 2023, DOI 10.48550/arXiv.2304.09842
Y. Qin, S. Hu, Y. Lin, W. Chen, N. Ding, G. Cui, Z. Zeng, Y. Huang, C. Xiao, C. Han, and Y. R. Fung, Tool learning with foundation models, arXiv preprint, 2023, DOI 10.48550/arXiv.2304.08354
T. Schick, J. Dwivedi-Yu, R. Dessi, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, Toolformer: Language models can teach themselves to use tools, arXiv preprint, 2023, DOI 10.48550/arXiv.2302.04761
K. Ma, H. Cheng, X. Liu, E. Nyberg, and J. Gao, Open-domain question answering via chain of reasoning over heterogeneous knowledge, (Findings of the Association for Computational Linguistics: EMNLP), 2022, pp. 5360-5374.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: pretraining of Deep Bidirectional Transformers for Language Understanding, (Proc. NAACL-HLT, Minneapolis, MN, USA), 2019, pp. 4171-4186.
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, Language models are few-shot learners, Adv. Neural Info. Process. Syst. 33 (2020), 1877-1901.
G. Lample and A. Conneau, Cross-lingual language model pretraining, arXiv preprint, 2019, DOI 10.48550/arXiv.1901.07291
T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Lee, and K. Toutanova, Natural questions: a benchmark for question answering research, Trans. Assoc. Comput. Linguist. 7 (2019), 452-466.
Z. Yang, P. Qi, S. Zhang, Y. Bengio, W. W. Cohen, R. Salakhutdinov, and C. D. Manning, HotpotQA: a dataset for diverse, explainable multi-hop question answering, (Proc. Conf. Empirical Methods in Natural Language Processing), 2018, pp. 2369-2380, DOI 10.18653/v1/D18-1259.

ETRI Journal

Towards a small language model powered chain-of-reasoning for open-domain question answering

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)