2024 TrainingLanguageModelstoGenerat

Subject Headings: Fine-Grained Rewarding, LLM Hallucination.

Notes

It introduces a novel training framework to enhance the accuracy and credibility of LLMs in generating text with in-text citations.
It tackles the issues of hallucination, credibility, and reliability in AI-generated text, aiming for more reliable outcomes.
It employs fine-grained rewards during training to encourage the production of text with relevant citations.
It shows through experiments that this framework surpasses traditional training methods in generating accurate and pertinent citations.
It uses a mix of rejection sampling and reinforcement learning to improve model performance.
It proves the method's effectiveness across various datasets, showcasing its broad applicability.
It offers promising results that mark progress in developing more reliable and credible LLMs for text generation.

While recent Large Language Models (LLMs) have proven useful in answering user queries, they are prone to hallucination, and their responses often lack credibility due to missing references to reliable sources. An intuitive solution to these issues would be to include in-text citations referring to external documents as evidence. While previous works have directly prompted LLMs to generate in-text citations, their performances are far from satisfactory, especially when it comes to smaller LLMs. In this work, we propose an effective training framework using fine-grained rewards to teach LLMs to generate highly supportive and relevant citations, while ensuring the correctness of their responses. We also conduct a systematic analysis of applying these fine-grained rewards to common LLM training strategies, demonstrating its advantage over conventional practices. We conduct extensive experiments on Question Answering (QA) datasets taken from the ALCE benchmark and validate the model's generalizability using EXPERTQA. On LLaMA-2-7B, the incorporation of fine-grained rewards achieves the best performance among the baselines, even surpassing that of GPT-3.5-turbo.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2024 TrainingLanguageModelstoGenerat	Chengyu Huang Zeqiu Wu Yushi Hu Wenya Wang			Training Language Models to Generate Text with Citations via Fine-grained Rewards				10.48550/arXiv.2402.04315		2024