LLM-based System User Preference Dataset
Jump to navigation
Jump to search
An LLM-based System User Preference Dataset is a annotated dataset that contains human preference records for LLM outputs.
- Context:
- It can (often) include Human Ratings on aspects like helpfulness, informativeness, or creativity of LLM outputs.
- It can be used to Train LLMs or Fine-Tune LLMs
- It can range from being a Single-LLM Preference Data to being a LLM Comparison Preference Data between outputs from different LLMs.
- It can contain click-through rate (CTR) data to assess user engagement with LLM outputs.
- It can serve as a benchmark for evaluating LLMs' understanding and replication of human preferences and behavior.
- …
- Example(s):
- Counter-Example(s):
- ...
- See: Text Completion, Language Model Evaluation, Human-in-the-Loop, Dataset.
References
2023
- GBard
- LLM Preference Data is a broad term that encompasses various kinds of data used to align the behavior of Large Language Models (LLMs) with human preferences. This data can be used for various purposes, such as:
- . Training LLMs: Preference data can be used to train LLMs to produce outputs that are more aligned with human preferences. This can be done through techniques like reinforcement learning, where the LLM is rewarded for generating outputs that are judged to be "good" by humans.
- . Fine-tuning LLMs: After an LLM has been trained on a general dataset, it can be fine-tuned on a smaller dataset of preference data specific to a particular task or domain. This can help the LLM to better understand the nuances of human preferences in that context.
- . Evaluating LLMs: Preference data can be used to evaluate the performance of LLMs. This can involve comparing the outputs of different LLMs on the same set of prompts and seeing which one produces outputs that are judged to be more consistent with human preferences.
- Here are some specific examples of what LLM Preference Data might include:
- Human ratings or judgments: This could be ratings of how helpful, informative, or creative a particular LLM output is.
- Comparisons between multiple LLM outputs: This could involve asking humans to choose which of two or more LLM outputs they prefer.
- Click-through rate (CTR) data: This data can be used to see which LLM outputs are more likely to be clicked on by users.
- Engagement data: This data can be used to see which LLM outputs are more likely to keep users engaged with a particular task or application.
- LLM Preference Data is a valuable resource for improving the performance and usability of LLMs. By incorporating this data into the training, fine-tuning, and evaluation process, researchers and developers can create LLMs that are more likely to meet the needs and expectations of users.
- LLM Preference Data is a broad term that encompasses various kinds of data used to align the behavior of Large Language Models (LLMs) with human preferences. This data can be used for various purposes, such as:
2023
- (Ji et al., 2023) ⇒ J Ji, M Liu, J Dai, X Pan, C Zhang, C Bian, … (2023). “Beavertails: Towards Improved Safety Alignment of LLM Via a Human-Preference Dataset.” In: arXiv preprint arXiv. [1]
- NOTE: It describes the "Beavertails" project which focuses on improving the safety alignment of large language models (LLMs) using a human-preference dataset. The paper outlines a two-stage annotation process, involving over 70 crowdworkers to annotate the dataset with human preference data efficiently.
- ABSTRACT: In this paper, we introduce the BeaverTails dataset, aimed at fostering research on safety alignment in large language models (LLMs). This dataset uniquely separates annotations of helpfulness and harmlessness for question-answering pairs, thus offering distinct perspectives on these crucial attributes. In total, we have gathered safety meta-labels for 333,963 question-answer (QA) pairs and 361,903 pairs of expert comparison data for both the helpfulness and harmlessness metrics. We further showcase applications of BeaverTails in content moderation and reinforcement learning with human feedback (RLHF), emphasizing its potential for practical safety measures in LLMs. We believe this dataset provides vital resources for the community, contributing towards the safe development and deployment of LLMs. Our project page is available at the following URL: this https URL.