2004 LookingforaFewGoodMetricsAutoma

(Lin, 2004) ⇒ Chin-Yew Lin. (2004)."Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough?". In: Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, Question Answering and Summarization (NTCIR 2004).

Subject Headings: Recall-Oriented Understudy For Gisting Evaluation (ROUGE) Metrics; ROUGE-N; ROUGE-L; ROUGEW, ROUGE-S, ROUGE-SU; ROUGE Summarization Evaluation Software Package, Document Understanding Conference (DUC); Text Summarization Task.

Notes

Other Link(s):
- DBLP: https://dblp.org/rec/html/conf/ntcir/Lin04
- Microsoft: https://www.microsoft.com/en-us/research/publication/looking-for-a-few-good-metrics-automatic-summarization-evaluation-how-many-samples-are-enough/

Cited By

Google Scholar: ~ 100 Citations. Retrieved: 2020-06-13.
Semantic Scholar: ~ 81 Citations Retrieved: 2020-06-13.
MS Academic: ~ 111 Citations Retrieved: 2020-06-13.

Quotes

Author Keywords

Summarization; automatic evaluation; Document Understanding Conference; DUC; ROUGE.

Abstract

ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper discusses the validity of the evaluation method used in the Document Understanding Conference (DUC) and evaluates five different ROUGE metrics: ROUGE-N, ROUGE-L, ROUGEW, ROUGE-S, and ROUGE-SU included in the ROUGE summarization evaluation package using data provided by DUC. A comprehensive study of the effects of using single or multiple references and various sample sizes on the stability of the results is also presented.

References

BibTeX

@inproceedings{2004_LookingforaFewGoodMetricsAutom,
  author    = {Chin-Yew Lin},
  editor    = {Noriko Kando and
               Haruko Ishikawa},
  title     = {Looking for a Few Good Metrics: Automatic Summarization Evaluation
               - How Many Samples Are Enough?},
  booktitle = {Proceedings of the Fourth NTCIR Workshop on Research in Information
               Access Technologies Information Retrieval, Question Answering and
               Summarization (NTCIR 2004) National Center of Sciences, Tokyo, Japan,
               June 2-4, 2004},
  publisher = {National Institute of Informatics (NII)},
  year      = {2004},
  url       = {http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings4/OPEN/NTCIR4-OPEN-LinCY.pdf},
}

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2004 LookingforaFewGoodMetricsAutoma	Chin-Yew Lin			Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough?						2004