2009 CitationsInTheDLofClassis

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Canonical Bibliographic Citation, Ancient Written Artifact.

Notes

Quotes

Abstract

1 Introduction

  • In the field of Classics, canonical references are the traditional way established by scholars to cite primary sources within secondary sources. By primary sources we mean essentially the ancient texts that are the specific research object of Philology, whereas by secondary sources we indicate all the modern publications containing scholarly interpretations about those ancient texts. This specific characteristic strongly differentiates canonical references from the typical references we usually find within research papers.
  • Canonical references are used to shortly refer to the research object itself (in this case ancient texts) rather than to the existing literature about a certain topic, as happens with references to other secondary sources. Given this distinction, canonical references assume a role of primary importance as the main entry point to the information contained in scholarly digital libraries of Classics. To find a parallel with other research fields, the role played by those references is somewhat analogous with that played by protein names in the medical literature or by notations of chemical compounds in the field of Chemistry. As was recently shown by Doms and Schroeder (2005) protein names can be used to semantically index documents and thus to enhance the information retrieval from a digital library of texts, provided that they are properly organized by using an ontology or a controlled vocabulary.
  • Moreover, by analyzing and indexing such references as if they were backlinks (Lester, 2007) from a secondary to a primary source, it is possible to provide quantitative data about the impact of an ancient author for research in a particular disciplinary field, or in relation to a limited corpus of texts (e.g., the papers published by scholarly journals in a given time interval).

=3 Canonical Text References

  • Canonical references present unique characteristics when compared to bibliographic references to modern publications. First of all, they do not refer to physical facts of the referred work (such as publication date or page number), but refer rather to its logical and hierarchical structure. In addition, canonical references often provide additional information needed by the reader to resolve the reference. For example “Archestr. fr. 30.1 Olson-Sens” means line 1 of fragment 30 of the comic poet Archestratus in the edition published by S. D. Olson and A. Sens in 1994.
  • The specification of the edition according to which a source is cited is an important piece of information to be considered. Indeed, since the aim of Philology is to reconstruct for ancient works a text that is as close as possible to the original one (given that the original text may have been corrupted over centuries of manuscript tradition), editors and scholars often disagree substantially as to what readings and conjectures have to be included in the established text.
  • Although some well established sets of abbreviations exist, scholars’ practice of citing primary sources may noticeably differ according to style preferences and the typographical needs of publishers, journals or research groups. Aeschylus’ name might appear in the abridged forms “A., Aesch., Aeschyl.”, and similarly a collection of fragments like Jacoby’s Die Fragmente der Grieschischen Historiker may be abbreviated either as FrGrHist or FGrHist.
  • Moreover, some highly specialized branches of research exist within the field of Classics, such as those dedicated to Epic poetry or Tragedy, or even to a single author like Aeschylus or Homer. In those specialized branches a common tendency to use shorter references with a higher semantic density for the most cited authors can be observed. For example, in publications containing thousands of references to Homer’s Iliad and Odyssey, references to these texts are often expressed with Greek letters indicating the book number along with the verse number (e.g., “a 1” stands for the first verse of the first book of Homer’s Odyssey). Lowercase letters are used to refer to books of the Odyssey, whereas uppercase letters refer to the books of the Iliad, according to a practice developed in the IV century B.C. by scholars of the library at Alexandria.
  • In the actual practice of scholarly writing, canonical references can appear with slightly different figures according to the needs of narrative. Along with complete canonical references to a single text passage, expressed as either a single value or a range of values, other references can often be found that are missing one or more components that are normally present within canonical references, such as an indication of the author name, of the work title or of the editor name (e.g., “Hom. Od. 9.1, 9.2-3; Il 1.100”). This happens particularly in subsequent references to passages of the same work.
  • Those differences that can be observed about the appearance of canonical references require us to apply different processing strategies to each case. We focus on the task of automatically identifying complete references to primary sources. Once those references have been identified in the input document, we can find other anaphoric references by applying some scope-based parsing. Indeed, a canonical reference in the text constitutes the reference scope for subsequent text passage indications referring to the same work.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 CitationsInTheDLofClassisMatteo Romanello
Federico Boschetti
Gregory Crane
Citations in the Digital Library of Classics: extracting canonical references by using conditional random fields