Morphological Analysis System

A Morphological Analysis System is a natural language processing system that implements a morphological analysis algorithm to solve a morphological analysis task.

AKA: Morpheme Detection System.
Context:
- It can range from being a Computer-Assisted Morphological Analysis Task, to being a Supervised Morphological Analysis System, to being a Unsupervised Morphological Analysis System.
- It can range from being Morphological Parsing System to being a Non-concatenative Morphological Analysis System.
- It can include the following NLP sub-systems:
- It can implement the following algorithms:
  - an Adaptor Grammar Algorithm,
  - a Gibbs Sampling Algorithm,
  - a Suffix Stripping Algorithm,
  - a Zipfian Sparsity Algorithm.
Example(s):
Counter-Example(s):
See: Natural Language Syntactic Analysis Task, Morphological Tag, Morphological Inflection, Morphological Derivation, Part-of-Speech Tagging System, Word Sense Disambiguation, Minimum Description Length, Zipfian Sparsity, Gibbs Sampling, Non-concatenative Morphology, Allomorphy, Morphophonology, Recurrent Neural Network Language Model.

References

2017

(Goldsmith et al., 2017) ⇒ John A. Goldsmith, Jackson L. Lee, and Aris Xanthos. (2017). “Computational Learning of Morphology.” In: Annual Review of Linguistics Journal, 3. doi:10.1146/annurev-linguistics-011516-034017
- QUOTE: Most of the more successful work is based fundamentally on the metaphorical understanding that grammar learning consists of a search through grammar space, typically one small step at a time. That is, we can imagine the specification of a grammar as locating it as a point in a space of very high dimensionality, and the task of finding the correct grammar is conceived of as one of traveling through that space. Methods differ as to where in grammar space the search should start: some assume that we start in a random location, while other methods allow one to start at a grammar that is reasonably close to the final solution. In this section we will briefly describe three approaches that have been used in this literature, Minimum Description Length (MDL) analysis, Gibbs sampling, and adaptor grammars.
  All of these approaches have been developed in the context of probabilistic models, and involve different aspects of a search algorithm through the space of possible grammars (here, morphologies) to find one or more grammars that score high on a test based on probability. Probability assigned to training data is used as a way to quantify the notion of “goodness of fit”, in the sense that the higher the probability is that a grammar assigns to a set of data, the better the goodness of fit. The three approaches are not, strictly speaking, alternatives; one could adopt any subset of the three in implementing a system.

2008a

(Gasser, 2008) ⇒ Michael Gasser (2008)."Morphological Analysis and Generation in Computer-Assisted Teaching of Indigenous Languages". School of Informatics. Indiana University
- QUOTE: Morphological analysis:
  - Converts a surface form to a lexical grammatical form;
  - A surface form is analyzed into its constituent morphemes:
    - kinawilo → k-in-aw-il-o
  - A surface form is analyzed into a representation of its grammatical features:
    - kinawilo →
      [root=‘il’,
      abs=[prs=1,num=sing],
      erg=[prs=2,num=sing,-form],
      tam=incmpl]

2008b

(Saranya, 2008) ⇒ S. K. Saranya. (2008). “Morphological Analyzer for Malayalam Verbs.” In: M. Tech Thesis, Amrita School of Engineering, Coimbatore.
- QUOTE: Morphological Analysis: Individual words are analyzed into their components and nonword tokens such as punctuation are separated from the words(...)
  Suppose we have an English interface to an operating system and the following sentence is typed: I want to print Bill’s .init file. Morphological analysis must do the following things:
  - Pull apart the word “Bill’s” into proper noun “Bill” and the possessive suffix “’s”.
  - Recognize the sequence “.init” as a file extension that is functioning as an adjective in the sentence.

This process will usually assign syntactic categories to all the words in the sentence. Consider the word “prints”. This word is either a plural noun or a third person singular verb (he prints )(...)

Morphological analyzer and morphological generator are two essential and basic tools for building any language processing application. Morphological Analysis is the process of providing grammatical information of a word given its suffix. Morphological analyzer is a computer program which takes a word as input and produces its grammatical structure as output. A morphological analyzer will return its root/stem word along with its grammatical information depending upon its word category. For nouns it will provide gender, number, and case information and for verbs, it will be tense, aspects, and modularity(...)

Various NLP research groups have developed different methods and algorithm for morphological analysis. Some of the algorithms are language dependent and some of them are language independent. A brief survey of various methods involved in Morphological Analysis includes the following:

2004

(Diab et al., 2004) ⇒ Mona Diab, Kadri Hacioglu, and Daniel Jurafsky. (2004). “Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks.”. In: Proceedings of HLT-NAACL 2004: Short Papers. ISBN:1-932432-24-8
- QUOTE: Morphological analysis may be characterized as the process of segmenting a surface word form into its component derivational and inflectional morphemes."

2001

(Goldsmith, 2001) ⇒ John Goldsmith. (2001). “Unsupervised Learning of the Morphology of a Natural Language". In: Computational Linguistics Journal, 27(2). doi:10.1162/089120101750300490