Deep Neural Network-based Text Segmentation Algorithm

Context:
- It can be trained on large datasets to recognize complex patterns in text for accurate segmentation.
- It can be designed as a Supervised Learning Algorithm, requiring labeled data for training.
- It can be support with techniques such as Transfer Learning and Word Embeddings.
- ...
Example(s):
Counter-Example(s):
- a Rule-based Text Segmentation Algorithm, which relies on predefined rules rather than learning from data.
- a Frequency-based Text Clustering Algorithm, which groups text based on word frequencies and does not involve deep learning.
See: Natural Language Processing (NLP), Deep Learning, Text Analytics.

References

(Zhai et al., 2017) ⇒ Feifei Zhai, Saloni Potdar, Bing Xiang, and Bowen Zhou. (2017). “Neural Models for Sequence Chunking.” In: Proceedings of the AAAI conference on artificial intelligence, 31(1).
- ABSTRACT: Many natural language understanding (NLU) tasks, such as shallow parsing (i.e., text chunking) and semantic slot filling, require the assignment of representative labels to the meaningful chunks in a sentence. Most of the current deep neural network (DNN) based methods consider these tasks as a sequence labeling problem, in which a word, rather than a chunk, is treated as the basic unit for labeling. These chunks are then inferred by the standard IOB (Inside-Outside- Beginning) labels. In this paper, we propose an alternative approach by investigating the use of DNN for sequence chunking, and propose three neural models so that each chunk can be treated as a complete unit for labeling. Experimental results show that the proposed neural sequence chunking models can achieve start-of-the-art performance on both the text chunking and slot filling tasks.