2009 NaturalLanguageProcessingwithPy

From GM-RKB
Jump to navigation Jump to search

Subject Headings: NLTK Python Toolkit, Linguistic Resource, Text Classification, Information Extraction System.

Notes

Cited By

Quotes

Abstract

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Chapter 1 Language Processing and Python

Computing with Language: Texts and Words

A Closer Look at Python: Texts as Lists of Words

Computing with Language: Simple Statistics

Back to Python: Making Decisions and Taking Control

Automatic Natural Language Understanding

Summary

Further Reading

Exercises

Chapter 2 Accessing Text Corpora and Lexical Resources

Accessing Text Corpora

Conditional Frequency Distributions

More Python: Reusing Code

Lexical Resources

WordNet

Summary

Further Reading

Exercises

Chapter 3 Processing Raw Text

Accessing Text from the Web and from Disk

Strings: Text Processing at the Lowest Level

Text Processing with Unicode

Regular Expressions for Detecting Word Patterns

Useful Applications of Regular Expressions

Normalizing Text

Regular Expressions for Tokenizing Text

Segmentation

Formatting: From Lists to Strings

Summary

Further Reading

Exercises

Chapter 4 Writing Structured Programs

Back to the Basics

Sequences

Questions of Style

Functions: The Foundation of Structured Programming

Doing More with Functions

Program Development

Algorithm Design

A Sample of Python Libraries

Summary

Further Reading

Exercises

Chapter 5 Categorizing and Tagging Words

Using a Tagger

Tagged Corpora

Mapping Words to Properties Using Python Dictionaries

Automatic Tagging

N-Gram Tagging

Transformation-Based Tagging

How to Determine the Category of a Word

Summary

Further Reading

Exercises

Chapter 6 Learning to Classify Text

Supervised Classification

Further Examples of Supervised Classification

Evaluation

Decision Trees

Naive Bayes Classifiers

Maximum Entropy Classifiers

Modeling Linguistic Patterns

Summary

Further Reading

Exercises

Chapter 7 Extracting Information from Text

Information Extraction

Chunking

Developing and Evaluating Chunkers

Recursion in Linguistic Structure

Named Entity Recognition

Relation Extraction

Summary

Further Reading

Exercises

Chapter 8 Analyzing Sentence Structure

Some Grammatical Dilemmas

What’s the Use of Syntax?

Context-Free Grammar

Parsing with Context-Free Grammar

Dependencies and Dependency Grammar

Grammar Development

Summary

Further Reading

Exercises

Chapter 9 Building Feature-Based Grammars

Grammatical Features

Processing Feature Structures

Extending a Feature-Based Grammar

Summary

Further Reading

Exercises

Chapter 10 Analyzing the Meaning of Sentences

Natural Language Understanding

Propositional Logic

First-Order Logic

The Semantics of English Sentences

Discourse Semantics

Summary

Further Reading

Exercises

Chapter 11 Managing Linguistic Data

Corpus Structure: A Case Study

The Life Cycle of a Corpus

Acquiring Data

Working with XML

Working with Toolbox Data

Describing Language Resources Using OLAC Metadata

Summary

Further Reading

Exercises

Appendix Afterword: The Language Challenge

Language Processing Versus Symbol Processing

Contemporary Philosophical Divides

NLTK Roadmap

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 NaturalLanguageProcessingwithPyEwan Klein
Steven Bird
Edward Loper
Natural Language Processing with Python2009