Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit

Portada
"O'Reilly Media, Inc.", 12 jun 2009 - 504 páginas

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

  • Extract information from unstructured text, either to guess the topic or identify "named entities"
  • Analyze linguistic structure in text, including parsing and semantic analysis
  • Access popular linguistic databases, including WordNet and treebanks
  • Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence


This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.
 

Índice

Chapter 1 Language Processing and Python
1
Chapter 2 Accessing Text Corpora and Lexical Resources
39
Chapter 3 Processing Raw Text
79
Chapter 4 Writing Structured Programs
129
Chapter 5 Categorizing and Tagging Words
179
Chapter 6 Learning to Classify Text
221
Chapter 7 Extracting Information from Text
261
Chapter 8 Analyzing Sentence Structure
291
Chapter 9 Building FeatureBased Grammars
327
Chapter 10 Analyzing the Meaning of Sentences
361
Chapter 11 Managing Linguistic Data
407
The Language Challenge
441
Bibliography
449
NLTK Index
459
General Index
463
Página de créditos

Otras ediciones - Ver todo

Términos y frases comunes

Sobre el autor (2009)

Ewan Klein is Professor of Language Technology in the School of Informatics at the University of Edinburgh. He completed a PhD on formal semantics at the University of Cambridge in 1978. After some years working at the Universities of Sussex and Newcastle upon Tyne, Ewan took up a teaching position at Edinburgh. He was involved in the establishment of Edinburgh's Language Technology Group in 1993, and has been closely associated with it ever since. From 2000-2002, he took leave from the University to act as Research Manager for the Edinburgh-based Natural Language Research Group of Edify Corporation, Santa Clara, and was responsible for spoken dialogue processing. Ewan is a past President of the European Chapter of the Association for Computational Linguistics and was a founding member and Coordinator of the European Network of Excellence in Human Language Technologies (ELSNET). Edward Loper has recently completed a PhD on machine learning for natural language processing at the the University of Pennsylvania. Edward was a student in Steven's graduate course on computational linguistics in the fall of 2000, and went on to be a TA and share in the development of NLTK. In addition to NLTK, he has helped develop two packages for documenting and testing Python software, epydoc, and doctest.

Información bibliográfica