Python 2.6 Text Processing Beginner's Guide
Categorizing types of text data
Ensuring you have Python installed
Time for action – implementing a ROT13 encoder
Time for action – processing as a filter
Time for action – skipping over markup tags
Supporting third-party modules
Time for action – installing SetupTools
Time for action – configuring a virtual environment
Time for action – generating transfer statistics
Time for action – introducing a new log format
Time for action – accessing files directly
Time for action – handling compressed files
Time for action – spell-checking HTML content
Time for action – spell-checking live HTML pages
Time for action – handling urllib 2 errors
Understanding the basics of string object
Time for action – employee management
Time for action – customizing log processor output
Time for action – adding status code data
Time for action – displaying warnings on malformed lines
Time for action – simple manipulation with string methods
Text Processing Using the Standard Library
Time for action – processing Excel formats
Time for action – CSV and formulas
Time for action – processing custom CSV formats
Time for action – creating a spreadsheet of UNIX users
Modifying application configuration files
Time for action – adding basic configuration read support
Time for action – relying on configuration value interpolation
Time for action – configuration defaults
Time for action – generating a configuration file
Time for action – creating an egg-based package
Time for action – writing JSON data
Time for action – testing an HTTP URL
Time for action – regular expression grouping
Implementing Python-specific elements
Time for action – reading DNS records
Time for action – event-driven processing
Time for action – driving incremental processing
Time for action – creating a dungeon adventure game
Time for action – updating our game to use DOM processing
Time for action – using XPath in our adventure
Time for action – displaying links in an HTML page
Time for action – installing Mako
Time for action – loading a simple Mako template
Time for action – reformatting the date with Python code
Time for action – defining Mako def tags
Time for action – converting mail message to use namespaces
Inheriting from base templates
Time for action – updating base template
Time for action – adding another inheritance layer
Time for action – creating custom Mako tags
Overviewing alternative approaches
Understanding Encodings and i18n
Understanding basic character encodings
Time for action – manually decoding
Time for action – copying Unicode data
Time for action – fixing our copy application
Time for action – changing encodings
Internationalization and Localization
Time for action – preparing for multiple languages
Time for action – providing translations
Dealing with PDF files using PLATYPUS
Time for action – installing ReportLab
Time for action – writing PDF with basic layout and style
Time for action – installing xlwt
Time for action – generating XLS data
Working with OpenDocument files
Time for action – installing ODFPy
Time for action – generating ODT data
Time for action – installing PyParsing
Time for action – implementing a calculator
Time for action – handling type translations
Time for action – suppressing portions of a match
Processing data using the Natural Language Toolkit
Time for action – installing NLTK
Understanding search complexity
Time for action – implementing a linear search
Time for action – installing Nucular
Time for action – full text indexing
Time for action – measuring index benefit
Time for action – field-qualified indexes
Time for action – performing advanced Nucular queries
Indexing and searching other data
Time for action – indexing Open Office documents
Looking for Additional Resources
Looking for Additional Resources
Looking for Additional Resources
Looking for Additional Resources
Looking for Additional Resources