- Author: David Mertz
- Format: plain text
- Price: free (print edition available on Amazon)
Text Processing in Python describes techniques for manipulation of text using the Python programming language. At the broadest level, text processing is simply taking textual information and doing something with it. This might be restructuring or reformatting it, extracting smaller bits of information from it, or performing calculations that depend on the text. Text processing is arguably what most programmers spend most of their time doing.
Because Python is clear, expressive, and object-oriented it is a perfect language for doing text processing, even better than Perl. As the amount of data everywhere continues to increase, this is more and more of a challenge for programmers.
This book is not a tutorial on Python. It has two other goals: helping the programmer get the job done pragmatically and efficiently; and giving the reader an understanding – both theoretically and conceptually – of why what works works and what doesn’t work doesn’t work. Mertz provides practical pointers and tips that emphasize efficient, flexible, and maintainable approaches to the text processing tasks that working programmers face daily.
- Python Basics
- Basic String Operations
- Regular Expressions
- Parsers And State-Machines
- Internet Tools And Techniques
- A Selective And Impressionistic Short Review Of Python
- A Data Compression Primer
- Understanding Unicode
- A State-Machine For Adding Markup To Text