Content-Length: 8664 | pFad | http://wiki.python.org/python/./PyCon2006(2f)Tutorials(2f)TextProcessing.html

PyCon2006/Tutorials/TextProcessing
This wiki is in the process of being archived due to lack of usage and the resources necessary to serve it — predominately to bots, crawlers, and LLM companies. Edits are discouraged.
Pages are preserved as they were at the time of archival. For current information, please visit python.org.
If a change to this archive is absolutely needed, requests can be made via the infrastructure@python.org mailing list.

Intended Audience

Beginning to intermediate programmers. A basic working knowledge of Python is assumed.

Summary

This tutorial will introduce beginning to intermediate programmers to the many useful Python tools & techniques for text and data processing. Topics will include regular expressions, filtering data with generators, and parsing.

Outline

  • Common data sources needing processing:
    • log files
    • CSV
    • tabular data
    • email
    • XML
  • Tools & techniques:
    • lists & dictionaries
    • s.join(list) instead of accumulating
    • for line in file
    • filters, large data sources: generators
    • decorate-sort-undecorate
    • StringIO
  • Regular expressions:
    • pattern matching
    • filtering
    • substitution
    • splitting
  • Parsing:
    • text.split()
    • text.find()
    • regular expressions
    • "real" parsers (including XML)
    • state machines

Please send feedback & ideas for further specific topics to the trainer, David Goodger (email, home page).


2026-02-14 16:12








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://wiki.python.org/python/./PyCon2006(2f)Tutorials(2f)TextProcessing.html

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy