Source From Here
Preface
This module defines a class HTMLParser which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. Unlike the parser in htmllib, this parser is not based on the SGML parser in sgmllib.
Notes.
- class HTMLParser.HTMLParser
An exception is defined as well:
- exception HTMLParser.HTMLParseError
Example HTML Parser Application
As a basic example, below is a simple HTML parser that uses the HTMLParser class to print out start tags, end tags and data as they are encountered:
') The output will then be: Preface
This module defines a class HTMLParser which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. Unlike the parser in htmllib, this parser is not based on the SGML parser in sgmllib.
Notes.
- class HTMLParser.HTMLParser
An exception is defined as well:
- exception HTMLParser.HTMLParseError
Example HTML Parser Application
As a basic example, below is a simple HTML parser that uses the HTMLParser class to print out start tags, end tags and data as they are encountered:
- from HTMLParser import HTMLParser
- # create a subclass and override the handler methods
- class MyHTMLParser(HTMLParser):
- def handle_starttag(self, tag, attrs):
- print "Encountered a start tag:", tag
- def handle_endtag(self, tag):
- print "Encountered an end tag :", tag
- def handle_data(self, data):
- print "Encountered some data :", data
- # instantiate the parser and fed it some HTML
- parser = MyHTMLParser()
- parser.feed('
Test ' - '
Parse me!
沒有留言:
張貼留言