Package web2py :: Package gluon :: Module html :: Class web2pyHTMLParser
[hide private]
[frames] | no frames]

Class web2pyHTMLParser

source code

markupbase.ParserBase --+    
                        |    
    HTMLParser.HTMLParser --+
                            |
                           web2pyHTMLParser

obj = web2pyHTMLParser(text) parses and html/xml text into web2py helpers. obj.tree contains the root of the tree, and tree can be manipulated
>>> str(web2pyHTMLParser('hello<div a="b" c=3>wor&lt;ld<span>xxx</span>y<script/>yy</div>zzz').tree)
'hello<div a="b" c="3">wor&lt;ld<span>xxx</span>y<script></script>yy</div>zzz'
>>> str(web2pyHTMLParser('<div>a<span>b</div>c').tree)
'<div>a<span>b</span></div>c'
>>> tree = web2pyHTMLParser('hello<div a="b">world</div>').tree
>>> tree.element(_a='b')['_c']=5
>>> str(tree)
'hello<div a="b" c="5">world</div>'


Instance Methods [hide private]
 
__init__(self, text, closed=('input', 'link'))
Initialize and reset this instance.
source code
 
handle_starttag(self, tagname, attrs) source code
 
handle_data(self, data) source code
 
handle_charref(self, name) source code
 
handle_entityref(self, name) source code
 
handle_endtag(self, tagname) source code

Inherited from HTMLParser.HTMLParser: check_for_whole_start_tag, clear_cdata_mode, close, error, feed, get_starttag_text, goahead, handle_comment, handle_decl, handle_pi, handle_startendtag, parse_endtag, parse_pi, parse_starttag, reset, set_cdata_mode, unescape, unknown_decl

Inherited from markupbase.ParserBase: getpos, parse_comment, parse_declaration, parse_marked_section, updatepos

Inherited from markupbase.ParserBase (private): _parse_doctype_attlist, _parse_doctype_element, _parse_doctype_entity, _parse_doctype_notation, _parse_doctype_subset, _scan_name

Class Variables [hide private]

Inherited from HTMLParser.HTMLParser: CDATA_CONTENT_ELEMENTS

Inherited from markupbase.ParserBase (private): _decl_otherchars

Method Details [hide private]

__init__(self, text, closed=('input', 'link'))
(Constructor)

source code 
Initialize and reset this instance.
Overrides: HTMLParser.HTMLParser.__init__
(inherited documentation)

handle_starttag(self, tagname, attrs)

source code 
Overrides: HTMLParser.HTMLParser.handle_starttag

handle_data(self, data)

source code 
Overrides: HTMLParser.HTMLParser.handle_data

handle_charref(self, name)

source code 
Overrides: HTMLParser.HTMLParser.handle_charref

handle_entityref(self, name)

source code 
Overrides: HTMLParser.HTMLParser.handle_entityref

handle_endtag(self, tagname)

source code 
Overrides: HTMLParser.HTMLParser.handle_endtag