ly want to validate the document, use this:: >>> parser = XHTMLParser(dtd_validation=True) For catalog support, see http://www.xmlsoft.org/catalog.html. c