Making Xerces ignore a DTD

I’m just trying to write a SAX parser for the XML log files generated from the Logging API if you use the XMLFormatter.

There are a couple of problems with the XML generated. One is that the XML is not well formed until the log file is “completed” because it doesn’t have a final </log> element (not surprising really).

The second is that it declares a DTD as follows:

<?xml version="1.0" encoding="windows-1252" standalone="no"?>
<!DOCTYPE log SYSTEM "logger.dtd">

Now it’s fairly easy to find out what the DTD should be, it’s in the JavaDoc and a quick search on Google reveals it. But I can’t guarantee it’ll be in the right place so that I can just load it.

No problem, I thought, I’ll just turn off validation and then it won’t matter. Wrong! Even with validation switched off, it still tries to load it and I get a FileNotFoundException.

However, by delving in the source code I finally found that if you switch off two features, then it does ignore the external DTD. So add this to your code if you want to ignore DTDs:

    final SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();

    final SAXParser saxParser = saxParserFactory.newSAXParser();
    final XMLReader parser = saxParser.getXMLReader();

    // Ignore the DTD declaration
    parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
    parser.setFeature("http://xml.org/sax/features/validation", false);

If anyone knows a better way then please let me know!

11 thoughts on “Making Xerces ignore a DTD”

  1. Hello forum & poster of this solution.
    Been reading, trying various things and searching the web for two days before coming across this!

    Thanks it worked a treat.
    Java newbie.

  2. Yeah, even I spent quite a lot of time wandering on different sites before I came across this. Works perfectly. Thanks.

  3. You are a star, scholar and gentleman. I was at the point of doing some string manipulation to get rid of the frigging DOCTYPE !

  4. Thx alot.
    it was quite useful for me.
    I searched the hell for a solution and this works quite fine,
    But…are there any disadvantages which I’m getting with this solution?
    XML validation still persists or is switched off as well?

  5. Where did you add this? Sorry, total newb question, but I’ve needed to do this for years, and I’ve no one to walk me through it.

    1. Hi smac, what exactly do you mean when you say “where did you add this”?

      To make Xerces ignore the DTD, just after you’ve created the XMLReader (assigned to parser in my example above), you set the two features shown and then it ignores the DTD (but obviously this means that it won’t check to ensure that the XML conforms to it).

      That’s all you have to do.

  6. Pass the parser an EntityResolver that returns an empty DTD:

    final SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    saxParserFactory.setValidating(false);
    final SAXParser saxParser = saxParserFactory.newSAXParser();
    final XMLReader parser = saxParser.getXMLReader();
    
    parser.setEntityResolver(new EntityResolver(){
    	public InputSource resolveEntity(String publicId, String systemId) {
    		return new InputSource(new ByteArrayInputStream(new byte[]{}));
    	}
    });
    
    1. I should have added… if you want to validate, bundle the DTD in your JAR, in the same folder as your parsing utility class. That way, you know exactly where it is, and after all, it is a class resource in this context. Then have your utility class implement EntityResolver thus:

      @Override
      public InputSource resolveEntity(String publicId, String systemId) {
      return new InputSource(MyUtilityClass.class.getResourceAsStream(“logger.dtd”));
      }

      And set the entity resolver of the parser to the instance of your utility class (eg, “this”).

Leave a Reply

Your email address will not be published. Required fields are marked *