I’ve always felt that XML processing in Java is overly complex. Surely something that so many business applications need to do on a regular basis should be simpler and easier by now.
Recently I needed to get access to the raw XML being retrieved in this case from an Amazon Web Service for two reasons. Our application needed to store the raw XML into a database for caching and audit purposes. The other reason was so that we could look at the raw XML to research some corner case issues when the data we received wasn’t quite what we were expecting.
Have done a:
Document doc = db.parse(requestUrl);
You would have thought that Document would have a method along the lines of
RawXML = doc.getRawMessage();
It’s at that point that you really start to question why something so obvious and so simple is missing from standard Java libraries. I did some Googling around and found lots of people asking similar questions. The solution isn’t difficult it’s just unbelievably clumsy.
Basically what I do now is to read my XML source into a string. Once it’s in a String I can do whatever I need with it. For some reason DocumentBuilder.parse doesn’t have a version that accepts a String. You therefore have to use your String to create an InputStream that you can then pass to Parse.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); // Fetch the XML into a string URL myURL = new URL(requestUrl); InputStream myInputStream = myURL.openStream(); Scanner myScanner = new Scanner(myInputStream, "UTF-8"); myScanner.useDelimiter("\\A"); String myRawXMLRequest = myScanner.next(); myInputStream.close(); myScanner.close(); // Output the Raw XML Request to prove we have access to it System.out.println(myRawXMLRequest); // Now instead of passing the RequestURL directly to the parser we // wrap the Raw XML String in a new Input Stream and pass that into the Parser. // Old code was: // Document doc = db.parse(requestUrl); // New code: Document doc = db.parse(new InputSource(new StringReader(myRawXMLRequest)));