My Solr client talks with a proxy application which communicates with remote Solr Server to get data.
In previous post, Solr: Use JSON(GSon) Streaming to Reduce Memory Usage, I described the problem we faced, how to use JSON(GSon) Streaming, and also some other approaches to reduce memory usage.
In post Solr: Use SAX Parser to Read XML Response to Reduce Memory Usage
I also described how to use SAX to parse response for better performance.
In this post, I will introduce how to use Stax Parser to parse XML response.
Implementation
The code to use Stax to read document one by one from http stream:
-- Use Stax parser and Java Executors Future to wait all thread finished: all docs imported.
Resources
Parsing XML using DOM, SAX and StAX Parser in Java
Java SAX vs. StAX
In previous post, Solr: Use JSON(GSon) Streaming to Reduce Memory Usage, I described the problem we faced, how to use JSON(GSon) Streaming, and also some other approaches to reduce memory usage.
In post Solr: Use SAX Parser to Read XML Response to Reduce Memory Usage
I also described how to use SAX to parse response for better performance.
In this post, I will introduce how to use Stax Parser to parse XML response.
Implementation
The code to use Stax to read document one by one from http stream:
-- Use Stax parser and Java Executors Future to wait all thread finished: all docs imported.
private static ImportedResult handleXMLResponseViaStax( SolrQueryRequest request, InputStream in, int fetchSize) throws XMLStreamException { XMLInputFactory factory = XMLInputFactory.newInstance(); XMLStreamReader reader = null; ImportedResult importedResult = new ImportedResult(); List<Future<Void>> futures = new ArrayList<Future<Void>>(); try { reader = factory.createXMLStreamReader(in); int fetchedSize = 0; int numFound = -1, start = -1; while (reader.hasNext()) { int event = reader.next(); switch (event) { case XMLStreamConstants.START_ELEMENT: { if ("result".equals(reader.getLocalName())) { numFound = Integer.valueOf(reader.getAttributeValue("", "numFound")); } else if ("start".equals(reader.getLocalName())) { start = Integer.valueOf(reader.getAttributeValue("", "start")); } else if ("doc".equals(reader.getLocalName())) { ++fetchedSize; futures.add(readOneDoc(request, reader)); } break; } default: break; } } importedResult.setFetched(fetchedSize); importedResult.setHasMore((fetchedSize + start) < numFound); importedResult.setImportedData((fetchedSize != 0)); return importedResult; } finally { if (reader != null) { reader.close(); } } } private static Future<Void> readOneDoc(SolrQueryRequest request, XMLStreamReader reader) throws XMLStreamException { String contentid = null, bindoc = null; OUTER: while (reader.hasNext()) { int event = reader.next(); INNER: switch (event) { case XMLStreamConstants.START_ELEMENT: { if ("str".equals(reader.getLocalName())) { String fieldName = reader.getAttributeValue(0); if ("contentid".equals(fieldName)) { contentid = reader.getElementText(); } else if ("bindoc".equals(fieldName)) { bindoc = reader.getElementText(); } } break INNER; } case XMLStreamReader.END_ELEMENT: { if ("doc".equals(reader.getLocalName())) { break OUTER; } } default: break; } } return CVSyncDataImporter.getInstance().importData(request, contentid, bindoc); }
Parsing XML using DOM, SAX and StAX Parser in Java
Java SAX vs. StAX