My Solr client talks with a proxy application which communicates with remote Solr Server to get data.
In previous post, Solr: Use JSON(GSon) Streaming to Reduce Memory Usage, I described the problem we faced, how to use JSON(GSon) Streaming, and also some other approaches to reduce memory usage.
In post Solr: Use SAX Parser to Read XML Response to Reduce Memory Usage
I also described how to use SAX to parse response for better performance.
In this post, I will introduce how to use Stax Parser to parse XML response.
Implementation
The code to use Stax to read document one by one from http stream:
-- Use Stax parser and Java Executors Future to wait all thread finished: all docs imported.
Resources
Parsing XML using DOM, SAX and StAX Parser in Java
Java SAX vs. StAX
In previous post, Solr: Use JSON(GSon) Streaming to Reduce Memory Usage, I described the problem we faced, how to use JSON(GSon) Streaming, and also some other approaches to reduce memory usage.
In post Solr: Use SAX Parser to Read XML Response to Reduce Memory Usage
I also described how to use SAX to parse response for better performance.
In this post, I will introduce how to use Stax Parser to parse XML response.
Implementation
The code to use Stax to read document one by one from http stream:
-- Use Stax parser and Java Executors Future to wait all thread finished: all docs imported.
private static ImportedResult handleXMLResponseViaStax(
SolrQueryRequest request, InputStream in, int fetchSize)
throws XMLStreamException {
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader reader = null;
ImportedResult importedResult = new ImportedResult();
List<Future<Void>> futures = new ArrayList<Future<Void>>();
try {
reader = factory.createXMLStreamReader(in);
int fetchedSize = 0;
int numFound = -1, start = -1;
while (reader.hasNext()) {
int event = reader.next();
switch (event) {
case XMLStreamConstants.START_ELEMENT: {
if ("result".equals(reader.getLocalName())) {
numFound = Integer.valueOf(reader.getAttributeValue("",
"numFound"));
} else if ("start".equals(reader.getLocalName())) {
start = Integer.valueOf(reader.getAttributeValue("", "start"));
} else if ("doc".equals(reader.getLocalName())) {
++fetchedSize;
futures.add(readOneDoc(request, reader));
}
break;
}
default:
break;
}
}
importedResult.setFetched(fetchedSize);
importedResult.setHasMore((fetchedSize + start) < numFound);
importedResult.setImportedData((fetchedSize != 0));
return importedResult;
} finally {
if (reader != null) {
reader.close();
}
}
}
private static Future<Void> readOneDoc(SolrQueryRequest request,
XMLStreamReader reader) throws XMLStreamException {
String contentid = null, bindoc = null;
OUTER: while (reader.hasNext()) {
int event = reader.next();
INNER: switch (event) {
case XMLStreamConstants.START_ELEMENT: {
if ("str".equals(reader.getLocalName())) {
String fieldName = reader.getAttributeValue(0);
if ("contentid".equals(fieldName)) {
contentid = reader.getElementText();
} else if ("bindoc".equals(fieldName)) {
bindoc = reader.getElementText();
}
}
break INNER;
}
case XMLStreamReader.END_ELEMENT: {
if ("doc".equals(reader.getLocalName())) {
break OUTER;
}
}
default:
break;
}
}
return CVSyncDataImporter.getInstance().importData(request, contentid,
bindoc);
}
Parsing XML using DOM, SAX and StAX Parser in Java
Java SAX vs. StAX