Build Web Service APIs to Update Solr's Managed Resources (stop words, synonyms)

User Case
Solr provides Rest API to update managed resources such as stop words, synonyms and etc. But usually we will wrap it and provide Rest api in our application layer, so admin can do solr admin operation in UI.

Also usually we use CloudSolrClient and prefer to use solrj api over make directly rest api to solr server - as usually our application only knows address of zookeeper  servers, not address of solr servers.

The Implementation
We first create generic APIs: getManagedResource, addManagedResource and deleteManagedResource. Then we call them to manage stop words, synonyms.

We use spring-data-solr's SolrJsonRequest in getManagedResource and addManagedResource which can help parse json's response.

deleteManagedResource is more complex - we can't use solrJ directly
as org.apache.solr.client.solrj.SolrRequest.METHOD only supports GET, POST, PUT not DELETE.

Here I use CloudSolrClient's Apache HttpClient to send HttpDelete, use ZkStateReader and ClusterState to get one address of live solr nodes.

The Rest APIs just call these methods.

public String getManagedResource(final String path) {
    try {
        final SolrJsonRequest request = new SolrJsonRequest(METHOD.GET, path);
        return request.process(this.getSolrClient()).getJsonResponse();
    } catch (SolrServerException | IOException e) {
        throw new MyServerException(e, "Failed to get " + path);
    }
}
public void addManagedResource(final String path, final Object content, final boolean reloadCollection) {
    final SolrJsonRequest request = new SolrJsonRequest(METHOD.PUT, path);
    request.addContentToStream(content);
    try {
        final SolrJsonResponse response = request.process(this.getSolrClient());
        final int status = response.getStatus();
        logger.info(MessageFormat.format("add resource: {0}, status: {1}, result: {2}", path, status,
                response.getJsonResponse()));
        if (status != 0) {
            throw new MyServerException(ErrorCode.data_access_error,
                    MessageFormat.format("Failed to add resource, path: {0}, status: {1}", path, status));
        }
        if (reloadCollection) {
            reloadCollection();
        }
    } catch (SolrServerException | IOException | InterruptedException e) {
        throw new MyServerException(e, "Failed to add resource: " + path);
    }
}
public void deleteManagedResource(@Nonnull final List<String> paths, final boolean reloadCollection) {
    try {
        Preconditions.checkNotNull(paths);
        final String solrUrl = getOneSolrServerUrl(getSolrClient());
        final List<String> done = new ArrayList<>(paths.size());
        for (final String path : paths) {
            final HttpDelete request = new HttpDelete(solrUrl + path);

            final HttpResponse response = getSolrClient().getHttpClient().execute(request);
            final String entity = EntityUtils.toString(response.getEntity());
            logger.info(MessageFormat.format("delete path: {0}, result: {1}", path, entity));
            final ObjectMapper objectMapper = Util.createFailSafeObjectmapper();
            final Map<String, Object> resultMap = objectMapper.readValue(entity,
                    objectMapper.getTypeFactory().constructMapLikeType(Map.class, String.class, Object.class));
            final Map<String, Object> responseHeader = (Map<String, Object>) resultMap.get("responseHeader");
            if (responseHeader != null) {
                final int status = Integer.valueOf(responseHeader.get("status").toString());
                // ignore 404 which means it's already deleted
                if (status != 0 && status != 404) {
                    throw new MyServerException(MessageFormat.format(
                            "Failed to delete path: {0}, status: {1}, already deleted: {2}", path, status, done));
                }
            }
            done.add(path);
        }
        if (reloadCollection) {
            this.reloadCollection();
        }
    } catch (IOException | SolrServerException | InterruptedException e) {
        throw new MyServerException(ErrorCode.data_access_error, e, "Failed to delete path: " + paths);
    }
}

public String getSynonyms(final String language) {
    return getManagedResource("/schema/analysis/synonyms/" + language);
}
public void addSynonyms(final String language, final List<Object> synonymes, final boolean reloadCollection) {
    for(Object synonyms: synonymes){
        addManagedResource("/schema/analysis/synonyms/" + language, synonyms, reloadCollection);
    }
}
public void deleteSynonyms(final String language, final List<String> synonyms, final boolean reloadCollection) {
    if (CollectionUtils.isNotEmpty(synonyms)) {
        deleteManagedResource(synonyms.stream().map(synonym -> {
            return MessageFormat.format("/schema/analysis/synonyms/{0}/{1}", language,
                    Util.encodeAsUtf8(synonym));
        }).collect(Collectors.toList()), reloadCollection);
    }
}

public void addStopWords(final String language, final List<String> stopWords, final boolean reloadCollection) {
    addManagedResource("/schema/analysis/stopwords/" + language, stopWords, reloadCollection);
}
public String getStopWords(final String language) {
    return getManagedResource("/schema/analysis/stopwords/" + language);
}
public void deleteStopWords(final String language, final List<String> stopWords, final boolean reloadCollection) {
    if (CollectionUtils.isNotEmpty(stopWords)) {
        deleteManagedResource(stopWords.stream().map(synonym -> {
            return MessageFormat.format("/schema/analysis/stopwords/{0}/{1}", language,
                    Util.encodeAsUtf8(synonym));
        }).collect(Collectors.toList()), reloadCollection);
    }
}


public static String getOneSolrServerUrl(CloudSolrClient solrClient)
{
    final ZkStateReader zkReader =solrClient.getZkStateReader();
    final ClusterState clusterState = zkReader.getClusterState();
    final SetString> liveNodes = clusterState.getLiveNodes();
    
    if (liveNodes.isEmpty()) {
        throw new MyServerException(ErrorCode.data_access_error, "No lobe nodes");
    }
    return zkReader.getBaseUrlForNodeName(new TreeSet<>(liveNodes).iterator().next()) + "/" + solrClient.getDefaultCollection();   
}
Post a Comment

Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts