Solr - Tips and Tricks

Admin UI
http://127.0.0.1:8983/solr/#/~cloud?view=tree
bin/solr help
bin/solr status
bin/solr healthcheck
bin/solr stop -all
(bin/solr start -cloud -s example/cloud/node1/solr -p 8983 -h 127.0.0.1)  && (bin/solr start -cloud -s example/cloud/node2/solr -p 7574 -z 127.0.0.1:9983 -h 127.0.0.1) && (bin/solr start -cloud -s example/cloud/node3/solr -p 6463 -z 127.0.0.1:9983 -h 127.0.0.1)

Delete docs:
change *;* to your query
update?commit=true&stream.body=<delete><query>*:*</query></delete>

https://wiki.apache.org/solr/SearchHandler
Use invariants to lock options and overwrite values client passes.
Use appends to append options, use defaults to provide default options.

Request Paramters
distrib=false - only query current core

debugQuery
debug=query/results/timing

explainOther
debug=results&explainOther=id:MA*

Range Query inclusive: [a to b]
exclusive: {a to b} - it's not ().
mixed: [a to b} {a to b]

Negative Query
Query empty fields: -field:*
field is empty or is abc: (*:* OR -field:*) OR field:abc
(*:* -id:1) OR id:1 - return all docs
http://stackoverflow.com/questions/634765/using-or-and-not-in-solr-query
-foo is transformed by solr into (*:* -foo)
The big caveat is that Solr only checks to see if the top level query is a pure negative query!

zkcli
zkcli.sh -zkhost zooServer:port  -cmd putfile /configs/solrconfig.xml solrconfig.xml
zkcli.sh -zkhost zooServer:port  -cmd get /configs/schema.xml

To get solrcloud nodes info(such as ip address)
java -classpath "*" org.apache.solr.cloud.ZkCLI -zkhost myzkhost -cmd get /clusterstate.json | grep base_url
zkcli.sh -zkhost myzkhost:port -cmd get /clusterstate.json

Rest API
solr/collection/config
solr/collection/config/requestHandler
solr/collection/schema
solr/collection/schema/version
solr/admin/collections?action=RELOAD&name=$NAME

SolrJ Field Annotation
Map Dynamic fields to fields
@Field("supplier_*")
Map> supplier;

@Field("sup_simple_*")
Map supplier_simple;

@Field("allsupplier_*")
private String[] allSuppliers;

@Field(child = true)
Child[] child;

Staring Solr
-m 2g: Start Solr with the defined value as the min (-Xms) and max (-Xmx) heap size for the JVM.

bin/solr stop -all

(bin/solr start -cloud -s example/cloud/node1/solr -p 8983 -h 127.0.0.1 -m 2g)  && (bin/solr start -cloud -s example/cloud/node2/solr -p 7574 -z 127.0.0.1:9983 -h 127.0.0.1 -m 2g) && (bin/solr start -cloud -s example/cloud/node3/solr -p 6463 -z 127.0.0.1:9983 -h 127.0.0.1 -m 2g)
-- Use -h 127.0.0.1 so solr can continue to work even ip changed.

Extending Solr
Implement the SolrCoreAware interface in custom RequestHandler to get SolrCore in inform method.

Customize and extend DocumentObjectBinder

Get solr static fields
SolrServer solrCore = new HttpSolrServer("http://{host:port}/solr/core-name");
SolrQuery query = new SolrQuery();

query.setRequestHandler("/schema/fields");
// query.add(CommonParams.QT, "/schema/fields");
QueryResponse response = solrClient.query(query);
NamedList responseHeader = response.getResponseHeader();
ArrayList fields = (ArrayList) response.getResponse().get("fields");
for (SimpleOrderedMap field : fields) {
    Object fieldName = field.get("name");

}

Solr Internals
replicationFactor

The Solr replicationFactor has nothing to do with quorum. Solr uses Zookeeper's Quorum sensing to insure that all Solr nodes have a consistent picture of the cluster.

openSearcher and hardCommit
- Soft commit always opens new searcher.
- openSearcher only makes sense for hardcommit

Use config api to change solr settings dynamically

Use JSON API, but be aware SolrJ may not work with JSON API in some cases.

Solr Facet APIs
http://yonik.com/json-facet-api/
http://yonik.com/solr-facet-functions/

Don't forget facet.mincount=1

Solr Nested Objects
Define _root_ field

Use [child] - ChildDocTransformerFactory to return child documents

admin adn collection apis
/admin/collections?action=CLUSTERSTATUS

curl http://localhost:8983/solr/mycollection/update -X POST -H 'Content-Type: application/json' --data-binary @atomic.json

Zookeeper
Clean ZK data - link
run java -cp zookeeper-3.4.6.jar:conf org.apache.zookeeper.server.PurgeTxnLog  ../zoo_data/ ../zoo_data/ -n 3

Access solr cloud via ssh tunnel
Create a tunnel to zookeeper and solr nodes
- But when solrJ queries zookeeper, it still returns the external solr nodes that we can't access directly
Add a conditional breakpoint at CloudSolrClient.sendRequest(SolrRequest, String)
- before  LBHttpSolrClient.Req req = new LBHttpSolrClient.Req(request, theUrlList);
theUrlList.clear();
theUrlList.add("http://localhost:18983/solr/searchItems/");
theUrlList.add("http://localhost:28983/solr/searchItems/");

return false;

Solr suggester
It supports filter on multiple fields. Just copy these fields to the contextFilterFeild.

Troubleshooting
400 Unknown Version - when run curl solr
- Maybe u need encode query parameters

Debug Solr Query
http://splainer.io/

Post a Comment

Labels

Java (159) Lucene-Solr (111) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (37) Eclipse (33) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Web Development (20) Nutch2 (18) Tools (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) Lesson Learned (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts