Solr Boost to Improve Search Relevancy

One important step when use Solr is to tune search relevancy.
Boosting Fields: qf
Not all the fields have the same importance, We can boost some fields, for example keyword or title fields.
qf=keyword^10 title^5
Boosting Phrases: field~slop^boost
pf can be used to boost close proximity. We can also specify an optional slop factor directly in "pf" with the syntax field~slop.
pf=title^20 content^10
This will translate a user query like foo bar into:
title:”foo bar”^20 OR content:”foo bar”^10
Boost Queries
A boost query is a query that will be executed on background after a user query, and that will boost the documents that matched it.
bq=sponsored:true or bq=instock:true
Boost Functions
bf=recip(ms(NOW,publicationDate),3.16e-11,1,1)
The boost Parameter in edismax
Boost exact match
Copy the content into two fields, one with LowerCaseFilterFactory to do case-insensitive search, one without LowerCaseFilterFactory to do exact match, and boost on that field.
title titleExact^10
title^10 titleExact^100
Minimum 'Should' Match: mm



Boost records that contain all terms: boost when mm=100%
Use a pf to boost on a phrase in those same two fields (just common sense)
Set up a boost query (bq) to boost the score if all the search terms are present
'q'='_query_:"{!dismax qf=$f1 mm=$mm1 pf=$f1 bq=$bq1 v=$q1}"',
'mm1'='0%',
'f1'='author^3 title^1',
'q1'='Dueber Constructivism',
'bq1'='_query_:"{!dismax qf=$f1 mm=\'100%\' v=$q1 }"^5',
'fl' ='score,*'
Boosting Documents in Solr by Recency, Popularity and User Preferences
Comparing boost methods in Solr

Resources
http://www.lucidworks.com/about-us/news/options-tune-document%E2%80%99s-relevance-solr
http://everydaydeveloper.blogspot.com/2012/02/solr-improve-relevancy-by-boosting.html
http://robotlibrarian.billdueber.com/using-localparams-in-solr-sst-2/#An_example_boost_records_that_contain_all_terms
http://wiki.apache.org/solr/ExtendedDisMax
http://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/
http://www.slideshare.net/LucidImagination/boosting-documents-in-solr-by-recency-popularity-and-user-preferences
Post a Comment

Labels

Java (159) Lucene-Solr (110) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (36) Eclipse (34) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Web Development (20) Nutch2 (18) Tools (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Troubleshooting (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Http Client (8) Maven (8) Problem Solving (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) Lesson Learned (5) Programmer Skills (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) System Design (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts