Solr: Creating a DocTransformer to Show Doc Offset in Response

In client, the doc offset can be easilly computed: start * rows + (offset in current response).

But in some cases, it is useful to show offset explicitly. For example, when debug response search relevancy, tester may report why some doc are not showd in first page. In this case, we may run the query, and give a big rows, for example: q="Nexus 7"&rows=500, then search the doc in the response xml. In this case, it would be helpful, if response can show the offset directly, like below:
<doc>
 <str name="id">id3456</str>
 <long name="[offset]">156</long>
</doc>
The field name  would be [transfomername], [offset] in this case.
Solr DocTransformers
Solr DocTransformers allows us to add/change/remove fields/response before return to the client. By default, it provides [explain],[value],[shard],[docid]. We can easily add our own DocTransformer implementation.

OffsetTransformerFactory Implementation
The implementation would be look like: ValueAugmenterFactory. To use it, we will add the trasnformer in fl field: q="Nexus 7"&fl=id,[offset] 
public class OffsetTransformerFactory extends TransformerFactory {
  private boolean enabled = false;
  public void init(NamedList args) {
    if (args != null) {
      SolrParams params = SolrParams.toSolrParams(args);
      enabled = params.getBool("enabled", false);
      if (!enabled) return;
    }
    super.init(args);
  }
  
  /*
   * filed is [offset] in this case.<br>
   * Notice augmenterArgs is the local params to this transfomer like:
   * [myTransformer foo=1 bar=good], not paramters in SolrQueryRequest.
   */
  public DocTransformer create(String field, SolrParams augmenterArgs,
      SolrQueryRequest req) {
    SolrParams params = req.getParams();
    String str = params.get(CommonParams.START);
    long start = 0;
    if (StringUtils.isNotBlank(str)) {
      start = Long.valueOf(str);
    }
    long rows = Long.valueOf(params.get(CommonParams.ROWS));
    long startOffset = start * rows;
    return new OffsetTransformer(field, startOffset);
  }
  
  class OffsetTransformer extends DocTransformer {
    private String field;
    private long startOffset;
    private long offset = 0;
    
    public OffsetTransformer(String field, long startOffset) {
      this.field = field;
      this.startOffset = startOffset;
    }
    public void transform(SolrDocument doc, int docid) throws IOException {
      if (enabled) {
        doc.setField(field, startOffset + offset);
        ++offset;
      }
    }
    public String getName() {
      return OffsetTransformer.class.getName();
    }  
  }
}
Configuration in solrconfig.xml
<transformer name="offset"
  class="OffsetTransformerFactory">
  <bool name="enabled">true</bool>
</transformer>
Resources
Solr DocTransformers
Post a Comment

Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts