Extend Guava TreeMultimap to Get Topn elements Sorted by Values


It is a common requirement that we have a Hashmap, we want only return the topn elements sorted on the values, for example, in the following HashMap, the key is username, the value is user's score. We want to return top 2 users with highest scores.
David->85, Jeffery->85, Tom->90, Vivian->95

In previous article, I extend TreeSet to add <key, value> pair as Map.Entry, and sort it by value, and override add method to limit TreeSet size.


It works well, and with good time and space complexity. But one of my colleague argued that: in a map, we should sort by key not by value, so the key should be score, the value should be user name, the data structure should be sorted multi-value map: there can be multiple values for a same key, and it's sorted by key.


Of course this works, but the performance would be a little worse, as for a same key, there can be multiple values, we have to iterate the value list.


Neither am I good at arguing, nor like arguing. However arguing and negotiation are important skills for a software engineer: they are a part of our life, so I should try my best to improve these skills.  

I think my solution is better, but I can't clearly express why at that time.
Less Learned: when compare 2 implementations, we should compare: time/space complexity, implementation complexity.

Anyway, I would like to implement my colleague's idea. Just curious, and want to know how to implement a sorted multi-value TreeMap, and limit its size.


Google Guava provides TreeMultimap: there can be multiple values for a same key, and it's sorted by keys.  


We only need extend put method to limit its size. It's easy, but the bad part is that TreeMultimap's constructor is package level, to extend it, the subclass must be in the same package.


The code is like below: you can review the complete code at Github.

public class TopnTreeMultimap<K extends Comparable, V extends Comparable>
  extends TreeMultimap<K, V> {
 private final boolean DEBUG = true;
 private static final long serialVersionUID = 1L;

 private int maxSize;

 public TopnTreeMultimap(Comparator<? super K> keyComparator,
   Comparator<? super V> valueComparator, int maxSize) {
  super(keyComparator, valueComparator);
  this.maxSize = maxSize;
 }

 public static <K extends Comparable, V extends Comparable> TopnTreeMultimap<K, V> create(
   int maxSize) {
  return new TopnTreeMultimap<K, V>(Ordering.natural(), Ordering.natural(),
    maxSize);
 }

 @SuppressWarnings("unchecked")
 @Override
 public boolean put(K newKey, V newValue) {
  boolean added = false;
  if (this.size() > maxSize)
   throw new RuntimeException("Code error, should never happen.");
  if (this.size() == maxSize) {
   Iterator<K> keyIterator = this.keySet().iterator();
   K currentMinKey = keyIterator.next();
   if (newKey.compareTo(currentMinKey) <= 0) {
    if (DEBUG) {
     System.out.println("Ignore the put element: " + newKey + " : "
       + newValue);
    }
   } else {
    added = super.put(newKey, newValue);

    if (added) {
     // remove the first element - the min
     Iterator entryiIterator = this.entryIterator();
     entryiIterator.next();
     entryiIterator.remove();
    }
   }
  } else {
   added = super.put(newKey, newValue);
  }
  return added;
 }

 public static void main(String[] args) throws Exception {
  TopnTreeMultimap<Integer, String> instance = TopnTreeMultimap.create(2);
  instance.put(85, "David");
  instance.put(85, "Jeffery");
  instance.put(90, "Tom");
  instance.put(95, "Vivian");
  instance.put(95, "Vivian");

  System.err.println(instance);
  // print {90=[Tom], 95=[Vivian]}
 }
}

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)