Java: Get HashMap's Topn elements Sorted by Values


It is a common requirement that we have a Hashmap, we want to only return the topn elements sorted by the value.
For example, in the following HashMap, the key is username, the value is user's score. We want to return top 2 users with highest scores.
David->85, Jeffery->85, Tom->90, Vivian->95

We may consider using TreeMap, but it is sorted by the key not the value.

In this article, I will construct a new TreeSet subclass, its element is Entry: a key-value pair, we will sort it by value via a custom Comparator:
new Comparator<Map.Entry<K, V>>() {
 public int compare(Map.Entry<K, V> e1, Map.Entry<K, V> e2) {
  int res = e1.getValue().compareTo(e2.getValue());
  if (res == 0) {
   res = e1.getKey().compareTo(e2.getKey());
  }
  return res;
 }
};
We will limit its size to n, by override add method: if its size is less than n, add the entry, if not:
1. If the value of the new entry is less than the minimum-value element(the first element in the TreeSet), then discard the new entry.
2. If the value of the new entry is larger than the first element in the TreeSet, remove the first element, and add the new entry.

The code is like below. You can review the complete source code at Github:

class TreeSetTopnAscendingByValue<K extends Comparable<? super K>, V extends Comparable<? super V>>
 extends TreeSet<Map.Entry<K, V>> {
private static final boolean DEBUG = false;
private static final long serialVersionUID = 1L;

private int maxSize = Integer.MAX_VALUE;
private Comparator<Map.Entry<K, V>> comparator;

public TreeSetTopnAscendingByValue(Comparator<Map.Entry<K, V>> comparator,
  int maxSize) {
 super(comparator);
 this.comparator = comparator;
 this.maxSize = maxSize;
}

public TreeSetTopnAscendingByValue(int maxSize) {
 this(new Comparator<Map.Entry<K, V>>() {
  public int compare(Map.Entry<K, V> e1, Map.Entry<K, V> e2) {
   int res = e1.getValue().compareTo(e2.getValue());
   if (res == 0) {
    res = e1.getKey().compareTo(e2.getKey());
   }
   return res;
  }
 }, maxSize);
}

public boolean add(K newKey, V newValue) {
 return add(new SimpleEntry<K, V>(newKey, newValue));
}

public boolean add(IKeyValue<K, V> obj) {
 return add(new SimpleEntry<K, V>(obj.getKey(), obj.getValue()));
}

@Override
public boolean add(Entry<K, V> newValue) {
 boolean added = false;
 if (this.size() > maxSize)
  throw new RuntimeException("Code error, should never happen.");
 if (this.size() == maxSize) {
  // we can first check whether the map already contains this element, if
  // yes, ignore.
  // or we can first add, then if the size is increased, remove the first
  // element again, maybe the latter is better : we can run test to compare
  // it.
  Iterator<Entry<K, V>> iterator = this.iterator();
  Entry<K, V> currentMin = iterator.next();
  // only remove currentMin if newValue > currentMin. if equals, do nothing
  // for better performance.
  if (comparator.compare(newValue, currentMin) > 0) {
   added = super.add(newValue);
   if (DEBUG) {
    System.out.println("size: " + this.size() + ", add "
      + newValue.toString() + ", remove: " + currentMin);
   }
   // the added element may already exist, if the map doesn't have this
   // element(added is true), remove currentMin
   // or we can us this.size() > maxSize
   if (added) {
    iterator = this.iterator();
    iterator.next();
    iterator.remove();
   }
  } else {
   // don't add newValue;
   if (DEBUG) {
    System.out.println("size: " + this.size() + ", Skip "
      + newValue.toString());
   }
  }
 } else {
  added = super.add(newValue);
 }
 return added;
}


interface IKeyValue<K, V> {
 public K getKey();

 public V getValue();
}
The test code is like below:
public class HashMapSorterByValue {
// http://stackoverflow.com/questions/2864840/treemap-sort-by-value
public static <K extends Comparable<? super K>, V extends Comparable<? super V>> SortedSet<Map.Entry<K, V>> getTopnbyValue(
  Map<K, V> map, int topn) {
 TreeSetTopnAscendingByValue<K, V> sortedEntries = new TreeSetTopnAscendingByValue<K, V>(
   topn);
 sortedEntries.addAll(map.entrySet());
 return sortedEntries;
}

private static void testSortByValues() {
 TreeSetTopnAscendingByValue<String, Integer> sortedEntries = new TreeSetTopnAscendingByValue<String, Integer>(
   3);
 sortedEntries.add("David", 85);
 sortedEntries.add("Jeffery", 85);
 sortedEntries.add("Tom", 90);
 sortedEntries.add("Vivian", 95);
 sortedEntries.add("Jacky", 95);
 sortedEntries.add("Jacky", 95);

 System.out.println(sortedEntries);

 // test iterate
 Iterator<Entry<String, Integer>> iterator = sortedEntries.iterator();
 while (iterator.hasNext()) {
  Entry<String, Integer> entry = iterator.next();
  System.out.println(entry.getKey() + " : " + entry.getValue());
 }
 sortedEntries.remove(new SimpleEntry<String, Integer>("Jacky", 95));
 System.out.println(sortedEntries);
}

private static void testSrotedByPeopleScore() {
 TreeSetTopnAscendingByValue<String, Double> sortedEntries = new TreeSetTopnAscendingByValue<String, Double>(
   2);
 Person p = new Person("David", 85);
 sortedEntries.add(p);
 p = new Person("Jeffery", 85);
 sortedEntries.add(p);
 p = new Person("Tom", 90);
 sortedEntries.add(p);
 p = new Person("Vivian", 95);
 sortedEntries.add(p);

 assertEquals(2, sortedEntries.size());
 Iterator<Entry<String, Double>> iterator = sortedEntries.iterator();
 assertEquals(90.0, iterator.next().getValue());
 assertEquals(95.0, iterator.next().getValue());
 System.out.println(sortedEntries);
}

class Person implements IKeyValue<String, Double> {
 private String name;
 private double score;

 public Person(String name, double score) {
  this.name = name;
  this.score = score;
 }

 public String getName() {
  return name;
 }

 public String getKey() {
  return getName();
 }

 public Double getValue() {
  return score;
 }

 @Override
 public String toString() {
  return name + ":" + score;
 }
}

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)