Java equals and hashcode


Java equals and hashcode

Obey the general contract when overriding equals
When we don't need to implement equals?
• Each instance of the class is inherently unique. This is true for classes such as Thread that represent active entities rather than values.
• You don’t care whether the class provides a “logical equality” test, java.util.Random doesn't implement equals.
• A superclass has already overridden equals, and the superclass behavior
is appropriate for this class.
• The class is private or package-private, and you are certain that its equals method will never be invoked.
One kind of value class that does not require the equals method to be overridden is a class that uses instance control(Singleton) to ensure that at most one object exists with each value. Enum types fall into this category. For these classes, logical equality is the same as object identity, so Object’s equals method functions as a logical equals method.
When is it appropriate to override Object.equals?
When a class has a notion of logical equality that differs from mere object identity and a superclass has not already overridden equals to implement the desired behavior.
Not only is overriding the equals method necessary to satisfy programmer expectations; it enables instances to serve as map keys or set elements with predictable, desirable behavior.
The equals method implements an equivalence relation. It is:
Reflexive: For any non-null reference value x, x.equals(x) must return true.
If you were to violate it and then add an instance of your class to a collection, the collection’s contains method might well say that the collection didn’t contain the instance that you just added.
Symmetric: For any non-null reference values x and y, x.equals(y) must re-
turn true if and only if y.equals(x) returns true.
public final class CaseInsensitiveString {
    private final String s;
    public CaseInsensitiveString(String s) {
       if (s == null) throw new NullPointerException();
       this.s = s;
    }
    // Broken - violates symmetry!
    @Override public boolean equals(Object o) {
       if (o instanceof CaseInsensitiveString) return s
           .equalsIgnoreCase(((CaseInsensitiveString) o).s);
       if (o instanceof String) // One-way interoperability!
       return s.equalsIgnoreCase((String) o);
       return false;
    }
    // To eliminate the problem, merely remove the ill-conceived attempt to
    // inter-operate with String from the equals method.
    // @Override public boolean equals(Object o) {
    // return o instanceof CaseInsensitiveString
    // && ((CaseInsensitiveString) o).s.equalsIgnoreCase(s);
    // }
    public static void main(String[] args) {
       CaseInsensitiveString cis = new CaseInsensitiveString(“Polish”);
       String s = “polish”;
       List list = new ArrayList();
       list.add(cis);
       // this would depend on how contains is implemented.
       // it calls CaseInsensitiveString.equals(true), or String.equals(false)
       // in sun jdk, it would return false.
       System.err.println(list.contains(s));
    }
}
Transitive: For any non-null reference values x, y, z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.
Consider the case of a subclass that adds a new value component to its superclass, the subclass adds a piece of information that affects equals comparisons.
There is no way to extend an instantiable class and add a value component while preserving the equals contract, unless you are willing to forgo the benefits of object-oriented abstraction.
public class ColorPoint extends Point {
    private final Color color;
    public ColorPoint(int x, int y, Color color) {
       super(x, y);
       this.color = color;
    }
    // Broken - violates symmetry!
    // System.err.println(p.equals(cp)); // return true
    // System.err.println(cp.equals(p)); // return false
    // @Override public boolean equals(Object o) {
    // if (!(o instanceof ColorPoint)) return false;
    // return super.equals(o) && ((ColorPoint) o).color == color;
    // }
    // Broken - violates transitivity!
    // System.err.println(p1.equals(p2));// return true
    // System.err.println(p3.equals(p2));// return true
    // System.err.println(p1.equals(p3));// return false
    // @Override public boolean equals(Object o) {
    // if (!(o instanceof Point)) return false;
    // // If o is a normal Point, do a color-blind comparison
    // if (!(o instanceof ColorPoint)) return o.equals(this);
    // // o is a ColorPoint; do a full comparison
    // return super.equals(o) && ((ColorPoint) o).color == color;
    // }
    // Broken - violates Liskov substitution principle (page 40)
    @Override public boolean equals(Object o) {
       if (o == null || o.getClass() != getClass()) return false;
       ColorPoint p = (ColorPoint) o;
       return p.getX() == getX() && p.getY() == getY()
           && p.getColor() == getColor();
    }
    public static void main(String[] args) {
       ColorPoint p1 = new ColorPoint(1, 2, Color.RED);
       Point p2 = new Point(1, 2);
       ColorPoint p3 = new ColorPoint(1, 2, Color.RED);
       System.err.println(p1.equals(p2));// return false
       System.err.println(p3.equals(p2));// return false
       System.err.println(p1.equals(p3));// return true
    }
    public Color getColor() {
       return color;
    }
}
Consistent: For any non-null reference values x and y, multiple invocations
of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
• For any non-null reference value x, x.equals(null) must return false, all objects must be unequal to null.
public abstract class AbstractList extends AbstractCollection implements List {
  public boolean equals(Object o) {
    if (o == this)
      return true;
    if (!(o instanceof List))
      return false;
    ListIterator e1 = listIterator();
    ListIterator e2 = ((List) o).listIterator();
    while(e1.hasNext() && e2.hasNext()) {
      E o1 = e1.next();
      Object o2 = e2.next();
      if (!(o1==null ? o2==null : o1.equals(o2)))
       return false;
    }
    return !(e1.hasNext() || e2.hasNext());
  }
  public int hashCode() {
    int hashCode = 1;
    Iterator i = iterator();
    while (i.hasNext()) {
      E obj = i.next();
      hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
    }
    return hashCode;
}
}
List ia = new ArrayList();
List ib = new LinkedList();
System.err.println(ia.equals(ib)); // return true
To implement high-quality equals method:
1. Use the == operator to check if the argument is a reference to this object.
If so, return true. This is just a performance optimization, but one that is worth doing if the comparison is potentially expensive.
2. Use the instanceof operator to check if the argument has the correct type.
If not, return false. Typically, the correct type is the class in which the method occurs. Occasionally, it is some interface implemented by this class.
3. Cast the argument to the correct type.
4. For each “significant” field in the class, check if that field of the argument matches the corresponding field of this object.
For primitive fields whose type is not float or double, use the == operator for comparisons; for object reference fields, invoke the equals method recursive-
ly; for float fields, use the Float.compare method; and for double fields, use
Double.compare.
5. When you are finished writing your equals method, ask yourself three questions: Is it symmetric? Is it transitive? Is it consistent? And don’t just ask yourself; write unit tests to check that these properties hold! If they don’t, figure out why not, and modify the equals method accordingly
Always override hashCode when you override equals
Don’t substitute another type for Object in the equals declaration.
You must override hashCode in every class that overrides equals. Failure to do so will result in a violation of the general contract for Object.hashCode, which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
Here is the contract, copied from the Object specification [JavaSE6]:
Whenever it is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execu-
tion of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Equal objects must have equal hash codes.
Objects with equal hashcode is not required to be equal.
What would happen if 2 equals object have different hashcode? CRUDJ
2 equals object would be insert into same hashmap, we can query it and delete it.
Map m = new HashMap();
m.put(new PhoneNumber(707, 867, 5309), “Jenny”);
// return null
System.err.println(m.get(new PhoneNumber(707, 867, 5309)));
// return false
System.err.println(m.containsKey(new PhoneNumber(707, 867, 5309)));
m.put(new PhoneNumber(707, 867, 5309), “Jenny”);
System.err.println(m.size()); // return 2
m.remove(“Jenny”);
System.err.println(m.size()); // return 2
The get method is likely to look for the phone number in a different hash bucket from the one in which it was stored by the put method. Even if the two instances happen to hash to the same bucket, the get method will almost certainly return null, as HashMap has an optimization that caches the hash code associated with each entry and doesn’t bother checking for object equality if the hash codes don’t match.
What if 2 un-equal objects have same hashcode?
This would decrease hashset. hashmap performance.
// The worst possible legal hash function - never use!
@Override public int hashCode() { return 42; }
Hash tables degenerate to linked lists.
A good hash function tends to produce unequal hash codes for unequal objects.
Ideally, a hash function should distribute any reasonable collection of unequal instances uniformly across all possible hash values.
Do not be tempted to exclude significant parts of an object from the hash code computation to improve performance.
Principals to implement hashcode:
1. Store some constant nonzero value, say, 17, in an int variable called result.
2. For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
a. Compute an int hash code c for the field:
i. If the field is a boolean, compute (f?1:0).
ii. If the field is a byte, char, short, or int, compute (int) f.
iii. If the field is a long, compute (int) (f ^ (f >>> 32)).
iv. If the field is a float, compute Float.floatToIntBits(f).
v. If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.a.iii.
vii. If the field is an array, treat it as if each element were a separate field.
That is, compute a hash code for each significant element by applying
these rules recursively, and combine these values per step 2.b. If every
element in an array field is significant, you can use one of the
Arrays.hashCode methods added in release 1.5.
b. Combine the hash code c computed in step 2.a into result as follows:
    result = 31 * result + c;
3. Return result.
4. When you are finished writing the hashCode method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition!
If equal instances have unequal hash codes, figure out why and fix the problem.
You must exclude any fields that are not used in equals comparisons, or you risk violating the second provision of the hashCode contract.

The multiplication in step 2.b makes the result depend on the order of the fields, yielding a much better hash function if the class has multiple similar fields.

The multiplication in step 2.b makes the result depend on the order of the fields, yielding a much better hash function if the class has multiple similar fields.
For example, if the multiplication were omitted from a String hash function, all anagrams would have identical hash codes. The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance:
31*i==(i<<5)-i. Modern VMs do this sort of optimization automatically.
Post a Comment

Labels

Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (38) Eclipse (33) Code Example (31) Linux (25) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts