Don't Use Mutable Object in HashSet

We know that we should not use mutable object as Key in HashMap, and should not use mutable fields in hashCode() if we plan it into HashSet or HashMap.
Check Consequences when using Mutable Fields in hashCode()

TL;DR
Be careful when use mutable objects in HashSet - bug hides there.
Instead, use HashMap.

But subtle bugs may hide if we use mutable objects in HashSet.

For example, we save set of configuration objects into db (as json string), later we read and update it - change field1 value from f1 to f2.

If we define hashCode and equals methods of configuration as below - only use name(the primary field).
@Test
public void testMutableHashSet() throws Exception {
    final Set<Confiuguration> sets = new HashSet<>();

    // we add name conf1 to hashset, and persit it as json string to db or file.
    Confiuguration conf1 = new Confiuguration();
    conf1.setName("name1");
    conf1.setField1("f1");
    sets.add(conf1);
    System.out.println(sets);

    // Later we want to change conf1 field1 to f2, we read the sets from db and deserialize it
    // into Hashset,
    // The following code will not work
    conf1 = new Confiuguration();
    conf1.setName("name1");
    conf1.setField1("f2");

    // not added, because name1-f1 and name1-f2 have same hashcode, and are equals
    final boolean added = sets.add(conf1);
    System.out.println(added);// false
    System.out.println(sets);// [Configuration [name=name1, field1=f1]]

    // we have to remove it first
    sets.remove(conf1);
    sets.add(conf1);
    System.out.println(sets);
}

class Confiuguration {
    private String name;
    private String field1;
    // other fields ignored
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(final Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj == null) {
            return false;
        }
        if (getClass() != obj.getClass()) {
            return false;
        }
        final Confiuguration other = (Confiuguration) obj;
        if (name == null) {
            if (other.name != null) {
                return false;
            }
        } else if (!name.equals(other.name)) {
            return false;
        }
        return true;
    }
}

Hashset is actually a  hashmap - key is the hashset value, value is dummy object:
    // Dummy value to associate with an Object in the backing Map

    private static final Object PRESENT = new Object();
    private transient HashMap<E,Object> map;

Check put method in HashMap JDK7 Code, if hashmap already contains same key(same hashCode, and equals), hashmap will update the value. HashMap doesn't update existing key. - Thus hashset is not updated.
public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key);
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }
    modCount++;
    addEntry(hash, key, value, i);
    return null;
}

If we define hashCode and equals methods of configuration as below:
@Test
public void testMutableHashSet2() throws Exception {
    final Set<Confiuguration1> sets = new HashSet<>();

    // we add name conf1 to hashset, and persit it as json string to db or file.
    Confiuguration1 conf1 = new Confiuguration1();
    conf1.setName("name1");
    conf1.setField1("f1");
    sets.add(conf1);
    System.out.println(sets);

    // Later we want to change conf1 field1 to f2, we read the sets from db and deserialize it
    // into Hashset,
    // The following code will not work
    conf1 = new Confiuguration1();
    conf1.setName("name1");
    conf1.setField1("f2");

    // final boolean added = sets.add(conf1);
    // System.out.println(added);// true
    // System.out.println(sets);// [Confiuguration [name=name1, field1=f1], Confiuguration
    // // [name=name1, field1=f2]]

    // This has no effect, as it can't remove name1-f1, hashcode same, but not equals(field
    // values are not same)
    sets.remove(conf1);
    final boolean added = sets.add(conf1);
    System.out.println(added);

    // [Confiuguration [name=name1, field1=f1], Confiuguration [name=name1, field1=f2]]
    System.out.println(sets);
}
class Confiuguration1 {
    private String name;
    private String field1;
    // other fields ignored
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(final Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj == null) {
            return false;
        }
        if (getClass() != obj.getClass()) {
            return false;
        }
        final Confiuguration1 other = (Confiuguration1) obj;
        if (name == null) {
            if (other.name != null) {
                return false;
            }
        } else if (!name.equals(other.name)) {
            return false;
        }
        if (field1 == null) {
            if (other.field1 != null) {
                return false;
            }
        } else if (!field1.equals(other.field1)) {
            return false;
        }
        return true;
    }
}

Data structures:
If order doesn't matter - use Set not List, use Hashmap not TreeMap.

Using Java Code Quality Tools to Identify Bugs

The Problem
The following code will return NPE, when the instance Integer field - integerFlag is null, but it's difficult to capture the error when review the code.
    public int method() {
        if (integerFlag == 0) { //or BoolanFlag == true
            return;
        }
        //...
    }

The fix is to change the == to: Objects.equals(integerFlag, 0)
-- Use Objects.equals to compare equals as it's null safe.
-- Use common util libraries such as CollectionUtils.isEmpty etc.

But how can we utilize code analysis tool to capture this kind of errors for us? 

In Java, we can integrate findbugs, pmd, Sonar in maven, then run mvn site:site site:stage, the developers have to scan changed code and fix reported problem if needed before send it out for review. 

This will make developers and reviewers life easier.

Tools to help detect bugs
Github link: https://github.com/jefferyyuan/code-quality-mvn

FindBugs
findbugs:gui, findbugs:gui, findbugs:check
Extensions
fb-contrib

PMD
pmd:pmd, pmd:cpd

mvn site:site site:stage
Integrate findbugs, pmd into mvn.

Facebook Infer
brew upgrade opam
brew update && brew upgrade opam
./build-infer.sh java
-- If it fails due to missing packages, uses opam install.
Infer on maven project
mvn clean &&  infer --debug -- mvn compile -o

too many open files on osx
sudo sysctl -w kern.maxfiles=20480
sudo sysctl -w kern.maxfilesperproc=22480
sudo ulimit -S -n 2048

Google Error Prone
https://github.com/google/error-prone/issues/376

Sonar
Code Analysis with SonarQube Plugin
Install and run Sonar server
mvn clean verify sonar:sonar
mvn verify -Pcoverage,jenkins -Dsonar.host.url=http://localhost:9000 sonar:sonar

Install plugins
http://localhost:9000/updatecenter/installed
https://wiki.jenkins-ci.org/display/JENKINS/Static+Code+Analysis+Plug-ins

Checker Framework
Run Maven example first.
Install checker eclipse plugin.
Use annotation in comments
/*>>>
import org.checkerframework.checker.nullness.qual.*;
import org.checkerframework.checker.regex.qual.*;
*/

Configure Eclipse Compiler Warnings
Enable null analysis, unbox conversion, missing default in switch etc
Leveraging JSR-305 null annotations to prevent NullPointerExceptions
Use @CheckForNull, @Nonnulls

Misc && Issues
Use -X to print more log and check the log

maven-compiler-plugin Unsupported major.minor version 52.0
Some plugins may only work with jdk8 or jdk7, use export to change JAVA_HOME to JDK8/7 and rerun.


Spring Solr: Using Custom Converters to Serialize object to Json String in Solr

The Problem
There are two kinds of data when save data to Solr: fields that we will search on, other fields that we will never search on.

Schemaless design
We can save all non-searchable fields to one Json string field into Solr, this way, we don't have to change Solr schema every time we add/remove a field.

We want to use object in java class, serialize it to json stirng when save to Solr, and deserialize the json string to object when read from Solr.

Solution:
As we use Spring Solr to save and restore data to/from Solr, all we need to do is implement and register custom solr converters that will convert object to json string when save to Solr, convert json string to object when read data from Solr.

We store the following Message data in Solr:
Class Message {
  long id;
  // searchable fields such as activeDate, city etc
  MessageDetails details;
}

MessageDetails implements Jsonable, Serializable {
  // all non-searchable fields define here.
}

//Mark interface: this instance will be serialized to json string when save to Solr.
public interface Jsonable {}

The 
Solr Converters
Then we will implement Solr Converters like below.
JsonableToStringConverter will convert the object into string; StringToMessageDetailsConverter will convert string instance to MessageDetails.
public class Converters {
  @WritingConverter
  public enum JsonableToStringConverter implements Converter<Jsonable, String> {
      INSTANCE;
      @Override
      public String convert(final Jsonable source) {
          if (source == null) {
              return null;
          }
          final ObjectMapper mapper = new ObjectMapper();

          try {
              return mapper.writeValueAsString(source);
          } catch (final JsonProcessingException e) {
              logger.error(MessageFormat.format("Unable to serialize to json; source: {0}", source), e);
              throw new BusinessException(ErrorCode.INTERNAL_ERROR, "Unable to serialize to json.");
          }
      }
  }

  @ReadingConverter
  public enum StringToMessageDetailsConverter implements Converter<String, MessageDetails> {
      INSTANCE;

      @Override
      public MessageDetails convert(final String source) {
          if (source == null) {
              return null;
          }
          try {
              return new ObjectMapper().readValue(source, MessageDetails.class);
          } catch (final IOException e) {
              logger.error(MessageFormat.format("Unable to deserialize from json; source: {0}", source), e);
              throw new BusinessException(ErrorCode.INTERNAL_ERROR, "Unable to deserialize from json.");
          }
      }
  }
}
At last we will register these customer converters into SolrConverter.
  @Bean
  public static SolrConverter mappingSolrConverter() {
      final MappingContext<? extends SolrPersistentEntity<?>, SolrPersistentProperty> mappingContext =
              new SimpleSolrMappingContext();
      final MappingSolrConverter converter = new MappingSolrConverter(mappingContext);
      final List<Object> converters = new ArrayList<Object>();

      converters.add(JsonableToStringConverter.INSTANCE);
      converters.add(StringToMessageDetailsConverter.INSTANCE);        
      converter.setCustomConversions(new CustomConversions(converters));
      return converter;
  }

Labels

Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (39) Eclipse (33) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Problem Solving (10) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Life (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) Invest (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts