Learning Solr Code: Import Data to Solr

The main logic is at solr.servlet.SolrRequestParsers.parse(SolrCore, String, HttpServletRequest)
public SolrQueryRequest parse( SolrCore core, String path, HttpServletRequest req ) throws Exception
  {
    SolrRequestParser parser = standard;
  
    ArrayList streams = new ArrayList(1);
    SolrParams params = parser.parseParamsAndFillStreams( req, streams );
    SolrQueryRequest sreq = buildRequestFrom( core, params, streams );
    sreq.getContext().put( "path", path );
  }  
There are multiple SolrRequestParser implementations: FormDataRequestParser, MultipartRequestParser, RawRequestParser, SimpleRequestParser, StandardRequestParser. By default, it uses StandardRequestParser.

In StandardRequestParser.parseParamsAndFillStreams, if it is GET or HEAD request, it will parse the query string, and create a SolrParams: please refer about how it parses the query string.

If it is a POST request, for normal post request, StandardRequestParser.parseParamsAndFillStreams will use FormDataRequestParser to parse the form data, and create a SolrParams.
The following curl request will be handled by FormDataRequestParser.
curl -d "stream.body=<add><doc><field name='contentid'>content1</field></doc></add>&clientId=client123&batchId=1" http://host:port/solr/update

If the data is uploaded as a file, like below:
curl http://host:port/solr/update -F "fieldName=@data.xml"
The fieldName doesn't matter and can be anything.

StandardRequestParser.parseParamsAndFillStreams will use MultipartRequestParser, which will use apache commons to create fileupload.FileItem, then create a servlet.FileItemContentStream.
How it determines whether the request is multipart?
ServletFileUpload.isMultipartContent(req), whether the contentType starts with "multipart/".

For a POST request, if the request is not format mentioned before, it will use RawRequestParser which creates a servlet.HttpRequestContentStream from the request.

Then in SolrRequestParsers.buildRequestFrom, it will get stream.file, stream.body, stream.url, and constructs ContentStreamBase.FileStream/StringStream/URLStream. The file stream.file points to must be a local file to Solr server.
Subclasses of ContentStreamBase
HttpRequestContentStream
Wrap an HttpServletRequest as a ContentStream
public InputStream getStream() throws IOException {
return req.getInputStream();
}
FileItemContentStream
Wrap a org.apache.commons.fileupload.FileItem as a ContentStream
ContentStreamBase.FileStream
ContentStreamBase.URLStream
ContentStreamBase.StringStream
DocumentAnalysisRequestHandlerTest.ByteStream
Using curl to send request to Solr
curl -d "stream.body=<add><doc><field name=\"id\">id1</field></doc></add>&clientId=client123" http://host:port/solr/update
curl -d "stream.body=<add><commit/></add>&clientId=client123" http://host:port/solr/update
Error:
In this case, have to add "" for the value of -d, as the value contains special characters, like <, otherwise it will report error:
curl -d stream.body=<add><doc><field name=\"id\">id1</field></doc></add>&clientId=client123 http://host:port/solr/update
< was unexpected at this time.

For the stream body, have to use " to enclose property name, like \"id\". The following request will fail:
curl -d "stream.body=<add><doc><field name=id>id1</field></doc></add>&clientId=client123" http://host:port/solr/update
org.apache.solr.common.SolrException: Unexpected character 'i' (code 105) in start tag Expected a quote
at [row,col {unknown-source}]: [1,23]
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character 'i' (code 105) in start tag Expected a quote
at [row,col {unknown-source}]: [1,23]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)

Correct Usage:
Use "" to enclose the value of -d.
Use \ to escape specail characrts, " to \", \ to \\. 
curl -d "stream.body=2,0,1,0,1,\"c:\\\",1,0,\"c:\",0,1,16 %0D%0A 2,0,1,0,1,\"x:\\\",2,0,\"x:\",0,1,16 &separator=,&fieldnames=omiited&literal.id=9000&stream.contentType=text/csv;charset=utf-8&commit=true" http://localhost:8080/solr/update/csv
Code:
private final boolean com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(char)(char c)
 // And then a quote:
 if (c != '"' && c != '\'') {
  throwUnexpectedChar(c, SUFFIX_IN_ELEMENT+" Expected a quote");
 }
Upload csv content
curl -d "stream.body=id1&clientId=client123&fieldnames=id" http://host:port/solr/update/csv
Upload XML File
curl -F "fieldName=@data.xml" http://host:port/solr/update
curl -F "fieldName=@data.xml;type=application/xml" http://host:port/solr/update
;type=application/xml set MIME content-type of the file.
curl -F fieldName=@data.xml -F clientId=client123 -F &batchId=2 http://host:port/solr/update
Have to use multiple -F for multiple form data, format -F "key1=value1&key2=value2" doesn't work - this will only set one pair, value of key1 is value1&key2=value2.
Delete Data
curl -d "stream.body=<delete><query>*:*</query></delete>&commit=true" http://host:post/solr/update
Curl Usage
-d, --data <data>
(HTTP) Sends the specified data in a POST request to the HTTP server.
--data-binary <data>
(HTTP) This posts data exactly as specified with no extra processing whatsoever.
curl http://host:port/solr/update -H "Content-Type: text/xml" -d @C:\jeffery\data.xml
-F, --form <name=content>
curl -F password=@/etc/passwd www.mypasswords.com
curl -F "name=daniel;type=text/foo" url.com

Set the Request Method: -X POST
Set Request Headers: -H "Authorization: OAuth 2c4419d1aabeec"
View Response Headers: -i
Debug request: -v

-o (lowercase o) the result will be saved in the filename provided in the command line
-O (uppercase O) the filename in the URL will be taken and it will be used as the filename to store the result
Follow HTTP Location Headers with -L option

To POST to a page
curl -d "item=bottle&category=consumer&submit=ok" www.example.com/process.php

Referer & User Agent
curl -e http://some_referring_site.com http://www.example.com/
curl -A "Mozilla/5.0 (compatible; MSIE 7.01; Windows NT 5.0)" http://www.example.com

Limit the Rate of Data Transfer
curl --limit-rate 1000B -O http://www.gnu.org/software/gettext/manual/gettext.html

Continue/Resume a Previous Download: -C -
curl -C - -O http://www.gnu.org/software/gettext/manual/gettext.html

Pass HTTP Authentication in cURL
curl -u username:password URL

Download Files from FTP server
curl -u ftpuser:ftppass -O ftp://ftp_server/public_html/xss.php

List/Download using Ranges
curl ftp://ftp.uk.debian.org/debian/pool/main/[a-z]/

Upload Files to FTP Server
curl -u ftpuser:ftppass -T myfile.txt ftp://ftp.testserver.com
curl -u ftpuser:ftppass -T "{file1,file2}" ftp://ftp.testserver.com
curl ftp://username:password@example.com

Use Proxy to Download a File
curl -x proxysever.test.com:3128 http://google.co.in

References
http://curl.haxx.se/docs/manpage.html
http://curl.haxx.se/docs/httpscripting.html
9 uses for cURL worth knowing
6 essential cURL commands for daily use
9 uses for cURL worth knowing
15 Practical Linux cURL Command Examples (cURL Download Examples)

Learning Solr Code: SolrParams and NamedList

Design Principal: SolrParams defines common behavior, but doesn't defines how it should be implemented - the data structure SolrParams uses. 

SolrParams.wrapDefaults wraps 2 SolrParams, one additional default SolrParams, if get(name) can't find value in the first map, it will get value from the default SolrParams.

In org.apache.solr.handler.RequestHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse), it will wrap params from request, and add defaults,appends and invariant params from the request handler in solrconfig.xml. 

In org.apache.solr.handler.RequestHandlerBase.init(NamedList), it reads params in defaults section into variable defaults, params in appends section into variable appends, params in invariants section into variable invariants, .

In org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse), it wraps defaults,appends,invariants into SolrParams in request.
SolrPluginUtils.setDefaults(req,defaults,appends,invariants);
Example like below:

<requestHandler name="/update/asyncXML" class="solr.AsyncXmlUpdateRequestHandler">
 <str name="unique_field_name">contentid</str>
 <str name="clientId_param_name">clientId</str>
 <lst name="defaults">
  <str name="update.contentType">application/xml</str>
 </lst>
</requestHandler>
The SolrParams in a RequestHandler.handleRequestBody is a DefaultSolrParams, through it, you can get the key/value paris in solrconfig.xml.
NamedList
In the NamedList in RequestHandler.init() method, if you want to access the value defined in defaults section, you can: 
1. run super.init(args);, which will read defaults section into variable defaults SolrParams, then you can: defaults.get("update.contentType");
2. Or you can NamedList<Object> defautNl = args.get("defaults"); then read defautNl.

If you run  SolrParams params = SolrParams.toSolrParams(args); the SolrParams is MapSolrParams, not a DefaultSolrParams, this means it doesn't wrap normal configuration, defaults,appends,invariants into one SolrParams. If you run params.get("update.contentType"); it will return null.
Change SolrParams in a request
ModifiableSolrParams newParams = new ModifiableSolrParams(req.getParams());
req.setParams(newParams);
Set Content Type
1. Set default content stream type for a request handler in solrconfig.xml:
<lst name="defaults">
<str name="update.contentType">application/xml</str>
</lst>
2. Set default content stream type for a request handler in code.
In init(NamedList) method of the requestHandler.
setAssumeContentType("application/xml");
3. Set content stream type in url: 
&stream.contentType=application/xml
How SolrParams is created:
org.apache.solr.servlet.SolrRequestParsers.parseQueryString(String, Map<String, String[]>)
        // this input stream emulates to get the raw bytes from the URL as passed to servlet container, it disallows any byte > 127 and enforces to %-escape them:
        final InputStream in = new InputStream() {
          int pos = 0;
          @Override
          public int read() {
            if (pos < len) {
              final char ch = queryString.charAt(pos);
              if (ch > 127) {
                throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "URLDecoder: The query string contains a not-%-escaped byte > 127 at position " + pos);
              }
              pos++;
              return ch;
            } else {
              return -1;
            }
          }
        };  
SolrRequestParsers converts the queryString into an InputStream..
In SolrRequestParsers.parseFormDataContent(InputStream, long, Charset, Map<String, String[]>), read the InputStream, use currentStreamcurrentStream to switch between 2 ByteArrayOutputStream2: keyStream, valueStream. Handle these special characters: &, =, %, +.
/** Makes the buffer of ByteArrayOutputStream available without copy. */
static final class ByteArrayOutputStream2 extends ByteArrayOutputStream {
 byte[] buffer() {
   return buf;
 }
}

Snake - Using Trie to Find Words Comprised of Provided Characters

At the Chinese New Year Celebration Party in our company, the host gives us a word puzzle:
Give you 5 characters: S, N, A, K, E (as this year is year of Snake.), write down all words that are only composed of these 5 characters, each character can occur 0 or multiple times.

This is a funny algorithm question, and can be solved using Tire like below.

We read word from a dictionary file, build a Trie, when try to get all words comprised of these candidate characters, we use depth-first order, for each valid character in the first layer, iterate all valid characters in second layer, and go on.

When construct this Trie:
If this trie is going to be searched multiple times for different candidate characters, we can insert all words into this Trie.
If we only answer this question one time, then we only insert words that are comprised of only these candidate characters.

The code is like below: You can review complete code in Github.
Class Snake

package org.codeexample.jefferyyuan.algorithm.wordPuzzles;
import org.codeexample.jefferyyuan.common.WordTree;
public class Snake extends WordTree {
 public Set<String> getValidWords(List<Character> candidates) {
  // change all chars to lower case.
  List<Character> tmp = new ArrayList<>(candidates.size());
  for (Character character : candidates) {
   tmp.add(Character.toLowerCase(character));
  }
  WordNode currentNode = root;
  Map<Character, WordNode> children = currentNode.getChildrenMap();
  Set<String> words = new HashSet<>();
  for (Character candidate : tmp) {
   words.addAll(getValidWords(children.get(candidate), tmp));
  }
  return words;
 }
 private Set<String> getValidWords(WordNode node, List<Character> candidates) {
  Set<String> words = new HashSet<>();
  if (node == null)return words;
  if (node.isWord()) {
   words.add(node.getWord());
  }
  Map<Character, WordNode> children = node.getChildrenMap();
  for (Character candidate : candidates) {
   WordNode chileNode = children.get(candidate);
   words.addAll(getValidWords(chileNode, candidates));
  }
  return words;
 }

 public Snake(String dictFile) throws IOException, InterruptedException {
  init(dictFile);
 }
 // Insert each word that are comprised only of chars from the dictFile into the Trie
 private void init(String dictFile, List<String> chars) throws Exception {
 }
 // Insert each word from the dictFile into the Trie
 private void init(String dictFile) throws IOException, InterruptedException {
 }
}

Class WordTree
package org.codeexample.jefferyyuan.common;
public class WordTree {
 protected WordNode root;
 public WordTree() {
  root = new WordNode(null, WordNode.TYPE_ROOT);
 }
 public void addWord(String word) {
  if (word == null) return;
  word = word.trim();
  word = fixString(word);
  if ("".equals(word)) return;
  WordNode parentNode = root, curretNode;
  for (int i = 0; i < word.length(); i++) {
   char character = word.charAt(i);
   Map<Character, WordNode> children = parentNode.getChildrenMap();
   if (children.containsKey(character)) {
    curretNode = children.get(character);
   } else {
    curretNode = new WordNode(character, WordNode.TYPE_NON_WORD);
    parentNode.addChild(curretNode);
   }
   parentNode = curretNode;
  }
  parentNode.thisIsAWord();
 }
 /**
  * This method comes from
  * http://logos.cs.uic.edu/340/assignments/Solutions/Wordpopup/curso/trie.java
  */
 public String fixString(String str) {
  int index = 0; // starting index is 0

  // convert the string to lower case
  str = str.toLowerCase();

  // convert the String to an array of chars to easily
  // manipulate each char
  char[] myChars = str.toCharArray(); // holds the old String
  char[] newChars = new char[str.length()]; // will make up the new String

  // loop until every char in myChars is tested
  for (int x = 0; x < myChars.length; x++) {
   // accept all alphabetic characters only
   if (myChars[x] >= 'a' && myChars[x] <= 'z') {
    newChars[index++] = myChars[x];
   }
  }

  // return a String consisting of the characters in newChars
  return String.valueOf(newChars);
 }

 /**
  * @param prefix
  * @return all words in this tree that starts with the prefix, <br>
  *         if prefix is null, return an empty list, if prefix is empty string,
  *         return all words in this word tree.
  */
 public List<String> wordsPrefixWith(String prefix) {
  List<String> words = new ArrayList<String>();
  if (prefix == null)
   return words;
  prefix = prefix.trim();
  WordNode currentNode = root;
  for (int i = 0; i < prefix.length(); i++) {
   char character = prefix.charAt(i);
   Map<Character, WordNode> children = currentNode.getChildrenMap();
   if (!children.containsKey(character)) {
    return words;
   }
   currentNode = children.get(character);
  }
  return currentNode.subWords();
 }

 /**
  * @param word
  * @return whether this tree contains this word, <br>
  *         if the word is null return false, if word is empty string, return
  *         true.
  */
 public boolean hasWord(String word) {
  if (word == null) return false;
  word = word.trim();
  if ("".equals(word)) return true;
  WordNode currentNode = root;
  for (int i = 0; i < word.length(); i++) {
   char character = word.charAt(i);
   Map<Character, WordNode> children = currentNode.getChildrenMap();
   if (!children.containsKey(character)) {
    return false;
   }
   currentNode = children.get(character);
  }
  // at last, check whether the parent node contains one null key - the
  // leaf node, if so return true, else return false.
  return currentNode.getChildrenMap().containsKey(null);
 }

 public static class WordNode {
  private Character character;
  private WordNode parent;
  private Map<Character, WordNode> childrenMap = new HashMap<Character, WordNode>();

  private int type;
  public static int TYPE_ROOT = 0;
  public static int TYPE_NON_WORD = 1;
  public static int TYPE_WORD = 2;

  public WordNode(Character character, int type) {
   this.character = character;
   this.type = type;
  }

  /**
   * @return all strings of this sub tree
   */
  public List<String> subWords() {
   List<String> subWords = new ArrayList<String>();
   String prefix = getPrefix();
   List<String> noPrefixSubWords = subWordsImpl();
   for (String noPrefixSubWord : noPrefixSubWords) {
    subWords.add(prefix + noPrefixSubWord);
   }
   return subWords;
  }

  public boolean isWord() {
   return type == TYPE_WORD;
  }

  /**
   * Indicate this node represents a valid word.
   */
  public void thisIsAWord() {
   type = TYPE_WORD;
  }
  public String getWord() {
   if (isWord()) {
    return getPrefix() + character;
   } else {
    throw new RuntimeException("Not a valid word.");
   }
  }
  private String getPrefix() {
   StringBuilder sb = new StringBuilder();
   WordNode parentNode = this.parent;
   while (parentNode != null) {
    if (parentNode.getCharacter() != null) {
     sb.append(parentNode.getCharacter());
    }
    parentNode = parentNode.parent;
   }
   return sb.reverse().toString();
  }

  private List<String> subWordsImpl() {
   List<String> words = new ArrayList<String>();
   Iterator<Character> keyIterator = childrenMap.keySet().iterator();
   while (keyIterator.hasNext()) {
    Character key = keyIterator.next();
    if (key == null) {
     words.add(convertToString(this.character));
    } else {
     WordNode node = childrenMap.get(key);
     List<String> childWords = node.subWordsImpl();
     for (String childWord : childWords) {
      words.add(convertToString(this.character) + childWord);
     }
    }
   }
   return words;
  }
  public void addChild(WordNode child) {
   child.parent = this;
   childrenMap.put(child.getCharacter(), child);
  }
  public Character getCharacter() {
   return character;
  }
  public Map<Character, WordNode> getChildrenMap() {
   return childrenMap;
  }
  public String toString() {
   return "WordNode [character=" + character + ", type=" + typeToString()
     + ", childrenMap.size=" + childrenMap.size() + "]";
  }
  private String convertToString(Character character) {
   return (character == null) ? "" : String.valueOf(character);
  }
  private String typeToString() {
   String result = "";
   if (type == TYPE_ROOT)
    result = "ROOT";
   else if (type == TYPE_NON_WORD)
    result = "NOT_WORD";
   else if (type == TYPE_WORD)
    result = "WORD";
   return result;
  }
 }
}
References:
TRIE data structure
http://stevedaskam.wordpress.com/2009/05/28/trie-structures/
http://www.technicalypto.com/2010/04/trie-in-java.html

Radix/PATRICIA Trie
Radix/PATRICIA Trie is a space-optimized trie data structure where each node with only one child is merged with its child.
This makes them much more efficient for small sets (especially if the strings are long) and for sets of strings that share long prefixes.
Java implementation

Start Stop Embedded Jetty Programmatically

Recently I am trying to package embedded jetty, solr.war, and solr.home in one package, and start and shut the embedded jetty server dynamically.

About how to package embedded jetty, solr.war, and solr.home in one package, and reduce the size, please refer to:
Part 1: Shrink Solr Application Size
Part 2: Use Proguard to Shrink Solr Application Size
Part 3: Use Pack200 to Shrink Solr Application Size
This article would like to talk about how to start and shutdown embedded jetty server dynamically.

There are several things we need consider:
1. How to make sure only one instance running?
Fo each unzipped package, user should only run it once, if user clicks the run.bat again, it should report the application is already running.

I found this article, the basic idea is to create a lock file, and lock it when the application is running. When application ended/closed release the lock and delete the file lock. If not able to create and lock the file, it means the application is already running.
2. Dynamical Port
If the port user gives is not available, maybe already occupied by other application, we will try to find a free port in a range.
How to check whether a port is available?
We can try to create a ServerSocket, bound to that port, if it throws exception, means it is already occupied, if not, means it is a free port.
ServerSocket socket = new ServerSocket(port);
3. How to start and shutdown embedded jetty
The code is like below:
To make sure this jetty exclusively binds to the port, we need create a SelectChannelConnector, and setReuseAddress false.

To make the jetty server shutdown itself on a valid request, add a ShutdownHandler with SHUTDOWN_PASSWORD.Please refer to the article.

Then later, users can call http://host:por/shutdown?token=SHUTDOWN_PASSWORD&_exitJvm=true, _exitJvm=true makes the jvm also exits, the application would end.

Server server = new Server();
// add solr.war
WebAppContext webapp1 = new WebAppContext();
webapp1.setContextPath("/solr");
webapp1.setWar(solrWarPath);

HandlerList handlers = new HandlerList();
handlers.setHandlers(new Handler[] {webapp1,
  new ShutdownHandler(server, SHUTDOWN_PASSWORD)});
server.setHandler(handlers);

SelectChannelConnector connector = new SelectChannelConnector();
connector.setReuseAddress(false);
connector.setPort(port);
server.setConnectors(new Connector[] {connector});

server.start();
Complete Code
The main code to start and shutdown jetty is like below. 
The complete code including scripts to start and shutdown jetty, and configuration file is in Github.
package org.codeexample.jeffery.misc;
public class EmbeddedSolrJettyServer {
  
  private static final String SYS_APP_HOME = "app.home";
  private static final String SYS_SOLR_SOLR_HOME = "solr.solr.home";
  /**
   * The following four parameter can be configured in command line
   */
  private static final String ARG_EXIT_AFTER = "exitAfter";
  private static final String ARG_PORT = "port";
  private static final String ARG_DYNAMIC_PORT = "dynamicPort";
  private static final String ARG_LOCALONLY = "localOnly";
  
  private static final String ARG_SEARCH_PORT_START_RANGE = "searchPortStartRange";
  private static final String ARG_SEARCH_PORT_END_RANGE = "searchPortEndRange";
  
  /**
   * hidden parameter, which will only take effect once, will be removed after
   * that.
   */
  private static final String ARG_DEBUG = "debug";
  
  private static final int DEFAULT_PORT = 8080;
  private static final String CONFILE_FILE = "/etc/config.properties";
  private static final String LOCK_FILE  = "app.lock";
  private static final String RESULT_FILE = "result";
  
  private static final int DEFAULT_SEARCH_PORT_START_RANGE = 5000;
  private static final int DEFAULT_SEARCH_PORT_END_RANGE = 50000;
  
  private static File lockFile;
  private static FileChannel channel;
  private static FileLock lock;
  
  private static String SHUTDOWN_PASSWORD = "shutdown_passwd";
  private String appBaseLocation;
  boolean debug = false;
  public static void main(String[] args) throws Exception {
    
    EmbeddedSolrJettyServer instance = new EmbeddedSolrJettyServer();
    instance.handleRequest(args);
  }
  
  private void handleRequest(String[] args) throws Exception {
    appBaseLocation = getBaseLocation();
    
    if (args.length < 1) {
      exitWithError("No arguments.");
    }
    
    String str = args[0];
    if ("start".equalsIgnoreCase(str)) {
      startServer(args);
    } else if ("shutdown".equalsIgnoreCase(str)) {
      stopServer();
    } else if ("-h".equalsIgnoreCase(str)) {
      printUsage();
    }
  }
  
  private void stopServer() throws FileNotFoundException, IOException {
    Properties properties = readProperties();
    
    String str = properties.getProperty(ARG_PORT);
    if (str != null) {
      shutdown(Integer.valueOf(str), SHUTDOWN_PASSWORD);
    } else {
      System.err.println("Can't read port from properties file.");
    }
  }
  
  private void printUsage() {
    System.err.println("Usage:");
    System.err
        .println("Start Server: java -classpath \"lib\\*;startjetty.jar\" org.codeexample.jeffery.misc.EmbeddedSolrJettyServer start -port port -exitAfter number -dynamicPort");
    System.err
        .println("Shutdown Server: java -classpath \"lib\\*;startjetty.jar\" org.codeexample.jeffery.misc.EmbeddedSolrJettyServer shutdown");
  }
  
  /**
   * First read form the config.file, then use command line arguments to
   * overwrite it, if no exits, set default value.
   */
  private Properties getOptions(String[] args) throws IOException {
    Properties properties = new Properties();
    // put default values
    properties.setProperty(ARG_PORT, String.valueOf(DEFAULT_PORT));
    properties.setProperty(ARG_DYNAMIC_PORT, "false");
    
    properties.setProperty(ARG_SEARCH_PORT_START_RANGE,
        String.valueOf(DEFAULT_SEARCH_PORT_START_RANGE));
    properties.setProperty(ARG_SEARCH_PORT_END_RANGE,
        String.valueOf(DEFAULT_SEARCH_PORT_END_RANGE));
    
    properties.putAll(readProperties());
    // remove exitAfter, as this is a dangerous value, it should take effect
    // when uses set is explicitly, via command line, not from config.properties
    properties.remove(ARG_EXIT_AFTER);
    
    // ignore localOnly from config.proerties
    properties.setProperty(ARG_LOCALONLY, "false");
    // code comes from
    // http://journals.ecs.soton.ac.uk/java/tutorial/java/cmdLineArgs/parsing.html
    
    // the first arg is start
    int i = 1;
    String arg;
    while (i < args.length && args[i].startsWith("-")) {
      arg = args[i++];
      if (arg.equalsIgnoreCase("-" + ARG_DYNAMIC_PORT)) {
        if (i < args.length) {
          arg = args[i++];
          Boolean dynamicPort = Boolean.FALSE;
          try {
            dynamicPort = Boolean.valueOf(arg);
          } catch (Exception e) {
            dynamicPort = Boolean.FALSE;
          }
          properties.setProperty(ARG_DYNAMIC_PORT, dynamicPort.toString());
          
        } else {
          exitWithError("No value is specified for " + ARG_DYNAMIC_PORT);
        }
      } else if (arg.equalsIgnoreCase("-" + ARG_LOCALONLY)) {
        if (i < args.length) {
          arg = args[i++];
          Boolean localOnly = Boolean.FALSE;
          try {
            localOnly = Boolean.valueOf(arg);
          } catch (Exception e) {
            localOnly = Boolean.FALSE;
          }
          properties.setProperty(ARG_LOCALONLY, localOnly.toString());
          
        } else {
          exitWithError("No value is specified for " + ARG_LOCALONLY);
        }
      } else if (arg.equalsIgnoreCase("-" + ARG_PORT)) {
        if (i < args.length) {
          arg = args[i++];
          try {
            int port = Integer.parseInt(arg);
            if (port < 0) {
              exitWithError("Paramter " + ARG_PORT + ":" + arg
                  + " is not valid.");
            }
            properties.setProperty(ARG_PORT, arg);
          } catch (Exception e) {
            exitWithError("Paramter " + ARG_PORT + ":" + arg + " is not valid.");
          }
        } else {
          exitWithError("No value is specified for " + ARG_PORT);
        }
      } else if (arg.equalsIgnoreCase("-" + ARG_EXIT_AFTER)) {
        if (i < args.length) {
          arg = args[i++];
          try {
            long seconds = Long.parseLong(arg);
            if (seconds < 0) {
              exitWithError("Paramter " + ARG_EXIT_AFTER + ":" + arg
                  + " is not valid.");
            }
            properties.setProperty(ARG_EXIT_AFTER, arg);
          } catch (Exception e) {
            exitWithError("Paramter " + ARG_EXIT_AFTER + ":" + arg
                + " is not valid.");
          }
        } else {
          exitWithError("No value is specified for " + ARG_EXIT_AFTER);
        }
      } else if (arg.equalsIgnoreCase("-" + ARG_DEBUG)) {
        if (i < args.length) {
          arg = args[i++];
          debug = false;
          try {
            debug = Boolean.parseBoolean(arg);
          } catch (Exception e) {
            debug = false;
          }
          properties.setProperty(ARG_DEBUG, Boolean.toString(debug));
          
        } else {
          exitWithError("No value is specified for " + ARG_DEBUG);
        }
      }
    }
    return properties;
  }
  
  @SuppressWarnings("resource")
  private void startServer(String[] args) throws Exception {
    Properties properties = getOptions(args);
    // check whether the application is already running
    lockFile = new File(appBaseLocation + LOCK_FILE);
    // Check if the lock exist
    if (lockFile.exists()) {
      // if exist try to delete it
      lockFile.delete();
    }
    // Try to get the lock
    channel = new RandomAccessFile(lockFile, "rw").getChannel();
    lock = channel.tryLock();
    if (lock == null) {
      // File is lock by other application
      channel.close();
      properties = readProperties();
      String portStr = properties.getProperty(ARG_PORT);

      if (portStr != null) {
        printSuccess(portStr, "Application is already running");
        return;
      }
    }
    // Add shutdown hook to release lock when application shutdown
    ShutdownHook shutdownHook = new ShutdownHook();
    Runtime.getRuntime().addShutdownHook(shutdownHook);
    
    try {
      Integer port = Integer.parseInt(properties.getProperty(ARG_PORT));
      boolean dynamicPort = Boolean.valueOf(properties
          .getProperty(ARG_DYNAMIC_PORT));
      boolean localOnly = Boolean
          .valueOf(properties.getProperty(ARG_LOCALONLY));
      
      int searchFrom = Integer.parseInt(properties
          .getProperty(ARG_SEARCH_PORT_START_RANGE));
      int searchTo = Integer.parseInt(properties
          .getProperty(ARG_SEARCH_PORT_END_RANGE));
      
      String sorlHome = appBaseLocation + "solr-home";
      if (!new File(sorlHome).exists()) {
        exitWithError("Solr home " + sorlHome
            + " doesn't exist or is not a folder.");
      }      
      // create logs directory
      File logs = new File(appBaseLocation, "logs");
      if (!logs.isDirectory()) {
        logs.mkdir();
      }
      
      Server server = null;
      if (dynamicPort) {
        // try 10 times
        for (int i = 0; i < 10; i++) {
          if (port == null) {
            port = findUnusedPort(searchFrom, searchTo);
          }
          if (port == null) {
            continue;
          }
          try {
            server = doStartEmbeddedJetty(sorlHome, port, localOnly, properties);
            // break if start the server successfully.
            break;
          } catch (Throwable e) {
            port = null;
            continue;
          }
        }
      } else {
        if (port == null) {
          // should not happen
          exitWithError("In no-dynamicPort mode, a valid port must be specified in command line or config.proprties.");
        }
        server = doStartEmbeddedJetty(sorlHome, port, localOnly, properties);
      }
      if (server != null) {
        properties.setProperty(ARG_PORT, port.toString());
        writeProperties(properties);
        printSuccess(port.toString(), "Server is started at port: " + port);
        server.join();
      } else {
        exitWithError("Unable to find available port.");
      }
    } catch (Throwable e) {
      if (debug) {
        e.printStackTrace();
      }
      exitWithError(e.getMessage());
    }
  }
  
  /**
   * From http://download.eclipse.org/jetty/stable-9/apidocs/org/eclipse/jetty/
   * server/handler/ShutdownHandler.html
   * 
   * @param shutdownCookie
   */
  private static void shutdown(int port, String shutdownCookie) {
    try {
      URL url = new URL("http://localhost:" + port + "/shutdown?token="
          + shutdownCookie + "&_exitJvm=true");
      HttpURLConnection connection = (HttpURLConnection) url.openConnection();
      connection.setRequestMethod("POST");
      connection.getResponseCode();
      System.out.println("Success=True");
      System.out.print("Message=Server (" + port + ") is shutdown");
    } catch (SocketException e) {
      System.out.println("Success=True");
      System.out.print("Message=Server is already not running.");
    } catch (Exception e) {
      System.out.println("Success=False");
      System.out.print("Message=" + e.getMessage());
    }
  }
  
  private Server doStartEmbeddedJetty(String sorlHome, int port, boolean localOnly, Properties properties) throws Throwable {
    System.setProperty(SYS_APP_HOME, appBaseLocation);
    System.setProperty(SYS_SOLR_SOLR_HOME, sorlHome);
    System.setProperty("jetty.port", String.valueOf(port));
    String localhost = "localhost";
    if (localOnly) {
      System.setProperty("jetty.host", localhost);
    } else {
      System.setProperty("jetty.host", "0.0.0.0");
    }
    Server server = null;
    Properties oldProperties = readProperties();
    try {
      // write new properties
      writeProperties(properties);      
      server = new Server();
      
      File jettyXmlFile = new File(appBaseLocation + "/etc/", "jetty.xml");
      XmlConfiguration configuration = new XmlConfiguration(jettyXmlFile
          .toURI().toURL());
      configuration.configure(server);
      
      HandlerList handlers = new HandlerList();
      
      // add solr.war
      // WebAppContext webapp1 = new WebAppContext();
      // webapp1.setContextPath("/solr");
      // webapp1.setWar(solrWar);
      
      File webapps = new File(appBaseLocation + "/webapps/");
      if (webapps.isDirectory()) {
        File[] wars = webapps.listFiles(new FilenameFilter() {
          @Override
          public boolean accept(File dir, String name) {
            return name.endsWith(".war");
          }
        });
        for (File war : wars) {
          String context = war.getName();
          context = context.substring(0, context.length() - 4);
          if (context.length() != 0) {
            WebAppContext webappContext = new WebAppContext();
            webappContext.setContextPath("/" + context);
            webappContext.setWar(war.getAbsolutePath());
            handlers.addHandler(webappContext);
          }
        }
      }
      
      // later add handlers from etc/jetty.xml
      handlers.addHandler(new ShutdownHandler(server, SHUTDOWN_PASSWORD));
      // handlers.setHandlers(new Handler[] {
      // // webapp1,
      // new ShutdownHandler(server, SHUTDOWN_PASSWORD)});
      // handler configured in jetty.xml
      Handler oldHandler = server.getHandler();
      if (oldHandler != null) {
        handlers.addHandler(oldHandler);
      }
      
      server.setHandler(handlers);
      SelectChannelConnector connector = new SelectChannelConnector();
      if (localOnly) {
        connector.setHost(localhost);
      } else {
        connector.setHost(null);
      }
      connector.setReuseAddress(false);
      connector.setPort(port);
      server.setConnectors(new Connector[] {connector});
      
      server.start();
    } catch (Throwable e) {
      if (server != null) {
        server.stop();
      }
      // if failed, restore old properties
      writeProperties(oldProperties);
      if (e instanceof InvocationTargetException) {
        Throwable tmp = e.getCause();
        if (tmp != null) {
          e = tmp;
        }
      }
      throw e;
    }
    return server;
  }
  
  private Integer findUnusedPort(int searchFrom, int searchTo) {
    ServerSocket s = null;
    for (int port = searchFrom; port < searchTo; port++) {
      try {
        s = new ServerSocket(port);
        // Not likely to happen, but if so: s.close throws exception, we will
        // continue and choose another port
        s.close();
        return port;
      } catch (Exception e) {
        continue;
      }
    }
    s.getLocalPort();
    return null;
  }
  
  private String getBaseLocation() throws UnsupportedEncodingException {
    File jarPath = new File(EmbeddedSolrJettyServer.class.getProtectionDomain()
        .getCodeSource().getLocation().getPath());
    // startjetty.jar is under /lib directory
    String baseLocation = jarPath.getParentFile().getParent();
    // handle non-ascii character in path, such as Chinese
    baseLocation = URLDecoder.decode(baseLocation,
        System.getProperty("file.encoding"));
    if (!baseLocation.endsWith(File.separator)) {
      baseLocation = baseLocation + File.separator;
    }
    return baseLocation;
  }
  
  private void unlockFile() {
    // release and delete file lock
    try {
      if (lock != null) {
        lock.release();
        channel.close();
        lockFile.delete();
      }
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
  
  /**
   * Write to a file
   * 
   * @param msg
   */
  private void exitWithError(String msg) throws IOException {
    BufferedWriter bw = null;
    try {
      bw = new BufferedWriter(new FileWriter(appBaseLocation + RESULT_FILE));
      bw.write("Success=False");
      System.err.println("Success=False");
      bw.newLine();
      bw.write("Port=Unkown");
      System.err.println("Port=Unkown");
      bw.newLine();
      bw.write("Message=" + msg);
      System.err.println("Message=" + msg);
      bw.flush();
    } finally {
      if (bw != null) {
        bw.close();
      }
    }
    System.exit(-1);
  }
  
  private void printSuccess(String port, String msg) throws IOException {
    BufferedWriter bw = null;
    try {
      bw = new BufferedWriter(new FileWriter(appBaseLocation + RESULT_FILE));
      bw.write("Success=True");
      System.out.println("Success=True");
      bw.newLine();
      bw.write("Port=" + port);
      System.out.println("Port=" + port);
      bw.newLine();
      bw.write("Message=" + msg);
      System.out.println("Message=" + msg);
      bw.flush();
    } finally {
      if (bw != null) {
        bw.close();
      }
    }
  }
  
  private Properties readProperties() throws FileNotFoundException, IOException {
    String propertyFile = appBaseLocation + CONFILE_FILE;
    InputStream is = null;
    
    Properties properties = new Properties();
    try {
      is = new FileInputStream(propertyFile);
      properties.load(is);
      
    } finally {
      if (is != null) {
        is.close();
      }
    }
    return properties;
  }
  
  private class ShutdownHook extends Thread {
    public void run() {
      unlockFile();
    }
  }
  
  /**
   * Only save properties when it starts the application successfully.
   * 
   * @param properties
   * @throws IOException
   */
  private void writeProperties(Properties properties) throws IOException {
    String propertyFile = appBaseLocation + CONFILE_FILE;
    OutputStream os = null;
    try {
      // remove hidden, one-time only parameter.
      properties.remove(ARG_DEBUG);
      os = new FileOutputStream(propertyFile);
      properties.store(os, "");
    } finally {
      if (os != null) {
        os.close();
      }
    }
  }
  
  static class NullPrintStream extends PrintStream {
    public NullPrintStream() {
      super(new OutputStream() {
        public void write(int b) {
          // DO NOTHING
        }
      });
      
    }
    
    @Override
    public void write(int b) {
      // do nothing
    }
    
  }
}

class ConcatOutputStream extends OutputStream {
  private OutputStream stream1, stream2;
  
  public ConcatOutputStream(OutputStream stream1, OutputStream stream2) {
    super();
    this.stream1 = stream1;
    this.stream2 = stream2;
  }
  
  @Override
  public void write(int b) throws IOException {
    stream1.write(b);
    stream2.write(b);
  } 
}

Common Java Code Bugs

When write code, I made so many simple mistakes, so I try to write them down here to remind me not to make same mistakes again.
Copy&Paste is evil.
Most times, the problem happens when I copy and paste, change it, but forget to change  some places. 
Take time to check and review code before start to compile or run tests. This can save me a lot of time.
Boolean condition
if(!valid) or if(valid). 
if(a.equals(b)) or if(!a.equals(b))
if(map.isEmpty()) or if(!map.isEmpty())
Use &&, Not &
str != null & str.equalsIgnoreCase("true")
Throw NPE when str is null.
Forget else statement.
Think about what should be done in else statement. 

NullPointerException
Forget to initialize variable, especially for instance variable.
Check Null, and handle the case.
NPE when unbox 
int value = Long or (Long)obj;
the Long or obj may be null.
Float maxScore = null;
maxScore = docList.maxScore(); // if use float, here it may throws NullPonterException
check whether the collection is null before use for-loop or iterator.
for(String str: strList) 

Forget to shutdown threadpool and wait for it finish
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.MINUTES);

Where to put executor.shutdown() or server.shutdown()
We have to wait until all tasks are done or submitted。

Add object more than one time
if (obj != null && (Long) obj == 0) {
  sortedNL.add(label, queryValue);
  queryValue = new NamedList<Object>();
}

sortedNL.add(label, queryValue == null ? new NamedList<Object>() : queryValue);
Forget to check for preconditions, null pointers - Defensive programming
Defensive programming teaches to check whenever you are in doubt excplicity about the method arguments. 
When to call super.method()
Understand when we should call super.method and why.
In MyUpdateRequestHandler.init(NamedList),
public void init(NamedList args) {
if (args != null) {
  SolrParams params = SolrParams.toSolrParams(args);
  Object obj = params.get(PARAM_CLIENTID_PARAM_NAME);
  if (obj != null)  clientIdParamName = obj.toString(); 
}
super.init(args);
}
In this case, I have to call super.init(args) at last, as init method in parent calls createDefaultLoaders, and in my subclasses, I overwrite createDefaultLoaders, which need parameter clientIdParamName.
If I call super.init(args), in createDefaultLoaders, the clientIdParamName would be null which is not expected.
Map Key
Once you put a key/value pair in a hash map you should not change the value of the key, ever, in any way that changes the hash code. If the key is changed where it generates a new hash code, you will not be able to locate the correct bucket in the HashMap that contains the key/value pair. 
Throw exceptions to signal exceptional conditions instead of using Null flags

References
Collected Java Practices
8 Common Code Violations in Java
Common Java Mistakes and Bug Patterns 

Windows Commands/Batch All In One

Command
http://ss64.com/nt/
type  filename
Sleep some time
sleep 5
timeout 5
These 2 commands are not available in every windows machine, in practice, we can use ping to cause delay.
ping 1.1.1.1 -n 1 -w 1000 >NUL 2>NUL
start command: run command in a separate window.
call: Calls one batch program from another without stopping the parent batch program. 
find
find [/v] [/c] [/n] [/i] "string" [[Drive:][Path]FileName[...]]
/v(reverse), /c(count), /n(show line number), /i(case-insensitive)
for %f in (*.bat) do find "PROMPT" %f 
dir c:\ /s /b | find "CPU"
sort
/r, /+n, 
more
+n : Displays first file beginning at the line specified by n.
/c, /s
The following commands are accepted at the more prompt:
SPACEBAR Display next page
ENTER Display next line
f Display next file
= Show line number
p n Display next n lines
s n Skip next n lines
q, ?
Widnows Batch

Read until there are 3 lines in file result
@ECHO OFF
:readFileLoop
if exist result (
    set /a "x = 0"
    for /F "tokens=*" %%L in (result) do set /a "x = x + 1"
if "%x%" EQU  "3" (
goto :end
)
) > NUL 2>&1
goto :readFileLoop
@ECHO ON
Using batch files
If statement
CompareOp : 
EQU, NEQ, LSS, LEQ, GTR, GEQ
For for /?
for {%variable|%%variable} in (set) do command [ CommandLineOptions]
Use %variable to carry out for from the command prompt. Use %%variable to carry out the for command within a batch file.
Directories only: for /D
Recursive: for /R
Iterating a range of values: for /L
Iterating and file parsing: for /F
eol=c Specifies an end of line character (just one character).
skip=n Specifies the number of lines to skip at the beginning of the file.
delims=xxx
tokens=x,y,m-n
for /F "eol=; tokens=2,3* delims=," %i in (myfile.txt) do @echo %i %j %k
This command parses each line in Myfile.txt, ignoring lines that begin with a semicolon and passing the second and third token from each line to the FOR body (tokens are delimited by commas or spaces). The body of the FOR statement references %i to get the second token, %j to get the third token, and %k to get all of the remaining tokens.
Using batch parameters
%0-%9
When you use batch parameters in a batch file, %0 is replaced by the batch file name, and %1 through %9 are replaced by the corresponding arguments that you type at the command line. To access arguments beyond %9, you need to use the shift command. 
modifiers: %~f1, %~d1, %~p1, %~n1, %~x1, %~s1, %~a1, %~t1, %~z1
setlocal/endlocal
Use setlocal to change environment variables when you run a batch file. Environment changes made after you run setlocal are local to the batch file. Cmd.exe restores previous settings when it either encounters an endlocal command or reaches the end of the batch file.
Parse Command Args
:parseParams
set key=%~1
set value=%~2
if "%key%" == "" goto :eof
if "%key%" == "-MY_JAVA_OPTIONS" (   
    shift
shift
if "%value%" == "" (
        echo Empty value for -MY_JAVA_OPTIONS
        goto end
    )
SET MY_JAVA_OPTIONS=%value%
    goto parseParams %*
) else (
rem skip some other parameters
shift
    goto parseParams %*
)
goto :eof

:parseArgs
  REM recursive procedure to split off the first two tokens from the input
if "%*" NEQ "" (
for /F "tokens=1,2,* delims== " %%i in ("%*") do call :assignKeyValue %%i %%j & call :parseArgs %%k
)
goto :eof
:assignKeyValue
if /i "%1" EQU "-Xmx" (
SET Xmx=%2
) else if /i "%1" EQU "-Xms" (
    SET Xms=%2

goto :eof
SubString: SET var=%var:~10,5%
http://geekswithblogs.net/SoftwareDoneRight/archive/2010/01/30/useful-dos-batch-functions-substring-and-length.aspx
Save current path and restore it later
set currentPath=%cd%
cd %currentPath%

set MYPATH=%~dp0

Auto Completion -Using Trie to Find Strings Starting with Prefix

We are all familiar with the auto completion function provided by IDE, for example, in eclipse, if we type Collections.un, then eclipse would list all methods that start with "un" such as unmodifiableCollection, unmodifiableList etc.

So how to implement this function?
How to find all strings that starts with prefix provided repeatedly and efficiently?

Answer:
We need to preprocess the list of string, so later we can quickly search it.

One way is to sort the string list by alphabetical order, then when search with the prefix (say app), we binary search this list and get a lower index whose string is larger than “app”, and get a higher index whose string is less than “apr”, then all strings between the lower index and higher index[lower index, higher index) are the strings that starts with the prefix.
Each query would take O(longn), n is the length of the string list.

Another better way is to create a tree from the string list, for example, for string "append", it would look like this:
  [root node(flag)]
         /
        a
       / \
     [ST] p
          \
          p -- return all strings from this sub tree
         /
         e
         \
         n
        / \
        d [Sub Tree]
       /
[leaf node(flag)]
So when we search all strings that starts with "app", it can search this tree, and get all strings of the p node, the time complexity depends on the length of the prefix, having nothing to do with the length of the string list. This is much better.

Code:
The complete algorithm/test code and also many other algorithm problems and solutions are available from Github.

package org.codeexample.jefferyyuan.autocomplete;
public class WordTree {
 private WordNode root;
 public WordTree() {
  root = new WordNode(null);
 }
 /**
  * Add a string into this word tree, if word is null or an empty string, do
  * nothing
  */
 public void addWord(String word) {
  if (word == null)
   return;
  word = word.trim();
  if ("".equals(word)) {
   return;
  }
  WordNode parentNode = root, curretNode;
  for (int i = 0; i < word.length(); i++) {
   char character = word.charAt(i);
   Map<Character, WordNode> children = parentNode
     .getChildrenMap();
   if (children.containsKey(character)) {
    curretNode = children.get(character);
   } else {
    curretNode = new WordNode(character);
    parentNode.addChild(curretNode);
   }
   parentNode = curretNode;
  }
  // at last, add a leaf node - whose character value is null to indicate
  // the end of the word
  curretNode = new WordNode(null);
  parentNode.addChild(curretNode);
 }
 /**
  * @param prefix
  * @return all words in this tree that starts with the prefix, <br>
  *         if prefix is null, return an empty list, if prefix is empty
  *         string, return all words in this word tree.
  */
 public List<String> wordsPrefixWith(String prefix) {
  List<String> words = new ArrayList<String>();
  if (prefix == null)
   return words;
  prefix = prefix.trim();
  WordNode currentNode = root;
  for (int i = 0; i < prefix.length(); i++) {
   char character = prefix.charAt(i);
   Map<Character, WordNode> children = currentNode
     .getChildrenMap();
   if (!children.containsKey(character)) {
    return words;
   }
   currentNode = children.get(character);
  }
  return currentNode.subWords();
 }
 /**
  * @param word
  * @return whether this tree contains this word, <br>
  *         if the word is null return false, if word is empty string, return
  *         true.
  */
 public boolean hasWord(String word) {
  if (word == null)
   return false;
  word = word.trim();
  if ("".equals(word))
   return true;
  WordNode currentNode = root;
  for (int i = 0; i < word.length(); i++) {
   char character = word.charAt(i);
   Map<Character, WordNode> children = currentNode
     .getChildrenMap();
   if (!children.containsKey(character)) {
    return false;
   }
   currentNode = children.get(character);
  }
  // at last, check whether the parent node contains one null key - the
  // leaf node, if so return true, else return false.
  return currentNode.getChildrenMap().containsKey(
    null);
 }
}
class WordNode {
 private Character character;
 private WordNode parent;
 private Map<Character, WordNode> childrenMap = new HashMap<Character, WordNode>();
 public WordNode(Character character) {
  this.character = character;
 }
 /**
  * @return all strings of this sub tree
  */
 public List<String> subWords() {
  List<String> subWords = new ArrayList<String>();
  String prefix = getPrefix();
  List<String> noPrefixSubWords = subWordsImpl();
  for (String noPrefixSubWord : noPrefixSubWords) {
   subWords.add(prefix + noPrefixSubWord);
  }
  return subWords;
 }
 private List<String> subWordsImpl() {
  List<String> words = new ArrayList<String>();
  Iterator<Character> keyIterator = childrenMap
    .keySet().iterator();
  while (keyIterator.hasNext()) {
   Character key = keyIterator.next();
   if (key == null) {
    words.add(convertToString(this.character));
   } else {
    WordNode node = childrenMap.get(key);
    List<String> childWords = node
      .subWordsImpl();
    for (String childWord : childWords) {
     words
       .add(convertToString(this.character)
         + childWord);
    }
   }
  }
  return words;
 }
 public void addChild(WordNode child) {
  child.parent = this;
  childrenMap.put(child.getCharacter(), child);
 }
 public Character getCharacter() {
  return character;
 }
 public WordNode getParent() {
  return parent;
 }
 public Map<Character, WordNode> getChildrenMap() {
  return childrenMap;
 }
 private String convertToString(Character character) {
  return (character == null) ? "" : String
    .valueOf(character);
 }
 private String getPrefix() {
  StringBuilder sb = new StringBuilder();
  WordNode parentNode = this.parent;
  while (parentNode != null) {
   if (parentNode.getCharacter() != null) {
    sb.append(parentNode.getCharacter());
   }
   parentNode = parentNode.parent;
  }
  return sb.reverse().toString();
 }
}
From my old blog.

Use UTF-8 Without BOM for Java Source File

Today, I use Notepad++ to create a java source file, save it as UTF-8, use ANT to compile it and run it. But Ant failed with error:
SimpleClass.java:1: illegal character: \191
public class SimpleClass {
  ^
2 errors

The problem is that encoding UTF-8 is in fact UTF-8 with BOM for notepad++.
When javac or java read files encoded in UTF-8, Java assumes the UTF8 don't have a BOM so if the BOM is present it won't be discarded and it will be seen as data.

This is why it reports illegal character.

To fix this problem, just change the encoding to UTF-8 without BOM.

According to the Unicode standard, the BOM for UTF-8 files is not recommended:

So for Notepad++, by default UTF-8 should be UTF-8 without BOM, and add another encoding UTF-8 with BOM in case user needs it.
For us, don't use UTF-8 with BOM if possible.

Resource:
Handle UTF8 file with BOM

Labels

Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (38) Eclipse (33) Code Example (31) Linux (25) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts