How To Encode A URL String in Java

Recently, I am using Apache Http Client and HttpURLConnection to send the following request to remote Solr server:
http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100
It got IllegalArgumentException like below:
java.lang.IllegalArgumentException: Illegal character in query at index 74: http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100
 at java.net.URI.create(URI.java:859)
 at org.apache.http.client.methods.HttpGet.<init>(HttpGet.java:69)
Caused by: java.net.URISyntaxException: Illegal character in query at index 74: http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100
 at java.net.URI$Parser.fail(URI.java:2829)
 at java.net.URI$Parser.checkChars(URI.java:3002)
 at java.net.URI$Parser.parseHierarchical(URI.java:3092)
The problem is because url special characters which should be encoded. Please read URL Encoding about what characters need to be encoded and why?

In Java, we can use URLEncoder to encode special characters. 
To use URLEncoder, we just need pay attention to one thing: which parts should be encoded.
Basic rule is that if these special characters are used for special use, then don't encode them.

In url: <scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ], we usually need encode the <query> and <fragment> part. 

For the 2 formats of query string: 
Semicolon format: key1=value1;key2=value2;key3=value3
Ampersand format: key1=value1&key2=value2&key3=value3
We should not encode the ? =, & or ; which is used to separate multiple key value pairs. We should just encode the key and value field.

Resource
URL Encoding
URLEncoder Javadoc
URI scheme
Post a Comment

Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts