Recently, I am using Apache Http Client and HttpURLConnection to send the following request to remote Solr server:
http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100
It got IllegalArgumentException like below:
In Java, we can use URLEncoder to encode special characters.
To use URLEncoder, we just need pay attention to one thing: which parts should be encoded.
Basic rule is that if these special characters are used for special use, then don't encode them.
In url: <scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ], we usually need encode the <query> and <fragment> part.
For the 2 formats of query string:
Semicolon format: key1=value1;key2=value2;key3=value3
Ampersand format: key1=value1&key2=value2&key3=value3
We should not encode the ? =, & or ; which is used to separate multiple key value pairs. We should just encode the key and value field.
Resource
URL Encoding
URLEncoder Javadoc
URI scheme
http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100
It got IllegalArgumentException like below:
java.lang.IllegalArgumentException: Illegal character in query at index 74: http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100 at java.net.URI.create(URI.java:859) at org.apache.http.client.methods.HttpGet.<init>(HttpGet.java:69) Caused by: java.net.URISyntaxException: Illegal character in query at index 74: http://localhost:8080/solr/select?q=extractingat:[2012-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]&start=0&rows=100 at java.net.URI$Parser.fail(URI.java:2829) at java.net.URI$Parser.checkChars(URI.java:3002) at java.net.URI$Parser.parseHierarchical(URI.java:3092)The problem is because url special characters which should be encoded. Please read URL Encoding about what characters need to be encoded and why?
In Java, we can use URLEncoder to encode special characters.
To use URLEncoder, we just need pay attention to one thing: which parts should be encoded.
Basic rule is that if these special characters are used for special use, then don't encode them.
In url: <scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ], we usually need encode the <query> and <fragment> part.
For the 2 formats of query string:
Semicolon format: key1=value1;key2=value2;key3=value3
Ampersand format: key1=value1&key2=value2&key3=value3
We should not encode the ? =, & or ; which is used to separate multiple key value pairs. We should just encode the key and value field.
@Test public void urlspecailchars2() throws Exception { String url = "http://localhost:8080/solr/select?q=" + URLEncoder .encode( "extractingat:[2013-11-14T04:08:54.000Z TO 2013-11-14T04:11:05.000Z]", "UTF-8") + "&start=0&rows=100"; DefaultHttpClient httpClient = new DefaultHttpClient(); System.out.println(url); HttpGet get = new HttpGet(url); HttpResponse response = httpClient.execute(get); System.out.println(EntityUtils.toString(response.getEntity())); }
URL Encoding
URLEncoder Javadoc
URI scheme