Handling gzip Response in Apache HttpClient 4.2

The Problem
My application uses Apache HttpClient 4.2, but when it sends request to some web pages, the response is garbled characters.

Using Fiddler's Composer to execute the request, found the response is gziped.
Content-Encoding: gzip

The Solution
In Apache HttpClient 4.2, the DefaultHttpClient doesn't support compression, so it doesn't decompress the response. We have to use DecompressingHttpClient.
public void usingDefualtHttpClient() throws Exception {
  // output would be garbled characters in http client 4.2.
  HttpClient httpClient = new DefaultHttpClient();
  getContent(httpClient, new URI(URL_STRING));
}

public void usingDecompressingHttpClient() throws Exception {
  // use DecompressingHttpClient to handle gzip response in  http client 4.2.
  HttpClient httpCLient = new DecompressingHttpClient(
      new DefaultHttpClient());
  getContent(httpCLient, new URI(URL_STRING));
}

private void getContent(HttpClient httpClient, URI url) throws IOException,
    ClientProtocolException {
  HttpGet httpGet = new HttpGet(url);
  HttpResponse httpRsp = httpClient.execute(httpGet);
  String text = EntityUtils.toString(httpRsp.getEntity());

  for (Header header : httpRsp.getAllHeaders()) {
    System.out.println(header);
  }
  System.out.println(text);
}
The problem can also be fixed by upgrading http client to 4.3.5: in this versionthe default http client supports compression.

And in  http client to 4.3.5, the DefaultHttpClient is deprecated, it's recommenced to use HttpClientBuilder:
public void usingHttpClientBuilderIn43() throws Exception {
  HttpClientBuilder builder = HttpClientBuilder.create();
  CloseableHttpClient httpClient = builder.build();
  getContent(httpClient, new URI(URL_STRING));
}

Post a Comment

Labels

Java (159) Lucene-Solr (110) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (37) Eclipse (33) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Web Development (20) Nutch2 (18) Tools (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) Lesson Learned (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts