The Problem
In one solr-related application, we use SolrJ HttpSolrServer to push data into remote Solr server. In one load testing machine, we see SocketConnection Exception intermittently in remote Solr Server.
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset
at org.apache.solr.client.solrj.impl.HttpSolrServer.request (HttpSolrServer.java:340)
Although for each request, we will retry several times. But we are still wondering what caused this exception and the network usage in the remote Solr server.
Use TCPView in Windows to Monitor Network
In linux, we can use lsof -i -p pid to list all port a process opens. In Windows, we can use TCPView
from Windows Sysinternals to monitor tcp connections.
We see there are a thousands of connections in TIME_WAIT, CLOSE_WAIT, FIN_WAIT2 status in client and remote Solr server. This looks suspicious.
HttpSolrServer Usage Pattern
As the main logic in client is to use HttpSolrServer to send requests to Solr server. So we suspect there is something wrong when use HttpSolrServer.
Checked the code, the client program uses java Threadpool to send request, in each thread, it creates a HttpSolrServer, send a requests, then shutdown the SolServer.
Looked at wike page in SolrJ
HttpSolrServer is thread-safe and if you are using the following constructor, you *MUST* re-use the same instance for all requests. If instances are created on the fly, it can cause a connection leak. The recommended practice is to keep a static instance of HttpSolrServer per solr server url and share it for all requests.
We changed the code to use a global static HttpSolrServer instance. Then use TCPView to monitor network usage after the change. Now there are only less than 20 connections open.
Same logic applies when we use Apache HttpClient: Use one single HttpClient instance for each remote host.
Lesson Learned
Use tools in Windows Sysinternals and TCPView
Reuse HttpSolrServer and Apache HttpClient instance.
Test Program
If you run the following test program, and monitor with TCPView, you can immediately notice the difference.
SolrJ Wiki
Connection Close in HttpClient
HttpClient Connection Management FAQ and Explanation of TIME_WAIT status
TCP Connection Termination Order
In one solr-related application, we use SolrJ HttpSolrServer to push data into remote Solr server. In one load testing machine, we see SocketConnection Exception intermittently in remote Solr Server.
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Connection reset
at org.apache.solr.client.solrj.impl.HttpSolrServer.request (HttpSolrServer.java:340)
Although for each request, we will retry several times. But we are still wondering what caused this exception and the network usage in the remote Solr server.
Use TCPView in Windows to Monitor Network
In linux, we can use lsof -i -p pid to list all port a process opens. In Windows, we can use TCPView
from Windows Sysinternals to monitor tcp connections.
We see there are a thousands of connections in TIME_WAIT, CLOSE_WAIT, FIN_WAIT2 status in client and remote Solr server. This looks suspicious.
HttpSolrServer Usage Pattern
As the main logic in client is to use HttpSolrServer to send requests to Solr server. So we suspect there is something wrong when use HttpSolrServer.
Checked the code, the client program uses java Threadpool to send request, in each thread, it creates a HttpSolrServer, send a requests, then shutdown the SolServer.
Looked at wike page in SolrJ
HttpSolrServer is thread-safe and if you are using the following constructor, you *MUST* re-use the same instance for all requests. If instances are created on the fly, it can cause a connection leak. The recommended practice is to keep a static instance of HttpSolrServer per solr server url and share it for all requests.
We changed the code to use a global static HttpSolrServer instance. Then use TCPView to monitor network usage after the change. Now there are only less than 20 connections open.
Same logic applies when we use Apache HttpClient: Use one single HttpClient instance for each remote host.
Lesson Learned
Use tools in Windows Sysinternals and TCPView
Reuse HttpSolrServer and Apache HttpClient instance.
Test Program
If you run the following test program, and monitor with TCPView, you can immediately notice the difference.
@Test public void correctUsageSolrServer() throws InterruptedException { final String baseURL = "SOLRURL"; final HttpSolrServer solrServer = new HttpSolrServer(baseURL); int i = 0; while (i++ < 1000) { Thread thread = new Thread() { @Override public void run() { SolrQuery query = new SolrQuery("*:*"); try { QueryResponse rsp = solrServer.query(query); System.out.println(rsp); } catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e); } } }; thread.start(); Thread.sleep(10); } // be sure to shutdown solrServer at the end. solrServer.shutdown(); } @Test public void wrongUsageSolrServer() throws InterruptedException { final String baseURL = "SOLRURL"; int i = 0; while (i++ < 1000) { Thread thread = new Thread() { @Override public void run() { final HttpSolrServer solrServer = new HttpSolrServer(baseURL); SolrQuery query = new SolrQuery("*:*"); try { QueryResponse rsp = solrServer.query(query); System.out.println(rsp); } catch (Exception e) { e.printStackTrace(); throw new RuntimeException(e); } finally { solrServer.shutdown(); } } }; thread.start(); Thread.sleep(10); } }Resources
SolrJ Wiki
Connection Close in HttpClient
HttpClient Connection Management FAQ and Explanation of TIME_WAIT status
TCP Connection Termination Order