How to Read Internal docs to Solve Problems

Often, we need find some docs in internal website and read them to figure out how to do some task.

Internal vs Internet
This is different from searching internet, which usually we can find plenty of resources, after read multiple articles, we can usually have a sense how to do it. It's fine if we don't read some articles carefully, - not ideal, but usually we can find related/similar articles and can understand or get it at that time.

Usually there would be only a few docs/pages for internal docs: they are well documented, and give all information we need.

But if you don't read them carefully or miss some key information, you would be mot able to solve the problem by just reading the docs. - you can still ask help from others. 

Be organized when read the docs
- Maybe open all related docs in a different browser, in one window
- Maybe start with the entrance page (maybe given by someone) , and follow the links while read it

Notice/Write down what you don't understand or is strange to you
-- Usually they are the keys to solve the problem

Use tools like evernote to help read the docs
- add note, highlight content etc

First find/know all internal websites that are useful
Search them, read them carefully
Examples: docker sidecar, yubikey-ssh

Tips and Tricks for Docker

docker run -d -p 80:80 --name webserver nginx
-v host-dir:container-dir
-p host-port:container-port

docker ps
Attach to existing container's shell
docker exec -it container_id /bin/sh

docker kill/stop $(docker ps -q)
docker build --no-cache .
docker-compose build --no-cache mysql

Delete all containers
docker rm $(docker ps -a -q)
Delete all images
docker rmi -f $(docker images -q)

Clean up disk space used by Docker
docker system df
docker system prune

Increase docker memory
In advance tab of preference, change the memory or cpu.
configure --memory 6g in docker run command to set a different value.

docker stats
check memory and cpu usage

RUN creates an intermediate container, runs the script and freeze the new state of that container in a new intermediate image.


Prefer exec form over shell form
docker run --entrypoint [my_entrypoint] containter_name [command 1] [arg1] [arg2]

Use COPY if don't need ADD's specical features

Use dockerfile to build each image, docker compose file to assemble them.

EXPOSE 8983-8986
USER builder

Change hostname
docker run -it -h myhost ...
docker run --rm -it --cap-add SYS_ADMIN ...

Tips and Tricks for Atom Editor

-- Disabled by default; find autosave in packages, go to its settigns and select "enabled".
Find whitespace package, uncheck Ensure Single Trailing Newline option
- Auto Reveal
Atom -> Preference -> Editor -> Enable soft wrap
Atom -> Config
    enabled: true
    softWrap: true
    ensureSingleTrailingNewline: false

-- It can format json, xml and other program langs such as java etc.
dictionary: ctrl-cmd-k
-- Copy JSON data save as a.json. jsonlint will automatically run and check the json data.
Line Ending Converter

Command Palette: Cmd-shift-P
Goto line:              Ctrl + g
Go to Matching Bracket Ctrl+m
Toggle Tree View: Cmd+\
Fuzzy Find Files
Increase Font Size: Cmd++
Decrease Font Size: Cmd+-

Convert to Upper Case: ⌘-k-u 
Convert to Lower Case: ⌘-k-L
Cut to End of Line:        Ctrl-k
Delete Line:               Ctrl+Shift+k

cmd+shift+: to bring up the list of corrections

How Solr Create Collection - Learn Solr Code

Test Code to create collections
MiniSolrCloudCluster cluster = new MiniSolrCloudCluster(4 /*numServers*/, testBaseDir, solrXml, JettyConfig.builder().setContext("/solr").build());
cluster.createCollection(collectionName, 2/*numShards*/, 2/*replicationFactor*/, "cie-default", null);

CollectionsHandler.handleRequestBody(SolrQueryRequest, SolrQueryResponse)

CollectionAction action = CollectionAction.get(a); // CollectionAction .CREATE(true)
CollectionOperation operation = CollectionOperation.get(action); //CollectionOperation .CREATE_OP(CREATE)

Map result =, rsp, this);
Return a mpa like this:
{name=collectionName, fromApi=true, replicationFactor=2, collection.configName=configName, numShards=2, stateFormat=2}

ZkNodeProps props = new ZkNodeProps(result);
if (operation.sendToOCPQueue) handleResponse(operation.action.toLower(), props, rsp, operation.timeOut);

CollectionsHandler. handleResponse
QueueEvent event = coreContainer.getZkController() .getOverseerCollectionQueue() .offer(Utils.toJSON(m), timeout);

This uses DistributedQueue.offer(byte[] data, long timeout) to add a task to /overseer/collection-queue-work/qnr-numbers.

It uses LatchWatcher to wait until this task is processed.

Overseer and OverseerCollectionProcessor

OverseerCollectionProcessor.processMessage(ZkNodeProps, String operation /*create*/)
OverseerCollectionProcessor.processMessage(ZkNodeProps, String)
OverseerCollectionProcessor.createCollection(ClusterState, ZkNodeProps, NamedList)

  ClusterStateMutator.getShardNames(numSlices, shardNames);
   positionVsNodes = identifyNodes(clusterState, nodeList, message, shardNames, repFactor); // round-robin if rule not set

  createConfNode(configName, collectionName, isLegacyCloud);
// This message will be processed by ClusterStateUpdater
// wait for a while until we do see the collection in the clusterState

  for (Map.Entry e : positionVsNodes.entrySet()) {
  if (isLegacyCloud) {
    shardHandler.submit(sreq, sreq.shards[0], sreq.params);
  } else {
    coresToCreate.put(coreName, sreq);

This will send http call and be handled by CoreAdminHandler.handleRequestBody.

  // if there were any errors while processing
  // the state queue, items would have been left in the
  // work queue so let's process those first
  byte[] data = workQueue.peek();
  boolean hadWorkItems = data != null;
  while (data != null)  {
    final ZkNodeProps message = ZkNodeProps.load(data);
    clusterState = processQueueItem(message, clusterState, zkStateWriter, false, null);
    workQueue.poll(); // poll-ing removes the element we got by peek-ing
    data = workQueue.peek();


  zkWriteCommand = processMessage(clusterState, message, operation);

  clusterState = zkStateWriter.enqueueUpdate(clusterState, zkWriteCommand, callback);

  case CREATE:
    return new ClusterStateMutator(getZkStateReader()).createCollection(clusterState, message);

overseer.ClusterStateMutator.createCollection(ClusterState, ZkNodeProps)

Gradle Tips and Tricks - 2017

Run tasks on sub-projects only
./gradlew sub-project:build

./gradlew classes

Gradle task options Skip Tasks
-x test -x findbugsMain -x findbugsTest -x findbugsIntegrationTest -x pmdMain -x pmdTest -x pmdIntegrationTest -x testClasses -x integrationTestClasses -x javadoc -x javadocJar


-s, --stacktrace

Run specific tests
gradle test --tests org.gradle.SomeTest.someMethod
gradle test --tests org.gradle.SomeTest
gradle test --tests org.gradle.internal*
//select all ui test methods from integration tests by naming convention
gradle test --tests *IntegTest*ui*
//selecting tests from different test tasks
gradle test --tests *UiTest integTest --tests *WebTest*ui

Install an artifact locally
apply plugin: 'maven-publish'
gradle publishToMavenLocal

Using artifacts from local maven
apply plugin: "maven"
allprojects {
  repositories {

Run tasks in remote debug mode
export GRADLE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005"
- this is same as above, but more flexible.

gw - run gradle in sub folders
brew install gdub

- Figure out which projects are to take part in a build.
include ':repository', ':services', ':web-app' vs settings.gradle

Change subproject name
include "foo" = 'projectName'
project(":foo").name = "foofoo"

allprojects {}
subprojects {}

Use doLast instead of <<
task copyJarToBin(type: Copy) {
    from createJar // shortcut for createJar.outputs.files
    into "d:/tmp"

task stopTomcat(type:Exec)

Config files
Startup script

settings.gradle in root folder
build.gradle for each module

Maven Tips and Tricks - 2016

Lessons Learned about Programming and Soft Skills - 2017

How to compare different approaches
- Always think about different approaches (even if you already finished/committed code)

- Don't just choose one that looks good
- List them and compare them
- Always ask you why choose this approache
- Try hard to find problems in your current approach, and how to fix them
For small coding
- Implement them if possible
- Then compare which makes code cleaner, less change etc
Example: Exclude source and javadoc from -jar
APP_BIN=$APP_BIN_DIR/$(ls $APP_BIN_DIR | grep -E 'jarName-version.*jar' | grep -v sources | grep -v javadoc | grep -v pom)

How to quickly scan/learn new classes
Sometimes we need quickly scan/check a bunch of related classes to check how to implement a function, use a method etc
- Check the class's Javadoc
- Check the class signature
- Check main methods:
  - static methods
  - using ctrol+o or outline view
- Check call hierarchy in source code
- Check test code/examples
- Google search code example

When refactor/change the code, also check/change/improve its related code.

Find related doc, check/read the doc carefully.
- Mark/Note the important part of the doc.

For some task, we can use the trial and error approach, just do it, then fix it.
But for some task(production or physical hardware related), it's better to figure out the right way to do it first.

Evaluate the outcome of the action. 
- best/worst outcome

How to implement/work on a feature
- what's goal, what to achieve
- how to test/verify/deploy/enable it in test or production environment easily 
  - useful tricks: dry-run, 
  - able to enable/disable default configuration automatically, but override it manually
- how to measure whether the change makes improvement

Think different/3+/more approaches
Compare them
Don't stop until find a solution that looks good to you
Use tools(notebook, whiteboard)
- Usually we are not happy with the first approach that comes to our minds, and find out a different approach: maybe it's a little better or maybe just be different. In some cases, we stop there: maybe we started to talk it with others or present it (to get others' ideas)
-Example: add Explanation to transform actions

Before ask a question
- Try to Solve it by yourself
- Make sure you read all related code/doc: from top to bottom(quickly, scan but don't ignore any code that may be important)

- Example: NightlyTestRunner

Verify the assumption
- Be aware of the the assumption we or others made in the design or the code.
- Verify whether it's true or not
- Example: one line one record in csv, one-to-one between tms id and bam_id

- Check and realize alternatives
Sometimes, we want A, but maybe B also works and is even better.
- Example: asset letter or account statement

Use tools
Write down in whiteboard or notebook or app
Take a picture now 
- always bring the phone

Check carefully and verify your claim before blame others or think others are wrong
We incline to think others are wrong or made a mistake even if someone told you he/she did that - we made a very brief search and didn't check carefully, then started to think they are wrong.

Prefer to use code to enforce the rule than documentation
Example: all tests must extends XbaseUnitTest.

Make API/feature easier
- to use
- to test/rollback in production (feature flag)

Realize your assumption/decision and verify it first
- Otherwise you may go farther but on the wrong path
Step by step and verify each step

Identify useful info and active quickly (or you may forget about it)
Don't let your past experience affect you
- Try it, it may be different this time
Example: big item delivery

When some thing totally doesn't make sense:
- maybe they are totally different things, you are comparing Apple with Banaba
Example: 12/15, 1/15

Lessons Learned about Programming and Soft Skills - 2016

Bash Scripting Essentials

Brace Expansion
chown root /usr/{ucb/{ex,edit},lib/{ex?.?*,how_ex}}

Special Variables
$? Exit value of last executed command.
wait $pid

$! Process number of last background command.
$0 First word; that is, the command name. This will have the full pathname if it was found via a PATH search.
$n Individual arguments on command line (positional parameters).
$# Number of command-line arguments.

“$*” All arguments on command line as one string (“$1 $2…”). The values are separated by the first character in $IFS.
“$@” All arguments on command line, individually quoted (“$1” “$2” …).

-n string is not null.
-z string is null, that is, has zero length

Testing for File Characteristics
-d File is a directory
-e File exists
-f File is a regular file
-s File has a size greater than zero
-r, -w, -x, -s - socket

[ -d "$dir" ] && echo "$dir exists." || echo "$dir doesn't exists."

Testing with Pattern Matches
== pattern
=~ ere
if [[ "${MYFILENAME}" == *.jpg ]]

-a, &&
-o, ||

if [ ! -d $param ]
if [ $? -ne 0 ]

HEAP_DUMP_DIR=$(sed 's/-XX:HeapDumpPath=\([^ ]*\)/\1/' <<< $param)

Use variable $1, $2..$n to access argument passed to the function.

Hello () {
   echo "Hello $1 $2"
   return 10

Hello a b

for i in $( command ); do command $i; done

for i in $( command ); do
command $i

Google Shell Style Guide
quote your variables; prefer "${var}" over "$var",
  • Use "$@" unless you have a specific reason to use $*.
Use $(command) instead of backticks.
[[ ... ]] is preferred over [test

[[ ... ]] reduces errors as no pathname expansion or word splitting takes place between [[ and ]] and [[ ... ]] allows for regular expression matching where [ ... ] does not

Use readonly or declare -r to ensure they're read only.

Use Local Variables

Use set -o errexit (a.k.a. set -e) to make your script exit when a command fails.
Then add || true to commands that you allow to fail.
set -e - enable exit immediately
set +e - disable exit immediately
set -x  - print a trace Use set -o nounset (a.k.a. set -u) to exit when your script tries to use undeclared variables.
Use set -o xtrace (a.k.a set -x) to trace what gets executed. Useful for debugging.

Use $(( ... )), not expr for executing arithmetic expressions. which is more forgiving about space
Use (( or let, not $(( when you don't need the result

Identify common problems with shellcheck.

while true; do some_commands_here; done
while true

Essential Linux Commands for Developers

How DistributedUpdateProcessor Works - Learning Solr Code

Case Study
Case 1: Update request is first sent to a follower (can be any node)
I: The coordinator nodes receives the add request:

DistribPhase phase =

phrase is None

DistributedUpdateProcessor.setupRequest(String, SolrInputDocument, String)

ClusterState cstate = zkController.getClusterState();
DocCollection coll = cstate.getCollection(collection);
Slice slice = coll.getRouter().getTargetSlice(id, doc, route, req.getParams(), coll);
String shardId = slice.getName();

In which shard this doc should store

Replica leaderReplica = zkController.getZkStateReader().getLeaderRetry(
    collection, shardId);
isLeader = leaderReplica.getName().equals(

Whether I am the leader that should store the doc: false

2. Forward to the leader that should store the doc
// I need to forward onto the leader...
nodes = new ArrayList<>(1);

3. DistributedUpdateProcessor.processAdd(AddUpdateCommand)
           (isLeader || isSubShardLeader ?
            DistribPhase.FROMLEADER.toString() :
            DistribPhase.TOLEADER.toString())); ==> TOLEADER
params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
    zkController.getBaseUrl(), req.getCore().getName()));
cmdDistrib.distribAdd(cmd, nodes, params, false, replicationTracker);

II: The leader receives the request:
org.apache.solr.update.processor.UpdateRequestProcessorChain.createProcessor(SolrQueryRequest, SolrQueryResponse)
final String distribPhase = req.getParams().get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM);
skipToDistrib true
// skip anything that doesn't have the marker interface - UpdateRequestProcessorFactory.RunAlways

DistribPhase phase = TOLEADER
String fromCollection = updateCommand.getReq().getParams().get(DISTRIB_FROM_COLLECTION);
if (isLeader || isSubShardLeader) {
          // that means I want to forward onto my replicas...
          // so get the replicas...
          forwardToLeader = false;
nodes = follower nodes

2. The leader adds the doc locally first
boolean dropCmd = false;
if (!forwardToLeader) {    // forwardToLeader false
  dropCmd = versionAdd(cmd); // usually return false

private void doLocalAdd(AddUpdateCommand cmd) throws IOException {

if (willDistrib) { // true
  cmd.solrDoc = clonedDoc;

3. The leader forwards the add request to its followers
params = new ModifiableSolrParams(filterParams(req.getParams()));
           (isLeader || isSubShardLeader ?
            DistribPhase.FROMLEADER.toString() :
params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
    zkController.getBaseUrl(), req.getCore().getName()));

if (replicationTracker != null && minRf > 1)
  params.set(UpdateRequest.MIN_REPFACT, String.valueOf(minRf));

cmdDistrib.distribAdd(cmd, nodes, params, false, replicationTracker);

III: Followers receives the request:
1. org.apache.solr.update.processor.UpdateRequestProcessorChain.createProcessor(SolrQueryRequest, SolrQueryResponse)
final String distribPhase = req.getParams().get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM); //FROMLEADER
final boolean skipToDistrib = distribPhase != null; // true

if (DistribPhase.FROMLEADER == phase && !couldIbeSubShardLeader(coll)) {
  if (req.getCore().getCoreDescriptor().getCloudDescriptor().isLeader()) {
    // locally we think we are leader but the request says it came FROMLEADER
    // that could indicate a problem, let the full logic below figure it out
  } else {
    isLeader = false;     // we actually might be the leader, but we don't want leader-logic for these types of updates anyway.
    forwardToLeader = false;
    return nodes;

return empty nodes
if (!forwardToLeader) {
  dropCmd = versionAdd(cmd);

Case 2: The update request is sent to leader which should store this doc
DistributedUpdateProcessor.setupRequest(String, SolrInputDocument, String)
DistribPhase phase: none

Replica leaderReplica = zkController.getZkStateReader().getLeaderRetry(
    collection, shardId);
isLeader = leaderReplica.getName().equals(

if (isLeader || isSubShardLeader) {
          // that means I want to forward onto my replicas...
          // so get the replicas...
          forwardToLeader = false;
nodes = followers

It will forward the request to its followers with params:

if (!forwardToLeader) { // false
  dropCmd = versionAdd(cmd);
It will add to its local at this stage

// It doesn't forward this request to itself again, so no stage update.distrib=TOLEADER

Case 3: The add request is sent to a leader which should not own this doc
Case 4: The add request is sent to a leader which should not own this doc
The coordinator node will forward the add request to the leader of the shard that should store the request

DistributedUpdateProcessor.setupRequest(String, SolrInputDocument, String)
ClusterState cstate = zkController.getClusterState();
DocCollection coll = cstate.getCollection(collection);
Slice slice = coll.getRouter().getTargetSlice(id, doc, route, req.getParams(), coll);
String shardId = slice.getName();

decide which shard this doc belongs to
return nodes - the leader that should store the doc


cmdDistrib.distribAdd(cmd, nodes, params, false, replicationTracker);

Case 5: Send multiple docs in one command to a follower
XMLLoader.processUpdate(SolrQueryRequest, UpdateRequestProcessor, XMLStreamReader)

while (true) {
  if ("doc".equals(currTag)) {
    if(addCmd != null) {
      log.trace("adding doc...");
      addCmd.solrDoc = readDoc(parser);
    } else {
      throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Unexpected tag without an tag surrounding it.");

It calls processAdd for each doc.

Related Code

UpdateRequestProcessorChain.createProcessor(SolrQueryRequest, SolrQueryResponse)

if the chain includes the RunUpdateProcessorFactory, but does not include an implementation of the DistributingUpdateProcessorFactory interface, then an instance of DistributedUpdateProcessorFactory will be injected immediately prior to the RunUpdateProcessorFactory.
if (0 <= runIndex && 0 == numDistrib) {
  // by default, add distrib processor immediately before run
  DistributedUpdateProcessorFactory distrib
    = new DistributedUpdateProcessorFactory();
  distrib.init(new NamedList());
  list.add(runIndex, distrib);


DistribPhase phase = DistribPhase.parseParam(req.getParams().get(DISTRIB_UPDATE_PARAM))

boolean isOnCoordinateNode = (phase == null || phase == DistribPhase.NONE);

How to Improve Design Skills

How to Improve Problem Solving Skills

How to improve problem solving skills from Yun Yuan
How to test it (locally) and easily
How to verify whether it works easily
What's the short comings of your current approach?
Whether there is better approach?

How To Ask Questions The Smart Way

How To Conduct a Technical Interview Effectively

Technical Skills
- Problem solving: not-easy algorithm questions
- Coding
- Design
Soft skills
- Communication
- Retrospect
  - Mistakes related with design/decision
  - What you learned from your mistake
  - Bugs/troubleshooting
- Eager to learn
- Be flexible, willing to listen, not stubborn

What questions to ask
- ask interesting/challenging questions
- Or questions that's not difficult but focus on coding (bug free)
- ask questions that can be solved in different ways
- Avoid questions that can only solved one specific approach, unless it's obvious(binary search etc), and you are tesing coding skills not problem solving skills

Don't ask 
- brain teasers, puzzles, riddles
- problems only because you are interested, you just happen to know, or you just learned recently

Know the questions very well
- Different approaches
- Expect different approaches that you don't even know
  - Verify it(use example, proof), if it works, the candidate does a good job and you also learn something new

Know common cause of bugs
- Able to detect bugs in candidate's code quickly

Give candidates the opportunity to prove themselves and shine
We are trying to evaluate the candidate's skills thoroughly, what he/she is good at, what not.
If you plan to ask 2 coding questions, one simple, and one more difficult, tell candidates
Let the candidates know your expectation

Make the candidates learn something
- If the candidate doesn't give right solution/answer, and at the end of the interview, he/she wants to know how to approach it, tell him/her.
- Candidates takes a lot of effort for the interview (one day off and commute), if they desire to learn something, and learning something make them feel good
- Prove that you know the solution and have reasonable answer, and not ask questions you even don't know much

No surprise
If you find issues/bugs in candidate's code or design, point them out
The candidate should have a rough idea about how he/she performs in this interview

Be fair

Phone interview
Prefer coding question over design question
- as design is partly about communication and it's hard to test communication skills over phone

About me - Jeffery Yuan (2017)

This would be a short list that about I am good at and what I should improve.
- I will keep updating it, and hope when I retrospect after 1 year, I will realize that I have improved  and learned a lot of things.

Retrospect and Learning Logs
- I like to summarize what I have learned, and write them down

Sharing Knowledge

Problem Solving and troubleshooting
- I like to solve difficult problems as I can always learn something from it.
- I also summarize how(what steps) I take to solve the problems, what I learned that can make me solve problems quicker later.
- Search and find resource needed to solve the problem
- See more at my blog: Troubleshooting

Proactively find problems and fix them
- such as find problems in existing design and code, and think about how to improve them

Be honest
- to myself and colleague about what I know and what I don't
Be moderate
- I know there are still a lot of things that I should learn and improve.
- I like to learn from others

Proactively learning
- Have a safaribooksonline account
- Like to learn from book, and people
- When I use Cassandra, Kafka in our project, I took time to learn not only how to use it but more importantly its high level design.
- Read more at my log System Design
Programmer: Lifelong Learning

Weakness - things need improving
System design
Knowledge about distributed system
Public Speaking

How to Review and Discuss - Software Design

Talk/Think about all related
- how do we store data, 
- client api 
- ui change
- back compatibility: how to handle old data/client

But focus on most important stuff (first)

Talk/think about design principles/practices
- such as idempotent, parallelization,monitoring, etc
- Check more at System Design - Summary

What's the impact of other (internal and cross-team) components?

How others components use it?

What're the known and potential constraints/issues/flaws in current design?
Don't only talk about its advantages, 
Also talk about issues, don't hide them

What are alternatives?
Think alternative and different approaches, this can help find better solution
We can't really review and compare if there is no alternatives

Welcome different approaches
- although it doesn't mean it's better, or we will use it

Development Cost
- How difficult it takes to implement?

What may change and How to evolve

What may change in (very) near future?

How do can we know when the new feature works or doesn't work
How can we know problems happen

Feature Flag
Can we enable/disable the feature at runtime

Be Prepared
Ok to have informal/impromptu discussion with one or two colleagues

But make sure everyone is prepared for the formal team design discussion
All attendees should know the topic: how they would design it

Don't make design decision immediately - for things that really matters
Take time to reflect and develop disagreement, talk it again later

Listen first

When you don't agree with other's approaches
Don't get too defensive
Talk about ideas not people

Be prepared

Make API/Feature easier
- to use
- to test/rollback in production (feature flag)

System Design - Summary

Problem Solving Practice - Redis cache.put Hangs

The Issue
After deployed the change: Multi Tiered Caching - Using in-process EhCache in front of Distributed Redis to test environment (with some other change and someone did some change in the server like restart), we found out that cache.put hangs when save data to redis.

Troubleshooting Process
First we tried to reproduce the issue in my local setup, it always works. But we can easily reproduce it in test environment.

This mde me think this maybe something related with the test environment.

Then I used kill -8 processId to generate several thread dumps when reproduce the issue in test machine. I found out some suspect:
"ajp-nio-8009-exec-10" #91 daemon prio=5 os_prio=0 tid=0x00007f49c400a800 nid=0x75db waiting on condition [0x00007f495333e000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at  RedisCache$RedisCachePutCallback(RedisCache$AbstractRedisCacheCallback).waitForLock(RedisConnection) line: 600
RedisCache$RedisCachePutCallback(RedisCache$AbstractRedisCacheCallback).doInRedis(RedisConnection) line: 564
at com.lifelong.example.MultiTieredCache.lambda$put$40(
at com.lifelong.example.MultiTieredCache$$Lambda$18/1283186866.accept(Unknown Source)
at java.util.ArrayList.forEach(
at com.lifelong.example.MultiTieredCache.put(
at org.springframework.cache.interceptor.AbstractCacheInvoker.doPut(
at org.springframework.cache.interceptor.CacheAspectSupport$CachePutRequest.apply(
at org.springframework.cache.interceptor.CacheAspectSupport.execute(
at org.springframework.cache.interceptor.CacheAspectSupport.execute(
at org.springframework.cache.interceptor.CacheInterceptor.invoke(

Check the code at RedisCache$AbstractRedisCacheCallback to understand how it works:
for operations like put/putIfAbsent/evict/clear, @cacheable with sync =true(RedisWriteThroughCallback), it check whether there is a key like cacheName~lock in redis, if exist, it will wait until it's gone.

This lock is created and deleted for @Cacheable with sync =true in RedisWriteThroughCallback which calls lock and unlock methods.

This made me check the settings in redis: after created the tunnel to redis, ran command: key cacheName~lock, I found out that it's indeed there.

Now everything make sense:
- we did set sync=true and run performance test, then restarted the server and removed it. The cacheName~lock was left there may be due to server restart. Due to the cacheName~lock, now all resid update api would not work.

After removed cacheName~lock in redis, everything works fine.

Take away
- When use some feature (@Cacheable(sync=true) in this case), know how it's implemented.

Multi Tiered Caching - Using in-process Cache in front of Distributed Cache

Why Multi Tiered Caching?
  To improve application's performance, we usually cache data in distributed cache like redis/memcached or in-process cache like EhCache. 

  Each have its own strengths and weaknesses:
  In-Process Cache is faster but it's hard to maintain consistency and can't store a lot of data; This can be easily solved when using a distributed cache, but it's slower due to network latency and serialization and deserialization.

  In some cases, we may want to use both: mainly use a distributed cache to cache data, but also cache data that is small and doesn't change often (or at all) such as configuration in in-process cache.
The Implementation
  Spring uses CacheManager to determine which cache implementation to use.
  We define our own MultiTieredCacheManager and MultiTieredCache like below.
public class MultiTieredCacheManager extends AbstractCacheManager {
    private final List<CacheManager> cacheManagers;
     * @param cacheManagers - the order matters, when fetch data, it will check the first one if not
     *        there, will check the second one, then back-fill the first one
    public MultiTieredCacheManager(final List<CacheManager> cacheManagers) {
        this.cacheManagers = cacheManagers;
    protected Collection<? extends Cache> loadCaches() {
        return new ArrayList<>();
    protected Cache getMissingCache(final String name) {
        return new MultiTieredCache(name, cacheManagers);

public class MultiTieredCache implements Cache {
    private static final Logger logger = LoggerFactory.getLogger(MultiTieredCache.class);

    private final List<Cache> caches = new ArrayList<>();
    private final String name;

    public MultiTieredCache(final String name, @Nonnull final List<CacheManager> cacheManagers) { = name;
        for (final CacheManager cacheManager : cacheManagers) {

    public ValueWrapper get(final Object key) {
        ValueWrapper result = null;
        final List<Cache> cachesWithoutKey = new ArrayList<>();
        for (final Cache cache : caches) {
            result = cache.get(key);
            if (result != null) {
            } else {
        if (result != null) {
            for (final Cache cache : cachesWithoutKey) {
                cache.put(key, result.get());
        return result;

    public <T> T get(final Object key, final Class<T> type) {
        T result = null;
        final List<Cache> noThisKeyCaches = new ArrayList<>();
        for (final Cache cache : caches) {
            result = cache.get(key, type);
            if (result != null) {
            } else {
        if (result != null) {
            for (final Cache cache : noThisKeyCaches) {
                cache.put(key, result);

        return result;
    // called when set sync = true in @Cacheable
    public <T> T get(final Object key, final Callable<T> valueLoader) {
        T result = null;
        for (final Cache cache : caches) {
            result = cache.get(key, valueLoader);
            if (result != null) {
        return result;
    public void put(final Object key, final Object value) {
        caches.forEach(cache -> cache.put(key, value));
    public void evict(final Object key) {
        caches.forEach(cache -> cache.evict(key));
    public void clear() {
        caches.forEach(cache -> cache.clear());
    public String getName() {
        return name;
    public Object getNativeCache() {
        return this;

public class CacheConfig extends CachingConfigurerSupport {
  public CacheManager cacheManager(EhCacheCacheManager ehCacheCacheManager, RedisCacheManager redisCacheManager) {
      if (!cacheEnabled) {
          return new NoOpCacheManager();
      // Be careful when make change - the order matters
      ArrayList<CacheManager> cacheManagers = new ArrayList<>();
      if (ehCacheEnabled) {
      if (redisCacheEnabled) {
      return new MultiTieredCacheManager(cacheManagers);

  public EhCacheCacheManager ehCacheCacheManager() {
      final EhCacheManagerFactoryBean ehCacheManagerFactoryBean = new EhCacheManagerFactoryBean();
      ehCacheManagerFactoryBean.setConfigLocation(new ClassPathResource("ehcache.xml"));

      final EhCacheManagerWrapper ehCacheManagerWrapper = new EhCacheManagerWrapper();
      return ehCacheManagerWrapper;

  @Bean(name = "redisCacheManager")
  public RedisCacheManager redisCacheManager(final RedisTemplate<String, Object> redisTemplate) {
      final RedisCacheManager redisCacheManager =
              new RedisCacheManager(redisTemplate, Collections.<String>emptyList(), true);
      return redisCacheManager;
  Others things we can do when use multi (tiered) cache in CacheManager:
- We can use cache name prefix to determine which cache to use.
- We can add logic to only cache some kinds of data in specific cache.

- able to use only Distributed Cache or only in-process Cache

Making Child Documents Working with Spring-data-solr

The Problem
We use spring-data-solr in our project - as we like its conversion feature which can convert string to enum, entity to json data and etc, and vice versa, and recently we need use Solr's nested documents feature which spring-data-solr doesn't support.

Issues in Spring-data-solr
SolrInputDocument class contains a Map _fields AND List _childDocuments.

Spring-data-solr converts java entity class to SolrDocument. It provides two converters: MappingSolrConverter and SolrJConverter.

MappingSolrConverter converts the entity to a Map: MappingSolrConverter.write(Object, Map, SolrPersistentEntity)

SolrJConverter uses solr's DocumentObjectBinder to convert entity to SolrInputDocument,
it will convert field that is annotated with @Field(child = true) to child documents.
- This also means that spring-data-solt's convert features will not work with SolrJConverter

BUT SolrJConverter still just thinks SolrInputDocument is a map and add all into the destination: Map sink
- SolrJConverter.write(Object, Map)

After this, the child documents is discarded.

The Fix
We still want to use spring-data-solr's conversion functions - partly because we don't want to rewrite everything to use SolrJ directly.

So when save to solr: we uses spring-data-solr's MappingSolrConverter to convert parent entity as solrInputDocument, then convert child entities as solrInputDocuments and add them into parent's solrInputDocument.

When read from solr, we read the SolrDocument as parent entity, then read its child documents as child entities and add them into parent entity.
public class ParentEntity {
  @Field(child = true)
  private List<ChildEntity> children;
protected SolrClient solrClient;

// we add our own converters into MappingSolrConverter
// for more, please check 
protected MyMappingSolrConverter solrConverter;

public void save(@Nonnull final ParentEntity parentEntity) {
    final SolrInputDocument solrInputDocument = solrConverter.createAndWrite(parentEntity);
    daddChildDocuemnts(parentEntity, solrInputDocument);
    try {
        solrClient.add(getCollection(), solrInputDocument);
    } catch (SolrServerException | IOException e) {
        throw new BusinessException(e, "failed to save " + parentEntity);

protected void daddChildDocuemnts(@Nonnull final ParentEntity parentEntity,
        @Nonnull final SolrInputDocument solrInputDocument) {
            .map(child -> solrConverter.createAndWrite(child)).collect(Collectors.toList()));

public List<T> querySolr(final SolrParams query) {
    try {
        final QueryResponse response = solrClient.query(getCollection(), query);
        return convertFromSolrDocs(response.getResults());
    } catch (final Exception e) {
        throw new BusinessException("data retrieve failed." + query);
 * Also return child documents in solr response as ChildEntity if it exists
protected List<ParentEntity> convertFromSolrDocs(final SolrDocumentList docList) {
    List<ParentEntity> result = new ArrayList<>();
    if (docList != null) {
        result = -> {
            final ParentEntity parentEntity =, solrDoc);
            final List<SolrDocument> childDocs = solrDoc.getChildDocuments();
            if (childDocs != null) {
               ->, solrDoc))

            return parentEntity;

    return result;
Mix Spring Data Solr and SolrJ in Solr Cloud 5
SolrJ: Support Converter and make it easier to extend DocumentObjectBinder


Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (39) Eclipse (33) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Tools (21) Web Development (20) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Problem Solving (10) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Become a Better You (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Life (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) Invest (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts