How to Troubleshoot and Debug - Programmer Skills


Related: How to Solve Problems
Troubleshooting is fun
We can learn something new, avoid making same mistake in future.
it brings us accomplishment and satisfaction.

How to Troubleshoot and Debug

Check and save current state(logging)
We may not recreate the problem easily, so analyze the problem and save logging first.

Understand the problem first
-- Especially when trouble shoot problem for others(rookie).
First take some time to understand the symptom, what the symptom means, whether it's really a problem
Don't just make some change blindly without understanding the symptom
Check error code, understand how the user is writing the code

Understand the environment first
-- Especially asked to help trouble shoot in other's project which you are not familiar at all
-- Don't blindly think others made some simple/stupid mistake, or same mistake you made before
  -- You din't find some configuration file - it may be because you find it in wrong folder
-- Know how to quickly search/debug in linux
-- Most time, the problem/root cause is simple, or just need some configuration change or there is no problem at all

List/think about what may be the root cause
- trouble shooting problem is thinking what may go wrong
- how it's likely caused by this?
- how easily we can verify it?

- if not likely and not easily to verify it, try/list other possible root cause/approaches first

Read/Check the Log
Change the logging level to know more abut the internal
For example, if we are using spring security library: then change level of org.springframework.security to debug in dev environment.

Check/Use IDE
- The IDE can help to detect problems.
- Eclipse' problems view may tell problems that cause project not build, or mistake in configuration files such as web.xml.

Reproduce the problem (faster)
Try to reproduce the problem locally in our own setup
-- It's much easier to trouble shooting in our local setup, we can do all tricks add breakpoint, change variable value, force return, drop frames to rerun etc.
-- Sometimes the problem is related with data, it only happens with the data in x line with its data. We can change local setup to point to remote data (via creating tunnel etc). - This is much easier to remote debug code in remote server.

Simplify the suspect code
Create simple test cases to reproduce it

If possible, debug and test locally
For example, we can change local setup to use data in remote setup(test or sometimes production lines).

Remote debug if can't debug locally 
- but last resort, slow and sometimes impossible

Use condition breakpoint to change behavior dynamically
Use display view/force return to change behavior
Force the execution of suspect path
-- throw exception, change value etc

Source code is always the ultimate truth
- Don't assume or think ..., check the code to verify it.
- Always find the related code and understand it
  - how it works, how it gets called, how to configure it
  - You can't troubleshoot without know understanding the code first
- We can find examples/working code
- We can understand how/why the code works by running and debug the code

Check the code and log together

Change the Code to Help Debug
-- Add logging


Construct and test hypotheses

- First verify the most-likely, easy-to-verify ones


Google Search using the error message with library, class and method name
Search source code in Github/Eclipse
Search log: use linux command:(f|e)grep

Compare the difference between the code and the failed code
-- Compare different version in git(hub).
-- Search company's code base and find similar code
-- Others may already fixed same problems

If still gets stuck:
Take time to learn the framework/feature etc
Sleep on it, try to solve it later.

Scripting
Write test code using scripting or other languages

Ask in StackOverflow, product's forum
Read the Doc/JavaDoc

Solve problems quickly but find root cause slowly
- Solve problems quickly so others can move on
- But take time to find root cause or reflect if it matters

Collaboration
Ask questions
-- When help solve problems for others or debug other's code.
Get more information as much as possible - when help others
Provide more information when ask help

When work on urgent issues with others
-- Collaborate with others timely
-- Let others know what you are testing, what's your progress, so there is no overlap.

Reflection: Lesson Learned
How we find the root cause, why it takes so long

What we learned
What's the root cause

Why we made the mistake

How we can prevent this happens again

Share the knowledge in the team

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)