What to Look for in Code Review

Good PR
Small Change/Diff/Pr
Stacked Change
Review often, merge often

What we want to achieve via Code review
Capture design and code issues
Sharing of knowledge

What to focus in Code review
Business logic
High level design
Coding sytle

Code review should be productive
– both for reviewers and the developer
Usually code review should be finished timely – within 3 to 5 days, not in weeks or months
- Unless we don’t need the function or have disagreement about the design and the code

- should be a pleasure for reviewers and the developer
- help the author to capture mistakes, bugs
- Reviewer can learn somethings such as:
   - domain knowledges, the new feature or others

Who do the code review?
How to pick code reviewers?
Who merge the code?

Bug always happens, how to prevent the developer and code reviewers make same mistake?

Before Send Code to Review
Only one change in one pr
Build succeed
Include new unit tests, integration tests
All tests passed
Existing functions work

New functions work

Use tools to analyze code first - eclipse compile warnings, findbugs, pmd, mvn site report.

Carefully reviewed your PR first
- Fix common issues such as commented-out code, indentation
- Code looks clean and readable to yourself
- Use tools such as FindBugs or Eclipse itself to check and clean your code
- It saves time for you and reviewers
- The reviewers don't pay attention to the trivial stuff and can focus on logic or things that can't be easily found out by tools

Your PR should be pleasant for others to review.
Every PR should have clear description what has changed, and why (if needed)
Write descriptions/comments
- for reviewer
- for clients who is going to call your APIs
- share knowledge
Add comments to your code - such as why the code is removed

Highlight potential issues, or things you want reviewers to pay attention

- Don't hide them, instead highlight them
- it's for your won good as reviewers help improve your code, find potential issues, also this makes yourself a better engineer

Your Code
Your code should be pleasant for you and reviewers to read
- We not only write code, we write better code every day, write code that others want to use/call or read and maintain.

Write code for maintainer, reviewers, you in the future
Add comments to your code
- why you made the change this way: especially when there re other obvious approach, but maybe not work
- how you find the bug (internet link)
- when you think others(maybe yourself in the future) maybe confused
- write comments for your client/caller, how they should use it, what's the trick part

When others/reviewers ask questions about the code or when they feel difficult to understand it
It may be because others lack of the specific knowledge of the domain(logic) or it may (more likely) be a sign that we should improve the code, make it more readable

Share your knowledge to the team
- Such as in order to implement the PR, you learned some things new
- You don't have to share if it's something basic or we can find it via simple google search

- It's good to share if it takes you a lot of time, or you just want to share :)

During Code Review
Prefer to use PR to communicate
- as others would also know the conversation
- unless it's difficult to describe in words
- transparent and share knowledge
- What bugs we made, and how someone detect in code review

- how the reviewer reviews the code and finds out the issue

Don't forget the big picture
- What's the goal/new function of the PR
- Whether it solves the problem, implements the new function

What questions to ask?
- Not trivial things such as What is XX?
- Google it first, unless you still didn't find your answers

- Or you are the manager, tech skills/knowledge is not your primary goals any more

Use IDE to help understand the code if necessary
- Check the code change in Eclipse can help you to understand the code if you are not familiar with the function,

What to Look For
What may go wrong, what code may look suspicious, dangerous, have potential bugs, need pay extra attention, common source of bugs.

Whether the change improves code quality or make it worse?
Is there enough tests?
- Unit/Integration tests
- Happy, bad paths or corner cases

Admin what you don't know and ask others to review that part.
Ask for clarification/explanation if can't understand it.

Does it need documentation or notify other teams?

Using new features in the language?
- such as JDK8 time api, stream, new methods: getOrDefault, compute, putIfAbsent, computeIfAbsent etc

Don't invent here
- Whether we should use some well-known libraries, instead write our own code.

- Things that looks strange, that I don't understand or can't understand easily

Whether the new code is consistent (with existing code)
-- such as method name, order of parameters.

Focus on Main Function/Logic
-- Focus on the code that looks complicated, difficult to understand
-- Don't skip it if you can't understand it.

Whether the api is performance sensitive?
- Like database/networks calls
- If so, how long it usually takes in normal/extreme case?
- Is there duplication? do we need cache? or save the result first into a variable or a map?
--Did the code do enough optimization? such as use thread pool, call third part services asynchronously, in parallel etc. 

Do we need provide timeout for block call?
If so, whether the timeout value makes sense or is configurable?

Thread safe
Concurrent programming is hard, so pay more attention.
Example: Don't Use Mutable Variables in Multiple Threads
Variables may be Changed Unexpectedly in Multiple Threads
- Use mutable instance fields in services.
-- SimpleDateFormat

Does it reuse threadpool, is it configured correctly?
- Don't recreate threadpool and shutdown in each method, reuse it.
Is there really benefit to use multiple threads?
Is there race condition?
Is lock used correctly?

Code Quality
Whether the method/class does only one thing?
Big interfaces: split them
use polymorphism to remove if.
Preconditions cannot be strengthened in a subtype.
Postconditions cannot be weakened in a subtype.

Review the data model
- As it's much harder to change data model after deployed to production.
-- is there any not needed fields?
How it is going to be used?
-- use redundant field/data to boost performance (avoid join query?)
-- will we search on it, define it as separate field, or store it as part of json object?

Is it using Right Data structures?
What's the frequent operations on the data?
If need search often: uses Set/Map instead of  List
If there is a Key in the data, use Map instead of List.
Don't use mutable object inSet, or as Key in Map
Consider using Guava Multimap instead of Map<K, List<V>>
-- Multimap put: the case when the key doesn't exist, get returns empty collection instead of null.

Is class in right place - module/application or package?
Whether it should be an instance field?

Do we need protect the API? 
- If so, which role?
Are we storing sensitive info?
Should we audit the operation?

Code Quality
Bad Smell
Magic Number/String 
- change to use constant/enum and give a meaningful name.
Limit the number of parameters per unit to at most 4.
- extract parameters into objects.
Too many parameters
Parameters are easily to be mixed

Use boolean as parameter -- consider to use enum or use different names.
Big Class/Methods

Usually method should be short and easy to read, no too many nests.
Complex methods
-- extract small methods

Separate Concerns to different classes/functions

API Design
Think from client perspective.

Unmodifiable of parameter, return value 
- defensive copy if needed

Whether API is easier to use, test/rollback in production

Code Readability
Whether the code is easy to read and understand?
 -- If not, ask the developer to clarify or add comments to the code, or refactor the code.
Bad signs - use boolean parameters, return Pair, long parameter lists

Naming(variable/function/class name) is important to make the code readable.
Whether names of variables, method, class are meaningful?
-- 20 Tips for Better Naming
Good names should answer all the big questions, be precise and thorough, reveal intent, and follow conventions.
-- Avoid use vague variable names.
waitTime -> waitTimeInMs

Number of Method Arguments should be as small as possible.
- smaller is better, remove unused parameters
Order of arguments
- especially when the methods/constructors are overloaded. 

Order of statements
-- Put related statement together
-- whether some commands/functions should be always last one? If so, is it documented? 

API design
Unnecessary code
Is there any unnecessary code(field/methods/function)?

Configuration Driven
Should it be define as constant, in property file or stored in db?
Prefer to store in db, so we can change it dynamically.

Do we need wire on/off feature?
- Consider adding turn-on flag for for new features

Code Duplication - Don't repeat yourself
Does it cause duplication?

Is there anything that can be extracted and reused in future?
Whether this is common/util function?
Whether the function is put in right place?

Tools like PMD/CPD can help find copy-paste code.
Similar Methods:
Extract common logic?
Use Holder object to contain all data?
Before create a new constant/util methods, search the code base first to make sure it's not already there.

Common Bugs
- Change to use Optional, or use annotations: such as @Notnull, @CheckforNull
-- if(anint == anInteger), anint == null, 
anInteger = boolean?anint:anInteger
-- Use NullSafe API:
Objects.equals, MoreObjects.firstNonNull, StringUtils...

< or <+, off-by-one error
isEmpty or isNotEmpty?
Dangling If/Missing else
Missing default in switch

Allocate Unlimited Resources
-- Executors.newCachedThreadPool(); 
-- Create threads, make db/network calls in loop
Call expensive operations in for loop or for every record.

Look for signs of questionable code
Deep nesting of loops and conditionals.
AutoDetectParser instead of JpegParser

Not Needed Code
Remove commented code
Developer may write code that is not needed - such unused field/functions/features
Don't guess at future functionality.

Be cautious of clever implementation.
Is it over-engineered?

Is there enough Logging?
Whether there is enough logging to help debug if the function doesn't work as expected or to prove the code work.
Logging more info in lower log level.

Should we add monitor info such as API's latency/frequency? 

Don't return array

- Non-static inner class, anonymous class, lambda uses method or field from the enclosing class
  - Prefer to use static inner class
  - Whether it may cause memory leak

- Use Optional instead return null 
- Use Collections.emptyList/List.of()/ etc instead of new empty ArrayList
- Use new features unnecessary or inappropriately
- In some cases, the code is more efficient to just use regular for loop instead of stream
- we can collect/update all variables in one loop rather than use multiple streams(traverse) to update them separately
- Miss @Override

Json and Java serialization and deserialization of Http API response data

How to write the perfect pull request
thoughtbot - Code Review
Post a Comment


Java (161) Lucene-Solr (112) Interview (64) All (58) J2SE (53) Algorithm (45) Soft Skills (39) Eclipse (33) Code Example (31) JavaScript (23) Linux (22) Spring (22) Tools (22) Windows (22) Web Development (20) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) Troubleshooting (14) J2EE (13) Network (13) Tips (12) PowerShell (11) Chrome (10) Problem Solving (10) Design (9) How to (9) Learning code (9) Performance (9) Security (9) UIMA (9) html (9) Http Client (8) Maven (8) bat (8) blogger (8) Big Data (7) Database (7) Google (7) Guava (7) JSON (7) Shell (7) System Design (7) ANT (6) Coding Skills (6) Lesson Learned (6) Programmer Skills (6) Scala (6) css (6) Algorithm Series (5) Cache (5) Continuous Integration (5) IDE (5) adsense (5) xml (5) AIX (4) Become a Better You (4) Code Quality (4) Concurrency (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Life (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Review (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Distributed (3) Dynamic Languages (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Mac (3) Python (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Fiddler (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Firefox (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) Invest (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts