The previous post covered 4 Types of Code Reviews. In this post I’ll cover 6 Best Practices for Conducting ETL Code Reviews – to use before, during and after the code review.
1. Assemble the Code Review Team
Before deciding who should be on the team, think about when to put the team together. Don’t wait until the code review document is done! If you wait, a lot of time could be wasted before the actual code review occurs. The meeting might need to be scheduled several days later to work around everyone’s schedules. Bring the team together before the code review document is complete. If it takes four days to create the document, plan the meeting for one or two days after that. I prefer to meet soon after the document is ready so it’s still fresh in my mind. Also, I don’t like delaying code. I want it in production as soon as possible so I can move on to the next project.
The Team
Possible participants include:
- Developer of the code
- Senior Developer(s)
- Facilitator
- Junior Developer (optional)
- Technical Team Lead (optional)
- QA Resource (optional)
- Business Analyst (optional)
The first two on the list should be obvious but the other five may not be so I’ll cover each of them.
Facilitator – There should be someone assigned the task of managing the flow of the meeting and if necessary function as a referee. This could be a Senior Developer or even the Team Lead. Dual roles are OK.
Junior Developer – I prefer having a junior developer attend a code review so they can understand what to expect when their code is evaluated. This can also be a chance to teach the junior developer about the coding standards for the project. However, research indicates that this approach doesn’t always yield positive results. Therefore, don’t rely on it as the main reason to include them in the code review team.
Technical Team Lead – If the other developers don’t have enough experience then the Technical Team Lead may need to sit in and possibly run the meeting.
QA Resource – You may want QA in this meeting so they can get an idea of what they’ll be testing.
Business Analyst – This person will be reviewing the code to ensure that it meets the requirements.
There is no magic number for how many participants to have in the meeting. I usually expect to see at least three people (including the developer) in the meeting. Bigger isn’t always better. Remember, the more people you try to schedule to attend the meeting the more difficult it is to find a time when they can all meet. Use the smallest team possible to reduce resource usage and improve execution time.
Lawrence Votta produced this graph in his 1993 paper titled “Does Every Inspection Need a Meeting?”. The point of it is that the more reviewers in the meeting and the longer the meeting (in hours) the further out in time you have to go to find an open spot in all the reviewer’s schedule.
The takeaway from the graph: the fewer people in the meeting, the easier is to schedule the meeting.
2. Schedule the Code Review
Schedule the meeting after unit testing is complete and before QA starts. Remember to leave enough time:
- To complete the code review documentation
- For reviewers to read the document
- To fix issues found during the meeting
How long should the meeting be? I try to keep the meeting to one hour. If that isn’t long enough then schedule a second meeting. Most of the participants will lose focus after an hour. We don’t want any bugs to slip through just because the meeting was too long. If there is any doubt that a single meeting won’t be long enough, then schedule a second meeting up front. It’s easier to cancel a meeting than it is to add one.
3. Prepare for the Code Review
The developer is responsible for constructing a document that shows and explains each step of the ETL job. The document should be the sole source of code during the meeting. The ideal document will have:
- Screen prints of all ETL objects. Include everything necessary to explain the batch job.
- A list of the objects to be reviewed and version control labels of those objects
Why a document? Why not just have the developer lead the team through the job using a PC and a projector? Because the developer needs to take the time to thoroughly review the job and present an organized story. I don’t think there has been a single time when I am preparing the document that I haven’t found some issue with my code. It could be adding comments to clarify complex code or fixing bugs that weren’t found during unit testing. I look at the document construction step as a way to ensure that my code is shown in the best possible light.
The document also serves as a place to record the bugs or issues identified during the meeting. When it is my code that is being reviewed, I go into the meeting with a pad of sticky notes, a highlighter and a red pen. Every fault gets a sticky note which will be placed on the page where the fault was found such that it is sticking slightly outside the page. This allows me to find every issue after the meeting without having to remember where it was in the document. Every issue will get some sort of annotation on the page so I can remember the details surrounding the issue. All participants should be making similar annotations, not just the developer.
Once the document is complete it is published to the members of the code review team. At this point the ETL code must have been checked in to the Central Repository and labeled. Consider this as a code freeze. Reviewers don’t want to look at code that is a moving target.
4. Code Review Ground Rules
- Check your ego at the door – this goes for everyone on the code review team
- It’s about the code, not you
- No personal attacks
- There may not be a single right answer
- Review in one hour segments
- Record each defined bug or fault – suspected or confirmed
- Use the checklist
5. Code Review Checklist
This is a list of common faults to look for. For example, for every target table check to see if a primary key is defined and if not then make sure the “Use input keys” checkbox is enabled. Use the checklist to make sure you don’t gloss over something.
A subsequent post will go into much greater detail.
6. Follow Up
There must be a “final” code review to go over all the fixes resulting from the initial code review. For simple issues, this should be an Over The Shoulder type code review with one of the participants from the meeting. Complicated issues may require assembling the review team again.
There are no code review tools available for Data Services like there are for Java or C++ type projects. We have to work a little harder. Using the annotations made in the document during the meeting we can review each issue to make sure it has been resolved. Ideally, each issue would be formally recorded so we can go back over time to identify the top common faults for the team and for each developer. This allows us to gather metrics to show that – over time – the developers and the project are producing fewer bugs going into QA.
The next post in my ETL Code Review series will cover the Code Review Checklist.
References
• Votta, Lawrence G. “Does Every Inspection Need a Meeting?” ACM SIGSOFT Software Engineering Notes 18.5 (1993): 107-14. Print.