Cracking the Chaos - Tips on reading and debugging other programmers' code
Most IT firms where I worked as programmer were IT services industry, there was very little greenfield work and most of the time we had to maintain/fix code written by other engineers who couldn't even be found in some cases!
As a result, reading and debugging existing code was something we had to adapt ourselves to. Based on my experience, this is a brief overview of how the process generally works and some tricks to make your life easier:
1. Equip yourself with all domain knowledge & documentation that could be found beforehand. Reading code written by someone else could be excruciatingly painful when you don't even know what the code is about and is exactly supposed to be doing. This may not always be easy, you may need to gather bits of information from people wearing all kinds of hats such as designers, testers, clients, domain experts, etc., you may also have to scavenge through content on your company's intranet or documentation systems. If you are lucky enough to get hold of test cases from a previous release, that's even better!
2. Study the back-end database (tables, views, etc). About 90% of code in existence takes inputs from somewhere and puts it in a database table. Or alternatively, it takes the information from a database table and either puts it on the user's screen or sends it in response to an API request. Understanding the back-end data structure first helps you a lot when you later try to understand what the code is doing. This luxury may not always be available though! I remember that many a times I had to start studying the code even when database and server access was yet to be provided to me! But if you have that access then go ahead and study the back-end tables first by running SQL statements like DESC/SELECT.
3. Skim through the modules/classes briefly first without going into each individual function/procedure. Assuming you followed the last two steps, you should have at least some idea of what the app is supposed to do by now. Now, controlling the urge to do a microscopic scrutiny of the entire code line by line, try to get a sense of the app's "lay of the land". Start from the app's entry point (void main function in Java/C#, similar equivalents in other languages) and see what all is happening from there. As you skim through the code, look for comments of all kinds, there is valuable information in there sometimes. Keep making notes of what the code is doing in a notepad separately. Its quite helpful if you write this in a format of pseudo-code like this, for example:
start: main() function
input: distro_file
call check_pattern: distro_file
if not check_successful:
print("pattern doesn't match with any known distros")
exit
call verify(pattern_result)
Above pseudo-code is a brief example of the functionality of an open source python app I've written called distro-verify.
4. Start debugging the code, repeat as many times as necessary. Debugging is the bread and butter of every programmer worth his salt! Just as you need reigns to get control over an unstable horse, you need the debugging tool to get effective control over a program's source code. It doesn't necessarily have to be step-by-step debugging with breakpoints using an IDE though that's a great way and preferred if possible. You can also add print statements at important locations in code as a way of debugging. The more you debug the code, the more you'll understand what its doing. This is the part where you scrutinize each nook and cranny of the code as the debugging process will take you there.
5. Try all kinds of inputs and variations. Usually, your tester or designer will tell you what the inputs are supposed to be or the documentation will tell you. As you debug through the code, keep making notes of what you don't understand (maybe a designer or domain expert will later help you understand that part). Also try various kinds of inputs and observe what the app does with each one. If you have the test cases, run the app through each one and see whether they pass or not.
This is a very generic process, your particular code and domain area might have some peculiarities which can result in additional steps. Do let me know how you fare in reading other programmer's code in comments below!
As a result, reading and debugging existing code was something we had to adapt ourselves to. Based on my experience, this is a brief overview of how the process generally works and some tricks to make your life easier:
1. Equip yourself with all domain knowledge & documentation that could be found beforehand. Reading code written by someone else could be excruciatingly painful when you don't even know what the code is about and is exactly supposed to be doing. This may not always be easy, you may need to gather bits of information from people wearing all kinds of hats such as designers, testers, clients, domain experts, etc., you may also have to scavenge through content on your company's intranet or documentation systems. If you are lucky enough to get hold of test cases from a previous release, that's even better!
2. Study the back-end database (tables, views, etc). About 90% of code in existence takes inputs from somewhere and puts it in a database table. Or alternatively, it takes the information from a database table and either puts it on the user's screen or sends it in response to an API request. Understanding the back-end data structure first helps you a lot when you later try to understand what the code is doing. This luxury may not always be available though! I remember that many a times I had to start studying the code even when database and server access was yet to be provided to me! But if you have that access then go ahead and study the back-end tables first by running SQL statements like DESC/SELECT.
3. Skim through the modules/classes briefly first without going into each individual function/procedure. Assuming you followed the last two steps, you should have at least some idea of what the app is supposed to do by now. Now, controlling the urge to do a microscopic scrutiny of the entire code line by line, try to get a sense of the app's "lay of the land". Start from the app's entry point (void main function in Java/C#, similar equivalents in other languages) and see what all is happening from there. As you skim through the code, look for comments of all kinds, there is valuable information in there sometimes. Keep making notes of what the code is doing in a notepad separately. Its quite helpful if you write this in a format of pseudo-code like this, for example:
start: main() function
input: distro_file
call check_pattern: distro_file
if not check_successful:
print("pattern doesn't match with any known distros")
exit
call verify(pattern_result)
Above pseudo-code is a brief example of the functionality of an open source python app I've written called distro-verify.
4. Start debugging the code, repeat as many times as necessary. Debugging is the bread and butter of every programmer worth his salt! Just as you need reigns to get control over an unstable horse, you need the debugging tool to get effective control over a program's source code. It doesn't necessarily have to be step-by-step debugging with breakpoints using an IDE though that's a great way and preferred if possible. You can also add print statements at important locations in code as a way of debugging. The more you debug the code, the more you'll understand what its doing. This is the part where you scrutinize each nook and cranny of the code as the debugging process will take you there.
5. Try all kinds of inputs and variations. Usually, your tester or designer will tell you what the inputs are supposed to be or the documentation will tell you. As you debug through the code, keep making notes of what you don't understand (maybe a designer or domain expert will later help you understand that part). Also try various kinds of inputs and observe what the app does with each one. If you have the test cases, run the app through each one and see whether they pass or not.
This is a very generic process, your particular code and domain area might have some peculiarities which can result in additional steps. Do let me know how you fare in reading other programmer's code in comments below!
Thanks for providing this helpful article. I'm a maintenance programmer, these are great tips.
ReplyDeleteIf you see strange / unexplainable code that just seems wrong or seems unnecessary,, be careful. USUALLY there's a reason it was put there... You might be temped to remove it or "fix it". But be very careful.... Just because a code base is super messy doesn't mean that the coders who worked on it were stupid ...there's probably reasons for doing what they did...
ReplyDeleteYou're right. Its a great idea to actually contact them if they could be found and have a chat about the code they've written, you'll usually get remarkable insights from that.
ReplyDelete