Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you guys approach the "start" of reading a code base, i never know where to start looking, specifically if its a language i am not too familiar with i have no idea where to start and sometimes i have no idea where the program execution starts


Mitchell Hashimoto has published a blog post describing how he approaches complex codebases. That might give you an idea where to start.

https://mitchellh.com/writing/contributing-to-complex-projec...


Great guide! I would add fixing bugs. I often learn most about a code base by fixing my bugs. A good debugger can be a blessing. Profiling is part of debugging to me. Questions can come up about why something is taking a long time that lead to more debugging and thinking about what is going on.


The best runs I've had working on others' codebases is to jump into documenting it. Many projects love having someone read, ask questions about, and document code, even (or especially) from a naive standpoint since that's who'll benefit the most from it, and in the process you learn how the code's structured, track where references lead to, and more often than not kick over some bugs worth fixing in the process.


Good advice! I especially like the first advice

> The first step to understanding the internals of any project is to become a user of the project.

It's normally easier to figure out complex behaviour from the spec/doc/interaction than from the code.


I've watched an interview on MSDN with one of the developers of .NET (I think she was responsible for the GC), who also used to work on Windows, making those famous workarounds making games work on newer Windows releases, even when they relied on old kernel bugs. I think she said that the best way to get familiar with a complex new codebase is to step through it with a debugger, going through several scenarios. I think it's a great idea. I only wish I had a working visual debugger in my day to day work.

EDIT: Found the interview: <https://docs.microsoft.com/en-us/shows/Careers-Behind-the-Co...>


Besides the good methods others have posted, and a really nice method (if you have it) of having someone else familiar with the code give you a tour, you can also do pretty well just by brute forcing it.

Get a list of all the files, sorted however (`find -name *.foo` works) and start going through them top to bottom, or bottom to top if that's a more clear convention of the language. Maybe shuffle order a bit if you discover unit tests (nearby or asking a tool to cross-reference a call) to read the code and the test around the same time, but resist the urge to jump around too much or too deeply. Jot down short notes about what seems to be the main purpose(s) of the file, and move on. Keep going, keep track of what you've seen, your first goal is to do a complete survey of all the files and not get too distracted by fully understanding new syntax (Java annotations and Python decorators can both be understood as high level declarative tags even though under the hood they're quite different) or endless note revisions from new insights as you progress and start seeing connections or just finally understanding terminology ("wtf is a 'hero'?").

You'd be surprised how fast you can do a single (high level, shallow, skimming in places) pass even for larger code bases, by the end of it you'll also have found the/an entry point, and are in a better place for followup study or producing materials that can help the next person (like an architecture diagram that lists the files involved in each element, at least at that moment, or just some important cross references you've noted that a tool isn't necessarily going to make clear). And for easy code, a single pass may be all you ever need, even if you read it in a strange order. A completed puzzle is perfectly clear regardless of the order you put the pieces down.


Short answer -

git clone <repo> ;

open project in editor/IDE

Read the readme.md to get an idea at the author's opinion

Start at `func main(){}` and find what I find.

Longer answer can be taught by taking the patterns out of

https://www.goodreads.com/book/show/567610.How_to_Read_a_Boo...


I came up with the idea of ENTRYPOINT comments to solve this problem: https://gist.github.com/gushogg-blake/247b1bf2ed46b035d1c8a2...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: