Post History
I am a professor and PhD that has been coding over 40 years. I'll restrict this comment to documenting code, which is different enough to warrant its own answer from a professional: I "grew up" (my...
Answer
#4: Attribution notice removed
Source: https://writers.stackexchange.com/a/36259 License name: CC BY-SA 3.0 License URL: https://creativecommons.org/licenses/by-sa/3.0/
#3: Attribution notice added
Source: https://writers.stackexchange.com/a/36259 License name: CC BY-SA 3.0 License URL: https://creativecommons.org/licenses/by-sa/3.0/
#2: Initial revision
I am a professor and PhD that has been coding over 40 years. I'll restrict this comment to documenting code, which is different enough to warrant its own answer from a professional: I "grew up" (my first real programming job, 40 years ago) on IBM operating system code; written entirely in assembly language. The assembly language of the time was rather cryptic: "MOV R1,R2" for example, or "ADD 1,R3". Variable names were restricted to eight characters. If you have looked at the modern assembly generated by your compiler, it can be even more cryptic, due to a variety of addressing modes and instructions present today that did not exist back then. As a result, it was close to to impossible to "read the code" and understand a single thing about what it was supposed to be doing, or what the programmer thought he was doing, with the instructions given. A consequence of that was the IBM coding standard, which for all I know was around since the 1960's: block comments on every "function" (subroutine at the time) AND a comment on every assembly line. The block function provided an overview of what the code was doing and how. Such as, > Parsing an interrupt from a controller; the interrupt number is in R0 and the controller code and sub-command are mapped into the upper and lower halves of R7. We find a function table for this interrupt, and call the routine indexed by the controller code, with the sub-command in R0. The actual code had a comment on EVERY line, with no exceptions, explaining what the assembler code was doing: Example: > MOV R7, R1 # Make a copy of controller code and sub-command. > AND 0xFF, R1 # Isolate just sub-command, for later. > SHR $5,R7 # find offset into function address table. And so on. This is also the origin (I am pretty sure) of the "don't repeat what you are doing in the comments." This is not helpful: > SHR $5,R7 # shift R7 5 bits right! Presume I can read code; I don't need you to tell me that, I need you to tell me WHY you are shifting R7 5 bits right. Reading that early OS code, all the comments were aligned in column 40 (of an 80 column screen) and reading down the left half was **WHAT** you were doing, reading down the right half was **WHY** you were doing it. I am not recommending this for C, C# or C++. To a **small** extent they can be self-documenting: > sec += uSec \* 1000000.; // no comment necessary. But if you can see why this pervasive commenting was necessary when thousands of coders were writing straight-up assembly for millions of lines of code with hundreds of devices, and any of them might quit any day, you can understand the spirit of how your modern coding commentary should be: ### It should explain the code and what you are doing well enough that another programmer, of your skill but unfamiliar with the coding problem being solved, can get into the routine and the details and find a bug. Without wondering WTF you are thinking or checking with: > if (x & 0xA013 || y & 0x03) var3+=7; Your code should not be cryptic. It is difficult for comments on lines to get stale; if the line is fixed I fix the comment at the same time. It is easy for block comments on functions (or whole files) to get stale, it takes an effort to fix a function, test it, and then go back and fix the block comment. So it is better to keep the block comment informative but not **detailed** , for example you may do a table search, but you don't have to specify you are doing a binary search, or hash table search, or whatever. You search! Exactly how are in the code, commentary can be found there if appropriate (and may not be needed if your code calls "binSearch(&Table, N, Key);" or something like that. You aren't commenting for novices, but programmers. In short, the block comments are an **abstract** of what the function does for the caller, kind of like how you would think of it when you are calling it in other code; its reason for being there. It initializes all the disk controller chips, using the table defined by the machine configuration, and leaves them in a ready state or disabled state if the disk hardware fails its self-test. That said, modern code can contain a plethora of library or 3rd party package calls that even a programmer will not know, so if your library calls or methods (or their arguments) are complex or have cryptic names, it may be helpful (both to other programmers, and yourself in the six months) to explain what a called routine is supposed to be doing, in particular if it does a lot of work and takes a ton of arguments (like a javascript library animating a chart on the user's screen, using several tables and levels of data). In such cases, I tend to go multi-line on the function call, putting 1 or a few arguments on each line with commentary explaining them, as I see necessary. > ret = rend3D3LSv( // render chart, 3-D, 3 level, side-viewer, > cardata, nCar, nDim, // the car data we just selected, rows and columns, > pX, pY, lX, lY, // screen box chart window dimensions, > 0.5, // Limited rotation to 180d, And so on. Commentary should be maintained with changes, but it should not be HARD to maintain. Even in C, it should be designed to help debug or update the code when that happens a few years from now when you've written a hundred thousand **other** lines of code (everything changes sooner or later, from the trivial like how input data is formatted, to the major like which libraries or rendering packages your software must use). Comments are a record of your analytic thought, so a future programmer (including yourself) is not stuck trying to read your mind-of-the-moment in order to use, debug, or update your code. To the extent it is truly obvious what you are doing in your code, you don't have to supply redundant comments. Determining what will BE truly obvious to another person or yourself in a year is just a skill you will have to learn; but it is better to err on the side of caution and comment. A few seconds now can save you an hour of frustration later. Certainly, any function call **you had to look up** to be able to call it, you should comment as you call it. Like IBM, I am still in the habit of writing code/comment together, while the thought of "why" is in my head. I do not go back over written code and try to come up with the comments then. Likewise, my block comments on functions are written BEFORE the function (although I have usually written the prototype and args first and know what I intend to DO with the function). However, I warrant there is leeway on complex calls used all the time in some field; If I have 200 calls to a linear algebra in my code, I don't comment every time I call a matrix-multiply. On the grounds that a programmer reading this code knows (or should know) what a DGEMM() looks like and doesn't need me holding their hand. Or my comment isn't about DGEMM(), it is something helpful like **_// get first intermediate vector from partial right hand side_**. This is quite different from writing user manuals and documentation, I've done that but have no particular insight or advice, so I will leave it for others to explain.