Reading Code: From Abstraction to Details
There are two questions I am being asked quite often. The first question is how to structure code such that it will be easy to read and manipulate; the second question is how to read code effectively (in a review or as part of a maintenance task). Clearly, these two questions are related. In fact, they represent two sides of the same issue. Code should be written such that it will be easy to read and work with. Therefore, if we have a preferred way to read code, the author of the code should probably write the code such that it will support this code-reading technique.
So, what is the most effective way to read (and write) code? You might argue that this is a question of taste and habit. However, I believe we can easily find some high-level guidelines, which can help both the author of the code and its reader in their tasks.
The human mind is great in capturing abstractions. At the same time, it is limited in the number of things it can capture, understand and remember at any given time. These two attributes can direct us in reading and writing code. As a general rule, it is easier to understand source code when you can concentrate on a limited number of details, and then drill down into more details (in a finer resolution) as needed.
Take for example the task of understanding how a car works. When you first approach a car, you don’t want (or need) all the little details such as the internal structure of the engine, where the fuel pump is located, etc. If you had to know all these details to understand the concept of a car, you would have been quite frustrated. So, when you first approach a car, all you need to be aware of are some high-level concepts such as a steering wheel, breaks, fuel pedal, gears, etc. These abstract concepts are enough to get you started. You can easily get a notion of the system you are trying to run. Of course, you might be interested in finer details, such as how the breaks perform their part in the system. This information should be quite easy to locate when you need it, after you understand the system from a high-level perspective.
Back to code. Take for example the following code snippet (in C#). Don’t invest more than a couple of minutes in trying to understand it…
| public class FileServer { private Socket m_ServerSocket; private const int LENGTH_FIELD_SIZE = 2; public FileServer(ushort listeningPort, int readTimeout) public void Run() m_ServerSocket.Bind(new IPEndPoint(0, m_ListeningPort)); |
As you probably saw, this (not too long) piece of code is cluttered with information. It performs several tasks and it forces you to delve into all the technical low-level details the moment you start reading it. This makes it difficult to read, understand, and later manipulate. If that’s not enough, as a special "bonus" this piece of code also includes a bug, which hides somewhere among all these details.
Now, try to see if the following version of the same application is somewhat easier to understand:
| public void ServeRequest() { try { byte[] requestBuffer = ReadRequest(); string targetFileName = ExtractFileName(requestBuffer); byte[] fileContent = LoadFileContent(targetFileName); byte[] encryptedContent = EncryptData(fileContent); SendResponse(encryptedContent); } catch (Exception ex) { // TODO: Handle Error } finally { m_ActiveConnection.Close(); } } |
I bet you found this version of the code to be much simpler. Now it is clear at first glance what the code should do. The details are not important when you first encounter the code of this application. When some details become relevant, you know exactly where to find them. If, for example, you need to know how the name of the file is extracted from the request, you can go directly to the ExctractFileName method and find the relevant 2 lines of code. As a special bonus, the bug that was part of the first version of the code, cannot possibly happen in the second version. The fact that each method has a well-defined single responsibility, prevents such a bug from happening (if you still haven’t found it, have a look at "encryption" sequence in the original version… if you can locate it).
The conclusion from this little experiment is that code should be read from abstractions to details. Naturally, it must be written in a manner that supports this approach. Now, when I talk about abstractions in that context, I am not referring only to design-level abstractions such as abstract classes and interfaces. The example above has nothing to do with these entities. The term ‘abstraction’ in the example above refers to abstract operations (such as EncryptData) which does not force the reader (and the writer) to deal with their implementation details. The fact that these details are enclosed in a different method in the same class, is not a problem in the context of this discussion.
Of course, design-level abstractions are also a powerful tool for achieving the same concept in a broader scope. If you are reading code which uses many well-defined interfaces, the code reading should be easier. You should not delve into all the possible different implementations at first. You only need to understand the role of the abstraction in the context it is being used. Later, when you need to deal with concrete details, you can easily find the relevant implementation and read it in isolation.
Following this guideline when writing code leads to better design and better structure of the code. The code is much easier to understand, and safer to work with.












October 5th, 2007 at 12:35 pm
290154687607
I plan to check it out