I do agree that visualizations are lacking. In part this is because of difficulties in complete analysis, and rather than do half a job, they don't do any job at all. And in other parts its because different tools do a better job: for example, instrumenting profilers are better at showing control flow.
I generally want three kinds of things in my head when reading code: control flow, data flow and data shape.
But there are different resolutions to this. Control flow may be simple, at the method level; or it may be more complex, with asynchronous callbacks, queuing systems, RPC, web service requests and orchestrations controlled by configuration.
Data flow may be simply knowing where in the code a particular attribute is updated and where it is read and used to make a decision. But it's also about where the data came from ultimately - what all the ingredients are that go into its formation - and also what other data it in turn affects.
Data shape has to do with the simple shapes of structures, but diagrams are scarcely needed for that - a glance at the definition of the structure is enough to commit it to short-term memory. More interesting is global invariants, local invariants, longer chains (how you get to one distant structure from another via links), the database model, configuration data model, static data vs dynamic data.
Most of the interesting information is at a higher level than can reasonably be analyzed in most Turing-complete languages. The best info comes from profiling or debugging.
Looking at Coati.io it can't do those languages yet, but the graphical flow chart looks really amazing to me and I'de love to try it out on my own code.
For more current approaches, see "Code Maps" in the VS 2015 release notes here - https://www.visualstudio.com/en-us/news/releasenotes/vs2015-...
Doing proper control flow is done less often for the reasons I explained. Profilers do it better. See e.g. the Call Graph in AQTime: https://support.smartbear.com/viewarticle/43205/ - it lets you drill through the callers and callees of each function as seen in practice. This means it can see through polymorphism, reflection, dynamic loading, function pointers - all the things that make static analysis infeasible.
Even the debugger fails when you come across indirect calls. Also in gdb multiple threads make it easy to miss something when stepping through code.
For exploring control flow, an instrumenting profiler is really useful, if you can easily isolate a representative code run.
Downside was it took a long time (many hours) to scan the entire codebase.
Once during the start of my career I was asked to implement a caching layer in an HTTP library. I had no idea what it was, so there was some reading up to do. There were the excellent guides from mnot.net, as well as the nicely written RFC 2616. But Chrome's code was the best of all - https://github.com/adobe/chromium/blob/master/net/http/http_.... If you want to know for a fact how Chrome decides caching, that code is kind of the heart of it.
Recently I needed to retrieve the "Rendered Font Name" that is available in the DevTools "Computed CSS" section. This is the name of the system font that Chrome finally picks, based on the Font-Family property. This is platform specific and so not in DOM, nor available to Chrome Extensions directly. The only way it could be done was by making an extension running in debug mode and communicates to the browser thru its remote debugger protocol. (This part of the documentation is lacking, but it is an esoteric topic anyway). The good news was that the code that does this is well encapsulated and could be easily extracted into a command line utility. For the curious: https://chromium.googlesource.com/chromium/src/+/master/thir... (there is a nugget about real-world software in the comments)
This is, at a surface level, good code to me. Things are easy to find, there is a rhyme and rhythm to the system, and feels welcoming. The thoughts of the people who designed that system over years would be great to hear. Most writing about good code on the internet comes from an OO background, mostly wrt information systems. I wonder what people who've written these systems have to say about building and engineering complex software.
I would say that at the beginning of the project (2006-2008) we didn't have so much of a focus on platform design, just on shipping a browser as quickly as possible. Some of the abstractions from that era haven't stood the test of time as the project has scaled to many platforms, features etc.
Over the course of time we've had various refactoring projects to try and pay down some of the technical debt. The first major one was the "content refactor" from 2011. This led to the separation of the multi-process browser shell from the UI layer, which has allowed for other chromium-derived browser apps to emerge.
Today, we've observed that even this layer is a bit too complicated, so we're running more projects to try and modularize it a bit more. My mental model is that the browser is kind of like a set of system services for an ephemeral app runtime, and it's good to imagine what the APIs & separation between those things should be. To aid this we've developed a new suite of IPC tools which are way more useful than the original stuff we have used for much of the lifetime of Chrome.
Anyway this kind of thing requires an ongoing investment and a set of people who thrive on the art of API design and in grungy, challenging refactoring work. I probably have many more thoughts on this topic but this'll do for right now :-)
There is a dearth of quality conversations on the internet about good code in a real-world messy context, mostly because the people who're doing serious work don't have the time to talk about it. Would be a good thing if you write more. In fact you folks should be writing books!
// Of course, there are other factors that can force a response to always be validated or re-fetched.
Gee thanks, what might those factors be? A comment like that is worse than not commenting at all.Or the real ones that get my goat, comments which just tell you what the code that follows obviously does:
// If there is no Date header, then assume that the server response was
// generated at the time when we received the response.
Time date_value;
if (!GetDateValue(&date_value))
date_value = response_time;
There's even a line (1019) where they divide by 10 but don't explain why they do it. That's what I personally would comment.The code itself is fairly easy to read (though I do wonder why they bothered using TimeDelta as it just seems to make the code more complicated, and results in confusing code like this:).
return TimeDelta(); // not freshIf they just returned an int, you'll sooner or later see it passed through five different places, and at the end location, nobody remembers if it's ms, time ticks, seconds, or even a unix time instead of a delta. TimeDelta removes that question.
It also does things that have subtle issues you might miss if you did it "by hand", like saturated adds, multiplying with an integer value while handling overflow correctly, etc.
These things might be overkill for smaller project, but once you have something with hundreds of contributors, every little bit helps keep the code base sane.
As for the division by 10 - look up at line 953 :)
Yes, it should be a named constant. And some of the comments could certainly be better. It's a work in progress. (And if you want to help with that work, we happily accept patches!)
The GetFreshnessLifetime function below it then covers the additional cases where it returns that. Such as the headers being set to not cache, or the expiry time being earlier than the response's time (or current time if none is provided).
I think it also makes sense to assume the comment is letting us know that just because HttpResponseHeaders::RequiresValidation returns false, that doesn't mean that's the only thing that can make it require/not require a re-fetch.
> There's even a line (1019) where they divide by 10 but don't explain why they do it. That's what I personally would comment.
This is covered near the head of the function, line 951. Using a constant such as heuristic_scalar instead of simply using '10' would be more readable though.
The example uses the author mentions ("Following code paths from method to method", "Finding where an interface is implemented and which methods get overridden", "Exploring dependencies between types and functions") all sound like pretty standard features today. Plus with an IDE you get the benefit of having them right there in the editor/debugger/etc. and much more.
I would however really like something for inspecting the run-time structure of an application's objects. Most debugger views are really clunky for looking at large amounts of data, and even the pretty-print features often don't help much. Having a zoomable graph with the objects right there in front of me would really bring my productivity to the next level.
(I've been thinking about ways of getting to deal with arbitrary object graphs, but an important requirement was to keep it language-independent, so it just deals with common data structure formats at the moment: lists, trees, tables, graphs, hashmaps, etc. —my thinking is most difficult to debug algos are performing operations on these anyhow.)
This specific screenshot however - a linked list - shows a situation where you most likely don't care about the exact structure like this - you only care that it's a list of elements, so most debuggers would show it to you like that - just as a list. The specific raw structure only obscures the parts you care about.
Visual Studio allows describing your custom data structures like this using XML files (https://msdn.microsoft.com/en-us/library/jj620914.aspx), or even custom graphical controls (https://code.msdn.microsoft.com/windowsdesktop/Writing-graph...) and gdb allows you to write a pretty-printer in Python (using an API which I can't find any documentation for :/ ), but it's all kinda clunky and still doesn't deal well with large amounts of data.
My ideal tool would be something that combines both of these things - you have
- An object graph that you can zoom in/out of that shows raw objects (just like in that DDD screenshot).
- An easy way to describe a custom view for your own objects (you can also switch to the raw view for an object, or switch between different defined views, etc.). Just like VS/gdb/lldb allow you to do, but a lot easier and more powerful. You could for example view a specific dictionary that contains complex objects in a tabbed interface, where the keys are tab titles, etc.
- A way to live edit these custom views - so that you can rapidly create them without restarting the debugger and restoring the state many times (Visual Studio supports this for the .natvis files).
- A powerful searching/filtering/transformation mechanism (e.g. for every object that satisfies this condition, show only this property and sort by that, etc.).
- Some way to save these configured views + filters + transformations, etc.
I agree that some pieces only are really comprehensible runtime, but I applaud tools that reflect the need to learn a codebase without necessarily having to (or being able to) bring all the code into your IDE.
I used early versions of Coati (0.5 I think) and it used clang for the backend, loading a new project took ages, probably longer than compiling it in my case. I should try again as they released 1.0 not too long ago.
Also -- sort of offtopic, but motivated by TFA -- I'd love some way to find out about quirks that native speakers of $lang have when they speak $otherlang.
A few common "tells" that I know, for $otherlang = Englishf are Hyphenating-Things-Like-This and also spaces before exclamation points or question marks !
This is probably related to typographic rules. For example in french you put a space before and after double punctuation marks (!?:;" etc) and a space after single punctuation marks (.,) one exception is the single quote which should not be precedeed nor followed by a space.
It gives an overview and interpretation of a body of neuroscience research in the context of teaching programming. I can't quite summarize the whole talk succinctly (and don't want to lure anyone with catchy titles either) — but my takeaway from it is that those "visual" programming tools are mostly useless and not going to help significantly.
The reason for that being how the brain works: switching back and forth between "visual" and "linguistic" cognition is hard, and requires specific training to do efficiently. Please turn to the talk for references.
Otherwise I'd rather use whatever IDE JetBrains has for it. It might not have the exact same capabilities (or maybe it does, I didn't look closely enough) but why use another tool and context if the current one is good enough.
"Looks" promising.
I wonder if something similar is available for C/C++.