Lately, I've noticed a pattern emerging that I think John is referring to in the second part. The situation is that often a large function will be composed of many smaller, clearly separable steps that involve temporary, intermediate results. These are clear candidates to be broken out into smaller functions. But, a conflict arises from the fact that they would each only be invoked at exactly one location. So, moving the tiny bits of code away from their only invocation point has mixed results on the readability of the larger function. It becomes more readable because it is composed of only short, descriptive function names, but less readable because deeper understanding of the intermediate steps requires disjointly bouncing around the code looking for the internals of the smaller functions.
The compromise I have often found is to reformat the intermediate steps in the form of control blocks that resemble a function definitions. The pseudocode below is not a great example because, to keep it brief, the control flow is so simple that it could have been just a chain of method calls on anonymous return values.
AwesomenessT largerFunction(Foo1 foo1, Foo2 foo2)
{
// state the purpose of step1
ResultT1 result1; // inline ResultT1 step1(Foo1 foo)
{
Bar bar = barFromFoo1(foo);
Baz baz = bar.makeBaz();
result1 = baz.awesome(); // return baz.awesome();
} // bar and baz no longer require consideration
// state the purpose of step2
ResultT2 result2; // inline ResultT2 step2(Foo2 foo)
{
Bar bar = barFromFoo2(foo); // second bar's lifetime does not overlap with the 1st
result2 = bar.awesome(); // return bar.awesome();
}
return result1.howAwesome(result2);
}
I make a point to call out out that the temp objects are scope-blocked to the minimum necessary lifetimes primarily because doing so reduces the amount of mental register space required for my brain to understand the larger function. When I see that the first bar and baz go out of existence just a few lines after they come into existence, I know I can discard them from short term memory when parsing the rest of the function. I don't get confused by the second bar. And, I don't have to check the correctness of the whole function with regards to each intermediate value.Now I have to spend days and possibly weeks refactoring dozens of functions and breaking them apart in to managable services so we can not only use them, but also extend and test them.
I'm afraid what Carmack was talking about was meant to be taken with a grain of salt and not applied as a "General Rule" but people will anyway after reading it.
This approach of raising the bar for introducing functions might do well with my "trace tests". I'm going to try it.
[1] Sorry, I've temporarily turned off my site while we wait for clarity on shellsock.
I think maybe the answer is that you want to do the development all piecemeal, so you can test each individual bit in isolation, and /then/ inline everything...
That sound like it might be effective?
But, it sounds like what you are dealing with is not inline blocks of separable functionality. Sounds like a bunch of good-old, giant, messy functions.
I don't think that's the kind of "inlining" being discussed -- to me that's the sign of a program that was transferred from BASIC or COBOL into a more modern language, but without any refactoring or even a grasp if its operation.
I think the similarity between inlining for speed, and inlining to avoid thinking very hard, is more a qualititive than a quantitative distinction.
Indeed. The definition of the DRY principle [1] is:
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
and not:
Don't type the same characters into the keyboard multiple times.
People often forget that.
One nice thing that falls out of Scala's syntax is that it makes this style possible without using mutation:
val result1 = {
val bar = barFromFoo1(foo)
// ...
baz.awesome()
}
val result2 = {
val bar = ... // unrelated to 'bar' above
bar.awesome()
}
All names introduced in the block go out of scope at the end of the block, but the 'value' of the block is just the value of the last line, so you can assign the whole thing to a final / constant variable. This style has all the advantages listed above, and makes it easier to avoid mutation and having uninitialized variables in scope -- I wish more languages had it.In Java you can separate the declaration and assignment of a final variable as long as every branch provably assigns to the variable. For example:
final int x;
if(someCondition) {
final int a = 1;
final int b = 2;
x = a + b;
} else {
x = 0;
}As a side note, I'd love to see Scala make it's way into game development. I've been messing around trying to get libgdx working with it. But I would love something that lets me take full advantage of the Scala language.
AwesomenessT largerFunction(Foo1 foo1, Foo2 foo2)
{
ResultT1 result1 = [&] {
Bar bar = barFromFoo1(foo1);
Baz baz = bar.makeBaz();
return baz.awesome();
} ();
ResultT2 result2 = [&] {
Bar bar = barFromFoo2(foo2);
return bar.awesome();
} ();
return result1.howAwesome(result2);
}
Bonus: you can initialize 'const' variables with multiple statements: const auto values = [&] {
std::vector <int> v (n);
std::iota (begin (v), end (v), 0);
std::shuffle (begin (v), end (v), std::mt19937 {seed});
return v;
} ();I also try to explicitly name which variables I'm capturing (within reason) as it makes it obvious at a glance what can and can't be modified within the lambda. I really wish it was possible to force constness on captured variables :/
> Some practical Matters --- Using large comment blocks inside the major function to delimit the minor functions is a good idea for quick scanning, and often enclosing it in a bare braced section to scope the local variables and allow editor collapsing of the section is useful. I know there are some rules of thumb about not making functions larger than a page or two, but I specifically disagree with that now -- if a lot of operations are supposed to happen in a sequential fashion, their code should follow sequentially.
He is absolutely correct in this. However, he's wrong with regards to the level of abstraction. Those "operations" should be part of functions that could be scattered all over the code base in whatever order they were written. But at the end of the day, they will be called sequentially right next to each other.
I've often found this to be the case. The developers I see that make these "god" functions are unable to write and compose their code in layers. They instead see the entire run (start->finish) of their programs as one giant series of "sequential" "operations". So what ends up happening is they've got high-level code, interspersed with low level io/networking/db calls.
I hadn't understood this "maybe leave functions inlined" rant a couple years ago when I first heard about it - it makes a lot of sense now.
It can look a bit daunting at first sight - if you're not used to this style of code, it just looks like your average rambling stream-of-consciousness function - but it's actually pretty easy to keep on top of. And if people really complain, it's super-easy to split out into functions :)
let result = {
let b = foo(a);
let mut c = b.see();
while (c) {
c.frob();
}
baz(c)
};The very nice upside would be that you could make the inputs to the blocks explicit. In contrast, the fact that foo1 is an input to step1 and foo2 is an input to step2 can only be understood by careful examination.
https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html
This allows you to actually write what you want. Nicer still, these internal functions are aware of surrounding context, so they're full closures, and thus you can take their address and pass it around as a callback without needing any void* cookie. I've been using these more and more.
The downsides are that this is a nonstandard gcc-only extension and that it's available in C only, not C++. Depending on what you're doing, these can be deal-breakers.
(Edited to fix minor typo)
One think that I found beneficial is that by dividing the big functions into smaller ones, the resulting functions have the same abstraction level. To give an exaggerated example: I try to split my functions so that I never deal with Countries, Provinces, Cities, Buildings, People, Body Members, Organs & Cells in the same scope. I try to only deal with one of them per function (sometimes two, in parent-child cases).
I find that 1-abstraction-level functions are easier to understand, and I gladly "pay the price" of having extra one-use names around for this reason. I do try to restrict the scope of those "extra functions" as much as I can, put related functions together, and reduce side-effects as much as I humanly can.
Personally I'd like to see something like Haskell's "where", to create a scope for private functions in the "B style":
void largerFunction(void) {
small1();
small2();
} where {
void small1(void) {
...
}
void small2(void) {
...
}
}
Until that becomes possible, we can always just make sufficiently small modules with few public functions.One can read the high level code in small terms, and dive into details inside the where bound expression.
Please don't call this programming style SSA, that's confusing it with the compiler IR pattern. "Immutable variables" is the popular jargon.
In VS13 the R-Click -> Go to Definition interface is OKay... but it definitely could be better
http://www.chris-granger.com/images/lightable/light-full.png
"The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it."
That's the equivalent of saying "the faster you drive the safer you are b/c you're spending less time in danger"
You'll just end up with larger monster functions that are harder to manage. "Method C" will always be a disaster for code organization b/c your commented off "MinorFunctions" will start to bleed into each other when the interface isn't well defined.
" For instance, having one check in the player think code for health <= 0 && !killed is almost certain to spawn less bugs than having KillPlayer() called in 20 different places"
I don't completely get his example, but I see what he's saying about state and bugs that arise from that. You call a method 20 times and it has an non obvious assumption about state that can crop up at a later point - and it can be hard to track down. However the flip side is that when you do track it down, you will fix several bugs you didn't even know about.
The alternative of rewriting or reengineering the same solution each time is simply awful and you'll screw up way more often
I'm starting to think object oriented programming is a bit over rated. It's hard to express why exactly, but I'm finding plain functions that operate on data can be clearer, less complicated, and more efficient than methods. Blasphemous as it may seem, a switch statement does the equivalent of simple polymorphism and can be kept inline.
There isn't anything close to unanimous agreement, but the dominant view is that something like single inheritance is a useful tool to have in your language. But all the high-end OO philosophy stuff is flat-out held in distaste by the majority of high-end engine programmers. (In many cases because they bought into it and tried it and it made a big mess.)
In the statically compiled languages that most people think of when they hear "OO" (C++ and Java), yeah, switch statements vs. virtual methods (performance differences aside) is basically a matter of code style (do you want to organize by function/method, or by type/object?)
However, the original proponents of OO intended it to be used in dynamically compiled languages where it could be used as a way to implement open/extensible systems. For instance, if a game entity has true update, animate, etc. methods, then anyone can implement new entities at run time; level designers can create one-off entities for certain levels, modders can pervasively modify behaviors without needing the full source code, programmers trying to debug some code can easily update methods while the game is still running, etc. You can get a similar effect in C or C++ with dynamic linking (Quake 2 did this for some large subsystems), but it's a pain and kinda breaks down at fine (entity-level) granularity.
The other, "dual" (I think I'm using that word correctly?) approach famously used by emacs is to create hooks that users can register their own functions with, and extend the program that way. Like switch statements, it basically amounts to storing functions at the call site instead of inside the object, except with an extensible data structure rather than burning everything directly into the machine code.
Obviously you can't really take advantage of any of this if you're writing some state of the art hyper-optimized rendering code or whatever like Carmack, I'm just saying that OO's defining characteristics make a lot more sense when you drift away from C++ back to its early days at Xerox PARC.
What OOP nicely brings to the table is polymorphism and type extension. Two things not doable just with modules.
Although generics help with static polymorphism.
The problem was that the IT world went overboard with Java and C#, influenced by Smalltalk, Eiffel and other pure OO languages.
Along the way, the world forgot about the other programming languages that offered both modules and objects.
> Blasphemous as it may seem, a switch statement does the equivalent of simple polymorphism and can be kept inline.
Except it is not extendable.
I think this why Go has become so popular. It deliberately is not object-oriented in the traditional sense, yet it gives you most of the advantages of OOP (except for people who are into deep and intricate inheritance hierarchies, I guess). (I don't know how many people are actually using it, but now that I think of it, the same could be said of Erlang - the language itself does not offer any facilities for OOP, but in a way Erlang is way OOP, if you think of processes as objects that send and respond to messages.)
So I think there is nothing blasphemous about your statement (in fact, Go allows you to switch on the type of a value).
(I am not saying that OOP is bad - there are plenty of problems for which OOP-based solutions are pretty natural, and I will use them happily in those cases; but I get the feeling that at some point people got a bit carried away by the feeling that OOP is the solution to every problem and then got burnt when reality asserted itself. The best example is probably Java.)
In the area I've worked in, I've seen numerous semantic systems come and go, all built around essentially one giant model of the slice of the world it's supposed to represent, and the projects all end up dead after a couple years because:
a) the model does a poor job representing the world
b) nobody seems to have a use-case that lines up perfectly with the model (everybody needs some slightly different version of the model to work with)
c) attempts to rectify b by just adding in more and more to the model just leaves you with an unusable messy model.
More recent systems seems to be working at a meta-model level, with fewer abstract concepts and links, rather than getting down into the weeds with such specificity, and letting people muck around in the details as they see fit. But lots of the aggregate knowledge that large-scale semantic systems are supposed to offer gets lost in this kind of approach.
I think OOP at its heart is just another case of this -- it's managed to turn a programming problem into an ontology problem. You can define great models for your software that mostly make sense, but then the promise of code-reuse means your suddenly trying to shoehorn in a model meant for a different purpose into this new project. The result is code that either tries to ignore "good" OOP practices to just get the damn job done, or over specified models that end up so complicated nobody can really work with them and introduce unbelievable organizational and software overhead.
It's not to say that OOP and other semantic modeling approach don't have merit. They're very useful tools. I think the answer might be separate models on the same problem/data, each representing a facet on the problem. But I haven't quite gotten the impression that anybody else has arrived at this in industry and are instead just going for higher levels of abstraction or dumping the models all together.
Again, OOP manages to turn programming problems into ontology problems -- which is hardly a field that's well understood, while the goal is and always has been to turn programming problems into engineering problems -- which are much more readily understood.
Most programming concepts are really about code organization and not expressiveness or the ability to express an algorithm clearly.
Object oriented programming only really starts to make sense when you are working on something that will take thousands of man-hours. If you are working alone, or on a small project is can be completely irrelevant.
The work flow you are describing is what MATLAB guys do. It's an absolute nightmare once the project gets too large. It is however very fast an flexible for prototyping.
I agree. The simplicity comes from the fact that you are focusing on different aspects at different times. I find that I will start off with defining my data structure and only focusing on the data structure. What information do I need, what is the best way to organize the data. Those sorts of issues. Once I have the data structure then I focus on what I want to do with it. This may result in some functions attached to the data structure using the object oriented features and sometimes the functions live apart from the data structure. The benefit comes from mentally decoupling the data from the functions.To use Minecraft as an example, a player may die from falling from too high, drowning, getting attacked by a monster. If killPlayer() is called serparately for each of those cases, he asserts that it may cause bugs due to differing context or sequencing relative to other parts of the code. If OTOH you just decrement player health in each of those places and then check for health<=0 at only one place, you eliminate that class of bugs.
Another way to put it is that Method C is the least bad solution when factoring fails. I had a conflict with a coworker several months ago over a difficult piece of functionality that I had implemented Method C style. It was giving him headaches and he complained incessantly about the fact that the code was written in a linear "non-factored" style. I tried to explain to him that the problem was simply that hard, and the code organization wasn't making it worse, but was rather making the best of a bad situation. (Basically, if he thought the code was hard to understand, then he obviously hadn't tried to understand the problem it was solving!) He ignored me and refactored the code Method B style. A month later he was still struggling (because it was a truly complex problem) and he called me over to help him out. The code was now unfamiliar to me, so I'd point at a method call and say, "What does this method do?" "Uh... let's see. <click>" "What does that method it's calling do?" "Hold on. <click>" And so on, all the way down the call chain.
The refactored code had become "easy to read" in the sense that the methods were short, but it also become impossible to read in the sense that reading a method didn't give you any useful information unless you went on to read all the code in all the methods it called. We ended up reading the code exactly as we would have read Method C code, except with a lot of clicking and no visual continuity or context. Abstraction didn't protect us from the details; it just made it harder to see how they fit together into the whole.
while true:
...
if (!wasDead && dead) startFadeOut()
...
wasDead = dead
doPhysics()
and then months later someone adds fall damage to the physics engine, and suddenly there's a way to die where the screen doesn't fade out.He might not have communicated it completely correctly, but I believe he wasn't advocating for getting rid of functions to reduce redundancy.
He instead was advocating getting rid of functions that simply provide documentation of the process, and instead find a way to inline those functions clearly.
> However the flip side is that when you do track it down, you will fix several bugs you didn't even know about.
I think he is saying a class of bugs is avoided. For instance if I do X, Y and Z where all are only ran when the player is alive and Y might kill the player, leaving the player alive avoids a bug in Z if it assumes that the player is alive.
I get the impression that he understands it quite well, which is why he avoids it.
Can you share a bit about your background here? In the absence of more context, to me, this reads sort of like a guy who plays football on weekends saying that Lionel Messi "doesn't really get" football.
> "The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it."
> That's the equivalent of saying "the faster you drive the safer you are b/c you're spending less time in danger"
What I believe he means is that functions calls at different places can be a source of trouble when you're not side-effect free. algorithm() {
do_small_thing_one();
do_small_thing_two();
do_small_thing_three();
...
do_small_thing_X();
}
Advantages over Carmack's Style C:1. Substitutes comments for accurate and specific function names. Why better? Because comments can get out of sync with the code.
2. You can quickly see the sub-steps of the algorithm, rather than reading a multi-page-long giant function with a ton of comments.
When using this style, the inner functions are not visible outside of that source file (you can arrange this depending on your programming language). Then it's easy to make sure they are only called once within the source file, or only called appropriately.
That's because I agree with Carmack that functions called from lots of unrelated places are a terrible thing.
(Edited for clarity after people pointed out that it seemed like I was just advocating for Carmack's style A or B.)
You can't see if you a repeating the same stuff in multiple small things.
You achieve that by not exposing them in header files.
Can you explain this a bit better? In my experience function names can just as easily get out of sync with the code. I've worked on many codebases that were full of small functions with misleading names.
Actually, there is an additional and probably much more important reason that I prefer "functions over comments", and that is the functions can be nested.
With comments, I often find myself needing a big comment that describes the next several chunks of things, and each of those chunks has some comments, too. So call those "level 1" and "level 2" comments.
People tend to distinguish these cases like:
/* ***************************
level-one comment
*************************** */
and then /* level-two comment */
But what if you have something that really "should" be a level-two comment (because it's just one small thing that doesn't have subdivisions), but it comes after a level-one comment? Now you need an end delimeter for level-one comments, like this: /* *****************
done with that
******************* */
Now at this point, your code is nasty and non-readable and it's not obvious that things have been kept in sync over time. We could just avoid all this ambiguity by using nested functions that are nonetheless only called from one place and thus _could have_ been manually inlined in theory (but, per my practice, are not).We do inline our code in Haskell sometimes, but usually the real gains (in my limited experience, with numerics code) are to be had by unboxing, FWIW.
Firstly, inlining has nothing to do with state mutation. He just happens to be talking about a codebase that does a lot of state mutation - eg. a video game. Therefore the functions he's talking about inlining are state mutating functions.
It also sounds like performance is a secondary, but still important, concern in this post. What he's really getting at is what the typical modular programming style does to our awareness of details and therefore our ability to understand and optimize our programs. The commonly held conviction is that smaller functions are better for writing understandable and correct code. He's saying this isn't necessarily the case - especially not when you're trying to optimize code and understand the interrelations between the state it's mutating. Modularity can often produce deep stacks while hiding and scattering state mutations. It can also obscure interrelations you should be aware of. In the kinds of scenarios where correctness, comprehensibility, and performance are tantamount having one big long function might just be better.
Heavy modularization makes a ton of sense with utility code made to create and tear down fixtures, as well as with general utility functions like navigation through a user interface to get to the initial point of testing.
However, in the test, where you need to have complete understanding of the sequence of events in order to keep the test valid, it's much better to inline nearly everything that would affect simulated user or client flow even if that means duplication between tests.
That's exponentially more true when more than one person/team/org/whatever would be maintaining different tests or test areas. The last thing you want to do at that point is share test sequencing code, since it's so easy to subvert the flow in other tests using it by mistake.
It's a hard argument to make, because everyone gets SPOT, yo, Fowler rules, etc. They aren't wrong, but it really only applies when the interface is everything and how you fulfill the interface is irrelevant.
For some types of code--frame-accurate games being one, tests being another--the order of events is paramount. IMO, even mature patterns like Selenium's Page Object Model gets this one wrong by encouraging test flow code to live in POM methods.
There are absolutely times where optimizing for understandability and being paranoid about implementation changes is the way to go.
Not exactly. He points out that in real time systems, worst case execution time is more important than average execution time. You have to render a video frame in 1/60 of a second or it will need to be displayed twice. In that case, getting the job done faster on average doesn't matter. This can change a bit if something like power consumption becomes relevant - then you've got conflicting requirements.
Real time keeps performance a top priority, we just look at a different metric.
As for pure code, the reason he has become more bullish about functional programming is that it by definition is less susceptible to all these subtle order-of-execution issues. You are free to structure your code in whatever way you like and split it into as many functions as you want and still have the peace of mind that you will never access an uninitialized variable or use a dead object.
So, if I have a value A, and it's a reference type(either by pointer, or by just the nature of the language), then if I pass that object into function DoSomething, DoSomething may change A without my knowledge, and cause behaviors further on that I wasn't expecting.
If I inline DoSomething, I see exactly what I'm doing to A, what I'm changing on it.
let a = f () in a + a
f () + f ()In side effecting languages, inlining cannot make function calls disappear, unless the compiler can convince itself that it's safe to do so.
If an expression calls f() twice, and f has a side effect, then the inlined version of the code has to do that side effect twice, exactly if the non-inlined function were called, just without some of the overhead of a function call.
(On a different note, in a language like C, inlining a side effect actually improves safety. If we have f() { global++ } then f() + f() is safe and has the effect of incrementing global twice, whereas global++ + global++ is undefined behavior.)
He is talking about
f _ = do_work 12
let a = do_work 12 in a + a(And, for that matter, whether it can convert that to "f()<<1".)
But there are very real advantages. I learned through game programming and still do some for fun and I absolutely prefer having a main loop that puts its fingers into all the components of the game than to have a main loop which delegates everything to mysterious entity.update()-style functions. The lack of architecture allows me to structure the logic of the game more clearly for exactly the reasons Carmack outlines. Everything is sequenced - what has already happened in the frame can be seen by scrolling up a bit instead of digging through a half-dozen files.
But the real win here is for the beginner programmer. I strongly dislike the trend these days towards programming education being done in a "fill in the blanks" manner where the student takes an existing framework and writes a number of functions. The problem is that the student rarely has any idea what the framework is doing. I would rather not have beginners write games by make on_draw(), on_tick(), etc. functions but much rather have them write a for loop and have to call gfx_library_init() at program start and gfx_library_swap_buffers() at the end of a frame. That way they can say "The program starts here and steps through these lines and then exits here" versus having magic frameworks do the work for them. There is plenty of magic done these days behind the scenes for any beginner, but it is too much to have a completely opaque flow-control.
Abstraction is good for people experienced with a codebase, but as codebases grow larger and more complex that usually means the first few people, because newcomers are rarely able to attain the same grasp of the big picture.
This suggests that authors of say open source projects who want to gain more collaborators might want to go against their instincts for abstraction.
(This thought crystallized for me after conversations on HN in the past couple of weeks: https://news.ycombinator.com/item?id=8308881 and https://news.ycombinator.com/item?id=8327008)
I'm not a fan of excessive abstraction either - the main thing I use it for is to reduce code duplication, which IMHO is one of the real benefits. Code that contains lots of functions-called-once or classes-used-once feels like a terribly inefficient and obfuscated way to do something, and far less straightforward than it could be.
The whole "abstraction is good because it allows us to build large complex systems" notion is all too common in beginning courses in programming/software engineering architecture, and it tends to make people think that large complex systems are also good. Thus the feeling that somehow all software should be large and complex, and the resulting architecture-astronautism and disturbingly inefficient software. I completely disagree - abstraction should be taught as being a necessary evil for managing complexity, for use only when that complexity is actually justified and cannot be simplified further. Abstraction hides complexity but does not eliminate it; in fact it could be said that it probably adds to it. Code hidden by abstraction is still code that gets executed, consumes resources, and could contain bugs.
Also, open source projects could seriously do with some dev docs. A UML diagram or two wouldn't hurt either.
If you use a framework (or at least a common pump_messages->update_ents->render cycle), it's a hell of a lot easier for other people to work on your code for a longer period of time (and those other people include yourself). Imagine if you decide to change your gfx_library with something new, especially if you had decided to scatter it's draw calls all over the "game logic" (instead of, say, a dedicated entity draw() method). What about when you start allowing data-driven entity updates? What about <insert practical concern here>?
To be fair, you take it too far (as I'd once done) and you destroy performance. So, eventually, you work out how to have nice architectures that still support the fast hacky stuff.
Anyways, the beginner is allowed to make such mudballs for a while--but no longer than necessary!
welcome to the club. We don't have cookies, but we have flower sugar and eggs, you can make cookies. :)
As a side note, the original code contained large numbers of manual loop unrolling optimizations like noted in the email. I actually saw a performance increase from removing them. Same in some cases for manually inlining the function calls. From what I could tell, writing simpler, inline code made the optimizer much more efficient.
To come back to the point at hand, regularly writing all kinds of code, from embedded to desktop to mobile, I can only nod vigorously at Carmack's observations
Always writing minimal functions doesn't work 100% of the time, but is unfortunately often enforced vehemently in large teams, as the ones in charge don't trust the rest to be sensible. There should always be a good reason to follow this or that rule, but when the rules of the team are broken intenionally, it'd better be documented to avoid confusion later on!
Admittedly I am not a very experienced programmer, but I thought the general line of thought with regards to making your program easier to test would be that each function should do as little as possible.
That depends on the nature of the inline code. If it simply unrolls a local loop that has no side effects, then it's structurally identical to the original loop, but (in some cases) faster. This is demonstrated by the fact that some compilers can be coerced to inline certain repetitive actions originally written in the form of loops, as a speed optimization.
[1] https://medium.com/@evnbr/coding-in-color-3a6db2743a1e
[2] http://www.reddit.com/r/programming/comments/1w76um/coding_i...
The 1993 crash was due to Pilot Induced Oscillation (PIO). This is a general term for situations when the pilot makes inputs to stabilize an airplane, but the inputs instead end up exacerbating the instability. A simple example of how this could happen is if the control inputs for some reason take effect with a time delay: the airplane pitches up, so the pilot tries to push it down, after a moment the transient passes and the plane pitches down so the pilots pulls up, but the previous input amplifies the downwards movement so the pilot pulls up harder, etc.
Several of the first generation of unstable fly-by-wire airplanes had problems with PIO. Unlike conventional aircrafts, where the rudder positions exactly follow the position of the stick, here the desired rudder position is calculated as the sum of two inputs, one calculated from the stick position, and one calculated by flight control software to dampen instabilities.
Early versions of the software was "rate limited", i.e. at each iteration of the main loop the software calculates the desired rudder position and then moves the rudders towards that position at the fastest rate the rudder actuators allow. However, that leads to problems when there are very large transient stick inputs: because the rudders take some time to move, the largest rudder deflection occurs with some delay after the largest stick input (see figure 6 in [1]).
In the 1993 crash there was a wind gust causing a pitch movement, and both the pilot and the flight control software provided a compensation. The sum of the two signals was big enough to hit the rate-limitation, so the response of the airplane was strange, the pilot gave several more large inputs, and there was PIO.
Incidentally, one of the YF-22 prototypes crashed for basically the same reason, even though they run different software. The solution was to develop some new "phase compensation" methods for designing controllers [1].
[1] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=558586&...
Basically, the plane would have been better off cutting off input from the pilot and flying on its own.
I just read more about PIO and it's funny, it reminds me of a short story I read a long long time ago, written by Arthur C Clarke, I remember the scene in it where a pilot is remote controlling his big space ship or plane of some kind, while standing on the ground. But because the remote is going through a satellite, there is a feed back delay, say 1 second or so. So when he tries to compensate for a wrong turn, a flavor of PIO is induced and the ship crashes. Think that particular story was written in the 40's, good foresight of Mr. Clarke.
The way we have traditionally measured performance and optimized our games encouraged a lot of conditional operations [...] This gives better demo timing numbers, but a huge amount of bugs are generated because skipping the expensive operation also usually skips some other state updating that turns out to be needed elsewhere.
Now that we are firmly decided on a 60hz game, worst case performance is more important than average case performance, so highly variable performance should be looked down on even more.
Two words: Battery life. In case of mobile devices, this is not sound advice.
To make things more complicated, the .do always, then
inhibit or ignore. strategy, while a very good idea for
high reliability systems, is less appropriate in power and
thermal constrained environments like mobile.
Or do I misunderstand your objection?It's just that two days ago, I had to deal with exactly that issue, so it was fresh on my mind while reading the article...
Explain?
As an example performing an operation every other frame can cause your CPU to thrash due to always taking the wrong path. (I of course am oversimplifying and OOO CPUs are quite complex)
Or you can look for a more enlightened employer. Mine is hiring, for example.
Imagine two months later someone writing largeFunctionB() is browsing around the code and finds smallFunction(), thinking it will do the job he requires but actually it has a hidden bug that was never triggered under the context of it executing in largeFunctionA or under the limited input range that largeFunctionA was using.
See in particular this paragraph from the article:
Besides awareness of the actual code being executed, inlining functions also
has the benefit of not making it possible to call the function from other
places. That sounds ridiculous, but there is a point to it. As a codebase grows
over years of use, there will be lots of opportunities to take a shortcut and
just call a function that does only the work you think needs to be done. There
might be a FullUpdate() function that calls PartialUpdateA(), and
PartialUpdateB(), but in some particular case you may realize (or think) that
you only need to do PartialUpdateB(), and you are being efficient by avoiding
the other work. Lots and lots of bugs stem from this. Most bugs are a result
of the execution state not being exactly what you think it is.Seems like it's the fault of the second developer for using the smallFunction() without understanding what it does.
We all have our own programming dogma that we love and defend religiously, but we should never stop asking if our code is truthfully, objectively, clear and easy to read, prone to bugs and/or runs efficiently. "Best practices" can get you 80% of the way there, but a developer should never stop questioning the quality of their code, even if it contradicts the sacred rules.
Having noticed that he has a problem when multiple functions are all interacting with that same shared global state, it's kind of amusing that Carmack's reaction is to reduce the number of functions, rather than remove the global state.
And so games live within this environment of confusion over which things happen when. At every point there are a few defensive tools - queue up actions in a buffer, poll values instead of copying them, etc. - but the overall management of these concurrent, overlapping systems remains a challenging task lacking in silver bullets.
That statement (at least taken in isolation) is false. Inlining it means that you're still executing the exact same code. If it had problems as a function, it still has problems when inlined.
But that isn't the problem that Carmack is trying to address. He's concerned about bugs caused by lack of programmer comprehension of the code's role in the larger function. It's a valid concern. But inlining it makes it harder to find problems in the details of what the inlined function does (or even to realize that that's where the problem is, or maybe even to realize that there's a problem at all).
All styles help with some problems and make others worse. The answer isn't a style, it's good taste and experience to know when to use which style.
But if it's inlined, it's no longer a function. ;)
What he's saying here is that the function itself was fine and free of bugs but that there is a problem for the programmer. The programmer's understanding and expectation of what the function does isn't in that it affects state in a way the programmer didn't know or expect. What the function does becomes much more obvious and controllable when you inline the function's body.
"Avoid globals" is a fairly common (and good) truism for programmers of all stripes. But casting it in light of the central (and easily digestible) tenet of functional programming makes it much more approachable. Smart (but sometimes insufferably pompous ) proponents of functional programming should take notes.
That being said, I've seen procedures that followed this sort of approach that were thousands of lines long. Even if we could have cut down on the ridiculous number of conditionals in that code, most of the state at that scale asymptotically approaches an undifferentiated mass of global variables. The result is testable and maintainable only via heroic effort. There have got to be limits to this kind of approach. (For Mr Edwards the solution was to break the whole thing up into a sequence of composable views, or lenses, with the interface between each stage being well-defined.)
I wonder to what extent Mr Carmack's pivot to pure functions is simply an acknowledgement that there were much better ways to refactor the code than the mess of one-timer procedures that probably seemed like a good idea the first time through...
The email was written in 2007. In there, he advocates the inlining of single-use functions into god functions as it reduces the risk of these functions being opted into other routines, especially when they all deal with shared mutable data.
Single-use functions are explicitly singled out in his email; he mentions that he does not encourage duplicating code to avoid functions.
| "In almost all cases, code duplication is a greater evil than whatever second order problems arise from functions being called in different circumstances, so I would rarely advocate duplicating code to avoid a function"
The blurb at the front indicates the intent of his post. Since then he has favoured a functional-programming approach - don't inline your functions, but avoid making your functions rely on mutable/larger scope states. Pass in everything that is needed by the function through parameters. Avoid functions with side-effects, encourage idempotence. That way, reusing the function does not lead to unintentional side-effects.
He also mentions that should you still decide to inline, " you should be made constantly aware of the full horror of what you are doing.".
A lot of things change within a decade. =)
Funny because it's backward. If a code is duplicated, consider to make a function if the pieces of code are the same semantically. (Two pieces of code which are the same at a given time can diverge over time and you don't want to miss that. Only analyzing the sense of what you're doing (=semantic) gives you the answer.)
Never add fancy things (like adding a function which is not a function) in your code because code is not fancy, it causes bugs.
> If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.
Well yeah, fix the semantic if it needs to else do nothing.
> If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.
Well yeah, fix the semantic if it needs to else do nothing.
>Minimize control flow complexity and "area under ifs", favoring consistent execution paths and times over "optimally" avoiding unnecessary work.
The right thing to do is to never optimize unless it's to slow and you've identified the first bottleneck. "Never optimize" means: write the naive code correctly (without performance aberation like adding element in an array).
> To sum up:
Stop fancy. Stop optimization. Stop thinking about code syntactically (=the succession of operation gives the good result). Think constantly about your code semantically.
> I do believe that there is real value in pursuing functional programming, but it would be irresponsible to exhort everyone to abandon their C++ compilers and start coding in Lisp, Haskell, or, to be blunt, any other fringe language.
"Here, let me dismiss functional programming, and by the way OCaml and other 'non-pure' functional languages don't exist, and functional programming languages aren't useful for anything 'real' so you should do your functional programming in C, and also you may want to dump everything in one long-ass function because I don't like deep stacks".
He's just rationalizing C traditions.
One of my biggest gripe about OO programming was that you had no idea what the other components were doing unless you investigated each component directly. Sometimes the dependencies and the chain would get so large and complicated, you'd spend more time figuring out how to wrap your head around the whole thing than doing things that result in direct business benefit.
But every interview you go to will tell you otherwise, inflating technical debt is a great thing to keep managers keep their job and for sales team to boast about six digit LOC = Obviously state of the art.