You would be just as well off making computeWidth const/pure/readonly/whatever
The compiler can even detect if it modifies anything and mark it for you. In fact, better compilers will compute mod/ref information and know that m_cachedWidth is not touched over that call.
However, LLVM's (which is what at least Apple is using) basic stateless alias analysis is not good enough to do this in this kind of case (there are some exceptions where it might be able to, none apply here)
This is actually a great example of how improving pointer and alias analysis in a compiler buys you gains, and not an example of "how you should modify your code", since you generally should not modify your code to work around temporary single-compiler deficiencies without very good reason.
Especially considering how quickly compiler releases are pushed by Apple/et al.
Even if it is not, I still think this is an example of "how you should modify your code". Reason? Doing
temp = foo();
temp *= bar();
m_member = temp;
keeps your state consistent in case bar() throws. I would even use it if bar() is known not to throw, because you can't know what the future will bring, and because what you 'know' sometimes isn't true. Defensive programming, if easy as in this case, is a net benefit.Some comments: The compiler does not do magic. If you call a virtual function or call an address from an other library, there is simply no way for the compiler to know the side effects at compile time.
The keyword "const" does not help the compiler to optimize this kind of code. Const is mostly a tool for developers, you can ignore it with mutable, const_cast, etc.
In many cases, the kind of code of the example will be optimized properly: -If the code is simple enough and all the functions are in the same compilation unit, Clang will figure out the dependencies and optimize it properly. -If the code is simple enough and the code is spread across several compilation units, Clang's link-time optimizer does an amazing job at finding dependencies.
The point of this example was more to illustrate a point about code clarity. We should not hesitate to make code more explicit, use extra temporary variables, etc. Those extra lines help both the other hackers and the compiler.
False. There are annotations on plenty of these libraries in header files, and you can determine virtual call side effects in plenty of cases through various forms of analysis.
"The keyword "const" does not help the compiler to optimize this kind of code. Const is mostly a tool for developers, you can ignore it with mutable, const_cast, etc."
Let's just say I understand what const and the various attributes I mentioned do, where they are useful, and how compilers use them. I disagree with your statement. Note that const where i mentioned it (IE computeWidth() const) would make the this pointer const, and const_casting that would not be legal, AFAIK.
Unless the member variable was marked as mutable, the compiler would know it is readonly.
"The point of this example was more to illustrate a point about code clarity. We should not hesitate to make code more explicit, use extra temporary variables, etc. Those extra lines help both the other hackers and the compiler."
This is not what was written in the article (though i'd agree with your statement about code clarity) , and as I showed, those extra lines do not help a good compiler.
C++ const has no effect on optimization, since it can be casted away.
And being aware of aliasing issues is a good idea in general; nothing the compiler does can fix this in the general case (e.g. if computeWidth() is located in an external shared library it's basically impossible for the compiler to determine that it can't modify m_cachedWidth)
What did that 6% binary size reduction buy them in startup time? 10ms? 300ms?
What about memory consumption after startup?
Did tests indicate that the memory locality had improved CPU performance in noticeable manner?
Nobody kept all the numbers and digging them backwards is more work than I have time for.
I am sorry I no longer have the actual numbers. To give an idea of the order of magnitude: -For startup speed, measuring the cold start of a new WebProcess, the size of the WebCore binary seems to have a direct relation to the time it takes to start the process. Cutting 5% of binary was giving about 5% reduction of startup time. -The inlining improvements gave a runtime boost of the order of a few (single digit) percents. It was usually improvements over many benchmarks instead of being specific to one part of WebCore. -Some changes had surprising results. I don't remember specifics but some changes (unrelated to initialization) improved startup time without changing runtime performance in any measurable way.
I'm all for removing old code, but if you're going to claim performance gains then why not measure those?
Here's an example, Thrift (as used in Hector, a Cassandra client), had someone make a performance improvement:
https://issues.apache.org/jira/browse/THRIFT-959
The discussion has a lot of "shoulds", and one measurement of latency distributions, but no measurement of typical workloads or bulk inserts. Turns out, that caused at least a 30% performance regression:
If I understand Webkit's architecture correctly, that doesn't even include chrome (the visible UI, not Google's browser), JavaScriptCore, platform specific glue, and especially no auxilliary files (certificates, icons, the "broken image" sign, ...).
Sometimes I long for the good old days where a browser used to fit on a floppy disc (Opera).
I wonder if someone has done analysis on what features make browsers so complicated. I could imagine that 20% of the code could handle 80% of the features (as so often). You could have a 'lite' HTML subset that's targeted on rich documents, rather than rich client webapps. Something like that would be great for older computers or mobile computers.
Going a bit further, I know there is a lot of crazy stuff in WebKit... e.g. neural networks try to predict which links you'll click on, based on previous behavior, mouse movements, etc, and then the browser prefetches likely pages. There are runtimes for NaCL, pNaCL, Flash, there's a PDF browser (some of these are plugins), there is a VNC client, support for a bunch of different rendering models (layered HTML elements, Canvas, 3D), media support (codecs), support for webcams and microphones, and peer-to-peer communication, and much more. phew
I guess a large chunk of this stuff should be in the OS, so that other apps could benefit from it. And another large part of it should be in plugins, so the browser can benefit from all the codecs on the system, for example.
There is a visualisation of the chrome binary here: http://neugierig.org/software/chromium/bloat/ I'm not sure how up to date it is now, but it gives a vague idea.
>I guess a large chunk of this stuff should be in the OS, so that other apps could benefit from it. And another large part of it should be in plugins, so the browser can benefit from all the codecs on the system, for example.
That is the case for Safari for example (using PDFKit for pdfs etc and system codecs for video and audio). Mozilla like to bundle everything with Firefox because they view Firefox as an OS itself, rather than just another application for viewing html documents. Most of the huge size and complexity in modern browsers is due to people trying to turn the browser into an operating system: https://en.wikipedia.org/wiki/Inner-platform_effect
But there are a lot of new APIs under the HTML5 umbrella. They're all accessed using JavaScript for some, a lot of the code will be in WebCore. Here's a bunch: http://www.netmagazine.com/features/developer-s-guide-html5-...
I'm sure Canvas and WebGL add a lot to WebCore.
Really? While Chrome has all of those built in, the other WebKit browsers don't so why would it be in the WebKit source tree (especially after the mutual purges of the Blink/WebKit split.)
Also, all of those (except maybe the PDF viewer) in Chrome are plugins, and they are PPAPI (which was Chromium specific, not used by other WebKit browsers) not NPAPI plugins, and Flash and maybe the PDF viewer aren't bundled with Chromium (just Chrome) so its really weird that anything related to them would remain in WebKit.
inline void updateCachedWidth() { m_cachedWidth = computeWidth() * deviceScaleFactor(); }
Ghee, was that line so hard to read ? No it's easier ! (Might have just been an example though.)
foo.h:
int bar(int i) { return i; /* awesome function! */
foo.cpp: #include "foo.h" … bar(5) …
At this point everything is fine, but lets say there's also:
wiffle.cpp: #include "foo.h" … bar(6) …
now both foo.o and bar.o will contain a function named bar - because they picked up the definition in foo.h and c/c++ don't (technically) see any difference between a function that has been written inline, or a function that was included from a header.
By slapping the inline keyword on the function bar, the link flags for the function change so that the bar won't be exported from any object file that includes it, and so the name collision will no longer occur at link time.
There are a bunch of other benefits, mostly along the lines of "there function doesn't need to be exported therefore if it's not used i don't need to include it in the object"
void hi() { int x = 3; }
without the inline and you will get a complaint that hi() is multiply defined at link time. So, one use case.
A lot of the plain-C APIs with a CoreFoundation style interface are actually C++ underneath. (No insider information necessary, this is easy to see in stack traces and process call stack samples.)
I wonder how their team could swing that culture without tripping over legal at every turn.
This remains the case to a large extent. This makes working with Apple and Microsoft in standards groups a bit challenging at times, since they won't actually comment on whether they're even thinking about implementing a standard, or whether they would be willing to implement it as written, until they suddenly ship it.
And if they're _not_ shipping it you have no way to tell whether that's because they never will because of some fundamental issue they perceive or whether they're basically fine with the idea bu just haven't gotten around to finding resources to implement yet.
Isn't this why Objective-C++ is a thing?
The major downsides are that you can only optimize what the profiler can see and running the thing to make a build takes forever.
And clang's PGO support is not very good so far, so there isn't much to talk about...
The function in example 1 is modifying a member variable, and there is indeed a keyword that requires functions not to modify the class they operate on: const. It's very powerful, and by a long shot my favourite feature of C++.
That said, the function in question actually has to modify a member variable.
int square (int) __attribute__ ((pure));
But I can't say how smart is the compiler in handling those attributes.[1] http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
http://llvm.org/docs/LangRef.html#function-attributes
https://github.com/llvm-mirror/llvm/blob/master/lib/Transfor...
What is the future of WebKit now that Blink has been introduced? Will Apple spend considerable resources keeping an open-source project at the bleeding-edge considering it doesn't really make them any money? Should Safari just be scrapped? It only accounts for < 4% of the market share of non-mobile browsers.
http://www.netmarketshare.com/browser-market-share.aspx?qpri...
http://en.wikipedia.org/wiki/Usage_share_of_web_browsers
Here is StatCounter for Mobile for the last 6 months, the iPhone browser with 23%: http://gs.statcounter.com/#mobile_browser-ww-monthly-201207-...
The C++ community has brought itself all kinds of complexity and long compile times all in the name of performance which, in my mind, was always pretty suspect.
* Try to be explicit { rather than implicit }
* Carefully consider inlining { large blocks of code }
* Do not use static initializers { for infrequency or trivialities }GCC docs sound like the trick would be -fprofile-use, -freorder-functions and -freorder-blocks-and-partition - after a representative profiling run.
A representative profiling run for a shipping binary is a problem of course, JITs win here. DEC had a dynamic binary reoptimization framework in the 90s called DYNAMO that could do it for AOT compiled binaries.
> The second big drop is the result of removing features and cleaning code that came from the Chromium project.
In the graph, the second big drop is ~5% of the initial code size, removing the chromium code actually reduced binary size more than the inlining fixes did.
With the clang or gcc compilers, you can easily link ObjC and C++ together. To some extent you can even mix them in one file (ObjC++), though I don't have experience with that.