But in most cases such as ASP.NET MVC actions, or doing an API call in a desktop app, the maintenance and high level code simplification you can get by using these techniques far outweighs the low level "cost" of the code generated by the compilers.
It's good to know what's going on under the covers though.
So later I did compare it to a very simple for loop operating against a byte[], and had both versions work against a 10MB buffer (not unrealistic input in my use case). The throughput of the "yield return" code was something like 5 times less. I didn't try too hard to track down the precise cause of that at the time (maybe it was GC from lots of temporaries as you say? I was pretty CPU bound, and thinking it could have had more to do with a more straightforward loop generating code with fewer jumps after JIT), I just took the faster version.
1: for(i = 0; i < arr.Length; i++) sum += arr[i];
2: foreach(var x in generate(count)) sum += x;
In the first case, you end up with a small, relatively tight loop (the machine code has a lot of extra stuff I don't quite understand).In the second case, you're literally doing a virtual call (and I don't think the CLR inlines interface calls) to get_Current() in a loop, followed by MoveNext(). So there's 2 function call overheads, not to mention the actual code that get_Current and MoveNext have.
Here's the sample program[1]. I get about 600% runtime for the enumerable version versus the array. Given all the extra work, I'm sorta impressed it's only 6x. The 32-bit JIT is a bit slower doing the array method than the 64-bit, which surprises me. Here's the code for the loops[2].
var myBytes = GetMyBytesItr();
for (var i = 0; i < myBytes.Count(); ++i)
{
ProcessByte(myBytes.ElementAt(i));
}
and you'll be in a world of hurt especially if GetMyBytesItr() allocates a memory buffer. Count() causes the "get the bytes data" action to occur, as does every iteration of ElementAt(). Now I'm not saying this is definitely what you were experiencing, but it's a common pitfall. Also, using ElementAt() for each iteration is, of course, completely contrived for this example (you'd want to foreach instead which would cause only one execution of the "get the bytes data" action).ILSpy is a pretty interesting application/library to look at under the hood. The decompilation logic that transforms .NET bytecode (MSIL) into higher-level data structures is split into a bunch of well-defined transform phases that run in a pipeline, which means you can actually step through the pipeline and watch it improve the readability and semantic clarity of the IL one step at a time. It's an incredibly valuable debugging tool and really useful for understanding how this kind of decompiler works, and it was a big influence on how I ended up designing the similar parts of my compiler.
As a whole, I think ILSpy demonstrates just how valuable it is to have a really well specified instruction set sitting beneath your compiler and runtime. MSDN's documentation for the instruction set is clear and understandable and libraries like Cecil and ILSpy make it easy to load, manipulate, and save for whatever your purposes might be - runtime code generation, machine transforms of compiled code, obfuscation, deobfuscation, or outright cross-compilation.
I could generate JS from raw IL (and other projects like Volta did just that) but ILSpy gives me a huge head start in terms of producing JS that actually looks like what you'd write by hand. For loops instead of while loops, switch statements instead of cascading ifs, etc.
e.g. the C# compiler turns "var a = 3;" into "int a = 3;" It turns anonymous lambdas into methods with generated names (i.e turns one line of code into 3-4 lines), and makes classes with constructors that do the environment capture if needed. It turns "yield return" generator methods into objects that have state.
While async/await is a cool feature and is worth using, it is worth noting the up trend in complexity of generated code – and ansyc/await generates significantly more code in the simpler language without this feature than previous new features did.
How much code does an await statement give rise to? it looks like about 40-50 lines to me.
Async/Await makes it easier to write asynchronous code in a traditional linear manner, it's easy to shoot yourself in the foot but it's just as easy with the alternatives.
Here's an article that I wrote a while back on what Async & Await generates: http://blog.filipekberg.se/2013/01/16/what-does-async-await-...
var result = await client.GetStringAsync("http://msdn.microsoft.com");
From the perspective of the caller, that code is still blocking. It solves the problem of threads being blocked, but that's only one of the problems you have.The real and more complete alternative would be to work with a well designed Future/Promise framework, like the one in Scala [1] which is also usable from Java [2]. Doing concurrent computations by means of Futures/Promises is like working with Lego blocks.
Let me exemplify with a piece of code that's similar to what I'm using in production. Lets say that you want to make 2 requests to 2 different endpoints that serve similar results. You want the first one that completes to win (like an auction), but in case of a returned error, you want to fallback on the other one (type annotations and extra verbosity added for clarity):
val url1 = "http://some.url.com"
val url2 = "http://other.url.com"
// initiating first concurrent request
val fr1: Future[Response] = client.get(url1)
// initiating second concurrent request
val fr2: Future[Response] = client.get(url2)
// the first fallbacks on the second in case of error
val firstWithFallback: Future[Response] =
fr1.fallbackTo(fr2)
// the second fallbacks on the first in case of error
val secondWithFallback: Future[Response] =
fr2.fallbackTo(fr1)
// pick a winner
val finalResponse: Future[Response] =
Future.firstCompletedOf(firstWithFallback :: secondWithFallback :: Nil)
// process the result
val string: Future[String] = finalResponse.map(r => r.body)
// in case both endpoints failed, log the error and
// fallback on a local copy
val stringWithFallback: Future[String] = string.recover {
case ex: Exception =>
logger.error(ex)
File.read("/something.txt")
}
Given an HTTP client that's based on NIO, the above code is totally non-blocking. You can do many other crazy things. Like you can wait for several requests to complete and get a list of results in return. Or you can try the first url and if it fails, try the second and so on, until one of them succeeds.In the context of web apps, you can prepare Future responses this way and return them when they are ready. Works great with the async responses support from Java Servlets 3.0. Or with a framework such as Play Framework 2.x you can simply return Future[Response] straight from your controllers [3]
[1] http://docs.scala-lang.org/sips/pending/futures-promises.htm...
[2] http://doc.akka.io/docs/akka/snapshot/java/futures.html
[3] http://www.playframework.com/documentation/2.1.0/ScalaAsync
Also, although it is possible to extend C# using stuff like custom linq providers, [1] I still prefer lisp style macros.
[0] http://www.linuxjournal.com/article/2821 [1] http://msdn.microsoft.com/en-us/library/bb546158.aspx
Though that may change with C# 5 and Roslyn (http://en.wikipedia.org/wiki/Microsoft_Roslyn compiler as a service)
This is the case with all high-level language features. Full lexical scoping with first class functions forces the PL implementation to either generate a <code;environment> tuple for all function values, or to perform aggressive closure conversion (up to Stalin levels) in addition. Virtual method dispatch gives rise to inline caches, often polymorphic ones. Lazy evaluation forces you to generate thunks. Pattern matching forces you to generate a decision tree or a similar structure. The list goes on. You already have such things in C#, even without async calls.
IMHO this means that the fruit of these features are no longer so low-hanging.