Decompiling C# (async/await) (opens in new tab)

(community.sharpdevelop.net)

57 pointsinv13y ago28 comments

A look under the hood of a C# decompiler.

28 comments

19 comments · 5 top-level

Inside every language there is a minimal language with all the syntactic sugar translated into a subset of the language.

e.g. the C# compiler turns "var a = 3;" into "int a = 3;" It turns anonymous lambdas into methods with generated names (i.e turns one line of code into 3-4 lines), and makes classes with constructors that do the environment capture if needed. It turns "yield return" generator methods into objects that have state.

While async/await is a cool feature and is worth using, it is worth noting the up trend in complexity of generated code – and ansyc/await generates significantly more code in the simpler language without this feature than previous new features did.

How much code does an await statement give rise to? it looks like about 40-50 lines to me.

fekberg13y ago

You need to compare it to the alternatives. For instance how much IL will be procuded from creating Tasks with Continuations?

Async/Await makes it easier to write asynchronous code in a traditional linear manner, it's easy to shoot yourself in the foot but it's just as easy with the alternatives.

Here's an article that I wrote a while back on what Async & Await generates: http://blog.filipekberg.se/2013/01/16/what-does-async-await-...

bad_user13y ago

Personally I think the Async/Await model is too limiting. E.g. lets talk about this:

    var result = await client.GetStringAsync("http://msdn.microsoft.com");

From the perspective of the caller, that code is still blocking. It solves the problem of threads being blocked, but that's only one of the problems you have.

The real and more complete alternative would be to work with a well designed Future/Promise framework, like the one in Scala [1] which is also usable from Java [2]. Doing concurrent computations by means of Futures/Promises is like working with Lego blocks.

Let me exemplify with a piece of code that's similar to what I'm using in production. Lets say that you want to make 2 requests to 2 different endpoints that serve similar results. You want the first one that completes to win (like an auction), but in case of a returned error, you want to fallback on the other one (type annotations and extra verbosity added for clarity):

    val url1 = "http://some.url.com"
    val url2 = "http://other.url.com"

    // initiating first concurrent request
    val fr1: Future[Response] = client.get(url1)

    // initiating second concurrent request
    val fr2: Future[Response] = client.get(url2)

    // the first fallbacks on the second in case of error
    val firstWithFallback: Future[Response] =  
      fr1.fallbackTo(fr2)

    // the second fallbacks on the first in case of error
    val secondWithFallback: Future[Response] = 
      fr2.fallbackTo(fr1)

    // pick a winner
    val finalResponse: Future[Response] = 
      Future.firstCompletedOf(firstWithFallback :: secondWithFallback :: Nil)

    // process the result
    val string: Future[String] = finalResponse.map(r => r.body)

    // in case both endpoints failed, log the error and 
    // fallback on a local copy
    val stringWithFallback: Future[String] = string.recover {
      case ex: Exception =>
        logger.error(ex)
        File.read("/something.txt")
    }

Given an HTTP client that's based on NIO, the above code is totally non-blocking. You can do many other crazy things. Like you can wait for several requests to complete and get a list of results in return. Or you can try the first url and if it fails, try the second and so on, until one of them succeeds.

In the context of web apps, you can prepare Future responses this way and return them when they are ready. Works great with the async responses support from Java Servlets 3.0. Or with a framework such as Play Framework 2.x you can simply return Future[Response] straight from your controllers [3]

[1] http://docs.scala-lang.org/sips/pending/futures-promises.htm...

[2] http://doc.akka.io/docs/akka/snapshot/java/futures.html

[3] http://www.playframework.com/documentation/2.1.0/ScalaAsync

2 more replies

deltasquared13y ago

It is neat that C# has some standard macros defined. It would be nicer if it was more explicit what is going on behind the scenes. I would love to see the equivalent of c-macro-expand. [0]

Also, although it is possible to extend C# using stuff like custom linq providers, [1] I still prefer lisp style macros.

[0] http://www.linuxjournal.com/article/2821 [1] http://msdn.microsoft.com/en-us/library/bb546158.aspx

SideburnsOfDoom13y ago

Yep. The C# design attitude seems to be that the compiler-writers have the ability to define such macros, but the C# compiler users don't. Compare with F#.

Though that may change with C# 5 and Roslyn (http://en.wikipedia.org/wiki/Microsoft_Roslyn compiler as a service)

1 more reply

gngeal13y ago

"While async/await is a cool feature and is worth using, it is worth noting the up trend in complexity of generated code – and ansyc/await generates significantly more code in the simpler language without this feature than previous new features did."

This is the case with all high-level language features. Full lexical scoping with first class functions forces the PL implementation to either generate a <code;environment> tuple for all function values, or to perform aggressive closure conversion (up to Stalin levels) in addition. Virtual method dispatch gives rise to inline caches, often polymorphic ones. Lazy evaluation forces you to generate thunks. Pattern matching forces you to generate a decision tree or a similar structure. The list goes on. You already have such things in C#, even without async calls.

mariusmg13y ago

Yes,for each await it basically generates a state machine. It's pretty crazy to think how much code you don't have to write by using "await".

SideburnsOfDoom13y ago

"yield return" also makes a state machine, but a simpler one. I like async/await, but I am aware of the steep upward trend in internal complexity of these features.

IMHO this means that the fruit of these features are no longer so low-hanging.

CodeCube13y ago· 5 in thread

About the only time that you have to be really careful with stuff like this is if you're writing something that's super sensitive to GC pauses, such as an XNA game. In those cases, yield return, linq, lambdas, some foreach loops will all generate short lived objects that will cause the GC to kick in more often. So if you're doing that in every update loop you could end up with performance issues. And even that is only on some platforms, as the desktop GC does a great job of dealing with short term garbage.

But in most cases such as ASP.NET MVC actions, or doing an API call in a desktop app, the maintenance and high level code simplification you can get by using these techniques far outweighs the low level "cost" of the code generated by the compilers.

It's good to know what's going on under the covers though.

MichaelGG13y ago

The C# compiler one-ups F# here, and will cache the delegate for lambdas, so long as they don't capture any locals (that is if it's "lifted") - so lambdas don't necessarily mean an extra object. (Although the LINQ methods need enumerables and enumerators.)

kevingadd13y ago

When did they introduce caching for lambdas? I've been caching them by hand since I used to see them pop up in CLR Profiler all the time. Is it unable to cache lambdas constructed in member functions because it can't be sure they don't close over 'this'?

1 more reply

asveikau13y ago

Reminds me of an experience I had. I kind of naively wrote something as a method that would "yield return" bytes. After all, everyone is familiar with that attitude so many people have, that you write what looks nicest and worry about bottlenecks later. I'm personally not usually too big on that attitude (I think it's often an overused excuse for obviously bad code) but "yield return" does let you write some very natural-looking stuff.

So later I did compare it to a very simple for loop operating against a byte[], and had both versions work against a 10MB buffer (not unrealistic input in my use case). The throughput of the "yield return" code was something like 5 times less. I didn't try too hard to track down the precise cause of that at the time (maybe it was GC from lots of temporaries as you say? I was pretty CPU bound, and thinking it could have had more to do with a more straightforward loop generating code with fewer jumps after JIT), I just took the faster version.

MichaelGG13y ago

A for loop will have bounds checking removed, for one. Also, the JIT does better inside single method bodies. Compare:

  1: for(i = 0; i < arr.Length; i++) sum += arr[i];
  
  2: foreach(var x in generate(count)) sum += x;

In the first case, you end up with a small, relatively tight loop (the machine code has a lot of extra stuff I don't quite understand).

In the second case, you're literally doing a virtual call (and I don't think the CLR inlines interface calls) to get_Current() in a loop, followed by MoveNext(). So there's 2 function call overheads, not to mention the actual code that get_Current and MoveNext have.

Here's the sample program[1]. I get about 600% runtime for the enumerable version versus the array. Given all the extra work, I'm sorta impressed it's only 6x. The 32-bit JIT is a bit slower doing the array method than the 64-bit, which surprises me. Here's the code for the loops[2].

1: http://pastebin.com/GAg3RiNx

2: http://pastebin.com/index/a1eC8sX9

xanadohnt13y ago

A "yield return enumerable" is "delay executed" so evaluating it multiple times causes the yield-return execution to occur at every evaluation (if you haven't taken care to ToArray() or ToList() it). Do something like:

  var myBytes = GetMyBytesItr();
  for (var i = 0; i < myBytes.Count(); ++i)
  {
    ProcessByte(myBytes.ElementAt(i));
  }

and you'll be in a world of hurt especially if GetMyBytesItr() allocates a memory buffer. Count() causes the "get the bytes data" action to occur, as does every iteration of ElementAt(). Now I'm not saying this is definitely what you were experiencing, but it's a common pitfall. Also, using ElementAt() for each iteration is, of course, completely contrived for this example (you'd want to foreach instead which would cause only one execution of the "get the bytes data" action).

1 more reply

kevingadd13y ago· 2 in thread

The ILSpy team's hard work is part of what made it possible for me to write my .NET -> JS compiler (http://jsil.org/). My ~120k LoC wouldn't work without their ~450k LoC (well, I don't consume all 450k...)

ILSpy is a pretty interesting application/library to look at under the hood. The decompilation logic that transforms .NET bytecode (MSIL) into higher-level data structures is split into a bunch of well-defined transform phases that run in a pipeline, which means you can actually step through the pipeline and watch it improve the readability and semantic clarity of the IL one step at a time. It's an incredibly valuable debugging tool and really useful for understanding how this kind of decompiler works, and it was a big influence on how I ended up designing the similar parts of my compiler.

As a whole, I think ILSpy demonstrates just how valuable it is to have a really well specified instruction set sitting beneath your compiler and runtime. MSDN's documentation for the instruction set is clear and understandable and libraries like Cecil and ILSpy make it easy to load, manipulate, and save for whatever your purposes might be - runtime code generation, machine transforms of compiled code, obfuscation, deobfuscation, or outright cross-compilation.

MichaelGG13y ago

Interesting, so the benefit there is generating higher-level JavaScript instead of interpreting IL at a lower level?

kevingadd13y ago

Yeah. I consume the munged IL that comes out of their transform pipeline (though for complex reasons, I don't use all of it - some of their transforms are destructive in ways that aren't helpful, or I'd have to undo them) which saves me the trouble of reimplementing things they already figured out, like how to transform most branch/jump patterns into if statements and while loops.

I could generate JS from raw IL (and other projects like Volta did just that) but ILSpy gives me a huge head start in terms of producing JS that actually looks like what you'd write by hand. For loops instead of while loops, switch statements instead of cascading ifs, etc.

1 more reply

guiomie13y ago

Its great to see a post on c# around here these days.

TomJoad13y ago

ILSpy is probably one of the best utility programs I have ever used... It has saved me countless times from poorly documented API's and helps me work around countless bugs in third party libraries.

j / k navigate · click thread line to collapse

28 comments

19 comments · 5 top-level

SideburnsOfDoom13y ago· 7 in thread

Inside every language there is a minimal language with all the syntactic sugar translated into a subset of the language.

How much code does an await statement give rise to? it looks like about 40-50 lines to me.

fekberg13y ago

You need to compare it to the alternatives. For instance how much IL will be procuded from creating Tasks with Continuations?

Async/Await makes it easier to write asynchronous code in a traditional linear manner, it's easy to shoot yourself in the foot but it's just as easy with the alternatives.

Here's an article that I wrote a while back on what Async & Await generates: http://blog.filipekberg.se/2013/01/16/what-does-async-await-...

bad_user13y ago

Personally I think the Async/Await model is too limiting. E.g. lets talk about this:

    var result = await client.GetStringAsync("http://msdn.microsoft.com");

From the perspective of the caller, that code is still blocking. It solves the problem of threads being blocked, but that's only one of the problems you have.

    val url1 = "http://some.url.com"
    val url2 = "http://other.url.com"

    // initiating first concurrent request
    val fr1: Future[Response] = client.get(url1)

    // initiating second concurrent request
    val fr2: Future[Response] = client.get(url2)

    // the first fallbacks on the second in case of error
    val firstWithFallback: Future[Response] =  
      fr1.fallbackTo(fr2)

    // the second fallbacks on the first in case of error
    val secondWithFallback: Future[Response] = 
      fr2.fallbackTo(fr1)

    // pick a winner
    val finalResponse: Future[Response] = 
      Future.firstCompletedOf(firstWithFallback :: secondWithFallback :: Nil)

    // process the result
    val string: Future[String] = finalResponse.map(r => r.body)

    // in case both endpoints failed, log the error and 
    // fallback on a local copy
    val stringWithFallback: Future[String] = string.recover {
      case ex: Exception =>
        logger.error(ex)
        File.read("/something.txt")
    }

[1] http://docs.scala-lang.org/sips/pending/futures-promises.htm...

[2] http://doc.akka.io/docs/akka/snapshot/java/futures.html

[3] http://www.playframework.com/documentation/2.1.0/ScalaAsync

2 more replies

deltasquared13y ago

It is neat that C# has some standard macros defined. It would be nicer if it was more explicit what is going on behind the scenes. I would love to see the equivalent of c-macro-expand. [0]

Also, although it is possible to extend C# using stuff like custom linq providers, [1] I still prefer lisp style macros.

[0] http://www.linuxjournal.com/article/2821 [1] http://msdn.microsoft.com/en-us/library/bb546158.aspx

SideburnsOfDoom13y ago

Yep. The C# design attitude seems to be that the compiler-writers have the ability to define such macros, but the C# compiler users don't. Compare with F#.

Though that may change with C# 5 and Roslyn (http://en.wikipedia.org/wiki/Microsoft_Roslyn compiler as a service)

1 more reply

gngeal13y ago

mariusmg13y ago

Yes,for each await it basically generates a state machine. It's pretty crazy to think how much code you don't have to write by using "await".

SideburnsOfDoom13y ago

"yield return" also makes a state machine, but a simpler one. I like async/await, but I am aware of the steep upward trend in internal complexity of these features.

IMHO this means that the fruit of these features are no longer so low-hanging.

CodeCube13y ago· 5 in thread

It's good to know what's going on under the covers though.

MichaelGG13y ago

kevingadd13y ago

1 more reply

asveikau13y ago

MichaelGG13y ago

A for loop will have bounds checking removed, for one. Also, the JIT does better inside single method bodies. Compare:

  1: for(i = 0; i < arr.Length; i++) sum += arr[i];
  
  2: foreach(var x in generate(count)) sum += x;

In the first case, you end up with a small, relatively tight loop (the machine code has a lot of extra stuff I don't quite understand).

1: http://pastebin.com/GAg3RiNx

2: http://pastebin.com/index/a1eC8sX9

xanadohnt13y ago

  var myBytes = GetMyBytesItr();
  for (var i = 0; i < myBytes.Count(); ++i)
  {
    ProcessByte(myBytes.ElementAt(i));
  }

1 more reply

kevingadd13y ago· 2 in thread

MichaelGG13y ago

Interesting, so the benefit there is generating higher-level JavaScript instead of interpreting IL at a lower level?

kevingadd13y ago

1 more reply

guiomie13y ago

Its great to see a post on c# around here these days.

TomJoad13y ago

ILSpy is probably one of the best utility programs I have ever used... It has saved me countless times from poorly documented API's and helps me work around countless bugs in third party libraries.

j / k navigate · click thread line to collapse