Each GC algorithm has its worst-case behavior; saying "avoid at all costs" because of a single scenario is not very helpful.
Also, what were the JVM flags for each test? All I can see in these graphs is the bump at the beginning before the heap has sized to a stable level.
Incorrect, a realtime system must only have execution times below a certain threshold. malloc/free is just as non-deterministic as GC allocated memory (in practical systems). If you put effort into it you can make malloc/free do really stupid things too. If I wrote a program that allocated and freed the right size strings the program would eventually crash.
With any sort of modern CPU it would be almost impossible to produce non-trivial programs with deterministic execution times. I'd hazard a guess that there are very few 'realtime' programs that couldn't be written using the JVM. If you can do HFT on the JVM it's realtime enough.
To make systems with deterministic execution times you'd have to start looking at extremely limited processors that lack RAM (or have really exotic ram that doesn't refresh), as well as removing other resources that are essential to building software the modern way.
All this proves is that for this particular problem the JVM is probably unsuitable for a realtime system. The fortunate thing is that I don't think any real time program actually needs to allocate strings in this manner.
In all seriousness most of the reason that the JVM is not used on realtime systems is because there aren't very many cheap CPUs capable of executing java byte code that are certified for operation in the harsh environments that a lot of realtime systems operate in. My friend builds realtime systems and he codes in C because that's the only compiler available for the CPU.
He'd start using Netduino in a heartbeat if it could survive being next to an electric generator inside the Hoover dam.
This is actually a trading platform but if you can run the exchange on the JVM you could certainly write a client for it on the JVM.
One of the things to consider in engineering a real-time environment is what is the central data-flow load of the system. For example, in a medical data-acquisition environment, allocating an object for each tick would not likely be wise. In fact, use of malloc might not be called for, statically allocating buffers being a better avenue.
In the case of a low-latency financial trading system written in a garbage-collected language, it might be prudent to allocate all the objects before 0830 (if that is when trading starts) and free them after the market close.
There is a wide range of real-time response requirements (for financial exchanges, response times of less than a hundred of milliseconds (eg, 395 from BATS) for full turnaround are expected). Electrocardiogram data needs to be sampled once each millisecond, but the jitter in sampling times needs to be low.
Looks like some good tools there, but without the engineering context it is hard to draw a useful conclusion.
As for your points about realtime - I am completely with you. I had in my mind that one might have a realtime system on the rtjvm or in c++ which needed to periodically communicate with a standard JVM. There are three approaches I could see for this. One, would be to send messages and decouple that way. Another is to make the communication abort if it looked like it might cause a deadline miss. The third was to briefly (milliseconds) disable the gc in the standard JVM so the communication could occur and then turn it back on again immediately afterwards.
Now - that sounds like a really bad idea and it probably is - but it is an interesting idea to play with.
So - please don't thing I am proposing realtime programming on the standard JVM, just some ideas for larger systems integration.
The solution, provided by the maker of the above-mentioned windows software is an external keying module. With this arrangement, windows sends characters to the hardware brick, which, latency-free, sends out the characters quite nicely. (Amazing how the trained morse-code ear can distinguish ms delays in character starts.)
Personally, I now think that way where a system has a component that is latency rich connected to a component that is latency-free. This of course has implications to the overall solution.
Reason 1. G1 is using SATB write barrier which is more expensive than crad marking barrier other HotSpot's collectors are using
Reason 2. G1 have to use STW pause to move object around. E.g. if your heap is fulled in half we life objects, you have to physically move 1MiB live data in memory to reclaim 1MiB of free space, cleaning sparse regions first will effectively reduce this proportion but G1 is still have to work very hard to reclaim each meg of free space
If you need low pause GC in JVM today. use concurrent mark sweep http://java.dzone.com/articles/how-tame-java-gc-pauses http://aragozin.blogspot.com/2011/07/gc-check-list-for-data-...
I don't know the semantics of "real-time" (argued in other comments), but if you are coming from that angle, maybe you are looking for guarantees -- it seems like there ought to be a way to write GC so that it just works no matter what you throw at it, and it seems like G1 is not that.