I would love to find some kind of stripped-down vector graphics engine that is fully geared towards real-time rendering of path-based graphics for games. For example to write retro-style games that use 2D vector graphics instead of textured quads. I've been writing a simple game that uses this graphics style, and so far I've been simply using pre-rendered textures for sprites. Being able to directly render these things by composing vector-based layers would allow all kinds of interesting effects and image quality improvements, e.g. seamless scaling morphing, etc. Obviously this would be way, way slower than emulating these things using bitmap textures and/or SDF's for scaling, but for the simple graphics my game is using that should probably be fine, even on mid-range hardware there is plenty of frame time I could dedicate to vector rendering and still hit at least 30fps.
It's also worth noting that the HTML canvas implementations in most browsers is also not too bad for performance. Quartz is also really fast, but Apple made it really easy to accidentally use a slow rendering path (especially for bitmap operations)
This AR demo on MagicLeap shows the potential: https://twitter.com/asajeffrey/status/1106667615622180864
Pathfinder3 looks promising, but comes with its own tradeoffs. It's an open area of research.
[0]: https://github.com/google/skia/tree/master/src/compute/skc
320 PPI display (a.k.a. "Retina") has 9 times more pixels than old, classic 96 PPI one.
So just attaching new monitor will require 10 times better CPU in order to render the same UI. That if to use CPU rasterizers only.
Obviously that above is not an option. That's why Direct2D and Skia contain as GPU as CPU rasterizers. That dualism complicates things quite a lot - two alternative renderers under the same API roof shall produce similar results.
So Blend2D, to be a viable solution for the UI, shall be 10 faster in rasterizing than any current alternatives.
Yet, it was NV_path_rendering OpenGL extension from NVIDIA aimed for 2D path rasterization, but it seems the effort is dimmed now as OpenGL itself. OpenGL architecture, that was created to run H/W accelerated full screen apps, is far from being adequate for windowed UI.
So far Microsoft's Direct2D is the best thing that we have for H/W accelerated UI so far. And WARP mode in Direct2D (CPU rasterizer) is pretty close to the Blend2D - they also use JIT for rasterizing AFAIK.
Blend2D has multithreaded rendering on roadmap - I have experience in this topic and everything in Blend2D was designed with multithreading in mind (banding for example). The implementation I'm planning would scale very well.
NV_path_rendering - I haven't seen any detailed comparison to be honest. Frame-rate is not enough to compare CPU vs GPU technology - both memory consumption and power consumption are important as well to calculate frame-rate per watt.
I cannot comment on Direct2D as it's not open source and it runs only on a single operating system. So I don't consider Direct2D as a competition at the moment.
While GPU rasterization, from application perspective, is near O(1) - does not depend on number of pixels in ideal circumstances.
And having multiple threads to render UI is not desired - there are too many CPU consumers on modern desktop, e.g. online radio that is playing now, etc.
I am not saying that CPU rasterization makes no sense. Quite contrary. As a practical example: in Sciter on Linux/GTK I am using Cairo backend by default as OpenGL inside GTK windows is horrible. So Skia does not help there at all - Cairo and its CPU rasterizer is used.
If we would have something that allows to rasterize paths 5-10 times faster than current Cairo - it will solve all current desktop needs I think.
In principle 192 PPI resolution for desktop monitors of practical sizes (24 inch, 3840x2160 pixels) is OK - human eye will not be able to see separate pixels. Pretty much the same number of pixels is on mobiles ( iPadPro: 2732x2048 ). These are targets that need to be considered.
Practical requirements:
Take HN site in browser. Open it full screen. Decent 2D library should be able to rasterize that amount of text with 60 FPS (e.g. kinetic scrolling).
I do wonder if these kind of libraries shouldn't be supplanted by hardware accelerated rendering, though.
The author of Antigrain passed away a few years back, sadly.
It was and still probably is one of the best graphics libraries out there, and he was really hoping for a time when antialias-free, vector-based UI on high-pixel-density screens would become reality.
[1] Announcement in Russian: https://rsdn.org/forum/life/5377743.flat
"Maxim Shemanarev 1966-2013. Tragically, unexpectedly passed away in his home at 47." The discussion that follows names epileptic seizure as the cause of death.
[2] For example, this glorious text on text rasterization from 2007 http://www.antigrain.com/research/font_rasterization/index.h...
$ python tools/git-sync-deps
$ gn gen out/release -args="is_official_build=true is_component_build=false"
$ ninja -C out/release
This will build it as a static library. I believe by default it will try to use system libraries, you can make it use its own version of those libraries in this slightly more complicated process (which could be done with a script but I'm too lazy to write one right now): $ gn args out/release --list --short
This will print all the current build arguments, copy all the use_system_XXX = true, you'll have to modify those to be = false, and paste them into the text editor that will open once you run: $ gn args out/release
After that, run ninja to build $ ninja -C out/release
For the include files, you have to add at least these to your include directory: include
include/config
include/core
include/effects
include/gpu
On Windows, you might want to have two builds linking against different versions of the std lib (because of the whole iterator debug level stuff). Add these args: extra_cflags=["/MTd"] and extra_cflags=["/MT"] for debug/release builds (statically linked in this case).Of course a better format in the first place(mentioned in another comment) would just solve this anyway
- Dense cell buffer, 32-bit integer per one cell (FreeType/Qt use sparse cell buffer and two 32-bit integers per cell, font-rs uses float if I'm not mistaken)
- Blend2D builds edges before rasterization step, edge builder is optimized and performs clipping and curve flattening
- Rasterization and composition happens in bands, thus storage required by the cell buffer is quite small (currently band is 32 scanlines, but we will make it adjustable based on width)
- To complement dense cell buffer Blend2D uses also a shadow bit-buffer to mark cells that were altered by the rasterizer
- Compositor scans bit-buffer instead of cell-buffer to quickly skip areas that were not touched by the rasterizer
- Compositor is SIMD optimized and calculates multiple masks at the same time (at the moment it works at 4 pixels at a time, but this can be extended easily to 8 and 16 pixels)
- Compositor clears both cell-buffer and shadow bit-buffer during composition so when the compositor finishes these buffers are zeroed and ready for another band
- Blend2D maintains small zeroed memory pool that is used to quickly allocate cell and bit buffers when you create the rendering context and returned to the pool when you destroy it
There are probably more differences, like parametrization of NonZero and EvenOdd fill rules, etc, but these are really implementation details to minimize the number of pipelines generated by common rendering operations.
The advantage of font-rs is rendering small paths, large paths will have increasing overhead as a lot of computations would happen on zero cells. Blend2D rasterizer is universal and was tuned to perform well from small to 4K framebuffers.
When I was designing Blend2D's rasterizer I wrote around 20 rasterizers and benchmarked them against each other in various resolutions. I had similar rasterizers like font-rs (but not using floats) and these were only competing in very small resolutions like 8x8 and 16x16 pixels. When shifting to larger resolutions these rasterizers would always lose as shadow bit-buffer scan is much quicker than going through zero cells, especially if you do pixel compositing.
There are demo samples in bl-samples-qt repository that render animated content and have Blend2D and Qt rendering options. You can check them out to see how the rasterizer competes against Qt.
Let me know if that answers your question.
I get the impression that often these rendering bugs are partially caused by old optimizations that break later spec changes or missed some edge cases the first time around, so I hope that means this engine does better than most because it's more recent, built from-the-ground-up and can learn from the mistakes of others.
The world needed a vector version of PNG far more than it needed a vector version of HTML/CSS.
The SMIL stuff has made the SVG DOM a bloody mess, and it's deprecated everywhere, which is a sad outcome, but it's still useful.