Compositor-related performance improvements #3

vancluever · 2024-03-29T17:23:08Z

#2 resulted in (expected) performance losses for normal draw operations (e.g., Path.fill() and Path.stroke()), due to the extra composition operations now involved in performing the draw. Rather than paint pixel-by-pixel (and only those pixels), we now have paint the mask (functionally equivalent to the old, aforementioned, non-composited paint), downsample the mask, and then composite that mask twice, one to a temporary source surface, and then to the canvas/destination surface.

I have some ideas I want to explore to try and help improve performance here:

SIMD Processing

This feels like "low-hanging fruit", but I think there's some nuance in how implementation needs to be done. Rudimentary updates to allow the composition operations to use vectors (see 18df15d) did not yield any performance increases (very naively measured with time running the spec tests), I'm guessing due to the amount of conversions necessary to actually make this happen with our current memory layout in a surface. We don't want to lose our type system here, so I'm wondering if MultiArrayList could help us here - hopefully, this would allow us to parallelize entire strides of pixel memory into SIMD-able operations at the very least at the channel level.

Other Ideas (Memory, etc)

These issues are just off-the-cuff ideas that I've got for what could be other improvements:

Tweaks of where we alloc/de-alloc memory?
Improvements to our super-sampling algorithm?

One thing I did notice - adding an ArenaAllocator to paintComposite immediately got us a near 40% performance increase (the figures are possibly higher, given my rudimentary benchmark included all tests, not just the ones using AA). I will be submitting this fix shortly.

The text was updated successfully, but these errors were encountered:

vancluever · 2024-03-29T17:50:18Z

ArenaAllocator for paintComposite now in with #4.

vancluever · 2024-04-23T21:31:23Z

Pasting some v0.1.0 benchmark results for rendering of 017_stroke_star_round.zig. We are definitely rendering worse than Cairo, but that's to be expected, as there is lots of room in the compositor for optimization. Cairo is also using pixman under the hood, which is purpose-built for this kind of thing (and additionally is SIMD-optimized).

The performance lag being compositor-related is pretty evident by the good performance on the non-AA z2d test (Cairo, as far as I know, builds a mask even when not using AA, whereas we don't - for the time being anyway).

vancluever · 2024-08-29T18:48:09Z

Removing the milestone in favor a longer-term burn on this one. There are other things that I think I want to prioritize first (e.g., missing features particularly in support of some degree of a complete SVG feature set).

vancluever changed the title ~~Compositor-related performance increases~~ Compositor-related performance improvements Mar 29, 2024

vancluever mentioned this issue Mar 29, 2024

v0.1.0 pre-release push #5

Closed

9 tasks

vancluever added this to the v0.2.0 milestone Apr 24, 2024

vancluever removed this from the v0.2.0 milestone Aug 29, 2024

vancluever added the enhancement New feature or request label Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compositor-related performance improvements #3

Compositor-related performance improvements #3

vancluever commented Mar 29, 2024 •

edited

Loading

vancluever commented Mar 29, 2024

vancluever commented Apr 23, 2024

vancluever commented Aug 29, 2024

Compositor-related performance improvements #3

Compositor-related performance improvements #3

Comments

vancluever commented Mar 29, 2024 • edited Loading

SIMD Processing

Other Ideas (Memory, etc)

vancluever commented Mar 29, 2024

vancluever commented Apr 23, 2024

vancluever commented Aug 29, 2024

vancluever commented Mar 29, 2024 •

edited

Loading