[Ohrrpgce] SVN: teeemcee/12814 rasterizer: optimise Tex2DSampler/FPInt, which was killing GCC's optimis

subversion at HamsterRepublic.com subversion at HamsterRepublic.com
Mon Feb 21 00:09:13 PST 2022


teeemcee
2022-02-21 00:09:06 -0800 (Mon, 21 Feb 2022)
676
rasterizer: optimise Tex2DSampler/FPInt, which was killing GCC's optimiser

For me, this speeds up 8bit -> 32bit non-blended draws (of a backdrop) by ~10x
on x86_64 (GCC 10) and over 4x on x86, and gives a small 15% boost to clang too.

Speedup for blended blits about half that.

Surprisingly, converting to FPInts just for a couple lines inside the sampler is
much faster (up to 2x) than using floats there instead. Getting the floor of a
float is slower than I though.
I also tried converting TexCoord in the IncTypes to use FPInt to eliminate the
float->FPInt conversions, but bizarrely that was 2-3x slower.

Also, some commented-out failed optimisations to Color scaling
---
U   wip/fpInt.hpp
U   wip/gfxRender.hpp
U   wip/rasterizer.cpp



More information about the Ohrrpgce mailing list