Skip to content

Latest commit

 

History

History
68 lines (62 loc) · 3.78 KB

README.md

File metadata and controls

68 lines (62 loc) · 3.78 KB

sortfast -- a laboratory to explore sorting performance

https://cantrip.org/sortfast.html

Examples:

make all
make CC=g++-9 INDEXED
make CC=clang++-9 ANDANDOR
make CC=clang++-9 INDEXED_PESSIMAL_ON_GCC
make CC=g++-9 CHECK=CHECK BOG

Same input, same algorithm, varying performance:

/tr>
Variant Compiler Haswell SkylakeX Zen2
-march=znver2
*
STD G++ 1 (10.2s) 1 (7.3s) 1 (9.0s)
Clang 1 (10.0s) 1 (7.3s) 1 (8.5s)
BOG G++ 1.1 (11.5s) 1.1 (8.2s) 1.1 (10.3s)
Clang 1.1 (11.3s) 1.1 (7.9s) 1.1 (9.4s)
&&| G++ .56 (5.7s) .61 (4.5s) .82 (7.4s)
Clang .49 (4.9s) .47 (3.4s) .75 (6.4) !
IX G++ .79 (8.1s) .89 (6.5s) .94 (8.5s)
Clang .66 (6.6s) .63 (4.6s) .87 (7.4s)
IX-P G++ 1.2 (12.0s) 1.2 (8.4s) 2.7 (24s) !
Clang .64 (6.4s) .60 (4.4s) .77 (7.4s)
ROT G++ .72 (7.3s) .70 (5.1s) .73 (6.6s)
Clang .68 (6.8s) .66 (4.8s) 1.2 (9.9s)
CMOV G++ 1.1 (11.3s) 1.1 (8.0s) .82 (7.4s)
Clang .49 (4.9s).47 (3.4s) .74 (6.3s) !
IX1 G++ 1.6 (15.9s) 1.6 (11.8s) .82 (7.4) !
Clang .61 (6.1s) .62 (4.5s) .73 (6.6s)
IX2 G++ 1.1 (11.2s) 1.1 (8.2s) 2.1 (19s) !
Clang .69 (6.9s) .70 (5.1s) 2.2 (19s) !

Note that compiler tunings for Zen2 are very immature, resulting in severely suboptimal code generation in several cases seen here.

Radix sorts, for reference

Variant Compiler Haswell SkylakeX Zen2
-march=znver2
*
RX256 G++ .13 (1.3s) .14 (1.0s) .20 (1.8s) !
Clang .12 (1.2s) .12 (0.9s) .21 (1.8s) !
RX2 G++ .66 (6.7s) .62 (4.5s) 1.4 (12.7s)
Clang .71 (7.1s) .66 (4.8s) 1.4 (11.7s)