librempeg/libswscale
Rémi Denis-Courmont 3312c631bb riscv: probe for Zbb extension at load time
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.

Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1:  AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
    LBU   rd, %pcrel_lo(1b)(rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1:  AUIPC rd, GOT_OFFSET_HI
    LD    rd, GOT_OFFSET_LO(rd)
    LBU   rd, (rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 21:52:15 +02:00
..
aarch64 swscale/aarch64: Add rgb24 to yuv implementation 2024-06-11 16:59:43 +02:00
arm
loongarch swscale/la: Add following builtin optimized functions 2023-05-25 21:05:15 +02:00
ppc swscale/ppc/swscale_ppc_template: Reindent after the previous commit 2024-04-06 18:08:21 +02:00
riscv riscv: probe for Zbb extension at load time 2024-06-11 21:52:15 +02:00
tests swscale/tests/swscale: Add help text 2024-02-24 15:27:40 +01:00
x86 swscale/x86/rgb_2_rgb: add missing wrap to ff_uyvytoyuv422_avx2 2024-06-11 16:59:43 +02:00
alphablend.c
bayer_template.c
gamma.c avutil/common: Don't auto-include mem.h 2024-04-01 19:51:37 +02:00
half2float.c
hscale_fast_bilinear.c
hscale.c avutil/common: Don't auto-include mem.h 2024-04-01 19:51:37 +02:00
input.c swscale: add GBRAP14 format support 2023-09-28 19:37:58 +02:00
libswscale.v
log2_tab.c
Makefile
options.c all: use designated initializers for AVOption.unit 2024-02-24 15:27:39 +01:00
output.c swscale/output: Fix integer overflow in yuv2rgba64_full_1_c_template() 2024-05-06 19:53:15 +02:00
rgb2rgb_template.c
rgb2rgb.c sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
rgb2rgb.h sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00
slice.c avutil/common: Don't auto-include mem.h 2024-04-01 19:51:37 +02:00
swscale_internal.h sws/input: R-V V rgb24ToY & bgr24ToY 2024-06-09 00:01:54 +02:00
swscale_unscaled.c swscale: add GBRAP14 format support 2023-09-28 19:37:58 +02:00
swscale.c sws/input: R-V V rgb24ToY & bgr24ToY 2024-06-09 00:01:54 +02:00
swscale.h swscale: document some missing arguments 2022-10-17 09:56:47 +02:00
swscaleres.rc
utils.c swscale/utils: Fix xInc overflow 2024-04-06 18:08:22 +02:00
version_major.h libs: bump major version for all libraries 2024-03-17 19:46:28 +01:00
version.c lib*/version: Use static_assert for static asserts 2024-04-01 19:50:54 +02:00
version.h Bump after 7.0 branch point 2024-03-27 19:21:50 +01:00
vscale.c avutil/common: Don't auto-include mem.h 2024-04-01 19:51:37 +02:00
yuv2rgb.c swscale/yuv2rgb: Use 64bit for brightness computation 2024-05-28 16:28:07 +02:00