Commit Graph

6291 Commits

Author SHA1 Message Date
Tomas Härdin
a1c96cbe0f lavu/intmath.h: Fix UB in ff_ctz_c() and ff_ctzll_c()
Found by value analysis

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:01 +02:00
Tomas Härdin
c717e525ba lavu/common.h: Fix UB in av_clip_uintp2_c()
Found by value analysis

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:01 +02:00
Tomas Härdin
90e108a2e9 lavu/common.h: Fix UB in av_clip_intp2_c()
Found by value analysis

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:00 +02:00
Tomas Härdin
ef6ea6e31a lavu/common.h: Fix UB in av_clipl_int32_c()
Found by value analysis

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:00 +02:00
James Almer
f23c47f7b6 avutil/common: assert that bit position in av_zero_extend is valid
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:00 +02:00
James Almer
c24d1b47c3 avutil: rename av_mod_uintp2 to av_zero_extend
It's more descriptive of what it does.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-14 15:05:00 +02:00
Rémi Denis-Courmont
10285e8875 lavu/bswap: remove some inline assembler
C code or compiler built-ins are preferable over inline assembler for
byte-swaps as it allows for better optimisations (e.g. instruction
scheduling) which would otherwise be impossible.

As with f64c2e710f for x86 and Arm,
this removes the inline assembler on GCC (and Clang) since we now
require recent enough compiler versions. This indeed seems to work on
AArch64, SuperH and, if Zbb is enabled, RISC-V. (AVR32 was not tested
since it has no known working compilers at this time.)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-13 21:50:47 +02:00
Rémi Denis-Courmont
b3583f545b lavu/x86: remove GCC 4.4- stuff
Since the C11 support is required, those GCC versions can no longer be
supported anyhow. (Clang pretends to be GCC 4.4, but it looks like the
code was intended for old GCC specifically.)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-13 21:50:47 +02:00
Rémi Denis-Courmont
e6bdfc1f0c lavu/arm: remove GCC 4.6- stuff
Since the C11 support is required, those GCC versions can no longer be
supported anyhow. (Clang pretends to be GCC 4.4, but the removed code
does not seem to have been intended for Clang.)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-13 21:50:47 +02:00
Haihao Xiang
da223773ff lavu/hwcontext_vulkan: Support write on drm frame
Otherwise nothing is written into the destination when a write mapping
is requested.

For example, a vulkan frame mapped from a drm frame (which is wrapped as
a vaapi frame in the example) is used as the output of scale_vulkan
filter, it always gets a green screen without this patch.

ffmpeg -init_hw_device vaapi=va -init_hw_device vulkan=vulkan@va
-filter_hw_device vulkan -f lavfi -i testsrc=size=352x288,format=nv12
-vf
"hwupload,scale_vulkan,hwmap=derive_device=vaapi:reverse=1,format=vaapi,hwdownload,format=nv12"
-f nut - | ffplay -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-12 09:14:15 +02:00
Rémi Denis-Courmont
c33c1c8820 lavu/riscv: use Zbb CLZ/CTZ/CLZW/CTZW at run-time
Zbb static    Zbb dynamic   I baseline
clz       0.668032642   1.336072283   19.552376803
clzl      0.668092643   1.336181786   26.110855571
ctz       1.336208533   3.340209702   26.054869008
ctzl      1.336247784   3.340362457   26.055266290
(seconds for 1 billion iterations on a SiFive-U74 core)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 21:52:15 +02:00
Rémi Denis-Courmont
47381f35a8 lavu/riscv: use Zbb CPOP/CPOPW at run-time
Zbb static    Zbb dynamic   I baseline
popcount  1.336129286   3.469067758   20.146362909
popcountl 1.336322291   3.340292968   20.224829821
(seconds for 1 billion iterations on a SiFive-U74 core)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 21:52:15 +02:00
Rémi Denis-Courmont
934c2cd5fb lavu/riscv: use Zbb REV8 at run-time
This adds runtime support to use Zbb REV8 for 32- and 64-bit byte-wise
swaps. The result is about five times slower than if targetting Zbb
statically, but still a lot faster than the default bespoke C code or a
call to GCC run-time functions.

For 16-bit swap, this is however unsurprisingly a lot worse, and so this
sticks to the baseline. In fact, even using REV8 statically does not
seem to be beneficial in that case.

         Zbb static    Zbb dynamic   I baseline
bswap16:  0.668184765   3.340764069   0.668029012
bswap32:  0.668174014   3.340763319   9.353855435
bswap64:  0.668221765   3.340496313  14.698672283
(seconds for 1 billion iterations on a SiFive-U74 core)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 21:52:15 +02:00
Rémi Denis-Courmont
3312c631bb riscv: probe for Zbb extension at load time
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.

Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1:  AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
    LBU   rd, %pcrel_lo(1b)(rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1:  AUIPC rd, GOT_OFFSET_HI
    LD    rd, GOT_OFFSET_LO(rd)
    LBU   rd, (rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 21:52:15 +02:00
Zhao Zhili
00c8f6ae03 avutil/timer: Add clock_gettime as a fallback of AV_READ_TIME
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 16:59:43 +02:00
Zhao Zhili
2c0562679e avutil/aarch64: Skip define AV_READ_TIME for apple
It will fallback to mach_absolute_time inside libavutil/timer.h

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-11 16:59:43 +02:00
James Almer
ab735c5efe x86/float_dsp: add SSE2 and AVX versions of scalarproduct_double
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-04 08:38:17 +02:00
Lynne
f90110848a lavu: bump minor and add APIchanges entries for the new channel positions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-02 21:40:38 +02:00
Lynne
9572560805 channel_layout: add new channel positions supported by xHE-AAC
apichanges will be updated upon merging, as well as a version bump.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-02 21:40:36 +02:00
Rémi Denis-Courmont
191c20953b lavu/lls: R-V V update_lls
update_lls_8_c:        7.5
update_lls_8_rvv_f64:  4.2
update_lls_12_c:      14.5
update_lls_12_rvv_f64: 5.7

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-01 18:25:44 +02:00
James Almer
e0d08f3e6e avutil/float_dsp.h: fix doxy for scalarproduct_double
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-01 18:25:43 +02:00
James Almer
85e821ce1c avutil/float_dsp: revert accidental doxy removal
done by accident in 6a7c4d60a1498929c2a366f2ef4ccc35621a4358.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-06-01 18:25:43 +02:00
Rémi Denis-Courmont
0d80d74ed6 lavu/float_dsp: R-V V scalarproduct_double
C908:
scalarproduct_double_c:       39.2
scalarproduct_double_rvv_f64: 10.5

X60:
scalarproduct_double_c:       35.0
scalarproduct_double_rvv_f64:  5.2

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-31 21:53:43 +02:00
Rémi Denis-Courmont
0bcdaf5164 lavu/lls: use ff_scalarproduct_double_c()
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-31 21:53:43 +02:00
Rémi Denis-Courmont
a8304ef327 lavu/float_dsp: add double-precision scalar product
The function pointer is appended to the structure for backward binary
compatibility. Fortunately, this is allocated by libavutil, not by the
user, so increasing the structure size is safe.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-31 21:53:42 +02:00
Rémi Denis-Courmont
78cbaf076e riscv: allow passing addend to vtype_vli macro
A constant (-1) is added to the length value, so we can have an added
for free, and optimise the addition away if the addend is exactly 1.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-31 10:09:00 +02:00
Michael Niedermayer
a4e522f3f6 avutil/tests/opt: Check av_set_options_string() for failure
This is test code after all so it should test things

Fixes: CID1518990 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-28 16:28:07 +02:00
Michael Niedermayer
94bcba18ad avutil/tests/dict: Check av_dict_set() before get for failure
Failure is possible due to strdup()

Fixes: CID1516764 Dereference null return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-28 16:28:07 +02:00
Michael Niedermayer
8a92c1087c avutil/random_seed: Avoid dead returns
Fixes: CID1538296 Structurally dead code

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-28 16:28:07 +02:00
Michael Niedermayer
3523a22b91 qsv: Initialize impl_value
Fixes: The warnings from CID1598553 Uninitialized scalar variable

Passing partly initialized structs is ugly and asking for hard to rieproduce bugs,
The uninitialized fields where not used

Reviewed-by: "Xiang, Haihao" <haihao.xiang-at-intel.com@ffmpeg.org>
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-28 16:28:07 +02:00
James Almer
5efc42b5b2 avutil/channel_layout: add a helper function to get the ambisonic order of a layout
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-24 22:47:41 +02:00
oltolm
045c3e65de avutil/hwcontext_qsv: fix GCC 14.1 warnings
Tested-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: oltolm <oleg.tolmatcev@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-23 22:02:58 +02:00
Haihao Xiang
e54d367547 lavu/hwcontext_qsv: add support for dynamic frame pool in qsv_map_to
Make it work with the source which has a dynamic frame pool.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-21 00:02:15 +02:00
Haihao Xiang
cc5e4a2fef lavu/hwcontext_qsv: add support for dynamic frame pool in qsv_frames_derive_to
Make it work with the source which has a dynamic frame pool.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-21 00:02:14 +02:00
Haihao Xiang
6c572e9fda lavu/hwcontext_qsv: create dynamic frame pool if required
When AVHWFramesContext.initial_pool_size is 0, a dynamic frame pool is
required. We may support this under certain conditions, e.g. oneVPL 2.9+
support dynamic frame allocation, we needn't provide a fixed frame pool
in the mfxFrameAllocator.Alloc callback.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-21 00:02:14 +02:00
Haihao Xiang
acfb3545e0 lavu/hwcontext_qsv: update AVQSVFramesContext to support dynamic frame pool
Add AVQSVFramesContext.info and update the description.

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-21 00:02:14 +02:00
Haihao Xiang
2243ca5a42 lavu/version: fix minor version
The latest version should be 59.18.100 since commit 01c5f4ad

$ git diff 01c5f4ad~1..HEAD doc/APIchanges
...
+2024-05-10 - xxxxxxxxx - lavu 59.18.100 - cpu.h
+  Add AV_CPU_FLAG_RV_ZVBB.
+

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-21 00:02:14 +02:00
Rémi Denis-Courmont
cb16d687c3 lavu/riscv: add assembler macros for adjusting vector LMUL
vtype_vli computes the VTYPE value with the optimal LMUL for a given
element width, tail and mask policies and a run-time vector length.

vtype_ivli does the same, but with the compile-time constant vector
length.

vwtypei and vntypei can be used to widen or narrow a VTYPE value for
use in mixed-width vector-optimised functions.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-19 22:36:24 +02:00
Brad Smith
9b0e097abd avutil/ppc/cpu: Also use the machdep.altivec sysctl on NetBSD
Use the machdep.altivec sysctl on NetBSD for AltiVec detection
as is done with OpenBSD.

Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-18 19:59:12 +02:00
Rémi Denis-Courmont
77ca08983c lavu/riscv: fix parsing the unaligned access capability
Pointed-out-by: Stefan O'Rear <sorear@fastmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-18 19:59:10 +02:00
Rémi Denis-Courmont
45f49ade44 lavu/riscv: remove bogus B extension
The B Bit manipulation extension was not defined to this day, and
probably never will. Instead it was broken down into Zba, Zbb, Zbc and
Zbs with no particular blessed set to make up B.

This removes the bogus field test. Linux never set this bit, nor
(AFAICT) did FreeBSD or any other OS. We can always add it back in the
unlikely event that it gets taken into use.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-14 22:49:49 +02:00
Rémi Denis-Courmont
8382c30c65 lavu/riscv: CPU flag for fast misaligned accesses
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-14 22:49:48 +02:00
Rémi Denis-Courmont
95c222ac6d lavu/riscv: fallback to raw hwprobe() system call
Not all C run-times support this, and even then, it will be a while
before distributions provide recent enough versions thereof.

Since this is a trivial system call wrapper, we might just as well call
the corresponding kernel system call directly where the C run-time lacks
support but the kernel headers are new enough (as is the case on Debian
Unstable at the time of writing). In doing so, we need to add a few more
guards as the first suitable kernel (headers) release did not expose the
V, Zba and Zbb extensions.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-14 22:49:48 +02:00
Rémi Denis-Courmont
db547b9d6c lavu/riscv: add ff_rv_vlen_least()
This inline function checks that the vector length is at least a given
value. With this, most run-time VLEN checks can be optimised away.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-13 23:44:15 +02:00
Michael Niedermayer
d8f841658a avutil/tests/base64: Check with too short output array
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-13 10:54:08 +02:00
Michael Niedermayer
77d1200880 libavutil/base64: Try not to write over the array end
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-13 10:54:08 +02:00
Rémi Denis-Courmont
0f5357fd85 lavu/riscv: add Zvbb CPU capability detection
This requires Linux kernel version 6.8 or later.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-11 18:22:55 +02:00
Rémi Denis-Courmont
3bc242e279 riscv: add Zvbb vector bit manipulation extension
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-11 18:22:55 +02:00
Rémi Denis-Courmont
cb5ae4420b lavu/riscv: remove bespoke assembler for MIN
This is no longer necessary as Zbb is now always explicitly required.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-11 18:22:53 +02:00
Rémi Denis-Courmont
e2f2fce2ae lavu/riscv: allow requesting a second extension
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-05-11 18:22:53 +02:00