librempeg

mirror of https://github.com/librempeg/librempeg synced 2024-11-22 00:51:37 +00:00

Author	SHA1	Message	Date
Ramiro Polla	f3837d7e21	checkasm/sw_range_convert: indent after previous couple of commits Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:24 +01:00
Ramiro Polla	d1cf450895	checkasm/sw_range_convert: test all supported bit depths This commit also reduces the number of times ff_sws_init_scale() gets called (only once per bit depth), and the number of times randomize_buffers() gets called (only if the function must be checked). Benchmarks are only performed on bit depths 8 and 16 (since they are different functions, and not only different constants). Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:24 +01:00
Ramiro Polla	e916b70b15	checkasm/sw_range_convert: only run benchmarks on largest input width Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:24 +01:00
Ramiro Polla	e912eeba81	checkasm/sw_range_convert: reduce number of input sizes tested Reduce input sizes to 8 (to test that the function works with widths smaller than the vector length) and 1920 (raising the largest input size to improve benchmark results). Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:24 +01:00
Ramiro Polla	37f0cd8d05	checkasm/sw_range_convert: use YUV pixel formats instead of YUVJ We are already setting the range, so we can use regular YUV pixel formats instead of YUVJ. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:23 +01:00
Ramiro Polla	1113b2c658	checkasm: use FF_ARRAY_ELEMS instead of hardcoding size of arrays Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-31 08:30:23 +01:00
Niklas Haas	7d8a5e3aee	swscale: rename SwsContext to SwsInternal And preserve the public SwsContext as separate name. The motivation here is that I want to turn SwsContext into a public struct, while keeping the internal implementation hidden. Additionally, I also want to be able to use multiple internal implementations, e.g. for GPU devices. This commit does not include any functional changes. For the most part, it is a simple rename. The only complications arise from the public facing API functions, which preserve their current type (and hence require an additional unwrapping step internally), and the checkasm test framework, which directly accesses SwsInternal. For consistency, the affected functions that need to maintain a distionction have generally been changed to refer to the SwsContext as sws, and the SwsInternal as c. In an upcoming commit, I will provide a backing definition for the public SwsContext, and update `sws_internal()` to dereference the internal struct instead of merely casting it. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-26 09:25:17 +02:00
James Almer	6127e1611e	tests/checkasm/sw_rgb: don't write random data past the end of the buffer Should fix fate-checkasm-sw_rgb under gcc-ubsan. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-18 10:04:11 +02:00
Martin Storsjö	5e84f3d5dd	checkasm: lls: Use relative tolerances rather than absolute ones Depending on the magnitude of the output values, the potential errors can be larger. This fixes errors in the lls tests on x86_32 for some seeds, observed with GCC 11 (on Ubuntu 22.04, with the distro compiler, with -m32). Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-10-10 13:35:27 +02:00
Martin Storsjö	e5a14ae4e6	checkasm: Print the SVE vector length at startup Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-28 18:06:25 +02:00
Martin Storsjö	eadd7fbb05	aarch64: Add CPU feature flags for SVE and SVE2 Add code for detecting the feature on Linux and Windows. Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-28 18:06:25 +02:00
Martin Storsjö	6c0404e6b4	checkasm/sw_rgb: Revert test additions from e18b46d95fadcbaaf450bda9f1871849f2b0c586 The unaligned width test cases fail on i386; we have an assembly function of rgb24toyv12 which is enabled only within "#if ARCH_X86_32 && HAVE_7REGS", which seems to fail these new test cases for unaligned widths. As that assembly function has existed for a long time in that form, the issue probably isn't very recent, thus skip testing these cases for now. Once the assembly function has been fixed, these test cases can be readded. Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-26 21:45:11 +02:00
Zhao Zhili	95e3e4cdb2	swscale/aarch64: Fix rgb24toyv12 only works with aligned width Since c0666d8b, rgb24toyv12 is broken for width non-aligned to 16. Add a simple wrapper to handle the non-aligned part. Co-authored-by: johzzy <hellojinqiang@gmail.com> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-25 21:37:11 +02:00
Ramiro Polla	b564d62366	checkasm/sw_rgb: add rgb24toyv12 tests Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-08 20:51:02 +02:00
Ramiro Polla	50df07f149	checkasm/sw_rgb: add deinterleaveBytes Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-08 20:51:02 +02:00
James Almer	4ab512f6e0	fate/checkasm/sw_gbrp: don't randomly set internal values They are set by sws_init_context(). May help with signed integer overflows reported by gcc-usan. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-08 20:50:58 +02:00
Rémi Denis-Courmont	2d03f3af32	checkasm/riscv: print official extension names Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-08 20:50:56 +02:00
Anton Khirnov	4d6e1e09dd	lavc/opus*: move to opus/ subdir Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-03 10:22:59 +02:00
Ramiro Polla	81a3528647	avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_t Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-03 10:22:58 +02:00
Nuo Mi	f39ee4f57c	checkasm: add vvc_bdof test apply_bdof_8_8x16_c: 5776.5 apply_bdof_8_8x16_avx2: 396.2 apply_bdof_8_16x8_c: 5722.0 apply_bdof_8_16x8_avx2: 216.0 apply_bdof_8_16x16_c: 11213.2 apply_bdof_8_16x16_avx2: 434.5 apply_bdof_10_8x16_c: 5657.7 apply_bdof_10_8x16_avx2: 1096.0 apply_bdof_10_16x8_c: 5531.7 apply_bdof_10_16x8_avx2: 212.5 apply_bdof_10_16x16_c: 11043.7 apply_bdof_10_16x16_avx2: 1252.7 apply_bdof_12_8x16_c: 5680.0 apply_bdof_12_8x16_avx2: 1096.5 apply_bdof_12_16x8_c: 5646.2 apply_bdof_12_16x8_avx2: 624.5 apply_bdof_12_16x16_c: 11076.0 apply_bdof_12_16x16_avx2: 1241.5 Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-09-03 10:22:55 +02:00
J. Dekker	ea29a27b56	checkasm: add wildcompares for test & functions Added: --test=<pattern> Filter tests by glob style pattern. --bench[=<pattern>] Run benchmark and optionally filter functions by glob style pattern. Example: $ ./tests/checkasm/checkasm --bench=yuva* [...] yuva420p_bgr24_8_c: 34.5 ( 1.00x) yuva420p_bgr24_8_ssse3: 31.1 ( 1.11x) yuva420p_bgr24_128_c: 310.6 ( 1.00x) yuva420p_bgr24_128_ssse3: 178.1 ( 1.74x) yuva420p_bgr24_1080_c: 2509.6 ( 1.00x) yuva420p_bgr24_1080_ssse3: 1471.5 ( 1.71x) yuva420p_bgr24_1920_c: 4462.6 ( 1.00x) yuva420p_bgr24_1920_ssse3: 2331.1 ( 1.91x) [...] Ported from dav1d. Signed-off-by: J. Dekker <jdek@itanimul.li> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:54 +02:00
J. Dekker	550c48eac8	checkasm: improve print format Port dav1d's checkasm output format to FFmpeg's checkasm, includes relative speedups and aligns results. Signed-off-by: J. Dekker <jdek@itanimul.li> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:54 +02:00
J. Dekker	045f9e52ca	checkasm: print only results to stdout Signed-off-by: J. Dekker <jdek@itanimul.li> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:53 +02:00
J. Dekker	859462e8b7	checkasm: add csv/tsv bench output When collecting performance information from checkasm it is common to parse the output for use in graphs to compare vs different architectures. Signed-off-by: J. Dekker <jdek@itanimul.li> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:53 +02:00
Ramiro Polla	59fb24fa79	checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:52 +02:00
Ramiro Polla	b263720204	checkasm/yuv2yuv: add tests for semiplanar unscaled converters Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:50 +02:00
Ramiro Polla	4b87fd8a49	swscale/yuv2rgb: add yuv42{0,2}p -> gbrp unscaled colorspace converters Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-30 18:02:41 +02:00
Nuo Mi	b6b22d5f56	checkasm: add tests for vvc dmvr dmvr_8_12x20_c: 186.2 dmvr_8_12x20_avx2: 25.7 dmvr_8_20x12_c: 181.7 dmvr_8_20x12_avx2: 25.2 dmvr_8_20x20_c: 283.2 dmvr_8_20x20_avx2: 32.0 dmvr_10_12x20_c: 90.0 dmvr_10_12x20_avx2: 15.7 dmvr_10_20x12_c: 41.0 dmvr_10_20x12_avx2: 14.7 dmvr_10_20x20_c: 81.5 dmvr_10_20x20_avx2: 26.7 dmvr_12_12x20_c: 190.7 dmvr_12_12x20_avx2: 20.2 dmvr_12_20x12_c: 187.2 dmvr_12_20x12_avx2: 20.2 dmvr_12_20x20_c: 292.7 dmvr_12_20x20_avx2: 27.2 dmvr_h_8_12x20_c: 317.0 dmvr_h_8_12x20_avx2: 37.0 dmvr_h_8_20x12_c: 340.0 dmvr_h_8_20x12_avx2: 41.0 dmvr_h_8_20x20_c: 540.7 dmvr_h_8_20x20_avx2: 64.0 dmvr_h_10_12x20_c: 322.7 dmvr_h_10_12x20_avx2: 30.7 dmvr_h_10_20x12_c: 344.2 dmvr_h_10_20x12_avx2: 34.0 dmvr_h_10_20x20_c: 529.0 dmvr_h_10_20x20_avx2: 51.5 dmvr_h_12_12x20_c: 326.7 dmvr_h_12_12x20_avx2: 33.5 dmvr_h_12_20x12_c: 331.7 dmvr_h_12_20x12_avx2: 51.2 dmvr_h_12_20x20_c: 534.0 dmvr_h_12_20x20_avx2: 62.7 dmvr_hv_8_12x20_c: 650.0 dmvr_hv_8_12x20_avx2: 57.2 dmvr_hv_8_20x12_c: 676.2 dmvr_hv_8_20x12_avx2: 70.0 dmvr_hv_8_20x20_c: 1068.5 dmvr_hv_8_20x20_avx2: 103.2 dmvr_hv_10_12x20_c: 649.0 dmvr_hv_10_12x20_avx2: 48.2 dmvr_hv_10_20x12_c: 677.7 dmvr_hv_10_20x12_avx2: 59.7 dmvr_hv_10_20x20_c: 1093.5 dmvr_hv_10_20x20_avx2: 91.7 dmvr_hv_12_12x20_c: 660.0 dmvr_hv_12_12x20_avx2: 58.7 dmvr_hv_12_20x12_c: 682.7 dmvr_hv_12_20x12_avx2: 72.0 dmvr_hv_12_20x20_c: 1094.0 dmvr_hv_12_20x20_avx2: 113.2 dmvr_v_8_12x20_c: 325.7 dmvr_v_8_12x20_avx2: 31.2 dmvr_v_8_20x12_c: 326.2 dmvr_v_8_20x12_avx2: 38.5 dmvr_v_8_20x20_c: 538.5 dmvr_v_8_20x20_avx2: 54.2 dmvr_v_10_12x20_c: 318.5 dmvr_v_10_12x20_avx2: 23.7 dmvr_v_10_20x12_c: 330.7 dmvr_v_10_20x12_avx2: 40.5 dmvr_v_10_20x20_c: 567.5 dmvr_v_10_20x20_avx2: 48.0 dmvr_v_12_12x20_c: 335.2 dmvr_v_12_12x20_avx2: 30.0 dmvr_v_12_20x12_c: 330.2 dmvr_v_12_20x12_avx2: 39.5 dmvr_v_12_20x20_c: 535.2 dmvr_v_12_20x20_avx2: 60.0 Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-15 18:17:55 +02:00
Rémi Denis-Courmont	32d04f137a	lavu/riscv: drop probing for zba CPU capability Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-15 10:48:30 +02:00
Rémi Denis-Courmont	ab10f641ec	lavc/riscv: drop probing for F & D extensions F and D extensions are included in all RISC-V application profiles ever made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be selected at compilation time. Currently, there are no consumers for these two flags. If there is ever a need to reintroduce F- or D-specific optimisations, we can always use __riscv_f or __riscv_d compiler predefined macros respectively. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-04 17:29:13 +02:00
Rémi Denis-Courmont	04a777d62b	checkasm/riscv: preserve T1 whilst calling... This preserves T1 whilst calling the instrumented function. In a Sci-Fi setting where type-based Control Flow Integrity (CFI) is supported, the calling code (i.e., the `checkasm` test case) will set T1 to the expected value of the landing pad label (LPL) of the instrumented function. The call wrapper will always use LPL zero which is a wild card. We should preserve the value of T1 at least until the indirect call to the instrumented function. Of course this is Sci-Fi, because: 1) there is no hardware (or even QEMU) support yet, 2) all our assembler functions currently use LPL zero anyway. This uses T3 rather than T2 because indirect branches with T2 is reserved for notionally direct calls made with an indirect call instruction (e.g. due to GOT indirection), and are exempted from forward-edge CFI checks. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-08-04 17:29:12 +02:00
Rémi Denis-Courmont	5774d62dfc	checkasm/riscv: align the landing pads Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-31 17:43:39 +02:00
Rémi Denis-Courmont	8bd0d30c02	checkasm/riscv: add forward-edge CFI landing pads Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-31 17:43:39 +02:00
Rémi Denis-Courmont	a113208ab0	lavu/riscv: add CPU flag for B bit manipulations The B extension was finally ratified in May 2024, encompassing: - Zba (addresses), - Zbb (basics) and - Zbs (single bits). It does not include Zbc (base-2 polynomials). Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-31 17:43:37 +02:00
Martin Storsjö	df2787881f	checkasm: Increase the tolerance for ac3_sum_square_butterfly_float Increase the tolerance from 10 ulp to 11 ulp. This fixes occasional errors for some inputs; the errors could be reproduced on aarch64/neon builds, with "checkasm --test=ac3dsp 3446175925". Signed-off-by: Martin Storsjö <martin@martin.st> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-24 11:27:51 +02:00
Rémi Denis-Courmont	d01914cfa0	checkasm/h264dsp: test TX bypass Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-24 00:53:52 +02:00
James Almer	4e3dc972c3	checkasm/lls: increase epsilon value for the update_lls test Should fix failures for some seeds on x86_32. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-07-24 00:53:47 +02:00
Ramiro Polla	112cbeea83	checkasm: add tests for yuv2rgb Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-28 22:20:31 +02:00
Nuo Mi	5294f78afb	checkasm/vvc_alf: ensure right and bottom boundaries are not overwritten by asm Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-27 09:03:07 +02:00
Nuo Mi	925011ad3d	checkasm/vvc_alf: random select alf virtual boundaries position A picture's virtual boundaries will split a CTU into 4 ALF blocks. The ALF virtual boundary may cross or not cross a ALF block. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-27 09:03:07 +02:00
Nuo Mi	6406fffad7	checkasm/vvc_alf: only check the valid filter and classify sizes Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-27 09:03:07 +02:00
Andreas Rheinhardt	31c95c91ab	avcodec/me_cmp: Zero MECmpContext in ff_me_cmp_init() Not every function will be set, so zero the context to initialize everything. This also allows to remove an initialization in dvenc.c. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-20 20:27:43 +02:00
Andreas Rheinhardt	3933dcd491	avcodec/me_cmp,dvenc,mpegvideo: Move ildct_cmp to its users MECmpContext.ildct_cmp is an array of function pointers that are not set by ff_me_cmp_init(), but that are set by users to one of the other arrays via ff_set_cmp(). Remove these pointers from MECmpContext and add pointers for the actually used functions to its users. (The DV encoder already did so.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-20 20:27:43 +02:00
Andreas Rheinhardt	d2c568bd77	avcodec/me_cmp, mpegvideo: Move frame_skip_cmp to MpegEncContext MECmpContext has several arrays of function pointers that are not set by ff_me_cmp_init(), but that are set by users to one of the other arrays via ff_set_cmp(). One of these other users is mpegvideo_enc; it is the only user of MECmpContext.frame_skip_cmp and it only uses one of these function pointers at all. This commit therefore moves this function pointer to MpegEncContext; and removes the array from MECmpContext. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-20 20:27:43 +02:00
Andreas Rheinhardt	840eb90aba	avcodec/me_cmp, motion_est: Move me_(pre_)?_cmp etc. to MotionEstContext MECmpContext has several arrays of function pointers that are not set by ff_me_cmp_init(), but that are set by users to one of the other arrays via ff_set_cmp(). One of these other users is the motion estimation API. It uses MECmpContext.(me_pre\|me\|me_sub\|mb)_cmp. It is basically the only user of these arrays. This commit therefore moves these arrays to MotionEstContext; this has the additional advantage of making motion_est.c more independent from MpegEncContext. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-20 20:27:43 +02:00
Zhao Zhili	4e1c8f1a93	tests/checkasm: Remove check on linux perf fd in uninit The check should be >= 0, not > 0. The check itself is redundant since uninit only being called after init is success. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-19 17:49:28 +02:00
Ramiro Polla	2250a5963e	checkasm: add tests for {lum,chr}ConvertRange Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-17 13:37:29 +02:00
James Almer	5200d0b509	checkasm/lls: add missing random values to the test buffers Fixes valgrind warnings after 18adaf9fe558587cb1b707c647af83015b69da48. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-13 21:50:47 +02:00
Rémi Denis-Courmont	3ea49a7f29	checkasm/lls: adjust buffer sizes and alignments var must be padded. param has `order + 1`, not `order` elements and is not over-aligned. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-11 21:52:14 +02:00
Zhao Zhili	d25c581ec4	tests/checkasm: Fix build error when enable linux perf on Android B0 is defined by system header, see `f0f596dbc6` for ref. Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>	2024-06-11 16:59:43 +02:00

1 2 3 4 5 ...

568 Commits