Commit Graph

117119 Commits

Author SHA1 Message Date
Niklas Haas
346f2fe5e8 avutil/dovi_meta: document static vs dynamic ext blocks
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:37 +02:00
Niklas Haas
19c36e98ef fate/scalechroma: switch to standard chroma location
Replace the manually specified chroma location by one using standard
notation, arbitrarily "bottomleft" as it is a less common path.

Required if we want to phase out the use of manual chroma locations.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
ccdca6e4f2 avfilter/vf_zscale: remove unused fields
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
8c9327c119 avfilter/vf_scale: fix 4:1:0 interlaced chroma pos
The current logic hard-coded a check for v_sub == 1. We can extend this
logic slightly to cover the case of interlaced 4:1:0 (which has v_sub ==
2).

Here is a diagram explaining this scenario (with center-siting):

a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b
      X               X
a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b

a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b
      Y               Y
a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b

a = even luma rows
b = odd luma rows
X = even chroma sample
Y = odd chroma sample

In progressive mode, the chroma samples sit at (384, 384) respectively.

Relative to the 8x4 grid of even luma samples (a), the X sample sits at:
  h_chr_pos = 384
  v_chr_pos = 192

Relative to the 8x4 grid of odd luma samples (b), the Y sample sits at:
  h_chr_pos = 384
  v_chr_pos = 576

The new code calculates the correct values in all circumstances.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
6f30aee978 avfilter/vf_scale: add in/out_chroma_loc
Currently, this just functions as a more principled and user-friendly
replacement for the (undocumented and hard to use) *_chr_pos fields.

However, the goal is to automatically infer these values from the input
frames' chroma location, and deprecate the manual use of *_chr_pos
altogether. (Indeed, my plans for an swscale replacement will most
likely also end up limiting the set of legal chroma locations to those
permissible by AVFrame properties)

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
b90b5f7def avfilter/swscale: always fix interlaced chroma location
The current logic only fixes it when the user does not explicitly
specify the chroma location. However, this does not make a lot of sense.
Since there is no way to specify this property per-field, it effectively
*prevents* the user from being able to correctly scale interlaced frames
with top-aligned chroma.

It makes more sense to consider the user setting in the progressive case
only, and automatically adapt it to the correct interlaced field
positions, following the details of the MPEG specification.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
7e188369dc swscale/options: relax src/dst_h/v_chr_pos value range
When dealing with 4x subsampling ratios (log2 == 2), such as can arise
with 4:1:1 or 4:1:0, a value range of 512 is not enough to cover the
range of possible scenarios.

For example, bottom-sited chroma in 4:1:0 would require an offset of 768
(three luma rows). Simply double the limit to 1024. I don't see any
place in initFilter() that would experience overflow as a result of this
change, especially since get_local_pos() right-shifts it by the
subsampling ratio again.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
c83e625e0b avfilter/vf_setparams: allow setting chroma location
Shockingly, there isn't currently _any_ filter for overriding this.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:36 +02:00
Niklas Haas
4285516123 swscale: document SWS_FULL_CHR_H_* flags
Based on my best understanding of what they do, given the source code.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
8c095f85ae vulkan_filter: don't require the storage flag for the base frames format
We check for whether subformats support storage immediately below.
Those are the ones we require storage for, rather than the base format
itself.

This permits better reuse of AVHWFrame contexts.

The patch also removes an always-false check in the subformat check.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
dd093087c2 vulkan_filter: allow reusing frame contexts with DRM tiling
There's no reason not to permit this, particularly if a user wants
to manipulate images which will be exported back to DRM.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
d6bf4c7d73 hwcontext_vulkan: align host mapping size to minImportedHostPointerAlignment
This was left out of the recent rewrite of the system.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
9a5fcad214 vulkan: enable encoding of images if video_maintenance1 is enabled
Vulkan encoding was designed in a very... consolidated way.
You had to know the exact codec and profile that the image was going to
eventually be encoded as at... image creation time. Unfortunately, as good
as our code is, glimpsing into the exact future isn't what its capable of.

video_maintenance1 removed that requirement, which only then made encoding
images practically possible.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
aea34a3bf7 hwcontext_vulkan: enable VK_KHR_video_maintenance1
We require it for encoding.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
874fbbb3ab hwcontext_vulkan: setup extensions before features
The issue is that enabling features requires that the device
extension is supported. The extensions bitfield was set later,
so it was always 0, leading to no features being added.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
d68db2d412 hwcontext_vulkan: don't enable deprecated VK_KHR_sampler_ycbcr_conversion extension
It was added to Vulkan 1.1 a long time ago.
Validation layer will warn if this is enabled.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:35 +02:00
Lynne
aaf57716fc hwcontext_vulkan: fix user layers, add support for different debug modes
The validation layer option only supported GPU-assisted validation.
This is mutually exclusive with shader debug printfs, so we need to
differentiate between the two.

This also fixes issues with user-given layers, and leaks in case of
errors.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:34 +02:00
Lynne
6778183db2 vulkan_decode: use the correct queue family for decoding ops
In 680d969a305c0927480573a1b455024088b51aeb, the new API was
used to find a queue family for dispatch, but the found queue
family was not used for decoding, just for dispatching.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:34 +02:00
Anton Khirnov
5ab4548c03 lavfi: move AVFilterLink.graph to FilterLink
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:34 +02:00
Anton Khirnov
fe12430a86 lavfi: move AVFilterLink.frame_wanted_out to FilterLinkInternal
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:34 +02:00
Anton Khirnov
87b0b94bde lavfi: move AVFilterLink.{frame,sample}_count_{in,out} to FilterLink
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
49989556f1 lavfi: move AVFilterLink.frame_rate to FilterLink
Co-developed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
b0be4b7aeb lavfi: move AVFilterLink.current_pts(_us) to FilterLink
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
c969da071d lavfi: move AVFilterLink.hw_frames_ctx to FilterLink
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
88d4d7fe15 lavfi/vf_*_cuda: do not access hw contexts before checking they exist
The checks are performed in init_processing_chain().

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
5e8bd1219f lavfi: move AVFilterLink.m{ax,in}_samples to FilterLink
Also, document who sets these fields and when.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:33 +02:00
Anton Khirnov
d805cdb19b lavfi: add a new struct for private link properties
Specifically those that should be visible to filters, but hidden from
API callers. Such properties are currently located at the end of the
public AVFilterLink struct, demarcated by a comment marking them as
private. However it is generally better to hide them explicitly, using
the same pattern already employed in avformat or avcodec.

The new struct is currently trivial, but will become more useful in
following commits.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:32 +02:00
Anton Khirnov
7e9cd2e966 lavfi: set AVFilterLink.graph on link creation
There is no reason to delay this.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-30 18:02:32 +02:00
Wu Jianhua
07dba79218 avcodec/vvc/dsp: prefix TxType and TxSize with VVC
See https://patchwork.ffmpeg.org/project/ffmpeg/patch/TYSPR06MB64337C4A9ADF5312E6648543AA62A@TYSPR06MB6433.apcprd06.prod.outlook.com/#81892

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:57 +02:00
Wu Jianhua
ad62ab63e2 avcodec/vvc_parser: move avctx->has_b_frames initialization to dec
From Jun Zhao <mypopydev@gmail.com>:
> Should we relocate this to the decoder? Other codecs typically set this
> parameter in the decoder.

Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
46866a56b2 avcodec/vvcdec: move frame tab memset from the main thread to worker threads
memset tables in the main thread can become a bottleneck for the decoder.
For example, if it takes 1% of the processing time for one core, the maximum achievable FPS will be 100.
Move the memeset to worker threads will fix the issue.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
72f1ce6008 avcodec/vvcdec: do not zero frame qp table
For luma, qp can only change at the CU level, so the qp tab size is related to the CU.
For chroma, considering the joint CbCr, the QP tab size is related to the TU.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
63afa7dbc9 avcodec/vvcdec: do not zero frame msf mmi table
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
e57949afe5 avcodec/vvcdec: do not zero frame cpm table
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
95da160818 avcodec/vvcdec: check_available, use && instead of &= for shortcut evaluation
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
78efbaf27c avcodec/vvcdec: do not zero frame mvf table
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
5011dfbfbb avcodec/vvcdec: refact out is_available from is_a0_available
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:56 +02:00
Nuo Mi
dec4a78f85 avcodec/vvcdec: split ctu table to zero init and no zero init parts
cus need to init to zero, other parts are not

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
5b64355aa5 avcodec/vvcdec: remove unnecessary perframe initializations
deblock, sao, alf
skip, imtf, ipm, cqt_depth, cb_pos_x, cb_pos_y, cb_height, cp_mv,
tb_pos_x0, tb_pos_y0, tb_width, tb_height

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
e6e3891a7f avcodec/vvcdec: refact, combine bs tab with tu tab
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
92c4710e4c avcodec/vvcdec: thread, ensure the parse stage gets the highest priority
The parser stage is not parallelizable.
We need to schedule it as soon as possible to create later stages, which are more parallelizable

clips                                       | before | after | delta
--------------------------------------------|--------|-------|------
RitualDance_1920x1080_60_10_420_37_RA.266   | 342.7  | 365.3 |  6.59%
NovosobornayaSquare_1920x1080.bin           | 321.7  | 400   | 24.34%
Tango2_3840x2160_60_10_420_27_LD.266        |  82.3  |  91.7 | 11.42%
RitualDance_1920x1080_60_10_420_32_LD.266   | 323.7  | 319.3 | -1.36%
Chimera_8bit_1080P_1000_frames.vvc          | 364    | 411.3 | 12.99%
BQTerrace_1920x1080_60_10_420_22_RA.vvc     | 162.7  | 185.7 | 14.14%

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
b6b22d5f56 checkasm: add tests for vvc dmvr
dmvr_8_12x20_c: 186.2
dmvr_8_12x20_avx2: 25.7
dmvr_8_20x12_c: 181.7
dmvr_8_20x12_avx2: 25.2
dmvr_8_20x20_c: 283.2
dmvr_8_20x20_avx2: 32.0
dmvr_10_12x20_c: 90.0
dmvr_10_12x20_avx2: 15.7
dmvr_10_20x12_c: 41.0
dmvr_10_20x12_avx2: 14.7
dmvr_10_20x20_c: 81.5
dmvr_10_20x20_avx2: 26.7
dmvr_12_12x20_c: 190.7
dmvr_12_12x20_avx2: 20.2
dmvr_12_20x12_c: 187.2
dmvr_12_20x12_avx2: 20.2
dmvr_12_20x20_c: 292.7
dmvr_12_20x20_avx2: 27.2
dmvr_h_8_12x20_c: 317.0
dmvr_h_8_12x20_avx2: 37.0
dmvr_h_8_20x12_c: 340.0
dmvr_h_8_20x12_avx2: 41.0
dmvr_h_8_20x20_c: 540.7
dmvr_h_8_20x20_avx2: 64.0
dmvr_h_10_12x20_c: 322.7
dmvr_h_10_12x20_avx2: 30.7
dmvr_h_10_20x12_c: 344.2
dmvr_h_10_20x12_avx2: 34.0
dmvr_h_10_20x20_c: 529.0
dmvr_h_10_20x20_avx2: 51.5
dmvr_h_12_12x20_c: 326.7
dmvr_h_12_12x20_avx2: 33.5
dmvr_h_12_20x12_c: 331.7
dmvr_h_12_20x12_avx2: 51.2
dmvr_h_12_20x20_c: 534.0
dmvr_h_12_20x20_avx2: 62.7
dmvr_hv_8_12x20_c: 650.0
dmvr_hv_8_12x20_avx2: 57.2
dmvr_hv_8_20x12_c: 676.2
dmvr_hv_8_20x12_avx2: 70.0
dmvr_hv_8_20x20_c: 1068.5
dmvr_hv_8_20x20_avx2: 103.2
dmvr_hv_10_12x20_c: 649.0
dmvr_hv_10_12x20_avx2: 48.2
dmvr_hv_10_20x12_c: 677.7
dmvr_hv_10_20x12_avx2: 59.7
dmvr_hv_10_20x20_c: 1093.5
dmvr_hv_10_20x20_avx2: 91.7
dmvr_hv_12_12x20_c: 660.0
dmvr_hv_12_12x20_avx2: 58.7
dmvr_hv_12_20x12_c: 682.7
dmvr_hv_12_20x12_avx2: 72.0
dmvr_hv_12_20x20_c: 1094.0
dmvr_hv_12_20x20_avx2: 113.2
dmvr_v_8_12x20_c: 325.7
dmvr_v_8_12x20_avx2: 31.2
dmvr_v_8_20x12_c: 326.2
dmvr_v_8_20x12_avx2: 38.5
dmvr_v_8_20x20_c: 538.5
dmvr_v_8_20x20_avx2: 54.2
dmvr_v_10_12x20_c: 318.5
dmvr_v_10_12x20_avx2: 23.7
dmvr_v_10_20x12_c: 330.7
dmvr_v_10_20x12_avx2: 40.5
dmvr_v_10_20x20_c: 567.5
dmvr_v_10_20x20_avx2: 48.0
dmvr_v_12_12x20_c: 335.2
dmvr_v_12_12x20_avx2: 30.0
dmvr_v_12_20x12_c: 330.2
dmvr_v_12_20x12_avx2: 39.5
dmvr_v_12_20x20_c: 535.2
dmvr_v_12_20x20_avx2: 60.0

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
a16ce8f94d x86/vvcdec: add dmvr avx2 code
Decoder-Side Motion Vector Refinement is about 4~8% CPU usage for some clips

here is the test result for one time
clips                                     | before| after | delta
------------------------------------------|-------|-------|------
RitualDance_1920x1080_60_10_420_37_RA.266 | 338.7 | 354.3 |4.61%
NovosobornayaSquare_1920x1080.bin         | 320.3 | 329.3 |2.81%
Tango2_3840x2160_60_10_420_27_LD.266      | 83.3  | 83.7  |0.48%
RitualDance_1920x1080_60_10_420_32_LD.266 | 320.7 | 327.3 |2.06%
Chimera_8bit_1080P_1000_frames.vvc        | 360.7 | 381.0 |5.63%
BQTerrace_1920x1080_60_10_420_22_RA.vvc   | 161.7 | 163.0 |0.80%

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
Nuo Mi
d2e280039c avcodec/vvcdec: Use av_image_copy_plane for DMVR 10-bit integer pixels
It's no need to shift and interpolate for 10-bit integer pixels,
av_image_copy_plane is enough

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:55 +02:00
eaphone
9ad368b397 libavdevice/gdigrab: change hwnd tail check fail logic to !=null
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:54 +02:00
gnattu
04c70739c1 avutil/hwcontext_videotoolbox: silence warning for RGB
Hardware frames with RGB colorspace will not have a YCbCrMatrixKey.
Currently, it will spam the console with warning if rgb frame is
uploaded.

Signed-off-by: Gnattu OC <gnattuoc@me.com>
Reviewed-by: Marvin Scholz <epirat07@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:54 +02:00
Araz Iusubov
64df2b148a avcodec/amfenc: new encoder features support
Implemented:
New usage modes for AV1 encoder.
Latency mode for H264, HEVC and AV1 encoders.
Adaptive Quantization (AQ) mode in AV1 encoder.
Signed-off-by: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:54 +02:00
Gyan Doshi
f970195fa5 lavc/libx265: unbreak build for X265_BUILD >= 210
x265 added support for alpha starting with build 210.
While doing so, x265_encoder_encode() changed its fifth arg to
an array of pointers to x265_picture. This broke building lavc/libx265.c

This patch simply unbreaks the build and maintains existing single-layer
non-alpha encoding support.

Fixes #11130

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 18:17:54 +02:00
Paul B Mahol
06201b1dac avfilter/vf_v360: add dual square projection 2024-08-15 18:13:47 +02:00
James Almer
88f24e2a61 avformat/iamf_parse: ignore Audio Elements with an unsupported type
Better fix for the NULL pointer dereference from d7f83fc2f423.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-08-15 10:48:38 +02:00