librempeg/doc/multithreading.txt
Andreas Rheinhardt 6e3bbd0062 avcodec/codec_internal: Remove FF_CODEC_CAP_ALLOCATE_PROGRESS
Before commit f025b8e110,
every frame-threaded decoder used ThreadFrames, even when
they did not have any inter-frame dependencies at all.
In order to distinguish those decoders that need the AVBuffer
for progress communication from those that do not (to avoid
the allocation for the latter), the former decoders were marked
with the FF_CODEC_CAP_ALLOCATE_PROGRESS internal codec cap.

Yet distinguishing these two can be done in a more natural way:
Don't use ThreadFrames when not needed and split ff_thread_get_buffer()
into a core function that calls the user's get_buffer2 callback
and a wrapper around it that also allocates the progress AVBuffer.
This has been done in 02220b88fc
and since that commit the ALLOCATE_PROGRESS cap was nearly redundant.

The only exception was WebP and VP8. WebP can contain VP8
and uses the VP8 decoder directly (i.e. they share the same
AVCodecContext). Both decoders are frame-threaded and VP8
has inter-frame dependencies (in general, not in valid WebP)
and therefore the ALLOCATE_PROGRESS cap. In order to avoid
allocating progress in case of a frame-threaded WebP decoder
the cap and the check for the cap has been kept in place.

Yet now the VP8 decoder has been switched to use ProgressFrames
and therefore there is just no reason any more for this check
and the cap. This commit therefore removes both.

Also change the value of FF_CODEC_CAP_USES_PROGRESSFRAMES
to leave no gaps.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2024-04-20 14:20:10 +02:00

66 lines
2.7 KiB
Plaintext

FFmpeg multithreading methods
==============================================
FFmpeg provides two methods for multithreading codecs.
Slice threading decodes multiple parts of a frame at the same time, using
AVCodecContext execute() and execute2().
Frame threading decodes multiple frames at the same time.
It accepts N future frames and delays decoded pictures by N-1 frames.
The later frames are decoded in separate threads while the user is
displaying the current one.
Restrictions on clients
==============================================
Slice threading -
* The client's draw_horiz_band() must be thread-safe according to the comment
in avcodec.h.
Frame threading -
* Restrictions with slice threading also apply.
* Custom get_buffer2() and get_format() callbacks must be thread-safe.
* There is one frame of delay added for every thread beyond the first one.
Clients must be able to handle this; the pkt_dts and pkt_pts fields in
AVFrame will work as usual.
Restrictions on codec implementations
==============================================
Slice threading -
None except that there must be something worth executing in parallel.
Frame threading -
* Codecs can only accept entire pictures per packet.
* Codecs similar to ffv1, whose streams don't reset across frames,
will not work because their bitstreams cannot be decoded in parallel.
* The contents of buffers must not be read before ff_progress_frame_await()
has been called on them. reget_buffer() and buffer age optimizations no longer work.
* The contents of buffers must not be written to after ff_progress_frame_report()
has been called on them. This includes draw_edges().
Porting codecs to frame threading
==============================================
Find all context variables that are needed by the next frame. Move all
code changing them, as well as code calling get_buffer(), up to before
the decode process starts. Call ff_thread_finish_setup() afterwards. If
some code can't be moved, have update_thread_context() run it in the next
thread.
Add AV_CODEC_CAP_FRAME_THREADS to the codec capabilities. There will be very little
speed gain at this point but it should work.
Use ff_thread_get_buffer() (or ff_progress_frame_get_buffer()
in case you have inter-frame dependencies and use the ProgressFrame API)
to allocate frame buffers.
Call ff_progress_frame_report() after some part of the current picture has decoded.
A good place to put this is where draw_horiz_band() is called - add this if it isn't
called anywhere, as it's useful too and the implementation is trivial when you're
doing this. Note that draw_edges() needs to be called before reporting progress.
Before accessing a reference frame or its MVs, call ff_progress_frame_await().