You are not logged in.
Dear ab, gods.
I am using fpc to call ffmpeg c library on aarch64 Linux PC, aim to decode 1080p video.
Result: 70 milliseconds per frame.
Why is the efficiency so low? Normally, it should be 10 milliseconds per frame.
On the same machine, use qt creator to call ffmpeg c library to complete the same task,Result OK.
my pc is aarch64 linux based ubuntu.
If necessary, I can open TeamViewer remotely control.
Offline
I do not know what to say, since I don't know much about your problem.
Are you sure the time is spend in ffmpeg and not in your FPC code?
Also look at how the ffmpeg library is initialized (memory or mutex settings e.g.).
Offline
thank you, ab, I should explain more clearly.
1. The target task is to use FPC to decode 1080P video files.
2. The current issue I am facing is that on the aarch64 Linux platform based ubuntu,
the decoding efficiency of using FPC to call FFMPEG c library is too low, resulting 70 milliseconds per frame.
As a comparison, I used QT Creator with the same FFMPEG c library and the same calling method, resulting in 10 milliseconds per second. This is ok.
3. In my past work experience, I have mostly worked under Windows and Linux, both of which are X86 architectures.
I used FPC to call FFMPEG c library and completed this task very smoothly.
4. I placed the code in FTP://1: Test1234@121.40.151.139 /,Please refer to readme.txt
5. If necessary, please let me know,I can open TeamViewer remotely control.
Last edited by vster (2024-02-20 04:42:51)
Offline
I have some doubts about whether it is related to the compilation of FPC on arm64.
I feel that it has little to do with the Lazarus FPC version. Currently, I am using Lazarus 2.2.2+FPC 3.2.2
Offline
the ffmpeg and sdl2 dynamic libraries are both loaded in FPC standard mode.
the key is that the avcodec.decode_video2 takes too long.
Last edited by vster (2024-02-20 03:44:52)
Offline
Perhaps ask on the FPC/Lazarus forums.
I don't see any reason in FPC why it should be slow to call a C library.
There is no overhead in FPC to call a C function, even on aarch64.
It calls the function with the standard ABI, with no redirection or whatsoever.
You have to be SURE of the slowdown location.
Use some profiling, e.g. put some QueryPerformanceMicroseconds() around the library calls, and see the timing.
Perhaps the slowdown is not in the decode library, but in how you store your result in the buffer, or something like that.
Without profiling, we can't say.
Offline
thanks. ab. I raised this question in Lazarus forums before, but there is currently no result.
https://forum.lazarus.freepascal.org/in … 953.0.html
Offline
By using gettickcount for simple timing, avcodec_decode_video2 took 70 milliseconds, mainly due to decoding H264
Offline
It is confusing why QT calls FFMPEG C Library is OK?
my confidence in Pascal has wavered a bit
Offline
Debugging the source code of ffmpeg avcodec_decode_video2, The files involved mainly include h264dec. c, h264_slice. c, and h264_cabac.c。
The main functions are h264_decode_frame ->decodedeal_units ->decode_slice.
The function that takes the longest time is decode_slice. Decode_slice Internal
Brief process: ff_init_cabac_decoder ->ff_h264_init_cabac_states->Ff_h264_decode_mbm_cabac.
In the FPC environment, ff_h264_init_cabac_states and ff_h264_decode_mbm_cabac
The execution efficiency of is only 1/6 of that in the QT environment.
Posting the source code of ff_h264_init_cabac_states will mainly take time on the for loop. Exactly, ff_h264_decode_mb_cabac is also in the
a large amount of for loop, because the h264 frame is composed of many macro blocks, ff_h264_decode_mb_cabac only decodes a single macro block.
void ff_h264_init_cabac_states(const H264Context *h, H264SliceContext *sl)
{
int ii;
const int8_t (*tab)[2];
const int slice_qp = av_clip(sl->qscale - 6*(h->ps.sps->bit_depth_luma-8), 0, 51);
if (sl->slice_type_nos == AV_PICTURE_TYPE_I)
tab = cabac_context_init_I;
else
tab = cabac_context_init_PB[sl->cabac_init_idc];
//calculate pre-state
for( ii= 0; ii < 1024; ii++ ) {
int pre = 2*(((tab[ii][0] * slice_qp) >>4 ) + tab[ii][1]) - 127;
pre^= pre>>31;
if(pre > 124)
pre= 124 + (pre&1);
sl->cabac_state[ii] = pre;
}
}
Offline
Under qt, ff_h264_init_cabac_states costs 6 clock_t, while under fpc, it costs 36 clock_t.
@ab, @all,please help me, thanks
Last edited by vster (2024-02-27 05:54:59)
Offline
Further debugging, ff_h264_init_cabac_states mainly takes time on for loop
Offline
Did you ask on some ffmpeg dedicated forum?
My wild guess (probably wrong) is that the library context is not properly initialized, so non optimal code (e.g. with pure C code) is used.
Don't try to follow the code, IIRC a lot of ffmpeg functions are renamed via macros in the source code, so it is likely that the source you look at is not what is executed.
Offline
thanks, ab.
I will write the timing code in the ffmpeg c file and compile it into libraries such as libavcodec.so,
Then Lazarus demo loading the libavcodec.so library through loadlibrary and running it,
Print out the timing results. The testing process for QT and Lazarus is the same, The timing code has also been executed
but the timing result for QT is ok, Lazarus results are too slow.
Offline