FFMedia by Firefly: Ultra-Flow Video Processing Framework

0 comments

What is FFMedia?

Rockchip RK3588 series system on chips have super video encoding and decoding capabilities, especially excellent performance in multi-channel video concurrent processing. However, when we develop video processing applications, we often face problems such as general frameworks such as gstreamer and ffmpeg failing to give full play to chip performance, official original APIs being too close to the bottom layer, high learning costs, long cycles, and heavy development workload.

The main components of each unit are as follows:

  • Input unit: Contains input units such as rtsp, rtmp, whep, camera, file, etc.
  • Processing unit: Includes hardware decoding, encoding, image processing and inference units, and other processing units that support hardware acceleration.
  • Output unit: Includes rtsp, rtmp, whip, drm display, gb28181, file and other output units.

Functions And Features

Core Architecture

  • Modular architecture: The entire framework adopts the Producer/Consumer model, and each unit is abstracted into the ModuleMedia class.
  • Efficient memory management technology: Data interaction between units and hardware is implemented using zero copy.

Media Processing Capabilities

  • Format support: supports parsing and encapsulation of mainstream container formats such as mp4/mkv/flv/ts and mainstream protocols such as rtsp/rtmp/ gb28181/webrtc.
  • Transcoding and processing: supports video transcoding, cropping, splicing, watermarking and other processing.
  • Streaming media processing: supports pulling media streams from sources such as cameras and network streams for real-time processing, forwarding and storage.

Performance Optimization

  • Low load and low latency: Deeply optimize data stream processing and transmission, with lower CPU usage and higher data real-time performance compared to GStreamer and FFmpeg
  • Efficient Python module: Seamless interoperability between C++ and Python is achieved through pybind11.
  • Unified interface: Shield and optimize complex underlying operations to provide users with an efficient and unified interface.

Platform Compatibility

  • Chip-level adaptation: supports all Rockchip machine models under the Firefly platform.
  • System support: supports different versions of systems such as Buildroot/Ubuntu/ Debian.

Download Source Code

Pull Source Code:

  1. git clone https://github.com/Firefly-rk-linux-utils/ffmedia_release.git

Development Interface

All interfaces support C++ and Python calls.

C++ language paradigm:

  1. auto rtsp_c = make_shared <modulertspclient>("rtsp://xxx");
  2. auto ret = rtsp_c->init()

Python language paradigm:

  1. auto rtsp_c = make_shared("rtsp://xxx");
  2. auto ret = rtsp_c->init()

Typical Scenarios And Performance Tests

Test environment: ITX-3588J ITX Motherboard

Low Latency Live Streaming

Test the playback of H265 1080p@30fps RTSP live stream using the following modules:

  • RTSP Client: A self-implemented lightweight RTSP client module is used; it takes about 0.03 milliseconds to fetch one frame of the stream.
  • MPP Decoding: Decoding module based on MPP implementation; decoding one frame takes about 1.2 milliseconds (can be as low as 0.7 in multi-channel mode).
  • DRM Display: Display module based on DRM framework; sending and displaying one frame takes about 0.9 milliseconds.

The delay of H265 (p frame series is sequential) and 1080P live broadcast can be calculated: the delay from the network to the decoding of the data stream into the YUV raw stream is about 1.3 milliseconds, and the screen display is also affected by the screen refresh rate. For example, the screen refresh interval of 60fps is 16.667 milliseconds, and the display delay can be calculated to be between 0.9 and 16.667 milliseconds. In summary, the minimum delay of a 1080P live broadcast is about 2.4 milliseconds.

The performance indicators are shown in the following table:

CPU Usage Memory Usage  Frame Processing Time
1.0% 72M 2.4ms

The simple test command is as follows:

  1. ./demo rtsp://xxx -d 0

The performance indicators of the test playback of 32-channel H265 1080p@30fps RTSP real-time streams are shown in the following table:

CPU Usage Memory Usage
23.5% 2.8GB

The simple test command is as follows:

  1.   ./demo rtsp://xxx -d 0 -c 32

Real-time Video Streaming Transcoding And Broadcasting

Test the transcoding of H265 1080p@30fps RTSP live stream into H264 RTSP stream using the relevant modules:

  • RTSP client: lightweight RTSP client module; it takes about 0.03 milliseconds to fetch one frame.
  • MPP decoding: decoding module based on MPP implementation; it takes about 1.2 milliseconds to decode one frame (can be as low as 0.7 in multi-channel mode)
  • MPP encoding: encoding module based on MPP implementation; it takes about 4.8 milliseconds to encode one frame (can be as low as 2.5 in multi-channel mode)
  • RTSP server: lightweight RTSP server module; it takes about 0.1 milliseconds to push one frame.

It can be preliminarily estimated that the theoretical time required for video frames from streaming, transcoding to streaming is about 6.3 milliseconds.

The performance indicators are shown in the following table:

CPU Usage Memory Usage  Frame Processing Time
1.0% 112M 6.3MS

The simple test command is as follows:

  1. ./demo rtsp://xxx -e h264 -p 8554
  2. # You can use demo or other software to pull the transcoded rtsp stream: rtsp://ip:8554/live/0

AIBOX Application Case - Background Removal via U²-Net

Leave a comment

Please note, comments need to be approved before they are published.