Transcoding¶
Data model¶
This section explains ffmpeg/fffw data model in details.
ffmpeg command line structure¶
Let’s look on short command line produced by fffw in Write your first command:
ffmpeg -loglevel level+info -y \
-t 5.0 -i input.mp4 \
-filter_complex "[0:v]scale=w=1280:h=720[vout0]" \
-map "[vout0]" -c:v libx264 \
-map 0:a -c:a aac \
output.mp4
- First section contains common
ffmpegflags: -loglevel- logging setup-y- overwrite mode
- Second part contains parameters related to input files:
-t- total input read duration-i- input file name
After that there is a -filter_complex parameter that describes stream
processing graph. In details we’ll discuss it in section Filter graph definition.
- Next part contains codecs parameters:
-map- what is an input for this codec, input stream or graph edge.-c:v- video codec identifier.-c:a- audio codec identifier.
This section usually contains lot’s of codec-specific parameter like bitrate or number of audio channels.
The last part is output file definition section. Usually it’s just output file
name (output.mp4) but it may contain some muxer parameters.
Filter graph definition¶
ffmpeg provides a very powerful tool for video and audio stream processing -
filter graph. This graph contains filters - nodes connected with named
edges.
filteris a node that receives one or more input streams and produces one or more output streams.Each
streamis a sequence of frames (video or audio)Another node is an
input stream: it is a starting node for graph that starts from decoder (a thing that receives chunks of encoded video fromdemuxerand decodes it to a raw image / audio sample sequence).And the last type of node is a
codec: it is an output node for graph that receives a raw video/audio stream from filter graph, compress it and pass to amuxerwhich writes resulting file.
There are two syntaxes to define edges between graph nodes:
Short syntax describes a linear sequence of filters:
deint,crop=0:10:1920:1060,scale=1280:720
This syntax has no named edges and means that three filters (
deint,cropandscale) are applied subsequently to a single video stream.Full syntax describes complicate graph filter:
[0:v]scale=100:100[logo]; [1:v][logo]overlay=x=1800:y=100[vout0]
This syntax has named input stream identifiers (
[0:v],[1:v]) and named edges ([logo],[vout0]) to have control about how nodes are connected to each other and to codecs.
Implementation¶
Let’s look how this command line structure is implemented in fffw.
Common ffmpeg flags¶
FFMPEG class is responsible for rendering
common flags like overwrite or loglevel. There are a lot of other flags
that are not covered by provided implementation and should be added manually
via FFMPEG inheritance as discussed in Extending fffw.
from fffw.encoding import FFMPEG
ff = FFMPEG(overwrite=True)
Input file flags¶
Input files in fffw are described by
Input which stores a list of
Stream objects. When Input is a
file, Stream is a video or audio sequence in this file. An Input could
also be a capture device like x11grab or a network client like hls.
You may initialize Input directly or use
input_file helper.
Each Stream can contain metadata - information about dimensions, duration,
bitrate and another characteristics described by
VideoMeta and
AudioMeta.
For an input file you can set such flags as fast seek or input format.
from pymediainfo import MediaInfo
from fffw.encoding import *
from fffw.graph.meta import *
# detect information about input file
mi = MediaInfo.parse('input.mp4')
# initializing streams with metadata
streams = []
for track in from_media_info(mi):
if isinstance(track, VideoMeta):
streams.append(Stream(VIDEO, meta=track))
else:
streams.append(Stream(AUDIO, meta=track))
# initialize input file
source = Input(input_file='input.mp4', streams=tuple(streams))
# if no metadata is required, just use text variant
ff = FFMPEG(input='logo.png')
# add another input to ffmpeg
ff < source
Filter complex¶
FilterComplex hides all the
complexity of properly linking filters together. It is also responsible for
tracing metadata transformations (like dimensions change in Scale filter or
duration change in Trim).
from fffw.encoding import *
ff = FFMPEG()
source = ff < input_file('input.mp4')
logo = ff < input_file('logo.png')
# pass first video stream (from source input file) as bottom
# layer to overlay filter.
overlay = ff.video | Overlay(x=1720, y=100)
# scale logo to 100x100 and pass as top layer to overlay filter
logo | Scale(width=100, height=100) | overlay
# output video with logo to destination file
output = overlay > output_file('output.mp4', VideoCodec('libx264'))
# tell ffmpeg that it'll output something to destination file
ff > output
Output files¶
ffmpeg results are defined by
Output class, which contains
a list of Codec objects representing
video and audio streams in destination file encoded by some codecs.
Each codec has
-mapparameter which links it either to input stream or to a destination node in filter graphCodec defines a set of encoding parameters like
bitrateor number of audio channels. These parameters are not defined byfffwlibrary and should be defined via inheritance as discussed in Extending fffw.Codec list definition is followed by a set of muxing parameters (like
format) and destination file name. There parameters are kept byOutputinstance.FFMPEG may have multiple outputs.
from fffw.encoding import *
from fffw.graph import VIDEO
ff = FFMPEG(input='input.mp4')
split = ff.video | Split(VIDEO, output_count=4)
# define video codecs
vc1 = VideoCodec('libx264', bitrate=4_000_000)
split | Scale(1920, 1080) > vc1
vc2 = VideoCodec('libx264', bitrate=2_000_000)
split | Scale(1280, 720) > vc2
vc3 = VideoCodec('libx264', bitrate=1_000_000)
split | Scale(960, 480) > vc3
vc4 = VideoCodec('libx264', bitrate=500_000)
split | Scale(640, 360) > vc4
# add an audio codec for each quality
ac1, ac2, ac3, ac4 = [AudioCodec('aac') for _ in range(4)]
# tell ffmpeg to take single audio stream and encode
# it 4 times for each output
audio_stream = ff.audio
audio_stream > ac1
audio_stream > ac2
audio_stream > ac3
audio_stream > ac4
# define outputs as a filename with codec set
ff > output_file('full_hd.mp4', vc1, ac1)
ff > output_file('hd.mp4', vc2, ac2)
ff > output_file('middle.mp4', vc3, ac3)
ff > output_file('low.mp4', vc4, ac4)
Usage¶
To process something with fffw you need:
Create
FFMPEGinstanceAdd one or more
Inputfiles to itIf necessary, initialize some processing graph
Add one or more
OutputfilesRun command with
FFMPEG.run