IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> Python知识库 -> AV1 编解码器编译以及使用 -> 正文阅读

[Python知识库]AV1 编解码器编译以及使用

参考网站

https://aomedia.googlesource.com/aom/

编译工具和依赖库

  1. CMake

  2. Git

  3. Perl

    • 官网下载安装:https://www.perl.org/get.html

    • 我的百度云下载安装:链接:https://pan.baidu.com/s/1ZSsE2nevhW7ZMMSY1QzE-Q?pwd=unel
      提取码:unel

      安装完成之后在终端输入:perl -v

      出现以下输出则表示成功

      在这里插入图片描述

  4. yasm

    链接:https://yasm.tortall.net/Download.html

    下载完成之后重命名为 yasm.exe,并且配置环境变量。

  5. Doxygen

    链接:https://doxygen.nl/download.html,需要安装 1.8.10 以上版本

  6. EMSDK

    链接:https://github.com/emscripten-core/emsdk,需要安装 Python3.6.0 以上版本

    在终端进入代码路径之后输入以下命令:

    emsdk.bat update			#更新
    emsdk.bat install latest 	#安装最新的emsdk
    emsdk.bat activate latest	#激活
    emsdk_env.bat				#设置环境变量
    

    安装完成之后在终端输入:emcc -v,出现以下输出则表示成功:
    在这里插入图片描述

    在 Windows 环境下,如果想把 Emscripten 的环境变量注册为全局变量,可以以管理员身份运行 emsdk.bat activate latest --global,或者手动配置环境变量,来更改系统的环境变量,使得以后无需再运行 emsdk_env.bat,该方法有潜在的副作用:它将环境变量指向了 Emscripten 内置的 Node.js、Python、Java,若系统中安装了这些组件的其他版本,可能引发冲突。

下载源代码

  • 官方链接:git clone https://aomedia.googlesource.com/aom

  • 即使可以科学上网,但是官方链接我一直无法下载,所以我去谷歌的 AV1 论坛(https://groups.google.com/a/aomedia.org/g/av1-discuss)找了一份代码,链接:https://storage.googleapis.com/aom-releases/libaom-3.4.0.tar.gz

编译

使用 CMake 生成 VS 解决方案

打开 CMake 软件,设置代码路径,生成路径,点击 Configure 按钮
在这里插入图片描述

选择自己的 VS 版本
在这里插入图片描述

根据需求手动修改配置(不修改也可以),点击 Generate 按钮
在这里插入图片描述

看到以下 2 行说明生成成功:
在这里插入图片描述

编译

  1. 进入 build 目录,打开 AOM.sln

  2. 设置 aomenc/aomdec 为启动项,右键生成
    在这里插入图片描述

使用

编码示例:

aomenc.exe --fps=30/1 --width=320 --height=256 --max-q=22--psnr --obu --bit-depth=8 -y -o str.bin E:\Sequence\foreman-cif_320x256.yuv

解码示例:

aomdec.exe --i420 -o rec.yuv str.bin

所有可调的编码参数:

Options:
            --help                      Show usage options and exit
  -c <arg>, --cfg=<arg>                 Config file to use
  -D,       --debug                     Debug mode (makes output deterministic)
  -o <arg>, --output=<arg>              Output filename
            --codec=<arg>               Codec to use
  -p <arg>, --passes=<arg>              Number of passes (1/2/3)
            --pass=<arg>                Pass to execute (1/2/3)
            --fpf=<arg>                 First pass statistics file name
            --limit=<arg>               Stop encoding after n input frames
            --skip=<arg>                Skip the first n input frames
            --good                      Use Good Quality Deadline
            --rt                        Use Realtime Quality Deadline
            --allintra                  Use all intra mode
  -q,       --quiet                     Do not print encode progress
  -v,       --verbose                   Show encoder parameters
            --psnr=<arg>                Show PSNR in status line (0: Disable PSNR status line display, 1: PSNR calculated using input bit-depth (default), 2: PSNR calculated using stream bit-depth); takes default option when arguments are not specified
            --webm                      Output WebM (default when WebM IO is enabled)
            --ivf                       Output IVF
            --obu                       Output OBU
            --q-hist=<arg>              Show quantizer histogram (n-buckets)
            --rate-hist=<arg>           Show rate histogram (n-buckets)
            --disable-warnings          Disable warnings about potentially incorrect encode settings
  -y,       --disable-warning-prompt    Display warnings, but do not prompt user to continue
            --test-decode=<arg>         Test encode/decode mismatch
                                          off, fatal, warn

Encoder Global Options:
            --nv12                      Input file is NV12
            --yv12                      Input file is YV12
            --i420                      Input file is I420 (default)
            --i422                      Input file is I422
            --i444                      Input file is I444
  -u <arg>, --usage=<arg>               Usage profile number to use (0: good, 1: rt, 2: allintra)
  -t <arg>, --threads=<arg>             Max number of threads to use
            --profile=<arg>             Bitstream profile number to use
  -w <arg>, --width=<arg>               Frame width
  -h <arg>, --height=<arg>              Frame height
            --forced_max_frame_width=<arg>
                                        Maximum frame width value to force
            --forced_max_frame_height=<arg>
                                        Maximum frame height value to force
            --stereo-mode=<arg>         Stereo 3D video format
                                          mono, left-right, bottom-top, top-bottom, right-left
            --timebase=<arg>            Output timestamp precision (fractional seconds)
            --fps=<arg>                 Stream frame rate (rate/scale)
            --global-error-resilient=<arg>
                                        Enable global error resiliency features
  -b <arg>, --bit-depth=<arg>           Bit depth for codec
                                          8, 10, 12
            --input-bit-depth=<arg>     Bit depth of input
            --lag-in-frames=<arg>       Max number of frames to lag
            --large-scale-tile=<arg>    Large scale tile coding (0: off (default), 1: on (ivf output only))
            --monochrome                Monochrome video (no chroma planes)
            --full-still-picture-hdr    Use full header for still picture
            --use-16bit-internal        Force use of 16-bit pipeline
            --annexb=<arg>              Save as Annex-B

Rate Control Options:
            --drop-frame=<arg>          Temporal resampling threshold (buf %)
            --resize-mode=<arg>         Frame resize mode
            --resize-denominator=<arg>  Frame resize denominator
            --resize-kf-denominator=<arg>
                                        Frame resize keyframe denominator
            --superres-mode=<arg>       Frame super-resolution mode
            --superres-denominator=<arg>
                                        Frame super-resolution denominator
            --superres-kf-denominator=<arg>
                                        Frame super-resolution keyframe denominator
            --superres-qthresh=<arg>    Frame super-resolution qindex threshold
            --superres-kf-qthresh=<arg> Frame super-resolution keyframe qindex threshold
            --end-usage=<arg>           Rate control mode
                                          vbr, cbr, cq, q
            --target-bitrate=<arg>      Bitrate (kbps)
            --min-q=<arg>               Minimum (best) quantizer
            --max-q=<arg>               Maximum (worst) quantizer
            --undershoot-pct=<arg>      Datarate undershoot (min) target (%)
            --overshoot-pct=<arg>       Datarate overshoot (max) target (%)
            --buf-sz=<arg>              Client buffer size (ms)
            --buf-initial-sz=<arg>      Client initial buffer size (ms)
            --buf-optimal-sz=<arg>      Client optimal buffer size (ms)
            --bias-pct=<arg>            CBR/VBR bias (0=CBR, 100=VBR)
            --minsection-pct=<arg>      GOP min bitrate (% of target)
            --maxsection-pct=<arg>      GOP max bitrate (% of target)

Keyframe Placement Options:
            --enable-fwd-kf=<arg>       Enable forward reference keyframes
            --kf-min-dist=<arg>         Minimum keyframe interval (frames)
            --kf-max-dist=<arg>         Maximum keyframe interval (frames)
            --disable-kf                Disable keyframe placement
            --sframe-dist=<arg>         S-Frame interval (frames)
            --sframe-mode=<arg>         S-Frame insertion mode (1..2)

AV1 Specific Options:
            --cpu-used=<arg>            Speed setting (0..6 in good mode, 5..10 in realtime mode, 0..9 in all intra mode)
            --auto-alt-ref=<arg>        Enable automatic alt reference frames
            --sharpness=<arg>           Bias towards block sharpness in rate-distortion optimization of transform coefficients (0..7), default is 0
            --static-thresh=<arg>       Motion detection threshold
            --row-mt=<arg>              Enable row based multi-threading (0: off, 1: on (default))
            --tile-columns=<arg>        Number of tile columns to use, log2
            --tile-rows=<arg>           Number of tile rows to use, log2
            --enable-tpl-model=<arg>    RDO based on frame temporal dependency (0: off, 1: backward source based); required for deltaq mode
            --enable-keyframe-filtering=<arg>
                                        Apply temporal filtering on key frame (0: no filter, 1: filter without overlay (default), 2: filter with overlay - experimental, may break random access in players)
            --arnr-maxframes=<arg>      AltRef max frames (0..15)
            --arnr-strength=<arg>       AltRef filter strength (0..6)
            --tune=<arg>                Distortion metric tuned with
                                          psnr, ssim, vmaf_with_preprocessing, vmaf_without_preprocessing, vmaf, vmaf_neg, butteraugli
            --cq-level=<arg>            Constant/Constrained Quality level
            --max-intra-rate=<arg>      Max I-frame bitrate (pct)
            --max-inter-rate=<arg>      Max P-frame bitrate (pct)
            --gf-cbr-boost=<arg>        Boost for Golden Frame in CBR mode (pct)
            --lossless=<arg>            Lossless mode (0: false (default), 1: true)
            --enable-cdef=<arg>         Enable the constrained directional enhancement filter (0: false, 1: true (default), 2: disable for non-reference frames)
            --enable-restoration=<arg>  Enable the loop restoration filter (0: false (default in realtime mode), 1: true (default in non-realtime mode))
            --enable-rect-partitions=<arg>
                                        Enable rectangular partitions (0: false, 1: true (default))
            --enable-ab-partitions=<arg>
                                        Enable ab partitions (0: false, 1: true (default))
            --enable-1to4-partitions=<arg>
                                        Enable 1:4 and 4:1 partitions (0: false, 1: true (default))
            --min-partition-size=<arg>  Set min partition size (4:4x4, 8:8x8, 16:16x16, 32:32x32, 64:64x64, 128:128x128); with 4k+ resolutions or higher speed settings, min partition size will have a minimum of 8
            --max-partition-size=<arg>  Set max partition size (4:4x4, 8:8x8, 16:16x16, 32:32x32, 64:64x64, 128:128x128)            --enable-dual-filter=<arg>  Enable dual filter (0: false, 1: true (default))
            --enable-chroma-deltaq=<arg>
                                        Enable chroma delta quant (0: false (default), 1: true)
            --enable-intra-edge-filter=<arg>
                                        Enable intra edge filtering (0: false, 1: true (default))
            --enable-order-hint=<arg>   Enable order hint (0: false, 1: true (default))
            --enable-tx64=<arg>         Enable 64-pt transform (0: false, 1: true (default))
            --enable-flip-idtx=<arg>    Enable extended transform type (0: false, 1: true (default)) including FLIPADST_DCT, DCT_FLIPADST, FLIPADST_FLIPADST, ADST_FLIPADST, FLIPADST_ADST, IDTX, V_DCT, H_DCT, V_ADST, H_ADST, V_FLIPADST, H_FLIPADST
            --enable-rect-tx=<arg>      Enable rectangular transform (0: false, 1: true (default))
            --enable-dist-wtd-comp=<arg>
                                        Enable distance-weighted compound (0: false, 1: true (default))
            --enable-masked-comp=<arg>  Enable masked (wedge/diff-wtd) compound (0: false, 1: true (default))
            --enable-onesided-comp=<arg>
                                        Enable one sided compound (0: false, 1: true (default))
            --enable-interintra-comp=<arg>
                                        Enable interintra compound (0: false, 1: true (default))
            --enable-smooth-interintra=<arg>
                                        Enable smooth interintra mode (0: false, 1: true (default))
            --enable-diff-wtd-comp=<arg>
                                        Enable difference-weighted compound (0: false, 1: true (default))
            --enable-interinter-wedge=<arg>
                                        Enable interinter wedge compound (0: false, 1: true (default))
            --enable-interintra-wedge=<arg>
                                        Enable interintra wedge compound (0: false, 1: true (default))
            --enable-global-motion=<arg>
                                        Enable global motion (0: false, 1: true (default))
            --enable-warped-motion=<arg>
                                        Enable local warped motion (0: false, 1: true (default))
            --enable-filter-intra=<arg> Enable filter intra prediction mode (0: false, 1: true (default))
            --enable-smooth-intra=<arg> Enable smooth intra prediction modes (0: false, 1: true (default))
            --enable-paeth-intra=<arg>  Enable Paeth intra prediction mode (0: false, 1: true (default))
            --enable-cfl-intra=<arg>    Enable chroma from luma intra prediction mode (0: false, 1: true (default))
            --enable-diagonal-intra=<arg>
                                        Enable diagonal (D45 to D203) intra prediction modes, which are a subset of directional modes; has no effect if enable-directional-intra is 0 (0: false, 1: true (default))
            --force-video-mode=<arg>    Force video mode (0: false, 1: true (default))
            --enable-obmc=<arg>         Enable OBMC (0: false, 1: true (default))
            --enable-overlay=<arg>      Enable coding overlay frames (0: false, 1: true (default))
            --enable-palette=<arg>      Enable palette prediction mode (0: false, 1: true (default))
            --enable-intrabc=<arg>      Enable intra block copy prediction mode (0: false, 1: true (default))
            --enable-angle-delta=<arg>  Enable intra angle delta (0: false, 1: true (default))
            --disable-trellis-quant=<arg>
                                        Disable trellis optimization of quantized coefficients (0: false 1: true  2: true for rd search 3: true for estimate yrd search (default))
            --enable-qm=<arg>           Enable quantisation matrices (0: false (default), 1: true)
            --qm-min=<arg>              Min quant matrix flatness (0..15), default is 8
            --qm-max=<arg>              Max quant matrix flatness (0..15), default is 15
            --reduced-tx-type-set=<arg> Use reduced set of transform types
            --use-intra-dct-only=<arg>  Use DCT only for INTRA modes
            --use-inter-dct-only=<arg>  Use DCT only for INTER modes
            --use-intra-default-tx-only=<arg>
                                        Use Default-transform only for INTRA modes
            --quant-b-adapt=<arg>       Use adaptive quantize_b
            --coeff-cost-upd-freq=<arg> Update freq for coeff costs. 0: SB, 1: SB Row per Tile, 2: Tile, 3: Off
            --mode-cost-upd-freq=<arg>  Update freq for mode costs. 0: SB, 1: SB Row per Tile, 2: Tile, 3: Off
            --mv-cost-upd-freq=<arg>    Update freq for mv costs. 0: SB, 1: SB Row per Tile, 2: Tile, 3: Off
            --frame-parallel=<arg>      Enable frame parallel decodability features (0: false (default), 1: true)
            --error-resilient=<arg>     Enable error resilient features (0: false (default), 1: true)
            --aq-mode=<arg>             Adaptive quantization mode (0: off (default), 1: variance 2: complexity, 3: cyclic refresh)
            --deltaq-mode=<arg>         Delta qindex mode (0: off, 1: deltaq objective (default), 2: deltaq placeholder, 3: key frame visual quality, 4: user rating based visual quality optimization); requires --enable-tpl-model=1
            --deltaq-strength=<arg>     Deltaq strength for --deltaq-mode=4 (%)
            --delta-lf-mode=<arg>       Enable delta-lf-mode (0: off (default), 1: on)
            --frame-boost=<arg>         Enable frame periodic boost (0: off (default), 1: on)
            --noise-sensitivity=<arg>   Noise sensitivity (frames to blur)
            --tune-content=<arg>        Tune content type
                                          default, screen, film
            --cdf-update-mode=<arg>     CDF update mode for entropy coding (0: no CDF update, 1: update CDF on all frames (default), 2: selectively update CDF on some frames)
            --color-primaries=<arg>     Color primaries (CICP) of input content:
                                          bt709, unspecified, bt601, bt470m, bt470bg, smpte240, film, bt2020, xyz, smpte431, smpte432, ebu3213
            --transfer-characteristics=<arg>
                                        Transfer characteristics (CICP) of input content:
                                          unspecified, bt709, bt470m, bt470bg, bt601, smpte240, lin, log100, log100sq10, iec61966, bt1361, srgb, bt2020-10bit, bt2020-12bit, smpte2084, hlg, smpte428
            --matrix-coefficients=<arg> Matrix coefficients (CICP) of input content:
                                          identity, bt709, unspecified, fcc73, bt470bg, bt601, smpte240, ycgco, bt2020ncl, bt2020cl, smpte2085, chromncl, chromcl, ictcp
            --chroma-sample-position=<arg>
                                        The chroma sample position when chroma 4:2:0 is signaled:
                                          unknown, vertical, colocated
            --min-gf-interval=<arg>     Min gf/arf frame interval (default 0, indicating in-built behavior)
            --max-gf-interval=<arg>     Max gf/arf frame interval (default 0, indicating in-built behavior)
            --gf-min-pyr-height=<arg>   Min height for GF group pyramid structure (0 (default) to 5)
            --gf-max-pyr-height=<arg>   Maximum height for GF group pyramid structure (0 to 5 (default))
            --sb-size=<arg>             Superblock size to use
                                          dynamic, 64, 128
            --num-tile-groups=<arg>     Maximum number of tile groups, default is 1
            --mtu-size=<arg>            MTU size for a tile group, default is 0 (no MTU targeting), overrides maximum number of tile groups
            --timing-info=<arg>         Signal timing info in the bitstream (model only works for no hidden frames, no super-res yet):
                                          unspecified, constant, model
            --film-grain-test=<arg>     Film grain test vectors (0: none (default), 1: test-1  2: test-2, ... 16: test-16)
            --film-grain-table=<arg>    Path to file containing film grain parameters
            --denoise-noise-level=<arg> Amount of noise (from 0 = don't denoise, to 50)
            --denoise-block-size=<arg>  Denoise block size (default = 32)
            --enable-dnl-denoising=<arg>
                                        Apply denoising to the frame being encoded when denoise-noise-level is enabled (0: false, 1: true (default))
            --max-reference-frames=<arg>
                                        Maximum number of reference frames allowed per frame (3 to 7 (default))
            --reduced-reference-set=<arg>
                                        Use reduced set of single and compound references (0: off (default), 1: on)
            --enable-ref-frame-mvs=<arg>
                                        Enable temporal mv prediction (default is 1)
            --target-seq-level-idx=<arg>
                                        Target sequence level index. Possible values are in the form of "ABxy". AB: Operating point (OP) index, xy: Target level index for the OP. E.g. "0" means target level index 0 (2.0) for the 0th OP, "1019" means target level index 19 (6.3) for the 10th OP.
            --set-tier-mask=<arg>       Set bit mask to specify which tier each of the 32 possible operating points conforms to. Bit value 0 (default): Main Tier, 1: High Tier.
            --min-cr=<arg>              Set minimum compression ratio. Take integer values. Default is 0. If non-zero, encoder will try to keep the compression ratio of each frame to be higher than the given value divided by 100.
            --vbr-corpus-complexity-lap=<arg>
                                        Set average corpus complexity per mb for single pass VBR using lap. (0..10000), default is 0
            --input-chroma-subsampling-x=<arg>
                                        Chroma subsampling x value
            --input-chroma-subsampling-y=<arg>
                                        Chroma subsampling y value
            --dv-cost-upd-freq=<arg>    Update freq for dv costs. 0: SB, 1: SB Row per Tile, 2: Tile, 3: Off
            --partition-info-path=<arg> Partition information read and write path
            --enable-directional-intra=<arg>
                                        Enable directional intra prediction modes (0: false, 1: true (default))
            --enable-tx-size-search=<arg>
                                        Enable transform size search to find the best size for each block. If false, transforms always have the largest possible size (0: false, 1: true (default))
            --loopfilter-control=<arg>  Control loop filtering (0: Loopfilter disabled for all frames, 1: Enable loopfilter for all frames (default), 2: Disable loopfilter for non-reference frames, 3: Disable loopfilter for frames with low motion)
            --auto-intra-tools-off=<arg>
                                        Automatically turn off several intra coding tools for allintra mode; only in effect if --deltaq-mode=3
  -p <arg>, --passes=<arg>              Number of passes (1/2/3)
            --two-pass-output=<arg>     The output file for the first two passes for three-pass encoding
  -spf <arg>, --second-pass-log=<arg>   Log file from second pass
            --fwd-kf-dist=<arg>         Set distance between forward keyframes. A value of -1 (default) means no repetitive forward keyframes.
            --strict-level-conformance=<arg>
                                        When set to 1, exit the encoder when it fails to encode to a given target level
            --dist-metric=<arg>         Distortion metric to use for in-block optimization
                                          psnr, qm-psnr

Stream timebase (--timebase):
  The desired precision of timestamps in the output, expressed
  in fractional seconds. Default is 1/1000.
  Python知识库 最新文章
Python中String模块
【Python】 14-CVS文件操作
python的panda库读写文件
使用Nordic的nrf52840实现蓝牙DFU过程
【Python学习记录】numpy数组用法整理
Python学习笔记
python字符串和列表
python如何从txt文件中解析出有效的数据
Python编程从入门到实践自学/3.1-3.2
python变量
上一篇文章      下一篇文章      查看所有文章
加:2022-06-29 19:00:05  更:2022-06-29 19:03:18 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/15 11:30:18-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码