WebM文件格式标准
英文原文地址:http://www.webmproject.org/docs/container/#?
一、綜述
WebM是一種數字多媒體容器文件格式,由WebM Project提出,它包含了Matroska容器文件格式標準。
二、目標
1、短期目標
(1)選擇一種適合于VP8編碼格式的容器格式,使開放的網絡更完美。
(2)使網絡內容提供者更容易創建和發布VP8視頻。
2、長期目標
培養這種開源格式的普及性,使得用戶可以任何地方都能很方便的使用。
三、命名
| 容器格式名稱 | WebM | 只包含Audio的MIME類型 | audio/webm |
| 文件擴展名 | .webm | Video編碼格式 | VP8 |
| MIME類型 | video/webm | Audio編碼格式 | Vorbis |
四、HTML5視頻類型參數
1、視頻編碼格式
VP8:vp8.X的含義是vp8編碼格式,bitstream的版本為X。目前VP8的bitstream的版本只有0,FURCC為VP8X,正好與vp8.X相匹配。
2、音頻編碼格式
Vorbis:只包含音頻格式的文件的MIME為"audio/webm"。
3、canPlayType函數
(1)canPlayType('video/webm') should return maybe
(2)canPlayType('audio/webm') should return maybe
(3)canPlayType('video/webm; codecs="vp8, vorbis"') should return probably
(4)canPlayType('video/webm; codecs="vp8.0, vorbis"') should return probably
(5)canPlayType('audio/webm; codecs="vorbis"') should return probably
五、智能客戶端
主要目標之一就是讓內容創作者具有更高級的回放能力,例如僅僅使用一個HTTP服務器就可以實現快速seek和快速start。因此,為了實現這個目標,內容創作者應該遵循下面的guidelines。
六、WebM Guidelines
這些guidelines目前是為了基于HTTP連接的文件流,并指出哪些地方是與Matraska Specification相關的。
1、Demuxer和Muxer Guidelines
(1)DocType element應該是"webm"
(2)視頻編碼格式應該是VP8,CodecID是"V_VP8",而且沒有CodecPrivate數據
(3)音頻編碼格式應該是Vorbis,WebM Project將開發出一份詳細的指導,關于如何在WebM格式中開發Vorbis,包括profile, bitrate, channels等。
(4)最初的WebM格式不支持Subtitles。在不久的將來,WHATWG/W3C RFC將發布關于Subtitles和overlays的指導,在HTML5 <video>中。
(5)DocReadTypeVersion應該遵循Matraska Specification,例如包含v2 element的文件的DocReadTypeVersion應該是2.
2、Muxer Guidelines
(1)WebM文件應該包含SeekHead element,這樣可以讓客戶端知道文件包含Cues element。
(2)WebM文件應該包含一個只有keyframe的Cues element。Cues element應該只包含視頻關鍵幀,這樣可以降低文件頭的尺寸。建議將Cues element放在所有的Clusters之前,這樣客戶端可以在seek操作中可以seek到一個point,而且這個point還沒有下載。
(3)所有的timecodes的絕對值(block+cluster)都必須是單調遞增的。所有的timecodes都與block的起始時間相關。
(4)TimecodeScale應該設置成一個默認值:1000000納秒。這樣每個Cluster的block的數值可以上升到32767秒。
(5)關鍵幀應該放在cluster的開始位置。將關鍵幀放在cluster的開始位置,可以讓客戶端更容易、更快的實現seek操作。
(6)包含視頻關鍵幀的timecode的音頻blocks應該和這個視頻關鍵幀在同一個cluster中。
(7)和視頻blocks的絕對timecode相同的音頻blocks,應該放在視頻blocks的前面。
(8)WebM文件必須緊緊支持pixels for the DisplayUnit element。
(9)VP8 frames應該在SimpleBlock element中mux。
3、VP8交替參考幀(Alternate Reference Frames)
如果使能了交替參考幀,VP8編碼器會在輸出的時候插入一個交替參考幀(AR),在依賴于這個交替參考幀的幀的前面。在I/P幀之間最多加一個幀。這依賴幀(D)一般為P幀。交替參考幀將會被codec SDK標記為不可見,而且必選在依賴幀之前解碼,但不會輸出。
AR的時間戳的值應設置的盡量接近前一幀的時間戳值。
編碼示例:
| Input | F0 | ? | F1 |
| Output | I/P | AR | D |
| PTS | 0 | 1 | 2 |
解碼示例:
| Input | I/P | AR | D |
| Output | F0 | ? | F1 |
| PTS | 0 | ? | 2 |
4、Demuxer Guidelines
(1)Demuxer必須只能打開DocType為"webm"的文件。
(2)一旦Demuxer認定WebM文件的header和metadata是有效的,而且播放器開始播放該文件了,那么Demuxer必須盡最大的努力解析該文件,以保證回放盡可能的正確進行。
(3)WebM 文件如果沒有Cues element,如果實現seek操作。不過WebM Project正在考慮在沒有Cues element情況下支持seek。
(4)如果Cues element不是放在文件的開始位置,那么檢索會被推遲,以保證回放盡快開始。
5、WebVTT Guidelines
(1)在WebM track中保存WebVTT數據
WebVTT文件的內容作為WebM文件的一個track存在。將作為HTML5 track tag的屬性而出現的信息可以被嵌入在WebM的track element中,如下:
a、WebVTT SUBTITLES和CAPTIONS的TrackType sub-element的值是0x11,WebVTT DESCRIPTIONS和METADATA的是0x21。
b、Label屬性存儲在Name sub-element。
c、srclang屬性存儲在Language sub-element。
WebVTT的CodecID是"D_WEBVTT/kind",其中kind是SUBTITLES、CAPTIONSD、ESCRIPTIONS或METADATA中的一個。
(2)在WebM Block中保存WebVTT cue
WebM cues作為Block element的data,按照下面描述的格式保存在track中。所有的WebVTT數據必須編碼成UTF-8格式然后保存在一個WebM block中。這個block的時間戳和duration通過WebVTT Cue的起始時間和結束時間來獲得。
如果WebVTT Cue包含WebVTT cue identifier,這個WebVTT cue identifier會被寫到WebM block中,緊接著便是WebVTT line terminator。如果WebVTT Cue沒有WebVTT cue identifier,那么WebVTT line terminator會被寫到該block中。將使用一個空行來區分原始的WebVTT Cue沒有WebVTT cue identifier。
WebVTT cue timings不會被保存在WebM block中,WebVTT cue的起始時間和結束時間會通過WebM Block的起始時間和duration來合成。
如果WebVTT Cue包含?WebVTT cue settings,這個?WebVTT cue settings會被寫到WebM block中,緊接著便是WebVTT line terminator。如果WebVTT Cue沒有WebVTT cue settings,那么WebVTT line terminator會被寫到該block中。
然后,將WebVTT cue paylaod寫到該block中。
(3)WebVTT Chapter cues
WebVTT Chapter cues用于navigation,因此會采用不同的處理方式,因為他們必須放在一起而且可以立即獲得。因為這個原因,WebVTT chapter cues?不應該鑲嵌的和timed cues一樣,相反,他們應該被轉化成Matroska chapters,并且使用那種鑲嵌方式。 Matroska chapters 是WebVTT chapter cues的子集,因此轉化是無損的。
七、實現細節
在最初的版本中,WebM支持Matroska Specification的一部分標準,Matroska的其他功能還在考慮之中。目前支持的elements以及相應的描述如下所示;
1、EBML Basics
| ? | Element Name | Description |
| Supported | EBML | Top-level element,包含文件描述信息。 |
| Supported | EBMLVersion | 用來創建文件的EBML parser的版本。 |
| Supported | EBMLReadVersion | 具有讀文件功能的EBML parser的最小版本。 |
| Supported | EBMLMaxIDLength | 文件中ID的最大長度(在Matroska文件中為4 或更小)。 |
| Supported | EBMLMaxSizeLength | 文件中Size的最大長度 (在Matroska文件中為8 或更小)。如果某個element的size比EBMLMaxSizeLength大,將被認為是無效的。 |
| Supported | DocType | 描述EBML header后面的document的類型(這里是‘webm’)。 |
| Supported | DocTypeVersion | 用來創建文件的DocType interpreter的版本。 |
| Supported | DocTypeReadVersion | 具有讀文件功能的DocType interpreter的最小版本。 |
?2、Global Elements (貫穿整個文件格式)
| ? | Element Name | Description |
| Unsupported | CRC-32 | 所有的level 1 elements 應該包含一個CRC-32。建議將CRC放在Master element的開始位置,為了更容易的讀取。 |
| Supported | Void | 用來使損壞的數據無效, 以避免出現意外的行為。也可以用來預留空間,在一個sub-element中,以備后面使用。 |
| ? | Signature Start | ? |
| Unsupported | SignatureSlot | 包含數據流中一些elements的簽名。 |
| Unsupported | SignatureAlgo | 使用的簽名算法 (1=RSA, 2=elliptic)。 |
| Unsupported | SignatureHash | 使用的哈希算法 (1=SHA1-160, 2=MD5)。 |
| Unsupported | SignaturePublicKey | 使用算法的公共密鑰 (在基于PKI簽名的情況下)。 |
| Unsupported | Signature | 數據的簽名。 |
| Unsupported | SignatureElements | 包含用來計算簽名的elements。 |
| Unsupported | SignatureElementList | 由連貫的elements組成的一個list。 |
| Unsupported | SignedElement | 一個element ID,它的數據將被用來計算簽名。 |
| ? | Signature End | ? |
3、Segment
| ? | Element Name | Description |
| Supported | Segment??????????? | 這個element包含所有其他的top-level (level 1) elements。 典型的 Matroska文件只包含一個Segment element。 |
4、Meta Seek Information
| ? | Element Name | Description |
| Supported | SeekHead | 包含其他level 1 elements的位置。 |
| Supported | Seek | 包含一個指向一個EBML element的seek entry。 |
| Supported | SeekID | 與element name一致的ID。 |
| Supported | SeekPosition | Segment中element的位置 (0 = 第一個level 1 element)。 |
5、Segment Information
| ? | Element Name | Description |
| Supported | Info | Contains miscellaneous general information and statistics on the file. |
| Unsupported | SegmentUID | A randomly generated unique ID to identify the current segment between many others (128 bits). |
| Unsupported | SegmentFilename | A filename corresponding to this segment. |
| Unsupported | PrevUID | A unique ID to identify the previous chained segment (128 bits). |
| Unsupported | PrevFilename | An escaped filename corresponding to the previous segment. |
| Unsupported | NextUID | A unique ID to identify the next chained segment (128 bits). |
| Unsupported | NextFilename | An escaped filename corresponding to the next segment. |
| Unsupported | SegmentFamily | A randomly generated unique ID that all segments related to each other must use (128 bits). |
| Unsupported | ChapterTranslate | A tuple of corresponding ID used by chapter codecs to represent this segment. |
| Unsupported | ChapterTranslateEditionUID | Specify an edition UID on which this correspondance applies. When not specified, it means for all editions found in the segment. |
| Unsupported | ChapterTranslateCodec | The chapter codec using this ID (0: Matroska Script, 1: DVD-menu). |
| Unsupported | ChapterTranslateID | The binary value used to represent this segment in the chapter codec data. The format depends on the ChapProcessCodecID used. |
| Supported | TimecodeScale | Timecode scale in nanoseconds (1.000.000 means all timecodes in the segment are expressed in milliseconds). |
| Supported | Duration | Duration of the segment (based on TimecodeScale). |
| Supported | DateUTC | Date of the origin of timecode (value 0), i.e. production date. |
| Supported | Title | General name of the segment. |
| Supported | MuxingApp | Muxing application or library (“libmatroska-0.4.3”). |
| Supported | WritingApp | Writing application (“mkvmerge-0.3.3”). |
6、Cluster
| ? | Element Name | Description |
| Supported | Cluster | The lower level element containing the (monolithic) Block structure. |
| Supported | Timecode | Absolute timecode of the cluster (based on TimecodeScale). |
| Unsupported | SilentTracks | The list of tracks that are not used in that part of the stream. It is useful when using overlay tracks on seeking. Then you should decide what track to use. |
| Unsupported | SilentTrackNumber | One of the track number that are not used from now on in the stream. It could change later if not specified as silent in a further Cluster. |
| Unsupported | Position | Position of the Cluster in the segment (0 in live broadcast streams). It might help to resynchronise offset on damaged streams. |
| Supported | PrevSize | Size of the previous Cluster, in octets. Can be useful for backward playing. |
| Supported | BlockGroup | Basic container of information containing a single Block or BlockVirtual, and information specific to that Block/VirtualBlock. |
| Supported | Block | Block containing the actual data to be rendered and a timecode relative to the Cluster Timecode. |
| Unsupported | BlockVirtual | A Block with no data. It must be stored in the stream at the place the real Block should be in display order. |
| Unsupported | BlockAdditions | Contain additional blocks to complete the main one. An EBML parser that has no knowledge of the Block structure could still see and use/skip these data. |
| Unsupported | BlockMore | Contain the BlockAdditional and some parameters. |
| Unsupported | BlockAddID | An ID to identify the BlockAdditional level. |
| Unsupported | BlockAdditional | Interpreted by the codec as it wishes (using the BlockAddID). |
| Supported | BlockDuration | The duration of the Block (based on TimecodeScale). This element is mandatory when DefaultDuration is set for the track. When not written and with no DefaultDuration, the value is assumed to be the difference between the timecode of this Block and the timecode of the next Block in “display” order (not coding order). This element can be useful at the end of a Track (as there is not other Block available), or when there is a break in a track like for subtitle tracks. |
| Unsupported | ReferencePriority | This frame is referenced and has the specified cache priority. In cache only a frame of the same or higher priority can replace this frame. A value of 0 means the frame is not referenced. |
| Supported | ReferenceBlock | Timecode of another frame used as a reference (ie: B or P frame). The timecode is relative to the block it’s attached to. |
| Unsupported | ReferenceVirtual | Relative position of the data that should be in position of the virtual block. |
| Unsupported | CodecState | The new codec state to use. Data interpretation is private to the codec. This information should always be referenced by a seek entry. |
| Unsupported | Slices | Contains slices description. |
| Unsupported | TimeSlice | Contains extra time information about the data contained in the Block. While there are a few files in the wild with this element, it is no longer in use and has been deprecated. Being able to interpret this element is not required for playback. |
| Supported | LaceNumber | The reverse number of the frame in the lace (0 is the last frame, 1 is the next to last, etc). While there are a few files in the wild with this element, it is no longer in use and has been deprecated. Being able to interpret this element is not required for playback. |
| Unsupported | FrameNumber | The number of the frame to generate from this lace with this delay (allow you to generate many frames from the same Block/Frame). |
| Unsupported | BlockAdditionID | The ID of the BlockAdditional element (0 is the main Block). |
| Unsupported | Delay | The (scaled) delay to apply to the element. |
| Unsupported | Duration | The (scaled) duration to apply to the element. |
| Supported | SimpleBlock | Similar to Block but without all the extra information, mostly used to reduced overhead when no extra feature is needed. |
| Unsupported | EncryptedBlock | Similar to SimpleBlock but the data inside the Block are Transformed (encrypt and/or signed). |
7、Track
| ? | Element Name | Description |
| Supported | Tracks | Top-level element。 |
| Supported | TrackEntry | 描述一個track。 |
| Supported | TrackNumber | Block Header中使用的track號 (不建議超過127個tracks,盡管設計的初衷沒有限制)。 |
| Supported | TrackUID | 識別Track的UID,不能為0。ss |
| Supported | TrackType | Track類型,8位(1: video, 2: audio, 3: complex, 0x10: logo, 0x11: subtitle, 0x12: buttons, 0x20: control)。 |
| Supported | FlagEnabled | Track使能標記。 |
| Supported | FlagDefault | 設置為1,表示該track應該被默認選擇。 |
| Supported | FlagForced | 設置為1,表示該track必須被播放。 如果多個track的FlagForced設置為1,那么播放器將選擇一個language與用戶設置匹配的track。 |
| Supported | FlagLacing | 如果track可能包含使用lacing的Blocks,設置為1。 |
| Unsupported | MinCache | 在播放過程中要求播放器能夠緩存最少的frame的個數。 |
| Unsupported | MaxCache | 在播放過程中要求播放器能夠緩存最少的frame的個數,0 表示沒有緩存。 |
| Supported | DefaultDuration | 每個frame持續的時間,以納秒為單位。 |
| Unsupported | TrackTimecodeScale | Block的timecode乘以這個值獲得實際的timecode值。通常用來調節視頻速度。 |
| Unsupported | TrackOffset | 可以與Block Timecode相加。 通常用來調節track的播放位置。 |
| Unsupported | MaxBlockAdditionID | BlockAddID的最大值。0意味著該track沒有BlockAdditions。 |
| Supported | Name | Track名稱。 |
| Supported | Language | 用Matroska 語言形式指定track的語言。 |
| Supported | CodecID | Codec的ID。 |
| Supported | CodecPrivate | Codec的private data。 |
| Supported | CodecName | Codec的名稱。 |
| Unsupported | AttachmentLink | Codec使用的attachment的UID。 |
| Unsupported | CodecSettings | 字符串,用來描述編碼設置。 |
| Unsupported | CodecInfoURL | 用來查找關于Codec信息的URL。 |
| Unsupported | CodecDownloadURL | 用來下載關于Codec的URL。 |
| Unsupported | CodecDecodeAll | The codec can decode potentially damaged data. |
| Unsupported | TrackOverlay | Specify that this track is an overlay track for the Track specified (in the u-integer). That means when this track has a gap (see SilentTracks) the overlay track should be used instead. The order of multiple TrackOverlay matters, the first one is the one that should be used. If not found it should be the second, etc. |
| Unsupported | TrackTranslate | The track identification for the given Chapter Codec. |
| Unsupported | TrackTranslateEditionUID | Specify an edition UID on which this translation applies. When not specified, it means for all editions found in the segment. |
| Unsupported | TrackTranslateCodec | The chapter codec using this ID (0: Matroska Script, 1: DVD-menu). |
| Unsupported | TrackTranslateTrackID | The binary value used to represent this track in the chapter codec data. The format depends on the ChapProcessCodecID used. |
| Unsupported | Video Start | ? |
| Supported | Video | Video屬性設置。 |
| Supported | FlagInterlaced | 如果Video是interlace的,設置為1。 |
| Supported | StereoMode | Stereo-3D video mode. 支持的Modes: 0: mono, 1: side by side (left eye is first), 2: top-bottom (right eye is first), 3: top-bottom (left eye is first), 11: side by side (right eye is first) 不支持的Modes: 4: checkboard (right is first), 5: checkboard (left is first), 6: row interleaved (right is first), 7: row interleaved (left is first), 8: column interleaved (right is first), 9: column interleaved (left is first), 10: anaglyph (cyan/red) |
| Supported | PixelWidth | 編碼的Video Frame的width,以pixel為單位。 |
| Supported | PixelHeight | 編碼的Video Frame的height,以pixel為單位。 |
| Supported | PixelCropBottom | 從bottom remove pixel的個數。 |
| Supported | PixelCropTop | 從top remove pixel的個數。 |
| Supported | PixelCropLeft | 從left remove pixel的個數。 |
| Supported | PixelCropRight | 從right remove pixel的個數。 |
| Supported | DisplayWidth | 顯示的Video Frame的width。 |
| Supported | DisplayHeight | 顯示的Video Frame的height。 |
| Supported | DisplayUnit | DisplayWidth/Height的uint類型 (0: pixels, 1: centimeters, 2: inches). 目前僅僅支持pixel。 |
| Supported | AspectRatioType | 指定高寬比例可能的改變(0: free resizing, 1: keep aspect ratio, 2: fixed)。 |
| Unsupported | ColourSpace | 和AVI文件的一樣 (32 bits)。 |
| Unsupported | GammaValue | Gamma Value. |
| Supported | FrameRate | 幀率。 |
| ? | Video End | ? |
| ? | Audio Start | ? |
| Supported | Audio | Audio屬性設置。 |
| Supported | SamplingFrequency | 采樣頻率,以Hz為單位。 |
| Supported | OutputSamplingFrequency | 實際輸出采樣頻率,以Hz為單位(用于SBR 技術)。 |
| Supported | Channels | Channel個數 |
| Unsupported | ChannelPositions | Table of horizontal angles for each successive channel, see appendix. |
| Supported | BitDepth | 采樣深度,主要用于PCM格式。 |
| ? | Audio End | ? |
| ? | Content Encoding Start | ? |
| Unsupported | All | All elements about Content Encoding |
| ? | Content Encoding End | ? |
8、Cueing Data
| ? | Element Name | Description |
| Supported | Cues | Top-level element,有助于加速seeking access。 |
| Supported | CuePoint | 包含所有與seek point相關的信息。 |
| Supported | CueTime | 基于segment time base的絕對timecode值。 |
| Supported | CueTrackPositions | 與timecode相關的不同tracks的位置。 |
| Supported | CueTrack | 一個給定位置的track。 |
| Supported | CueClusterPosition | 包含Required Block的Cluster的位置。. |
| Supported | CueBlockNumber | 指定的Cluster中的Block號。 |
| Unsupported | CueCodecState | 與該Cue element相關的Codec State的位置。 0 意味著數據來自一開始的Track Entry。 |
| Unsupported | CueReference | 包含required referenced Blocks的Clusters。 |
| Unsupported | CueRefTime | Referenced Block的timecode。 |
| Unsupported | CueRefCluster | 包含referenced Block的Cluster的位置。 |
| Unsupported | CueRefNumber | 指定Cluster中Track X中的referenced Block號。 |
| Unsupported | CueRefCodecState | 與該referenced element相關的Codec State的位置。 0意味著數據來自一開始的Track Entry。 |
9、Attachment(Unsupported)
10、Chapters(Unsupported)
11、Tagging
(1)Tags element應該放在文件的末尾,以方便不重要的、瑣碎的升級。
(2)TagName應該放在Tag data之前。
(3)SimpleTag不應該包含其他的SimpleTag。
(4)當連接多個文件的時候,應避免Tag在不同文件之間的merge。
| ? | Element Name | Description |
| Supported | Tags | Element containing elements specific to Tracks/Chapters. |
| Supported | Tag | Element containing elements specific to Tracks/Chapters. |
| Supported | Targets | Contain all UIDs where the specified meta data apply. It is void to describe everything in the segment. |
| Supported | TargetTypeValue | A number to indicate the logical level of the target. |
| Supported | TargetType | An informational string that can be used to display the logical level of the target like “ALBUM”, “TRACK”, “MOVIE”, “CHAPTER”, etc |
| Supported | TrackUID | This value SHOULD be 0, meaning the tags apply to all tracks in the Segment. |
| Unsupported | EditionUID | A unique ID to identify the EditionEntry(s) the tags belong to. If the value is 0 at this level, the tags apply to all editions in the Segment. |
| Unsupported | ChapterUID | A unique ID to identify the Chapter(s) the tags belong to. If the value is 0 at this level, the tags apply to all chapters in the Segment. |
| Unsupported | AttachmentUID | A unique ID to identify the Attachment(s) the tags belong to. If the value is 0 at this level, the tags apply to all the attachments in the Segment. |
| Supported | SimpleTag | Contains general information about the target. |
| Supported | TagName | The name of the Tag that is going to be stored. |
| Supported | TagLanguage | Specifies the language of the tag specified. |
| Supported | TagDefault | Indication to know if this is the default/original language to use for the given tag |
| Supported | TagString | The value of the Tag. |
| Supported | TagBinary | The values of the Tag if it is binary. Note that this can not be used in the same SimpleTag as TagString |
?
總結
以上是生活随笔為你收集整理的WebM文件格式标准的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Google编程题:最小操作数
- 下一篇: webstorm 不知道手贱点了什么,有