XDCAM

XDCAM is a series of products for digital recording using random access solid-state memory media introduced by Sony in 2003. Four different product lines, the XDCAM SD, XDCAM HD, XDCAM EX and XDCAM HD422, differ in types of encoder used, frame size, container type and in recording media. The XDCAM range includes cameras and decks which act as drop-in replacements for traditional VTRs, allowing XDCAM discs to be used within a traditional videotape-based workflow. These decks can also serve as random access computer hard drives for easy import of the video data files into nonlinear editing systems (NLE), allowing for convenient file-sharing for uses such as archiving and transcription.

The XDCAM format uses multiple video compression methods and media container formats. Video is recorded with DV, MPEG-2 Part 2 or MPEG-4 compression schemes. DV is used for standard definition video. MPEG-2 is used both for standard and high definition video, while MPEG-4 is used for proxy video. Audio is recorded in uncompressed PCM form for all formats except proxy video, which uses A-Law compression.

DVCAM uses standard DV encoding and is compatible with most editing systems. Some camcorders that allow DVCAM recording can record progressive-scan video. MPEG IMX allows recording in standard definition, using MPEG-2 encoding at data rates of 30, 40 or 50 megabits per second. Unlike most other MPEG-2 implementations, IMX uses intra-frame compression with each frame having the same exact size in bytes to simplify recording onto video tape.

MPEG HD is used in all product lines except for XDCAM SD. This format supports multiple frame sizes, frame rates, scanning types and quality modes. MPEG HD422 doubles the chroma-resolution compared to the previous generations of high-definition video XDCAM formats. This format is used only in XDCAM HD422 products. Proxy AV is used to record low resolution proxy videos. This format employs MPEG-4 video encoding.

The Professional Disc was chosen by Sony as its medium for professional nonlinear video acquisition. Its format is similar to Blu-ray Disc and was deemed to be cost effective, reliable and robust, and suitable for field work, which had been a problem with many previous disc-based systems. In 2008 Sony introduced a new recording medium to their XDCAM range, SxS Pro, a solid-state memory card implemented as an ExpressCard module.

Equipment that uses Professional Disc as recording media employ MXF container to store digital audio/digital video streams as well as metadata including subtitles and closed captioning. Tapeless camcorders that record onto solid-state memory cards use MP4 container for high definition audio/video, and DV-AVI container for DV video. JVC camcorders that use XDCAM EX recording format are also capable of recording into QuickTime container besides using MP4 container.

ProRes / Apple Intermediate / Final Cut

ProRes is a lossy video compression format developed by Apple, Inc., for use in post-production. It is the successor of the Apple Intermediate Codec and was introduced in 2007 with Final Cut Studio 2. ProRes is a line of intermediate codecs, which means they are intended for use during video editing, including synchronization of audio and video and addition of subtitles or closed captioning and not for practical end-user viewing. ProRes retains higher quality than end-user codecs while requiring much less expensive disc systems compared to uncompressed video. It is comparable to Avid’s DNxHD codec or CineForm, which offer similar bitrates and which are also intended to be used as intermediate codecs.

ProRes 422 is a DCT-based intra-frame-only codec and is simpler to decode than distribution-oriented formats like H.264. ProRes 4444 is another lossy video compression format developed by Apple and introduced with Final Cut Studio in 2009. It shares many features with other codecs of Apple’s ProRes family but provides better quality than its predecessors, particularly in the area of color. In August 2008, Apple introduced a free ProRes QuickTime Decoder for both Mac and Windows that allows playback of ProRes files through QuickTime. Installing Final Cut Pro will install the ProRes codecs for encoding files on OS X.

The Apple Intermediate Codec is a high-quality 8-bit 4:2:0 video codec used mainly as a less processor-intensive way of working with long-GOP MPEG-2 footage such as HDV. The Apple Intermediate Codec was designed by Apple, Inc., to be an intermediate format in an HDV and AVCHD workflow. Unlike MPEG-2-based HDV and similar to the standard-definition DV codec, the Apple Intermediate Codec does not use temporal compression, enabling every frame to be decoded immediately without decoding other frames.

Final Cut Studio is a professional video and audio production suite for Mac OS X from Apple, Inc., and a competitor to Avid Media Composer in the high-end movie production industry. Final Cut Studio version 3 contains six main applications and several smaller applications used in editing video. The major applications it includes are Final Cut Pro 7 for realtime editing for DV, SD and HD, Motion 4 for realtime motion graphics design, Soundtrack Pro 3 for advanced audio editing and sound design, DVD Studio Pro 4 for encoding, authoring and burning, Color 1.5 for color-grading (adapted from Silicon Color’s FinalTouch) and Compressor 3.5, a video-encoding tool for outputting projects in different formats, allowing for easy file-sharing for uses such as archiving and transcription. Final Cut Studio was introduced at the National Association of Broadcasters in 2005.

What is a WMV file?

WMV (Windows Media Video) is a file type which can contain video, audio and subtitles including closed captioning) in one of several video compression formats developed by Microsoft. WMV was originally designed for Internet streaming applications as a competitor to RealVideo. Other formats, such as WMV Screen and WMV Image, cater for specialized content. Through standardization from the Society of Motion Picture and Television Engineers (SMPTE), WMV 9 has gained adoption for physical-delivery formats such as HD DVD and Blu-ray Disc. The first version of the format, WMV 7, was introduced in 1999, and was built upon Microsoft’s implementation of MPEG-4 Part 2.

A WMV file is in most circumstances encapsulated in the Advanced Systems Format (ASF) container format. The file extension .wmv describes ASF files that use Windows Media Video. Microsoft recommends that ASF files containing non-Windows Media formats use the generic .asf file extension.

Although WMV is generally packed into the ASF container format, it can also be put into the Matroska or AVI container format. The resulting files have the .mkv and .avi file extensions, respectively. One common way to store WMV in an AVI file is to use the WMV 9 Video Compression Manager (VCM) codec implementation.

Software that can play WMV files includes Windows Media Player, RealPlayer, Mplayer, Media Player Classic, VLC Media Player and K Multimedia Player. The Microsoft Zune media management software supports the WMV format, but uses a Zune-specific variation of Windows Media DRM which is used by PlaysForSure. Many third-party players exist for various platforms such as Linux that use the Ffmpeg implementation of the WMV format. Flip4Mac WMV is a third-party QuickTime component which allows Macintosh users to play WMV files in any player that uses the QuickTime framework, free of charge to view files but chargeable to convert formats. All of these formats make possible saving of filmed material, useful for such purposes as archiving and transcription.

Software that exports video in WMV format include CyberLink PowerDirector, Avid (PC version), Windows Movie Maker, Windows Media Encoder, Microsoft Expression Encoder, Sorenson Squeeze, Sony Vegas Pro, Adobe Premiere Pro, Adobe After Effects, AVS Video Editor, Telestream Episode, Total video converter and Telestream FlipFactory. Programs that encode using the WMV Image format include Windows Media Encoder, AVS Video Editor and Photo Story.

The ASF container format, in which a WMV stream may be encapsulated, can. Windows Media DRM (digital rights management), which can be used in conjunction with WMV, supports time-limited subscription video services such as those offered by CinemaNow. Windows Media DRM, a component of PlaysForSure and Windows Media Connect, is supported on many portable video devices and streaming media clients such as the Xbox 360.

What is VOB format?

VOB (Video Object) is the container format used in DVD-Video media. VOB can contain digital video, digital audio, subtitles (including closed captioning), DVD menus and navigation contents multiplexed together into a stream form. Files in VOB format may be encrypted. Files in VOB format have .vob filename extension.

The VOB format is based on the MPEG program stream format. The MPEG program stream has provisions for nonstandard data in the form of private streams. VOB files are a subset of the MPEG program stream standard. While all VOB files are MPEG program streams, not all MPEG program streams comply with the definition for a VOB file. On the DVD, all the content for one title set is contiguous, but broken up into 1 GB VOB files in the computer compatible file systems for the convenience of the various operating systems.

Each VOB file must be less than or equal to 1 GB. VOB files may be accompanied with IFO and BUP files. These files respectively have .ifo and .bup filename extensions. Images, video and audio used in DVD menus are stored in VOB files. IFO (information) files contain all the information a DVD player needs to know about a DVD so that the user can navigate and play all DVD content properly, such as where a chapter starts, where a certain audio or subtitle stream is located, information about menu functions and navigation. BUP (backup) files are exact copies of IFO files, supplied to help in case of corruption. Video players may not allow DVD navigation when IFO or BUP files are absent.

Almost all commercially produced DVD-Video titles use some restriction or copy protection method, which also affects VOB files. Copy protection is usually used for copyrighted content. Many DVD-Video titles are encrypted with CSS (Content Scramble System). This is a data encryption and communications authentication method designed to prevent copying video and audio data directly from DVD-Video discs.

Decryption and authentication keys needed for playing back encrypted VOB files are stored in the normally inaccessible lead-in area of the DVD and are used only by CSS decryption software in a DVD player or software player. If someone is trying to copy the VOB files of an encrypted DVD-Video to a hard drive, an error can occur, because the DVD was not authenticated in the drive by CSS decryption software. This makes VOB inconvenient to use for purposes of file-sharing for uses such as reference, archiving filmed material, editing and transcription.

A player of generic MPEG-2 files can usually play unencrypted VOB files, which contain MPEG-1 Audio Layer II audio. Mplayer, VLC media player, GOM player, Windows Media Player Classic and more platform-specific players like ALLPlayer play VOB files.

What is RTMP?

RTMP (Real Time Messaging Protocol) is a proprietary protocol developed by Macromedia for streaming audio, video and data (including metadata such as subtitles and closed captioning) over the Internet between a Flash player and a server. While the primary motivation for RTMP was to be a protocol for playing Flash video, it is also used in other applications, such as the Adobe LiveCycle Data Services ES.

RTMP is a TCP (Transmission Control Protocol)-based protocol which maintains persistent connections and allows low-latency communication. To deliver streams smoothly and transmit as much information as possible, it splits streams into fragments and their size is negotiated dynamically between the client and server while sometimes it is kept unchanged. Fragments from different streams may then be interleaved and multiplexed over a single connection.

However, in practice individual fragments are not typically interleaved.

Instead, the interleaving and multiplexing is done at the packet level, with RTMP packets across several different active channels being interleaved in such a way as to ensure that each channel meets its bandwidth, latency, and other quality-of-service requirements. Packets interleaved in this fashion are treated as indivisible, and are not interleaved on the fragment level. This allows for convenient file-sharing for such uses as archiving, reference and transcription.

The RTMP defines several virtual channels on which packets may be sent and received, and which operate independently of each other. For example, there is a channel for handling RPC (remote procedure call) requests and responses, a channel for video stream data, a channel for audio stream data, and a channel for out-of-band control messages such as fragment size negotiation. During a typical RTMP session, several channels may be active simultaneously at any given time.

When RTMP data is encoded, a packet header is generated. The packet header specifies, amongst other matters, the ID of the channel on which it is to be sent, a timestamp of when it was generated, and the size of the packet’s payload. This header is then followed by the actual payload content of the packet, which is fragmented according to the currently agreed-upon fragment size before it is sent over the connection.

At a higher level, the RTMP encapsulates MP3 or AAC audio and FLV1 video multimedia streams, and can make remote procedure calls using the Action Message Format. Any RPC services required are made asynchronously, using a single client/server request/response model, such that real-time communication is not required.

The most widely adopted RTMP client is Adobe Flash Player, which supports playback of audio and video streamed from RTMP servers when installed as a web browser plug-in.

What is SMPTE?

The Society of Motion Picture and Television Engineers (SMPTE) is a professional membership association focused on the advancement of the art, science, and craft of the image, sound, and metadata (such as subtitles and closed captioning) ecosystem, worldwide. Since its founding in 1916, SMPTE has published the SMPTE Motion Imaging Journal and developed more than 800 standards, recommended practices, and engineering guidelines.

The Society is sustained by motion-imaging executives, engineers, creative and technology professionals, researchers, scientists, educators and students throughout the world. Through the Society’s partnership with the Hollywood Post Alliance (HPA), this membership is complemented by the professional community of businesses and individuals who provide expertise, support, tools, and the infrastructure for the creation and finishing of motion pictures, television, commercials, digital media, and other dynamic media content.

SMPTE innovations include color bars which have set a consistent reference point to ensure color is calibrated correctly on broadcast monitors, programs, and on video cameras. SMPTE also fostered timecode, which gives every frame of video its own unique identifying number, makes digital editing possible, and enables the association of other data to make audio and video even more meaningful, accurate, and repeatable, whether in post for a major studio release, in hard news environments, live sports production, or for archival uses such as reference and transcription. It can also synchronizes music and is often used to automate lighting, pyrotechnics, video, and other effects in live concert production.

Other SMPTE innovations include digital camera standards, transport of high bit rate media signals over IP networks, and timed text, which makes broadcast content more easily accessible to tens of millions of people in the U.S. with disabilities. SMPTE Timed Text is also the basis for subtitles and closed captioning in the digital entertainment content ecosystem’s UltraViolet format for commercial movie and television content and is used by several video services and Internet video players.

SMPTE was founded in 1916 by C. F. Jenkins and a group of engineers to create a society of engineering specialists in the motion picture field. A constitution was then created, and Jenkins was named chairman of the Society of Motion Picture Engineers (the T was added in the 1950s with the advent of television). Today, SMPTE is recognized as the global leader in the development of standards and authoritative practices for film, television, video and multimedia.

To accomplish its educational goals, SMPTE organizes annual conferences and seminars. It also publishes the SMPTE Motion Imaging Journal, which includes technical papers, tutorials, practical application articles, standards updates and SMPTE Section Reports.

What are WebVTT and SubRip format?

WebVTT (Web Video Text Tracks) is a file format that allows editing of external text tracks. Used in conjunction with HTML5’s <track> element allows information such as subtitles, closed captioning and descriptions for a media resource such as audio or video to be displayed synchronized with the media resource. The ability to add textual information in this way allows more options to the viewer and greater accessibility of media content to those who may be unable to listen to a video’s audio track due to auditory issues or language difficulties.

WebVTT files contain several types of information about the video including subtitles (transcription or translation of dialogue), captions (similar to subtitles but also including sound effects and other audio information), as well as descriptions, chapters and metadata, which provide information about the video and aid the viewer in navigation through the video.

WebVTT files are text files, encoded as UTF-8, with a .vtt file extension. It is an offshoot of the WebSRT (Web Subtitle Resource Tracks) file which is itself an adaptation of SubRip, both of which share an .srt file extension.

SubRip is a software program for Windows which (extracts) subtitles and their timings from video. It is free software, released under the GNU GPL (general public license). SubRip is also the name of the widely used and broadly compatible subtitle text file format created by this software.

Using optical character recognition, SubRip can extract from live video, video files and DVDs, then record the extracted subtitles and timings as a SubRip format text file. It can optionally save the recognized subtitles as bitmaps for later subtraction or erasure from the source video.

In practice, SubRip is configured with the correct codec for the video source, then trained by the user on the specific text area, fonts, styles, colors and video processing requirements to recognize subtitles. After trial and fine tuning, SubRip can automatically extract subtitles for the whole video source file during its playback. SubRip records the beginning and end times and text for each subtitle in the output text .srt file.

The SubRip .srt file format is supported by most software video players. For Windows software video players that do not support subtitle playback directly, the VSFilter Direct X filter displays SubRip and other subtitle formats. The SubRip format is supported directly by many subtitle creation and editing tools and some home media players. In 2008, YouTube added subtitle support to its Flash video player under the closed captioning option. Content producers can upload subtitles in SubRip format.

What is the SWF file format?

SWF is an Adobe Flash file format used for multimedia, vector graphics and ActionScript. SWF files deliver graphics (including text, such as subtitles and closed captioning) and animation over the Internet. The SWF file format was designed as a very efficient delivery format and not as a format for exchanging graphics between graphics editors. SWF originated with FutureWave Software, transferred to Macromedia, and then came under the control of Adobe. Originally, the term SWF was used as an abbreviation for ShockWave Flash. SWF files are stored in files with the extension .swf.

SWF format is primarily intended for on-screen display and so it supports anti-aliasing, fast rendering to a bitmap of any color format, animation and interactive buttons. SWF is a tagged format, so the format can be evolved with new features while maintaining backwards compatibility with older players, making possible convenient file-sharing for uses such as reference, archiving and transcription. SWF files can be delivered over a network with limited and unpredictable bandwidth. SWF files are compressed to be small and support incremental rendering through streaming. Files can be displayed without any dependence on external resources such as fonts.

SWF files can be generated from within several Adobe products including Flash, Flash Builder and After Effects as well as through MXMLC, a command line application compiler which is part of the freely available Flex SDK. Although Adobe Illustrator can generate SWF format files through its export function, it cannot open or edit them. SWF files can be built with open source Motion-Twin ActionScript 2 Compiler (MTASC), the open source Ming library and the free software suite SWFT tools as well as various third party programs.

The company Future Wave Software originally defined the file format with the objective to create small files for displaying entertaining animations. The idea involved a format which player software could run on any system and which would work with slower network connections. FutureWave released FutureSplash Animator in May 1996. When Macromedia acquired FutureWave, FutureSplash Animator became Macromedia Flash and later SWF when Adobe acquired Macromedia.

Adobe makes available plugins such as Adobe Flash Player and Adobe Integrated Runtime to play SWF files in web browsers on many desktop operating systems including Microsoft Windows, Mac OS X and Linux.

Adobe has incorporated SWF playback and authoring in other product and technologies of theirs, including in Adobe Shockwave, which renders more complex documents. SWF can also be embedded in PDF files which are viewable with Adobe Reader 9 or later.

Sony PlayStation Portable consoles can play limited SWF files in Sony’s web browser, beginning with firmware version 2.71. The Nintendo Wii and the Sony PS3 consoles can run SWF files through their Internet browsers.

What is bitrate?

In telecommunications and computing, bitrate is the number of bits that are conveyed or processed per unit of time. A bit is the basic unit of information in computing and digital communications. A bit can have only one of two values and may therefore be physically implemented with a two-state device. These values are most commonly represented as 0 and 1. The term bit is a portmanteau of the words binary digit. Simply put, the bitrate is the speed at which digital information (text, image, video, audio or metadata such as subtitles and closed captioning) can be transferred across a particular communication link.

In digital communication systems, the physical layer gross bitrate, raw bitrate, data signaling rate, gross data transfer rate or uncoded transmission rate is the total number of physically transferred bits per second over a communication link, including useful data as well as protocol overhead.

In digital multimedia, bitrate often refers to the number of bits used per unit of playback time to represent a continuous medium such as audio or video after source coding (data compression). The encoding bit rate of a multimedia file is the size of a multimedia file in bytes divided by the playback time of the recording (in seconds), multiplied by eight.

For realtime streaming multimedia, the encoding bit rate is the goodput that is required to avoid interrupt: In the case of file transfer, goodput corresponds to the achieved file transfer rate. The term average bitrate is used in case of variable bitrate multimedia source coding schemes. In this context, the peak bitrate is the maximum number of bits required for any short-term block of compressed data.

In digital multimedia, bitrate represents the amount of information, or detail, that is stored per unit of time of a recording. The bitrate depends on several factors, including whether the original material is sampled at different frequencies, uses different numbers of bits, is encoded by different schemes, or may be digitally compressed by different algorithms. Generally, choices are made between these in order to achieve the desired trade-off between minimizing the bitrate and maximizing the quality of the material when it is played.

If lossy data compression is used on audio or visual data (such as when transferring data for reference purposes, archiving and transcription), differences from the original signal will be introduced; if the compression is substantial, or lossy data is decompressed and recompressed, this may become noticeable in the form of compression artifacts. Whether these affect the perceived quality, and if so how much, depends on the compression scheme, encoder power, the characteristics of the input data, the listener’s perceptions, the listener’s familiarity with artifacts, and the listening or viewing environment.

What are codecs?

A codec is a computer program capable of encoding or decoding a digital data stream or signal. A codec encodes a data stream or signal for transmission, storage or encryption, or decodes it for playback or editing. Codecs are used in videoconferencing, streaming media and video editing applications. An audio codec converts analog audio signals into digital signals for transmission or storage. A receiving device then converts the digital signals back to analog using an audio decompressor for playback. A video codec accomplishes the same task for video signals, and other codecs do the same for metadata such as subtitles and closed captioning.

Many of the more popular codecs in the software world are lossy, meaning that they reduce quality by some amount in order to achieve compression. Often, this type of compression is virtually indistinguishable from the original uncompressed sound or images, depending on the codec and the settings used. Smaller data sets ease the strain on relatively expensive storage sub-systems such as non-volatile memory and hard disk as well as write-once-read-many formats such as CD-ROM, DVD and Blu-ray Disc . Lower data rates also reduce cost and improve performance when the data is transmitted.

There are also many lossless codecs which are typically used for archiving data (for such uses as reference or transcription) in a compressed form while retaining all of the information present in the original stream. If preserving the original quality of the stream is more important than eliminating the correspondingly larger data sizes, lossless codecs are preferred. This is especially true if the data is to undergo further processing such as editing, in which case the repeated application of processing (encoding and decoding) on lossy codecs will degrade the quality of the resulting data such that it is no longer identifiable.

Two principal techniques are used in codecs, pulse-code modulation and delta modulation. Codecs are often designed to emphasize certain aspects of the media to be encoded such as motion, color or surface texture. There are thousands of audio and video codecs, ranging in cost from free to hundreds of dollars or more. This variety of codecs can create compatibility and obsolescence issues. The impact is lessened for older formats, for which free or nearly-free codecs have exist.

Many multimedia data streams contain both audio and video and often some metadata that permit synchronization of audio and video. Each of these three streams may be handled by different programs, processes, or hardware; but for the multimedia data streams to be useful in stored or transmitted form, they must be encapsulated together in a container format.

Avid DNxHD codec

Avid DNxHD (Digital Nonlinear Extensible High Definition) is a lossy high-definition video post-production codec engineered for multi-generation compositing with reduced storage and bandwidth requirements for audio, video and metadata (including subtitles and closed captioning) source material. It is an implementation of the SMPTE VC-3 standard. The DNxHD codec was developed by Avid Technology, Inc., and first supported in Avid DS Nitris in 2004 and approved as compliant with the SMPTE VC-3 standard in 2008.

Avid DNxHD is engineered to create mastering-quality HD media at reduced file sizes, increasing real-time HD productivity, whether using local storage or in real-time collaborative workflows. While native HD camera compression formats are efficient, they aren’t engineered to maintain quality during complex post production effects processing.

Uncompressed HD delivers superior image quality, but data rates and file sizes can greatly hamper workflow. Avid DNxHD allows for optimal mastering picture quality, minimal degradation over multiple generations, reduced storage requirements, realtime HD sharing and collaboration for editing as well as for other uses such as archiving and transcription, improved multi-stream performance, and allows the QuickTime HD codec to be exported much faster from Media Composer than other QT codecs, offers superior playback performance and picture quality in Pro Tools.

DNxHD is a intended to be usable as both an intermediate format suitable for use while editing and as a presentation format. DNxHD data is typically stored in an MXF container, although it can also be stored in a QuickTime container. The source code for the Avid DNxHD codec is licensable free of charge. QuickTime HD codec is downloadable free of charge for Mac OS X and Windows XP, Windows Vista and Windows 7. It has been commercially licensed to a number of companies including Ikegami, FilmLight, Harris Corporation, JVC, Seachange, and EVS Broadcast Equipment.

Ikegami’s Editcam camera system is unique in its support for DNxHD and records directly to DNxHD encoded video. Such material is immediately accessible by editing platforms that directly support the DNxHD codec. DNxHD is also supported by Arri Alexa, Blackmagic Design, HyperDeck Shuttle 2, HyperDeck Studio, Media Composer, NewsCutter, Symphony, Avid DS and Interplay Assist. A standalone QuickTime codec for Windows XP and Mac OS X is available to create and play QuickTime files containing DNxHD material.

DNxHD is very similar to JPEG. Every frame is independent and consists of VLC-coded DCT coefficients. The DNxHD codec was submitted to the SMPTE organization as the framework for the VC-3 family of standards and was approved as SMPTE VC-3 after a two year testing and validation process in 2008 and 2009.

What are VP6, VP7, VP8 and VP9 formats?

VP6, VP7, VP8 and VP9 are proprietary high-definition video compression formats and codecs developed by On2 Technologies and used in platforms such as Adobe Flash Player 8 and above, Adobe Flash Lite, Java FX and other mobile and desktop video platforms. A video codec is a device or software that enables compression or decompression of digital video, audio and metadata including subtitles and closed captioning.

Digital video codecs are found in DVD players and recorders, video CD systems, in emerging satellite and digital terrestrial broadcast systems and in various digital devices and software products with video recording or playing capability. Online video material is encoded by a variety of codecs, and this has led to the availability of codec packs, a pre-assembled set of commonly used codecs combined with an installer available as a software package for personal computers.

On2 TrueMotion VP6 is a proprietary lossy video compression format and video codec. It is an incarnation of the TrueMotion video codec, a series of video codecs developed by On2 Technologies. This codec is commonly used by Adobe Flash, Flash Video and JavaFX media files. The VP6 codec was introduced in 2003. Later incarnations of this codec are VP7 and VP8. With the Google acquisition of On2, VP8 is licensed as open source. Earlier versions such as VP6 remain unsupported.

VP7 is a proprietary lossy video compression format and video codec introduced as a successor to VP6 in 2005 with improvements in compression of digital video, making file-sharing for such uses as archiving and transcription of video source material more convenient. Move Networks used the VP7 codec in its Move Media Player plug-in for Firefox and Internet Explorer, used by ABC and Fox networks for streaming of full network television shows.

VP8 is a video compression format owned by Google and created by On2 Technologies as a successor to VP7. In 2010, after the purchase of On2 Technologies, Google provided an irrevocable patent promise on its patents for implementing the VP8 format and released a specification of the format under the Creative Commons Attribution 3.0 license. That same year, Google also released libvpx, the reference implementation of VP8, under a BSD license.

VP9 is an open and royalty-free video compression standard developed by Google. VP9 is a successor to VP8. Chromium, Chrome, Firefox and Opera support playing VP9 video format in the HTML5 video tag. Development of VP9 started in 2011 with one of its goals being to reduce the bitrate by 50% compared to VP8 while having the same video quality.

What is Windows Media Player?

Windows Media Player (WMP) is a media player and media library application developed by Microsoft that is used for playing audio and video and for viewing images (including subtitles and closed captioning) on personal computers. Windows Media Player replaced an earlier application called Media Player which was introduced in 1991, adding features beyond simple video or audio playback.

In addition to being a media player, Windows Media Player includes the ability to download music and audio content from a computer and copy it to compact disc, burn recordable discs in Audio CD format or as data discs with playlists such as MP3 CD, allowing for convenient file-sharing for uses such as archiving audio material and transcription. Windows Media Player also synchronizes content with a digital audio player or other mobile device, and enable users to purchase or rent music from a number of online music stores.

The default file formats for Windows Media Player are Windows Media Video (WMV), Windows Media Audio (WMA), and Advanced Systems Format (ASF) as well as an XML-based playlist format called Windows Playlist (WPL). The player is also able to utilize a digital rights management service in the form of Windows Media DRM.

Windows Media Player supports playback of audio, video and pictures, along with fast forward, reverse, file markers and variable playback speed. It supports local playback, streaming playback with multicast streams and progressive downloads. Items in a playlist can be skipped over temporarily at playback time without removing them from the playlist. Full keyboard-based operation is possible in the player.

The player includes intrinsic support for Windows Media codecs and also WAV and MP3 media formats. Support for any media codec and container format can be added using Media Foundation codecs. Windows Media Player Mobile 10 on Windows Mobile 6.5 supports MP3, ASF, WMA and WMV using WMV or MPEG-4 codecs.

Windows Media Player features integrated audio CD burning support as well as data CD burning support. Data CDs can have any of the media formats supported by the player. While burning Data CDs, the media can be transcoded into WMA format and playlists can be added to the CD as well.

Windows Media Player features universal brightness, contrast, saturation and hue adjustments and pixel aspect ratio for supported video formats. Windows Media Player can also have attached audio and video DSP plug-ins which process the output audio or video data. The player supports subtitles and closed-captioning for local media, video on demand streaming or live streaming scenarios. Windows Media captions support the SAMI file format but can also carry embedded closed caption data.

WebVTT and SRT files

What is Theora format?

Theora is a free and open lossy video compression format from the Xiph.Org Foundation. It can be used to distribute film and video online (as well as audio and metadata, including subtitles and closed captioning) and on disc without the licensing and royalty fees or vendor lock-in associated with other formats. Xiph.Org Foundation is a non-profit organization that produces free multimedia formats and software tools. Theora is named after Theora Jones, a character in the Max Headroom television program.

Theora scales from postage stamp to HD resolution and is considered particularly competitive at low bitrates. It is in the same class as MPEG-4/DiVX, and like the Vorbis audio codec it has room for improvement as encoder technology develops. Theora has been in full public release since November 3, 2008. The bitstream format for Theora was frozen in 2004, and all bitstreams encoded since that date remain compatible with future releases.

Theora is a variable bitrate DCT (discrete cosine transform)-based video compression scheme. Like most common video codecs, Theora also uses chroma subsampling and block-based motion compensation. Pixels are grouped into various structures, namely super-blocks, blocks and macroblocks. Theora supports intra-coded frames and forward-predictive frames, but not the bi-predictive frames which are found in H.264 and VC-1. Theora also does not support interlacing.

The Theora video-compression format is essentially compatible with the VP3 video-compression format, consisting of a backward-compatible superset. Theora is a superset of VP3, and VP3 streams can be converted into Theora streams without recompression but not vice versa. VP3 video compression can be decoded using Theora implementations, but Theora video compression usually cannot be decoded using VP3 implementations. The libtheora reference implementation provides the standard encoder and decoder under a BSD license.

Theora video streams can be stored in any suitable container format, allowing convenient file-sharing of source material for such uses as archiving and transcription of original video programming. Most commonly it is found in the Ogg container with Vorbis or FLAC audio streams which provide a completely open, royalty-free multimedia format. It can also be used with the Matroska container.

Theora is derived from the proprietary VP3 codec, released into the public domain by On2 Technologies. It is comparable in design and bitrate efficiency to MPEG-4 Part 2, early versions of Windows Media Video and Real Video.

Theora is established as a video format in open-source applications and is the format used for Wikipedia’s video content. Theora is supported by browsers such as Mozilla Firefox 3.5 and later versions, Google Chrome, Tizen, SeaMonkey, Konqueror, Opera and Midori. Theora is also supported by DirectShow, Gstreamer, Phonon, QuickTime, Silverlight, FFmpeg, Helix Player, Miro Media Player, MPlayer, VLC, xine and Dragon player.

What is MPEG-1?

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio and metadata such as subtitles and closed captioning without excessive quality loss, making video CDs, digital cable and satellite TV and digital audio broadcasting (DAB) possible as well as allowing for easy file-sharing of moving pictures for such uses as reference, archiving and transcription. MPEG-1 is used in a large number of products and technologies, most notably the MP3 audio format. MPEG-1 was developed by MPEG (Moving Pictures Experts Group) beginning in 1988 to address the need for standard video and audio formats and to build on the H.261 standard to get better quality through the use of more complex encoding methods.

In July 1990, before the first draft of the MPEG-1 standard had been completed, work began on a second standard, MPEG-2, intended to extend MPEG-1 technology to provide full broadcast-quality video at high bitrates and to address shortcomings of the original MPEG-1 standard, including an audio compression system limited to two channels (stereo), no standardized support for interlaced video with poor compression when used for interlaced video and only one standardized profile, which was unsuited for higher resolution video.

The MPEG-1 standard very strictly defines the bitstream and decoder function but does not define how MPEG-1 encoding is to be performed, although a reference implementation is provided in ISO/IEC-11172-5. This means that MPEG-1 coding efficiency can drastically vary depending on the encoder used and that newer encoders generally perform significantly better than their predecessors.

Most popular software for video playback includes MPEG-1 decoding in addition to any other supported formats. Virtually all digital audio devices can play back MPEG-1 Audio. Before the formulation of MPEG-2, many digital satellite/cable TV services used MPEG-1 exclusively. The widespread popularity of MPEG-2 with broadcasters means MPEG-1 can be played by most digital cable and satellite converter boxes and digital disc and tape players due to backwards compatibility.

MPEG-1 is the exclusive video and audio format used on Green Book CD-i, the first consumer digital video format, and on Video CD (VCD), still a very popular format around the world. The Super Video CD standard, based on VCD, uses MPEG-1 audio exclusively, as well as MPEG-2 video. Most DVD players also support Video CD and MP3 CD playback, which use MPEG-1.

The primary file extension for MPEG-1 or MPEG-2 audio and video compression is .mpg. The most common file extension for MPEG-1 Layer 3 audio is .mp3. An MP3 file is typically an uncontained stream of raw audio.

What is Adobe Flash Player?

Adobe Flash Player (labeled Shockwave Flash in Internet Explorer and Firefox) is freeware software for viewing multimedia, executing rich Internet applications, and streaming video, audio and metadata content such as subtitles and closed captioning created on the Adobe Flash platform. Flash Player can run from a web browser as a browser plug-in or on supported mobile devices.

Flash Player is a common format for games, animations, and GUIs (graphic user interfaces) embedded into web pages. Flash Player can be downloaded for free and its plug-in version is available for recent versions of web browsers such as Internet Explorer, Mozilla Firefox, Google Chrome, Opera and Safari on selected platforms. Each version of Adobe Flash Player is backwards-compatible, making it possible to access material archived on older systems for such uses as reference and transcription.

Flash Player can run from a web browser as a plug-in or on supported mobile devices, and standalone application versions are also available for Windows and Mac OS X. Flash Player runs SWF files that can be created by Adobe Flash Professional, by Adobe Flex or by a number of other Macromedia and third-party tools. Flash Player was created by Macromedia and developed and distributed by Adobe Systems after its acquisition. Flash Player supports vector and raster graphics, 3-D graphics, an embedded scripting language called ActionScript, and streaming of video and audio.

Flash Player executes and displays content from a provided SWF file, although it has no in-built features to modify the SWF file at runtime. It can execute software written in the ActionScript programming language which enables the runtime manipulation of text, data, graphics, sound and video. The player can also access certain connected hardware devices, including web cameras and microphones.

Flash Player is used internally by Adobe Integrated Runtime (Adobe AIR) to provide a cross-platform runtime environment for desktop applications and mobile applications. Flash applications must specifically be built for the Adobe AIR in order to utilize additional features provided, such as file system integration, native client extensions, native window/screen integration, taskbar/dock integration, and hardware integration with connected Accelerometer and GPS devices.

MP3, Flash Video (FLV and F4V), PNG (Portable Network Graphics), JPEG and GIF files can be accessed and played back via Flash Player from a server via HTTP or embedded inside an SWF file.

Flash Player is available for many major desktop platforms including Windows XP and later, OS X 10.6 and later, Linux and Google Chrome. Flash Player is supported on mobile and tablet devices such as Acer, BlackBerry 10, Dell, HTC, Lenovo, Logitech, LG, Motorola, Samsung, Sharp, SoftBank, Sony and Toshiba.

What are H.263 / MPEG-4 AVC video formats?

H.263, H.264 (or MPEG-4 Part 10 Advanced Video Coding or MPEG-4 AVC) are video compression formats used for the recording, compression, and distribution of video content as well as audio and metadata such as subtitles or closed captioning. H.263 was developed by the ITU-T Video Coding Experts Group (VCEG) in 1996. The first version of H.264 was completed in May 2003, and various extensions of its capabilities have been added in subsequent editions.

H.263 was originally designed for use in videoconferencing, but as H.264 provides significant improvement in capability beyond H.263, the H.263 standard is now considered a legacy design. Most new videoconferencing products now include H.264 as well as H.263 and H.261 capabilities.

H.264 (MPEG-4 AVC) is a block-oriented motion-compensation-based video compression standard developed by VCEG together with the ISO/IEC/JTC1 Moving Picture Experts Group (MPEG). H.264 is best known as being one of the video-encoding standards for Blu-ray Discs. It is also widely used by streaming internet sources, such as videos from Vimeo, YouTube, and the iTunes Store, web software such as the Adobe Flash Player and Microsoft Silverlight and also various HDTV broadcasts over terrestrial, cable and satellite.

The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bitrates than previous standards without increasing the complexity of design so much that it would be impractical or excessively expensive to implement and to provide flexibility to allow the standard to be applied to a wide variety of applications allowing for convenient file-sharing for such uses as archiving and transcription of original video.

The H.264 video format has a broad application range that covers all forms of digital compressed video from low bitrate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. To ensure compatibility and problem-free adoption of H.264 AVC, many standards bodies have amended or added to their video-related standards so that users of these standards can employ H.264 AVC.

The Digital Video Broadcast project (DVB) approved the use of H.264 AVC for broadcast television in 2004. The Advanced Television Systems Committee (ATSC) standards body in the United States approved the use of H.264 AVC for broadcast television in 2008, although the standard is not yet used for fixed ATSC broadcasts within the United States. It has also been approved for use with the more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of H.264. H.264 is also used as part of recording formats by Sony, Panasonic, Canon and Nikon as well as in closed circuit TV and video surveillance products.

What is SAMI?

SAMI (Synchronized Accessible Media Interchange) is a file format designed to deliver synchronized text such as closed captioning, subtitles, or audio descriptions with digital media content. SAMI was released by Microsoft in 1998. The files may have either .smi or .sami file extensions.

At its most basic level SAMI can be used as an intermediate interchange format for encoding closed captions in the form of Line-21 for National Television System Committee (NTSC) and MPEG for DVDs. The audio, video and animation content is stored separately and synchronized with the SAMI content. Thus, Web content can come from different sites.

For example, video content may come from a main server while the captioning content may arrive from a local server. This allows for tailored captioning for different regions of the world. Furthermore, dynamic content like news clips can be posted to a Web site as they arrive while closed captioning can then be added by a local caption provider at a later time from any location in the world.

SAMI allows captioning content to be independent of audio format. For example, while Audio Video Interleave (AVI) cannot store captioned data, QuickTime can, SAMI captioning will work with both formats. As long as a metric of overall duration and elapsed time is generated, captions can be synchronized to audio content.

SAMI captions can also be seen by a user on a text-only device. Over a slow link, a user could access audio content without having to download the entire image or sound file. These would simply show up as a linear transcription. Furthermore, if a serial out option is provided, users can still enjoy the content of a movie or captioned radio via their Braille display.

SAMI supports 64 or more languages as well as many media formats, so it is extremely efficient to use SAMI for content indexing and retrieval then to have filters for every format and manner of encoding employed. A SAMI file can be easily searched for content and then allow the associated video, animation or audio content to be accessed.

In addition to standard fonts, SAMI can support other text styles, such as different colors, sizes, or languages to aid a variety of users. SAMI can be particularly useful for individuals who are hard-of-hearing or visually impaired. The SAMI format can also assist in educational purposes, such as teaching beginning readers or students learning a second language.

SAMI files are supported by Windows Media Player, Chameleo, GOM Player, K-Multimedia Player, Media Player Classic Home Cinema, Mplayer, PBS Cove, Perian, Plex, VLC Media Player, XBMC and Xine.

CEA-708

CEA-708 (Consumer Electronics Association-708), also called EIA-708 is the current standard for closed captioning (textual transcription of the audio content of a program) for Advanced Television Systems Committee (ATSC) digital television (DTV) streams in the United States and Canada. It was developed by the Electronic Industries Alliance.

Unlike run-length encoding digital video broadcasting (RLE DVB) and DVD subtitles, CEA-708 captions are low bandwidth textual like traditional CEA-608 captions and EBU Teletext/Ceefax subtitles. Building upon the prior CEA-608 standard, CEA-708 has support for more character sets and better caption positioning options.

CEA-708 captions are injected into MPEG-2 video streams in the picture user data. The packets are in picture order, and must be rearranged just like picture frames are. This is known as the DTVCC Transport Stream. It is a fixed-bandwidth channel that accommodates backward compatible line-21 captions and CEA-708 captions. The main form of signalling is via a PSIP caption descriptor which indicates the language of each caption and is formatted for easy reader (3rd grade level for language learners) in the Program and system Information Protocol (PSIP) event information table (EIT) on a per event basis.

CEA-708 caption decoders are required in the U.S. by Federal Communications Commission regulation in all 13-inch diagonal or larger digital televisions. Some broadcasters are required by FCC regulations to caption a percentage of their broadcasts.

CEA-608, also known as EIA-608 and line-21 captions, is the former standard for closed captioning for NTSC (National Television System Committee) TV broadcasts in the United States, Canada and Mexico. It was developed by the Electronic Industries Alliance and was required by law to be implemented in most television receivers made in the United States. Scenarist Closed Caption (SCC), file extension .scc is the preferred file format when captions are based on CEA-608 features.

CEA-608 captions are transmitted on either the odd or even fields of line 21 with an odd parity bit in the non-visible active video data area in NTSC broadcasts and are also sometimes present in the picture user data in ATSC transmissions. The odd field captions relate to the primary audio track and the even field captions related to the second audio program (SAP), usually a second-language translation of the primary audio, such as a French or Spanish translation of an English-speaking TV show. CEA-608 was later revised with the addition of extended character sets to fully support the representation of the Spanish, French and other Western European languages, as well as to support two-byte characters for the Korean and Japanese markets.

World Wide Web Consortium (W3C)

The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. W3C was founded in 1994 by Tim Berners-Lee, inventor of the World Wide Web and author of the first Web server and client program as well as author of HyperText Markup Language (HTML), the computer language primarily used to publish information on the Web. W3C is composed of member organizations from Apple, Inc. to Zhejiang University who work together to develop and maintain standards for the Web as well as educating, developing software and serving as a forum for discussion about the Web. Members of W3C include businesses, nonprofit organizations, universities, governmental entities, and individuals.

W3C’s purpose is to create standards that allow common accessibility among all users of the Web, particularly by addressing incompatibility issues between industry vendors, essentially making it a quality control apparatus for the Web. Standards recommended by member organizations proceed through a four-step process before they are certified as being W3C-compatible.

According to the W3C’s Web site, their mission is “to lead the World Wide Web to its full potential by developing protocols and guidelines that ensure the long-term growth of the Web.” This includes the democratization of the web, making easier the sharing of information regardless of differences in technical capacity (hardware, software, and the multitude of devices with which one can access the Web), language, location or physical limitations, illustrated by the W3C’s standards for the use of closed captioning over the Web.

W3C’s standards also help maintain the open-endedness of the Web by working to maintain a trustworthy structure that allows individuals and networks from all over the world to communicate and share files safely for a multitude of purposes from participating in social networks to viewing filmed entertainment to archiving recorded material for reference and transcription.

W3C is administered by the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) in the United States, the European Research Consortium for Informatics and Mathematics (ERCIM) in France, Keio University in Japan and Beihang University in China. The W3C also has world offices in sixteen regions around the world, working with regional Web communities to promote W3C technologies in local languages, broaden W3C’s geographical base, and encourage international participation in W3C Activities.

As of November, Jeffrey Jaffe, former CTO of Novell, is the CEO of W3C. The W3C has a small staff and most of its work is done by experts in the consortium’s working groups.

What is DivX?

DivX is the brand name of products created originally by DivX LLC, including the DivX Codec which became popular due to its ability to compress lengthy video segments into small sizes while maintaining high visual quality. The three DivX codecs include the original MPEG-4 Part 2 DivX, the H.264/MPEG-4 AVC DivX Plus HD codec and the DivX high-efficiency video coding (HEVC) Ultra HD codec. It is one of several codecs commonly associated with ripping, in which audio and video multimedia as well as metadata including subtitles and closed captioning are transferred to a hard disk and transcoded.

DivX 6 added an optional media container format called DivX Media Format (DMF) using a .divx extension that includes support for the DVD-Video and VOB container-like features. This media container format is used for the MPEG-4 Part 2 codec. Some features of DMF include multiple video streams, multiple audio tracks, and multiple subtitles and other metadata.

DivX Plus HD is a file type using the standard Matroska media container format, file extension .mkv, rather than the proprietary DivX Media Format. DivX Plus HD files contain an H.264 video bitstream, AAC Surround Sound audio, and a number of XML-based attachments defining chapters, subtitles and metadata This media container format is used for the H.264/MPEG-4 AVC codec.

DivX Video on Demand (DivX VOD) is DivX’s version of digital rights management (DRM), which allows content copyright holders to control distribution. DivX, Inc. has format approval from major Hollywood studios including Sony, Paramount and Lionsgate, which have allowed content retailers to sell protected videos that will play on current and previous generations of DivX certified devices.

DivX Plus for Windows is a suite of software for Microsoft Windows that contains the DivX codec, a standalone media player, and a video converter and media player plug-in for web browsers. It was released on 16 March 2010.

DivX Plus Player is a standalone media player which allows the conversion and transfer of video files via USB or optical disk, allowing for convenient file sharing for uses such as editing, archiving and transcription. DivX Plus Player also features a media library as well as a set of DRM features that help authorize purchased commercial content for the consumer’s computer and DivX-certified devices.

The DivX Plus Converter can convert AVI, MP4, QuickTime video, Windows Media Video, AVCHD and RMVB files to DivX Plus HD format. With the purchase of a license, it can also convert video files to DMF files. With the purchase of an additional plug-in, DivX Plus Converter can also support MPEG video, MPEG-2 video, DVD Video and Video CD.

What is a Motion JPEG (MJPEG or M-JPEG)?

Motion JPEG (MJPEG or M-JPEG) is a video compression format in which each video frame or interlaced field of a digital video sequence (including video and metadata such as subtitles and closed captioning) is compressed separately as a JPEG image. Originally developed for multimedia PC applications, MJPEG is now used by video-capture devices such as digital cameras, IP cameras, webcams, and by nonlinear video editing systems. It is supported by the QuickTime Player, the PlayStation console and browsers such as Safari, Google Chrome and Mozilla Firefox. MJPEG was first used by the QuickTime Player in the mid-1990s.

MJPEG is an intra-frame-only compression scheme. Because frames are compressed independently of one another, MJPEG imposes lower processing and memory requirements on hardware devices. As such, the image-quality of MJPEG is directly a function of each video frame’s spatial complexity. Frames with large smooth-transitions or monotone surfaces compress well and are more likely to hold their original detail with few visible compression artifacts. Frames exhibiting complex textures, fine curves and lines are prone to exhibit DCT artifacts such a ringing, smudging and macroblocking. This gives MJPEG an advantage over interframe compression schemes, which do not accommodate rapid motion between frames and require more hardware to meet the memory demands of interframe compression.

MJPEG is frequently used in non-linear video editing systems. Desktop CPUs are powerful enough to work with high-definition video so no special hardware is required and they in turn offer native random-access to a frame. MJPEG support is also widespread in video capture and editing equipment, allowing for easy file-sharing for uses such as archiving and transcription.

Prior to the recent rise in MPEG-4 encoding in consumer devices, a progressive scan form of MJPEG saw widespread use in the movie modes of digital still cameras, allowing video encoding and playback through the integrated JPEG compression hardware with only software modification. The AMV video format is a modified version of MJPEG.

Many network-enabled cameras provide MJPEG streams that network clients can connect to. Mozilla and Webkit-based browsers have native support for viewing MJPEG streams. Some network-enabled cameras provide their own MJPEG interfaces as part of the normal feature set. For cameras that don’t provide this feature natively, a server can be used to transcode the camera pictures into an MJPEG stream and then provide that stream to other network clients.

The MJPEG standard emerged from a market-adoption process rather than a standards body and thus enjoys broad client support. Most major web browsers and video players provide native support and plug-ins are available for the rest.

Synchronized Multimedia Integration Language (SMIL) markup language

Synchronized Multimedia Integration Language (SMIL) is a World Wide Web Consortium extensible markup language (XML) that enables authoring of interactive audiovisual presentations. SMIL is typically used for multimedia presentations which integrate streaming audio and video with images, text or any other type of media including animations, visual transitions, and metadata such as subtitles and closed captioning. .SMIL is an easy-to-learn HTML-like language, and many SMIL presentations are written using a simple text editor.

SMIL allows the author to present media items such as text, images, video, audio, links to other SMIL presentations, and files from multiple web servers, allowing for convenient file sharing for uses such as editing, archiving and transcription. SMIL markup is written in XML and has similarities to HTML. SMIL files commonly take the .smil file extension as other programs share the .smi extension.

Authoring and rendering tools for SMIL include RealSlideshow Basic by RealNetworks, GoLive6 by Adobe and TransTool, an open-source transcription tool. SMIL players include Adobe Media Player, QuickTime Player, RealPlayer and Windows Media Player. SMIL presentations can be accessed via a computer’s browser with the use of a plug-in.

Some browsers, including Mozilla, are incorporating SMIL and other XML-related technologies into their browsers. SMIL is also able to access scalable vector graphics (SVG) animation. SMIL can be used on handheld and mobile devices and has also engendered the Multimedia Messaging Service (MMS), a video and picture equivalent of Short Message Service (SMS). SMIL is also one of the underlying technologies used by HD-DVD for advanced interactivity. The internet video site Hulu uses SMIL as part of its media-playing technology.

SMIL documents are similar in structure to HTML documents in that they are typically divided between a required body section, which contains timing information, and an optional head section which contains layout and metadata information. SMIL refers to media objects by URLs,, allowing them to be shared between presentations and stored on different servers for load balancing. The language can also associate different media objects with different bandwidth requirements.

SMIL can be used as a script or playlist tying sequential pieces of multimedia together which can then be syndicated through RSS or Atom. In addition, the combination of multimedia-laden .smil files with RSS or Atom syndication is useful for accessibility to audio-enabled podcasts by the hard-of-hearing through Timed Text closed captions, and can also turn multimedia into hypermedia that can be hyperlinked to other linkable audio and video multimedia.

What is a WAV file?

Waveform Audio File Format (WAVE or more commonly known as WAV due to its filename extension .wav) is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the Resource Interchange File Format (RIFF) bitstream format method for storing data in chunks. It is the main format used on Windows systems for raw and typically uncompressed audio without video or metadata such as subtitles or closed captioning. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format. Both WAVs and AIFFs (Audio Interchange File Format) are compatible with Windows, Macintosh and Linux operating systems. The RIFF format acts as a “wrapper” for various audio coding formats.

Though a WAV file can contain compressed audio, the most common WAV audio format is uncompressed audio in the LPCM format. LPCM is also the standard audio coding format for audio CDs. Since LPCM is uncompressed and retains all of the samples of an audio track, professional users or audio experts may use the WAV format with LPCM audio for maximum audio quality. WAV files can also be edited and manipulated with relative ease using software. The WAV format supports compressed audio using Audio Compression Manager in Windows. Any ACM codec can be used to compress a WAV file.

Uncompressed WAV files are large, so file sharing of WAV files over the Internet is uncommon. However, it is a commonly used file type, suitable for retaining first-generation archived files of high quality, for use on a system where disk space is not a constraint, or in applications such as audio editing and transcription, where the time involved in compressing and uncompressing data is a concern.

Broadcast Wave Format (BWF) is an extension of the Microsoft WAVE audio format and is the recording format of most file-based nonlinear digital recorders used for motion picture, radio and television production. It was first specified by the European Broadcasting Union in 1997 and updated in 2001 and 2003. The purpose of this file format is the addition of metadata to facilitate the seamless exchange of sound data between different computer platforms and applications. It specifies the format of metadata, allowing audio processing elements to identify themselves, document their activities, and permits synchronization with other recordings. This metadata is stored as extension chunks in a standard digital audio WAV file. Since the only difference between a BWF and a normal WAV file is the extended information in the file header, a BWF does not require a special player for playback.

Pop-on and Roll-up captioning defined

Pop-on and roll-up are the two styles of presentation commonly used in television captioning. Paint-on captions are sometimes used for special effects but are much less common. Accurate Secretarial LLC endorses the use of the pop-on style for all television captioning. WebVTT supports pop-on as well as paint-on captions.

Pop-on captions, as their name suggests, appear in boxes one or two line long at the bottom of the screen, though they may be placed on the screen in such a way to indicate the current speaker, and then pop off again to be replaced by the next caption box. In the case of audio content that cannot be confined to a single pop-on box, one pop-on box is replaced by another at logical breaks in the text (ends of phrases or sentences). Pop-on captions include edited audio information to aid viewer comprehension of the program. This may include dialog, narration, sound effects, indications of offscreen activity, etc. Pop-on captions are synchronized with the audio content of the program and thus are used typically for prerecorded programs that include multiple speakers.

Roll-up captions are used predominantly in conjunction with live television broadcasts, such as news reports and live sporting events. Up to four lines of captioned text appear in a box at the bottom of the screen. While the box remains stable on the screen, captions appear word by word and scroll to the top of the box, where the top line will disappear and be replaced by the next line. Essentially new lines as they are created will appear at the bottom of the box and push the older text upward.

Due to the fact that roll-up captions are created to reflect audio content generated in real time, there is always some lack of synchronization of the text to the words being spoken onscreen. This can be disconcerting to some viewers, not unlike the effect of watching a program and discovering the audio is on a several-second delay. Viewers who use lip-reading in addition reading captions may also find this problematic.

In addition, the time constraints involved in actively transcribing a program as it is being broadcast invariably result in a loss of accuracy in transcription, including misspellings, misunderstood words, or simply mistaken keystrokes. The time pressure also frequently requires substantial editing by the transcriptionist, for example, in the case of multiple speakers talking at the same time or whenever multiple audio inputs overlap one another.

Digital Entertainment Content Ecosystem (DECE) and UltraViolet

The Digital Entertainment Content Ecosystem (DECE) is a cross-industry initiative developing the next generation digital media experience based on open, licensable specifications. DECE members include network hardware manufacturers such as Cisco, computer, television and mobile device manufacturers such as Sony, Samsung and HP, content producers such as Fox and Warner Brothers, audio and visual encoding companies such as Dolby and DivX, cable companies and content distributors such as Comcast and Netflix, and big box retailers such as Best Buy. These companies have joined together to develop and operate UltraViolet, which enables consumers to purchase digital video content from a choice of online retailers and play it on a variety of devices and platforms made by different manufacturers.

UltraViolet is a free, cloud-based digital rights library that allows users of digital home entertainment content to stream and download licensed content to multiple platforms and devices. An UltraViolet account is a Digital Rights Locker where licenses for purchased content are stored and managed regardless of point of sale. The UltraViolet digital locker does not store video files and is not a cloud storage platform. Rather, UltraViolet coordinates and manages the licenses for each account but not the content itself. In this way, UltraViolet bypasses the cost of storage and bandwidth used when the media is accessed and insulates itself from future technological advances, allowing users to continue watching content they have purchased even when the players become outdated.

UltraViolet content is available from many existing movie-streaming services using their existing streaming and DRM technologies. Some services offer downloads that can be saved on notebook PC’s, tablets, gaming consoles, or phones for offline viewing. Content can also be streamed over the Internet to an unlimited number of devices, depending on the content license rights held by the streaming provider.

With the UltraViolet Common File Format (CFF), downloaded files can be copied between devices, stored on physical media such as DVDs or flash memory and can then be played on any UltraViolet device of software player registered to the household account. The Common File Format uses the Common Encryption (CENC) system. This ensures that a consistent set of codecs, media formats, DRMs, subtitling, and metadata such as closed captioning, are used across the whole UltraViolet ecosystem. The CFF uses the .uvu file extension and is based on existing standards from MPEG and SMPTE, and others and was originally derived from the Microsoft Protected Interoperable File Format (PIFF) specification. The goal was to avoid the problem of different file formats for different players and to make it possible to copy files from player to player, allowing for convenient file sharing for uses such as editing, archiving and transcription.

What is DV and DVPro?

DV is a format for storing digital video, audio and metadata such as subtitles and closed captioning. It was launched in 1995 through the joint efforts of multiple producers of video camera recorders. DV uses lossy compression of video while audio is stored uncompressed. An intra-frame video compression scheme is used to compress video on a frame-by-frame basis. All DV variants except for DVCPRO Progressive are recorded to tape within interlaced video stream. Film-like frame rates are possible by using pulldown transfer of film to video.

DVCPRO, also known as DVCPRO25, is a variation of DV developed by Panasonic and introduced in 1995 for use in electronic news-gathering (ENG) equipment. In 1996 Sony responded with its own professional version of DV called DVCAM. Like DVCPRO, DVCAM uses locked audio, which prevents audio synchronization drift that may happen on DV if several generations of copies are made.

DVCPRO50 was introduced by Panasonic in 1997 for high-value electronic news gathering and digital cinema and is often described as two DV codecs working in parallel. Comparable formats include Sony’s Digital Betacam, launched in 1993, and MPEG IMX, launched in 2001. DVCPRO Progressive was introduced by Panasonic for news gathering, sports journalism and digital cinema. Like HDV-SD, it was meant as an intermediate format during the transition time from standard definition to high definition video.

DVCPRO HD, also known as DVCPRO 100, is a high-definition video format that can be thought of as four DV codecs that work in parallel. While technically DVCPRO HD is a direct descendant of DV, it is used almost exclusively by professionals. Tape-based DVCPRO HD cameras exist only in shoulder mount variant. A similar format, Digital-S (D-9 HD), is offered by JVC and uses videocassettes with the same form-factor as VHS. The main competitor to DVCPRO HD is HDCAM, offered by Sony. It uses a similar compression scheme but at higher bitrate.

Tape-based DV variants, except for DVCPRO Progressive, do not support native progressive recording, therefore progressively acquired video is recorded within interlaced video stream using the pulldown transfer technique, the same technique used in television to broadcast movies.

DV was originally designed for recording onto magnetic tape. Tape is enclosed into small, medium, large and extra-large videocassettes. Any DV cassette can record any variant of DV video. Nevertheless, manufacturers often label cassettes with DV, DVCAM, DVCPRO, DVCPRO50 or DVCPRO HD and indicate recording time with regards to the label posted. With proliferation of tapeless camcorder video recording, DV video can be recorded on optical discs, sold-state flash memory cards and hard disk drives and used as computer files. This allows for easy file-sharing for such uses as archiving and transcription.

What are RealVideo and Windows Real Player?

RealVideo is a suite of proprietary video compression formats developed by RealNetworks and first 1997. RealVideo is supported on many platforms, including Windows, Mac, Linux, Solaris, and several mobile phones. RealVideo is usually paired with RealAudio and packaged in a RealMedia container, which can also include metadata including subtitles and closed captioning, with the file extension .rm. RealMedia is suitable for use as a streaming media format. Streaming video can be used to watch live television, since it does not require downloading the entire video in advance. The official player for RealVideo is RealNetworks RealPlayer SP, which is available for various platforms including Windows, Macintosh and Linux.

To facilitate real-time streaming, RealVideo and RealAudio use constant bitrate encoding, sending the same amount of data over the network each second. RealMedia Variable Bitrate (RMVAB) allows for better video quality but is less suited for streaming because it is difficult to predict how much network capacity a certain video stream will need. Video with fast motion or rapidly changing scenes requires a higher bitrate.

RealPlayer, formerly RealOne Player is cross-platform adware created by RealNetworks and primarily used for the playing of recorded media. The media player is compatible with numerous formats within the multimedia realm, including MP3, MPEG-4, QuickTime, Windows Media, and multiple versions of the proprietary RealAudio and RealVideo formats. RealPlayer is also available for on Macintosh, and Linux, Unix, Palm OS, Windows Mobile and Symbian versions have been released. The software is powered by an underlying open-source media engine called Helix.

The first version of RealPlayer was introduced in April 1995 as RealAudio Player and was one of the first media players capable of streaming media over the Internet. RealPlayer was initially accessed by many users as a plug-in to watch streaming video or listen to streaming audio but has since been overtaken in popularity by Adobe Flash.

Features of RealPlayer include a web browser, graphical animations, equalizer and video controls including cross-fade and gapless playback, recording audio, LivePause which pauses streaming video clips, CD ripping, and a media converter which allows converting to proper format and transferring video to iPods, cell phones, Xbox, PS3 and other devices. This allows for convenient file sharing of filmed material for such uses as editing, archiving and transcription. Users of RealPlayer can also post videos to Facebook and Twitter directly from the software.

RealPlayer SP for Windows includes audio CD burning capabilities, DVR-style playback buffering, multimedia search, Internet radio, a jukebox-style file library, an embedded web browser using Microsoft Internet Explorer, and the ability to convert and transfer media to a wide range of devices.

What is TTML?

TTML (Timed Text Markup Language, formerly specified as DXFP) was published as a detailed standard in November 2010 by the World Wide Web Consortium (W3C) that covers timed text on the Web with the goal being to define a nonproprietary, standardized format that could be used for displaying text synchronized with other elements such as audio and video. Timed text refers to the presentation of text media in conjunction with other media, such as audio and video. Typical applications of timed text are the real-time subtitling of foreign-language movies on the Web, closed captioning for people lacking audio devices or having hearing issues, karaoke, scrolling news items or teleprompter applications.

Similar to other professional sectors with a highly automated production process, the broadcast industry depends on reliable and stable standards to guarantee the quality of their services. The benefit of a formal standard that uses XML (Extensible Markup Language) as an established technology in this context is driven by the large increase of video content distribution over IP-based networks. The demand for subtitles to show with video is rising as well. In some regions provisioning of subtitles with online content is also an obligation from the regulator.

Incompatible, nonstandardized formats for captioning, subtitling and other forms of timed text are frequently found on the Web. Often this means that when creating a Synchronized Multimedia Integration Language (SMIL) presentation, the text portion often needs to be targeted to a particular playback environment. In addition, the accessibility community relies heavily on captioning to make audiovisual content accessible. The lack of an interoperable format adds significant additional costs to the costs of captioning Web content. Also, for professional content providers including broadcasting stations, a standardized format such as TTML gives broadcast stations the means to guarantee the quality of their online services and to satisfy the expectations of their audience.

Standardization committees from the broadcast and movie domain have adopted and promoted TTML as their format for subtitles. The Society of Motion Picture & Television Engineers (SMPTE) extended TTML to SMPTE-TT, the Digital Entertainment Content Ecosystem consortium (DECE) defined a TTML profile for the common file format (CFF-TT), and the European Broadcasting Union (EBU) published the TTML subset EBU-TT] for the interchange, archiving and production of subtitles.

TTML was adopted as an XML standard by the EBU due to the fact that it has well-documented Unicode support, which was missing in the prior binary EBU STL format. While the translation process of spoken text into subtitles requires a large amount of manual work in regard to transcription and formatting, the deployment of subtitles in different subtitle formats for linear and non-linear TV is only practically feasible when it is automated.

What is MXF file format?

MXF (Material eXchange Format) is a file format for the exchange between servers, tape streamers and digital archives for professional digital video and audio media defined by a set of SMPTE (Society of Motion Picture & Television Engineers) standards. MXF a supports a number of different streams encoded in any of a variety of video and audio compression formats, together with metadata (such as specific file information, subtitles and closed captioning) which describes the material contained within the MXF file. The file extension for MXF files is .mxf.

MXF bundles together video, audio and program data such as text along with metadata and places them into a wrapper. Its body is stream-based and carries the essence (audio and video) and some of the metadata. It holds a sequence of video frames, each complete with associated audio and data essence plus frame-based metadata. The latter typically comprises timecode and file format information for each of the video frames.

MFX was developed due to the changing technology of television production and digital services to viewers. Due to the change in ways for moving program video and audio, there far greater use of computers and IT-related products, such as servers as well as an expanded reliance on automation and the re-use of material. Besides the need to carry metadata, file transfers are needed to fit with computer operations and streamed for real-time operations, making possible convenient file-sharing for uses such as reference, archiving filmed material, editing and transcription.

MXF Operational Pattern 1a (OP1a) is used for a file with a single playable essence comprising a single essence element or interleaved essence elements in which the content may consist of multiple, interleaved tracks of picture and sound. OP1a files are self-contained and work well in applications where each file represents a complete program or take. However, it may be less applicable to content- authoring steps such as nonlinear editing, where programs are created by editing different sections of source material. MXF OP-Atom is used for applications where source information (picture and sound ) are kept separate and later edited together.

MXF is openly available to all interested parties. It is not compression-scheme specific and simplifies the integration of systems using MPEG and DV as well as potential future compression strategies. The transportation of these different files is independent of content, not dictating the use of specific manufacturers’ equipment. Any required processing can be achieved by invoking the appropriate hardware or software codec.

MFX is supported by Adobe After Effects, Adobe Premiere Pro, Apple Final Cut Pro via XDCAM Transfer, Autodesk Smoke, Avid, Dalet, Harris, Omneon, Quantel, Rhozet, Sony Vegas pro, Sorenson Squeeze Telestream, FlipFactory, Grass Valley EDIUS and Grass Valley K2.

What is HDV format?

HDV is a format for recording high-definition video on DV (digital video) cassette tape. The format was originally developed by JVC and supported by Sony, Canon, and Sharp. The four companies formed the HDV consortium in September 2003. HDV and HDV logo are trademarks of JVC and Sony. In HDV, video and audio are encoded in digital form, using lossy interframe compression. Video is encoded with the H.262/MPEG-s Part 2 compression scheme. Stereo audio is encoded with the MPEG-1 Layer 2 compression scheme. The compressed audio and video are multiplexed into a MPEG-2 transport stream, which is typically recorded onto magnetic tape, but can also be stored in a computer file allowing for convenient file-sharing for reference, archiving filmed material, editing (including the addition of metadata such as subtitles and closed captioning) and transcription.

Two major versions of HDV are HDV 720p and HDV 1080i. The former is used by JVC and is informally known as HDV1. The latter is preferred by Sony and Canon and is sometimes referred to as HDV2. HDV 720p format allows recording high definition video (HDV-HD) as well as progressive-scan standard definition video (HDV-SD). HDV-HD closely matches broadcast 720p progressive scan video standard in terms of scanning type, frame size, aspect ratio and data rate. Early HDV 720p camcorders could shoot only at 24, 25 and 30 frames per second. Later models offer both film-like (24p, 25p, 30p) and reality-like (50p, 60p) frame rates.

Sony adapted HDV, originally conceived as progressive-scan format by JVC, to interlaced video. Interlaced video has been a useful compromise for decades due to its ability to display motion smoothly while reducing recording and transmission bandwidth. Interlaced video is still being used in acquisition and broadcast, but interlaced display devices are being phased out. All modern computer monitors use progressive scanning as well.

Because HDV video is recorded in digital form, original content can be copied onto another tape or captured to a computer for editing without quality degradation. Depending on capturing software and computer’s file system, either a whole tape is captured into one contiguous file, or the video is split in smaller 4-GB or 2-GB segments, or a separate file is created for each take. The way files are named depends on capturing software. Some systems convert HDV video into proprietary intermediate format on the fly while capturing, so original format is not preserved.

HDV footage can be natively edited by most non-linear editors, with real-time playback being possible on modern mainstream personal computers. Slower computers may exhibit reduced performance compared to other formats such as DV because of high resolution and interframe compression of HDV video.

What is a FLV file?

FLV (Flash Video) is a container file format used to deliver video over the Internet using Adobe Flash Player version 6 and newer. Flash Video content may also be embedded within SWF (an Adobe Flash file format used for multimedia, vector graphics and ActionScript files). There are two different video file formats known as Flash Video, .flv and .f4v. The audio and video data and metadata (such as subtitles and closed captioning) within FLV files are encoded in the same manner as they are within SWF files, making. The F4V file format is based on the ISO base media file format. Both formats were developed by Adobe Systems. FLV was originally developed by Macromedia.

Flash Video is the standard for web-based streaming video over RTMP (Real Time Messaging Protocol). Notable users include YouTube, Hulu, VEVO, Yahoo! Video, metacafe, Reuters.com, and many other news providers. The FLV and F4V formats make possible convenient file-sharing for uses such as reference, archiving filmed material, editing and transcription.

Flash Video FLV files usually contain material encoded with codecs following the Sorenson Spark or VP6 video compression formats. The most recent public releases of Flash Player also support H.264 video and HE-AAC audio. All of these compression formats are restricted by patents.

Flash Video is viewable on most operating systems via the Adobe Flash Player and web browser plugin or one of several third-party programs. Apple’s iOS devices do not support the Flash Player plugin and so require other delivery methods such as provided by the Adobe Flash Media Server.

Commonly, Flash Video FLV files contain video bit streams which are a proprietary variant of the H.263 video standard, under the name of Sorenson Spark. Sorenson Spark is an older codec for FLV files but is also a widely available and compatible one, because it was the first video codec supported in Flash Player. It is the required video compression format for Flash Player 6 and 7.

Flash Player 8 and newer revisions also support the playback of On2, TrueMotion VP6 video bit streams (FourCC VP6F or FLV4). On2 VP6 can provide a higher visual quality than Sorenson Spark, especially when using lower bit rates. However, it is computationally more complex and therefore will not run as well on certain older system configurations.

Desktop-based applications that support Flash Video for Microsoft Windows, Mac OS X and Unix-based systems include Adobe Media Player, Media Player Classic, Mplayer, RealPlayer, VLC media player and Winamp. Mac OS devices can play flash videos in QuickTime with the help of additional software. PDA and smartphone-based applications that support Flash Video include Windows Mobile and Palm OS using the Core Pocket Media Player.

What is an AVI file?

AVI (Audio Video Interleaved) is a multimedia container format introduced by Microsoft in November 1992 as part of its Video for Windows software. AVI files contain audio and video data and metadata (such as subtitles and closed captioning) in a file container that allows synchronous audio-with-video playback. Like the DVD video format, AVI files support multiple streaming audio and video, allowing for convenient file-sharing for uses such as reference, archiving filmed material, editing and transcription. Most AVI files also use the file format extensions developed by the Matrox OpenDML group in February 1996. AVI files use the file extension .avi.

AVI is a derivative of the Resource Interchange File Format (RIFF), which divides a file’s data into blocks. Each block is identified by a FourCC tag. An AVI file takes the form of a single block in a RIFF formatted file, which is then subdivided into two mandatory blocks and one optional block.

By way of the RIFF format, the audio-visual data contained in the blocks can be encoded or decoded by an encoder/decoder. Upon creation of the file, the codec translates between raw data and the compressed data format used inside the block. An AVI file may carry audio/visual data inside the chunks in virtually any compression scheme, including Full Frame (Uncompressed), Intel Real Time (Indeo), Cinepak, Motion JPEG, Editable MPEG, WDOWave, ClearVideo/RealVideo, QPEG and MPEG-4 Video.

In addition, AVI files can embed Extensible Metadata Platform (XMP). By design, any RIFF file can legally include additional chunks of data, each identified by a four-character code. As such, it is theoretically possible to expand any RIFF file format, including AVI, to support almost any conceivable metadata.

Since its introduction in the early 90s, new computer video techniques have been introduced which the original AVI specification did not anticipate. For example, there are several competing approaches to including a timecode in AVI files, which affects usability of the format in film and television post-production, although it is widely used. Also, AVI was not intended to contain video using any compression technique which requires access to future video frame data beyond the current frame. More recent container formats (such as Matroska, Ogg and MP4) solve these problems, although software is freely available to both create and correctly replay AVI files.

DV AVI is an AVI file where the video has been compressed to conform with DV standards. There are two types of DV-AVI files. In type 1 multiplexed audio-video is saved together into the video section of the AVI file. This saves space, but Windows applications based on the VfW API do not support it. Type 2 is supported by VfW applications, although the file size is slightly larger.

What is a WebM video file format?

WebM is a video file format intended primarily for royalty-free use in the HTML-5 video tag. The WebM Project releases WebM-related software under a BSD (Berkeley Software Distribution, a Unix-like operating system) license and all users are granted a worldwide, non-exclusive, no-charge, royalty-free patent license. WebM files consist of video streams (as well as metadata including subtitles and closed captioning) compressed with the VP8 video codec and audio streams compressed with the Ogg Vorbis audio codec. WebM is a royalty-free alternative to the patented H.264 codec (which powers everything from online videos to Blu-Ray discs) and MPEG-4 standards, and is suitable for commercial and non-commercial applications. WebM files use a .webm file extension.

WebM videos can also be easily coded to autoplay and endlessly loop without an additional video player, which makes them a potential competitor to the GIF, which currently reigns supreme for easily-shared, easily-embedded animations. WebM file structure is sponsored by Google and was first unveiled in 2010. The WebM container is based on a profile of Matroska, which stores video in .mkv files. WebM initially supported VP8 video and Vorbis audio streams and in was updated to accommodate VP9 and Opus audio.

VP8 is a highly efficient video compression technology that was developed by On2 Technologies. Google acquired On2 in February 2010. Vorbis is an open-source audio compression technology. It is an independent project of the Xiph Foundation.

WebM was built for the web. After testing hundreds of thousands of videos with widely varying characteristics, it was found that the VP8 video codec delivered high-quality video while efficiently adapting to varying processing and bandwidth conditions across a broad range of devices. The relative simplicity of VP8 makes it easy to integrate into existing environments and requires comparatively little manual tuning in the encoder to produce high-quality results, excellent for the sharing and archiving of filmed and audio material, as well as for such uses as quick reference and transcription.

WebM video plays directly in your web browser using HTML-5. No plug-ins are needed, but a modern web browser that supports HTML-5 and WebM is required. YouTube is supporting WebM in addition to its existing formats as part of its HTML-5 experiment.

WebM is supported by Mozilla Firefox 4 and later versions, Opera 10.60 and later versions, Google Chrome 6 and later versions, and Microsoft Internet Explorer 9 and later versions. Other media players and components that support WebM include jetAudio Basic, Media Player Classic, Moovida Core, Perian, VLC, Winamp and XBMC.

What is timecode or TC?

A timecode is a sequence of numeric codes generated at regular intervals by a timing synchronization system. They provide a time reference for editing, synchronization, identification and transcription. In video production and filmmaking, SMPTE (Society of Motion Picture and Television Engineers) timecode is a form of media metadata (along with subtitles and closed captioning) used extensively for synchronization, logging and identifying material in recorded media. Timecodes are added to film, video or audio material and have also been adapted to synchronize music. Linear and vertical-interval timecodes were developed in 1967 by EECO, an electronics company that developed video recorders and later video production systems. EECO assigned its intellectual property to permit public use.

The SMPTE family of timecodes are used in film, video and audio production, and can be encoded in many different formats, including linear timecode (LTC), vertical interval timecode (VITC), AES-EBU embedded timecode used with digital audio, burnt-in timecode, control track timecode (CTL) and MIDI timecode. Timecodes for purposes other than video and audio production include inter-range instrumentation group timecode (IRIG), which is used for military, governmental and commercial purposes.

During filmmaking or video production shoot, the camera assistant will typically log the start and end timecodes of shots, and the data generated will be sent on to the editorial department for use in referencing those shots. This shot-logging process is typically done using shot-logging software running on a laptop computer that is connected to the timecode generator or the camera itself.

Linear timecode (LTC) is suitable to be recorded on an audio channel, or by audio wires. This is how it is distributed within a studio to synchronize recorders and cameras. To read LTC, the recording must be moving, meaning that LTC is nonfunctional when the recording is stationary or nearly stationary. This shortcoming led to the development of VITC (vertical interval timecode). VITC is recorded directly into the VBI (vertical blanking interval) of the video signal on each frame of video. The advantage of VITC is that, since it is a part of the playback video, it can be read when the tape is stationary.

CTL timecode is SMPTE timecode embedded in a videotape’s control track. Burnt-in timecode (BITC) is a timecode in which the numbers are burnt into the video image so the timecode can be easily read. Videotapes that are duplicated with these timecode numbers burnt in to the video are known as window dubs.

What is 3GP?

3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS (Universal Mobile Telecommunication System) multimedia services. It is used primarily on 3G mobile phones but can also be played on some 2G and 4G phones. 3GP is a required file format for video and associated speech/audio media types and timed text (including closed captioning) in ETSI 3GPP technical specifications for IP Multimedia Subsystem (IMS), Multimedia Messaging Service (MMS), Multimedia Broadcast/Multicast Service (MBMS) and Transparent end-to-end Packet-switched Streaming Service (PSS).

3G2 (3GPP2 file format) is a multimedia container format defined by the 3GGP2 for 3G CDMA2000 multimedia services. It is very similar to the 3GP file format, but has some extensions and limitations in comparison to 3GP.

The 3rd Generation Partnership Project (3GPP) is a collaboration between groups of telecommunications associations, known as the Organizational Partners. The purpose of 3GPP was to make a globally applicable third-generation (3G) mobile phone system specification within the scope of the International Mobile Telecommunications-2000 project of the International Telecommunication Union (ITU). 3GPP standardization encompasses Radio, Core Network and Service architecture. The project was established in December 1998.

The 3GP and 3G2 file formats are both structurally based on the ISO base media file format defined in ISO/IEC 14496-12 – MPEG-4 Part 12, but older versions of the 3GP file format did not use some of its features. 3GP and 3G2 are container formats similar to MPEG-4 Part 14 (MP4), which is also based on MPEG-4 Part 12. The 3GP and 3G2 file formats were designed to decrease storage and bandwidth requirements to accommodate mobile phones. 3GPP file format was designed for GSM-based phones and may have the filename extension .3gp. 3GPP2 file format was designed for CDMA-based phones and may have the filename extension .3g2. Some cell phones use the .mp4 extension for 3GP video.

Most 3G-capable mobile phones support the playback and recording of video in 3GP format (memory, maximum file size for playback and recording, and resolution limits exist and vary). Some newer/higher-end phones without 3G capabilities may also playback and record in this format. Audio imported from CD onto PlayStation 3 when it is set to encode to the MPEG-4 AAC codec copies onto USB devices in the 3GP format. The Nintendo DSi supports .3gp on an SD card.

When transferred to a computer, 3GP movies can be viewed on Linux, Mac and Windows platforms with MPlayer and VLC media player, making possible the transcription and archiving of filmed material. Programs such as Media Player Classic, K-Multimedia Player, Totem, Real Player, QuickTime and GOM Player can also be used. 3GP files can be encoded and decoded with open source software FFmpeg.

What is MPEG-2?

MPEG-2 (H.222/H.262 as defined by the ITU) is a standard for the generic coding of moving pictures and associated audio information and metadata such as subtitles and closed captioning. It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission bandwidth for uses such as reference, archiving filmed material, editing and transcription. MPEG-2 is widely used as the format of digital television signals that are broadcast by terrestrial (over-the-air), cable and direct broadcast satellite TV systems.

It also specifies the format of movies and other programs that are distributed on DVD and similar discs. TV stations, TV receivers, DVD players, and other equipment are often designed to this standard. MPEG-2 was the second of several standards developed by the Moving Pictures Expert Group (MPEG) and is an international standard (ISO/IEC 13818).

MPEG-2 relates to how video is delivered but not how it is encoded. So whether a video is transport stream or program stream has nothing to do with the quality of the video encoding or the MPEG-2 GOP structure. The format for delivery is independent of the content. The different formats exist because there are conflicting applications. Someone saving MPEG-2 to a file on a computer is not concerned about transmission, whereas someone wanting to transmit MPEG-2 would be very concerned with file format. The MPEG-2 standards address both of these concerns.

MPEG-2 includes a Systems section, part 1, that defines two distinct, but related, container formats. One is the transport stream, a data packet format designed to transmit one data packet in four ATM data packets for streaming digital video and audio over fixed or mobile transmission mediums, where the beginning and the end of the stream may not be identified, such as radio frequency, cable and linear recording mediums, examples of which include ATSC, DVB, ISDB and SBTVD broadcasting, and HDV recording on tape. The other is the program stream, an extended version of the MPEG-1 container format without the extra overhead of transport stream designed for random access storage mediums such as hard disk drives, optical discs and flash memory. An MPEG-2 program stream contains only one content channel. An MPEG-2 transport stream can contain one or more content channels.

MPEG-2 is not as efficient as newer standards such as H.264 and Opus. However, its backwards compatibility with existing hardware and software means it is still widely used, for example, in the DVD-Video standard. Some of the filename extensions related to MPEG-2 audio and video file formats are .mpg, .mpeg, .m2v, .mp2, mp3.

What is MPEG transport stream?

MPEG transport stream (MPEG-TS, MTS or TS) is a standard format for transmission and storage of audio, video and Program and Information System Information Protocol (PSIP) data. It is used in broadcast systems such as DVB, ATSC and IPTV.

Transport Stream is specified in MPEG-2 Part 1, Systems. Transport stream specifies a container format encapsulating packetized elementary streams (video, audio and metadata such as subtitles and closed captioning) with error correction and stream synchronization features for maintaining transmission integrity when the signal is degraded.

Transport streams differ from program streams in several ways. Program streams are designed for reasonably reliable media, such as discs (like DVDs), while transport streams are designed for less reliable transmission, namely terrestrial or satellite broadcast. Transport streams may also carry multiple programs. Multiple MPEG programs are combined and sent to a transmitting antenna. In the U.S. broadcast digital TV system, a receiver then decodes the TS and displays it. In most other parts of the world, transmission would be accomplished by one or more variants of the modular DVB system.

Each program is described by a Program Map Table (PMT) which has a unique PID (packet identifier), and the elementary streams associated with that program have PIDs listed in the PMT. For instance, a transport stream used in digital television might contain three programs, representing three television channels. Each channel may consist of one video stream, one or two audio streams and any necessary metadata. A receiver wishing to decode a particular channel decodes the payloads of each PID associated with its program. It can discard the contents of all other PIDs.

Transport Stream was originally designed for broadcast but was later adapted for use with digital video cameras, recorders and players by adding a 4-byte timecode (TC) to standard 188-byte packets. This is what is informally called M2TS stream. The timecode allows quick access to any part of the stream either from a media player or from a non-linear video editing system (useful for quick reference and transcription of filmed material). It is also used to synchronize video streams from several cameras in a multi-camera shoot.

Filename extension .m2ts is used on Blu-ray Disc Video for files which contain an incompatible BDAV MPEG-2 transport stream. Blu-ray Disc employs the MPEG-2 transport stream recording method. That enables transport streams of a BDAV converted digital broadcast to be recorded as they are with minimal alteration of the packets. It also enables simple stream cut style editing of a BDAV converted digital broadcast that is recorded as is and where the data can be edited just by discarding unwanted packets from the stream.

What is a DFXP file?

DFXP files are xml based closed caption files. They generally contain formatting information like font color and alignment, among others. One DFXP file can contain captions in multiple languages. The viewer will show either the language that it normally displays by default, or whatever language the user chooses (if the file contains that language.)

Ooyala video systems will only use DFXP files for closed captions. For more info on Ooyala as well as an example of the DFXP format, visit Ooyala’s official web page on the topic.

DFXP is certainly a much less common format than .srt files, but most professional closed caption creators can export them. As always, make sure the company you’re working with can supply closed caption files in the format you need.

What are .scc files?

.SCC files are one of the oldest closed caption file formats. The name is based on Scenarist Closed Captions. Unlike many newer closed caption files, the text is encoded in hexadecimal format.

This format is used primarily by television broadcasters. The file itself is encoded into line 21 of the video (which is generally not visible) by using an encoder, a piece of hardware that does only this. On the viewer’s end, their television decodes the information encoded on line 21 and displays it as closed captions.

The information that can by contained in a .SCC file is limited to time, content, and position on screen. Most sets can interpret text color information as well. Font is determined by the television set. If the file is encoded with the wrong font on the captioner’s end, characters may disappear or be displayed incorrectly. This is especially common with apostrophes.

.SCC files are also used by other applications as well. Many DVD authoring programs can import .SCC files into their workflow.

Converting from Analog to Digital Closed Captioning Services

Did you know there is a difference between the captioning of analog television (608) and captioning of digital television (708)? Comparing the two, there have been many more advances made to 708 captions than 608 captions. Analog captioning has fewer capabilities than the digital television captioning. Analog television captioning is becoming less common as most channels are digital, and people are loving the 708 captioning and want it for every channel.

In order to see 608 captioning, one must have a decoder built into the television or as a separate device, and it is capable of displaying languages such as English, French, Spanish, Dutch, German, Italian, and Portuguese. When viewing 608 closed captioning, the caption will be in white capital letters within a black box, where 708 captioning has three different text sizes, eight different fonts, 64 colored backgrounds, and many more languages for closed captioning to choose from. Digital television captioning allows many more viewers to watch programs. With closed captioning services, providers can convert 608 closed captioning to 708 closed captioning, but cannot convert 708 closed captioning to 608 closed captioning. This isn’t possible, because there are many more capabilities of the 708 captioning that would not be supported by 608 closed captioning.

Analog televisions with 608 captioning are becoming a thing of the past, and the high-definition and standard-definition digital broadcasts with 708 captioning are making their way to the top. Soon, with the extinction of analog televisions, people will no longer accept the 608 captioning converted to 708 captioning. Viewers will notice a difference between the 708 captioning and the 608 conversion and will want them all to be 708 close captioning. Welcome to the era where everyone depends on digital.

What is a MP4 file?

MP4 is an abbreviated term for MPEG-4 Part 14. It may also be referred to as MPEG-4 AVC (Advanced Video Coding). This format for working with video files was first introduced in 1998. The MPEG refers to Motion Pictures Expert Group, a working group of authorities established in 1988 by Hiroshi Yasuda and Leonardo Chiariglione to set standards for audio and video compression and transmission. Since the first MPEG meeting in May 1988, MPEG has grown to include approximately 350 members per meeting from various industries, universities and research institutions.

MP4, first published in 1998, was designed to encompass all the features that were part of earlier releases of MPEG files and add a few more that would prove helpful with the advancing online technology. MP4 is a container format, allowing a combination of audio, video, subtitles such as closed captioning, and still images to be held in a single file. It also allows for advanced content such as 3D graphics, menus, and user interactivity.

Because it requires a relatively low amount of bandwidth, the introduction of MP4 made it possible for the audience to continue growing by providing quicker, faster, and higher quality broadcast media for the average user, particularly as advancing technology made it possible to create more powerful desktop and laptop systems with larger hard drives and commanding more power.

There are all sorts of ways to enjoy the benefits of MP4. First, online consumers can enjoy all sorts of recorded video and audio, whether professional or amateur. Also, these recordings can be saved to a hard drive for later copying or transcription of archival materials. Some sites that use this streaming application also make it possible for users to send a copy of the recording via the Internet directly to a friend or acquaintance.

Software compatible with MP4 includes 3ivx, ALLPlayer, Amarok, Audacious Media Player, Augen Prizm, Banshee Music Player, Dell MediaDirect, Exaile, foobar2000, GOM Player, iTunes, iPods (all versions), jetAudio, J.River Media Jukebox, J.River Media Center, The KMPlayer, KSP Sound Player, Media Player Classic, Music Player Daemon, MPlayer, Nero Burning ROM (Nero ShowTime), Nintendo DSi Sound, Nokia PC Suite, Photo Channel (Wii), Playstation Portable XMB, QuickTime Player, RealPlayer, Rhythmbox, Songbird (software), VLC Media Player, Winamp, Windows Media Player 12, XBMC Media Center, Xine, Zoom Player, and Zune.