AI-Based Closed Captioning Solution for New Streaming Platform Requirements

AI-Based Closed Captioning Solution for New Streaming Platform Requirements

In the past couple of years, consumers’ insatiable and growing demand has increased content consumption over streaming media. We have witnessed a barrage of new players launching streaming services into the market. Recent examples include the launch of Quibi and Peacock in April 2020, and HBO Max is planning to expand into Latin America by June 2021.

The increasing demand has left content creators and owners to scramble for new or repurposed content for these platforms while meeting the platform’s standards in video and corresponding metadata. This critical metadata includes closed captions and, as is the case with video, closed captions must meet standards and style guides mandated by individual streaming platforms.

The tremendous growth in content consumption and the widespread acceptance of closed captions beyond the hearing-impaired community have driven up the output volume required of captioners and introduced the need to provide captions that can be used on different platforms.

closed captioning solution

One of our clients, a leading U.S.-based mobile video platform content producer headquartered in California, faced a similar challenge. They were obligated to include closed captions in line with the platform’s standards and style guidelines for all their content.

The client assessed Digital Nirvana’s nTrance, an AI-driven, cloud-based, enterprise-grade solution for transcription, captioning, and translation to find a solution to these challenges. There are a lot of impressive closed captioning applications in the market. What makes Digital Nirvana’s Trance different? While the basic functionality of other applications may be similar, Trance is unique in that it is offered via the MediaService-IQ platform; the gateway to Digital Nirvana’s suite of Machine Learning (ML) and Artificial Intelligence (AI) capabilities.

The platform makes accessible Trance’s collection of sophisticated yet straightforward AI modules. These modules simplify captioning for the user and support an evolving captioning and processing workflow. Be it automatic speech-to-text content, automatic caption generation based on style guides, or translation, each aspect has been designed to reduce the effort involved to create the output.

Because it is an enterprise-grade application, Trance comes with an orchestration layer that enables easy project management, automatic assignment of tasks to users, and a holistic view of day-to-day operations. Combining these future-ready functionalities with superior ease of use, Trance was the customer’s top choice.

Digital Nirvana’s solution enhances efficiencies by using various AI modules to address the needs of transcription, caption generation, and translation based on the target streaming platform’s style guide preference, enabling users to confirm compliance with output requirements automatically.

Once media is ingested, a speech-to-text output is automatically generated and then displayed alongside the video in the user interface as a time-synced transcript. The operator can quickly review and correct the transcript, then convert it to closed captions based on the profile set, e.g., Netflix, Quibi, Prime, etc.

This process enables adherence to character count, line count, text frame, gaps, maximum words per minute, and more. Once the initial review is completed, the content is displayed in a captioning professional window. Users can review it along with the video and confirm how the content appears on platforms.

Once the caption review is completed, the user can automatically generate translation in the same window alongside the video and source-language closed captions. This feature eliminates the need to recheck conformance on style-guide-based parameters and allows users to review automatic translations in line with the source language captions.

Trance also has a built-in caption conformance module that helps users repurpose existing captions, correct them, and reformat them to comply with new streaming media requirements. This feature generates time-synced alerts on any nonconformance so the user can easily navigate to the occurrence and review.

After completing caption generation or repurposing using caption conformance, users can download caption output formats based on the profile set, including customized WebVTT or TTML formats suitable for various streaming platforms. Users can also choose to download multiple forms that are in conformance with different broadcast and streaming platforms.

Not just this client, all our clients have leveraged the following key features of cloud-based Trance to accelerate their captioning process:

Web User Interface: Trance provides a simple and intuitive UI with user-specific access and customizations. Users can access work items through an easy-to-use dashboard and can even customize keyboard shortcuts.

Transcription Page: Our solution is enabled with advanced speech-to-text (STT) engines equipped to handle various content types. The advanced speech-to-text (STT) engines of Trance allow users to color-code for easy identification of low-confidence text and enable easy navigation within the edit area with interactive text. Users can also import existing scripts.

Automatic Formatting (Presets): Trance presets enable Natural Language Processing (NLP) based on grammar and proper nouns. These presets can be customized technical parameters to accommodate various styles and create multiple presets under one account for various projects.

Pro Captioning Page: The Trance pro captioning page provides the functionality to view, edit automatically formatted captions, sync automatically to the video, generate alerts on nonconformance with preset guidelines, and import existing closed-caption sidecar files.

Text Localization/Translations: Trance supports translation into 100+ languages. Users can copy the formatting of the source or create a new preset for each language. A dual-pane display allows them to view source language text alongside translations.

24/7 support: Digital Nirvana’s experience, backed by a worldwide support team, ensures that our customers get application-level availability, security, comprehensive visibility, and quick responses.

Digital Nirvana’s Trance is an easy-to-use, web-based application for the generation of transcripts, closed captions/subtitles, and translations for content localization. Purpose-built for media and entertainment operators, the solution empowers users to unlock the power of AI with no significant upfront capital expense or in-house expertise.

Our adaptive technology is designed to handle future industry standards, while our open API architecture makes integration with existing workflows seamless and easy.

Our solutions empower broadcasters and independent content producers to enhance content value, meet regulatory captioning requirements, and prepare content for publishing to different distribution channels. It offers an interface through which customers can submit their job requests and access customized, flexible services that fit their business needs.

Leave a Comment

Your email address will not be published. Required fields are marked *

Let’s lead you into the future

At Digital Nirvana, we believe that knowledge is the key to unlocking your organization's true potential. Contact us today to learn more about how our solutions can help you achieve your goals.

Scroll to Top

Required skill set:

Required skill set:

Required skill set:

Required skill set: