Connections

M&E Journal: Language Codes for All

“What code should we use for Latin American Spanish?” an HBO developer asked me in 2017. “We’re setting up a new system and we aren’t sure what it should be.” It turns out there wasn’t anything consistent implemented across HBO.

Working with Laura Dawson, then HBO’s metadata analyst, we came up with a consistent language taxonomy solution. Dawson re-called, “We thought this project would take a few weeks. It wound up taking over 16 months and resulted in codes for 127 languages. The solution — an implementation of IETF BCP (Internet Engineering Task Force Best Current Practice) 47 — became the foundation for the Language Metadata Table.”

HBO’s table of languages was introduced at MESA’s 2018 Smart Content Summit in New York.

The response was so enthusiastic that MESA asked Dawson and I to chair the Language Metadata Table (LMT) working group.

“Superficially, language encoding seems like a trivial problem, but it’s not,” said Craig Seidel, SVP of technology at MovieLabs. “In the digital supply chain, encoding mistakes can cause confusion about metadata, audio tracks, and subtitle tracks. Ultimately, this wastes time and impacts consumer experience. LMT solves this problem by providing simple direct information that can be used to consistently encode language. By removing the guesswork, everything goes where it should.”

As of October 2020, LMT has more than 200 languages and continues to grow. Rajesh Ramachan-dran, president and CTO of Qube Cinema, is leading a sub-committee devoted to languages of the Indian subcontinent, both confirming existing and adding new languages.

After that, LMT will also be expanded to include more southeast Asian languages.MESA and SMPTE have also partnered on LMT. Bruce Devlin, SMPTE’s VP of standards, is creating a tool that validates LMT codes. Formal adoption of LMT as SMPTE’s language code solution is making its way through the SMPTE processes. Upon approval, the code will be publicly available.

LMT adheres to IETF BCP 47, from which more than 40,000 different language code combinations can be createdLMT’s guiding principle is to use the shortest IETF BCP 47 code that provides clarity.

Only the approved LMT codes will pass validation using the SMPTE validation tool. For example, the LMT compli-ant code for Hungarian = hu. Some may use the IETF BCP 47 code for Hungarian as spoken in Hungary: hu-HU which is not part of LMT; therefore hu-HU won’t pass through the validation tool.

Adoption of LMT is increasing. Jamie Duemo, SVP of multi-platform distribution for ViacomCBS, said, “After the joining of Viacom and CBS, it was important for the newly-combined company to align as many processes, standards, and best practices across all of its brands. LMT’s agnostic approach to language desig-nations, while using an authoritative standard (BCP-47) as its source, will greatly assist ViacomCBS with interoperability between new and existing processes, products and services.”

Other standards bodies are also adhering to LMT. “As luck would have it, EIDR was already following BCP-47, so we didn’t have much work to do on our end.

Metadata without standardization is of little use, and language encoding has long been a chaotic mess of conflicting standards, no standards, and inconsistent practice,” said Richard Kroon, EIDR’s director of engineering. “I’ve been trying to solve this problem for 20 years and am delighted that LMT has finally stepped into the void to bring consistency to chaos. “With LMT’s guidance, language metadata exchange can be consistent and reliable, which also means it can be automated, resulting in lower costs, higher process velocity, reduced error rates, and increased bandwidth. Genre encoding is still a nightmare, but at least now language is a dream.”

LMT committee members are from organizations from across the media and entertainment industry.

In addition to WarnerMedia companies (HBO, Turner, WarnerBros), representatives from Disney, Lionsgate, APEX (Airline Passengers Experience), Gracenote, Fox, Sony, and others participate in the meetings, which are held every six to eight weeks.Standards facilitate interoperability between content creators/providers and distributors, and LMT is a standard that meets that need. I never thought that a table created for HBO/WarnerMedia would turn out to be the language code solution for the media and entertainment industry.

But here we are!For more information on the LMT, please visit mesaonline.org/language-metadata-table.

* By Yonah Levenson, Manager, Metadata Management, Taxonomy, WarnerMedia/HBO

___________________________________________________________

Click here to translate this article
Click here to download the complete .PDF version of this article
Click here to download the entire Winter 2020/2021 M&E Journal