Hollywood and Metadata: A Blockbuster Combination

By Matt Turner, CTO of media and entertainment, MarkLogic

Big data gives companies all the information they need to better delight and serve customers. But the companies who’ll win with big data won’t be those who simply have the most data, but those who can make the most out of that data, and quickly disseminate it.

Metadata — the data about other data that makes it more searchable, thus more useable and valuable — is a critical piece of this puzzle and especially in the fast-changing, data-rich and consumer-facing entertainment industry. Want to quickly find all scenes with a particular actor paired with another actor from a certain TV show? How about kicking out to an advertiser the many times that a certain red couch shows up center screen?

With this kind of data being constantly created, metadata is becoming a Hollywood A-lister and playing a vital role ensuring content makes it to the fans who love it and to advertisers, too. Without metadata, there is no way for the entertainment industry to keep track of the millions of pieces of content and associated details that the industry produces almost nonstop.

The industry has long used data to inform decision making, such as using ratings to decide whether to cut a show. Yet the amount and nature of data available has vastly increased.

Matt Turner, Mark Logic
Matt Turner, MarkLogic
Companies now have data on how and where people consume content, when and where they share it, even what they say about it on social media.

In parallel, there have been huge shifts in how data is created as content is produced.

The entertainment industry, in the past decade, has shifted from analog film and tape to digital file and networked delivery, including direct to consumers.

With thousands of productions in development at any one time, each an isolated island or silo generating upwards of seven or eight terabytes of data a day, piecing together the metadata generated throughout each step of the process and making sure that data is useful is a huge task.

Metadata: Complex, But Searchable

Metadata is all the data around an asset that can be broadly categorized as:

• Descriptive data. It analyzes the content in more detail, such as actor names, characters, location and is now also being applied at scene level to help find content.

• Technical data. This is information about the digital files and includes data which comes off the camera as well as the technical attributes of the actual digital content. The typical camera generates roughly a terabyte of content an hour and there could be hundreds of versions of any one item of content across each part of an organization.

• Transactional data. More backroom focused, it can include details pertaining to rights, funding or insurance.

How companies manage this metadata is changing. Traditional relational databases require content owners to funnel data into the confines of rows and columns from which it can then be searched and extracted. Companies have to model upfront how they think the data will be used, which determines how it’ll be stored and retrieved. This approach creates silos of data throughout an organization and requires them to use complex and costly processes to access the data. When the company needs to incorporate new types of data, the schemas and processes need to be redesigned and re-executed.

Next generation NoSQL databases, built with a flexible data models to store, manage, and search data, now enable companies to integrate hundreds or thousands of pieces of data from across the supply chain into a single system and make it available for any purpose. They also enable companies to describe and link data from every part of the supply chain. Semantics allows them to then use taxonomies and ontologies to model and manage complex relationships in data. This includes classifications, genres and relationships between titles and products as well as how characters, films and series fit together alongside other data from production. All of this occurs, even as new types of data pour in, without the time and expense of having to rebuild the system.

Warner Bros., for one, is now reaping the benefits of having a scalable and flexible system, which is both operational and transactional, to give it a complete view of its data and metadata. Using the real-life example of “The Dark Knight,” Warner Bros. quickly and efficiently delivers metadata to more than 150 territories worldwide to all to its digital partners using Smart Content, the white knight of metadata management, which is a combination of NoSQL data and semantics.

NBC, to celebrate the 40th anniversary of its iconic show “Saturday Night Live,” created an SNL app. In addition to hiring a ‘metadata army’ to watch and ‘tag’ information from old SNL clips, NBC linked and enriched that metadata with semantic ontologies. By combining the different types of data from the supply chain, NBC was able to fully leverage its enormous library of content and create a personalized experience for fans. The app was highly successful, holding the number one position in the app store for several weeks, and delivering over 100 million videos to fans.

Exploring how to capture and re-use metadata in the production cycle was a key goal of another Hollywood project called “The Suitcase” from The Entertainment Technology Center at the University of Southern California ([email protected]).

This project used new technology approaches in production including cloud storage and C4 IDs. One of the goals was to prove that a complete set of descriptive data could be generated from the assets and data generated during production and didn’t need to be re-created downstream.

Entertainment is Serious Business.

In today’s world, where entertainment companies compete with a myriad of content producers on all kinds of platforms, companies need to maximize their ability to reach audiences by creating content that engages and retains eyeballs and can be effectively monetized. Keeping everyone in the picture by connecting all the data and content across the digital supply chain has never been more important.

Traditional relational database technology, which keeps data in hard to manage and search silos, is being replaced in all kinds of industries by next generation NoSQL databases. For entertainment, such a shift will not only improve the experience of the media company, but of its customers, as well.