Smart Screen Exclusive

GrayMeta: Metadata is King

From the start, GrayMeta has had a singular focus: help media and entertainment companies get the most out of their metadata.

The company’s solutions allow companies to capture metadata from multi-petabytes of media, with seamless integration with existing archival systems and the ability to search, extract, analyze and retrieve metadata, offering the ability to potentially save millions in man hours. To date, the company has been used to examine 1.5 billion pieces of metadata, with approximately 3,500 global media and entertainment use cases.

Aaron Edell, VP of operations and professional services for GrayMeta, sat down with the Media & Entertainment Services Alliance (MESA) to discuss the mistakes content owners make with metadata, the company’s favorite use cases, and what’s next for the metadata industry as a whole.

MESA: From the start GrayMeta has looked to help the media and entertainment industry make the most of their content. What makes GrayMeta unique, and stand out among similar companies?

Edell: GrayMeta’s MetaFarm software service is the world’s first centralized, automated harvester of file-based metadata, transcending all file formats and standards.

The platform is built around hundreds of highly modularized metadata harvesters called “magnets,” which act like apps that are designed to understand any type of metadata. Examples include (but are not limited to) a magnet that extracts embedded media information from any video/audio file, a magnet that scrapes content from Microsoft Office documents and PDFs, a magnet that detects faces, and a magnet that does speech-to-text and language detection.

All of the data and metadata that is harvested by the magnets is then put into a data store that has a powerful yet simple API layer, enabling complex search, browse and data analysis capabilities. The icing on the cake is that magnets are simple to build to accommodate any new file format or set of metadata. So far, MetaFarm can extract metadata from over thousands of different file formats and data sources such as existing databases, emails, and log files.

Combining the metadata extracted from every file and source of data in your environment and making it available through a single API command is incredibly valuable. Now customers can, in minutes, search their entire environment for raw video files shot by the firmware that was in the camera and cross-correlate that with the weather for that time and location, or see exactly how much money is being spent on duplicate files, analyze call sheets with time spent actually shooting, find long lost files that have been misplaced or renamed, search across scripts and production notes by character name or scene location, and much, much more.

MESA: In your opinion, why has metadata become so important so quickly for content owners, and how is GrayMeta supporting this need?

Edell: Metadata is king! In fact, I would argue that it is more valuable to have than the data itself. For example; consider a camera file that has been harvested by MetaFarm. The metadata tells you when and where it was made, what camera and lens was used, what all the settings on the camera were, what the lighting and color profile was, what is in the shot, who is in the shot, what language was spoken, what the weather was like that day, what part of the script it represents, how the file was transferred from the camera, what context the shot has with other shots, etc. If you lost the file but maintained the metadata, you’d have everything you needed to recreate the shot. Now consider having this data on the scale of the hundreds of millions of files you have and you can start to see the immense value of saving this data.

One of the more shocking truths is that most of this data is being lost today. As these files traverse a file system, or go through transcoding, editing, color correction, e-mail servers, file transfer servers, archives, data tapes, metadata is being lost on an alarming scale. This is value that MetaFarm saves customers from losing forever. As we enter the zetabyte era with the oncoming reality of 4K and even 8K, it will become ever more critical to extract and save all the data you generate, not just the 5% that is saved today.

MESA: What are some of the biggest mistakes content owners make when it comes to metadata today?

Edell: To think its not important or that there just isn’t a reason to keep it. The truth is that a lot of our customers come up with new use cases for harvesting this data every day that they could have never predicted. There are some obvious examples such as celebrities suddenly dying and scrambling to find the footage of that person for the 5 o’clock news, but then there are some that aren’t so obvious, such as a camera manufacturer telling you that there is an issue with the firmware you might have on some of your cameras that causes issues when shooting in low light.

Without having harvested those camera files with MetaFarm, you’d have to spend a great deal of time trying to find files that were shot on that firmware in low light. Or a consumer device suddenly enables contextual searching and you have to scramble to make that data available through an API so that fans can find your content. Or a trend suddenly emerges on Twitter around a scene from a classic episode of television and you want to quickly find that scene and tweet it out to your followers. The use cases are endless, and it’s impossible to predict them, so the best thing to do is be prepared with quick, simple access to 100% of your data. And the best part is that the amount of storage required for your metadata is trivial, especially when compared to what it takes to store media files. So there’s no risk and very low cost to harvesting and storing this incredibly valuable metadata.

MESA: Where does the cloud come into play with GrayMeta’s offerings, and how do we assure clients that it’s secure?

Edell: Cloud has many wonderful applications, but there are still some barriers to entry for a lot of the media entertainment industry because of potential security risks and because of cost. Media files are the largest files in the world, so transferring them to and from the cloud can be expensive and time consuming. So we made the choice easy for our customers; MetaFarm runs on premises, in the cloud, or on both. For the customers who have petabytes of media files, MetaFarm can harvest in their data centers, safely behind the firewall, and move the extracted metadata to a data store anywhere else that makes sense. If the customer has made the transition to storing all their content in the cloud, then MetaFarm can run completely in the cloud as well. There is also a hybrid model where the harvesting occurs on site behind the firewall, but the metadata extracted is stored and made accessible in the cloud.

MESA: Can you share some of your favorite use-case examples, where media and entertainment clients made especially good use of GrayMeta’s offerings?

Edell: Some examples of use cases other than the ones I’ve mentioned before are:

• In 2010, the United States passed a law called the CALM act, which bars the audio of TV commercials from being broadcast at louder sound volumes than the TV program material they accompany. In traditional media workflow environments, a company may have to spend hundreds of thousands of dollars in staff time and overheard hunting for content in their archive that doesn’t meet these requirements. Customers with MetaFarm having already harvested all the audio levels and stored them as searchable data, can discover which assets violate the CALM act within a matter of minutes.

• Ten years after the release of an animated feature, a marquee studio needed to rerelease the movie in 3D. In order to create a 3D look, every single shot had to be reopened and tweaked in the animation software. When going back to the archive of animation files, the studio came across references to files that were not where they were supposed to be. Every person who might have remembered where the files were had left the company. The CTO had to spend three months with full-time staff searching every file-system the studio had. If all the content had been harvested by MetaFarm, the checksum values of every file would have been stored in the data store. Then the studio could have quickly located the missing content, programmatically, within a matter of hours.

• A major studio had a number of properties with various product placements. When some of these properties needed to be distributed to new territories, the product placement had to be changed to accommodate new agreements. When harvesting all of the content for the studio, a magnet was enabled that scanned every frame for readable text, and stored the text and the time-code location of every instance. Another magnet then analyzed the text to detect what language the text was in. The studio was able to quickly determine which properties had which product placement, in what language it was in, and where exactly it appeared in each file. The amount of time saved was tremendous for the marketing team.

MESA: What’s next for the company? Are there new services GrayMeta would like to offer, or specific industries it would like to branch into?

Edell: The really interesting thing about MetaFarm is that it extracts metadata from all file formats, which we’ve discovered makes it immediately applicable to applications outside of media and entertainment.

We’ve had meetings with some folks in the security and surveillance industry who have a crushing need to correlate endless video and audio with other documents and data. But video is just the beginning. Oil and gas as well as the medical and health industry has expressed interest in MetaFarm because they also have enormous file-based environments and are in desperate need of novel ways of understanding and accessing the value of the data they own. Essentially anyone with file-based data can start extracting incredible value from their content with MetaFarm.

GrayMeta also has some exciting services and products in the road map that have emerged from some very interesting use cases that extracting metadata can solve. I’m looking forward to being able to talk about them openly soon!