The proliferation of online distribution options means that content owners are confronted with the problem of managing highly complex data sets as their content is viewed across the Internet. Here Ben Reid, founder of elasticiti, an analytics solution for media companies, explains how they can go about automating the complex tasks associated with data management.
As a content owner, these are heady days. You have no shortage of distribution outlets for your video content – Netflix, Amazon, Apple TV, Roku, Xbox, plus the cable providers – allowing you to achieve audience scale not seen since the days when all of America sat down to watch Mutual of Omaha’s Wild Kingdom on Sunday evenings.
Now that you can distribute your content anywhere, the big (and potentially unwelcomed) question is: should you?
In a typical content distribution arrangement, you, the asset owner, require a contractual minimum along with some sort of revenue sharing agreement. But the moment you turn your assets over to the distributor, you lose all visibility into how they’re consumed. So how do you know if your partner is meeting its requirements? If consumption rates are aligned with your forecasts?
To answer these questions you need access to your data that’s held by your partners, but in too many cases, that’s easier said than done.
To begin, you’ll need to request data from dozens of sources, which is a hugely complex problem. Complicating matters further, your data will come back in many forms, none of which are likely to align with your internal systems and formats. Consequently, your research department will need to perform a lot of ugly blocking and tackling to get a comprehensive picture of what’s going on.
Digging deeper, you’ll need to monitor how each partnership performs. Let’s say you strike a partnership with Partner X and they agree to provide you with a specified amount of exposure to their customers. Are they meeting their minimums? Exceeding them? Is it stable or highly dynamic traffic? How do you know the strengths and weaknesses of each partner, so that your team can jump on the phone and remediate issues if necessary? If there are problems, can you afford to discover them days, or even weeks, later?
The datasets generated by each partner truly fall into the Big Data category, in that they’re hugely voluminous, of myriad varieties, and come at a velocity that overwhelms the desktop computing capabilities of your researchers.
For example, you may be able to get your data via an API. This sounds hopeful, but all to often, poor performance means it can take up to half a day to retrieve your data. Or you can automate data retrieval from a reporting portal via web-scraping. But this approach is more challenging to set up, and the system will break every time there’s a design change to the reporting site. Or you can get your data via an email or on an FTP site, but we all know the issues there.
Here’s the bottom line: although distribution partners recognize the need to share data, there is no consensus on how to do it, and some partners are more supportive of your data needs than others.
Once you get the data, critical grooming needs to happen. For instance, you’ll need to fully understand every column of data, or risk flawed interpretations. That means a lot of communication and documentation between you and your partners is necessary so that you can have confidence in the data you receive. Next, you’ll need to decide on the right level of aggregation. While highly granular data is always good, when it comes to reporting, you’ll need to harmonize it across partners at a common level of detail.
Where does this data go? Ideally in your data lake. If not, you’ll most likely load it into your data warehouse and into your big data environment, which means finding a way to join the two, which is about as much fun as it sounds. This is a key area where strong partnerships and communication between business users and internal IT leaders is a crucial foundation for success.
To benefit from the golden age of digital distribution, media companies need to embrace the data requirements of your business team. That means streamlining your partner data onboarding and rationalization processes, and automating much of what your internal distribution team does manually. Data heterogeneity will be the norm until such time the industry agrees to standardize on nomenclature and formats. Since that won’t happen anytime soon, you’ll need to have business processes in place to handle ever-evolving complex data feeds.
As video content owners enjoy the benefits of an expanding viewing universe, it’s vital to keep a firm hand on the partnership reins, as well as an open eye to the risks and opportunities those benefits present.