While many organizations have yet to effectively leverage data available to them, Netflix is a noteworthy exception.
Netflix is easily one of the most counter-intuitive companies out
there. A huge example of Netflix’s counter-intuitive nature is shown through its decision to flat out block VPNs in 2016.
This is despite the fact that at the time, more than 30 million
Netflix users lived in countries where Netflix’s service is unavailable without using a VPN or other location-masking services (and where Netflix is now recording most of its subscription gains).
The same year, Netflix hiked its prices and refused to back down despite protests from users and loss of hundreds of thousands of users.
Yet, Netflix has only grown since.
Netflix’s subscriber growth since it made its controversial decision
to ban VPNs and hike its prices in 2016.
So how is Netflix able to continue rapid growth despite alienating
a significant portion of its base? By leveraging big data to find out exactly what users want and giving it to them. Netflix is betting big on content and user experience, the larger chunk of Netflix’s budget is spent on content. In 2019, Netflix is committing a $15 billion budget to content. For comparison, they are committing a meager $2.9 billion for marketing.
While it’s easy to focus on Netflix’s huge content budget, it
would be a better idea to focus on the process used to come up with ideas for this content and how much of a role big data plays.
Netflix’s big data infrastructure
Netflix uses data processing software and traditional business intelligence tools such as Hadoop and Teradata, as well as its own open-source solutions such as Lipstick and Genie, to gather, store, and process massive amounts of information. These platforms influence its decisions on what content to create and promote to viewers.
Netflix doesn’t use a traditional data center-based
Hadoop data warehouse. In order to allow it to store and multiple Hadoop clusters for different workloads accessing the same data. In the Hadoop ecosystem, it uses Hive for ad hoc queries and analytics and Pig for ETL (extract, transform, load), and algorithms. It then created its own Genie project to help handle increasingly massive data volumes as it scales. All this points to one thing: Netflix is very particular about having a lot of data and being able to process this data to ensure it understands exactly what its users want.
The result has been nothing short of amazing. Netflix has
been able to ensure a high engagement rate with its original content, such that 90 percent of Netflix users have engaged with its original content.
Netflix’s big data approach to content is so successful that,
compared to the TV industry, where just 35 percent of shows are renewed past their first season, Netflix renews 93 percent of its original series.