AWS Engineers for Building Enterprise Data Lake Solutions.

Universal Music Group is a global leader in the music production and publishing industry. While they have been revolutionizing the music industry for decades, a while ago, they came across certain concerns related to their data storage and analysis techniques, and partnered with to optimize  their process.

Major Challenges

With the existing systems, they were facing the following challenges:
  • Inability to scale up to the data explosion from streaming partners (Spotify, Apple, YouTube etc).
  • Ineffective analysis from the existing systems, such as LYSD (last year same day).
  • High licensing costs due to high data volumes.

Migration Highlights

Migration Strategy

Re-Platform

Source

MS SQL Server with 200+ TB data.

Target

Amazon Redshift

Best practices

Enriched source data using Hive and loaded into S3 as multi-part files
Loaded data from S3 -> Redshift using Bulk load copy command

Fast facts

Migration Strategy

Re-Platform

Technologies

Java, Python, AWS S3, AWS Redshift, AWS EMR, Qubole

Team Size

2 SMEs, 3 Architects, 5 Senior Developers, 5 Developers

Data Size

200+ TB2 TB/day250+ million records/day

Solution

  • Cloud based, elastically scalable Data Lake architecture offering faster analytics and business agility in a cost - efficient manner
  • AWS Elastic MapReduce (EMR) was used to scale out data processing across nodes, and store processed data in AWS S3 storage
  • Data pipeline was used to move data from S3 to Redshift to support faster analytics needs downstream

Business Benefits

70%
Cost savings realized
Deep dive analysis
Now possible, cost-effectively
1/5th
Reduction in data processing time
80%
Reduction in data cleansing– Enhanced business agility