Project Title:
UMG Consumer Analysis.
Environment:
Java, MapReduce, Hive, Hue, Spark, AWS EMR, S3, EC2,
Step Functions, Lambda, Spring Boot, Data Pipeline, Python, Spark, Redshift,
Quoble, Big query, Google cloud storage, Google Data Flow, Apache beam,
Cloud Composer (Airflow), Cloud data transfer API, Cloud SQL and Kubernetes.
Project Precise
UMG Enterprise Reporting Service group receives
consumer data from their digital salespartners like Spotify, iTunes, Amazon,
Google, etc. daily for business intelligence and analytics purposes.The
volume of this data per day is approximately 60 GB (compressed) with
approximately 300 millionrows. The rate of data growth is accelerating at
approximately around 10% month over month. Thisdata is currently stored and
processed within UMG data center using Traditional tools and
systems.Currently, we use Cloud based platform AWS (S3, EMR & REDSHIFT)
to store daily digital sales partnersConsumer Analytics data, which is
in-turn, consumed by users using Micro strategy reporting tool