Business Impact

  • 5M

    Web URLs Were Scraped Daily for Content

  • Automated

    Insights Generation

Customer Key Facts

  • Location : North America
  • Industry : Media & Entertainment

Problem Context

The customer, a Media Consulting company, wanted to structure and analyze large unstructured datasets arising from brand and asset marketing. They wanted to provide direction to their partners on creative, media, messaging, and communication strategies for digital ad campaigns.



  • Storing at least 14 TB of proprietary data coming from 11 different data sources
  • Proprietary data was unclean and had not been validated for over a year
  • Processing daily data of 2 billion records

Technologies Used

AWS Glue
AWS Lambda
Amazon S3
Amazon QuickSight
Amazon Athena
AWS Elastic BeanStalk

Developed an End-to-End Consumer Data Platform for Analysis


Quantiphi developed an end-to-end platform to inject multiple data sources into AWS Cloud and created a data lake on Amazon S3. Multiple different data sources were combined and transformed using AWS Glue to create a data-ready platform for analysis. An automated solution was also developed, that was able to scrape content from more than 5M web URLs daily to identify the content preference of users and generate insights on user behavior.


  • Enables the creation of consumer pools
  • Helps generate key insights into consumer behavior, such as application usage, purchase behavior, and media consumed in real-time

Looking for similar project?

Let's Talk

Get your digital transformation started

Let's Talk