Search results for: data platform

Counting at Scale: HyperLogLog to the Rescue

MediaMath processes many terabytes of data each day for the various reports available in T1. One metric we show is the number of unique impressions for each campaign, there is a big difference between showing an ad to 100 different people and showing the same ad to one person 100 times. While this is conceptually a simple problem, solving it at scale is not quite as straightforward. The canonical way of solving this problem would be for any given campaign to put the id of each person who saw an ad for that campaign into a set and then check […]

Data Liberation at MediaMath

// 04.15.2015 // Data

MediaMath was recently at Amazon Web Services Re:invent 2014, where we presented on our open data platform and data liberation project, both of which are enabled by a variety of tools including many AWS tools. Below is a recording of our presentation: Data Liberation at MediaMath. Aggregating and processing terabytes of data per day is a challenge for any technology company. As marketers and brands become more sophisticated consumers of data, enabling granular levels of access to targeted subsets of data from outside your firewalls presents new challenges. In this presentation, VP of Engineering¬†Edward Fagin and Senior Director of Data […]

Breaking the logjam

// 10.15.2014 // Infrastructure

At MediaMath, our infrastructure generates terabytes of business-critical messages every day, such as ad impression logs and tracking beacon events. A service we’ve developed within our TerminalOne technology platform, nicknamed the “MediaMath Firehose,” enables our internal analytics applications and bidding systems to generate meaningful insights and take action on all of the data from these messages in real time. This wasn’t always the case; traditionally, this data was made available in hourly or nightly batches. We needed a significant technical and cultural transformation to move from batching to streaming. When we first began architecting our data delivery systems in the […]