Search results for: AWS

VPCs, jump boxes, & NATs: Part 1 of our Hybrid Cloud Tips & Tricks Series

// 11.12.2014 // Data

Here at MediaMath, we are building out new services for our hundreds of terabytes of data at a lightning fast pace. This isn’t your grandma’s software development shop. Using cloud services like Amazon’s EC2 service allows us to scale up our infrastructure to match this pace. However, MediaMath doesn’t run a purely cloud-hosted environment. Instead, we run a mixed data center environment, with a number of high performance components running in a variety of data centers throughout the world. That means that the new AWS-hosted pipelines and data stores we are building must integrate with our in-house data centers, which […]

Breaking the logjam

// 10.15.2014 // Infrastructure

At MediaMath, our infrastructure generates terabytes of business-critical messages every day, such as ad impression logs and tracking beacon events. A service we’ve developed within our TerminalOne technology platform, nicknamed the “MediaMath Firehose,” enables our internal analytics applications and bidding systems to generate meaningful insights and take action on all of the data from these messages in real time. This wasn’t always the case; traditionally, this data was made available in hourly or nightly batches. We needed a significant technical and cultural transformation to move from batching to streaming. When we first began architecting our data delivery systems in the […]

Learning how to learn: My summer on the Data Platform Team

// 08.20.2014 // Data

During the summer of 2014, I worked as an intern on the Data Platform team. One of the team’s main initiatives is to develop data workflows and reporting for other internal groups. My first project was to build a report using the programming language Scala. The report I built was for the Site Uniques Workflow, which is the data processing pipeline for all video advertising campaigns. Specifically, this report allows you to group various campaign attributes together to obtain different metrics. For example, you can group by campaign ID, website, ad exchange ID, auction ID, etc. It pulls raw bid […]

Making your local Hadoop more like AWS Elastic MapReduce

// 05.21.2014 // Data

A version of this article originally appeared on Ian’s personal blog here.  At MediaMath, we’re big users of Elastic MapReduce (EMR). EMR’s incredible flexibility makes it a great fit for our data analytics team, which processes TBs of data each day to provide insights to our clients, to better understand our own business, and to power the various product back-ends that make Terminal 1 the “marketing operating system” that it is. An extremely important best practice for any analytics project is to ensure the local development and test environments match the production environment as much as possible. This eliminates the […]

Page 2 of 212