MediaMath Developer Blog Authors

A Picture of Ian Hummel
IAN HUMMEL Director of Data Platform

Ian Hummel is the Director of Data Platform at MediaMath. He’s led a variety of product initiatives at MediaMath over the years and is currently focused on building a next-gen large scale analytics platform. Before MediaMath he worked in a variety of tech fields including enterprise search, video processing, identity federation, and mobile app development. He has a BA in Mathematics and Comp. Sci from Boston University and an MBA from INSEAD.

A Picture of Ian Hummel
IAN HUMMEL Director of Data Platform

Ian Hummel is the Director of Data Platform at MediaMath. He’s led a variety of product initiatives at MediaMath over the years and is currently focused on building a next-gen large scale analytics platform. Before MediaMath he worked in a variety of tech fields including enterprise search, video processing, identity federation, and mobile app development. He has a BA in Mathematics and Comp. Sci from Boston University and an MBA from INSEAD.

articles by this author:

Extending Play’s validation to work with Big Data tools like DynamoDB, S3, and Spark

// 03.18.2015 // Data

In this two-part blog series, we are looking at how MediaMath uses Play’s API to perform data validation on big data pipelines. In part one, we covered data validation with Play’s combinator-based API. In part two, we’ll extend that data validation to work with Amazon Web Services DynamoDB, AWS S3, and Spark. Extending validation to work with AWS DynamoDB MediaMath uses a variety of technologies in our analytics stack, including AWS DynamoDB. DynamoDB is a distributed, fault-tolerant key value store as a service that makes it easy to store/query massive datasets. We use it to power a few internal troubleshooting […]

Scaling data tools: How Play enables strongly typed big data pipelines

// 03.04.2015 // Data

The other day, I was talking with a colleague about data validation, and the Play web framework came up. Play has a nice API for validating HTML form and JSON submissions. This works great when you’re processing small amounts of data from the web-tier of your application. But could that same tech benefit a Big Data team working on a backend powered by Hadoop or Spark? We decided to find out, and the results were encouraging. The secret sauce? Play’s combinator-based approach to data validation. Whether your data is big or small, garbage in is garbage out MediaMath processes TBs […]

Making your local Hadoop more like AWS Elastic MapReduce

// 05.21.2014 // Data

A version of this article originally appeared on Ian’s personal blog here.  At MediaMath, we’re big users of Elastic MapReduce (EMR). EMR’s incredible flexibility makes it a great fit for our data analytics team, which processes TBs of data each day to provide insights to our clients, to better understand our own business, and to power the various product back-ends that make Terminal 1 the “marketing operating system” that it is. An extremely important best practice for any analytics project is to ensure the local development and test environments match the production environment as much as possible. This eliminates the […]