QueryableState in Apache Flink – Part 1

QueryableStates allows users to do real-time queries on the internal state of the stream without having to store the result on to any external storage. This opens up many interesting possibilities since we no longer need to wait for the system to write to the external storage (which has always been one of the main bottlenecks in these kinds of systems). It might be even possible to not have any kind of database and make the user facing applications directly query the stream, which will make the application faster and cheaper. This might not be applicable to all the use […]

A Tale of TwoTails – Mutual tail recursion in Scala

TwoTails is a compiler plugin written to add support to Scala for mutual tail recursion. While Trampolines or trampolined style recursion solve the direct need, they require explicit construction by a developer and add overhead in the form of additional data structures. Unfortunately, building a “native” solution directly into Scalac without using trampolines is not a straightforward task, even with basic tail recursion. In the latest version, a second compilation scheme has been introduced solving an issue peculiar to the JVM which the first scheme was not able to properly address. I’ll discuss both the motivation behind this new scheme […]

Take Reports From Concept to Production with PySpark and Databricks

// 04.19.2017 // Data Science

This article is was originally published on the DataBricks blog on April 3rd, 2017 Introduction: What is MediaMath? MediaMath is a demand-side media buying and data management platform. This means that brands and ad agencies can use our software to programmatically buy advertisements as well as manage and use the data that they have collected from their users. We serve over a billion ads each day, and track over 4 billion events that occur on the sites of our customers on a busy day. This wealth of data makes it easy to imagine novel reports in response to nearly any situation. Turning […]

Video: Extreme-scale Data Science Using Spark

// 11.14.2016 // Data Science

At the Spark Summit in Brussels, MediaMath’s SVP of Data Science, Prasad Chalasani, gave an invited keynote talk, Extreme Scale Ad-Tech at MediaMath with Spark and Databricks. MediaMath’s demand-side platform responds to over 200 billion ad-opportunities daily, and leverages massive amounts of data to power smarter digital marketing. We use Spark heavily both in production and R&D to develop innovative, proprietary, and scalable solutions to multiple large-scale data problems, such as: Training Machine-learning models for predicting conversion probability given an ad-impression Measuring causal effectiveness of advertising using randomized tests Estimating audience reach for specified targeting criteria. Finding deviceIDs belonging to the same user based on […]

Video: Building a Clustered Service in Go

// 10.26.2016 // Platform API

In a data-streaming web world, things happen fast. In less than the blink of an eye, MediaMath’s digital marketing systems host real-time auctions and serve ads across the world to the tune of 4.6 million queries per second. In this session at the GOTO Chicago Conference, MediaMath CTO Wilfried Schobeiri dove into MediaMath’s data stream processing architecture and how the company is building the next generation of real-time, high performance systems in Go. Using Go, MediaMath is able to scale its systems on a minimal resource footprint. Wil also explains why Go is a game-changer for building services, how to […]

Real-time Streaming Attribution Using Apache Flink

// 09.12.2016 // Data

In this blog post, I will share a proof of concept for real-time attribution using Apache Flink from streaming data sources of impressions and events, and how we handled some of the specific problems inherent in windowing and processing real-time data streams at scale. Our goal was to determine if we could use Flink to stream impression and event data so that we could determine attribution in real time in order to optimize advertising strategies immediately. In digital advertising, we refer to ads – whether they are served on social networks, Mobile, Video, or display – as impressions. Once the […]

Page 1 of 1212310