Search results for: hive

Building faster, scalable reporting with Hadoop-Impala

// 05.21.2014 // Infrastructure

As a leading DSP with billons of online ads running through our platform every day, one of our biggest problems is how best to frequently report attribution data (which ad led to which action, like a sale or online signup) to our clients in a reliable way. The problem we are tackling, in numbers: A) 30-day impression volume = 35 – 40 billion records B) 1-hour event/click volume = 15 – 20 million records We need to join B (events) with A (impressions) twice every hour (once for event and once for clicks), find the matching records, perform complex sequencing […]