All Stories

Kafka Consumer Memory Tuning

Yesterday, I had a process that was consuming a single Kafka topic. I was running it in our “staging” environment, and everything worked great. My heap space for the process...

Schedules & Scores API for Streaming Live Sports Stats

Recently, I’ve been looking for an API that exposes schedules and scores for “the big four” American leagues. Here’s what I was looking for:

Killing Subprocesses in Linux/Bash

Lately, I’ve been working with YARN at LinkedIn. This framework allows you to execute Bash scripts on one or more machines. It’s used primarily for Hadoop. When using YARN, you...

Sorting Reducer Input Values in Hadoop

I HIGHLY recommend that you read the email thread by Owen O’Malley that describes this technique in brief. I should also note that this example is using the 0.18 Hadoop...