Resource Allocation in Mesos: Dominant Resource Fairness

Apache Mesos provides a unique approach to cluster resource management called two-level scheduling: instead of storing information about available cluster resources in centralized manner it operates with a notion of resource offers which slave nodes advertise to running frameworks via Mesos master, thus keeping the whole system architecture concise and scalable. Master's allocation module is »

Apache Spark: core concepts, architecture and internals

This post covers core concepts of Apache Spark such as RDD, DAG, execution workflow, forming stages of tasks and shuffle implementation and also describes architecture and main components of Spark Driver. There's a github.com/datastrophic/spark-workshop project created alongside with this post which contains Spark Applications examples and dockerized Hadoop environment to play with. »

Data processing platforms architectures with SMACK: Spark, Mesos, Akka, Cassandra and Kafka

This post is a follow-up of the talk given at Big Data AW meetup in Stockholm and focused on different use cases and design approaches for building scalable data processing platforms with SMACK(Spark, Mesos, Akka, Cassandra, Kafka) stack. While stack is really concise and consists of only several components it is possible to implement »

In the Wake of Scala Days 2015

Abstract: Scala Days Amsterdam conference was full of interesting topics so in this post I'll cover talks on Scala platform, core concepts for making Scala code more idiomatic, monad transformers, consistency in distributed systems, distributed domain driven design and a little more. This post came out of the post-conference presentation to my team, so the »