Category Archives: Microsoft Azure

Zeppelin and Spark SQL on HDInsight

Interactive analysis have become a major part of the field of Data Science. Two tools have become very popular, Jupyter and Zeppelin.
This article will show you how to provision a Spark cluster and run analysis on it with the help of Zeppelin.

Continue reading

Advertisements

DocumentDB as a data sink for Azure Stream Analytics

The Internet of Things (IoT) has finally arrived (some time ago). An important part is to analyze sensor data in motion. For that a streaming system is necessary. You could use Apache Storm or Apache Spark Streaming to do that, but if you want to run it as a service on Azure without going through the pain to set up a cluster Azure Stream Analytics is a good choice.

In this post I’m going through the basics of Azure Stream Analytics and DocumentDB, which is used as a destination for our data after the streaming is done, and how you can create a simple Stream Analytics job using Blob storage as the input and DocumentDB as the output.

Continue reading

Continuous Delivery of Azure WebJobs via Git

Continous delivery (or continuous deployment or continuous integration) is an important part of the modern software development lifecycle.
Azure WebJobs on the other hand is a nice little feature to run processes in the background.

This article describes in short why continuous delivery is a good thing, what Azure WebJobs are and how you can use both together.

Continue reading

Running RethinkDB on Azure

Over a year ago a friend of mine aroused my interest for RethinkDB. At that time I played a little bit with it and thought that it appears to be a nice database. After that I focused on other topics, but recently I came back to RethinkDB. In this post I would like to explain how you can setup RethinkDB on Azure and play around with it.

Continue reading