Interactive analysis have become a major part of the field of Data Science. Two tools have become very popular, Jupyter and Zeppelin.
This article will show you how to provision a Spark cluster and run analysis on it with the help of Zeppelin.
The Internet of Things (IoT) has finally arrived (some time ago). An important part is to analyze sensor data in motion. For that a streaming system is necessary. You could use Apache Storm or Apache Spark Streaming to do that, but if you want to run it as a service on Azure without going through the pain to set up a cluster Azure Stream Analytics is a good choice.
In this post I’m going through the basics of Azure Stream Analytics and DocumentDB, which is used as a destination for our data after the streaming is done, and how you can create a simple Stream Analytics job using Blob storage as the input and DocumentDB as the output.
Continous delivery (or continuous deployment or continuous integration) is an important part of the modern software development lifecycle.
Azure WebJobs on the other hand is a nice little feature to run processes in the background.
This article describes in short why continuous delivery is a good thing, what Azure WebJobs are and how you can use both together.
It’s getting cold outside and November already arrived and there are just a few days left until Christmas. This means that it is time for another Links of the month.
This issue contains some interesting blog post related to Azure and Apache Spark in general.
September is gone and it is time for another Links of the month. This time is special for me. In the middle of September my first course with Opsgility was released, covering the basics of Microsoft Azure DocumentDB. Nevertheless, there are more things to talk about.
Today I’m very proud to blog about something which started at the beginning of the year. At that time Michael Washam asked me if I want to create a course for Opsgility. We had some ideas and agreed to work on a course about Azure DocumentDB. At that time this service was still in preview. Today, a few months later, DocumentDB reached GA and several features were added. During the creation of the course a lot of things changed. There are no capacity units anymore, the API changed a little bit, etc. Nevertheless, it is great to see what the DocumentDB team achieved.
At the beginning of the week Opsgility announced that the course is finally available.
So, what is covered in the course? This course aims to give listeners the ability to build their own applications based on DocumentDB. It starts with the basics, understanding what NoSQL databases are and how document-oriented databases fit into them. After the introduction a sample application will be built which uses DocumentDB features such as Stored Procedures, UDFs and many more. You will learn how you can easily monitor and scale DocumentDB. At the end of the course some scenarios are presented to give an idea where DocumentDB could be a good choice.
It was a long journey and I would like to thank the guys at Opsgility, especially Tiffiney Groce and Michael Washam. Michael made it possible to built my own course and Tiffiney helped me a lot during the creation with coordinating everything. Thank you!
So, have fun! If you have feedback about the course I would love to hear it.