I was expecting this session to show a bit more the integration of Stream Analytics and Data Factory from Azure in practice. Unfortunately, the presentation covered those topics rather quickly. It was still interesting in a few aspects. The presentation shown how a company with soda vending machines was able to predict maintenance of the machines to avoid sales loss. It was doing this by capturing streaming data from sensors and performing real time analytics on those, machine learning algorithms were able to successfully predict the cause of the failure and
This was probably my favorite session of the day. It was showing a new piece of software that Microsoft used internally to run Azure over the last five years for several of its most core services. Azure Service Fabric is a development framework that provides a few interesting services for the developer:
– Automated live upgrade of microservices
– Automated scaling
– Health monitoring
– Easy state management in a distributed environment
It’s almost the dream platform of any service developer and system administrator. A lot of hard problems that come with highly resilient and distributed applications are being solved by the framework. It is also well integrated with the Visual Studio environment for easy debugging. What’s possibly the most interesting thing about this, is that will allow you to seamlessly move you application from an on premise setup to Azure as it’s the exact same platform. This will be achieved through the newly announced Microsoft Azure Stack that I was talking about in my previous post. To be able to achieve stateful distributed systems, they had to develop new transactional data structures that are highly available in the fabric. This allows for an application to share it’s state across multiple nodes and be resilient in the event of a failure. What is also given by this application model is a way for the client that is communicating with the service to gracefully handle failures and node maintenances.
I was curious to see what was Microsoft’s take on tackling this problem. Turns out it was quite straightforward, they use new enhancements in SQL Server 2016 column store indices in order to minimize IO as much as possible to while serving analytical queries. Since there can be size reduction in the order of around 5x to 10x generally. One “trick” they are also talking about was using a secondary read-only replica to serve some of the queries. The presenter went through some of the details of how the column store indices are working and how they are mitigating some of the common issue with running analytical queries on top on an operational database. I was expecting them to talk a bit about the resource governor but that was not a topic covered. In-memory tables were also covered quite a bit even though they pertain to scenarios that are quite specific and not exactly mainstream. Good information nonetheless.[
This session covered Delve and its foundation, the Office Graph. The presenters went through some examples as to how the graph is populated today from data coming from Office 365 and other third party applications such as Salesforce, something they call application signals. The goal for Microsoft, is to capture as many signals as possible coming from their applications and perform analytics on the resulting graph to deliver new functionality in Office by presenting relevant content for instance and understanding better the current user’s work context. For instance when a user first send an email to a new colleague, Office Graph will notice this and create new relationship nodes of type “Working with” to illustrate that those two persons are collaborating.
They also covered how the data was being modeled. The graph model is quite simple and I found some odd choices in my opinion for certain things. For instance, instead of assigning a certain relationship type to something like “Has Read” for when a user has read a document, they create what they call action nodes instead. So to figure out all the documents read by a user, you need to find the proper action node, then capture all the document nodes linked to that action node.
Overall I think Microsoft will provide the proper extensibility points by allowing third parties and ISV to load their data in the Office Graph and defining custom types. There currently is a query language from what the presenter told me but it looks rather limited when you compare it with Neo4j’s Cypher.