Microsoft has announced Azure Data Lake Storage Gen2 and Azure Data Explorer are now generally available. Furthermore, the company rolled out a preview for Azure Data Factory Mapping Data Flow. In a blog post, Microsoft says all updates are designed to further cloud analytics in the Azure platform.
Azure Data Lake Storage (ADLS) Gen2 provides more compatibility, especially the Apache system. Microsoft has built ADLS on top of Blob Storage through Gen2. This allows the service to function on Apache, including integration with Apache Spark and Hadoop.
Microsoft has reduced compute operations while boosting overall performance by developing a hierarchical namespace (HNS). The HNS supports atomic and folder file operations in Azure Data Lake. In terms of atomic file operations, this means any operation must always be completed or it will fail.
ADLS Gen2 is also integrate into other Azure services, such as Data Factory, HDInsight, SQL Data Warehouse, and Microsoft Power BI.
Azure Data Explorer
Today, Microsoft also announced the general availability of Azure Data Explorer. The company describes ADX as a way of analyzing large streaming data sets in real time. While it may seem similar to Azure Analysis Servicer, Data Explorer allows combined data from multiple sources into a single analysis model.
In its blog post, Microsoft says Azure Data Explorer has the ability to query “1 billion records in under a second” without needing to change data. ADX has been built to combine the capabilities of the engine and data management (DM) service in Azure.
“While the DM service handles raw data ingestion and related tasks, as well as failure management, the engine is in charge of processing the incoming data and serving user queries. To achieve higher performance during operation, the engine combines auto scaling and data sharding.”
Azure Data Factory Mapping Data Flow (Preview)
Finally, Azure Data Factory Mapping Data Flow is now available in preview. Data Factory allows hybrid data integration from multiple sources. This basically means users can move data from numerous on-premises servers to scale cloud when managing data.
Azure Data Factory Mapping Data Flow provides:
“[…] visually design, build, and manage data transformation processes without learning Spark or having a deep understanding of their distributed infrastructure.”