Azure Synapse Analytics: The Service Formerly Known as SQL Data Warehouse
Azure Synapse Analytics (formerly SQL Data Warehouse), Microsoft’s latest data service offering was announced earlier this month at Microsoft Ignite. Synapse is the next generation of Azure SQL Data Warehouse, blending big data analytics, data warehousing, and data integration into a single unified service that provides end-to-end analytics with limitless scale. It offers the capability to query your data using either serverless on-demand compute or provisioned resources. Bringing these two concepts together, Synapse delivers the ability to ingest, explore, prep, train, manage, and serve data through a single pane of glass to support business intelligence, machine learning, and data science workloads in the cloud.
source: MicrosoftIt is the first service of its kind to bring multiple technologies together into one unified experience to reduce time to market, increase development efficiencies, and knock down silos between teams. End-to-end development starting from data ingestion, to cleansing, and all the way through to visualizations can be completed in one user interface. No longer will you need to switch between multiple tools to build and support your data and analytics platform. Synapse Analytics helps to bring together members from multiple teams into one tool to support collaboration across the enterprise data landscape.
Azure Synapse StudioMicrosoft has enriched the Azure SQL Data Warehouse experience by providing a sleek new look and feel. Azure Synapse Studio brings together the ability to ingest, explore, analyze, and visualize your data all through one user interface. Does it look familiar? Well it should—it has the same look and feel as the Azure Data Factory and Azure Databricks UI. It is an end-to-end unified experience, not only for data engineers, but for data scientists as well. If your gut reaction is like mine, right now you’re saying, “Oh great, yet another tool I need to learn.” No need to worry — Microsoft built this UI with users in mind. At its core, it leverages existing capabilities of other data services you may already use within the Azure eco-system, opposed to introducing a completely new tool.
IngestFor those familiar with Azure Data Factory, the ingestion process in Azure Synapse Studio provides a very similar experience. You still have the ability to build pipelines and can take advantage of the copy data wizard, leverage data flows to perform your business transformations and retain the ability to manually trigger or schedule the execution of pipelines.
ExploreThe Explore functionality is like Azure Data Explorer on steroids. Microsoft has provided us with the ability to explore storage accounts, data lakes, and databases all within the same interface. For those who have used Azure Data Explorer or Azure Data Studio, browsing your resources will feel no different. Not only can you explore your data, but it has some great features built in that allow you to easily discover data in storage accounts or data lakes. For example, similar to a right-click of a table or view in SQL Server Management Studio using “Select Top 1000”, you now have the same capability in Azure Synapse Studio to query files that reside in storage accounts and data lakes.
Analyze & VisualizeMicrosoft brings together the ability to run both SQL and Spark, providing a single pane of glass to help bridge the gap between data engineers and data scientists. Using Azure Synapse Studio, you have the ability to analyze and transform data using both T-SQL and Spark Notebooks. The develop tab within Azure Synapse Studio allows you to develop and explore T-SQL scripts, spark notebooks, data flows, spark job definitions, and Power BI. Aside from the wow factor of having one place to do all these things, in my opinion, the coolest capability is how Power BI has been exposed and integrated into Synapse Studio—it’s now linked directly to the Power BI Service. Any datasets or reports that live in your Power BI workspaces are now browsable, can be edited directly in Synapse Studio, and republished out to the Power BI Service. Additionally, you have the capability to create new datasets and reports and publish those to Power BI as well.
New FeaturesAmongst all the new capabilities of Azure Synapse Studio, improvements to the data warehousing portion of the service were buried in the credits at Ignite, but are well worth mentioning.
- Result-set caching
- Materialized views
- Ordered clustered columnstore indexes
- JSON support
- Dynamic data masking
- Integration with SQL Server Data Tools
- Read committed snapshot isolation
In Public Preview:
- Workload isolation
- Simple ingestion with COPY INTO
- Azure data share support
- Private link support
In Private Preview:
- Streaming ingestion and analytics
- Built-in machine learning with native prediction and scoring capabilities
- Fast query over parquet files (10x faster than Polybase)
- Ability to update distribution columns
- FROM clause with joins
- Multi-column distribution support on tables
- Column level encryption
A Unified Experience
(source: Microsoft)Azure Synapse Analytics leverages Azure Data Lake Storage as the building blocks of storing and ingesting your data into the data warehouse. Combining the existing capabilities from Azure SQL Data Warehouse with the ability to run both Spark and SQL in clustered and serverless form factors enables both data science and data engineering workloads. This helps bridge the gap between data scientists and data engineers. Traditionally, data engineers use several tools to wrangle and shape data into a format that can support data science applications, and data scientists are using many tools unfamiliar to the data engineers. By providing the capabilities to support both data engineering and data science activities in one tool, Azure Synapse Analytics helps break down team silos by providing one unified experience for collaboration. The integration, management, monitoring, and security capabilities are unparalleled in the market, providing a streamlined end-user experience. With deep integrations between Azure Data Lake Storage, Azure Machine Learning, and Power BI, Microsoft is able to significantly reduce project development time and time to market with this end-to-end analytics solution. Azure has some of the most advanced security and privacy features in the marketplace today. Features such as threat detection, transparent data encryption, and always-on encryption are built into the underlying architecture of Azure Synapse. Synapse also provides fine-grained access control to help ensure data stays safe and private by leveraging column-level security and native row-level security, as well as dynamic data masking to automatically protect sensitive data in real-time. Combine these features with a defense-in-depth security strategy, and Azure Synapse gives you complete control of security at all levels of the analytics platform.