This week on March 8th 2023 the during a SAP Data
Unleashed event SAP announced a new solution called SAP Datasphere. What it
means is that Datawarehouse
Cloud (DWC) becomes SAP Datasphere.
What led SAP to announce this solution? Basically, they are trying
to address today’s challenges of data architecture: from data warehousing (structured
data) to data lakes (unstructured or any kind of data) and beyond to data fabric (integrated layer (fabric)
of data and connecting processes) reaching to particular challenges like data federation,
cataloging, lineage, metadata, integration and semantic modeling of data.
How does the SAP address these kind of challenges it? By mixing a
portfolio of their existing products like:
DWC – Data warehouse solution in cloud that evolved from SAP BW
(BW/4), customers can move their BW models to Datasphere/DWC via SAP BW Bridge
(BWB), thus investments made into BW are safe.
SAP Analytics Cloud (SAC) – Datasphere is integrated into SAC by supporting its analytics
and planning use cases.
SAP Data Intelligence
Cloud (SAP Data Intelligence formerly SAP Data Hub) –
Datasphere leverages its Data Catalog functionality and engines for data moving.
And in addition, solutions from their partners like below to
support the Business Data Fabric:
Databricks – provides data lakes platform called lakehouse (data warehouse +
data lake) initially based around Apache Spark.
Confluent – capturing data in motion capabilities based on Apache Kafka.
Collibra – data governance and metadata management capabilities.
DataRobot – capabilities of AI lifecycle management, a platform for augmented
intelligence – AutoML.
All these capabilities together forms the Datasphere. Although technically
speaking it is a combination of DWC and SAP Data Intelligence Cloud. The Datawarehouse
Cloud is rebranded to the Datasphere claiming the Datasphere to be a next
generation of the DWC. Simple speaking features of the data integration, data
cataloging, and semantic modeling were added into the DWC to enhance its data
discovery, modeling, and distribution capabilities making it the Datasphere.
Data can be either replicated into the Datasphere or federated. SAP
emphasizes an approach of data can sit anywhere just its analytics runs in the Datasphere.
This is crucial point as running the data warehouse in the cloud may not be scalable
easily. Thus, keeping data in its source and not replicating it may sound a
better options.
User of the Datasphere digs into so called Datasphere Catalog to find a data of his/her interest. Leveraging
its lineage capability a relationships between the different data can be
explored. The Catalog supports data objects from other SAP Datasphere instances
and SAP Analytics Cloud. This should be expanded soon to other SAP apps (like
BW, ECC, S/4) plus non SAP apps via its partners. While accessing data like
this the data from one source can be enhanced/mixed with data from other sources
just by the user in Datasphere Spaces.
Assuming here that the spaces are next generation of BW workspaces.
My take
Nowadays
organizations are processing data outside their enterprise systems more and
more. Therefore, a solution to enable users to work in analytics area using
data from “anywhere” is very plausible. This is not a new for SAP. Somewhat
similar picture was painted when Data Intelligence/Data
Hub came on board. Seeing
the Datasphere as a successor to those initiatives (DI, BW, DWC) plus having its AI powered
capabilities the idea of the business data fabric perhaps
may come true in future having a kind of “chatGPT” style of analytics.
More
information:
SAP DataSphere microsite
onlinedocu
SAP
Data Unleashed event
SAP
Data Warehouse Cloud (DWC), SAP BW Bridge (BWB)