Traditional approach to solve
some of these above would be to load the data to some central place (e.g. data warehouse)
while using ETL tools. A new way the Data Hub is using is to orchestrate the
data via pipeline driven integration, operations and governance.
The Data Pipelines with the Data Hub are flow based
applications consisting of reusable and configurable operations (e.g. ETL, data
preparation, code execution, connectors, etc.).
There are so called workflows available in the Data Hub solution
to orchestrate processes across the data landscape (e.g. executing data
pipelines, triggering SAP BW Process Chains, SAP Data Services Jobs, etc.).
From data governance point of view, the products has a
metadata repository of information stored in the connected landscape. This
supports discovery, profiling and search capabilities for the data.
Basically what it does is
organizing the data in systems landscape. It enables accessing and harmonizing
information from a variety of sources via unifying the metadata in catalog.
It can connect all types of
data sources (e.g. enterprise systems, data lakes, etc.). Then it can organize and manage all data assets
coming from these sources. Also it can orchestrate and monitor data processes within
the different systems. And finally integrate existing assets (e.g. python
scripts on data lake, process chains in SAP BW/4HANA, etc.)
The SAP Data Hub is part of SAP
HANA Data Management Suite (HDMS) suite of products. The product is
integrated with other SAP and none SAP solutions like:
- BW4/HANA
- (BW/4HANA process chain can start a workflow task in DH as the BW/4HANA has specific
process type “Data Hub Workflow”)
- Hadoop – the Data Hub can write
to HDFS files via an OpenHub destination, it has Connectivity via HTTP.
More information:
EIM-DH (SAP Data Hub)- SAP support site component
2466184 - SAP Data
Hub: Central Release Note
No comments:
Post a Comment