Sunday, December 30, 2018

SAP Data Hub (SAP DH)

In 2017 SAP introduced a new product for data integration. It is called a Data Hub. Main purpose of the product is to help to solve and overcome issues like: constantly growing amount of data available today (big data), less accessibility of data due to proliferation of cloud based software, data governance risks (e.g. GDPR in EU), disconnect of data due to sitting in silos, missing link between the data, no data readiness, etc.

Traditional approach to solve some of these above would be to load the data to some central place (e.g. data warehouse) while using ETL tools. A new way the Data Hub is using is to orchestrate the data via pipeline driven integration, operations and governance.

The Data Pipelines with the Data Hub are flow based applications consisting of reusable and configurable operations (e.g. ETL, data preparation, code execution, connectors, etc.).

There are so called workflows available in the Data Hub solution to orchestrate processes across the data landscape (e.g. executing data pipelines, triggering SAP BW Process Chains, SAP Data Services Jobs, etc.).

From data governance point of view, the products has a metadata repository of information stored in the connected landscape. This supports discovery, profiling and search capabilities for the data.

Basically what it does is organizing the data in systems landscape. It enables accessing and harmonizing information from a variety of sources via unifying the metadata in catalog.

It can connect all types of data sources (e.g. enterprise systems, data lakes, etc.).  Then it can organize and manage all data assets coming from these sources. Also it can orchestrate and monitor data processes within the different systems. And finally integrate existing assets (e.g. python scripts on data lake, process chains in SAP BW/4HANA, etc.)

The SAP Data  Hub is part of SAP HANA Data Management Suite (HDMS) suite of products. The product is integrated with other SAP and none SAP solutions like:

BW4/HANA - (BW/4HANA process chain can start a workflow task in DH as the BW/4HANA has specific process type “Data Hub Workflow”)

- Hadoop – the Data Hub can write to HDFS files via an OpenHub destination, it has Connectivity via HTTP.

A picture - courtesy of SAP SE from its marketing material.

More information:
EIM-DH (SAP Data Hub)- SAP support site component
2466184 - SAP Data Hub: Central Release Note

No comments: