OpenMetaData

OpenMetaData with Azure File Share support is available on Nuvolos.

Overview

OpenMetaData is an end-to-end metadata management platform that enables unlocking the value of data assets in common use cases of data discovery and governance, as well as emerging use cases related to data quality, observability, and people collaboration.

OpenMetaData on Nuvolos supports the ingestion of files stored on Azure File Shares, which allows you to track operations performed on files stored in Azure File Shares.

Setting up OpenMetaData

Add a new OpenMetaData application to your working instance in Nuvolos:

OpenMetaData runs in a VSCode application on Nuvolos, along with a pre-installed Airflow application, which executes the ingestion workflows created by OpenMetaData.

Starting your application

Once you have added the OpenMetaData application to your Nuvolos instance, start your application.

After a couple of minutes, you should see an initialization screen:

This initialization can take a few minutes upon the first start of a new application as both the OpenMetaData and Airflow databases need to be set up in the background.

Once the application starts, you will see a VSCode interface:

VSCode is used so the Airflow interface can also be accessed and DAGs be created/refined as necessary. You can also install additional packages via the built-in Terminal.

Opening OpenMetaData

To show OpenMetaData, open the Command Palette and issue the OpenMetaData: Show OpenMetadata command:

OpenMetaData opens in a new tab in VSCode:

Click on the Sign in with Auth0 button to log in to OpenMetaData. On the first start, a new user will be created for you.

If you are an administrator in your Nuvolos space, your OpenMetaData user will be an administrator within the OpenMetaData application. If you are not an administrator in the Nuvolos space, a non-privileged OpenMetaData user will be created.

OpenMetaData checks for administrators only on the first start of the application, if you have been granted Nuvolos space administrator privileges after the application was started, you will need to ask your co-admins to grant you admin roles in OpenMetaData.

Adding an Azure File Share storage:

Click on Settings -> Storages -> Add New Service and select AZFS from the available storage services:

Give a name to your storage service and obtain the name of the Azure File Share to be used and a connection string (the credential) that provides read access to the Azure File Share:

Click Test Connection to test whether the credentials can be used to access the Azure File Share:

Adding an ingestion pipeline for Azure File Share

You can create an ingestion pipeline to create OpenMetaData containers and objects from folders and files in Azure File Share. The pipeline is an Airflow DAG, created and managed by OpenMetaData.

To create an ingestion pipeline, edit your new AZFS (Azure File Share) storage and click Add Metadata Ingestion on the Ingestions tab:

You can name your ingestion pipeline if you wish. You need to choose Storage Metadata Config AZFS as the value for Storage Metadata Config Service and provide the connection string for the Azure File Share and the name of the file share:

In the next step, you can specify the schedule for the ingestion pipeline:

Once the schedule is defined, click Add & Deploy to create the ingestion pipeline in Airflow:

Running the ingestion pipeline

Click Run to execute the ingestion pipeline on demand:

You can see the logs from Airflow by clicking on the Logs link.

Viewing the Airflow DAG

You can open Airflow with the Airflow: Show Airflow VSCode command:

Checking the newly ingested metadata:

You can check the newly ingested metadata in the Explore -> Containers view:

Last updated