NUVOLOS
2.0.0
Search
K
Links

Database integration

The Scientific Data Warehouse (SDW) is the Data Platform in Nuvolos
Nuvolos is not just an online computer lab, it is also an online data platform. We offer strong data integration to support modern research and education use cases that rely on large amounts of data.
As a data platform, Nuvolos integrates strongly with both online (Nuvolos based) and offline (non-Nuvolos based) applications. Please review our access documentation to learn more. Nuvolos also offers the ability to define data pipelines, and ingest data from various sources. Please reach out to our support for more information.
In this documentation, we will regularly refer to the data platform service as the Scientific Data Warehouse (SDW). The Scientific Data Warehouse is built on the Snowflake service, you can refer to the Snowflake SQL documentation here.

Start working with data

Nuvolos differentiates two types of data:
  1. 1.
    Tabular data stored in some database management system
  2. 2.
    Data stored in regular files
This page describes working with tabular data. For working with files, consult our guide to the file system.
This documentation distinguishes tabular and file based data.
Tabular data means refers to data stored in the Scientific Data Warehouse (SDW), a SQL compliant cloud-based data warehouse.
File based data means data that is stored on a regular file system. This guide focuses on database-stored datasets.

Dataset spaces

Datasets are special kinds of spaces, with the main goal of containing only tabular data and documentation related to that tabular data and potentially other descriptor files. For obtaining a full list of the datasets available to you in your current organisation, navigate to your dashboard, and pick the datasets menu.
Viewing the list of available datasets
Datasets consist of immutable snapshots. Datasets are to be used as sources of distribution, not to directly work in.

Distribute data you need

Suppose that your project is called 'Demo research project' and you need two tables from the 'Correlation Risk Proxy' dataset.
Using the distribute feature, you can set your research project up with the required data:
Distributing tables to a personal space

Work with data in your workspace

Once the distribution has completed, you will receive an e-mail to your account you registered with. If the distribution was successful, the data should be available in the instance you distributed to.
You will now be able to work with data in your workspace, without having to worry about backing up your data and causing inadvertent changes.
Please follow our detailed guides: