A complete database research workflow (Matlab & RStudio)
Nuvolos supports a query-analyze-insert cycle to help researchers query, analyse and store data safely in a simple, integrated workflow. The workflow is demonstrated in two languages, but can be extended to any languages supported by Nuvolos.
Matlab workflow
A standard skeleton scientific workflow in Matlab can be broken down into three main steps:
Query research-relevant data
Analyse, transform or otherwise manipulate data
Store results
Querying relevant data
For the mock example, we are going to work with the Fama-French factor set that is available for our Demo user, we will be focusing on the North America 5-factor table. The NORTH_AMERICA_5_FACTORS
table has been distributed to the instance we are working in.
After opening the Matlab application, the following bit of code will return the entire database table as a Matlab table-type object. The query we are executing is a merge of a monthly stock series table (for Apple monthly stock prices) and the Fama-French factor table as follows.
The SQL query:
The code that executes the query, the above string is saved in query_string
.
The simple analysis
The previous step resulted in dataset_factor
containing a Matlab Table
object that holds the data. The fitlm
method fits a linear regression on the table with an R-style formula. We then write back the fitted values to the Table as a column.
Storing results in the database
As a final step, we write back the results using the data upload command for Matlab:
As an end result, we can then find the table on the Nuvolos UI:
RStudio workflow
A standard skeleton scientific workflow in RStudio can be broken down into three main steps:
Query research-relevant data
Analyse, transform or otherwise manipulate data
Store results
It is possible to create more complext workflows, but it will usually consist of three-step modules such as above.
Querying relevant data
For the mock example, we are going to work with the Fama-French factor set that is available for our Demo user, we will be focusing on the North America 5-factor table. The NORTH_AMERICA_5_FACTORS
table has been distributed to the instance we are working in.
After opening the RStudio application, the following bit of code will return the entire database table as an R data frame object. The query we are executing is a merge of a monthly stock series table (for Apple monthly stock prices) and the Fama-French factor table as follows.
The SQL query:
The code that executes the query, the above string is saved in query_string
.
The simple analysis
The previous step resulted in dataset_factor
containing an R data.frame
object that holds the data. The lm
method fits a linear regression on the data frame. We put the fitted values to the data frame.
Storing results in the database
As a final step, we write back the results using the data upload command for R:
Last updated