JupyterLab
Tips for using JupyterLab on Nuvolos
Long-running notebooks
If running notebooks via Jupyter, we recommend submitting the notebooks for computation using papermill and specifying an explicit logfile when executing from the Jupyter terminal. This way you can disconnect from the Jupyter application and the notebook execution can continue whilst be able to monitor run progress.
Note that your Jupyter notebook wil only receive cell output updates as long as the notebook is kept open in the browser. If you reopen a notebook that is still calculating in the background, you won't receive cell output updates. This is standard Jupyter behavior, unrelated to Nuvolos. This is one of the reasons why using tools like Papermill makes sense for long-running notebooks.
Adding a new launcher
In some cases it might be useful to have multiple conda environments inside a single JupyterLab application and be able to launch notebooks from the JupyterLab launcher with kernels that run in these environments. We recommend that the kernel specification associated with the new conda environments created is always installed into the base conda environment (and not user / system prefix) to make sure that the kernel/launcher will function well after distributing an application. Our examples below follow this convention. If you don't want to share the application, then you can also follow instructions from other sources where typically the kernel specification is installed into the user home directory. The following can be done from a JupyterLab terminal and shortly afterwards a new Launcher should appear.
Python
In this case we recommend to create a new conda environment and install a launcher into the environment as following:
conda create env --name my_new_env
conda activate my_new_env
conda install ipykernel
ipython kernel install --prefix=/opt/conda --name "My New Env"
R
In this case we recommend to create a new conda environment and install a launcher into the environment as following:
conda create env --name my_new_env
conda activate my_new_env
conda install r-recommended r-irkernel
R -e 'IRkernel::installspec(prefix="/opt/conda")'
Accessing a local webserver in the browser
Certain python packages come as webserver-based extensions, and you need to open them in a browser window to interact with them. Due to how Nuvolos applications are encapsulated, you cannot just start a new server processes in your Nuvolos app and access it from your local browser.
Since this is a fairly common problem is many infrastructures, there is already a solution for this called Jupyer Server Proxy. As an example, we'll show now how to enabled Tensorboard in your JupyterLab application.
Make sure you're using a JupyterLab app with version > 3.0.0
Install the webserver application (in this example, Tensorboard) in JupyterLab
Install Jupyter Server Proxy (probably already installed)
Create a new file
/opt/conda/etc/jupyter/jupyter_server_config.py
with contentsRestart your Nuvolos application
You should see a new launcher for Tensorboard
Run your tensorflow computation and make note of the directory of your model run. Let's assume it's called
/tmp/my_fit_1
Create a symbolic link for your model directory with
If you already have a link at
/files/tensorboard_logdir
, you'll need to remove it first withrm /files/tensorboard_logdir
.Click on the 'tensorboard' launcher to open Tensorboard pointing at your model's run directory.
To analyze a different run, repeat steps 7-9.
In the example, the model run output files were put under /tmp deliberately. If your application emits lots of events, any filesystem slowness will negatively impact the performance of your training. For this reason, it's recommended to put these output files on the fastest storage medium available to your application: the local SSD drive under /tmp. Note that files under /tmp are not retained between Nuvolos application restarts. For this reason, make sure to use a tool like rsync to move files to persistent storage whenever it's needed. Alternatively, reach out to support to get a quote on additional persistent SSD storage and use that for storing your event files.
If the server application takes a lot of time to start, you might need to increase the timeout value in the example, otherwise you'll need to refresh the page periodically until the server starts.
Creating a ploty dash application from a notebook
Make sure you have the following packages installed (we suggest to do this via conda
from the conda-forge
channel).
plotly
dash
Once these are installed, install the JupyterDash extension:
After this you need to make sure that your dash application has the following logic in it:
Note that this procedure relies on the dash application being run in the context of a notebook.
Real-time kernel resource usage monitoring
JupyterLab has a great extension for monitoring resource usage in real time. You can install it with
After installing, restart your Nuvolos application. From then on, you'll see a new metering icon on the right sidecar:
Whenever a notebook tab is in focus, this extension displays CPU and RAM usage for the attached kernel, and also host-level CPU and RAM utilization.
This extension requires IPyKernel version 6.10.0 or above, so it might not work in older JupyterLab versions.
Matplotlib plots with LaTeX
If you wish to use LaTeX to render labels and other texts when using matplotlib
, you can install a LaTeX environment by following the instructions in our documentation:
Install the packages required by mathplotlib:
tlmgr install type1cm cm-super underscore dvipng
Run the notebook cell with
usetex=True
again.
Last updated