Integrating JupyterLab¶
The section describes how JupyterLab has been integrated in to the virtual machine image at the ILL. There are many different ways of installing JupyterLab, however a certain amount of configuration has to conform to VISA requirements to ensure that the link to VISA Jupyter Proxy behaves as expected.
Installation using a python virtual environment¶
We use pip to install a specific version of JupyterLab and labextensions. We install JuypyterLab under the root directory of /opt/visa/jupyter
. The following script describes the process:
#!/bin/bash
echo "Installing JupyterLab..."
JUPYTER_DIR=/opt/visa/jupyter
mkdir -p $JUPYTER_DIR
cd $JUPYTER_DIR || exit
python3 -m venv jupyterlab
source jupyterlab/bin/activate
pip install --upgrade pip
pip install jupyterlab==2.2.9
pip install ipywidgets==7.5.1
# enable interactive matplotlib
jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install jupyter-matplotlib@0.7.4
jupyter nbextension enable --py widgetsnbextension
echo "Finished installing JupyterLab"
Starting the server at boot¶
There are certain requirements for JupyterLab to be correctly integrated into VISA:
The Jupyter server needs to start automatically during the boot process
This ensures that JupyterLab is available for connection automatically via VISA without starting it manually
Jupyter must run as the same user as the owner of the instance
The VISA user will therefore have access to their home directory and notebook files as they expect
The base_url of the server must be
/jupyter/{INSTANCE_ID}
The VISA Jupyter Proxy forwards requests on a specific URL which must match that of the Juptyer server
The port of the Jupyter Server must be identical to the configuration of VISA Jupyter Proxy
VISA Jupyter Proxy forwards HTTP requests to the Jupyter Server
When creating the instance, VISA API Server uses cloud-init to pass meta data to the instance including id
(the ID of the instance) and owner
(the username of the owner). The script below uses these elements to start JupyterLab in the required way (feel free to modify this script accordingly, but keep in mind that changing the root_url will break the link to VISA and changing the user will modify the behaviour of Jupyter).
#!/bin/bash
JUPYTER_ENV=/opt/visa/jupyter/jupyterlab
# Get owner and instance_id metadata
OWNER=`cloud-init query ds.meta_data.meta.owner`
INSTANCE_ID=`cloud-init query ds.meta_data.meta.id`
# The jupyter configuration file (you may want a different conf depending on dev or prod environments)
JUPYTER_CONF=/path/to/jupyter-conf.py
# Verify that we get the data from cloud-init
if [ -z "$OWNER" ]; then
echo "Failed to get OWNER from OpenStack instance metadata"
exit 1
else
echo "Got owner \"$OWNER\" from OpenStack instance metadata"
fi
if [ -z "$INSTANCE_ID" ]; then
echo "Failed to get INSTANCE_ID from OpenStack instance metadata"
exit 1
else
echo "Got instance ID $INSTANCE_ID from OpenStack instance metadata"
fi
# Defined the base url of Jupyter (as required by VISA Jupyter Proxy)
BASE_URL=/jupyter/$INSTANCE_ID
# Check that the owner/login exists
if ! id "$OWNER" &>/dev/null; then
echo "Failed to run JupyerLab: User $OWNER not found"
exit 1
fi
# Run as the instance owner
su - $OWNER <<EOF
echo "Running JupyterLab as $OWNER using conf file $JUPYTER_CONF with base URL $BASE_URL"
# Run JupyterLab
$JUPYTER_ENV/bin/jupyter lab --config $JUPYTER_CONF --NotebookApp.base_url=$BASE_URL
EOF
The following is an example config file for Jupyter:
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888
c.NotebookApp.trust_xheaders = True
c.NotebookApp.allow_origin = '*'
c.NotebookApp.disable_check_xsrf = False
c.NotebookApp.token = ''
c.Application.log_level = 'DEBUG'
Note that the port here is set to 8888: this is the default value in VISA Jupyter Proxy.
Integration of conda environments¶
Jupyter comes with a default Python environment. To be able to perform data analysis we replace this with a conda environment with pre-installed libraries. Conda is a package manager and environment manager that allows you to create and use environments for specific analysis purposes.
It is also possible for users to add their own conda environments to the VISA JupyterLab installation.
More information about Conda can be found here.
Default data analysis environment¶
The following script can be added the image build process to create a data anlysis environment using conda:
#!/bin/bash
# Assume that conda is already installed
CONDA_INSTALL_DIR=/opt/conda
CONDA_ENVS_DIR=$CONDA_INSTALL_DIR/envs
CONDA_EXE=$CONDA_INSTALL_DIR/bin/conda
# The location where Jupyter is installed
JUPYTER_DIR=/opt/visa/jupyter
CONDA_ALWAYS_YES="true"
# Proxies here if needed
#HTTP_PROXY=http://my.proxy.host:1234
#HTTPS_PROXY=http://my.proxy.host:1234
echo "Setting up conda environment for data analysis"
source "$CONDA_INSTALL_DIR/etc/profile.d/conda.sh"
# Get the yaml description of the environment
$CONDA_EXE env create -f /tmp/environment_data_analysis.yml --force
echo "Creating ipykernel for data analysis"
# Integrate the conda environment into Jupyter
$CONDA_ENVS_DIR/data-analysis/bin/python -m ipykernel install --prefix=$JUPYTER_DIR/jupyterlab --name 'python3' --display-name 'Data Analysis'
The final line replaces the default python3
environment with the data analysis one.
The script expects the location of conda yaml file to be at /tmp/environment_data_analysis.yml
. The content of the data analysis environment are as follows:
name: data-analysis
channels:
- anaconda
- conda-forge
- defaults
- rdkit
dependencies:
- cython=0.29.21
- pip=20.2.4
- python=3.6.9
- rdkit=2020.09.2
- refnx=0.1.19
- pip:
- biopython==1.78
- h5py==3.1.0
- iminuit==1.5.4
- ipykernel==5.3.4
- ipympl==0.5.8
- ipython==7.16.1
- pyqt5==5.15.2
- lmfit==1.0.1
- matplotlib==3.1.3
- numba==0.52.0
- numexpr==2.7.1
- numpy==1.19.4
- pandas==1.1.4
- PeakUtils==1.3.3
- Pillow==8.0.1
- PyYAML==5.3.1
- qtpy==1.9.0
- requests==2.25.0
- scikit-learn==0.23.2
- scipy==1.5.4
- seaborn==0.11.0
- statsmodels==0.12.1
- sympy==1.7
- ufit==1.4.4
- jscatter==1.2.7.2
- uncertainties==3.1.5
- attrs==20.3.0
- periodictable==1.5.3
User environments¶
Assuming as user has created their own Conda environment in VISA and wishes to integrate this into JupyterLab, there are a couple of commands that they need to perform.
Firstly they you need to install ìpykernel
from within their activated Conda environment:
> conda activate my_conda_env
(my_conda_env) > conda install ipykernel
(my_conda_env) > python -m ipykernel install --user --name=my_conda_env
More details of this can be found on the VISA User documentation at the ILL.