Introducing the new hh5 conda environment#
The hh5 python/conda environment has grown vastly in scope since it was first installed many years ago. What started as a place to put important python packages missing from NCI’s installations has since become an agile, community-driven, monolithic installation serving almost 600 researchers. Currently, there are 261 packages listed in the file used to construct and update the environment. Once all of the dependencies of these packages are factored in, there are 1,015 total python packages installed in the current
analysis3-unstable environment. As a result of the work put in by my predecessors at CLEX CMS, an update to the
analysis3-unstable environment, from a researcher requesting a package through email@example.com to integration and testing of that package, can be completed in under 1 hour.
That being said, there are issues that had to be addressed going forward.
The size of the environment;
At 1,015 packages, the current
analysis3-unstableenvironment contains 289,524 files, directories and symlinks. At approximately 10GB in size, this corresponds to an average file size of around 36kB, which is not suitable for the lustre file systems that the conda environments are installed on.
The lack of integration with NCI systems;
As conda is designed to create integrated environments on single-user systems, it installs packages like MPI and SSH, which are not configured correctly for NCI’s systems, causing issues such as MPI not working across multiple nodes, or ssh host-based authentication failing.
The new hh5 conda environment, herein referred to as
conda_concept is our attempt to address these issues while retaining the all the strengths of the current
conda/analysis3 environments. It is a combination of container and squashfs-based solution, though use of a container is only a byproduct of Singularity’s ability to manage
squashfs filesystems. It combines ideas from Singularity Registry HPC, Rioux et al. PEARC ‘20: Practice and Experience in Advanced Research Computing, July 2020, Pages 72–76 and the current
Same packages as current
Addresses issues with the installation and maintenance of current
Will be maintained in parallel with standard hh5 conda for at least 3 months.
analysis3-stable/unstablestructure and (roughly) quarterly update time
New module name -
conda_concept- to use on Gadi e.g.
$ module use /g/data/hh5/public/modules
$ module load conda_concept/analysis3-unstable
analysis3/23.01environments are available - they are exact duplicates of the corresponding existing
Can’t use direct path to interpreter, must use path to the ‘scripts’ directory instead if a full path is needed (e.g. in
#!/g/data/hh5/.../python3shbang lines in scripts)
conda activatedoes not work, must load the module.
Usage of the new
conda_concept/analysis3 environments is very similar to the current
conda/analysis3 environments. On the Gadi command line, or in PBS scripts, run
$ module use /g/data/hh5/public/modules
$ module load conda_concept/analysis3
The version naming scheme is not changing. The
analysis3-unstable environment will always be an alias for the most recent environment and will continue to be updated at roughly 3 monthly intervals.
analysis3 will remain the most recent “stable” environment.
$ module load conda_concept
will load the “stable” environment. Specific versions can also be loaded with e.g.
$ module load conda_concept/analysis3-23.01
analysis3 workflow comprising python scripts or jupyter notebooks should seamlessly carry over to the new
conda_concept environments. If your workflow does not, please contact firstname.lastname@example.org or leave an issue on the github page.
Key differences between
The key difference of the approach used by the
conda_concept environments is the use of singularity’s ability to manage squashfs file systems. A squashfs can be thought of a something like a tar file that can be accessed as if it were an entirely new file system. The obvious advantage of placing a conda environment in a squashfs is that it reduces the file count of the entire environment on
/g/data to one.
The drawback of using singularity to manage the squashfs is that containers have a number of restrictions placed on them for security reasons, (e.g. the
qcat commands cannot be run from inside a container). Because of this, once the module is loaded, the user is kept out of the containerised environment unless a command that exists inside the container is run. This is accomplished by the use of a ‘launcher’ script that runs
singularity and executes the command from within the container. For more details on this, see the CMS Wiki page. This leads to known issues with the environment.
The path to the python executable within a conda environment cannot be used as the shbang (e.g.
#!/g/data/hh5/public/apps/cms_conda/envs/analysis3/bin/python3on the first line of script) as the executable does not exist outside of the container. Instead, use the launcher script symlink:
#!/g/data/hh5/public/apps/cms_conda_scripts/analysis3.d/bin/python3, which launches the container and runs the script from inside of it.
conda envcommands do not work. This is because the
condacommand runs outside of Singularity and does not have any visibility into the environments.
conda activatedoes not work. Loading the module is equivalent to running
conda activatefor a given environment, however,
conda activateto load a different environment after loading the module will not work.
Very occasionally, a package may fail to import with
IOError. This is an issue with the underlying file system. To aid in diagnosis, please submit the PBS jobid and/or node you were working on and the time at which the import failed.
If you encounter any problem not listed here that is fixed when you revert to the standard
conda/analysis3-unstable environment, please contact email@example.com, or leave an issue on the github page.