Leonardo
Leo provisions Spark clusters through Google Dataproc and installs Jupyter notebooks and Hail on them. It can also proxy end-user connections to the Jupyter interface in order to provide authorization for particular users.
For more information and an overview, see the wiki.
We use JIRA instead of the issues page on Github. If you would like to see what we are working you can visit our active sprint or our backlog on JIRA. You will need to set-up an account to access, but it is open to the public.
Swagger API documentation: https://notebooks.firecloud.org/
Authorization provider
Leo provides two modes of authorization out of the box:
- By whitelist
- Through Sam, the Workbench IAM service
Users wanting to roll their own authorization mechanism can do so by subclassing LeoAuthProvider and setting up the Leo configuration file appropriately.
Service account provider
There are (up to) three service accounts used in the process of spinning up a notebook cluster:
- The Leo service account itself, used to make the call to Google Dataproc
- The service account passed to dataproc clusters create via the
--service-accountparameter, whose credentials will be used to set up the instance and localized into the GCE metadata server - The service account that will be localized into the user environment and returned when any application asks for application default credentials.
Currently, Leo uses its own SA for #1, and the same per-user project-specific SA for #2 and #3, which it fetches from Sam. Users wanting to roll their own service account provision mechanism by subclassing ServiceAccountProvider and setting up the Leo configuration file appropriately.
Building and running Leonardo
Clone the repo.
$ git clone https://github.com/DataBiosphere/leonardo.git
$ cd leonardo
Run Leonardo unit tests
Leonardo requires Java 8 due to a dependency on Java's DNS SPI functionality. This feature is removed in Java 9 and above.
Ensure docker is running. Spin up MySQL locally:
$ ./docker/run-mysql.sh start leonardo
Note, if you see error like
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Run docker system prune -a
Build Leonardo and run all unit tests.
export SBT_OPTS="-Xmx2G -Xms1G -Dmysql.host=localhost -Dmysql.port=3311 -Duser.timezone=UTC"
sbt clean compile "project http" test
You can also run a particular test suite, e.g.
sbt "testOnly *LeoAuthProviderHelperSpec"
or a particular test within a suite, e.g.
sbt "testOnly org.broadinstitute.dsde.workbench.leonardo.runtimes.RuntimeCreationDiskSpec -- -z "create runtime and attach a persistent disk""
where map is a substring within the test name.
Once you're done, tear down MySQL.
./docker/run-mysql.sh stop leonardo
Do docker restart leonardo-mysql if you see java.sql.SQLNonTransientConnectionException: Too many connections error
- Running tests against FIAB Checking FIAB mysql (fina password in /etc/leonardo.conf in firecloud_leonardo-app_1 container)
docker exec -it firecloud_leonardo-mysql_1 bash
root@2f5efbd4f138:/# mysql -u leonardo -pRun scalafmt
Learn more about scalafmt
sbt scalafmtAll
Building Leonardo docker image
To install git-secrets
brew install git-secrets
To ensure git hooks are run
cp -r hooks/ .git/hooks/
chmod 755 .git/hooks/apply-git-secrets.sh
To build jar, leonardo docker image, and leonardo-notebooks docker image
./docker/build.sh jar -d build
To build jar, leonardo docker image, and leonardo-notebooks docker image
and push to repos broadinstitute/leonardo and broadinstitute/leonardo-notebooks
tagged with git hash
./docker/build.sh jar -d push
To build the leonardo-notebooks docker image with a given tag
bash ./jupyter-docker/build.sh build <TAG NAME>
To push the leonardo-notebooks docker image you built
to repo broadinstitute/leonardo-notebooks
bash ./jupyter-docker/build.sh push <TAG NAME>