Tutorial: Running Julia in Galileo

Written and developed by

Matthew Gasperetti
matthew@hypernetlabs.io
Alexander Berry
alexander@hypernetlabs.io

Tutorial: Running Julia in Galileo

Written and developed by

Matthew Gasperetti

matthew@hypernetlabs.io

Alexander Berry

alexander@hypernetlabs.io

Getting started with Julia in Galileo

Getting started with Julia in Galileo

To get started with Galileo log into your account using Firefox or Chrome, and download our Julia example file from GitHub.

The downloaded file consists of a .jl file, a .csv file, and a Dockerfile. We’ll try running this folder in Galileo, first, and then take a look at what’s happening behind the scenes.

Let’s have a look at our files

The julia_example.jl script conducts a simple linear regression using the supplied mtcars.csv dataset. It also demonstrates how to use a dataset loaded from a library.

Next, our julia_example.jl file conducts a Monte Carlo experiment that simulates 50,000 throws of two six-sided dice to calculate the probability that the sum of one throw of two dice is greater than or equal to seven. It then repeats the same experiment 10 million times. Finally, it compares the means of the two samples and the amount of time it took to calculate them.

Understanding the user interface

When you log into Galileo, the first thing you’ll see is your Dashboard:

View of the Galileo Dashboard

To run the julia_example.jl file, drag and drop the entire julia_example folder you downloaded from our GitHub to the station Galilei at the top of the Dashboard:

Drag and drop the julia_example folder to the Galilei station

After you drag and drop the julia_example folder to Galileo, you’ll be able to see the job running in the Your Recent Jobs panel. The job runs quickly in Galileo – try running it locally and comparing:

When the example job completes, hit the Download button under Action to download the results:

The results folder will be downloaded as a .zip that contains an output.log file returning the results of the analysis and a folder called filesys where plots and other files that were created by the analysis are stored.

The Downloaded .zip file contains a folder called filesys and a file called output.log

Let’s take a look at the output.log file first, which returns the results of the regression and Monte Carlo analysis we ran:

Summary of the results of the simple regression and Monte Carlo simulations

Running your own Julia files in Galileo — A closer look at how it works

A closer study of the files in our julia_example folder will help illustrate how to modify them so we can run other jobs. After that, we’ll have a look at the Galileo Docker Wizard, which helps automate the process.

How to code a Dockerfile to run Julia in Galileo

Let’s quickly review the example Dockerfile, which you can open with a text editor like Atom.

The first thing to notice is that the file is called Dockerfile with no extension. It cannot be called anything else — Dockerfile2, Dockerfile copy, or Dockerfile.txt won’t work.

Looking at the Dockerfile with our text editor, the first Docker command we see is:

FROM julia:1.1

This tells Docker how to setup a Julia 1.1 environment. We want to leave it as is.

Let’s look at the next line of code we see in our Docker file:

RUN julia -e ‘import Pkg; Pkg.add(“CSV”)’

This tells Docker to install the CSV package to the Julia environment. The julia_example includes a total of six packages that are all installed in similar fashion.

The final command is:

ENTRYPOINT [“julia”,”julia_example.jl”]

This tells Docker the name of our .jl file and what command to use in order to execute the script.

Here is the Dockerfile from the julia_example folder in its entirety with comments:

#The line below determines the build image to use
FROM julia:1.1
#The next block determines what dependencies to load
RUN julia -e ‘import Pkg; Pkg.add(“CSV”)’
RUN julia -e ‘import Pkg; Pkg.add(“DataFrames”)’
RUN julia -e ‘import Pkg; Pkg.add(“GLM”)’
RUN julia -e ‘import Pkg; Pkg.add(“RDatasets”)’
RUN julia -e ‘import Pkg; Pkg.add(“StatsBase”)’
#This line determines where to copy project files from, and where to copy them to
COPY . .
#The entrypoint is the command used to start your project
ENTRYPOINT [“julia”,”julia_example.jl”]

 

Now, Let’s have a look at our .jl file

First, import our dependencies with the “using” key word:

using CSV, DataFrames, GLM, Rdatasets, StatsBase

The final important thing to note is that we read in the dataset we are using, mtcars.csv, like it is in our working directory with the following command:

DataFrame(CSV.File(“mtcars.csv”))

Notice the path is relative not absolute. There should not be a path to a directory anywhere in our .jl file. In this example, we also cast the dataset as a DataFrame object for easy manipulation.

It’s also possible to load datasets from a library. The following line loads the iris dataset from the RDatasets library:

iris = dataset(“datasets”, “iris”)

Let’s discuss the datasets for a second

Both the mtcars.csv and iris datasets are simple. No surprises here. We just wanted to show you two ways to access data: 1) including mtcars.csv in the folder you drag and drop to Galileo, and 2) by calling the loaded iris dataset directly in the .jl file.

Using the Docker Wizard to create your own project

If you drag and drop a folder to Galileo that does not contain a Dockerfile, you will see a Docker Wizard prompt:

The Docker Wizard helps automate creating a Docker file

To create a Docker file for a .jl file called my_project.jl that installs RDatasets, DataFrames, and GLM, enter the following settings into the Docker Wizard:

An example showing how to use Galileo’s Docker Wizard

It’s important to mention that including RDatasets, DataFrames, and GLM as dependencies will install them via Docker by adding the following commands to the Dockerfile:

RUN julia -e ‘import Pkg; Pkg.add(“RDatasets”); using RDatasets’
RUN julia -e ‘import Pkg; Pkg.add(“DataFrames”); using DataFrames’
RUN julia -e ‘import Pkg; Pkg.add(“GLM”); using GLM’

 

Once you complete your custom Dockerfile, make sure to add it to the project folder containing your my_project.jl script and your data (if applicable). Your project folder should contain a Dockerfile, your .jl file, and data (if applicable). Your folder should look like this:

Now that your folder looks right, drag and drop it onto Galilei in your Dashboard at https://app.galileoapp.io.

We hope this tutorial was helpful. Please let us know if you have any questions or any problems using Galileo. Your feedback is extremely important to us. Contact us anytime at matthew@hypernetlabs.io or alexander@hypernetlabs.io.