Introduction to Jupyter lab

In this module, we use jupyter notebooks to write the code, store results (as text or images), document both code and results, and to run data mining applications.

When all the required dependencies have been met, you can start the Jupyter web server from the command line and point your browser at the relevant url. Your browser then becomes the client that allows you to interact with the server.

We will go into how notebooks can be used in a later lab.

For now, we will just check that the software has been installed successfully.

Get the data file

If you have not done so already, you need to download auto-mpg.csv.

Navigate to a suitable directory/folder

For this purpose, my advice is to change directory to where you downloaded auto-mpg.csv

Activate conda, so that the dependencies are in your python path

1
conda activate

Alternatively, if on MS Windows, just start Anaconda Prompt and work from there.

Start the jupyter(lab) server

1
2
3
4
5
jupyter-lab
....
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=f9b36df2b7e03e15bd46c41552a883828d002a1c480f63f0
     or http://127.0.0.1:8888/?token=f9b36df2b7e03e15bd46c41552a883828d002a1c480f63f0

Note that jupyter-lab will generally open a browser tab for you. In your browser you should see the jupyter-lab interface.

Create a new Python 3 notebook

The jupyter-lab page provides a handy "button" for this. Pressing that creates a new notebook with a default name of Untitled.ipynb. If you have an existing notebook, you can use the navigation panel to the Left to find the file and can open it by clicking on it.

At this stage you might as well File > Save As a more memorable name.

Create your first script

We are going to read the auto-mpg.csv file into a pandas dataframe and quickly view its contents.

Assuming the auto-mpg.csv file has been placed in the same folder as the notebook, in the first cell (which should be of type code, not markdown or raw), type

1
2
3
import pandas as pd
auto_mpg = pd.read_csv('auto-mpg.csv')
auto_mpg.head()

Run your first script

Click the "play" button directly above the code cell and the notebook uses the Python 3 kernel to run your script.

The output cell below your code cell will contain the first 5 rows of the auto_mpg data frame.

What now?

  1. Add a second cell, change its type from Code to Markdown and some comments/documentation in markdown format.
  2. Investigate the options in the pandas `read.csv() function.
  3. Try simple pandas expressions like auto_mpg['weight'].max()to see how to select particular columns and to apply certain functions to sets of values (car weights in pounds, in this instance).

Enjoy!

Closedown

  1. Save your script when you are finished. You can then close that browser tab.
  2. To stop the jupyter web server, type Ctrl-C then y.