Overview

This week, you are asked to work through the first three tutorials from the pandas_workshop.

Please refer to the README for advice on how to setup the workshop. Note the advice to create a conda/pip environment specifically for Pandas Workshop learning, to avoid package dependency conflicts with the environment you created for your other work in the Data Mining 1 module.

The tutorials will help you to become more familiar with pandas features that you will need for your CA1 attempts.

The datasets can be found here - right click, download and unzip so that the data/ folder is a subfolder of where you placed the notebooks below.

Getting Started

The Getting Started notebook provides a gentle introduction to Pandas, supplementing the Pandas introduction you saw in previous weeks.

Sections include

Data Wrangling

The Data Wrangling notebook is complementary to the treatment introduced in class, where data is loaded, inspected, cleaned, etc.

Sections include

Data visualisation

The Data Visualisation notebook introduces some of the visualisation features that are built in to pandas, and adds some more from seaborn.

Sections include

Exercises

There are also some exercises built in to both workbooks so that you can check your understanding.

For your convenience, you can add your code snippets to this workbook.

Data files for CA1