Data visualization with Python programming software

Frontpage → Activities → Data visualization with Python programming software

Key information

Next planned workshop: TBD
Registration deadline: TBD
Location: Entirely digital
Workshop leader: Radovan Bast

Registration instructions

Registration is now closed.

Content

In this introductory-level workshop, we will learn to produce publication-ready and reproducible graphs using the Python programming language.

This workshop is about reading data from a file, processing the data, plotting the result, and all of this in a reproducible way. This is something that we all do or would like to do. The content of this workshop is general and should be relevant for anyone working in the field of science.

We will work in Jupyter notebooks and start with Python basics, to be able to read data from Excel sheets and comma-separated values (CSV) files. We will introduce the pandas library for “data wrangling” (reading, writing, sorting, and filtering of data).

We will learn how to process data and compute simple statistics, error bars, and regression approximations with Python and the help of its libraries.

And finally we will learn how to produce reproducible plots using the libraries Matplotlib, Seaborn, and Altair (you can then choose your favorite one). We will practice how to share these visualization pipelines using Binder via GitHub.

This is a 3 day workshop where the majority of the learning will take place over two half days on March 29th and 30th, the Mon-Tues before Easter. After that we will work on our own projects and meet up again on April 7th (the first Wednesday after Easter) for a mentoring session where we will together improve data processing and visualization pipelines for our actual research projects.

Schedule

Day 1, Monday March 29

9:00 – 9:50
- Jupyter and Python basics
10:00 – 10:50
- Reading and writing data with pandas
11:00 – 12:00
- Generating our first plot
- Tidy data format

Day 2, Tuesday March 30

9:00 – 9:50
- Computing and visualizing statistics (error bars, regression)
10:00 – 10:50
- Customizing plots
11:00 – 12:00
- Sharing reproducible data science pipelines using Binder

Day 3, Wednesday April 7

9:00 – 12:00
- Mentoring on assignments and own projects.
  You will briefly present a visualization/plotting challenge from your own research work and we will together try to improve the workflow in a constructive way.

Admission

Approximately 20 students will be admitted to this workshop. You must be a BioCat member to join. Not a member? Register here.

Required materials

No programming language experience needed, we will start from zero and learn the basics together
Computer with network access
As preparation, read and work through https://swcarpentry.github.io/python-novice-inflammation/ in self-study (2 hours investment)
Anaconda installation (installation instructions will be provided)
GitHub account (optional)
Zoom video conferencing software
Bring one of your recent plotting tasks or challenges