Deep learning for particle physicists

Deep learning for particle physicists#

This book is an introduction to modern neural networks (deep learning), intended for particle physicists. Most particle physicists need to use machine learning for data analysis or detector studies, and the unique combination of mathematical and statistical knowledge that physicists have puts them in a position to understand the topic deeply. However, most introductions to deep learning can’t assume that their readers have this background, and advanced courses assume specialized knowledge that physics audiences may not have.

This book is “introductory” because it emphasizes the foundations of what neural networks are, how they work, why they work, and it provides practical steps to train neural networks of any topology. It does not get into the (changing) world of network topologies or designing new kinds of machine learning algorithms to fit new problems.

The material in this book was first presented at CoDaS-HEP in 2024: jpivarski-talks/2024-07-24-codas-hep-ml. I am writing it in book format, rather than simply depositing my slide PDFs and Jupyter notebooks in https://hsf-training.org/, because the original format assumes that I’ll verbally fill in the gaps. This format is good for two purposes:

  • offline self-study by a student without a teacher, and

  • for teachers preparing new course slides and notebooks (without having to read my mind).

The course materials include some inline problems, intended for active learning during a lecture, and a large project designed for students to work on for about 2 hours. (In practice, experienced students finished it in an hour and beginners could have used a little more time.)

Software for the course#

This course uses Scikit-Learn and PyTorch for examples and problem sets. TensorFlow is also a popular machine learning library, but its functionality is mostly the same as PyTorch, and I didn’t want to hide the concepts behind incidental differences in software interfaces. (I did include Scikit-Learn because its interface is much simpler than PyTorch. When I want to emphasize issues that surround fitting in general, I’ll use Scikit-Learn because the fit itself is just two lines of code, and when I want to emphasize the details of the machine learning model, I’ll use PyTorch, which expands the fit into tens of lines of code and allows for more control of this part.)

I didn’t take the choice of PyTorch over TensorFlow lightly (since I’m a newcomer to both). I verified that PyTorch is about as popular as TensorFlow among CMS physicists using the plot below (derived using the methodology in this GitHub repo and this talk). Other choices, such as JAX, would be a mistake because a reader of this tutorial would not be prepared to collaborate with machine learning as it is currently practiced in particle physics.

Moreover, PyTorch seems to be more future-proof than TensorFlow. By examining the use of both outside of particle physics, we see that Google search volume is increasing for PyTorch at the expense of TensorFlow (“JAX” is a common word with meanings beyond machine learning, making it impossible to compare), and PyTorch is much more frequently used by machine learning competition winners in the past few years.

What to install#

Make sure that you have these packages installed with conda, pip, uv, etc.

  - numpy
  - matplotlib
  - pandas
  - iminuit
  - scikit-learn
  - h5py
  - awkward
  - pytorch-cpu  # `torch` in pip

If you’re using pip, see PyTorch’s documentation for instructions. The name pytorch-cpu is only for conda.

The exercises are all small enough that you won’t need a GPU, but if you want to use PyTorch with your GPU, you’ll have to install the GPU drivers (only) outside of conda and then conda install 'cuda-version>=12' pytorch pytorch-cuda. If you’re using pip, the entire CUDA installation is outside of what pip manages.

Table of contents#