This tutorial explores Machine Learning using scikit-learn and PyTorch for applications in high energy physics.
Extended from a version developed by Luke Polson for the 2020 USATLAS Computing Bootcamp.
This lesson leads directly into a lesson “Machine Learning on GPU” originally developed by Anna Scaife.
Prerequisites
- A Kaggle account. Click here to create an account as described in the Setup page
- Basic Python knowledge, e.g. through the Software Carpentry Programming with Python lesson
Introduction
Machine learning is everywhere in modern “big-data” science. As physicists and big-data scientists, it’s a good idea to know a bit about machine learning.
The aim of this lesson is to:
- explore what it means to build a machine learning model
- expand on concepts in machine learning that are essential to anyone working in big-data science
The skills we’ll focus on:
- Understanding a bit about machine learning
- Preparing data for machine learning
- Training some machine learning models
- Comparing some machine learning models
The HSF Training Curriculum
This training module is part of the Training Curriculum, a series of training modules that serves HEP newcomers the software skills needed as they enter the field, and in parallel, instill best practices for writing software.
Videos are provided at the top of each page to help guide you. For the sections without coding (Introduction, Mathematical Foundations, Neural Networks) the videos essentially take you through the text, so choose whichever way you learn best: video or reading. For the remaining sections, the videos take you through the coding live.