Introduction: Python background
|
Be familiar with the syntax of Python dicts, NumPy arrays, slicing rules, and bitwise logic operators.
Large-scale computations in Python tend to be performed one array at a time, rather than one scalar operation at a time.
You, as a user, will likely be gluing together many packages in each data analysis.
|
Basic file I/O with Uproot
|
Uproot TDirectories and TTrees have a dict-like interface.
Uproot reading methods are primarily intended to get data into a more specialized library.
Uproot writing is more limited, but it can write histograms and TTrees.
|
TTree details
|
ROOT files have a structure that enables partial reading. This is essential for large datasets.
Be aware of how much data you’re reading and when.
The Python + Jupyter + Uproot interface provides a gradual path from interactive tinkering to scaled-up workflows.
|
Jagged, ragged, Awkward Arrays
|
NumPy (and almost all array libraries) is only for rectilinear collections of numbers: arrays, tables, and tensors.
Awkward Array extends NumPy’s slicing and array-manipulation to jagged arrays and more general data types (such as nested records).
These extensions are useful for physics.
There’s usually more than one way to get what you want.
|
Histogram manipulations and fitting
|
High-energy physicists approach histogramming in a different way from NumPy, Matplotlib, SciPy, etc.
Scikit-HEP tools make histogramming and fitting Pythonic.
|
Lorentz vectors, particle PDG IDs, jet-clustering, oh my!
|
Instead of building vector methods into multiple packages, a standalone package provides just that.
The value of these small packages amplify when used together.
|
Tools for scaling up
|
|