Tools for scaling up
Overview
Teaching: 10 min
Exercises: 0 minQuestions
How do I turn working code fragments into batch jobs?
Where can I look for more help?
Objectives
Learn where to go next.
Scaling up
The tools described in these lessons are intended to be used within a script that is scaled up for large datasets.
You could use any of them in an ordinary GRID job (or other batch processor).
However, the Coffea project (documentation) is building a distributed ecosystem that integrates Pythonic analysis with data analysis farms. This is too large of a subject to cover here, but check out the software and join the Coffea user meetings if you’re interested.
Scikit-HEP Resources
- scikit-hep.org
- Uproot GitHub, documentation
- Awkward Array GitHub, documentation
- boost-histogram GitHub, documentation
- hist GitHub, documentation
- Unified Histogram Interface GitHub, documentation
- mplhep GitHub, documentation
- iminuit GitHub, documentation
- zfit GitHub, documentation
- Vector GitHub, documentation
- Particle GitHub, documentation
- hepunits GitHub
- fastjet GitHub, documentation
- pyhf GitHub, documentation
- hepstats GitHub, documentation
- cabinetry GitHub, documentation
- histoprint GitHub
- decaylanguage GitHub, documentation
- GooFit GitHub, documentation
- pyhepmc GitHub
- pylhe GitHub
and finally
- cookie GitHub, documentation, a template for making your own…
Key Points
See Coffea for more about scaling up your analysis.
Pythonic high-energy physics is a broad and growing ecosystem.