Tools for scaling up

Overview

Teaching: 10 min
Exercises: 0 min

Questions

How do I turn working code fragments into batch jobs?

Where can I look for more help?

Objectives

Learn where to go next.

Scaling up

The tools described in these lessons are intended to be used within a script that is scaled up for large datasets.

You could use any of them in an ordinary GRID job (or other batch processor).

However, the Coffea project (documentation) is building a distributed ecosystem that integrates Pythonic analysis with data analysis farms. This is too large of a subject to cover here, but check out the software and join the Coffea user meetings if you’re interested.

Scikit-HEP Resources

scikit-hep.org
Uproot GitHub, documentation
Awkward Array GitHub, documentation
boost-histogram GitHub, documentation
hist GitHub, documentation
Unified Histogram Interface GitHub, documentation
mplhep GitHub, documentation
iminuit GitHub, documentation
zfit GitHub, documentation
Vector GitHub, documentation
Particle GitHub, documentation
hepunits GitHub
fastjet GitHub, documentation
pyhf GitHub, documentation
hepstats GitHub, documentation
cabinetry GitHub, documentation
histoprint GitHub
decaylanguage GitHub, documentation
GooFit GitHub, documentation
pyhepmc GitHub
pylhe GitHub

and finally

cookie GitHub, documentation, a template for making your own…

Key Points

See Coffea for more about scaling up your analysis.

Pythonic high-energy physics is a broad and growing ecosystem.

previous episode

Scikit-HEP Tutorial

lesson home

Tools for scaling up

Overview

Scaling up

Scikit-HEP Resources

Key Points

previous episode

lesson home