order.dataset¶
Classes to define datasets.
Contents
Class Dataset¶
- class Dataset(*args, **kwargs)¶
Bases:
UniqueObject
,CopyMixin
,AuxDataMixin
,TagMixin
,DataSourceMixin
,LabelMixin
Dataset definition providing two kinds of information:
(systematic) shift-dependent, and
shift-indepent information.
Independent is e.g. whether or not it contains real data, whereas shift-dependent information is e.g. the number of events in the nominal or a shifted variation. Latter information is contained in
DatasetInfo
objects that are stored in this class and mapped to strings. These info objects can be accessed viaget_info()
or via items (__getitem__). For convenience, some of the properties of the nominalDatasetInfo
object are accessible on this class via forwarding.Arguments
A dataset is always measured in (real data) / created for (MC) a dedicated campaign, therefore it belongs to a
Campaign
object. In addition, physics processes can be linked to a dataset, therefore it hasProcess
objects.When info is does not contain a nominal
DatasetInfo
object (mapped to the keyorder.shift.Shift.NOMINAL
, i.e.,"nominal"
), all kwargs are used to create one. Otherwise, it should be a dictionary matching the format of the info mapping. label and label_short are forwarded to theLabelMixin
, is_data to theDataSourceMixin
, tags to theTagMixin
, aux to theAuxDataMixin
, and name, id and context to theUniqueObject
constructor.Copy behavior
All attributes are copied except for references to linked processes. The campaign reference is kept. Also note the copy behavior of
UniqueObject
’s.Example
import order as od campaign = od.Campaign("2017B", 1, ...) d = od.Dataset("ttH_bb", 1, campaign=campaign, keys=["/ttHTobb_M125.../.../..."], n_files=123, n_events=456789, ) d.info.keys() # -> ["nominal"] d["nominal"].n_files # -> 123 d.n_files # -> 123 # similar to above, but set explicit info objects d = Dataset("ttH_bb", 1, campaign=campaign, info={ "nominal": { "keys": ["/ttHTobb_M125.../.../..."], "n_files": 123, "n_events": 456789, }, "scale_up": { "keys": ["/ttHTobb_M125_scaleUP.../.../..."], "n_files": 100, "n_events": 40000, }, }, ) d.info.keys() # -> ["nominal", "scale_up"] d["nominal"].n_files # -> 123 d.n_files # -> 123 d["scale_up"].n_files # -> 100
Members
- campaign¶
- type: Campaign, None
The
Campaign
object this dataset belongs to. When set, this dataset is also added to the dataset index of the campaign object.
- info¶
- type: dictionary
Mapping of shift names to
DatasetInfo
instances.
- keys¶
- type: list
- read-only
The dataset keys of the nominal
DatasetInfo
object.
- n_files¶
- type: integer
- read-only
The number of files of the nominal
DatasetInfo
object.
- n_events¶
- type: integer
- read-only
The number of events of the nominal
DatasetInfo
object.
- processes¶
- type: UniqueObjectIndex
- read-only
The
UniqueObjectIndex
of child processes.
- set_info(shift_name, info)¶
Sets an
DatasetInfo
object info for a given shift_name. Returns the object.
- get_info(shift_name)¶
Returns the
DatasetInfo
object for a given shift_name.
- add_process(*args, **kwargs)¶
Adds a child process to the
processes
index and returns it. SeeUniqueObjectIndex.add()
for more info.
- clear_processes(context=None)¶
Removes all child processes from the
processes
index for context. When context is None, the default_context of theprocesses
index is used.
- extend_processes(objs, context=None)¶
Adds multiple child processes to the
processes
index for context and returns the added objects in a list. When context is None, the default_context of theprocesses
index is used.
- get_leaf_processes(context=None)¶
Returns all child processes from the
processes
index for context that have no child processes themselves in a recursive fashion. When context is None, the default_context of theprocesses
index is used.
- get_process(obj, deep=True, default=no_default, context=None)¶
Returns a child process given by obj, which might be a name, id, or an instance from the
processes
index for context. If deep is True, the lookup is recursive. When no process is found, default is returned when set. Otherwise, an error is raised. When context is None, the default_context of theprocesses
index is used.
- has_process(obj, deep=True, context=None)¶
Checks if the
processes
index for context contains an obj which might be a name, id, or an instance. If deep is True, the lookup is recursive. When context is None, the default_context of theprocesses
index is used.
- property has_processes¶
Returns True when this process has child processes, False otherwise.
- property is_leaf_process¶
Returns True when this process has no child processes, False otherwise.
- remove_process(obj, context=None, silent=False)¶
Removes a child process given by obj, which might be a name, id, or an instance from the
processes
index for context and returns the removed object. When context is None, the default_context of theprocesses
index is used. Unless silent is True, an error is raised if the object was not found. SeeUniqueObjectIndex.remove()
for more info.
- walk_processes(context=None, depth_first=False, include_self=False)¶
Walks through the
processes
index for context and per iteration, yields a child process, its depth relative to this process, and its child processes in a list that can be modified to alter the walking. When context is None, the default_context of theprocesses
index is used. When context is all, all indices are traversed. When depth_first is True, iterate depth-first instead of the default breadth-first. When include_self is True, also yield this process instance with a depth of 0.
Class DatasetInfo¶
- class DatasetInfo(keys=None, n_files=-1, n_events=-1, tags=None, aux=None)¶
Bases:
CopyMixin
,AuxDataMixin
,TagMixin
Container class holding information on particular dataset variations. Instances of this class are typically used in
Dataset
objects to store shift-dependent information, such as the number of files or events for a particular shift (e.g. nominal, scale_up, etc).Arguments
keys denote the identifiers or origins of a dataset. n_files and n_events can be used for further bookkeeping. tags are forwarded to the
TagMixin
, and aux to theAuxDataMixin
.Copy behavior
All attributes are copied. Also note the copy behavior of
UniqueObject
’s.Members
- keys¶
- type: list
The dataset keys, e.g.
["/ttHTobb_M125.../.../..."]
.
- n_files¶
- type: integer
The number of files.
- n_events¶
- type: integer
The number of events.