Tracks¶

class evaldet.Tracks(ids, frame_nums, detections, classes=None, confs=None, zero_indexed=True)¶

A class representing objects’ tracks in a MOT setting.

It can read the following MOT file formats

MOT format (as described here)
MOT ground truth format (as described here)
CVAT’s version of the MOT format (as described here)
CVAT for Video format (as described here)
UA-DETRAC XML format (you can download an example here)

Internally, all the attributes are saved as a single numpy array, and sorted by frame numbers. This enables easy access, as well as easy conversion to/from formats that to not store detections by frames (but by tracks).

The frame numbers will be zero-indexed internally, so for the MOT files 1 will be subtracted from all frame numbers.

Parameters

ids (Union[List[int], numpy.ndarray[Any, numpy.dtype[numpy.int32]]]) –
frame_nums (Union[List[int], numpy.ndarray[Any, numpy.dtype[numpy.int32]]]) –
detections (Union[List[numpy.ndarray[Any, numpy.dtype[numpy.float32]]], numpy.ndarray[Any, numpy.dtype[numpy.float32]]]) –
classes (Optional[Union[List[int], numpy.ndarray[Any, numpy.dtype[numpy.int32]]]]) –
confs (Optional[Union[List[float], numpy.ndarray[Any, numpy.dtype[numpy.float32]]]]) –
zero_indexed (bool) –

Return type

None

__contains__(idx)¶

Whether the frame idx is present in the collection.

Return type: bool
Parameters: idx (int) –

__getitem__(idx: int) → evaldet.tracks.FrameTracks¶

__getitem__(idx: slice) → Tracks

Return type: Union[FrameTracks, Tracks]

__len__()¶

Return type: int

property all_classes: Set[int]¶

Get a set of all classes in the collection.

Return type: Set[int]

property classes: numpy.ndarray[Any, numpy.dtype[numpy.int32]]¶

Return type: ndarray[Any, dtype[int32]]

property confs: numpy.ndarray[Any, numpy.dtype[numpy.float32]]¶

Return type: ndarray[Any, dtype[float32]]

property detections: numpy.ndarray[Any, numpy.dtype[numpy.float32]]¶

Return type: ndarray[Any, dtype[float32]]

filter(filter)¶

Filter the tracks using a boolean mask.

This method will filter all attributes according to the mask provided.

Parameters: filter (ndarray) – A boolean array, should be the same length as ids and other attributes.
Return type: None

property frame_nums: numpy.ndarray[Any, numpy.dtype[numpy.int32]]¶

Return type: ndarray[Any, dtype[int32]]

property frames: Set[int]¶

Get an ordered list of all frame numbers in the collection.

Return type: Set[int]

classmethod from_csv(csv_file, fieldnames, zero_indexed=True)¶

Get detections from a CSV file.

The CSV file should have a normal comma (,) as a separator, and should not include a header.

Parameters

csv_file (Union[str, Path]) – path to the CSV file
filednames – The names of the fields. This will be passed to csv.DictReader. It should contain the names of the fields, in order that they appear. The following names will be used (others will be disregarded): - xmin - ymin - height - width - conf: for the confidence of the item - class: for the class label of the item - id: for the id of the item - frame: for the frame number
zero_indexed (bool) – If the frame numbers are zero indexed. Otherwise they are assumed to be 1 indexed, and 1 will be subtracted from all frame numbers to make them zero indexed.
fieldnames (List[str]) –

Return type

~TracksType

classmethod from_cvat_video(file_path, classes_list)¶

Creates a Tracks object from detections file in the CVAT for Video XML format.

Here’s how this file might look like:

<annotations>
    <version>1.1</version>
    <meta>
        <!-- lots of non-relevant metadata -->
    </meta>
    <track id="0" label="Car" source="manual">
        <box frame="659" outside="0" occluded="0" keyframe="1" xtl="323.83" ytl="104.06" xbr="367.60" ybr="139.49" z_order="-1"> </box>
        <box frame="660" outside="0" occluded="0" keyframe="1" xtl="320.98" ytl="105.24" xbr="365.65" ybr="140.95" z_order="0"> </box>
    </track>
    <track id="1" label="Car" source="manual">
        <box frame="659" outside="0" occluded="0" keyframe="1" xtl="273.10" ytl="88.77" xbr="328.69" ybr="113.09" z_order="1"> </box>
        <box frame="660" outside="0" occluded="0" keyframe="1" xtl="273.10" ytl="88.88" xbr="328.80" ybr="113.40" z_order="0"> </box>
    </track>
    <track id="2" label="Car" source="manual">
        <box frame="659" outside="0" occluded="0" keyframe="1" xtl="375.24" ytl="80.43" xbr="401.65" ybr="102.67" z_order="0"> </box>
        <box frame="660" outside="0" occluded="0" keyframe="1" xtl="374.69" ytl="80.78" xbr="401.09" ybr="103.01" z_order="0"> </box>
    </track>
    <track id="3" label="Car" source="manual">
        <box frame="699" outside="0" occluded="0" keyframe="1" xtl="381.50" ytl="79.04" xbr="405.12" ybr="99.19" z_order="0"> </box>
        <box frame="700" outside="0" occluded="0" keyframe="1" xtl="380.94" ytl="79.60" xbr="404.56" ybr="99.75" z_order="0"> </box>
    </track>
</annotations>

All attributes of each detection will be ignored, except for label (in the track object), which will be used for the class values. As this attribute usually contains string values, you also need to provide classes_list - a list of all possible class values. The class attribute will then be replaced by the index of the label in this list.

Elements with “outside=1” will be ignored.

Parameters

file_path (Union[Path, str]) – Path where the detections file is located
classes_list (List[str]) – The list of all possible class values. The values from that attribute in the file will then be replaced by the index of that value in this list.

Return type

~TracksType

classmethod from_mot(file_path)¶

Creates a Tracks object from detections file in the MOT format.

The format should look like this:

<frame>, <id>, <xmin>, <ymin>, <width>, <height>, <conf>, <x>, <y>, <z>

Note that all values above are expected to be numeric - string values will cause an error. The values for x, y and z will be ignored.

The frame numbers will be zero-indexed internally, so 1 will be subtracted from all frame numbers.

Parameters: file_path (Union[Path, str]) – Path where the detections file is located. The file should be in the format described above, and should not have a header.
Return type: ~TracksType

classmethod from_mot_cvat(file_path)¶

Creates a Tracks object from detections file in the CVAT’s MOT format.

The format should look like this:

<frame>, <id>, <xmin>, <ymin>, <width>, <height>, <not ignored>, <class>, <visibility>, <skipped>

Note that all values above are expected to be numeric - string values will cause an error. The last two elements (visibility and skipped) are optional. The values for not ignored, visibility and skipped will be ignored.

The frame numbers will be zero-indexed internally, so 1 will be subtracted from all frame numbers.

Parameters: file_path (Union[Path, str]) – Path where the detections file is located. The file should be in the format described above, and should not have a header.
Return type: Tracks

classmethod from_mot_gt(file_path)¶

Creates a Tracks object from detections file in the MOT ground truth format. This format has some more information compared to the normal

The format should look like this:

<frame>, <id>, <xmin>, <ymin>, <width>, <height>, <conf>, <class>, <visibility>

Note that all values above are expected to be numeric - string values will cause an error. The value for visibility will be ignored.

The frame numbers will be zero-indexed internally, so 1 will be subtracted from all frame numbers.

Parameters: file_path (Union[Path, str]) – Path where the detections file is located. The file should be in the format described above, and should not have a header.
Return type: ~TracksType

classmethod from_parquet(file_path)¶

Read the tracks from a parquet file.

The file should have the following columns:: <frame>, <id>, <xmin>, <ymin>, <width>, <height>, <conf>, <class>

Parameters: file_path (Union[Path, str]) – Path where the detections file is located
Return type: ~TracksType

classmethod from_ua_detrac(file_path, classes_attr_name=None, classes_list=None)¶

Creates a Tracks object from detections file in the UA-DETRAC XML format.

Here’s how this file might look like:

<sequence name="MVI_20033">
    <sequence_attribute camera_state="unstable" sence_weather="sunny"/>
    <ignored_region>
        <box height="53.75" left="458.75" top="0.5" width="159.5"/>
    </ignored_region>
    <frame density="4" num="1">
        <target_list>
            <target id="1">
                <box height="71.46" left="256.88" top="201.1" width="67.06"/>
                <attribute color="Multi" orientation="315" speed="1.0394" trajectory_length="91" truncation_ratio="0" vehicle_type="Taxi"/>
            </target>
        </target_list>
    </frame>
    <frame density="2" num="2">
        <target_list>
            <target id="2">
                <box height="32.44999999999999" left="329.27" top="96.65" width="56.53000000000003"/>
                <attribute color="Multi" orientation="315" speed="1.0394" trajectory_length="3" truncation_ratio="0" vehicle_type="Car"/>
            </target>
            <target id="4">
                <box height="122.67000000000002" left="0.0" top="356.7" width="76.6"/>
                <attribute color="Multi" orientation="315" speed="1.0394" trajectory_length="1" truncation_ratio="0" vehicle_type="Car"/>
            </target>
        </target_list>
    </frame>
</sequence>

The ignored_region node will not be taken into account - if you want some detections to be ignored, you need to filter them prior to the creation of the file.

All attributes of each detection will be ignored, except for the one designated by classes_attr_name (for example, in original UA-DETRAC this could be "vehicle_type"). This would then give values for classes attribute. As this attribute usually contains string values, you also need to provide classes_list - a list of all possible class values. The class attribute will then be replaced by the index of the label in this list.

Parameters

file_path (Union[Path, str]) – Path where the detections file is located
classes_attr_name (Optional[str]) – The name of the attribute to be used for the classes attribute. If provided, classes_list must be provided as well.
classes_list (Optional[List[str]]) – The list of all possible class values - must be provided if classes_attr_name is provided. The values from that attribute in the file will then be replaced by the index of that value in this list.

Return type

~TracksType

property ids: numpy.ndarray[Any, numpy.dtype[numpy.int32]]¶

Return type: ndarray[Any, dtype[int32]]

property ids_count: Dict[int, int]¶

Get the number of frames that each id is present in.

Return type: Dict[int, int]
Returns: A dictionary where keys are the track ids, and values are the numbers of frames they appear in.

to_csv(dirname, labels)¶

Export detections to a simple CSV format. The format comprises of two files: dets.csv, containing the detections, and labels.txt, which contains the names of the labels (corresponding to label indices in dets.csv). The rows in dets.csv have the following format:

<frame>, <id>, <x_min>, <y_min>, <width>, <height>, <class>, <conf>

Note that frame and class are both 0 indexed.

Parameters

dirname (Union[Path, str]) – The name of the directory to save to - will be created if it doesn’t already exist.
labels (Sequence[str]) – A list/tuple of label names. The length should be at least the maximum label index - 1 (the first label corresponds to label at the 0-th index).

Return type

None

to_cvat_video(filename, labels, image_size=(1, 1))¶

Export detections to CVAT for Video 1.1 format.

More information on the format can be found here.

Parameters

filename (Union[Path, str]) – The name of the file to save to - should have an .xml suffix.
labels (Sequence[str]) – A list/tuple of label names. The length should be at least the maximum label index - 1 (the first label corresponds to label at the 0-th index).
image_size (Tuple[int, int]) – The size of the image in the [w, h] format, in pixels.

Return type

None

to_parquet(file_path)¶

Export detections to parquet format.

Parameters

file_name – Relative or absolute file name to save to.
file_path (Union[pathlib.Path, str]) –

Return type

None