DOTAMeanAP¶

class mmeval.metrics.DOTAMeanAP(iou_thrs: Union[float, List[float]] = 0.5, scale_ranges: Optional[List[Tuple]] = None, num_classes: Optional[int] = None, eval_mode: str = '11points', nproc: int = 4, drop_class_ap: bool = True, classwise: bool = False, **kwargs)[source]¶

DOTA evaluation metric.

DOTA is a large-scale dataset for object detection in aerial images which is introduced in https://arxiv.org/abs/1711.10398. This metric computes the DOTA mAP (mean Average Precision) with the given IoU thresholds and scale ranges.

Parameters

iou_thrs (float | List[float]) – IoU thresholds. Defaults to 0.5.
scale_ranges (List[tuple], optional) – Scale ranges for evaluating mAP. If not specified, all bounding boxes would be included in evaluation. Defaults to None.
num_classes (int, optional) – The number of classes. If None, it will be obtained from the ‘CLASSES’ field in self.dataset_meta. Defaults to None.
eval_mode (str) – ‘area’ or ‘11points’, ‘area’ means calculating the area under precision-recall curve, ‘11points’ means calculating the average precision of recalls at [0, 0.1, …, 1]. The PASCAL VOC2007 defaults to use ‘11points’, while PASCAL VOC2012 defaults to use ‘area’. Defaults to ‘11points’.
nproc (int) – Processes used for computing TP and FP. If nproc is less than or equal to 1, multiprocessing will not be used. Defaults to 4.
drop_class_ap (bool) – Whether to drop the class without ground truth when calculating the average precision for each class.
classwise (bool) – Whether to return the computed results of each class. Defaults to False.
**kwargs – Keyword parameters passed to BaseMetric.

Examples

>>> import numpy as np
>>> from mmeval import DOTAMetric
>>> num_classes = 15
>>> dota_metric = DOTAMetric(num_classes=15)
>>>
>>> def _gen_bboxes(num_bboxes, img_w=256, img_h=256):
...     # random generate bounding boxes in 'xywha' formart.
...     x = np.random.rand(num_bboxes, ) * img_w
...     y = np.random.rand(num_bboxes, ) * img_h
...     w = np.random.rand(num_bboxes, ) * (img_w - x)
...     h = np.random.rand(num_bboxes, ) * (img_h - y)
...     a = np.random.rand(num_bboxes, ) * np.pi / 2
...     return np.stack([x, y, w, h, a], axis=1)
>>> prediction = {
...     'bboxes': _gen_bboxes(10),
...     'scores': np.random.rand(10, ),
...     'labels': np.random.randint(0, num_classes, size=(10, ))
... }
>>> groundtruth = {
...     'bboxes': _gen_bboxes(10),
...     'labels': np.random.randint(0, num_classes, size=(10, )),
...     'bboxes_ignore': _gen_bboxes(5),
...     'labels_ignore': np.random.randint(0, num_classes, size=(5, ))
... }
>>> dota_metric(predictions=[prediction, ], groundtruths=[groundtruth, ])  
{'mAP@0.5': ..., 'mAP': ...}

add(predictions: Sequence[Dict], groundtruths: Sequence[Dict]) → None[source]¶

Add the intermediate results to self._results.

Parameters

predictions (Sequence[Dict]) –
A sequence of dict. Each dict representing a detection result for an image, with the following keys: - bboxes (numpy.ndarray): Shape (N, 5) or shape (N, 8).

bounding bboxes of this image. The box format is depend on predict_box_type. Details in Note.
- scores (numpy.ndarray): Shape (N, ), the predicted scores of bounding boxes.
- labels (numpy.ndarray): Shape (N, ), the predicted labels of bounding boxes.
groundtruths (Sequence[Dict]) –
A sequence of dict. Each dict represents a groundtruths for an image, with the following keys:
- bboxes (numpy.ndarray): Shape (M, 5) or shape (M, 8), the groundtruth bounding bboxes of this image, The box format is depend on predict_box_type. Details in Note.
- labels (numpy.ndarray): Shape (M, ), the ground truth labels of bounding boxes.
- bboxes_ignore (numpy.ndarray): Shape (K, 5) or shape(K, 8), the groundtruth ignored bounding bboxes of this image. The box format is depend on self.predict_box_type.Details in upper note.
- labels_ignore (numpy.ndarray): Shape (K, ), the ground truth ignored labels of bounding boxes.

Note

The box shape of predictions and groundtruths is depends on the predict_box_type. If predict_box_type is ‘rbox’, the box shape should be (N, 5) which represents the (x, y,w, h, angle), otherwise the box shape should be (N, 8) which represents the (x1, y1, x2, y2, x3, y3, x4, y4).