

class mmeval.metrics.VOCMeanAP(iou_thrs: Union[float, List[float]] = 0.5, scale_ranges: Optional[List[Tuple]] = None, num_classes: Optional[int] = None, eval_mode: str = 'area', use_legacy_coordinate: bool = False, nproc: int = 4, drop_class_ap: bool = True, classwise: bool = False, **kwargs)[source]

Pascal VOC evaluation metric.

This metric computes the VOC mAP (mean Average Precision) with the given IoU thresholds and scale ranges.

  • iou_thrs (float | List[float]) – IoU thresholds. Defaults to 0.5.

  • scale_ranges (List[tuple], optional) – Scale ranges for evaluating mAP. If not specified, all bounding boxes would be included in evaluation. Defaults to None.

  • num_classes (int, optional) – The number of classes. If None, it will be obtained from the ‘classes’ field in self.dataset_meta. Defaults to None.

  • eval_mode (str) – ‘area’ or ‘11points’, ‘area’ means calculating the area under precision-recall curve, ‘11points’ means calculating the average precision of recalls at [0, 0.1, …, 1]. The PASCAL VOC2007 defaults to use ‘11points’, while PASCAL VOC2012 defaults to use ‘area’. Defaults to ‘area’.

  • use_legacy_coordinate (bool) – Whether to use coordinate system in mmdet v1.x. which means width, height should be calculated as ‘x2 - x1 + 1` and ‘y2 - y1 + 1’ respectively. Defaults to False.

  • nproc (int) – Processes used for computing TP and FP. If nproc is less than or equal to 1, multiprocessing will not be used. Defaults to 4.

  • drop_class_ap (bool) – Whether to drop the class without ground truth when calculating the average precision for each class.

  • classwise (bool) – Whether to return the computed results of each class. Defaults to False.

  • **kwargs – Keyword parameters passed to BaseMetric.


>>> import numpy as np
>>> from mmeval import VOCMeanAP
>>> num_classes = 4
>>> voc_map = VOCMeanAP(num_classes=4)
>>> def _gen_bboxes(num_bboxes, img_w=256, img_h=256):
...     # random generate bounding boxes in 'xyxy' formart.
...     x = np.random.rand(num_bboxes, ) * img_w
...     y = np.random.rand(num_bboxes, ) * img_h
...     w = np.random.rand(num_bboxes, ) * (img_w - x)
...     h = np.random.rand(num_bboxes, ) * (img_h - y)
...     return np.stack([x, y, x + w, y + h], axis=1)
>>> prediction = {
...     'bboxes': _gen_bboxes(10),
...     'scores': np.random.rand(10, ),
...     'labels': np.random.randint(0, num_classes, size=(10, ))
... }
>>> groundtruth = {
...     'bboxes': _gen_bboxes(10),
...     'labels': np.random.randint(0, num_classes, size=(10, )),
...     'bboxes_ignore': _gen_bboxes(5),
...     'labels_ignore': np.random.randint(0, num_classes, size=(5, ))
... }
>>> voc_map(predictions=[prediction, ], groundtruths=[groundtruth, ])  
{'AP50': ..., 'mAP': ...}
add(predictions: Sequence[Dict], groundtruths: Sequence[Dict])None[source]

Add the intermediate results to self._results.

  • predictions (Sequence[dict]) –

    A sequence of dict. Each dict representing a detection result for an image, with the following keys:

    • bboxes (numpy.ndarray): Shape (N, 4), the predicted bounding bboxes of this image, in ‘xyxy’ foramrt.

    • scores (numpy.ndarray): Shape (N, 1), the predicted scores of bounding boxes.

    • labels (numpy.ndarray): Shape (N, 1), the predicted labels of bounding boxes.

  • groundtruths (Sequence[dict]) –

    A sequence of dict. Each dict represents a groundtruths for an image, with the following keys:

    • bboxes (numpy.ndarray): Shape (M, 4), the ground truth bounding bboxes of this image, in ‘xyxy’ foramrt.

    • labels (numpy.ndarray): Shape (M, 1), the ground truth labels of bounding boxes.

    • bboxes_ignore (numpy.ndarray): Shape (K, 4), the ground truth ignored bounding bboxes of this image, in ‘xyxy’ foramrt.

    • labels_ignore (numpy.ndarray): Shape (K, 1), the ground truth ignored labels of bounding boxes.

calculate_class_tpfp(predictions: List[dict], groundtruths: List[dict], class_index: int, pool: Optional[multiprocessing.pool.Pool])Tuple[source]

Calculate the tp and fp of the given class index.

  • predictions (List[dict]) – A list of dict. Each dict is the detection result of an image. Same as VOCMeanAP.add.

  • groundtruths (List[dict]) – A list of dict. Each dict is the ground truth of an image. Same as VOCMeanAP.add.

  • class_index (int) – The class index.

  • pool (Optional[Pool]) – A instance of multiprocessing.Pool. If None, do not use multiprocessing.


  • tp (numpy.ndarray): Shape (num_ious, num_scales, num_pred), the true positive flag of each predicted bbox for this class.

  • fp (numpy.ndarray): Shape (num_ious, num_scales, num_pred), the false positive flag of each predicted bbox for this class.

  • num_gts (numpy.ndarray): Shape (num_ious, num_scales), the number of ground truths.

Return type

tuple (tp, fp, num_gts)

compute_metric(results: list)dict[source]

Compute the VOCMeanAP metric.


results (List[tuple]) – A list of tuple. Each tuple is the prediction and ground truth of an image. This list has already been synced across all ranks.


The computed metric, with the following keys:

  • mAP, the averaged across all IoU thresholds and all class.

  • AP{IoU}, the mAP of the specified IoU threshold.

  • mAP@{scale_range}, the mAP of the specified scale range.

  • classwise, the evaluation results of each class. This would be returned if self.classwise is True.

Return type


get_class_gts(groundtruths: List[dict], class_index: int)Tuple[source]

Get prediciton gt information of a certain class index.

  • groundtruths (list[dict]) – Same as VOCMeanAP.add.

  • class_index (int) – Index of a specific class.


  • class_gts (List[numpy.ndarray]): The gt bboxes of this class.

  • class_ignore_gts (List[numpy.ndarray]): The ignored gt bboxes of this class. This is necessary when counting tp and fp.

Return type

tuple (class_gts, class_ignore_gts)

get_class_predictions(predictions: List[dict], class_index: int)List[source]

Get prediciton results of a certain class index.

  • predictions (list[dict]) – Same as VOCMeanAP.add.

  • class_index (int) – Index of a specific class.


A list of predicted bboxes of this class. Each predicted score of the bbox is concatenated behind the predicted bbox.

Return type


property num_classes: int

Returns the number of classes.

The number of classes should be set during initialization, otherwise it will be obtained from the ‘classes’ field in self.dataset_meta.


The number of classes.

Return type



RuntimeError – If the num_classes is not set.

Read the Docs v: latest
On Read the Docs
Project Home

Free document hosting provided by Read the Docs.