virtex.utils.metrics
This module is a collection of metrics commonly used during pretraining and downstream evaluation. Two main classes here are:
TopkAccuracy
used for ImageNet linear classification evaluation.CocoCaptionsEvaluator
used for caption evaluation (CIDEr and SPICE).
Parts of this module (tokenize()
, cider()
and spice()
) are
adapted from coco-captions evaluation code.
- class virtex.utils.metrics.TopkAccuracy(k: int = 1)[source]
Bases:
object
Top-K classification accuracy. This class can accumulate per-batch accuracy that can be retrieved at the end of evaluation. Targets and predictions are assumed to be integers (long tensors).
If used in
DistributedDataParallel
, results need to be aggregated across GPU processes outside this class.- Parameters
k –
k
for computing Top-K accuracy.
- class virtex.utils.metrics.CocoCaptionsEvaluator(gt_annotations_path: str)[source]
Bases:
object
A helper class to evaluate caption predictions in COCO format. This uses
cider()
andspice()
which exactly follow original COCO Captions evaluation protocol.- Parameters
gt_annotations_path – Path to ground truth annotations in COCO format (typically this would be COCO Captions
val2017
split).
- virtex.utils.metrics.tokenize(image_id_to_captions: Dict[int, List[str]]) Dict[int, List[str]] [source]
Given a mapping of image id to a list of corrsponding captions, tokenize captions in place according to Penn Treebank Tokenizer. This method assumes the presence of Stanford CoreNLP JAR file in directory of this module.