Evaluation

We will use a single metric per task to determine the performance of submitted results. To access the evaluation code that we will use in the challenge, take a look at our GitHub page

 Task 1: Nuclei instance segmentation and classification

We will use multi-class panoptic quality (`PQ`) to determine the performance of nuclear instance segmentation and classification.  For each type `t`, the `PQ` is defined as:

where `x` denotes a ground truth (GT) instance, `y` denotes a predicted instance, and IoU denotes intersection over union. Setting IoU(`x`,`y`)>0.5 will uniquely match `x` and `y`. This unique matching therefore splits all available instances of type `t` within the dataset into matched pairs (TP), unmatched GT instances (FN) and unmatched predicted instances (FP). Henceforth, we define the multi-class `PQ` (`mPQ`) as the task ranking metric, which takes averages the `PQ` over all classes:

Note, for `mPQ` we calculate the statistics over all images to ensure there are no issues when a particular class is not present in a patch. This is different to `mPQ` calculation used in previous publications, such as PanNuke, MoNuSAC and in the original Lizard paper, where the `PQ` is calculated for each image and for each class before the average is taken. Hence, for the purpose of this challenge, we refer to the metric as `mPQ`+.

In the GitHub repository, we also provide code to calculate binary panoptic quality, but this is not used for determination of the leaderboard. For this, `PQ` is calculated per image and the image-level results are averaged.

 Task 2: Nuclear composition regression

For the second task, we will use multi-class coefficient of determination to determine the correlation between the predicted and true counts. For this, the statistic is calculated for each class independently and then the results are averaged. In particular, for each nuclear category `t` the correlation of determination is defined as follows:

Here, `RSS` stands for the sum of squares of residuals and `TSS` stands for the total sum of squares.