"Caudate Segmentation Evaluation (CAUSE07)"

Caudate Segmentation Evaluation (CAUSE07)

We submitted results with our multi-structure, multi-atlas variant of FS+LDDMM (currently under review) to the ongoing caudate segmentation evaluation, CAUSE07. Researchers from all over the world can download the test data, perform their caudate segmentation, and submit the results back to the website where the manual segmentations (not available to the entrants) are used to calculate a performance score.

Our scores

Our method ranked #2 overall, with an overall score of 75.99. The scores are based on an average of five scores based on the: overlap error, volume difference, average surface distance, RMS surface distance, and max surface distance. For more details please see below, and the CAUSE07 results webpage.

Volume Correlation

Also computed in the results, but not included in the scoring system, was the Pearson correlation of the volumes; our method currently has the highest correlation of all methods, with an average of 0.81. Notably, the current ranked #1 method has an average correlation of 0.17, which means the segmentations could not be used reliably in a volumetric study.

Results of the top 10 submissions,
ranked by average Pearson Correlation (as of May 24, 2009)

Team Submitted Score Position Correlation
MIAL-SFU (our team) March 12, 2009 75.992 0.81
ISICAD November 15, 2007 73.187 0.76
Segmentation Team March 10, 2009 75.193 0.70
ISICAD January 2, 2008 74.26 4 0.67
Uath-UNC October 23, 2007 71.17 9 0.65
I3A(UZ) December 11, 2007 74.08 5 0.57
ISICAD October 28, 2007 70.8410 0.48
LONI_2 February 16, 2008 73.386 0.46
ISBE Manchester November 13, 2007 71.458 0.41
ISICAD October 7, 2008 79.16 1 0.17

Our results

Please see below for a detailed look at our automated segmentations (prior to binarization for submission).

Test Dataset 1

Healthy controls and subjects in a Schizotypal Personality Disorder (SPD) study provided by Psychiatry Neuroimaging Laboratory at the Brigham and Women's Hospital, Boston

The manual delineations for this dataset included long and narrow caudate tails which were missed by our method for some subjects, thus explaining the low score for max surface distance.

Dataset Average Scores
Overlap Error Score: 85.37
Volume Difference Score: 80.44
Average Surface Distance Score: 79.28
RMS Surface Distance Score: 68.61
Max Surface Distance Score: 47.15
Final Score: 72.19
Left Caudate
Right Caudate
Subject Final Score
66.21
63.65
76.55
65.61
66.65
63.89
67.08
86.51
88.81
70.58
85.13
62.16
63.56
84.26

Test Dataset 2

Five Pediatric (age 2-4 years) subjects provided by UNC autism research group, University of North Carolina at Chapel Hill.

Dataset Average Scores
Overlap Error Score: 82.42
Volume Difference Score: 84.18
Average Surface Distance Score: 78.86
RMS Surface Distance Score: 83.42
Max Surface Distance Score: 85.39
Final Score: 82.85
Left Caudate
Right Caudate
Subject Final Score
77.39
85.00
87.15
80.20
84.53

Test Dataset 3

Five Elderly (age 60-75 years) subjects provided by UNC Parkinson research group, University of North Carolina at Chapel Hill.

Dataset Average Scores
Overlap Error Score: 81.60
Volume Difference Score: 72.96
Average Surface Distance Score: 78.94
RMS Surface Distance Score: 83.11
Max Surface Distance Score: 82.89
Final Score: 79.78
Left Caudate
Right Caudate
Subject Final Score
77.89
81.68
82.01
78.86
78.47