You are here: Home Press releases Extracting distinguishing features for sex classification on fundus images from a neural network
Date: Apr 06, 2020

Extracting distinguishing features for sex classification on fundus images from a neural network Artificial intelligence and especially neural networks have long been criticized for their 'black-box' nature. This was exemplified by previously surprising results of a neural network being able to classify the sex of a person based on a retina picture. We showed that we were able to extract some of the distinguishing features by thoroughly analysing the neural network.

Whilst well researched and enjoying ever-increasing widespread use, convolutional neural networks (CNN) have always had the problem of a result-oriented structure. Even when performing very well with certain tasks only the final decision made by the CNN is easily accessible, not how that decision was reached. One example of this happening was a high impact paper showing their network could classify the sex of a person based on a colour fundus photography (CFP), a picture of the retina, with very high accuracy. However, no satisfactory explanation of how the CNN actually distinguished between the two classes could be given.

In our work, we showed that using a combination of expert domain knowledge and algorithmic analysis of the CNN we could extract some of the distinguishing features. We used occlusion sensitivity maps to find areas of high importance for the classification of the CNN. Consulting with experts we were able to narrow down possible candidates to the fovea and optic disk. Comparing images we found distinct vasculatures for males and females in the optic disk area. Especially the angle between the nasal vena superior and inferior was closer for female individuals.

Using this information we set up the webpage eye2sex.com to test how applicable this rule was for experts and laypersons. We showed that people performed significantly (p=0.003) better than random using this rule. However top performers achieved only around 70% accuracy whilst the CNN achieved 82.9%. This is likely due to the network using additional information which is hard to access by hand like small colour differences due to a differing thickness of retina layers.

Document Actions