GestaltMatch: breaking the limits of rare disease matching using facial phenotypic descriptors


Recent advances in next-generation phenotyping (NGP) for syndromology, such as DeepGestalt, have learned phenotype representations of multiple disorders by training on thousands of patient photos. However, many Mendelian syndromes are still not represented by existing NGP tools, as only a handful of patients were diagnosed. Moreover, the current architecture for syndrome classification, e.g., in DeepGestalt, is trained “end-to-end,” that is photos of molecularly confirmed cases are presented to the network and a node in the output layer, that will correspond to this syndrome, is maximized in its activity during training. This approach will not be applicable to any syndrome that was not part of the training set, and it cannot explain similarities among patients. Therefore, we propose “GestaltMatch” as an extension of DeepGestalt that utilizes the similarities among patients to identify syndromic patients by their facial gestalt to extend the coverage of NGP tools.


We compiled a dataset consisting of 21,400 patients with 1,451 different rare disorders. For each individual, a frontal photo and the molecularly confirmed diagnosis were available. We considered the deep convolutional neural network (DCNN) in DeepGestalt as a composition of a feature encoder and a classifier. The last fully-connected layer in the feature encoder was taken as Facial Phenotypic Descriptor (FPD). We trained the DCNN on the patients’ frontal photos to optimize the FPD and to define a Clinical Face Phenotype Space (CFPS). The similarities among each patient were quantified by cosine distance in CFPS.


Patients with similar syndromic phenotypes were located in close proximity in the CFPS. Ranking syndromes by distance in CFPS, we first showed that GestaltMatch provides a better generalization of syndromic features than a face recognition model that was only trained on healthy individuals. Moreover, we achieved 87% top-10 accuracy in identifying rare Mendelian diseases that were excluded from the training set. We further proved that the distinguishability of syndromic disorders does not correlate with its prevalence. 


GestaltMatch enables matching novel phenotypes and thus complements related molecular approaches.


Wird geladen