
Genome-wide tools that classify variants as deleterious or not, without reference to a specific disease or mechanism, may not perform as well as those that separate gene–disease relations since, for example, they do not distinguish between gain- and loss-of-function variants. First, variation in a single gene can cause distinct phenotypes via different allelic mechanisms. While existing genome-wide tools learn from large-scale data over the entire genome, they might also compromise the prediction accuracy for specific sets of genes and diseases 7 in the following ways. Several tools have been developed to predict the effects of rare variants given multiple functional annotations to derive scores describing the likelihood of pathogenicity.

According to guidelines developed by the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP), 1 computational prediction of variant pathogenicity is integrated as one line of supporting evidence to assess the clinical significance of genetic variation. The accurate prediction of the effect of a previously unseen genetic variant on disease risk is an unmet need in clinical genetics. ConclusionsĪ disease-specific variant classifier outperforms state-of-the-art genome-wide tools for rare missense variants in inherited cardiac conditions ( ), highlighting broad opportunities for improved pathogenicity prediction through disease specificity. Variants classified as disease-causing are associated with both disease status and clinical severity, including a 21% increased risk (95% confidence interval 11–29%) of severe adverse outcomes by age 60 in patients with hypertrophic cardiomyopathy. CardioBoost obtains excellent accuracy (cardiomyopathies 90.2% arrhythmias 91.9%) for variants classified with >90% confidence, and increases the proportion of variants classified with high confidence more than twofold compared with existing tools. ResultsĬardioBoost has high global discrimination accuracy (precision recall area under the curve 0.91 for cardiomyopathies 0.96 for arrhythmias), outperforming existing tools (4–24% improvement). We assessed CardioBoost’s ability to discriminate known pathogenic from benign variants, prioritize disease-associated variants, and stratify patient outcomes. We developed a disease-specific variant classifier, CardioBoost, that estimates the probability of pathogenicity for rare missense variants in inherited cardiomyopathies and arrhythmias.

We hypothesized that incorporating disease-specific information would improve tool performance.

State-of-the-art machine learning variant prioritization tools are imprecise and ignore important parameters defining gene–disease relationships, e.g., distinct consequences of gain-of-function versus loss-of-function variants. Genetics in Medicine volume 23, pages 69–79 ( 2021) Cite this articleĪccurate discrimination of benign and pathogenic rare variation remains a priority for clinical genome interpretation. Disease-specific variant pathogenicity prediction significantly improves variant interpretation in inherited cardiac conditions
