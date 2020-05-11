Thu, 11/05/2020 - 11:12 — bioquicknews

On Day 3 (Thursday, October 29) of the American Society of Human Genetics (ASHG) 2020 Virtual Annual Meeting (https://www.ashg.org/meetings/2020meeting/), one of the most interesting presentations was on the subject of developmental stuttering. Douglas M. Shaw (a graduate student in the Vanderbilt Genetics institute, Vanderbilt University) gave talk entitled “Applying a Phenome Risk Classifying Model to Identify Undiagnosed Developmental Stuttering Cases in a Biobank for Genome Wide Association Analysis.” In the abstract to his talk, Shaw described “developmental stuttering” as a speech disorder characterized by a disturbance in fluency and speech pattern, with an adult prevalence of 1-3% in the US. Despite twin-based studies showing ~50% heritability, the genetic etiology of stuttering is still largely unknown. No population-based genome wide association analysis (GWAS) has yielded variants that reach genome-wide significance, Shaw and colleagues wrote. Shaw noted that within Vanderbilt’s Electronic Health Record-linked biorepository (BioVU), only 142 cases of stuttering have diagnostic ICD9/10 (ICD9-307.0, ICD10-F98.5, ICD9-315.35, ICD10-F80.81, ICD10-R47.82) codes out of 92,762 genotyped samples, suggesting a large portion of people who stutter are not well-captured within the EHR. To address this case acquisition issue and provide a large enough sample set to power a GWAS, Shaw and colleagues developed a phenome-risk classification machine learning algorithm to identify patients who are at high risk for developmental stuttering.

Their model is a Gini Index-based decision tree classifier which uses phecodes identified to be enriched in cases of stuttering within Vanderbilt’s ungenotyped EHR (n~2.7M) as prediction features and developmental stuttering status as the outcome variable. This model was trained and tested with a set of manually reviewed developmental stuttering cases as well as sex, age, race, and ethnicity matched controls, and resulted in a 83% positive prediction rate. Shaw and colleagues applied this model in BioVU and were able to identify 9,221 genotyped cases within the EHR, resulting in a developmental stuttering prevalence of 10%. 5,977 European cases were selected for a preliminary GWAS of model-identified developmental stuttering cases in BioVU. Shaw said the teams compared its GWAS results with the GWAS summary statistics of a clinically diagnosed developmental stuttering sample set and found a significant concordance in the direction of effect for all tested variants genome-wide. Expanding the GWAS to include all 6,339 high-risk European samples resulted in a genome-wide significant locus on chromosome 2 (lead SNP=rs12613255, B=.323; P value=1.31*10-8), 98 kb 5’ of the FAM94A gene. Association analysis in samples with African ancestry (N=1,853) resulted in a near genome-wide significant hit at rs7837758 (B=.518; P value=5.07*10-8), an intronic variant found within the ZMAT4 gene on chromosome 8. Shaw and colleagues concluded by saying that this method of case identification has facilitated the identification of the first significantly associated variant from a population-based analysis of developmental stuttering risk and provides a framework to improve power for analyses of phenotypes under-reported in electronic health records. Interestingly, Shaw noted that by adulthood almost all developmental stutters (99%) no longer stutter. Photo is of basketball star Bill Walton (at right) who suffered from stuttering, but worked at it and ultimately got into broadcasting.

[ASHG abstract] [ASHG 2020 Virtual Annual Meeting]