提出新指标量化人口遗传个体之间的异同和训练
![Illustration of population-level versus individual-level PGS accuracy. a, Discrete labeling of GIA with PCA-based clustering. Each dot represents an individual. The circles represent arbitrary boundaries imposed on the genetic ancestry continuum to divide individuals into different GIA clusters. The color represents the GIA cluster label. The gray dots are individuals who are left unclassified. b, Schematic illustrating the variation of population-level PGS accuracy across clusters. The box plot represents the PGS accuracy (for example, R2) measured at the population level. The question mark emphasizes that the PGS accuracy for unclassified individuals is unknown owing to the lack of a reference group. Gray dashed lines emphasize the categorical nature of GIA clustering. c, Continuous labeling of everyone’s unique position on the genetic ancestry continuum with a PCA-based GD. The GD is defined as the Euclidean distance of an individual’s genotype from the center of the training data when projected on the PC space of training genotype data. Everyone has their own unique GD, di, and individual PGS accuracy, r2i. d, Individual-level PGS accuracy decays along the genetic ancestry continuum. Each dot represents an individual and its color represents the assigned GIA label. Individuals labeled with the same ancestry spread out on the genetic ancestry continuum, and there are no clear boundaries between GIA clusters. This figure is illustrative and does not involve any real or simulated data. Credit: Nature (2023). DOI: 10.1038/s41586-023-06079-4 提出新指标量化人口遗传个体之间的异同和训练](https://scx1.b-cdn.net/csz/news/800a/2023/a-proposed-new-metric.jpg)
生物信息学的研究团队隶属于多个机构在美国和丹麦奥胡斯大学提出了一种新的衡量标准量化人口遗传个体之间的异同和训练。他们的研究发表在杂志上自然。的编辑器自然也在同一个杂志发表的一份研究简报问题概述这方面做的工作团队。
多基因的分数(后卫)工具来估计的概率是基于某一特征或疾病遗传背景。动力计算通常是通过添加了许多常见的遗传变异的影响与感兴趣的特征。但得到分数的准确性依赖于遗传变异的程度实际上用于构造他们捕获遗传多样性的人口从他们。
这通常意味着,如果一个给定的人口用于列车动力分配不同的基因种群测试应用,后卫可能不执行。使这样的分数更有用,研究人员提出了一种新的衡量标准被称为遗传距离(GD)——目的是量化个体遗传差异和培训数量基于全基因组等位基因频率。
新规将范围从0(代表相同的特质)到1(代表特征是完全不同的),它还将考虑古代和最近的进化事件影响一个给定的人口。支持使用新度量,研究小组表明,GD可以反向与后卫对于一些疾病和跨人群特征,即使是那些通常被认为是均匀的。该小组还证明了GD可以用来识别那些可能受益于动力一直在训练特定的人群,或相反地,那些更多的多样性和动力,依靠不同的变体。
团队认为他们的指标可以为测量提供一个连续测量动力分配的准确性和指出,它还强调了开发动力分配时考虑遗传多样性的重要性。
更多信息:易鼎等,多基因遗传祖先中得分精度不同的连续体,自然(2023)。DOI: 10.1038 / s41586 - 023 - 06079 - 4
连续测量对于理解基于基因预测的准确性,自然(2023)。DOI: 10.1038 / d41586 - 023 - 01492 - 1
©2023科学BOB体育赌博X网络