【佳学基因检测】GWAS技术在基因检测和基因解码中的应用?
全基因组关联研究
全基因组关联(GWA)研究扫描了整个物种的基因组,以寻找多达数百万个SNP与特定感兴趣性状之间的关联。值得注意的是,感兴趣的特征实际上可以是归因于群体的任何类型的表型,无论是定性(如疾病状态)还是定量(如身高)。基本上,给定p个SNP和n个样本或个体,GWA分析将拟合p个独立的单变量线性模型,每个模型基于n个样本,使用每个SNP的基因型作为感兴趣特征的预测因子。每个P检验中的关联显著性(P值)由相应SNP的系数估计β确定(从技术上讲,关联显著性为P(eta | H_0:eta=0))。请注意,因为这些测试是独立的,而且数量相当多,所以在建立并行GWA分析时有很大的计算优势。相当合理的是,有必要使用多种假设检验方法(如Bonferroni、Benjamini-Hochberg或错误发现率(FDR))调整产生的P值。GWA研究现在在许多不同物种的遗传学中很常见。
Genome-wide association studies
Genome-wide association (GWA) studies scan an entire species genome for association between up to millions of SNPs and a given trait of interest. Notably, the trait of interest can be virtually any sort of phenotype ascribed to the population, be it qualitative (e.g. disease status) or quantitative (e.g. height). Essentially, given p SNPs and n samples or individuals, a GWA analysis will fit p independent univariate linear models, each based on n samples, using the genotype of each SNP as predictor of the trait of interest. The significance of association (P-value) in each of the p tests is determined from the coefficient estimate of the corresponding SNP (technically speaking, the significance of association is ). Note that because these tests are independent and quite numerous, there is a great computational advantage in setting up a parallelized GWA analysis (as we will do shortly). Quite reasonably, it is necessary to adjust the resulting P-values using multiple hypothesis testing methods such as Bonferroni, Benjamini-Hochberg or false discovery rate (FDR). GWA studies are now commonplace in genetics of many different species.
关联映射与连锁映射
通常,人们无法区分关联和连锁作图或数量性状位点(QTL)作图之间的区别。尽管概念上相似,但它们的工作方式实际上是相反的。两者之间的一个关键区别是关联作图依赖于无关个体的高密度SNP基因分型,而连锁作图依赖于受控育种实验中显著较少的标记分离——毫不奇怪,QTL作图很少在人类中进行。重要的是,关联作图提供了基因组中的点关联,而连锁作图提供了QTL,即染色体区域。
本教程涵盖了在进行GWA分析时要考虑的基本方面,从基因型和表型数据的预处理到结果的解释。我们将使用316名中国人、印度人和马来人的混合人群,最近使用高通量SNP芯片测序、转录组学和脂质组学对其进行了表征(Saw等人,2017年)。更具体地说,我们将寻找>250万SNP标记与胆固醇水平之间的关联。最后,我们将使用USCS基因组浏览器探索候选SNP的附近,以获得功能性见解。此处显示的方法主要基于里德等人2015年概述的教程。R脚本和一些数据可以在我的存储库中找到,但是您仍然需要从这里下载omics数据。请遵循回购协议中的说明。
Association mapping vs. linkage mapping
Too often, people cannot tell the difference between association and linkage mapping, or quantitative trait loci (QTL) mapping. Albeit conceptually similar, their are actually opposite in their workings. One of the key differences between the two is that association mapping relies on high-density SNP genotyping of unrelated individuals, whereas linkage mapping relies on the segregation of substantially fewer markers in controlled breeding experiments – unsurprisingly QTL mapping is seldom conducted in humans. Importantly, association mapping gives you point associations in the genome, whereas linkage mapping gives you QTL, chromosomal regions.
The present tutorial covers fundamental aspects to consider when conducting GWA analysis, from the pre-processing of genotype and phenotype data to the interpretation of results. We will use a mixed population of 316 Chinese, Indian and Malay that was recently characterized using high-throughput SNP-chip sequencing, transcriptomics and lipidomics (Saw et al., 2017). More specifically, we will search for associations between the >2.5 million SNP markers and cholesterol levels. Finally, we will explore the vicinity of candidate SNPs using the USCS Genome Browser in order to gain functional insights. The methodology shown here is largely based on the tutorial outlined in Reed et al., 2015. The R scripts and some of the data can be found in my repository, but you will still need to download the omics data from here. Please follow the instructions in the repo.
(责任编辑:admin)