【佳学基因检测】通过重新平衡训练样本识别个体癌症相关基因
dna检测一次多少钱:分析
分析肿瘤基因检测及基因突变的改进与提高看到《IEEE Trans Nanobioscience》在2016 Jun;15(4):309-315.发表了一篇题目为《通过重新平衡训练样本识别个体癌症相关基因》肿瘤靶向药物治疗基因检测临床研究文章。该研究由Bolin Chen, Xuequn Shang, Min Li, Jianxin Wang, Fang-Xiang Wu等完成。促进了肿瘤的精准治疗与个性化用药的发展,进一步强调了基因信息检测与分析的重要性。
癌症基检测准确性的方法研究内容关键词:
重新,平衡,训练,样本,识别,个体,癌症相关基因
如何确定肿瘤相关基因的基因检测临床应用结果
个体癌症相关基因的鉴定通常是一个不平衡的分类问题。已知的癌症相关基因的数量远远少于所有未知基因的数量,这使得很难从这种不平衡的训练样本中检测出新的预测。常规的机器学习基因检测的研究方法要么只能检测与所有癌症相关的基因,要么添加临床知识来规避这个问题。在这项研究中,基因解码基因检测引入了一种训练样本再平衡策略,通过使用两步逻辑回归和随机重采样的研究方法来克服这个问题。两步逻辑回归是选择一组与所有癌症相关的基因。同时执行随机重采样基因解码基因检测的研究方法以进一步分类与个体癌症相关的那些基因。通过首先随机添加与其他癌症相关的阳性实例,然后在下一步根据整体表现排除那些不相关的预测来规避分类不平衡的问题。数值实验表明,即使与癌症相关的已知基因数量很少,所提出的重采样基因解码基因检测的研究方法也能够识别癌症相关基因。通过使用留一法交叉验证基因解码基因检测的研究方法,所有个体癌症的最终预测值达到 0.93 左右,与现有基因解码基因检测的研究方法相比,这是非常有前途的。
肿瘤发生与复发转移国际数据库描述:
The identification of individual-cancer-related genes typically is an imbalanced classification issue. The number of known cancer-related genes is far less than the number of all unknown genes, which makes it very hard to detect novel predictions from such imbalanced training samples. A regular machine learning method can either only detect genes related to all cancers or add clinical knowledge to circumvent this issue. In this study, we introduce a training sample rebalancing strategy to overcome this issue by using a two-step logistic regression and a random resampling method. The two-step logistic regression is to select a set of genes that related to all cancers. While the random resampling method is performed to further classify those genes associated with individual cancers. The issue of imbalanced classification is circumvented by randomly adding positive instances related to other cancers at first, and then excluding those unrelated predictions according to the overall performance at the following step. Numerical experiments show that the proposed resampling method is able to identify cancer-related genes even when the number of known genes related to it is small. The final predictions for all individual cancers achieve AUC values around 0.93 by using the leave-one-out cross validation method, which is very promising, compared with existing methods.
(责任编辑:佳学基因)