KISTI Institutional Repository: Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification

Open Access KISTI

KISTI repository

BROWSE

KISTI Institutional Repository7. KISTI 연구성과 학술지 발표논문

download0 view1,145

This item is licensed Korea Open Government License

Title: Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification

Author(s): 이홍란; 황규백; 김현우; 백은옥; 이상원; 조윤성

Publication Year: 2016-12-22

Abstract: Background: Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available.Results: To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches.Conclusions: We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search.

Keyword: False discovery rate; Proteogenomic search; Separate false discovery rate analysis; Simulation; Target-decoy approach; Model-based approach

Journal Title: BMC genomics

Citation Volume: 17

ISSN: 1471-2164

Files in This Item:: There are no files associated with this item.

Appears in Collections:: 7. KISTI 연구성과 > 학술지 발표논문

URI: https://repository.kisti.re.kr/handle/10580/14493
http://www.ndsl.kr/ndsl/search/detail/article/articleSearchResultDetail.do?cn=NART78513511

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

KISTI 국가과학기술데이터본부 디지털큐레이션센터 데이터표준화팀
우)34141 대전광역시 유성구 대학로 245 한국과학기술정보연구원
Tel 042) 869-1004,1234 FAX 042) 869-1091

KISTI Institutional Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.

개인정보처리방침

저작권 정책

BROWSE

Browse