首页    期刊浏览 2025年04月29日 星期二
登录注册

文章基本信息

  • 标题:Linked Open Data Construction of Purpose Oriented Interactomics by Integration of Life Sciences LOD
  • 本地全文:下载
  • 作者:Yusuke Komiyama ; Masaki Banno ; Masayuki Yarimizu
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2014
  • 卷号:29
  • 期号:4
  • 页码:356-363
  • DOI:10.1527/tjsai.29.356
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:Researchers of agriculture, life science and drug design of the need to acquire information that combines two or more life science databases for problem solving. Semantic Web technologies are already necessary for data integration between those databases. This study introduces a technique of utilizing RDF (Resource Description Framework) and OWL (Web Ontology Language) as a data set for development of a machine learning predictor of interactomics. Also, for SPARQL (SPARQL Protocol and RDF Query Language) we sketched the implementing method of interactomics LOD (Linked Open Data) in the graph database. Interactomics LOD has included the pairs of protein--protein interactions of tyrosine kinase, the pairs of amino acid residues of sugar (carbohydrate) binding proteins, and cross-reference information of the protein chain among an entry of major bioscience databases since 2013. Finally, we designed three RDF schema models and made access possible using AllegroGraph 4.11 and Virtuoso 7. The number of total triples was 1,824,859,745 in these databases. It could be combined with public LOD of the life science domain of 28,529,064,366 triples and was able to be searched. We showed that it was realistic to deal with large-scale LOD on a comparatively small budget by this research. The cost cut by LOD decreased not only expense but development time. Especially RDF-SIFTS (Structure Integration with Function, Taxonomy and Sequence) that is an aggregate of 10 small LOD was constructed in the short period of BioHackathon 2013 or was developed in one week. We could say that we can obtain quickly a data set required for the machine learning of interactomics by using LOD. We set up the interactomics LOD for application development as a database. SPARQL endpoints of these databases are exhibited on the portal site UTProt (The University of Tokyo Protein, http://utprot.net).
  • 关键词:bioscience database ; computational biosemantics ; interactomics, PPI ; protein ligand binding site
国家哲学社会科学文献中心版权所有