Introduction

Numerous experimental and computational researches have expanded a number of diverse RNA-RNA interactions, however, there are few text mining systems for extracting diverse RRIs information from biomedical literatures. Hence, we developed RIscoper (RNA Interactome Scoper) to simplify this process, a software platform that extracted RRIs from literatures based on N-gram model. Importantly, a reliable RRI corpus was integrated in RIscoper, recruiting more than 13300 sentences with RRI information by manually curated. RIscoper presents a high performance (90.32% precision and 94.10% recall) with integrating natural language processing techniques and reliable RRI corpus.


Highlights

  • RIscoper establishes a comprehensive and reliable RRI corpus, recruiting more than 13300 sentences with RRI information involving more than 5000 biomedical literatures by manually curated. These positive sentences involved in multiple RNA interactions including mRNA, lncRNA, miRNA, sRNA, circRNA, snoRNA, snRNA, scaRNA and scRNA. It’s providing a favorable resource for ongoing text mining studies of RNA interactions and will be a benchmark dataset in other future machine learning works.
  • RIscoper is based on N-gram statistics language model, which has the following advantages: (i) has lower computational complexity; (ii) do less manual intervention of the sentences; (iii) can be easily transplanted and extended.
  • RIscoper is the first tool for full-scale RNA interactome scanning, which support user-provided abstracts or full text papers, such as PDF and TXT file extraction, and network communication with PubMed by online PMID and keyword-based extraction. RIscoper is written in JAVA, which is a fast and simple tool for database curators, experimental biologists as well as bioinformaticians.


News


  • RIscoper completed
    June 2018

  • RIscoper construction
    September 2017

  • Data collection
    April 2017




Contact


Wang Dong



wangdong@ems.hrbmu .edu.cn