Physicians envision a future in which genomic data from patients is heavily used to manage care, but experts have questioned the accuracy and reliability of these analyses. Now, a study by 150 researchers in 12 countries finds real strength and agreement across RNA genomic sequencing techniques and laboratories — as well as ways to improve what little variability exists to set a new high standard.
The results of the study were published in Nature Biotechnology in three separate research articles.
These results should provide assurance to patients, clinicians and the research community that genomic sequencing is accurate, says E. Aubrey Thompson, Ph.D., a professor of cancer biology at Mayo Clinic in Florida, one of three institutions that led the study. Dr. Thompson is a study co-author and member of the project leadership.
“It seems very likely that decisions about patient care are going to be influenced by genomic data, derived from sequencing both RNA and DNA from patient samples, and we now know the extent to which these sequence-based analyses can be relied upon within a given laboratory or from laboratory to laboratory,” he says.
“That means that results of a patient’s sample, from which clinical management decisions will likely be made, will be accurate worldwide,” says Dr. Thompson.
RNA sequencing is being used with increasing frequency to characterize a growing array of conditions — everything from prenatal birth defects to disorders of the elderly.
The other institutions involved in the study are the Beijing Genomic Institute and Weill Cornell Medical School. All three institutions have extensive experience in sequencing RNA and have helped develop novel analytical tools for interpreting the data.
The U.S. Food and Drug Administration (FDA) funded the research, given its need to understand the accuracy of such data submitted in applications for approval of new drugs, clinical applications and genomic diagnostic procedures, Dr. Thompson says.
The purpose of this project, known as Sequence Quality Control (SEQC), was to rigorously define both the scope and the sources of variation in RNA sequencing data.
Laboratory groups at the three leading institutions sequenced the same two RNA samples multiple times.
More than 1 billion nucleotides of sequencing data were generated by each site. The data were then analyzed under the direction of the FDA with the assistance of a large group of academic and industrial statisticians. The researchers also examined the current technologies and major biochemical methods of 30 RNA-sequencing labs and hundreds of researchers. The researchers also found that RNA can be accurately extracted and analyzed from severely degraded genetic samples, such as from tissue samples that have been stored for many years.
“It was determined that there is very strong agreement between the sequence data generated by experienced sequencing laboratories,” Dr. Thompson says. “The studies now establish the best practice for all laboratories to use, so that results are reliable and reproducible across laboratories.”
- Anton Kratz, Piero Carninci. The devil in the details of RNA-seq. Nature Biotechnology, 2014; 32 (9): 882 DOI: 10.1038/nbt.3015
- Leming Shi et al. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nature Biotechnology, 2014; 32 (9): 903 DOI: 10.1038/nbt.2957
- Christopher E Mason et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nature Biotechnology, 2014; 32 (9): 915 DOI: 10.1038/nbt.2972
- Weida Tong et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nature Biotechnology, 2014; 32 (9): 926 DOI: 10.1038/nbt.3001