U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

The 2022 Nucleic Acids Research database issue and the online molecular biology database collection

Affiliations.

  • 1 Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK.
  • 2 Institut Curie, 25 rue d'Ulm, 75005 Paris, France.
  • PMID: 34986604
  • PMCID: PMC8728296
  • DOI: 10.1093/nar/gkab1195

The 2022 Nucleic Acids Research Database Issue contains 185 papers, including 87 papers reporting on new databases and 85 updates from resources previously published in the Issue. Thirteen additional manuscripts provide updates on databases most recently published elsewhere. Seven new databases focus specifically on COVID-19 and SARS-CoV-2, including SCoV2-MD, the first of the Issue's Breakthrough Articles. Major nucleic acid databases reporting updates include MODOMICS, JASPAR and miRTarBase. The AlphaFold Protein Structure Database, described in the second Breakthrough Article, is the stand-out in the protein section, where the Human Proteoform Atlas and GproteinDb are other notable new arrivals. Updates from DisProt, FuzDB and ELM comprehensively cover disordered proteins. Under the metabolism and signalling section Reactome, ConsensusPathDB, HMDB and CAZy are major returning resources. In microbial and viral genomes taxonomy and systematics are well covered by LPSN, TYGS and GTDB. Genomics resources include Ensembl, Ensembl Genomes and UCSC Genome Browser. Major returning pharmacology resource names include the IUPHAR/BPS guide and the Therapeutic Target Database. New plant databases include PlantGSAD for gene lists and qPTMplants for post-translational modifications. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). Our latest update to the NAR online Molecular Biology Database Collection brings the total number of entries to 1645. Following last year's major cleanup, we have updated 317 entries, listing 89 new resources and trimming 80 discontinued URLs. The current release is available at http://www.oxfordjournals.org/nar/database/c/.

© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

PubMed Disclaimer

Similar articles

  • The 27th annual Nucleic Acids Research database issue and molecular biology database collection. Rigden DJ, Fernández XM. Rigden DJ, et al. Nucleic Acids Res. 2020 Jan 8;48(D1):D1-D8. doi: 10.1093/nar/gkz1161. Nucleic Acids Res. 2020. PMID: 31906604 Free PMC article. Review.
  • The 2024 Nucleic Acids Research database issue and the online molecular biology database collection. Rigden DJ, Fernández XM. Rigden DJ, et al. Nucleic Acids Res. 2024 Jan 5;52(D1):D1-D9. doi: 10.1093/nar/gkad1173. Nucleic Acids Res. 2024. PMID: 38035367 Free PMC article.
  • The 2023 Nucleic Acids Research Database Issue and the online molecular biology database collection. Rigden DJ, Fernández XM. Rigden DJ, et al. Nucleic Acids Res. 2023 Jan 6;51(D1):D1-D8. doi: 10.1093/nar/gkac1186. Nucleic Acids Res. 2023. PMID: 36624667 Free PMC article.
  • The 2018 Nucleic Acids Research database issue and the online molecular biology database collection. Rigden DJ, Fernández XM. Rigden DJ, et al. Nucleic Acids Res. 2018 Jan 4;46(D1):D1-D7. doi: 10.1093/nar/gkx1235. Nucleic Acids Res. 2018. PMID: 29316735 Free PMC article.
  • The importance of biological databases in biological discovery. Baxevanis AD. Baxevanis AD. Curr Protoc Bioinformatics. 2011 Jun;Chapter 1:1.1.1-1.1.6. doi: 10.1002/0471250953.bi0101s34. Curr Protoc Bioinformatics. 2011. PMID: 21633941 Review.
  • Beyond blast: enabling microbiologists to better extract literature, taxonomic distributions and gene neighbourhood information for protein families. Reed CJ, Denise R, Hourihan J, Babor J, Jaroch M, Martinelli M, Hutinet G, de Crécy-Lagard V. Reed CJ, et al. Microb Genom. 2024 Feb;10(2):001183. doi: 10.1099/mgen.0.001183. Microb Genom. 2024. PMID: 38323604 Free PMC article.
  • Statistical integration of multi-omics and drug screening data from cell lines. El Bouhaddani S, Höllerhage M, Uh HW, Moebius C, Bickle M, Höglinger G, Houwing-Duistermaat J. El Bouhaddani S, et al. PLoS Comput Biol. 2024 Jan 31;20(1):e1011809. doi: 10.1371/journal.pcbi.1011809. eCollection 2024 Jan. PLoS Comput Biol. 2024. PMID: 38295113 Free PMC article.
  • Ten simple rules for managing laboratory information. Berezin CT, Aguilera LU, Billerbeck S, Bourne PE, Densmore D, Freemont P, Gorochowski TE, Hernandez SI, Hillson NJ, King CR, Köpke M, Ma S, Miller KM, Moon TS, Moore JH, Munsky B, Myers CJ, Nicholas DA, Peccoud SJ, Zhou W, Peccoud J. Berezin CT, et al. PLoS Comput Biol. 2023 Dec 7;19(12):e1011652. doi: 10.1371/journal.pcbi.1011652. eCollection 2023 Dec. PLoS Comput Biol. 2023. PMID: 38060459 Free PMC article.
  • The consequences of data dispersion in genomics: a comparative analysis of data sources for precision medicine. Costa M, García S A, Pastor O. Costa M, et al. BMC Med Inform Decis Mak. 2023 Nov 9;23(Suppl 3):256. doi: 10.1186/s12911-023-02342-w. BMC Med Inform Decis Mak. 2023. PMID: 37946154 Free PMC article.
  • Immune Activation and Inflammatory Response Mediated by the NOD/Toll-like Receptor Signaling Pathway-The Potential Mechanism of Bullfrog ( Lithobates catesbeiana ) Meningitis Caused by Elizabethkingia miricola . Li F, Chen B, Xu M, Feng Y, Deng Y, Huang X, Geng Y, Ouyang P, Chen D. Li F, et al. Int J Mol Sci. 2023 Sep 26;24(19):14554. doi: 10.3390/ijms241914554. Int J Mol Sci. 2023. PMID: 37833994 Free PMC article.
  • Cantelli G., Bateman A., Brooksbank C., Petrov A.I., Malik-Sheriff R.S., Ide-Smith M., Hermjakob H., Flicek P., Apweiler R., Birney E.et al. .. The European Bioinformatics Institute (EMBL-EBI) in 2021. Nucleic Acids Res. 2021; 10.1093/nar/gkab1127. - DOI - PMC - PubMed
  • Sayers E.W., Bolton E.E., Brister J.R., Canese K., Chan J., Comeau D.C., Connor R., Funk K., Kelly C., Kim S.et al. .. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2021; 10.1093/nar/gkab1112. - DOI - PMC - PubMed
  • CNCB-NGDC Members and Partners Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022. Nucleic Acids Res. 2021; 10.1093/nar/gkab951. - DOI - PMC - PubMed
  • Torrens-Fontanals M., Peralta-García A., Talarico C., Guixà-González R., Giorgino T., Selent J.. SCoV2-MD: a database for the dynamics of the SARS-CoV-2 proteome and variant impact predictions. Nucleic Acids Res. 2021; 10.1093/nar/gkab977. - DOI - PMC - PubMed
  • De Silva N.H., Bhai J., Chakiachvili M., Contreras-Moreira B., Cummins C., Frankish A., Gall A., Genez T., Howe K.L., Hunt S.E.et al. .. The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data. Nucleic Acids Res. 2021; 10.1093/nar/gkab889. - DOI - PMC - PubMed

Publication types

  • Search in MeSH

Related information

Grants and funding.

  • Oxford University Press

LinkOut - more resources

Full text sources.

  • Europe PubMed Central
  • PubMed Central
  • Silverchair Information Systems

Miscellaneous

  • NCI CPTAC Assay Portal

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Nucleic Acids Research is a peer-reviewed scientific journal published by Oxford University Press. It covers research on nucleic acids, such as DNA and RNA, and related work. Some of its content is available under an open access license. According to the Journal Citation Reports, the journal s 2010 impact factor is 7.836. The journal publishes two yearly special issues, one dedicated to biological databases, published in January since 1993 and the other on biological web servers, published in July since 2003.

Some content from Wikipedia , licensed under CC BY-SA

Nucleic Acids Research

  • Date 6 hours 12 hours 1 day 3 days all
  • Rank Last day 1 week 1 month all
  • LiveRank Last day 1 week 1 month all
  • Popular Last day 1 week 1 month all

nucleic acid research

Researchers discover gene scissors that switch off with a built-in timer

CRISPR gene scissors, as new tools of molecular biology, have their origin in an ancient bacterial immune system. But once a virus attack has been successfully overcome, the cell has to recover.

Biotechnology

Aug 22, 2024

nucleic acid research

Cellular DNA damage response pathways might be useful against some disease-causing viruses

New research reveals that triggering a cell's DNA damage response could be a promising avenue for developing novel treatments against several rare but devastating viruses for which no antiviral treatments exist, possibly ...

Cell & Microbiology

Aug 21, 2024

nucleic acid research

Newly discovered protein stops DNA damage

Researchers from Western University have discovered a protein that has the never-before-seen ability to stop DNA damage in its tracks. The finding could provide the foundation for developing everything from vaccines against ...

Molecular & Computational biology

Aug 15, 2024

nucleic acid research

Light-responsive gene regulation at the mRNA level

Researchers at the University of Bayreuth have established a new optogenetic approach that can control the bacterial production of proteins at the mRNA level using blue light. The new system gates the activation of the genetic ...

Aug 14, 2024

nucleic acid research

Smart guide RNAs: Researchers use logic gate-based decision-making to construct circuits that control genes

Researchers have transformed guide RNAs, which direct enzymes, into a smart RNA capable of controlling networks in response to various signals. A research team consisting of Professor Jongmin Kim and Ph.D. candidates Hansol ...

Jul 23, 2024

nucleic acid research

Research shows how RNA 'junk' controls our genes

Researchers at Arizona State University have made a significant advance in understanding how genes are controlled in living organisms. The new study, published in the journal Nucleic Acids Research, focuses on critical snippets ...

Jul 2, 2024

nucleic acid research

Scientists develop a new generation of DNA tests for a wide range of applications

A research group led by Dr. Edward Curtis has developed two new types of catalytic DNA molecules (deoxyribozymes) that can reveal the presence of target molecules through fluorescence or color. Several types of sensors were ...

Jun 26, 2024

nucleic acid research

Not wrapping but folding: Bacteria also organize their DNA, but they do it a bit differently

Some bacteria, it turns out, have proteins much like ours that organize the DNA in their cells. They just do it a bit differently. This is revealed by new research from biochemists at the Leiden Institute of Chemistry and ...

Jun 13, 2024

nucleic acid research

New simplified DNA model for advanced computational simulations

DNA is the molecule that contains all the genetic information necessary for the development and functioning of living organisms. It is organized in a structure called "chromatin," which is found inside the nucleus of cells. ...

nucleic acid research

Researchers create an innovative tool for the reliable and efficient study of gene function

A team of scientists at the Centro Nacional de Investigaciones Cardiovasculares (CNIC) led by Rui Benedito has generated a novel genetic tool, called iSuRe-HadCre, that enables the induction of precise genetic alterations ...

Jun 11, 2024

E-mail newsletter

Protein-nucleic acid hybrid nanostructures for molecular diagnostic applications

  • Review Article
  • Published: 24 August 2024

Cite this article

nucleic acid research

  • Noah R. Sundah 1 , 2 ,
  • Yuxuan Seah 2 ,
  • Auginia Natalia 1 , 2 ,
  • Xiaoyan Chen 1 , 2 ,
  • Panida Cen 1 , 2 , 3 ,
  • Yu Liu 1 , 2 &
  • Huilin Shao 1 , 2 , 4 , 5  

Molecular diagnostic technologies empower new clinical opportunities in precision medicine. However, existing approaches face limitations with respect to performance, operation and cost. Biological molecules including proteins and nucleic acids are being increasingly adopted as tools in the development of new molecular diagnostic technologies. In particular, leveraging their complementary properties—the functional diversity of proteins and the precision programmability of nucleic acids—a wide range of protein-nucleic acid hybrid nanostructures have been developed. These hybrid structures take diverse forms, ranging from one-dimensional to three-dimensional hybrids, as static assemblies to dynamic machines, and possess myriad functions to recognize target biomarkers, encode vast information and execute catalytic activities. Motivated by recent advances in this area of molecular nanotechnology, we review the state-of-art design and application of various types of protein-nucleic acid hybrid nanostructures for molecular diagnostics, and present an outlook on the challenges and opportunities for emerging pre-clinical and clinical applications, highlighting the promise for earlier detection, more refined diagnosis and highly tailored treatment decision that ultimately lead to improved patient outcomes.

nucleic acid research

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Ashley, E. A. Towards precision medicine. Nat. Rev. Genet. 2016 , 17 , 507–522.

Article   CAS   PubMed   Google Scholar  

National Research Council (US) Committee on A Framework for Developing A New Taxonomy of Disease. Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease ; National Academies Press: Washington, 2011.

Google Scholar  

Adan, A.; Alizada, G.; Kiraz, Y.; Baran, Y.; Nalbant, A. Flow cytometry: Basic principles and applications. Crit. Rev. Biotechnol. 2017 , 37 , 163–176.

Tighe, P. J.; Ryder, R. R.; Todd, I.; Fairclough, L. C. ELISA in the multiplex era: Potentials and pitfalls. Proteomics Clin. Appl. 2015 , 9 , 406–422.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kingsmore, S. F. Multiplexed protein measurement: Technologies and applications of protein and antibody arrays. Nat. Rev. Drug Discov. 2006 , 5 , 310–321.

Wong, M. L.; Medrano, J. F. Real-time PCR for mRNA quantitation. Biotechniques 2005 , 39 , 75–85.

Alberts, B.; Johnson, A.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P. Analyzing protein structure and function. In Molecular Biology of the Cell ; 4th ed. Alberts, B.; Johnson, A.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P., Eds.; Garland Science: New York, 2002.

Seeman, N. C.; Sleiman, H. F. DNA nanotechnology. Nat. Rev. Mater. 2018 , 3 , 17068.

Article   CAS   Google Scholar  

Chen, Y. J.; Groves, B.; Muscat, R. A.; Seelig, G. DNA nanotechnology from the test tube to the cell. Nat. Nanotechnol. 2015 , 10 , 748–760.

Jones, M. R.; Seeman, N. C.; Mirkin, C. A. Programmable materials and the nature of the DNA bond. Science 2015 , 347 , 1260901.

Article   PubMed   Google Scholar  

Pinheiro, A. V.; Han, D. R.; Shih, W. M.; Yan, H. Challenges and opportunities for structural DNA nanotechnology. Nat. Nanotechnol. 2011 , 6 , 763–772.

Myhrvold, C.; Silver, P. A. Using synthetic RNAs as scaffolds and regulators. Nat. Struct. Mol. Biol. 2015 , 22 , 8–10.

Wilner, O. I.; Weizmann, Y.; Gill, R.; Lioubashevski, O.; Freeman, R.; Willner, I. Enzyme cascades activated on topologically programmed DNA scaffolds. Nat. Nanotechnol. 2009 , 4 , 249–254.

Perrault, S. D.; Shih, W. M. Virus-inspired membrane encapsulation of DNA nanostructures to achieve in vivo stability. ACS Nano 2014 , 8 , 5132–5140.

Ge, Z. L.; Guo, L. J.; Wu, G. Q.; Li, J.; Sun, Y. L.; Hou, Y. Q.; Shi, J. Y.; Song, S. P.; Wang, L. H.; Fan, C. H. et al. DNA origami-enabled engineering of ligand-drug conjugates for targeted drug delivery. Small 2020 , 16 , 1904857.

Sharma, J.; Chhabra, R.; Cheng, A. C.; Brownell, J.; Liu, Y.; Yan, H. Control of self-assembly of DNA tubules through integration of gold nanoparticles. Science 2009 , 323 , 112–116.

Stephanopoulos, N.; Francis, M. B. Choosing an effective protein bioconjugation strategy. Nat. Chem. Biol. 2011 , 7 , 876–884.

Stephanopoulos, N. Hybrid nanostructures from the self-assembly of proteins and DNA. Chem 2020 , 6 , 364–405.

Knappe, G. A.; Wamhoff, E. C.; Bathe, M. Functionalizing DNA origami to investigate and interact with biological systems. Nat. Rev. Mater. 2023 , 8 , 123–138.

Sacca, B.; Niemeyer, C. M. Functionalization of DNA nanostructures with proteins. Chem. Soc. Rev. 2011 , 40 , 5910–5921.

Zhao, D.; Kong, Y. H.; Zhao, S. S.; Xing, H. Engineering functional DNA-protein conjugates for biosensing, biomedical, and nanoassembly applications. Top. Curr. Chem. 2020 , 378 , 41.

Song, P.; Shen, J. W.; Ye, D. K.; Dong, B. J.; Wang, F.; Pei, H.; Wang, J. B.; Shi, J. Y.; Wang, L. H.; Xue, W. et al. Programming bulk enzyme heterojunctions for biosensor development with tetrahedral DNA framework. Nat. Commun. 2020 , 11 , 838.

Tavallaie, R.; McCarroll, J.; Le Grand, M.; Ariotti, N.; Schuhmann, W.; Bakker, E.; Tilley, R. D.; Hibbert, D. B.; Kavallaris, M.; Gooding, J. J. Nucleic acid hybridization on an electrically reconfigurable network of gold-coated magnetic nanoparticles enables microRNA detection in blood. Nat. Nanotechnol. 2018 , 13 , 1066–1071.

Saiki, R. K.; Scharf, S.; Faloona, F.; Mullis, K. B.; Horn, G. T.; Erlich, H. A.; Arnheim, N. Enzymatic amplification of β-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 1985 , 230 , 1350–1354.

Sano, T.; Smith, C. L.; Cantor, C. R. Immuno-PCR: Very sensitive antigen detection by means of specific antibody-DNA conjugates. Science 1992 , 258 , 120–122.

Ruzicka, V.; Marz, W.; Russ, A.; Gross, W. Immuno-PCR with a commercially available avidin system. Science 1993 , 260 , 698–699.

Zhou, H.; Fisher, R. J.; Papas, T. S. Universal immuno-PCR for ultra-sensitive target proteindetection. Nucleic Acids Res. 1993 , 21 , 6038–6039.

Hendrickson, E. R.; Truby, T. M. H.; Joerger, R. D.; Majarian, W. R.; Ebersole, R. C. High sensitivity multianalyte immunoassay using covalent DNA-labeled antibodies and polymerase chain reaction. Nucleic Acids Res. 1995 , 23 , 522–529.

Sims, P. W.; Vasser, M.; Wong, W. L.; Williams, P. M.; Meng, Y. G. Immunopolymerase chain reaction using real-time polymerase chain reaction for detection. Anal. Biochem. 2000 , 281 , 230–232.

Schweitzer, B.; Wiltshire, S.; Lambert, J.; O’Malley, S.; Kukanskis, K.; Zhu, Z. R.; Kingsmore, S. F.; Lizardi, P. M.; Ward, D. C. Immunoassays with rolling circle DNA amplification: A versatile platform for ultrasensitive antigen detection. Proc. Natl. Acad. Sci. USA 2000 , 97 , 10113–10119.

Wacker, R.; Ceyhan, B.; Alhorn, P.; Schueler, D.; Lang, C.; Niemeyer, C. M. Magneto immuno-PCR: A novel immunoassay based on biogenic magnetosome nanoparticles. Biochem. Biophys. Res. Commun. 2007 , 357 , 391–396.

Chen, L. Y.; Wei, H. P.; Guo, Y. C.; Cui, Z. Q.; Zhang, Z. P.; Zhang, X. E. Gold nanoparticle enhanced immuno-PCR for ultrasensitive detection of Hantaan virus nucleocapsid protein. J. Immunol. Methods 2009 , 346 , 64–70.

Goltsev, Y.; Samusik, N.; Kennedy-Darling, J.; Bhate, S.; Hale, M.; Vazquez, G.; Black, S.; Nolan, G. P. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 2018 , 174 , 968–981.

Fredriksson, S.; Gullberg, M.; Jarvius, J.; Olsson, C.; Pietras, K.; Gustafsdottir, S. M.; Ostman, A.; Landegren, U. Protein detection using proximity-dependent DNA ligation assays. Nat. Biotechnol. 2002 , 20 , 473–477.

Gullberg, M.; Gustafsdottir, S. M.; Schallmeiner, E.; Jarvius, J.; Bjarnegard, M.; Betsholtz, C.; Landegren, U.; Fredriksson, S. Cytokine detection by antibody-based proximity ligation. Proc. Natl. Acad. Sci. USA 2004 , 101 , 8420–8424.

Soderberg, O.; Gullberg, M.; Jarvius, M.; Ridderstrale, K.; Leuchowius, K. J.; Jarvius, J.; Wester, K.; Hydbring, P.; Bahram, F.; Larsson, L. G. et al. Direct observation of individual endogenous protein complexes in situ by proximity ligation. Nat. Methods 2006 , 3 , 995–1000.

Darmanis, S.; Nong, R. Y.; Vanelid, J.; Siegbahn, A.; Ericsson, O.; Fredriksson, S.; Backlin, C.; Gut, M.; Heath, S.; Gut, I. G. et al. ProteinSeq: High-performance proteomic analyses by proximity ligation and next generation sequencing. PLoS One 2011 , 6 , e25583.

Lundberg, M.; Eriksson, A.; Tran, B.; Assarsson, E.; Fredriksson, S. Homogeneous antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood. Nucleic Acids Res. 2011 , 39 , e102.

Hu, J. M.; Wang, T. Y.; Kim, J.; Shannon, C.; Easley, C. J. Quantitation of femtomolar protein levels via direct readout with the electrochemical proximity assay. J. Am. Chem. Soc. 2012 , 134 , 7066–7072.

Feng, W.; Beer, J. C.; Hao, Q. Y.; Ariyapala, I. S.; Sahajan, A.; Komarov, A.; Cha, K. T.; Moua, M.; Qiu, X. L.; Xu, X. M. et al.. NULISA: A proteomic liquid biopsy platform with attomolar sensitivity and high multiplexing. Nat. Commun. 2023 , 14 , 7238

Stoeckius, M.; Hafemeister, C.; Stephenson, W.; Houck-Loomis, B.; Chattopadhyay, P. K.; Swerdlow, H.; Satija, R.; Smibert, P. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 2017 , 14 , 865–868.

Vistain, L.; Van Phan, H.; Keisham, B.; Jordi, C.; Chen, M. J.; Reddy, S. T.; Tay, S. Quantification of extracellular proteins, protein complexes and mRNAs in single cells by proximity sequencing. Nat. Methods 2022 , 19 , 1578–1589.

Liu, Y.; Yang, M. Y.; Deng, Y. X.; Su, G.; Enninful, A.; Guo, C. C.; Tebaldi, T.; Zhang, D.; Kim, D.; Bai, Z. L. et al. High-spatial-resolution multi-omics sequencing via deterministic barcoding in tissue. Cell 2020 , 183 , 1665–1681.e18.

Vickovic, S.; Lötstedt, B.; Klughammer, J.; Mages, S.; Segerstolpe, Å.; Rozenblatt-Rosen, O.; Regev, A. SM-Omics is an automated platform for high-throughput spatial multi-omics. Nat. Commun. 2022 , 13 , 795.

Ben-Chetrit, N.; Niu, X.; Swett, A. D.; Sotelo, J.; Jiao, M. S.; Stewart, C. M.; Potenski, C.; Mielinis, P.; Roelli, P.; Stoeckius, M. et al. Integration of whole transcriptome spatial profiling with protein markers. Nat. Biotechnol. 2023 , 41 , 788–793.

Liu, Y.; DiStasio, M.; Su, G.; Asashima, H.; Enninful, A.; Qin, X. Y.; Deng, Y. X.; Nam, J.; Gao, F.; Bordignon, P. et al. High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seq. Nat. Biotechnol. 2023 , 41 , 1405–1409.

Jungmann, R.; Avendaño, M. S.; Woehrstein, J. B.; Dai, M. J.; Shih, W. M.; Yin, P. Multiplexed 3D cellular super-resolution imaging with DNA-PAINT and Exchange-PAINT. Nat. Methods 2014 , 11 , 313–318.

Schnitzbauer, J.; Strauss, M. T.; Schlichthaerle, T.; Schueder, F.; Jungmann, R. Super-resolution microscopy with DNA-PAINT. Nat. Protoc. 2017 , 12 , 1198–1228.

Ganji, M.; Schlichthaerle, T.; Eklund, A. S.; Strauss, S.; Jungmann, R. Quantitative assessment of labeling probes for super-resolution microscopy using designer DNA nanostructures. ChemPhysChem 2021 , 22 , 911–914.

Oleksiievets, N.; Sargsyan, Y.; Thiele, J. C.; Mougios, N.; Sograte-Idrissi, S.; Nevskyi, O.; Gregor, I.; Opazo, F.; Thoms, S.; Enderlein, J. et al. Fluorescence lifetime DNA-PAINT for multiplexed super-resolution imaging of cells. Commun. Biol. 2022 , 5 , 38.

Sograte-Idrissi, S.; Oleksiievets, N.; Isbaner, S.; Eggert-Martinez, M.; Enderlein, J.; Tsukanov, R.; Opazo, F. Nanobody detection of standard fluorescent proteins enables multi-target DNA-PAINT with high resolution and minimal displacement errors. Cells 2019 , 8 , 48.

Lobanova, E.; Whiten, D.; Ruggeri, F. S.; Taylor, C. G.; Kouli, A.; Xia, Z. J.; Emin, D.; Zhang, Y. P.; Lam, J. Y. L.; Williams-Gray, C. H. et al. Imaging protein aggregates in the serum and cerebrospinal fluid in Parkinson’s disease. Brain 2022 , 145 , 632–643.

Eklund, A. S.; Ganji, M.; Gavins, G.; Seitz, O.; Jungmann, R. Peptide-PAINT super-resolution imaging using transient coiled coil interactions. Nano Lett. 2020 , 20 , 6732–6737.

Jungmann, R.; Avendaño, M. S.; Dai, M. J.; Woehrstein, J. B.; Agasti, S. S.; Feiger, Z.; Rodal, A.; Yin, P. Quantitative superresolution imaging with qPAINT. Nat. Methods 2016 , 13 , 439–442.

Chen, C.; Zong, S. F.; Liu, Y.; Wang, Z. Y.; Zhang, Y. Z.; Chen, B. A.; Cui, Y. P. Profiling of exosomal biomarkers for accurate cancer identification: Combining DNA-PAINT with machine-learning-based classification. Small 2019 , 15 , 1901014.

Chen, Z. W.; Yin, G. Q.; Wei, J. X.; Qi, T. S.; Qian, Z. T.; Wang, Z. Y.; Zong, S. F.; Cui, Y. P. Quantitative analysis of multiple breast cancer biomarkers using DNA-PAINT. Anal. Methods 2022 , 14 , 3671–3679.

Goodman, R. P.; Berry, R. M.; Turberfield, A. J. The single-step synthesis of a DNA tetrahedron. Chem. Commun. 2004 , 12 , 1372–1373.

Article   Google Scholar  

Pei, H.; Wan, Y.; Li, J.; Hu, H. Y.; Su, Y.; Huang, Q.; Fan, C. H. Regenerable electrochemical immunological sensing at DNA nanostructure-decorated gold surfaces. Chem. Commun. 2011 , 47 , 6254–6256.

Chen, X. Q.; Zhou, G. B.; Song, P.; Wang, J. J.; Gao, J. M.; Lu, J. X.; Fan, C. H.; Zuo, X. L. Ultrasensitive electrochemical detection of prostate-specific antigen by using antibodies anchored on a DNA nanostructural scaffold. Anal. Chem. 2014 , 86 , 7337–7342.

Li, Z. H.; Zhao, B.; Wang, D. F.; Wen, Y. L.; Liu, G.; Dong, H. Q.; Song, S. P.; Fan, C. H. DNA nanostructure-based universal microarray platform for high-efficiency multiplex bioanalysis in biofluids. ACS Appl. Mater. Interfaces 2014 , 6 , 17944–17953.

Lin, Y.; Jia, J. P.; Yang, R.; Chen, D. Z.; Wang, J.; Luo, F.; Guo, L. H.; Qiu, B.; Lin, Z. Y. Ratiometric immunosensor for GP73 detection based on the ratios of electrochemiluminescence and electrochemical signal using DNA tetrahedral nanostructure as the carrier of stable reference signal. Anal. Chem. 2019 , 91 , 3717–3724.

Sundah, N. R.; Ho, N. R. Y.; Lim, G. S.; Natalia, A.; Ding, X. G.; Liu, Y.; Seet, J. E.; Chan, C. W.; Loh, T. P.; Shao, H. L. Barcoded DNA nanostructures for the multiplexed profiling of subcellular protein distribution. Nat. Biomed. Eng. 2019 , 3 , 684–694.

Jablonski, E.; Moomaw, E. W.; Tullis, R. H.; Ruth, J. L. Preparation of oligodeoxynucleotide-alkaline phosphatase conjugates and their use hybridization probes. Nucleic Acids Res. 1986 , 14 , 6115–6128.

Civit, L.; Fragoso, A.; O’Sullivan, C. K. Electrochemical biosensor for the multiplexed detection of human papillomavirus genes. Biosens. Bioelectron. 2010 , 26 , 1684–1687.

Civit, L.; Fragoso, A.; Hölters, S.; Dürst, M.; O’Sullivan, C. K. Electrochemical genosensor array for the simultaneous detection of multiple high-risk human papillomavirus sequences in clinical samples. Anal. Chim. Acta 2012 , 715 , 93–98.

Ge, L. L.; Li, B.; Xu, H. X.; Pu, W. Y.; Kwok, H. F. Backfilling rolling cycle amplification with enzyme-DNA conjugates on antibody for portable electrochemical immunoassay with glucometer readout. Biosens. Bioelectron. 2019 , 132 , 210–216.

Rubio-Cosials, A.; Schulz, E. C.; Lambertsen, L.; Smyshlyaev, G.; Rojas-Cordova, C.; Forslund, K.; Karaca, E.; Bebel, A.; Bork, P.; Barabas, O. Transposase-DNA complex structures reveal mechanisms for conjugative transposition of antibiotic resistance. Cell 2018 , 173 , 208–220.e20.

Picelli, S.; Björklund, Å. K.; Reinius, B.; Sagasser, S.; Winberg, G.; Sandberg, R. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 2014 , 24 , 2033–2040.

Kaya-Okur, H. S.; Wu, S. J.; Codomo, C. A.; Pledger, E. S.; Bryson, T. D.; Henikoff, J. G.; Ahmad, K.; Henikoff, S. CUT&Tag for efficient epigenomic profiling of small samples and single cells. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 2019 , 10 , 1930

Article   PubMed   PubMed Central   Google Scholar  

Deng, Y. X.; Bartosovic, M.; Kukanja, P.; Zhang, D.; Liu, Y.; Su, G.; Enninful, A.; Bai, Z. L.; Castelo-Branco, G.; Fan, R. Spatial-CUT&Tag: Spatially resolved chromatin modification profiling at the cellular level. Science 2022 , 375 , 681–686.

Jinek, M.; Chylinski, K.; Fonfara, I.; Hauer, M.; Doudna, J. A.; Charpentier, E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 2012 , 337 , 816–821.

Hsu, P. D.; Scott, D. A.; Weinstein, J. A.; Ran, F. A.; Konermann, S.; Agarwala, V.; Li, Y. Q.; Fine, E. J.; Wu, X. B.; Shalem, O. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 2013 , 31 , 827–832.

Pardee, K.; Green, A. A.; Takahashi, M. K.; Braff, D.; Lambert, G.; Lee, J. W.; Ferrante, T.; Ma, D.; Donghia, N.; Fan, M. et al. Rapid, low-cost detection of zika virus using programmable biomolecular components. Cell 2016 , 165 , 1255–1266.

Hajian, R.; Balderston, S.; Tran, T.; deBoer, T.; Etienne, J.; Sandhu, M.; Wauford, N. A.; Chung, J. Y.; Nokes, J.; Athaiya, M. et al. Detection of unamplified target genes via CRISPR-Cas9 immobilized on a graphene field-effect transistor. Nat. Biomed. Eng. 2019 , 3 , 427–437.

Abudayyeh, O. O.; Gootenberg, J. S.; Konermann, S.; Joung, J.; Slaymaker, I. M.; Cox, D. B. T.; Shmakov, S.; Makarova, K. S.; Semenova, E.; Minakhin, L. et al. C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector. Science 2016 , 353 , aaf5573.

Chen, J. S.; Ma, E. B.; Harrington, L. B.; Da Costa, M.; Tian, X. R.; Palefsky, J. M.; Doudna, J. A. CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity. Science 2018 , 360 , 436–439.

Gootenberg, J. S.; Abudayyeh, O. O.; Lee, J. W.; Essletzbichler, P.; Dy, A. J.; Joung, J.; Verdine, V.; Donghia, N.; Daringer, N. M.; Freije, C. A. et al. Nucleic acid detection with CRISPR-Cas13a/ C2c2. Science 2017 , 356 , 438–442.

Gootenberg, J. S.; Abudayyeh, O. O.; Kellner, M. J.; Joung, J.; Collins, J. J.; Zhang, F. Multiplexed and portable nucleic acid detection platform with Cas13, Cas12a, and Csm6. Science 2018 , 360 , 439–444.

Fozouni, P.; Son, S.; Díaz de León Derby, M.; Knott, G. J.; Gray, C. N.; D’Ambrosio, M. V.; Zhao, C. Y.; Switz, N. A.; Kumar, G. R.; Stephens, S. I. et al. Amplification-free detection of SARS-CoV-2 with CRISPR-Cas13a and mobile phone microscopy. Cell 2021 , 184 , 323–333.e9.

Moon, J.; Liu, C. C. Asymmetric CRISPR enabling cascade signal amplification for nucleic acid detection by competitive crRNA. Nat. Commun. 2023 , 14 , 7504.

Qin, P. P.; Chen, P. R.; Deng, N.; Tan, L.; Yin, B. C.; Ye, B. C. Switching the activity of CRISPR/Cas12a using an allosteric inhibitory aptamer for biosensing. Anal. Chem. 2022 , 94 , 15908–15914.

Jang, H.; Song, J.; Kim, S.; Byun, J. H.; Lee, K. G.; Park, K. H.; Woo, E.; Lim, E. K.; Jung, J.; Kang, T.. ANCA: Artificial nucleic acid circuit with argonaute protein for one-step isothermal detection of antibiotic-resistant bacteria. Nat. Commun. 2023 , 14 , 8033

Hannezo, E.; Heisenberg, C. P. Mechanochemical feedback loops in development and disease. Cell 2019 , 178 , 12–25.

Nusse, R.; Fuerer, C.; Ching, W.; Harnish, K.; Logan, C.; Zeng, A.; ten Berge, D.; Kalani, Y. Wnt signaling and stem cell control. Cold Spring Harb. Symp. Quant. Biol. 2008 , 73 , 59–66.

Purvis, J. E.; Lahav, G. Encoding and decoding cellular information through signaling dynamics. Cell 2013 , 152 , 945–956.

Chado, G. R.; Stoykovich, M. P.; Kaar, J. L. Role of dimension and spatial arrangement on the activity of biocatalytic cascade reactions on scaffolds. ACS Catal. 2016 , 6 , 5161–5169.

Mameuda, A.; Takinoue, M.; Kamiya, K. Control of reversible formation and dispersion of the three enzyme networks integrating DNA computing. Anal. Chem. 2023 , 95 , 9548–9554.

Ho, N. R. Y.; Lim, G. S.; Sundah, N. R.; Lim, D.; Loh, T. P.; Shao, H. L. Visual and modular detection of pathogen nucleic acids with enzyme-DNA molecular complexes. Nat. Commun. 2018 , 9 , 3238.

Sundah, N. R.; Natalia, A.; Liu, Y.; Ho, N. R. Y.; Zhao, H. T.; Chen, Y.; Miow, Q. H.; Wang, Y.; Beh, D. L. L.; Chew, K. L. et al. Catalytic amplification by transition-state molecular switches for direct and sensitive detection of SARS-CoV-2. Sci. Adv. 2021 , 7 , eabe5940.

Chen, Y.; Sundah, N. R.; Ho, N. R. Y.; Natalia, A.; Liu, Y.; Miow, Q. H.; Wang, Y.; Beh, D. L. L.; Chew, K. L.; Chan, D. et al. Collaborative equilibrium coupling of catalytic DNA nanostructures enables programmable detection of SARS-CoV-2. Adv. Sci. 2021 , 8 , 2101155.

De Vlaminck, I.; Martin, L.; Kertesz, M.; Patel, K.; Kowarsky, M.; Strehl, C.; Cohen, G.; Luikart, H.; Neff, N. F.; Okamoto, J. et al. Noninvasive monitoring of infection and rejection after lung transplantation. Proc. Natl. Acad. Sci. USA 2015 , 112 , 13336–13341.

Klein, A. M.; Mazutis, L.; Akartuna, I.; Tallapragada, N.; Veres, A.; Li, V.; Peshkin, L.; Weitz, D. A.; Kirschner, M. W. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 2015 , 161 , 1187–1201.

Sun, W. J.; Gu, Z. Engineering DNA scaffolds for delivery of anticancer therapeutics. Biomater. Sci. 2015 , 3 , 1018–1024.

Sun, W. J.; Wang, J. Q.; Hu, Q. Y.; Zhou, X. W.; Khademhosseini, A.; Gu, Z. CRISPR-Cas12a delivery by DNA-mediated bioresponsive editing for cholesterol regulation. Sci. Adv. 2020 , 6 , eaba2983.

Fan, K. L.; Xi, J. Q.; Fan, L.; Wang, P. X.; Zhu, C. H.; Tang, Y.; Xu, X. D.; Liang, M. M.; Jiang, B.; Yan, X. Y. et al. In vivo guiding nitrogen-doped carbon nanozyme for tumor catalytic therapy. Nat. Commun. 2018 , 9 , 1440

Somasundar, A.; Ghosh, S.; Mohajerani, F.; Massenburg, L. N.; Yang, T. L.; Cremer, P. S.; Velegol, D.; Sen, A. Positive and negative chemotaxis of enzyme-coated liposome motors. Nat. Nanotechnol. 2019 , 14 , 1129–1134.

Liu, X. G.; Zhang, F.; Jing, X. X.; Pan, M. C.; Liu, P.; Li, W.; Zhu, B. W.; Li, J.; Chen, H.; Wang, L. H. et al. Complex silica composite nanomaterials templated with DNA origami. Nature 2018 , 559 , 593–598.

Borrebaeck, C. A. K. Precision diagnostics: Moving towards protein biomarker signatures of clinical utility in cancer. Nat. Rev. Cancer 2017 , 17 , 199–204.

Duncombe, T. A.; Tentori, A. M.; Herr, A. E. Microfluidics: Reframing biological enquiry. Nat. Rev. Mol. Cell Biol. 2015 , 16 , 554–567.

Prakadan, S. M.; Shalek, A. K.; Weitz, D. A. Scaling by shrinking: Empowering single-cell ‘omics’ with microfluidic devices. Nat. Rev. Genet. 2017 , 18 , 345–361.

Camacho, D. M.; Collins, K. M.; Powers, R. K.; Costello, J. C.; Collins, J. J. Next-generation machine learning for biological networks. Cell 2018 , 173 , 1581–1592.

Topol, E. J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019 , 25 , 44–56.

Liu, Y.; Sundah, N. R.; Ho, N. R. Y.; Shen, W. X.; Xu, Y.; Natalia, A.; Yu, Z. L.; Seet, J. E.; Chan, C. W.; Loh, T. P. et al. Bidirectional linkage of DNA barcodes for the multiplexed mapping of higher-order protein interactions in cells. Nat. Biomed. Eng. , 2024 , 8 , 909–923

Kazane, S. A.; Sok, D.; Cho, E. H.; Uson, M. L.; Kuhn, P.; Schultz, P. G.; Smider, V. V. Site-specific DNA-antibody conjugates for specific and sensitive immuno-PCR. Proc. Natl. Acad. Sci. USA 2012 , 109 , 3731–3736.

Rosen, C. B.; Kodal, A. L. B.; Nielsen, J. S.; Schaffert, D. H.; Scavenius, C.; Okholm, A. H.; Voigt, N. V.; Enghild, J. J.; Kjems, J.; Tørring, T. et al. Template-directed covalent conjugation of DNA to native antibodies, transferrin and other metal-binding proteins. Nat. Chem. 2014 , 6 , 804–809.

Gu, L. C.; Li, C.; Aach, J.; Hill, D. E.; Vidal, M.; Church, G. M. Multiplex single-molecule interaction profiling of DNA-barcoded proteins. Nature 2014 , 515 , 554–557.

Agasti, S. S.; Liong, M.; Peterson, V. M.; Lee, H.; Weissleder, R. Photocleavable DNA barcode-antibody conjugates allow sensitive and multiplexed protein analysis in single cells. J. Am. Chem. Soc. 2012 , 134 , 18499–18502.

Kwak, M.; Herrmann, A. Nucleic acid/organic polymer hybrid materials: Synthesis, superstructures, and applications. Angew. Chem., Int. Ed. 2010 , 49 , 8574–8587.

Ilinskaya, A. N.; Dobrovolskaia, M. A. Understanding the immunogenicity and antigenicity of nanomaterials: Past, present and future. Toxicol. Appl. Pharmacol. 2016 , 299 , 70–77.

Schüller, V. J.; Heidegger, S.; Sandholzer, N.; Nickels, P. C.; Suhartha, N. A.; Endres, S.; Bourquin, C.; Liedl, T. Cellular immunostimulation by CpG-sequence-coated DNA origami structures. ACS Nano 2011 , 5 , 9696–9702.

Teutsch, S. M.; Bradley, L. A.; Palomaki, G. E.; Haddow, J. E.; Piper, M.; Calonge, N.; Dotson, W. D.; Douglas, M. P.; Berg, A. O.; EGAPP Working Group. The evaluation of genomic applications in practice and prevention (EGAPP) initiative: Methods of the EGAPP working group. Genet. Med. 2009 , 11 , 3–14.

Ahmed, L.; Constantinidou, A.; Chatzittofis, A. Patients’ perspectives related to ethical issues and risks in precision medicine: A systematic review. Front. Med. 2023 , 10 , 1215663.

Download references

Acknowledgements

This work was supported in part by funding from National University of Singapore (NUS), NUS Research Scholarship, Ministry of Education, Institute for Health Innovation & Technology, Ministry of Education, National Research Foundation, and National Medical Research Council.

Author information

Authors and affiliations.

Institute for Health Innovation and Technology, National University of Singapore, Singapore, 117599, Singapore

Noah R. Sundah, Auginia Natalia, Xiaoyan Chen, Panida Cen, Yu Liu & Huilin Shao

Department of Biomedical Engineering, College of Design and Engineering, National University of Singapore, Singapore, 117583, Singapore

Noah R. Sundah, Yuxuan Seah, Auginia Natalia, Xiaoyan Chen, Panida Cen, Yu Liu & Huilin Shao

Integrative Sciences and Engineering Programme, NUS Graduate School, National University of Singapore, Singapore, 119077, Singapore

Agency for Science, Technology and Research, Institute of Molecular and Cell Biology, Singapore, 138673, Singapore

Huilin Shao

Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Singapore

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Huilin Shao .

Rights and permissions

Reprints and permissions

About this article

Sundah, N.R., Seah, Y., Natalia, A. et al. Protein-nucleic acid hybrid nanostructures for molecular diagnostic applications. Nano Res. (2024). https://doi.org/10.1007/s12274-024-6925-6

Download citation

Received : 30 April 2024

Revised : 30 July 2024

Accepted : 31 July 2024

Published : 24 August 2024

DOI : https://doi.org/10.1007/s12274-024-6925-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • molecular nanotechnology
  • DNA-protein hybrid
  • molecular diagnostics
  • precision medicine
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts

Collection 

Nucleic acid chemistry

Since the discovery of the double-helical structure of DNA and postulation of the central dogma of molecular biology, stating that the flow of genetic information goes from DNA to RNA to protein, the field of nucleic acid chemistry has expanded dramatically.

Nucleic acids have become important diagnostic markers for many diseases, enabled by breakthroughs in synthesis and sequencing technologies.

Nucleic acids have become medical modalities. We have recently witnessed the admission of antisense oligonucleotides to cure genetic diseases and the rapid development of mRNA vaccines against Covid-19 during the pandemic in 2020.

These developments are enabled by nucleic acid chemistry, the ability to synthesize oligonucleotides and install modifications at will. The field benefits from a tight interconnection between purely synthetic chemistry and molecular biology, or a combination of both as in chemo-enzymatic methods.

More and more functions of nucleic acids are discovered in vitro and in cells, such as ribozymes selected to catalyze methylation reactions and ribozymes cutting DNA in genomes.

DNA was shown to contain not only the four canonical nucleobases A, C, G, T but also methylated versions of C and their oxidized forms. The repertoire of RNA modifications is still expanding, with >170 currently annotated ones. The analysis, quantification, and mapping of these modifications on a transcriptome-wide scale is a prerequisite to understand their function and relevance in health and disease. Their exact function and dynamic aspects are only starting to be understood.

The already diverse natural functions of nucleic acids, behaving as aptamers, riboswitches, ribozymes, and DNAzymes can be further expanded by various natural and non-natural functionalities, such as tags, probes, markers, or drug molecules that can be installed in DNA or RNA. Such functionalized nucleic acids can be exploited for broad applications in gene editing, synthetic biology, biosensing, and drug discovery.

This Collection aims to offer insights and inspiration in the field of nucleic acid chemistry, including but not limited to:

  • Synthesis, modifications, functionalizations and bioconjugations of nucleic acids
  • Detection, biochemical profiling and structural characterization of nucleic acids
  • Application of nucleic acids in chemical biology, medicine, diagnostics and more.

We welcome both fundamental and applied studies, as well as both experimental and theoretical research.

The Collection primarily welcomes original research papers, and we encourage submissions from all authors—and not by invitation only.

miRNA-mRNA interaction

Andrea Rentmeister, PhD

Department of Chemistry, Ludwig-Maximilians-Universität München, Germany

Michal Hocek, PhD, DSc

Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Czech Republic

  • Collection content
  • How to submit
  • About the Guest Editors
  • About this Collection

Synthesis and modification of nucleic acids

nucleic acid research

Generation of DNA oligomers with similar chemical kinetics via in-silico optimization

Networks of interacting DNA oligomers have various applications in molecular biology, chemistry and materials science, however, kinetic dispersions during DNA hybridization can be problematic for some applications. Here, the authors reveal that limiting unnecessary duplexes using in-silico optimization can reduce in-vitro kinetic dispersions by as much as 96%.

  • Michael Tobiason
  • Bernard Yurke
  • William L. Hughes

nucleic acid research

The selection of a hydrophobic 7-phenylbutyl-7-deazaadenine-modified DNA aptamer with high binding affinity for the Heat Shock Protein 70

DNA aptamers can be selected against a wide range of therapeutic targets, however, the success rate of selective binding remains low due to the highly hydrophilic nature of the DNA backbone. Here, the authors design a hydrophobic 7-phenylbutyl-7-deazaadenine-modified DNA aptamer showing high binding affinity for the heat shock protein 70.

  • Catherine Mulholland
  • Ivana Jestřábová
  • Michal Hocek

nucleic acid research

Evaluation of 3′-phosphate as a transient protecting group for controlled enzymatic synthesis of DNA and XNA oligonucleotides

Controlled enzymatic DNA synthesis represents an alternative synthetic methodology that circumvents the limitations of traditional soild-phase synthesis. Here, the authors explore the use of 3’-phosphate as a transient protecting group for the controlled enzymatic synthesis of DNA and XNA oligonucleotides.

  • Marie Flamme
  • Steven Hanlon
  • Marcel Hollenstein

Structure and function of nucleic acids

nucleic acid research

Structure of a 10-23 deoxyribozyme exhibiting a homodimer conformation

RNA-cleaving DNAzymes exhibit potential as biosensors and in vivo knockdown agents, however, the structures of DNAzymes remain underexplored. Here, the authors report the 2.7 Å X-Ray crystal structure of a 10-23 DNAzyme–substrate complex in a homodimer conformation.

  • Evan R. Cramer
  • Sarah A. Starcovic
  • Aaron R. Robart

nucleic acid research

i-Motif folding intermediates with zero-nucleotide loops are trapped by 2′-fluoroarabinocytidine via F···H and O···H hydrogen bonds

The oligonucleotide d(TC 5 ) forms a well-characterized tetrameric i-motif in solution; however, the isolation of dimeric and trimeric intermediates remains challenging. Here, the authors report that 2′-deoxy-2′-fluoroarabinocytidine substitutions can prompt TC 5 to form dimeric i-motif folding intermediates through fluorine and oxygen hydrogen bonds.

  • Roberto El-Khoury
  • Veronica Macaluso
  • Masad J. Damha

nucleic acid research

Intrastrand backbone-nucleobase interactions stabilize unwound right-handed helical structures of heteroduplexes of L- a TNA/RNA and SNA/RNA

Serinol nucleic acid and L-threoninol nucleic acid can bind to RNA and DNA, endowing them with potential as nucleic acid-based drugs. Here the authors prepare single crystals of L- a TNA/RNA and SNA/RNA heteroduplexes to further our structural understanding of how synthetic nucleic acids hybridize with natural nucleic acids.

  • Yukiko Kamiya
  • Tadashi Satoh
  • Hiroyuki Asanuma

Applications of nucleic acids in chemical biology and medicinal chemistry

nucleic acid research

DNA-encoded chemical libraries yield non-covalent and non-peptidic SARS-CoV-2 main protease inhibitors

Conventional structure-based design of M pro inhibitors of SARS-CoV-2 often starts from the structural information of M pro and their binders; however, the continual rise of resistant strains requires innovative routes to discover new inhibitors. Here, the authors develop a DNA-encoded chemical library screening to produce non-covalent, non-peptidic small molecule inhibitors for SARS-CoV-2 M pro independently of preliminary knowledge regarding suitable starting points.

  • Ravikumar Jimmidi
  • Srinivas Chamakuri
  • Damian W. Young

nucleic acid research

Accessible light-controlled knockdown of cell-free protein synthesis using phosphorothioate-caged antisense oligonucleotides

Light-activated antisense oligonucleotides have been developed to induce gene knockdown in living cells, however, their synthesis remains challenging and application in cell-free systems is underexplored. Here, the authors report a one-step method for selectively attaching photocages onto phosphorothioate linkages of antisense oligonucleotides that can knockdown cell-free protein synthesis using light.

  • Denis Hartmann
  • Michael J. Booth

nucleic acid research

Advanced preparation of fragment libraries enabled by oligonucleotide-modified 2′,3′-dideoxynucleotides

Next-generation genome sequencing technologies have revolutionized the life sciences, however all sequencing platforms require nucleic acid pre-processing to generate suitable libraries for sequencing. Here, oligonucleotide-tethered 2′,3′-dideoxynucleotide terminators bearing universal priming sites are synthesised and incorporated by DNA polymerases, allowing integration of the fragmentation step into the library preparation workflow while also enabling the obtained fragments to be readily labeled by platform-specific adapters.

  • Justina Medžiūnė
  • Žana Kapustina
  • Arvydas Lubys

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

nucleic acid research

  • Open access
  • Published: 18 August 2024

Development and validation of a rapid five-minute nucleic acid extraction method for respiratory viruses

  • Yu Wang 1 ,
  • Yuanyuan Huang 1 ,
  • Yuqing Peng 1 ,
  • Qinglin Cao 1 ,
  • Wenkuan Liu 2 ,
  • Zhichao Zhou 2 ,
  • Guangxin Xu 1 ,
  • Lei Li 3 , 4 &
  • Rong Zhou 1 , 2  

Virology Journal volume  21 , Article number:  189 ( 2024 ) Cite this article

289 Accesses

Metrics details

The rapid transmission and high pathogenicity of respiratory viruses significantly impact the health of both children and adults. Extracting and detecting their nucleic acid is crucial for disease prevention and treatment strategies. However, current extraction methods are laborious and time-consuming and show significant variations in nucleic acid content and purity among different kits, affecting detection sensitivity and efficiency. Our aim is to develop a novel method that reduces extraction time, simplifies operational steps, and ensures high-quality acquisition of respiratory viral nucleic acid.

We extracted respiratory syncytial virus (RSV) nucleic acid using reagents with different components and analyzed cycle threshold (Ct) values via quantitative real-time polymerase chain reaction (qRT-PCR) to optimize and validate the novel lysis and washing solution. The performance of this method was compared against magnetic bead, spin column, and precipitation methods for extracting nucleic acid from various respiratory viruses. The clinical utility of this method was confirmed by comparing it to the standard magnetic bead method for extracting clinical specimens of influenza A virus (IAV).

The solution, composed of equal parts glycerin and ethanol (50% each), offers an innovative washing approach that achieved comparable efficacy to conventional methods in a single abbreviated cycle. When combined with our A Plus lysis solution, our novel five-minute nucleic acid extraction (FME) method for respiratory viruses yielded superior RNA concentrations and purity compared to traditional methods. FME, when used with a universal automatic nucleic acid extractor, demonstrated similar efficiency as various conventional methods in analyzing diverse concentrations of respiratory viruses. In detecting respiratory specimens from 525 patients suspected of IAV infection, the FME method showed an equivalent detection rate to the standard magnetic bead method, with a total coincidence rate of 95.43% and a kappa statistic of 0.901 ( P < 0.001).

Conclusions

The FME developed in this study enables the rapid and efficient extraction of nucleic acid from respiratory samples, laying a crucial foundation for the implementation of expedited molecular diagnosis.

Introduction

Acute respiratory virus infection is a prevalent human disease, particularly affecting children, the elderly, and immunocompromised individuals, with a high incidence during certain seasons [ 1 , 2 ]. Common respiratory viruses include respiratory syncytial virus (RSV), adenovirus (ADV), influenza virus, and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [ 3 ]. Among them, influenza A virus (IAV) and SARS-CoV-2 have caused global pandemics, posing a serious threat to human health and social public safety [ 4 , 5 ]. Currently, ongoing research and development for RSV and SARS-CoV-2 vaccines are being conducted worldwide; however, achieving herd immunity will still be a prolonged process due to challenges such as low vaccination rates and virus mutations in low- and middle-income countries [ 6 , 7 ].

The nucleic acid amplification test (NAAT) and its detection of specific DNA is a powerful tool widely employed in various fields, including disease diagnosis, gene functional analysis, and mutation screening. In diagnostic applications, NAAT-based analysis offers several advantages over traditional enzyme or antibody-based methods, including higher sensitivity, faster sample-to-result processing time, and greater flexibility in target detection selection, which enables rapid adaptation to various emerging challenges [ 8 ]. However, the major bottleneck preventing the rapid and large-scale implementation of molecular diagnostics is the separation and purification of nucleic acid from samples, which is a complex process that traditionally requires skilled technicians and involves numerous manual pipetting steps [ 9 ].

Isolating high-quality nucleic acid while preserving its purity and integrity is an essential prerequisite for downstream molecular applications, including quantitative real-time polymerase chain reaction (qRT-PCR), isothermal amplification technologies (IATs), next-generation sequencing (NGS), and microarrays [ 10 , 11 , 12 , 13 ]. Although there are several methods for nucleic acid isolation, they can be primarily categorized into three groups: liquid-phase extraction utilizing phenol and guanidine thiocyanate (GTC), solid-phase adsorption employing silica membrane spin columns, and superparamagnetic beads. The methods all follow a common sequence: first, cell lysis and inhibition of ribonuclease P (RNase P) activity with subsequent separation of nucleic acids from the lysed mixture; then, washing of the nucleic acids; and, ultimately, recovery of purified nucleic acids [ 14 ]. After phenol-based TRIzol reagent is used to extract nucleic acids, they undergo separation and purification through precipitation. This method is widely used for RNA extraction from tissues and cells; however, TRIzol requires a duration exceeding 70 min for extraction, the recovered RNA often suffers from contamination due to residual organic matter [ 15 ]. The spin column method is based on the binding of nucleic acids to a solid-phase silica carrier under a high salt concentration, followed by a series of washing and centrifugation steps to eliminate contaminants, and finally elution of the nucleic acids from the silica using a low-salt solution [ 16 ]. The spin column method involves numerous operational steps, with a duration of approximately 40–60 min, and requires multiple centrifugations during extraction, which increases the risk of nucleic acids breakage and degradation. The magnetic bead method employs paramagnetic beads coated with various functionalized surface chemicals to efficiently capture and purify nucleic acids. In addition, the magnetic beads are separated from the liquid during washing and elution steps by utilizing a magnet to attract the beads [ 17 , 18 ]. The magnetic bead method only requires 25–30 min, but the recovery rate of nucleic acid in the eluent is relatively low, and the presence of residual magnetic beads also exerts an inhibitory effect on subsequent PCR [ 19 ]. The extraction process can be broadly divided into manual and automatic processes, with many DNA/RNA extraction kits designed for manual operation. However, with the increasing demand for high-throughput analysis, there is a growing popularity of kits specifically designed for automatic operation.

Given the instability of RNA and the ubiquitous presence of RNA enzymes, it is crucial to select an RNA assay kit that can simultaneously meet quality, purity, and integrity requirements while minimizing the time required. In this study, we have developed a rapid respiratory virus nucleic acid extraction method that can be completed within 5 min. The method not only ensures the quality of the extracted nucleic acid but also reduces processing time, thereby facilitating prompt and efficient downstream applications based on nucleic acid analysis.

Materials and methods

Cells, viruses, and clinical specimens.

AD293 cells (from American Type Culture Collection) were cultured in Dulbecco’s modified Eagle medium (DMEM; Gibco, Grand Island, NY, USA) supplemented with 10% fetal bovine serum (FBS; Gibco, Grand Island, NY, USA), 1% penicillin, and 1% streptomycin (Gibco, Grand Island, NY, USA) at 37°C in an incubator with a 5% CO 2 atmosphere. RSV (RSV-A2), ADV (ADV4), IAV (H1N1), herpes simplex virus 1 (HSV-1), and human coronavirus 229E (HCoV-229E) were obtained from the State Key Laboratory of Respiratory Disease. A pseudovirus of SARS-CoV-2 was purchased from Sansure Biotech Inc (Changsha, China). Frozen clinical specimens from 525 suspected IAV-infected patients were obtained from the First Affiliated Hospital of Guangzhou Medical University. This project was approved by the biosafety committee, and all experiments were performed in accordance with biosafety regulations.

DNA and RNA extraction

We have developed a five-minute nucleic acid extraction (FME) reagent (China Food and Drug Administration (CFDA) Certification Class I, No. 20230661). The reagents comprised a lysis solution containing GTC, sodium citrate tribasic dihydrate (sodium citrate), sodium lauroyl sarcosine (sarkosyl), dithiothreitol (DTT), polyethylene glycol 6000 (PEG 6000), and isopropyl alcohol (IPA); a washing solution consisting of a mixture of glycerin and ethanol (EtOH) in equal proportions; an elution solution composed of Tris–HCl (pH 8.0) and ethylene diamine tetraacetic acid (EDTA); and magnetic beads purchased from BayBio Bio-tech Co., Ltd (Guangzhou, China). A total of 40 μL of magnetic beads and 500 μL of lysis solution were added to a 1.5 mL centrifuge tube, followed by the addition of 200-μL samples that were mixed by vortexing for 1 min, and subsequently all the supernatant was removed using a magnetic separator. The magnetic beads were washed with 300 μL of washing solution, vortexed for 1 min, and subjected to magnetic separation again. The supernatant was discarded. Then, 100 μL of elution solution was added, the mixture was incubated at 56°C for 1 min and magnetically separated, and all the supernatant containing the extracted nucleic acid was transferred. Alternatively, an automated nucleic acid extractor can be used by adding a 200-μL sample to a preloaded well plate containing reagents and placing it into an E-Five nucleic acid extractor (HuYanSuo Medical Technology Co., Ltd., Guangzhou, China). The machine was operated according to the manufacturer’s instructions, with the entire process taking approximately 5 min. Upon completion of the extraction, 100 μL of eluent was stored at −80°C for further use.

To compare the efficiency of extraction, the same sample was extracted according to the manufacturer’s instructions for each commercial reagent kit while maintaining consistent loading and elution volumes. The extraction kits utilized in this study included the following: LemnisCare (LC) Viral DNA/RNA Extraction kit (LemnisCare Medical Technology Co., Ltd., Shenzhen, China), referred to as LC; Magen Total DNA/RNA kit (Magen Biotechnology Co., Ltd., Guangzhou, China), referred to as Magen; Baypure Viral DNA/RNA Extraction kit (BayBio Bio-tech Co., Ltd., Guangzhou, China), referred to as Baypure; HR Fast-Virus DNA/RNA kit (Huirui Biotechnology Co., Ltd., Zhuhai, China), referred to as HR; Viral Nucleic Acid Isolation kit (Jiangsu BioPerfectus Technologies Co., Ltd., Taizhou, China), referred to as BioPerfectus; TIANamp Virus DNA/RNA kit (Tiangen Biotech Co., Ltd., Beijing, China), referred to as TIANamp; ABT Nucleic Acid Extraction kit (Applied Biological Technologies Co., Ltd., Beijing, China), referred to as ABT; GIRM Nucleic Acid (DNA/RNA) Extraction kit (HuYanSuo Medical Technology Co., Ltd., Guangzhou, China), referred to as GIRM; and TRIzol reagent (Thermo Fisher Scientific, Waltham, MA, USA), referred to as TRIzol.

Nucleic acid concentration and integrity

The concentration and purity of RNA from each kit were assessed by analyzing 1 μL of extracted nucleic acid using the NanoDrop One (Thermo Fisher Scientific, Waltham, MA, USA). A known concentration of plasmids was added to the FME to compare the amount of DNA recovered from the elution solutions while concurrently assessing the integrity of the DNA fragments through 1.5% agarose gel electrophoresis.

Quantitative real-time PCR

The qRT-PCR experiment was conducted using a SLAN-96P real-time PCR detection system (Hongshi Medical Technology Co., Ltd., Shanghai, China). The RSV, ADV, IAV, and pseudovirus of SARS-CoV-2 were analyzed with a respiratory syncytial virus Nucleic Acid Diagnostic kit (HuYanSuo Medical Technology Co., Ltd., Guangzhou, China), human adenovirus Nucleic Acid Diagnostic kit (HuYanSuo Medical Technology Co., Ltd., Guangzhou, China), SARS-CoV-2 and influenza A/B Virus Nucleic Acid Diagnostic kit (Sansure Biotech Inc., Changsha, China), and 2019-nCoV Nucleic Acid Diagnostic kit (Sansure Biotech Inc., Changsha, China), respectively. The reaction system and procedures of the four commercially available test kits were followed according to the instructions. HSV-1 and HCoV-229E were detected using SYBR Green dye (Tsingke Biotech Co., Ltd., Beijing, China). The nucleic acid of HCoV-229E required an additional reverse transcription step of 25°C for 5 min, activation at 50°C for 15 min, and 85°C for 2 min (Vazyme Biotech Co., Ltd., Nanjing, China). The reaction system consisted of 10 μL of ArtiCan ATM SYBR qPCR Mix (Tsingke Biotech Co., Ltd., Beijing, China), 1.5 μL of forward and reverse primers (Additional file 1 : Table S1), and 7 μL of DNA template. The PCR cycling program was as follows: 95°C for 1 min; and 40 cycles of 95°C for 10 s, 60°C for 20 s, followed by melting curve analysis.

Clinical performance of the FME

Frozen respiratory tract specimens (pharyngeal swabs) were collected from suspected IAV-infected patients, and the FME and standard magnetic bead method (BioPerfectus) were used for nucleic acid extraction. Both extraction methods ensured consistent sample volumes of 200 μL and elution volumes of 100 μL. The SARS-CoV-2 and Influenza A/B Virus Nucleic Acid Diagnostic kits (Sansure Biotech Inc., Changsha, China) were used for qRT-PCR detection, and specifically for the detection of IAV clinical specimens. Briefly, 20 μL of nucleic acid was added to a mixture containing 26 μL of a PCR buffer mixture (dNTPs, MgCl 2 , primer, and probe) and 4 μL of an enzyme mixture (reverse transcriptase and Taq polymerase). The PCR cycling program was as follows: 50°C for 4 min and 95°C for 30 s, followed by 45 cycles of 95°C for 2 s and 60°C for 20 s. The SLAN-96P system (Hongshi Medical Technology Co., Ltd., Shanghai, China) was used for qRT-PCR. A VIC channel cycle threshold (Ct) value of ≤ 40 and a simultaneous Cy5 channel Ct value of ≤ 40 indicated positivity for IAV. If at least one of the two extraction methods yielded a positive result for IAV, the sample was classified as positive for IAV; otherwise, it was considered negative.

Quantitative analysis of proteins

Modified Bradford reagent (Sangon Biotech Co., Ltd., Shanghai, China) was used to quantify the protein content in the washing solution after a magnetic bead wash. In brief, the reagent was used to measure the OD595 of bovine serum albumin (BSA) with a range of known concentrations. A standard curve correlating the absorbance with protein concentration at 595 nm was constructed. Subsequently, the OD595 of the test sample was measured in the same manner to determine its protein concentration by referencing the standard curve.

Statistical analysis

SPSS Statistics 20 software (IBM, Armonk, NY, USA) was used to calculate the percentage of positive and negative agreements between the FME and standard magnetic bead methods, and Cohen’s kappa statistic was calculated [ 20 ]. Statistical graphs were generated using GraphPad Prism 8 (GraphPad Software Inc., San Diego, CA, USA). Statistical analysis was performed with Student’s t tests, and P < 0.05 (two tailed) was considered statistically significant. The data presented are representative of at least three independent experiments.

The combination of glycerin and EtOH can efficiently purify nucleic acid

Nucleic acid extraction by the magnetic bead method involves three steps: lysis, washing, and elution (Fig. 1 A). To develop a rapid and efficient method for nucleic acid extraction, we have explored a novel composition of lysate and washing solution with magnetic beads. First, the LC kit was selected as the control group due to its efficiency being comparable to that of other commercial kits for RSV and ADV extraction (Additional file 2 : Figure S1), and a novel washing solution, glycerin, was substituted for the one in the LC kit to compare its washing efficacy to that of this LC washing buffer. Glycerin, a commonly employed protein stabilizer and storage buffer component, was identified as a viable washing solution for the purification of nucleic acid on magnetic beads in our study. As the glycerin concentration increased gradually from 20% to 80%, the extraction efficiency of RSV gradually improved, with no significant difference between 80% glycerin and the LC washing buffer (Fig. 1 B). Considering the high viscosity of concentrated glycerin, we chose 50% glycerin and gradually increased the EtOH concentration on this basis. The results showed that compared with 50% glycerin, the higher the EtOH content, the better the extraction efficiency of RSV. When the proportion of EtOH reached 30%, its washing effect surpassed that of the LC washing buffer, reaching optimal performance when the EtOH concentration increased to 50% (Fig. 1 C). Subsequently, we gradually increased the glycerin content while maintaining 50% EtOH and observed a corresponding gradual enhancement in extraction efficiency as the glycerin increased from 10% to 50% (Fig. 1 D). This showed that the combination of 50% glycerin and 50% EtOH allowed glycerin and EtOH to effectively perform their respective washing functions and that the solution with this ratio possessed an optimal viscosity for facilitating the subsequent steps. Furthermore, it was surprising to observe that using 50% glycerin and 50% EtOH for one wash yielded better results than washing two or three times (Fig. 1 E). This indicated that a single wash with 50% glycerin and 50% EtOH effectively eliminated the majority of impurities, while increasing the number of washes may result in the loss of nucleic acids. To further explore the efficacy of glycerin in removing impurities, we quantitatively measured the protein content of the novel washing solution after the washing process. The presence of 50% glycerin was found to exhibit a certain protein elution effect; however, as the proportion of EtOH gradually increased and reached 30%, the efficiency of the protein elution was significantly enhanced (Fig. 1 F). The proportion of glycerin was gradually increased while maintaining 50% EtOH, but it was found that the washed protein content remained relatively stable (Fig. 1 G), suggesting that while glycerin removed impurities, the primary ability to wash proteins was due to EtOH. Simultaneously, we also found that washing only once with 50% glycerin and 50% EtOH effectively eliminated the majority of protein impurities, while subsequent rounds of washing yielded less protein removal.

figure 1

Glycerin and EtOH combination for nucleic acid purification. A Schematic diagram of the nucleic acid extraction magnetic bead method. B Using the LC reagent as a control group, varying glycerin concentrations (80%, 70%, 60%, 50%, 40%, 30%, and 20%) and diethyl pyrocarbonate (DEPC) H 2 O were used to replace the LC kit washing buffer for low-concentration RSV nucleic acid extraction. One washing was performed, and qRT-PCR Ct values were used to assess the washing solution effect. C Using the LC reagent as a control group, different glycerin–EtOH combinations (50% glycerin and 50% EtOH, 50% glycerin and 40% EtOH, 50% glycerin and 30% EtOH, 50% glycerin and 20% EtOH, 50% glycerin and 10% EtOH, and 50% glycerin) were used to replace the LC kit washing buffer for high-concentration RSV nucleic acid extraction. One washing was performed, and qRT-PCR Ct values were used to assess the washing solution effect. D EtOH-glycerin solutions (50% EtOH and 50% glycerin, 50% EtOH and 40% glycerin, 50% EtOH and 30% glycerin, 50% EtOH and 20% glycerin, 50% EtOH and 10% glycerin, and 50% EtOH) were used to replace the LC kit washing buffer for high-concentration RSV nucleic acid extraction. One washing was performed, and qRT-PCR Ct values were used to assess the washing solution effect. E High-concentration RSV nucleic acid extraction using 50% glycerin and 50% EtOH washing solution, followed by one, two, or three rounds of purification in the washing step. Efficiency was evaluated based on qRT-PCR Ct values. F – H Wash solutions obtained from the wash steps in C–E were analyzed for the protein concentration. Data are presented as means ± SD for three independent biological replicates. Statistical significance was calculated using t tests; ns P > 0.05, * P < 0.05, ** P < 0.01, *** P < 0.001

The effect of A Plus lysis solution

In addition to the washing solution, we also developed a novel lysis solution termed A Plus (AP). When using AP or other commercial lysis buffers combined with 50% glycerin and 50% EtOH to extract high-concentration RSV, the lysis effect of AP was essentially comparable or even slightly superior to that of other brands of lysis buffer (Fig. 2 A), indicating the superior effectiveness of AP compared to the lysate in other commercially available kits. To investigate the key components involved in AP lysis, we used AP that lacked individual components to re-extract RSV. The results indicate that, compared with the intact AP, the reduction of each component in the AP lysis solution led to a decrease in extraction efficiency, among which the most significant impact was caused by reducing GTC or IPA (Fig. 2 B), indicating that the reduction of any constituent of the AP lysis solution adversely affects the final extraction effect, with the absence of GTC and IPA having the most pronounced impact. This suggested that GTC and IPA play a pivotal role in the lysis effect of AP solution. If a lysis solution is not thoroughly washed during the extraction process, it may have a significant impact on subsequent experiments, such as PCR amplification. Therefore, we also investigated whether adding various components of AP to the elution solution affected qRT-PCR performance. We found that the amplification of qRT-PCR was completely inhibited after the direct addition of AP lysis solution or GTC, and sarkosyl also had a significant inhibitory effect ( P < 0.01). Although the degree of inhibition was not obvious with sodium citrate, it decreased the fluorescence normalized reporter (Rn) value of qRT-PCR (Additional file 3 : Figure S2A and S2B). Further research showed that the inhibitory effect of sodium citrate, sarkosyl, and GTC on qRT-PCR disappeared when their concentration in the elution solution was reduced to 1.25, 1, and 0.1 mM, respectively (Additional file 3 : Figure S2C–S2H). This implied that while sodium citrate, sarkosyl, and GTC are the primary components in the lysis process, they must be removed during the washing stage to ensure that their concentration in the elution solution remains below inhibitory levels.

figure 2

The AP solution demonstrates high efficacy in sample lysis. A During the extraction of high-concentration RSV nucleic acid, the lysis solution was prepared using either AP, phosphate-buffered saline (PBS), or a commercially available nucleic acid extraction kit lysis buffer. The washing solution consisted of a mixture of 50% glycerin and 50% EtOH, and the efficiency of each lysis solution was determined by analyzing the qRT-PCR Ct values. B AP lysis solution was prepared lacking specific components, and these lysis solutions or PBS were used to extract RSV viral nucleic acid at high concentrations. Subsequently, a wash step was performed using 50% glycerin and 50% EtOH, followed by the determination of qRT-PCR Ct values to assess the efficiency of each lysis solution. Data are presented as means ± SD for three independent biological replicates. Statistical significance was calculated using t tests; ns P > 0.05, * P < 0.05, ** P < 0.01, *** P < 0.001

Extraction effect of the FME

We used AP lysis solution, washing solution (50% glycerin and 50% EtOH), magnetic beads, and an elution solution separately and sequentially to form the FME. The concentration and purity of RNA are two important criteria for any RNA extraction process. We found that the concentration of RNA obtained after applying the FME method to AD293 cells was higher than that of other traditional methods, and the purity (260/280 ratio) remained between 1.9 and 2.0 (Table 1 ). Recovery and integrity are metrics used to evaluate the effectiveness of nucleic acid extraction systems. To evaluate the extraction efficiency of the FME on DNA, we added known amounts of DNA to AP and compared it with the amount of DNA in the elution after extraction. The results showed that there was a strong linear correlation between the amounts of DNA added to the FME and the amounts of DNA recovered from the system (Fig. 3 A) and that complete nucleic acid fragments were obtained through extraction using the FME (Fig. 3 B). Subsequently, we also conducted a comparative analysis of the RSV extraction efficiency using the FME and several commercially available kits, including magnetic bead (Magen and LC), spin column (ABT and TIANamp), and precipitation methods (GIRM and TRIzol). The results showed that both the FME and GIRM precipitation methods yielded the highest RSV nucleic acid concentrations, with FME demonstrating a shorter processing time, and the FME was superior to the magnetic bead method and spin column method reagents used as a comparison (Fig. 3 C). This suggested that the FME could obtain a high RNA concentration from viral samples in a short period of time.

figure 3

Performance analysis of the FME method. A There was a strong linear correlation (Pearson’s R , R 2 = 0.9969) between the amount of DNA added to the FME and the amount of DNA recovered from the FME (μg). B DNA integrity was analyzed by agarose gel electrophoresis. Line 1: DNA Marker, Line 2: DNA before being added to the FME, Line 3: DNA after being added to the FME. C The FME and six different nucleic acid extraction kits were used to extract high-concentration RSV, and the Ct values obtained through qRT-PCR served as the criteria for evaluating the efficiency of each extraction kit. D–F PBS was used for fivefold gradient dilutions of RSV, and each gradient was subjected to nucleic acid extraction using either manual or automatic methods. The Ct values were used to assess the consistency between automatic and manual operations. D: Amplification curve, E: histogram, F: linear regression curve, manually extracted R 2 = 0.9921, and automatically extracted R 2 = 0.9922. Data are presented as means ± SD for three independent biological replicates. Statistical significance was calculated using t tests; ns P > 0.05, * P < 0.05, ** P < 0.01, *** P < 0.001

Automation and performance evaluation of the FME

Automation and high throughput are two of the biggest advantages of the magnetic bead method [ 21 ]. Considering the differences in RNA yield between manual and automatic extraction, we paired the FME with an automatic nucleic acid extraction instrument and implemented specific procedures to achieve a 5-minute extraction (Additional file 1 : Table S2) and then compared the differences between the manual and automatic FME. As a result, it was observed that there was no significant difference in the efficacy of the manual or automatic FME when using fivefold diluted RSV samples sequentially (Fig. 3 D and E), and the regression curves obtained from both approaches exhibited a high degree of overlap (Fig. 3 F). This showed that after using a specific program, the FME could be adapted to currently available instruments and consumables, achieving high-quality automatic nucleic acid extraction within 5 min.

To confirm the ability of automatic FME to extract actual samples, we selected several respiratory viruses commonly used in laboratories and compared the FME method with a representative magnetic bead method (LC), spin column method (TIANamp), and precipitation method (GIRM). The results showed that both the FME and the precipitation method exhibited the most effective extraction of samples with high, medium, and low concentrations of RSV, and the FME was superior to the spin column method and magnetic bead method (Fig. 4 A). When extracting ADV, the traditional precipitation method was confirmed to be the most effective, followed by the FME and spin column, although the FME still outperformed the magnetic bead method at all concentrations (Fig. 4 B). For IAV extraction, except for a slightly better performance of the precipitation method at the highest concentration compared with the FME, both the FME and precipitation method displayed optimal effects at all concentrations, surpassing other techniques (Fig. 4 C). Regarding HSV-1 extraction, no significant difference was observed among the FME, precipitation, and spin column methods; however, all three were marginally more effective than the magnetic bead method (Fig. 4 D). For high-concentration pseudovirus in a SARS-CoV-2 extraction, the precipitation method was the best, followed by the FME and spin column methods, while the magnetic bead method was the least effective. At a medium concentration, there was no significant difference among the FME, precipitation, and spin column methods, and all were superior to the magnetic bead method. At a low concentration, there was no significant difference among the four methods (Fig. 4 E). In terms of high, medium, and low concentrations of HCoV-229E sample extractions, the FME exhibited the highest extraction effect, followed by the precipitation, magnetic bead, and spin column methods (Fig. 4 F). This showed that FME achieved a superior or second-best outcome compared with commonly used nucleic acid extraction kits available on the market, regardless of whether the process involved extracting DNA or RNA viruses, or enveloped or non-enveloped viruses.

figure 4

Effects of the FME on the extraction of multiple respiratory viruses. The efficiency of each method was evaluated by extracting high, medium, and low concentrations of respiratory viruses using the FME, magnetic bead (LC), spin column (TIANamp), and precipitation (GIRM) methods, followed by obtaining Ct values through qRT-PCR. A Effect of RSV nucleic acid extraction. B Effect of ADV nucleic acid extraction. C Effect of IAV nucleic acid extraction. D Effect of HSV-1 nucleic acid extraction. E Effect of SARS-CoV-2 pseudovirus nucleic acid extraction ( n gene). F Effect of HCoV-229E nucleic acid extraction. Data are presented as means ± SD for three independent biological replicates. Statistical significance was calculated using t tests; ns P > 0.05, * P < 0.05, ** P < 0.01, *** P < 0.001

Frozen clinical specimens from 525 suspected IAV-infected patients were simultaneously extracted using both the standard magnetic bead method and the FME. The Ct values obtained by the standard magnetic bead method and FME were compared to evaluate the clinical performance of the FME. The detection performance of the FME for IAV, compared with that of the standard magnetic bead method, was as follows: sensitivity, 97.00% (323/333); specificity, 92.71% (178/192); positive predictive value, 95.85% (323/337); negative predictive value, 94.68% (178/188); and total coincidence rate (323+178)/525 = 95.43% (Table 2 ). The consistency between the two methods was good, with a Kappa statistic of 0.901 ( P < 0.001). In addition, we analyzed the Ct value distribution of the clinical specimens. The average Ct values of IAV extracted by the standard magnetic bead method and FME were 31.49 and 31.16, respectively. There was no significant difference in the Ct value distribution between the two extraction methods. However, the Ct values obtained through the FME were broader in the upper and lower quartiles than those obtained by the magnetic bead method (Fig. 5 A). Further analysis was conducted on specimens with Ct values > 35 to evaluate the limit of detection (LoD) of the FME assay in actual clinical samples. It was found that there was no significant difference in the Ct distribution between the FME and standard magnetic bead method in low-concentration specimens (Fig. 5 B).

figure 5

Distribution of Ct values for clinical specimens using the FME and standard magnetic bead method. A Distributions of Ct values obtained from IAV clinical specimens using the FME and standard magnetic bead extraction methods. B Distribution of Ct values for low-concentration specimens (Ct values > 35 for results of both tests). Lines within boxes represent medians. Upper and lower boundaries of boxes represent upper and lower quartiles, respectively. Bars represent minimum and maximum values. Dashed lines indicate the detection limit. ND, not detected. Statistical significance was calculated using t tests; ns P > 0.05

We noticed that a total of 24 specimens had inconsistent results after being extracted using the FME and standard magnetic bead method. Among them, 14 cases were positive for FME detection and negative for standard magnetic bead method detection; 10 cases were negative for FME detection (including one case with a Ct value > 40, which was judged to be negative due to the interpretation criteria) and positive for standard magnetic bead method detection (Additional file 1 : Table S3). All these inconsistent results had Ct values > 35, and 21 had Ct values > 37. Clearly, these specimens were borderline positive results. In addition to the difference in extraction efficiency, the detection performance of the qRT-PCR instrument on the LoD also affects the detection rate of the results. Overall, the effect of the FME on clinical specimen extraction was comparable to that of the standard magnetic bead method.

In recent years, the field of pathogen detection technology has witnessed an unprecedented pace of development. Compared with lateral flow immunochromatographic assays (LFIAs), enzyme-linked immunosorbent assays (ELISAs), plaque reduction neutralization tests (PRNTs), and other technologies, NAAT-based detection methods exhibit superior sensitivity, specificity, and speed of detection [ 22 , 23 , 24 ]. Among all NAATs, qRT-PCR is the most widely used and has emerged as the preferred method for detecting human pathogens [ 25 , 26 ]. However, a major choke point of qRT-PCR diagnosis is that it relies on the extraction and purification of nucleic acids from samples, which is crucial for achieving optimal sensitivity, but also a relatively time-consuming and laborious process [ 27 , 28 ]. It has been reported that extraction-free methods can serve as an alternative procedure, alleviating the supply bottleneck of extraction reagents [ 29 ]. However, these strategies are incapable of eliminating PCR inhibitors in the specimen, leading to a significant reduction in sensitivity when processing samples with low viral loads [ 30 , 31 ]. Therefore, there is an urgent need for novel technology capable of rapidly extracting high-quality nucleic acid.

Glycerin functions as a humectant, solvent, plasticizer, adhesive, and binding agent. Despite its extensive utilization in the fields of food, cosmetics, medicine, and chemical synthesis, glycerin has not yet been employed as the primary raw material in nucleic acid purification reagents [ 32 ]. In this study, we demonstrated that glycerin, which is miscible with water in any proportion, can purify nucleic acids, and the higher the concentration of glycerin, the better the purification effect (Fig. 1 B). Furthermore, after adding a certain proportion of EtOH to 50% glycerin, the effect of washing with 50% glycerin and 50% EtOH for a short duration was superior to that of thoroughly washing by the conventional magnetic bead method three times (Fig. 1 C). However, we found that the elution of miscellaneous proteins using 50% glycerin and 50% EtOH primarily relied on the presence of EtOH (Fig. 1 F and G); the specific role of glycerin in nucleic acid purification remains to be investigated.

To release nucleic acids from the sample, physical, chemical, or enzymatic methods are usually used to lyse samples [ 9 ]. In clinical laboratories, nucleic acid lysis commonly involves mixing the sample with a chemical lysis solution, subjecting it to high-temperature heating, and supplementing with proteinase K [ 21 , 33 ]. In this study, we developed a chemical-based lysis solution AP, which exhibited a superior lysis effect compared with the majority of commercially available lysis reagents when combined with 50% glycerin and 50% EtOH for nucleic acid extraction (Fig. 2 A). The FME, composed of AP, 50% glycerin and 50% EtOH, exhibited superior efficacy in RSV extraction compared with conventional methods (Fig. 3 C). Moreover, the extracted RNA from AD293 cells demonstrated above-average levels of concentration and purity (Table 1 ). It is worth pointing out that commercially available kits have been modified to ensure consistent sample loading and elution volumes with FME, thus the above results may not have necessarily reflected the optimal efficiency of these kits; however, these findings highlight the advantages of FME in terms of speed and performance. We also explored the inhibition of the PCR reaction when there was residual AP lysis solution during elution when the concentration of each AP component reached a certain threshold. After the concentrations of GTC, sodium citrate, and sarkosyl reached a level that inhibited PCR amplification, an exponential increase in the Ct values resulted (Additional file 3 : Figure S2). Subsequently, by diluting the FME elution solution 10-fold, we observed a sequential increase of approximately 3.4 in the detected Ct values (Additional file 4 : Figure S3), indicating the absence of inhibitory residues within the elution. Although concentration and purity are important parameters for evaluating the quality of nucleic acid extraction, the ultimate indicator lies in its functionality in downstream applications [ 34 ].

Compared with other nucleic acid extraction techniques, the magnetic bead method offers several advantages, including unrestricted sample volumes, easy removal from sample suspensions, and suitability for large-scale automated operations. Moreover, various types of magnetic particles can be utilized for the separation and purification of DNA, RNA, and plasmid DNA [ 35 , 36 ]. We equipped the FME with a small automatic nucleic acid extractor to automate the FME in the same manner as the manual operation in terms of extraction time, quality, and stability (Fig. 3 F). We observed that the FME demonstrated superior or comparable efficacy in extracting respiratory viruses compared with magnetic bead, spin column, and precipitation methods (Fig. 4 ). We hypothesized that the disparity in extraction efficiency among different viruses was associated with the presence of a specific secondary structure and envelope. It should be noted that, in terms of time requirements, the magnetic bead method, when automated, required approximately 25–30 min, while both the spin column method and precipitation method, being manual processes, necessitated close to 1 h. Furthermore, both the spin column and precipitation methods require manual operations, which places high demands on the skills and proficiency of the operators. By contrast, the automatic FME not only extracted high-quality nucleic acid but also required less time with no technical requirements for personnel. In addition, our research showed that the FME had excellent clinical performance when analyzing specimens from individuals suspected to have IAV infection. The distribution of Ct values detected by FME in 525 clinical specimens was similar to that observed using the standard magnetic bead method (Fig. 5 A), and there was no significant difference between the two methods in the analysis of low-concentration specimens close to the LoD (Fig. 5 B), although FME identified more positive cases (Additional file 1 : Table S3). In spite of this, it is essential to observe that the aforementioned clinical specimens refer to pharyngeal swabs, and other samples such as sputum, stool, and blood cannot be directly processed using the FME method. In the case of sputum and stool, pre-treatment is required prior to extraction, which typically takes a minimum of 30 min [ 37 , 38 ]. Additionally, whole blood and plasma samples are highly viscous, and therefore require additional time for lysis, washing and magnetic bead magnetization. Consequently, when extracting viruses from complex samples, FME is not able to produce high-quality nucleic acid within 5 min. Collectively, these findings highlight the promising clinical applicability of FME for analyzing non-complex samples.

Recently, numerous innovative techniques for nucleic acid extraction have been reported, including the use of cellulose-based membranes to extract nucleic acid without the requirement of a separate elution step, enabling direct amplification from the membrane [ 39 ]. A polytetrafluoroethylene (PTFE)-based nucleic acid extraction system achieved a fully enclosed sample extraction process that seamlessly integrated with droplet digital PCR (ddPCR) [ 40 ]. Using an integrated microfluidic system, the process of sample extraction, enrichment, and detection was completed within a small chip [ 41 ]. Although these novel methodologies exhibited rapidity and compactness, their applicability remained confined to specific scenarios. To achieve widespread adoption in scientific research and clinical testing, challenges pertaining to automation, cost-effectiveness, and processing of intricate samples still need to be addressed. The results of qRT-PCR detection depend on a number of factors, including the specimen type, the timing of collection, the quality and quantity of viral RNA, the primer and probe design for viral RNA targets, the reagents and instruments used for detection, as well as the signals and cut-off values employed for result interpretation [ 42 , 43 ]. Among them, obtaining high-quality nucleic acid is fundamental for accurate detection. However, when changing the RNA extraction kits from one type to other, particularly those based on different chemical components, the reliability of both the quantity and quality of extracted nucleic acid becomes uncertain [ 44 , 45 ]. How to detect viruses with high Ct values and low loads stably, while avoiding false negatives, remains a challenging problem.

In summary, this study demonstrates that glycerin served as an effective washing solution for nucleic acid purification and presented a novel nucleic acid extraction technology based on glycerin with a remarkable processing time of only 5 min. The developed technology exhibited notable features, including rapidity, cost-effectiveness, automation, and high-quality yields of extracted nucleic acids. Consequently, this study significantly contributes to the reduction of sample pre-processing time in NAATs and lays a foundation for the realization of rapid molecular diagnosis.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files.

Abbreviations

Respiratory syncytial virus

Severe acute respiratory syndrome coronavirus 2

Influenza A virus

Nucleic acid amplification test

Quantitative real-time polymerase chain reaction

Isothermal amplification technologies

Next-generation sequencing

Guanidine thiocyanate

Ribonuclease P

Dulbecco’s modified Eagle medium

Fetal bovine serum

Herpes simplex virus 1

Human coronavirus 229E

  • Five-minute nucleic acid extraction

China Food and Drug Administration

Sodium citrate tribasic dihydrate

Sodium lauroyl-sarcosine

Dithiothreitol

Polyethylene glycol 6000

Isopropyl alcohol

Ethylene diamine tetraacetic acid

Cycle threshold

Bovine serum albumin

A Plus lysis solution

Normalized reporter

Limit of detection

Lateral flow immunochromatographic assays

Enzyme-linked immunosorbent assays

Plaque reduction neutralization tests

Polytetrafluoroethylene

Droplet digital PCR

Diethyl pyrocarbonate

Phosphate-buffered saline

Olofsson S, Brittain-Long R, Andersson LM, Westin J, Lindh M. PCR for detection of respiratory viruses: seasonal variations of virus infections. Expert Rev Anti Infect Ther. 2011;9(8):615–26.

Article   PubMed   CAS   Google Scholar  

Kikkert M. Innate immune evasion by human respiratory RNA viruses. J Innate Immun. 2020;12(1):4–20.

Mahony JB. Nucleic acid amplification-based diagnosis of respiratory virus infections. Exp Rev Anti-infect Ther. 2010;8(11):1273–92.

Article   CAS   Google Scholar  

Taubenberger JK, Morens DM. 1918 Influenza: the mother of all pandemics. Emerg Infect Dis. 2006;12(1):15–22.

Article   PubMed   PubMed Central   Google Scholar  

Zhou P, Yang X-L, Wang X-G, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579(7798):270–3.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Ma S, Lavelle TA, Ollendorf DA, Lin P-J. Herd immunity effects in cost-effectiveness analyses among low- and middle-income countries. Appl Health Econ Health Policy. 2022;20(3):395–404.

Lucero-Prisno DE, Shomuyiwa DO, Vicente CR, Méndez MJG, Qaderi S, Lopez JC, et al. Achieving herd immunity in South America. Global Health Res Policy. 2023;8(1):2.

Article   Google Scholar  

Tali SHS, LeBlanc JJ, Sadiq Z, Oyewunmi OD, Camargo C, Nikpour B, et al. Tools and Techniques for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)COVID-19 Detection. Clin Microbiol Rev. 2021;34(3):e00228-20.

CAS   Google Scholar  

Tan SC, Yiap BC. DNA, RNA, and protein extraction: the past and the present. J Biomed Biotechnol. 2009;2009:574398.

PubMed   PubMed Central   Google Scholar  

LeBlanc JJ, Gubbay JB, Li Y, Needle R, Arneson SR, Marcino D, et al. Real-time PCR-based SARS-CoV-2 detection in Canadian laboratories. J Clin Virol. 2020;128:104433.

Carter DJ, Cary RB. Lateral flow microarrays: a novel platform for rapid nucleic acid detection based on miniaturized lateral flow chromatography. Nucleic Acids Res. 2007;35(10):e74.

James AS, Alawneh JI. COVID-19 Infection Diagnosis: Potential Impact of Isothermal Amplification Technology to Reduce Community Transmission of SARS-CoV-2. Diagnostics (Basel). 2020;10(6):399.

Hess JF, Kohl TA, Kotrová M, Rönsch K, Paprotka T, Mohr V, et al. Library preparation for next generation sequencing: a review of automation strategies. Biotechnol Adv. 2020;41:107537.

Ali N, Rampazzo RdCP, Costa ADT, Krieger MA. Current nucleic acid extraction methods and their implications to point-of-care diagnostics. BioMed Res Int. 2017;2017:9306564.

Alabi T, Patel SB, Bhatia S, Wolfson JA, Singh P. Isolation of DNA-free RNA from human bone marrow mononuclear cells: comparison of laboratory methods. Biotechniques. 2020;68(3):159–62.

Boom R, Sol CJ, Salimans MM, Jansen CL, Dillen PMW-V, Noordaa JVD. Rapid and simple method for purification of nucleic acids. J Clin Microbiol. 1990;28(3):495-503.

Hawkins TL, O’Connor-Morin T, Roy A, Santillan C. DNA purification and isolation using a solid-phase. Nucleic Acids Res. 1994;22(21):4543–4.

Yoza B, Matsumoto M, Matsunaga T. DNA extraction using modified bacterial magnetic particles in the presence of amino silane compound. J Biotechnol. 2002;94(3):217–24.

Li P, Li M, Yue D, Chen H. Solid-phase extraction methods for nucleic acid separation. A review. J Sep Sci. 2021;45(1):172–84.

Article   PubMed   Google Scholar  

Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med. 2005;37(5):360–3.

PubMed   Google Scholar  

Sabat J, Subhadra S, Rath S, Ho LM, Kanungo S, Panda S, et al. Yielding quality viral RNA by using two different chemistries: a comparative performance study. Biotechniques. 2021;71(4):510–5.

Okba NMA, Müller MA, Li W, Wang C, GeurtsvanKessel CH, Corman VM, et al. Severe acute respiratory syndrome coronavirus 2-specific antibody responses in coronavirus disease patients. Emerg Infect Dis. 2020;26(7):1478–88.

Li Z, Yi Y, Luo X, Xiong N, Liu Y, Li S, et al. Development and clinical application of a rapid IgM-IgG combined antibody test for SARS-CoV-2 infection diagnosis. J Med Virol. 2020;92(9):1518–24.

Manenti A, Maggetti M, Casa E, Martinuzzi D, Torelli A, Trombetta CM, et al. Evaluation of SARS-CoV-2 neutralizing antibodies using a CPE-based colorimetric live virus micro-neutralization assay in human serum samples. J Med Virol. 2020;92(10):2096–104.

D’Cruz RJ, Currier AW, Sampson VB. Laboratory testing methods for novel Severe Acute Respiratory Syndrome-Coronavirus-2 (SARS-CoV-2). Front Cell Dev Biol. 2020;8:468.

Carter LJ, Garner LV, Smoot JW, Li Y, Zhou Q, Saveson CJ, et al. Assay techniques and test development for COVID-19 diagnosis. ACS Cent Sci. 2020;6(5):591–605.

Rahman MM, Elaissari A. Nucleic acid sample preparation for in vitro molecular diagnosis_from conventional. Drug Discov Today. 2012;17(21–22):1199–207.

Thatcher SA. DNA/RNA preparation for molecular detection. Clin Chem. 2015;61(1):89–99.

Smyrlaki I, Ekman M, Lentini A, Rufino de Sousa N, Papanicolaou N, Vondracek M, et al. Massive and rapid COVID-19 testing is feasible by extraction-free SARS-CoV-2 RT-PCR. Nat Commun. 2020;11(1):4812.

Bruce EA, Huang ML, Perchetti GA, Tighe S, Laaguiby P, Hoffman JJ, et al. Direct RT-qPCR detection of SARS-CoV-2 RNA from patient nasopharyngeal swabs without an RNA extraction step. PLoS Biol. 2020;18(10):e3000896.

Lübke N, Senff T, Scherger S, Hauka S, Andrée M, Adams O, et al. Extraction-free SARS-CoV-2 detection by rapid RT-qPCR universal for all primary respiratory materials. J Clin Virol. 2020;130:104579.

Becker LC, Bergfeld WF, Belsito DV, Hill RA, Klaassen CD, Liebler DC, et al. Safety Assessment of Glycerin as Used in Cosmetics. Int J Toxicol. 2019;38(3_suppl):6S-22S.

Beall SG, Cantera J, Diaz MH, Winchell JM, Lillis L, White H, et al. Performance and workflow assessment of six nucleic acid extraction technologies for use in resource limited settings. PLoS One. 2019;14(4):e0215753.

Jeffries MKS, Kiss AJ, Smith AW, Oris JT. A comparison of commercially-available automated and manual extraction kits for the isolation of total RNA from small tissue samples. BMC Biotechnol. 2014;14:94.

Chiang C-L, Sung C-S, Wu T-F, Chen C-Y, Hsu C-Y. Application of superparamagnetic nanoparticles in purification of plasmid DNA from bacterial cells. J Chromatogr B Analyt Technol Biomed Life Sci. 2005;822(1–2):54–60.

Berensmeier S. Magnetic particles for the separation and purification of nucleic acids. Appl Microbiol Biotechnol. 2006;73(3):495–504.

Peng J, Lu Y, Song J, Vallance BA, Jacobson K, Yu HB, et al. Direct Clinical Evidence Recommending the Use of Proteinase K or Dithiothreitol to Pretreat Sputum for Detection of SARS-CoV-2. Front Med. 2020;7.

Harrington C, Sun H, Jeffries-Miles S, Gerloff N, Mandelbaum M, Pang H, et al. Culture-Independent Detection of Poliovirus in Stool Samples by Direct RNA Extraction. Microbiol Spectr. 2021;9(3):e0066821.

Zou Y, Mason MG, Wang Y, Wee E, Turni C, Blackall PJ, et al. Nucleic acid purification from plants, animals and microbes in under 30 seconds. PLoS Biol. 2017;15(11):e2003916.

Yin J, Hu J, Sun J, Wang B, Mu Y. A fast nucleic acid extraction system for point-of-care and integration of digital PCR. Analyst. 2019;144(23):7032–40.

Ohlsson P, Evander M, Petersson K, Mellhammar L, Lehmusvuori A, Karhunen U, et al. Integrated acoustic separation, enrichment, and microchip polymerase chain reaction detection of bacteria from blood for rapid sepsis diagnostics. Anal Chem. 2016;88(19):9403–11.

Rahbari R, Moradi N, Abdi M. rRT-PCR for SARS-CoV-2: Analytical considerations. Clin Chim Acta. 2021;516:1–7.

Udugama B, Kadhiresan P, Kozlowski HN, Malekjahani A, Osborne M, Li VYC, et al. Diagnosing COVID-19: the disease and tools for detection. ACS Nano. 2020;14(4):3822–35.

Ambrosi C, Prezioso C, Checconi P, Scribano D, Sarshar M, Capannari M, et al. SARS-CoV-2: Comparative analysis of different RNA extraction methods. J Virol Methods. 2021;287:114008.

Vallejo L, Martínez-Rodríguez M, Nieto-Bazán MJ, Delgado-Iribarren A, Culebras E. Comparative study of different SARS-CoV-2 diagnostic techniques. J Virol Methods. 2021;298:114281.

Download references

Acknowledgments

We would like to thank all the staff members of the molecular diagnostic PCR team for their support and valuable discussion.

This work was supported by the Research and Development Project in the Key Areas of Guangdong Province (2022B1111020003) and the Emergency Key Program of Guangzhou Laboratory (EKPG21-13). The funders had no input into the study design, data collection, or interpretation, or the decision to submit the work for publication.

Author information

Authors and affiliations.

Guangzhou National Laboratory, Guangzhou, China

Yu Wang, Yuanyuan Huang, Yuqing Peng, Qinglin Cao, Guangxin Xu & Rong Zhou

State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou Medical University, Guangzhou, China

Wenkuan Liu, Zhichao Zhou & Rong Zhou

School of Pharmacy, Tongji Medical College, Huazhong University of Science of Technology, Wuhan, China

GIRM Biosafety (Guangzhou) Co., Ltd, Guangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

YW: conceptualization, methodology, investigation, data curation, formal analysis, writing–original draft. YYH: investigation, data curation, formal analysis, writing–original draft. YQP: investigation, data curation, validation. QLC: investigation, data curation. WKL: methodology, investigation, resources, validation. ZCZ: methodology, resources. GXX: investigation, data curation. LL: methodology, funding acquisition, resources, supervision, writing–reviewing and editing. RZ: conceptualization, funding acquisition, project administration, writing–reviewing and editing. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Lei Li or Rong Zhou .

Ethics declarations

Ethics approval and consent to participate.

This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The research protocol was reviewed and approved by the Ethics Committee of the First Affiliated Hospital of Guangzhou Medical University (No. 2020-77). All enrolled patients obtained written informed consent from their guardians.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. , 12985_2024_2381_moesm2_esm.tif.

Additional file 2: Figure S1. The LC exhibits an extraction efficiency similar to the magnetic bead method. A and B RSV and ADV nucleic acid were extracted using LC, Baypure, and HR magnetic bead extraction kits, and qRT-PCR Ct values were used to assess the efficiency of each method for extracting nucleic acid. A: Effect of RSV nucleic acid extraction. B: Effect of ADV nucleic acid extraction. Data are presented as means± SD for three independent biological replicates. Statistical significance was calculated using t tests; ns P > 0.05, * P < 0.05, ** P < 0.01.

12985_2024_2381_MOESM3_ESM.tif

Additional file 3: Figure S2. Effect of residual AP lysis solution on PCR amplification. RSV nucleic acid was extracted at a high concentration by the FME. A and B In the RSV reaction system, 4 μL of RSV nucleic acid was added, along with an additional 1 μL each of DEPC H 2 O, 25 mM sodium citrate, 20 mM sarkosyl, 2.5% PEG 6000, 1 M DTT, and 4 M GTC, IPA, or AP lysis solution. qRT-PCR Ct values were used to assess the effect of the residual AP lysis solution components on amplification. C and D 4 μL of RSV nucleic acid was added, followed by an additional 1 μL of sodium citrate diluted in a twofold gradient. Ct values were used to assess the effect of residual sodium citrate on PCR amplification. E and F 4 μL of RSV nucleic acid was added, followed by an additional 1 μL of sarkosyl diluted in a twofold gradient. Ct values were used to assess the effect of residual sarkosyl on PCR amplification. G and H 4 μL of RSV nucleic acid was added, followed by an additional 1 μL of GTC in a twofold gradient. Ct values were used to assess the effect of GTC on PCR amplification. Solid lines indicate the median, and dashed lines indicate the detection limit. Data are presented as means ± SD for three or four independent biological replicates.

12985_2024_2381_MOESM4_ESM.tif

Additional file 4: Figure S3. The Ct values of the eluted nucleic acid were determined following a 10-fold gradient dilution. High concentrations of RSV were extracted by the FME, and the nucleic acid in the elution was sequentially diluted using a 10-fold gradient with DEPC H 2 O. The Ct values were determined by qRT-PCR for each dilution gradient, with each measurement repeated three times. A Amplification curve. B Linear regression curve, R 2 = 0.9993.

Additional file 5. 

Additional file 6. , rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Wang, Y., Huang, Y., Peng, Y. et al. Development and validation of a rapid five-minute nucleic acid extraction method for respiratory viruses. Virol J 21 , 189 (2024). https://doi.org/10.1186/s12985-024-02381-3

Download citation

Received : 10 January 2024

Accepted : 02 May 2024

Published : 18 August 2024

DOI : https://doi.org/10.1186/s12985-024-02381-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Respiratory viruses
  • Glycerin and ethanol

Virology Journal

ISSN: 1743-422X

nucleic acid research

  • Alzheimer's disease & dementia
  • Arthritis & Rheumatism
  • Attention deficit disorders
  • Autism spectrum disorders
  • Biomedical technology
  • Diseases, Conditions, Syndromes
  • Endocrinology & Metabolism
  • Gastroenterology
  • Gerontology & Geriatrics
  • Health informatics
  • Inflammatory disorders
  • Medical economics
  • Medical research
  • Medications
  • Neuroscience
  • Obstetrics & gynaecology
  • Oncology & Cancer
  • Ophthalmology
  • Overweight & Obesity
  • Parkinson's & Movement disorders
  • Psychology & Psychiatry
  • Radiology & Imaging
  • Sleep disorders
  • Sports medicine & Kinesiology
  • Vaccination
  • Breast cancer
  • Cardiovascular disease
  • Chronic obstructive pulmonary disease
  • Colon cancer
  • Coronary artery disease
  • Heart attack
  • Heart disease
  • High blood pressure
  • Kidney disease
  • Lung cancer
  • Multiple sclerosis
  • Myocardial infarction
  • Ovarian cancer
  • Post traumatic stress disorder
  • Rheumatoid arthritis
  • Schizophrenia
  • Skin cancer
  • Type 2 diabetes
  • Full List »

share this!

August 20, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

iSN04: A novel nucleic acid drug for the treatment of vascular diseases

by Shinshu University

iSN04: A novel nucleic acid drug for the treatment of vascular diseases

Atherosclerosis, a major cause of mortality worldwide, involves an overgrowth of vascular smooth muscle cells in the blood vessels, constraining blood flow and potentially causing cardiovascular diseases.

Against this backdrop, researchers from Shinshu University have developed a DNA aptamer called iSN04 that targets and counteracts with the protein nucleolin in smooth muscle cells . This anti-nucleolin aptamer helps maintain smooth muscle cells in a differentiated state, offering new treatment potential for atherosclerosis and other vascular diseases.

Their findings were published on 15 June 2024 in Volume 14 of the journal Biomolecules . Ms. Mana Miyoshi, affiliated with the Department of Agriculture, Graduate School of Science and Technology, Shinshu University, contributed to the study as the first author.

Heart diseases and strokes are on the rise worldwide, with atherosclerosis being a leading contributor. Atherosclerosis involves the buildup of plaques—composed of fat, cholesterol, calcium, and other substances found in the blood—inside the arteries.

Over time, this buildup can lead to the hardening and narrowing of the arteries, restricting blood flow . Notably, this condition involves vascular smooth muscle cells (VSMCs) in the arterial walls.

VSMCs can switch between a contractile state (ideal for blood vessel function) and a proliferative state (capable of contributing to plaque formation). During atherosclerosis, the switch from a contractile to a proliferative state can lead to plaque instability and rupture, underscoring the importance of a corresponding therapeutic strategy.

Traditional treatments for atherosclerosis typically focus on lowering cholesterol levels and managing risk factors like high blood pressure. However, a novel approach involves directly targeting VSMCs to stabilize plaques and prevent their rupture.

This innovative strategy has led researchers from Shinshu University, Japan, to develop a novel nucleic acid drug called iSN04. Associate Professor Tomohide Takaya, from the Faculty of Agriculture, Shinshu University led the study.

iSN04 belongs to a group of nucleic acid drugs called DNA aptamer—a short, single-stranded DNA molecule capable of selectively binding to a specific target of interest. iSN04 interacts with a protein called nucleolin in VSMCs.

Nucleolin plays a role in the de-differentiation (loss of specialized function) and proliferation of VSMCs, contributing to plaque formation and instability. By targeting nucleolin with iSN04, the researchers aimed to keep VSMCs in their contractile state, thereby reducing plaque formation and promoting plaque stability.

Interestingly, the same research group laid the groundwork through multiple previous studies for the current study.

Dr. Takaya says, "Our anti-nucleolin DNA aptamer, iSN04, was originally identified as an inducer of skeletal muscle differentiation in 2021. Subsequently, we found that iSN04 also promotes cardiac muscle differentiation in 2023. Therefore, we hypothesized that iSN04 could promote smooth muscle differentiation."

He adds, "This study showed that iSN04 indeed induces vascular smooth muscle cell differentiation, resulting in the inhibition of angiogenesis."

The study showed that iSN04 can help maintain VSMCs in their contractile, differentiated state. The researchers interestingly found that iSN04 can effectively enter VSMCs without any carriers. Once inside, iSN04 reduced VSMC proliferation and increased the levels of a protein marker called α-smooth muscle actin, helping VSMCs in achieving a contractile state.

Another concern the researchers sought to address was angiogenesis within plaques. Angiogenesis refers to new blood vessel formation within plaques that can lead to their instability and rupture. The study demonstrated that iSN04 could inhibit angiogenesis in an experimental model using mouse aortic rings, indicating its potential to stabilize plaques.

Ms. Miyoshi said, "Given our anti-nucleolin DNA aptamer inhibits angiogenesis by inducing VSMC differentiation, it can find applications as a nucleic acid drug for pathological angiogenesis involved in atherosclerosis, cancer, and retinopathy."

The breakthrough development of iSN04 marks a promising new chapter in the fight against atherosclerosis, and could revolutionize its treatment and improve patient outcomes globally.

Explore further

Feedback to editors

nucleic acid research

Study identifies metabolic switch essential for generation of memory T cells and anti-tumor immunity

Aug 24, 2024

nucleic acid research

Multiple sclerosis appears to protect against Alzheimer's disease

Aug 23, 2024

nucleic acid research

Good sleep habits important for overweight adults, study suggests

nucleic acid research

Mediterranean diet supplement can affect epigenetics associated with healthy aging

nucleic acid research

New method for quantifying boredom in the body during temporary stress

nucleic acid research

Cancer researchers develop new method that uses internal clock inside tumor cells to optimize therapies

nucleic acid research

Strength training activates cellular waste disposal, interdisciplinary research reveals

nucleic acid research

Being a 'weekend warrior' could be as good for brain health as exercising throughout the week

nucleic acid research

Simple blood test for Alzheimer's disease could change how the disease is detected and diagnosed

nucleic acid research

Chlamydia can settle in the intestine, organoid experiments reveal

Related stories.

nucleic acid research

DNA aptamer finds novel application in regulating cell differentiation

Oct 10, 2023

nucleic acid research

New look at atherosclerosis

Jul 15, 2019

nucleic acid research

Smooth muscle overexpression of PGC1α attenuates atherosclerosis in rabbits

Jun 24, 2021

nucleic acid research

Scientists find cancer-like features in atherosclerosis, spurring opportunity for new treatment approaches

Apr 30, 2024

nucleic acid research

Pressure-driven foam cell formation revealed as key driver of arterial disease, paving the way for new therapies

Jan 5, 2024

nucleic acid research

Targeting a specific protein in smooth muscle cells may dramatically reduce atherosclerotic plaque formation

Jun 16, 2022

Recommended for you

nucleic acid research

Intestinal parasite could hold key to scar-free wound healing, study suggests

nucleic acid research

Weight loss drug's heart benefits extend to people with heart failure, study finds

Aug 22, 2024

nucleic acid research

How insulin, zinc and pH can block harmful protein clumps linked to type 2 diabetes

nucleic acid research

New insights into blood flow fluctuations offer hope in fight against cardiovascular disease

nucleic acid research

Bioengineers develop hybrid grafts to combat cardiovascular disease

Let us know if there is a problem with our content.

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Medical Xpress in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Nucleic Acids Res
Vol. 52
2024

2024 Jan 5

2024 Jan 11

2024 Jan 25

2024 Feb 9

2024 Feb 28

2024 Mar 21

2024 Apr 12

2024 Apr 24

2024 May 8

2024 May 22

2024 Jun 10

2024 Jun 24

2024 Jul 5

2024 Jul 8

2024 Jul 22

2024 Aug 12

2023 Jan 6

2023 Jan 11

2023 Jan 25

2023 Feb 22

2023 Feb 28

2023 Mar 21

2023 Apr 11

2023 Apr 24

2023 May 8

2023 May 22

2023 Jun 9

2023 Jun 23

2023 Jul 5

2023 Jul 7

2023 Jul 21

2023 Aug 11

2023 Aug 25

2023 Sep 8

2023 Sep 22

2023 Oct 13

2023 Oct 27

2023 Nov 10

2023 Nov 27

2023 Dec 11

2022 Jan 7

2022 Jan 11

2022 Jan 25

2022 Feb 22

2022 Feb 28

2022 Mar 21

2022 Apr 8

2022 Apr 22

2022 May 6

2022 May 20

2022 Jun 10

2022 Jun 24

2022 Jul 5

2022 Jul 8

2022 Jul 22

2022 Aug 12

2022 Aug 26

2022 Sep 9

2022 Sep 23

2022 Oct 14

2022 Oct 28

2022 Nov 11

2022 Nov 28

2022 Dec 9

2021 Jan 8

2021 Jan 11

2021 Jan 25

2021 Feb 22

2021 Feb 26

2021 Mar 18

2021 Apr 6

2021 Apr 19

2021 May 7

2021 May 21

2021 Jun 4

2021 Jun 21

2021 Jul 2

2021 Jul 9

2021 Jul 21

2021 Aug 20

2021 Sep 7

2021 Sep 20

2021 Sep 27

2021 Oct 11

2021 Nov 8

2021 Nov 18

2021 Dec 2

2021 Dec 16

2020 Jan 8

2020 Jan 10

2020 Jan 24

2020 Feb 20

2020 Feb 28

2020 Mar 18

2020 Apr 6

2020 Apr 17

2020 May 7

2020 May 21

2020 Jun 4

2020 Jun 19

2020 Jul 2

2020 Jul 9

2020 Jul 27

2020 Aug 20

2020 Sep 4

2020 Sep 18

2020 Sep 25

2020 Oct 9

2020 Nov 4

2020 Nov 18

2020 Dec 2

2020 Dec 16

2019 Jan 8

2019 Jan 10

2019 Jan 25

2019 Feb 20

2019 Feb 28

2019 Mar 18

2019 Apr 8

2019 Apr 23

2019 May 7

2019 May 21

2019 Jun 4

2019 Jun 20

2019 Jul 2

2019 Jul 9

2019 Jul 26

2019 Aug 22

2019 Sep 5

2019 Sep 19

2019 Sep 26

2019 Oct 10

2019 Nov 4

2019 Nov 18

2019 Dec 2

2019 Dec 16

2018 Jan 4

2018 Jan 9

2018 Jan 25

2018 Feb 16

2018 Feb 28

2018 Mar 16

2018 Apr 6

2018 Apr 20

2018 May 4

2018 May 18

2018 Jun 1

2018 Jun 20

2018 Jul 2

2018 Jul 6

2018 Jul 27

2018 Aug 21

2018 Sep 6

2018 Sep 19

2018 Sep 28

2018 Oct 12

2018 Nov 2

2018 Nov 16

2018 Nov 30

2018 Dec 14

2017 Jan 4

2017 Jan 9

2017 Jan 25

2017 Feb 17

2017 Feb 28

2017 Mar 17

2017 Apr 7

2017 Apr 20

2017 May 5

2017 May 19

2017 Jun 2

2017 Jun 20

2017 Jul 3

2017 Jul 7

2017 Jul 27

2017 Aug 21

2017 Sep 6

2017 Sep 19

2017 Sep 29

2017 Oct 13

2017 Nov 2

2017 Nov 16

2017 Dec 1

2017 Dec 15

2016 Jan 4

2016 Jan 8

2016 Jan 29

2016 Feb 18

2016 Feb 29

2016 Mar 18

2016 Apr 7

2016 Apr 20

2016 May 5

2016 May 19

2016 Jun 2

2016 Jun 20

2016 Jul 8

2016 Jul 8

2016 Jul 27

2016 Aug 19

2016 Sep 6

2016 Sep 19

2016 Sep 30

2016 Oct 14

2016 Nov 2

2016 Nov 16

2016 Dec 1

2016 Dec 15

2015 Jan 9

2015 Jan 28

2015 Jan 30

2015 Feb 18

2015 Feb 27

2015 Mar 11

2015 Mar 31

2015 Apr 20

2015 Apr 30

2015 May 19

2015 May 26

2015 Jun 23

2015 Jul 1

2015 Jul 13

2015 Jul 27

2015 Aug 18

2015 Sep 3

2015 Sep 18

2015 Sep 30

2015 Oct 15

2015 Oct 30

2015 Nov 16

2015 Dec 2

2015 Dec 15

2014 Jan 1

2014 Jan

2014 Jan 1

2014 Feb

2014 Feb

2014 Mar

2014 Apr

2014 Apr

2014 Apr

2014 May 1

2014 Jun 1

2014 Jul 1

2014 Jul 1

2014 Aug 1

2014 Aug 18

2014 Sep 1

2014 Sep 2

2014 Sep 15

2014 Sep 29

2014 Oct 13

2014 Oct 29

2014 Nov 10

2014 Dec 1

2014 Dec 16

2013 Jan

2013 Jan

2013 Jan

2013 Feb

2013 Feb

2013 Mar

2013 Apr

2013 Apr

2013 Apr

2013 May

2013 May

2013 Jun

2013 Jul

2013 Jul

2013 Jul

2013 Aug

2013 Aug

2013 Sep

2013 Sep

2013 Oct

2013 Oct

2013 Nov

2013 Nov

2013 Dec

2012 Jan

2012 Jan

2012 Jan

2012 Feb

2012 Feb

2012 Mar

2012 Mar

2012 Apr

2012 Apr

2012 May

2012 May

2012 Jun

2012 Jul

2012 Jul

2012 Jul

2012 Aug

2012 Aug

2012 Sep

2012 Sep

2012 Oct

2012 Oct

2012 Nov

2012 Nov

2012 Dec

2011 Jan

2011 Jan

2011 Jan

2011 Feb

2011 Mar

2011 Mar

2011 Mar

2011 Apr

2011 Apr

2011 May

2011 May

2011 Jun

2011 Jul

2011 Jul

2011 Jul 1

2011 Aug

2011 Aug

2011 Sep

2011 Sep

2011 Oct

2011 Oct

2011 Nov

2011 Nov

2011 Dec

2010 Jan

2010 Jan

2010 Jan

2010 Jan

2010 Mar

2010 Mar

2010 Apr

2010 Apr

2010 May

2010 May

2010 Jun

2010 Jun

2010 Jul

2010 Jul

2010 Jul 1

2010 Aug

2010 Aug

2010 Sep

2010 Sep

2010 Oct

2010 Oct

2010 Nov

2010 Nov

2010 Dec

2009 Jan

2009 Jan

2009 Feb

2009 Feb

2009 Mar

2009 Apr

2009 Apr

2009 Apr

2009 May

2009 May

2009 Jun

2009 Jun

2009 Jul

2009 Jul

2009 Jul 1

2009 Aug

2009 Aug

2009 Sep

2009 Sep

2009 Oct

2009 Oct

2009 Nov

2009 Nov

2009 Dec

2008 Jan

2008 Jan

2008 Feb

2008 Feb

2008 Mar

2008 Mar

2008 Apr

2008 Apr

2008 May

2008 May

2008 Jun

2008 Jun

2008 Jul

2008 Jul 1

2008 Aug

2008 Aug

2008 Sep

2008 Sep

2008 Oct

2008 Oct

2008 Nov

2008 Nov

2008 Dec

2008 Dec

2007 Jan

2007 Jan

2007 Jan

2007 Feb

2007 Feb

2007 Mar

2007 Mar

2007 Apr

2007 Apr

2007 May

2007 May

2007 Jun

2007 Jun

2007 Jul

2007 Jul

2007 Jul

2007 Aug

2007 Aug

2007 Sep

2007 Sep

2007 Oct

2007 Nov

2007 Dec

2007 Dec

2006 Jan 1

2006

2006

2006

2006

2006

2006

2006

2006

2006

2006

2006

2006 Jul 1

2006

2006

2006

2006 Sep

2006 Sep

2006 Oct

2006 Oct

2006 Nov

2006 Nov

2006 Dec

2006 Dec

2005 Jan 1

2005

2005

2005

2005

2005

2005

2005

2005

2005

2005

2005

2005 Jul 1

2005

2005

2005

2005

2005

2005

2005

2005

2005

2005

2005

2004 Jan 1

2004

2004

2004

2004

2004

2004

2004

2004

2004

2004

2004

2004 Jul 1

2004

2004

2004

2004

2004

2004

2004

2004

2004

2004

2004

2003 Jan 1

2003 Jan 15

2003 Feb 1

2003 Feb 15

2003 Mar 1

2003 Mar 15

2003 Apr 1

2003 Apr 15

2003 May 1

2003 May 15

2003 Jun 1

2003 Jun 15

2003 Jul 1

2003 Jul 15

2003 Aug 1

2003 Aug 15

2003 Sep 1

2003 Sep 15

2003 Oct 1

2003 Oct 15

2003 Nov 1

2003 Nov 15

2003 Dec 1

2003 Dec 15

2002 Jan 1

2002 Jan 15

2002 Feb 1

2002 Feb 15

2002 Mar 1

2002 Mar 15

2002 Apr 1

2002 Apr 15

2002 May 1

2002 May 15

2002 Jun 1

2002 Jun 15

2002 Jul 1

2002 Jul 15

2002 Aug 1

2002 Aug 15

2002 Sep 1

2002 Sep 15

2002 Oct 1

2002 Oct 15

2002 Nov 1

2002 Nov 15

2002 Dec 1

2002 Dec 15

2001 Jan 1

2001 Jan 15

2001 Feb 1

2001 Feb 15

2001 Mar 1

2001 Mar 15

2001 Apr 1

2001 Apr 15

2001 May 1

2001 May 15

2001 Jun 1

2001 Jun 15

2001 Jul 1

2001 Jul 15

2001 Aug 1

2001 Aug 15

2001 Sep 1

2001 Sep 15

2001 Oct 1

2001 Oct 15

2001 Nov 1

2001 Nov 15

2001 Dec 1

2001 Dec 15

2000 Jan 1

2000 Jan 15

2000 Feb 1

2000 Feb 15

2000 Mar 1

2000 Mar 15

2000 Apr 1

2000 Apr 15

2000 May 1

2000 May 15

2000 Jun 1

2000 Jun 15

2000 Jul 1

2000 Jul 15

2000 Aug 1

2000 Aug 15

2000 Sep 1

2000 Sep 15

2000 Oct 1

2000 Oct 15

2000 Nov 1

2000 Nov 15

2000 Dec 1

2000 Dec 15

1999 Jan 1

1999 Jan 15

1999 Feb 1

1999 Feb 15

1999 Mar 1

1999 Mar 15

1999 Apr 1

1999 Apr 15

1999 May 1

1999 May 15

1999 Jun 1

1999 Jun 15

1999 Jul 1

1999 Jul 15

1999 Aug 1

1999 Aug 15

1999 Sep 1

1999 Sep 15

1999 Oct 1

1999 Oct 15

1999 Nov 1

1999 Nov 15

1999 Dec 1

1999 Dec 15

1998 Jan 1

1998 Jan 15

1998 Feb 1

1998 Feb 15

1998 Mar 1

1998 Mar 15

1998 Apr 1

1998 Apr 15

1998 May 1

1998 May 15

1998 Jun 1

1998 Jun 15

1998 Jul 1

1998 Jul 15

1998 Aug 1

1998 Aug 15

1998 Sep 1

1998 Sep 15

1998 Oct 1

1998 Oct 15

1998 Nov 1

1998 Nov 15

1998 Dec 1

1998 Dec 15

1997 Jan 1

1997 Jan 15

1997 Feb 1

1997 Feb 15

1997 Mar 1

1997 Mar 15

1997 Apr 1

1997 Apr 15

1997 May 1

1997 May 15

1997 Jun 1

1997 Jun 15

1997 Jul 1

1997 Jul 15

1997 Aug 1

1997 Aug 15

1997 Sep 1

1997 Sep 15

1997 Oct 1

1997 Oct 15

1997 Nov 1

1997 Nov 15

1997 Dec 1

1997 Dec 15

1996 Jan 1

1996 Jan 15

1996 Feb 1

1996 Feb 15

1996 Mar 1

1996 Mar 15

1996 Apr 1

1996 Apr 15

1996 May 1

1996 May 15

1996 Jun 1

1996 Jun 15

1996 Jul 1

1996 Jul 15

1996 Aug 1

1996 Aug 15

1996 Sep 1

1996 Sep 15

1996 Oct 1

1996 Oct 15

1996 Nov 1

1996 Nov 15

1996 Dec 1

1996 Dec 15

1995 Jan 11

1995 Jan 25

1995 Feb 11

1995 Feb 25

1995 Mar 11

1995 Mar 25

1995 Apr 11

1995 Apr 25

1995 May 11

1995 May 25

1995 Jun 11

1995 Jun 25

1995 Jul 11

1995 Jul 25

1995 Aug 11

1995 Aug 25

1995 Sep 11

1995 Sep 25

1995 Oct 11

1995 Oct 25

1995 Nov 11

1995 Nov 25

1995 Dec 11

1995 Dec 25

1994 Jan 11

1994 Jan 25

1994 Feb 11

1994 Feb 25

1994 Mar 11

1994 Mar 25

1994 Apr 11

1994 Apr 25

1994 May 11

1994 May 25

1994 Jun 11

1994 Jun 25

1994 Jul 11

1994 Jul 25

1994 Aug 11

1994 Aug 25

1994 Sep

1994 Sep 11

1994 Sep 25

1994 Oct 11

1994 Oct 25

1994 Nov 11

1994 Nov 25

1994 Dec 11

1994 Dec 25

1993 Jan 11

1993 Jan 25

1993 Feb 11

1993 Feb 25

1993 Mar 11

1993 Mar 25

1993 Apr 11

1993 Apr 25

1993 May 11

1993 May 25

1993 Jun 11

1993 Jun 25

1993 Jul 1

1993 Jul 11

1993 Jul 25

1993 Aug 11

1993 Aug 25

1993 Sep 11

1993 Sep 25

1993 Oct 11

1993 Oct 25

1993 Nov 11

1993 Nov 25

1993 Dec 11

1993 Dec 25

1992 Jan 11

1992 Jan 25

1992 Feb 11

1992 Feb 25

1992 Mar 11

1992 Mar 25

1992 Apr 11

1992 Apr 25

1992 May 11

1992 May 11

1992 May 25

1992 Jun 11

1992 Jun 25

1992 Jul 11

1992 Jul 25

1992 Aug 11

1992 Aug 25

1992 Sep 11

1992 Sep 25

1992 Oct 11

1992 Oct 25

1992 Nov 11

1992 Nov 25

1992 Dec 11

1992 Dec 25

1991 Jan 11

1991 Jan 25

1991 Feb 11

1991 Feb 25

1991 Mar 11

1991 Mar 25

1991 Apr 11

1991 Apr 25

1991 Apr 25

1991 May 11

1991 May 25

1991 Jun 11

1991 Jun 25

1991 Jul 11

1991 Jul 25

1991 Aug 11

1991 Aug 25

1991 Sep 11

1991 Sep 25

1991 Oct 11

1991 Oct 25

1991 Nov 11

1991 Nov 25

1991 Dec

1991 Dec 11

1991 Dec 25

1990 Jan 11

1990 Jan 25

1990 Feb 11

1990 Feb 25

1990 Mar 11

1990 Mar 25

1990 Apr 11

1990 Apr 25

1990 Apr 25

1990 May 11

1990 May 25

1990 Jun 11

1990 Jun 25

1990 Jul 11

1990 Jul 25

1990 Aug 11

1990 Aug 25

1990 Sep 11

1990 Sep 25

1990 Oct 11

1990 Oct 25

1990 Nov 11

1990 Nov 25

1990 Dec 11

1990 Dec 25

1989 Jan 11

1989 Jan 25

1989 Feb 11

1989 Feb 25

1989 Mar 11

1989 Mar 25

1989 Apr 11

1989 Apr 25

1989 May 11

1989 May 25

1989 Jun 12

1989 Jun 26

1989 Jul 11

1989 Jul 25

1989 Aug 11

1989 Aug 25

1989 Sep 12

1989 Sep 25

1989 Oct 11

1989 Oct 25

1989 Nov 11

1989 Nov 25

1989 Dec 11

1989 Dec 25

1989

1988 Jan 11

1988 Jan 25

1988 Feb 11

1988 Feb 25

1988 Mar 11

1988 Mar 25

1988 Mar 25

1988 Apr 11

1988 Apr 25

1988 May 11

1988 May 25

1988 Jun 10

1988 Jun 24

1988 Jul 11

1988 Jul 25

1988 Jul 25

1988 Aug 11

1988 Aug 25

1988 Sep 12

1988 Sep 26

1988 Oct 11

1988 Oct 25

1988 Nov 11

1988 Nov 25

1988 Dec 9

1988 Dec 23

1988

1987 Jan 12

1987 Jan 26

1987 Feb 11

1987 Feb 25

1987 Mar 11

1987 Mar 25

1987 Apr 10

1987 Apr 24

1987 May 11

1987 May 26

1987 Jun 11

1987 Jun 25

1987 Jul 10

1987 Jul 24

1987 Aug 11

1987 Aug 25

1987 Sep 11

1987 Sep 25

1987 Oct 12

1987 Oct 26

1987 Nov 11

1987 Nov 25

1987 Dec 10

1987 Dec 23

1987

1986 Jan 10

1986 Jan 24

1986 Feb 11

1986 Feb 25

1986 Mar 11

1986 Mar 25

1986 Apr 11

1986 Apr 25

1986 May 12

1986 May 27

1986 Jun 11

1986 Jun 25

1986 Jul 11

1986 Jul 25

1986 Aug 11

1986 Aug 26

1986 Sep 11

1986 Sep 25

1986 Oct 10

1986 Oct 24

1986 Nov 11

1986 Nov 25

1986 Dec 9

1986 Dec 22

1986

1985 Jan 11

1985 Jan 25

1985 Feb 11

1985 Feb 25

1985 Mar 11

1985 Mar 25

1985 Apr 11

1985 Apr 25

1985 May 10

1985 May 24

1985 Jun 11

1985 Jun 25

1985 Jul 11

1985 Jul 25

1985 Aug 12

1985 Aug 26

1985 Sep 11

1985 Sep 25

1985 Oct 11

1985 Oct 25

1985 Nov 11

1985 Nov 25

1985 Dec 9

1985 Dec 20

1985

1984 Jan 11

1984 Jan 11

1984 Jan 25

1984 Feb 10

1984 Feb 24

1984 Mar 12

1984 Mar 26

1984 Apr 11

1984 Apr 25

1984 May 11

1984 May 25

1984 Jun 11

1984 Jun 25

1984 Jul 11

1984 Jul 25

1984 Aug 10

1984 Aug 24

1984 Sep 11

1984 Sep 25

1984 Oct 11

1984 Oct 25

1984 Nov 12

1984 Nov 26

1984 Dec 11

1984 Dec 21

1984

1983 Jan 11

1983 Jan 25

1983 Feb 11

1983 Feb 25

1983 Mar 11

1983 Mar 25

1983 Apr 11

1983 Apr 25

1983 May 11

1983 May 25

1983 Jun 11

1983 Jun 25

1983 Jul 11

1983 Jul 25

1983 Aug 11

1983 Aug 25

1983 Sep 10

1983 Sep 24

1983 Oct 11

1983 Oct 25

1983 Nov 11

1983 Nov 25

1983 Dec 10

1983 Dec 20

1982 Jan 11

1982 Jan 22

1982 Feb 11

1982 Feb 25

1982 Mar 11

1982 Mar 25

1982 Apr 10

1982 Apr 24

1982 May 11

1982 May 25

1982 Jun 11

1982 Jun 25

1982 Jul 10

1982 Jul 24

1982 Aug 11

1982 Aug 25

1982 Sep 11

1982 Sep 25

1982 Oct 11

1982 Oct 25

1982 Nov 11

1982 Nov 25

1982 Dec 11

1982 Dec 20

1981 Jan 10

1981 Jan 24

1981 Feb 11

1981 Feb 25

1981 Mar 11

1981 Mar 25

1981 Apr 10

1981 Apr 24

1981 May 11

1981 May 25

1981 Jun 11

1981 Jun 25

1981 Jul 10

1981 Jul 24

1981 Aug 11

1981 Aug 25

1981 Sep 11

1981 Sep 25

1981 Oct 10

1981 Oct 24

1981 Nov 11

1981 Nov 25

1981 Dec 11

1981 Dec 21

1980 Jan 11

1980 Jan 25

1980 Feb 11

1980 Feb 25

1980 Mar 11

1980 Mar 25

1980 Apr 11

1980 Apr 25

1980 May 10

1980 May 24

1980 Jun 11

1980 Jun 25

1980 Jul 11

1980 Jul 25

1980 Aug 11

1980 Aug 25

1980 Sep 11

1980 Sep 25

1980 Oct 10

1980 Oct 24

1980 Nov 11

1980 Nov 25

1980 Dec 11

1980 Dec 20

1979 Jan

1979

1979

1979 Feb

1979 Mar

1979 Apr

1979 Jun 11

1979 Jun 25

1979 Jul 11

1979 Jul 25

1979 Aug 10

1979 Aug 24

1979 Sep 11

1979 Sep 25

1979 Oct 10

1979 Oct 25

1979 Nov 10

1979 Nov 24

1979 Dec 11

1979 Dec 20

1978 Jan

1978 Jan 1

1978 Feb

1978 Mar

1978 Apr

1978 May

1978 Jun

1978 Jul

1978 Jul 1

1978 Aug

1978 Sep

1978 Oct

1978 Nov

1978 Dec

1977 Jan

1977 Feb

1977 Mar

1977 Apr

1977 May

1977 Jun

1977 Jul

1977 Aug

1977 Sep

1977 Oct

1977 Nov

1977 Dec

1976 Jan

1976 Feb

1976 Mar

1976 Apr

1976 May

1976 Jun

1976 Jul

1976 Aug

1976 Sep

1976 Oct

1976 Nov

1976 Dec

1975 Jan

1975 Feb

1975 Mar

1975 Apr

1975 May

1975 Jun

1975 Jul

1975 Aug

1975 Sep

1975 Oct

1975 Nov

1975 Dec

1974 Jan

1974 Feb

1974 Mar

1974 Apr

1974 May

1974 Jun

1974 Jul

1974 Aug

1974 Sep

1974 Oct

1974 Nov

1974 Dec
  • UB Directory
  • Research and Economic Development >
  • Need to Know and Events >

Cellular DNA damage response pathways might be useful against some disease-causing viruses

research news

polyomaviruses.

The UB research is applicable to two types of small DNA viruses: papillomaviruses, such as HPV, and polyomaviruses, such as the one shown here in a 3D print. Polyomaviruses infect most people without causing serious disease but they can lead to serious diseases and some cancers in immunologically weakened individuals. Photo:  NIAID

By ELLEN GOLDBAUM

Published August 22, 2024

Thomas Melendy.

New research reveals that triggering a cell’s DNA damage response could be a promising avenue for developing novel treatments against several rare but devastating viruses for which no antiviral treatments exist, possibly including human papilloma virus (HPV), which causes cancer.

Published online on Aug. 10 in Nucleic Acids Research , the paper focuses on the DNA damage response pathway and demonstrates how this pathway can reduce the function of a viral enzyme, a helicase, resulting in suppressing viral replication.

“This research is significant both for understanding how cells respond to DNA damage, to prevent them from becoming cancerous in the first place, how targeting this pathway can be used in new cancer treatments, and because it now opens up possibilities for new approaches to treating some rare but devastating viral infections,” says Thomas Melendy, senior author on the paper and associate professor of microbiology and immunology in the Jacobs School of Medicine and Biomedical Sciences at UB. 

How replication slows in response to DNA damage

The research focuses on a process called the DNA damage response, part of which has evolved to stop or slow DNA synthesis whenever cellular DNA damage occurs. “These pathways are important for preventing exacerbation of DNA damage that can lead to either cell death or cancer,” explains Rama Dey-Rao, research assistant professor of microbiology and immunology in the Jacobs School and joint first author on the paper with Caleb Hominski, a previous student in the lab.

When these pathways are activated, DNA replication is suppressed at sites in the genome called origins; at the same time, the progression of DNA replication forks also slows down. Replication forks, so-called because their structure resembles a fork, are where large groups of proteins coordinate genome replication through the unwinding and synthesis of DNA.

Melendy says that while quite a bit is known about how DNA damage response causes cells to stop DNA replication origins from “firing,” it’s been much harder to figure out how the progression of replication forks slows down in response to DNA damage.

“Researchers have been very interested in how that slowing occurs because it’s so dramatic,” says Melendy. “DNA damage response pathways cause replication forks to slow down progression by about ten-fold. This ten-fold slowdown means that synthesis of the cell’s genome, which usually takes about 12 hours, would take nearly five days, greatly increasing the time cells have to repair DNA damage.”

Viral connection

For years, Melendy and his colleagues have been studying two types of small DNA viruses: papillomaviruses, such as HPV, and polyomaviruses, which infect most people without causing serious disease but can lead to serious diseases and some cancers in immunologically weakened individuals. A rare cancer caused by a polyomavirus caused the death of musician Jimmy Buffett in 2023.

“We previously showed that in response to DNA damage, HPV does not stop or slow its DNA replication, while polyomaviruses do stop or slow their DNA replication,” says Melendy, “so by comparing and contrasting these two virus types we can gain insights into how polyomavirus DNA replication is slowed in response to DNA damage, which in turn provides us insights into how human cells slow replication forks.”

In the current research, they demonstrate that a phosphorylation site — where a phosphate is added to a molecule — on the major polyomavirus DNA replication and transcription protein is highly conserved in polyomaviruses across many animal species.

“The conservation of this phosphorylation/modification across polyomaviruses that have evolved to infect many different species of mammals suggested it was likely important,” says Melendy.

To study the effects of this, the UB researchers made a mutation at the specific amino acid residue on the viral protein where this phosphorylation occurs to mimic the addition of a phosphate group being there.

When they expressed this mutant viral protein in human cells using a system to evaluate polyomavirus DNA replication, they found the virus’s genome replication was decreased by 10-fold. However, viral transcription was unaffected, indicating that phosphorylation on that amino acid residue has a highly specific effect on viral DNA replication, but didn’t affect other functions of that protein. 

Role for DNA helicase

In comparing the wild-type and mutant proteins, they found the only function it was compromised for was the ability to act as a DNA helicase, unwinding DNA strands to facilitate entry of DNA synthesis enzymes.

“This is the first demonstration that it might be possible to use phosphorylation as a ‘switch’ on a DNA helicase to dial down replication speed,” explains Melendy.

Evidence suggests a similar phosphorylation can occur in human DNA helicases as well.

“For many cancers, if we selectively inhibit the DNA damage checkpoints they still retain, and simultaneously treat with lower than normal amounts of DNA-damaging chemotherapeutics, then we might be able to selectively damage cancer cells while leaving non-cancerous cells intact, greatly enhancing cancer cell killing while simultaneously reducing toxic side effects.”

This is an ongoing area of study by the UB researchers with their collaborators at Roswell Park Comprehensive Cancer Center.

Based on the current study, these DNA damage checkpoints may now be relevant to treating viral infections of the small DNA viruses under investigation at UB.

“Because they rely almost exclusively on host cell enzymes to synthesize their viral genomes, these small DNA viruses have been very resistant to anti-viral therapeutics,” says Melendy. “We currently have no antiviral treatments for HPV or polyomaviruses. By triggering the DNA damage response in a patient, this could dramatically slow viral DNA replication, suppressing the infection, providing us with a novel avenue for possible antiviral treatments of these as-of-yet untreatable viral infections.”

Shichen Shen and Jun Qu, both of the School of Pharmacy and Pharmaceutical Sciences, are co-authors. The work was funded by the National Institutes of Health and the NIH Training Grant in Microbial Pathogenesis.

  • Search Menu
  • Sign in through your institution
  • Chemical Biology and Nucleic Acid Chemistry
  • Computational Biology
  • Critical Reviews and Perspectives
  • Data Resources and Analyses
  • Gene Regulation, Chromatin and Epigenetics
  • Genome Integrity, Repair and Replication
  • Molecular Biology
  • Nucleic Acid Enzymes
  • RNA and RNA-protein complexes
  • Structural Biology
  • Synthetic Biology and Bioengineering
  • Advance Articles
  • Breakthrough Articles
  • Molecular Biology Database Collection
  • Special Collections
  • Scope and Criteria for Consideration
  • Author Guidelines
  • Data Deposition Policy
  • Database Issue Guidelines
  • Web Server Issue Guidelines
  • Submission Site
  • About Nucleic Acids Research
  • Editors & Editorial Board
  • Information of Referees
  • Self-Archiving Policy
  • Dispatch Dates
  • Advertising and Corporate Services
  • Journals Career Network
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

Introduction, materials and methods, data availability, supplementary data, methods for constructing and evaluating consensus genomic interval sets.

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Julia Rymuza, Yuchen Sun, Guangtao Zheng, Nathan J LeRoy, Maria Murach, Neil Phan, Aidong Zhang, Nathan C Sheffield, Methods for constructing and evaluating consensus genomic interval sets, Nucleic Acids Research , 2024;, gkae685, https://doi.org/10.1093/nar/gkae685

  • Permissions Icon Permissions

The amount of genomic region data continues to increase. Integrating across diverse genomic region sets requires consensus regions, which enable comparing regions across experiments, but also by necessity lose precision in region definitions. We require methods to assess this loss of precision and build optimal consensus region sets. Here, we introduce the concept of flexible intervals and propose three novel methods for building consensus region sets, or universes: a coverage cutoff method, a likelihood method, and a Hidden Markov Model. We then propose three novel measures for evaluating how well a proposed universe fits a collection of region sets: a base-level overlap score, a region boundary distance score, and a likelihood score. We apply our methods and evaluation approaches to several collections of region sets and show how these methods can be used to evaluate fit of universes and build optimal universes. We describe scenarios where the common approach of merging regions to create consensus leads to undesirable outcomes and provide principled alternatives that provide interoperability of interval data while minimizing loss of resolution.

Graphical Abstract

Advancements in high-throughput sequencing technologies have resulted in a vast amount of diverse epigenomic data that has given us tremendous insight into genome function. Epigenomic data are often summarized into genomic region sets stored in BED files. Through the work of hundreds of individual labs and projects such as ENCODE ( 1 ), the NCBI Gene Expression Omnibus ( 2 ) now contains almost 100 000 BED files ( 3 ). The volume of data has made integration challenging.

For an analysis that spans several genomic interval sets, one of the first steps is to define a consensus region set, or region universe , upon which the diverse sets can be interpreted ( 4–10 ). Such universe region sets have many common practical use cases. For example, they define genomic intervals for differential peak analysis ( 11 ); they form the regions of interest in a count matrix in single-cell epigenome analysis ( 11 , 12 ); they are used as a background for statistical region enrichment analysis ( 13–16 ); and they are a region vocabulary in vector representation approaches ( 17–20 ).

For many tools, the choice of universe is critical. It defines the features to which data will be projected. Currently, there are different ways of choosing a universe for analysis. Simple approaches include tiling the genome into fixed-size bins ( 12 , 15 ), or using intersection or union operations on a collection of region sets ( 11 ). Some methods have been developed to create better-fitting universes for specific downstream use cases ( 21 , 22 ). An alternative is to use a predefined universe from an external source; for example, the ENCODE consortium curated a registry of candidate cis-regulatory elements accessible through the SCREEN webserver ( 1 ), and the Ensemble Regulatory Build is a central, reusable source of regulatory region definitions ( 23 ). The choice of universe matters because universes can be a poor fit to data, and if a universe does not fit the data well, it can lead to incomplete or incorrect results ( 5 , 14 , 15 ). However, despite the importance of selecting a universe, it is often done ad hoc , and there are few approaches to assess the fit of a universe to a collection of region sets.

Here, we address these limitations by introducing novel concepts for constructing and evaluating region universes. First, we introduce the idea of flexible genomic intervals, which represent region boundaries by intervals instead of points, allowing us to summarize many fixed regions into fewer flexible regions without loss of information. Next, we propose three methods for constructing flexible region universes: a coverage cutoff universe, a maximum likelihood universe, and a Hidden Markov Model (HMM). Finally, we propose three methods to evaluate the fit of a universe to a collection of region sets: (i) the base-level F 10 -score; (ii) a Region Boundary Distance score ( RBD ); and (iii) a likelihood model score that assesses the likelihood that the proposed universe was drawn from the given distribution of region sets.

To assess our universes and evaluation methods, we compared our methods against alternatives and predefined universes. We show that flexible universes can capture information from complex data collections into one well-defined universe. Moreover, we show how our assessment metrics provide complementary measures of assessing universe fit and we prove the relevance of these measures. We show that the union universe has many downsides and propose the HMM universe as a generally useful approach for defining well-fit universes. To demonstrate how these universes could affect downstream analysis, we conclude with an application of region set enrichment analysis, where we show how the results are affected by choice of universe. Overall, our results demonstrate the importance of considering region universe and provide promising new tools to construct better-fitting universes for a variety of use cases.

To integrate genomic interval data, we first require a consensus set of intervals, or a universe. We may select a predefined universe from an external source or define one from a collection of input region sets using a consensus algorithm (Figure  1A ). With a universe in hand, we can then ‘tokenize’ the original regions (Figure  1B ). Tokenization redefines them into universe regions, normalizing differences in region boundaries to transform similar regions into a single representation. The most basic tokenization method is simple interval overlap. This approach works well if a universe approximates the original data well; otherwise, this may result in loss of precision. A universe may not be a good fit to a collection for a variety of reasons (Figure  1C ); for example, (i) a region can be shifted; (ii) two neighboring regions may be merged, making them indistinguishable in downstream analysis ( 5 ); (iii) a universe may omit important intervals, leading to loss of information ( 5 ); or (iv) a universe may contain extraneous regions that do not reflect genome coverage, adding noise and compute time ( 14 ). If a universe is a poor fit, it can affect downstream analysis negatively; for example, a differential accessibility analysis wouldn’t even test a locus that had been dropped from the universe. It could also miss a significant locus if it were merged with an abutting locus that lacked differential signal. Or, a motif analysis based on universe regions that had been shifted could miss enriched sequence features that were present in the part of the region left out of the universe.

Overview of the concept of a universe. (A) A consensus algorithm takes a region set collection $\mathbb {R} = [\mathcal {R}_1, \mathcal {R}_2, ...]$ and builds a consensus representation called universe $\mathcal {U}$. (B) We ‘tokenize’ raw regions into universe $\mathcal {U}$ by redefining them as universe regions, creating a more uniform collection. (C) Universes may poorly represent region sets by shifting, merging, dropping or adding extraneous regions.

Overview of the concept of a universe. ( A ) A consensus algorithm takes a region set collection |$\mathbb {R} = [\mathcal {R}_1, \mathcal {R}_2, ...]$| and builds a consensus representation called universe |$\mathcal {U}$|⁠ . ( B ) We ‘tokenize’ raw regions into universe |$\mathcal {U}$| by redefining them as universe regions, creating a more uniform collection. ( C ) Universes may poorly represent region sets by shifting, merging, dropping or adding extraneous regions.

To address these issues, we developed three new approaches for constructing a universe that is a good fit to the original data: first the ‘coverage cutoff universe’; second, the ‘maximum likelihood universe’; and finally, an ‘HMM universe’. To assess them, we also developed three universe fit metrics. Finally, we applied these to a variety of real datasets.

Methods for building optimal universes

The coverage cutoff universe.

The simplest example of a universe built from the data is a ‘union universe’, in which a collection of region sets is merged. This method is often done for differential analysis of ATAC-Seq data ( 11 ). The union universe by definition covers all bases from the original collection; however, it can also lead to very large regions, particularly if the number of input region sets is large. Another simple alternative is using an intersection operation, which would include only bases covered in every region set in the collection, but this has the opposite problem: it leads to very sparse universes.

We reasoned that a hybrid approach may achieve a better result. First, we conceptualize a collection of region sets as a coverage signal track across all input region sets. Then, similar to a peak calling approach, we choose a cutoff x such that universe includes only positions with coverage greater or equal to x (Figure  2A ). Setting the cutoff to one corresponds to a union universe, and setting the cutoff equal to the number of input region sets corresponds to an intersection universe. Setting a cutoff in between the two balances these extremes and provides a tunable parameter that may be adjusted depending on the needs of downstream tasks. We call the resulting universe a coverage cutoff (CC) universe. A principled approach to selecting the cutoff is to use a simple likelihood model that calculates the probability of appearing in a collection. With this model, we can calculate an optimal cutoff according to Eq. ( 1 ) (see Supplementary methods for details).

Different approaches to building universes. (A) Coverage-based universes are derived from the genome coverage of a collection of region sets. Examples include intersection $\mathcal {U}_{int}$, coverage cutoff $\mathcal {U}_{CC}$, and union universe $\mathcal {U}_{union}$. (B) A flexible region in contrast to fixed region can represent boundaries of many variable regions. (C) The flexible coverage cutoff (CCF) universe is based on coverage of the genome by a collection. It uses two cutoff values: the lower defines flexible boundaries and the upper defines the region core. (D) A collection of genomic region sets is aggregated, and region starts, core (overlap), and ends are counted, creating signal tracks. (E) Maximum likelihood universe is derived from three signal tracks. Using a likelihood model, we build a scoring matrix that assesses the probability of each position being a given part of a flexible region. Next, we find the most likely path, which represents the maximum likelihood universe. (F) The HMM universe treats signal tracks representing genome coverage by different parts of a region as emissions of hidden states that correspond to different parts of flexible regions.

Different approaches to building universes. ( A ) Coverage-based universes are derived from the genome coverage of a collection of region sets. Examples include intersection |$\mathcal {U}_{int}$|⁠ , coverage cutoff |$\mathcal {U}_{CC}$|⁠ , and union universe |$\mathcal {U}_{union}$|⁠ . ( B ) A flexible region in contrast to fixed region can represent boundaries of many variable regions. ( C ) The flexible coverage cutoff (CCF) universe is based on coverage of the genome by a collection. It uses two cutoff values: the lower defines flexible boundaries and the upper defines the region core. ( D ) A collection of genomic region sets is aggregated, and region starts, core (overlap), and ends are counted, creating signal tracks. ( E ) Maximum likelihood universe is derived from three signal tracks. Using a likelihood model, we build a scoring matrix that assesses the probability of each position being a given part of a flexible region. Next, we find the most likely path, which represents the maximum likelihood universe. ( F ) The HMM universe treats signal tracks representing genome coverage by different parts of a region as emissions of hidden states that correspond to different parts of flexible regions.

Here, S c is a sum of genome coverage by collection and g is the size of the genome.

We realized that, in a sense, the CC universe is a point estimate of a more complex distribution of possible universes. We reasoned that we may gain some insight by modeling the boundaries of the consensus regions as intervals, rather than points. To do this, we developed a new concept of a genomic region we call a flexible region . In contrast to fixed interval that is defined by two fixed points ( start and end ), a flexible interval is defined by four ( start start , start end , end start and end end ) (Figure  2B ). A flexible interval can model many region variations into one well-defined flexible interval.

A simple approach to constructing a flexible region universe is to define two cutoff values instead of one: the looser represents the cutoff for boundaries, and the stricter for the region core (Figure  2C ). This way, positions with coverage between those two points will be assigned to flexible region boundaries and positions with coverage higher than the second cutoff will be assigned to core of the region. Using this idea, we built a confidence interval around the optimal cutoff value, which extends the CC universe into the coverage cutoff flexible (CCF) universe.

Maximum likelihood universe

While flexible intervals more naturally represent collections of overlapping region sets than fixed intervals, we reasoned that they still suffer from the possibility of merging neighboring regions when collections are large. To address this issue, we need information about not just the coverage of regions in a collection, but also the region start and end positions. To compute this information, we developed a fast plane-sweep algorithm (see Supplementary methods for details). With this tool we can quickly and efficiently calculate three tracks representing aggregate start, end, and coverage values of a region set collection at base-pair resolution (Figure  2D ). We reasoned that a model that could incorporate all of these signals may improve universe resolution.

We can conceptualize a flexible universe as a path through the genome that assigns either start, core, or end state to each position. Using a universe scoring model and optimization algorithm, we can build the best path through the genome (Figure  2E ). As a scoring model, we next developed a complex likelihood model, an extension of the simple likelihood model introduced earlier for CC universe, which considers not just the coverage (core), but also in region start and end signal tracks. This model describes for each position the probability of it being a region start, core, or end. We thus build the maximum likelihood universe (LH) in 3 steps: (i) compute the three signal tracks; (ii) use a likelihood model to build a scoring matrix; (iii) find the maximum likelihood path through the genome (see Supplementary methods for details).

Hidden Markov model universe

The maximum likelihood universe provides a simple and principled model for optimal flexible universes. However, a disadvantage is that it provides no tunability, since the likelihood scores are determined purely from the data. We reasoned that this may lead to results depending on input collection. To address this, we sought a more tunable model using a Hidden Markov Model (HMM).

An HMM models a hidden processes using (i) a matrix of transition probabilities between hidden states and (ii) emission probabilities of observations from hidden states. In our model, there are three observed sequences: the number of starts, overlaps, and ends at a given position. The hidden variable corresponds to the different parts of the flexible segment (Figure  2F and Supplementary Figure S1 ). We can tune transition probabilities, which can be chosen in a way that will prevent unnecessary segmentation, and emission matrix, which describes the relationship between observations and hidden states (see Supplementary methods for details).

Methods for evaluating universe fit

Having developed several new approaches to construct universes, we next sought to evaluate these universes and compare them to other common approaches. Because the choice of universe can dramatically affect downstream analyses, it is important to choose a universe deliberately. However, there are no well-established methods for assessing universe fit to data. Furthermore, different analyses may be better served by a different types of universe, indicating that there really is no generally optimal universe, but the idea of what makes a ‘good’ universe depends on the downstream analysis. For example, a differential accessibility analysis should prioritize sensitivity over specificity; in this case, it is not a major problem if the universe includes many regions present in only a few samples, since the cost of extra comparisons is lower. On the other hand, the cost of excluding a region that could be a significant differential locus would be high. In contrast, a word-based deep learning task that trains a model with input dimensions equal to the size of the universe may elect a more specific universe, at the cost of discarding some regions that are present in a few of the samples, because otherwise the training could be intractable. Thus, the question of universe optimality depends on the use case, and therefore, we require methods of evaluating universe fit that can be tuned to a research question. This problem is similar to the comparison of two generic interval sets, for which several methods have been developed ( 24 ), with two key differences: first, we want to compare a universe region set not only to one other region set, but to a collection of them; and second, the question is not symmetric: it is generally more important that a universe not miss information (regions), even at the cost of some extra regions – and the desired level of asymmetry can vary. Therefore, we developed three methods for assessing universe fit to data: (i) a base-level overlap score; (ii) a region boundary distance score and (iii) the universe likelihood.

Base-level overlap score

Our first metric is based on base coverage. We consider the universe as a prediction of whether a given genomic position is present in a given region set from the collection. Treating each region set from the collection as a query, we can then conceptualize matches and mismatches as true positives (covered in both universe and query), false positives (covered in universe, but not in query), or false negatives (covered in query, but not in universe) (Figure  3A ). This allows us to calculate common classification evaluation measures such as precision and recall. Precision counts the number of true positives, so a low precision indicates presence of unimportant positions in the universe; recall measures how much of the universe is in a query. To combine precision and recall, we use the F 10 -score, a weighted version of the traditional F -score that pays 10 more times attention to recall than precision. This asymmetry captures our goal to prioritize sensitivity over specificity: by prioritizing recall, we indicate that it’s better to have a few extra, noisy regions that to exclude something important. The F 10 -score results in values between 0 and 1, with a perfect fit approaching 1. An alternative approach to base-level overlap score would be Jaccard similarity, however it does not account for asymmetry of the comparison.

Different approaches to assess how well the universe represents the data. (A) The base-level overlap measure considers the universe as a prediction of a region set and based on that it calculates number of false positives (FP), true positives (TP), and false negatives (FN), and from that derives recall (R) and precision (P), which are combined into the F10-score. (B) The region boundary distance (RBD) score assesses how well a universe represents start and end positions, by calculating distance from region set to universe, and from universe to region set; those two metrics are combined into a region boundary score by calculating their reciprocal, weighted harmonic mean. (C) Likelihood assessment uses a likelihood model based on signal tracks representing genome coverage by different parts of a region to calculate universe likelihood as a combination of likelihoods of all three signals tracks. D) A complete analysis example comparing a collection of region sets against 4 proposed universes: $\mathcal {U}_1$ a precise universe, $\mathcal {U}_2$ a sensitive universe, $\mathcal {U}_3$ a fragmented universe, and $\mathcal {U}_4$ well-fit universe. For each universe, all 3 metrics are calculated. The F10-score and RBD score assess individual region sets. The final score for a collection is their average. In contrast, the likelihood is calculated directly for the whole collection.

Different approaches to assess how well the universe represents the data. ( A ) The base-level overlap measure considers the universe as a prediction of a region set and based on that it calculates number of false positives (FP), true positives (TP), and false negatives (FN), and from that derives recall ( R ) and precision ( P ), which are combined into the F 10 -score. ( B ) The region boundary distance (RBD) score assesses how well a universe represents start and end positions, by calculating distance from region set to universe, and from universe to region set; those two metrics are combined into a region boundary score by calculating their reciprocal, weighted harmonic mean. ( C ) Likelihood assessment uses a likelihood model based on signal tracks representing genome coverage by different parts of a region to calculate universe likelihood as a combination of likelihoods of all three signals tracks. D) A complete analysis example comparing a collection of region sets against 4 proposed universes: |$\mathcal {U}_1$| a precise universe, |$\mathcal {U}_2$| a sensitive universe, |$\mathcal {U}_3$| a fragmented universe, and |$\mathcal {U}_4$| well-fit universe. For each universe, all 3 metrics are calculated. The F 10 -score and RBD score assess individual region sets. The final score for a collection is their average. In contrast, the likelihood is calculated directly for the whole collection.

Region boundary distance score

One disadvantage of the base overlap score is that it is unaware of region boundaries. A universe region that covers two abutting query regions would get a perfect score. This can be highly problematic in downstream applications; for example, in a differential analysis, lumping two distinct loci together could dilute differential signal. To address this, we sought a measure that would consider region starts and ends (Figure  3B ). We calculate the distance between each boundary of each region in the query and the closest corresponding boundary in the universe. Universes with boundaries that are near the query boundaries would have shorter distances, indicating better universe fit. However, highly fragmented universes with many unnecessary boundaries would have very small distances from query to universe. To account for this, we also calculate the inverse distance: from boundaries in universe to the nearest boundaries in the query. Finally, we combine those two metrics into a region boundary distance score ( RBD ) by taking their reciprocal, weighted harmonic means. With this score we describe the universe’s ability to conserve information about starts and ends, with a score of one representing a perfect representation of boundary locations. However, we do not incorporate any information about collection coverage in this score.

For fixed universes, the start and end point are well-defined, however for flexible regions they are intervals. Therefore, for flexible universes, we modify the RBD score to set distance equal to zero if a boundary query region is inside the universe’s boundary interval.

Universe likelihood

Finally, we sought a metric that incorporates both information about region boundaries as well as genome coverage. We propose here a universe likelihood score (Figure  3C ). We first calculate three signal tracks representing genome coverage by start, core, and end of the regions in the collection. Then for each signal track we make a separate model, which results in three separate models for different parts of a region. Each of these models describes the probability of a given position being a given part of a region, depending on the signal strength (see Supplementary methods for details). That results in a complex, probabilistic description of a region set collection. Next, we use this model to calculate the likelihood of the universe, which we can compare between universes. We make two versions of likelihood calculations, one suited for fixed universes and one for flexible universes. We use log likelihood, so our values range from minus infinity (low) to zero (high). Finally, to increase interpretability, we normalize the scores by subtracting the likelihood of an empty universe (one that contains no regions). Thus, a positive final score reflects a given universe that is more likely than having no regions at all, while a negative score means the universe is less likely than the empty universe.

To accommodate flexible universes, we adapted the likelihood score by calculating the boundary likelihood as if the whole flexible interval could contain a boundary position, rather than a single fixed point (see Supplementary methods for details).

Assessing region sets collections

Having developed three assessment methods, we can use them to compare competing universes to assess which universe is the best fit for a collection of region sets. We do this by computing the scores for each universe and comparing among universes (Figure  3D ). The scores assess different aspects of the universe fit; the F 10 -score promotes sensitive universes over specific ones, RBD score penalizes sparse universes, and likelihood provides complex universe assessment. Although the likelihood score incorporates information about boundaries as well as how well the universe covers the collection, it can penalize sensitive universes because it is not intentionally biased toward asymmetry the way the previous scores are. Thus, by computing all these scores, we reason that we get a complete picture that can guide decisions for selecting a universe for a collection of region sets.

Evaluation on real data

Next, we developed an evaluation strategy to test our universe building and assessment methods on real data. We assembled five diverse collections of region sets representing different biological problems (Figure  4A ): (i) CTCF ChIP small , a small random collection of CTCF region sets ( n = 40) from the ENCODE database ( 1 ); (ii) CTCF ChIP large , CTCF ChIP-seq datasets ( n  = 877); (iii) TF ChIP , ChIP-seq experiments for diverse transcription factors (TFs) ( n = 8503); (iv) B-LCL ATAC , a small set of ATAC-seq files from B-Lymphoblastoid cell lines (B-LCL; n = 400) from ChIP-Atlas ( 25 ); (v) a Random ATAC , random ATAC-seq results ( n = 5000). These datasets vary in data type, collection size, and level of heterogeneity of input regions across region sets.

Overview of evaluation approach. (A) Five collections representing different biological problems used for assessment. (B) For each collection, we compared it to five data-driven universes and three predefined universes. The data-driven universes are tailored to the input collection, but the predefined universes do not vary by collection. (C) We assessed the fit of each universe to each collection using our three assessment methods.

Overview of evaluation approach. ( A ) Five collections representing different biological problems used for assessment. ( B ) For each collection, we compared it to five data-driven universes and three predefined universes. The data-driven universes are tailored to the input collection, but the predefined universes do not vary by collection. ( C ) We assessed the fit of each universe to each collection using our three assessment methods.

We also assembled universes to assess. First, we obtained three universes that do not depend on analyzed data, which we call predefined universes: (i) the tiles universe, which bins the genome into non-overlapping 1000 bp tiles; (ii) the SCREEN universe, which consists of predefined cis-regulatory elements from ENCODE ( 1 ); and (iii) the Regulatory Build (RB) universe, consisting of pre-defined regulatory elements from Ensembl ( 23 ). In addition to these three external universes, we also built 5 data-driven universes that are specific to each region set collection. These include: (i) the union universe; (ii) the CC universe; (iii) the CCF universe; (iv) the LH universe; and (v) the Hidden Markov Model (HMM) universe. This led to 28 universes and 40 pairwise comparisons of universe-to-collection (Figure  4B ). For each comparison, we computed our three assessment methods (Figure  4C ). This gives us a comprehensive evaluation of both externally sourced and data-driven universes, tested on diverse query region set collections.

Data overview

To explore the differences in our five region set collections, we first computed general coverage statistics ( Supplementary Table S1 ). The smallest region set, B-LCL ATAC , contains ≈700 000 regions and covers 0.2% of the genome, whereas the largest, TF ChIP , contains ≈1.5 billion regions and covers 91% of genome. We also observed that the ATAC-seq collections have smaller regions on average than the ChIP-seq collections.

Universe overview

The universes also have very different characteristics, with some requiring additional filtering based on region likelihood and size (see Supplementary methods for details, Supplementary Figures S2 and S3 ). For example, for the CTCF ChIP large collection, the eight universes have different levels of precision and fragmentation (Figure  5A ). The universes differed in average region size, number of regions, and percent of genome covered ( Supplementary Figure S4 , Supplementary Table S2 ). Having assembled the universes, we next computed our three assessment methods.

Universes overview and results of base-level overlap score. (A) Example of universes assessed for the CTCF ChIP large collection, including the 3 constant external universes, and 5 data-driven universes built from the input collection. (B) Different universes represent genome coverage by the collection to a different extent, example from the Random ATAC collection. Collection $\mathbb {R}$ consists of many different files, which are represented by the core signal track. Regions in $\mathcal {R}_1, \mathcal {R}_2, \mathcal {R}_3$ are best represented by CC, CCF and LH universes in terms of overlap. (C) Precision and recall distribution for each collection and universes assessment. (D) Average F10-score for each collection and universes assessment.

Universes overview and results of base-level overlap score. ( A ) Example of universes assessed for the CTCF ChIP large collection, including the 3 constant external universes, and 5 data-driven universes built from the input collection. ( B ) Different universes represent genome coverage by the collection to a different extent, example from the Random ATAC collection. Collection |$\mathbb {R}$| consists of many different files, which are represented by the core signal track. Regions in |$\mathcal {R}_1, \mathcal {R}_2, \mathcal {R}_3$| are best represented by CC, CCF and LH universes in terms of overlap. ( C ) Precision and recall distribution for each collection and universes assessment. ( D ) Average F 10 -score for each collection and universes assessment.

Assessment 1: base-level overlap F 10 -score

Region sets in a collection can differ widely; our first assessment method assesses fit by quantifying the degree of overlap between each region set and the universe. In example data from the Random ATAC collection, we observe that some universes cover many bases present in only few of the collection’s region sets, while other universes are more stringent (Figure  5B ). To assess this globally, we first computed precision and recall for each comparison (Figure  5C ). We observed that the tiling universe and union universe both have perfect recall for all tested collections, consistent with how these universes are constructed; the tiling universe covers the entire (mappable) genome, and the union universe by definition covers every base contained in the collection. In contrast, recall is lower for the more stringent data-driven universes; the CC, CCF and LH universes exclude positions with low coverage, especially for large collections; the HMM universe has higher recall in general, indicating that it contains most positions covered by the collection. Finally, the lowest recall scores are assigned to the external universes, SCREEN and RB, which is consistent with these universes being built from other data sources. This highlights an advantage of building bespoke universes tailored to a collection: recall is superior. On the precision side, the worst performer overall is the tiles universe, consistent with many tiles in the universe that do not reflect coverage in the collection. In contrast, SCREEN and RB had generally higher precision, especially for ChIP-Seq collections.

In general, data-driven universes tend to represent collections well with good precision and recall. This is most apparent for the B-LCL ATAC collection for which data-driven universes are much better than predefined. For likelihood universes (CC, CCF, and LH universes) built from large ChIP-Seq collections ( CTCF ChIP large and TF ChIP ) we observe the worst recall among data-driven universes. This is the consequence of Eq. ( 1 ), from which we observed that, for ChIP-Seq collections, the optimal cutoff value is higher. On the other hand, both the HMM universe and the union universe have high recall and low precision. In general, we see that universes with high recall have lower precision, reflecting the delicate balance between including complete information in universes without adding too much noise.

To propose a balance between precision and recall, we next calculated the F 10 -score, which assigns more weight to recall than precision (Figure  5D ). For predefined universes, we see that the tiles universe scores well for ChIP-seq collections, which cover more of the genome, but poorly for ATAC-seq collections, which have lower coverage. In general, both RB and SCREEN are outperformed by data-driven universes, especially for the B-LCL ATAC collection, for which they contain too much noise. In general, for ChIP-Seq collections, the union universe is the best for these metrics, consistent with the weighting we chose that gives 10 times the weight to recall. Interestingly, for the TF ChIP collection, the HMM universe outperforms likelihood universes; on the other hand, for ATAC-Seq collections, the CCF universe outperforms both union and HMM universes. Overall, we conclude that computing precision, recall, and F 10 provide useful insight into assessing universe fit. They provide a way to quantify the advantages of a data-driven universe and assess how much information is lost by an external universe.

Assessment 2: region boundary distance score

Next, we sought to address the major weakness we see in the base-level overlap score: that it does not consider region boundaries. Assessing boundaries is important because of how it affects downstream analysis. If two regulatory elements with distinct behavior are merged into a single region in a universe, then all downstream analyses will essentially evaluate the average of the two signals. But different universes have different sensitivities to boundary points (Figure  6A ); for example, anecdotally, the union universe is not sensitive at all: it contains few boundaries, especially for larger collections. It will clearly merge together many neighboring regions, even if they have distinct patterns across input sets. The CC and CCF universes are more sensitive but still miss out on many boundaries for bigger collections; the LH universe is very sensitive to boundaries, but has other weaknesses (it tends to exclude positions with low coverage); all of those problem are solved with HMM universe, which is very sensitive and also is able to represent regions with low coverage. To assess boundaries globally, we turned to the region boundary distance score. First, for each region set, for each region, for each boundary, we calculated the distance to the nearest corresponding universe boundary (See Methods; Figure  6B ). For all collections, the distance from collection to RB universe was very high. Similarly, the union universe performs very poorly in this metric, particularly for larger collections, consistent with intuition that the union regions lose boundary precision as the number of regions increases. We also computed the inverse: distances from query to universe (Figure  6C ). Interestingly, for ATAC-Seq collections, we observe a small distance from query to universe but a high distance from universe to query. This indicates that all universes have many boundaries that are not present in the raw data, but at the same time boundaries present in the queries are well-reflected by universes.

Results of region boundary distance score. (A) Different universes represent region boundaries to a different extent. Three signal tracks provide summarized description of the whole collection. Both LH and HMM universe are most sensitive to region boundaries. (B) Distribution of median distances from query to universe. (C) Distribution of median of distance from universe to query. (D) Average RBD score for each collection and universe comparison.

Results of region boundary distance score. ( A ) Different universes represent region boundaries to a different extent. Three signal tracks provide summarized description of the whole collection. Both LH and HMM universe are most sensitive to region boundaries. ( B ) Distribution of median distances from query to universe. ( C ) Distribution of median of distance from universe to query. ( D ) Average RBD score for each collection and universe comparison.

To summarize both distance directions, we calculated their reciprocal, weighted harmonic mean, the Region Boundary Distance score ( RBD ) (Figure  6D ). The average RBD score shows that the RB universe is a very poor fit to all collections, likely because it contains few, large regions. On the other hand, tiles universe has similar scores for all collections, which is good for big ChIP-Seq collections compared to other universes, but bad for ATAC-Seq collections. Interestingly, the SCREEN universe seems to be a good fit for all collections, and the best fit for ATAC-Seq collections, even outperforming the data-driven ones for this metric. Among data-driven universes, the HMM universe performs the best, with LH in second place for all collections except B-LCL ATAC . However, for this collection all data-driven universes have similar scores. Coverage-based universes (CC, CCF) perform well for ATAC-Seq collections, but not for large ChIP-Seq collections ( CTCF ChIP large and TF ChIP ) compared to other data-driven universes. As expected, the RBD score reflects the poor performance of the union universe for large ChIP-Seq collections; the merging leads to poor reflection of interval boundaries.

Assessment 3: likelihood calculation

Finally, we calculated the likelihood score for each comparison (Figure  7A ). In general, predefined universes are a worse fit to the collections than an empty universe, with exception of SCREEN for CTCF ChIP large and TF ChIP collections. Among data-driven universes, the union universe performs very poorly, achieving negative scores for all collections. As expected, CC, CCF, and LH universes outperform the empty universe for all collections, as these were designed to optimize likelihood in some way. The HMM universe performs well overall; however, it is worse than empty universe for CTCF ChIP large and TF ChIP collections. A more detailed look reveals that the low HMM likelihood scores in these scenarios are driven by region coverage tracks, not boundaries, suggesting that for these collections, our current HMM parameterization may yield a universe with too much noise (see Materials and Methods; Supplementary Figure S5 ).

Results of universe likelihood and comparison between fixed and flexible scores. (A) Likelihood of each universe given collection. (B) Change of RBD score when we account for flexibility. (C) Change of likelihood, when we account for flexibility.

Results of universe likelihood and comparison between fixed and flexible scores. ( A ) Likelihood of each universe given collection. ( B ) Change of RBD score when we account for flexibility. ( C ) Change of likelihood, when we account for flexibility.

Comparing flexible to fixed universes

So far, our assessments have not taken into account that some universes can be flexible. We believe that flexible universes provide several advantages, and sought to assess them. Flexible intervals don’t quite fit into the standard 3-column BED format; however, they can be stored using optional fields of an extended BED format. In this approach, sequence name, start, and end of the flexible region are represented by the first three columns, and thickStart and thickEnd columns hold the information about end of the flexible start and start of the flexible end. To assess this, we applied our flexible-aware version of the RBD score. We observed that RBD score improves for all flexible universes (Figure  7B ). The change is less significant for CCF universe; however for LH and HMM universes, the new score is close to one, with the HMM universe performing slightly better.

We computed a version of the likelihood score that considers universe flexibility. This score shows a significant improvement; for all flexible universes, all collection scores change by an order of magnitude (Figure  7C ). Likelihood values that consider flexibility are similar for all universes; however, the HMM universe performs slightly worse for large ChIP-Seq collections ( CTCF ChIP large , TF ChIP ). This reflects that, unlike the CCF and LH universes, the HMM universe does not explicitly optimize likelihood. A more detailed look into likelihood showed that although the HMM universe has the best likelihood of cores of the regions, it performs less well for boundary positions (see Methods; Supplementary Figure S6 ).

Assessing across metrics

Since each metric assesses different aspects of universe fit, considering them independently limits the scope of assessment. For a holistic view, we summarized the scores of each metric into normalized heatmaps, allowing comparison within and across metrics (Figure  8 ). We observed several informative cross-score patterns: First, the F 10 -score is the most consistent metric across universes, indicating that all these universes cover the collections to similar extent (Figure  8A ). In contrast, the RBD indicates more variation in how well universes represent region boundaries (Figure  8B ). This is consistent with our intuition that matching boundaries is a more difficult task, since it requires ensuring large regions are split well. The disparity between F 10 -scores and RBD score demonstrates their combined utility. For example, the union universe has perhaps the overall best F 10 -scores across universes, but has low RBD score. Inversely, the SCREEN universe has the best RBD score for the B-LCL ATAC collection, but is significantly worse than any data-driven universe for F 10 -score. In likelihood scores, the CC, CCF, LH universes outperformed other universes to similar extent, while tiles, RB and union were universally poor fits for all universes (Figure  8C ). The HMM has much better RBD score than CC, but a worse likelihood, reflecting that likelihood considers more than boundary positions. Additionally, the likelihood is stricter in boundary assessments: while for RBD score we take a median of actual values, for likelihood we use probability of a given position being a boundary. Comparison of flexible and fixed versions of RBD score and likelihood highlights the value of flexible regions (Figure  8D , E ); RBD score for LH and HMM improved significantly after accounting for flexibility.

Results of universe comparison using different scores. (A) Row normalized F10-score of each universe given collection. (B) Row normalized RBD score of each universe given collection. (C) Row normalized likelihood of each universe given collection. (D) Row normalized flexible version of RBD score of each universe given collection. (E) Row normalized flexible version of likelihood of each universe given collection.

Results of universe comparison using different scores. ( A ) Row normalized F 10 -score of each universe given collection. ( B ) Row normalized RBD score of each universe given collection. ( C ) Row normalized likelihood of each universe given collection. ( D ) Row normalized flexible version of RBD score of each universe given collection. ( E ) Row normalized flexible version of likelihood of each universe given collection.

Application on downstream analysis

To demonstrate how universe affects downstream analyses and how our universe building and evaluation methods can be applied, we performed a region set enrichment analysis using LOLA, a tool for statistical region enrichment analysis ( 15 ). The goal is to take some demo region sets and then use them to search a database of region sets to find similar regions, and explore how the choice of universe affects the results. We constructed two different experiments. For our first experiment, we used the Random ATAC collection as a database. We used three predefined universes – tiles, RB and SCREEN – as well as data-driven universes built from the Random ATAC and B-LCL ATAC collections. To see how the universes based on rare cell types perform, we also added data-driven universes built from fifteen Glia ATAC-Seq files. We queried the database with fifteen files that were not present in the database and were not used for universe construction, representing three different cell types: five files from A549, five from B-LCL, and five from Glia samples. For our second experiment, we used 623 ChIP-Seq files representing different TFs to build a database and data-driven universes. We queried the database with thirty files representing different TFs: 10 files from EZH2, 10 files from POLR2A, 10 files from YY1. For both experiments, we assessed performance of the region enrichment analysis with R -precision (rPPV), which measures the precision based on the top R results, where R is equal to the number of correct results in the whole database.

Our results demonstrate clear impact of the universe on analysis performance (Figure  9A ). In the first experiment, the data-driven universes were the best performers for the specific questions; Glia ATAC queries (gray dots) performed best under the tailored Glia data-driven universes, followed by the Random ATAC data-driven universes, and performed poorly with any pre-defined universes or with the B-LCL data-driven universes. The B-LCL queries (yellow dots) also performed best with the tailored B-LCL-driven universes, and performed reasonably well with predifined or Random ATAC data-driven universes, but poorly with the Glia data-driven universe. Finally, A549 samples performed equally well with predefined or Random ATAC data-driven universes, and poorly with the universes built on the other data types. This shows that using universes based on a specific cell type increase the performance of downstream analysis for this type. Our second experiment, based on TF data, shows a clear difference between the union universe and other more complex data driven universes. We also see an imbalance among the predefined universes, with the SCREEN universe outperforming tiles and RB in general. In conclusion, for ATAC-Seq data, rPPV is higher for data driven universes than predefined ones; for TF data rPPV is higher for more complex data-driven universes (CC, CCF, ML, HMM) than union universes. Overall, these results demonstrate that choosing the right universe is a complex task. In our experiments, the more complex data-driven universes (CC, CCF, LH and HMM) are always either comparable or superior to simpler data-driven or pre-defined universes, although the exact performance depends on the both the initial data and the downstream task. Most importantly, this analysis suggests that the assessment methods we present can be helpful in choosing the right universe.

Results of downstream enrichment analysis depends on universe. (A) R-precision (rPPV) of query files depending on universe. (B) Median of different universe, depending on the collection used for their construction.

Results of downstream enrichment analysis depends on universe. ( A ) R -precision (rPPV) of query files depending on universe. ( B ) Median of different universe, depending on the collection used for their construction.

Many integrative epigenome analyses require the data to be defined on a set of consensus regions, or universe. This universe is critical for analysis because it determines the precision of regions assessed. Our experiments highlight how different universes can have different levels of fit to collections and may therefore be useful for different tasks. Despite the importance of this choice, few approaches have been developed to aid analysts in building appropriate universes or assessing the fit of an existing universe. In this study, we have addressed these issues by presenting several novel methods to build universes from collections of region sets, as well as new ways to assess the fit of a universe to a collection of region sets. We also introduced the concept of flexible segments, and proposed several methods for constructing universes that can use either traditional fixed boundaries or flexible interval boundaries.

In general, data-driven universes outperformed predefined ones. However, the data-driven universes also have a weakness: by definition, they change with the underlying collection, and therefore cannot support an integrative analysis that spans collections. If results need to be compared across collection, then a shared universe is required. There are a few options for analysis in this case, all of which are facilitated by our work: First, a custom data-driven universe that spans all included collections can be built. Since we have described several ways to create well-performing data-driven universes, it would be easy to just design a bespoke universe for a given comparative analysis. However, this may not always be possible or convenient, and at some point, an external universe may be preferred. In this case, a trade-off is required: fit of the universe must be sacrificed to increase interoperability with other collections. Our assessment methods now provide a principled way to assess this trade-off and inform research decisions.

Among the non-data-driven universes we tested, SCREEN performed well for the collections we tested. However, there are almost certainly other collections or use cases for which the tiles, RB, or other external universes would be a better choice. Along these lines, we propose that our methods for building universes can be used in the future to create predefined universes from large collections, thereby creating even better global universes that can be re-used for integrative analysis. In the future, a centralized repository of universes, built using different methods and for different target use cases, could be a useful resource; a given collection of region sets could be represented into different universes based on the balance of fit and need for integration.

Based on our results, we propose that the union universe, though widely used, does not represent ChIP-Seq data well, particularly for large collections. Instead, we propose the HMM universe as a good all-around option that solves many of the issues with simpler universes. It has the highest sensitivity to boundary positions and good recall. It also provides adjustable parameters; by setting emission and transition probabilities, users may adjust model sensitivity and keep it consistent across collections. Still, the final choice of universe should consider the needs of downstream analysis. Our results show that assessing universe fit is a complex question, with many features to optimize. Even with our assessment metrics, it is difficult to claim an optimal universe for a given collection; instead, the answer depends on the downstream analysis priorities. For example, in general, the CCF and LH universe represent properties of a whole collection well, but they exclude infrequent regions; thus, they may be useful for NLP analysis of the genome but could lead to losing information about rare cell types in single cell analysis. In contrast, the union universe by definition covers all bases found in the region set collection, and therefore has a great F 10 -score; however, it also merges regions, which is reflected by poor RBD score and likelihood scores, indicating that it would not be a good fit for an application that requires high region resolution. Therefore, multiple perspectives must be considered for a holistic assessment of universe fit.

One advantage of our new universe construction methods is that they naturally create flexible universes. Flexible universes are a new way of summarizing information from large collections with less loss of information. We showed that proposed approaches of making flexible universes improve results over inflexible universes. We see flexible regions as a powerful new concept that can modify our current way of thinking about universes. Furthermore, we expect that using them for differential peak analysis, statistical region enrichment analysis and NLP approaches has potential to improve results. Flexible regions will become more useful as we and others develop the necessary tooling to work with them; for example, we will require tokenization methods that can project a traditional region set into a flexible universe quickly and accurately, while considering the universe flexibility.

In conclusion, this research provides new concepts, methods, and insight that will help researchers to determine the best analysis path for many types of genomic region analysis.

Software is available at https://github.com/databio/geniml .

Supplementary Data are available at NAR Online.

National Human Genome Research Institute [R01-HG012558]; National Institute of General Medical Sciences [R35-GM128636]. Funding for open access charge: NIH.

Conflict of interest statement . N.C.S. is a consultant for InVitro Cell Research, LLC.

Moore   J.E. , Purcaro   M.J. , Pratt   H.E. , Epstein   C.B. , Shoresh   N. , Adrian   J. , Kawli   T. , Davis   C.A. , Dobin   A. , Kaul   R.  et al. .   Expanded encyclopaedias of DNA elements in the human and mouse genomes . Nature . 2020 ; 583 : 699 – 710 .

Google Scholar

Barrett   T. , Wilhite   S.E. , Ledoux   P. , Evangelista   C. , Kim   I.F. , Tomashevsky   M. , Marshall   K.A. , Phillippy   K.H. , Sherman   P.M. , Holko   M.  et al. .   NCBI GEO: Archive for functional genomics data sets—update . Nucleic Acids Res.   2013 ; 41 : D991 – D995 .

Xue   B. , Khoroshevskyi   O. , Gomez   R.A. , Sheffield   N.C.   Opportunities and challenges in sharing and reusing genomic interval data . Front. Genet.   2023 ; 14 : 1155809 .

Kruczyk   M. , Umer   H.M. , Enroth   S. , Komorowski   J.   Peak finder metaserver - a novel application for finding peaks in ChIP-seq data . BMC Bioinformatics . 2013 ; 14 : 280 .

Lun   A.T.L. , Smyth   G.K.   De novo detection of differentially bound regions for ChIP-seq data using peaks and windows: Controlling error rates correctly . Nucleic Acids Res.   2014 ; 42 : e95 .

Lun   A.T.L. , Smyth   G.K.   Csaw: a bioconductor package for differential binding analysis of ChIP-seq data using sliding windows . Nucleic Acids Res.   2015 ; 44 : e45 .

Smith   J.P. , Sheffield   N.C.   Analytical approaches for ATAC-seq data analysis . Curr. Protoc. Hum. Genet.   2020 ; 106 : e101 .

Fan   H. , Atiya   H.I. , Wang   Y. , Pisanic   T.R. , Wang   T.-H. , Shih   I.-M. , Foy   K.K. , Frisbie   L. , Buckanovich   R.J. , Chomiak   A.A.  et al. .   Epigenomic reprogramming toward mesenchymal-epithelial transition in ovarian-cancer-associated mesenchymal stem cells drives metastasis . Cell Rep.   2020 ; 33 : 108473 .

Smith   J.P. , Corces   M.R. , Xu   J. , Reuter   V.P. , Chang   H.Y. , Sheffield   N.C.   PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments . NAR Genom. Bioinform.   2021 ; 3 : lqab101 .

Miller   H.E. , Montemayor   D. , Abdul   J. , Vines   A. , Levy   S.A. , Hartono   S.R. , Sharma   K. , Frost   B. , Chédin   F. , Bishop   A.J.R.   Quality-controlled r-loop meta-analysis reveals the characteristics of r-loop consensus regions . Nucleic Acids Res.   2022 ; 50 : 7260 – 7286 .

Yan   F. , Powell   D.R. , Curtis   D.J. , Wong   N.C.   From reads to insight: a hitchhiker’s guide to ATAC-seq data analysis . Genome Biol.   2020 ; 21 : 22 .

Chen   H. , Lareau   C. , Andreani   T. , Vinyard   M.E. , Garcia   S.P. , Clement   K. , Andrade-Navarro   M.A. , Buenrostro   J.D. , Pinello   L.   Assessment of computational methods for the analysis of single-cell ATAC-seq data . Genome Biol.   2019 ; 20 : 241 .

Simovski   B. , Kanduri   C. , Gundersen   S. , Titov   D. , Domanska   D. , Bock   C. , Bossini-Castillo   L. , Chikina   M. , Favorov   A. , Layer   R.M.  et al. .   Coloc-stats: a unified web interface to perform colocalization analysis of genomic features . Nucleic Acids Res.   2018 ; 46 : W186 – W193 .

Kanduri   C. , Bock   C. , Gundersen   S. , Hovig   E. , Sandve   G.K.   Colocalization analyses of genomic elements: approaches, recommendations and challenges . Bioinformatics . 2019 ; 35 : 1615 – 1624 .

Sheffield   N.C. , Bock   C.   LOLA: Enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor . Bioinformatics . 2016 ; 32 : 587 – 589 .

Nagraj   V.P. , Magee   N.E. , Sheffield   N.C.   LOLAweb: a containerized web server for interactive genomic locus overlap enrichment analysis . Nucleic Acids Res . 2018 ; 46 : W194 – W199 .

Gharavi   E. , Gu   A. , Zheng   G. , Smith   J.P. , Zhang   A. , Brown   D.E. , Sheffield   N.C.   Embeddings of genomic region sets capture rich biological associations in low dimensions . Bioinformatics . 2021 ; 37 : 4299 – 4306 .

Gharavi   E. , LeRoy   N.J. , Zheng   G. , Zhang   A. , Brown   D.E. , Sheffield   N.C.   Joint representation learning for retrieval and annotation of genomic interval sets . Bioengineering . 2024 ; 11 : 263 .

Zheng   G. , Rymuza   J. , Gharavi   E. , LeRoy   N.J. , Zhang   A. , Brown   D.E. , Sheffield   N.C.   Methods for evaluating unsupervised vector representations of genomic regions . NAR Genom. Bioinform.   2024 ; 6 : lqae086 .

LeRoy   N.J. , Smith   J.P. , Zheng   G. , Rymuza   J. , Gharavi   E. , Zhang   A. , Brown   D.E. , Sheffield   N.C.   Fast clustering and cell-type annotation of scATACdata with pre-trained embeddings . NAR Genom. Bioinform.   2024 ; 6 : lqae073 .

Samb   R. , Khadraoui   K. , Belleau   P. , Deschênes   A. , Lakhal-Chaieb   L. , Droit   A.   Using informative multinomial-dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling . Stat.Appl. Genet. Mol. Biol.   2015 ; 14 : 517 – 532 .

Granja   J.M. , Corces   M.R. , Pierce   S.E. , Bagdatli   S.T. , Choudhry   H. , Chang   H.Y. , Greenleaf   W.J.   ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis . Nat. Genet.   2021 ; 53 : 403 – 411 .

Zerbino   D.R. , Wilder   S.P. , Johnson   N. , Juettemann   T. , Flicek   P.R.   The Ensembl Regulatory Build . Genome Biol.   2015 ; 16 : 56 .

Chikina   M.D. , Troyanskaya   O.G.   An effective statistical evaluation of ChIPseq dataset similarity . Bioinformatics . 2012 ; 28 : 607 – 613 .

Zou   Z. , Ohta   T. , Miura   F. , Oki   S.   ChIP-atlas 2021 update: a data-mining suite for exploring epigenomic landscapes by fully integrating ChIP-seq, ATAC-seq and bisulfite-seq data . Nucleic Acids Res.   2022 ; 50 : W175 – W182 .

Email alerts

Citing articles via.

  • Editorial Board

Affiliations

  • Online ISSN 1362-4962
  • Print ISSN 0305-1048
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

COMMENTS

  1. Nucleic Acids Research

    The 2024 Nucleic Acids Research Web Server issue is the 22nd in a series of annual issues dedicated to web-based software resources for analysis and visualization of molecular biology data. This issue includes 74 articles covering web servers that support research activities in a wide range of areas, ranging from software aimed at the wet lab ...

  2. About the journal

    Nucleic Acids Research (NAR) is a peer-reviewed journal that publishes papers on various aspects of nucleic acids and proteins. It has a high impact factor and is covered by many indexing services and databases in biochemistry, molecular biology, genetics and related fields.

  3. Advance articles

    Browse the latest research articles published in Nucleic Acids Research, a leading journal in the field of molecular biology and biotechnology. Find out the latest developments in chromatin, epigenetics, genome editing, RNA, DNA and more.

  4. The current landscape of nucleic acid therapeutics

    Other research demonstrated that viruses could specifically infect and thus regress tumours 135,136 and that injecting certain tissues with nucleic acids could produce genes of interest 137 ...

  5. Nucleic Acids Research

    Nucleic Acids Research is an open-access journal on nucleic acids and related topics, published by Oxford University Press since 1974. It has a high impact factor and publishes special issues on biological databases and web servers.

  6. Nucleic acids

    Nucleic acids function in encoding, transmitting and expressing genetic information in either the double-stranded form (mostly for DNA) or in single-stranded form (mostly for RNA). Latest Research ...

  7. Nucleic acids

    Here, Bou-Nader et al., define the nucleic acid-binding specificity of S9.6 and report its crystal structures free and bound to a hybrid, which reveal the asymmetric recognition of the RNA and DNA ...

  8. Advances in Nucleic Acid Research: Exploring the Potential of

    In conclusion, this review provided insights into future trends and challenges in nucleic acid research, highlighting the enormous potential and impact of this field. Developments in these areas will not only deepen our understanding of fundamental biological processes but also pave the way for new diagnostic tools, personalized medicine, and ...

  9. Understanding biochemistry: structure and function of nucleic acids

    Abstract. Nucleic acids, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), carry genetic information which is read in cells to make the RNA and proteins by which living things function. The well-known structure of the DNA double helix allows this information to be copied and passed on to the next generation.

  10. Phys.org

    Nucleic Acid Research is a peer-reviewed journal that publishes papers on various aspects of nucleic acids and proteins. It covers topics such as chemistry, computational biology, gene regulation, genomics, molecular biology, RNA, structural biology and more.

  11. The 2022 Nucleic Acids Research database issue and the online ...

    The 2022 Nucleic Acids Research Database Issue contains 185 papers, including 87 papers reporting on new databases and 85 updates from resources previously published in the Issue. Thirteen additional manuscripts provide updates on databases most recently published elsewhere. Seven new databases focu …

  12. The 2023 Nucleic Acids Research Database Issue and the online molecular

    NEW AND UPDATED DATABASES. In its 30th incarnation, the Nucleic Acids Research Database Issue once again ranges across biology with a total of 178 papers. Table Table1 1 lists the 90 new databases included, a recent record number, and there are 82 update papers from resources previously covered by NAR. Finally, six databases most recently published elsewhere contribute updates (Table (Table2). 2).

  13. General Instructions

    Nucleic Acids Research is a peer reviewed fully open access journal publishing 24 issues per year online and in print. All papers published in the Journal are made freely available online under open access publishing agreements, with applicable charges.

  14. Nucleic acids

    Read the latest Research articles in Nucleic acids from Nature. ... Four different XNAs — polymers with backbone chemistries not found in nature, namely, arabino nucleic acids, 2 ...

  15. Phys.org

    Nucleic Acids Research. Nucleic Acids Research is a peer-reviewed scientific journal published by Oxford University Press. It covers research on nucleic acids, such as DNA and RNA, and related ...

  16. Journal Citation Reports

    Learn about the journal profile, impact factor, ranking, and citation analysis of Nucleic Acids Research, a leading journal in the Web of Science Core Collection.

  17. Protein-nucleic acid hybrid nanostructures for molecular diagnostic

    Molecular diagnostic technologies empower new clinical opportunities in precision medicine. However, existing approaches face limitations with respect to performance, operation and cost. Biological molecules including proteins and nucleic acids are being increasingly adopted as tools in the development of new molecular diagnostic technologies. In particular, leveraging their complementary ...

  18. The 2021 Nucleic Acids Research database issue and the online molecular

    NEW AND UPDATED DATABASES. The 28th annual Nucleic Acids Research Database Issue contains 189 papers spanning, as usual, a wide range of biology. Unsurprisingly, COVID-19 casts a long shadow over the Issue. Seven new databases specifically address the pandemic and the SARS-CoV-2 virus responsible (Table (Table1) 1) but new and returning databases in all areas have rushed to support research ...

  19. Issues

    Browse the latest articles on nucleic acid research, including breakthrough, critical, and review articles, as well as data resources and analyses. Topics include gene regulation, chromatin, epigenetics, RNA, DNA, and more.

  20. Nucleic acid chemistry

    Nucleic acids have become important diagnostic markers for many diseases, enabled by breakthroughs in synthesis and sequencing technologies. ... The Collection primarily welcomes original research ...

  21. Development and validation of a rapid five-minute nucleic acid

    The rapid transmission and high pathogenicity of respiratory viruses significantly impact the health of both children and adults. Extracting and detecting their nucleic acid is crucial for disease prevention and treatment strategies. However, current extraction methods are laborious and time-consuming and show significant variations in nucleic acid content and purity among different kits ...

  22. Precise Preparation of Supramolecular Spherical Nucleic Acids for

    Angewandte Chemie International Edition is one of the prime chemistry journals in the world, publishing research articles, highlights, communications and reviews across all areas of chemistry. Molecular spherical nucleic acids (m-SNAs) are a second generation of spherical nucleic acids (SNAs), which are of significance in potential application ...

  23. The 2022 Nucleic Acids Research database issue and the online molecular

    The 29th annual Nucleic Acids Research Database Issue contains 185 papers covering topics from across biology and beyond. The ongoing COVID-19 pandemic continues to play a major role, inspiring the construction of seven new databases (Table (Table1). 1). The reader will also find its impact obvious in papers describing other new and returning ...

  24. Volume 51 Issue 6

    Publishes the results of leading edge research into physical, chemical, biochemical and biological aspects of nucleic acids and proteins involved in nucleic acid metabolism and/or interactions. Fully open access.

  25. iSN04: A novel nucleic acid drug for the treatment of vascular diseases

    This innovative strategy has led researchers from Shinshu University, Japan, to develop a novel nucleic acid drug called iSN04. Associate Professor Tomohide Takaya, from the Faculty of Agriculture ...

  26. Archive of "Nucleic Acids Research".

    Nucleic Acids Res; Nucleic Acids Research Vols. 1 to 52; 1974 to 2024; Vol. 52 2024: v.52(D1): D1-D1702 2024 Jan 5: v.52(1): 1-509 2024 Jan 11: v.52(2): 511-1003 ... Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press. Follow NCBI. Connect with NLM National Library of Medicine 8600 Rockville Pike ...

  27. Cellular DNA damage response pathways might be useful against some

    Published online on Aug. 10 in Nucleic Acids Research, ... "This research is significant both for understanding how cells respond to DNA damage, to prevent them from becoming cancerous in the first place, how targeting this pathway can be used in new cancer treatments, and because it now opens up possibilities for new approaches to treating ...

  28. Volume 51 Issue 2

    Nucleic Acids Research | 51 | 2 | January 2023. Cover: Zalpha (Zα) domains bind to left-handed Z-DNA and Z-RNA. The Zα domain protein family includes cellular (ADAR1, ZBP1 and PKZ) and viral (vaccinia virus E3 and cyprinid herpesvirus 3 (CyHV-3) ORF112) proteins.

  29. Methods for constructing and evaluating consensus genomic interval sets

    Here, S c is a sum of genome coverage by collection and g is the size of the genome. We realized that, in a sense, the CC universe is a point estimate of a more complex distribution of possible universes. We reasoned that we may gain some insight by modeling the boundaries of the consensus regions as intervals, rather than points.