Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

  • Qualitative studies
  • Research design
  • Get an email alert for Qualitative studies
  • Get the RSS feed for Qualitative studies

Showing 1 - 13 of 1,219

View by: Cover Page List Articles

Sort by: Recent Popular

published qualitative research paper

We choose : Adolescent girls and young women’s choice for an HIV prevention product in a cross-over randomized clinical trial conducted in South Africa, Uganda, and Zimbabwe"> We choose : Adolescent girls and young women’s choice for an HIV prevention product in a cross-over randomized clinical trial conducted in South Africa, Uganda, and Zimbabwe

Millicent Atujuna, Kristin Williams,  [ ... ], Ariane van der Straten

published qualitative research paper

Impact of bariatric surgery on premenopausal women’s womanliness: A qualitative systematic review and meta-synthesis

Rebecca Paul, Ellen Andersson,  [ ... ], Carina Berterö

published qualitative research paper

“This is you teaching you:” Exploring providers’ perspectives on experiential learning and enhancing patient safety and outcomes in ketamine-assisted therapy

Elena Argento, Tashia Petker,  [ ... ], Zach Walsh

published qualitative research paper

Perception and coping mechanisms of patients with diabetes mellitus during the COVID-19 pandemic in Ibadan, Nigeria

Olajumoke Ololade Tunji-Adepoju, Obasanjo Afolabi Bolarinwa, Richard Gyan Aboagye, Williams O. Balogun

published qualitative research paper

Treating the disease and meeting the person with the illness-patient perspectives of needs during infective endocarditis, a qualitative study

Helena Lindberg, Johan Vaktnäs, Magnus Rasmussen, Ingrid Larsson

published qualitative research paper

Dietary practice and associated factors among elderly people in Northwest Ethiopia, 2022: Community based mixed design

Mulat Tirfie Bayih, Adane Ambaye Kassa, Yeshalem Mulugeta Demilew

A qualitative study of stressors faced by older stroke patients in a convalescent rehabilitation hospital

Yuta Asada, Kaori Nishio, Kohei Iitsuka, Jun Yaeda

published qualitative research paper

From many voices, one question: Community co-design of a population-based qualitative cancer research study

Susannah K. Ayre, Elizabeth A. Johnston,  [ ... ], Belinda C. Goodwin

published qualitative research paper

Using residents and experts to evaluate the validity of areal wombling for detecting social boundaries: A small-scale feasibility study

Meng Le Zhang, Aneta Piekut,  [ ... ], Gwilym Pryce

published qualitative research paper

Delphi studies in social and health sciences—Recommendations for an interdisciplinary standardized reporting (DELPHISTAR). Results of a Delphi study

Marlen Niederberger, Julia Schifano,  [ ... ], the DEWISS network

published qualitative research paper

A qualitative exploration of migraine in students attending Irish Universities

Orla Flynn, Catherine Blake, Brona M. Fullen

published qualitative research paper

A protocol for a cluster randomized controlled trial to assess the impact of Balika Bodhu: A combined empowerment and social norm based sexual and reproductive health and rights intervention for married adolescent girls in rural Bangladesh

Mahfuz Al Mamun, Sultan Mahmud,  [ ... ], Ruchira Tabassum Naved

published qualitative research paper

Digital transformation in college libraries: The effect of digital reading on reader service satisfaction

Yixin Lu, Shengguang Lin

Connect with Us

  • PLOS ONE on Twitter
  • PLOS on Facebook

published qualitative research paper

Qualitative Research Journal

  • Submit your paper
  • Author guidelines
  • Editorial team
  • Indexing & metrics
  • Calls for papers & news

Before you start

For queries relating to the status of your paper pre decision, please contact the Editor or Journal Editorial Office. For queries post acceptance, please contact the Supplier Project Manager. These details can be found in the Editorial Team section.

Author responsibilities

Our goal is to provide you with a professional and courteous experience at each stage of the review and publication process. There are also some responsibilities that sit with you as the author. Our expectation is that you will:

  • Respond swiftly to any queries during the publication process.
  • Be accountable for all aspects of your work. This includes investigating and resolving any questions about accuracy or research integrity .
  • Treat communications between you and the journal editor as confidential until an editorial decision has been made.
  • Include anyone who has made a substantial and meaningful contribution to the submission (anyone else involved in the paper should be listed in the acknowledgements).
  • Exclude anyone who hasn’t contributed to the paper, or who has chosen not to be associated with the research.
  • In accordance with COPE’s position statement on AI tools , Large Language Models cannot be credited with authorship as they are incapable of conceptualising a research design without human direction and cannot be accountable for the integrity, originality, and validity of the published work. The author(s) must describe the content created or modified as well as appropriately cite the name and version of the AI tool used; any additional works drawn on by the AI tool should also be appropriately cited and referenced. Standard tools that are used to improve spelling and grammar are not included within the parameters of this guidance. The Editor and Publisher reserve the right to determine whether the use of an AI tool is permissible.
  • If your article involves human participants, you must ensure you have considered whether or not you require ethical approval for your research, and include this information as part of your submission. Find out more about informed consent .

Generative AI usage key principles

  • Copywriting any part of an article using a generative AI tool/LLM would not be permissible, including the generation of the abstract or the literature review, for as per Emerald’s authorship criteria, the author(s) must be responsible for the work and accountable for its accuracy, integrity, and validity.
  • The generation or reporting of results using a generative AI tool/LLM is not permissible, for as per Emerald’s authorship criteria, the author(s) must be responsible for the creation and interpretation of their work and accountable for its accuracy, integrity, and validity.
  • The in-text reporting of statistics using a generative AI tool/LLM is not permissible due to concerns over the authenticity, integrity, and validity of the data produced, although the use of such a tool to aid in the analysis of the work would be permissible.
  • Copy-editing an article using a generative AI tool/LLM in order to improve its language and readability would be permissible as this mirrors standard tools already employed to improve spelling and grammar, and uses existing author-created material, rather than generating wholly new content, while the author(s) remains responsible for the original work.
  • The submission and publication of images created by AI tools or large-scale generative models is not permitted.

Research and publishing ethics

Our editors and employees work hard to ensure the content we publish is ethically sound. To help us achieve that goal, we closely follow the advice laid out in the guidelines and flowcharts on the COPE (Committee on Publication Ethics) website .

We have also developed our research and publishing ethics guidelines . If you haven’t already read these, we urge you to do so – they will help you avoid the most common publishing ethics issues.

A few key points:

  • Any manuscript you submit to this journal should be original. That means it should not have been published before in its current, or similar, form. Exceptions to this rule are outlined in our pre-print and conference paper policies .  If any substantial element of your paper has been previously published, you need to declare this to the journal editor upon submission. Please note, the journal editor may use  Crossref Similarity Check  to check on the originality of submissions received. This service compares submissions against a database of 49 million works from 800 scholarly publishers.
  • Your work should not have been submitted elsewhere and should not be under consideration by any other publication.
  • If you have a conflict of interest, you must declare it upon submission; this allows the editor to decide how they would like to proceed. Read about conflict of interest in our research and publishing ethics guidelines .
  • By submitting your work to Emerald, you are guaranteeing that the work is not in infringement of any existing copyright.
  • If you have written about a company/individual/organisation in detail using information that is not publicly available, have spent time within that company/organisation, or the work features named/interviewed employees, you will need to clear permission by using the  consent to publish form ; please also see our permissions guidance for full details. If you have to clear permission with the company/individual/organisation, consent must be given either by the named individual in question or their representative, a board member of the company/organisation, or a HR department representative of the company/organisation.
  • You have an ethical obligation and responsibility to conduct your research in adherence to national and international research ethics guidelines, as well as the ethical principles outlined by your discipline and any relevant authorities, and to be transparent about your research methods in such a way that all involved in the publication process may fairly and appropriately evaluate your work. For all research involving human participants, you must ensure that you have obtained informed consent, meaning that you must inform all participants in your work (or their legal representative) as to why the research is being conducted, whether their anonymity is protected, how their data will be stored and used, and whether there are any associated risks from participation in the study; the submitted work must confirm that informed consent was obtained and detail how this was addressed in accordance with our policy on informed consent .  
  • Where appropriate, you must provide an ethical statement within the submitted work confirming that your research received institutional and national (or international) ethical approval, and that it complies with all relevant guidelines and regulations for studies involving humans, whether that be data, individuals, or samples. Specifically, the statement should contain the name and location of the institutional ethics reviewing committee or review board, the approval number, the date of approval, and the details of the national or international guidelines that were followed, as well as any other relevant information. You should also include details of how the work adheres to relevant consent guidelines along with confirming that informed consent was secured for all participants. The details of these statements should ensure that author and participant anonymity is not compromised. Any work submitted without a suitable ethical statement and details of informed consent for all participants, where required, will be returned to the authors and will not be considered further until appropriate and clear documentation is provided. Emerald reserves the right to reject work without sufficient evidence of informed consent from human participants and ethical approval where required.

Third party copyright permissions

Prior to article submission, you need to ensure you’ve applied for, and received, written permission to use any material in your manuscript that has been created by a third party. Please note, we are unable to publish any article that still has permissions pending. The rights we require are:

  • Non-exclusive rights to reproduce the material in the article or book chapter.
  • Print and electronic rights.
  • Worldwide English-language rights.
  • To use the material for the life of the work. That means there should be no time restrictions on its re-use e.g. a one-year licence.

We are a member of the International Association of Scientific, Technical, and Medical Publishers (STM) and participate in the STM permissions guidelines , a reciprocal free exchange of material with other STM publishers.  In some cases, this may mean that you don’t need permission to re-use content. If so, please highlight this at the submission stage.

Please take a few moments to read our guide to publishing permissions  to ensure you have met all the requirements, so that we can process your submission without delay.

Open access submissions and information

All our journals currently offer two open access (OA) publishing paths; gold open access and green open access.

If you would like to, or are required to, make the branded publisher PDF (also known as the version of record) freely available immediately upon publication, you can select the gold open access route once your paper is accepted. 

If you’ve chosen to publish gold open access, this is the point you will be asked to pay the APC (article processing charge) . This varies per journal and can be found on our APC price list or on the editorial system at the point of submission. Your article will be published with a Creative Commons CC BY 4.0 user licence , which outlines how readers can reuse your work.

Alternatively, if you would like to, or are required to, publish open access but your funding doesn’t cover the cost of the APC, you can choose the green open access, or self-archiving, route. As soon as your article is published, you can make the author accepted manuscript (the version accepted for publication) openly available, free from payment and embargo periods.

You can find out more about our open access routes, our APCs and waivers and read our FAQs on our open research page. 

Find out about open

Transparency and Openness Promotion (TOP) Guidelines

We are a signatory of the Transparency and Openness Promotion (TOP) Guidelines , a framework that supports the reproducibility of research through the adoption of transparent research practices. That means we encourage you to:

  • Cite and fully reference all data, program code, and other methods in your article.
  • Include persistent identifiers, such as a Digital Object Identifier (DOI), in references for datasets and program codes. Persistent identifiers ensure future access to unique published digital objects, such as a piece of text or datasets. Persistent identifiers are assigned to datasets by digital archives, such as institutional repositories and partners in the Data Preservation Alliance for the Social Sciences (Data-PASS).
  • Follow appropriate international and national procedures with respect to data protection, rights to privacy and other ethical considerations, whenever you cite data. For further guidance please refer to our  research and publishing ethics guidelines . For an example on how to cite datasets, please refer to the references section below.

Prepare your submission

Manuscript support services.

We are pleased to partner with Editage, a platform that connects you with relevant experts in language support, translation, editing, visuals, consulting, and more. After you’ve agreed a fee, they will work with you to enhance your manuscript and get it submission-ready.

This is an optional service for authors who feel they need a little extra support. It does not guarantee your work will be accepted for review or publication.

Visit Editage

Manuscript requirements

Before you submit your manuscript, it’s important you read and follow the guidelines below. You will also find some useful tips in our structure your journal submission how-to guide.

Article files should be provided in Microsoft Word format.

While you are welcome to submit a PDF of the document alongside the Word file, PDFs alone are not acceptable. LaTeX files can also be used but only if an accompanying PDF document is provided. Acceptable figure file types are listed further below.

Articles should be between 3000  and 7000 words in length. This includes all text, for example, the structured abstract, references, all text in tables, and figures and appendices. 

Please allow 280 words for each figure or table.

A concisely worded title should be provided.

The names of all contributing authors should be added to the ScholarOne submission; please list them in the order in which you’d like them to be published. Each contributing author will need their own ScholarOne author account, from which we will extract the following details:

(institutional preferred). . We will reproduce it exactly, so any middle names and/or initials they want featured must be included. . This should be where they were based when the research for the paper was conducted.

In multi-authored papers, it’s important that ALL authors that have made a significant contribution to the paper are listed. Those who have provided support but have not contributed to the research should be featured in an acknowledgements section. You should never include people who have not contributed to the paper or who don’t want to be associated with the research. Read about our for authorship.

If you want to include these items, save them in a separate Microsoft Word document and upload the file with your submission. Where they are included, a brief professional biography of not more than 100 words should be supplied for each named author.

Your article must reference all sources of external research funding in the acknowledgements section. You should describe the role of the funder or financial sponsor in the entire research process, from study design to submission.

All submissions must include a structured abstract, following the format outlined below.

These four sub-headings and their accompanying explanations must always be included:

The following three sub-headings are optional and can be included, if applicable:


You can find some useful tips in our  how-to guide.

The maximum length of your abstract should be 250 words in total, including keywords and article classification (see the sections below).

Your submission should include up to 12 appropriate and short keywords that capture the principal topics of the paper. Our  how to guide contains some practical guidance on choosing search-engine friendly keywords.

Please note, while we will always try to use the keywords you’ve suggested, the in-house editorial team may replace some of them with matching terms to ensure consistency across publications and improve your article’s visibility.

During the submission process, you will be asked to select a type for your paper; the options are listed below. If you don’t see an exact match, please choose the best fit:

You will also be asked to select a category for your paper. The options for this are listed below. If you don’t see an exact match, please choose the best fit:

 Reports on any type of research undertaken by the author(s), including:

 Covers any paper where content is dependent on the author's opinion and interpretation. This includes journalistic and magazine-style pieces.

 Describes and evaluates technical products, processes or services.

 Focuses on developing hypotheses and is usually discursive. Covers philosophical discussions and comparative studies of other authors’ work and thinking.

 Describes actual interventions or experiences within organizations. It can be subjective and doesn’t generally report on research. Also covers a description of a legal case or a hypothetical case study used as a teaching exercise.

 This category should only be used if the main purpose of the paper is to annotate and/or critique the literature in a particular field. It could be a selective bibliography providing advice on information sources, or the paper may aim to cover the main contributors to the development of a topic and explore their different views.

 Provides an overview or historical examination of some concept, technique or phenomenon. Papers are likely to be more descriptive or instructional (‘how to’ papers) than discursive.

Headings must be concise, with a clear indication of the required hierarchy. 

The preferred format is for first level headings to be in bold, and subsequent sub-headings to be in medium italics.

Notes or endnotes should only be used if absolutely necessary. They should be identified in the text by consecutive numbers enclosed in square brackets. These numbers should then be listed, and explained, at the end of the article.

All figures (charts, diagrams, line drawings, webpages/screenshots, and photographic images) should be submitted electronically. Both colour and black and white files are accepted.

There are a few other important points to note:

Tables should be typed and submitted in a separate file to the main body of the article. The position of each table should be clearly labelled in the main body of the article with corresponding labels clearly shown in the table file. Tables should be numbered consecutively in Roman numerals (e.g. I, II, etc.).

Give each table a brief title. Ensure that any superscripts or asterisks are shown next to the relevant items and have explanations displayed as footnotes to the table, figure or plate.

Where tables, figures, appendices, and other additional content are supplementary to the article but not critical to the reader’s understanding of it, you can choose to host these supplementary files alongside your article on Insight, Emerald’s content-hosting platform (this is Emerald's recommended option as we are able to ensure the data remain accessible), or on an alternative trusted online repository. All supplementary material must be submitted prior to acceptance.

Emerald recommends that authors use the following two lists when searching for a suitable and trusted repository:

   

, you must submit these as separate files alongside your article. Files should be clearly labelled in such a way that makes it clear they are supplementary; Emerald recommends that the file name is descriptive and that it follows the format ‘Supplementary_material_appendix_1’ or ‘Supplementary tables’. All supplementary material must be mentioned at the appropriate moment in the main text of the article; there is no need to include the content of the file only the file name. A link to the supplementary material will be added to the article during production, and the material will be made available alongside the main text of the article at the point of EarlyCite publication.

Please note that Emerald will not make any changes to the material; it will not be copy-edited or typeset, and authors will not receive proofs of this content. Emerald therefore strongly recommends that you style all supplementary material ahead of acceptance of the article.

Emerald Insight can host the following file types and extensions:

, you should ensure that the supplementary material is hosted on the repository ahead of submission, and then include a link only to the repository within the article. It is the responsibility of the submitting author to ensure that the material is free to access and that it remains permanently available. Where an alternative trusted online repository is used, the files hosted should always be presented as read-only; please be aware that such usage risks compromising your anonymity during the review process if the repository contains any information that may enable the reviewer to identify you; as such, we recommend that all links to alternative repositories are reviewed carefully prior to submission.

Please note that extensive supplementary material may be subject to peer review; this is at the discretion of the journal Editor and dependent on the content of the material (for example, whether including it would support the reviewer making a decision on the article during the peer review process).

All references in your manuscript must be formatted using one of the recognised Harvard styles. You are welcome to use the Harvard style Emerald has adopted – we’ve provided a detailed guide below. Want to use a different Harvard style? That’s fine, our typesetters will make any necessary changes to your manuscript if it is accepted. Please ensure you check all your citations for completeness, accuracy and consistency.

References to other publications in your text should be written as follows:

, 2006) Please note, ‘ ' should always be written in italics.

A few other style points. These apply to both the main body of text and your final list of references.

At the end of your paper, please supply a reference list in alphabetical order using the style guidelines below. Where a DOI is available, this should be included at the end of the reference.

Surname, initials (year),  , publisher, place of publication.

e.g. Harrow, R. (2005),  , Simon & Schuster, New York, NY.

Surname, initials (year), "chapter title", editor's surname, initials (Ed.), , publisher, place of publication, page numbers.

e.g. Calabrese, F.A. (2005), "The early pathways: theory to practice – a continuum", Stankosky, M. (Ed.),  , Elsevier, New York, NY, pp.15-20.

Surname, initials (year), "title of article",  , volume issue, page numbers.

e.g. Capizzi, M.T. and Ferguson, R. (2005), "Loyalty trends for the twenty-first century",  , Vol. 22 No. 2, pp.72-80.

Surname, initials (year of publication), "title of paper", in editor’s surname, initials (Ed.),  , publisher, place of publication, page numbers.

e.g. Wilde, S. and Cox, C. (2008), “Principal factors contributing to the competitiveness of tourism destinations at varying stages of development”, in Richardson, S., Fredline, L., Patiar A., & Ternel, M. (Ed.s),  , Griffith University, Gold Coast, Qld, pp.115-118.

Surname, initials (year), "title of paper", paper presented at [name of conference], [date of conference], [place of conference], available at: URL if freely available on the internet (accessed date).

e.g. Aumueller, D. (2005), "Semantic authoring and retrieval within a wiki", paper presented at the European Semantic Web Conference (ESWC), 29 May-1 June, Heraklion, Crete, available at: http://dbs.uni-leipzig.de/file/aumueller05wiksar.pdf (accessed 20 February 2007).

Surname, initials (year), "title of article", working paper [number if available], institution or organization, place of organization, date.

e.g. Moizer, P. (2003), "How published academic research can inform policy decisions: the case of mandatory rotation of audit appointments", working paper, Leeds University Business School, University of Leeds, Leeds, 28 March.

 (year), "title of entry", volume, edition, title of encyclopaedia, publisher, place of publication, page numbers.

e.g.   (1926), "Psychology of culture contact", Vol. 1, 13th ed., Encyclopaedia Britannica, London and New York, NY, pp.765-771.

(for authored entries, please refer to book chapter guidelines above)

Surname, initials (year), "article title",  , date, page numbers.

e.g. Smith, A. (2008), "Money for old rope",  , 21 January, pp.1, 3-4.

 (year), "article title", date, page numbers.

e.g.   (2008), "Small change", 2 February, p.7.

Surname, initials (year), "title of document", unpublished manuscript, collection name, inventory record, name of archive, location of archive.

e.g. Litman, S. (1902), "Mechanism & Technique of Commerce", unpublished manuscript, Simon Litman Papers, Record series 9/5/29 Box 3, University of Illinois Archives, Urbana-Champaign, IL.

If available online, the full URL should be supplied at the end of the reference, as well as the date that the resource was accessed.

Surname, initials (year), “title of electronic source”, available at: persistent URL (accessed date month year).

e.g. Weida, S. and Stolley, K. (2013), “Developing strong thesis statements”, available at: https://owl.english.purdue.edu/owl/resource/588/1/ (accessed 20 June 2018)

Standalone URLs, i.e. those without an author or date, should be included either inside parentheses within the main text, or preferably set as a note (Roman numeral within square brackets within text followed by the full URL address at the end of the paper).

Surname, initials (year),  , name of data repository, available at: persistent URL, (accessed date month year).

e.g. Campbell, A. and Kahn, R.L. (2015),  , ICPSR07218-v4, Inter-university Consortium for Political and Social Research (distributor), Ann Arbor, MI, available at: https://doi.org/10.3886/ICPSR07218.v4 (accessed 20 June 2018)

Submit your manuscript

There are a number of key steps you should follow to ensure a smooth and trouble-free submission.

Double check your manuscript

Before submitting your work, it is your responsibility to check that the manuscript is complete, grammatically correct, and without spelling or typographical errors. A few other important points:

  • Give the journal aims and scope a final read. Is your manuscript definitely a good fit? If it isn’t, the editor may decline it without peer review.
  • Does your manuscript comply with our research and publishing ethics guidelines ?
  • Have you cleared any necessary publishing permissions ?
  • Have you followed all the formatting requirements laid out in these author guidelines?
  • If you need to refer to your own work, use wording such as ‘previous research has demonstrated’ not ‘our previous research has demonstrated’.
  • If you need to refer to your own, currently unpublished work, don’t include this work in the reference list.
  • Any acknowledgments or author biographies should be uploaded as separate files.
  • Carry out a final check to ensure that no author names appear anywhere in the manuscript. This includes in figures or captions.

You will find a helpful submission checklist on the website Think.Check.Submit .

The submission process

All manuscripts should be submitted through our editorial system by the corresponding author.

The only way to submit to the journal is through the journal’s ScholarOne site as accessed via the Emerald website, and not by email or through any third-party agent/company, journal representative, or website. Submissions should be done directly by the author(s) through the ScholarOne site and not via a third-party proxy on their behalf.

A separate author account is required for each journal you submit to. If this is your first time submitting to this journal, please choose the Create an account or Register now option in the editorial system. If you already have an Emerald login, you are welcome to reuse the existing username and password here.

Please note, the next time you log into the system, you will be asked for your username. This will be the email address you entered when you set up your account.

Don't forget to add your  ORCiD ID during the submission process. It will be embedded in your published article, along with a link to the ORCiD registry allowing others to easily match you with your work.

Don’t have one yet? It only takes a few moments to register for a free ORCiD identifier .

Visit the ScholarOne support centre  for further help and guidance.

What you can expect next

You will receive an automated email from the journal editor, confirming your successful submission. It will provide you with a manuscript number, which will be used in all future correspondence about your submission. If you have any reason to suspect the confirmation email you receive might be fraudulent, please contact the journal editor in the first instance.

Post submission

Review and decision process.

Each submission is checked by the editor. At this stage, they may choose to decline or unsubmit your manuscript if it doesn’t fit the journal aims and scope, or they feel the language/manuscript quality is too low.

If they think it might be suitable for the publication, they will send it to at least two independent referees for double anonymous peer review.  Once these reviewers have provided their feedback, the editor may decide to accept your manuscript, request minor or major revisions, or decline your work.

While all journals work to different timescales, the goal is that the editor will inform you of their first decision within 60 days.

During this period, we will send you automated updates on the progress of your manuscript via our submission system, or you can log in to check on the current status of your paper.  Each time we contact you, we will quote the manuscript number you were given at the point of submission. If you receive an email that does not match these criteria, it could be fraudulent and we recommend you contact the journal editor in the first instance.

Manuscript transfer service

Emerald’s manuscript transfer service takes the pain out of the submission process if your manuscript doesn’t fit your initial journal choice. Our team of expert Editors from participating journals work together to identify alternative journals that better align with your research, ensuring your work finds the ideal publication home it deserves. Our dedicated team is committed to supporting authors like you in finding the right home for your research.

If a journal is participating in the manuscript transfer program, the Editor has the option to recommend your paper for transfer. If a transfer decision is made by the Editor, you will receive an email with the details of the recommended journal and the option to accept or reject the transfer. It’s always down to you as the author to decide if you’d like to accept. If you do accept, your paper and any reviewer reports will automatically be transferred to the recommended journals. Authors will then confirm resubmissions in the new journal’s ScholarOne system.

Our Manuscript Transfer Service page has more information on the process.

If your submission is accepted

Open access.

Once your paper is accepted, you will have the opportunity to indicate whether you would like to publish your paper via the gold open access route.

If you’ve chosen to publish gold open access, this is the point you will be asked to pay the APC (article processing charge).  This varies per journal and can be found on our APC price list or on the editorial system at the point of submission. Your article will be published with a Creative Commons CC BY 4.0 user licence , which outlines how readers can reuse your work.

For UK journal article authors - if you wish to submit your work accepted by Emerald to REF 2021, you must make a ‘closed deposit’ of your accepted manuscript to your respective institutional repository upon acceptance of your article. Articles accepted for publication after 1st April 2018 should be deposited as soon as possible, but no later than three months after the acceptance date. For further information and guidance, please refer to the REF 2021 website.

All accepted authors are sent an email with a link to a licence form.  This should be checked for accuracy, for example whether contact and affiliation details are up to date and your name is spelled correctly, and then returned to us electronically. If there is a reason why you can’t assign copyright to us, you should discuss this with your journal content editor. You will find their contact details on the editorial team section above.

Proofing and typesetting

Once we have received your completed licence form, the article will pass directly into the production process. We will carry out editorial checks, copyediting, and typesetting and then return proofs to you (if you are the corresponding author) for your review. This is your opportunity to correct any typographical errors, grammatical errors or incorrect author details. We can’t accept requests to rewrite texts at this stage.

When the page proofs are finalised, the fully typeset and proofed version of record is published online. This is referred to as the EarlyCite version. While an EarlyCite article has yet to be assigned to a volume or issue, it does have a digital object identifier (DOI) and is fully citable. It will be compiled into an issue according to the journal’s issue schedule, with papers being added by chronological date of publication.

How to share your paper

Visit our author rights page  to find out how you can reuse and share your work.

To find tips on increasing the visibility of your published paper, read about  how to promote your work .

Correcting inaccuracies in your published paper

Sometimes errors are made during the research, writing and publishing processes. When these issues arise, we have the option of withdrawing the paper or introducing a correction notice. Find out more about our  article withdrawal and correction policies .

Need to make a change to the author list? See our frequently asked questions (FAQs) below.

Frequently asked questions

The only time we will ever ask you for money to publish in an Emerald journal is if you have chosen to publish via the gold open access route. You will be asked to pay an APC (article-processing charge) once your paper has been accepted (unless it is a sponsored open access journal), and never at submission.

At no other time will you be asked to contribute financially towards your article’s publication, processing, or review. If you haven’t chosen gold open access and you receive an email that appears to be from Emerald, the journal, or a third party, asking you for payment to publish, please contact our support team via .

Please contact the editor for the journal, with a copy of your CV. You will find their contact details on the editorial team tab on this page.

Typically, papers are added to an issue according to their date of publication. If you would like to know in advance which issue your paper will appear in, please contact the content editor of the journal. You will find their contact details on the editorial team tab on this page. Once your paper has been published in an issue, you will be notified by email.

Please email the journal editor – you will find their contact details on the editorial team tab on this page. If you ever suspect an email you’ve received from Emerald might not be genuine, you are welcome to verify it with the content editor for the journal, whose contact details can be found on the editorial team tab on this page.

If you’ve read the aims and scope on the journal landing page and are still unsure whether your paper is suitable for the journal, please email the editor and include your paper's title and structured abstract. They will be able to advise on your manuscript’s suitability. You will find their contact details on the Editorial team tab on this page.

Authorship and the order in which the authors are listed on the paper should be agreed prior to submission. We have a right first time policy on this and no changes can be made to the list once submitted. If you have made an error in the submission process, please email the Journal Editorial Office who will look into your request – you will find their contact details on the editorial team tab on this page.

Editor-in-Chief

  • Dr Mark Vicars Victoria University and Honorary Adjunct Professor at Mahidol University - Australia and Thailand [email protected]
  • Dr Jeanne Marie Iorio The University of Melbourne [email protected]

Commissioning Editor

  • Danielle Crow Emerald Publishing - UK [email protected]

Journal Editorial Office (For queries related to pre-acceptance)

  • Prashant Bangera Emerald Publishing [email protected]

Supplier Project Manager (For queries related to post-acceptance)

  • Sivakeerthika Saravanan Emerald Publishing [email protected]

Editorial Advisory Board

  • Professor Victoria Carrington University of Tasmania - Australia
  • Dr Antonia Darder Loyola Marymount University - USA
  • Professor Norman Denzin University of Illinois - USA
  • Dr Yvonne Downs Independent Scholar - UK
  • Dr Ken Gale Glasgow University - UK
  • Professor William Gaudelli Lehigh University - USA
  • Professor Dan Goodley Sheffield University - UK
  • Professor Ivor Goodson University of Brighton - UK
  • Professor Gabriele Griffin Centre for Gender Research, Uppsala University - Sweden
  • Dr Aaron Koh The Chinese University of Hong Kong - Hong Kong
  • Dr Rebecca Lawthom Sheffield University - UK
  • Dr Ligia (Licho) Lopez Lopez The University of Melbourne - Australia
  • Professor Kate Pahl Manchester Metropolitan University - UK
  • Professor Will Parnell Portland State University - USA
  • Professor Ronald Pelias Southern University, Illinois - USA
  • Professor Laurel Richardson Ohio State University - USA
  • Dr Reshmi Roy Federation University - Australia
  • Professor Pat Sikes University of Sheffield - UK
  • Professor Andrew Sparkes Leeds Beckett University - UK
  • Professor Elizabeth St. Pierre University of Georgia - USA
  • Prof Shirley R. Steinberg University of Calgary, Canada and University of the West of Scotland - UK
  • Dr Allison Sterling Henward Penn State College of Education - USA
  • Professor Maria Tamboukou University of East London - UK

Citation metrics

CiteScore 2023

Further information

CiteScore is a simple way of measuring the citation impact of sources, such as journals.

Calculating the CiteScore is based on the number of citations to documents (articles, reviews, conference papers, book chapters, and data papers) by a journal over four years, divided by the number of the same document types indexed in Scopus and published in those same four years.

For more information and methodology visit the Scopus definition

CiteScore Tracker 2024

(updated monthly)

CiteScore Tracker is calculated in the same way as CiteScore, but for the current year rather than previous, complete years.

The CiteScore Tracker calculation is updated every month, as a current indication of a title's performance.

2023 Impact Factor

The Journal Impact Factor is published each year by Clarivate Analytics. It is a measure of the number of times an average paper in a particular journal is cited during the preceding two years.

For more information and methodology see Clarivate Analytics

5-year Impact Factor (2023)

A base of five years may be more appropriate for journals in certain fields because the body of citations may not be large enough to make reasonable comparisons, or it may take longer than two years to publish and distribute leading to a longer period before others cite the work.

Actual value is intentionally only displayed for the most recent year. Earlier values are available in the Journal Citation Reports from Clarivate Analytics .

Publication timeline

Time to first decision

Time to first decision , expressed in days, the "first decision" occurs when the journal’s editorial team reviews the peer reviewers’ comments and recommendations. Based on this feedback, they decide whether to accept, reject, or request revisions for the manuscript.

Data is taken from submissions between 1st June 2023 and 31st May 2024

Acceptance to publication

Acceptance to publication , expressed in days, is the average time between when the journal’s editorial team decide whether to accept, reject, or request revisions for the manuscript and the date of publication in the journal. 

Data is taken from the previous 12 months (Last updated July 2024)

Acceptance rate

The acceptance rate is a measurement of how many manuscripts a journal accepts for publication compared to the total number of manuscripts submitted expressed as a percentage %

Data is taken from submissions between 1st June 2023 and 31st May 2024 .

This figure is the total amount of downloads for all articles published early cite in the last 12 months

(Last updated: July 2024)

This journal is abstracted and indexed by

  • American Sociological Association Publishing Options database
  • BFI (Denmark)
  • British Library
  • The Publication Forum (Finland)

Reviewer information

Peer review process.

This journal engages in a double-anonymous peer review process, which strives to match the expertise of a reviewer with the submitted manuscript. Reviews are completed with evidence of thoughtful engagement with the manuscript, provide constructive feedback, and add value to the overall knowledge and information presented in the manuscript.

The mission of the peer review process is to achieve excellence and rigour in scholarly publications and research.

Our vision is to give voice to professionals in the subject area who contribute unique and diverse scholarly perspectives to the field.

The journal values diverse perspectives from the field and reviewers who provide critical, constructive, and respectful feedback to authors. Reviewers come from a variety of organizations, careers, and backgrounds from around the world.

All invitations to review, abstracts, manuscripts, and reviews should be kept confidential. Reviewers must not share their review or information about the review process with anyone without the agreement of the editors and authors involved, even after publication. This also applies to other reviewers’ “comments to author” which are shared with you on decision.

published qualitative research paper

Resources to guide you through the review process

Discover practical tips and guidance on all aspects of peer review in our reviewers' section. See how being a reviewer could benefit your career, and discover what's involved in shaping a review.

More reviewer information

Calls for papers

Decentring the human in qualitative research: exploring diverse approaches by creating online communities.

Introduction This special issue emerged from the Australian Association for Research in Education Qualitative Research Methodologies Special Interest Group Seminar Series on Decentring the Human in Qualitative Research (cl...

Thank you to the 2023 Reviewers of Qualitative Research Journal

The publishing and editorial teams would like to thank the following, for their invaluable service as 2023 reviewers for this journal. We are very grateful for the contributions made. With their help, the journal has been able to publish such high...

Thank you to the 2022 Reviewers of Qualitative Research Journal

The publishing and editorial teams would like to thank the following, for their invaluable service as 2022 reviewers for this journal. We are very grateful for the contributions made. With their help, the journal has been able to publish such high...

Thank you to the 2021 Reviewers of Qualitative Research Journal

The publishing and editorial teams would like to thank the following, for their invaluable service as 2021 reviewers for this journal. We are very grateful for the contributions made. With their help, the journal has ...

Literati awards

2023 literati award winners banner

Qualitative Research Journal - Literati Award Winners 2023

We are pleased to announce our 2023 Literati Award winners. Outstanding Papers Extended Qualitative Content Analysis: ...

published qualitative research paper

Qualitative Research Journal - Literati Award Winners 2021

We are pleased to announce our 2021 Literati Award winners. Outstanding Paper Collaborative autoethnography:...

Qualitative Research Journal is an international journal dedicated to communicating the theory and practice of qualitative research in the human sciences. Interdisciplinary and eclectic, QRJ covers all methodologies that can be described as qualitative.

Signatory of DORA logo

Aims and scope

Qualitative Research Journal (QRJ) deals comprehensively with the collection, analysis and presentation of qualitative data in the human sciences as well as theoretical and conceptual inquiry and provides an international forum for researchers and practitioners to advance knowledge and promote good qualitative research practices.

Latest articles

These are the latest articles published in this journal (Last updated: July 2024)

So, You Think You're a Leader? Qualitative Study to Understand Patterns ofPresentation and Symmetry Among Dimensions of Leader Identity

“oh my phone, i can't live without you”: a phenomenological study of nomophobia among college students, the opportunity of struggle: a case study on developing a maori-centric nursing course, top downloaded articles.

These are the most downloaded articles over the last 12 months for this journal (Last updated: July 2024)

Factors that enhance and limit youth empowerment, according to social educators

Visual tools for supporting interviews in qualitative research: new approaches, women leaders' lived experiences of bravery in leadership.

These are the top cited articles for this journal, from the last 12 months according to Crossref (Last updated: July 2024)

Culturally Responsive and Communicative Teaching for Multicultural Integration: Qualitative Analysis from Public Secondary School

Creating spaces of wellbeing in academia to mitigate academic burnout: a collaborative autoethnography, children's voices through play-based practice: listening, intensities and critique., related journals.

This journal is part of our Education collection. Explore our Education subject area to find out more.  

See all related journals

Social Studies Research and Practice

Social Studies Research and Practice (SSRP) is a quality peer-reviewed, electronic journal. Research and practice...

published qualitative research paper

On the Horizon: The International Journal of Learning Futures

On the Horizon: The International Journal of Learning Futures (OTH) is a strategic planning resource for decision makers...

published qualitative research paper

International Journal for Lesson and Learning Studies

The first journal of its kind, the International Journal for Lesson and Learning Studies publishes lesson and learning...

published qualitative research paper

This title is aligned with our quality education for all goal

We believe in quality education for everyone, everywhere and by highlighting the issue and working with experts in the field, we can start to find ways we can all be part of the solution.

SDG 4 Quality education

Criteria for Good Qualitative Research: A Comprehensive Review

  • Regular Article
  • Open access
  • Published: 18 September 2021
  • Volume 31 , pages 679–689, ( 2022 )

Cite this article

You have full access to this open access article

published qualitative research paper

  • Drishti Yadav   ORCID: orcid.org/0000-0002-2974-0323 1  

102k Accesses

47 Citations

70 Altmetric

Explore all metrics

This review aims to synthesize a published set of evaluative criteria for good qualitative research. The aim is to shed light on existing standards for assessing the rigor of qualitative research encompassing a range of epistemological and ontological standpoints. Using a systematic search strategy, published journal articles that deliberate criteria for rigorous research were identified. Then, references of relevant articles were surveyed to find noteworthy, distinct, and well-defined pointers to good qualitative research. This review presents an investigative assessment of the pivotal features in qualitative research that can permit the readers to pass judgment on its quality and to condemn it as good research when objectively and adequately utilized. Overall, this review underlines the crux of qualitative research and accentuates the necessity to evaluate such research by the very tenets of its being. It also offers some prospects and recommendations to improve the quality of qualitative research. Based on the findings of this review, it is concluded that quality criteria are the aftereffect of socio-institutional procedures and existing paradigmatic conducts. Owing to the paradigmatic diversity of qualitative research, a single and specific set of quality criteria is neither feasible nor anticipated. Since qualitative research is not a cohesive discipline, researchers need to educate and familiarize themselves with applicable norms and decisive factors to evaluate qualitative research from within its theoretical and methodological framework of origin.

Similar content being viewed by others

published qualitative research paper

Good Qualitative Research: Opening up the Debate

Beyond qualitative/quantitative structuralism: the positivist qualitative research and the paradigmatic disclaimer.

published qualitative research paper

What is Qualitative in Research

Avoid common mistakes on your manuscript.

Introduction

“… It is important to regularly dialogue about what makes for good qualitative research” (Tracy, 2010 , p. 837)

To decide what represents good qualitative research is highly debatable. There are numerous methods that are contained within qualitative research and that are established on diverse philosophical perspectives. Bryman et al., ( 2008 , p. 262) suggest that “It is widely assumed that whereas quality criteria for quantitative research are well‐known and widely agreed, this is not the case for qualitative research.” Hence, the question “how to evaluate the quality of qualitative research” has been continuously debated. There are many areas of science and technology wherein these debates on the assessment of qualitative research have taken place. Examples include various areas of psychology: general psychology (Madill et al., 2000 ); counseling psychology (Morrow, 2005 ); and clinical psychology (Barker & Pistrang, 2005 ), and other disciplines of social sciences: social policy (Bryman et al., 2008 ); health research (Sparkes, 2001 ); business and management research (Johnson et al., 2006 ); information systems (Klein & Myers, 1999 ); and environmental studies (Reid & Gough, 2000 ). In the literature, these debates are enthused by the impression that the blanket application of criteria for good qualitative research developed around the positivist paradigm is improper. Such debates are based on the wide range of philosophical backgrounds within which qualitative research is conducted (e.g., Sandberg, 2000 ; Schwandt, 1996 ). The existence of methodological diversity led to the formulation of different sets of criteria applicable to qualitative research.

Among qualitative researchers, the dilemma of governing the measures to assess the quality of research is not a new phenomenon, especially when the virtuous triad of objectivity, reliability, and validity (Spencer et al., 2004 ) are not adequate. Occasionally, the criteria of quantitative research are used to evaluate qualitative research (Cohen & Crabtree, 2008 ; Lather, 2004 ). Indeed, Howe ( 2004 ) claims that the prevailing paradigm in educational research is scientifically based experimental research. Hypotheses and conjectures about the preeminence of quantitative research can weaken the worth and usefulness of qualitative research by neglecting the prominence of harmonizing match for purpose on research paradigm, the epistemological stance of the researcher, and the choice of methodology. Researchers have been reprimanded concerning this in “paradigmatic controversies, contradictions, and emerging confluences” (Lincoln & Guba, 2000 ).

In general, qualitative research tends to come from a very different paradigmatic stance and intrinsically demands distinctive and out-of-the-ordinary criteria for evaluating good research and varieties of research contributions that can be made. This review attempts to present a series of evaluative criteria for qualitative researchers, arguing that their choice of criteria needs to be compatible with the unique nature of the research in question (its methodology, aims, and assumptions). This review aims to assist researchers in identifying some of the indispensable features or markers of high-quality qualitative research. In a nutshell, the purpose of this systematic literature review is to analyze the existing knowledge on high-quality qualitative research and to verify the existence of research studies dealing with the critical assessment of qualitative research based on the concept of diverse paradigmatic stances. Contrary to the existing reviews, this review also suggests some critical directions to follow to improve the quality of qualitative research in different epistemological and ontological perspectives. This review is also intended to provide guidelines for the acceleration of future developments and dialogues among qualitative researchers in the context of assessing the qualitative research.

The rest of this review article is structured in the following fashion: Sect.  Methods describes the method followed for performing this review. Section Criteria for Evaluating Qualitative Studies provides a comprehensive description of the criteria for evaluating qualitative studies. This section is followed by a summary of the strategies to improve the quality of qualitative research in Sect.  Improving Quality: Strategies . Section  How to Assess the Quality of the Research Findings? provides details on how to assess the quality of the research findings. After that, some of the quality checklists (as tools to evaluate quality) are discussed in Sect.  Quality Checklists: Tools for Assessing the Quality . At last, the review ends with the concluding remarks presented in Sect.  Conclusions, Future Directions and Outlook . Some prospects in qualitative research for enhancing its quality and usefulness in the social and techno-scientific research community are also presented in Sect.  Conclusions, Future Directions and Outlook .

For this review, a comprehensive literature search was performed from many databases using generic search terms such as Qualitative Research , Criteria , etc . The following databases were chosen for the literature search based on the high number of results: IEEE Explore, ScienceDirect, PubMed, Google Scholar, and Web of Science. The following keywords (and their combinations using Boolean connectives OR/AND) were adopted for the literature search: qualitative research, criteria, quality, assessment, and validity. The synonyms for these keywords were collected and arranged in a logical structure (see Table 1 ). All publications in journals and conference proceedings later than 1950 till 2021 were considered for the search. Other articles extracted from the references of the papers identified in the electronic search were also included. A large number of publications on qualitative research were retrieved during the initial screening. Hence, to include the searches with the main focus on criteria for good qualitative research, an inclusion criterion was utilized in the search string.

From the selected databases, the search retrieved a total of 765 publications. Then, the duplicate records were removed. After that, based on the title and abstract, the remaining 426 publications were screened for their relevance by using the following inclusion and exclusion criteria (see Table 2 ). Publications focusing on evaluation criteria for good qualitative research were included, whereas those works which delivered theoretical concepts on qualitative research were excluded. Based on the screening and eligibility, 45 research articles were identified that offered explicit criteria for evaluating the quality of qualitative research and were found to be relevant to this review.

Figure  1 illustrates the complete review process in the form of PRISMA flow diagram. PRISMA, i.e., “preferred reporting items for systematic reviews and meta-analyses” is employed in systematic reviews to refine the quality of reporting.

figure 1

PRISMA flow diagram illustrating the search and inclusion process. N represents the number of records

Criteria for Evaluating Qualitative Studies

Fundamental criteria: general research quality.

Various researchers have put forward criteria for evaluating qualitative research, which have been summarized in Table 3 . Also, the criteria outlined in Table 4 effectively deliver the various approaches to evaluate and assess the quality of qualitative work. The entries in Table 4 are based on Tracy’s “Eight big‐tent criteria for excellent qualitative research” (Tracy, 2010 ). Tracy argues that high-quality qualitative work should formulate criteria focusing on the worthiness, relevance, timeliness, significance, morality, and practicality of the research topic, and the ethical stance of the research itself. Researchers have also suggested a series of questions as guiding principles to assess the quality of a qualitative study (Mays & Pope, 2020 ). Nassaji ( 2020 ) argues that good qualitative research should be robust, well informed, and thoroughly documented.

Qualitative Research: Interpretive Paradigms

All qualitative researchers follow highly abstract principles which bring together beliefs about ontology, epistemology, and methodology. These beliefs govern how the researcher perceives and acts. The net, which encompasses the researcher’s epistemological, ontological, and methodological premises, is referred to as a paradigm, or an interpretive structure, a “Basic set of beliefs that guides action” (Guba, 1990 ). Four major interpretive paradigms structure the qualitative research: positivist and postpositivist, constructivist interpretive, critical (Marxist, emancipatory), and feminist poststructural. The complexity of these four abstract paradigms increases at the level of concrete, specific interpretive communities. Table 5 presents these paradigms and their assumptions, including their criteria for evaluating research, and the typical form that an interpretive or theoretical statement assumes in each paradigm. Moreover, for evaluating qualitative research, quantitative conceptualizations of reliability and validity are proven to be incompatible (Horsburgh, 2003 ). In addition, a series of questions have been put forward in the literature to assist a reviewer (who is proficient in qualitative methods) for meticulous assessment and endorsement of qualitative research (Morse, 2003 ). Hammersley ( 2007 ) also suggests that guiding principles for qualitative research are advantageous, but methodological pluralism should not be simply acknowledged for all qualitative approaches. Seale ( 1999 ) also points out the significance of methodological cognizance in research studies.

Table 5 reflects that criteria for assessing the quality of qualitative research are the aftermath of socio-institutional practices and existing paradigmatic standpoints. Owing to the paradigmatic diversity of qualitative research, a single set of quality criteria is neither possible nor desirable. Hence, the researchers must be reflexive about the criteria they use in the various roles they play within their research community.

Improving Quality: Strategies

Another critical question is “How can the qualitative researchers ensure that the abovementioned quality criteria can be met?” Lincoln and Guba ( 1986 ) delineated several strategies to intensify each criteria of trustworthiness. Other researchers (Merriam & Tisdell, 2016 ; Shenton, 2004 ) also presented such strategies. A brief description of these strategies is shown in Table 6 .

It is worth mentioning that generalizability is also an integral part of qualitative research (Hays & McKibben, 2021 ). In general, the guiding principle pertaining to generalizability speaks about inducing and comprehending knowledge to synthesize interpretive components of an underlying context. Table 7 summarizes the main metasynthesis steps required to ascertain generalizability in qualitative research.

Figure  2 reflects the crucial components of a conceptual framework and their contribution to decisions regarding research design, implementation, and applications of results to future thinking, study, and practice (Johnson et al., 2020 ). The synergy and interrelationship of these components signifies their role to different stances of a qualitative research study.

figure 2

Essential elements of a conceptual framework

In a nutshell, to assess the rationale of a study, its conceptual framework and research question(s), quality criteria must take account of the following: lucid context for the problem statement in the introduction; well-articulated research problems and questions; precise conceptual framework; distinct research purpose; and clear presentation and investigation of the paradigms. These criteria would expedite the quality of qualitative research.

How to Assess the Quality of the Research Findings?

The inclusion of quotes or similar research data enhances the confirmability in the write-up of the findings. The use of expressions (for instance, “80% of all respondents agreed that” or “only one of the interviewees mentioned that”) may also quantify qualitative findings (Stenfors et al., 2020 ). On the other hand, the persuasive reason for “why this may not help in intensifying the research” has also been provided (Monrouxe & Rees, 2020 ). Further, the Discussion and Conclusion sections of an article also prove robust markers of high-quality qualitative research, as elucidated in Table 8 .

Quality Checklists: Tools for Assessing the Quality

Numerous checklists are available to speed up the assessment of the quality of qualitative research. However, if used uncritically and recklessly concerning the research context, these checklists may be counterproductive. I recommend that such lists and guiding principles may assist in pinpointing the markers of high-quality qualitative research. However, considering enormous variations in the authors’ theoretical and philosophical contexts, I would emphasize that high dependability on such checklists may say little about whether the findings can be applied in your setting. A combination of such checklists might be appropriate for novice researchers. Some of these checklists are listed below:

The most commonly used framework is Consolidated Criteria for Reporting Qualitative Research (COREQ) (Tong et al., 2007 ). This framework is recommended by some journals to be followed by the authors during article submission.

Standards for Reporting Qualitative Research (SRQR) is another checklist that has been created particularly for medical education (O’Brien et al., 2014 ).

Also, Tracy ( 2010 ) and Critical Appraisal Skills Programme (CASP, 2021 ) offer criteria for qualitative research relevant across methods and approaches.

Further, researchers have also outlined different criteria as hallmarks of high-quality qualitative research. For instance, the “Road Trip Checklist” (Epp & Otnes, 2021 ) provides a quick reference to specific questions to address different elements of high-quality qualitative research.

Conclusions, Future Directions, and Outlook

This work presents a broad review of the criteria for good qualitative research. In addition, this article presents an exploratory analysis of the essential elements in qualitative research that can enable the readers of qualitative work to judge it as good research when objectively and adequately utilized. In this review, some of the essential markers that indicate high-quality qualitative research have been highlighted. I scope them narrowly to achieve rigor in qualitative research and note that they do not completely cover the broader considerations necessary for high-quality research. This review points out that a universal and versatile one-size-fits-all guideline for evaluating the quality of qualitative research does not exist. In other words, this review also emphasizes the non-existence of a set of common guidelines among qualitative researchers. In unison, this review reinforces that each qualitative approach should be treated uniquely on account of its own distinctive features for different epistemological and disciplinary positions. Owing to the sensitivity of the worth of qualitative research towards the specific context and the type of paradigmatic stance, researchers should themselves analyze what approaches can be and must be tailored to ensemble the distinct characteristics of the phenomenon under investigation. Although this article does not assert to put forward a magic bullet and to provide a one-stop solution for dealing with dilemmas about how, why, or whether to evaluate the “goodness” of qualitative research, it offers a platform to assist the researchers in improving their qualitative studies. This work provides an assembly of concerns to reflect on, a series of questions to ask, and multiple sets of criteria to look at, when attempting to determine the quality of qualitative research. Overall, this review underlines the crux of qualitative research and accentuates the need to evaluate such research by the very tenets of its being. Bringing together the vital arguments and delineating the requirements that good qualitative research should satisfy, this review strives to equip the researchers as well as reviewers to make well-versed judgment about the worth and significance of the qualitative research under scrutiny. In a nutshell, a comprehensive portrayal of the research process (from the context of research to the research objectives, research questions and design, speculative foundations, and from approaches of collecting data to analyzing the results, to deriving inferences) frequently proliferates the quality of a qualitative research.

Prospects : A Road Ahead for Qualitative Research

Irrefutably, qualitative research is a vivacious and evolving discipline wherein different epistemological and disciplinary positions have their own characteristics and importance. In addition, not surprisingly, owing to the sprouting and varied features of qualitative research, no consensus has been pulled off till date. Researchers have reflected various concerns and proposed several recommendations for editors and reviewers on conducting reviews of critical qualitative research (Levitt et al., 2021 ; McGinley et al., 2021 ). Following are some prospects and a few recommendations put forward towards the maturation of qualitative research and its quality evaluation:

In general, most of the manuscript and grant reviewers are not qualitative experts. Hence, it is more likely that they would prefer to adopt a broad set of criteria. However, researchers and reviewers need to keep in mind that it is inappropriate to utilize the same approaches and conducts among all qualitative research. Therefore, future work needs to focus on educating researchers and reviewers about the criteria to evaluate qualitative research from within the suitable theoretical and methodological context.

There is an urgent need to refurbish and augment critical assessment of some well-known and widely accepted tools (including checklists such as COREQ, SRQR) to interrogate their applicability on different aspects (along with their epistemological ramifications).

Efforts should be made towards creating more space for creativity, experimentation, and a dialogue between the diverse traditions of qualitative research. This would potentially help to avoid the enforcement of one's own set of quality criteria on the work carried out by others.

Moreover, journal reviewers need to be aware of various methodological practices and philosophical debates.

It is pivotal to highlight the expressions and considerations of qualitative researchers and bring them into a more open and transparent dialogue about assessing qualitative research in techno-scientific, academic, sociocultural, and political rooms.

Frequent debates on the use of evaluative criteria are required to solve some potentially resolved issues (including the applicability of a single set of criteria in multi-disciplinary aspects). Such debates would not only benefit the group of qualitative researchers themselves, but primarily assist in augmenting the well-being and vivacity of the entire discipline.

To conclude, I speculate that the criteria, and my perspective, may transfer to other methods, approaches, and contexts. I hope that they spark dialog and debate – about criteria for excellent qualitative research and the underpinnings of the discipline more broadly – and, therefore, help improve the quality of a qualitative study. Further, I anticipate that this review will assist the researchers to contemplate on the quality of their own research, to substantiate research design and help the reviewers to review qualitative research for journals. On a final note, I pinpoint the need to formulate a framework (encompassing the prerequisites of a qualitative study) by the cohesive efforts of qualitative researchers of different disciplines with different theoretic-paradigmatic origins. I believe that tailoring such a framework (of guiding principles) paves the way for qualitative researchers to consolidate the status of qualitative research in the wide-ranging open science debate. Dialogue on this issue across different approaches is crucial for the impending prospects of socio-techno-educational research.

Amin, M. E. K., Nørgaard, L. S., Cavaco, A. M., Witry, M. J., Hillman, L., Cernasev, A., & Desselle, S. P. (2020). Establishing trustworthiness and authenticity in qualitative pharmacy research. Research in Social and Administrative Pharmacy, 16 (10), 1472–1482.

Article   Google Scholar  

Barker, C., & Pistrang, N. (2005). Quality criteria under methodological pluralism: Implications for conducting and evaluating research. American Journal of Community Psychology, 35 (3–4), 201–212.

Bryman, A., Becker, S., & Sempik, J. (2008). Quality criteria for quantitative, qualitative and mixed methods research: A view from social policy. International Journal of Social Research Methodology, 11 (4), 261–276.

Caelli, K., Ray, L., & Mill, J. (2003). ‘Clear as mud’: Toward greater clarity in generic qualitative research. International Journal of Qualitative Methods, 2 (2), 1–13.

CASP (2021). CASP checklists. Retrieved May 2021 from https://casp-uk.net/casp-tools-checklists/

Cohen, D. J., & Crabtree, B. F. (2008). Evaluative criteria for qualitative research in health care: Controversies and recommendations. The Annals of Family Medicine, 6 (4), 331–339.

Denzin, N. K., & Lincoln, Y. S. (2005). Introduction: The discipline and practice of qualitative research. In N. K. Denzin & Y. S. Lincoln (Eds.), The sage handbook of qualitative research (pp. 1–32). Sage Publications Ltd.

Google Scholar  

Elliott, R., Fischer, C. T., & Rennie, D. L. (1999). Evolving guidelines for publication of qualitative research studies in psychology and related fields. British Journal of Clinical Psychology, 38 (3), 215–229.

Epp, A. M., & Otnes, C. C. (2021). High-quality qualitative research: Getting into gear. Journal of Service Research . https://doi.org/10.1177/1094670520961445

Guba, E. G. (1990). The paradigm dialog. In Alternative paradigms conference, mar, 1989, Indiana u, school of education, San Francisco, ca, us . Sage Publications, Inc.

Hammersley, M. (2007). The issue of quality in qualitative research. International Journal of Research and Method in Education, 30 (3), 287–305.

Haven, T. L., Errington, T. M., Gleditsch, K. S., van Grootel, L., Jacobs, A. M., Kern, F. G., & Mokkink, L. B. (2020). Preregistering qualitative research: A Delphi study. International Journal of Qualitative Methods, 19 , 1609406920976417.

Hays, D. G., & McKibben, W. B. (2021). Promoting rigorous research: Generalizability and qualitative research. Journal of Counseling and Development, 99 (2), 178–188.

Horsburgh, D. (2003). Evaluation of qualitative research. Journal of Clinical Nursing, 12 (2), 307–312.

Howe, K. R. (2004). A critique of experimentalism. Qualitative Inquiry, 10 (1), 42–46.

Johnson, J. L., Adkins, D., & Chauvin, S. (2020). A review of the quality indicators of rigor in qualitative research. American Journal of Pharmaceutical Education, 84 (1), 7120.

Johnson, P., Buehring, A., Cassell, C., & Symon, G. (2006). Evaluating qualitative management research: Towards a contingent criteriology. International Journal of Management Reviews, 8 (3), 131–156.

Klein, H. K., & Myers, M. D. (1999). A set of principles for conducting and evaluating interpretive field studies in information systems. MIS Quarterly, 23 (1), 67–93.

Lather, P. (2004). This is your father’s paradigm: Government intrusion and the case of qualitative research in education. Qualitative Inquiry, 10 (1), 15–34.

Levitt, H. M., Morrill, Z., Collins, K. M., & Rizo, J. L. (2021). The methodological integrity of critical qualitative research: Principles to support design and research review. Journal of Counseling Psychology, 68 (3), 357.

Lincoln, Y. S., & Guba, E. G. (1986). But is it rigorous? Trustworthiness and authenticity in naturalistic evaluation. New Directions for Program Evaluation, 1986 (30), 73–84.

Lincoln, Y. S., & Guba, E. G. (2000). Paradigmatic controversies, contradictions and emerging confluences. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (2nd ed., pp. 163–188). Sage Publications.

Madill, A., Jordan, A., & Shirley, C. (2000). Objectivity and reliability in qualitative analysis: Realist, contextualist and radical constructionist epistemologies. British Journal of Psychology, 91 (1), 1–20.

Mays, N., & Pope, C. (2020). Quality in qualitative research. Qualitative Research in Health Care . https://doi.org/10.1002/9781119410867.ch15

McGinley, S., Wei, W., Zhang, L., & Zheng, Y. (2021). The state of qualitative research in hospitality: A 5-year review 2014 to 2019. Cornell Hospitality Quarterly, 62 (1), 8–20.

Merriam, S., & Tisdell, E. (2016). Qualitative research: A guide to design and implementation. San Francisco, US.

Meyer, M., & Dykes, J. (2019). Criteria for rigor in visualization design study. IEEE Transactions on Visualization and Computer Graphics, 26 (1), 87–97.

Monrouxe, L. V., & Rees, C. E. (2020). When I say… quantification in qualitative research. Medical Education, 54 (3), 186–187.

Morrow, S. L. (2005). Quality and trustworthiness in qualitative research in counseling psychology. Journal of Counseling Psychology, 52 (2), 250.

Morse, J. M. (2003). A review committee’s guide for evaluating qualitative proposals. Qualitative Health Research, 13 (6), 833–851.

Nassaji, H. (2020). Good qualitative research. Language Teaching Research, 24 (4), 427–431.

O’Brien, B. C., Harris, I. B., Beckman, T. J., Reed, D. A., & Cook, D. A. (2014). Standards for reporting qualitative research: A synthesis of recommendations. Academic Medicine, 89 (9), 1245–1251.

O’Connor, C., & Joffe, H. (2020). Intercoder reliability in qualitative research: Debates and practical guidelines. International Journal of Qualitative Methods, 19 , 1609406919899220.

Reid, A., & Gough, S. (2000). Guidelines for reporting and evaluating qualitative research: What are the alternatives? Environmental Education Research, 6 (1), 59–91.

Rocco, T. S. (2010). Criteria for evaluating qualitative studies. Human Resource Development International . https://doi.org/10.1080/13678868.2010.501959

Sandberg, J. (2000). Understanding human competence at work: An interpretative approach. Academy of Management Journal, 43 (1), 9–25.

Schwandt, T. A. (1996). Farewell to criteriology. Qualitative Inquiry, 2 (1), 58–72.

Seale, C. (1999). Quality in qualitative research. Qualitative Inquiry, 5 (4), 465–478.

Shenton, A. K. (2004). Strategies for ensuring trustworthiness in qualitative research projects. Education for Information, 22 (2), 63–75.

Sparkes, A. C. (2001). Myth 94: Qualitative health researchers will agree about validity. Qualitative Health Research, 11 (4), 538–552.

Spencer, L., Ritchie, J., Lewis, J., & Dillon, L. (2004). Quality in qualitative evaluation: A framework for assessing research evidence.

Stenfors, T., Kajamaa, A., & Bennett, D. (2020). How to assess the quality of qualitative research. The Clinical Teacher, 17 (6), 596–599.

Taylor, E. W., Beck, J., & Ainsworth, E. (2001). Publishing qualitative adult education research: A peer review perspective. Studies in the Education of Adults, 33 (2), 163–179.

Tong, A., Sainsbury, P., & Craig, J. (2007). Consolidated criteria for reporting qualitative research (COREQ): A 32-item checklist for interviews and focus groups. International Journal for Quality in Health Care, 19 (6), 349–357.

Tracy, S. J. (2010). Qualitative quality: Eight “big-tent” criteria for excellent qualitative research. Qualitative Inquiry, 16 (10), 837–851.

Download references

Open access funding provided by TU Wien (TUW).

Author information

Authors and affiliations.

Faculty of Informatics, Technische Universität Wien, 1040, Vienna, Austria

Drishti Yadav

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Drishti Yadav .

Ethics declarations

Conflict of interest.

The author declares no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Yadav, D. Criteria for Good Qualitative Research: A Comprehensive Review. Asia-Pacific Edu Res 31 , 679–689 (2022). https://doi.org/10.1007/s40299-021-00619-0

Download citation

Accepted : 28 August 2021

Published : 18 September 2021

Issue Date : December 2022

DOI : https://doi.org/10.1007/s40299-021-00619-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Qualitative research
  • Evaluative criteria
  • Find a journal
  • Publish with us
  • Track your research

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is Qualitative Research? | Methods & Examples

What Is Qualitative Research? | Methods & Examples

Published on June 19, 2020 by Pritha Bhandari . Revised on September 5, 2024.

Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research.

Qualitative research is the opposite of quantitative research , which involves collecting and analyzing numerical data for statistical analysis.

Qualitative research is commonly used in the humanities and social sciences, in subjects such as anthropology, sociology, education, health sciences, history, etc.

  • How does social media shape body image in teenagers?
  • How do children and adults interpret healthy eating in the UK?
  • What factors influence employee retention in a large organization?
  • How is anxiety experienced around the world?
  • How can teachers integrate social issues into science curriculums?

Table of contents

Approaches to qualitative research, qualitative research methods, qualitative data analysis, advantages of qualitative research, disadvantages of qualitative research, other interesting articles, frequently asked questions about qualitative research.

Qualitative research is used to understand how people experience the world. While there are many approaches to qualitative research, they tend to be flexible and focus on retaining rich meaning when interpreting data.

Common approaches include grounded theory, ethnography , action research , phenomenological research, and narrative research. They share some similarities, but emphasize different aims and perspectives.

Qualitative research approaches
Approach What does it involve?
Grounded theory Researchers collect rich data on a topic of interest and develop theories .
Researchers immerse themselves in groups or organizations to understand their cultures.
Action research Researchers and participants collaboratively link theory to practice to drive social change.
Phenomenological research Researchers investigate a phenomenon or event by describing and interpreting participants’ lived experiences.
Narrative research Researchers examine how stories are told to understand how participants perceive and make sense of their experiences.

Note that qualitative research is at risk for certain research biases including the Hawthorne effect , observer bias , recall bias , and social desirability bias . While not always totally avoidable, awareness of potential biases as you collect and analyze your data can prevent them from impacting your work too much.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

published qualitative research paper

Each of the research approaches involve using one or more data collection methods . These are some of the most common qualitative methods:

  • Observations: recording what you have seen, heard, or encountered in detailed field notes.
  • Interviews:  personally asking people questions in one-on-one conversations.
  • Focus groups: asking questions and generating discussion among a group of people.
  • Surveys : distributing questionnaires with open-ended questions.
  • Secondary research: collecting existing data in the form of texts, images, audio or video recordings, etc.
  • You take field notes with observations and reflect on your own experiences of the company culture.
  • You distribute open-ended surveys to employees across all the company’s offices by email to find out if the culture varies across locations.
  • You conduct in-depth interviews with employees in your office to learn about their experiences and perspectives in greater detail.

Qualitative researchers often consider themselves “instruments” in research because all observations, interpretations and analyses are filtered through their own personal lens.

For this reason, when writing up your methodology for qualitative research, it’s important to reflect on your approach and to thoroughly explain the choices you made in collecting and analyzing the data.

Qualitative data can take the form of texts, photos, videos and audio. For example, you might be working with interview transcripts, survey responses, fieldnotes, or recordings from natural settings.

Most types of qualitative data analysis share the same five steps:

  • Prepare and organize your data. This may mean transcribing interviews or typing up fieldnotes.
  • Review and explore your data. Examine the data for patterns or repeated ideas that emerge.
  • Develop a data coding system. Based on your initial ideas, establish a set of codes that you can apply to categorize your data.
  • Assign codes to the data. For example, in qualitative survey analysis, this may mean going through each participant’s responses and tagging them with codes in a spreadsheet. As you go through your data, you can create new codes to add to your system if necessary.
  • Identify recurring themes. Link codes together into cohesive, overarching themes.

There are several specific approaches to analyzing qualitative data. Although these methods share similar processes, they emphasize different concepts.

Qualitative data analysis
Approach When to use Example
To describe and categorize common words, phrases, and ideas in qualitative data. A market researcher could perform content analysis to find out what kind of language is used in descriptions of therapeutic apps.
To identify and interpret patterns and themes in qualitative data. A psychologist could apply thematic analysis to travel blogs to explore how tourism shapes self-identity.
To examine the content, structure, and design of texts. A media researcher could use textual analysis to understand how news coverage of celebrities has changed in the past decade.
To study communication and how language is used to achieve effects in specific contexts. A political scientist could use discourse analysis to study how politicians generate trust in election campaigns.

Qualitative research often tries to preserve the voice and perspective of participants and can be adjusted as new research questions arise. Qualitative research is good for:

  • Flexibility

The data collection and analysis process can be adapted as new ideas or patterns emerge. They are not rigidly decided beforehand.

  • Natural settings

Data collection occurs in real-world contexts or in naturalistic ways.

  • Meaningful insights

Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing, testing or improving systems or products.

  • Generation of new ideas

Open-ended responses mean that researchers can uncover novel problems or opportunities that they wouldn’t have thought of otherwise.

Prevent plagiarism. Run a free check.

Researchers must consider practical and theoretical limitations in analyzing and interpreting their data. Qualitative research suffers from:

  • Unreliability

The real-world setting often makes qualitative research unreliable because of uncontrolled factors that affect the data.

  • Subjectivity

Due to the researcher’s primary role in analyzing and interpreting data, qualitative research cannot be replicated . The researcher decides what is important and what is irrelevant in data analysis, so interpretations of the same data can vary greatly.

  • Limited generalizability

Small samples are often used to gather detailed data about specific contexts. Despite rigorous analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased and unrepresentative of the wider population .

  • Labor-intensive

Although software can be used to manage and record large amounts of text, data analysis often has to be checked or performed manually.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square goodness of fit test
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2024, September 05). What Is Qualitative Research? | Methods & Examples. Scribbr. Retrieved September 11, 2024, from https://www.scribbr.com/methodology/qualitative-research/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, how to do thematic analysis | step-by-step guide & examples, what is your plagiarism score.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Can J Hosp Pharm
  • v.68(3); May-Jun 2015

Logo of cjhp

Qualitative Research: Data Collection, Analysis, and Management

Introduction.

In an earlier paper, 1 we presented an introduction to using qualitative research methods in pharmacy practice. In this article, we review some principles of the collection, analysis, and management of qualitative data to help pharmacists interested in doing research in their practice to continue their learning in this area. Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. Whereas quantitative research methods can be used to determine how many people undertake particular behaviours, qualitative methods can help researchers to understand how and why such behaviours take place. Within the context of pharmacy practice research, qualitative approaches have been used to examine a diverse array of topics, including the perceptions of key stakeholders regarding prescribing by pharmacists and the postgraduation employment experiences of young pharmacists (see “Further Reading” section at the end of this article).

In the previous paper, 1 we outlined 3 commonly used methodologies: ethnography 2 , grounded theory 3 , and phenomenology. 4 Briefly, ethnography involves researchers using direct observation to study participants in their “real life” environment, sometimes over extended periods. Grounded theory and its later modified versions (e.g., Strauss and Corbin 5 ) use face-to-face interviews and interactions such as focus groups to explore a particular research phenomenon and may help in clarifying a less-well-understood problem, situation, or context. Phenomenology shares some features with grounded theory (such as an exploration of participants’ behaviour) and uses similar techniques to collect data, but it focuses on understanding how human beings experience their world. It gives researchers the opportunity to put themselves in another person’s shoes and to understand the subjective experiences of participants. 6 Some researchers use qualitative methodologies but adopt a different standpoint, and an example of this appears in the work of Thurston and others, 7 discussed later in this paper.

Qualitative work requires reflection on the part of researchers, both before and during the research process, as a way of providing context and understanding for readers. When being reflexive, researchers should not try to simply ignore or avoid their own biases (as this would likely be impossible); instead, reflexivity requires researchers to reflect upon and clearly articulate their position and subjectivities (world view, perspectives, biases), so that readers can better understand the filters through which questions were asked, data were gathered and analyzed, and findings were reported. From this perspective, bias and subjectivity are not inherently negative but they are unavoidable; as a result, it is best that they be articulated up-front in a manner that is clear and coherent for readers.

THE PARTICIPANT’S VIEWPOINT

What qualitative study seeks to convey is why people have thoughts and feelings that might affect the way they behave. Such study may occur in any number of contexts, but here, we focus on pharmacy practice and the way people behave with regard to medicines use (e.g., to understand patients’ reasons for nonadherence with medication therapy or to explore physicians’ resistance to pharmacists’ clinical suggestions). As we suggested in our earlier article, 1 an important point about qualitative research is that there is no attempt to generalize the findings to a wider population. Qualitative research is used to gain insights into people’s feelings and thoughts, which may provide the basis for a future stand-alone qualitative study or may help researchers to map out survey instruments for use in a quantitative study. It is also possible to use different types of research in the same study, an approach known as “mixed methods” research, and further reading on this topic may be found at the end of this paper.

The role of the researcher in qualitative research is to attempt to access the thoughts and feelings of study participants. This is not an easy task, as it involves asking people to talk about things that may be very personal to them. Sometimes the experiences being explored are fresh in the participant’s mind, whereas on other occasions reliving past experiences may be difficult. However the data are being collected, a primary responsibility of the researcher is to safeguard participants and their data. Mechanisms for such safeguarding must be clearly articulated to participants and must be approved by a relevant research ethics review board before the research begins. Researchers and practitioners new to qualitative research should seek advice from an experienced qualitative researcher before embarking on their project.

DATA COLLECTION

Whatever philosophical standpoint the researcher is taking and whatever the data collection method (e.g., focus group, one-to-one interviews), the process will involve the generation of large amounts of data. In addition to the variety of study methodologies available, there are also different ways of making a record of what is said and done during an interview or focus group, such as taking handwritten notes or video-recording. If the researcher is audio- or video-recording data collection, then the recordings must be transcribed verbatim before data analysis can begin. As a rough guide, it can take an experienced researcher/transcriber 8 hours to transcribe one 45-minute audio-recorded interview, a process than will generate 20–30 pages of written dialogue.

Many researchers will also maintain a folder of “field notes” to complement audio-taped interviews. Field notes allow the researcher to maintain and comment upon impressions, environmental contexts, behaviours, and nonverbal cues that may not be adequately captured through the audio-recording; they are typically handwritten in a small notebook at the same time the interview takes place. Field notes can provide important context to the interpretation of audio-taped data and can help remind the researcher of situational factors that may be important during data analysis. Such notes need not be formal, but they should be maintained and secured in a similar manner to audio tapes and transcripts, as they contain sensitive information and are relevant to the research. For more information about collecting qualitative data, please see the “Further Reading” section at the end of this paper.

DATA ANALYSIS AND MANAGEMENT

If, as suggested earlier, doing qualitative research is about putting oneself in another person’s shoes and seeing the world from that person’s perspective, the most important part of data analysis and management is to be true to the participants. It is their voices that the researcher is trying to hear, so that they can be interpreted and reported on for others to read and learn from. To illustrate this point, consider the anonymized transcript excerpt presented in Appendix 1 , which is taken from a research interview conducted by one of the authors (J.S.). We refer to this excerpt throughout the remainder of this paper to illustrate how data can be managed, analyzed, and presented.

Interpretation of Data

Interpretation of the data will depend on the theoretical standpoint taken by researchers. For example, the title of the research report by Thurston and others, 7 “Discordant indigenous and provider frames explain challenges in improving access to arthritis care: a qualitative study using constructivist grounded theory,” indicates at least 2 theoretical standpoints. The first is the culture of the indigenous population of Canada and the place of this population in society, and the second is the social constructivist theory used in the constructivist grounded theory method. With regard to the first standpoint, it can be surmised that, to have decided to conduct the research, the researchers must have felt that there was anecdotal evidence of differences in access to arthritis care for patients from indigenous and non-indigenous backgrounds. With regard to the second standpoint, it can be surmised that the researchers used social constructivist theory because it assumes that behaviour is socially constructed; in other words, people do things because of the expectations of those in their personal world or in the wider society in which they live. (Please see the “Further Reading” section for resources providing more information about social constructivist theory and reflexivity.) Thus, these 2 standpoints (and there may have been others relevant to the research of Thurston and others 7 ) will have affected the way in which these researchers interpreted the experiences of the indigenous population participants and those providing their care. Another standpoint is feminist standpoint theory which, among other things, focuses on marginalized groups in society. Such theories are helpful to researchers, as they enable us to think about things from a different perspective. Being aware of the standpoints you are taking in your own research is one of the foundations of qualitative work. Without such awareness, it is easy to slip into interpreting other people’s narratives from your own viewpoint, rather than that of the participants.

To analyze the example in Appendix 1 , we will adopt a phenomenological approach because we want to understand how the participant experienced the illness and we want to try to see the experience from that person’s perspective. It is important for the researcher to reflect upon and articulate his or her starting point for such analysis; for example, in the example, the coder could reflect upon her own experience as a female of a majority ethnocultural group who has lived within middle class and upper middle class settings. This personal history therefore forms the filter through which the data will be examined. This filter does not diminish the quality or significance of the analysis, since every researcher has his or her own filters; however, by explicitly stating and acknowledging what these filters are, the researcher makes it easer for readers to contextualize the work.

Transcribing and Checking

For the purposes of this paper it is assumed that interviews or focus groups have been audio-recorded. As mentioned above, transcribing is an arduous process, even for the most experienced transcribers, but it must be done to convert the spoken word to the written word to facilitate analysis. For anyone new to conducting qualitative research, it is beneficial to transcribe at least one interview and one focus group. It is only by doing this that researchers realize how difficult the task is, and this realization affects their expectations when asking others to transcribe. If the research project has sufficient funding, then a professional transcriber can be hired to do the work. If this is the case, then it is a good idea to sit down with the transcriber, if possible, and talk through the research and what the participants were talking about. This background knowledge for the transcriber is especially important in research in which people are using jargon or medical terms (as in pharmacy practice). Involving your transcriber in this way makes the work both easier and more rewarding, as he or she will feel part of the team. Transcription editing software is also available, but it is expensive. For example, ELAN (more formally known as EUDICO Linguistic Annotator, developed at the Technical University of Berlin) 8 is a tool that can help keep data organized by linking media and data files (particularly valuable if, for example, video-taping of interviews is complemented by transcriptions). It can also be helpful in searching complex data sets. Products such as ELAN do not actually automatically transcribe interviews or complete analyses, and they do require some time and effort to learn; nonetheless, for some research applications, it may be a valuable to consider such software tools.

All audio recordings should be transcribed verbatim, regardless of how intelligible the transcript may be when it is read back. Lines of text should be numbered. Once the transcription is complete, the researcher should read it while listening to the recording and do the following: correct any spelling or other errors; anonymize the transcript so that the participant cannot be identified from anything that is said (e.g., names, places, significant events); insert notations for pauses, laughter, looks of discomfort; insert any punctuation, such as commas and full stops (periods) (see Appendix 1 for examples of inserted punctuation), and include any other contextual information that might have affected the participant (e.g., temperature or comfort of the room).

Dealing with the transcription of a focus group is slightly more difficult, as multiple voices are involved. One way of transcribing such data is to “tag” each voice (e.g., Voice A, Voice B). In addition, the focus group will usually have 2 facilitators, whose respective roles will help in making sense of the data. While one facilitator guides participants through the topic, the other can make notes about context and group dynamics. More information about group dynamics and focus groups can be found in resources listed in the “Further Reading” section.

Reading between the Lines

During the process outlined above, the researcher can begin to get a feel for the participant’s experience of the phenomenon in question and can start to think about things that could be pursued in subsequent interviews or focus groups (if appropriate). In this way, one participant’s narrative informs the next, and the researcher can continue to interview until nothing new is being heard or, as it says in the text books, “saturation is reached”. While continuing with the processes of coding and theming (described in the next 2 sections), it is important to consider not just what the person is saying but also what they are not saying. For example, is a lengthy pause an indication that the participant is finding the subject difficult, or is the person simply deciding what to say? The aim of the whole process from data collection to presentation is to tell the participants’ stories using exemplars from their own narratives, thus grounding the research findings in the participants’ lived experiences.

Smith 9 suggested a qualitative research method known as interpretative phenomenological analysis, which has 2 basic tenets: first, that it is rooted in phenomenology, attempting to understand the meaning that individuals ascribe to their lived experiences, and second, that the researcher must attempt to interpret this meaning in the context of the research. That the researcher has some knowledge and expertise in the subject of the research means that he or she can have considerable scope in interpreting the participant’s experiences. Larkin and others 10 discussed the importance of not just providing a description of what participants say. Rather, interpretative phenomenological analysis is about getting underneath what a person is saying to try to truly understand the world from his or her perspective.

Once all of the research interviews have been transcribed and checked, it is time to begin coding. Field notes compiled during an interview can be a useful complementary source of information to facilitate this process, as the gap in time between an interview, transcribing, and coding can result in memory bias regarding nonverbal or environmental context issues that may affect interpretation of data.

Coding refers to the identification of topics, issues, similarities, and differences that are revealed through the participants’ narratives and interpreted by the researcher. This process enables the researcher to begin to understand the world from each participant’s perspective. Coding can be done by hand on a hard copy of the transcript, by making notes in the margin or by highlighting and naming sections of text. More commonly, researchers use qualitative research software (e.g., NVivo, QSR International Pty Ltd; www.qsrinternational.com/products_nvivo.aspx ) to help manage their transcriptions. It is advised that researchers undertake a formal course in the use of such software or seek supervision from a researcher experienced in these tools.

Returning to Appendix 1 and reading from lines 8–11, a code for this section might be “diagnosis of mental health condition”, but this would just be a description of what the participant is talking about at that point. If we read a little more deeply, we can ask ourselves how the participant might have come to feel that the doctor assumed he or she was aware of the diagnosis or indeed that they had only just been told the diagnosis. There are a number of pauses in the narrative that might suggest the participant is finding it difficult to recall that experience. Later in the text, the participant says “nobody asked me any questions about my life” (line 19). This could be coded simply as “health care professionals’ consultation skills”, but that would not reflect how the participant must have felt never to be asked anything about his or her personal life, about the participant as a human being. At the end of this excerpt, the participant just trails off, recalling that no-one showed any interest, which makes for very moving reading. For practitioners in pharmacy, it might also be pertinent to explore the participant’s experience of akathisia and why this was left untreated for 20 years.

One of the questions that arises about qualitative research relates to the reliability of the interpretation and representation of the participants’ narratives. There are no statistical tests that can be used to check reliability and validity as there are in quantitative research. However, work by Lincoln and Guba 11 suggests that there are other ways to “establish confidence in the ‘truth’ of the findings” (p. 218). They call this confidence “trustworthiness” and suggest that there are 4 criteria of trustworthiness: credibility (confidence in the “truth” of the findings), transferability (showing that the findings have applicability in other contexts), dependability (showing that the findings are consistent and could be repeated), and confirmability (the extent to which the findings of a study are shaped by the respondents and not researcher bias, motivation, or interest).

One way of establishing the “credibility” of the coding is to ask another researcher to code the same transcript and then to discuss any similarities and differences in the 2 resulting sets of codes. This simple act can result in revisions to the codes and can help to clarify and confirm the research findings.

Theming refers to the drawing together of codes from one or more transcripts to present the findings of qualitative research in a coherent and meaningful way. For example, there may be examples across participants’ narratives of the way in which they were treated in hospital, such as “not being listened to” or “lack of interest in personal experiences” (see Appendix 1 ). These may be drawn together as a theme running through the narratives that could be named “the patient’s experience of hospital care”. The importance of going through this process is that at its conclusion, it will be possible to present the data from the interviews using quotations from the individual transcripts to illustrate the source of the researchers’ interpretations. Thus, when the findings are organized for presentation, each theme can become the heading of a section in the report or presentation. Underneath each theme will be the codes, examples from the transcripts, and the researcher’s own interpretation of what the themes mean. Implications for real life (e.g., the treatment of people with chronic mental health problems) should also be given.

DATA SYNTHESIS

In this final section of this paper, we describe some ways of drawing together or “synthesizing” research findings to represent, as faithfully as possible, the meaning that participants ascribe to their life experiences. This synthesis is the aim of the final stage of qualitative research. For most readers, the synthesis of data presented by the researcher is of crucial significance—this is usually where “the story” of the participants can be distilled, summarized, and told in a manner that is both respectful to those participants and meaningful to readers. There are a number of ways in which researchers can synthesize and present their findings, but any conclusions drawn by the researchers must be supported by direct quotations from the participants. In this way, it is made clear to the reader that the themes under discussion have emerged from the participants’ interviews and not the mind of the researcher. The work of Latif and others 12 gives an example of how qualitative research findings might be presented.

Planning and Writing the Report

As has been suggested above, if researchers code and theme their material appropriately, they will naturally find the headings for sections of their report. Qualitative researchers tend to report “findings” rather than “results”, as the latter term typically implies that the data have come from a quantitative source. The final presentation of the research will usually be in the form of a report or a paper and so should follow accepted academic guidelines. In particular, the article should begin with an introduction, including a literature review and rationale for the research. There should be a section on the chosen methodology and a brief discussion about why qualitative methodology was most appropriate for the study question and why one particular methodology (e.g., interpretative phenomenological analysis rather than grounded theory) was selected to guide the research. The method itself should then be described, including ethics approval, choice of participants, mode of recruitment, and method of data collection (e.g., semistructured interviews or focus groups), followed by the research findings, which will be the main body of the report or paper. The findings should be written as if a story is being told; as such, it is not necessary to have a lengthy discussion section at the end. This is because much of the discussion will take place around the participants’ quotes, such that all that is needed to close the report or paper is a summary, limitations of the research, and the implications that the research has for practice. As stated earlier, it is not the intention of qualitative research to allow the findings to be generalized, and therefore this is not, in itself, a limitation.

Planning out the way that findings are to be presented is helpful. It is useful to insert the headings of the sections (the themes) and then make a note of the codes that exemplify the thoughts and feelings of your participants. It is generally advisable to put in the quotations that you want to use for each theme, using each quotation only once. After all this is done, the telling of the story can begin as you give your voice to the experiences of the participants, writing around their quotations. Do not be afraid to draw assumptions from the participants’ narratives, as this is necessary to give an in-depth account of the phenomena in question. Discuss these assumptions, drawing on your participants’ words to support you as you move from one code to another and from one theme to the next. Finally, as appropriate, it is possible to include examples from literature or policy documents that add support for your findings. As an exercise, you may wish to code and theme the sample excerpt in Appendix 1 and tell the participant’s story in your own way. Further reading about “doing” qualitative research can be found at the end of this paper.

CONCLUSIONS

Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. It can be used in pharmacy practice research to explore how patients feel about their health and their treatment. Qualitative research has been used by pharmacists to explore a variety of questions and problems (see the “Further Reading” section for examples). An understanding of these issues can help pharmacists and other health care professionals to tailor health care to match the individual needs of patients and to develop a concordant relationship. Doing qualitative research is not easy and may require a complete rethink of how research is conducted, particularly for researchers who are more familiar with quantitative approaches. There are many ways of conducting qualitative research, and this paper has covered some of the practical issues regarding data collection, analysis, and management. Further reading around the subject will be essential to truly understand this method of accessing peoples’ thoughts and feelings to enable researchers to tell participants’ stories.

Appendix 1. Excerpt from a sample transcript

The participant (age late 50s) had suffered from a chronic mental health illness for 30 years. The participant had become a “revolving door patient,” someone who is frequently in and out of hospital. As the participant talked about past experiences, the researcher asked:

  • What was treatment like 30 years ago?
  • Umm—well it was pretty much they could do what they wanted with you because I was put into the er, the er kind of system er, I was just on
  • endless section threes.
  • Really…
  • But what I didn’t realize until later was that if you haven’t actually posed a threat to someone or yourself they can’t really do that but I didn’t know
  • that. So wh-when I first went into hospital they put me on the forensic ward ’cause they said, “We don’t think you’ll stay here we think you’ll just
  • run-run away.” So they put me then onto the acute admissions ward and – er – I can remember one of the first things I recall when I got onto that
  • ward was sitting down with a er a Dr XXX. He had a book this thick [gestures] and on each page it was like three questions and he went through
  • all these questions and I answered all these questions. So we’re there for I don’t maybe two hours doing all that and he asked me he said “well
  • when did somebody tell you then that you have schizophrenia” I said “well nobody’s told me that” so he seemed very surprised but nobody had
  • actually [pause] whe-when I first went up there under police escort erm the senior kind of consultants people I’d been to where I was staying and
  • ermm so er [pause] I . . . the, I can remember the very first night that I was there and given this injection in this muscle here [gestures] and just
  • having dreadful side effects the next day I woke up [pause]
  • . . . and I suffered that akathesia I swear to you, every minute of every day for about 20 years.
  • Oh how awful.
  • And that side of it just makes life impossible so the care on the wards [pause] umm I don’t know it’s kind of, it’s kind of hard to put into words
  • [pause]. Because I’m not saying they were sort of like not friendly or interested but then nobody ever seemed to want to talk about your life [pause]
  • nobody asked me any questions about my life. The only questions that came into was they asked me if I’d be a volunteer for these student exams
  • and things and I said “yeah” so all the questions were like “oh what jobs have you done,” er about your relationships and things and er but
  • nobody actually sat down and had a talk and showed some interest in you as a person you were just there basically [pause] um labelled and you
  • know there was there was [pause] but umm [pause] yeah . . .

This article is the 10th in the CJHP Research Primer Series, an initiative of the CJHP Editorial Board and the CSHP Research Committee. The planned 2-year series is intended to appeal to relatively inexperienced researchers, with the goal of building research capacity among practising pharmacists. The articles, presenting simple but rigorous guidance to encourage and support novice researchers, are being solicited from authors with appropriate expertise.

Previous articles in this series:

Bond CM. The research jigsaw: how to get started. Can J Hosp Pharm . 2014;67(1):28–30.

Tully MP. Research: articulating questions, generating hypotheses, and choosing study designs. Can J Hosp Pharm . 2014;67(1):31–4.

Loewen P. Ethical issues in pharmacy practice research: an introductory guide. Can J Hosp Pharm. 2014;67(2):133–7.

Tsuyuki RT. Designing pharmacy practice research trials. Can J Hosp Pharm . 2014;67(3):226–9.

Bresee LC. An introduction to developing surveys for pharmacy practice research. Can J Hosp Pharm . 2014;67(4):286–91.

Gamble JM. An introduction to the fundamentals of cohort and case–control studies. Can J Hosp Pharm . 2014;67(5):366–72.

Austin Z, Sutton J. Qualitative research: getting started. C an J Hosp Pharm . 2014;67(6):436–40.

Houle S. An introduction to the fundamentals of randomized controlled trials in pharmacy research. Can J Hosp Pharm . 2014; 68(1):28–32.

Charrois TL. Systematic reviews: What do you need to know to get started? Can J Hosp Pharm . 2014;68(2):144–8.

Competing interests: None declared.

Further Reading

Examples of qualitative research in pharmacy practice.

  • Farrell B, Pottie K, Woodend K, Yao V, Dolovich L, Kennie N, et al. Shifts in expectations: evaluating physicians’ perceptions as pharmacists integrated into family practice. J Interprof Care. 2010; 24 (1):80–9. [ PubMed ] [ Google Scholar ]
  • Gregory P, Austin Z. Postgraduation employment experiences of new pharmacists in Ontario in 2012–2013. Can Pharm J. 2014; 147 (5):290–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Marks PZ, Jennnings B, Farrell B, Kennie-Kaulbach N, Jorgenson D, Pearson-Sharpe J, et al. “I gained a skill and a change in attitude”: a case study describing how an online continuing professional education course for pharmacists supported achievement of its transfer to practice outcomes. Can J Univ Contin Educ. 2014; 40 (2):1–18. [ Google Scholar ]
  • Nair KM, Dolovich L, Brazil K, Raina P. It’s all about relationships: a qualitative study of health researchers’ perspectives on interdisciplinary research. BMC Health Serv Res. 2008; 8 :110. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pojskic N, MacKeigan L, Boon H, Austin Z. Initial perceptions of key stakeholders in Ontario regarding independent prescriptive authority for pharmacists. Res Soc Adm Pharm. 2014; 10 (2):341–54. [ PubMed ] [ Google Scholar ]

Qualitative Research in General

  • Breakwell GM, Hammond S, Fife-Schaw C. Research methods in psychology. Thousand Oaks (CA): Sage Publications; 1995. [ Google Scholar ]
  • Given LM. 100 questions (and answers) about qualitative research. Thousand Oaks (CA): Sage Publications; 2015. [ Google Scholar ]
  • Miles B, Huberman AM. Qualitative data analysis. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]
  • Patton M. Qualitative research and evaluation methods. Thousand Oaks (CA): Sage Publications; 2002. [ Google Scholar ]
  • Willig C. Introducing qualitative research in psychology. Buckingham (UK): Open University Press; 2001. [ Google Scholar ]

Group Dynamics in Focus Groups

  • Farnsworth J, Boon B. Analysing group dynamics within the focus group. Qual Res. 2010; 10 (5):605–24. [ Google Scholar ]

Social Constructivism

  • Social constructivism. Berkeley (CA): University of California, Berkeley, Berkeley Graduate Division, Graduate Student Instruction Teaching & Resource Center; [cited 2015 June 4]. Available from: http://gsi.berkeley.edu/gsi-guide-contents/learning-theory-research/social-constructivism/ [ Google Scholar ]

Mixed Methods

  • Creswell J. Research design: qualitative, quantitative, and mixed methods approaches. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]

Collecting Qualitative Data

  • Arksey H, Knight P. Interviewing for social scientists: an introductory resource with examples. Thousand Oaks (CA): Sage Publications; 1999. [ Google Scholar ]
  • Guest G, Namey EE, Mitchel ML. Collecting qualitative data: a field manual for applied research. Thousand Oaks (CA): Sage Publications; 2013. [ Google Scholar ]

Constructivist Grounded Theory

  • Charmaz K. Grounded theory: objectivist and constructivist methods. In: Denzin N, Lincoln Y, editors. Handbook of qualitative research. 2nd ed. Thousand Oaks (CA): Sage Publications; 2000. pp. 509–35. [ Google Scholar ]

Those Who Were Born Poor: A Qualitative Study of Philippine Poverty

  • Journal of Counseling Psychology 55(2):158-171
  • 55(2):158-171

Ma. Teresa G. Tuason at University of North Florida

  • University of North Florida

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Allan B. I. Bernardo

  • Mira Michelle Angeli De Guzman
  • Ma. Elizabeth Macapagal

Austin J. P. Ferolino

  • CURR PSYCHOL

Menghao Ren

  • Shengqi Zou
  • Daoqun Ding

Matt Dylan Geli

  • Francie Kaye Sabalza

Hana Ysabelli Pevidal

  • SPAN J PSYCHOL

Sheri Levy

  • Asley E. Lytle
  • Maed Josel T.Tallo
  • Dr. Mary Leanne A. Laganhon

Affandi A.

  • Joe L. Kincheloe

Peter Mclaren

  • Bernice Lott
  • Y.S. Lincoln

Glenn Richardson

  • B.L. Neiger
  • K.L. Kumpfer
  • R.L. Jarrett
  • Hector F. Myers
  • Sylvie Taylor
  • L.E. Harrison
  • Juliet M. Corbin
  • Anselm Strauss
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • Systematic review
  • Open access
  • Published: 11 September 2024

Evaluation of research co-design in health: a systematic overview of reviews and development of a framework

  • Sanne Peters   ORCID: orcid.org/0000-0001-6235-1752 1 ,
  • Lisa Guccione 2 , 3 ,
  • Jill Francis 1 , 2 , 3 , 4 ,
  • Stephanie Best 1 , 2 , 3 , 5 ,
  • Emma Tavender 6 , 7 ,
  • Janet Curran 8 , 9 ,
  • Katie Davies 10 ,
  • Stephanie Rowe 1 , 8 ,
  • Victoria J. Palmer 11 &
  • Marlena Klaic 1  

Implementation Science volume  19 , Article number:  63 ( 2024 ) Cite this article

13 Altmetric

Metrics details

Co-design with consumers and healthcare professionals is widely used in applied health research. While this approach appears to be ethically the right thing to do, a rigorous evaluation of its process and impact is frequently missing. Evaluation of research co-design is important to identify areas of improvement in the methods and processes, as well as to determine whether research co-design leads to better outcomes. We aimed to build on current literature to develop a framework to assist researchers with the evaluation of co-design processes and impacts.

A multifaceted, iterative approach, including three steps, was undertaken to develop a Co-design Evaluation Framework: 1) A systematic overview of reviews; 2) Stakeholder panel meetings to discuss and debate findings from the overview of reviews and 3) Consensus meeting with stakeholder panel. The systematic overview of reviews included relevant papers published between 2000 and 2022. OVID (Medline, Embase, PsycINFO), EBSCOhost (Cinahl) and the Cochrane Database of Systematic reviews were searched for papers that reported co-design evaluation or outcomes in health research. Extracted data was inductively analysed and evaluation themes were identified. Review findings were presented to a stakeholder panel, including consumers, healthcare professionals and researchers, to interpret and critique. A consensus meeting, including a nominal group technique, was applied to agree upon the Co-design Evaluation Framework.

A total of 51 reviews were included in the systematic overview of reviews. Fifteen evaluation themes were identified and grouped into the following seven clusters: People (within co-design group), group processes, research processes, co-design context, people (outside co-design group), system and sustainment. If evaluation methods were mentioned, they mainly included qualitative data, informal consumer feedback and researchers’ reflections. The Co-Design Evaluation Framework used a tree metaphor to represent the processes and people in the co-design group (below-ground), underpinning system- and people-level outcomes beyond the co-design group (above-ground). To evaluate research co-design, researchers may wish to consider any or all components in the tree.

Conclusions

The Co-Design Evaluation Framework has been collaboratively developed with various stakeholders to be used prospectively (planning for evaluation), concurrently (making adjustments during the co-design process) and retrospectively (reviewing past co-design efforts to inform future activities).

Peer Review reports

Contributions to the literature

While stakeholder engagement in research seems ethically the right thing to do, a rigorous evaluation of its process and outcomes is frequently missing.

Fifteen evaluation themes were identified in the literature, of which research process , cognitive and emotional factors were the most frequently reported.

The Co-design Evaluation Framework can assist researchers with research co-design evaluation and provide guidance regarding what and when to evaluate.

The framework can be used prospectively, concurrently, and retrospectively to make improvements to existing and future research co-design projects.

Introduction

Lots of money is wasted in health research that does not lead to meaningful benefits for end-users, such as healthcare professionals and consumers [ 1 , 2 , 3 ]. One contributor to this waste is that research often focusses on questions and outcomes that are of limited importance to end-users [ 4 , 5 ]. Engaging relevant people in research co-design has increased in order to respond to this issue. There is a lack of consensus in the literature on the definition and processes involved in undertaking a co-design approach. For the purposes of this review, we define research co-design as meaningful end-user engagement that occurs across any stage of the research process , from the research planning phase to dissemination of research findings [ 6 ]. Meaningful end-user engagement refers to an explicit and measurable responsibility, such as contributing to writing a study proposal [ 6 ]. The variety of research co-design methods can be seen as a continuum ranging from limited involvement, such as consulting with end-users, to the much higher effort research approaches in which end-users and researchers aim for equal decision-making power and responsibility across the entire research process [ 6 ]. Irrespective of the intensity of involvement, it is generally recommended that a co-design approach should be based on several important principles such as equity, inclusion and shared ownership [ 7 ].

Over time, increasing attention has been given to research co-design [ 6 , 8 ]. Funding bodies encourage its use and it is recommended in the updated UK MRC framework on developing and evaluating complex interventions [ 9 ]. End-user engagement has an Equator reporting checklist [ 10 ] and related work has been reported by key organisations, such as the James Lind Alliance in the UK ( www.jla.nihr.ac.uk ), Patient Centered Outcomes Research Institute in the US ( www.pcori.org ) and Canadian Institutes of Health Research ( https://cihrirsc.gc.ca/e/41592.html ). In addition, peer reviewed publications involving co-design have risen from 173 per year in 2000 to 2617 in 2022 (PubMed), suggesting a growing importance in research activities.

Engaging end-users in the health research process is arguably the right thing to do, but the processes and outcomes of co-design have rarely been evaluated in a rigorous way [ 6 ]. Existing anecdotal evidence suggests that research co-design can benefit researchers, end-users and lead to more robust research processes [ 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 ]. Both researchers and end-users have reported positive experiences of engaging in the co-design process. Potential benefits include a better understanding of community needs, more applicable research questions, designs and materials and improved trust between the researchers and end-users. Several reviews on conducting research co-design have concluded that co-design can be feasible, though predominantly used in the early phases of research, for example formulating research questions and developing a study protocol [ 6 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 ]. However, these reviews highlighted that engagement of end-users in the research process required extra time and funding and had the risk of becoming tokenistic [ 6 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 ].

The use of resources in co-design studies might need to be justified to the funder as well as its impacts. A rigorous evaluation of research co-design processes and outcomes is needed to identify areas of potential improvement and to determine the impact of research co-design. Several overviews of reviews on research co-design have been published but with no or limited focus on evaluation [ 20 , 21 , 22 , 23 ]. Moreover, current literature provides little guidance around how and what to evaluate, and which outcomes are key.

This study thus had two aims:

To conduct a systematic overview of reviews to identify evaluation methods and process and outcome variables reported in the published health research co-design literature.

To develop a framework to assist researchers with the evaluation of co-design processes and impacts.

This project used a multifaceted, iterative approach to develop a Co-design Evaluation Framework. It consisted of the following steps: 1) A systematic overview of reviews; 2) Stakeholder panel meetings to discuss and debate findings from the overview of reviews and 3) Consensus meeting with stakeholder panel. The reporting checklist for overviews of reviews was applied in Additional file 1 [ 24 ].

Step 1: A systematic overview of reviews

We conducted a systematic overview of reviews [ 25 ], reviewing literature reviews rather than primary studies, to investigate the following question: What is known in the published literature about the evaluation of research co-design in health research? The protocol of our systematic overview of reviews was published in the PROSPERO database (CRD42022355338).

Sub questions:

What has been co-designed and what were the objectives of the co-design process?

Who was involved and what was the level of involvement?

What methods were used to evaluate the co-design processes and outcomes?

What was evaluated (outcome and process measures) and at what timepoint (for example concurrently, or after, the co-design process)?

Was a co-design evaluation framework used to guide evaluation?

Search strategy

We searched OVID (Medline, Embase, PsycINFO), EBSCOhost (Cinahl) and the Cochrane Database of Systematic reviews on the 11th of October 2022 for literature reviews that reported co-design evaluation or outcomes in health research. The search strategy was based on previous reviews on co-design [ 6 , 14 , 26 ] and refined with the assistance of a research librarian and the research team (search terms in Additional file 2). Papers published from January 2000 to September 2022 were identified and retrieved by one author (SP).

Study selection

Database records were imported into EndNote X9 (The EndNote Team, Philadelphia, 2013) and duplicates removed. We managed the study selection process in the software program Covidence (Veritas Health Innovation, Melbourne, Australia). Two independent reviewers (SP, MK or LG) screened the titles and abstracts of all studies against the eligibility criteria (Table  1 ). Discrepancies were resolved through discussion or with a third reviewer (either SP, MK or LG, depending on which 2 reviewers disagreed). If there was insufficient information in the abstract to decide about eligibility, the paper was retained to the full-text screening phase. Full-text versions of studies not excluded at the title and abstract screening phase were retrieved and independently screened by two reviewers (SP, MK or LG) against eligibility criteria. Disagreements were resolved through discussion, or with a third reviewer, and recorded in Covidence.

Data extraction of included papers was conducted by one of three reviewers (SP, MK or LG). A second reviewer checked a random sample of 20% of all extracted data (LG or SP). Disagreements were resolved through regular discussion. Data were extracted using an excel spreadsheet developed by the research team and included review characteristics (such as references, type of review, number of included studies, review aim), details about the co-design process (such as who was involved in the co-design, which topics the co-design focused on, what research phase(s) the co-design covered, in which research phase the co-design took place and what the end-users’ level of involvement was) and details about the co-design evaluation (what outcomes were reported, methods of data collection, who the participants of the evaluation were, the timepoint of evaluation, whether an evaluation framework was used or developed and conclusions about co-design evaluation).

Types of end-users’ involvement were categorised into four groups based on the categories proposed by Hughes et al. (2018): 1. Targeted consultation; 2. Embedded consultation; 3. Collaboration and co-production and 4. User-led research, see Table  2 .

Data extraction and analysis took place in three iterative phases (Fig.  1 ), with each phase containing one third of the included studies. Each phase of data extraction and analysis was followed by stakeholder panel meetings (see step 2 below). This stepwise approach enabled a form of triangulation wherein themes that emerged through each phase were discussed with the stakeholder panel and incorporated both retrospectively (re-coding data in the prior phase) and prospectively (coding new data in the next phase).

figure 1

Iterative phases in the process of the Co-design evaluation framework development

All reported outcomes of research co-design in the first phase (one third of all data) were inductively coded into themes, according to the principles of thematic analysis [ 28 ]. Two researchers (SP and MK) double coded 10% of all data and reached consensus through discussion. Given that consensus was high, one researcher (SP) continued the coding while having frequent discussions and reviews within the research team. In phase 2 (also one third of all data), deductive coding was based on the themes identified in the first round. Data of the first phase were re-coded, if new codes emerged during the stakeholder panel meeting. The same process took place for the third phase.

Step 2: Stakeholder panel meetings to discuss and debate findings from the overview of reviews

Results from step 1 were presented to the stakeholder panel to interpret and critique the review findings. The panel consisted of ten people, including a mix of consumers, healthcare professionals and researchers. Stakeholders were selected for their experience or expertise in research co-design. The number of meetings was not pre-determined, rather, it was informed by the outcomes from step 1. The number of stakeholders in each meeting ranged from six to ten.

A core group from the broader stakeholder panel (SP, MK, LG, JF) with a breadth of research experience and methodological expertise discussed the themes arising from both steps 1 and 2 and considered various ways of presenting them. Multiple design options were considered and preliminary frameworks were developed. Following discussion with the stakeholder panel, it was agreed that the evaluation themes could be grouped into several clusters to make the framework more comprehensible. The grouping of evaluation themes into clusters was informed by reported proposed associations between evaluation themes in the literature as well as the stakeholder panel’s co-design experience and expertise. Evaluation themes as well as clusters were agreed upon during the stakeholder panel meetings.

Step 3: Consensus meeting with stakeholder panel

The consensus meeting included the same stakeholder panel as in step 2. The meeting was informed by a modified Nominal Group Technique (NGT). The NGT is a structured process for obtaining information and reaching consensus with a target group who have some association or experience with the topic [ 29 ]. Various adaptations of the NGT have been used and additional pre-meeting information has been suggested to enable more time for participants to consider their contribution to the topic [ 30 ]. The modified NGT utilised in this study contained the following: (i) identification of group members to include experts with depth and diverse experiences. They were purposively identified at the start of this study for their expertise or experience in research co-design and included: a patient consumer, a clinician, three clinician researchers and six researchers with backgrounds in behavioural sciences, psychology, education, applied ethics and participatory design. All authors on this paper were invited by e-mail to attend an online meeting; (ii) provision of information prior to the group meeting included findings of the overview of reviews, a draft framework and objectives of the meeting. Five authors with extensive research co-design experience were asked to prepare a case example of one of their co-design projects for sharing at the group meeting. The intention of this exercise was to discuss the fit between a real-world example and the proposed framework; (iii) hybrid meeting facilitated by two researchers (SP & JF) who have experience in facilitating consensus meetings. Following presentation of the meeting materials, including the preliminary framework, group members were invited to silently consider the preliminary framework and generate ideas and critiques; iv) participants sharing their ideas and critiques; v) clarification process where group members shared their co-design example project and discussed the fit with components of the initial framework, and vi) silent voting and/or agreement on the framework via a personal email to one of the researchers (SP).

Step 1: Systematic overview of reviews

The database searches identified a total of 8912 papers. After removing 3016 duplicates and screening 5896 titles and abstracts, 148 full texts were sought for retrieval. Sixteen were not retrieved as they were not available in English ( n  = 2) or full-text was not available ( n  = 14). Of the remaining 132 papers assessed for eligibility, 81 were excluded. The final number of papers included in this overview of reviews was 51 (See Fig.  2 ).

figure 2

PRISMA flow chart (based on [ 31 ]) of overview of reviews

Characteristics of the included studies

Of the 51 included reviews [ 11 , 12 , 14 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 ], 17 were systematic reviews, 12 were scoping reviews, 14 did not report the type or method of review, three were narrative reviews, two were qualitative evidence synthesis, another two were a structured literature search and one was a realist review. The number of studies included in the reviews ranged from 7 to 260. Nineteen reviews focused on co-design with specific populations, for example older people, people with intellectual disabilities, people living with dementia and 32 reviews included co-design with a range of end-users. The co-design focused in most cases on a mix of topics ( n  = 31). Some reviews were specifically about one clinical topic, for example critical care or dementia. In ten cases, the clinical topics were not reported. Co-design took place during multiple research phases. Thirty-six reviews covered co-design in agenda/priority setting, 36 in study design, 30 in data collection, 25 in data analysis and 27 in dissemination. With regards to the research translation continuum, most of the co-design was reported in practice and community-based research ( n  = 32), three reviews were conducted in basic research and 11 in human research. The types of end-users’ involvement in co-design ranged from targeted consultation ( n  = 14) to embedded consultation ( n  = 20), collaboration and co-production ( n  = 14) to end-user- led research ( n  = 6), including papers covering multiple types of involvement. Seventeen papers did not report the types of involvement. The reported co-design included a variety of time commitments, from a minimum of a one-off 60-min meeting to multiple meetings over multiple years. Twenty-seven reviews did not report details about the end-users’ types of involvement.

Identified evaluation themes

Fifteen evaluation themes were identified and were arranged into two higher level groups: 1. within the co-design team and 2. broader than co-design team (Table  3 ). The themes related to the first group (within the co-design team) included: Structure and composition of the co-design group, contextual enablers/barriers, interrelationships between group members, decision making process, emotional factors, cognitive factors, value proposition, level/ quality of engagement, research process, health outcomes for co-design group and sustainment of the co-design team or activities. The themes within the second group (broader than co-design team) included: Healthcare professional-level outcomes, healthcare system level outcomes, organisational level outcomes and patient and community outcomes.

The research process was the most frequently reported evaluation theme in the reviews ( n  = 44, 86% of reviews), followed by cognitive factors ( n  = 35, 69%) and emotional factors ( n  = 34, 67%) (Table  4 ). Due to variability in reporting practices, it was not possible to specify the number of primary studies that reported specific evaluation themes. Evaluation methods for the themes were not reported in the majority of reviews ( n  = 43, 84%). If evaluation methods were mentioned, they were mainly based on qualitative data, including interviews, focus groups, field notes, document reviews and observations (see overview with references in Additional file 3). Survey data was mentioned in three reviews. Many reviews reported informal evaluation based on participant experiences (e.g. informal feedback), reflection meetings, narrative reflections and authors’ hypotheses (Additional file 3). The timing of the evaluation was only mentioned in two papers: 1. Before and after the co-design activities and 2. Post co-design activities. One paper suggested that continuous evaluation might be helpful to improve the co-design process (Additional file 3).

The systematic overview of reviews found that some authors reported proposed positive associations between evaluation themes (Table  5 ). The most frequently reported proposed association was between level/quality of engagement and emotional factors ( n  = 5, 10%). However, these proposed associations did not seem to have any empirical evidence and evaluation methods were not reported.

All evaluation themes were grouped into the following clusters (Table  6 ): People (within co-design group), group processes, research processes, co-design context, people (outside co-design group), system and sustainment.

Only one paper reported the evaluation in connection to the research phases (Agenda/priority setting, study design, data collection, data analysis and dissemination). This paper reported the following outcomes for the following research phases [ 58 ]:

Agenda/priority setting: Research process; Level/quality of engagement; Cognitive factors; Attributes of the co-design group; Interrelationships between group members; Sustainment of the co-design team or activities; Patient and community outcomes.

Study design: Attributes of the co-design group; Interrelationships between group members; Level/quality of engagement; Cognitive factors; Emotional factors; Research process.

The various research phases in which consumers could be involved, as well as the clusters of evaluation themes, informed the design of the co-design evaluation framework.

Two main options were voted on and discussed within the stakeholder panel. The two main options can be found in Additional file 4. Draft 2 was the prefered option as it was perceived as more dynamic than draft 1, representing a clearer interplay between the two contexts. The stakeholder panel suggested a few edits to the draft, such as the inclusion of bi-directional arrows in the tree trunk and a vertical arrow from underground to above ground with the label ‘impact’.

The final version of the Co-design Evaluation framework is presented in Fig.  3 .

figure 3

Research Co-design evaluation framework

Figure  3 presents co-design evaluation as the below-ground and above-ground structures of a tree. The tree metaphor presents the processes and people in the co-design group (below-ground) as the basis for system- and people-level outcomes beyond the co-design group (above-ground). To evaluate research co-design, researchers may wish to consider any or all components in this Figure. These evaluation components relate to the methods, processes, and outcomes of consumer involvement in research.

The context within the co-design group (the roots of the tree) consists of the people, group processes and research processes, with various evaluation themes (dot points) related to them, as well as contextual barriers and enablers that relate to situational aspects that might enable or hinder consumer engagement. The context outside the co-design group, i.e., the wider community (the branches and leaves of the tree), comprises people who were not involved in the research co-design process, the system-level and sustainment-related outcomes. These above ground groups are potential beneficiaries or targets of the co-design activities.

The arrows in the middle of the trunk represent the potential mutual influence of the two contexts, suggesting that an iterative approach to evaluation might be beneficial. For example, when deciding the composition of the co-design group, it may be important to have an appropriate representation of the people most impacted by the problem issue or topic at hand. Or, if a co-designed healthcare intervention does not achieve the desired outcomes in the wider context, the co-design group might consider potential ways to improve the intervention or how it was delivered. Evaluation of a research co-design process might start with the foundations (the roots of the tree) and progress to above ground (the tree grows and might develop fruit). Yet, depending on the aim of the evaluation, a focus on one of the two contexts, either below or above ground, might be appropriate.

Which, and how many, components are appropriate to evaluate depends on the nature of the co-design approach and the key questions of the evaluation. For example, if a co-design approach is used in the very early stages of a research program, perhaps to identify priorities or to articulate a research question, then 'below' the ground components are key. While a randomised study comparing the effects of a co-designed intervention versus a researcher-designed intervention might only consider 'above' the ground components.

The white boxes on the right-hand side of Fig.  3 indicate the research phases, from agenda/priority setting to dissemination, in which consumers can and should be involved. This co-design evaluation framework may be applied at any phase of the research process or applied iteratively with a view to improving future co-design activities.

This systematic overview of reviews aimed to build on current literature and develop a framework to assist researchers with the evaluation of research co-design. Fifty-one included reviews reported on fifteen evaluation themes, which were grouped into the following clusters: People (within co-design group), group processes, research processes, co-design context, people (outside co-design group), system and sustainment. Most reviews did not report measurement methods for the evaluation themes. If methods were mentioned, they mostly included qualitative data, informal consumer feedback and researchers’ reflections. This finding strengthens our argument that a framework may be helpful in supporting methodologically robust studies to assess co-design processes and impacts. The Co-Design Evaluation Framework has adopted a tree metaphor. It presents the processes and people in the co-design group (below-ground) as the underpinning system- and people-level outcomes beyond the co-design group (above-ground). To evaluate stakeholder involvement in research, researchers may wish to consider any or all components in the tree. Which, and how many, components are appropriate to evaluate depends on the nature of the co-design approach and the key questions that stakeholders aim to address. Nonetheless, it will be important that evaluations delineate what parts of the research project have incorporated a co-design approach.

The Equator reporting checklist for Research Co-Design, GRIPP2, provides researchers with a series of concepts that should be considered and reported on when incorporating patient and public involvement in research [ 10 ]. These concepts include, but are not limited to, methods of involving patients and the public in research and intensity of engagement. The Co-Design Evaluation Framework is not intended as a replacement for the GRIPP2, rather, it can be used prospectively to inform development of the co-design project or retropsectively to inform completion of the GRIPP2. Table 7 provides hypothetical examples of research questions that co-design evaluation projects might address. The framework could be used at multiple points within co-design projects, including prospectively (planning for evaluation before the co-design process has started), concurrently ( incorporating improvements during the co-design process) and retrospectively (reviewing past co-design efforts to inform future projects).

Our systematic overview of reviews identified multiple evaluation themes. Some of these overlapped with reported values associated with public involvement in research [ 80 ], community engagement measures [ 15 ] and reported impacts of patient and public involvement in research, as described by others [ 16 , 81 , 82 ]. The added value of our systematic overview of reviews is that we went beyond a list of items and took it one step further by looking at evaluation themes, potential associations between evaluation themes, clusters of evaluation themes and ultimately developed a framework to assist others with research co-design evaluation.

Some reviews in our overview of reviews proposed potential associations between evaluation themes. Yet, these proposed associations were not empirically tested. One of the included studies [ 58 ] proposed conditions and mechanisms involved in co-design processes and outcomes related to diabetes research. Although it is a promising starting point, this should be further explored. A realist evaluation including other research topics and other approaches, such as the use of logic models, which was also recognised in the updated MRC framework [ 9 ], might help to build on explorations of included mechanisms of action [ 83 ] and give insight into how core ingredients contribute to certain co-design processes and outcomes. As recognised by others [ 6 , 84 ], the reporting practice of research co-design in the literature could be improved as details about context, mechanisms and expected outcomes are frequently missing. This will also help us to gain a better understanding of what works for whom, why, how and in which circumstances.

The lack of a consistent definition of co-design makes it challenging to identify and synthesise the literature, as recognised by others [ 6 ]. Given that there are so many different terms used in the literature, there is a risk that we might have missed some relevant papers in our overview of reviews. Nevertheless, we tried to capture as many as possible synonyms of co-design in our search terms. The absence of quality assessment of included studies in our overview of reviews can be seen as a limitation. However, our overview of reviews did not aim to assess existing literature on the co-design process, but rather focused on what to evaluate, how and when. We did note whether the reported evaluation themes were based on empirical evidence or authors’ opinions. Primary studies reported in the included reviews were not individually reviewed as this was outside the scope of this paper. A strength in our methods was the cyclical process undertaken between steps 1 and 2. Analysis of the data extracted from the overview was refined over three phases following rigorous discussions with a diverse and experienced stakeholder panel. It was a strength of our project that a mix of stakeholders were involved, including consumers, healthcare professionals and researchers.

Stakeholders are frequently engaged in research but if research co-design processes and outcomes are not evaluated, there will be limited learning from past experiences. Evaluation is essential to make refinements during existing projects and improve future co-design activities. It is also critical for ensuring commitments to the underpinning values of c-odesign are embedded within activities.

A systematic review of all primary studies within the included reviews of this overview of reviews, would allow greater depth relating to the practicalities of how to evaluate certain themes. It would lead to a better understanding of existing measures and methods and which evaluation areas need further development. Future research should also focus on whether co-design leads to better outcomes than no co-design (only researcher-driven research). To our knowledge, this has not been explored yet. Moreover, future research could gain better insight into the mechanisms of change within co-design and explore potential associations between evaluation themes for example, those proposed in the included reviews between level/quality of engagement and emotional factors.

We followed a systematic, iterative approach to develop a Co-Design Evaluation Framework that can be applied to various phases of the research co-design process. Testing of the utility of the framework is an important next step. We propose that the framework could be used at multiple points within co-design projects, including prospectively (planning for evaluation before the co-design process has started), concurrently (to incorporate improvements during the co-design process) and retrospectively (reviewing past co-design efforts to inform future projects).

Availability of data and materials

All data generated during this study are included either within the text or as a supplementary file.

Abbreviations

Medical Research Council

Guidance for Reporting Involvement of Patients and the Public

Chalmers I, Glasziou P. Systematic reviews and research waste. Lancet. 2016;387(10014):122–3.

Article   PubMed   Google Scholar  

Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet. 2009;374(9683):86–9.

Glasziou P, Altman DG, Bossuyt P, Boutron I, Clarke M, Julious S, et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet. 2014;383(9913):267–76.

Ioannidis JP. Why Most Clinical Research Is Not Useful. PLoS Med. 2016;13(6):e1002049.

Article   PubMed   PubMed Central   Google Scholar  

Oliver S. Patient involvement in setting research agendas. Eur J Gastroenterol Hepatol. 2006;18(9):935–8.

Slattery P, Saeri AK, Bragge P. Research co-design in health: a rapid overview of reviews. Health Res Policy Syst. 2020;18(1):17.

Ní Shé É, Harrison R. Mitigating unintended consequences of co-design in health care. Health Expect. 2021;24(5):1551–6.

Peters S, Sukumar K, Blanchard S, Ramasamy A, Malinowski J, Ginex P, et al. Trends in guideline implementation: an updated scoping review. Implement Sci. 2022;17:50.

Skivington K, Matthews L, Simpson SA, Craig P, Baird J, Blazeby JM, et al. A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance. BMJ. 2021;374:n2061.

Staniszewska S, Brett J, Simera I, Seers K, Mockford C, Goodlad S, et al. GRIPP2 reporting checklists: tools to improve reporting of patient and public involvement in research. BMJ. 2017;358:j3453.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Domecq JP, Prutsky G, Elraiyah T, Wang Z, Nabhan M, Shippee N, et al. Patient engagement in research: a systematic review. BMC Health Serv Res. 2014;14:89.

Manafo E, Petermann L, Mason-Lai P, Vandall-Walker V. Patient engagement in Canada: a scoping review of the “how” and “what” of patient engagement in health research. Health Res Policy Syst. 2018;16(1):5.

Fergusson D, Monfaredi Z, Pussegoda K, Garritty C, Lyddiatt A, Shea B, et al. The prevalence of patient engagement in published trials: a systematic review. Res Involv Engagem. 2018;4:17.

Vat LE, Finlay T, Jan Schuitmaker-Warnaar T, Fahy N, Robinson P, Boudes M, et al. Evaluating the “return on patient engagement initiatives” in medicines research and development: A literature review. Health Expect. 2020;23(1):5–18.

Luger TM, Hamilton AB, True G. Measuring Community-Engaged Research Contexts, Processes, and Outcomes: A Mapping Review. Milbank Q. 2020;98(2):493–553.

Modigh A, Sampaio F, Moberg L, Fredriksson M. The impact of patient and public involvement in health research versus healthcare: A scoping review of reviews. Health Policy. 2021;125(9):1208–21.

Clavel N, Paquette J, Dumez V, Del Grande C, Ghadiri DPS, Pomey MP, et al. Patient engagement in care: A scoping review of recently validated tools assessing patients’ and healthcare professionals’ preferences and experience. Health Expect. 2021;24(6):1924–35.

Newman B, Joseph K, Chauhan A, Seale H, Li J, Manias E, et al. Do patient engagement interventions work for all patients? A systematic review and realist synthesis of interventions to enhance patient safety. Health Expect. 2021;24:1905 No Pagination Specified.

Lowe D, Ryan R, Schonfeld L, Merner B, Walsh L, Graham-Wisener L, et al. Effects of consumers and health providers working in partnership on health services planning, delivery and evaluation. Cochrane Database Syst Rev. 2021;9:CD013373.

PubMed   Google Scholar  

Price A, Albarqouni L, Kirkpatrick J, Clarke M, Liew SM, Roberts N, et al. Patient and public involvement in the design of clinical trials: An overview of systematic reviews. J Eval Clin Pract. 2018;24(1):240–53.

Sarrami-Foroushani P, Travaglia J, Debono D, Braithwaite J. Implementing strategies in consumer and community engagement in health care: results of a large-scale, scoping meta-review. BMC Health Serv Res. 2014;14:402.

Abrams R, Park S, Wong G, Rastogi J, Boylan A-M, Tierney S, et al. Lost in reviews: Looking for the involvement of stakeholders, patients, public and other non-researcher contributors in realist reviews. Research Synthesis Methods. 2021;12(2):239–47.

Zych MM, Berta WB, Gagliardi AR. Conceptualising the initiation of researcher and research user partnerships: a meta-narrative review. Health Res Policy Syst. 2020;18(1):24.

Gates M, Gates A, Pieper D, Fernandes RM, Tricco AC, Moher D, et al. Reporting guideline for overviews of reviews of healthcare interventions: development of the PRIOR statement. BMJ. 2022;378:e070849.

Pollock A, Campbell P, Brunton G, Hunt H, Estcourt L. Selecting and implementing overview methods: implications from five exemplar overviews. Syst Rev. 2017;6(1):145.

Greenhalgh T, Hinton L, Finlay T, Macfarlane A, Fahy N, Clyde B, et al. Frameworks for supporting patient and public involvement in research: Systematic review and co-design pilot. Health Expectations: An International Journal of Public Participation in Health Care & Health Policy. 2019;22(4):785–801.

Article   Google Scholar  

Hughes M, Duffy C. Public involvement in health and social sciences research: A concept analysis. Health Expectations: An International Journal of Public Participation in Health Care & Health Policy. 2018;21(6):1183–90.

Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2008;3(2):77–101.

Waggoner J, Carline JD, Durning SJ. Is There a Consensus on Consensus Methodology? Descriptions and Recommendations for Future Consensus Research. Acad Med. 2016;91(5):663–8.

Harvey N, Holmes CA. Nominal group technique: an effective method for obtaining group consensus. Int J Nurs Pract. 2012;18(2):188–94.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

Baldwin JN, Napier S, Neville S, Clair VAWS. Impacts of older people’s patient and public involvement in health and social care research: a systematic review. Age Ageing. 2018;47(6):801–9.

Bench S, Eassom E, Poursanidou K. The nature and extent of service user involvement in critical care research and quality improvement: A scoping review of the literature. Int J Consum Stud. 2018;42(2):217–31.

Bethell J, Commisso E, Rostad HM, Puts M, Babineau J, Grinbergs-Saull A, et al. Patient engagement in research related to dementia: a scoping review. Dementia. 2018;17(8):944–75.

Brett J, Staniszewska S, Mockford C, Herron-Marx S, Hughes J, Tysall C, et al. A systematic review of the impact of patient and public involvement on service users, researchers and communities. The Patient: Patient-Centered Outcomes Research. 2014;7(4):387–95.

Di Lorito C, Birt L, Poland F, Csipke E, Gove D, Diaz-Ponce A, et al. A synthesis of the evidence on peer research with potentially vulnerable adults: how this relates to dementia. Int J Geriatr Psychiatry. 2017;32(1):58–67.

Di Lorito C, Bosco A, Birt L, Hassiotis A. Co-research with adults with intellectual disability: A systematic review. J Appl Res Intellect Disabil. 2018;31(5):669–86.

Fox G, Fergusson DA, Daham Z, Youssef M, Foster M, Poole E, et al. Patient engagement in preclinical laboratory research: A scoping review. EBioMedicine. 2021;70:103484.

Frankena TK, Naaldenberg J, Cardol M, Linehan C, van Schrojenstein Lantman-de Valk H. Active involvement of people with intellectual disabilities in health research - a structured literature review. Res Dev Disabil. 2015;45–46:271–83.

Fudge N, Wolfe CD, McKevitt C. Involving older people in health research. Age Ageing. 2007;36(5):492–500.

Article   CAS   PubMed   Google Scholar  

George AS, Mehra V, Scott K, Sriram V. Community participation in health systems research: A systematic review assessing the state of research, the nature of interventions involved and the features of engagement with communities. PLoS One. 2015;10(10):ArtID e0141091.

Legare F, Boivin A, van der Weijden T, Pakenham C, Burgers J, Legare J, et al. Patient and public involvement in clinical practice guidelines: A knowledge synthesis of existing programs. Med Decis Making. 2011;31(6):E45–74.

McCarron TL, Clement F, Rasiah J, Moran C, Moffat K, Gonzalez A, et al. Patients as partners in health research: a scoping review. Health Expect. 2021;24:1378 No Pagination Specified.

Miller J, Knott V, Wilson C, Roder D. A review of community engagement in cancer control studies among indigenous people of Australia, New Zealand, Canada and the USA. Eur J Cancer Care. 2012;21(3):283–95.

Article   CAS   Google Scholar  

Shen S, Doyle-Thomas KAR, Beesley L, Karmali A, Williams L, Tanel N, et al. How and why should we engage parents as co-researchers in health research? A scoping review of current practices. Health Expect. 2017;20(4):543–54.

Velvin G, Hartman T, Bathen T. Patient involvement in rare diseases research: a scoping review of the literature and mixed method evaluation of Norwegian researchers’ experiences and perceptions. Orphanet J Rare Dis. 2022;17(1):212.

Wiles LK, Kay D, Luker JA, Worley A, Austin J, Ball A, et al. Consumer engagement in health care policy, research and services: A systematic review and meta-analysis of methods and effects. PLoS One. 2022;17:e0261808.no pagination

Cook N, Siddiqi N, Twiddy M, Kenyon R. Patient and public involvement in health research in low and middle-income countries: a systematic review. BMJ Open. 2019;9(5):e026514.

Chambers E, Gardiner C, Thompson J, Seymour J. Patient and carer involvement in palliative care research: An integrative qualitative evidence synthesis review. Palliat Med. 2019;33(8):969–84.

Brett J, Staniszewska S, Mockford C, Herron-Marx S, Hughes J, Tysall C, et al. Mapping the impact of patient and public involvement on health and social care research: a systematic review. Health Expect. 2014;17(5):637–50.

Boaz A, Hanney S, Jones T, Soper B. Does the engagement of clinicians and organisations in research improve healthcare performance: a three-stage review. BMJ Open. 2015;5(12):e009415.

Boivin A, L’Esperance A, Gauvin FP, Dumez V, Macaulay AC, Lehoux P, et al. Patient and public engagement in research and health system decision making: A systematic review of evaluation tools. Health Expect. 2018;21(6):1075–84.

Anderst A, Conroy K, Fairbrother G, Hallam L, McPhail A, Taylor V. Engaging consumers in health research: a narrative review. Aust Health Rev. 2020;44(5):806–13.

Arnstein L, Wadsworth AC, Yamamoto BA, Stephens R, Sehmi K, Jones R, et al. Patient involvement in preparing health research peer-reviewed publications or results summaries: a systematic review and evidence-based recommendations. Res Involv Engagem. 2020;6:34.

Becerril-Montekio V, Garcia-Bello LA, Torres-Pereda P, Alcalde-Rabanal J, Reveiz L, Langlois EV. Collaboration between health system decision makers and professional researchers to coproduce knowledge, a scoping review. Int J Health Plann Manage. 2022;28:28.

Google Scholar  

Bird M, Ouellette C, Whitmore C, Li L, Nair K, McGillion MH, et al. Preparing for patient partnership: A scoping review of patient partner engagement and evaluation in research. Health Expect. 2020;23:523 No Pagination Specified.

Dawson S, Campbell SM, Giles SJ, Morris RL, Cheraghi-Sohi S. Black and minority ethnic group involvement in health and social care research: A systematic review. Health Expect. 2018;21(1):3–22.

Harris J, Haltbakk J, Dunning T, Austrheim G, Kirkevold M, Johnson M, et al. How patient and community involvement in diabetes research influences health outcomes: A realist review. Health Expectations: An International Journal of Public Participation in Health Care & Health Policy. 2019;22(5):907–20.

Hubbard G, Kidd L, Donaghy E. Involving people affected by cancer in research: a review of literature. Eur J Cancer Care. 2008;17(3):233–44.

Hubbard G, Kidd L, Donaghy E, McDonald C, Kearney N. A review of literature about involving people affected by cancer in research, policy and planning and practice. Patient Educ Couns. 2007;65(1):21–33.

Jones EL, Williams-Yesson BA, Hackett RC, Staniszewska SH, Evans D, Francis NK. Quality of reporting on patient and public involvement within surgical research: a systematic review. Ann Surg. 2015;261(2):243–50.

Drahota A, Meza RD, Brikho B, Naaf M, Estabillo JA, Gomez ED, et al. Community-Academic Partnerships: A Systematic Review of the State of the Literature and Recommendations for Future Research. Milbank Q. 2016;94(1):163–214.

Forsythe LP, Szydlowski V, Murad MH, Ip S, Wang Z, Elraiyah TA, et al. A systematic review of approaches for engaging patients for research on rare diseases. J Gen Intern Med. 2014;29(Suppl 3):788–800.

Article   PubMed Central   Google Scholar  

Lander J, Hainz T, Hirschberg I, Strech D. Current practice of public involvement activities in biomedical research and innovation: a systematic qualitative review. PLoS One [Electronic Resource]. 2014;9(12):e113274.

Lee DJ, Avulova S, Conwill R, Barocas DA. Patient engagement in the design and execution of urologic oncology research. Urol Oncol. 2017;35(9):552–8.

Malterud K, Elvbakken KT. Patients participating as co-researchers in health research: A systematic review of outcomes and experiences. Scandinavian Journal of Public Health. 2020;48(6):617–28.

Miah J, Dawes P, Edwards S, Leroi I, Starling B, Parsons S. Patient and public involvement in dementia research in the European Union: a scoping review. BMC geriatr. 2019;19(1):220.

Nilsen ES, Myrhaug HT, Johansen M, Oliver S, Oxman AD. Methods of consumer involvement in developing healthcare policy and research, clinical practice guidelines and patient information material. Cochrane Database Syst Rev. 2006;2006(3):CD004563.

PubMed   PubMed Central   Google Scholar  

Oliver S, Clarke-Jones L, Rees R, Milne R, Buchanan P, Gabbay J, et al. Involving consumers in research and development agenda setting for the NHS: developing an evidence-based approach. Health Technol Assess. 2004;8(15):1-148 III-IV.

Orlowski SK, Lawn S, Venning A, Winsall M, Jones GM, Wyld K, et al. Participatory Research as One Piece of the Puzzle: A Systematic Review of Consumer Involvement in Design of Technology-Based Youth Mental Health and Well-Being Interventions. JMIR Hum Factors. 2015;2(2):e12.

Pii KH, Schou LH, Piil K, Jarden M. Current trends in patient and public involvement in cancer research: A systematic review. Health Expectations: An International Journal of Public Participation in Health Care & Health Policy. 2019;22(1):3–20.

Sandoval JA, Lucero J, Oetzel J, Avila M, Belone L, Mau M, et al. Process and outcome constructs for evaluating community-based participatory research projects: a matrix of existing measures. Health Educ Res. 2012;27(4):680–90.

Sangill C, Buus N, Hybholt L, Berring LL. Service user’s actual involvement in mental health research practices: A scoping review. Int J Ment Health Nurs. 2019;28(4):798–815.

Schelven F, Boeije H, Marien V, Rademakers J. Patient and public involvement of young people with a chronic condition in projects in health and social care: A scoping review. Health Expect. 2020;23:789 No Pagination Specified.

Schilling I, Gerhardus A. Methods for Involving Older People in Health Research-A Review of the Literature. International Journal of Environmental Research & Public Health [Electronic Resource]. 2017;14(12):29.

Shippee ND, Domecq Garces JP, Prutsky Lopez GJ, Wang Z, Elraiyah TA, Nabhan M, et al. Patient and service user engagement in research: a systematic review and synthesized framework. Health Expect. 2015;18(5):1151–66.

Vaughn LM, Jacquez F, Lindquist-Grantz R, Parsons A, Melink K. Immigrants as research partners: A review of immigrants in community-based participatory research (CBPR). J Immigr Minor Health. 2017;19(6):1457–68.

Walmsley J, Strnadova I, Johnson K. The added value of inclusive research. J Appl Res Intellect Disabil. 2018;31(5):751–9.

Weschke S, Franzen DL, Sierawska AK, Bonde LS, Strech D. Schorr SG. Reporting of patient involvement: A mixed-methods analysis of current practice in health research publications. medRxiv; 2022. p. 21.

Gradinger F, Britten N, Wyatt K, Froggatt K, Gibson A, Jacoby A, et al. Values associated with public involvement in health and social care research: A narrative review. Health Expectations: An International Journal of Public Participation in Health Care & Health Policy. 2015;18(5):661–75.

Hoekstra F, Mrklas KJ, Khan M, McKay RC, Vis-Dunbar M, Sibley KM, et al. A review of reviews on principles, strategies, outcomes and impacts of research partnerships approaches: A first step in synthesising the research partnership literature. Health Res Pol Syst. 2020;18(1):51 no pagination.

Stallings SC, Boyer AP, Joosten YA, Novak LL, Richmond A, Vaughn YC, et al. A taxonomy of impacts on clinical and translational research from community stakeholder engagement. Health Expect. 2019;22(4):731–42.

Grindell C, Coates E, Croot L, O’Cathain A. The use of co-production, co-design and co-creation to mobilise knowledge in the management of health conditions: a systematic review. BMC Health Serv Res. 2022;22(1):877.

Staley K. “Is it worth doing?” Measuring the impact of patient and public involvement in research. Res Involv Engagem. 2015;1:6.

Download references

Acknowledgements

The authors would like to thank the graphic designers, Jenni Quinn and Kevin Calthorpe, for their work on Fig. 3 .

Not applicable.

Author information

Authors and affiliations.

School of Health Sciences, The University of Melbourne, Melbourne, Australia

Sanne Peters, Jill Francis, Stephanie Best, Stephanie Rowe & Marlena Klaic

Department of Health Services Research, Peter MacCallum Cancer Centre, Melbourne, Australia

Lisa Guccione, Jill Francis & Stephanie Best

Sir Peter MacCallum Department of Oncology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, Australia

Centre for Implementation Research, Ottawa Hospital Research Institute, Ottawa, Canada

Jill Francis

Victorian Comprehensive Cancer Centre, Melbourne, VIC, Australia

Stephanie Best

Emergency Research, Murdoch Children’s Research Institute, Melbourne, Australia

Emma Tavender

Department of Critical Care, The University of Melbourne , Melbourne, Australia

School of Nursing, Faculty of Health, Ottawa, Canada

Janet Curran & Stephanie Rowe

Emergency Medicine, Faculty of Medicine, Ottawa, Canada

Janet Curran

Neurological Rehabilitation Group Mount Waverley, Mount Waverley, Australia

Katie Davies

The ALIVE National Centre for Mental Health Research Translation, The University of Melbourne, Melbourne, Australia

Victoria J. Palmer

You can also search for this author in PubMed   Google Scholar

Contributions

SP coordinated the authorship team, completed the systematic literature searches, synthesis of data, framework design and substantial writing. MK and LG were the second reviewers for the systematic overview of reviews. MK, LG and JF assisted with framework design.  SP, LG, JF, SB, ET, JC, KD, SR, VP and MK participated in the stakeholder meetings and the consensus process. All authors commented on drafts and approved the final submitted version of the manuscript.

Corresponding author

Correspondence to Sanne Peters .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Peters, S., Guccione, L., Francis, J. et al. Evaluation of research co-design in health: a systematic overview of reviews and development of a framework. Implementation Sci 19 , 63 (2024). https://doi.org/10.1186/s13012-024-01394-4

Download citation

Received : 01 April 2024

Accepted : 31 August 2024

Published : 11 September 2024

DOI : https://doi.org/10.1186/s13012-024-01394-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Research co-design
  • Stakeholder involvement
  • End-user engagement
  • Consumer participation
  • Outcome measures

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

published qualitative research paper

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 09 September 2024

Accelerating histopathology workflows with generative AI-based virtually multiplexed tumour profiling

  • Pushpak Pati 1 ,
  • Sofia Karkampouna 2 , 3 ,
  • Francesco Bonollo   ORCID: orcid.org/0000-0003-3609-5956 2 ,
  • Eva Compérat 4 ,
  • Martina Radić   ORCID: orcid.org/0000-0002-4093-839X 2 ,
  • Martin Spahn 5 , 6 ,
  • Adriano Martinelli 1 , 7 , 8 ,
  • Martin Wartenberg   ORCID: orcid.org/0000-0002-5378-3825 9 ,
  • Marianna Kruithof-de Julio   ORCID: orcid.org/0000-0002-6085-7706 2 , 3 , 10 &
  • Marianna Rapsomaniki   ORCID: orcid.org/0000-0003-3883-4871 1 , 8 , 11  

Nature Machine Intelligence ( 2024 ) Cite this article

2284 Accesses

89 Altmetric

Metrics details

  • Cancer imaging
  • Machine learning

A preprint version of the article is available at bioRxiv.

Understanding the spatial heterogeneity of tumours and its links to disease initiation and progression is a cornerstone of cancer biology. Presently, histopathology workflows heavily rely on hematoxylin and eosin and serial immunohistochemistry staining, a cumbersome, tissue-exhaustive process that results in non-aligned tissue images. We propose the VirtualMultiplexer, a generative artificial intelligence toolkit that effectively synthesizes multiplexed immunohistochemistry images for several antibody markers (namely AR, NKX3.1, CD44, CD146, p53 and ERG) from only an input hematoxylin and eosin image. The VirtualMultiplexer captures biologically relevant staining patterns across tissue scales without requiring consecutive tissue sections, image registration or extensive expert annotations. Thorough qualitative and quantitative assessment indicates that the VirtualMultiplexer achieves rapid, robust and precise generation of virtually multiplexed imaging datasets of high staining quality that are indistinguishable from the real ones. The VirtualMultiplexer is successfully transferred across tissue scales and patient cohorts with no need for model fine-tuning. Crucially, the virtually multiplexed images enabled training a graph transformer that simultaneously learns from the joint spatial distribution of several proteins to predict clinically relevant endpoints. We observe that this multiplexed learning scheme was able to greatly improve clinical prediction, as corroborated across several downstream tasks, independent patient cohorts and cancer types. Our results showcase the clinical relevance of artificial intelligence-assisted multiplexed tumour imaging, accelerating histopathology workflows and cancer biology.

Similar content being viewed by others

published qualitative research paper

Self-supervised learning for characterising histomorphological diversity and spatial RNA expression prediction across 23 human tissue types

published qualitative research paper

Dissecting tumor microenvironment from spatially resolved transcriptomics data by heterogeneous graph learning

published qualitative research paper

TIAToolbox as an end-to-end library for advanced tissue image analytics

Tissues are spatially organized ecosystems, where cells of diverse phenotypes, morphologies and molecular profiles coexist with non-cellular compounds and interact to maintain homeostasis 1 . Several tissue staining technologies are used to interrogate this intricate tissue architecture. Among these, hematoxylin and eosin (H&E) is the undisputed workhorse, routinely used to assess aberrations in tissue morphology linked to disease in histopathology workflows 2 . A notable example is cancer, where H&E staining can reveal abnormal cell proliferation, lymphovascular invasion and immune cell infiltration, among others. Complementary to the morphological information available via H&E, immunohistochemistry (IHC) 3 can detect and quantify the distribution and localization of specific markers within cell compartments and within their proper histological context, crucial for tumour subtyping, prognosis and personalized treatment selection. As tissue restaining in conventional IHC is limited, repeated serial sections stained with different antibodies are required for in-depth tumour profiling, a time-consuming and tissue-exhaustive process, prohibitive in cases of limited tissue availability. Additionally, serial IHC staining yields unaligned, non-multiplexed images occasionally of suboptimal quality due to artefacts, and tissue unavailability may lead to missing stainings (Fig. 1a ). Recently, multiplexed imaging technologies 4 , 5 , 6 have enabled the simultaneous quantification of dozens of markers on the same tissue, revolutionizing spatial biology 7 . Still, their high cost, cumbersome experimental process, tissue-destructive nature and need for specialized equipment severely limit clinical adoption.

figure 1

a , In a typical histopathology workflow, serial tissue sections from a tumour resection are stained with H&E and IHC to highlight tissue morphology and molecular expression of several markers of interest. This time-consuming and tissue-exhaustive process yields unpaired tissue slides that bear the technical risk of suboptimal quality in terms of missing stainings, tissue artefacts and unaligned tissues. b , To mitigate these issues, the VirtualMultiplexer uses generative AI to rapidly render, from a real input H&E image, consistent, reliable and pixel-wise aligned IHC stainings. c , As the generated images are now virtually multiplexed, they are further exploited to train early fusion graph transformers able to predict several clinically relevant endpoints. d , The VirtualMultiplexer was successfully transferred across image scales and patient cohorts and showed potential in being transferred to other tissue types, accelerating clinical applications and discovery.

Virtual staining—that is, artificially staining tissue images using generative artificial intelligence (AI)—has emerged as a promising cost-effective, accessible and rapid alternative that addresses the above limitations 8 , 9 . A virtual staining model is trained on two sets of images—a source and a target set—and learns the source-to-target appearance mapping 10 , 11 so as to simulate the target staining on the source, ultimately producing at inference time a virtual target image. Initial virtual staining models were based on different flavours of generative adversarial networks (GANs) operating under a paired setting: that is, they depended on precisely aligned source and target images, which allowed them to directly optimize a pixel-wise loss between the virtual and real images 12 . Successful examples of paired models include translating label-free microscopy to H&E and specific stainings 13 , 14 , 15 , 16 , H&E to special stains 17 , 18 , H&E to IHC 19 , 20 and IHC to multiplex immunofluorescence 21 . However, as tissue restaining is not routinely done, paired models depend on aligning tissue slices via image registration, a time-consuming and error-prone process, often infeasible in practice because of substantial discrepancies even between consecutive slices. Additionally, as tissue architecture largely alters after the first slices, retrospective addition of new markers is impossible. To circumvent these limitations, unpaired stain-to-stain (S2S) translation models have recently emerged, with early applications in translating from H&E to IHC 22 , 23 , 24 , 25 , 26 and special staining 27 , 28 and from cryosections to formalin-fixed paraffin-embedded (FFPE) sections 29 . The vast majority of unpaired models are inspired by CycleGAN 30 ; they depend on an adversarial loss to preserve the source content and a cycle consistency loss to preserve the target style. Some employ additional constraints: for example, domain-invariant content and domain-variant style 22 , perceptual embeddings 24 or structural similarity 25 .

An important limitation of CycleGAN-based models is that cycle consistency assumes a bijective mapping between the source and target domains 30 , which does not hold for many S2S translation tasks. As a result, a persistent problem is staining unreliability, observed as incorrect mappings across domains: for example, positive signals from the source domain are mapped to negative signals from the target domain. To account for staining unreliability, recent works guide the translations via expert annotations: ref. 26 translates H&E to cytokeratin-stained IHC using expert annotations of positive and negative metastatic regions on the H&E images, and ref. 25 translated H&E to Ki67-stained IHC by leveraging cancer and normal region annotations in both H&E and IHC images. Although these approaches show promising results for these specific translation tasks, acquiring such annotations is impractical when translating to several IHC markers and infeasible even for experienced pathologists for specialized tasks (for example, identifying p53 + cells in H&E images). To circumvent the annotation challenge, ref. 31 recently introduced a semisupervised approach, which, however, again depends on image registration. Consequently, there is a great need for unpaired S2S translation models that preserve staining consistency without needing consecutive tissue sections, image registration or extensive annotations on the source domain.

Regardless of the underlying modelling assumptions, another important limitation of S2S translation methods concerns evaluation. As ground-truth and virtually generated images are not pixel-wise aligned, S2S translation quality is typically quantified at a high feature level using inception-based scores 32 . However, these scores do not guarantee accurate preservation of complex and biologically meaningful patterns 9 . To alleviate these concerns, some studies employ qualitative assessment through pathological examination of the virtual images 22 , 24 . Still, a persistent concern is the presence of hallucinations in virtual images 33 that might otherwise appear realistic even to experienced pathologists. Ultimately, to ensure that virtual images not only visually appear realistic but also are useful from a clinical standpoint, using them as input to downstream models that predict clinical endpoints could provide an unbiased, convincing validation 9 .

Here, we propose the VirtualMultiplexer, a generative toolkit that translates H&E images to matching IHC images for a variety of markers (one IHC marker at a time) (Fig. 1a,b ). The VirtualMultiplexer is inspired by contrastive unpaired translation (CUT) 34 , an appealing alternative to CycleGAN that achieves content preservation by maximizing the mutual information between target and source domains. Our toolkit does not necessitate pixel-wise aligned H&E and IHC images and, in contrast to existing approaches, requires minimal expert annotations only on the IHC domain. To ensure biological consistency, the VirtualMultiplexer introduces an architecture with multiscale constraints at the single-cell, cell-neighbourhood and whole-image level that closely mimics human expert evaluation. We trained the VirtualMultiplexer on a prostate cancer tissue microarray (TMA) containing unpaired H&E and IHC images for six clinically relevant nuclear, cytoplasmic and membrane-targeted markers. We evaluated the generated images using quantitative fidelity metrics, expert pathological assessment and visual Turing tests and assessed their clinical relevance by predicting clinical endpoints (Fig. 1c ). We successfully transferred the model across tissue image scales and out-of-distribution patient cohorts and demonstrated its potential to transfer across tissue types (Fig. 1d ). Our results suggest that the VirtualMultiplexer generates realistic, indistinguishable from real, multiplexed IHC images of high quality, outperforming existing methods. Using the virtually multiplexed datasets improves the prediction of clinical endpoints not only in the training cohort but also in two independent prostate cancer patient cohorts and a pancreatic ductal adenocarcinoma (PDAC) cohort, with important implications in histopathology.

VirtualMultiplexer is a virtually multiplexed staining toolkit

The VirtualMultiplexer is a generative toolkit for unpaired S2S translation, trained on unpaired real H&E (source) and IHC (target) images (Fig. 2 ; detailed description in Methods ). During training, each image is split into patches that are fed into a generator network G that conditions on input H&E and IHC and learns to transfer the staining pattern, as captured by IHC, to the tissue morphology, as captured by H&E. The generated IHC patches are stitched together to create a virtual IHC image (Fig. 2a ). We train an independent one-to-one VirtualMultiplexer model for each IHC marker at a time. To ensure staining reliability, we propose a multiscale approach, designed to accurately learn staining specificity at a single-cell level and content and style preservation at a cell-neighbourhood and whole-image level, which involves jointly optimizing three distinct loss functions (Fig. 2b ). The neighbourhood loss (1) ensures that generated IHC patches are indistinguishable from real IHC patches and consists of an adversarial and a multilayer contrastive loss (Fig. 2b ), adopted from CUT 34 . The adversarial loss \({{\mathcal{L}}}_{{\rm{adv}}}\) (1a) is a standard GAN loss 35 , where real and virtual IHC patches are used as input to patch discriminator D , which attempts to classify them as either real or virtual, eliminating style differences. The multilayer contrastive loss (1b) is based on a patch-level noise contrastive estimation (NCE) loss 34 \({{\mathcal{L}}}_{{\rm{contrastive}}}\) that ensures that the content of corresponding real H&E and virtual IHC patches is preserved across multiple layers of G enc : that is, the encoder of the generator G . The VirtualMultiplexer introduces two losses: a global consistency loss and a local consistency loss (Fig. 2b ). The global consistency loss (2) uses a feature extractor F and enforces content consistency between real H&E and virtual IHC images ( \({{\mathcal{L}}}_{{\rm{content}}}\) ) and style consistency between real and virtual IHC images ( \({{\mathcal{L}}}_{{\rm{style}}}\) ) across multiple layers of F . The local consistency loss (3) enables the model to capture a realistic appearance and staining pattern at the cellular level while alleviating the multi-subdomain mappings. This is achieved by leveraging prior knowledge on staining status via expert annotations and training two separate networks: a cell discriminator D cell that eliminates differences in the style of real and virtual cells ( \({{\mathcal{L}}}_{{\rm{cellDisc}}}\) ) and a cell classifier F cell that predicts the staining status and thus enforces staining consistency at a cell level ( \({{\mathcal{L}}}_{{\rm{cellClass}}}\) ).

figure 2

a , The VirtualMultiplexer consists of a generator G that takes as input real unpaired H&E and IHC images and is trained to perform S2S translation by mapping the staining distribution of IHC onto H&E while preserving tissue morphology, ultimately generating virtually multiplexed synthetic IHC images only from input H&E images. b , During training, the VirtualMultiplexer optimizes several losses that enforce consistent S2S translation at multiple scales, including (1) a neighbourhood consistency loss that ensures indistinguishable translations at a neighbourhood (patch) level, (2) a global consistency loss that ensures that the model accurately captures content and style constraints at a global tile level and (3) a local consistency loss that encodes biological priors on cell type classification and discriminator constraints at a cellular level.

Performance assessment of the VirtualMultiplexer

We trained the VirtualMultiplexer on a prostate cancer cohort from the European Multicenter Prostate Cancer Clinical and Translational Research Group (EMPaCT) TMA 36 , 37 , 38 ( Methods ). The cohort contained unpaired H&E and IHC images from 210 patients with four cores per patient for six clinically relevant markers: androgen receptor (AR), NK3 Homeobox 1 (NKX3.1), CD44, CD146, p53 and ERG. The VirtualMultiplexer generated virtual IHC images that preserved the tissue morphology of the real H&E image and the staining pattern of the real IHC image (Fig. 3a–c ; additional examples in Extended Data Fig. 1 ). We benchmarked the VirtualMultiplexer with four state-of-the-art unpaired S2S translation methods: CycleGAN 30 , CUT 34 , CUT with kernel instance normalization (KIN) 39 and AI-FFPE 29 using the Fréchet inception distance (FID), an established metric used to assess the quality of AI-generated images 40 ( Methods ). The VirtualMultiplexer resulted in the lowest FID score across all markers (Fig. 3d ), with an average value of 29.2 (±3), consistently lower than CycleGAN (49 ± 6), CUT (35.8 ± 4.5), CUT with KIN (37.8 ± 2.3) and AI-FFPE (35.9 ± 2.6). We also used the contrast-structure similarity score, a variant of the structural similarity score that computes contrast and structure preservation 25 , where again the VirtualMultiplexer surpassed all other models in performance (Supplementary Table 1 ). These results indicated that virtual images generated by the VirtualMultiplexer were closer to the real ones in terms of distribution than any of the competing methods.

figure 3

a , Example H&E core from the EMPaCT TMA. b , Real, unpaired IHC-stained cores for different antibody markers corresponding to the H&E core in a . c , Virtually stained IHC cores, now paired with the H&E core in a . d , Comparison of the VirtualMultiplexer with state-of-the-art S2S models. Barplots and error bars indicate the mean and standard deviation of the FID score from three independent runs of each model. Number of test samples used varies per marker and is reported in each subplot. e , Results of the visual Turing test, where circles indicate results of the guess of each one of the n  = 4 experts, and barplots and error bars indicate the corresponding mean and standard variation. f , Assessment of staining quality of the virtual and real stainings, performed on 50 real and 50 virtual images. RR, real as real; RV, real as virtual; VR, virtual as real; VV, virtual as virtual.

To further quantify the indistinguishability of real and virtual images, we conducted a visual Turing test: three experts in prostate histopathology and one board-certified pathologist were shown 100 randomly selected patches per marker, with 50 of them originating from real and 50 from virtual IHC images, and were asked to classify each patch as virtual or real. Our model was able to trick the experts, as they achieved a close-to-random average sensitivity of 52.1% and specificity of 54.1% across all markers (Fig. 3e ). Last, we performed a staining quality assessment: we gave the pathologist 50 real and 50 virtual images per marker, revealing which were real and virtual; the pathologist performed a qualitative assessment of the staining, as judged by overall expression levels, background, staining pattern, cell type specificity and subcellular localization (Fig. 3f ; detailed annotations in Supplementary Data 1 ). Across all markers, on average 70.7% of the virtual images reached an acceptable staining quality, as opposed to 78.3% of the real images. The results varied depending on the marker, with virtual NKX3.1 and CD146 images achieving the highest quality of 96%, surpassing even real images. Conversely, virtual AR images had the lowest score of 46%, with an additional 10% exhibiting accurate staining but high background, and the remaining 42% rejected mostly due to heterogeneous staining or falsely unstained cells. Background presented a challenge with CD44 and p53; the latter appeared to be further affected by border artefacts—that is, the presence of abnormally highly stained cells only in the core border—also occasionally present in real images. ERG achieved a higher staining quality in virtual than in real images, which both often faced background issues. We concluded that for most markers, the staining quality scores and the number of cores with staining artefacts were comparable in virtual versus real images.

Following these observations, we carefully examined if virtual images capture accurate staining patterns. Overall, for all markers, we observed similar patterns, correct cell types and subcellular distributions (Extended Data Fig. 2 a). Certain discrepancies were also found, such as systematic lack of recognition of CD146 + vascular structures (Extended Data Fig. 2 b). Nonetheless, the more pathologically relevant paterns, crucial for diagnostic applicability, were correctly reconstructed. We also compared the staining intensity of positive and negative cells and observed high concordance between class-wise intensity distributions and separability for both real and virtual images, confirming that the virtual images faithfully capture the staining intensity for both cell classes (Extended Data Fig. 3 ). Finally, we performed an ablation study demonstrating the effects of different components of the VirtualMultiplexer loss (Extended Data Fig. 4 ). The mere imposition of the neighbourhood consistency (the primary objective in competing methods) leads to obvious staining unreliability: for example, swapping of staining patterns between positive and negative cells. Our global consistency clearly mitigates this, and our local consistency further optimizes the virtual staining at the cell level.

Transferring from TMAs to WSIs

To assess how well the model can be transferred across imaging scales, we fed the TMA-trained VirtualMultiplexer with five out-of-distribution H&E-stained prostate whole-slide images (WSIs) and generated virtual IHC images for NKX3.1, AR and CD146. We then stained for the same markers by IHC on the direct serial sections, thus generating ground-truth and directly comparable WSIs to visually validate the model predictions ( Methods ). For NKX3.1 (Fig. 4 ), the virtual images largely captured the staining appearance of the real ones, both in terms of specific glandular luminal cell identification (positive signal) (examples 1 and 2 in Fig. 4 and Extended Data Fig. 5 ) and accurate non-annotation of stromal or vascular structures (absence of signal) (example 3 in Fig. 4 and Extended Data Fig. 5 ). In minority, virtual images did not highlight the rarer NKX3.1 + cell population that are not part of the epithelial gland, but rather in the periglandular stroma (example 4 in Fig. 4 and Extended Data Fig. 5 ). For CD146 and AR, we observed intensity discrepancies between virtual and real images, more striking for CD146 where the overall signal intensity and background are higher in virtual versus real images (Fig. 4 and Extended Data Fig. 5 ). These discrepancies can be attributed to the fact that the training set TMA images have a different staining distribution than the WSIs. Although this might lead to false interpretation of marker expression levels at a first inspection, when evaluating at higher magnification, the staining pattern in the matching real and virtual regions was effectively correct: for example, no glandular signal (example 5 in Fig. 4 ) and appropriate stromal localization of CD146 (examples 6 and 7 in Fig. 4 ) and nuclear localization of AR in luminal epithelial cells (example 5 in Extended Data Fig. 5 ). Lack of detection of vascular structures for CD146 was evident in both TMA cores and WSI (example 8 in Fig. 4 ).

figure 4

Example of H&E (left), virtual IHC (middle) and real IHC (right) staining for NKX3.1 (top) and CD146 (bottom) of prostate cancer tissue WSIs. Blue-framed zoomed-in regions display accurate staining pattern. Red-framed zoomed-in regions display examples of virtual staining mispredictions.

The VirtualMultiplexer improves clinical predictions

We then assessed the utility of the generated stainings in augmenting the performance of AI models when predicting clinically relevant endpoints. Specifically, we encoded the real H&E, real IHC or virtual IHC images as tissue-graph representations and employed a graph transformer (GT) 41 to map the representations to downstream class labels (Fig. 5a,b and Methods ). We trained the GT model under three settings (Fig. 5c ): (1) a unimodal setting, where independent GT models were trained for each H&E and IHC marker; (2) a multimodal late fusion setting, where the outputs of independent GT models were fused at the last embedding stage, and (3) a multimodal early fusion setting, where the patch features were combined early in the tissue graph and fed into the GT model. Whereas the unimodal setting resulted in a separate prediction per marker, both multimodal settings combined the patch features, resulting in a single prediction. In contrast to the late fusion multimodal setting, in the early fusion case only one model that learns from the joint spatial distribution across all markers was trained, mimicking a multiplexed imaging scenario. With the exception of the early fusion setting that is only feasible for virtual images, we tested all three settings with both real and virtual images as input, resulting in a total of five different combinations (Fig. 5d , legend).

figure 5

a , Patch extraction and computation of patch features with a frozen ResNet-50 model (blue trapezoid). b , Overview of the GT model, implemented by first constructing a patch-level graph representation, followed by a transformer that processes the graph representation to predict clinically relevant endpoints. c , Training of GT models (green trapezoid) under three different settings, depending on the integration strategy. d , Prediction results of overall survival status (left: 0, alive/censored; 1, prostate cancer related death) and disease progression (right: 0, no recurrence; 1, recurrence). Barplot colours indicate one of the five combinations of training setting and input data used (see legend). For each combination, barplot heights and error bars indicate the mean and standard deviation of the weighted F 1 score, as computed in the held-out test set from three independent runs with different initializations. The exact number of training samples used in each cases is given on the top of the barplots. a For all multimodal models, the reported number refers to the union across all markers. MM-R-L, multimodal–real–late fusion; MM-V-E, multimodal–virtual–early fusion; MM-V-L, multimodal–virtual–late fusion.

We applied these settings to the EMPaCT dataset to predict patient overall survival status and disease progression (Fig. 5d and Methods ). As small discrepancies in the number of real IHC images available were present due to missing stainings, we matched the number of virtual IHC images to the number of available real IHC images to ensure a fair comparison between real and virtual unimodal models (dark and light blue barplots in Fig. 5d , respectively). As H&E images were always available, the unimodal model trained on H&E had a slight advantage over all other models in terms of number of samples used. To compare all multimodal models, we again matched the number of virtual images to the available real data, and thus the last three bars in Fig. 5d are also directly comparable. We observed that the unimodal–virtual settings are on par with the unimodal–real for both tasks, with variations in prediction performance depending on the marker. When predicting overall survival status, two interesting exceptions concern CD146 and p53: for CD146, the unimodal–virtual setting outperformed the unimodal–real, in accordance with the previous observation that virtual CD146 images achieved a higher-quality score than real ones (Fig. 3f ). The opposite is true for p53: virtual p53 images were of lower quality than real p53 images, and the corresponding unimodal–virtual setting achieved a lower performance than the unimodal–real one. However, these observations were not replicated for disease progression prediction, which appeared to be an overall harder task. In both tasks, all multimodal settings outperformed the unimodal ones, including the H&E, indicating the utility of combining information from complementary markers. Furthermore, the multimodal early fusion model trained with virtual images achieved the best weighted F 1 score of 82.9% and 74.8% for overall survival status and disease progression, respectively. We also performed a marker-level interpretability analysis, pointing to markers of high importance inline with the unimodal high and low weighted F 1 scores (Extended Data Fig. 6 ). Overall, our results establish the potential of virtual multiplexed images in augmenting the efficacy of AI models in the prediction of clinical endpoints.

Transferring across patient cohorts and cancer types

We then assessed the model’s ability to generalize to out-of-distribution data using two independent prostate cancer cohorts, SICAP 42 and prostate cancer grade assessment (PANDA) 43 , containing H&E-stained needle biopsies with associated Gleason scores ( Methods ). We used the pretrained VirtualMultiplexer to generate IHC images for four markers relevant towards Gleason score prediction: NKX3.1, CD146, AR and ERG (Fig. 6a ; additional examples in Extended Data Fig. 7 ). We observed that the virtual staining patterns of the IHC markers were overall correct and specific in terms of cell type and subcellular localization, with the only exception being the occasional aspecific AR signal in the extracellular matrix areas. Other inconsistencies include weak staining of interstitial tissue for CD146 and heterogeneous gland staining for ERG. We also observed some recurring issues as in the EMPaCT TMA (Fig. 3 ): background (for example, occasional stromal background in NKX3.1 and ERG) and border and tiling artefacts (for example, CD146). Subsequently, we trained GT models under the previous settings to predict Gleason grade (Fig. 6b,c , respectively). We observed that the predictive performance of the unimodal–virtual settings was close to or superior to the model using standalone H&E images for both datasets. Further improvement was attained by the multimodal–virtual settings, with the early fusion model achieving the highest weighted F 1 score (SICAP, 61.4%; PANDA, 72.3%), which not only outperformed the H&E unimodal counterparts, but also WholeSIGHT 44 , the previous top performing model on these datasets that achieved a weighted F 1 score of 58.6% and 67.9% on SICAP and PANDA, respectively. Finally, as for both SICAP and PANDA, ground-truth region-level annotations of Gleason scores exist, we performed a region-based interpretability analysis and observed that the salient tissue regions contributing to model predictions coincided with the ground-truth annotations (Extended Data Fig. 8 ).

figure 6

a , Top, real H&E needle biopsy of the SICAP dataset. Bottom, matching virtual IHC stainings across four IHC markers, as generated from the EMPaCT-trained VirtualMultiplexer. b , Prediction results of Gleason grading for the SICAP test set in terms of weighted F 1 score and confusion matrix. Note that the setting unimodal–real (dark blue barplot) only includes training the model on H&E, as no real IHC data are available here. c , Same as in b , but for the PANDA dataset. d , Virtual IHC staining of a PDAC TMA dataset with corresponding prediction of TNM staging. In b – d , barplots and error bars are as in Fig. 3 and confusion matrices correspond to the multimodal–virtual early fusion model.

Finally, we evaluated the generalization ability of the VirtualMultiplexer on other cancer types. We applied the EMPaCT-pretrained VirtualMultiplexer to a PDAC TMA and generated virtual IHC stainings for CD44, CD146 and p53 (Fig. 6d ), three markers with expected expression in pancreatic tissue. The generated images appeared overall realistic, with no means of discriminating whether they were virtually or actually stained. We observed that the CD44 and CD146 staining pattern in the virtual images was allocated, as expected, to the extracellular matrix of presented tissue spots, without major staining in the epithelial tissue part. For p53, we again observed overall proper staining allocation to the nuclei of epithelial cells with expected distribution, with no major staining of other compartments. To quantify the utility of the virtual stainings for downstream applications, we followed the same process as before to predict PDAC tumour, node and metastasis (TNM) stage, leading, again, to increased performance of models trained with virtually multiplexed data, concluding that virtually multiplexed data offers a performance advantage to prediction models. We also applied the pretrained VirtualMultiplexer to generate virtual IHC images for CD44 and CD146 from colorectal 45 and breast cancer 46 H&E-stained WSIs from The Cancer Genome Atlas (TCGA) at www.cancer.gov/tcga . Although the lack of normal tissue limited our ability to evaluate the staining quality in the generated images, we again observed an overall realistic virtual staining (Extended Data Fig. 9 ).

Lastly, we performed a runtime estimation of our framework (Extended Data Fig. 10 ) and concluded that it leads to substantial time gains when compared to a typical IHC staining, greatly accelerating histopathology workflows.

We proposed the VirtualMultiplexer, a generative model that translates H&E to several IHC markers using a multiscale architecture with biological priors that ensures biological consistency on a cellular, neighbourhood and global whole-image scale without requiring image registration or extensive annotations. The VirtualMultiplexer consistently outperformed state-of-the-art methods in image fidelity scores. Detailed evaluation suggested that the virtual IHC images were indistinguishable from real ones to the expert eye, with a staining quality on par with or even exceeding that of real images and occasional staining artefacts largely comparable for three of the six markers. A thorough ablation study demonstrated that our multiscale loss mitigates staining unreliability, as opposed to competing methods that solely use adversarial and contrastive objectives. We also found that the model generalized well to unseen datasets of different image scales without any fine-tuning.

Although our results demonstrate a clear potential, several limitations remain, to be addressed in future extensions. First, we occasionally observed elevated background, especially for markers with faint staining. More pronounced background was present when transferring to prostate cancer WSIs, which was expected considering that this dataset was generated in different institutions using different staining protocols. Second, the patch-wise processing occasionally induced tiling artefacts more pronounced at the core border, a well-known limitation of S2S translation approaches 24 , 39 , 47 , 48 . One possible underlying cause is that as the model has only seen tissue-full patches during training, when it receives as input a patch with little tissue, the losses ‘force’ it to stain with higher intensity to match the distribution of a full patch. Previous attempts to address the tiling artefact 24 , 39 have been suggested to cause less efficient translations 49 . As in our case the tiling artefact is limited to edge cases, a straightforward solution is discarding a narrow border surrounding the tissue, as empirically done in actual IHC when border artefacts are present. Alternatively, more sophisticated extensions, such as the bidirectional feature-fusion GAN proposed by ref. 48 could be exploited. Third, discrepancies in staining specificity were occasionally observed (for example, failing to stain CD146 + vascular structures and glandular NKX3.1 + cells invading periglandular stroma), as these patterns were rarely observed in the training images and can be mitigated by ensuring the inclusion of adequate representative examples in the training set.

Importantly, despite their limitations, the generated images enabled the training of early fusion GT models, which consistently improved the prediction of clinical endpoints not only in the training dataset across two prediction tasks but also in both out-of-distribution prostate cancer cohorts and the PDAC TMA cohort. In our experiments, we ensured that the multimodal early fusion models did not have a numerical advantage over models trained with real data and also had a much smaller parameter space in comparison to late fusion ones, suggesting that improved performance is not a mere outcome of higher sample size or model complexity. A potential explanation of the observed improvement is that virtual images are not affected by artefacts occasionally found in real images, corroborated by the fact that for markers where virtual images were of higher quality than real, the corresponding unimodal–virtual models outperformed the unimodal–real ones and vice versa. Another explanation could be that as multimodal early fusion models could learn from the joint spatial distribution of several markers on the same tissue, they managed to pick up single-cell multimodal spatial relationships, mimicking data generated by advanced multiplexed technologies. This is further supported by the fact that in the early fusion case, a single GT model proved to have more learning capacity than the integration of several equivalently potent ones. However, the superior performance of models trained with virtual data could be unrelated to a potential higher quality of the generated images and could be a direct outcome of the fact that the VirtualMultiplexer potentially picks up the most consistent patterns and eliminates a lot of the noise and artefacts in the data, making the prediction task easier. This is further supported by other works that have reported competitive performance using models trained on other spatial features extracted from the tissue images 50 , 51 .

In conclusion, the current work establishes the potential of virtual multiplexed staining, with important implications towards AI-assisted histopathology. For example, the VirtualMultiplexer could be directly used for data inpainting—that is, filling missing regions in an image—or for sample imputation—that is, generating missing samples from scratch. As IHC marker panels are not standardized across labs, filling in the gaps via virtual multiplexing could harmonize datasets within or across research labs, particularly important in cases of limited sample availability 52 , 53 . This could lead to the generation of harmonized and comprehensive patient cohorts, further used for clinically relevant predictions. An equally important application of our work concerns prehistopathological experimental design: generating a large collection of IHC stains in silico and training AI models could support marker selection for actual experimentation, reducing costs and preserving precious tissue. To reach its full potential, future work will be needed to validate the VirtualMultiplexer in real-world settings. From a technical standpoint, virtually multiplexed stainings can augment existing datasets and enable the development of foundational models for IHC, paving the way for multimodal tissue characterization. Interestingly, virtual multiplexed staining can be exploited as biologically conditioned data augmentations to boost the development and predictive performance of foundational models in histopathology. Our preliminary results on PDAC and TCGA images indicate that our model has the potential to generalize to tissues of different origins. However, more thorough evaluations are needed to solidify these encouraging early results. Finally, as our method is stain-agnostic, straightforward adaptations for S2S translation across multiplexed imaging technologies could substantially reduce costs via antibody panel optimization. Our vision is that future extensions of our work could lead to an ever-growing and readily available dictionary of virtual stainers for IHC and beyond, surpassing in multiplexing even the most cutting-edge technologies and accelerating spatial biology.

VirtualMultiplexer architecture

The VirtualMultiplexer is a generative AI toolkit that performs unpaired H&E-to-IHC translation. An overview of the model’s architecture is shown in Fig. 2a . The VirtualMultiplexer is trained using two sets of images: source H&E images, denoted as \({X}_{{\rm{img}}}=\{x\in {\mathcal{X}}\}\) , and target IHC images, denoted as \({Y}_{{\rm{img}}}=\{\;y\in {\mathcal{Y}}\}\) . X img and Y img are unpaired images that originate from different sections of the same TMA core and thus belong to the same patient, but are pixel-wise unaligned and thus unpaired. We train an independent one-to-one VirtualMultiplexer model for each IHC marker at a time. To train the VirtualMultiplexer, we use patches X p  = { x p   ∈   X img } and Y p  = {  y p   ∈   Y img } extracted from a pair of images X img and Y img , respectively. The backbone of the VirtualMultiplexer is a GAN-based generator G , specifically a CUT 34 model that consists of two sequential components: an encoder G enc and a decoder G dec . Upon training, the generator takes as input a patch x p and generates a virtual patch \({y}_{\rm{p}}^{{\prime} }\) : that is, \({y}_{\rm{p}}^{{\prime} }=G({x}_{\rm{p}})={G}_{{\rm{dec}}}({G}_{{\rm{enc}}}({x}_{\rm{p}}))\) . The virtually generated patches are stitched together to produce a final virtual image \({Y}_{{\rm{img}}}^{{\prime} }=\{\;{y}^{{\prime} }\in {{\mathcal{Y}}}^{{\prime} }\}\) . The VirtualMultiplexer is trained under the supervision of three levels of consistency objectives: local, neighbourhood and global consistency (Fig. 2b ). The neighbourhood consistency enforces effective staining translation at a patch level, where a patch captures the neighbourhood of a cell. We introduce additional global and local consistency objectives, operating at an image level and cell level, respectively, to further constrain the unpaired S2S translation and alleviate the stain-specific inconsistencies.

Neighbourhood consistency

The neighbourhood objective is a combination of an adversarial loss and a patch-wise multilayer contrastive loss, implemented as previously described in CUT 34 (Fig. 2b , panel 1). Briefly, the adversarial loss dictates the model to learn to eliminate style differences between real and virtual patches, and the multilayer contrastive loss guarantees the content preservation at a patch level 54 . The adversarial loss is a standard GAN min–max loss 35 , where the discriminator D takes as input real IHC patches Y p and IHC patches \({Y}_{\rm{p}}^{{\prime} }\) virtually generated by generator G and attempts to classify them as either real or virtual (Fig. 2b , panel 1a). It is calculated as follows:

The patch-wise multilayer contrastive loss follows a NCE concept as presented in refs. 54 , 55 and reused in refs. 29 , 34 . Specifically, it aims to maximize the resemblance between input H&E patch x p   ∈   X p and corresponding virtually synthesized IHC patch \({y}_{\rm{p}}^{{\prime} }\in {Y}_{\rm{p}}^{{\prime} }\) (Fig. 2b , panel 1b). We first extract a query subpatch \({y}_{\rm{sp}}^{{\prime} }\) of size 64 × 64 from the target IHC domain patch \({y}_{\rm{p}}^{{\prime} }\) (purple square in Fig. 2b , panel 1b) and match it to the corresponding subpatch x s p : that is, a subpatch at the same spatial location as \({y}_{\rm{sp}}^{{\prime} }\) but from the H&E source domain patch x p (black square in Fig. 2b , panel 1b). Because both subpatches originate from the exact same tissue neighbourhood, we expect that x s p and \({y}_{\rm{sp}}^{{\prime} }\) form a positive pair. We also sample N subpatches \(\{{x}_{\rm{sp}}^{-}\}\) at different spatial locations from x p (red squares in Fig. 2b , panel 1b) and expect that they form dissimilar, negative pairs with x s p . In a standard contrastive learning scheme, we would map y s p , x s p and \(\{{x}_{\rm{sp}}^{-}\}\) to a d -dimensional embedding space \({{\mathbb{R}}}^{d}\) via G enc and project them to a unit sphere, resulting in v , v + and \(\left.\right\{{v}^{-}\},\in {{\mathbb{R}}}^{d}\) , respectively, and then estimate the probability of a positive pair ( v , v + ) selected over negative pairs \((v,{v}_{n}^{-}),\forall n\in N\) as a cross-entropy loss with a temperature scaling parameter τ :

Here, we use a variation of the loss in equation ( 2 ), specifically a patch-wise multilayer contrastive loss that extends \({\mathcal{L}}(v,{v}^{+},{v}^{-})\) by computing it for feature maps extracted from L -layers of G enc 29 , 34 . This is achieved by passing the L feature maps of x p and \({y}_{\rm{p}}^{{\prime} }\) through a two-layer multilayer perceptron (MLP) H l , resulting in a stack of features \({\{{z}_{\rm{l}}\}}_{L}={\{{H}_{l}({G}_{{\rm{enc}}}^{l}({x}_{\rm{p}}))\}}_{L}\) and \({\{{z}_{l}^{{\prime} }\}}_{L}={\{{H}_{l}({G}_{{\rm{enc}}}^{l}(\;{y}_{\rm{p}}^{{\prime} }))\}}_{L}\) = \({\{{H}_{l}\left.\right({G}_{{\rm{enc}}}^{l}(G({x}_{\rm{p}}))\}}_{L}\) , ∀   l   ∈  {1, 2,  ⋯  ,  L }, respectively. We also iterate over each spatial location s   ∈  {1,  ⋯  ,  S l }, and we leverage all S l \ s patches as negatives, ultimately resulting in \({z}_{l,s}^{{\prime} }\) , z l , s and \({z}_{l,{S}_{l}\backslash s}\) for the query, positive and negative subpatches, respectively (purple, black and red boxes in Fig. 2b , panel 1b). The final patch-wise multilayer contrastive loss is computed as

We also employ contrastive loss \({{\mathcal{L}}}_{{\rm{contrastive}}}(G,H,{Y}_{\rm{p}})\) on patches y p   ∈   Y p , a domain-specific version of the identity loss 56 , 57 that prevents the generator G from making unnecessary changes as proposed in ref. 34 . Finally, the overall neighbourhood consistency objective is computed as a weighted sum of the adversarial loss {equation ( 1 )) and the multilayer contrastive loss (equation ( 3 )) with regularization hyperparameter λ NCE :

Global consistency

Inspired by seminal work in neural style transfer 58 , this objective consists of two loss functions: a content loss \({{\mathcal{L}}}_{{\rm{content}}}\) and a style loss \({{\mathcal{L}}}_{{\rm{style}}}\) that together enforce biological consistency in terms of both tissue composition and staining pattern at the image (tile) level (Fig. 2b , panel 2). Because the generated IHC images should be virtually paired to their corresponding input H&E image in terms of tissue composition, the content loss aims to penalize the loss in content between H&E and IHC images at a tile level. First, real patches X p and synthesized patches \({Y}_{\rm{p}}^{{\prime} }\) are stitched to create images X img and \({Y}_{{\rm{img}}}^{{\prime} }\) , respectively, and corresponding tiles of size 1,024 × 1,024 are extracted (boxes in Fig. 2b , panel 2), denoted as X t  = { x t   ∈   X img } and \({Y}_{t}^{{\prime} }=\{\;{y}_{t}^{{\prime} }\in {Y}_{{\rm{img}}}^{{\prime} }\}\) , respectively. Then the tiles are encoded by a pretrained feature extractor F , specifically VGG16 (ref. 59 ) pretrained on ImageNet 60 . The tile-level content loss at layer l of F is calculated as

where h , w and c are the height, width and channel dimensions of the feature map at the l th layer, respectively.

The style loss utilizes the synthesized image \({Y}_{{\rm{img}}}^{{\prime} }\) and the available real image Y img to match the style or overall staining distribution between real and virtual IHC images. Because \({Y}_{{\rm{img}}}^{{\prime} }\) and Y img do not have pixel-wise correspondence, large tiles \({Y}_{t}^{{\prime} }=\{\;{y}_{t}^{{\prime} }\in {Y}_{{\rm{img}}}\}\) and Y t  = {  y t   ∈   Y img } are extracted at random such that each tile incorporates a sufficient staining distribution. Next, \({Y}_{t}^{{\prime} }\) and Y t are processed by F to produce feature maps across multiple layers. The style loss is computed as

where \({\mathcal{G}}\) is the Gram matrix that measures the correlation between all the styles in a feature map. The denominator is a normalization term that compensates for the under- or overstylization of the tiles in a batch 61 . The overall global consistency loss is computed as

where L content and L style are the lists of the content and style layers of F , respectively, used to extract the feature matrices, and λ content and λ style are regularization hyperparameters for the respective loss terms.

Local consistency

The local consistency objective aims to enforce biological consistency at a local cell level and consists of two loss terms: a cell discriminator loss ( \({{\mathcal{L}}}_{{\rm{cellDisc}}}\) ) and a cell classification loss ( \({{\mathcal{L}}}_{{\rm{cellClass}}}\) ) (Fig. 2b , panel 3). The cell discriminator loss is inspired by ref. 26 and uses the cell discriminator D cell to identify whether a cell is real or virtual, in the same way that the patch discriminator of equation ( 1 ) attempts to classify patches as real or virtual. \({{\mathcal{L}}}_{{\rm{cellDisc}}}\) takes as input a real ( Y p ) and a virtual ( \({Y}_{\rm{p}}^{{\prime} }\) ) target patch and their corresponding cell masks ( \({M}_{{Y}_{\rm{p}}}\) and \({M}_{{Y}_{\rm{p}}^{{\prime} }}\) , respectively), which include bounding-box demarcation around the cells (Fig. 2b , panel 3). D cell comprises a feature extractor followed by a RoIAlign layer 62 and a final discriminator. The goal of D cell is to output \({D}_{{\rm{cell}}}({Y}_{\rm{p}},{M}_{{Y}_{\rm{p}}})\to {1}\) and \({D}_{{\rm{cell}}}({Y}_{\rm{p}}^{{\prime} },{M}_{{X}_{\rm{p}}})\to {0}\) , where 1 and 0 indicate real and virtual cells (indicated in black and purple, respectively, in Fig. 2b , panel 3). The cell discriminator loss is defined as

Although D cell aims to enforce the generation of realistically looking cells, it is agnostic to their marker expression, as it does not explicitly capture which cells have a positive or a negative staining status. To account for this, we introduce an additional loss via a classifier F cell that is trained to explicitly predict the cell staining status. This is achieved with the help of cell labels \({C}_{{Y}_{\rm{p}}^{{\prime} }}\) and \({C}_{{Y}_{\rm{p}}}\) : that is, binary variables depicting the positive or negative staining status of a cell (indicated as 1: yellow and 0: blue boxes in Fig. 2b , panel 3). The computation of cell masks and labels is described in detail in the section ‘Cell masking and labelling of IHC images’. The cell-level classification loss can be easily computed as cross-entropy loss, calculated as

where \(| {C}_{{y}_{\rm{p}}}|\) and \(| {C}_{{y}_{\rm{p}}^{{\prime} }}|\) are the number of cells in y p and \({y}_{\rm{p}}^{{\prime} }\) , respectively, \({{\mathbb{1}}}_{(.)}\) is the indicator function and p (.) is the cell-level probabilities predicted by F cell .

The overall local consistency loss is computed as

where λ cellDisc and λ cellClass are the regularization hyperparameters for the cell discriminator and classification loss terms, respectively. Importantly, the local consistency loss can be easily generalized to any other cellular or tissue component (for example, nuclei, glands) that might be relevant to other S2S translation problems, provided that corresponding masks and labels are available.

The complete objective function for optimizing VirtualMultiplexer is given as

Cell masking and labelling of IHC images

As already discussed, the local consistency loss of equation ( 11 ) needs as input cell masks \({M}_{{X}_{\rm{p}}},{M}_{{Y}_{\rm{p}}}\) and cell labels \({C}_{{X}_{\rm{p}}},{C}_{{Y}_{\rm{p}}}\) . However, acquiring these inputs manually for all patches across all antibodies is practically prohibitive, even for relatively small datasets. Automatic nuclei segmentation/detection using pretrained models (for example, HoVerNet 63 ) is a standard task for H&E images, but no such model exists for IHC images. To circumvent this challenge, we use an attractive property of the VirtualMultiplexer: its ability to synthesize virtual images that are pixel-wise aligned in any direction between the source and target domain. Specifically, we train a separate instance of the VirtualMultiplexer that performs IHC → H&E translation. The VirtualMultiplexer IHC→H & E is trained using neighbourhood consistency and global consistency objectives, as previously described. Once trained, it is used to synthesize a virtual H&E image \({X}_{{\rm{img}}}^{{\prime} }\) from a real IHC image Y img . At this point, we can leverage HoVerNet 63 to detect cell nuclei on real and virtual H&E images ( X img and \({X}_{{\rm{img}}}^{{\prime} }\) ) and simply transfer the corresponding cell masks ( \({M}_{{X}_{{\rm{img}}}}\) and \({M}_{{X}_{{\rm{img}}}^{{\prime} }}\) ) to their pixel-wise aligned IHC counterparts ( \({Y}_{{\rm{img}}}^{{\prime} }\) and Y img , respectively) to acquire \({M}_{{Y}_{{\rm{img}}}^{{\prime} }}\) and \({M}_{{Y}_{{\rm{img}}}}\) . This ‘trick’ eliminates the need to train individual cell detection models for each IHC antibody and fully automates the cell masking process in the IHC domain. To acquire cell labels \({C}_{{Y}_{{\rm{img}}}^{{\prime} }}\) and \({C}_{{Y}_{{\rm{img}}}}\) , we use only region annotations in Y img , where the experts partially annotated areas as positive or negative stainings in a few representative images. Because IHC stainings are specialized in delineating positive or negative staining status, the annotation was easy and fast and required approximately 2–3 minutes per image and per antibody marker. We also train cell detectors for the source and target domain: that is, \({D}_{{\rm{cell}}}^{{\rm{source}}}\) and \({D}_{{\rm{cell}}}^{{\rm{target}}}\) , respectively. Provided with the annotations, \({D}_{{\rm{cell}}}^{{\rm{target}}}\) is trained as a CNN patch classifier. The classifier predictions on Y img combined with \({M}_{{Y}_{\rm{p}}}\) result in \({C}_{{Y}_{p}}\) . The above region predictions on Y img are transferred on to \({X}_{{\rm{img}}}^{{\prime} }\) . Afterwards, \({X}_{{\rm{img}}}^{{\prime} }\) and the transferred annotations are used to train \({D}_{{\rm{cell}}}^{{\rm{source}}}\) as a CNN patch classifier. The classifier predictions on X img combined with \({M}_{{X}_{\rm{p}}}\) result in \({C}_{{X}_{p}}\) .

Implementation and training details

The architectural choices of the VirtualMultiplexer were set as follows: G is a ResNet 64 with nine residual blocks, D is a PatchGAN discriminator 12 , D cell includes four stride-2 feature convolutions followed by a RoIAlign layer and a discrimination layer and F cell includes four stride-2 feature convolutions and a two-layer MLP. We use Xavier weight initialization 65 , instance normalization 66 and a batch size of one image. We use least square GAN loss 67 for \({{\mathcal{L}}}_{{\rm{adv}}}\) . The model hyperparameters for the loss terms of the VirtualMultiplexer are set as follows: λ NCE is 1 with temperature τ equal to 0.08, λ content   ∈  {0.01, 0.1}, λ style   ∈  {5, 10}, λ cellDisc   ∈  {0.5, 1} and λ cellClass   ∈  {0.1, 0.5}. VirtualMultiplexer is optimized for 125 epochs using the Adam optimizer 68 with momentum parameters β 1  = 0.5 and β 2  = 0.999. Different learning rates (lr) are employed for different consistency objectives: that is, for neighbourhood consistency, lr G and lr D are set to 0.0002; for global consistency, learning rate lr G is chosen from {0.0001, 0.0002}; and for local consistency, learning rates \({\text{lr}}_{{D}_{{\rm{cell}}}}\) and \({\text{lr}}_{{F}_{{\rm{cell}}}}\) are chosen from {0.00001, 0.0001, 0.0002}. Among other hyperparameters, the number of tiles extracted per image to compute \({{\mathcal{L}}}_{{\rm{content}}}\) and \({{\mathcal{L}}}_{{\rm{style}}}\) is set to eight; the content layer in F is relu2_2; the style layers are relu1_2, relu2_2, relu3_3, relu4_3; and the number of cells per patch to compute \({{\mathcal{L}}}_{{\rm{cellDisc}}}\) is set to eight.

GT architecture

The GT architecture, proposed by ref. 41 , fuses a graph neural network and a vision transformer (ViT) to process histopathology images. The graph neural network operates on a graph-structured representation of a histopathology image, where the nodes and edges of the graph denote patches and interpatch spatial connectivity, and the nodes encode patch features extracted from a pretrained ResNet-50 network 64 . The graph representation underwent graph convolutions to contextualize the node features of the local tissue neighbourhood. Specifically, the GT employs a graph convolution layer 69 to learn contextualized node embeddings through propagating and aggregating neighbourhood node information. Subsequently, a ViT layer operates on the contextualized node features, leverages self-attention to weigh the importance of the nodes and aggregates the node information to render an image-level feature representation. Finally, an MLP maps the image-level features to a downstream image label. Note that histopathology images can have different spatial dimensions; therefore, their graph representations can have varying number of nodes. Also, the number of nodes can be very high when operating on gigapixel-sized WSIs. These two factors can potentially hinder the integration of the graph convolution layer to the ViT layer. To address these challenges, GT introduces a mincut pooling layer 70 , which reduces the number of nodes to a fixed number of tokens while preserving the local neighbourhood information of the nodes.

The architecture of the GT follows the official implementation on GitHub ( https://github.com/vkola-lab/tmi2022 ). Each input image was cropped to create a bag of 256 × 256 non-overlapping patches at ×10 magnification, and background patches with non-tissue area greater than 10% were discarded. The patches were encoded using the ResNet-50 64 model pretrained on the ImageNet dataset 60 . A graph representation was constructed using the patches with an eight-node connectivity pattern. The GT network consisted of one graph convolutional layer, and the ViT layer configurations were set as follows: number of ViT blocks = 3, MLP size = 128, embedding dimension of each patch = 32 and number of multihead attention = 8. The model hyperparameters were set as follows: number of clusters in mincut pooling = {50, 100}, Adam optimizer with initial learning rate of {0.0001, 0.00001}, a cosine annealing scheme for scheduling and a mini-batch size of eight. The GT models were trained for 400 epochs with early stopping.

The VirtualMultiplexer was trained using the EMPaCT TMA dataset; an independent subset of EMPaCT was used for internal testing. The VirtualMultiplexer was further evaluated in a zero-shot fashion—that is, without any retraining or fine-tuning—on three external prostate cancer datasets (prostate cancer WSIs, SICAP 42 and PANDA 43 needle biopsies), on an independent PDAC dataset (PDAC TMAs) and on TCGA data from breast and colorectal cancer. In all cases, independent GTs are trained and tested for individual datasets by using both real and virtually stained samples to address various downstream classification tasks. Details on all datasets used follow.

The dataset contains TMAs from 210 primary prostate tissues as part of EMPaCT and the Institute of Tissue Pathology in Bern. The study followed the guidelines of the World Medical Association Declaration of Helsinki 1964, updated in October 2013, and was conducted after approval by the Ethics Committees of Bern (CEC ID2015-00128). For each patient, four cores were selected, with two of them representing a low Gleason pattern and the other two a high Gleason pattern. Consecutive slices from each core were stained with H&E and IHC using multiple antibodies against nuclear markers NKX3.1 and AR, tumour markers p53 and ERG, and membrane markers CD44 and CD146. TMA FFPE sections of 4 μm were deparaffinized and used for heat-mediated antigen retrieval (citrate buffer, pH 6, Vector Labs; or Tris-HCl, pH 9). Sections were blocked for 10 min in 3% H 2 O 2 , followed by 30 min room temperature incubation in 1% bovine serum albumin in phosphate-buffered saline–0.1% Tween 20. The following antibodies were used: anti-AR (Dako Agilent, catalogue no. M3562, AR441, 1:100 dilution), anti-NKX3.1 (Athena Enzyme Systems, catalogue no. 314, lot 18025, 1:200), anti-p53 (Dako Agilent, catalogue no. M7001, DO-7, 1:800), anti-CD44 (Abcam, catalogue no. ab16728, 156-3C11, 1:2000), anti-ERG (Abcam, catalogue no. ab133264, EPR3864(2), 1:500) and anti-CD146 (Abcam, catalogue no. ab75769, EPR3208, 1:500). Images were acquired using a 3D Histech Panoramic Flash II 250 scanner at ×20 magnification (resolution 0.24 μm per pixel). The cores were annotated at patient level by expert uro-pathologists with binary labels for overall survival status (0, alive/censored; 1, prostate-cancer-related death) and disease progression status (0, no recurrence; 1, recurrence). Clinical follow-up was recorded on a per-patient basis, with a maximum follow-up time of up to 12 years. For both the survival and disease progression clinical endpoints, the available data were imbalanced in terms of class distributions. Access information is possible upon request to the corresponding authors. The distribution of cores per clinical endpoint for the EMPaCT dataset is summarized in Supplementary Table 2 .

Prostate cancer WSIs

Primary stage prostate cancer FFPE tissue sections (4 μm) were deparaffinized and used for heat-mediated antigen retrieval (citrate buffer, pH 6, Vector Labs). Sections were blocked for 10 min in 3% H 2 O 2 , followed by 30 min room temperature incubation in 1% bovine serum albumin in phosphate-buffered saline–0.1% Tween 20. The following primary antibodies were used: anti-CD146 (Abcam, catalogue no. ab75769, EPR3208, 1:500), anti-AR (Abcam, catalogue no. ab133273, EPR1535, 1:100) and anti-NKX3.1 (Cell Signaling, catalogue no. 83700T, D2Y1A, 1:200). Secondary anti-rabbit antibody Envision horseradish peroxidase (DAKO, Agilent Technologies, catalogue no. K400311-2, undiluted) was incubated for 30 min, and signal detection was done using 3-amino-9-ethylcarbazole substrate (DAKO, Agilent Technologies). Sections were counterstained with hematoxylin and mounted with aquatex. Images were acquired using a 3D Histech Panoramic Flash II 250 scanner at ×20 magnification (resolution 0.24 μm per pixel).

The dataset contains 155 H&E-stained WSIs from needle biopsies taken from 95 patients, split in 18,783 patches of size 512 × 512 (ref. 42 ). The WSIs were reconstructed by stitching the patches. The WSIs were scanned at ×40 magnification by a Ventana iScan Coreo scanner and downsampled to ×10 magnification. The WSIs were annotated by expert uro-pathologists for Gleason grades at the Hospital Clínico of Valencia, Spain.

The dataset includes 5,759 H&E-stained needle biopsies from 1,243 patients at the Radboud University Medical Center, Netherlands 71 and 5,662 H&E-stained needle biopsies from 1,222 patients at various hospitals in Stockholm, Sweden 72 . The slides from Radboud were scanned with a 3D Histech Panoramic Flash II 250 scanner at ×20 magnification (resolution 0.24 μm per pixel) and were downsampled to ×10. The slides from Sweden were scanned with a Hamamatsu C9600-12 and an Aperio Scan Scope AT2 scanner at ×10 magnification with a pixel resolution of 0.45202 μm and 0.5032 μm, respectively. The Gleason grades of the biopsies were annotated by expert uro-pathologists and were released as part of the PANDA challenge 43 . We removed the noisy and inconspicuously labelled biopsies from the dataset, resulting in 4,564 and 4,988 biopsies from the Radboud and the Swedish cohorts, respectively (9,552 biopsies in total). The distribution of WSIs across Gleason grades for both SICAP and PANDA datasets is shown in Supplementary Table 3 .

The PDAC TMA contained cancer tissue of 117 (50 female, 67 male) PDAC cases resected in a curative setting at the Department of Visceral Surgery of Inselspital Bern and diagnosed at the Institute of Tissue Medicine and Pathology (ITMP) of the University of Bern between the years 2014 and 2020. The study followed the guidelines of the World Medical Association Declaration of Helsinki 1964, updated in October 2013, and was conducted after approval by the Ethics Committees of Bern (CEC ID2020-00498). All participants provided written general consent. The TMA contained three spots from each case (tumour front, tumour centre, tumour stroma), leading to a total number of 351 tissue spots. Thirteen of these 117 cases were treated by neoadjuvant chemotherapy followed by surgical resection and adjuvant therapy, and the majority of the cases (104) were resected curatively and received adjuvant therapy. All cases were characterized comprehensively clinico-pathologically, including TNM stage, during a master’s thesis of student Jessica Lisa Rohrbach at ITMP, supervised by Martin Wartenberg. All cases were Union for International Cancer Control (UICC) tumour stage I, stage II or stage III cases on pathologic examination, according to the UICC TNM Classification of Malignant Tumours , 8th edition 73 ; the TMA did not include UICC tumour stage IV cases. In all of our analysis, including the TNM prediction (Fig. 6d ), we excluded the 13 neoadjuvant cases and considered only the 104 cases that received adjuvant therapy. The distribution of cores across the three TNM stages is reported in Supplementary Table 4 .

The dataset includes example H&E WSIs from breast cancer (BRCA) and colorectal cancer (CRC) from The TCGA, available at the GDC data portal ( https://portal.gdc.cancer.gov ) as diagnostic slides under project IDs TCGA-BRCA and TCGA-CRC, respectively.

Data preprocessing

For all datasets used, we followed a tissue region detection and patch extraction preprocessing procedure. Specifically, the tissue region was segmented using the preprocessing tools in the HistoCartography library 74 . A binary tissue mask denoting the tissue and non-tissue regions was computed for each downsampled input image by iteratively applying Gaussian smoothing and Otsu thresholding until the mean of non-tissue pixels was below a threshold. The estimated contours of the denoted tissue and the cavities of tissue were then filtered depending on their area to generate the final segmentation mask. Subsequently, non-overlapping patches of size 256 × 256 were extracted from ×10 magnification using the segmentation contours. The extracted H&E and IHC patches of the EMPaCT dataset were used for training and internal validation of the VirtualMultiplexer. For the unseen datasets (prostate cancer WSIs, SICAP, PANDA, PDAC, TCGA), the images were first stain-normalized to mitigate the staining appearance variability with respect to the EMPaCT TMAs, and then H&E patches were extracted. Specifically, for the SICAP, PANDA and PDAC datasets, we used the Vahadane stain normalization method 75 , from the HistoCartography library 74 , on the entire images. We masked out the blank regions by applying a threshold on the Lab colour space and computed the stain-density maps using only the tissue regions. Afterwards, the target stain-density maps are combined with the reference colour appearance matrix to produce normalized images, as proposed by the Vahadane method. Supplementary Fig. 1 presents a sample unnormalized WSI from the PANDA dataset and the corresponding stain-normalized WSI based on the reference EMPaCT TMA. For the prostate cancer and TCGA WSIs, we followed the same procedure but with stain-density maps extracted from a lower magnification (×2.5) for computational efficiency. Note that the VirtualMultiplexer is independent of the stain normalization method and can be trained using H&E images normalized by other advanced stain normalization algorithms: for example, deep learning-based methods 76 .

Method evaluation

Patch-level evaluation.

We use the FID score 77 to compare the distribution of the virtual IHC patches with the distribution of the real IHC patches, as shown in Fig. 3 . The computation begins with projecting the virtual and the real IHC patches to an embedding space using the InceptionV3 (ref. 77 ) model, pretrained on ImageNet 60 . The extracted embeddings are used to estimate multivariate normal distributions \({\mathcal{N}}(\;{\mu }_{\rm{r}},{\Sigma }_{\rm{r}})\) for real data and \({\mathcal{N}}(\;{\mu }_{\rm{s}},{\Sigma }_{\rm{s}})\) for virtual data. Finally, the FID score is computed as

where μ r and μ v are the feature-wise mean of the real and virtual patches, Σ r and Σ v are covariance matrices for the real and virtual embeddings, and Tr is the trace function. A lower FID score indicates a lower disparity between the two distributions and thereby a higher staining efficacy of the VirtualMultiplexer. To ensure reproducibility, we ran each model three times with three independent initializations and computed the mean and standard deviation for each model (barplot height and error bar in Fig. 3 ). We used a 70%:30% ratio to split the data into train and test sets, respectively. As for each marker a different number of IHC stainings were available in the EMPaCT data, the exact number of cores used per marker are given in Supplementary Table 5.

Image-level evaluation

We used a number of downstream classification tasks to assess the discriminative ability of the virtually stained IHC images on the EMPaCT, SICAP, PANDA and PDAC datasets. We further used these tasks to depict the utility of leveraging virtually multiplexed staining in comparison to standalone real H&E, real IHC and virtual IHC staining. Specifically, provided the aforementioned images, we constructed graph representations as described in Section GT architecture. Subsequently, GTs 41 were trained under unimodal and multimodal settings using both real and virtually stained images and evaluated on a held-out independent test dataset. The final classification scores were reported using a weighted F 1 metric, where a higher score depicts a better classification performance and thereby higher discriminative power of the utilized images. As before, we ran each model three times with three independent initializations and computed the mean and standard deviation for each model (barplot heights and error bars in Figs. 5 and 6 ). In all cases, we used a 60%:20%:20% ratio to split the data into train, validation and test sets, respectively. The exact number of train, validation and test samples used per task, marker and training setting in the EMPaCT dataset are given in Supplementary Table 6 .

For the SICAP, PANDA and PDAC datasets, the exact number of samples used in the train, validation and test splits coincide for all unimodal and multimodal models of Fig. 6 and are reported in Supplementary Table 7 .

Computational hardware and software

The image datasets were preprocessed on POWER9 central processing units and one NVIDIA Tesla A100 graphics processing unit using the Histocartography library 74 . The deep learning models were trained on NVIDIA Tesla P100 graphics processing units using PyTorch (v.1.13.1) (ref. 78 ) and PyTorch Geometric (v.2.3.0) (ref. 79 ). The entire pipeline was implemented in Python (v.3.9.1).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The main dataset used to support this study (EMPaCT) has been deposited in Zenodo, together with the prostate cancer WSIs 80 . The SICAP dataset is available at Mendeley data 81 . The PANDA dataset is available at the Kaggle website ( https://www.kaggle.com/c/prostate-cancer-grade-assessment/data ). The TCGA WSIs from breast and colorectal tissue are available as diagnostic slides under project IDs TCGA-BRCA and TCGA-CRC, respectively, at the GDC data portal ( https://portal.gdc.cancer.gov ). The PDAC dataset is available for academic research purposes upon request via e-mail to M.W. ([email protected]) or the Translational Research Unit Platform of ITMP of the University of Bern ([email protected]). All clinical data associated with the EMPaCT and PDAC patient cohorts cannot be shared owing to patient-confidentiality obligations.

Code availability

All source code of the VirtualMultiplexer is available under an open-source license at https://github.com/AI4SCR/VirtualMultiplexer and via Zenodo at https://doi.org/10.5281/zenodo.11941982 (ref. 82 ).

Kashyap, A. et al. Quantification of tumor heterogeneity: from data acquisition to metric generation. Trends Biotechnol. 40 , 647–676 (2022).

Article   Google Scholar  

Chan, J. K. The wonderful colors of the hematoxylin–eosin stain in diagnostic surgical pathology. Int. J. Surgical Pathol. 22 , 12–32 (2014).

De Matos, L. L., Trufelli, D. C., De Matos, M. G. L. & da Silva Pinhal, M. A. Immunohistochemistry as an important tool in biomarkers detection and clinical practice. Biomark. Insights 5 , BMI–S2185 (2010).

Giesen, C. et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods 11 , 417–422 (2014).

Goltsev, Y. et al. Deep profiling of mouse splenic architecture with codex multiplexed imaging. Cell 174 , 968–981 (2018).

Angelo, M. et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 20 , 436–442 (2014).

Lewis, S. M. et al. Spatial omics and multiplexed imaging to explore cancer biology. Nat. Methods 18 , 997–1012 (2021).

Pillar, N. & Ozcan, A. Virtual tissue staining in pathology using machine learning. Expert Rev. Mol. Diagnostics 22 , 987–989 (2022).

Bai, B. et al. Deep learning-enabled virtual histological staining of biological samples. Light.: Sci. Appl. 12 , 57 (2023).

Tschuchnig, M. E., Oostingh, G. J. & Gadermayr, M. Generative adversarial networks in digital pathology: a survey on trends and future potential. Patterns 1 , 100089 (2020).

Jose, L., Liu, S., Russo, C., Nadort, A. & Di Ieva, A. Generative adversarial networks in digital pathology and histopathological image processing: a review. J. Pathol. Inform. 12 , 43 (2021).

Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 5967–5976 (IEEE, 2017).

Li, J. et al. Biopsy-free in vivo virtual histology of skin using deep learning. Light Sci. Appl. 10 , 233 (2021).

Rivenson, Y. et al. PhaseStain: the digital staining of label-free quantitative phase microscopy images using deep learning. Light Sci. Appl. 8 , 23 (2019).

Rivenson, Y. et al. Virtual histological staining of unlabelled tissue-autofluorescence images via deep learning. Nat. Biomed. Eng. 3 , 466–477 (2019).

Rana, A. et al. Use of deep learning to develop and analyze computational hematoxylin and eosin staining of prostate core biopsy images for tumor diagnosis. JAMA Netw. Open 3 , e205111 (2020).

de Haan, K. et al. Deep learning-based transformation of H&E stained tissues into special stains. Nat. Commun. 12 , 4884 (2021).

Zhang, Y. et al. Digital synthesis of histological stains using micro-structured and multiplexed virtual staining of label-free tissue. Light Sci. Appl. 9 , 78 (2020).

Liu, S. et al. BCI: breast cancer immunohistochemical image generation through pyramid pix2pix. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 1815–1824 (IEEE, 2022).

Xie, W. Prostate cancer risk stratification via non-destructive 3D pathology with deep learning-assisted gland analysis. Cancer Res. 82 , 334 (2022).

Ghahremani, P. et al. Deep learning-inferred multiplex immunofluorescence for immunohistochemical image quantification. Nat. Mach. Intell. 4 , 401–412 (2022).

Zhang, R. et al. MVFStain: multiple virtual functional stain histopathology images generation based on specific domain mapping. Med. Image Anal. 80 , 102520 (2022).

Mercan, C. et al. Virtual staining for mitosis detection in breast histopathology. In Proc. 17th International Symposium on Biomedical Imaging (ISBI) 1770–1774 (IEEE, 2020).

Lahiani, A., Klaman, I., Navab, N., Albarqouni, S. & Klaiman, E. Seamless virtual whole slide image synthesis and validation using perceptual embedding consistency. IEEE J. Biomed. Health Inform. 25 , 403–411 (2020).

Liu, S. et al. Unpaired stain transfer using pathology-consistent constrained generative adversarial networks. IEEE Trans. Med. Imaging 40 , 1977–1989 (2021).

Boyd, J. et al. Region-guided CycleGANs for stain transfer in whole slide images. In Proc. Medical Image Computing and Computer Assisted Intervention (MICCAI) 356–365 (Springer, 2022).

Lin, Y. et al. Unpaired multi-domain stain transfer for kidney histopathological images. In Proc. AAAI Conference on Artificial Intelligence. 1630–1637 (AAAI, 2022).

Bouteldja, N., Klinkhammer, B. M., Schlaich, T., Boor, P. & Merhof, D. Improving unsupervised stain-to-stain translation using self-supervision and meta-learning. J. Pathol. Inform. 13 , 100107 (2022).

Ozyoruk, K. B. et al. A deep-learning model for transforming the style of tissue images from cryosectioned to formalin-fixed and paraffin-embedded. Nat. Biomed. Eng. 6 , 1407–1419 (2022).

Zhu, J.-Y., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. IEEE International Conference on Computer Vision (ICCV) 2242–2251 (IEEE, 2017).

Zeng, B. et al. Semi-supervised PR virtual staining for breast histopathological images. In Proc. Medical Image Computing and Computer Assisted Intervention (MICCAI). 232–241 (Springer, 2022).

Borji, A. Pros and cons of GAN evaluation measures: new developments. Computer Vis. Image Underst. 215 , 103329 (2022).

Cohen, J. P., Luck, M. & Honari, S. Distribution matching losses can hallucinate features in medical image translation. In Proc. Medical Image Computing and Computer Assisted Intervention (MICCAI) 529–536 (Springer, 2018).

Park, T., Efros, A. A., Zhang, R. & Zhu, J.-Y. Contrastive learning for unpaired image-to-image translation. In In Proc. European Conference on Computer Vision (ECCV) 319–345 (Springer, 2020).

Goodfellow, I. J. et al. Generative adversarial nets. In Proc. 27th International Conference on Neural Information Processing Systems. 2672-2680 (2014).

Briganti, A. et al. Identifying the best candidate for radical prostatectomy among patients with high-risk prostate cancer. Eur. Urol. 61 , 584–592 (2012).

Kneitz, B. et al. Survival in patients with high-risk prostate cancer is predicted by mir-221, which regulates proliferation, apoptosis, and invasion of prostate cancer cells by inhibiting IRF2 and SOCS3. Cancer Res. 74 , 2591–2603 (2014).

Tosco, L. et al. The EMPaCT classifier: a validated tool to predict postoperative prostate cancer-related death using competing-risk analysis. Eur. Urol. Focus 4 , 369–375 (2018).

Ho, M.-Y., Wu, M.-S. & Wu, C.-M. Ultra-high-resolution unpaired stain transformation via kernelized instance normalization. In Proc. European Conference on Computer Vision (ECCV) 490–505 (Springer, 2022).

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proc. 31st International Conference on Neural Information Processing Systems. 6629–6640 (ACM, 2017).

Zheng, Y. et al. A graph-transformer for whole slide image classification. IEEE Trans. Med. Imaging 41 , 3003–3015 (2022).

Silva-Rodríguez, J., Colomer, Adrián, Sales, María, Molina, R. & Naranjo, V. Going deeper through the Gleason scoring scale: an automatic end-to-end system for histology prostate grading and cribriform pattern detection. Comput. Methods Programs Biomed. 195 , 105637 (2020).

Bulten, W. et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat. Med. 28 , 154–163 (2022).

Pati, P. et al. Weakly supervised joint whole-slide segmentation and classification in prostate cancer. Med. Image Anal. 89 , 102915 (2023).

The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487 , 330 (2012).

The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490 , 61–70 (2012).

de Bel, T., Hermsen, M., Kers, J., van der Laak, J. & Litjens, G. Stain-transforming cycle-consistent generative adversarial networks for improved segmentation of renal histopathology. In Proc. 2nd International Conference on Medical Imaging with Deep Learning. 151–163 (PMLR, 2019).

Sun, K. et al. Bi-directional feature fusion generative adversarial network for ultra-high resolution pathological image virtual re-staining. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 3904–3913 (IEEE, 2023).

Siller, M. et al. On the acceptance of ‘fake’ histopathology: a study on frozen sections optimized with deep learning. J. Pathol. Inform. 13 , 100168 (2022).

Liang, J. et al. Deep learning supported discovery of biomarkers for clinical prognosis of liver cancer. Nat. Mach. Intell. 5 , 408–420 (2023).

Wang, S. et al. Deep learning of cell spatial organizations identifies clinically relevant insights in tissue images. Nat. Commun. 14 , 7872 (2023).

Nan, Y. et al. Data harmonisation for information fusion in digital healthcare: a state-of-the-art systematic review, meta-analysis and future research directions. Inf. Fusion 82 , 99–122 (2022).

Vert, J. P. How will generative AI disrupt data science in drug discovery? Nat. Biotechnol. 41 , 750–751 (2023).

Gutmann, M. & Hyvärinen, A. Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In Proc. 13th International Conference on Artificial Intelligence and Statistics 297–304 (PMLR, 2010).

van den Aaron, O., Li, Y. & Vinyals, O. Representation learning with contrastive predictive coding. Preprint at https://arXiv.org/abs/1807.03748 (2018).

Taigman, Y., Polyak, A. & Wolf, L. Unsupervised cross-domain image generation. In Proc. 4th International Conference on Learning Representations (ICLR) 1441–1455 (ICLR, 2017).

Zhang, L., Zhang, L., Mou, X. & Zhang, D. FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20 , 2378–2386 (2011).

Article   MathSciNet   Google Scholar  

Gatys, L. A., Ecker, A. S. & Bethge, M. Image style transfer using convolutional neural networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2414–2423 (IEEE, 2016).

Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at https://arXiv.org/abs/1409.1556 (2014).

Deng, J. et al. Imagenet: A large-scale hierarchical image database. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 248–255 (IEEE, 2009).

Cheng, J., Jaiswal, A., Wu, Y., Natarajan, P. & Natarajan, P. Style-aware normalized loss for improving arbitrary style transfer. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 134–143 (IEEE, 2021).

He, K., Gkioxari, G., Dollár, P. & Girshick, R. Mask r-cnn. In Proc. IEEE International Conference on Computer Vision (ICCV) 2980–2988 (IEEE, 2017).

Graham, S. et al. Hover-Net: simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 58 , 101563 (2019).

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) II-718–II-725 (IEEE, 2016).

Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feed-forward neural networks. In Proc. 13th International Conference on Artificial Intelligence and Statistics 249–256 (PMLR, 2010).

Ulyanov, D., Vedaldi, A. & Lempitsky, V. Instance normalization: the missing ingredient for fast stylization. Preprint at https://arXiv.org/607.08022 (2016).

Mao, X. et al. Least squares generative adversarial networks. In Proc. IEEE International Conference on Computer Vision (ICCV) 2813–2821 (IEEE, 2017).

Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Preprint at https://arXiv.org/1412.6980 (2014).

Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. Preprint at https://arXiv.org/abs/1609.02907 , (2017).

Bianchi, F. M., Grattarola, D. & Alippi, C. Spectral clustering with graph neural networks for graph pooling. In Proc. 37th International Conference on Machine Learning (ICML) 874–883 (PMLR, 2020).

Bulten, W. et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol. 21 , 233–241 (2020).

Ström, P. et al. Pathologist-level grading of prostate biopsies with artificial intelligence. Preprint at https://arXiv.org/1907.01368 (2019).

Brierley, J. D., Gospodarowicz, M. K. & Wittekind, C. TNM Classification of Malignant Tumours (Wiley, 2017).

Jaume, G., Pati, P., Anklin, V., Foncubierta, A. & Gabrani, M. Histocartography: a toolkit for graph analytics in digital pathology. In Proc. MICCAI Workshop on Computational Pathology 117–128 (PMLR, 2021).

Vahadane, A. et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans. Med. Imaging 35 , 1962–1971 (2016).

Voon, W. et al. Evaluating the effectiveness of stain normalization techniques in automated grading of invasive ductal carcinoma histopathological images. Sci. Rep. 13 , 20518 (2023).

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2818–2826 (IEEE, 2016).

Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 8024–8035 (ACM, 2019).

Fey, M. & Lenssen, J. E. Fast graph representation learning with Pytorch Geometric. Preprint at https://arXiv.org/abs/1903.02428 (2019).

Karkampouna, S. & Kruithof-de Julio, M. Dataset EMPaCT TMA. Zenodo https://doi.org/10.5281/zenodo.10066853 (2023).

Silva-Rodríguez, J. SICAPv2-prostate whole slide images with gleason grades annotations. Mendeley Data https://doi.org/10.17632/9xxm58dvs3.1 (2020).

Pati, P. VirtualMultiplexer code. Zenodo https://doi.org/10.5281/zenodo.11941982 (2024).

Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. 31st International Conference on Neural Information Processing Systems. 4768–4777 (2017).

Tang, F. et al. Chromatin profiles classify castration-resistant prostate cancers suggesting therapeutic targets. Science 376 , eabe1505 (2022).

Blank, A., Dawson, H., Hammer, C., Perren, A. & Lugli, A. Lean management in the pathology laboratory. Der Pathol. 38 , 540–544 (2017).

Download references

Acknowledgements

We would like to thank G. Jaume, J. Born and M. Graziani for constructive comments, discussions and suggestions. The results published here are in part based upon data generated by the TCGA Research Network at https://www.cancer.gov/tcga . This work was supported by the Swiss National Science Foundation Sinergia grant no. 202297 to M.R. and M.K.-d.J. The PDAC TMA construction took place at the Translational Research Unit Platform of the ITMP of the University of Bern ( https://www.ngtma.com/ ) in the setting of a grant by the Foundation for Clinical-Experimental Cancer Research Bern to M.W. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and affiliations.

IBM Research Europe, Rüschlikon, Switzerland

Pushpak Pati, Adriano Martinelli & Marianna Rapsomaniki

Urology Research Laboratory, Department for BioMedical Research, University of Bern, Bern, Switzerland

Sofia Karkampouna, Francesco Bonollo, Martina Radić & Marianna Kruithof-de Julio

Department of Urology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland

Sofia Karkampouna & Marianna Kruithof-de Julio

Department of Pathology, Medical University of Vienna, Vienna, Austria

Eva Compérat

Department of Urology, Lindenhofspital Bern, Bern, Switzerland

Martin Spahn

Department of Urology, University Duisburg-Essen, Essen, Germany

ETH Zürich, Zürich, Switzerland

Adriano Martinelli

Biomedical Data Science Center, Lausanne University Hospital, Lausanne, Switzerland

Adriano Martinelli & Marianna Rapsomaniki

Institute of Tissue Medicine and Pathology, University of Bern, Bern, Switzerland

Martin Wartenberg

Translational Organoid Resource, Department for BioMedical Research, University of Bern, Bern, Switzerland

Marianna Kruithof-de Julio

Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland

Marianna Rapsomaniki

You can also search for this author in PubMed   Google Scholar

Contributions

P.P. conceived and implemented the model. P.P., A.M. and M. Rapsomaniki designed and performed computational analyses. S.K., F.B. and M. Radić performed experiments. S.K., F.B., E.C., M.W. and M.K.-d.J. performed all qualitative assessments. P.P., S.K., F.B., A.M. and M.R. compiled the figures. M.S., M.W. and M.K.-d.J. contributed materials for the experiments. P.P., S.K., F.B. and M. Rapsomaniki wrote the paper with inputs from all authors. M.K.-d.J. and M. Rapsomaniki were responsible for the overall planning and supervision of the project.

Corresponding authors

Correspondence to Marianna Kruithof-de Julio or Marianna Rapsomaniki .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 qualitative evaluation of the virtualmultiplexer for two tma cores in the empact dataset..

Additional examples to the ones presented in Fig. 3 . Columns one and three present two H&E stained TMA cores and corresponding virtually stained images for six IHC markers. Columns two and four present reference IHC images for the same core.

Extended Data Fig. 2 Visual quality assessment of virtually stained IHC images of the EMPaCT prostate cancer TMA.

(A) Example virtual TMA cores across all six markers (left column) and selected zoomed in regions (middle column) that highlight accurate staining patterns. Real reference IHC images for each marker are given on the right column. We observed that AR+ and NKX3.1+ cells exhibited correct distribution in the luminal epithelial compartment of the prostatic glands and nuclear localization. Furthermore, a few NKX3.1+ cells in stromal regions (possibly stroma-invading tumor cells) were correctly predicted. Similarities in specific, matched areas between virtual and real IHC images were mainly assessed for staining pattern and overall intensity levels: we observed that the expression of markers indicative of tumor-specific molecular profile, such as loss of TP53 and ERG overexpression, did not largely deviate between virtual and real images at a TMA core level, which would be crucial for diagnostic applicability. (B) Same as (A) but highlighting regions with inaccurate or inconclusive staining. We observed non-specific signal in extra-cellular-matrix/stroma regions (NKX3.1, p53, ERG), occasional false nuclear expression (CD44), and systematic lack of recognition of CD146+ vascular structures.

Extended Data Fig. 3 Intensity distribution of positive and negative cells for real and virtual IHC images.

Cell segmentation and classification is performed using DeepLIIF 21 . Intensity of a cell is measured as the average of pixel values in the perceptual lightness (L) channel of Lab colorspace. The Wasserstein distance between the positive and negative cell distributions is computed to quantify the cell-class separability.

Extended Data Fig. 4 Ablation study.

Qualitative evaluation of the impact of multi-scale consistency objectives on the virtual staining quality of the VirtualMultiplexer across six IHC markers, presented in each row. (A) Sample H&E cores from the EMPaCT dataset. Corresponding virtually stained IHC cores for training the VirtualMultiplexer with neighbourhood consistency ( B ), neighborhood and global consistencies ( C ), and neighborhood, global, and local consistencies ( D ). The bounding boxes highlight zoomed-in regions in the IHC cores. (E) Reference real IHC cores corresponding to the cores in (A) .

Extended Data Fig. 5 Transfer learning from TMAs to WSIs of prostate cancer tissue.

Additional examples to the ones presented in Fig. 4 . Example of H&E (left image), virtual IHC (middle image), and real IHC (right image) staining for NKX3.1 (top) and AR (bottom) of prostate cancer tissue WSIs. Blue-framed zoomed-in regions display accurate staining pattern. Red-framed zoomed-in regions display examples of virtual staining mispredictions.

Extended Data Fig. 6 Marker-level interpretation of Graph-Transformer-based survival prediction classification.

For the modality-level interpretation, we performed Shapley Additive Explanations (SHAP) 83 analysis for the overall survival prediction task on EMPaCT (see the relevant computation for reference). We systematically dropped the modalities during inference and measured the change in classification weighted F1 scores, inline with the SHAP algorithm to compute modality-level importance. Here, the barplots and errorbars indicate the mean and the standard deviation, respectively, of the estimated Shapley values across all 134 test images for n = 3 Graph-Transformer classifiers. In the absence of ground truth marker importance, we used biological knowledge for qualitative analysis. NKX3.1 and AR were identified as crucial, which is sensible as they both express specific patterns in luminal epithelial cells in prostate and aid in distinguishing normal from carcinoma. High importance of CD44 could be linked to its heterogeneous pattern and pleiotropic effects found in tumor microenvironment 84 . Conversely, CD146’s relevance lies in highlighting vascular or fibroblast changes, rendering it less diagnostically informative. Notably, the high importance of CD44 and NKX3.1, and the low importance of CD146 and ERG, are inline with the unimodal high and low weighted F1 scores in Fig. 6 , respectively.

Extended Data Fig. 7 Transfer learning from TMAs to needle biopsies of prostate cancer tissue.

Additional examples to the qualitative samples presented in Fig. 6 . (A) and (B) present H&E biopsies from SICAP and PANDA datasets, respectively, and corresponding virtually stained IHC biopsies for six markers.

Extended Data Fig. 8 Region-level interpretation of Graph-Transformer-based Gleason grade classification.

Results for sample WSIs from the SICAP 42 (top) and PANDA 71 (bottom) datasets for interpreting the Gleason grading outcome of our Graph-Transformer, with accompanying ground truth annotations of Gleason scores. The model was trained using virtual images under early fusion setting. We used the GraphCAM method from 41 to produce attention maps corresponding to salient tissue regions contributing to model predictions. We observe a great overlap between the identified salient regions and the ground-truth Gleason pattern annotations for both primary and secondary class predictions in both datasets.

Extended Data Fig. 9 Transfer learning from TMAs to WSIs of different tissue types from TCGA cohort.

(A) H&E WSIs and (B) corresponding virtually stained IHC WSIs from colorectal carcinoma (top two rows) and breast invasive carcinoma (bottom two rows). For both the tissue types, the virtual stainings are produced for relevant CD44 and CD146 IHC markers.

Extended Data Fig. 10 The VirtualMultiplexer can greatly accelerate histopathology workflows.

We performed a runtime estimation of all components of the VirtualMultiplexer framework across imaging datasets of different scales: an in-domain TMA from the EMPaCT dataset (A) , an out-of-domain TMA from the PDAC dataset (B) , an out-of-domain needle biopsy from the SICAP dataset (C) , and an out-of-domain WSI from the in-house dataset (D) . We calculated that applying the trained VirtualMultiplexer on a single EMPaCT TMA core (6000 × 6000 pixels at 20X magnification-0.24 μ m/pixel) for one marker resulted in a total runtime of 2.81 seconds, and the same process for an out-of-distribution TMA core resulted in a runtime of 10.88 seconds, with the increase attributed to stain normalization. However, the stain normalization step is crucial as it alleviates the appearance disparity between the training and the out-of-distribution samples (Supplementary Fig. 1 ), and allows for a faithful application of the VirtualMultiplexer to unseen datasets. The above result implies that virtual staining of a hypothetical TMA slide containing 250 out-of-distribution TMA cores for 6 markers would be feasible in ≈ 65.8 minutes (preprocessing: ≈ 9.9 seconds per core, virtual staining and post-processing: ≈ 0.98 seconds per core and marker). Conversely, performing the IHC staining for the same hypothetical TMA for 6 IHC markers could take an estimated time of approximately 1 day, when applied in a cutting-edge pathology laboratory using the latest protocols 85 . When applied in a biology lab that does not specialize in pathology, however, IHC staining could take up to 5 days per marker (sectioning: 1 day, staining: 2 days, slide drying: 1 day, imaging: 1 day), leading to a minimum of 5 days, if done simultaneously for all 6 markers, and more than 10 days, if performed mostly sequentially. Importantly, as our method scales linearly with the size of the tissue (TMA to WSI) and with the number of markers, similar time gains would be feasible for virtually staining needle biopsies and WSIs.

Supplementary information

Supplementary information.

Supplementary Fig. 1 and Tables 1–7.

Reporting Summary

Supplementary data 1.

Assessment of the staining quality of virtual and real images for all six IHC markers, classified as acceptable (1) and unacceptable (0). Images with unacceptable staining quality were further categorized by the presence of elevated background or border artefacts.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Pati, P., Karkampouna, S., Bonollo, F. et al. Accelerating histopathology workflows with generative AI-based virtually multiplexed tumour profiling. Nat Mach Intell (2024). https://doi.org/10.1038/s42256-024-00889-5

Download citation

Received : 04 December 2023

Accepted : 29 July 2024

Published : 09 September 2024

DOI : https://doi.org/10.1038/s42256-024-00889-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

published qualitative research paper

IMAGES

  1. (PDF) Qualitative Research Paper

    published qualitative research paper

  2. 4 Useful Steps on How to Write a Qualitative Research Paper

    published qualitative research paper

  3. Example Research Paper Qualitative

    published qualitative research paper

  4. Qualitative Research Paper Introduction / 1 : Qualitative research

    published qualitative research paper

  5. Writing a Qualitative Research Paper.pdf

    published qualitative research paper

  6. (PDF) If You Could Just Provide Me with A Sample: Examining Sampling in

    published qualitative research paper

VIDEO

  1. QUANTITATIVE RESEARCH AND QUALITATIVE RESEARCH (PART-I)

  2. Qualitative Research Reporting Standards: How are qualitative articles different from quantitative?

  3. How to Publish Qualitative paper with Nvivo by Dr Jaspreet kaur

  4. RESEARCH

  5. Qualitative Research Paper 3

  6. What is qualitative research?

COMMENTS

  1. PDF Students' Perceptions towards the Quality of Online Education: A

    The findings of this research revealed that flexibility, cost-effectiveness, electronic research availability, ease of connection to the Internet, and well-designed class interface were students' positive experiences. The students' negative experiences were caused by delayed feedback from instructors, unavailable technical support from ...

  2. Qualitative Research: Sage Journals

    Qualitative Research is a peer-reviewed international journal that has been leading debates about qualitative methods for over 20 years. The journal provides a forum for the discussion and development of qualitative methods across disciplines, publishing high quality articles that contribute to the ways in which we think about and practice the craft of qualitative research.

  3. Qualitative Research

    Patience Mukwambo. Restricted access Book review First published September 20, 2023 pp. 1095-1097. xml Get Access. Table of contents for Qualitative Research, 24, 4, Aug 01, 2024.

  4. Planning Qualitative Research: Design and Decision Making for New

    Given the nuance and complexity of qualitative research, this paper provides an accessible starting point from which novice researchers can begin their journey of learning about, designing, and conducting qualitative research. ... This article was published in International Journal of Qualitative Methods. View All Journal Metrics. Article usage ...

  5. Research Journals

    The Qualitative Report Guide to Qualitative Research Journals is a unique resource for researchers, scholars, and students to explore the world of professional, scholarly, and academic journals publishing qualitative research. The number and variety of journals focusing primarily on qualitative approaches to research have steadily grown over ...

  6. Qualitative studies

    Using residents and experts to evaluate the validity of areal wombling for detecting social boundaries: A small-scale feasibility study. Meng Le Zhang, Aneta Piekut, [ ... ], Gwilym Pryce. Delphi studies in social and health sciences—Recommendations for an interdisciplinary standardized reporting (DELPHISTAR). Results of a Delphi study.

  7. Qualitative Psychology

    Qualitative Psychology publishes studies that represent a wide variety of methodological approaches including narrative, discourse analysis, life history, phenomenology, ethnography, action research, and case study. The journal is further concerned with discussions of teaching qualitative research and training of qualitative researchers.

  8. The Oxford Handbook of Qualitative Research

    Abstract. The Oxford Handbook of Qualitative Research, second edition, presents a comprehensive retrospective and prospective review of the field of qualitative research. Original, accessible chapters written by interdisciplinary leaders in the field make this a critical reference work. Filled with robust examples from real-world research ...

  9. SAGE Publications Inc

    Qualitative Research publishes papers with a clear methodological focus. We invite scholarship that has multi-disciplinary appeal, that debates and enlivens qualitative methods, and that pushes at the boundaries of established ways of doing qualitative research. We are interested in papers that are attentive to a wide audience, that are alive ...

  10. Qualitative Research Journal

    Qualitative Research Journal is an international journal dedicated to communicating the theory and practice of qualitative research in the human sciences. Interdisciplinary and eclectic, QRJ covers all methodologies that can be described as qualitative. ... Correcting inaccuracies in your published paper. Sometimes errors are made during the ...

  11. Criteria for Good Qualitative Research: A Comprehensive Review

    This review aims to synthesize a published set of evaluative criteria for good qualitative research. The aim is to shed light on existing standards for assessing the rigor of qualitative research encompassing a range of epistemological and ontological standpoints. Using a systematic search strategy, published journal articles that deliberate criteria for rigorous research were identified. Then ...

  12. Qualitative Research Methods: A Practice-Oriented Introduction

    The book aims at achieving e ects in three domains: (a) the. personal, (b) the scholarly, and (c) the practical. The personal goal. is to demystify qualitative methods, give readers a feel for ...

  13. What Is Qualitative Research?

    Published on June 19, 2020 by Pritha Bhandari. Revised on September 5, 2024. Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research.

  14. International Journal of Qualitative Methods: Sage Journals

    The International Journal of Qualitative Methods is the peer-reviewed interdisciplinary open access journal of the International Institute for Qualitative Methodology (IIQM) at the University of Alberta, Canada. The journal, established in 2002, is an eclectic international forum for insights, innovations and advances in methods and study designs using qualitative or mixed methods research.

  15. Qualitative Research Resources: Publishing Qualitative Research

    How to search for and evaluate qualitative research, integrate qualitative research into systematic reviews, report/publish qualitative research. ... Richardson, J., & Liddle, J. (2017). Where does good quality qualitative health care research get published? Primary Health Care Research & Development, 18(5), 515-521. doi:10.1017 ...

  16. Qualitative Research Part 3: Publication

    1 Qualitative Research Part 3: Publication. The first two papers in this series on qualitative research for mental health nursing explored the basics of qualitative research—methodologies and methods. This paper will explore how your research can be transformed into a publication. There is an art in reducing that work into a succinct research ...

  17. Qualitative Research: Data Collection, Analysis, and Management

    Doing qualitative research is not easy and may require a complete rethink of how research is conducted, particularly for researchers who are more familiar with quantitative approaches. There are many ways of conducting qualitative research, and this paper has covered some of the practical issues regarding data collection, analysis, and management.

  18. Those Who Were Born Poor: A Qualitative Study of Philippine Poverty

    Abstract. This qualitative study investigated the psychological experience of poverty among 2 groups of Filipinos who were interviewed about the effects of being raised poor, 12 who became rich ...

  19. PDF Reporting Standards for Qualitative Research in Psychology: What Are

    Chapters 4 through 7 consider the typical sections of a qualitative research paper— the introductory sections, Method, Results, and Discussion. These chapters emphasize aspects of reporting that are unique to qualitative research. They describe the general elements that should be reported in qualitative papers and can assist authors in devel-

  20. What Is Qualitative Research? An Overview and Guidelines

    Abstract. This guide explains the focus, rigor, and relevance of qualitative research, highlighting its role in dissecting complex social phenomena and providing in-depth, human-centered insights. The guide also examines the rationale for employing qualitative methods, underscoring their critical importance. An exploration of the methodology ...

  21. A Qualitative Study of the Impact of Experiences of Students With

    coach was the focus of this study. The purpose of this qualitative research was to gauge the extent of pressures, the social and emotional impact, and the advantages and/or disadvantages individuals felt when they were a student having a parent in a position of authority at their school. The findings from the research study substantiated the

  22. PDF Sample of the Qualitative Research Paper

    QUALITATIVE RESEARCH PAPER 45 population sample, so your study is limited by the number of participants, or that you used a convenience sample. Summary Then the author would wrap up the chapter with the summarization of the chapter and a transition to the next chapter as described above. Notice that this section started with a ...

  23. Evaluation of research co-design in health: a systematic overview of

    Co-design with consumers and healthcare professionals is widely used in applied health research. While this approach appears to be ethically the right thing to do, a rigorous evaluation of its process and impact is frequently missing. Evaluation of research co-design is important to identify areas of improvement in the methods and processes, as well as to determine whether research co-design ...

  24. Qualitative Research From Grounded Theory to Build a Scientific

    A recent book published in Scopus brings together a wide range of international studies in science education, focusing on the interaction of science teaching and learning, hence its great importance. ... This paper addresses three main questions: (1) What is personal epistemology research and how is it conceptualized? ... (2005). Qualitative ...

  25. Accelerating histopathology workflows with generative AI-based

    VirtualMultiplexer is a generative AI tool that produces realistic multiplexed immunohistochemistry images from tissue biopsies. The generated images could be used to improve clinical predictions ...

  26. Submission Guidelines: Qualitative Research: Sage Journals

    Articles must have a clear methodological focus, and not simply present findings from qualitative studies. They should be between 7,500 and 8,500 words, excluding references. Any articles that fall below or above that range will be returned. Notes is a new format for short, engaging and imaginative submissions.

  27. Learning to Do Qualitative Data Analysis: A Starting Point

    Yonjoo Cho is an associate professor of Instructional Systems Technology focusing on human resource development (HRD) at Indiana University. Her research interests include action learning in organizations, international HRD, and women in leadership. She serves as an associate editor of Human Resource Development Review and served as a board member of the Academy of Human Resource Development ...