• Deutschland
  • United Kingdom
  • How to write a PhD Thesis

Select Page

Writing the Data Analysis Chapter(s): Results and Evidence

Posted by Rene Tetzner | Oct 19, 2023 | PhD Success | 0 |

4.4 Writing the Data Analysis Chapter(s): Results and Evidence

Unlike the introduction, literature review and methodology chapter(s), your results chapter(s) will need to be written for the first time as you draft your thesis even if you submitted a proposal, though this part of your thesis will certainly build upon the preceding chapters. You should have carefully recorded and collected the data (test results, participant responses, computer print outs, observations, transcriptions, notes of various kinds etc.) from your research as you conducted it, so now is the time to review, organise and analyse the data. If your study is quantitative in nature, make sure that you know what all the numbers mean and that you consider them in direct relation to the topic, problem or phenomenon you are investigating, and especially in relation to your research questions and hypotheses. You may find that you require the services of a statistician to help make sense of the data, in which case, obtaining that help sooner rather than later is advisable, because you need to understand your results thoroughly before you can write about them. If, on the other hand, your study is qualitative, you will need to read through the data you have collected several times to become familiar with them both as a whole and in detail so that you can establish important themes, patterns and categories. Remember that ‘qualitative analysis is a creative process and requires thoughtful judgments about what is significant and meaningful in the data’ (Roberts, 2010, p.174; see also Miles & Huberman, 1994) – judgements that often need to be made before the findings can be effectively analysed and presented. If you are combining methodologies in your research, you will also need to consider relationships between the results obtained from the different methods, integrating all the data you have obtained and discovering how the results of one approach support or correlate with the results of another. Ideally, you will have taken careful notes recording your initial thoughts and analyses about the sources you consulted and the results and evidence provided by particular methods and instruments as you put them into practice (as suggested in Sections 2.1.2 and 2.1.4), as these will prove helpful while you consider how best to present your results in your thesis.

Although the ways in which to present and organise the results of doctoral research differ markedly depending on the nature of the study and its findings, as on author and committee preferences and university and department guidelines, there are several basic principles that apply to virtually all theses. First and foremost is the need to present the results of your research both clearly and concisely, and in as objective and factual a manner as possible. There will be time and space to elaborate and interpret your results and speculate on their significance and implications in the final discussion chapter(s) of your thesis, but, generally speaking, such reflection on the meaning of the results should be entirely separate from the factual report of your research findings. There are exceptions, of course, and some candidates, supervisors and departments may prefer the factual presentation and interpretive discussion of results to be blended, just as some thesis topics may demand such treatment, but this is rare and best avoided unless there are persuasive reasons to avoid separating the facts from your thoughts about them. If you do find that you need to blend facts and interpretation in reporting your results, make sure that your language leaves no doubt about the line between the two: words such as ‘seems,’ ‘appears,’ ‘may,’ ‘might,’ probably’ and the like will effectively distinguish analytical speculation from more factual reporting (see also Section 4.5).

You need not dedicate much space in this part of the thesis to the methods you used to arrive at your results because these have already been described in your methodology chapter(s), but they can certainly be revisited briefly to clarify or lend structure to your report. Results are most often presented in a straightforward narrative form which is often supplemented by tables and perhaps by figures such as graphs, charts and maps. An effective approach is to decide immediately which information would be best included in tables and figures, and then to prepare those tables and figures before you begin writing the text for the chapter (see Section 4.4.1 on designing effective tables and figures). Arranging your data into the visually immediate formats provided by tables and figures can, for one, produce interesting surprises by enabling you to see trends and details that you may not have noticed previously, and writing the report of your results will prove easier when you have the tables and figures to work with just as your readers ultimately will. In addition, while the text of the results chapter(s) should certainly highlight the most notable data included in tables and figures, it is essential not to repeat information unnecessarily, so writing with the tables and figures already constructed will help you keep repetition to a minimum. Finally, writing about the tables and figures you create will help you test their clarity and effectiveness for your readers, and you can make any necessary adjustments to the tables and figures as you work. Be sure to refer to each table and figure by number in your text and to make it absolutely clear what you want your readers to see or understand in the table or figure (e.g., ‘see Table 1 for the scores’ and ‘Figure 2 shows this relationship’).

Beyond combining textual narration with the data presented in tables and figures, you will need to organise your report of the results in a manner best suited to the material. You may choose to arrange the presentation of your results chronologically or in a hierarchical order that represents their importance; you might subdivide your results into sections (or separate chapters if there is a great deal of information to accommodate) focussing on the findings of different kinds of methodology (quantitative versus qualitative, for instance) or of different tests, trials, surveys, reviews, case studies and so on; or you may want to create sections (or chapters) focussing on specific themes, patterns or categories or on your research questions and/or hypotheses. The last approach allows you to cluster results that relate to a particular question or hypothesis into a single section and can be particularly useful because it provides cohesion for the thesis as a whole and forces you to focus closely on the issues central to the topic, problem or phenomenon you are investigating. You will, for instance, be able to refer back to the questions and hypotheses presented in your introduction (see Section 3.1), to answer the questions and confirm or dismiss the hypotheses and to anticipate in relation to those questions and hypotheses the discussion and interpretation of your findings that will appear in the next part of the thesis (see Section 4.5). Less effective is an approach that organises the presentation of results according to the items of a survey or questionnaire, because these lend the structure of the instrument used to the results instead of connecting those results directly to the aims, themes and argument of your thesis, but such an organisation can certainly be an important early step in your analysis of the findings and might even be valid for the final thesis if, for instance, your work focuses on developing the instrument involved.

The results generated by doctoral research are unique, and this book cannot hope to outline all the possible approaches for presenting the data and analyses that constitute research results, but it is essential that you devote considerable thought and special care to the way in which you structure the report of your results (Section 6.1 on headings may prove helpful). Whatever structure you choose should accurately reflect the nature of your results and highlight their most important and interesting trends, and it should also effectively allow you (in the next part of the thesis) to discuss and speculate upon your findings in ways that will test the premises of your study, work well in the overall argument of your thesis and lead to significant implications for your research. Regardless of how you organise the main body of your results chapter(s), however, you should include a final paragraph (or more than one paragraph if necessary) that briefly summarises and explains the key results and also guides the reader on to the discussion and interpretation of those results in the following chapter(s).

Why PhD Success?

To Graduate Successfully

This article is part of a book called "PhD Success" which focuses on the writing process of a phd thesis, with its aim being to provide sound practices and principles for reporting and formatting in text the methods, results and discussion of even the most innovative and unique research in ways that are clear, correct, professional and persuasive.

The assumption of the book is that the doctoral candidate reading it is both eager to write and more than capable of doing so, but nonetheless requires information and guidance on exactly what he or she should be writing and how best to approach the task. The basic components of a doctoral thesis are outlined and described, as are the elements of complete and accurate scholarly references, and detailed descriptions of writing practices are clarified through the use of numerous examples.

The basic components of a doctoral thesis are outlined and described, as are the elements of complete and accurate scholarly references, and detailed descriptions of writing practices are clarified through the use of numerous examples. PhD Success provides guidance for students familiar with English and the procedures of English universities, but it also acknowledges that many theses in the English language are now written by candidates whose first language is not English, so it carefully explains the scholarly styles, conventions and standards expected of a successful doctoral thesis in the English language.

Individual chapters of this book address reflective and critical writing early in the thesis process; working successfully with thesis supervisors and benefiting from commentary and criticism; drafting and revising effective thesis chapters and developing an academic or scientific argument; writing and formatting a thesis in clear and correct scholarly English; citing, quoting and documenting sources thoroughly and accurately; and preparing for and excelling in thesis meetings and examinations. 

Completing a doctoral thesis successfully requires long and penetrating thought, intellectual rigour and creativity, original research and sound methods (whether established or innovative), precision in recording detail and a wide-ranging thoroughness, as much perseverance and mental toughness as insight and brilliance, and, no matter how many helpful writing guides are consulted, a great deal of hard work over a significant period of time. Writing a thesis can be an enjoyable as well as a challenging experience, however, and even if it is not always so, the personal and professional rewards of achieving such an enormous goal are considerable, as all doctoral candidates no doubt realise, and will last a great deal longer than any problems that may be encountered during the process.

Interested in Proofreading your PhD Thesis? Get in Touch with us

If you are interested in proofreading your PhD thesis or dissertation, please explore our expert dissertation proofreading services.

PhD Dissertation Proofreading

Our PhD dissertation proofreaders specialise in improving grammar, sentence structure, citations, references, clarity, logical flow and readability.

Master’s Dissertation Proofreading

To avoid failure and its consequences, send your dissertation to our master’s dissertation proofreading service.

Dissertation Proofreading Services

Our dissertation proofreaders specialise in correcting and perfecting the language, editorial styles and references across all science fields.

Headquarters

Dissertation-Proofreading.com Allia Future Business Centre The Guildhall Market Square Cambridge CB2 3QJ United Kingdom

More Expert Proofreading Services

Journal editing.

Journal article editing services

PhD Thesis Editing

PhD thesis editing services

Manuscript Editing

Scientific editing, medical editing services, psychology proofreading, about the author, rene tetzner.

Rene Tetzner's blog posts dedicated to academic writing. Although the focus is on How To Write a Doctoral Thesis, many other important aspects of research-based writing, editing and publishing are addressed in helpful detail.

Related Posts

Phd success – how to write a doctoral thesis.

October 1, 2023

Table of Contents – PhD Success

October 2, 2023

The Essential – Preliminary Matter

October 3, 2023

The Main Body of the Thesis

October 4, 2023

  • Cookies & Privacy
  • GETTING STARTED
  • Introduction
  • FUNDAMENTALS

sample thesis data analysis and interpretation

Getting to the main article

Choosing your route

Setting research questions/ hypotheses

Assessment point

Building the theoretical case

Setting your research strategy

Data collection

Data analysis

Data analysis techniques

In STAGE NINE: Data analysis , we discuss the data you will have collected during STAGE EIGHT: Data collection . However, before you collect your data, having followed the research strategy you set out in this STAGE SIX , it is useful to think about the data analysis techniques you may apply to your data when it is collected.

The statistical tests that are appropriate for your dissertation will depend on (a) the research questions/hypotheses you have set, (b) the research design you are using, and (c) the nature of your data. You should already been clear about your research questions/hypotheses from STAGE THREE: Setting research questions and/or hypotheses , as well as knowing the goal of your research design from STEP TWO: Research design in this STAGE SIX: Setting your research strategy . These two pieces of information - your research questions/hypotheses and research design - will let you know, in principle , the statistical tests that may be appropriate to run on your data in order to answer your research questions.

We highlight the words in principle and may because the most appropriate statistical test to run on your data not only depend on your research questions/hypotheses and research design, but also the nature of your data . As you should have identified in STEP THREE: Research methods , and in the article, Types of variables , in the Fundamentals part of Lærd Dissertation, (a) not all data is the same, and (b) not all variables are measured in the same way (i.e., variables can be dichotomous, ordinal or continuous). In addition, not all data is normal , nor is the data when comparing groups necessarily equal , terms we explain in the Data Analysis section in the Fundamentals part of Lærd Dissertation. As a result, you might think that running a particular statistical test is correct at this point of setting your research strategy (e.g., a statistical test called a dependent t-test ), based on the research questions/hypotheses you have set, but when you collect your data (i.e., during STAGE EIGHT: Data collection ), the data may fail certain assumptions that are important to such a statistical test (i.e., normality and homogeneity of variance ). As a result, you have to run another statistical test (e.g., a Wilcoxon signed-rank test instead of a dependent t-test ).

At this stage in the dissertation process, it is important, or at the very least, useful to think about the data analysis techniques you may apply to your data when it is collected. We suggest that you do this for two reasons:

REASON A Supervisors sometimes expect you to know what statistical analysis you will perform at this stage of the dissertation process

This is not always the case, but if you have had to write a Dissertation Proposal or Ethics Proposal , there is sometimes an expectation that you explain the type of data analysis that you plan to carry out. An understanding of the data analysis that you will carry out on your data can also be an expected component of the Research Strategy chapter of your dissertation write-up (i.e., usually Chapter Three: Research Strategy ). Therefore, it is a good time to think about the data analysis process if you plan to start writing up this chapter at this stage.

REASON B It takes time to get your head around data analysis

When you come to analyse your data in STAGE NINE: Data analysis , you will need to think about (a) selecting the correct statistical tests to perform on your data, (b) running these tests on your data using a statistics package such as SPSS, and (c) learning how to interpret the output from such statistical tests so that you can answer your research questions or hypotheses. Whilst we show you how to do this for a wide range of scenarios in the in the Data Analysis section in the Fundamentals part of Lærd Dissertation, it can be a time consuming process. Unless you took an advanced statistics module/option as part of your degree (i.e., not just an introductory course to statistics, which are often taught in undergraduate and master?s degrees), it can take time to get your head around data analysis. Starting this process at this stage (i.e., STAGE SIX: Research strategy ), rather than waiting until you finish collecting your data (i.e., STAGE EIGHT: Data collection ) is a sensible approach.

Final thoughts...

Setting the research strategy for your dissertation required you to describe, explain and justify the research paradigm, quantitative research design, research method(s), sampling strategy, and approach towards research ethics and data analysis that you plan to follow, as well as determine how you will ensure the research quality of your findings so that you can effectively answer your research questions/hypotheses. However, from a practical perspective, just remember that the main goal of STAGE SIX: Research strategy is to have a clear research strategy that you can implement (i.e., operationalize ). After all, if you are unable to clearly follow your plan and carry out your research in the field, you will struggle to answer your research questions/hypotheses. Once you are sure that you have a clear plan, it is a good idea to take a step back, speak with your supervisor, and assess where you are before moving on to collect data. Therefore, when you are ready, proceed to STAGE SEVEN: Assessment point .

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalize your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Prevent plagiarism. Run a free check.

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalizing your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organizing data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualizing the relationship between two variables using a scatter plot .

By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval

Methodology

  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hostile attribution bias
  • Affect heuristic

Is this article helpful?

Other students also liked.

  • Descriptive Statistics | Definitions, Types, Examples
  • Inferential Statistics | An Easy Introduction & Examples
  • Choosing the Right Statistical Test | Types & Examples

More interesting articles

  • Akaike Information Criterion | When & How to Use It (Example)
  • An Easy Introduction to Statistical Significance (With Examples)
  • An Introduction to t Tests | Definitions, Formula and Examples
  • ANOVA in R | A Complete Step-by-Step Guide with Examples
  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Chi-Square (Χ²) Distributions | Definition & Examples
  • Chi-Square (Χ²) Table | Examples & Downloadable Table
  • Chi-Square (Χ²) Tests | Types, Formula & Examples
  • Chi-Square Goodness of Fit Test | Formula, Guide & Examples
  • Chi-Square Test of Independence | Formula, Guide & Examples
  • Coefficient of Determination (R²) | Calculation & Interpretation
  • Correlation Coefficient | Types, Formulas & Examples
  • Frequency Distribution | Tables, Types & Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | 4 Ways with Examples & Explanation
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Mode | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Hypothesis Testing | A Step-by-Step Guide with Easy Examples
  • Interval Data and How to Analyze It | Definitions & Examples
  • Levels of Measurement | Nominal, Ordinal, Interval and Ratio
  • Linear Regression in R | A Step-by-Step Guide & Examples
  • Missing Data | Types, Explanation, & Imputation
  • Multiple Linear Regression | A Quick Guide (Examples)
  • Nominal Data | Definition, Examples, Data Collection & Analysis
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • One-way ANOVA | When and How to Use It (With Examples)
  • Ordinal Data | Definition, Examples, Data Collection & Analysis
  • Parameter vs Statistic | Definitions, Differences & Examples
  • Pearson Correlation Coefficient (r) | Guide & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Probability Distribution | Formula, Types, & Examples
  • Quartiles & Quantiles | Calculation, Definition & Interpretation
  • Ratio Scales | Definition, Examples, & Data Analysis
  • Simple Linear Regression | An Easy Introduction & Examples
  • Skewness | Definition, Examples & Formula
  • Statistical Power and Why It Matters | A Simple Introduction
  • Student's t Table (Free Download) | Guide & Examples
  • T-distribution: What it is and how to use it
  • Test statistics | Definition, Interpretation, and Examples
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Two-Way ANOVA | Examples & When To Use It
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Understanding P values | Definition and Examples
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Kurtosis? | Definition, Examples & Formula
  • What Is Standard Error? | How to Calculate (Guide with Examples)

What is your plagiarism score?

Premier-Dissertations-Logo

Get an experienced writer start working

Review our examples before placing an order, learn how to draft academic papers, a step-by-step guide to dissertation data analysis.

dissertation-conclusion-example

How to Write a Dissertation Conclusion? | Tips & Examples

sample thesis data analysis and interpretation

What is PhD Thesis Writing? | Beginner’s Guide

sample thesis data analysis and interpretation

A data analysis dissertation is a complex and challenging project requiring significant time, effort, and expertise. Fortunately, it is possible to successfully complete a data analysis dissertation with careful planning and execution.

As a student, you must know how important it is to have a strong and well-written dissertation, especially regarding data analysis. Proper data analysis is crucial to the success of your research and can often make or break your dissertation.

To get a better understanding, you may review the data analysis dissertation examples listed below;

  • Impact of Leadership Style on the Job Satisfaction of Nurses
  • Effect of Brand Love on Consumer Buying Behaviour in Dietary Supplement Sector
  • An Insight Into Alternative Dispute Resolution
  • An Investigation of Cyberbullying and its Impact on Adolescent Mental Health in UK

3-Step  Dissertation Process!

sample thesis data analysis and interpretation

Get 3+ Topics

sample thesis data analysis and interpretation

Dissertation Proposal

sample thesis data analysis and interpretation

Get Final Dissertation

Types of data analysis for dissertation.

The various types of data Analysis in a Dissertation are as follows;

1.   Qualitative Data Analysis

Qualitative data analysis is a type of data analysis that involves analyzing data that cannot be measured numerically. This data type includes interviews, focus groups, and open-ended surveys. Qualitative data analysis can be used to identify patterns and themes in the data.

2.   Quantitative Data Analysis

Quantitative data analysis is a type of data analysis that involves analyzing data that can be measured numerically. This data type includes test scores, income levels, and crime rates. Quantitative data analysis can be used to test hypotheses and to look for relationships between variables.

3.   Descriptive Data Analysis

Descriptive data analysis is a type of data analysis that involves describing the characteristics of a dataset. This type of data analysis summarizes the main features of a dataset.

4.   Inferential Data Analysis

Inferential data analysis is a type of data analysis that involves making predictions based on a dataset. This type of data analysis can be used to test hypotheses and make predictions about future events.

5.   Exploratory Data Analysis

Exploratory data analysis is a type of data analysis that involves exploring a data set to understand it better. This type of data analysis can identify patterns and relationships in the data.

Time Period to Plan and Complete a Data Analysis Dissertation?

When planning dissertation data analysis, it is important to consider the dissertation methodology structure and time series analysis as they will give you an understanding of how long each stage will take. For example, using a qualitative research method, your data analysis will involve coding and categorizing your data.

This can be time-consuming, so allowing enough time in your schedule is important. Once you have coded and categorized your data, you will need to write up your findings. Again, this can take some time, so factor this into your schedule.

Finally, you will need to proofread and edit your dissertation before submitting it. All told, a data analysis dissertation can take anywhere from several weeks to several months to complete, depending on the project’s complexity. Therefore, starting planning early and allowing enough time in your schedule to complete the task is important.

Essential Strategies for Data Analysis Dissertation

A.   Planning

The first step in any dissertation is planning. You must decide what you want to write about and how you want to structure your argument. This planning will involve deciding what data you want to analyze and what methods you will use for a data analysis dissertation.

B.   Prototyping

Once you have a plan for your dissertation, it’s time to start writing. However, creating a prototype is important before diving head-first into writing your dissertation. A prototype is a rough draft of your argument that allows you to get feedback from your advisor and committee members. This feedback will help you fine-tune your argument before you start writing the final version of your dissertation.

C.   Executing

After you have created a plan and prototype for your data analysis dissertation, it’s time to start writing the final version. This process will involve collecting and analyzing data and writing up your results. You will also need to create a conclusion section that ties everything together.

D.   Presenting

The final step in acing your data analysis dissertation is presenting it to your committee. This presentation should be well-organized and professionally presented. During the presentation, you’ll also need to be ready to respond to questions concerning your dissertation.

Data Analysis Tools

Numerous suggestive tools are employed to assess the data and deduce pertinent findings for the discussion section. The tools used to analyze data and get a scientific conclusion are as follows:

a.     Excel

Excel is a spreadsheet program part of the Microsoft Office productivity software suite. Excel is a powerful tool that can be used for various data analysis tasks, such as creating charts and graphs, performing mathematical calculations, and sorting and filtering data.

b.     Google Sheets

Google Sheets is a free online spreadsheet application that is part of the Google Drive suite of productivity software. Google Sheets is similar to Excel in terms of functionality, but it also has some unique features, such as the ability to collaborate with other users in real-time.

c.     SPSS

SPSS is a statistical analysis software program commonly used in the social sciences. SPSS can be used for various data analysis tasks, such as hypothesis testing, factor analysis, and regression analysis.

d.     STATA

STATA is a statistical analysis software program commonly used in the sciences and economics. STATA can be used for data management, statistical modelling, descriptive statistics analysis, and data visualization tasks.

SAS is a commercial statistical analysis software program used by businesses and organizations worldwide. SAS can be used for predictive modelling, market research, and fraud detection.

R is a free, open-source statistical programming language popular among statisticians and data scientists. R can be used for tasks such as data wrangling, machine learning, and creating complex visualizations.

g.     Python

A variety of applications may be used using the distinctive programming language Python, including web development, scientific computing, and artificial intelligence. Python also has a number of modules and libraries that can be used for data analysis tasks, such as numerical computing, statistical modelling, and data visualization.

Testimonials

Very satisfied students

This is our reason for working. We want to make all students happy, every day. Review us on Sitejabber

Tips to Compose a Successful Data Analysis Dissertation

a.   Choose a Topic You’re Passionate About

The first step to writing a successful data analysis dissertation is to choose a topic you’re passionate about. Not only will this make the research and writing process more enjoyable, but it will also ensure that you produce a high-quality paper.

Choose a topic that is particular enough to be covered in your paper’s scope but not so specific that it will be challenging to obtain enough evidence to substantiate your arguments.

b.   Do Your Research

data analysis in research is an important part of academic writing. Once you’ve selected a topic, it’s time to begin your research. Be sure to consult with your advisor or supervisor frequently during this stage to ensure that you are on the right track. In addition to secondary sources such as books, journal articles, and reports, you should also consider conducting primary research through surveys or interviews. This will give you first-hand insights into your topic that can be invaluable when writing your paper.

c.   Develop a Strong Thesis Statement

After you’ve done your research, it’s time to start developing your thesis statement. It is arguably the most crucial part of your entire paper, so take care to craft a clear and concise statement that encapsulates the main argument of your paper.

Remember that your thesis statement should be arguable—that is, it should be capable of being disputed by someone who disagrees with your point of view. If your thesis statement is not arguable, it will be difficult to write a convincing paper.

d.   Write a Detailed Outline

Once you have developed a strong thesis statement, the next step is to write a detailed outline of your paper. This will offer you a direction to write in and guarantee that your paper makes sense from beginning to end.

Your outline should include an introduction, in which you state your thesis statement; several body paragraphs, each devoted to a different aspect of your argument; and a conclusion, in which you restate your thesis and summarize the main points of your paper.

e.   Write Your First Draft

With your outline in hand, it’s finally time to start writing your first draft. At this stage, don’t worry about perfecting your grammar or making sure every sentence is exactly right—focus on getting all of your ideas down on paper (or onto the screen). Once you have completed your first draft, you can revise it for style and clarity.

And there you have it! Following these simple tips can increase your chances of success when writing your data analysis dissertation. Just remember to start early, give yourself plenty of time to research and revise, and consult with your supervisor frequently throughout the process.

How Does It Work ?

sample thesis data analysis and interpretation

Fill the Form

sample thesis data analysis and interpretation

Writer Starts Working

sample thesis data analysis and interpretation

3+ Topics Emailed!

Studying the above examples gives you valuable insight into the structure and content that should be included in your own data analysis dissertation. You can also learn how to effectively analyze and present your data and make a lasting impact on your readers.

In addition to being a useful resource for completing your dissertation, these examples can also serve as a valuable reference for future academic writing projects. By following these examples and understanding their principles, you can improve your data analysis skills and increase your chances of success in your academic career.

You may also contact Premier Dissertations to develop your data analysis dissertation.

For further assistance, some other resources in the dissertation writing section are shared below;

How Do You Select the Right Data Analysis

How to Write Data Analysis For A Dissertation?

How to Develop a Conceptual Framework in Dissertation?

What is a Hypothesis in a Dissertation?

Get an Immediate Response

Discuss your requirments with our writers

WhatsApp Us Email Us Chat with Us

Get 3+ Free   Dissertation Topics within 24 hours?

Your Number

Academic Level Select Academic Level Undergraduate Masters PhD

Area of Research

admin farhan

admin farhan

Related posts.

Passion Project Ideas

230 Passion Project Ideas for Students

How to Write a Reaction Paper: Format, Template, & Examples

How to Write a Reaction Paper: Format, Template, & Examples

What Is a Covariate? Its Role in Statistical Modeling

What Is a Covariate? Its Role in Statistical Modeling

Comments are closed.

Qualitative Data Analysis and Interpretation: Systematic Search for Meaning

Patrick Ngulube at University of South Africa

  • University of South Africa

Abstract and Figures

Steps for using discourse analysis (Adapted from Gill, 2000:178-179)

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Hugo Pérez-Moure

  • KIMAIYO KANDA PAUL
  • JULIUS BICHANGA MIROGA
  • Ayoade Stephen

Lonna Yohanes Lengkong

  • Andree Washington
  • Ayu Tiara Hamdani
  • Vitha Octavanny
  • Akhmad Edhy Aruman
  • Seyed Masoud Sajjadian
  • Veronica Triplett
  • Ngwabeni SIYASANGA
  • Mtimka ONGAMA
  • Mehtap Kırşavoğlu
  • Merve Çakır Köle
  • Aydin Corakci

Thanduxolo Rubela

  • Int J Res Meth Educ

Jennifer Keys Adair

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • Privacy Policy

Research Method

Home » Data Interpretation – Process, Methods and Questions

Data Interpretation – Process, Methods and Questions

Table of Contents

Data Interpretation

Data Interpretation

Definition :

Data interpretation refers to the process of making sense of data by analyzing and drawing conclusions from it. It involves examining data in order to identify patterns, relationships, and trends that can help explain the underlying phenomena being studied. Data interpretation can be used to make informed decisions and solve problems across a wide range of fields, including business, science, and social sciences.

Data Interpretation Process

Here are the steps involved in the data interpretation process:

  • Define the research question : The first step in data interpretation is to clearly define the research question. This will help you to focus your analysis and ensure that you are interpreting the data in a way that is relevant to your research objectives.
  • Collect the data: The next step is to collect the data. This can be done through a variety of methods such as surveys, interviews, observation, or secondary data sources.
  • Clean and organize the data : Once the data has been collected, it is important to clean and organize it. This involves checking for errors, inconsistencies, and missing data. Data cleaning can be a time-consuming process, but it is essential to ensure that the data is accurate and reliable.
  • Analyze the data: The next step is to analyze the data. This can involve using statistical software or other tools to calculate summary statistics, create graphs and charts, and identify patterns in the data.
  • Interpret the results: Once the data has been analyzed, it is important to interpret the results. This involves looking for patterns, trends, and relationships in the data. It also involves drawing conclusions based on the results of the analysis.
  • Communicate the findings : The final step is to communicate the findings. This can involve creating reports, presentations, or visualizations that summarize the key findings of the analysis. It is important to communicate the findings in a way that is clear and concise, and that is tailored to the audience’s needs.

Types of Data Interpretation

There are various types of data interpretation techniques used for analyzing and making sense of data. Here are some of the most common types:

Descriptive Interpretation

This type of interpretation involves summarizing and describing the key features of the data. This can involve calculating measures of central tendency (such as mean, median, and mode), measures of dispersion (such as range, variance, and standard deviation), and creating visualizations such as histograms, box plots, and scatterplots.

Inferential Interpretation

This type of interpretation involves making inferences about a larger population based on a sample of the data. This can involve hypothesis testing, where you test a hypothesis about a population parameter using sample data, or confidence interval estimation, where you estimate a range of values for a population parameter based on sample data.

Predictive Interpretation

This type of interpretation involves using data to make predictions about future outcomes. This can involve building predictive models using statistical techniques such as regression analysis, time-series analysis, or machine learning algorithms.

Exploratory Interpretation

This type of interpretation involves exploring the data to identify patterns and relationships that were not previously known. This can involve data mining techniques such as clustering analysis, principal component analysis, or association rule mining.

Causal Interpretation

This type of interpretation involves identifying causal relationships between variables in the data. This can involve experimental designs, such as randomized controlled trials, or observational studies, such as regression analysis or propensity score matching.

Data Interpretation Methods

There are various methods for data interpretation that can be used to analyze and make sense of data. Here are some of the most common methods:

Statistical Analysis

This method involves using statistical techniques to analyze the data. Statistical analysis can involve descriptive statistics (such as measures of central tendency and dispersion), inferential statistics (such as hypothesis testing and confidence interval estimation), and predictive modeling (such as regression analysis and time-series analysis).

Data Visualization

This method involves using visual representations of the data to identify patterns and trends. Data visualization can involve creating charts, graphs, and other visualizations, such as heat maps or scatterplots.

Text Analysis

This method involves analyzing text data, such as survey responses or social media posts, to identify patterns and themes. Text analysis can involve techniques such as sentiment analysis, topic modeling, and natural language processing.

Machine Learning

This method involves using algorithms to identify patterns in the data and make predictions or classifications. Machine learning can involve techniques such as decision trees, neural networks, and random forests.

Qualitative Analysis

This method involves analyzing non-numeric data, such as interviews or focus group discussions, to identify themes and patterns. Qualitative analysis can involve techniques such as content analysis, grounded theory, and narrative analysis.

Geospatial Analysis

This method involves analyzing spatial data, such as maps or GPS coordinates, to identify patterns and relationships. Geospatial analysis can involve techniques such as spatial autocorrelation, hot spot analysis, and clustering.

Applications of Data Interpretation

Data interpretation has a wide range of applications across different fields, including business, healthcare, education, social sciences, and more. Here are some examples of how data interpretation is used in different applications:

  • Business : Data interpretation is widely used in business to inform decision-making, identify market trends, and optimize operations. For example, businesses may analyze sales data to identify the most popular products or customer demographics, or use predictive modeling to forecast demand and adjust pricing accordingly.
  • Healthcare : Data interpretation is critical in healthcare for identifying disease patterns, evaluating treatment effectiveness, and improving patient outcomes. For example, healthcare providers may use electronic health records to analyze patient data and identify risk factors for certain diseases or conditions.
  • Education : Data interpretation is used in education to assess student performance, identify areas for improvement, and evaluate the effectiveness of instructional methods. For example, schools may analyze test scores to identify students who are struggling and provide targeted interventions to improve their performance.
  • Social sciences : Data interpretation is used in social sciences to understand human behavior, attitudes, and perceptions. For example, researchers may analyze survey data to identify patterns in public opinion or use qualitative analysis to understand the experiences of marginalized communities.
  • Sports : Data interpretation is increasingly used in sports to inform strategy and improve performance. For example, coaches may analyze performance data to identify areas for improvement or use predictive modeling to assess the likelihood of injuries or other risks.

When to use Data Interpretation

Data interpretation is used to make sense of complex data and to draw conclusions from it. It is particularly useful when working with large datasets or when trying to identify patterns or trends in the data. Data interpretation can be used in a variety of settings, including scientific research, business analysis, and public policy.

In scientific research, data interpretation is often used to draw conclusions from experiments or studies. Researchers use statistical analysis and data visualization techniques to interpret their data and to identify patterns or relationships between variables. This can help them to understand the underlying mechanisms of their research and to develop new hypotheses.

In business analysis, data interpretation is used to analyze market trends and consumer behavior. Companies can use data interpretation to identify patterns in customer buying habits, to understand market trends, and to develop marketing strategies that target specific customer segments.

In public policy, data interpretation is used to inform decision-making and to evaluate the effectiveness of policies and programs. Governments and other organizations use data interpretation to track the impact of policies and programs over time, to identify areas where improvements are needed, and to develop evidence-based policy recommendations.

In general, data interpretation is useful whenever large amounts of data need to be analyzed and understood in order to make informed decisions.

Data Interpretation Examples

Here are some real-time examples of data interpretation:

  • Social media analytics : Social media platforms generate vast amounts of data every second, and businesses can use this data to analyze customer behavior, track sentiment, and identify trends. Data interpretation in social media analytics involves analyzing data in real-time to identify patterns and trends that can help businesses make informed decisions about marketing strategies and customer engagement.
  • Healthcare analytics: Healthcare organizations use data interpretation to analyze patient data, track outcomes, and identify areas where improvements are needed. Real-time data interpretation can help healthcare providers make quick decisions about patient care, such as identifying patients who are at risk of developing complications or adverse events.
  • Financial analysis: Real-time data interpretation is essential for financial analysis, where traders and analysts need to make quick decisions based on changing market conditions. Financial analysts use data interpretation to track market trends, identify opportunities for investment, and develop trading strategies.
  • Environmental monitoring : Real-time data interpretation is important for environmental monitoring, where data is collected from various sources such as satellites, sensors, and weather stations. Data interpretation helps to identify patterns and trends that can help predict natural disasters, track changes in the environment, and inform decision-making about environmental policies.
  • Traffic management: Real-time data interpretation is used for traffic management, where traffic sensors collect data on traffic flow, congestion, and accidents. Data interpretation helps to identify areas where traffic congestion is high, and helps traffic management authorities make decisions about road maintenance, traffic signal timing, and other strategies to improve traffic flow.

Data Interpretation Questions

Data Interpretation Questions samples:

  • Medical : What is the correlation between a patient’s age and their risk of developing a certain disease?
  • Environmental Science: What is the trend in the concentration of a certain pollutant in a particular body of water over the past 10 years?
  • Finance : What is the correlation between a company’s stock price and its quarterly revenue?
  • Education : What is the trend in graduation rates for a particular high school over the past 5 years?
  • Marketing : What is the correlation between a company’s advertising budget and its sales revenue?
  • Sports : What is the trend in the number of home runs hit by a particular baseball player over the past 3 seasons?
  • Social Science: What is the correlation between a person’s level of education and their income level?

In order to answer these questions, you would need to analyze and interpret the data using statistical methods, graphs, and other visualization tools.

Purpose of Data Interpretation

The purpose of data interpretation is to make sense of complex data by analyzing and drawing insights from it. The process of data interpretation involves identifying patterns and trends, making comparisons, and drawing conclusions based on the data. The ultimate goal of data interpretation is to use the insights gained from the analysis to inform decision-making.

Data interpretation is important because it allows individuals and organizations to:

  • Understand complex data : Data interpretation helps individuals and organizations to make sense of complex data sets that would otherwise be difficult to understand.
  • Identify patterns and trends : Data interpretation helps to identify patterns and trends in data, which can reveal important insights about the underlying processes and relationships.
  • Make informed decisions: Data interpretation provides individuals and organizations with the information they need to make informed decisions based on the insights gained from the data analysis.
  • Evaluate performance : Data interpretation helps individuals and organizations to evaluate their performance over time and to identify areas where improvements can be made.
  • Communicate findings: Data interpretation allows individuals and organizations to communicate their findings to others in a clear and concise manner, which is essential for informing stakeholders and making changes based on the insights gained from the analysis.

Characteristics of Data Interpretation

Here are some characteristics of data interpretation:

  • Contextual : Data interpretation is always contextual, meaning that the interpretation of data is dependent on the context in which it is analyzed. The same data may have different meanings depending on the context in which it is analyzed.
  • Iterative : Data interpretation is an iterative process, meaning that it often involves multiple rounds of analysis and refinement as more data becomes available or as new insights are gained from the analysis.
  • Subjective : Data interpretation is often subjective, as it involves the interpretation of data by individuals who may have different perspectives and biases. It is important to acknowledge and address these biases when interpreting data.
  • Analytical : Data interpretation involves the use of analytical tools and techniques to analyze and draw insights from data. These may include statistical analysis, data visualization, and other data analysis methods.
  • Evidence-based : Data interpretation is evidence-based, meaning that it is based on the data and the insights gained from the analysis. It is important to ensure that the data used in the analysis is accurate, relevant, and reliable.
  • Actionable : Data interpretation is actionable, meaning that it provides insights that can be used to inform decision-making and to drive action. The ultimate goal of data interpretation is to use the insights gained from the analysis to improve performance or to achieve specific goals.

Advantages of Data Interpretation

Data interpretation has several advantages, including:

  • Improved decision-making: Data interpretation provides insights that can be used to inform decision-making. By analyzing data and drawing insights from it, individuals and organizations can make informed decisions based on evidence rather than intuition.
  • Identification of patterns and trends: Data interpretation helps to identify patterns and trends in data, which can reveal important insights about the underlying processes and relationships. This information can be used to improve performance or to achieve specific goals.
  • Evaluation of performance: Data interpretation helps individuals and organizations to evaluate their performance over time and to identify areas where improvements can be made. By analyzing data, organizations can identify strengths and weaknesses and make changes to improve their performance.
  • Communication of findings: Data interpretation allows individuals and organizations to communicate their findings to others in a clear and concise manner, which is essential for informing stakeholders and making changes based on the insights gained from the analysis.
  • Better resource allocation: Data interpretation can help organizations allocate resources more efficiently by identifying areas where resources are needed most. By analyzing data, organizations can identify areas where resources are being underutilized or where additional resources are needed to improve performance.
  • Improved competitiveness : Data interpretation can give organizations a competitive advantage by providing insights that help to improve performance, reduce costs, or identify new opportunities for growth.

Limitations of Data Interpretation

Data interpretation has some limitations, including:

  • Limited by the quality of data: The quality of data used in data interpretation can greatly impact the accuracy of the insights gained from the analysis. Poor quality data can lead to incorrect conclusions and decisions.
  • Subjectivity: Data interpretation can be subjective, as it involves the interpretation of data by individuals who may have different perspectives and biases. This can lead to different interpretations of the same data.
  • Limited by analytical tools: The analytical tools and techniques used in data interpretation can also limit the accuracy of the insights gained from the analysis. Different analytical tools may yield different results, and some tools may not be suitable for certain types of data.
  • Time-consuming: Data interpretation can be a time-consuming process, particularly for large and complex data sets. This can make it difficult to quickly make decisions based on the insights gained from the analysis.
  • Incomplete data: Data interpretation can be limited by incomplete data sets, which may not provide a complete picture of the situation being analyzed. Incomplete data can lead to incorrect conclusions and decisions.
  • Limited by context: Data interpretation is always contextual, meaning that the interpretation of data is dependent on the context in which it is analyzed. The same data may have different meanings depending on the context in which it is analyzed.

Difference between Data Interpretation and Data Analysis

Data interpretation and data analysis are two different but closely related processes in data-driven decision-making.

Data analysis refers to the process of examining and examining data using statistical and computational methods to derive insights and conclusions from it. It involves cleaning, transforming, and modeling the data to uncover patterns, relationships, and trends that can help in understanding the underlying phenomena.

Data interpretation, on the other hand, refers to the process of making sense of the findings from the data analysis by contextualizing them within the larger problem domain. It involves identifying the key takeaways from the data analysis, assessing their relevance and significance to the problem at hand, and communicating the insights in a clear and actionable manner.

In short, data analysis is about uncovering insights from the data, while data interpretation is about making sense of those insights and translating them into actionable recommendations.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

What is a Hypothesis

What is a Hypothesis – Types, Examples and...

Table of Contents

Table of Contents – Types, Formats, Examples

Informed Consent in Research

Informed Consent in Research – Types, Templates...

Implications in Research

Implications in Research – Types, Examples and...

Research Objectives

Research Objectives – Types, Examples and...

Research Methodology

Research Methodology – Types, Examples and...

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Chapter 4 Presentation, Analysis, and Interpretation of Data

Profile image of Joanne Delos Reyes

Related Papers

The analysis and interpretation of data about wearing high heels for female students of Ligao community college.

James Joel Delos Reyes

The analysis and interpretation of data about wearing high heels for female students of Ligao community college. To complete this study properly, it is necessary to analyze the data collected in order to answer the research questions. Data is interpreted in a descriptive form. This chapter comprises the analysis, presentation and interpretation of the findings resulting from this study. The analysis and interpretation of data is carried out in two phases. The first part, which is based on the results of the questionnaire, deals with a qualitative analysis of data. The second, which is based on quantitative analysis. The unit of analysis is the major entity that the researcher going to analyze in the study. It is not 'what' or 'who' that is being studied. Researchers were collecting the data or information from the following female student for the completion of the study 100 questionnaires were distributed, only 80 was retrieve, some students did not completely answer the given data, few of them with a lot of missing data, while the remaining students answer well on the given questionnaire. The researchers use table in order to easily identify the data and interpret it according to the response of the following female students. In order to get the percentage of the following data we use the formula P=the value or the frequency/ by total respondent, (100) multiply (100). Table 1.

sample thesis data analysis and interpretation

Eğitimde Kuram ve Uygulama

Tamer Kutluca

1.Uluslararası İnsan Çalışmaları Kongresi (ICHUS2018)

ARİF DURĞUN , Burhanettin Uysal

Giriş ve Amaç: Bu çalışma ile hastanelerde çalışan sağlık meslek mensuplarının mesleki ve örgütsel bağlılık düzeyleri arasındaki ilişkiyi incelemek ve daha önce hemşireler üzerinde yapılan araştırmalardan yola çıkarak sağlık meslek mensupları üzerinde uygulayarak literatüre veri kazandırılmak amaçlanmaktadır. Gereç ve Yöntem: Araştırma, Bolu ili İzzet Baysal Üniversitesi İzzet Baysal Eğitim ve Araştırma Hastanesinde sağlık meslek mensupları üzerinde 2018 yılı Ekim-Kasım ayında yapılmıştır. Araştırmanın evrenini tüm sağlık meslek mensupları oluşturmaktadır. Araştırma etik kurul onayı alındıktan sonra hastanede görev yapan 684 sağlık meslek mensubundan tesadüfî örnekleme bağlı kalarak %95 güven aralığında %5 hata payı ile 247 çalışan üzerinde planlandı ancak araştırmanın sınırlılıklarından dolayı 201 çalışan üzerinde uygulandı. Anketlerden elde edilen veriler SPSS (Statistical Package for the Social Sciences) programı ile analiz edildi. Ayrıca sosyo-demografik özelliklere göre sağlık meslek mensupları arasında önemli farklılıklar gösterip göstermediğini analiz etmek için t testi, tek yönlü varyans analizi, ölçeklerin boyutları arasında korelasyon ve regresyon analizleri yapıldı. Her iki ölçeğe de Lisrel programı ile doğrulayıcı faktör analizi uygulandı. Bulgular ve Sonuç: Araştırmadan elde edilen bulgulara göre örgütsel bağlılık ve mesleki bağlılık güvenirlik analizi Cronbach’s alpha değerleri yüksek ve çok yüksek bulundu. Demografik faktörlerle ölçeklerin toplam puanları ile yapılan t-testi ve tek yönlü anova sonuçlarına göre cinsiyet, medeni durum değişkenleri açısından mesleki bağlılık ve örgütsel bağlılık anlamlı farklılaşmamaktadır. Eğitim değişkenine göre mesleki bağlılığın mesleki üyeliği sürdürme alt boyutunda anlamlı farklılaştığı, farklılığın doktora-lisans ve doktora-lise arasından kaynaklandığı bulundu. Unvan değişkenine göre çaba gösterme alt boyutunda anlamlı farklılaştı. Çalışma saatleri değişkeninde ise her iki ölçekte de anlamlı farklılık bulundu. Boyutlar arasında pozitif yönlü anlamlı ilişkiler bulundu (p<0,01;0,05). Regresyon analizine göre mesleki bağlılık, örgütsel bağlılığı etkilemektedir (p<0,01; R²:0,108;B=0,328).

bilal yaman

Bu araştırmanın amacı, gençlik merkezi faaliyetlerine katılan bireylerin bazı değişkenlere göre serbest zaman tatmin düzeylerinin incelenmesidir. Evreni Türkiye İç Anadolu bölgesindeki Gençlik merkezine üye gençler oluşturmaktadır. Araştırma grubunu ise bu bölgede bulunan 11 ildeki Gençlik merkezlerine üye olan yaşları 13-27 arasında değişen 906 birey oluşturmaktadır. Araştırma verileri toplanmasında, serbest zaman tatmin düzeylerini belirlemek amacıyla Beard ve Ragheb&#39; in (1980) geliştirdikleri, Karlı ve arkadaşlarının (2008) yılında geçerlilik güvenilirlik çalışmasını yaparak Türkçe literatüre kazandırdıkları 39 sorudan ve altı alt boyuttan oluşan iç tutarlılık (Chronbach Alfa) katsayısı 92 olarak bulunmuş Serbest Zaman Tatmin Ölçeği (Leisure Satisfaction Scale/LSS) kullanılmıştır. Verilerin analizinde değişkenlerin gruplara göre dağılımları incelenmiş, dağılımların normalliği ve varyansların homojenliği değerlendirilerek dağılımların parametrik özellik sergilemediği sonucuna ...

Chris Isokpenhi

abdurrahman kırtepe

Mehari Tesfai

International Journal of Anatolia Sport Sciences

serap çolak

DergiPark (Istanbul University)

mehmet Atlar

michaelroy yabes

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Venilyn April Barrientos

Beden Egitimi Ve Spor Bilimleri Dergisi

Talha Murathan

Hacettepe Üniversitesi Hemşirelik Fakültesi dergisi

Emine Geçkil

Emine Önder

Turkiye Klinikleri Journal of Medical Sciences

Ahmet Sunter

Accountability in Human Resource Management

Manmeet Bali Nag

İstanbul Gelişim Üniversitesi Sosyal Bilimler Dergisi

Yağmur Callak

Kastamonu Eğitim Dergisi

özgür Kalafat

Opus uluslararası toplum araştırmaları dergisi

Ilknur Maya

Journal of Turkish Studies

Ana Dili Eğitimi Dergisi

Gıyasettin Aytaş

Gül KALELİ YILMAZ

tutku başöz

Journal of International Social Research

Okan SARIGÖZ

Cevdet Cengiz

Ismail gürler

Yadigar Polat

Düzce Tıp Fakültesi Dergisi

melih Şahin

Sebahattin Devecioglu

The Journal of Academic Social Science Studies

Erhan görmez

Kuramsal Eğitimbilim

kübra erbey

Melike Yoncalık

International journal of Science Culture and Sport

Mehmet güllü

Fethi Kayalar

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

PROJECT FOCUS

  • 2D/3D Take-off & Estimating
  • Bid Management
  • Business Intelligence
  • Collaboration & Document Management
  • Commissioning & Handover
  • Field & Operations Management
  • Operation & Maintenance
  • Planning & Design
  • Prefabrication
  • Procurement
  • Project & Financial Control
  • Scheduling & Resource Management
  • Sustainability
  • BIM Manager
  • Building Product Manufacturers
  • C-Level Executive
  • Contracts Manager
  • Finance Manager
  • Planner/Scheduler
  • Plant & Equipment Manager
  • Prefab Production Manager
  • Procurement Manager
  • Project Manager
  • Quantity Surveyor
  • Sustainability/ESG Manager
  • RIB 4.0 Integrated Project and Enterprise Platform
  • RIB Benchmark Benchmarking & Conceptual Estimating
  • RIB BI+ Data Analytics & Business Intelligence
  • RIB BuildSmart Cost Management & Enterprise Accounting
  • RIB Candy Estimating, Planning & Project Control
  • RIB Civil CAD Software for Civil Engineering
  • RIB Connex Construction & Field Collaboration
  • RIB CostX 2D & BIM Takeoff & Estimating
  • RIB CX Intelligent Project Management Collaboration
  • RIB Digital Handover O&M Data Handover
  • RIB One Prefab Fully Integrated Production Workflow
  • RIB Presto BIM-Integrated Estimating & Project Management
  • RIB Project Document Management Collaboration
  • RIB SpecLink Intelligent, Data-driven Specifications
  • RIB SpecLive Project Specification Tracking Platform
  • Case Studies
  • Client Resources

What’s NEW

digital transformation in construction represented by civil engineers with laptop on construction site

The Key To Managing Digital Transformation In Construction

sample thesis data analysis and interpretation

Read our blog for the latest insights, industry trends, expert tips and best practices to inspire your construction projects

  • Construction Industry

digital transformation in construction represented by civil engineers with laptop on construction site

The Key To Managing Digital Transformation In Construction 9 July, 2024 22 mins read A popular proverb says: “The best time to plant a tree was 20 years ago. The second-best time is … Read More

construction worker looking at construction software on tablet

13 Steps To Implementing Construction Software 4 July, 2024 16 mins read The construction industry is steadily digitizing and adopting new technologies. Implementing innovative construction management software solutions offers massive advantages. Yet, … Read More

BIM project designs illustrated by bridge

Award-Winning BIM Designs: Five Innovative Projects That Showcase The Future 18 June, 2024 17 mins read Perhaps no other innovation in the history of the construction industry has been as influential and revolutionary as Building … Read More

Business Intelligence Trends By RIB Software

Top 10 Analytics And Business Intelligence Trends For 2024 17 June, 2024 46 mins read Over the past decade, business intelligence has been revolutionized. Data exploded and became big. And just like that, we … Read More

Types Of Reports Blog By RIB Software

A Guide To The Top 14 Types Of Reports With Examples Of When To Use Them 17 June, 2024 44 mins read Businesses have been producing reports forever. No matter what role or industry you work in, chances are that you … Read More

sample thesis data analysis and interpretation

Decarbonizing Construction: The What, Why, And How 13 June, 2024 21 mins read Every year when we celebrate Earth Overshoot Day, the ecological overrun becomes more and more obvious. According to the … Read More

Dashboard Design Principles Blog By RIB Software

25 Dashboard Design Principles & Best Practices To Enhance Your Data Analysis 12 June, 2024 66 mins read It’s often said that knowledge is equal to power. While it isn’t possible to apply this statement in a … Read More

subcontractor management guide by RIB Software

A Complete Guide To Efficient Subcontractor Management 12 June, 2024 17 mins read In the construction, mining, and industrial sectors, subcontractors are essential in delivering a project on time and within budget. … Read More

sample thesis data analysis and interpretation

How Precast Concrete Is Driving The Demand For Smart Factories 11 June, 2024 20 mins read The surge in demand for precast construction has spurred an increase in the utilization of advanced manufacturing technologies and … Read More

Ready To Build Better?

  • Open access
  • Published: 10 July 2024

An end-to-end approach for single-cell infrared absorption spectroscopy of bacterial inclusion bodies: from AFM-IR measurement to data interpretation of large sample sets

  • Wouter Duverger 1 , 2 ,
  • Grigoria Tsaka 1 , 2 , 3 , 4 ,
  • Ladan Khodaparast 1 , 2 ,
  • Laleh Khodaparast 1 , 2 ,
  • Nikolaos Louros 5 , 6 ,
  • Frederic Rousseau 1 , 2 &
  • Joost Schymkowitz 1 , 2  

Journal of Nanobiotechnology volume  22 , Article number:  406 ( 2024 ) Cite this article

Metrics details

Inclusion bodies (IBs) are well-known subcellular structures in bacteria where protein aggregates are collected. Various methods have probed their structure, but single-cell spectroscopy remains challenging. Atomic Force Microscopy-based Infrared Spectroscopy (AFM-IR) is a novel technology with high potential for the characterisation of biomaterials such as IBs.

We present a detailed investigation using AFM-IR, revealing the substructure of IBs and their variation at the single-cell level, including a rigorous optimisation of data collection parameters and addressing issues such as laser power, pulse frequency, and sample drift. An analysis pipeline was developed tailored to AFM-IR image data, allowing high-throughput, label-free imaging of more than 3500 IBs in 12,000 bacterial cells. We examined IBs generated in Escherichia coli under different stress conditions. Dimensionality reduction analysis of the resulting spectra suggested distinct clustering of stress conditions, aligning with the nature and severity of the applied stresses. Correlation analyses revealed intricate relationships between the physical and morphological properties of IBs.

Conclusions

Our study highlights the power and limitations of AFM-IR, revealing structural heterogeneity within and between IBs. We show that it is possible to perform quantitative analyses of AFM-IR maps over a large collection of different samples and determine how to control for various technical artefacts.

Inclusion bodies (IBs) are insoluble nonmembranous organelles in bacterial cells that store misfolded and aggregated proteins, first observed by Prouty et al. in recombinant bacteria [ 1 , 2 ]. These structures have attracted significant research attention due to their strong regulation by the host proteostasis machinery and their association with cellular senescence [ 3 , 4 , 5 ]. The challenge of resolubilising IBs often arises during the scale-up of protein production processes; however, in certain instances, proteins within IBs may retain some degree of their native structure and catalytic activity, negating the necessity for resolubilisation [ 6 ]. Moreover, IBs are being explored as putative drug delivery systems owing to their cell permeability and controlled drug release kinetics [ 7 , 8 ]. Additionally, cellular stressors such as starvation, senescence, and exposure to antibiotics can induce IB formation [ 9 ].

A wide range of techniques is employed to study the structural properties of IBs. Various techniques, including X-ray diffraction (XRD), Fourier transform infrared (FTIR) and Raman spectroscopy, nuclear magnetic resonance spectroscopy (NMR) and dye binding assays using Congo red, thioflavin T, thioflavin S, and pFTAA, have revealed that IBs possess amyloid-like characteristics [ 9 , 10 , 11 ]. Amyloid-like fibrils have been observed by transmission electron microscopy (TEM) and atomic force microscopy (AFM) upon digestion by proteinase K or trypsin [ 12 ]. The interactions of IBs with the proteostasis system and their dynamic behaviour have predominantly been studied using biochemical assays, as well as brightfield and fluorescence microscopy [ 4 , 13 , 14 , 15 ].

Micro-FTIR (µFTIR) is one of the few methods that offers label-free direct imaging of the secondary structure of proteins in IBs that does not depend on their extraction [ 16 ]. Proteins mainly absorb IR light in two regions of the IR spectrum: the amide I band (1600–1700 cm -1 ) and the amide II band (1500–1600 cm -1 ). The former is sensitive to the secondary conformation of a protein: β-sheets absorb between 1620 and 1640 cm -1 and 1674–1700 cm -1 , depending on their nature, while α-helices and disordered regions absorb around 1654 cm -1 and β-turns around 1672 cm -1 [ 17 ]. However, the resolution of µFTIR cannot exceed 2.5 μm, the Abbe limit at these wavelengths [ 18 ].

There has been an exceptional boom in infrared imaging methods for achieving higher resolution, such as optical photothermal infrared microscopy (OPTIR) and AFM-based methods such as atomic force microscopy-based infrared spectroscopy (AFM-IR) [ 19 , 20 ], tip-enhanced Raman scattering (TERS) and scanning near-field optical microscopy (SNOM) [ 21 ]. Each of these methods has its merits and limitations; see Dazzi and Prater (2016) for a comparison of AFM-IR, TERS, and SNOM [ 18 ]. In this work, we attempted to develop a protocol for the study of bacterial IBs using AFM-IR.

AFM-IR relies on the thermal expansion of molecules upon illumination with IR light of a wavenumber matching internal vibrations in those molecules and is therefore also known as photothermal infrared microscopy (PTIR) [ 22 ]. While the illumination laser remains diffraction-limited, a sharp AFM probe is used for the infrared absorption readout, resulting in a lateral resolution as low as 10 nm [ 23 ]. The amplitude of photothermal expansion is often considered proportional to the FTIR spectrum but is also influenced by factors including the probe shape, incident laser power, quality of mechanical contact, etc. [ 24 , 25 ], resulting in slight band shifts in comparison to traditional FTIR in regard to protein conformational analysis [ 24 , 26 , 27 ]. Since the invention of AFM-IR, many improvements have been made to this method, such as resonance-enhanced [ 28 ], tapping [ 29 ], surface-sensitive [ 30 ], and null-deflection AFM-IR [ 31 ]. Another line of research attempts to perform AFM-IR in water [ 32 , 33 ],

Previous studies on bacteria have utilised AFM-IR to study DNA [ 34 , 35 , 36 , 37 ], biopolymer-producing species [ 38 , 39 ], antibiotic resistance [ 40 ] or bacterial functional amyloids [ 41 ], primarily after depositing dried bacteria on a substrate [ 42 ]. AFM-IR has been shown to be capable of measuring changes in the cell wall composition that confer antibiotic resistance [ 40 ], visualising individual viruses injecting their genome into a cell [ 35 ], and studying bacterial functional amyloids [ 41 ]. Building on this line of research, in this work, we attempt to study the structural and temporal differences between IBs formed under different stress conditions by applying AFM-IR to explore variations in protein secondary structure in situ within bacterial cells. We provide a detailed protocol optimisation and the development of an end-to-end data analysis pipeline to support large-scale quantitative measurements of parameters in a single-cell and single-particle (IB) fashion. We show that the unprecedented sample size produced in our study overcomes the technical and biological variability of such challenging samples and conclude that AFM-IR is sensitive enough to detect IB formation in bacterial cells and to distinguish IBs arising from different stress conditions.

Bacterial growth conditions

10 µL of bacteria ( E. coli strains BW25113, BL21, or BL21 with pET15b-TEV-p53 plasmid [ 43 ]) from a 15% glycerol stock stored at − 80 °C were suspended in MHB medium (Fisher, 11,703,503), supplemented with ampicillin (100 µg/mL) if required. The cells were cultured overnight at 37 °C with shaking at 215 rpm.

Stress application

For Figs.  3 and 4 , BL21 (or BL21 pET15b) cultures were split over different tubes and washed with saline. They were then spiked with hydrogen peroxide (4 mM final concentration), nickel dichloride (100 µM final concentration), cobalt dichloride (100 µM final concentration), P2 (25 µg/mL), P33 (16 µg/mL), or, for p53 overexpression, IPTG (to 1 mM) and incubated for 1 h or 10 min in the case of P33.

For the experiment shown in Fig.  5 , E. coli BW25113 cultures were split into different tubes. The tube corresponding to the longest recovery condition was incubated at 49 °C for 1 h, after which it was moved to 37 °C for one or two hours. The other tubes were moved between the two temperatures such that they spent the correct amount of time at 37 °C after the one-hour heat shock.

AFM-IR sample preparation

All samples were spun down (2 min at 4300 × g), and the supernatant was replaced with 1.5 mL of saline solution (twice) before fixation in 0.5 mL of glutaraldehyde (2.5 vol% in 0.1 M Na-cacodylate buffer) and incubation for one hour at room temperature. Then, we performed three washes with cacodylate buffer (spinning down for 2 min at 12,100 × g) before secondary fixation in osmium tetroxide (1 vol% in cacodylate buffer) for 2 h. The samples were washed twice in cacodylate buffer and successively transferred to an ethanol series (30, 50, 70, 90, 100, 100, 100%), rotating at 4 °C for 10 min after each step. Then, they were resuspended twice in propylene oxide (Sigma, 82320) and rotated at 4 °C for 15 min.

The epoxy embedding was performed in three stages, first by resuspending in a 1:1 epoxy and propylene oxide mixture, supplementing 27 µL BDMA per 1 mL epoxy (Agar Scientific, AGR1031, hard formulation), and incubating for 1 h at 4 °C while rotating. Second, we resuspended the samples in a 2:1 mixture and left them to dry overnight. Finally, we transferred the samples to 100% epoxy resin, dried them at low vacuum for 4 h and cured them at 60 °C for 2 days. We sectioned the resin blocks to a thickness of 95 nm (Leica Ultracut UCT) and transferred the sections onto silicon wafers (Ted Pella, 16008), which were then glued to a sample disc (Bruker, SD-102 or Electron Microscopy Sciences 75010) using Reprorubber Thinpour (Reprorubber, 16116).

AFM-IR data acquisition

All samples were measured at least one time in resonance-enhanced mode with a pulse rate around 900 kHz. They were imaged at least under illumination with 1625 and 1650 cm -1 light, and spectra were collected at least five IB, cytoplasm and epoxy locations (as estimated by visual inspection of an IR Amplitude map at 1625 cm -1 ) when possible. Collecting epoxy spectra during every measurement session allows us to check and correct for tip contaminations.

A gold-coated cantilever (Bruker, PR-EX-nIR2-10, k  = 0.2 N/m, f 0  = 13 kHz, r  = 20–35 nm) was mounted in a nanoIR3 (Bruker) equipped with a MIRcat-QT laser (DRS Daylight Solutions), maximizing the laser sum, and adjusting the vertical and lateral deflection to approximately − 0.3 V and 0 V respectively. With the laser power set to 1.37%, a pulse rate of approximately 880 kHz and a pulse length of 160 ns, the IR beam was aligned in the x and y directions for each of the QCL chips (at 1730, 1260, 1088, and 914 cm -1 ), while its z position was optimised at 1730 cm -1 . The atmospheric humidity is controlled by purging the system with dry air. Care is taken to let the relative humidity stably drop below 1% before measurements are made.

A phase offset was chosen to maximise the IR Amplitude, and the phase-locked loop (PLL) gains were set to I  = 0.1 and P  = 1. After collecting a laser emission spectrum (also called power spectrum or background spectrum), one IR spectrum was collected on epoxy to check that all parameters were set correctly. AFM-IR datasets were acquired at 1650 cm -1 and 1625 cm -1 with the following settings: field of view, 10 × 10 or 20 × 20 μm; resolution, 512 × 512 px (hence a pixel size of approximately 20–40 nm); scan rate, 0.1 Hz; AFM I gain, 2; P gain, 1; PLL I gain, 6; P gain, 60. For spectral measurements, we realigned the IR focus (only x and y), collected a new power spectrum, and collected spectra with the following settings: PLL I gain, 0.1; P gain, 1; spectral resolution, 2 cm -1 ; coaverages, 5; spectral range, 800–1800 cm -1 ; resolution, 2 cm -1 . To change samples, we moved the sample to its lowest position and replaced it with a set of tweezers, taking care not to touch the AFM head.

Data processing

The data were processed using python 3.10.13, numpy 1.26.3, pandas 2.1.4, SciPy 1.11.4, scikit-learn 1.3.2, scikit-image 0.19.3, statsmodels 0.14.1, umap-learn 0.5.5, xarray 2023.7.0, opencv-python-headless 4.9.0, and cellpose 2.2.3. Additionally, we adapted code developed by Dos Santos et al. [ 44 ].

IR absorption spectra, as reported, were normalised with respect to the laser emission spectrum at the time of measurement but were further processed by dividing them by the average epoxy spectra from the same sample measurement session and then min–max normalised to the range 0–1 between 1600 and 1800 cm -1 . Every spectrum was reduced to a single PLL Frequency value by computing the average PLL Frequency between 1600 and 1650 cm -1 and subtracting the average epoxy PLL Frequency from the same sample and measurement session.

The AFM-IR datasets were processed as follows. First, the 1625 cm -1 IR Amplitude map was segmented into background, bacterium, and IB pixels. Cells are defined using a Cellpose model finetuned to our data and eroded to discard membrane pixels [ 45 ]. Cells intersecting the image border were discarded for analysis. Then, the intensity distribution of pixels inside a cell (IR Amplitude map at 1625 cm -1 ) was thresholded using the triangle algorithm, a binary opening was applied to discard noise pixels to obtain the IB map, and IB pixels outside of the cell mask were discarded [ 46 ]. Second, the IR Amplitude map at 1625 cm -1 was registered onto the 1650 cm -1 map to correct for sample drift. In the case of a constant drift, a simple translation would suffice, but nonconstant drift can introduce apparent image shearing. Therefore, registration is implemented in two steps, initially maximising the cross-correlation between the two matching height maps while allowing only rigid transformations and then allowing affine transformations. Finally, the PLL Frequency maps are processed to correct for PLL Frequency drift and cantilever variations by calculating the average PLL Frequency of epoxy pixels line-by-line, applying a rolling mean, and subtracting this profile from the whole map. Because of the higher IR amplitudes at 1650 cm -1 , the PLL map corresponding to this wavelength was used in downstream analyses.

For the statistical analysis of multiple groups, Shapiro and Bartlett tests were performed to choose between ANOVA or Kruskal‒Wallis tests, after which suitable post hoc tests were performed. For multiple comparisons, p values were Bonferroni-corrected.

Optimisation of data collection parameters

We started by optimising the data collection procedure, focusing on experimental parameters for AFM and those specific to AFM-IR such as the excitation laser power and pulse rate. Considering that optimal settings for a field of view measuring 10–20 μm wide necessitate slow scanning speeds [ 47 ], we quantified sample drift. Additionally, methods for plotting raw data from AFM-IR datasets were implemented for quality control purposes.

Bacterial cells were embedded in epoxy resin after fixation, and AFM-IR was conducted on 95 nm thick sections of the produced resin blocks (Fig.  1 A). This embedding approach provides superior sample shelf life and surface smoothness, facilitating imaging [ 48 ]. At the point of loading the cantilever, we adjust the deflection mirror to ensure free-air deflection is about − 0.3 V, matching the default engagement force to achieve a deflection setpoint close to 0 V. The scan rate and AFM feedback gains were optimised to maintain a deflection within 0.01 V of the setpoint during a measurement, except around very sharp features such as knife marks.

To determine the common optimal laser power for all measurements, we collected spectra at various power levels at sample locations devoid of cells, as reflected by the absence of the protein-originating amide I band (Fig.  1 B). The IR Amplitude signal at a laser power of 1.37% was larger than at both lower (0.69%) and higher (2.87%) power, indicating optimal field enhancement due to surface plasmon resonance [ 49 ]. At even higher power (5.73%), significant noise appeared around the absorption peak. Furthermore, it is recommended to avoid using excessively high laser powers, as this can potentially damage the sample. We conclude that 1.37% is the optimal power level for these samples on our system.

In resonance-enhanced AFM-IR, the repetition frequency of the IR laser needs to match the contact frequency of the sample-cantilever system [ 25 ], which varies from cantilever to cantilever but also depends on where the deflection laser hits the cantilever (Fig.  1 C). The optimal frequency was tracked using a phase-locked loop (PLL), as it is subject to drift and contingent upon the nanomechanical properties of the sample and cantilever. Therefore, investigating the PLL Frequency maps reveals nanomechanical differences in the sample, although it cannot offer a direct quantification of the Young’s modulus [ 50 , 51 ]. The gains of the PLL were determined through scanning experiments of epoxy-embedded bacteria. I = 6 and  P  = 60 provided the best separation between epoxy and bacteria (Fig.  1 D). When acquiring a collection of spectra at various locations throughout the sample, we opt for low PLL gains ( I  = 0.1, P  = 1) to reduce noise while allowing the PLL Frequency to adapt to slow changes in the optimal pulse rate.

Given the slow scanning speeds employed, sample drift may cause issues if left uncorrected. Temperature variations in the laboratory environment were found to exert a pronounced influence on sample drift relative to the AFM probe. Drift correction strategies were employed based on the observed drift patterns. This was done by collecting a series of height maps of the same sample, in this case, 2 × 2 μm height maps of amyloid protein on a gold substrate over a period of over 12 h (Fig.  1 E; underlying data are presented in Supplementary Information, Note S1 ). Temperature-dependent drift is apparent both in the sample plane (x and y) and its vertical position (z). Based on our data, drift speeds on the order of 5–10 nm/min should be expected, even at relatively constant temperatures, and drift correction may be necessary [ 52 ].

Vertical sample drift was automatically compensated for by the AFM height tracking feedback. However, drift in the cantilever’s free air deflection requires additional consideration to ensure consistent force application during acquisition. Similar measurements over 2 days at near-constant temperatures (within 28 ± 0.2 °C) and an average sample drift of only 0.4 nm/min revealed differences in the free air deflection when automated deflection setpoint adjustment between acquired height maps was allowed (Fig.  1 F). Given an engagement force of 0.3 V, these differences were large. As such, they required counteracting by resetting the deflection setpoint between map acquisitions; otherwise, this would result in strong variations in the force applied on the sample and cantilever and therefore the optimal pulse rate.

We assessed the accuracy of the humidity sensor in our system because atmospheric water vapour profoundly impacts IR spectra in the mid-infrared region due to its sharp absorption lines. While regular collection of the laser emission spectrum before each measurement partially compensates for this effect, periodic verification of the relative atmospheric humidity throughout an experiment is advisable, ideally maintaining levels below 1%. Notably, the placement of the humidity sensor in a nanoIR3 system near the supply of dry air may yield humidity readings that appear overly optimistic compared to readings obtained from a sensor positioned adjacent to the sample location (Fig.  1 G). Thus, it is imperative to allow humidity levels to fully equilibrate before collecting IR measurements.

Finally, we acquired AFM-IR datasets at 1770 cm -1 , a wavenumber at which no IR absorption is expected for epoxy or cells, to ensure that there is no IR Amplitude signal due to confounding mechanical effects. For this data, see Supplementary Information, Note S1 .

Despite the implementation of these optimisations, the stability of the system may not always be sufficient to guarantee high-quality measurements. To ensure the integrity of our data, we acquired Height and Deflection maps in one scanning direction and IR Amplitude, IR Phase, and PLL Frequency maps in both scanning directions (trace and retrace) without applying any data processing. This approach enables the assessment of data quality both during and after measurement (Fig.  1 H). Through this method, we can evaluate trace-retrace errors and assess the magnitude of deflection and IR phase signals, minimising deviations from zero. For all AFM-IR datasets and spectra published in this work, the raw data can be found in Supplementary Information, Note S2 .

Throughout the rest of the paper, we will be using “AFM-IR dataset” for a set of images or maps with different types of data (Height, Deflection, IR Amplitude, PLL Frequency and IR Phase) collected simultaneously, and “IR Amplitude spectra” or simply “spectra” for IR absorbance spectra collected with the AFM-IR instrument.

figure 1

Protocol optimisation. (A) Schematic representation of the experimental protocol, created with Biorender.com. (B) IR Amplitude spectra of epoxy resin at various laser power settings. (C) Dependence of the IR Amplitude on the laser pulse rate varies from probe to probe and is influenced by the location at which the deflection laser hits the cantilever. (D) The distribution of values in a PLL Frequency map acquired under different feedback gain settings. The inset shows how the two different distributions in the sample (cells and epoxy) are most clearly separated at the 6/60 setting. (E) Measured drift speeds in the x, y, and z directions of the sample relative to the probe during an overnight measurement (top), correlated to the laboratory temperature (bottom). (F) Drift in the free-air deflection over the same period as (E). (G) Discrepancy between the reported and actual atmospheric humidity after opening the dry air purging valve at t = 0. (H) Output of the quality control pipeline for AFM-IR datasets showing maps and two data profiles in the trace (blue) and retrace (orange) scanning directions along the lines shown in the image

Data analysis pipeline and signal reproducibility

We established a pipeline for the automated analysis of AFM-IR datasets and spectra collected with the predefined parameters. Refer to the Methods section for details. To evaluate the performance of our measurement and analysis protocols, we prepared five identical samples of bacteria with spontaneous inclusion body (IB) formation and conducted multiple imaging sessions for each sample ( n  = 3–4), utilising the same cantilever whenever possible (refer to Supplementary Information, Note S3 for additional sample and cantilever details). This approach enabled us to assess both technical and biological variability.

In each individual measurement, we collected two AFM-IR datasets, one with illumination 1625 cm -1 (representing β-sheets [ 26 ]) and one at 1650 cm -1 (representing α-helices and unordered loops [ 26 ]), along with five IR spectra corresponding to inclusion bodies (IB), cytoplasm (CP), and epoxy (background; BG). Representative spectra and their locations are shown in Fig.  2 A-C. For all spectra in this study, location data are provided in Figure S2 . To quantify the relative β-sheet content in each spectrum, we integrated the area from 1615 to 1635 cm -1 (Fig.  2 D). Our analysis revealed an enrichment of β-sheets in IBs compared to the cytoplasm. The observed β-sheet enrichment had a relative magnitude of 1.4 (95% CI: 1.36–1.52, two-sample t test: \(p_{adj}=10^{-6}\) ). Notably, the technical variability observed did not yield statistically significant differences between repeat measurements (ANOVA on all data points for each sample: p adj  > 0.3). Moreover, no significant biological variability was observed (ANOVA on averages of each replicate: p  > 0.76) in this assessment.

On the other hand, the PLL Frequency analysis (Fig.  2 E) revealed significant technical variability (ANOVA on all data within each repeat: 9 >  p   adj  > 2 × 10 -5 ), even after exclusion of an outlier measurement series (repeat 2, hollow markers). This technical variability masked any between-sample differences in the PLL Frequency of IBs, if there is any (ANOVA on averages of each replicate: p adj  = 2.2).

AFM-IR datasets provide a greater variety and depth of information than do spectra. They were first processed following the protocol detailed in the Methods section. Briefly, the pixels were classified as cell or background using a finetuned Cellpose model [ 45 ]. An IB map was generated by a binary threshold of the 1625 cm -1 IR Amplitude map, where the threshold was defined by the Triangle algorithm applied to the intensity histogram of the cell pixels, and a binary opening to discard noise pixels [ 46 ]. As a result, the smallest IBs detected have a radius of 2 pixels, corresponding to 40 nm or two times the nominal radius of the probe. Note that further experiments in this paper take a larger field of view, with twice the pixel size. However, since they are processed in the same manner, the smallest IBs will have a radius of 80 nm. This was done to increase throughput at the expense of resolution.

An example dataset is shown in Fig.  2 F-J. For illustration purposes, this is a 20 × 20 μm dataset. The datasets underlying the analysis in this section can be consulted in Supplementary Information, Note S2 . First, we observed polar enrichment of IBs (Fig.  2 K); however, there were more IBs in the middle of the cell than expected from the literature [ 3 ]. This may be a result of the random three-dimensional orientation of cells with respect to the sectioning plane, but it is also possible that AFM-IR is sensitive to small protein aggregates that were not previously picked up by fluorescence microscopy approaches. Note that the relative age of the cell poles is not accessible in this experiment and that therefore, the sign of the polar location has no meaning. The positive pole is simply the one located on the right-hand side in the map.

Second, this dataset provides a measurement of the number of inclusion bodies per cell for each sample, as shown in Fig.  2 L. Within this dataset, there was no significant technical variability (ANOVA on all data within each repeat: p adj  > 0.5), but biological variability (ANOVA on averages of each replicate: p adj  = 0.0002) was observed.

Third, this dataset contains a distribution of IB sizes (Fig.  2 M), with an average radius of 85 nm, corresponding to eight pixels or four times the nominal radius of the AFM tip. There was no evidence of significant technical (ANOVA on all data within each repeat: p adj  > 10) or biological variability between the samples (ANOVA on averages of each replicate: p adj  = 0.5).

Fourth, the segmentation maps can be correlated to the IR Amplitude ratio and PLL maps to assess the physical and structural properties of IBs in an unbiased manner. Due to the inhomogeneous intensities of IR Amplitude maps discussed before, it is important to compare the relative β-sheet enrichment of an IB, the mean of the 1625/1650 cm -1 ratio map within the IB region, to that of the cytoplasm surrounding it (Fig.  2 N). In this case, there was significant technical variability only within sample 3 (ANOVA on all data within sample 3: p adj  = 0.001, for other samples: p adj  > 9), but no biological variability between samples (ANOVA on averages of each replicate: p adj  = 0.06). We have not found the cause for this outlier measurement and can only recommend performing enough measurements so cases like these can be averaged out or discarded.

The relative β-sheet enrichment of inclusion bodies in this dataset was 1.11 (95% CI: 1.06–1.15, two-sample t test: p adj  = 0.0009). This enrichment value is lower than that measured in the spectral analysis, possibly because of the choice of wavenumbers for imaging.

Figure  2 O shows the PLL Frequency difference between IBs and the surrounding cytoplasm. As in the spectral analysis, measurement 2 is an outlier. Excluding it, there was no statistical evidence for technical variability (ANOVA on all data within each repeat: p adj  > 0.1) or biological variability (ANOVA on averages of each measurement: p adj  > 1.2). While the PLL Frequency of IBs can be evaluated independently from the cytoplasm, this approach introduces extensive technical and biological variability (Supplementary Information, Note S4 ).

In summary, we developed a robust imaging pipeline providing data inaccessible by spectral analysis and independent of user bias due to the cherry-picking of spectrum locations. However, image analysis is limited by the discrete number of acquired wavenumbers and is more sensitive to technical artifacts, as shown in the ratio map in Fig.  2 I.

figure 2

Data processing and analysis. (A) IR Amplitude map of thin section of bacteria embedded in epoxy and localisation of example spectra shown in (B). (B) IR Amplitude spectra after normalisation with respect to the laser power spectrum and (C) after further processing. The wavenumber range used for quantification of β-sheets (1615–1635 cm -1 ) is indicated. (D) Relative β-sheet content from IB (blue) and cytoplasm (orange) IR Amplitude spectra over five independent but biologically similar samples. Each column represents an independent measurement. Horizontal annotations indicate whether the data within contained groups with significantly different means. The vertical annotation highlights the significant difference between IB and cytoplasm β-sheet levels. (E) Quantification of the average PLL Frequency of these spectra, relative to the mean PLL Frequency of the epoxy spectra in that measurement session. Measurement 2, an outlier, is indicated by hollow markers. (F) Example of processed AFM-IR dataset, including an IR Amplitude map at 1625 cm -1 , (G) an IR Amplitude map at 1650 cm -1 , (H) a PLL Frequency map, (I) a ratio map of the IR Amplitudes, and (J) segmentation into cells and inclusion bodies based on the 1625 cm -1 IR Amplitude map. The white arrow highlights a cell with 3 segmented inclusion bodies. (K) Distribution of IBs along the cell major axis. (L) Plotted like (D), the average number of IBs per cell, (M) their area, (N) enrichment of their β-sheet ratio (average 1625/1650 ratio) relative to the cytoplasm of the same cell, and (O) their average PLL Frequence relative to the cytoplasm. Error bars represent a 95% confidence interval of the mean by bootstrap

The nature of a stressor is reflected in the structure of resulting inclusion bodies

Having developed a robust imaging pipeline and evaluated its sensitivity to technical and biological variability, we attempted to distinguish IBs from various stress conditions by AFM-IR. A panel was selected to include physical stress (heat shock), chemical stress (heavy metals such as NiCl 2 , CoCl 2 and oxidation by hydrogen peroxide) and proteotoxic stress (overexpression of the aggregation-prone p53 DNA-binding domain [ 43 ] or exposure to the peptides P2 and P33 [ 9 ]). Peptins are short hydrophobic peptides that nucleate the aggregation of endogenous proteins due to homology with aggregation-prone regions.

To increase the experimental throughput, only IR absorption spectra were collected for these samples, as shown in Fig.  3 A. These experiments were performed in E. coli BL21 to accommodate the overexpression stress, but this strain also exhibited spontaneous IB formation in the buffer IB and cytoplasm were distinct from each other under all conditions, partly due to the increased β-sheet concentration, which was visible in the second derivative spectra (Fig.  3 B). Figure  3 C shows a quantification of the β-sheet content, the cytoplasmic levels of which were correlated with those in IBs (Pearson r  = 0.84, 95% CI: 0.34-0.97, p  = 0.009; Fig.  3 D). Principle component analysis (PCA) indicated that the first principal component was highly sensitive to the β-sheet content (Fig.  3 E). Both PCA and uniform manifold approximation and projection (UMAP) [ 53 ] could distinguish between the IB and cytoplasm spectra (Fig.  3 F-G). Furthermore, IBs from heat shock and proteotoxic stress conditions formed a cluster, and the chemical stresses were intermediate between them and the cytoplasm spectra. In this sense, the AFM-IR spectra seem to reflect the severity and type of applied stress.

figure 3

The nature of a stress affects the resulting IBs. (A) IR Amplitude spectra of IBs and cytoplasm collected from thin sections of epoxy-embedded bacteria after application of various stress conditions. (B) Second derivative spectra (averaged for each sample) display an increase in β-sheet content. The average over the whole dataset is represented by the mean and shaded CI (mean ± 1.96 × SEM; standard error of the mean). (C) Relative β-sheet content of IBs and cytoplasma in each spectrum. Mean and 95% CI (bootstrap). (D) Same as (C) but highlighting the correlation between cytoplasmic and IB β-sheet levels. Error bars represent mean ± 1.96 × SEM. (E) The first three principal components found in this data. (F) Score plot mapping all spectra to PCA space. The colours of the data points match panel (D). (G) UMAP representation of the spectral data

Because these results were based on a single sample per condition, they needed to be validated. We therefore compared H 2 O 2 stress to heat shock with a larger number of samples ( n  = 3) and full imaging following the protocol developed in this paper. Heat shock was shown to induce a much greater IB load (Fig.  4 A, B). There were some inclusions visible in the hydrogen peroxide sample in Fig.  4 A, but they were not recognised by the image segmentation pipeline, presumably due to their lower β-sheet enrichment and smaller size.

These smaller IBs could still be studied by collecting IR absorption spectra on locations that visually had a strong IR Amplitude signal at 1625 cm -1 (see Fig.  4 C-D). Spectral analysis confirmed that heat shock IBs had the highest β-sheet content among all spectra quantified in Fig.  4 E (Dunnett’s test: p  < 0.033). Additionally, the second derivative spectra implied the existence of two new bands in the peroxide-stressed spectra at 1678 cm -1 (antiparallel β-sheets) and 1616 cm -1 (intermolecular β-sheets), although the latter was nearly invisible in the original spectra. The 1678 cm -1 band sets the peroxide cytoplasm spectra apart from all others (Fig.  4 F): Dunnett’s test comparing all spectra to the control cytoplasm revealed no significant differences, except for the peroxide cytoplasm spectrum ( p adj  = 0.01). We concluded that AFM-IR, at least in spectral mode, is sensitive enough to distinguish between different stresses based on the secondary structure of cytoplasmic and aggregated proteins in stressed cells.

figure 4

Validation of hydrogen peroxide stress. (A) Representative IR Amplitude maps of thin sections of bacteria embedded in epoxy resin after control, hydrogen peroxide, and heat shock treatment. (B) Number of IBs per cell. Compared to other conditions, heat shock causes much more IB formation (comparisons report p values from Tukey’s test). (C) Average IR Amplitude spectra of IBs and cytoplasma under the three conditions reveal differences in structural composition. (D) Averaged second derivative spectra from peroxide-treated bacteria are characterised by peaks at 1678 and 1616 cm -1 . (E) Quantification of β-sheet levels (IR Amplitude intensity around 1628 cm -1 ). (F) Quantification of IR Amplitude intensity around 1678 cm -1 . All spectra are plotted for each condition and replicate, and 95% CIs (bootstrap) are shown

Recovery from heat shock

To go even further, heat shock IBs were characterised in a time-resolved manner after returning to 37 °C (samples were collected before heat shock and immediately, 30 min, 1 h and 2 h after heat shock; Fig.  5 A-C).

A quantification of the β-sheet signal from these spectra (Fig.  5 D) showed that the IB spectra at all timepoints were significantly enriched in β-sheets compared to the IB spectra before heat shock (ANOVA followed by Tukey’s test: p adj  < 0.0003), but there was no evidence of significant changes in the β-sheet content during the recovery period (Tukey’s test: p  > 0.6). The cytoplasmic β-sheet content was stable over time (ANOVA: p  = 0.4). Due to the number of spectra in this experiment, it was possible to perform an accurate analysis of the second derivative spectra, which revealed the formation of both intramolecular and intermolecular β-sheets (Supplementary Information, Note S5 ). The PLL Frequency of IBs did not change over time between the IB spectra at different timepoints (ANOVA: p  = 0.7), nor did cytoplasm spectra (ANOVA: p  = 0.7, Fig.  5 E). In general, however, IBs had a higher PLL Frequency than the cytoplasm of the same cell, reflecting their increased stiffness (Wilcoxon signed-rank test: p adj  = 2 × 10 -5 ).

The image analysis data, specifically of the IB area (Fig.  5 F) and number (Fig.  5 G), showed similar trends: an increase during the heat shock with a steady state in the two hours afterwards. While the evolution of IB β-sheet enrichment was not statistically significant (ANOVA: p adj  = 0.1), its trend recapitulated the spectral quantification and remained significantly greater than 1 in general (95% CI: 1.13–1.18, two-sample t test: p adj  = 10 -18 , Fig.  5 H). Similarly, the difference in PLL Frequency between IBs and the cytoplasm (Fig.  5 I) did not vary over time (ANOVA: p  = 0.8) but was positive (95% CI: 0.15-0.52, one-sample t test: p adj  = 0.0003).

In short, AFM-IR was unable to resolve any differences in the IB composition in the first two hours after heat shock. This could mean that disassembly takes longer than two hours under the conditions used in this paper [ 15 ], or it could be a limitation of the instrument. These data were validated by several orthogonal methods: the IBs were stained with the amyloid marker pFTAA and imaged using structured illumination microscopy to verify the amyloid nature of the β-sheets, one sample was imaged by transmission electron microscopy (TEM) and scanning electron microscopy (SEM) to electron density variations measure surface wear due to the AFM measurement, and IBs were purified and imaged by AFM-IR (Supplementary Information, Note S6 ).

figure 5

Recovery of IBs after heat shock. (A) Representative IR Amplitude maps (top) and spectra (bottom) of thin sections of bacteria embedded in epoxy resin after control treatment, (B) heat shock, (C) and heat shock with subsequent recovery, (B) immediately after heat shock and (C) two hours after heat shock. (D) Β-sheet content of inclusion bodies (integral of normalised spectra between 1615–1635 cm -1 ), averaged per sample. Biological replicates are connected by thin lines. Bold lines represent averages. (E) Average PLL Frequency of IB and cytoplasm IR Amplitude spectra, relative to epoxy. (F) The IB area increases during heat shock but remains constant afterwards, like (G) the number of IBs per cell and (H) their β-sheet enrichment. (I) IBs are more rigid than the cytoplasm at all timepoints tested, but there were no differences between timepoints. Shaded regions represent 95% CIs by bootstrap

Using the full capabilities of AFM-IR

The protocol presented in this paper sacrifices resolution in favour of faster acquisition times and larger fields of view, yet the resulting data did offer evidence that IBs are not sharply defined objects but that they have diffuse boundaries spanning approximately 120 nm (Fig.  6 A). This figure shows the average β-sheet enrichment and PLL difference of all pixels in the heat shock recovery dataset as a function of their distance to the closest IB border, with negative values indicating pixels outside an IB. To substantiate this conclusion, we also present an example of the capabilities of the instrument at a sampling rate of approximately 1 pixel per 3 nm, as presented in Fig.  6 B. This IR Amplitude map clearly shows a heterogeneous IB with diffuse edges.

In addition to the β-sheet content and PLL Frequency of each IB, a large set of other properties was measured, such as localisation, size and shape, thickness, etc. Some of these were found to be intimately connected with each other (Fig.  6 C; see Supplementary Information, Note S7 for descriptions of each property). For this figure, Pearson correlations were calculated between all pairs of properties in the set of IBs in each of the AFM-IR datasets underlying Fig.  5 . Bootstrap resampling ( n  = 9999) of the resulting set of correlations was used to test which ones are significantly different from 0.

As expected, neither cell orientation nor the polar projection of an IB is correlated with any other variable in this dataset. However, its proximity to a cell pole is part of a cluster of correlated variables likely driven by apparent cell size, which in turn is strongly dependent on the orientation of the cell with respect to the sectioning plane.

Somewhat unexpectedly, the relative β-sheet enrichment of an IB was largely uncorrelated to variables related to PLL Frequency and therefore stiffness. For reasons outlined earlier in this paper, we consider the difference in PLL a more robust readout than the mean IB PLL itself. The fact that the former does not correlate with β-sheet concentration (beta_ratio_ib) may mean that stiffness is driven by protein density than secondary structure, or it may reflect a lack of sensitivity to the small differences in β-sheet concentration and PLL Frequency within the set of measured IBs, even if it is established that IBs as a whole have a higher PLL than the cytoplasm. Furthermore, the correlation between PLL Frequency and local section thickness may additionally confound these observations. Finally, β-sheet enrichment was correlated with IB area and a cluster of definitionally related variables, such as the IR Amplitude at 1652 and 1650 cm -1 . Even if our conclusions from this correlation analysis are limited, the analysis itself does show the potential of image-based AFM-IR experiments.

figure 6

Highlighting the capabilities of image-based analysis of AFM-IR data. (A) Average β-sheet enrichment and PLL Frequency (relative to epoxy) of a pixel in an AFM-IR dataset as a function of its distance to the nearest IB edge (mean and 95% CI). (B) IR Amplitude map of an IB after heat shock treatment in a thin section of epoxy-embedded bacteria. (C) Correlation plot between IB properties. Dots highlight statistically significant correlations (Bonferroni-corrected p  < 0.05)

This paper describes the development of a protocol for performing high-throughput single-cell AFM‑IR spectroscopy on bacterial IBs. In total, this paper studies AFM-IR datasets at two wavenumbers of 12,030 cells, containing 3539 IBs, as well as 1343 spectra. Datasets of this size require saving all data in their rawest form possible, not only to evaluate the data quality but also to perform end-to-end automated data analysis, as developed in this paper. This means that our primary data are easily auditable and that our analysis is fully reproducible.

The scale of this dataset made it possible to, for the first time, make a rigorous assessment of data variability introduced by repeated measurements or biological variation. For most data outputs, the differences between repeated measurements were not significant, except for the PLL Frequency, which was found to be very sensitive to technical variability. Considerable biological variability between different samples was also observed, which is important for quantitative measurements.

Improving the stability of the PLL feedback system will be critical for robust assessments of nanomechanical heterogeneities in a correlative fashion with chemical and structural information derived from AFM-IR. For the moment, this need may be better served by AFM modes specifically developed for mechanical characterisation and not by using the PLL Frequency as a primary read out [ 54 ]. Even on systems with both AFM-IR and specific nanomechanical mapping modes, improved PLL stability will benefit the quality of the IR Amplitude signal. Users of a nanoIR3 system should attempt to minimise exogenous factors such as environmental noise, temperature fluctuations, power supply stability, and to make sure the system is fully equilibrated before initiating key measurements.

It was established that AFM-IR can detect differences between a set of various stresses, both in spectral and imaging mode, but cannot discern any evolution in IB properties over a two-hour recovery period after heat shock, revealing both the possibilities and the limitations of the method’s sensitivity. However, given its severity, the time allowed for recovery from heat shock was quite short.

Currently, the main limitations of AFM-IR lie in the long measurement times for IR absorption images and in the technical artefacts that can cause misinterpretations of the data. Acquiring one high-quality dataset can easily take three to four hours. PLL tracking of the IR pulse frequency is a strength and a limitation of this study: it offers mechanical information about the sample, but the PLL feedback can be unstable and lose tracking; therefore, PLL Frequency is the least reproducible output modality. While sections of epoxy-embedded samples provide smooth samples and faster imaging, the epoxy masks some regions of the IR spectrum, precluding a measurement of the IR response of lipids and nucleic acids. Fixing the samples prevents live time-lapse imaging, but this is already prohibited by the long scanning times. Additionally, it is unlikely to find entire cells in a field of view because of the random orientation of bacteria with respect to the sectioning plane. It would be interesting to perform image-based analyses on bacteria spotted directly on a substrate to circumvent the problems caused by epoxy embedding, although we anticipate additional imaging difficulties caused by the increased surface topology [ 42 ]. Efforts are underway to enable AFM-IR imaging in a liquid environment, which would open the door to live-cell imaging [ 55 , 56 ].

AFM-IR has already been applied in medical contexts, for example to study drug uptake and formulation, protein aggregation in situ and in vitro, parasitic infections, and more [ 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 ]. We expect that improving technology and increasing ease-of-use of AFM-IR will enable even more biological applications of this method.

We studied IB formation and recovery under heat shock and other stresses by rigorously optimising the data collection protocols and developing an imaging pipeline to process large datasets. This study shows the potential of AFM-IR for single-cell spectroscopy of large numbers of cells and IBs, details a method that could be applied to many questions in microbiology, and improves upon existing data analysis workflows using fully open-source software. Furthermore, the code published alongside this work should facilitate future analyses of large AFM-IR datasets and improve the transparency and reproducibility of data reported in this field.

Data availability

The data and code underlying this study are openly available on GitHub at https://github.com/wduverger/ib_spectroscopy and deposited in figshare at https://doi.org/10.6084/m9.figshare.25398622.v2 .

Prouty WF, Karnovsky MJ, Goldberg AL. Degradation of abnormal proteins in Escherichia coli. Formation of protein inclusions in cells exposed to amino acid analogs. J Biol Chem. 1975;250(3):1112–22.

Article   CAS   PubMed   Google Scholar  

Schramm FD, Schroeder K, Jonas K. Protein aggregation in bacteria. FEMS Microbiol Rev. 2020;44(1):54–72.

Lindner AB, Madden R, Demarez A, Stewart EJ, Taddei F. Asymmetric segregation of protein aggregates is associated with cellular aging and rejuvenation. Proc Natl Acad Sci U S A. 2008;105(8):3076–81.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Carrio MM, Villaverde A. Role of molecular chaperones in inclusion body formation. FEBS Lett. 2003;537(1–3):215–21.

Stewart EJ, Madden R, Paul G, Taddei F. Aging and death in an organism that reproduces by morphologically symmetric division. PLoS Biol. 2005;3(2):e45.

Article   PubMed   PubMed Central   Google Scholar  

Peternel S, Grdadolnik J, Gaberc-Porekar V, Komel R. Engineering inclusion bodies for non denaturing extraction of functional proteins. Microb Cell Fact. 2008;7:34.

Unzueta U, Cespedes MV, Sala R, Alamo P, Sanchez-Chardi A, Pesarrodona M, et al. Release of targeted protein nanoparticles from functional bacterial amyloids: a death star-like approach. J Control Release. 2018;279:29–39.

Villaverde A, Garcia-Fruitos E, Rinas U, Seras-Franzoso J, Kosoy A, Corchero JL, et al. Packaging protein drugs as bacterial inclusion bodies for therapeutic applications. Microb Cell Fact. 2012;11:76.

Khodaparast L, Khodaparast L, Gallardo R, Louros NN, Michiels E, Ramakrishnan R, et al. Aggregating sequences that occur in many proteins constitute weak spots of bacterial proteostasis. Nat Commun. 2018;9(1):866.

Wang L, Maji SK, Sawaya MR, Eisenberg D, Riek R. Bacterial inclusion bodies contain amyloid-like structure. PLoS Biol. 2008;6(8):e195.

Pouplana S, Espargaro A, Galdeano C, Viayna E, Sola I, Ventura S, et al. Thioflavin-S staining of bacterial inclusion bodies for the fast, simple, and inexpensive screening of amyloid aggregation inhibitors. Curr Med Chem. 2014;21(9):1152–9.

Garcia-Fruitos E, Gonzalez-Montalban N, Morell M, Vera A, Ferraz RM, Aris A, et al. Aggregation as bacterial inclusion bodies does not imply inactivation of enzymes and fluorescent proteins. Microb Cell Fact. 2005;4:27.

Carrio MM, Villaverde A. Localization of chaperones DnaK and GroEL in bacterial inclusion bodies. J Bacteriol. 2005;187(10):3599–601.

Morell M, Bravo R, Espargaro A, Sisquella X, Aviles FX, Fernandez-Busquets X, et al. Inclusion bodies: specificity in their aggregation process and amyloid-like structure. Biochim Biophys Acta. 2008;1783(10):1815–25.

Govers SK, Mortier J, Adam A, Aertsen A. Protein aggregates encode epigenetic memory of stressful encounters in individual Escherichia coli cells. PLoS Biol. 2018;16(8):e2003853.

Ami D, Natalello A, Taylor G, Tonon G, Maria Doglia S. Structural analysis of protein inclusion bodies by Fourier transform infrared microspectroscopy. Biochim Biophys Acta. 2006;1764(4):793–9.

Goormaghtigh E, Cabiaux V, Ruysschaert J-M. Determination of Soluble and membrane protein structure by Fourier Transform Infrared Spectroscopy. In: Hilderson HJ, Ralston GB, editors. Physicochemical methods in the study of Biomembranes. Boston, MA: Springer US; 1994. pp. 405–50.

Chapter   Google Scholar  

Dazzi A, Prater CB. AFM-IR: technology and applications in Nanoscale Infrared Spectroscopy and Chemical Imaging. Chem Rev. 2017;117(7):5146–73.

Zhang D, Li C, Zhang C, Slipchenko MN, Eakins G, Cheng JX. Depth-resolved mid-infrared photothermal imaging of living cells and organisms with submicrometer spatial resolution. Sci Adv. 2016;2(9):e1600521.

AC VDDS, Hondl N, Ramos-Garcia V, Kuligowski J, Lendl B, Ramer G. AFM-IR for Nanoscale Chemical characterization in Life sciences: recent developments and future directions. ACS Meas Sci Au. 2023;3(5):301–14.

Article   Google Scholar  

Xitian H, Li Z, Xu W, Yan P. Review on near-field detection technology in the biomedical field. Adv Photonics Nexus. 2023;2(4):044002.

Google Scholar  

Dazzi A, Prazeres R, Glotin F, Ortega JM. Local infrared microspectroscopy with subwavelength spatial resolution with an atomic force microscope tip used as a photothermal sensor. Opt Lett. 2005;30(18):2388–90.

Schwartz JJ, Pavlidis G, Centrone A. Understanding Cantilever transduction efficiency and spatial resolution in Nanoscale Infrared Microscopy. Anal Chem. 2022;94(38):13126–35.

Ramer G, Aksyuk VA, Centrone A. Quantitative Chemical Analysis at the Nanoscale using the Photothermal Induced Resonance technique. Anal Chem. 2017;89(24):13524–31.

Quaroni L. Understanding and Controlling spatial resolution, sensitivity, and Surface Selectivity in Resonant-Mode Photothermal-Induced Resonance Spectroscopy. Anal Chem. 2020;92(5):3544–54.

Waeytens J, De Meutter J, Goormaghtigh E, Dazzi A, Raussens V. Determination of secondary structure of proteins by Nanoinfrared Spectroscopy. Anal Chem. 2023;95(2):621–7.

Waeytens J, Mathurin J, Deniset-Besseau A, Arluison V, Bousset L, Rezaei H, et al. Probing amyloid fibril secondary structures by infrared nanospectroscopy: experimental and theoretical considerations. Analyst. 2021;146(1):132–45.

Lu F, Belkin MA. Infrared absorption nano-spectroscopy using sample photoexpansion induced by tunable quantum cascade lasers. Opt Express. 2011;19(21):19942–7.

Wieland K, Ramer G, Weiss VU, Allmaier G, Lendl B, Centrone A. Nanoscale chemical imaging of individual chemotherapeutic cytarabine-loaded liposomal nanocarriers. Nano Res. 2019;12(1):197–203.

Article   CAS   Google Scholar  

Wang L, Wang H, Wagner M, Yan Y, Jakob DS, Xu XG. Nanoscale simultaneous chemical and mechanical imaging via peak force infrared microscopy. Sci Adv. 2017;3(6):e1700255.

Kenkel S, Mittal S, Bhargava R. Closed-loop atomic force microscopy-infrared spectroscopic imaging for nanoscale molecular characterization. Nat Commun. 2020;11(1):3225.

Mathurin J, Deniset-Besseau A, Dazzi A. Advanced Infrared Nanospectroscopy using Photothermal Induced Resonance technique, AFMIR: New Approach using Tapping Mode. Acta Phys Pol A. 2020;137(1):29–32.

Yilmaz U, Sam S, Lendl B, Ramer G. Bottom-illuminated Photothermal Nanoscale Chemical Imaging with a Flat Silicon ATR in Air and Liquid. Anal Chem. 2024;96(11):4410–8.

Dazzi A, Prazeres R, Glotin F, Ortega JM. Subwavelength infrared spectromicroscopy using an AFM as a local absorption sensor. Infrared Phys Technol. 2006;49(1–2):113–21.

Dazzi A, Prazeres R, Glotin F, Ortega JM, Al-Sawaftah M, de Frutos M. Chemical mapping of the distribution of viruses into infected bacteria with a photothermal method. Ultramicroscopy. 2008;108(7):635–41.

Mayet C, Dazzi A, Prazeres R, Ortega JM, Jaillard D. In situ identification and imaging of bacterial polymer nanogranules by infrared nanospectroscopy. Analyst. 2010;135(10):2540–5.

Baldassarre L, Giliberti V, Rosa A, Ortolani M, Bonamore A, Baiocco P, et al. Mapping the amide I absorption in single bacteria and mammalian cells with resonant infrared nanospectroscopy. Nanotechnology. 2016;27(7):075101.

Deniset-Besseau A, Prater CB, Virolle MJ, Dazzi A. Monitoring TriAcylGlycerols Accumulation by Atomic Force Microscopy Based Infrared Spectroscopy in Streptomyces Species for Biodiesel Applications. J Phys Chem Lett. 2014;5(4):654–8.

Rebois R, Onidas D, Marcott C, Noda I, Dazzi A. Chloroform induces outstanding crystallization of poly(hydroxybutyrate) (PHB) vesicles within bacteria. Anal Bioanal Chem. 2017;409(9):2353–61.

Kochan K, Nethercott C, Perez Guaita D, Jiang JH, Peleg AY, Wood BR, et al. Detection of Antimicrobial Resistance-related changes in biochemical composition of Staphylococcus aureus by means of Atomic Force Microscopy-Infrared Spectroscopy. Anal Chem. 2019;91(24):15397–403.

Otzen DE, Dueholm MS, Najarzadeh Z, Knowles TPJ, Ruggeri FS. In situ sub-cellular identification of functional amyloids in Bacteria and Archaea by Infrared Nanospectroscopy. Small Methods. 2021;5(6):e2001002.

Article   PubMed   Google Scholar  

Kochan K, Peleg AY, Heraud P, Wood BR. Atomic Force Microscopy Combined with Infrared Spectroscopy as a Tool to Probe single bacterium Chemistry. J Vis Exp. 2020(163).

Langenberg T, Gallardo R, van der Kant R, Louros N, Michiels E, Duran-Romaña R et al. Thermodynamic and evolutionary coupling between the native and amyloid state of globular proteins. Cell Reports2020.

Dos Santos ACVD, Heydenreich R, Derntl C, Mach-Aigner AR, Mach RL, Ramer G, et al. Nanoscale Infrared Spectroscopy and Chemometrics Enable detection of intracellular protein distribution. Anal Chem. 2020;92(24):15719–25.

Pachitariu M, Stringer C. Cellpose 2.0: how to train your own model. Nat Methods. 2022;19(12):1634–41.

Zack GW, Rogers WE, Latt SA. Automatic measurement of sister chromatid exchange frequency. J Histochem Cytochem. 1977;25(7):741–53.

Raussens V, Waeytens J. Characterization of bacterial amyloids by Nano-infrared spectroscopy. In: Arluison V, Wien F, Marcoleta A, editors. Bacterial amyloids: methods and protocols. New York, NY: Springer US; 2022. pp. 117–29.

Kenkel S, Gryka M, Chen L, Confer MP, Rao A, Robinson S, et al. Chemical imaging of cellular ultrastructure by null-deflection infrared spectroscopic measurements. Proc Natl Acad Sci U S A. 2022;119(47):e2210516119.

Lu F, Jin MZ, Belkin MA. Tip-enhanced infrared nanospectroscopy via molecular expansion force detection. Nat Photonics. 2014;8(4):307–12.

Schwartz JJ, Jakob DS, Centrone A. A guide to nanoscale IR spectroscopy: resonance enhanced transduction in contact and tapping mode AFM-IR. Chem Soc Rev. 2022;51(13):5248–67.

Shen Y, Chen A, Wang W, Shen Y, Ruggeri FS, Aime S, et al. The liquid-to-solid transition of FUS is promoted by the condensate surface. Proc Natl Acad Sci U S A. 2023;120(33):e2301366120.

Ramer G, dos Santos ACV, Zhang Y, Yilmaz U, Lendl B, editors. Image processing as basis for chemometrics in photothermal atomic force microscopy infrared imaging. Advanced Chemical Microscopy for Life Science and Translational Medicine 2023; 2023: SPIE.

McInnes L, Healy J, Saul N, Großberger L. UMAP: Uniform Manifold approximation and projection. J Open Source Softw2018. p. 861.

Simone D, Andrzej S, Angela S, Marco R, Daniele P. Atomic force microscopy as a tool for mechanical characterization at the nanometer scale. Nanomaterials Energy. 2023;12(2):71–80.

Mayet C, Dazzi A, Prazeres R, Allot F, Glotin F, Ortega JM. Sub-100 nm IR spectromicroscopy of living cells. Opt Lett. 2008;33(14):1611–3.

Ramer G, Ruggeri FS, Levin A, Knowles TPJ, Centrone A. Determination of Polypeptide Conformation with Nanoscale Resolution in Water. ACS Nano. 2018;12(7):6612–9.

Kennedy E, Al-Majmaie R, Al-Rubeai M, Zerulla D, Rice JH. Nanoscale infrared absorption imaging permits non-destructive intracellular photosensitizer localization for subcellular uptake analysis. RSC Adv. 2013;3(33):13789–95.

Paluszkiewicz C, Piergies N, Chaniecki P, Rękas M, Miszczyk J, Kwiatek WM. Differentiation of protein secondary structure in clear and opaque human lenses: AFM – IR studies. J Pharm Biomed Anal. 2017;139:125–32.

Qamar S, Wang G, Randle SJ, Ruggeri FS, Varela JA, Lin JQ, et al. FUS phase separation is modulated by a molecular chaperone and methylation of Arginine Cation-Pi interactions. Cell. 2018;173(3):720–34. e15.

Zhaliazka K, Kurouski D. Nanoscale characterization of parallel and antiparallel beta-sheet amyloid Beta 1–42 aggregates. ACS Chem Neurosci. 2022;13(19):2813–20.

Ruggeri FS, Mannini B, Schmid R, Vendruscolo M, Knowles TPJ. Single molecule secondary structure determination of proteins through infrared absorption nanospectroscopy. Nature Communications: Springer US; 2020. pp. 1–9.

Pancani E, Mathurin J, Bilent S, Bernet-Camard M-F, Dazzi A, Deniset-Besseau A, et al. High-resolution label-free detection of Biocompatible Polymeric nanoparticles in cells. Part Part Syst Charact. 2018;35(3):1700457.

Perez-Guaita D, Kochan K, Batty M, Doerig C, Garcia-Bustos J, Espinoza S, et al. Multispectral Atomic Force Microscopy-Infrared Nano-Imaging of Malaria Infected Red Blood cells. Anal Chem. 2018;90(5):3140–8.

Rizevsky S, Kurouski D. Nanoscale Structural Organization of Insulin Fibril Polymorphs revealed by Atomic Force Microscopy-Infrared spectroscopy (AFM-IR). ChemBioChem. 2020;21(4):481–5.

Roman M, Wrobel TP, Paluszkiewicz C, Kwiatek WM. Comparison between high definition FT-IR, Raman and AFM-IR for subcellular chemical imaging of cholesteryl esters in prostate cancer cells. J Biophotonics. 2020;13(5):e201960094.

Download references

Acknowledgements

We thank the VIB BioImaging Core at KU Leuven and the Electron Microscopy Core of VIB-KU Leuven for training, technical support, and access to their instrument parks.

The Switch Laboratory was supported by the Flanders Institute for Biotechnology (VIB), the University of Leuven (KU Leuven) and the Fund for Scientific Research Flanders (FWO) through project grant I011220N to F.R. and PhD fellowship 1128822 N to W.D.

Author information

Authors and affiliations.

Switch Laboratory, VIB-KU Leuven Center for Brain and Disease Research, Herestraat 49, Leuven, 3000, Belgium

Wouter Duverger, Grigoria Tsaka, Ladan Khodaparast, Laleh Khodaparast, Frederic Rousseau & Joost Schymkowitz

Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, Leuven, 3000, Belgium

Laboratory for Neuropathology, Department of Imaging and Pathology, KU Leuven, Herestraat 49, Leuven, 3000, Belgium

Grigoria Tsaka

Leuven Brain Institute, KU Leuven, Herestraat 49, Leuven, 3000, Belgium

Center for Alzheimer’s and Neurodegenerative Diseases, Peter O’Donnell Jr. Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA

Nikolaos Louros

Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA

You can also search for this author in PubMed   Google Scholar

Contributions

F.R. and J.S. designed and fund-raised the study. W.D., N.L., F.R. and J.S. designed the experiments. W.D. and G.T. performed the AFM-IR experiments. W.D., Lad. K. and Lal. K. performed microbiological work. W.D. performed the data analysis. W.D., N.L., J.S. and F.R. wrote the manuscript. All authors provided input, proofread the manuscript, and have given approval to the final version of the manuscript.

Corresponding authors

Correspondence to Frederic Rousseau or Joost Schymkowitz .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Duverger, W., Tsaka, G., Khodaparast, L. et al. An end-to-end approach for single-cell infrared absorption spectroscopy of bacterial inclusion bodies: from AFM-IR measurement to data interpretation of large sample sets. J Nanobiotechnol 22 , 406 (2024). https://doi.org/10.1186/s12951-024-02674-3

Download citation

Received : 13 March 2024

Accepted : 25 June 2024

Published : 10 July 2024

DOI : https://doi.org/10.1186/s12951-024-02674-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Image analysis
  • Protein aggregation
  • Escherichia coli
  • Infrared spectroscopy

Journal of Nanobiotechnology

ISSN: 1477-3155

sample thesis data analysis and interpretation

IMAGES

  1. SOLUTION: Thesis chapter 4 analysis and interpretation of data sample

    sample thesis data analysis and interpretation

  2. Sample Thesis Data Analysis and Interpretation

    sample thesis data analysis and interpretation

  3. sample thesis data analysis and interpretation

    sample thesis data analysis and interpretation

  4. SOLUTION: Thesis chapter 4 analysis and interpretation of data sample

    sample thesis data analysis and interpretation

  5. Interpretation Of Data

    sample thesis data analysis and interpretation

  6. (PDF) Data analysis in qualitative research

    sample thesis data analysis and interpretation

VIDEO

  1. Analysis of Data? Some Examples to Explore

  2. HOW TO DO DATA INTERPRETATION IN THESIS EASILY (UNDER 30 MINS)

  3. Demographic Analysis in SPSS

  4. How to run OLS regression analysis in MS Excel

  5. How to Solve DI Questions Quickly and Accurately-Tips and tricks for data interpretation

  6. CHAPTER-4 OF A THESIS

COMMENTS

  1. PDF Microsoft Word

    The analysis and interpretation of data is carried out in two phases. The. first part, which is based on the results of the questionnaire, deals with a quantitative. analysis of data. The second, which is based on the results of the interview and focus group. discussions, is a qualitative interpretation.

  2. Analysing and Interpreting Data in Your Dissertation: Making Sense of

    Master the art of analysing and interpreting data for your dissertation with our comprehensive guide. Learn essential techniques for quantitative and qualitative analysis, data preparation, and effective presentation to enhance the credibility and impact of your research.

  3. Chapter Four Data Presentation, Analysis and Interpretation 4.0

    DATA PRESENTATION, ANALYSIS AND INTERPRETATION. 4.0 Introduction. This chapter is concerned with data pres entation, of the findings obtained through the study. The. findings are presented in ...

  4. How to Write a Results Section

    A results section is where you report the main findings of the data collection and analysis you conducted for your thesis or dissertation. You should report all relevant results concisely and objectively, in a logical order. Don't include subjective interpretations of why you found these results or what they mean—any evaluation should be saved for the discussion section.

  5. PDF Microsoft Word

    4.1 INTRODUCTION. In this chapter, I describe the qualitative analysis of the data, including the practical steps involved in the analysis. A quantitative analysis of the data follows in Chapter 5. In the qualitative phase, I analyzed the data into generative themes, which will be described individually. I describe how the themes overlap.

  6. PDF Chapter 6: Data Analysis and Interpretation 6.1. Introduction

    From the preceding discussion of data analysis and interpretation, the views, ideas and suggestions expressed by different researchers and authors have been identified as important for . CHAPTER 6: DATA ANALYSIS AND INTERPRETATION ... interpretation of qualitative data collected for this thesis. 6.2.1 Analysis of qualitative data

  7. Writing the Data Analysis Chapter(s): Results and Evidence

    4.4 Writing the Data Analysis Chapter (s): Results and Evidence. Unlike the introduction, literature review and methodology chapter (s), your results chapter (s) will need to be written for the first time as you draft your thesis even if you submitted a proposal, though this part of your thesis will certainly build upon the preceding chapters.

  8. Step 7: Data analysis techniques for your dissertation

    Learn about data analysis techniques and how they affect your replication-based dissertation.

  9. Sample Thesis Chapter 4 Presentation Analysis and Interpretation of

    This document discusses the challenges students face with Chapter 4 of their thesis, which focuses on data presentation, analysis, and interpretation. It notes that Chapter 4 is complex, requiring statistical analysis and the ability to identify trends and draw meaningful conclusions.

  10. Sample Thesis Data Analysis and Interpretation

    The document discusses the challenges of data analysis and interpretation for thesis writing. It notes that this phase requires meticulous attention, analytical skills, and attention to detail. It further explains that unraveling data analysis complexities demands a thorough understanding of research methodologies and a discerning approach to derive meaningful insights. The process of ...

  11. The Beginner's Guide to Statistical Analysis

    The Beginner's Guide to Statistical Analysis | 5 Steps & Examples Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations.

  12. Sample Thesis Presentation Analysis and Interpretation of Data

    Analyzing and interpreting data is one of the most challenging aspects of writing a thesis. It requires making sense of collected data, drawing meaningful conclusions, and presenting them clearly. Whether the data is quantitative or qualitative, students often find it an overwhelming process, especially if they are unfamiliar with statistical analysis or research methodologies. Hiring an ...

  13. A Step-by-Step Guide to Dissertation Data Analysis

    A Step-by-Step Guide to Dissertation Data Analysis A data analysis dissertation is a complex and challenging project requiring significant time, effort, and expertise. Fortunately, it is possible to successfully complete a data analysis dissertation with careful planning and execution.

  14. (PDF) Qualitative Data Analysis and Interpretation: Systematic Search

    An understanding of qualitative data analysis is fundamental to their "systematic search for meaning" (Hatch, 2002:148) in their data.

  15. Chapter IV

    Sample Chapter 4 chapter iv presentation, analysis and interpretation of data this chapter presents the results, the analysis and interpretation of data

  16. Data Interpretation

    Data interpretation and data analysis are two different but closely related processes in data-driven decision-making. Data analysis refers to the process of examining and examining data using statistical and computational methods to derive insights and conclusions from it. It involves cleaning, transforming, and modeling the data to uncover ...

  17. Chapter 4 PRESENTATION, ANALYSIS AND INTERPRETATION OF DATA

    Chapter 4 PRESENTATION, ANALYSIS AND INTERPRETATION OF DATA This chapter presents the data gathered, the results of the statistical analysis done and interpretation of findings. These are presented in tables following the sequence of the specific research problem regarding the Effectiveness of Beat Patrol System in of San Manuel, Pangasinan.

  18. Chapter 4 PRESENTATION, ANALYSIS AND INTERPRETATION OF DATA

    70 Chapter 4 PRESENTATION, ANALYSIS AND INTERPRETATION OF DATA This chapter presents the findings of the study, its analysis and interpretation of data gathered. PROBLEM NO. 1: What are the socio-demographic profiles of the respondents?

  19. Chapter 4 Presentation, Analysis, and Interpretation of Data

    The analysis and interpretation of data about wearing high heels for female students of Ligao community college. To complete this study properly, it is necessary to analyze the data collected in order to answer the research questions. Data is interpreted in a descriptive form. This chapter comprises the analysis, presentation and interpretation of the findings resulting from this study. The ...

  20. PDF Microsoft Word

    This chapter describes the analysis of data followed by a discussion of the research findings. The findings relate to the research questions that guided the study. Data were analyzed to identify, describe and explore the relationship between death anxiety and death attitudes of nurses in a private acute care hospital and to determine the need ...

  21. What Is Data Interpretation? Meaning & Analysis Examples

    The right analysis and interpretation of data is the foundation for successful research. See key data interpretation methods & examples here!

  22. Chapter 4 revised

    Thesis chapter presentation, analysis and interpretation of data this chapter includes the presentation, analysis, and interpretation of data that have been

  23. Thesis chapter 4 analysis and interpretation of data sample

    This chapter deals with the data gathering, presentation, and interpretation of the results from the conducted survey and the software analysis. To ...

  24. An end-to-end approach for single-cell infrared absorption spectroscopy

    We present a detailed investigation using AFM-IR, revealing the substructure of IBs and their variation at the single-cell level, including a rigorous optimisation of data collection parameters and addressing issues such as laser power, pulse frequency, and sample drift. An analysis pipeline was developed tailored to AFM-IR image data, allowing ...