• Privacy Policy

Research Method

Home » Inferential Statistics – Types, Methods and Examples

Inferential Statistics – Types, Methods and Examples

Table of Contents

Inferential Statistics

Inferential Statistics

Inferential statistics is a branch of statistics that involves making predictions or inferences about a population based on a sample of data taken from that population. It is used to analyze the probabilities, assumptions, and outcomes of a hypothesis .

The basic steps of inferential statistics typically involve the following:

  • Define a Hypothesis: This is often a statement about a parameter of a population, such as the population mean or population proportion.
  • Select a Sample: In order to test the hypothesis, you’ll select a sample from the population. This should be done randomly and should be representative of the larger population in order to avoid bias.
  • Collect Data: Once you have your sample, you’ll need to collect data. This data will be used to calculate statistics that will help you test your hypothesis.
  • Perform Analysis: The collected data is then analyzed using statistical tests such as the t-test, chi-square test, or ANOVA, to name a few. These tests help to determine the likelihood that the results of your analysis occurred by chance.
  • Interpret Results: The analysis can provide a probability, called a p-value, which represents the likelihood that the results occurred by chance. If this probability is below a certain level (commonly 0.05), you may reject the null hypothesis (the statement that there is no effect or relationship) in favor of the alternative hypothesis (the statement that there is an effect or relationship).

Inferential Statistics Types

Inferential statistics can be broadly categorized into two types: parametric and nonparametric. The selection of type depends on the nature of the data and the purpose of the analysis.

Parametric Inferential Statistics

These are statistical methods that assume data comes from a type of probability distribution and makes inferences about the parameters of the distribution. Common parametric methods include:

  • T-tests : Used when comparing the means of two groups to see if they’re significantly different.
  • Analysis of Variance (ANOVA) : Used to compare the means of more than two groups.
  • Regression Analysis : Used to predict the value of one variable (dependent) based on the value of another variable (independent).
  • Chi-square test for independence : Used to test if there is a significant association between two categorical variables.
  • Pearson’s correlation : Used to test if there is a significant linear relationship between two continuous variables.

Nonparametric Inferential Statistics

These are methods used when the data does not meet the requirements necessary to use parametric statistics, such as when data is not normally distributed. Common nonparametric methods include:

  • Mann-Whitney U Test : Non-parametric equivalent to the independent samples t-test.
  • Wilcoxon Signed-Rank Test : Non-parametric equivalent to the paired samples t-test.
  • Kruskal-Wallis Test : Non-parametric equivalent to the one-way ANOVA.
  • Spearman’s rank correlation : Non-parametric equivalent to the Pearson correlation.
  • Chi-square test for goodness of fit : Used to test if the observed frequencies for a categorical variable match the expected frequencies.

Inferential Statistics Formulas

Inferential statistics use various formulas and statistical tests to draw conclusions or make predictions about a population based on a sample from that population. Here are a few key formulas commonly used:

Confidence Interval for a Mean:

When you have a sample and want to make an inference about the population mean (µ), you might use a confidence interval.

The formula for a confidence interval around a mean is:

[Sample Mean] ± [Z-score or T-score] * (Standard Deviation / sqrt[n]) where:

  • Sample Mean is the mean of your sample data
  • Z-score or T-score is the value from the Z or T distribution corresponding to the desired confidence level (Z is used when the population standard deviation is known or the sample size is large, otherwise T is used)
  • Standard Deviation is the standard deviation of the sample
  • sqrt[n] is the square root of the sample size

Hypothesis Testing:

Hypothesis testing often involves calculating a test statistic, which is then compared to a critical value to decide whether to reject the null hypothesis.

A common test statistic for a test about a mean is the Z-score:

Z = (Sample Mean - Hypothesized Population Mean) / (Standard Deviation / sqrt[n])

where all variables are as defined above.

Chi-Square Test:

The Chi-Square Test is used when dealing with categorical data.

The formula is:

χ² = Σ [ (Observed-Expected)² / Expected ]

  • Observed is the actual observed frequency
  • Expected is the frequency we would expect if the null hypothesis were true

The t-test is used to compare the means of two groups. The formula for the independent samples t-test is:

t = (mean1 - mean2) / sqrt [ (sd1²/n1) + (sd2²/n2) ] where:

  • mean1 and mean2 are the sample means
  • sd1 and sd2 are the sample standard deviations
  • n1 and n2 are the sample sizes

Inferential Statistics Examples

Sure, inferential statistics are used when making predictions or inferences about a population from a sample of data. Here are a few real-time examples:

  • Medical Research: Suppose a pharmaceutical company is developing a new drug and they’re currently in the testing phase. They gather a sample of 1,000 volunteers to participate in a clinical trial. They find that 700 out of these 1,000 volunteers reported a significant reduction in their symptoms after taking the drug. Using inferential statistics, they can infer that the drug would likely be effective for the larger population.
  • Customer Satisfaction: Suppose a restaurant wants to know if its customers are satisfied with their food. They could survey a sample of their customers and ask them to rate their satisfaction on a scale of 1 to 10. If the average rating was 8.5 from a sample of 200 customers, they could use inferential statistics to infer that the overall customer population is likely satisfied with the food.
  • Political Polling: A polling company wants to predict who will win an upcoming presidential election. They poll a sample of 10,000 eligible voters and find that 55% prefer Candidate A, while 45% prefer Candidate B. Using inferential statistics, they infer that Candidate A has a higher likelihood of winning the election.
  • E-commerce Trends: An e-commerce company wants to improve its recommendation engine. They analyze a sample of customers’ purchase history and notice a trend that customers who buy kitchen appliances also frequently buy cookbooks. They use inferential statistics to infer that recommending cookbooks to customers who buy kitchen appliances would likely increase sales.
  • Public Health: A health department wants to assess the impact of a health awareness campaign on smoking rates. They survey a sample of residents before and after the campaign. If they find a significant reduction in smoking rates among the surveyed group, they can use inferential statistics to infer that the campaign likely had an impact on the larger population’s smoking habits.

Applications of Inferential Statistics

Inferential statistics are extensively used in various fields and industries to make decisions or predictions based on data. Here are some applications of inferential statistics:

  • Healthcare: Inferential statistics are used in clinical trials to analyze the effect of a treatment or a drug on a sample population and then infer the likely effect on the general population. This helps in the development and approval of new treatments and drugs.
  • Business: Companies use inferential statistics to understand customer behavior and preferences, market trends, and to make strategic decisions. For example, a business might sample customer satisfaction levels to infer the overall satisfaction of their customer base.
  • Finance: Banks and financial institutions use inferential statistics to evaluate the risk associated with loans and investments. For example, inferential statistics can help in determining the risk of default by a borrower based on the analysis of a sample of previous borrowers with similar credit characteristics.
  • Quality Control: In manufacturing, inferential statistics can be used to maintain quality standards. By analyzing a sample of the products, companies can infer the quality of all products and decide whether the manufacturing process needs adjustments.
  • Social Sciences: In fields like psychology, sociology, and education, researchers use inferential statistics to draw conclusions about populations based on studies conducted on samples. For instance, a psychologist might use a survey of a sample of people to infer the prevalence of a particular psychological trait or disorder in a larger population.
  • Environment Studies: Inferential statistics are also used to study and predict environmental changes and their impact. For instance, researchers might measure pollution levels in a sample of locations to infer overall pollution levels in a wider area.
  • Government Policies: Governments use inferential statistics in policy-making. By analyzing sample data, they can infer the potential impacts of policies on the broader population and thus make informed decisions.

Purpose of Inferential Statistics

The purposes of inferential statistics include:

  • Estimation of Population Parameters: Inferential statistics allows for the estimation of population parameters. This means that it can provide estimates about population characteristics based on sample data. For example, you might want to estimate the average weight of all men in a country by sampling a smaller group of men.
  • Hypothesis Testing: Inferential statistics provides a framework for testing hypotheses. This involves making an assumption (the null hypothesis) and then testing this assumption to see if it should be rejected or not. This process enables researchers to draw conclusions about population parameters based on their sample data.
  • Prediction: Inferential statistics can be used to make predictions about future outcomes. For instance, a researcher might use inferential statistics to predict the outcomes of an election or forecast sales for a company based on past data.
  • Relationships Between Variables: Inferential statistics can also be used to identify relationships between variables, such as correlation or regression analysis. This can provide insights into how different factors are related to each other.
  • Generalization: Inferential statistics allows researchers to generalize their findings from the sample to the larger population. It helps in making broad conclusions, given that the sample is representative of the population.
  • Variability and Uncertainty: Inferential statistics also deal with the idea of uncertainty and variability in estimates and predictions. Through concepts like confidence intervals and margins of error, it provides a measure of how confident we can be in our estimations and predictions.
  • Error Estimation : It provides measures of possible errors (known as margins of error), which allow us to know how much our sample results may differ from the population parameters.

Limitations of Inferential Statistics

Inferential statistics, despite its many benefits, does have some limitations. Here are some of them:

  • Sampling Error : Inferential statistics are often based on the concept of sampling, where a subset of the population is used to infer about the population. There’s always a chance that the sample might not perfectly represent the population, leading to sampling errors.
  • Misleading Conclusions : If assumptions for statistical tests are not met, it could lead to misleading results. This includes assumptions about the distribution of data, homogeneity of variances, independence, etc.
  • False Positives and Negatives : There’s always a chance of a Type I error (rejecting a true null hypothesis, or a false positive) or a Type II error (not rejecting a false null hypothesis, or a false negative).
  • Dependence on Quality of Data : The accuracy and validity of inferential statistics depend heavily on the quality of data collected. If data are biased, inaccurate, or collected using flawed methods, the results won’t be reliable.
  • Limited Predictive Power : While inferential statistics can provide estimates and predictions, these are based on the current data and may not fully account for future changes or variables not included in the model.
  • Complexity : Some inferential statistical methods can be quite complex and require a solid understanding of statistical principles to implement and interpret correctly.
  • Influenced by Outliers : Inferential statistics can be heavily influenced by outliers. If these extreme values aren’t handled properly, they can lead to misleading results.
  • Over-reliance on P-values : There’s a tendency in some fields to overly rely on p-values to determine significance, even though p-values have several limitations and are often misunderstood.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Cluster Analysis

Cluster Analysis – Types, Methods and Examples

Discriminant Analysis

Discriminant Analysis – Methods, Types and...

MANOVA

MANOVA (Multivariate Analysis of Variance) –...

Documentary Analysis

Documentary Analysis – Methods, Applications and...

ANOVA

ANOVA (Analysis of variance) – Formulas, Types...

Graphical Methods

Graphical Methods – Types, Examples and Guide

What Is Inferential Statistics?

what is inferential analysis in research

Inferential statistics help us draw conclusions about how a hypothesis will play out or to determine a general parameter about a larger sample. We often use this process to compare two groups of subjects to make greater generalizations about a larger overall population.

Inferential Statistics vs. Descriptive Statistics

Related Reading From Built In Experts What Is Descriptive Statistics?

What Are Inferential Statistics Used For?

Inferential statistics are generally used in two ways: to set parameters about a group and then create hypotheses about how data will perform when scaled.

Inferential statistics are among the most useful tools for making educated predictions about how a set of data will scale when applied to a larger population of subjects. These statistics help set a benchmark for hypothesis testing, as well as a general idea of where specific parameters will land when scaled to a larger data set, such as the larger set’s mean.

 This process can determine a population’s z-score (where a subject will land on a bell curve) and set data up for further testing.

What’s the Difference Between Descriptive and Inferential Statistics?

Descriptive statistics are meant to illustrate data exactly as it is presented, meaning no predictions or generalizations should be used in the presentation of this data. More detailed descriptive statistics will present factors like the mean of a sample, the standard deviation of a sample or describe the sample’s probability shape.

Inferential statistics, on the other hand, rely on the use of generalizations based on data acquired from subjects. These statistics use the same sample of data as descriptive statistics, but exist to make assumptions about how a larger group of subjects will perform based on the performance of the existing subjects, with scalability factors to account for variations in larger groups.

Inferential statistics essentially do one of two things: estimate a population’s parameter, such as the mean or average, or set a hypothesis for further analysis.

What Is an Example of Inferential Statistics?

Any situation where data is extracted from a group of subjects and then used to make inferences about a larger group is an example of inferential statistics at work.

Though data sets may have a tendency to become large and have many variables, inferential statistics do not have to be complicated equations. For example, if you poll 100 people on whether or not they enjoy coffee, and 85 of those 100 people answer yes, while 15 answer no, the data will show that 85 percent of the sample enjoy coffee. Using that data, you might then infer that 85 percent of the general population enjoy coffee, while 15 percent of people do not.

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Great Companies Need Great People. That's Where We Come In.

Grad Coach

Quant Analysis 101: Inferential Statistics

Everything You Need To Get Started (With Examples)

By: Derek Jansen (MBA) | Reviewers: Kerryn Warren (PhD) | October 2023

If you’re new to quantitative data analysis , one of the many terms you’re likely to hear being thrown around is inferential statistics. In this post, we’ll provide an introduction to inferential stats, using straightforward language and loads of examples . 

Overview: Inferential Statistics

What are inferential statistics.

  • Descriptive vs inferential statistics

Correlation

  • Key takeaways

At the simplest level, inferential statistics allow you to test whether the patterns you observe in a sample are likely to be present in the population – or whether they’re just a product of chance.

In stats-speak, this “Is it real or just by chance?” assessment is known as statistical significance . We won’t go down that rabbit hole in this post, but this ability to assess statistical significance means that inferential statistics can be used to test hypotheses and in some cases, they can even be used to make predictions .

That probably sounds rather conceptual – let’s look at a practical example.

Let’s say you surveyed 100 people (this would be your sample) in a specific city about their favourite type of food. Reviewing the data, you found that 70 people selected pizza (i.e., 70% of the sample). You could then use inferential statistics to test whether that number is just due to chance , or whether it is likely representative of preferences across the entire city (this would be your population).

PS – you’d use a chi-square test for this example, but we’ll get to that a little later.

Inferential statistics help you understand whether the patterns you observe in a sample are likely to be present in the population.

Inferential vs Descriptive

At this point, you might be wondering how inferentials differ from descriptive statistics. At the simplest level, descriptive statistics summarise and organise the data you already have (your sample), making it easier to understand.

Inferential statistics, on the other hand, allow you to use your sample data to assess whether the patterns contained within it are likely to be present in the broader population , and potentially, to make predictions about that population.

It’s example time again…

Let’s imagine you’re undertaking a study that explores shoe brand preferences among men and women. If you just wanted to identify the proportions of those who prefer different brands, you’d only require descriptive statistics .

However, if you wanted to assess whether those proportions differ between genders in the broader population (and that the difference is not just down to chance), you’d need to utilise inferential statistics .

In short, descriptive statistics describe your sample, while inferential statistics help you understand whether the patterns in your sample are likely to reflect within the population .

Free Webinar: Research Methodology 101

Let’s look at some inferential tests

Now that we’ve defined inferential statistics and explained how it differs from descriptive statistics, let’s take a look at some of the most common tests within the inferential realm . It’s worth highlighting upfront that there are many different types of inferential tests and this is most certainly not a comprehensive list – just an introductory list to get you started.

A t-test is a way to compare the means (averages) of two groups to see if they are meaningfully different, or if the difference is just by chance. In other words, to assess whether the difference is statistically significant . This is important because comparing two means side-by-side can be very misleading if one has a high variance and the other doesn’t (if this sounds like gibberish, check out our descriptive statistics post here ).

As an example, you might use a t-test to see if there’s a statistically significant difference between the exam scores of two mathematics classes taught by different teachers . This might then lead you to infer that one teacher’s teaching method is more effective than the other.

It’s worth noting that there are a few different types of t-tests . In this example, we’re referring to the independent t-test , which compares the means of two groups, as opposed to the mean of one group at different times (i.e., a paired t-test). Each of these tests has its own set of assumptions and requirements, as do all of the tests we’ll discuss here – but we’ll save assumptions for another post!

Comparing two means (averages) side-by-side can be very misleading if one mean has a high variance and the other mean doesn't.

While a t-test compares the means of just two groups, an ANOVA (which stands for Analysis of Variance) can compare the means of more than two groups at once . Again, this helps you assess whether the differences in the means are statistically significant or simply a product of chance.

For example, if you want to know whether students’ test scores vary based on the type of school they attend – public, private, or homeschool – you could use ANOVA to compare the average standardised test scores of the three groups .

Similarly, you could use ANOVA to compare the average sales of a product across multiple stores. Based on this data, you could make an inference as to whether location is related to (affects) sales.

In these examples, we’re specifically referring to what’s called a one-way ANOVA , but as always, there are multiple types of ANOVAs for different applications. So, be sure to do your research before opting for any specific test.

Example of anova

While t-tests and ANOVAs test for differences in the means across groups, the Chi-square test is used to see if there’s a difference in the proportions of various categories . In stats speak, the Chi-square test assesses whether there’s a statistically significant relationship between two categorical variables (i.e., nominal or ordinal data). If you’re not familiar with these terms, check out our explainer video here .

As an example, you could use a Chi-square test to check if there’s a link between gender (e.g., male and female) and preference for a certain category of car (e.g., sedans or SUVs). Similarly, you could use this type of test to see if there’s a relationship between the type of breakfast people eat (cereal, toast, or nothing) and their university major (business, math or engineering).

Correlation analysis looks at the relationship between two numerical variables (like height or weight) to assess whether they “move together” in some way. In stats-speak, correlation assesses whether a statistically significant relationship exists between two variables that are interval or ratio in nature .

For example, you might find a correlation between hours spent studying and exam scores. This would suggest that generally, the more hours people spend studying, the higher their scores are likely to be.

Similarly, a correlation analysis may reveal a negative relationship between time spent watching TV and physical fitness (represented by VO2 max levels), where the more time spent in front of the television, the lower the physical fitness level.

When running a correlation analysis, you’ll be presented with a correlation coefficient (also known as an r-value), which is a number between -1 and 1. A value close to 1 means that the two variables move in the same direction , while a number close to -1 means that they move in opposite directions . A correlation value of zero means there’s no clear relationship between the two variables.

What’s important to highlight here is that while correlation analysis can help you understand how two variables are related, it doesn’t prove that one causes the other . As the adage goes, correlation is not causation.

Example of correlation

While correlation allows you to see whether there’s a relationship between two numerical variables, regression takes it a step further by allowing you to make predictions about the value of one variable (called the dependent variable) based on the value of one or more other variables (called the independent variables).

For example, you could use regression analysis to predict house prices based on the number of bedrooms, location, and age of the house. The analysis would give you an equation that lets you plug in these factors to estimate a house’s price. Similarly, you could potentially use regression analysis to predict a person’s weight based on their height, age, and daily calorie intake.

It’s worth noting that in these examples, we’ve been talking about multiple regression , as there are multiple independent variables. While this is a popular form of regression, there are many others, including simple linear, logistic and multivariate. As always, be sure to do your research before selecting a specific statistical test.

As with correlation, keep in mind that regression analysis alone doesn’t prove causation . While it can show that variables are related and help you make predictions, it can’t prove that one variable causes another to change. Other factors that you haven’t included in your model could be influencing the results. To establish causation, you’d typically need a very specific research design that allows you to control all (or at least most) variables.

Let’s Recap

We’ve covered quite a bit of ground. Here’s a quick recap of the key takeaways:

  • Inferential stats allow you to assess whether patterns in your sample are likely to be present in your population
  • Some common inferential statistical tests include t-tests, ANOVA, chi-square, correlation and regression .
  • Inferential statistics alone do not prove causation . To identify and measure causal relationships, you need a very specific research design.

If you’d like 1-on-1 help with your inferential statistics, check out our private coaching service , where we hold your hand throughout the quantitative research process.

Literature Review Course

Psst… there’s more!

This post is an extract from our bestselling short course, Methodology Bootcamp . If you want to work smart, you don't want to miss this .

You Might Also Like:

What is descriptive statistics?

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

15 Quantitative analysis: Inferential statistics

Inferential statistics are the statistical procedures that are used to reach conclusions about associations between variables. They differ from descriptive statistics in that they are explicitly designed to test hypotheses. Numerous statistical procedures fall into this category—most of which are supported by modern statistical software such as SPSS and SAS. This chapter provides a short primer on only the most basic and frequent procedures. Readers are advised to consult a formal text on statistics or take a course on statistics for more advanced procedures.

Basic concepts

British philosopher Karl Popper said that theories can never be proven, only disproven. As an example, how can we prove that the sun will rise tomorrow? Popper said that just because the sun has risen every single day that we can remember does not necessarily mean that it will rise tomorrow, because inductively derived theories are only conjectures that may or may not be predictive of future phenomena. Instead, he suggested that we may assume a theory that the sun will rise every day without necessarily proving it, and if the sun does not rise on a certain day, the theory is falsified and rejected. Likewise, we can only reject hypotheses based on contrary evidence, but can never truly accept them because the presence of evidence does not mean that we will not observe contrary evidence later. Because we cannot truly accept a hypothesis of interest (alternative hypothesis), we formulate a null hypothesis as the opposite of the alternative hypothesis, and then use empirical evidence to reject the null hypothesis to demonstrate indirect, probabilistic support for our alternative hypothesis.

A second problem with testing hypothesised relationships in social science research is that the dependent variable may be influenced by an infinite number of extraneous variables and it is not plausible to measure and control for all of these extraneous effects. Hence, even if two variables may seem to be related in an observed sample, they may not be truly related in the population, and therefore inferential statistics are never certain or deterministic, but always probabilistic.

\alpha

General linear model

Most inferential statistical procedures in social science research are derived from a general family of statistical models called the general linear model (GLM). A model is an estimated mathematical equation that can be used to represent a set of data, and linear refers to a straight line. Hence, a GLM is a system of equations that can be used to represent linear patterns of relationships in observed data.

Two-variable linear model

Two-group comparison

t

where the numerator is the difference in sample means between the treatment group (Group 1) and the control group (Group 2) and the denominator is the standard error of the difference between the two groups, which in turn, can be estimated as:

\[ s_{\overline{X}_{1}-\overline{X}_{2}} = \sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}} }\,.\]

Factorial designs

2 \times 2

Other quantitative analysis

There are many other useful inferential statistical techniques—based on variations in the GLM—that are briefly mentioned here. Interested readers are referred to advanced textbooks or statistics courses for more information on these techniques:

Factor analysis is a data reduction technique that is used to statistically aggregate a large number of observed measures (items) into a smaller set of unobserved (latent) variables called factors based on their underlying bivariate correlation patterns. This technique is widely used for assessment of convergent and discriminant validity in multi-item measurement scales in social science research.

Discriminant analysis is a classificatory technique that aims to place a given observation in one of several nominal categories based on a linear combination of predictor variables. The technique is similar to multiple regression, except that the dependent variable is nominal. It is popular in marketing applications, such as for classifying customers or products into categories based on salient attributes as identified from large-scale surveys.

Logistic regression (or logit model) is a GLM in which the outcome variable is binary (0 or 1) and is presumed to follow a logistic distribution, and the goal of the regression analysis is to predict the probability of the successful outcome by fitting data into a logistic curve. An example is predicting the probability of heart attack within a specific period, based on predictors such as age, body mass index, exercise regimen, and so forth. Logistic regression is extremely popular in the medical sciences. Effect size estimation is based on an ‘odds ratio’, representing the odds of an event occurring in one group versus the other.

Probit regression (or probit model) is a GLM in which the outcome variable can vary between 0 and 1—or can assume discrete values 0 and 1—and is presumed to follow a standard normal distribution, and the goal of the regression is to predict the probability of each outcome. This is a popular technique for predictive analysis in the actuarial science, financial services, insurance, and other industries for applications such as credit scoring based on a person’s credit rating, salary, debt and other information from their loan application. Probit and logit regression tend to demonstrate similar regression coefficients in comparable applications (binary outcomes), however the logit model is easier to compute and interpret.

Path analysis is a multivariate GLM technique for analysing directional relationships among a set of variables. It allows for examination of complex nomological models where the dependent variable in one equation is the independent variable in another equation, and is widely used in contemporary social science research.

Time series analysis is a technique for analysing time series data, or variables that continually changes with time. Examples of applications include forecasting stock market fluctuations and urban crime rates. This technique is popular in econometrics, mathematical finance, and signal processing. Special techniques are used to correct for autocorrelation, or correlation within values of the same variable across time.

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Basic Inferential Statistics: Theory and Application

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The heart of statistics is inferential statistics. Descriptive statistics are typically straightforward and easy to interpret. Unlike descriptive statistics, inferential statistics are often complex and may have several different interpretations.

The goal of inferential statistics is to discover some property or general pattern about a large group by studying a smaller group of people in the hopes that the results will generalize to the larger group. For example, we may ask residents of New York City their opinion about their mayor. We would probably poll a few thousand individuals in New York City in an attempt to find out how the city as a whole views their mayor. The following section examines how this is done.

A population is the entire group of people you would like to know something about. In our previous example of New York City, the population is all of the people living in New York City. It should not include people from England, visitors in New York, or even people who know a lot about New York City.

A sample is a subset of the population. Just like you may sample different types of ice cream at the grocery store, a sample of a population should be just a smaller version of the population.

It is extremely important to understand how the sample being studied was drawn from the population. The sample should be as representative of the population as possible. There are several valid ways of creating a sample from a population, but inferential statistics works best when the sample is drawn at random from the population. Given a large enough sample, drawing at random ensures a fair and representative sample of a population.

Comparing two or more groups

Much of statistics, especially in medicine and psychology, is used to compare two or more groups and attempts to figure out if the two groups are different from one another.

Example: Drug X

Let us say that a drug company has developed a pill, which they think increases the recovery time from the common cold. How would they actually find out if the pill works or not? What they might do is get two groups of people from the same population (say, people from a small town in Indiana who had just caught a cold) and administer the pill to one group, and give the other group a placebo. They could then measure how many days each group took to recover (typically, one would calculate the mean of each group). Let's say that the mean recovery time for the group with the new drug was 5.4 days, and the mean recovery time for the group with the placebo was 5.8 days.

The question becomes, is this difference due to random chance, or does taking the pill actually help you recover from the cold faster? The means of the two groups alone does not help us determine the answer to this question. We need additional information.

Sample Size

If our example study only consisted of two people (one from the drug group and one from the placebo group) there would be so few participants that we would not have much confidence that there is a difference between the two groups. That is to say, there is a high probability that chance explains our results (any number of explanations might account for this, for example, one person might be younger, and thus have a better immune system). However, if our sample consisted of 1,000 people in each group, then the results become much more robust (while it might be easy to say that one person is younger than another, it is hard to say that 1,000 random people are younger than another 1,000 random people). If the sample is drawn at random from the population, then these 'random' variations in participants should be approximately equal in the two groups, given that the two groups are large. This is why inferential statistics works best when there are lots of people involved.

Be wary of statistics that have small sample sizes, unless they are in a peer-reviewed journal. Professional statisticians can interpret results correctly from small sample sizes, and often do, but not everyone is a professional, and novice statisticians often incorrectly interpret results. Also, if your author has an agenda, they may knowingly misinterpret results. If your author does not give a sample size, then he or she is probably not a professional, and you should be wary of the results. Sample sizes are required information in almost all peer-reviewed journals, and therefore, should be included in anything you write as well.

Variability

Even if we have a large enough sample size, we still need more information to reach a conclusion. What we need is some measure of variability. We know that the typical person takes about 5-6 days to recover from a cold, but does everyone recover around 5-6 days, or do some people recover in 1 day, and others recover in 10 days? Understanding the spread of the data will tell us how effective the pill is. If everyone in the placebo group takes exactly 5.8 days to recover, then it is clear that the pill has a positive effect, but if people have a wide variability in their length of recovery (and they probably do) then the picture becomes a little fuzzy. Only when the mean, sample size, and variability have been calculated can a proper conclusion be made. In our case, if the sample size is large, and the variability is small, then we would receive a small p-value (probability-value). Small p-values are good, and this term is prominent enough to warrant further discussion.

In classic inferential statistics, we make two hypotheses before we start our study, the null hypothesis, and the alternative hypothesis.

Null Hypothesis: States that the two groups we are studying are the same.

Alternative Hypothesis: States that the two groups we are studying are different.

The goal in classic inferential statistics is to prove the null hypothesis wrong. The logic says that if the two groups aren't the same, then they must be different. A low p-value indicates a low probability that the null hypothesis is correct (thus, providing evidence for the alternative hypothesis).

Remember: It's good to have low p-values.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

what is inferential analysis in research

Home Market Research

Inferential Statistics: Definition, Types + Examples

Inferential statistics use analytical procedures to draw conclusions about survey data from sample data. Let's learn about it.

If you are a student in a statistics class or a professional researcher, you need to know how to use inferential statistics to analyze data and make smart decisions. In this age of “big data,” when we have access to a lot of information, the capacity to draw correct population conclusions from samples is crucial.

Inferential statistics enable you to draw inferences and make predictions based on your data, whereas descriptive statistics summarize the properties of a data collection. It is an area of mathematics that enables us to identify trends and patterns in a large number of numerical data.

In this post, we will discuss inferential statistics, including what they are, how they work, and some examples.

Definition of Inferential Statistics

Inferential statistics uses statistical techniques to extrapolate information from a smaller sample to make predictions and draw conclusions about a larger population.

It uses probability theory and statistical models to estimate population parameters and test population hypotheses based on sample data. The main goal of inferential statistics is to provide information about the whole population using sample data to make the conclusions drawn as accurate and reliable as possible.

There are two primary uses for inferential statistics:

  • Providing population estimations.
  • Testing theories to make conclusions about populations.

Researchers can generalize a population by utilizing inferential statistics and a representative sample. It requires logical reasoning to reach conclusions. The following is a procedure of the method for arriving at the results:

  • The population that is to be investigated should be chosen as a sample. In this case, the nature and characteristics of the population must be reflected in the sample.
  • Inferential statistical techniques are used to analyze the sample’s behavior. These include the models used for regression analysis and hypothesis testing.
  • The first-step sample is used to draw conclusions. Assumptions or predictions about the entire population are used to draw inferences.

Types of Inferential Statistics

Inferential statistics are divided into two categories:

  • Hypothesis testing.
  • Regression analysis.

Researchers frequently employ these methods to generalize results to larger populations based on small samples. Let’s look at some of the methods available in inferential statistics.

01. Hypothesis testing

Testing hypotheses and drawing generalizations about the population from the sample data are examples of inferential statistics. Creating a null hypothesis and an alternative hypothesis, then performing a statistical test of significance are required.

A hypothesis test can have left-, right-, or two-tailed distributions. The test statistic’s value, the critical value, and the confidence intervals are used to conclude. Below are a few significant hypothesis tests that are employed in inferential statistics.

When data has a normal distribution and a sample size of at least 30, the z test is applied to the data. When the population variance is known, it determines if the sample and population means are equal. The following setup can be used to test the right-tailed hypothesis:

Null Hypothesis: H 0 : μ=μ 0

Alternate hypothesis: H 1 : μ>μ 0

Test Statistic: Z Test = (x̄ – μ) / (σ / √n)

x̄ = sample mean

μ = population mean

σ = standard deviation of the population

n = sample size

Decision Criteria: If the z statistic > z critical value, reject the null hypothesis.

When the sample size is less than 30, and the data has a student t distribution, a t test is utilized. The sample and population mean are compared when the population variance is unknown. The inferential statistics hypothesis test is as follows:

Alternate Hypothesis: H 1 : μ>μ 0

Test Statistic: t = x̄−μ / s√n

The representations x̄, μ, and n are the same as stated for the z-test. The letter “s” represents the standard deviation of the sample.

Decision Criteria: If the t statistic > t critical value, reject the null hypothesis.

When comparing the variances of two samples or populations, an f test is used to see if there is a difference. The right-tailed f test can be configured as follows:

Null Hypothesis: H 0 :σ 2 1 =σ 2 2

Alternate Hypothesis: H 1 :σ 2 1 > σ 2 2

Test Statistic: f = σ 2 1  /   σ 2 2 , where σ 2 1 is the variance of the first population, and σ 2 2 is the variance of the second population.

Decision Criteria: Deciding Criteria: Reject the null hypothesis if f test statistic > critical value.

A confidence interval aids an estimation of a population’s parameters. For instance, a 95% confidence interval means that 95 out of 100 tests with fresh samples performed under identical conditions will result in the estimate falling within the specified range. A confidence interval can also be used to determine the crucial value in hypothesis testing.

In addition to these tests, inferential statistics also use the ANOVA, Wilcoxon signed-rank, Mann-Whitney U, Kruskal-Wallis, and H tests.

LEARN ABOUT: ANOVA testing

02. Regression analysis

Regression analysis is done to calculate how one variable will change in relation to another. Numerous regression models can be used, including simple linear, multiple linear, nominal, logistic, and ordinal regression.

In inferential statistics, linear regression is the most often employed type of regression. The dependent variable’s response to a unit change in the independent variable is examined through linear regression. These are a few crucial equations for regression analysis using inferential statistics:

Regression Coefficients:

The straight line equation is given as y = α + βx, where α and β are regression coefficients.

β=∑ n 1 (x i − x̄)(y i −y) / ∑ n 1 (x i −x) 2

β=r xy σ y / σ x

α=y−βx 

Here, x is the mean, and σ x is the standard deviation of the first data set. Similarly, y is the mean, and σy is the standard deviation of the second data set.

Example of inferential statistics

Consider for this example that you based your research on the test results for a particular class as described in the descriptive statistics section. You now want to do an inferential statistics study for that same test.

Assume it is a statewide exam that is standardized. You may demonstrate how this alters how we perform the study and the results that you report by using the same test, but this time with the intention of drawing inferences about a community.

Choose the class you wish to describe in descriptive statistics, and then enter all the test results for that class. Good and easy. You must first define the population for inferential statistics before selecting a random sample from it.

To ensure a representative sample, you must develop a random sampling strategy. This procedure may take time. Let’s use fifth-graders attending public schools in the U.S. state of California as your population definition.

For this example, assume that you gave the entire population a list of names, then selected 100 students randomly from that list and obtained their test results. Be aware that these students will not be from a single class but rather a variety of classes from various schools throughout the state.

Inferential statistics results in

The mean, standard deviation, and proportion for your random sample can all be calculated using inferential statistics as a point estimate. There is no way to know, but it is unlikely that any of these point estimations are exact. These figures have a margin of error because measuring every subject in this population is impossible.

Include the confidence intervals for the mean, standard deviation, and percentage of satisfactory scores (>=70). Inferential statistics is the CSV data file.

The population mean is between 77.4 and 80.9, with a 95% confidence interval given the uncertainty around these estimates. A measure of dispersion, the population standard deviation is most likely to range between 7.7 and 10.1. Moreover, between 77% and 92% is predicted for the population’s proportion of satisfactory scores.

Differences between Descriptive and Inferential Statistics

Both descriptive and inferential statistics are types of statistical analysis used to describe and analyze data. Here are the main differences between them:

Descriptive statistics use measures like mean, median, mode, standard deviation, variance, and range to summarize and describe a data set’s characteristics. They don’t make conclusions or predictions about a population based on the data.

Inferential statistics , on the other hand, use a sample of data to draw conclusions about the population from which the data came. They use probability theory and statistical models to determine certain outcomes’ likelihood and test hypotheses about the population.

Descriptive statistics are usually used to summarize the data and explain the most important parts of the dataset clearly and concisely. They describe a variable’s distribution, find trends and patterns, and examine the relationship between variables.

Inferential statistics are usually used to test hypotheses and draw conclusions about a population from a sample. They are used to make predictions, estimate parameters, and test the importance of differences between groups.

Descriptive statistics can be used on any type of data, including numerical data (like age, weight, and height) and categorical data (e.g. gender, race, occupation).

Inferential statistics use random samples from a population and make assumptions about how the data are distributed and how big the sample is.

Descriptive statistics give an overview of the data and are usually shown in tables, graphs, or summary statistics.

Inferential statistics give estimates and probabilities about a population and are usually reported as hypothesis tests, confidence intervals, and effect sizes.

While inferential statistics are used to make inferences about the population based on sample data, descriptive statistics are used to summarize and characterize the data.

The Importance of Inferential Statistics: Some Remarks

  • Inferential statistics uses analytical tools to determine what a sample’s data says about the whole population.
  • Inferential statistics include things like testing a hypothesis and looking at how things change over time.
  • Inferential statistics use sampling methods to find samples that are representative of the whole population.
  • Inferential statistics uses tools like the Z test, the t-test, and linear regression to determine what is happening.

Inferential statistics is a powerful way to draw conclusions about whole groups of people based on data from a small sample. Inferential statistics uses probability sampling theory and statistical models to help researchers determine certain outcomes’ likelihood and test their ideas about the population. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities.

Inferential statistics is an important part of the data unit of analysis and research because it lets us make predictions and draw conclusions about whole populations based on data from a small sample. It is a complicated and advanced field that requires careful thought about assumptions and data quality, but it can give important research questions and answers to important questions.

QuestionPro gives researchers an easy and effective way to collect and analyze data for inferential statistics. Its sampling options let you create a sample population representative of the larger population, and its data-cleaning tools help ensure the data is accurate.

QuestionPro is a helpful tool for researchers who need to collect and analyze data for inferential statistics. QuestionPro’s analytical features let you examine the relationships between variables, estimate population parameters, and test hypotheses. So sign up now!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

When I think of “disconnected”, it is important that this is not just in relation to people analytics, Employee Experience or Customer Experience - it is also relevant to looking across them.

I Am Disconnected – Tuesday CX Thoughts

May 21, 2024

Customer success tools

20 Best Customer Success Tools of 2024

May 20, 2024

AI-Based Services in Market Research

AI-Based Services Buying Guide for Market Research (based on ESOMAR’s 20 Questions) 

data information vs insight

Data Information vs Insight: Essential differences

May 14, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Popular searches

  • How to Get Participants For Your Study
  • How to Do Segmentation?
  • Conjoint Preference Share Simulator
  • MaxDiff Analysis
  • Likert Scales
  • Reliability & Validity

Request consultation

Do you need support in running a pricing or product study? We can help you with agile consumer research and conjoint analysis.

Looking for an online survey platform?

Conjointly offers a great survey tool with multiple question types, randomisation blocks, and multilingual support. The Basic tier is always free.

Research Methods Knowledge Base

  • Navigating the Knowledge Base
  • Foundations
  • Measurement
  • Research Design
  • Conclusion Validity
  • Data Preparation
  • Descriptive Statistics
  • Dummy Variables
  • General Linear Model
  • Posttest-Only Analysis
  • Factorial Design Analysis
  • Randomized Block Analysis
  • Analysis of Covariance
  • Nonequivalent Groups Analysis
  • Regression-Discontinuity Analysis
  • Regression Point Displacement
  • Table of Contents

Fully-functional online survey tool with various question types, logic, randomisation, and reporting for unlimited number of surveys.

Completely free for academics and students .

Inferential Statistics

With inferential statistics, you are trying to reach conclusions that extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the population might think. Or, we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what’s going on in our data.

Here, I concentrate on inferential statistics that are useful in experimental and quasi-experimental research design or in program outcome evaluation. Perhaps one of the simplest inferential test is used when you want to compare the average performance of two groups on a single measure to see if there is a difference. You might want to know whether eighth-grade boys and girls differ in math test scores or whether a program group differs on the outcome measure from a control group. Whenever you wish to compare the average performance between two groups you should consider the t-test for differences between groups .

Most of the major inferential statistics come from a general family of statistical models known as the General Linear Model . This includes the t-test, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA), regression analysis, and many of the multivariate methods like factor analysis, multidimensional scaling, cluster analysis, discriminant function analysis, and so on. Given the importance of the General Linear Model, it’s a good idea for any serious social researcher to become familiar with its workings. The discussion of the General Linear Model here is very elementary and only considers the simplest straight-line model. However, it will get you familiar with the idea of the linear model and help prepare you for the more complex analyses described below.

One of the keys to understanding how groups are compared is embodied in the notion of the “dummy” variable. The name doesn’t suggest that we are using variables that aren’t very smart or, even worse, that the analyst who uses them is a “dummy”! Perhaps these variables would be better described as “proxy” variables. Essentially a dummy variable is one that uses discrete numbers, usually 0 and 1, to represent different groups in your study. Dummy variables are a simple idea that enable some pretty complicated things to happen. For instance, by including a simple dummy variable in an model, I can model two separate lines (one for each treatment group) with a single equation. To see how this works, check out the discussion on dummy variables .

One of the most important analyses in program outcome evaluations involves comparing the program and non-program group on the outcome variable or variables. How we do this depends on the research design we use. research designs are divided into two major types of designs : experimental and quasi-experimental . Because the analyses differ for each, they are presented separately.

Experimental Analysis

The simple two-group posttest-only randomized experiment is usually analyzed with the simple t-test or one-way ANOVA . The factorial experimental designs are usually analyzed with the Analysis of Variance (ANOVA) Model . Randomized Block Designs use a special form of ANOVA blocking model that uses dummy-coded variables to represent the blocks. The Analysis of Covariance Experimental Design uses, not surprisingly, the Analysis of Covariance statistical model .

Quasi-Experimental Analysis

The quasi-experimental designs differ from the experimental ones in that they don’t use random assignment to assign units (e.g. people) to program groups. The lack of random assignment in these designs tends to complicate their analysis considerably. For example, to analyze the Nonequivalent Groups Design (NEGD) we have to adjust the pretest scores for measurement error in what is often called a Reliability-Corrected Analysis of Covariance model . In the Regression-Discontinuity Design , we need to be especially concerned about curvilinearity and model misspecification. Consequently, we tend to use a conservative analysis approach that is based on polynomial regression that starts by overfitting the likely true function and then reducing the model based on the results. The Regression Point Displacement Design has only a single treated unit. Nevertheless, the analysis of the RPD design is based directly on the traditional ANCOVA model.

When you’ve investigated these various analytic models, you’ll see that they all come from the same family – the General Linear Model . An understanding of that model will go a long way to introducing you to the intricacies of data analysis in applied and social research contexts.

Cookie Consent

Conjointly uses essential cookies to make our site work. We also use additional cookies in order to understand the usage of the site, gather audience analytics, and for remarketing purposes.

For more information on Conjointly's use of cookies, please read our Cookie Policy .

Which one are you?

I am new to conjointly, i am already using conjointly.

Table of Contents

What are inferential statistics, types of inferential statistics, how analysts use inferential statistics in decision-making, examples of inferential statistics, difference between inferential statistics and descriptive statistics, importance of inferential statistics in a data science career, inferential statistics explained: from basics to advanced.

Inferential Statistics Explained: From Basics to Advanced!

Understanding statistics is vital for a data science career . But what exactly are statistics? Beyond mere numbers, statistics is a nuanced field encompassing collecting, analyzing, interpreting, and presenting numerical data. It's invaluable for drawing broad conclusions from large populations where detailed measurements aren't feasible.

Statistics branches into descriptive and inferential categories. Here, we delve into inferential statistics. This article explores its definition, types, differences from descriptive statistics, and more, offering insights into this intricate and essential aspect of data science.

Inferential statistics involves drawing conclusions or making inferences about a population based on data collected from a sample of that population. Here's how it works:

  • Sampling: You start by collecting data from a subset of the population you're interested in studying. This subset is called a sample.
  • Analysis: After collecting data , you use various statistical techniques. This might include calculating measures like means, standard deviations, correlations, or regression coefficients.
  • Inference: Once you've analyzed the sample data, you make inferences or generalizations about the population from which the sample was drawn. These inferences are based on the assumption that the sample is representative of the population.
  • Inferential statistics includes hypothesis testing, confidence intervals, and regression analysis, among other techniques. These methods help researchers determine whether their findings are statistically significant and whether they can generalize their results to the larger population.

Inferential statistics comprises several techniques for drawing conclusions. Here are some common types:

1. Hypothesis Testing

  • Hypothesis testing is a fundamental technique in inferential statistics. It involves testing a hypothesis about a population parameter, such as a mean or proportion, using sample data. The process typically involves setting up null and alternative hypotheses and conducting a statistical test to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.
  • Example: A researcher might hypothesize that the average income of people in a certain city is greater than $50,000 per year. They would collect a sample of incomes, conduct a hypothesis test, and determine whether the data provide enough evidence to support or reject this hypothesis.
  • The Z-test is a statistical test to determine whether the means of two populations differ when the population variance is known, and the sample size is large (typically n > 30). It's based on the standard normal distribution (Z-distribution).
  • The Z-test statistic follows a standard normal distribution under the null hypothesis.
  • Example: A researcher wants to determine if the mean height of a population is significantly different from 65 inches. They collect a large sample of heights with a known population standard deviation and use the Z-test to compare the sample mean to the population mean.
  • The T-test is used when the population standard deviation is unknown or the sample size is small (typically n < 30). It's based on the Student's t-distribution, which has thicker tails than the standard normal distribution.
  • There are two main types of t-tests: the independent samples t-test (for comparing means of two independent groups) and the paired samples t-test (for comparing means of two related groups).
  • The formula for the t-test statistic is similar to the Z-test, but it uses the sample standard deviation instead of the population standard deviation.
  • Example: A researcher wants to determine if there is a significant difference in exam scores between two groups of students. They collect exam scores from each group and use the t-test to compare the means.
  • The F-test is used to compare the variances of two populations or more than two populations. It's commonly used in the analysis of variance (ANOVA) to test for differences among means of multiple groups.
  • The F-test statistic follows an F-distribution, which is positively skewed and takes on only non-negative values.
  • In ANOVA, the F-test compares the variance between groups to the variance within groups. If the ratio of these variances is sufficiently large, it suggests that the groups' means are different.
  • Example: A researcher wants to determine if there are differences in the effectiveness of three teaching methods on student performance. They collect performance data from students taught using each method and use ANOVA, which utilizes the F-test, to compare the variances between and within the groups.
Become a Data Scientist through hands-on learning with hackathons, masterclasses, webinars, and Ask-Me-Anything! Start learning now!

2. Confidence Intervals

  • Confidence intervals provide a range of values within which a population parameter is likely to lie and a level of confidence associated with that range. They are often used to estimate the true value of a population parameter based on sample data. The width of the confidence interval depends on the sample size and the desired level of confidence.
  • Example: A pollster might use a confidence interval to estimate the proportion of voters who support a particular candidate. The confidence interval would give a range of values within which the true proportion of supporters is likely to lie, along with a confidence level such as 95%.

3. Regression Analysis

  • Regression analysis examines the relationship between one or more independent variables and a dependent variable. It can be used to predict the value of the dependent variable based on the values of the independent variables. Regression analysis also allows for testing hypotheses about the strength and direction of the relationships between variables.
  • Example: A researcher might use regression analysis to examine the relationship between hours of study and exam scores. They could then use the regression model to predict exam scores based on the hours studied.

4. Analysis of Variance (ANOVA)

  • ANOVA is a statistical technique that compares means across two or more groups. It tests whether there are statistically significant differences between the groups' means. ANOVA calculates both within-group variance (variation within each group) and between-group variance (variation between the group means) to determine whether any observed differences are likely due to chance or represent true differences between groups.
  • Example: A researcher might use ANOVA to compare the effectiveness of three different teaching methods on student performance. They would collect data on student performance in each group and use ANOVA to determine whether there are significant differences in performance between the groups.

5. Chi-Square Tests

  • Chi-square tests are used to determine whether there is a significant association between two categorical variables. They compare the observed frequency distribution of the data to the expected frequency distribution under the null hypothesis of independence between the variables.
  • Example: A researcher might use a chi-square test to examine whether there is a significant relationship between gender and voting preference. They would collect data on the gender and voting preferences of a sample of voters and use a chi-square test to determine whether gender and voting preference are independent.

Become a Data Science & Business Analytics Professional

  • 28% Annual Job Growth By 2026
  • 11.5 M Expected New Jobs For Data Science By 2026

Data Scientist

  • Industry-recognized Data Scientist Master’s certificate from Simplilearn
  • Dedicated live sessions by faculty of industry experts

Caltech Post Graduate Program in Data Science

  • Earn a program completion certificate from Caltech CTME
  • Curriculum delivered in live online sessions by industry experts

Here's what learners are saying regarding our programs:

A.Anthony Davis

A.Anthony Davis

Simplilearn has one of the best programs available online to earn real-world skills that are in demand worldwide. I just completed the Machine Learning Advanced course, and the LMS was excellent.

Charu Tripathi

Charu Tripathi

Senior business intelligence engineer , dell technologies.

My online learning experience was truly enriching, thanks to the exceptional faculty. The faculty members were always available, ready to assist and guide me through challenging topics, fostering a conducive learning environment. Their expertise and commitment were evident in their thorough explanations and willingness to ensure every student comprehended the subject.

Analysts use inferential statistics in decision-making in various ways across different fields, such as business, economics, healthcare, social sciences, and more. Here's how:

  • Drawing Conclusions from Sample Data: Analysts often have access to only a subset of data (sample) rather than the entire population. Inferential statistics allow them to conclude the population based on this sample data. For example, a marketing analyst might conduct surveys on a sample of customers to infer the preferences or behaviors of the entire customer base.
  • Hypothesis Testing for Decision-Making: Hypothesis testing helps analysts make decisions by providing a structured framework for evaluating hypotheses or claims about populations. For instance, a business analyst might use hypothesis testing to determine whether implementing a new marketing strategy significantly impacts sales.
  • Risk Assessment and Management: Inferential statistics help assess and manage risks by quantifying uncertainty. Analysts can use techniques such as confidence intervals to estimate the range of possible outcomes and make decisions accordingly. In finance, for example, analysts might use inferential statistics to assess the risk associated with investment portfolios.
  • Predictive Modeling and Forecasting: Analysts often use inferential statistics to build predictive models and forecast future events or outcomes. Regression analysis, for instance, is commonly used to predict sales figures based on historical data, allowing businesses to make informed decisions about inventory management and resource allocation.
  • Experimental Design and Optimization: Inferential statistics are crucial in experimental design and optimization processes. By conducting controlled experiments and analyzing data using techniques like analysis of variance (ANOVA), analysts can identify factors that significantly impact outcomes and optimize processes or products accordingly.
  • Policy Evaluation and Decision Support: In fields such as public policy and healthcare, inferential statistics are used to evaluate the effectiveness of interventions or policies. Analysts can assess whether a policy has achieved its intended goals and provide evidence-based recommendations for decision-makers by comparing outcomes between treatment and control groups.
  • Quality Control and Process Improvement: Inferential statistics are used for quality control and process improvement in manufacturing and operations management. Control charts and hypothesis testing help analysts identify deviations from expected performance and make data-driven decisions to enhance product quality and efficiency.

1. Market Research

A company wants to estimate its customers' average satisfaction levels. It surveys a random sample of customers and calculates the mean satisfaction score from the sample data. Using inferential statistics, the company can then estimate the average satisfaction level of all its customers, along with a measure of uncertainty (confidence interval).

2. Medical Research

A pharmaceutical company is testing a new drug to lower blood pressure. They conduct a randomized controlled trial where patients are randomly assigned to either the treatment group or the control group. The company can infer whether the new drug effectively lowers blood pressure by comparing the mean blood pressure levels between the two groups and conducting hypothesis testing.

3. Economics

An economist wants to estimate the unemployment rate for a country. They collect a sample of household survey data and calculate the unemployment rate for the sample. Using inferential statistics, the economist can then estimate the unemployment rate for the entire population, along with a measure of uncertainty (margin of error).

4. Quality Control

A manufacturing company produces light bulbs and wants to ensure that the average lifespan of its bulbs meets a certain standard. The company takes a random sample of bulbs from each production batch and tests their lifespans. By conducting hypothesis testing on the sample data, the company can infer whether the average lifespan of all bulbs produced by the batch meets the standard.

5. Education

A school district is considering implementing a new teaching method to improve student performance in mathematics. They randomly select several schools to participate in a pilot program where the new teaching method is introduced. By comparing the mean math scores of students in the pilot schools to those in non-pilot schools and conducting hypothesis testing, the district can infer whether the new teaching method significantly impacts student performance.

6. Environmental Science

Researchers want to assess the effectiveness of a conservation program to protect a certain species of endangered birds. They collect data on bird populations in areas where the program has been implemented and where it has yet to be. By comparing the mean population sizes between the two groups of areas and conducting hypothesis testing, the researchers can infer whether the conservation program has significantly impacted bird populations.

Our Data Scientist Master's Program covers core topics such as R, Python, Machine Learning, Tableau, Hadoop, and Spark. Get started on your journey today!

Inferential statistics goes beyond merely describing data by drawing meaningful conclusions about entire populations from sample data. For instance, if we surveyed 100 people about their cola preferences and found that 60 preferred Cola A, inferential statistics allow us to extend those findings to the broader soda-drinking population.

In contrast, descriptive statistics simply summarize the data at hand. For example, in a specific survey conducted in a particular location, we might learn that 60% of respondents favored Cola A, and that's the extent of the information provided.

Indeed, inferential statistics introduces a higher complexity level than descriptive statistics. While descriptive statistics offer a snapshot of current data, inferential statistics utilize that data to predict future outcomes. Achieving this requires a diverse toolkit, often involving intricate techniques such as hypothesis testing, confidence intervals, regression analysis, rigorous numerical analysis, graphical representation, and charting.

Descriptive statistics offer a straightforward summary of existing data, while inferential statistics harness that data to forecast potential trends or outcomes.

Here's a table outlining the main differences between inferential statistics and descriptive statistics:

Inferential statistics plays a crucial role in a data science career for several reasons:

  • Making Informed Decisions: Data scientists often work with incomplete or sample data. Inferential statistics enables them to make accurate inferences about entire populations based on this sample data, allowing organizations to make informed decisions.
  • Hypothesis Testing: Data scientists frequently need to test hypotheses and make statistical inferences about relationships or patterns in data. Inferential statistics provides the tools and techniques to test these hypotheses and draw meaningful conclusions rigorously.
  • Predictive Modeling: Predictive modeling is a fundamental aspect of data science, where models are trained on historical data to predict future events or outcomes. Inferential statistics, including regression analysis and hypothesis testing, underpin many predictive modeling techniques and help ensure their reliability and validity.
  • Experimental Design: In many data science projects, especially in fields like healthcare and marketing, experimental design is critical for conducting controlled experiments and evaluating the effectiveness of interventions or treatments. Inferential statistics guides the design of experiments, sample size determination, and analysis of experimental data.
  • Understanding Uncertainty: Data scientists must grapple with uncertainty inherent in data and model predictions. Inferential statistics provides measures of uncertainty, such as confidence intervals and p-values, which quantify the reliability of estimates and help stakeholders understand the level of uncertainty associated with data-driven decisions.
  • Statistical Inference in Machine Learning: Machine learning algorithms often involve statistical inference, especially in parameter estimation, model selection, and hypothesis testing. Data scientists use inferential statistics to evaluate and interpret the results of machine learning models and assess their performance.
  • Quality Control and Assurance: Data science is used for quality control and assurance in industries like manufacturing and healthcare. Inferential statistics helps identify anomalies, detect patterns, and make decisions to improve processes and product quality.
  • Risk Assessment and Management: Data scientists use inferential statistics for risk assessment and management in fields like finance and insurance. Techniques such as Monte Carlo simulation, which relies on inferential statistics, are employed to model and quantify risk in complex systems.

Enrol in Simplilearn's comprehensive Data Scientist course, an immersive program, developed by industry experts. It equips you with the skills and knowledge needed to thrive in today's data-driven world. Through hands-on projects, case studies, and interactive learning modules, you'll master essential concepts such as data analysis, machine learning, statistical modeling, and more.

1. How do you identify inferential statistics?

Inferential statistics involves drawing conclusions or making predictions about a population based on sample data, utilizing techniques like hypothesis testing and regression analysis.

2. Can inferential statistics help predict future trends?

Yes, inferential statistics can help predict future trends by analyzing historical data patterns and extrapolating them to predict future outcomes.

3. What are some common tools used in inferential statistics?

Common tools in inferential statistics include hypothesis testing, regression analysis, analysis of variance (ANOVA), confidence intervals, and chi-square tests.

4. Is inferential statistics hard to learn for beginners?

While inferential statistics can be challenging for beginners due to its complexity and reliance on statistical concepts, with patience, practice, and guidance, beginners can grasp the fundamentals and build proficiency over time.

5. What are confidence intervals in inferential statistics?

Confidence intervals in inferential statistics are ranges of values constructed around a sample statistic, such as a mean or proportion, which provide an estimate of the range within which the true population parameter is likely to fall, along with a specified level of confidence.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Learn from Industry Experts with free Masterclasses

Data science & business analytics.

Data Scientist vs Data Analyst: Breaking Down the Roles

Learner Spotlight: Watch How Prasann Upskilled in Data Science and Transformed His Career

Open Gates to a Successful Data Scientist Career in 2024 with Simplilearn Masters program

Recommended Reads

Data Science Career Guide: A Comprehensive Playbook To Becoming A Data Scientist

What is Descriptive Statistics: Definition, Types, Applications, and Examples

A Comprehensive Look at Percentile in Statistics

Free eBook: Top Programming Languages For A Data Scientist

The Difference Between Data Mining and Statistics

All You Need to Know About Bias in Statistics

Get Affiliated Certifications with Live Class programs

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Inferential Statistics | An Easy Introduction & Examples

Inferential Statistics | An Easy Introduction & Examples

Published on 18 January 2023 by Pritha Bhandari .

While descriptive statistics summarise the characteristics of a data set, inferential statistics help you come to conclusions and make predictions based on your data.

When you have collected data from a sample , you can use inferential statistics to understand the larger population from which the sample is taken.

Inferential statistics have two main uses:

  • making estimates about populations (for example, the mean SAT score of all 11th graders in the US).
  • testing hypotheses to draw conclusions about populations (for example, the relationship between SAT scores and family income).

Table of contents

Descriptive versus inferential statistics, estimating population parameters from sample statistics, hypothesis testing, frequently asked questions.

Descriptive statistics allow you to describe a data set, while inferential statistics allow you to make inferences based on a data set.

Descriptive statistics

Using descriptive statistics, you can report characteristics of your data:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability concerns how spread out the values are.

In descriptive statistics, there is no uncertainty – the statistics precisely describe the data that you collected. If you collect data from an entire population, you can directly compare these descriptive statistics to those from other populations.

Inferential statistics

Most of the time, you can only acquire data from samples, because it is too difficult or expensive to collect data from the whole population that you’re interested in.

While descriptive statistics can only summarise a sample’s characteristics, inferential statistics use your sample to make reasonable guesses about the larger population.

With inferential statistics, it’s important to use random and unbiased sampling methods . If your sample isn’t representative of your population, then you can’t make valid statistical inferences or generalise .

Sampling error in inferential statistics

Since the size of a sample is always smaller than the size of the population, some of the population isn’t captured by sample data. This creates sampling error , which is the difference between the true population values (called parameters) and the measured sample values (called statistics).

Sampling error arises any time you use a sample, even if your sample is random and unbiased. For this reason, there is always some uncertainty in inferential statistics. However, using probability sampling methods reduces this uncertainty.

The characteristics of samples and populations are described by numbers called statistics and parameters :

  • A statistic is a measure that describes the sample (e.g., sample mean ).
  • A parameter is a measure that describes the whole population (e.g., population mean).

Sampling error is the difference between a parameter and a corresponding statistic. Since in most cases you don’t know the real population parameter, you can use inferential statistics to estimate these parameters in a way that takes sampling error into account.

There are two important types of estimates you can make about the population: point estimates and interval estimates .

  • A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean.
  • An interval estimate gives you a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

Both types of estimates are important for gathering a clear idea of where a parameter is likely to lie.

Confidence intervals

A confidence interval uses the variability around a statistic to come up with an interval estimate for a parameter. Confidence intervals are useful for estimating parameters because they take sampling error into account.

While a point estimate gives you a precise value for the parameter you are interested in, a confidence interval tells you the uncertainty of the point estimate. They are best used in combination with each other.

Each confidence interval is associated with a confidence level. A confidence level tells you the probability (in percentage) of the interval containing the parameter estimate if you repeat the study again.

A 95% confidence interval means that if you repeat your study with a new sample in exactly the same way 100 times, you can expect your estimate to lie within the specified range of values 95 times.

Although you can say that your estimate will lie within the interval a certain percentage of the time, you cannot say for sure that the actual population parameter will. That’s because you can’t know the true value of the population parameter without collecting data from the full population.

However, with random sampling and a suitable sample size, you can reasonably expect your confidence interval to contain the parameter a certain percentage of the time.

Your point estimate of the population mean paid vacation days is the sample mean of 19 paid vacation days.

Hypothesis testing is a formal process of statistical analysis using inferential statistics. The goal of hypothesis testing is to compare populations or assess relationships between variables using samples.

Hypotheses , or predictions, are tested using statistical tests . Statistical tests also estimate sampling errors so that valid inferences can be made.

Statistical tests can be parametric or non-parametric. Parametric tests are considered more statistically powerful because they are more likely to detect an effect if one exists.

Parametric tests make assumptions that include the following:

  • the population that the sample comes from follows a normal distribution of scores
  • the sample size is large enough to represent the population
  • the variances , a measure of variability , of each group being compared are similar

When your data violates any of these assumptions, non-parametric tests are more suitable. Non-parametric tests are called ‘distribution-free tests’ because they don’t assume anything about the distribution of the population data.

Statistical tests come in three forms: tests of comparison, correlation or regression.

Comparison tests

Comparison tests assess whether there are differences in means, medians or rankings of scores of two or more groups.

To decide which test suits your aim, consider whether your data meets the conditions necessary for parametric tests, the number of samples, and the levels of measurement of your variables.

Means can only be found for interval or ratio data , while medians and rankings are more appropriate measures for ordinal data .

Correlation tests

Correlation tests determine the extent to which two variables are associated.

Although Pearson’s r is the most statistically powerful test, Spearman’s r is appropriate for interval and ratio variables when the data doesn’t follow a normal distribution.

The chi square test of independence is the only test that can be used with nominal variables.

Regression tests

Regression tests demonstrate whether changes in predictor variables cause changes in an outcome variable. You can decide which regression test to use based on the number and types of variables you have as predictors and outcomes.

Most of the commonly used regression tests are parametric. If your data is not normally distributed, you can perform data transformations.

Data transformations help you make your data normally distributed using mathematical operations, like taking the square root of each value.

Descriptive statistics summarise the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

A sampling error is the difference between a population parameter and a sample statistic .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, January 18). Inferential Statistics | An Easy Introduction & Examples. Scribbr. Retrieved 21 May 2024, from https://www.scribbr.co.uk/stats/inferential-statistics-meaning/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, descriptive statistics | definitions, types, examples, understanding confidence intervals | easy examples & formulas, how to calculate variance | calculator, analysis & examples.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Inferential Statistics

Inferential statistics is a branch of statistics that makes the use of various analytical tools to draw inferences about the population data from sample data. Apart from inferential statistics, descriptive statistics forms another branch of statistics. Inferential statistics help to draw conclusions about the population while descriptive statistics summarizes the features of the data set.

There are two main types of inferential statistics - hypothesis testing and regression analysis. The samples chosen in inferential statistics need to be representative of the entire population. In this article, we will learn more about inferential statistics, its types, examples, and see the important formulas.

What is Inferential Statistics?

Inferential statistics helps to develop a good understanding of the population data by analyzing the samples obtained from it. It helps in making generalizations about the population by using various analytical tests and tools. In order to pick out random samples that will represent the population accurately many sampling techniques are used. Some of the important methods are simple random sampling, stratified sampling, cluster sampling, and systematic sampling techniques.

Inferential Statistics Definition

Inferential statistics can be defined as a field of statistics that uses analytical tools for drawing conclusions about a population by examining random samples. The goal of inferential statistics is to make generalizations about a population. In inferential statistics, a statistic is taken from the sample data (e.g., the sample mean) that used to make inferences about the population parameter (e.g., the population mean).

Types of Inferential Statistics

Inferential statistics can be classified into hypothesis testing and regression analysis. Hypothesis testing also includes the use of confidence intervals to test the parameters of a population. Given below are the different types of inferential statistics.

Types of Inferential Statistics

Hypothesis Testing

Hypothesis testing is a type of inferential statistics that is used to test assumptions and draw conclusions about the population from the available sample data. It involves setting up a null hypothesis and an alternative hypothesis followed by conducting a statistical test of significance. A conclusion is drawn based on the value of the test statistic, the critical value , and the confidence intervals . A hypothesis test can be left-tailed, right-tailed, and two-tailed. Given below are certain important hypothesis tests that are used in inferential statistics.

Z Test: A z test is used on data that follows a normal distribution and has a sample size greater than or equal to 30. It is used to test if the means of the sample and population are equal when the population variance is known. The right tailed hypothesis can be set up as follows:

Null Hypothesis: \(H_{0}\) : \(\mu = \mu_{0}\)

Alternate Hypothesis: \(H_{1}\) : \(\mu > \mu_{0}\)

Test Statistic: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the sample size.

Decision Criteria: If the z statistic > z critical value then reject the null hypothesis.

T Test: A t test is used when the data follows a student t distribution and the sample size is lesser than 30. It is used to compare the sample and population mean when the population variance is unknown. The hypothesis test for inferential statistics is given as follows:

Test Statistics: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\)

Decision Criteria: If the t statistic > t critical value then reject the null hypothesis.

F Test: An f test is used to check if there is a difference between the variances of two samples or populations. The right tailed f hypothesis test can be set up as follows:

Null Hypothesis: \(H_{0}\) : \(\sigma_{1}^{2} = \sigma_{2}^{2}\)

Alternate Hypothesis: \(H_{1}\) : \(\sigma_{1}^{2} > \sigma_{2}^{2}\)

Test Statistic: f = \(\frac{\sigma_{1}^{2}}{\sigma_{2}^{2}}\), where \(\sigma_{1}^{2}\) is the variance of the first population and \(\sigma_{2}^{2}\) is the variance of the second population.

Decision Criteria: If the f test statistic > f test critical value then reject the null hypothesis.

Confidence Interval: A confidence interval helps in estimating the parameters of a population. For example, a 95% confidence interval indicates that if a test is conducted 100 times with new samples under the same conditions then the estimate can be expected to lie within the given interval 95 times. Furthermore, a confidence interval is also useful in calculating the critical value in hypothesis testing.

Apart from these tests, other tests used in inferential statistics are the ANOVA test, Wilcoxon signed-rank test, Mann-Whitney U test, Kruskal-Wallis H test, etc.

Regression Analysis

Regression analysis is used to quantify how one variable will change with respect to another variable. There are many types of regressions available such as simple linear, multiple linear, nominal, logistic, and ordinal regression. The most commonly used regression in inferential statistics is linear regression. Linear regression checks the effect of a unit change of the independent variable in the dependent variable. Some important formulas used in inferential statistics for regression analysis are as follows:

Regression Coefficients :

The straight line equation is given as y = \(\alpha\) + \(\beta x\), where \(\alpha\) and \(\beta\) are regression coefficients.

\(\beta = \frac{\sum_{1}^{n}\left ( x_{i}-\overline{x} \right )\left ( y_{i}-\overline{y} \right )}{\sum_{1}^{n}\left ( x_{i}-\overline{x} \right )^{2}}\)

\(\beta = r_{xy}\frac{\sigma_{y}}{\sigma_{x}}\)

\(\alpha = \overline{y}-\beta \overline{x}\)

Here, \(\overline{x}\) is the mean, and \(\sigma_{x}\) is the standard deviation of the first data set. Similarly, \(\overline{y}\) is the mean, and \(\sigma_{y}\) is the standard deviation of the second data set.

Inferential Statistics Examples

Inferential statistics is very useful and cost-effective as it can make inferences about the population without collecting the complete data. Some inferential statistics examples are given below:

  • Suppose the mean marks of 100 students in a particular country are known. Using this sample information the mean marks of students in the country can be approximated using inferential statistics.
  • Suppose a coach wants to find out how many average cartwheels sophomores at his college can do without stopping. A sample of a few students will be asked to perform cartwheels and the average will be calculated. Inferential statistics will use this data to make a conclusion regarding how many cartwheel sophomores can perform on average.

Inferential Statistics vs Descriptive Statistics

Descriptive and inferential statistics are used to describe data and make generalizations about the population from samples. The table given below lists the differences between inferential statistics and descriptive statistics.

Related Articles:

  • Probability and Statistics
  • Data Handling
  • Summary Statistics

Important Notes on Inferential Statistics

  • Inferential statistics makes use of analytical tools to draw statistical conclusions regarding the population data from a sample.
  • Hypothesis testing and regression analysis are the types of inferential statistics.
  • Sampling techniques are used in inferential statistics to determine representative samples of the entire population.
  • Z test, t-test, linear regression are the analytical tools used in inferential statistics.

Examples on Inferential Statistics

Example 1: After a new sales training is given to employees the average sale goes up to $150 (a sample of 25 employees was examined) with a standard deviation of $12. Before the training, the average sale was $100. Check if the training helped at \(\alpha\) = 0.05.

Solution: The t test in inferential statistics is used to solve this problem.

\(\overline{x}\) = 150, \(\mu\) = 100, s = 12, n = 25

\(H_{0}\) : \(\mu = 100\)

\(H_{1}\) : \(\mu > 100\)

t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\)

The degrees of freedom is given by 25 - 1 = 24

Using the t table at \(\alpha\) = 0.05, the critical value is T(0.05, 24) = 1.71

As 20.83 > 1.71 thus, the null hypothesis is rejected and it is concluded that the training helped in increasing the average sales.

Answer: Reject Null Hypothesis.

Example 2: A test was conducted with the variance = 108 and n = 8. Certain changes were made in the test and it was again conducted with variance = 72 and n = 6. At a 0.05 significance level was there any improvement in the test results?

Solution: The f test in inferential statistics will be used

\(H_{0}\) : \(s_{1}^{2} = s_{2}^{2}\)

\(H_{1}\) : \(s_{1}^{2} > s_{2}^{2}\)

\(n_{1}\) = 8, \(n_{2}\) = 6

\(df_{1}\) = 8 - 1 = 7

\(df_{2}\) = 6 - 1 = 5

\(s_{1}^{2}\) = 108, \(s_{2}^{2}\) = 72

The f test formula is given as follows:

F = \(\frac{s_{1}^{2}}{s_{2}^{2}}\) = 106 / 72

Now from the F table the critical value F(0.05, 7, 5) = 4.88

Inferential Statistics Example

As 4.88 < 1.5, thus, we fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the test results improved.

Answer: Fail to reject the null hypothesis.

Example 3: After a new sales training is given to employees the average sale goes up to $150 (a sample of 49 employees was examined). Before the training, the average sale was $100 with a standard deviation of $12. Check if the training helped at \(\alpha\) = 0.05.

Solution: This is similar to example 1. However, as the sample size is 49 and the population standard deviation is known, thus, the z test in inferential statistics is used.

\(\overline{x}\) = 150, \(\mu\) = 100, \(\sigma\) = 12, n = 49

t = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\)

From the z table at \(\alpha\) = 0.05, the critical value is 1.645.

As 29.2 > 1.645 thus, the null hypothesis is rejected and it is concluded that the training was useful in increasing the average sales.

Answer: Reject the null hypothesis.

go to slide go to slide go to slide

what is inferential analysis in research

Book a Free Trial Class

FAQs on Inferential Statistics

What is the meaning of inferential statistics.

Inferential statistics is a field of statistics that uses several analytical tools to draw inferences and make generalizations about population data from sample data.

What are the Types of Inferential Statistics?

There are two main types of inferential statistics that use different methods to draw conclusions about the population data. These are regression analysis and hypothesis testing.

What are the Different Sampling Methods Used in Inferential Statistics?

It is necessary to choose the correct sample from the population so as to represent it accurately. Some important sampling strategies used in inferential statistics are simple random sampling, stratified sampling, cluster sampling, and systematic sampling.

What are the Different Types of Hypothesis Tests In Inferential Statistics?

The most frequently used hypothesis tests in inferential statistics are parametric tests such as z test, f test, ANOVA test , t test as well as certain non-parametric tests such as Wilcoxon signed-rank test.

What is Inferential Statistics Used For?

Inferential statistics is used for comparing the parameters of two or more samples and makes generalizations about the larger population based on these samples.

Is Z Score a Part of Inferential Statistics?

Yes, z score is a fundamental part of inferential statistics as it determines whether a sample is representative of its population or not. Furthermore, it is also indirectly used in the z test.

What is the Difference Between Descriptive and Inferential Statistics?

Descriptive statistics is used to describe the features of some known dataset whereas inferential statistics analyzes a sample in order to draw conclusions regarding the population.

Descriptive and Inferential Statistics

When analysing data, such as the marks achieved by 100 students for a piece of coursework, it is possible to use both descriptive and inferential statistics in your analysis of their marks. Typically, in most research conducted on groups of people, you will use both descriptive and inferential statistics to analyse your results and draw conclusions. So what are descriptive and inferential statistics? And what are their differences?

Descriptive Statistics

Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data. Descriptive statistics do not, however, allow us to make conclusions beyond the data we have analysed or reach conclusions regarding any hypotheses we might have made. They are simply a way to describe our data.

Descriptive statistics are very important because if we simply presented our raw data it would be hard to visualize what the data was showing, especially if there was a lot of it. Descriptive statistics therefore enables us to present the data in a more meaningful way, which allows simpler interpretation of the data. For example, if we had the results of 100 pieces of students' coursework, we may be interested in the overall performance of those students. We would also be interested in the distribution or spread of the marks. Descriptive statistics allow us to do this. How to properly describe data through statistics and graphs is an important topic and discussed in other Laerd Statistics guides. Typically, there are two general types of statistic that are used to describe data:

  • Measures of central tendency: these are ways of describing the central position of a frequency distribution for a group of data. In this case, the frequency distribution is simply the distribution and pattern of marks scored by the 100 students from the lowest to the highest. We can describe this central position using a number of statistics, including the mode, median, and mean. You can learn more in our guide: Measures of Central Tendency .
  • Measures of spread: these are ways of summarizing a group of data by describing how spread out the scores are. For example, the mean score of our 100 students may be 65 out of 100. However, not all students will have scored 65 marks. Rather, their scores will be spread out. Some will be lower and others higher. Measures of spread help us to summarize how spread out these scores are. To describe this spread, a number of statistics are available to us, including the range, quartiles, absolute deviation, variance and standard deviation .

When we use descriptive statistics it is useful to summarize our group of data using a combination of tabulated description (i.e., tables), graphical description (i.e., graphs and charts) and statistical commentary (i.e., a discussion of the results).

Inferential Statistics

We have seen that descriptive statistics provide information about our immediate group of data. For example, we could calculate the mean and standard deviation of the exam marks for the 100 students and this could provide valuable information about this group of 100 students. Any group of data like this, which includes all the data you are interested in, is called a population . A population can be small or large, as long as it includes all the data you are interested in. For example, if you were only interested in the exam marks of 100 students, the 100 students would represent your population. Descriptive statistics are applied to populations, and the properties of populations, like the mean or standard deviation, are called parameters as they represent the whole population (i.e., everybody you are interested in).

Often, however, you do not have access to the whole population you are interested in investigating, but only a limited number of data instead. For example, you might be interested in the exam marks of all students in the UK. It is not feasible to measure all exam marks of all students in the whole of the UK so you have to measure a smaller sample of students (e.g., 100 students), which are used to represent the larger population of all UK students. Properties of samples, such as the mean or standard deviation, are not called parameters, but statistics . Inferential statistics are techniques that allow us to use these samples to make generalizations about the populations from which the samples were drawn. It is, therefore, important that the sample accurately represents the population. The process of achieving this is called sampling (sampling strategies are discussed in detail in the section, Sampling Strategy , on our sister site). Inferential statistics arise out of the fact that sampling naturally incurs sampling error and thus a sample is not expected to perfectly represent the population. The methods of inferential statistics are (1) the estimation of parameter(s) and (2) testing of statistical hypotheses .

We have provided some answers to common FAQs on the next page . Alternatively, why not now read our guide on Types of Variable?

Chapter 15 Quantitative Analysis Inferential Statistics

Inferential statistics are the statistical procedures that are used to reach conclusions about associations between variables. They differ from descriptive statistics in that they are explicitly designed to test hypotheses. Numerous statistical procedures fall in this category, most of which are supported by modern statistical software such as SPSS and SAS. This chapter provides a short primer on only the most basic and frequent procedures; readers are advised to consult a formal text on statistics or take a course on statistics for more advanced procedures.

Basic Concepts

British philosopher Karl Popper said that theories can never be proven, only disproven. As an example, how can we prove that the sun will rise tomorrow? Popper said that just because the sun has risen every single day that we can remember does not necessarily mean that it will rise tomorrow, because inductively derived theories are only conjectures that may or may not be predictive of future phenomenon. Instead, he suggested that we may assume a theory that the sun will rise every day without necessarily proving it, and if the sun does not rise on a certain day, the theory is falsified and rejected. Likewise, we can only reject hypotheses based on contrary evidence but can never truly accept them because presence of evidence does not mean that we may not observe contrary evidence later. Because we cannot truly accept a hypothesis of interest (alternative hypothesis), we formulate a null hypothesis as the opposite of the alternative hypothesis, and then use empirical evidence to reject the null hypothesis to demonstrate indirect, probabilistic support for our alternative hypothesis.

A second problem with testing hypothesized relationships in social science research is that the dependent variable may be influenced by an infinite number of extraneous variables and it is not plausible to measure and control for all of these extraneous effects. Hence, even if two variables may seem to be related in an observed sample, they may not be truly related in the population, and therefore inferential statistics are never certain or deterministic, but always probabilistic.

How do we know whether a relationship between two variables in an observed sample is significant, and not a matter of chance? Sir Ronald A. Fisher, one of the most prominent statisticians in history, established the basic guidelines for significance testing. He said that a statistical result may be considered significant if it can be shown that the probability of it being rejected due to chance is 5% or less. In inferential statistics, this probability is called the p-value , 5% is called the significance level (α), and the desired relationship between the p-value and α is denoted as: p≤0.05. The significance level is the maximum level of risk that we are willing to accept as the price of our inference from the sample to the population. If the p-value is less than 0.05 or 5%, it means that we have a 5% chance of being incorrect in rejecting the null hypothesis or having a Type I error. If p>0.05, we do not have enough evidence to reject the null hypothesis or accept the alternative hypothesis.

We must also understand three related statistical concepts: sampling distribution, standard error, and confidence interval. A sampling distribution is the theoretical distribution of an infinite number of samples from the population of interest in your study. However, because a sample is never identical to the population, every sample always has some inherent level of error, called the standard error . If this standard error is small, then statistical estimates derived from the sample (such as sample mean) are reasonably good estimates of the population. The precision of our sample estimates is defined in terms of a confidence interval (CI). A 95% CI is defined as a range of plus or minus two standard deviations of the mean estimate, as derived from different samples in a sampling distribution. Hence, when we say that our observed sample estimate has a CI of 95%, what we mean is that we are confident that 95% of the time, the population parameter is within two standard deviations of our observed sample estimate. Jointly, the p-value and the CI give us a good idea of the probability of our result and how close it is from the corresponding population parameter.

General Linear Model

Most inferential statistical procedures in social science research are derived from a general family of statistical models called the general linear model (GLM). A model is an estimated mathematical equation that can be used to represent a set of data, and linear refers to a straight line. Hence, a GLM is a system of equations that can be used to represent linear patterns of relationships in observed data.

what is inferential analysis in research

Figure 15.1. Two-variable linear model.

The simplest type of GLM is a two-variable linear model that examines the relationship between one independent variable (the cause or predictor) and one dependent variable (the effect or outcome). Let us assume that these two variables are age and self-esteem respectively. The bivariate scatterplot for this relationship is shown in Figure 15.1, with age (predictor) along the horizontal or x-axis and self-esteem (outcome) along the vertical or y-axis. From the scatterplot, it appears that individual observations representing combinations of age and self-esteem generally seem to be scattered around an imaginary upward sloping straight line. We can estimate parameters of this line, such as its slope and intercept from the GLM. From high-school algebra, recall that straight lines can be represented using the mathematical equation y = mx + c, where m is the slope of the straight line (how much does y change for unit change in x) and c is the intercept term (what is the value of y when x is zero). In GLM, this equation is represented formally as:

y = β 0 + β 1 x + ε

where β 0 is the slope, β 1 is the intercept term, and ε is the error term . ε represents the deviation of actual observations from their estimated values, since most observations are close to the line but do not fall exactly on the line (i.e., the GLM is not perfect). Note that a linear model can have more than two predictors. To visualize a linear model with two predictors, imagine a three-dimensional cube, with the outcome (y) along the vertical axis, and the two predictors (say, x 1 and x 2 ) along the two horizontal axes along the base of the cube. A line that describes the relationship between two or more variables is called a regression line, β 0 and β 1 (and other beta values) are called regression coefficients , and the process of estimating regression coefficients is called regression analysis . The GLM for regression analysis with n predictor variables is:

y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + … + β n x n + ε

In the above equation, predictor variables x i may represent independent variables or covariates (control variables). Covariates are variables that are not of theoretical interest but may have some impact on the dependent variable y and should be controlled, so that the residual effects of the independent variables of interest are detected more precisely. Covariates capture systematic errors in a regression equation while the error term (ε) captures random errors. Though most variables in the GLM tend to be interval or ratio-scaled, this does not have to be the case. Some predictor variables may even be nominal variables (e.g., gender: male or female), which are coded as dummy variables . These are variables that can assume one of only two possible values: 0 or 1 (in the gender example, “male” may be designated as 0 and “female” as 1 or vice versa). A set of n nominal variables is represented using n–1 dummy variables. For instance, industry sector, consisting of the agriculture, manufacturing, and service sectors, may be represented using a combination of two dummy variables (x 1 , x 2 ), with (0, 0) for agriculture, (0, 1) for manufacturing, and (1, 1) for service. It does not matter which level of a nominal variable is coded as 0 and which level as 1, because 0 and 1 values are treated as two distinct groups (such as treatment and control groups in an experimental design), rather than as numeric quantities, and the statistical parameters of each group are estimated separately.

The GLM is a very powerful statistical tool because it is not one single statistical method, but rather a family of methods that can be used to conduct sophisticated analysis with different types and quantities of predictor and outcome variables. If we have a dummy predictor variable, and we are comparing the effects of the two levels (0 and 1) of this dummy variable on the outcome variable, we are doing an analysis of variance (ANOVA). If we are doing ANOVA while controlling for the effects of one or more covariate, we have an analysis of covariance (ANCOVA). We can also have multiple outcome variables (e.g., y 1 , y 1 , … y n ), which are represented using a “system of equations” consisting of a different equation for each outcome variable (each with its own unique set of regression coefficients). If multiple outcome variables are modeled as being predicted by the same set of predictor variables, the resulting analysis is called multivariate regression . If we are doing ANOVA or ANCOVA analysis with multiple outcome variables, the resulting analysis is a multivariate ANOVA (MANOVA) or multivariate ANCOVA (MANCOVA) respectively. If we model the outcome in one regression equation as a predictor in another equation in an interrelated system of regression equations, then we have a very sophisticated type of analysis called structural equation modeling . The most important problem in GLM is model specification , i.e., how to specify a regression equation (or a system of equations) to best represent the phenomenon of interest. Model specification should be based on theoretical considerations about the phenomenon being studied, rather than what fits the observed data best. The role of data is in validating the model, and not in its specification.

Two-Group Comparison

One of the simplest inferential analyses is comparing the post-test outcomes of treatment and control group subjects in a randomized post-test only control group design, such as whether students enrolled to a special program in mathematics perform better than those in a traditional math curriculum. In this case, the predictor variable is a dummy variable (1=treatment group, 0=control group), and the outcome variable, performance, is ratio scaled (e.g., score of a math test following the special program). The analytic technique for this simple design is a one-way ANOVA (one-way because it involves only one predictor variable), and the statistical test used is called a Student’s t-test (or t-test, in short).

The t-test was introduced in 1908 by William Sealy Gosset, a chemist working for the Guiness Brewery in Dublin, Ireland to monitor the quality of stout – a dark beer popular with 19 th century porters in London. Because his employer did not want to reveal the fact that it was using statistics for quality control, Gosset published the test in Biometrika using his pen name “Student” (he was a student of Sir Ronald Fisher), and the test involved calculating the value of t, which was a letter used frequently by Fisher to denote the difference between two groups. Hence, the name Student’s t-test, although Student’s identity was known to fellow statisticians.

The t-test examines whether the means of two groups are statistically different from each other (non-directional or two-tailed test), or whether one group has a statistically larger (or smaller) mean than the other (directional or one-tailed test). In our example, if we wish to examine whether students in the special math curriculum perform better than those in traditional curriculum, we have a one-tailed test. This hypothesis can be stated as:

where μ 1 represents the mean population performance of students exposed to the special curriculum (treatment group) and μ 2 is the mean population performance of students with traditional curriculum (control group). Note that the null hypothesis is always the one with the “equal” sign, and the goal of all statistical significance tests is to reject the null hypothesis.

what is inferential analysis in research

Nice to meet you.

Enter your email to receive our weekly  G2 Tea newsletter  with the hottest marketing news, trends, and expert opinions.

Come to the Right Conclusion with Inferential Analysis

March 23, 2020

by Mara Calvello

inferential analysis@2x-1

In this post

Descriptive analysis vs. inferential analysis, linear regression analysis, correlation analysis, analysis of variance, analysis of covariance, confidence interval, chi-square test, advantages of inferential analysis, limitations of inferential analysis.

We’re all guilty of jumping to conclusions from time to time.

Whether it's convincing yourself that no one is going to buy a ticket for the conference you’ve worked so hard to plan or that arriving at the airport two hours in advance simply isn’t enough time, we’ve all done it.

Outside of our daily lives, it’s easy to jump to inaccurate conclusions at work, no matter the industry. When we do this, we’re essentially generalizing, but what if you could make these generalizations more accurately? It’s possible when you run inferential analysis tests.

What is inferential analysis?

Inferential analysis is used to draw and measure the reliability of conclusions about a population that is based on information gathered from a sample of the population. Since inferential analysis doesn’t sample everyone in a population, the results will always contain some level of uncertainty.

When diving into statistical analysis , oftentimes the size of the population we’re looking to analyze is too large, making it impossible to study everyone. In these cases, data is collected using random samples of individuals within a specific population. Then, inferential analysis is used on the data to come to conclusions about the overall population.

Because it’s often impossible to measure an entire population of people, inferential analysis relies on gathering data from a sample of individuals within the population. Essentially, inferential analysis is used to try to infer from a sample of data what the population might think or show.

There are two main ways of going about this:

  • Estimating parameters: Taking a statistic from a data sample (like the sample mean) and using it to conclude something about the population (the population mean).
  • Hypothesis tests: The use of data samples to answer specific research questions.

In estimating parameters, the sample is used to estimate a value that describes the entire population, in addition to a confidence interval. Then, the estimate is created.

In hypothesis testing, data is used to determine if it is strong enough to support or reject an assumption.

The two main types of statistical analysis that people use most often are descriptive analysis and inferential analysis. Because of this, it’s not uncommon for the two to be confused for each other, even though they provide data analysts with different insights into the data that is collected.

While one can’t show the whole picture, when used together, they provide a powerful tool into data visualization and prediction analytics , since they rely on the same set of data.

Descriptive statistical analysis gives information that describes the data in some way. This is sometimes done with charts and graphs made with data visualization software to explain what the data presents. This method of statistical analysis isn’t used to draw conclusions, only to summarize the information.

Inferential statistical analysis is the method that will be used to draw the conclusions. It allows users to infer or conclude trends about a larger population based on the samples that are analyzed. Basically, it takes data from a sample and then makes conclusions about a larger population or group.

This type of statistical analysis is often used to study the relationship between variables within a sample, allowing for conclusions and generalizations that accurately represent the population. And unlike descriptive analysis, businesses can test a hypothesis and come up with various conclusions from this data.

Descriptive analysis vs inferential analysis

Let’s think of it this way. You’re at a baseball game and ask a sample of 100 fans if they like hotdogs. You could make a bar graph of yes or no answers, which would be descriptive analysis. Or you could use your research to conclude that 93% of the population (all baseball fans in all the baseball stadiums) like hotdogs, which would be inferential analysis.

Types of inferential analysis tests

There are many types of inferential analysis tests that are in the statistics field. Which one you choose to use will depend on your sample size, hypothesis you’re trying to solve, and the size of the population being tested.

Linear regression analysis is used to understand the relationship between two variables (X and Y) in a data set as a way to estimate the unknown variable to make future projections on events and goals.

The main objective of regression analysis is to estimate the values of a random variable (Z) based on the values of your known (or fixed) variables (X and Y). This is typically represented by a scatter plot, like the one below.

Linear regression analysis

One key advantage of using regression within your analysis is that it provides a detailed look at data and includes an equation that can be used for predictive analytics and optimizing data in the future.

The formula for regression analysis is:

Y = a + b(x)

A → refers to the y-intercept, the value of y when x = 0

B → refers to the slope, or rise over run

Another inferential analysis test is correlation analysis, which is used to understand the extent to which two variables are dependent on one another. This analysis essentially tests the strength of the relationship between two variables, and if their correlation is strong or weak.

The correlation between two variables can also be negative or positive, depending on the variables. Variables are considered “uncorrelated” when a change in one does not affect the other.

An example of this would be price and demand. This is because an increase in demand causes a corresponding increase in price. The price would increase because more consumers want something and are willing to pay more for it.

Overall, the objective of correlation analysis is to find the numerical value that shows the relationship between the two variables and how they move together. Like regression, this is typically done by utilizing data visualization software to create a graph.

Correlation analysis

Related: Learn more about the ins and outs of correlations vs regression , including the differences and which method your business should be using.

The analysis of variance (ANOVA) statistical method is used to test and analyze the differences between two or more means from a data set. This is done by examining the amount of variation between the samples.

In simplest terms, ANOVA provides a statistical test of whether two or more population means are equal, in addition to generalizing the t-test between two means.

Learn more: A t-test is used to show how significant the differences between two groups are. Essentially, it allows for the understanding of if differences (measured in means/averages) could have happened by chance.

This method will allow for the testing of groups to see if there’s a difference between them. For example, you may test students at two different high schools who take the same exam to see if one high school tests higher than the other.

ANOVA can also be broken down into two types:

  • One-way: Only one independent variable with two levels. An example would be a brand of peanut butter.
  • Two-way: Two independent variables that can have multiple levels. An example would be a brand of peanut butter and the calories.

A level is simply the different groups within the variable. So, using the same example as above, the levels of brands of peanut butter might be Jif, Skippy, or Peter Pan. The levels for calories could be smooth, creamy, or organic.

Analysis of covariance (ANCOVA) is a unique blend of analysis of variance (ANOVA) and regression. ANCOVA can show what additional information is available when considering one independent variable, or factor, at a time, without influencing others.

It is often used:

  • For an extension of multiple regression as a way to compare multiple regression lines
  • To control covariates (other variables) that aren’t the main focus of your study
  • For an extension of the analysis of variance
  • To study combinations of other variables of interest
  • To control for factors that cannot be randomized but that can be measured

ANCOVA can also be used to pretest or posttest an analysis when regression to the mean will affect your posttest measurement of the statistic.

As an example, let’s say your business creates new pharmaceuticals for the public that lowers blood pressure. You may conduct a study that monitors four treatment groups and one control group.

If you use ANOVA, you’ll be able to tell if the treatment does, in fact, lower blood pressure. When you incorporate ANCOVA, you can control other factors that might influence the outcome, like family life, occupation, or other prescription drug use.

A confidence interval is a tool that is used in inferential analysis that estimates a parameter, usually the mean, of an entire population. Essentially, it’s how much uncertainty there is with any particular statistic and is typically used with a margin of error.

The confidence interval is expressed with a number that reflects how sure you are that the results of the survey or poll are what you’d expect if it were possible to survey the entire population.

For instance, if the results of a poll or survey have a 98% confidence interval, then this defines the range of values that you can be 98% certain contains the population mean. To come to this conclusion, three pieces of information are needed:

  • Confidence level : Describes the uncertainty associated with a sampling method
  • Statistic: Data collected from the survey or poll
  • Margin of error : How many percentage points your results will differ from the real population value

A chi-square test, otherwise known as an x2 test, is used to identify the difference between groups when all of the variables are nominal (also known as, a variable with values that don’t have a numerical value), like gender, salary gap, political affiliation, and so on.

These tests are typically used with specific contingency tables that group observations based on common characteristics.

Questions that the chi-square test could answer might be:

  • Are education level and marital status related for all people in the United States?
  • Is there a relationship between voter intent and political party membership?
  • Does gender affect which holiday people favor?

Usually, these tests are done using the statistical analysis method called simple random sampling to collect data from a specific sample to potentially come to an accurate conclusion. If we use the first question listed above, the data may look like:

These contingency tables are used as a starting point to organize the data collected through simple random sampling.

There are many advantages to using inferential analysis, mainly that it provides a surplus of detailed information – much more than you’d have after running a descriptive analysis test.

This information provides researchers and analysts with comprehensive insights into relationships between two variables. It can also show awareness toward cause and effect and predictions regarding trends and patterns throughout industries.

Plus, since it is so widely used in the business world as well as academia, it’s a universally accepted method of statistical analysis.

When it comes to inferential statistics, there are two main limitations.

The first limitation comes from the fact that since the data being analyzed is from a population that hasn’t been fully measured, data analysts can’t ever be 100% sure that the statistics being calculated are correct. Since inferential analysis is based on the process of using values measured in a sample to conclude the values that would be measured from the total population, there will always be some level of uncertainty regarding the results.

The second limitation is that some inferential tests require the analyst or researcher to make an educated guess based on theories to run the tests. Similar to the first limitation, there will be uncertainty surrounding these guesses, which will also mean some repercussions on the reliability of the results of some statistical tests.

Don’t jump to conclusions

Before you jump to a potentially inaccurate conclusion regarding data, make sure to take advantage of the information that awaits within an inferential analysis test.

No matter the type of conclusion you’re looking to come to, or the hypothesis you start with, you may be surprised by the results an inferential analysis test can bring.

Looking for statistical analysis software to better interpret all of your data sets? Or maybe a tool that makes even the most complex statistical analysis simple and conclusive? Check out our list of unbiased reviews on G2!

Mara Calvello photo

Mara Calvello is a Content Marketing Manager at G2. She graduated with a Bachelor of Arts from Elmhurst College (now Elmhurst University). Mara's expertise lies within writing for HR, Design, SaaS Management, Social Media, and Technology categories. In her spare time, Mara is either at the gym, exploring the great outdoors with her rescue dog Zeke, enjoying Italian food, or right in the middle of a Harry Potter binge.

Recommended Articles

what is inferential analysis in research

Data Mining Techniques You Need to Unlock Quality Insights

In today's rapidly growing technological workspace, businesses have more data than ever before.

what is inferential analysis in research

Cohort Analysis: An Insider Look at Your Customer's Behavior

Whether you do it subconsciously or on purpose, it’s human nature to put things into groups.

what is inferential analysis in research

8 Customer Data Analysis Best Practices You Need to Know

Customer data analysis is all about getting a better understanding of who your customer is at...

by Mike Rossi

Never miss a post.

Subscribe to keep your fingers on the tech pulse.

By submitting this form, you are agreeing to receive marketing communications from G2.

Statology

Statistics Made Easy

Descriptive vs. Inferential Statistics: What’s the Difference?

There are two main branches in the field of statistics:

  • Descriptive Statistics

Inferential Statistics

This tutorial explains the difference between the two branches and why each one is useful in certain situations.

Descriptive  Statistics

In a nutshell,  descriptive statistics  aims to  describe  a chunk of raw data using summary statistics, graphs, and tables.

Descriptive statistics are useful because they allow you to understand a group of data much more quickly and easily compared to just staring at rows and rows of raw data values.

For example, suppose we have a set of raw data that shows the test scores of 1,000 students at a particular school. We might be interested in the average test score along with the distribution of test scores.

Using descriptive statistics, we could find the average score and create a graph that helps us visualize the distribution of scores.

This allows us to understand the test scores of the students much more easily compared to just staring at the raw data.

Common Forms of Descriptive Statistics

There are three common forms of descriptive statistics:

1. Summary statistics.  These are statistics that  summarize  the data using a single number. There are two popular types of summary statistics:

  • Measures of central tendency : these numbers describe where the center of a dataset is located. Examples include the  mean   and the  median .
  • Measures of dispersion : these numbers describe how spread out the values are in the dataset. Examples include the  range ,  interquartile range ,  standard deviation , and  variance .

2. Graphs . Graphs help us visualize data. Common types of graphs used to visualize data include boxplots , histograms , stem-and-leaf plots , and scatterplots .

3. Tables . Tables can help us understand how data is distributed. One common type of table is a  frequency table , which tells us how many data values fall within certain ranges. 

Example of Using Descriptive Statistics

The following example illustrates how we might use descriptive statistics in the real world.

Suppose 1,000 students at a certain school all take the same test. We are interested in understanding the distribution of test scores, so we use the following descriptive statistics:

1. Summary Statistics

Mean: 82.13 . This tells us that the average test score among all 1,000 students is 82.13.

Median: 84.  This tells us that half of all students scored higher than 84 and half scored lower than 84.

Max: 100. Min: 45.  This tells us the maximum score that any student obtained was 100 and the minimum score was 45. The  range – which tells us the difference between the max and the min – is 55.

To visualize the distribution of test scores, we can create a histogram – a type of chart that uses rectangular bars to represent frequencies.

what is inferential analysis in research

Based on this histogram, we can see that the distribution of test scores is roughly bell-shaped. Most of the students scored between 70 and 90, while very few scored above 95 and fewer still scored below 50.

Another easy way to gain an understanding of the distribution of scores is to create a frequency table. For example, the following frequency table shows what percentage of students scored between various ranges:

what is inferential analysis in research

We can see that just 4% of the total students scored above a 95. We can also see that (12% + 9% + 4% = ) 25% of all students scored an 85 or higher.

A frequency table is particularly helpful if we want to know what percentage of the data values fall above or below a certain value. For example, suppose the school considers an “acceptable” test score to be any score above a 75.

By looking at the frequency table, we can easily see that (20% + 22% + 12% + 9% + 4% = ) 67% of the students received an acceptable test score.

In a nutshell,  inferential statistics  uses a small sample of data to draw  inferences  about the larger population that the sample came from.

For example, we might be interested in understanding the political preferences of millions of people in a country.

However, it would take too long and be too expensive to actually survey every individual in the country. Thus, we would instead take a smaller survey of say, 1,000 Americans, and use the results of the survey to draw inferences about the population as a whole.

This is the whole premise behind inferential statistics – we want to answer some question about a population, so we obtain data for a small sample of that population and use the data from the sample to draw inferences about the population.

The Importance of a Representative Sample

In order to be confident in our ability to use a sample to draw inferences about a population, we need to make sure that we have a  representative sample   – that is, a sample in which the characteristics of the individuals in the sample closely match the characteristics of the overall population.

Ideally, we want our sample to be like a “mini version” of our population. So, if we want to draw inferences on a population of students composed of 50% girls and 50% boys, our sample would not be representative if it included 90% boys and only 10% girls.

what is inferential analysis in research

If our sample is not similar to the overall population, then we cannot generalize the findings from the sample to the overall population with any confidence.

How to Obtain a Representative Sample

To maximize the chances that you obtain a representative sample, you need to focus on two things:

1. Make sure you use a random sampling method.

There are several different random sampling methods that you can use that are likely to produce a representative sample, including:

  • A simple random sample
  • A systematic random sample
  • A cluster random sample
  • A stratified random sample

Random sampling methods tend to produce representative samples because every member of the population has an equal chance of being included in the sample.

2. Make sure your sample size is large enough . 

Along with using an appropriate sampling method, it’s important to ensure that the sample is large enough so that you have enough data to generalize to the larger population.

To determine how large your sample should be, you have to consider the population size you’re studying, the confidence level you’d like to use, and the margin of error you consider to be acceptable.

Fortunately, you can use online calculators to plug in these values and see how large your sample needs to be.

Common Forms of Inferential Statistics

There are three common forms of inferential statistics:

1. Hypothesis Tests.

Often we’re interested in answering questions about a population such as:

  • Is the percentage of people in Ohio in support of candidate A higher than 50%?
  • Is the mean height of a certain plant equal to 14 inches?
  • Is there a difference between the mean height of students at School A compared to School B?

To answer these questions we can perform a hypothesis test , which allows us to use data from a sample to draw conclusions about populations.

2. Confidence Intervals . 

Sometimes we’re interested in estimating some value for a population. For example, we might be interested in the mean height of a certain plant species in Australia.

Instead of going around and measuring every single plant in the country, we might collect a small sample of plants and measure each one. Then, we can use the mean height of the plants in the sample to estimate the mean height for the population.

However, our sample is unlikely to provide a perfect estimate for the population. Fortunately, we can account for this uncertainty by creating a confidence interval , which provides a range of values that we’re confident the true population parameter falls in.

For example, we might produce a 95% confidence interval of [13.2, 14.8], which says we’re 95% confident that the true mean height of this plant species is between 13.2 inches and 14.8 inches.

3. Regression .

Sometimes we’re interested in understanding the relationship between two variables in a population.

For example, suppose we want to know if  hours spent studying per week  is related to  test scores . To answer this question, we could perform a technique known as  regression analysis .

So, we may observe the number of hours studied along with the test scores for 100 students and perform a regression analysis to see if there is a significant relationship between the two variables.

If the p-value of the regression turns out to be significant , then we can conclude that there is a significant relationship between these two variables in the overall population of students.

The Difference Between Descriptive and Inferential Statistics

In summary, the difference between descriptive and inferential statistics can be described as follows:

Descriptive statistics  use summary statistics, graphs, and tables to describe  a data set.

This is useful for helping us gain a quick and easy understanding of a data set without pouring over all of the individual data values.

Inferential statistics  use samples to draw  inferences  about larger populations.

Depending on the question you want to answer about a population, you may decide to use one or more of the following methods: hypothesis tests, confidence intervals, and regression analysis.

If you do choose to use one of these methods, keep in mind that your sample needs to be representative of your population , or the conclusions you draw will be unreliable.

Featured Posts

5 Tips for Interpreting P-Values Correctly in Hypothesis Testing

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

3 Replies to “Descriptive vs. Inferential Statistics: What’s the Difference?”

Wow! Awesome! So easily explained! I finally understood and know now how to create and answer my questions! Thank you!

I just came across this site and all I can say is “I love you Sir”

This site is the real treasure I was lucky to find. Thanks a million, Zach Bobbitt!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

  • Search Menu

Sign in through your institution

  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Media
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business Ethics
  • Business History
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic History
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Theory
  • Politics and Law
  • Politics of Development
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Design and Analysis for Quantitative Research in Music Education

  • < Previous chapter
  • Next chapter >

5 Inferential Analysis

  • Published: March 2018
  • Cite Icon Cite
  • Permissions Icon Permissions

Researchers often employ statistical techniques to test hypotheses and to express the relative certainty they have when making a claim about how statistics derived from their sample data might be representative of population parameters. This chapter illustrates the logic underlying inferential statistical tests. Inferential analyses involves a set of tools that music education researchers can use when posing scientific questions and seeking to refute their hypotheses. The chapter describes techniques that can be used for testing hypotheses and estimating population parameters on the basis of sample data. In doing so, the chapter emphasizes basic approaches to null hypothesis significance testing, interpreting effect sizes, and building confidence intervals. The chapter also provides a brief critique of null hypothesis significance testing as a tradition.

Signed in as

Institutional accounts.

  • GoogleCrawler [DO NOT DELETE]
  • Google Scholar Indexing

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

The Federal Register

The daily journal of the united states government, request access.

Due to aggressive automated scraping of FederalRegister.gov and eCFR.gov, programmatic access to these sites is limited to access to our extensive developer APIs.

If you are human user receiving this message, we can add your IP address to a set of IPs that can access FederalRegister.gov & eCFR.gov; complete the CAPTCHA (bot test) below and click "Request Access". This process will be necessary for each IP address you wish to access the site from, requests are valid for approximately one quarter (three months) after which the process may need to be repeated.

An official website of the United States government.

If you want to request a wider IP range, first request access for your current IP, and then use the "Site Feedback" button found in the lower left-hand side to make the request.

IMAGES

  1. Inferential Statistics

    what is inferential analysis in research

  2. Basic Concepts of Inferential statistics

    what is inferential analysis in research

  3. Come to the Right Conclusion with Inferential Analysis

    what is inferential analysis in research

  4. Inferential Statistics

    what is inferential analysis in research

  5. Data Analysis 101: The types of analysis you can conduct

    what is inferential analysis in research

  6. Five types of statistical analysis Descriptive Inferential Differences

    what is inferential analysis in research

VIDEO

  1. Inferential analysis in STATA

  2. 1- Descriptive Statistics versus Inferential Statistics

  3. SPSS Workshop Part 3: Descriptive and inferential statistics full

  4. Brief Introduction to Statistical Inference

  5. Understanding Quantitative Research Methods

  6. Inferential Analysis of Quantitative Survey Data

COMMENTS

  1. Inferential Statistics

    Example: Inferential statistics. You randomly select a sample of 11th graders in your state and collect data on their SAT scores and other characteristics. You can use inferential statistics to make estimates and test hypotheses about the whole population of 11th graders in the state based on your sample data.

  2. Inferential Statistics

    Inferential statistics is a branch of statistics that involves making predictions or inferences about a population based on a sample of data taken from that population. It is used to analyze the probabilities, assumptions, and outcomes of a hypothesis. The basic steps of inferential statistics typically involve the following:

  3. What Is Inferential Statistics? (Definition, Uses, Example)

    What Is Inferential Statistics? Inferential statistics is the practice of using sampled data to draw conclusions or make predictions about a larger sample data sample or population. Inferential statistics help us draw conclusions about how a hypothesis will play out or to determine a general parameter about a larger sample.

  4. Quant Analysis 101: Inferential Statistics

    Inferential vs Descriptive. At this point, you might be wondering how inferentials differ from descriptive statistics. At the simplest level, descriptive statistics summarise and organise the data you already have (your sample), making it easier to understand. Inferential statistics, on the other hand, allow you to use your sample data to assess whether the patterns contained within it are ...

  5. Quantitative analysis: Inferential statistics

    Most inferential statistical procedures in social science research are derived from a general family of statistical models called the general linear model (GLM). A model is an estimated mathematical equation that can be used to represent a set of data, and linear refers to a straight line. Hence, a GLM is a system of equations that can be used ...

  6. Basic Inferential Statistics

    The goal in classic inferential statistics is to prove the null hypothesis wrong. The logic says that if the two groups aren't the same, then they must be different. A low p-value indicates a low probability that the null hypothesis is correct (thus, providing evidence for the alternative hypothesis).

  7. Inferential Statistics: Definition, Uses

    Inferential statistics use statistical models to help you compare your sample data to other samples or to previous research. Most research uses statistical models called the Generalized Linear model and include Student's t-tests, ANOVA (Analysis of Variance ), regression analysis and various other models that result in straight-line ...

  8. Inferential Statistics: Definition, Types + Examples

    Inferential statistics is an important part of the data unit of analysis and research because it lets us make predictions and draw conclusions about whole populations based on data from a small sample. It is a complicated and advanced field that requires careful thought about assumptions and data quality, but it can give important research ...

  9. Inferential Statistics

    Bivariate analysis is analyzing two variables together. An example of a univariate analysis would be simply looking at the death rate (mortality) in different countries. ... Practice and Research, 2019. Inferential Statistics. The other method for analyzing data is through inferential statistics. Used to make interpretations about a set of data ...

  10. Basics of statistics for primary care research

    Inferential statistics are another broad category of techniques that go beyond describing a data set. Inferential statistics can help researchers draw conclusions from a sample to a population. 1 We can use inferential statistics to examine differences among groups and the relationships among variables.

  11. Inferential Statistics

    Most of the major inferential statistics come from a general family of statistical models known as the General Linear Model. This includes the t-test, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA), regression analysis, and many of the multivariate methods like factor analysis, multidimensional scaling, cluster analysis ...

  12. Inferential Statistics Explained

    Inferential statistics, including regression analysis and hypothesis testing, underpin many predictive modeling techniques and help ensure their reliability and validity. Experimental Design: In many data science projects, especially in fields like healthcare and marketing, experimental design is critical for conducting controlled experiments ...

  13. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  14. Inferential Statistics

    Example: Inferential statistics. You randomly select a sample of 11th graders in your state and collect data on their SAT scores and other characteristics. You can use inferential statistics to make estimates and test hypotheses about the whole population of 11th graders in the state based on your sample data.

  15. Basic statistical tools in research and data analysis

    Inferential statistics. In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon.

  16. Inferential Statistics

    Some important formulas used in inferential statistics for regression analysis are as follows: Regression Coefficients: The straight line equation is given as y = α α + βx β x, where α α and β β are regression coefficients. β = ∑n 1(x−¯. ¯. ¯x)(y−¯. ¯. ¯y) ∑n 1 (x−¯.

  17. Descriptive and Inferential Statistics

    When analysing data, such as the marks achieved by 100 students for a piece of coursework, it is possible to use both descriptive and inferential statistics in your analysis of their marks. Typically, in most research conducted on groups of people, you will use both descriptive and inferential statistics to analyse your results and draw ...

  18. Chapter 15 Quantitative Analysis Inferential Statistics

    Inferential statistics are the statistical procedures that are used to reach conclusions about associations between variables. They differ from descriptive statistics in that they are explicitly designed to test hypotheses. ... and is widely used in contemporary social science research. Time series analysis is a technique for analyzing time ...

  19. PDF Basic Principles of Statistical Inference

    Three Modes of Statistical Inference. Descriptive Inference: summarizing and exploring data. Inferring "ideal points" from rollcall votes Inferring "topics" from texts and speeches Inferring "social networks" from surveys. Predictive Inference: forecasting out-of-sample data points. Inferring future state failures from past failures ...

  20. Come to the Right Conclusion with Inferential Analysis

    Inferential statistical analysis is the method that will be used to draw the conclusions. It allows users to infer or conclude trends about a larger population based on the samples that are analyzed. Basically, it takes data from a sample and then makes conclusions about a larger population or group.

  21. Descriptive vs. Inferential Statistics: What's the Difference?

    Descriptive statistics use summary statistics, graphs, and tables to describe a data set. This is useful for helping us gain a quick and easy understanding of a data set without pouring over all of the individual data values. Inferential statistics use samples to draw inferences about larger populations.

  22. PDF What is Inferential Statistics?

    Inferential statistics use statistical models to help you compare your sample data to other samples or to previous research. Most research uses statistical models called the Generalized Linear model and include Student's t-tests, ANOVA (Analysis of Variance), regression analysis and various other models that result in straight-line ("linear ...

  23. Inferential Analysis

    Inferential analyses involves a set of tools that music education researchers can use when posing scientific questions and seeking to refute their hypotheses. The chapter describes techniques that can be used for testing hypotheses and estimating population parameters on the basis of sample data.

  24. DATA ANALYSES.pdf

    In all quantitative research questions or hypotheses, we study individuals sampled from a population. However, in descriptive questions, we study only a single variable one at a time; in inferential analysis, we analyze multiple variables at the same time. Also, from comparing groups or relating variables, ...

  25. How to perform statistical data analysis in MS Excel

    Use the "Data Analysis" toolpack by enabling it from the "Add-ins" menu. Select the type of statistical analysis you want to perform (e.g., Descriptive Statistics, Regression). Input the ...

  26. Schedules of Controlled Substances: Rescheduling of Marijuana

    The Department of Justice ("DOJ") proposes to transfer marijuana from schedule I of the Controlled Substances Act ("CSA") to schedule III of the CSA, consistent with the view of the Department of Health and Human Services ("HHS") that marijuana has a currently accepted medical use as well as HHS's views about marijuana's abuse ...