• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Nonparametric Tests vs. Parametric Tests

By Jim Frost 121 Comments

Nonparametric tests don’t require that your data follow the normal distribution. They’re also known as distribution-free tests and can provide benefits in certain situations. Typically, people who perform statistical hypothesis tests are more comfortable with parametric tests than nonparametric tests.

You’ve probably heard it’s best to use nonparametric tests if your data are not normally distributed—or something along these lines. That seems like an easy way to choose, but there’s more to the decision than that.

In this post, I’ll compare the advantages and disadvantages to help you decide between using the following types of statistical hypothesis tests:

  • Parametric analyses to assess group means
  • Nonparametric analyses to assess group medians (sometimes)

In particular, I’d like you to focus on one key reason to perform a nonparametric test that doesn’t get the attention it deserves! If you need a primer on the basics, read my hypothesis testing overview .

Related Pairs of Parametric and Nonparametric Tests

Nonparametric tests are a shadow world of parametric tests. In the table below, I show linked pairs of statistical hypothesis tests.

1-sample t-test, Sign test,
, Mood’s median test
Factorial DOE with a factor and a blocking variable Friedman test

Additionally, Spearman’s correlation is a nonparametric alternative to Pearson’s correlation. Use Spearman’s correlation for nonlinear, monotonic relationships and for ordinal data. For more information, read my post Spearman’s Correlation Explained !

For this topic, it’s crucial you understand the concept of robust statistical analyses. Learn more in my post, What are Robust Statistics?

Advantages of Parametric Tests

Advantage 1: parametric tests can provide trustworthy results with distributions that are skewed and nonnormal.

Many people aren’t aware of this fact, but parametric analyses can produce reliable results even when your continuous data are nonnormally distributed. You just have to be sure that your sample size meets the requirements for each analysis in the table below. Simulation studies have identified these requirements. Read here for more information about these studies .

1-sample t-test Greater than 20
2-sample t-test Each group should have more than 15 observations
One-Way ANOVA

You can use these parametric tests with nonnormally distributed data thanks to the central limit theorem. For more information about it, read my post: Central Limit Theorem Explained .

Related posts : The Normal Distribution and How to Identify the Distribution of Your Data .

Advantage 2: Parametric tests can provide trustworthy results when the groups have different amounts of variability

It’s true that nonparametric tests don’t require data that are normally distributed. However, nonparametric tests have the disadvantage of an additional requirement that can be very hard to satisfy. The groups in a nonparametric analysis typically must all have the same variability (dispersion). Nonparametric analyses might not provide accurate results when variability differs between groups.

Conversely, parametric analyses, like the 2-sample t-test or one-way ANOVA, allow you to analyze groups with unequal variances. In most statistical software, it’s as easy as checking the correct box! You don’t have to worry about groups having different amounts of variability when you use a parametric analysis.

Related post : Measures of Variability

Advantage 3: Parametric tests have greater statistical power

In most cases, parametric tests have more power. If an effect actually exists, a parametric analysis is more likely to detect it.

Related post : Statistical Power and Sample Size

Advantages of Nonparametric Tests

Advantage 1: nonparametric tests assess the median which can be better for some study areas.

Now we’re coming to my preferred reason for when to use a nonparametric test. The one that practitioners don’t discuss frequently enough!

For some datasets, nonparametric analyses provide an advantage because they assess the median rather than the mean. The mean is not always the better measure of central tendency for a sample. Even though you can perform a valid parametric analysis on skewed data, that doesn’t necessarily equate to being the better method. Let me explain using the distribution of salaries.

Salaries tend to be a right-skewed distribution. The majority of wages cluster around the median, which is the point where half are above and half are below. However, there is a long tail that stretches into the higher salary ranges. This long tail pulls the mean far away from the central median value. The two distributions are typical for salary distributions.

Two right skewed distributions that have equal medians but different means.

In these distributions, if several very high-income individuals join the sample, the mean increases by a significant amount despite the fact that incomes for most people don’t change. They still cluster around the median.

In this situation, parametric and nonparametric test results can give you different results, and they both can be correct! For the two distributions, if you draw a large random sample from each population, the difference between the means is statistically significant. Despite this, the difference between the medians is not statistically significant. Here’s how this works.

For skewed distributions , changes in the tail affect the mean substantially. Parametric tests can detect this mean change. Conversely, the median is relatively unaffected, and a nonparametric analysis can legitimately indicate that the median has not changed significantly.

You need to decide whether the mean or median is best for your study and which type of difference is more important to detect.

Related posts : Determining which Measure of Central Tendency is Best for Your Data and Median: Definition and Uses

Advantage 2: Nonparametric tests are valid when our sample size is small and your data are potentially nonnormal

Use a nonparametric test when your sample size isn’t large enough to satisfy the requirements in the table above and you’re not sure that your data follow the normal distribution. With small sample sizes, be aware that normality tests can have insufficient power to produce useful results.

This situation is difficult. Nonparametric analyses tend to have lower power at the outset, and a small sample size only exacerbates that problem.

Advantage 3: Nonparametric tests can analyze ordinal data, ranked data, and outliers

Parametric tests can analyze only continuous data and the findings can be overly affected by outliers. Conversely, nonparametric tests can also analyze ordinal and ranked data, and not be tripped up by outliers. Learn more about Ordinal Data: Definition, Examples & Analysis .

Sometimes you can legitimately remove outliers from your dataset if they represent unusual conditions. However, sometimes outliers are a genuine part of the distribution for a study area, and you should not remove them.

You should verify the assumptions for nonparametric analyses because the various tests can analyze different types of data and have differing abilities to handle outliers.

If you’re using a Likert scale and you want to compare two groups, read my post about which analysis you should use to analyze Likert data .

Related posts : Data Types and How to Use Them and 5 Ways to Find Outliers in Your Data

Advantages and Disadvantages of Parametric and Nonparametric Tests

Many people believe that choosing between parametric and nonparametric tests depends on whether your data follow the normal distribution. If you have a small dataset, the distribution can be a deciding factor. However, in many cases, this issue is not critical because of the following:

  • Parametric analyses can analyze nonnormal distributions for many datasets.
  • Nonparametric analyses have other firm assumptions that can be harder to meet.

The answer is often contingent upon whether the mean or median is a better measure of central tendency for the distribution of your data.

  • If the mean is a better measure and you have a sufficiently large sample size, a parametric test usually is the better, more powerful choice.
  • If the median is a better measure, consider a nonparametric test regardless of your sample size.

Lastly, if your sample size is tiny, you might be forced to use a nonparametric test. It would make me ecstatic if you collect a larger sample for your next study! As the table shows, the sample size requirements aren’t too large. If you have a small sample and need to use a less powerful nonparametric analysis, it doubly lowers the chance of detecting an effect.

If you’re learning about hypothesis testing and like the approach I use in my blog, check out my Hypothesis Testing book! You can find it at Amazon and other retailers.

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Share this:

explain the hypothesis testing of non parametric data

Reader Interactions

' src=

June 10, 2024 at 5:24 pm

Thanks so much for this website. I have data for 8 subjects, with 4 groups of data. I would consider this a one-way repeated measures ANOVA. However, the data are non-normal, and in some cases the variances are unequal. From my understanding, I think I should use a Friedman’s test (non-parametric version of repeated-measures ANOVA). I came to this conclusion because Welch’s ANOVA can be used for unequal variances but requires normal distribution (at least for small samples sizes). Kruskal-Wallis did not seem to be appropriate, as it assumes similar distributions. Do you agree, or do you have a different suggestion? If I use Friedman’s test, do you have any thoughts about post-hoc tests, if necessary?

Thank you, Anita

' src=

June 12, 2024 at 9:01 pm

You’re very welcome for the website! It makes my day hearing that it was helpful!

Yes, your scenario does sound like it requires Friedman’s Test. I agree with you on that.

As for post-hoc tests, consider using the Nemenyi test after Friedman’s test.

Best wishes on your analysis! 🙂 Jim

' src=

May 18, 2024 at 3:18 am

Thank you for your insightful response. I appreciate your detailed explanations and guidance. I followed your guidance and found that data is not shaped distribution, making it not justifiable to use non parametric test. I used arithmetic mean return data, but later I learned that its better to use log returns. After using log returns, I did not get contradictory result. Both parametric and non parametric tests produce similar results. My sample size is large, I selected 360 companies 10 years daily closing prices.

May 15, 2024 at 3:09 am

How should I interpret contradictory results from parametric and non-parametric tests when my data is not normally distributed?

I’m conducting a profitability analysis using ROA and ROE portfolios. I have applied both parametric tests (t-tests and ANOVA) and non-parametric tests (Kruskal-Wallis H-test) to compare the returns across different portfolios. Here are my findings:

Parametric Tests (t-tests and ANOVA): The results indicate no statistically significant difference in mean returns across the portfolios. Non-Parametric Test (Kruskal-Wallis H-test): The results show a significant difference in the distribution of returns across the portfolios (p-values < 0.05). Given that my data does not meet the normality assumption required for parametric tests, I am inclined to rely on the non-parametric test results. However, I'm concerned about presenting seemingly contradictory results in my thesis.

My Questions: Which test results should I prioritize given the non-normal distribution of my data? How can I justify the choice of non-parametric tests over parametric tests in my analysis? Are there any authoritative references or guidelines that discuss the appropriateness of using non-parametric tests in such situations? Any advice or references would be greatly appreciated!

May 15, 2024 at 2:59 pm

There are several complicating issues relating to your case. I discuss them in this article, so I’d encourage you to reread the points I indicate.

What’s your sample size? If you have sufficiently large sample, normality is not an issue, particularly when your data are not extremely skewed. Read the Advantage 1 of Parametric Tests section about how parametric tests can handle nonnormal data. If your sample size is lower than the guidelines I list, you’d probably want to use a nonparametric test.

Have you graphed the data to see what they look like for each group? Do they have the shaped distribution? A nonparametric test will be valid for your data but it only tests the median under the specific condition that all groups in your data have the same shaped distribution (which can be nonnormal). That’s a hard and fast assumption that you can’t get around with a large sample size. Read the Advantage 2 of Parametric Tests section for more about this issue. Also, my Kruskal-Wallis article discusses it in a bit more detail and explains what you can conclude if the groups don’t have the same shaped distribution.

It’s interesting that you’ve received different results. I would lean towards the nonparametric but with the caveat that to interpret the results correctly you need to look at the distribution shapes for you groups and see if they’re the same or different. The results are valid either way but you need to know how to interpret the significant results. Nonparametric tests are designed so they don’t require a specific distribution, such as a nonnormal. So you won’t have any problem justifying its usage.

' src=

February 8, 2024 at 6:54 am

I am trying to compare two groups with 19 observations each. Some of the dependent variables are normally distributed, some are not. I am thinking of using a t-test for those that are normally distributed and the mann-whitney-u test for those that are not normally distributed. What do you think of mixing statistical analysis within the same study but for different hypothesis? I was also trying to transform the data, but I have z-scores (so negative values, so logarithmic and square root transformation won’t work. What do you think? Any help is highly appreciated.

Greets, Lisa

February 8, 2024 at 7:28 pm

Your approach seems generally reasonable. Mixing methods isn’t a problem. Just be sure to explain why. Ideally, you explain and describe the analysis phase process before you analyze any data so you can avoid cherry picking.

A couple of caveats.

You can use nonparametric tests with nonnormal data. However, these tests evaluate the mean only if the various groups follow the same distribution (which can be nonnormal). I explain the reasoning for that limitation in more detail in my posts specifically about the Mann-Whitney U and Kruskal Wallis tests. If the distributions are not the same, you can’t draw conclusions about the median specifically but you can say that one distribution tends to have higher values.

If you have the raw data, those might not have negative values allowing to use those transformation. Alternatively, try the Box-Cox transformation that can handle negative values.

I hope that helps! Best of luck with your project!

' src=

October 25, 2023 at 9:49 am

many thanks for your article(s), but I would slightly disagree with you regarding the median test in non-parametric analyses. For example, theMW-U test does not, without further assumptions, test for differences of the medians of the two groups! As you can see here in the example ( https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faq-why-is-the-mann-whitney-significant-when-the-medians-are-equal/ ), the medians may be truely 0 and 0, but the U test indicates a significance. But what is significant now, it cannot be the difference of the medians? So, it rather tests a stochastical superiority of one group over the other. Under the very strict additional assumptions that both distributions have the same form, the same spread and only differ by a location shift, we could conclude that the medians are different (but then this is also true for the mean). Since non-parametric tests do not test the same hypotheses as roughly comparable parametric tests, I wonder how good it is to see them as alternatives. They do not answer the same questions.

All the best from Germany, Rainer

October 25, 2023 at 3:23 pm

We actually agree on that completely. Notice in this post I specifically indicate that nonparametric tests assess the medians only sometimes–in very specific cases. I also highlight the strict assumption about groups having the same distribution shapes as a negative for nonparametric tests. Parametric tests like the t-test and ANOVA have versions (Welch’s) that can use group distributions with different shapes/spreads. That’s one reason for preferring parametric tests over nonparametric. I call all that out in this post (so probably read a bit more carefully).

Click the links for the specific analyses and I discuss the issues you mention in greater detail. I don’t go into depth about stochastic superiority and related issues in this post because it’s more of an overview comparing nonparametric to parametric. But I cover all that in the more specific posts about each analysis.

I appreciate the comment. You’re right on. But please read my posts a bit more carefully to realize that I already address and agree with those points. Thanks!

' src=

October 14, 2023 at 7:16 pm

Thank you very much for your effort, it is an invaluable website.

I have a question: In the table you provided the sample needed to consider the normality for One-Way ANOVA For 2-9 groups, each group should have more than 15 observations. For 10-12 groups, each group should have more than 20 observations.

However, I am conducting mixed between- within subject ANOVA with three groups and 12 repeated measures. How many students should be included in each group to consider my data are normally distributed, according to central limit theorem.

I would appreciate a reply

October 17, 2023 at 4:24 pm

Unfortunately, I’m not really sure. The numbers I use that you quote are based on simulation studies for those specific conditions. I’m not aware of studies that have looked at more complex models. And your model is more complex in that it is a mixed model and has a number of repeated measures. Another consideration is having a sufficient number of subjects to avoid overfitting your model. If each repeated measure is a new condition that you’ll be estimating, you’d need say at least 150 subjects for that part of it alone (i.e., avoiding overfitting). I’d also guess that would be sufficient for handling the normality issue as well. However, I’d take that as an absolute minimum number and shoot for more if possible.

But take that with a grain of salt because I’m not going by any published studies that have simulated those conditions.

Given the complexity of your study, you might want to consult with a statistician at your institution who can devote the time necessary that your study deserves.

Best of luck with your study! I’d be very interested in hearing from you about what you ultimately go with. 🙂

' src=

July 6, 2023 at 5:31 pm

Hi Jim! Thank you for creating this great website.I have never find it so easy to understand such complex topic like statistic. My query is could you please write something more info of how to interpret Friedman test and all related important terminologies whether it is the right test,when to use etc just like you have discussed every other topic ? I shall be grateful to you, Honestly even my uni stats team were not able to explain me so easily in the way you have taught the concept of statistics.

July 7, 2023 at 3:02 am

Thanks so much for writing. I’m so glad to hear that my website has been helpful!

I will definitely write about the Friedman test. I put it on my list!

Be sure to join my email list if you haven’t already so you’ll know when it’s published. It might be several weeks.

' src=

January 8, 2023 at 7:00 pm

Dear Jim, Many thanks for your great work!

' src=

October 12, 2022 at 10:43 am

Hello, my research has pretest, posttest and a delayed posttest. I have 2 groups (control and treatment) of 10 participants each. Based on your page, you mentioned that it’s possible to carry out parametric tests if there is more than 20 participants: 1. Does that mean if I have 20 participants that is not enough to carry out a parametric test even though my data is not normally distributed? So, I can carry out paired sample t-test or should I just use Wilcoxon Signed Rank Test? 2. How do I go about testing for delayed posttest? The reason I am doing a delayed posttest was to ensure the reliability of the posttest results.

' src=

September 26, 2022 at 3:47 am

Thanks a lot for your valuable website and information. If I used 10 animal as a sample size and I have high partial eta square, can I apply parametric ANOVA. Thanks again.

September 26, 2022 at 8:31 pm

If you have a sample size of 10 per group and you are sure they follow a normal distribution, you can use parametric ANOVA. Your sample size is small, which means you must satisfy the normality assumption.

' src=

July 16, 2022 at 7:06 pm

Thanks for your article. I am confronted with a similar situation where I have 4 conditions (20 subjects per condition, one of which is a control group). I see that this meets the 15 subjects requirement for 2-9 groups but what I want to know is, when would you consider the data to be extremely skewed and unfit for parametric analysis?

Any thresholds to determine “extreme skew”?

July 20, 2022 at 1:03 am

That’s kind of a trick question because there is no clear-cut dividing line. In most cases, you’ll be fine given the number of subjects per group. If you really want to check, you can do a resampling method to see what kind of distribution it produces. Does the resulting distribution look fairly normal? To see what I’m talking about, read my post about the central limit theorem . I show examples of sampling distributions that do and do not converge on the normal distribution for different distributions and sample sizes. You can try it with your data to see what it looks like. There’s no statistical test but if your sampling distribution looks fairly normal, you’re safe.

' src=

June 13, 2022 at 3:46 pm

Hello Jim, thank you for this article. I have a problem, the images don’t load.

June 13, 2022 at 4:04 pm

Hi Filip, I just checked and I’m seeing the image in this post with no problem (there’s only one image).

' src=

March 18, 2022 at 8:34 pm

I recently discovered your site and it is extremely helpful. Thanks! I have been struggling figuring out how to report data. Say I am analyzing the response to a medication in 3 groups of a patients, and looking at response vs blood concentrations of the drug. I am trying to come up a reference range that says: These patients will respond to symptoms when in the following blood concentrations (eg 5-25 mcg/mL). My total n= ~900 patients. 1 group has ~80 patients (responders), one group has ~600 cases (partial responder) and one group has ~150 (non-responders). The data is not normally distributed based on several normality tests. In order to establish the reference range, I need to capture the central 95% of patient blood concentrations when looking at the responders group (ideally just those that fully responder, but also those that fully respond + partially respond). If mean +/- 2SD is used, then I end up at a negative blood concentration, which obviously isn’t possible. However if I use median, the boxplot and whisker seems to capture a good range and indicates outlier. Is this latter way the correct way to go?

I hope this makes sense

' src=

March 14, 2022 at 11:17 am

Thank you so much for your reply.

March 5, 2022 at 10:09 am

Hello Jim, Thank you so much for writing wonderful articles. Your articles helped me a lot to understand statistics. They are making my data analysis easier. I will be very grateful if you would like to provide me with some suggestions regarding my sample size in ANOVA and parametric analysis. I have 4 groups having 50, 16, 54, 70 sizes respectively. I checked their distributions. They dont follow normal distributions. I did ordinary one way ANOVA or Welchs ANOVA depending on difference in their SD values. Among these 4 groups, first one is control group, 2nd one is experimental group comprising two types of patients, and the rest two groups are the two types of patient groups each which make up the 2nd group. Am i doing the right form of analysis?

March 6, 2022 at 3:04 pm

Hi Sanjeda,

If you look at the table in this post, you’ll see that when you use one-way ANOVA and have 2-9 groups, you typically don’t need to worry about normality when each group has at least 15 observations. You have four groups, so this applies to you. And all of your groups have at least 15 observations. Although one group is very close. I think you’re safe using one-way ANOVA unless your data are extremely skewed.

However, because you have unequal sample sizes across your groups, the equal variances assumption is particularly relevant. If your variances are not equal, definitely use Welch’s ANOVA . If they’re roughly equal, the regular one-way ANOVA should be fine.

If you have significant results, you should perform a post hoc analysis to see which groups are different. Because you have a control and treatment group, I recommend Dunnett’s method. Click the link to learn about them and I include an example that uses Dunnett’s.

I don’t know what you mean when you say that some groups are made up of two types of patients. In a one-way ANOVA, all subjects should be a random sample from the same population. Their primary difference between groups should be the grouping variable in your ANOVA, which is experimental group in your case.

' src=

December 19, 2021 at 2:20 pm

Hi Jim, I found the solution. I’m going to do an ordinal logistics regression analysis! I just wanted to let you know so you have more time to answer other questions. Thank you!

December 21, 2021 at 12:58 am

Sorry for the delay in replying but I’m glad you found your answer. One thing I wasn’t sure about from your original question was about your IVs and DV. Ordinal logistic regression is a good choice when your DV is ordinal, like Likert scale data. However, I’m not sure what variables you’ll use as IVs? If they’re also ordinal, then you’ll need to enter them either as continuous or categorical. Ordinal has characteristics of both, but you’ll have to choose one or the other for each IV. Although, you don’t have to make the same decision for all IVs. The correct choice depends on the nature and amount of your data along with the goals of your study.

December 17, 2021 at 8:38 pm

Below you can find the survey question that tried to measure the impact of cognitive biases induced by marketing messages on consumer decision making to purchase in e-commerce.

“When you shop online, which one of these sales aspects impacts your decision-making to purchase?” —————— Stock Availability: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly Reviews of people: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly Countdown timer: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly Nr. of likes: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly

(Likert scale)

Based on the existing literature and other online sources, the following marketing messages are used to induce cognitive biases of consumers in e-commerce. Each marketing message in a way manifests a cognitive bias. Stock availability –> Scarcity Bias Reviews –> Bandwagon effect Countdown timer –> Loss aversion Nr. of likes on the product –> Bandwagon effect

390 people responded to the survey and my hypothesis are as follows: Ho: Stock availability has no impact on consumer decision making to purchase H1: Stock availability impacts the consumer decision making to purchase Ho: Reviews of people have no impact on consumer decision making to purchase H1: Reviews of people impact the consumer decision making to purchase Ho: Countdown timer has no impact on consumer decision making to purchase H1: Countdown timer impacts the consumer decision making to purchase Ho: Nr. of likes on a product have no impact on consumer decision making to purchase H1: Nr. of likes on a product impact the consumer decision making to purchase

Given this information: 1. What kind of hypothesis testing should I use?

Ps: sorry for my long comment, I tried to be as clear as possible

Thank you in advance!

' src=

December 5, 2021 at 6:59 am

Hii Jim, you mentioned above sample size requirement about nonnormal data. How you fix that requirement, by doing some practical basis or from any reference?

December 5, 2021 at 10:51 pm

Those sample size requirements come from a simulation study conducted by some smart people I used to work with. You’ll find a link to it in the Advantage #1 section under Advantages of Parametric Tests. Click the link to read the study.

' src=

July 24, 2021 at 12:04 pm

Thank you so much for your prompt reply. Indeed your answers have me reassured! thank you! In short, one should not outrightly reject the application of parametric approaches under the non-normal distribution of data. It is still accurate and valid under that condition. The keyword here is “robustness” of the parametric approach even though it is used to analyse the highly skewed data. Robustness here of course is relying on several factors such as sample size, confidence interval set, or p value, Am i right?

By the way, i feel reluctant to use spearman rank correlation although my data (both continuous) are not normally distributed. Many articles and experts said we should use spearman in this case but i feel unsure due to the fact that spearman, by its name and intention of the analysis, it should be used on rank/ordinal data- like Likert scale data. However, as i mentioned, there are many, not just several, scholars recommend such an application of spearman for both continuous or interval data (not ranked). I am confused as i have read some articles (from old to new articles) suggesting that spearman is strictly meant for analysing ranked data. Therefore my question is should we be really concerned about the data type by which how spearman correlation is used?

Thanks in advance

July 25, 2021 at 12:40 am

Hi Gabriel,

Robustness indicates that the test performs satisfactorily even with non-normal data. Specifically, all hypothesis tests have a Type I error rate. That’s basically a false positive. Imagine the null hypothesis is true. You perform a hypothesis test, get a low p-value, reject the null hypothesis, and conclude that the effect/relationship exists in the population. In our thought experiment, we know that the test result is incorrect but in the real world you never know that for sure.

But, we know how often Type I errors occur. When a test performs correctly, the Type I error rate equals the significance level you use (e.g., 0.05). When a test is robust to departures from normality, the Type I error rate equals the significance level even with non-normal data. When a test is NOT robust, then non-normal data will cause the Type I error rate to NOT equal the significance level. The simulation studies have found that when you satisfy the sample size guidelines, the listed tests are robust to departures from normality.

If Spearman’s correlation is appropriate for data and Pearson’s is not, you really need to use Spearman’s. It’s NOT just for rank and ordinal data. It’s also for nonlinear relationships that are monotonic. Read my post about Spearman’s correlation to understand what that means and the types of relationships for which you should use it. You’ll need to graph your data to make that determination. The are definitely cases where you have continuous data and Spearman’s will be the appropriate type of correlation to use. That may or may not be the case for your data but you need to make that determination. Again, read my post about Spearman’s.

July 22, 2021 at 2:21 pm

Hi Jim, Thank you so much for the clarification on the use of parametric approaches for non-normal distributed data, provided that other requirements like sample size needs to be reasonably large. I noticed that you have provided some rule of thumbs in terms of sample size for t test and anova under the non-normal distribution of means but may i know what what about Pearson’s correlation? What would be an adequate sample size under the non-normal distribution.

If by the above rule of thumb, the parametric approach is valid (e.g., the sample size is 150 or 200), should we still need to perform normality test (skewness and kurtosis)? or we can assume that it should be fine? or even the latter contradicts the former, will the latter prevail over the former?

FYI, i am not a statistician, however i came across an article by Professor Geoff Norman debunking various myths about statistics like many of so called experts claim that once it is non normal, data are categorical/ordinal, sample size is too small then you have no choice but to use nonparametric approaches.

Thank you very much. Look forward to hearing from you.

July 23, 2021 at 10:50 pm

Thanks for the great question! Yes, you’re quite right, there are similar guidelines for Pearson’s correlation.

In general, the sample size for correlation should be greater than 25. There’s no formal rule for this number, but you need a certain number of observations to identify patterns such as correlation.

In terms of normality, it’s not necessarily an issue for the correlation coefficient itself but it is for the p-value. However, in some cases, the nature of the relationship will require you to use a different type of correlation, such as Spearman correlation . Fortunately, Pearson and Spearman correlation are robust to non-normal data when you have more than 25 paired observations. One caveat. The confidence intervals for the Pearson’s correlation coefficient remain sensitive to non-normality regardless of the sample size. The p-values for Spearman’s correlation are even robust to non-normal data because it’s a nonparametric method that uses ranks.

Your sample size of 150 or 200 are so much larger than the guideline value that you don’t need to worry about normality.

As for the article by Professor Norman, which I have not read, it’s inaccurate to say that you can’t use parametric methods with non-normal data. Thanks to the central limit theorem, you can use parametric methods with non-normal data when your sample size is large enough. The sample size guidelines I present are based on simulation studies that compare simulated test results to known correct results for various distributions and sample sizes. These studies find that when you satisfy these sample size guidelines, the tests work correctly even with non-normal data. However, if you have non-normal data and a small sample size, then you might need to use a nonparametric test, which I discuss in this article.

I hope that helps!

' src=

July 6, 2021 at 9:24 am

Thanks so much!!

' src=

March 7, 2021 at 11:37 am

Thanks for perfect explanation sir. Sir I have a question regarding my data analysis. I have conducted a study and I want to compare the present situation with previous. All participants (male and female) have already experienced the pre and post situation. To compare the present situation with previous one with the options of; (Not Available), (Worst Condition), (Average Condition), (Better Condition). Let say, to compare the “Drinking water facility” with above scale/options. Any suggestion how to analyze or whats kind of statistical test can be used for this kind of data. I will strongly appreciate your valuable inputs. Thanks Zeb

' src=

March 5, 2021 at 11:25 am

Dear Jim, Let me add a few notes from my 10-year practice in the clinical research biostatistics. 1) by simulations, I only rarely obtained the assumed type-1 error and assumed power with using parametric tests on highly skewed (like log-normal) and multi-modal data of different dispersion across groups, with so small data sizes. Not rarely I work with so specific data, so even N=300 doesn’t give reliable results. This is unacceptable in this industry I work in. It is interesting, however, to see how the outcomes vary depending on our experience.

This was visible especially on the ANCOVA on change-from-baseline adjusted for the baseline (the recommended by guidelines standard of analyses in the RCTs) in more complex designs and multiple repetitions over time (fit either via GLS or a mixed model). But then I either switch to the (weighted) GEE estimation or choose quantile regression with random effects and run a set of the LR tests over it to get the assumption-free ANOVA over the underlying model.

Moreover, it makes entirely no statistical sense to compare means in skewed data. These are wrong measures of the central tendency most of the time. Why? Because the arithmetic mean is by definition an additive measure, which has nothing to do with multiplicative processes or processes that can be described with the log-normal distribution. The two are incompatible. For exactly this reason it makes no sense to bootstrap the difference in means or to run a permutation test over the means – because still, however technically possible, it makes no statistical sense to use means to describe such data.

Sure, one could log-transform the data, but transforming isn’t the best option here, because it changes too many things: the hypothesis, biases the back-transformed CIs (Jensen’s inequality), affects the underlying model errors, affects the mean-variance relationship and more. Instead, we need to use the generalized model, which properly deals with the conditional expected value (rather than raw data) linked to the predictor, or by employing quantile regression followed by the LR tests to get the main and interaction effects.

2) Almost neither (except maybe 3-5) of the non-parametric test (ouf ot about 320 I know) requires formally equal variance (or more generally – dispersion). And neither assesses the medians in general (sadly, this is repeated even in many textbooks, luckily not all, and the awareness grows quickly). It holds IF and ONLY IF the distributions are equal (IID): same shape, same dispersion AND both are symmetric. Otherwise is practically never happens. Mann-Whitney, Kruskal-Wallis are about stochastic equivalence, assessed via pseudomedian. They all fail entirely as tests of equality of medians just by the definition of the pseudomedian and its properties. Lots of the literature is available on this, also the simulations confirm it. It’s very easy to have numerically equal medians and the test report significant results due to the difference in shape of dispersions. And that’s OK, because it was designed as stochastic equivalence and not median tests. Sure, if we want to restrict ourselves to equal dispersions AND symmetry of the distributions (must be by the definition of pseudomedian), then we can treat it as asymptotic tests of medians, but – then – this is a perfect situation for the CLT and, actually, the standard t test (median approaches the arithmetic mean here).

3) By the way, there are also modern tests, like the ATS (ANOVA-Type Statistics), WTS (Wald-Type Statistics), permuted WTS and ART ANOVA (Aligned-Rank Transform), which are much more flexible (handle up to 3-5, depending on implementation, main effects + interaction + repeated observations) and powerful. They use so-called relative effects.

' src=

February 7, 2021 at 9:17 pm

Sir while comparing parametric and non-parametric methods we miss the two real question 1) what if we use non-parametric tests in parametric conditions ? 2) what if we use parametric tests in non-parametric conditions ? Please detail on the error in outcome as the real life deterrent, Thanks

February 7, 2021 at 10:39 pm

Hi, I touch on those issues in this post. Specifically:

1) Typically, non-parametric tests have less power than their parametric counterparts. For power reasons, you’ll want to use a parametric test when it’s valid. Using a nonparametric test in these conditions increases the Type II error rate (false negatives)

2) If you use a parametric test when a nonparametric test is appropriate, you’ll obtain inaccurate results. The Type I error rate won’t necessarily equal the significance level you define for the test. I’m not sure if there is a consistent direction of change in that error rate. I suspect that the Type I error rate can be higher or lower than the significance level depending on the nature of the violation.

I hope that helps.

' src=

December 17, 2020 at 2:19 pm

Great article, thank you But may I ask when to say its better to choose the mean or the median as the best measure of central tendency for my data? is there any guide?

December 17, 2020 at 10:43 pm

Thanks for writing! In my post about measures of central tendencies , I write about which measure is best for different situations, including choosing between the mean and median. I’d recommend reading it. In a nutshell, the mean is better when your data are symmetric, or at least not extremely skewed, while the median is better when your data are fairly skewed. In my other post, I show why that’s the case.

' src=

September 15, 2020 at 1:16 pm

Hi Jim. Very informative article. I would like know one more thing. Can we use parametric tests to analyse ordinal data? If so, in what circumstances? Please advise.

September 17, 2020 at 2:47 am

That questions has been behind many debates in statistics! In some cases, yes! In this post, I have a link near the end for an article I wrote about analyzing Likert scaled data. Likert scale is an ordinal scale. And for those data, you can use the parametric 2-sample t-test. That’s based on a thorough simulation study. However, I would not say that means you can always use parametric tests for all scenarios where you have ordinal data. There are probably requirements for samples sizes and number of ordinal levels. At any rate, read that one post about analyzing Likert data to get an idea of some of the issues and how it works out for 2-sample t-tests.

' src=

September 1, 2020 at 10:32 pm

Hi Jim, Thanks for the very informative Article. It looks great to see all Hypothesis tests in one article, and appreciate the details and depth of the explanation. One thing that I been struck upon is to make the best choice between Parametric and non-parametric tests, when there are many varying features and under the influence of many varying features the distribution become highly uneven making it hard to compare and harder to draw inferences. But this is the actual case in practical application when you want to do A/B Testing. Real life A/B testing involves dealing with distributions that vary largely due to high number of Features(columns or variables). For doing A/B Testing with varying distributions in the 2 experiments under conditions of multiple features involved, would you recommend Parametric Statistical Hypothesis Tests or Non-Parametric Statistical Hypothesis Tests? ( I have tried Parametric Statistical Hypothesis Tests but it was getting hard to meet the statistical significance, as there are multiple features involved. If I remove/ignore most of the variables I may end-up getting the statistical significance, but that may not be the intended purpose of A/B testing though.) Can you throw some light,please?

' src=

July 8, 2020 at 3:38 am

Hi Jim! A researcher conducted a research that majority of people who died during pandemic bought a new phone during last year. What type of research is this? If his assumption is correct which statistical test should be apropriate to analyse the data? please answer this question in detail. i will be really thankful to you.

July 9, 2020 at 4:32 pm

Hi MahNoor, apparently this is a question from a test because someone else recently asked the identical question. I’m not going to do your test for you. However, I will point you towards a 2-sample proportions test, which will allow you to determine whether there is a difference the proportion of fatalities between those who bought a new phone and those who didn’t.

' src=

June 5, 2020 at 11:41 am

Amazing thanks!

June 4, 2020 at 8:39 am

Thanks so much for explaining this all! I want to compare the ages of two groups I have (one is only 17 people and one is 51 people). Because the first group is <20 people do I need a Mann-Whitney U test or can I just use a t test here?

Many thanks! Ben

June 4, 2020 at 1:38 pm

Do you have any theoretical reasons or empirical data that suggests the population for the smaller group follows a nonnormal distribution? If you can reasonably assume that it follows a normal distribution, you can probably use a t-test. However, if you have any doubts about that, best to go with Mann-Whitney.

' src=

May 23, 2020 at 3:04 pm

hi jim ,,,, thank you for the wonderful article ,,,,can you tell special features of factorial design.. it would be very helpful

' src=

May 18, 2020 at 2:04 pm

Thanks heaps for this excellent overview. However, I am bit confused with ‘The groups in a nonparametric analysis typically must all have the same variability (dispersion).’ As far as I can remember, ANOVA, as a parametric test assumes equal variances of the samples that wil be tested. Do you think i should stick to ANOVA if the samples are normally distributed but have unequal variance?

May 18, 2020 at 2:14 pm

If you have unequal variances, you can use Welch’s ANOVA . Click the link to read my post about it!

' src=

April 15, 2020 at 9:16 am

Thanks a bunch Jim !

April 14, 2020 at 3:23 pm

Thanks for this article! I would like to kindly seek your advise-

I’m currently looking to filter out variables that are highly correlated so that I may remove one or the other for an analysis, I was thinking of using the non parametric test Spearmans Rank Correlation, would that be correct? Data are of equal groups, each group >20 observations, continuous data.

April 14, 2020 at 10:34 pm

You can use that or even just the regular Pearson’s correlation. If you’re performing regression analysis and worried about multicollinearity , you can fit the model with the variables and then check the VIFs.

' src=

April 9, 2020 at 12:29 pm

Hi Jim. Thankyou for your article it was very helpful. I was wondering if you could help me- I’m currently doing my thesis and am carrying out a few statistical tests. One is an independent samples t test with 1 categorical independent variable (PP group 1, N= 57, PP group 2, N=45) with one continuous dependent variable. However, my data has violated the assumptions: Normality, Homogeneity of variance & has a few outliers. In this case, would I bootstrap my t-test or use the alternative non-parametric test (Mann-u Whitney). How would I make this decision? What would the criteria be for using bootstrapping over the alternative non-parametric test?

Thanks in advance for any insight you can offer! 🙂

April 10, 2020 at 7:46 pm

Hi Heather,

In your case, I would strongly consider using the t-test. In fact there are specific reasons for not using a nonparametric test in your case.

Specifically, you have a large enough sample size in each group so that the central limit theorem kicks in (see the table in the post for sample size requirements). Even though the data in your groups are non-normal, the sampling distributions should follow a normal distribution, which gives you valid results. Additionally, t-tests can handle unequal variances. Just be sure that your statistical software uses the version of the t-test that does NOT assume equal variances.

While nonparametric tests don’t assume that your data follow a particular distribution, they do assume that the spread of the data in each group is the same. Because your data have different variances, it violates that assumption for nonparametric tests.

I’d use the t-test! You could also use bootstrapping, but a t-test should work fine.

' src=

February 11, 2020 at 2:43 am

Hi Jim, very good post (along many others in your blog). Could you please provide any formal reference for the table of minimum sampling size? Thanks a lot! Ben

' src=

February 1, 2020 at 5:30 am

Thanks a lot for the valuable information, but may I ask how much do you mean by tiny size of data, are they less than 30?

' src=

January 29, 2020 at 6:25 pm

January 28, 2020 at 10:32 pm

Hello Jim, when did you publish this article? I would like to cite it for my school work

January 28, 2020 at 10:54 pm

Hi Mukhles,

I’m glad this article was helpful for you! When you cite web pages, you actually use the date you accessed the article. See this link at Purdue University for Electronic Citations . Look in the section for “A Page on a Website.” Thanks!

' src=

January 14, 2020 at 2:00 am

Hi jim would u Please answer one of my doubt, i m badly stuck in

January 14, 2020 at 11:00 am

Please find the blog post that is closest to the topic of your question. There is a search box in the right hand column part way down that can help you. Ask your question in the comments of the appropriate post and I’ll answer it!

' src=

December 20, 2019 at 11:22 am

Just wanted to add that the book “Nonparametric Statistical Inference, fifth edition” by Gibbons and Chakraborti (2010; CRC Press) has discussions about the power of some nonparametric tests, including Minitab Macro codes to simulate power. The updated edition (work in-progress) will discuss R codes. Hope this helps.

' src=

December 10, 2019 at 4:49 pm

Hi Jim! Great article, it really helped me for my study. Only problem now is that I need scientific papers for the statements made in your text, to refer to them in my study. Specifically I was wondering if you coud provide me with the paper you used to draw this conclusion “parametric tests have more power. If an effect actually exists, a parametric analysis is more likely to detect it”

Thanks a lot!

December 10, 2019 at 5:19 pm

Thanks for your kind words. I’m glad it was helpful!

It’s generally recognized that nonparametric tests have somewhat lower power compared to a similar parametric test. In other words, to have the same power as a similar parametric test, you’d need a somewhat larger sample size for the nonparametric test. That’s the tendency.

However, calculating the power for a nonparametric test and understanding the difference in power for a specific parametric and nonparametric tests is difficult. The problem arises because the specific difference in power depends on the precise distribution of your data. That makes it impossible to state a constant power difference by test. In other words, the power difference doesn’t just depend on the tests themselves but also the properties of your data.

For more information about these considerations, look at the following texts: Walsh, J.E. (1962) Handbook of Nonparametric Statistics, New York: D.V. Nostrand. Conover, W.J. (1980). Practical Nonparametric Statistics, New York: Wiley & Sons.

' src=

November 3, 2019 at 2:05 am

Jim, do you have anything which describes how to estimate the power of a nonparametric test?

November 4, 2019 at 9:32 am

Calculating power for nonparametric tests can be a bit complicated. For one thing, while nonparametric tests don’t require particular distributions, you need to know the distribution to be able to calculate statistical power for these tests. I don’t think many statistical packages have built in analyses for this type of power analysis. I’ve also heard of people using bootstrap methods or Monte Carlo simulations to come up with an answer. For these methods, you’ll still need either representative data or knowledge about the distribution.

Apparently, the pwr.boot function in R uses the bootstrap method to calculate power for nonparametric tests. Unfortunately, I have not used it myself but could be something to try. The problem is that you should not use data from a hypothesis test to calculate the power for that hypothesis test. If the test was statistically significant, power will be high. If the test was not significant, the power is low. You don’t know the real power. So, I’m not sure about the rational for using this command, but it is one approach.

' src=

September 23, 2019 at 11:20 pm

Hi. I wanted to leave a comment . . .

September 23, 2019 at 11:55 pm

Thanks for the heads-up. I tried sending you an email but it bounced.

' src=

April 5, 2019 at 7:34 am

Hi Jim, Thank you for this nice explanation. I must consult with you regarding the situation I have with my data. I have 10 data sets (10 different metals), each data set consisting of 20 values (5 values in 4 seasons). These are the measurements of the metal concentrations in fish liver and I want to assess if there are seasonal variations. I tested the normality of distribution and got normal distribution for 7 metals, and for 3 a non normal distribution. I have tested the homogeneity of variance (Leven’s test) and got result that 6 of the metals have homogeneous variation, while other 4 metals (3 of which have non normal distribution) does not have homogeneous variance. Finally, my question is, should I use parametric test (One way ANOVA) for all the 10 data sets, since majority of samples have normal distribution and homogeneous variance? Should I use non parametric (Kruskal-Wallis H) since my data sets are not large (20 values)? Or should I test normally distributed data with parametric, and non normally distributed data with non parametric? Thank you in advance, Kind regards, Jovana

' src=

April 3, 2019 at 3:16 am

Hi again Jim, This time my query regarding missing data when sample size is low. How do we deal with missing dependent variables in a continuous data set observed at different time intervals? Is multiple imputation a good option when data (sample) is missing at some time points and some were not detected due to method limitations. Some suggest replacing undetected data with the lowest possible value, such as 1/2 of the limit of detection instead of using zero. Can undetected data be treated as missing data?

I have looked up some multiple imputation methods in SPSS but not sure how much acceptable it is and how to report if acceptable.

Please enlighten with your expertise. Thank you in advance!

April 5, 2019 at 4:44 pm

Generally speaking, the less data you have the more difficult it is to estimate missing data. The missing values also play a larger role because they’re part of a smaller set. I don’t have personal experience using SPSS’ missing data imputation. I’ve read about it and it sounds good, but I’m not sure about limitations.

I’m not really sure about the detection limits issue. For one thing, I’d imagine that it depends on whether the lowest detectable value is still large enough to be important to your study. In other words, if it is so low that you’re not missing anything important, it might not be a problem. Perhaps the lowest detectable value is so low that in practical terms it’s not different from zero. But, that might not be the case. Additionally, I’d imagine it also depends on how much of your data fall in that region. If you’re obtaining lots of missing values or zeroes because much of the observations fall within that range, it becomes more problematic. Consequently, how to address it become very context sensitive and I wouldn’t be able to give you a good answer. I’d consult with subject-area specialists and see how similar studies have handled it. Sorry I couldn’t give you a more specific answer.

March 27, 2019 at 4:20 pm

Great! Thanks Jim. This is really helpful. Cheers!

' src=

March 26, 2019 at 6:03 pm

Thank you so much for this article! I wasn’t planning on using statistics in my research, but my research took a turn and my committee wanted to see testable hypotheses…for paleontology! Ugh. But, this article and your website is incredibly useful in dusting off the stats in my brain!

March 27, 2019 at 10:32 am

Your kind words mean so much to me. Thank you, Brittney!

March 23, 2019 at 12:38 am

Hi Jim, Thank you for making statistics a lot easier to understand. I now understand that parametric tests can be performed on a non-normal data if the sample size is big enough as indicated.

I have a few confusions regarding when and when not to perform log transformation of skewed data? When does the data have to be log transformed to perform statistical analysis? Can parametric tests be done on a log transformed data and how do we report the results after log transformation?

Do you have a blog post regarding this? Please provide your expert insights on these when possible.

March 24, 2019 at 10:49 pm

Yes, you can log transform data and use parametric analyses although it does change a key aspect of the test. You can present the results as saying that the difference between the log transformed means are statistically significant. Then, back transform those values to the natural units and present those as well. Also, note that using log transformed data changes the nature of that test so that it is comparing geometric means rather than the usual arithmetic means. Be sure that is acceptable. Also, check that the transformed data follow the normal distribution.

However, you generally don’t need to do this if you have a large enough sample size per group–as I point out in this post. Consider using transformations only when the data are severely skewed and/or you have a smaller sample size. Unfortunately, I don’t have a blog post on this process. However, unless you have a strong need to transform your data, I would not use that approach.

I hope this helps!

' src=

February 28, 2019 at 5:14 pm

Very helpful article. Nice explanation

' src=

January 20, 2019 at 11:34 pm

Jim, your site in general and this page has helped me understand statistics so much better as a novice. Regarding the Wilcoxon, although super helpful in understanding the basics- I’m still unsure about how I can relate this to my study. It’s been loosely suggested to me by a peer that I use the Wilcoxon text, but I’m not sure how to confirm this. I have 13 participants. They each watched Video 1 and answered 16 corresponding questions (8 for construct A and 8 for construct B). They then watched Video 2 and answered the same 16 questions (8 for construct A and 8 for construct B). The questions were 3, 5, and 7 likert scale questions. I want to find the differences in ratings between Videos 1 and 2 for construct A, the differences in ratings between Videos 1 and 2 for construct B, and the highest rated Video in total (combining both constructs). Any advice? Thanks

' src=

November 21, 2018 at 10:23 am

It is really helpful article. I learned a lot. Thanks for posting.

November 21, 2018 at 10:57 am

You’re very welcome. I’m glad it was helpful!

November 21, 2018 at 9:32 am

Thanks Jim. Which post-hoc test would you suggest in this case. I really appreciate it. Thanks.

November 21, 2018 at 10:11 am

The post-hoc test I’m most familiar with is the Games-Howell test, which is similar to Tukey’s test. I’m sure there are others, but I’m not familiar with them. For more information and an example of Welch’s with this post-hoc test, read my post on Welch’s ANOVA .

November 21, 2018 at 4:48 am

I am dealing with 6 groups of a data set with different number of sample sizes. The minimum sample size of one group is 56 and maximum is 350 and other groups sample sizes are in between these two points. My data is not normal and through levene’s test I found that the variances are not equal. I think comparison of mean is somehow meaningful compared to median. Could you please guide me to select between Welch-test ANOVA or Kruskal Wallis test?

November 21, 2018 at 9:27 am

Given your large sample sizes, unequal variances, and the fact that you want to compare means, I ‘d use Welch’s ANOVA.

Best of luck with your analysis!

' src=

August 11, 2018 at 10:54 am

Hi from Turkey I have followed your post for 6 months. Every article is better than the last. Thank you for have loved the statistic.

August 11, 2018 at 4:14 pm

Hi Ferhat, thank you so much! That means a lot to me!

' src=

August 6, 2018 at 8:01 am

This is really an insightful article. I have a question though regarding my study. Can I still use a parametric test even if the distribution is not normal and the variances aren’t homogeneous? I checked those assumptions via Shapiro-Wilk test and Levene’s F-test and the results suggested that both assumptions were violated. Other online articles mentioned that if this is the case, I should use a non-parametric test but I also read somewhere that oneway ANOVA would do. By the way, I have 3 groups with equal number of observations, i.e., 21 for each group.

Thanks for your time.

August 6, 2018 at 10:58 am

If you sample size per group meets the requirements that I present in the Advantage #1 for parametric test, then nonnormal data are not a problem. These tests are robust to departures from normality as long as you have a sufficient number of observations per group.

As for unequal variances, you often have stricter requirements when you use nonparametric tests. This fact isn’t discussed much but nonparametric tests typically requires the same spread across groups. For t-tests and ANOVA, you have options that allow you to use them when variances are not equal. For example, for ANOVA you can use Welch’s ANOVA. For details on that method, read my post about Welch’s ANOVA .

Based on your sample size per group, you should be able to use ANOVA regardless of whether the data are normally distributed. If you suspect that the variances are not equal, you can use Welch’s ANOVA.

I hope this helps.

' src=

August 8, 2018 at 4:01 am

Thanks a lot for your prompt response, Jim. Really appreciate it. I’ll check on Welch’s ANOVA, then. Again, many thanks!

' src=

August 1, 2018 at 4:32 am

My data of 350 doesnt follow normal distribution.. which one should i take median or mean..how should it be reported.. should i report on mean sd cv etc

August 1, 2018 at 3:18 pm

The answer to this question depends on which measure best represents the middle of your distribution and what is important to the subject area. In general, the more skewed your distribution, the more you should consider using the median. Graph your data to help answer this question. Also, I’ve written a post about the different measures of central tendency that you should read!

' src=

July 8, 2018 at 3:43 am

Thanks Respected Sir I got your point. You are great.

July 8, 2018 at 10:39 pm

You’re welcome. I’m glad I could help!

July 8, 2018 at 3:23 am

there is no significant difference in pre-intervention scores of groups with p value>0.05 but when we see Mean scores of groups there are minor difference among the groups. In this case Can I use ANCOVA?

July 8, 2018 at 3:38 am

ANCOVA allows you include a covariate (a continuous variable that might be correlated with the dependent variable) in the analysis along with your categorical variables (factors). Telling me about the means of the groups is not applicable to whether you should use ANCOVA specifically. Do you have a continuous independent variable to include in the analysis?

I’m not sure why you’re analyzing the pre-intervention scores? However, it is entirely normal to see differences between the group means when the p-value is greater than 0.05. However, that issue does not relate to whether you should use ANCOVA or not.

If you have only the 5 groups and there are no other variables in your analysis, no you can’t use ANCOVA because you don’t have a covariate. Seems like you should use one-way ANOVA . You can subtract the pretest scores from the post-test scores so you’re analyzing the differences by group. This process will tell you how the changes in the experimental groups compare to the change in the control group.

July 6, 2018 at 1:48 pm

Respected Sir, please answer my last two questions too.

July 6, 2018 at 1:59 pm

I will, Muhammad. Please keep in mind that the website is something I do in my spare time. I try to answer all questions but sometimes it will take a day or two depending on what else I have going on.

July 6, 2018 at 1:41 pm

Thanks Great Sir

July 6, 2018 at 1:25 pm

Dear Jim Frost thanks for your kind reply, Please also guide and answer my two questions more: 1. NO significant difference was found among the Covariates with p>0.05 before intervention. But there is minor difference in their mean score. In this case, Can I use ANCOVA for analysis with covariates having significant score with p>0.05? Is it okay that using ANCOVA will remove the initial differences found in mean score of covariate though there was No significant difference found in terms of p>0.05 before intervention? 2. In my experimental study sample size is 50. There are 5 groups (4 experimental and 1 control group). I am using randomized pretest-posttest control group design but some people say this research design is not appropriate. Please guide is this research okay or not? if not then please tell the appropriate design? I am giving different interventions to 4 experimental groups, No intervention to control group. Please reply immediately.

July 8, 2018 at 3:08 am

Hi Muhammad,

I’m a bit confused by your first question. Covariates are continuous variables so there are not any significant differences. Covariates don’t assess the differences between the means of the levels of a categorical variable. Instead, you use the p-value to determine whether there is a significant relationship between the covariate and dependent variable in the same manner as for linear regression. Usually, if the it is not significant, you don’t include it in the model. However, if theory strongly suggests that it should be in the model, it is ok to include it even when the p-value is greater than 0.05.

I don’t see why a pretest-posttest would not be OK. But, I don’t have much information to go by. Why did they say it was not appropriate?

July 6, 2018 at 1:07 pm

Actually I have only 10 subjects each group which is not greater than 15. thats why I asked?

July 6, 2018 at 1:31 pm

That size limit is only important when your data don’t follow a normal distribution. You said that your data do follow the normal distribution. So, it shouldn’t be a problem!

July 3, 2018 at 9:51 pm

I have 5 groups in experimental study (4 experimental and 01 control). Sample size 50 with 10 subjects in each group. All groups have normal distribution. Can I use parametric test, please reply immediate.

July 5, 2018 at 3:14 pm

Hi Muhammad, given what you state, I see no reason why you couldn’t use a parametric test.

' src=

June 17, 2018 at 1:57 pm

Hi Jim, thanks for the overview! Do you happen to have a source/reference I can refer to when using the claims you make as argumentation in my paper?

June 18, 2018 at 11:35 am

Hi Sam, I include a link in this post to a white paper about the sample size claims. You’ll find your answers there!

' src=

May 28, 2018 at 7:44 am

Wonderful article…love all your articles…😃

May 30, 2018 at 10:52 am

Thank you, Mohammad! That means a lot to me!

' src=

April 24, 2018 at 5:48 am

I have benefited from your information. May God bless You.

April 24, 2018 at 9:44 am

Thank you, David! It makes me happy to hear that this has been helpful for you!

' src=

February 12, 2018 at 2:33 am

Very nice explanation Of central tendencies

February 12, 2018 at 9:51 am

Thank you, Anitha!

' src=

April 25, 2017 at 4:23 am

How can I cite this article?

April 26, 2017 at 2:08 am

Hi, there are several standard formats for electronic sources, such as MLA , APA , and Chicago style . You’ll need to check with your institution to determine which one you should use.

' src=

April 24, 2017 at 7:33 am

' src=

April 24, 2017 at 1:03 am

Great article. This is one of those statistical tests that took a while to understand. But you explained it very nicely!

April 24, 2017 at 1:12 am

Thank you so much Lucas!

March 16, 2021 at 12:34 am

Thanks, Lucas!

Comments and Questions Cancel reply

  • Math Article
  • Non Parametric Test

Non-Parametric Test

Class Registration Banner

Non-parametric tests are experiments that do not require the underlying population for assumptions. It does not rely on any data referring to any particular parametric group of probability distributions . Non-parametric methods are also called distribution-free tests since they do not have any underlying population.  In this article, we will discuss what a non-parametric test is, different methods, merits, demerits and examples of non-parametric testing methods.

Table of Contents:

  • Non-parametric T Test
  • Non-parametric Paired T-Test

Mann Whitney U Test

Wilcoxon signed-rank test, kruskal wallis test.

  • Advantages and Disadvantages
  • Applications

What is a Non-parametric Test?

Non-parametric tests are the mathematical methods used in statistical hypothesis testing, which do not make assumptions about the frequency distribution of variables that are to be evaluated. The non-parametric experiment is used when there are skewed data, and it comprises techniques that do not depend on data pertaining to any particular distribution.

The word non-parametric does not mean that these models do not have any parameters. The fact is, the characteristics and number of parameters are pretty flexible and not predefined. Therefore, these models are called distribution-free models.

Non-Parametric T-Test

Whenever a few assumptions in the given population are uncertain, we use non-parametric tests, which are also considered parametric counterparts. When data are not distributed normally or when they are on an ordinal level of measurement, we have to use non-parametric tests for analysis. The basic rule is to use a parametric t-test for normally distributed data and a non-parametric test for skewed data.

Non-Parametric Paired T-Test

The paired sample t-test is used to match two means scores, and these scores come from the same group. Pair samples t-test is used when variables are independent and have two levels, and those levels are repeated measures.

Non-parametric Test Methods

The four different techniques of parametric tests, such as Mann Whitney U test, the sign test, the Wilcoxon signed-rank test, and the Kruskal Wallis test are discussed here in detail. We know that the non-parametric tests are completely based on the ranks, which are assigned to the ordered data. The four different types of non-parametric test are summarized below with their uses, null hypothesis , test statistic, and the decision rule. 

Kruskal Wallis test is used to compare the continuous outcome in greater than two independent samples.

Null hypothesis, H 0 :  K Population medians are equal.

Test statistic:

If N is the total sample size, k is the number of comparison groups, R j is the sum of the ranks in the jth group and n j is the sample size in the jth group, then the test statistic, H is given by:

\(\begin{array}{l}H = \left ( \frac{12}{N(N+1)}\sum_{j=1}^{k} \frac{R_{j}^{2}}{n_{j}}\right )-3(N+1)\end{array} \)

Decision Rule: Reject the null hypothesis H 0 if H ≥ critical value

The sign test is used to compare the continuous outcome in the paired samples or the two matches samples.

Null hypothesis, H 0 : Median difference should be zero 

Test statistic: The test statistic of the sign test is the smaller of the number of positive or negative signs.

Decision Rule: Reject the null hypothesis if the smaller of number of the positive or the negative signs are less than or equal to the critical value from the table.

Mann Whitney U test is used to compare the continuous outcomes in the two independent samples. 

Null hypothesis, H 0 : The two populations should be equal.

If R 1 and R 2 are the sum of the ranks in group 1 and group 2 respectively, then the test statistic “U” is the smaller of:

\(\begin{array}{l}U_{1}= n_{1}n_{2}+\frac{n_{1}(n_{1}+1)}{2}-R_{1}\end{array} \)

\(\begin{array}{l}U_{2}= n_{1}n_{2}+\frac{n_{2}(n_{2}+1)}{2}-R_{2}\end{array} \)

Decision Rule: Reject the null hypothesis if the test statistic, U is less than or equal to critical value from the table.

Wilcoxon signed-rank test is used to compare the continuous outcome in the two matched samples or the paired samples.

Null hypothesis, H 0 : Median difference should be zero.

Test statistic: The test statistic W, is defined as the smaller of W+ or W- .

Where W+ and W- are the sums of the positive and the negative ranks of the different scores.

Decision Rule: Reject the null hypothesis if the test statistic, W is less than or equal to the critical value from the table.

Advantages and Disadvantages of Non-Parametric Test

The advantages of the non-parametric test are:

  • Easily understandable
  • Short calculations
  • Assumption of distribution is not required
  • Applicable to all types of data

The disadvantages of the non-parametric test are:

  • Less efficient as compared to parametric test
  • The results may or may not provide an accurate answer because they are distribution free

Applications of Non-Parametric Test

The conditions when non-parametric tests are used are listed below:

  • When parametric tests are not satisfied.
  • When testing the hypothesis, it does not have any distribution.
  • For quick data analysis.
  • When unscaled data is available.

Frequently Asked Questions on Non-Parametric Test

What is meant by a non-parametric test.

The non-parametric test is one of the methods of statistical analysis, which does not require any distribution to meet the required assumptions, that has to be analyzed. Hence, the non-parametric test is called a distribution-free test.

What is the advantage of a non-parametric test?

The advantage of nonparametric tests over the parametric test is that they do not consider any assumptions about the data.

Is Chi-square a non-parametric test?

Yes, the Chi-square test is a non-parametric test in statistics, and it is called a distribution-free test.

Mention the different types of non-parametric tests.

The different types of non-parametric test are: Kruskal Wallis Test Sign Test Mann Whitney U test Wilcoxon signed-rank test

When to use the parametric and non-parametric test?

If the mean of the data more accurately represents the centre of the distribution, and the sample size is large enough, we can use the parametric test. Whereas, if the median of the data more accurately represents the centre of the distribution, and the sample size is large, we can use non-parametric distribution.

MATHS Related Links

explain the hypothesis testing of non parametric data

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

logo image missing

  • > Statistics

Non-Parametric Statistics: Types, Tests, and Examples

  • Pragya Soni
  • May 12, 2022

Non-Parametric Statistics: Types, Tests, and Examples title banner

Statistics, an essential element of data management and predictive analysis , is classified into two types, parametric and non-parametric. 

Parametric tests are based on the assumptions related to the population or data sources while, non-parametric test is not into assumptions, it's more factual than the parametric tests. Here is a detailed blog about non-parametric statistics.

What is the Meaning of Non-Parametric Statistics ?

Unlike, parametric statistics, non-parametric statistics is a branch of statistics that is not solely based on the parametrized families of assumptions and probability distribution. Non-parametric statistics depend on either being distribution free or having specified distribution, without keeping any parameters into consideration.

Non-parametric statistics are defined by non-parametric tests; these are the experiments that do not require any sample population for assumptions. For this reason, non-parametric tests are also known as distribution free tests as they don’t rely on data related to any particular parametric group of probability distributions.

In other terms, non-parametric statistics is a statistical method where a particular data is not required to fit in a normal distribution. Usually, non-parametric statistics used the ordinal data that doesn’t rely on the numbers, but rather a ranking or order. For consideration, statistical tests, inferences, statistical models, and descriptive statistics.

Non-parametric statistics is thus defined as a statistical method where data doesn’t come from a prescribed model that is determined by a small number of parameters. Unlike normal distribution model,  factorial design and regression modeling, non-parametric statistics is a whole different content.

Unlike parametric models, non-parametric is quite easy to use but it doesn’t offer the exact accuracy like the other statistical models. Therefore, non-parametric statistics is generally preferred for the studies where a net change in input has minute or no effect on the output. Like even if the numerical data changes, the results are likely to stay the same.

Also Read | What is Regression Testing?

How does Non-Parametric Statistics Work ?

Parametric statistics consists of the parameters like mean,  standard deviation , variance, etc. Thus, it uses the observed data to estimate the parameters of the distribution. Data are often assumed to come from a normal distribution with unknown parameters.

While, non-parametric statistics doesn’t assume the fact that the data is taken from a same or normal distribution. In fact, non-parametric statistics assume that the data is estimated under a different measurement. The actual data generating process is quite far from the normally distributed process.

Types of Non-Parametric Statistics

Non-parametric statistics are further classified into two major categories. Here is the brief introduction to both of them:

1. Descriptive Statistics

Descriptive statistics is a type of non-parametric statistics. It represents the entire population or a sample of a population. It breaks down the measure of central tendency and central variability.

2. Statistical Inference

Statistical inference is defined as the process through which inferences about the sample population is made according to the certain statistics calculated from the sample drawn through that population.

Some Examples of Non-Parametric Tests

In the recent research years, non-parametric data has gained appreciation due to their ease of use. Also, non-parametric statistics is applicable to a huge variety of data despite its mean, sample size, or other variation. As non-parametric statistics use fewer assumptions, it has wider scope than parametric statistics.

Here are some common  examples of non-parametric statistics :

Consider the case of a financial analyst who wants to estimate the value of risk of an investment. Now, rather than making the assumption that earnings follow a normal distribution, the analyst uses a histogram to estimate the distribution by applying non-parametric statistics.

Consider another case of a researcher who is researching to find out a relation between the sleep cycle and healthy state in human beings. Taking parametric statistics here will make the process quite complicated. 

So, despite using a method that assumes a normal distribution for illness frequency. The researcher will opt to use any non-parametric method like quantile regression analysis.

Similarly, consider the case of another health researcher, who wants to estimate the number of babies born underweight in India, he will also employ the non-parametric measurement for data testing.

A marketer that is interested in knowing the market growth or success of a company, will surely employ a non-statistical approach.

Any researcher that is testing the market to check the consumer preferences for a product will also employ a non-statistical data test. As different parameters in nutritional value of the product like agree, disagree, strongly agree and slightly agree will make the parametric application hard.

Any other science or social science research which include nominal variables such as age, gender, marital data, employment, or educational qualification is also called as non-parametric statistics. It plays an important role when the source data lacks clear numerical interpretation.

Also Read | Applications of Statistical Techniques

What are Non-Parametric Tests ?

Types of Non-Parametric Tests:1. Wilcoxon test 2. Mann-Whitney test 3. Kruskal Wallis test 4. Friedmann test

Types of Non-Parametric Tests

  Here is the list of non-parametric tests that are conducted on the population for the purpose of statistics tests :

Wilcoxon Rank Sum Test

The Wilcoxon test also known as rank sum test or signed rank test. It is a type of non-parametric test that works on two paired groups. The main focus of this test is comparison between two paired groups. The test helps in calculating the difference between each set of pairs and analyses the differences.

The Wilcoxon test is classified as a statistical  hypothesis tes t and is used to compare two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean rank is different or not.

Mann- Whitney U Test

The Mann-Whitney U test also known as the Mann-Whitney-Wilcoxon test, Wilcoxon rank sum test and Wilcoxon-Mann-Whitney test. It is a non-parametric test based on null hypothesis. It is equally likely that a randomly selected sample from one sample may have higher value than the other selected sample or maybe less.

Mann-Whitney test is usually used to compare the characteristics between two independent groups when the dependent variable is either ordinal or continuous. But these variables shouldn’t be normally distributed. For a Mann-Whitney test, four requirements are must to meet. The first three are related to study designs and the fourth one reflects the nature of data.

Kruskal Wallis Test

Sometimes referred to as a one way ANOVA on ranks, Kruskal Wallis H test is a nonparametric test that is used to determine the statistical differences between the two or more groups of an independent variable. The word ANOVA is expanded as Analysis of variance.

The test is named after the scientists who discovered it, William Kruskal and W. Allen Wallis. The major purpose of the test is to check if the sample is tested if the sample is taken from the same population or not.

Friedman Test

The Friedman test is similar to the Kruskal Wallis test. It is an alternative to the ANOVA test. The only difference between Friedman test and ANOVA test is that Friedman test works on repeated measures basis. Friedman test is used for creating differences between two groups when the dependent variable is measured in the ordinal.

The Friedman test is further divided into two parts, Friedman 1 test and Friedman 2 test. It was developed by sir Milton Friedman and hence is named after him. The test is even applicable to complete block designs and thus is also known as a special case of Durbin test.

Distribution Free Tests

Distribution free tests are defined as the mathematical procedures. These tests are widely used for testing statistical hypotheses. It makes no assumption about the probability distribution of the variables. An important list of distribution free tests is as follows:

  •  Anderson-Darling test: It is done to check if the sample is drawn from a given distribution or not.
  • Statistical bootstrap methods: It is a basic non-statistical test used to estimate the accuracy and sampling distribution of a statistic.
  • Cochran’s Q: Cochran’s Q is used to check constant treatments in block designs with 0/1 outcomes.
  • Cohen’s kappa: Cohen kappa is used to measure the inter-rater agreement for categorical items.
  • Kaplan-Meier test: Kaplan Meier test helps in estimating the survival function from lifetime data, modeling, and censoring.
  • Two-way analysis Friedman test: Also known as ranking test, it is used to randomize different block designs.
  • Kendall’s tau: The test helps in defining the statistical dependency between two different variables.
  • Kolmogorov-Smirnov test: The test draws the inference if a sample is taken from the same distribution or if two or more samples are taken from the same sample.
  • Kendall’s W: The test is used to measure the inference of an inter-rater agreement .
  • Kuiper’s test: The test is done to determine if the sample drawn from a given distribution is sensitive to cyclic variations or not.
  • Log Rank test: This test compares the survival distribution of two right-skewed and censored samples.
  • McNemar’s test: It tests the contingency in the sample and revert when the row and column marginal frequencies are equal to or not.
  • Median tests: As the name suggests, median tests check if the two samples drawn from the similar population have similar median values or not.
  • Pitman’s permutation test: It is a statistical test that yields the value of p variables. This is done by examining all possible rearrangements of labels.
  • Rank products: Rank products are used to detect expressed genes in replicated microarray experiments.
  • Siegel Tukey tests: This test is used for differences in scale between two groups.
  • Sign test: Sign test is used to test whether matched pair samples are drawn from distributions from equal medians.
  • Spearman’s rank: It is used to measure the statistical dependence between two variables using a monotonic function.
  • Squared ranks test: Squared rank test helps in testing the equality of variances between two or more variables.
  • Wald-Wolfowitz runs a test: This test is done to check if the elements of the sequence are mutually independent or random.

Also Read | Factor Analysis

Advantages and Disadvantages of Non-Parametric Tests

The benefits of non-parametric tests are as follows:

It is easy to understand and apply.

It consists of short calculations.

The assumption of the population is not required.

Non-parametric test is applicable to all data kinds

The limitations of non-parametric tests are:

It is less efficient than parametric tests.

Sometimes the result of non-parametric data is insufficient to provide an accurate answer.

Applications of Non-Parametric Tests

Non-parametric tests are quite helpful, in the cases :

Where parametric tests are not giving sufficient results.

When the testing hypothesis is not based on the sample.

For the quicker analysis of the sample.

When the data is unscaled.

The current scenario of research is based on fluctuating inputs, thus, non-parametric statistics and tests become essential for in-depth research and data analysis .

Share Blog :

explain the hypothesis testing of non parametric data

Be a part of our Instagram community

Trending blogs

5 Factors Influencing Consumer Behavior

Elasticity of Demand and its Types

An Overview of Descriptive Analysis

What is PESTLE Analysis? Everything you need to know about it

What is Managerial Economics? Definition, Types, Nature, Principles, and Scope

5 Factors Affecting the Price Elasticity of Demand (PED)

6 Major Branches of Artificial Intelligence (AI)

Scope of Managerial Economics

Dijkstra’s Algorithm: The Shortest Path Algorithm

Different Types of Research Methods

Latest Comments

explain the hypothesis testing of non parametric data

brenwright30

THIS IS HOW YOU CAN RECOVER YOUR LOST CRYPTO? Are you a victim of Investment, BTC, Forex, NFT, Credit card, etc Scam? Do you want to investigate a cheating spouse? Do you desire credit repair (all bureaus)? Contact Hacker Steve (Funds Recovery agent) asap to get started. He specializes in all cases of ethical hacking, cryptocurrency, fake investment schemes, recovery scam, credit repair, stolen account, etc. Stay safe out there! [email protected] https://hackersteve.great-site.net/

explain the hypothesis testing of non parametric data

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Choosing the Right Statistical Test | Types & Examples

Choosing the Right Statistical Test | Types & Examples

Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Statistical tests are used in hypothesis testing . They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Table of contents

What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Prevent plagiarism. Run a free check.

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .

For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

  • whether your data meets certain assumptions.
  • the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

  • Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
  • Homogeneity of variance : the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
  • Normality of data : the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data .

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

  • Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
  • Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

  • Ordinal : represent data with an order (e.g. rankings).
  • Nominal : represent group names (e.g. brands or species names).
  • Binary : represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.

Predictor variable Outcome variable Research question example
What is the effect of income on longevity?
What is the effect of income and minutes of exercise per day on longevity?
Logistic regression What is the effect of drug dosage on the survival of a test subject?

Comparison tests

Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).

Predictor variable Outcome variable Research question example
Paired t-test What is the effect of two different test prep programs on the average exam scores for students from the same class?
Independent t-test What is the difference in average exam scores for students from two different schools?
ANOVA What is the difference in average pain levels among post-surgical patients given three different painkillers?
MANOVA What is the effect of flower species on petal length, petal width, and stem length?

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

Variables Research question example
Pearson’s  How are latitude and temperature related?

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

Predictor variable Outcome variable Use in place of…
Spearman’s 
Pearson’s 
Sign test One-sample -test
Kruskal–Wallis  ANOVA
ANOSIM MANOVA
Wilcoxon Rank-Sum test Independent t-test
Wilcoxon Signed-rank test Paired t-test

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Choosing the right statistical test

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved August 26, 2024, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.

Parametric vs. Non-Parametric Tests and When to Use Them

A parametric test makes assumptions while a non-parametric test does not assume anything.

Adrienne Kline

The fundamentals of data science include computer science, statistics and math. It’s very easy to get caught up in the latest and greatest, most powerful algorithms —  convolutional neural nets, reinforcement learning, etc.

As an ML/health researcher and algorithm developer, I often employ these techniques. However, something I have seen rife in the data science community after having trained ~10 years as an electrical engineer is that if all you have is a hammer, everything looks like a nail. Suffice it to say that while many of these exciting algorithms have immense applicability, too often the statistical underpinnings of the data science community are overlooked. 

What is the Difference Between Parametric and Non-Parametric Tests?

A parametric test makes assumptions about a population’s parameters, and a non-parametric test does not assume anything about the underlying distribution.

I’ve been lucky enough to have had both undergraduate and graduate courses dedicated solely to statistics , in addition to growing up with a statistician for a mother. So this article will share some basic statistical tests and when/where to use them.

A parametric test makes assumptions about a population’s parameters:

  • Normality  : Data in each group should be normally distributed.
  • Independence  : Data in each group should be sampled randomly and independently.
  • No outliers  : No extreme outliers in the data.
  • Equal Variance  : Data in each group should have approximately equal variance.

If possible, we should use a parametric test. However, a non-parametric test (sometimes referred to as a distribution free test ) does not assume anything about the underlying distribution (for example, that the data comes from a normal (parametric distribution).

We can assess normality visually using a Q-Q (quantile-quantile) plot. In these plots, the observed data is plotted against the expected quantile of a normal distribution . A demo code in Python is seen here, where a random normal distribution has been created. If the data are normal, it will appear as a straight line.

Read more about data science Random Forest Classifier: A Complete Guide to How It Works in Machine Learning

Tests to Check for Normality

  • Shapiro-Wilk
  • Kolmogorov-Smirnov

The null hypothesis of both of these tests is that the sample was sampled from a normal (or Gaussian) distribution. Therefore, if the p-value is significant, then the assumption of normality has been violated and the alternate hypothesis that the data must be non-normal is accepted as true.

Selecting the Right Test

You can refer to this table when dealing with interval level data for parametric and non-parametric tests.

Read more about data science Statistical Tests: When to Use T-Test, Chi-Square and More

Advantages and Disadvantages

Non-parametric tests have several advantages, including:

  • More statistical power when assumptions of parametric tests are violated.
  • Assumption of normality does not apply.
  • Small sample sizes are okay.
  • They can be used for all data types, including ordinal, nominal and interval (continuous).
  • Can be used with data that has outliers.

Disadvantages of non-parametric tests:

  • Less powerful than parametric tests if assumptions haven’t been violated

[1] Kotz, S.; et al., eds. (2006), Encyclopedia of Statistical Sciences , Wiley.

[2] Lindstrom, D. (2010). Schaum’s Easy Outline of Statistics , Second Edition (Schaum’s Easy Outlines) 2nd Edition. McGraw-Hill Education

[3] Rumsey, D. J. (2003). Statistics for dummies, 18th edition  

Recent Data Science Articles

58 Examples of Artificial Intelligence in Business

  • +1(424)-285-0253
  • Address: 13th Street. 47 W 13th St, New York, NY 10011, USA.

Difference Between Parametric and Non-Parametric Tests

  • 24/7 Support
  • 100+ Subjects
  • 500+ PhD statisticians

Introduction

Parametric and non-parametric tests are two fundamental concepts in statistical analysis . Understanding the difference between these two types of tests is crucial for researchers and analysts in selecting the appropriate statistical method for their data analysis. Parametric tests assume specific distributional properties of the data, while non-parametric tests make minimal assumptions about the underlying population distribution. This article aims to elucidate the differences between parametric and non-parametric tests. It starts by discussing parametric and non-parametric tests and their assumptions, then proceeds to highlight the key differences between these tests. Thus, the article provides a comprehensive understanding of parametric and non-parametric tests, their respective applications, and their implications in statistical inference.

Parametric Test: Definition and Assumptions

parametric and non-paramtric tests

In the field of Statistics, a parametric test is a hypothesis test that aims to make inferences about the mean of the original population. This type of test is conducted based on parameters and assumes knowledge of the population distributions. The commonly used t-test utilizes the students’ t-statistic, which is applicable when specific conditions are met.

Parametric tests rely on the assumption of a normal distribution for the underlying variables. The mean value is considered, either known or estimated, while identifying the sample from the population. These tests are most effective when the variables of interest are measured on an interval scale.

Non-parametric Test: Definition and Characteristics

Contrary to parametric tests, non-parametric tests do not require any assumptions about the population distribution or distinct parameters. These tests are also hypothesis-based but do not rely on underlying assumptions. Instead, they focus on differences in medians and are often referred to as distribution-free tests.

Non-parametric tests are particularly suitable for variables determined on a nominal or ordinal level. They are commonly employed when dealing with non-metric independent variables. The central tendency is measured using the median value, and these tests offer flexibility by not relying on any specific probability distribution.

Key Differences Between Parametric and Non-parametric Tests

The key differences between parametric and non-parametric tests can be summarized as follows:

  • Assumptions : Parametric tests require assumptions about the data distribution, while non-parametric tests do not have such assumptions.
  • Central Tendency Value : Parametric tests use the mean value to measure central tendency, whereas non-parametric tests use the median value.
  • Correlation : Parametric tests employ Pearson correlation, while non-parametric tests use Spearman correlation.
  • Probabilistic Distribution : Parametric tests assume a normal distribution, while non-parametric tests can be applied to arbitrary distributions.
  • Population Knowledge : Parametric tests require knowledge about the population, whereas non-parametric tests do not have this requirement.
  • Applicability : Parametric tests are suitable for interval data, while non-parametric tests are used for nominal data.
  • Examples : Parametric tests include z-test and t-test, while non-parametric tests include Kruskal-Wallis and Mann-Whitney.

Examples of parametric tests and Their Non-Parametric Test Equivalents

Paired t-test/dependent t-testWilcoxon signed-rank test
Pearson’s CorrelationSpearman’s Correlation
Independent sample t-testMann-Whitney U-test
One-Way ANOVAKruskall-wallis test
Two-Way ANOVAFriedman test

Final Take Away

  • The key difference between parametric and non-parametric tests lies in their fundamental principles and statistical approaches. While parametric tests rely on statistical distributions within the data, nonparametric tests do not depend on any specific distribution.

What is the advantage of using nonparametric tests?

Non-parametric tests do not depend on any specific distribution, making them robust and applicable in a broader range of situations.

What is the advantage of using parametric tests?

Parametric tests rely on statistical data and have a higher likelihood of accuracy.

What central tendency values are considered for parametric and non-parametric tests?

Parametric tests consider the mean value, while nonparametric tests take the median value into consideration.

What are examples of parametric tests?

Examples of parametric tests include the t-test and z-test in statistics.

What are examples of nonparametric tests?

Examples of nonparametric tests are the Kruskal-Wallis test and the Mann-Whitney test.

Still struggling with your data analysis project? Get Started Now & Unlock 15% Off Your First Project

Non Parametric Data and Tests (Distribution Free Tests)

What is a non parametric test.

A non parametric test (sometimes called a distribution free test ) does not assume anything about the underlying distribution (for example, that the data comes from a normal distribution ). That’s compared to parametric test , which makes assumptions about a population’s parameters (for example, the mean or standard deviation ); When the word “non parametric” is used in stats, it doesn’t quite mean that you know nothing about the population. It usually means that you know the population data does not have a normal distribution .

For example, one assumption for the one way ANOVA is that the data comes from a normal distribution. If your data isn’t normally distributed , you can’t run an ANOVA, but you can run the nonparametric alternative—the Kruskal-Wallis test .

If at all possible, you should us parametric tests , as they tend to be more accurate. Parametric tests have greater statistical power , which means they are likely to find a true significant effect. Use nonparametric tests only if you have to (i.e. you know that assumptions like normality are being violated). Nonparametric tests can perform well with non-normal continuous data if you have a sufficiently large sample size (generally 15-20 items in each group).

When to use it

Non parametric tests are used when your data isn’t normal. Therefore the key is to figure out if you have normally distributed data. For example, you could look at the distribution of your data. If your data is approximately normal, then you can use parametric statistical tests . Q. If you don’t have a graph, how do you figure out if your data is normally distributed? A. Check the skewness and Kurtosis of the distribution using software like Excel (See: Skewness in Excel 2013 and Kurtosis in Excel 2013 ). A normal distribution has no skew. Basically, it’s a centered and symmetrical in shape. Kurtosis refers to how much of the data is in the tails and the center. The skewness and kurtosis for a normal distribution is about 1.

non parametric

If your distribution is not normal (in other words, the skewness and kurtosis deviate a lot from 1.0), you should use a non parametric test like chi-square test . Otherwise you run the risk that your results will be meaningless.

Does your data allow for a parametric test, or do you have to use a non parametric test like chi-square? The rule of thumb is:

  • For nominal scales or ordinal scales , use non parametric statistics.
  • For interval scales or ratio scales use parametric statistics.

nonparametric tests

Other reasons to run nonparametric tests:

  • One or more assumptions of a parametric test have been violated.
  • Your sample size is too small to run a parametric test.
  • Your data has outliers that cannot be removed.
  • You want to test for the median rather than the mean (you might want to do this if you have a very skewed distribution ).

Types of Nonparametric Tests

When the word “parametric” is used in stats, it usually means tests like ANOVA or a t test . Those tests both assume that the population data has a normal distribution. Non parametric do not assume that the data is normally distributed. The only non parametric test you are likely to come across in elementary stats is the chi-square test. However, there are several others. For example: the Kruskal Willis test is the non parametric alternative to the One way ANOVA and the Mann Whitney is the non parametric alternative to the two sample t test .

The main nonparametric tests are:

  • 1-sample sign test . Use this test to estimate the median of a population and compare it to a reference value or target value.
  • 1-sample Wilcoxon signed rank test . With this test, you also estimate the population median and compare it to a reference/target value. However, the test assumes your data comes from a symmetric distribution (like the Cauchy distribution or uniform distribution ).
  • Friedman test . This test is used to test for differences between groups with ordinal dependent variables. It can also be used for continuous data if the one-way ANOVA with repeated measures is inappropriate (i.e. some assumption has been violated).
  • Goodman Kruska’s Gamma : a test of association for ranked variables.
  • Kruskal-Wallis test . Use this test instead of a one-way ANOVA to find out if two or more medians are different. Ranks of the data points are used for the calculations, rather than the data points themselves.
  • The Mann-Kendall Trend Test looks for trends in time-series data.
  • Mann-Whitney test . Use this test to compare differences between two independent groups when dependent variables are either ordinal or continuous.
  • Mood’s Median test . Use this test instead of the sign test when you have two independent samples.
  • Spearman Rank Correlation. Use when you want to find a correlation between two sets of data.

The following table lists the nonparametric tests and their parametric alternatives.

Nonparametric test Parametric Alternative
1-sample One sample t-test
1-sample , One sample t-test

Advantages and Disadvantages

Compared to parametric tests , nonparametric tests have several advantages, including:

  • More statistical power when assumptions for the parametric tests have been violated. When assumptions haven’t been violated, they can be almost as powerful.
  • Fewer assumptions (i.e. the assumption of normality doesn’t apply).
  • Small sample sizes are acceptable.
  • They can be used for all data types, including nominal variables , interval variables , or data that has outliers or that has been measured imprecisely.

However, they do have their disadvantages. The most notable ones are:

  • Less powerful than parametric tests if assumptions haven’t been violated.
  • More labor-intensive to calculate by hand (for computer calculations, this isn’t an issue).
  • Critical value tables for many tests aren’t included in many computer software packages. This is compared to tables for parametric tests (like the z-table or t-table ) which usually are included.

Kotz, S.; et al., eds. (2006), Encyclopedia of Statistical Sciences , Wiley. Lindstrom, D. (2010). Schaum’s Easy Outline of Statistics , Second Edition (Schaum’s Easy Outlines) 2nd Edition. McGraw-Hill Education

Choosing Between Parametric and Non-parametric Tests in Statistical Data Analysis

  • First Online: 08 July 2024

Cite this chapter

explain the hypothesis testing of non parametric data

  • Kingsley Okoye 3 &
  • Samira Hosseini 4  

303 Accesses

In this chapter, the authors describe what is the parametric and non-parametric tests in statistical data analysis and the best scenarios for the use of each test. It provides a guide for the readers on how to choose which of the test is most suitable for their specific research, including a description of the differences, advantages, and disadvantages of using the two types of tests. Before using the different statistical methods in R (see Chap. 6 ), the users need to understand the differences and conditions under which the various tests or methods are applied. The term “parametric” is used to refer to parameters of the resultant datasets (distribution) that supposedly assume that the sample (mean, standard deviations, etc.) is normally distributed. While the “non-parametric” tests (usually measured in median) are referred to as “distribution-free” tests given the fact that the supporting methods assume that the analyzed datasets follow a certain but not specified distribution. Thus, the different statistical procedures or supporting methods (parametric versus non-parametric) are followed based on the type of the available dataset (nominal, ordinal, continuous, discrete) and/or the number of the independent versus dependent groups or categories of the variables which are described in Chapter 5 .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Antonopoulos, C., & Kakisis, J. (2019). Applied statistics in vascular surgery Part 1: Choosing between parametric and non-parametric tests. https://www.heljves.com

De Canditiis, D. (2019). Statistical inference techniques. In Encyclopedia of bioinformatics and computational biology: ABC of bioinformatics (vols 1–3, pp. 698–705). Elsevier. https://doi.org/10.1016/B978-0-12-809633-8.20357-9

Derrick, B., White, P., & Toher, D. (2020). Parametric and non-parametric tests for the comparison of two samples which both include paired and unpaired observations. Journal of Modern Applied Statistical Journal of Modern Applied Statistical Methods Methods , 18 . https://doi.org/10.22237/jmasm/1556669520

Hopkins, S., Dettori, J. R., & Chapman, J. R. (2018). Parametric and nonparametric tests in spine research: Why do they matter? In Global spine journal (vol. 8, Issue 6, pp. 652–654). SAGE Publications Ltd. https://doi.org/10.1177/2192568218782679

IBM. (2023). SPSS statistics–overview . IBM. https://www.ibm.com/products/spss-statistics

LaMorte, W. W. (2017). When to use a nonparametric test . Boston University School of Public Health. http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Nonparametric/BS704_Nonparametric2.html

le Cessie, S., Goeman, J. J., & Dekkers, O. M. (2020). Who is afraid of non-normal data? Choosing between parametric and non-parametric tests. European Journal of Endocrinology, 182 (2), E1–E3. NLM (Medline). https://doi.org/10.1530/EJE-19-0922

MATLAB. (2023). MATLAB. https://www.mathworks.com/

Minitab. (2015). Choosing between a nonparametric test and a parametric test . https://blog.minitab.com/blog/adventures-in-statistics-2/choosing-between-a-nonparametric-test-and-a-parametric-test

Minitab.com ©. (2023). Statistical & data analysis software package . Minitab. https://www.minitab.com/en-us/products/minitab/

Python. (2023). Python . https://www.python.org/

Rana, R., Singhal, R., & Dua, P. (2016). Deciphering the dilemma of parametric and nonparametric tests. Journal of the Practice of Cardiovascular Sciences, 2 (2), 95. https://doi.org/10.4103/2395-5414.191521

Article   Google Scholar  

Rstudio (2023). RStudio. RStudio . https://rstudio.com/products/rstudio/

Sas.com ©. (2023). SAS: Analytics, artificial intelligence and data management . SAS. https://www.sas.com/en_us/home.html

ScienceDirect. (2019). Parametric test–an overview. In Encyclopedia of bioinformatics and computational biology . ScienceDirect Topics. https://www.sciencedirect.com/topics/medicine-and-dentistry/parametric-test

Stata.com ©. (2023). Stata: Software for statistics and data science . https://www.stata.com/

Stojanović, M., Andjelković-Apostolović, M., Milošević, Z., Ignjatović, A. (2018). Parametric versus nonparametric tests in biomedical research. Acta Medica, 75 . https://doi.org/10.5633/amm.2018.0212

Turner, R., Samaranayaka, A., & Cameron, C. (2020). Parametric versus nonparametric statistical methods: which is better, and why? In New Zealand Medical . https://nzmsj.scholasticahq.com/article/12577.pdf

Winthrop. (2019). Parametric versus non-parametric statistical tests . Department of Biostatistics, Winthrop University Hospital. https://nyuwinthrop.org/wp-content/uploads/2019/08/Parametric-non-parametric-tests.pdf

Woodrow, L. (2014). Writing about quantitative research in applied linguistics. In Writing about quantitative research in applied linguistics . Palgrave Macmillan. https://doi.org/10.1057/9780230369955

Download references

Author information

Authors and affiliations.

Department of Computer Science, School of Engineering and Sciences, and Institute for the Future of Education, Tecnológico de Monterrey, Monterrey, Nuevo Leon, 64849, Mexico

Kingsley Okoye

School of Engineering and Sciences, and Institute for the Future of Education, Tecnológico de Monterrey, Monterrey, Nuevo Leon, 64849, Mexico

Samira Hosseini

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kingsley Okoye .

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Okoye, K., Hosseini, S. (2024). Choosing Between Parametric and Non-parametric Tests in Statistical Data Analysis. In: R Programming. Springer, Singapore. https://doi.org/10.1007/978-981-97-3385-9_4

Download citation

DOI : https://doi.org/10.1007/978-981-97-3385-9_4

Published : 08 July 2024

Publisher Name : Springer, Singapore

Print ISBN : 978-981-97-3384-2

Online ISBN : 978-981-97-3385-9

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

explain the hypothesis testing of non parametric data

Nonparametric Tests

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  

Learn More sidebar

All Modules

When to Use a Nonparametric Test

Nonparametric tests are sometimes called distribution-free tests because they are based on fewer assumptions (e.g., they do not assume that the outcome is approximately normally distributed). Parametric tests involve specific probability distributions (e.g., the normal distribution) and the tests involve estimation of the key parameters of that distribution (e.g., the mean or difference in means) from the sample data. The cost of fewer assumptions is that nonparametric tests are generally less powerful than their parametric counterparts (i.e., when the alternative is true, they may be less likely to reject H 0 ).

It can sometimes be difficult to assess whether a continuous outcome follows a normal distribution and, thus, whether a parametric or nonparametric test is appropriate. There are several statistical tests that can be used to assess whether data are likely from a normal distribution. The most popular are the Kolmogorov-Smirnov test, the Anderson-Darling test, and the Shapiro-Wilk test 1 . Each test is essentially a goodness of fit test and compares observed data to quantiles of the normal (or other specified) distribution. The null hypothesis for each test is H 0 : Data follow a normal distribution versus H 1 : Data do not follow a normal distribution. If the test is statistically significant (e.g., p<0.05), then data do not follow a normal distribution, and a nonparametric test is warranted. It should be noted that these tests for normality can be subject to low power. Specifically, the tests may fail to reject H 0 : Data follow a normal distribution when in fact the data do not follow a normal distribution. Low power is a major issue when the sample size is small - which unfortunately is often when we wish to employ these tests. The most practical approach to assessing normality involves investigating the distributional form of the outcome in the sample using a histogram and to augment that with data from other studies, if available, that may indicate the likely distribution of the outcome in the population.

There are some situations when it is clear that the outcome does not follow a normal distribution. These include situations:

  • when the outcome is an ordinal variable or a rank,
  • when there are definite outliers or
  • when the outcome has clear limits of detection.

Using an Ordinal Scale

Consider a clinical trial where study participants are asked to rate their symptom severity following 6 weeks on the assigned treatment. Symptom severity might be measured on a 5 point ordinal scale with response options: Symptoms got much worse, slightly worse, no change, slightly improved, or much improved. Suppose there are a total of n=20 participants in the trial, randomized to an experimental treatment or placebo, and the outcome data are distributed as shown in the figure below.

Distribution of Symptom Severity in Total Sample

Histogram showing the number of participants with various categories of symptom severity. The distribution is skewed with most patients in the Slightyly Improved or Much Improved Categories

The distribution of the outcome (symptom severity) does not appear to be normal as more participants report improvement in symptoms as opposed to worsening of symptoms.

When the Outcome is a Rank

When there are outliers.

In some studies, the outcome is continuous but subject to outliers or extreme values. For example, days in the hospital following a particular surgical procedure is an outcome that is often subject to outliers. Suppose in an observational study investigators wish to assess whether there is a difference in the days patients spend in the hospital following liver transplant in for-profit versus nonprofit hospitals. Suppose we measure days in the hospital following transplant in n=100 participants, 50 from for-profit and 50 from non-profit hospitals. The number of days in the hospital are summarized by the box-whisker plot below.

  Distribution of Days in the Hospital Following Transplant

Box and whisker plot of number of patients (vertical axis) as a funtion of days in hospital (horizontal axis). The the plot suggests a skewed distribution with most patients having shorter stays, but smaller numbers having long stays.

Note that 75% of the participants stay at most 16 days in the hospital following transplant, while at least 1 stays 35 days which would be considered an outlier. Recall from page 8 in the module on Summarizing Data that we used Q 1 -1.5(Q 3 -Q 1 ) as a lower limit and Q 3 +1.5(Q 3 -Q 1 ) as an upper limit to detect outliers. In the box-whisker plot above, 10.2, Q 1 =12 and Q 3 =16, thus outliers are values below 12-1.5(16-12) = 6 or above 16+1.5(16-12) = 22.  

Limits of Detection 

In some studies, the outcome is a continuous variable that is measured with some imprecision (e.g., with clear limits of detection). For example, some instruments or assays cannot measure presence of specific quantities above or below certain limits. HIV viral load is a measure of the amount of virus in the body and is measured as the amount of virus per a certain volume of blood. It can range from "not detected" or "below the limit of detection" to hundreds of millions of copies. Thus, in a sample some participants may have measures like 1,254,000 or 874,050 copies and others are measured as "not detected." If a substantial number of participants have undetectable levels, the distribution of viral load is not normally distributed.

Hypothesis Testing with Nonparametric Tests

In nonparametric tests, the hypotheses are not about population parameters (e.g., μ=50 or μ =μ ).   Instead, the null hypothesis is more general. For example, when comparing two independent groups in terms of a continuous outcome, the null hypothesis in a parametric test is H : μ =μ . In a nonparametric test the null hypothesis is that the two populations are equal, often this is interpreted as the two populations are .

 

Advantages of Nonparametric Tests

Nonparametric tests have some distinct advantages. With outcomes such as those described above, nonparametric tests may be the only way to analyze these data. Outcomes that are ordinal, ranked, subject to outliers or measured imprecisely are difficult to analyze with parametric methods without making major assumptions about their distributions as well as decisions about coding some values (e.g., "not detected"). As described here, nonparametric tests can also be relatively simple to conduct.

return to top | previous page | next page

Content ©2017. All Rights Reserved. Date last modified: May 4, 2017. Wayne W. LaMorte, MD, PhD, MPH

  • Key Differences

Know the Differences & Comparisons

Difference Between Parametric and Nonparametric Test

parametric-vs-nonparametric-test

On the other hand, the nonparametric test  is one where the researcher has no idea regarding the population parameter. So, take a full read of this article, to know the significant differences between parametric and nonparametric test.

Content: Parametric Test Vs Nonparametric Test

Comparison chart, hypothesis tests hierarchy, equivalent tests.

Basis for ComparisonParametric TestNonparametric Test
MeaningA statistical test, in which specific assumptions are made about the population parameter is known as parametric test. A statistical test used in the case of non-metric independent variables, is called non-parametric test.
Basis of test statisticDistributionArbitrary
Measurement levelInterval or ratioNominal or ordinal
Measure of central tendencyMeanMedian
Information about populationCompletely knownUnavailable
ApplicabilityVariablesVariables and Attributes
Correlation testPearsonSpearman

Definition of Parametric Test

The parametric test is the hypothesis test which provides generalisations for making statements about the mean of the parent population. A t-test based on Student’s t-statistic, which is often used in this regard.

The t-statistic rests on the underlying assumption that there is the normal distribution of variable and the mean in known or assumed to be known. The population variance is calculated for the sample. It is assumed that the variables of interest, in the population are measured on an interval scale.

Definition of Nonparametric Test

The nonparametric test is defined as the hypothesis test which is not based on underlying assumptions, i.e. it does not require population’s distribution to be denoted by specific parameters.

The test is mainly based on differences in medians. Hence, it is alternately known as the distribution-free test. The test assumes that the variables are measured on a nominal or ordinal level. It is used when the independent variables are non-metric.

Key Differences Between Parametric and Nonparametric Tests

The fundamental differences between parametric and nonparametric test are discussed in the following points:

  • A statistical test, in which specific assumptions are made about the population parameter is known as the parametric test. A statistical test used in the case of non-metric independent variables is called nonparametric test.
  • In the parametric test, the test statistic is based on distribution. On the other hand, the test statistic is arbitrary in the case of the nonparametric test.
  • In the parametric test, it is assumed that the measurement of variables of interest is done on interval or ratio level. As opposed to the nonparametric test, wherein the variable of interest are measured on nominal or ordinal scale.
  • In general, the measure of central tendency in the parametric test is mean, while in the case of the nonparametric test is median.
  • In the parametric test, there is complete information about the population. Conversely, in the nonparametric test, there is no information about the population.
  • The applicability of parametric test is for variables only, whereas nonparametric test applies to both variables and attributes.
  • For measuring the degree of association between two quantitative variables, Pearson’s coefficient of correlation is used in the parametric test, while spearman’s rank correlation is used in the nonparametric test.

parametric vs nonparametric test

Parametric TestNon-Parametric Test
Independent Sample t TestMann-Whitney test
Paired samples t testWilcoxon signed Rank test
One way Analysis of Variance (ANOVA)Kruskal Wallis Test
One way repeated measures Analysis of VarianceFriedman's ANOVA

To make a choice between parametric and the nonparametric test is not easy for a researcher conducting statistical analysis. For performing hypothesis, if the information about the population is completely known, by way of parameters, then the test is said to be parametric test whereas, if there is no knowledge about population and it is needed to test the hypothesis on population, then the test conducted is considered as the nonparametric test.

You Might Also Like:

onw way anova vs two way anova

Poorvi says

December 1, 2016 at 3:49 pm

Very nice article. students can clearly understand the actual concept.

Osoba Adunola says

December 5, 2016 at 3:11 pm

The information is very detailed and easy to grab. Thanks

Sue Smith says

December 15, 2016 at 11:37 am

This is excellent. The flowchart was really helpful. Thank you.

Surbhi S says

December 15, 2016 at 11:52 am

We are really contented with your views, this means a lot, keep sharing.

Nyadenga wellington says

April 10, 2017 at 6:30 pm

Great!. This clears off subject anxiety. keep sharing

tariro says

May 29, 2017 at 2:23 am

Thank u guys for simplifying this for us…. then u wonder why lecturers make it sound so hectic when things can be explained in such an easy to grasp way saving us the anxiety n pressure

July 11, 2017 at 8:13 am

MD. MEHEDI HASAN says

July 25, 2017 at 6:08 pm

These informations are very helpful to understand the concepts. Thanks

Suparna says

October 20, 2017 at 10:40 am

Thank you very much for the information and the explanation you’ve given… It helped me to understand the topic much better.

Morrel says

November 28, 2017 at 1:38 pm

Super Ball says

November 29, 2017 at 9:05 pm

Information is clear to understand, very helpful. Thank you 🙂

Kimnna says

December 13, 2017 at 12:05 am

This is super helpful! It is well detailed and easy to understand.

kakai brian says

February 6, 2018 at 1:31 pm

am liking this site it is great

Lillian Ramos says

February 11, 2018 at 9:12 am

This was extremely helpful on a very technical and difficult subject such as statistics.

March 13, 2018 at 2:08 pm

Please help me ….. I fail to understand what is meant if the question reads as follows:

State the parametric and non-parametric equivalent of the Wilcoxon Signed Rank Test

ABAH Augustine says

March 24, 2018 at 8:05 pm

very informative and educating. nice one

May 11, 2018 at 3:36 pm

Millions of thanks to all the readers of the page, for liking and sharing your valuable opinions with us, keep reading. 🙂

June 30, 2018 at 5:18 pm

This article is really helpful… Cheers to Surbhi S for creating this article and pls do continue on creating articles like this…

Prakash Mistri says

July 23, 2018 at 8:25 pm

very much effective documents. This material provides very good clarity on the parametric and non-parametric difference.

August 24, 2018 at 9:24 pm

thank you so much for making this more simplified

hanadi says

September 13, 2018 at 1:47 am

thank you for the simple yet detailed elaboration. million thanks you saved me.

Valsamma Cherian says

September 18, 2018 at 1:12 pm

Thank you . For this is very simple and apt information

kinza batool malik says

January 8, 2019 at 6:42 pm

its quite helpful and easy to understand..

Sally Morton says

February 28, 2019 at 10:17 pm

Thank you so much for this article, especially the Hypothesis Test Hierarchy chart. I am reviewing statistics, and this chart serves as a roadmap.

Shahzad rauf says

April 1, 2019 at 1:27 pm

Very helpful, I wll say awsome

Vishnu says

May 24, 2019 at 11:29 pm

Very good content and clear explanation

June 17, 2019 at 3:42 am

Thanks for a wonderfully easy explanation….

June 20, 2019 at 8:00 am

Omony Kamau says

October 7, 2019 at 7:08 pm

Very precise and to the points. I just like it

Ashenafi Tadesse says

June 15, 2020 at 7:45 pm

savita says

December 13, 2020 at 10:39 am

thanku very much….. very very helpful especially the hypothesis test chart

Stella Nwodo says

December 17, 2020 at 8:43 am

This document is very simplified, thank you for the knowledge.

nompe jeminah sebeko cekwana says

May 23, 2021 at 4:48 pm

Thank you very much, this information is clear and effective.

S.Sadiq Al-Alawi says

January 23, 2023 at 1:02 am

Thank you, very helpful, simply written, straight forward, easy to understand. I really liked it.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

IMAGES

  1. Parametric and Non-Paramtric test in Statistics

    explain the hypothesis testing of non parametric data

  2. What are Non-Parametric Tests in Statistics?

    explain the hypothesis testing of non parametric data

  3. Prueba de hipótesis

    explain the hypothesis testing of non parametric data

  4. Assumptions Of Nonparametric Tests

    explain the hypothesis testing of non parametric data

  5. Everything You Need To Know about Hypothesis Testing

    explain the hypothesis testing of non parametric data

  6. Parametric and Non-Paramtric test in Statistics

    explain the hypothesis testing of non parametric data

VIDEO

  1. Hypothesis Testing Non Parametric (day

  2. Session 8- Hypothesis testing by Non Parametric Tests (7/12/23)

  3. Non Parametric Test of Hypothesis

  4. Statistics Part 4

  5. S20 Basic Nonparametric Methods, Part II

  6. Nature, Uses, Advantages and Limitations of Non Parametric Tests

COMMENTS

  1. Nonparametric Tests vs. Parametric Tests

    Non-Parametric Test (Kruskal-Wallis H-test): The results show a significant difference in the distribution of returns across the portfolios (p-values < 0.05). Given that my data does not meet the normality assumption required for parametric tests, I am inclined to rely on the non-parametric test results.

  2. Non-parametric Test (Definition, Methods, Merits, Demerits & Example)

    Non-parametric tests are the mathematical methods used in statistical hypothesis testing, which do not make assumptions about the frequency distribution of variables that are to be evaluated. The non-parametric experiment is used when there are skewed data, and it comprises techniques that do not depend on data pertaining to any particular ...

  3. Main Difference Between Parametric and Non-Parametric Test

    It is a parametric test of hypothesis testing. It is used to determine whether the means are different when we know the population variance and the sample size is large (i.e., greater than 30). Assumptions of this test: Population distribution is normal. Samples are random and independent. The sample size is large.

  4. Non-Parametric Statistics: Types, Tests, and Examples

    Parametric tests are based on the assumptions related to the population or data sources while, non-parametric test is not into assumptions, it's more factual than the parametric tests. Here is a detailed blog about non-parametric statistics. ... When the testing hypothesis is not based on the sample. For the quicker analysis of the sample.

  5. Non-Parametric Tests in Hypothesis Testing

    Krusal-Wallis H Test (KW Test — Nonparametric version of one-way ANOVA) The Krusal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal. It is a non-parametric version of ANOVA. A significant Kruskal-Wallis test indicates that at least one sample stochastically dominates one other sample.

  6. Choosing the Right Statistical Test

    Choosing a parametric test: regression, comparison, or correlation. Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

  7. Nonparametric Tests

    Hypothesis Testing with Nonparametric Tests. In nonparametric tests, the hypotheses are not about population parameters (e.g., μ=50 or μ 1 =μ 2). Instead, the null hypothesis is more general. For example, when comparing two independent groups in terms of a continuous outcome, the null hypothesis in a parametric test is H 0: μ 1 =μ 2.

  8. Parametric vs. Non-Parametric Test: Which One to Use for Hypothesis

    In hypothesis testing, the other type apart from parametric is non-parametric. Typically, its non-parametric cousin can be used for every parametric test when the assumptions cannot be fulfilled.

  9. Introduction to Nonparametric Testing

    Using this approach, the sum of the ranks will always equal n (n+1)/2. When conducting nonparametric tests, it is useful to check the sum of the ranks before proceeding with the analysis. To conduct nonparametric tests, we again follow the five-step approach outlined in the modules on hypothesis testing.

  10. Parametric vs. Non-Parametric Tests & When To Use

    Non-parametric tests have several advantages, including: More statistical power when assumptions of parametric tests are violated. Assumption of normality does not apply. Small sample sizes are okay. They can be used for all data types, including ordinal, nominal and interval (continuous). Can be used with data that has outliers.

  11. Parametric vs. Non-Parametric Tests: A Comprehensive Guide for Data

    The choice of the test depends on the characteristics of the data, the distribution of the variables, and the assumptions of the test. This is where parametric and non-parametric tests come into play.

  12. Difference Between Parametric and Non-Parametric Tests

    In the field of Statistics, a parametric test is a hypothesis test that aims to make inferences about the mean of the original population. This type of test is conducted based on parameters and assumes knowledge of the population distributions. The commonly used t-test utilizes the students' t-statistic, which is applicable when specific conditions are met.

  13. Parametric and Nonparametric: Demystifying the Terms

    A statistic estimates a parameter. Parametric statistical procedures rely on assumptions about the shape of the distribution (i.e., assume a normal distribution) in the underlying population and about the form or parameters (i.e., means and standard deviations) of the assumed distribution. Nonparametric statistical procedures rely on no or few ...

  14. Hypothesis Testing, Parametric vs Nonparametric

    Nonparametric or distribution-free tests are widely used for statistical hypothesis testing, particularly when there is doubt as to whether data can be easily modeled by standard probability distributions (Gibbons and Chakraborti 2003 and Wasserman 2007 ). Most commonly this occurs when there is doubt about the data having a normal distribution.

  15. Nonparametric statistics

    Nonparametric statistics. Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as is parametric statistics. [ 1] Nonparametric statistics can be used for descriptive ...

  16. Hypothesis Tests Explained

    Usually, parametric tests have the corresponding non-parametric test, as well described in [3]. The diagram featured at the top of this article reviews how to choose the right Hypothesis Test according to the sample. Parametric Tests. As already said, Parametric Tests assume a normal distribution in the data.

  17. Non Parametric Data and Tests (Distribution Free Tests)

    What is a Non Parametric Test? A non parametric test (sometimes called a distribution free test) does not assume anything about the underlying distribution (for example, that the data comes from a normal distribution).That's compared to parametric test, which makes assumptions about a population's parameters (for example, the mean or standard deviation); When the word "non parametric ...

  18. Choosing Between Parametric and Non-parametric Tests in Statistical

    In this chapter, the authors describe what is the parametric and non-parametric tests in statistical data analysis and the best scenarios for the use of each test. It provides a guide for the readers on how to choose which of the test is most suitable for their specific research, including a description of the differences, advantages, and disadvantages of using the two types of tests.

  19. Hypothesis Testing Using Non-Parametric Tests: A Powerful Tool for Data

    In non-parametric tests like the Mann-Whitney U test, the critical values or p-values associated with the U statistic are used to make decisions about rejecting or accepting the null hypothesis.

  20. PDF Parametric vs. Non-Parametric Statistical Tests

    The short answer is…NO! This may be a surprise, but parametric tests can perform well with continuous data that are non-normal (stats lingo: we say they are "robust to departures from normality") if: 1. You have a decent sample size Parametric analyses Sample size guidelines for non-normal data 1-sample t test > 20

  21. When to Use a Nonparametric Test

    Hypothesis Testing with Nonparametric Tests. In nonparametric tests, the hypotheses are not about population parameters (e.g., μ=50 or μ 1 =μ 2). Instead, the null hypothesis is more general. For example, when comparing two independent groups in terms of a continuous outcome, the null hypothesis in a parametric test is H 0: μ 1 =μ 2.

  22. Hypothesis, Parametric and Non-Parametric Testing

    Hypothesis testing is used for both parametric and non-parametric testing. So before going into detail about parametric and non-parametric testing we can discuss general terms used in hypothesis ...

  23. Difference Between Parametric and Nonparametric Test

    Parametric Test Nonparametric Test; Meaning: A statistical test, in which specific assumptions are made about the population parameter is known as parametric test. A statistical test used in the case of non-metric independent variables, is called non-parametric test. Basis of test statistic: Distribution: Arbitrary: Measurement level: Interval ...