Calculate p-values from z-scores, t-scores, chi-square, and F-statistics instantly. Determine statistical significance for your hypothesis test. Free online p-value calculator with step-by-step guide and real-world examples.
Please provide any one value below to compute p-value from z-score or vice versa for a normal distribution.
Let's start simple. Imagine you flip a coin 10 times and get heads 9 times. That seems weird, right? A fair coin should give you heads about 5 times out of 10. So you start wondering: is this coin rigged?
The p-value is the answer to this question: "If the coin were perfectly fair, what are the chances I'd see a result this extreme (or more extreme) just by random luck?"
In that coin example, the p-value would be pretty small. Like 0.02 or something. That means there's only a 2% chance you'd get 9 or 10 heads from a fair coin. So you'd probably conclude the coin is rigged.
That's the core idea. The p-value tells you how surprising your data is, assuming nothing special is going on (that's the null hypothesis). If it's really surprising (low p-value), you start thinking maybe something IS going on.
Using the calculator is straightforward. Here's the step-by-step:
That's it. No sign-ups, no hidden fees. Just a clean, fast calculator.
This is where a lot of people get confused. Let's clear it up.
A one-tailed test asks: "Is my result significantly higher (or lower) than expected?" You're only looking in one direction.
A two-tailed test asks: "Is my result significantly different from expected, in either direction?" You're looking at both extremes.
Here's an example. Say you're testing a new drug. If you only care whether it makes people better (not worse), you'd use a one-tailed test. If you care whether it changes things in either direction (better or worse), you'd use a two-tailed test.
The two-tailed p-value is always twice the one-tailed p-value. So if your one-tailed p-value is 0.03, your two-tailed p-value is 0.06. That's why choosing the right test matters — it can change your conclusion.
Here's the part that trips everyone up. A p-value does NOT tell you the probability that your hypothesis is true. Let me say that again because it's that important.
A p-value of 0.01 does NOT mean there's a 99% chance your hypothesis is correct.
So what does it mean? It means that if the null hypothesis were true, there's only a 1% chance you'd see data this extreme. That's a subtle but crucial difference.
Think of it this way. A p-value is about the data, not about the hypothesis. It's asking: "How weird is this data, assuming nothing is going on?" If the data is really weird (low p-value), you start to doubt that "nothing is going on" assumption.
You've probably seen "p < 0.05" everywhere. It's like the VIP section of statistics. But here's the truth: 0.05 is completely arbitrary.
A guy named Ronald Fisher (more on him later) suggested 0.05 as a convenient cutoff in the 1920s. It caught on, and now it's treated like a law of nature. But it's not.
Here's what different p-values typically mean:
But here's the thing. A p-value of 0.049 and 0.051 are almost identical. Yet one is "significant" and the other isn't. That's why smart researchers don't just rely on the cutoff. They look at the actual p-value and the context.
Let's talk about the errors I see all the time on Reddit and in real life.
You get a p-value of 0.60. That's high. So you conclude: "The null hypothesis is true." Wrong. A high p-value just means you don't have enough evidence to reject the null. It doesn't mean the null is correct. Maybe your sample size was too small. Maybe your measurement was noisy. You can't prove the null is true — you can only fail to find evidence against it.
This is when someone runs a bunch of tests, then only reports the ones with "p < 0.05". It's dishonest, but it happens a lot. If you test 20 different things, you'll probably get at least one "significant" result just by chance. That's why replication matters.
A tiny effect can have a low p-value if your sample is huge. Like, a drug that makes people 0.1% better might have p = 0.001 with 10,000 patients. Is that meaningful? Probably not. Always look at the actual size of the effect, not just the p-value.
Who actually uses p-values? Pretty much everyone who works with data.
Doctors use p-values to test if new treatments work. If a drug trial shows p = 0.03, that's evidence the drug is effective. But they also look at side effects, cost, and how big the improvement actually is.
Marketers use p-values to test if their campaigns work. Did that new ad actually increase sales? Run an A/B test, calculate the p-value, and find out.
Factories use p-values to check if their products meet standards. If a batch of widgets has a p-value below 0.05 for defects, something's wrong with the production line.
The p-value was invented by Ronald Fisher in the 1920s. He was a British statistician who also loved genetics and smoking pipes. Fisher originally called it a "test of significance" and suggested 0.05 as a convenient cutoff.
Here's a fun fact: Fisher didn't actually like the term "p-value." He preferred to call it the "level of significance." But the "p" stuck, and now it's everywhere.
Another fun fact: Fisher was famously argumentative. He got into huge debates with other statisticians about how to interpret p-values. Some of those debates are still going on today.
You don't need to know the formula to use the calculator. But if you're curious, here's the basic idea.
The p-value comes from the probability distribution of your test statistic. For a z-test, it's the standard normal distribution (the bell curve). Your test statistic is a point on that curve. The p-value is the area under the curve beyond that point.
So if your z-score is 2.0, the p-value is the area under the bell curve from 2.0 to infinity. That area is about 0.0228 for a one-tailed test, or 0.0456 for a two-tailed test.
Our calculator does this calculation instantly. It uses numerical methods to find the exact area, so you don't have to look up tables or do integrals.
The p-value is a powerful tool, but it's not magic. It's just a number that tells you how surprising your data is. Use it wisely, understand its limits, and you'll be way ahead of most people.
And if you ever get confused, just come back to our calculator. Plug in your numbers, get your p-value, and keep moving forward.
A p-value of 0.05 means there's a 5% chance you'd see results this extreme (or more extreme) if the null hypothesis were true. It's the most common cutoff for statistical significance. But remember, it's just a guideline, not a magic number.
You can use our calculator for this. Just enter your z-score, select "z-score" as your test statistic, choose one-tailed or two-tailed, and click calculate. The calculator does the rest. If you want to do it by hand, you'd look up the z-score in a standard normal table and find the corresponding probability.
Not necessarily. A lower p-value means stronger evidence against the null hypothesis. But it doesn't tell you about the size or importance of the effect. A tiny, meaningless effect can have a very low p-value if your sample is huge. Always look at the effect size too.
Alpha is the threshold you set before running your test. It's your risk tolerance for false positives. Common alphas are 0.05, 0.01, and 0.10. The p-value is what you calculate from your data. If p < alpha, you reject the null hypothesis. If p > alpha, you don't.
No. A p-value is a probability, so it must be between 0 and 1. If you get a p-value greater than 1, something went wrong in your calculation. Double-check your inputs and try again.
A p-value of 0.5 means there's a 50% chance you'd see results this extreme if the null hypothesis were true. That's very high. It means your data is totally consistent with the null hypothesis. You have no evidence to reject it.