Which is more different between Parametric and Non-Parametric?

Parametric and non-parametric. Many researchers use an experimental design to find which method significantly gives the best answer to the output expectations desired by researchers or students. Cultivation breeders are certainly familiar with experimental design material, including complete randomized design, group randomized design, and others. Of course, with different terms and conditions.

But this time I am not discussing experimental design. Incidentally, my background is socioeconomics, where the research site is a very large laboratory called the community. Actually, if traced, the function and purpose are almost identical to the experimental design, namely finding which sample data set is greater than other groups. The only difference is if the breeder can condition the experiment into an expected environment so that it can be homogeneous or can create an ideal environmental condition for the experiment he is conducting.

T test, what do you imagine if you imagine the T test. the first thing that comes to my mind when I hear the T test is the value used to see if a coefficient and constant are significant below 0.05 in the regression model. If it is significant, it means that the coefficient has an effect on the Y value or dependent variable.

Actually, the T test is not only used for regression models. The T test itself on a statistical basis is used to determine how significantly a group of sample data is different from another group of sample data (2 sample T). For example, rice growth data on different planting patterns. One uses jarwo, one uses tegel, one uses direct seed spreading. Which of the three groups has the highest significance? You can use the T test to help with this problem. Alternatively, the T test can also be used to find out whether the mean value of a set of sample data is lower or higher than a certain value we specify (1 sample T).

For now, I understand that the difference test can be divided into two groups. Difference test on parametric data and difference test on non-parametric data. In parametric data, there are several conditions that must be met: (1) normal distribution data; (2) ratio type data; and (3) minimum data of 30. Ratio-type data is data that can be compared and has an absolute zero value. For example, production data A production value of 200 kg can be said to be twice the production of 100 kg, right? And it is also possible for production data to have a value of 0 (meaning no production). Then is there any data that does not have an absolute zero value and cannot be compared? There is. That data is called interval data, one class below ratio data. An example? Temperature data. We can’t say 50o Celsius is twice as hot as 25o Celsius. Right? And there is no absolute zero value in temperature. It means that the value of 0 has a temperature; it doesn’t mean there is no temperature.

If you already understand the data you use, it means you already know which difference test you will use to process the data you have. Let’s discuss them one by one. I use Minitab 17. I use this tool because it is very simple.

Parametric

The difference test used in parametric analysis is the T test. There are two types of T tests. One sample T test and two sample T tests I have discussed the one sample T test in the article: One-Sample T Test: What is stronger than the average?

For the two sample T tests, I have the following data that you can download to practice with me:

uji beda uji t parametric

Click stat – basic statistics – 2 sample t

uji beda 01

In the windows that appear, select each sample is in its own column according to the data that has been prepared. Enter variable A in sample 1 and variable b in sample 2 by filling in the column or clicking on the variable that appears on the left.

uji T

Then click options, and in this window, specify the confidence level, commonly called the confidence level. If we want alpha < 5%, then we fill in the confidence level of 95.

The concept of difference in Minitab is the difference between A and B, or the average of A minus the average of B. If group A has the same mean value as group B, then the average value of A minus the average of B will be equal to zero, according to the initial hypothesis (difference hypothesis). The difference hypothesis by default contains the number 0. You can change it according to your research concept. For example, your difference hypothesis contains 10. This means that if the difference between the averages of A and B is equal to or smaller than 10, then group A is still considered the same as group B. In this exercise, we fill in 0.

The alternative hypothesis in the next field is H1 of your research; you can fill it with difference > hypothesis difference, difference ≠ hypothesis difference, or difference < hypothesis difference.

difference > hypothesized difference, means mean A – mean B > 0, then mean A > mean B,

difference ≠ hypothesized difference, meaning mean A ≠ mean B, one-way, without having to explain whether mean A is larger or smaller than B.

difference < hypothesized difference, meaning mean A – mean B < 0, then mean A < mean B,

In this exercise, we will try them all. For the first time, we fill in the hypothesized difference.

uji T 03

Click OK, click OK, the result will be visible

Two-Sample T-Test and CI: A, B

 Two-sample T for A vs B
    N   Mean  StDev  SE Mean
A  40  34.23   8.33      1.3
B  40  48.73   6.03     0.95
Difference = μ (A) – μ (B)
Estimate for difference:  -14.50
95% CI for difference:  (-17.74, -11.26)
T-Test of difference = 0 (vs ≠): T-Value = -8.92  P-Value = 0.000  DF = 71

It can be seen that the p value = 0.00 means reject H0 and it is significant that mean A ≠ mean B.

We repeat the above method by selecting difference > hypothesized difference on H1, the result is:

Two-Sample T-Test and CI: A, B

 Two-sample T for A vs B
    N   Mean  StDev  SE Mean
A  40  34.23   8.33      1.3
B  40  48.73   6.03     0.95
Difference = μ (A) – μ (B)
Estimate for difference:  -14.50
95% lower bound for difference:  -17.21
T-Test of difference = 0 (vs >): T-Value = -8.92  P-Value = 1.000  DF = 71

It can be seen that the p-value = 1, above 0.05. so it can be concluded to accept H0 and conclude that mean A is not greater than mean B.

We choose difference < hypothesized difference, but before we click OK, click graph then check the individual value plot and boxplot. This diagram will help you understand how different your samples are, click OK and the result is :

Two-Sample T-Test and CI: A, B

 Two-sample T for A vs B
    N   Mean  StDev  SE Mean
A  40  34.23   8.33      1.3
B  40  48.73   6.03     0.95
Difference = μ (A) – μ (B)
Estimate for difference:  -14.50
95% upper bound for difference:  -11.79
T-Test of difference = 0 (vs <): T-Value = -8.92  P-Value = 0.000  DF = 71

It can be seen that the p-value is below 0.05. It is significant that mean A < mean B. H0 is rejected.

The conclusion from this data is that the mean of group A is not significantly equal to the mean of group B. The mean of group A is significantly smaller than the mean of group B.

The boxplot image can be seen as follows:

uji beda 04
uji t 05

Non-parametric

If you have data that doesn’t meet any of the 3 criteria above, then you should use a non-parametric analysis. In non-parametric analysis, there is Mann whitney test for two samples, while for two or more samples you can use kruskall wallis.

In mann whitney, the menu is located in stat – nonparametric – mann whitney.

uji T 06

The entries in mann whitney are not much different from the T test. You can choose less than (meaning A < B), not equal, or greater than. Only the language is different.

uji t 07

Mann-Whitney Test and CI: A, B

     N  Median
A  40  33.500
B  40  47.500
Point estimate for η1 – η2 is -15.000
95.1 Percent CI for η1 – η2 is (-18.001,-11.000)
W = 959.0
Test of η1 = η2 vs η1 < η2 is significant at 0.0000
The test is significant at 0.0000 (adjusted for ties)

As for kruskal wallis, the data pattern presented is slightly different from mann whitney. The kruskall wallis data presentation pattern is similar to the experimental design. Only 2 columns but includes several treatments or samples. You can download kruskall wallis training materials here.

kruskall wallis

Seen in the picture above, the treatment column includes 3 samples with production data as the value.

Then click stat – non-parametric – kruskall wallis

uji t 09

Enter production in the response column and treatment in the factor column.

uji t 10

Kruskal-Wallis Test: produksi versus perlakuan

 Kruskal-Wallis Test on produksi
perlakuan   N  Median  Ave Rank      Z
1          10   47.00      14.3  -0.53
2          10   44.50      11.7  -1.67
3          10   56.50      20.5   2.20
Overall    30              15.5
H = 5.27  DF = 2  P = 0.072
H = 5.31  DF = 2  P = 0.070  (adjusted for ties)

What we see is the H value and directly on the p value, which is 0.072. If you use the 95% confidence level or alpha = 0.05, then the p value is > 0.05, so it is concluded that these three samples do not provide different variations in production. Or it can be said that the production of these three samples is not different.

However, if we use alpha = 0.1, or 10%, the p value above is still below the alpha value. This means the opposite of the paragraph above: that the three samples provide diverse production variations. Or it can be said that one or more of the three samples have different productions. The sample group that is meant to be different in the previous sentence must be determined by Mann Whitney, as explained earlier. Comparing one by one of the three samples.

Thank you.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *