Statistical-Significance

開發統計測試以區分兩種產品

  • March 22, 2015

我有一個客戶調查的數據集,我想部署一個統計測試來查看產品 1 和產品 2 之間是否存在顯著性差異。

這是客戶評論的數據集。

比率從非常差、差、好的、好到非常好。

customer    product1    product2
1           very good   very bad
2           good        bad
3           okay        bad
4           very good   okay
5           bad         very good
6           okay        good
7           bad         okay
8           very good   very bad
9           good        good
10          good        very good
11          okay        okay
12          very good   good
13          good        good
14          very good   okay
15          very good   okay

我應該使用什麼方法來查看這兩種產品之間是否有任何區別?

對於不同評委的排名,可以使用弗里德曼測試。 http://en.wikipedia.org/wiki/Friedman_test

您可以將評級從非常差到非常好轉換為 -2、-1、0、1 和 2 的數字。然後將數據放入長格式並應用friedman.test,並將客戶作為阻止因子:

> mm
  customer variable value
1         1 product1     2
2         2 product1     1
3         3 product1     0
4         4 product1     2
5         5 product1    -1
6         6 product1     0
7         7 product1    -1
8         8 product1     2
9         9 product1     1
10       10 product1     1
11       11 product1     0
12       12 product1     2
13       13 product1     1
14       14 product1     2
15       15 product1     2
16        1 product2    -2
17        2 product2    -1
18        3 product2    -1
19        4 product2     0
20        5 product2     2
21        6 product2     1
22        7 product2     0
23        8 product2    -2
24        9 product2     1
25       10 product2     2
26       11 product2     0
27       12 product2     1
28       13 product2     1
29       14 product2     0
30       15 product2     0
> 
> friedman.test(value~variable|customer, data=mm)

       Friedman rank sum test

data:  value and variable and customer
Friedman chi-squared = 1.3333, df = 1, p-value = 0.2482

2個產品之間的差異排名不顯著。

編輯:

以下是回歸的輸出:

> summary(lm(value~variable+factor(customer), data=mm))

Call:
lm(formula = value ~ variable + factor(customer), data = mm)

Residuals:
  Min     1Q Median     3Q    Max 
 -1.9   -0.6    0.0    0.6    1.9 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)
(Intercept)         4.000e-01  9.990e-01   0.400    0.695
variableproduct2   -8.000e-01  4.995e-01  -1.602    0.132
factor(customer)2   6.248e-16  1.368e+00   0.000    1.000
factor(customer)3  -5.000e-01  1.368e+00  -0.365    0.720
factor(customer)4   1.000e+00  1.368e+00   0.731    0.477
factor(customer)5   5.000e-01  1.368e+00   0.365    0.720
factor(customer)6   5.000e-01  1.368e+00   0.365    0.720
factor(customer)7  -5.000e-01  1.368e+00  -0.365    0.720
factor(customer)8   9.645e-16  1.368e+00   0.000    1.000
factor(customer)9   1.000e+00  1.368e+00   0.731    0.477
factor(customer)10  1.500e+00  1.368e+00   1.096    0.291
factor(customer)11  7.581e-16  1.368e+00   0.000    1.000
factor(customer)12  1.500e+00  1.368e+00   1.096    0.291
factor(customer)13  1.000e+00  1.368e+00   0.731    0.477
factor(customer)14  1.000e+00  1.368e+00   0.731    0.477
factor(customer)15  1.000e+00  1.368e+00   0.731    0.477

Residual standard error: 1.368 on 14 degrees of freedom
Multiple R-squared:  0.3972,    Adjusted R-squared:  -0.2486 
F-statistic: 0.6151 on 15 and 14 DF,  p-value: 0.8194

在此處輸入圖像描述

引用自:https://stats.stackexchange.com/questions/142903

comments powered by Disqus