Investigation of measurement precision of latent variables in education

Maslak Anatoly A.; Osipov Sergey A.; Goncharova Tatyana N.

Образование и наука. 2014. № 7 (116)

КВАЛИМЕТРИЧЕСКИЙ ПОДХОД В ОБРАЗОВАНИИ

УДК 303.094.7

Maslak Anatoly A.

Doctor of Science, Professor, Head of Laboratory for Objective Measurement, Affiliate of Kuban State University, Slavyansk-on-Kuban. Е-mail: [email protected]

Osipov Sergey A.

Candidate of Sciences, Associate Professor, Department of Mathematics and Informatics, Affiliate of Kuban State University, Slavyansk-on-Kuban. Е-mail: [email protected]

Goncharova Tatyana N.

Postgraduate, Department of Mathematics and Informatics, Affiliate of Kuban State University,

Slavyansk-on-Kuban.

Е-mail: [email protected]

INVESTIGATION OF MEASUREMENT PRECISION OF LATENT VARIABLES IN EDUCATION1

Abstract. The objective of the study is to investigate the measurement accuracy of latent variables depending on a number of dichotomous test items and variation range.

Methods: Investigation is based on the simulation experiments. Results: The authors make recommendations for selecting a number of dichotomous test items and variation range depending on the required measurement precision of latent variables.

Scientific novelty: The research demonstrates statistical correlation between the measurement precision of latent variables and a number of test items and variation range.

Importance for practice: The research results can be used while developing the questionnaires and tests for measuring the latent variables.

Keywords: latent variable, Rasch model, measurement precision, dichoto-mous items, simulation experiment.

1 Статья публикуется в авторской редакции.

Маслак Анатолий Андреевич

доктор технических наук, профессор, заведующий лабораторией объективных измерений филиала Кубанского государственного университета, Славянск-на-Кубани. E-mail: [email protected]

Осипов Сергей Александрович

кандидат технических наук, доцент кафедры математики и информатики филиала Кубанского государственного университета, Славянск-на-Кубани. E-mail: [email protected]

Гончарова Татьяна Николаевна

аспирант кафедръ математики и информатики филиала Кубанского государственного университета, Славянск-на-Кубани. E-mail: [email protected]

ИССЛЕДОВАНИЕ ТОЧНОСТИ ИЗМЕРЕНИЯ ЛАТЕНТНЫХ ПЕРЕМЕННЫХ В ОБРАЗОВАНИИ

Аннотация. Цель работы - исследование точности измерения латентных переменных в зависимости от числа дихотомических тестовых заданий и диапазона их варьирования.

Методика и методы: исследование проведено на основе имитационного моделирования.

Результаты. Разработаны рекомендации по выбору числа дихотомических тестовых заданий и диапазона их варьирования в зависимости от требуемой точности измерения латентных переменных.

Научная новизна. Получена статистическая взаимосвязь между точностью измерения латентной переменной и числом тестовых заданий и диапазона их варьирования.

Практическая значимость. Результаты исследования могут быть использованы при построении тестов и опросников для измерения латентных переменных.

Ключевые слова: латентная переменная, модель Раша, точность измерения, имитационный эксперимент.

Introduction

In education and other social systems, the majority of variables, for example students' proficiency, are latent, i.e. they cannot be measured in such way as, for example weight or length. In the middle of the last century, there appeared a possibility of measuring the latent variables on a linear scale due to the developed theory of latent variables. After the work of Georg Rasch [1], a large number of research papers applying and dis-

Образование и наука. 2014 № 7 (116)

37

cussing the Rasch model have been published. It allowed shifting to essentially more advanced level of research in education and other social systems [2-4]. But still there are some open issues. One of them is about the number of test items needed to obtain the required precision of measurement of latent variable [5, 6]. Another issue is the influence of a range of items variation on measurement precision of a latent variable.

The work purpose

Tests and questionnaires play an important role in individual decision-making in areas such as educational testing, personnel selection, and many others. The research is aimed at measuring the precision of a latent variable depending on number of dichotomous items and a range of its variation. The need for this research results from the fact that the measurement cost substantially depends on a number of test items. Therefore, it is important to choose a minimum number of test items to provide the required precision of the latent variable measurement.

Methods

The authors use the paradigm of measuring the latent variables, developed by the Danish mathematician G. Rasch. In this paradigm the estimation of a latent variable, for example students' proficiency, does not depend on difficulty of a set of test items [7, 8]. Besides, students' proficiency and items difficulty are measured on the same linear interval scale in logits. By means of linear operations, the scale of latent variable can be transformed into any other scale. For example, the Federal Centre of Testing of the Ministry of Education and Science of the Russian Federation transform logits of the Unified State Exam into a 100-mark scale.

It is convenient to estimate the precision of measurement by a standard error. So there is a need to establish quantitative dependence of standard measurement error of a latent variable on a number of test items and range of its variation.

Research was based on the simulation experiment. Such method of research is used due to the fact that the model of measurement (Rasch model) is a probabilistic and nonlinear one. Analytical research methods in such situations are ineffective [9, 10].

For generating of a matrix of data the following scheme was used. Students' proficiency varied from -4.0 to +4.0 logits. This range covers the majority of practical Rasch model applications. For convenience of the

analysis of measurement precision 17 values of a latent variable (17 levels of proficiency) was used with step.5 logits: the first level equals -4.0, the second level equals -3.5, ..., the seventeenth - +4.0 logits. Each of 17 levels was used triple that is in a generated matrix there is 51 lines.

Difficulty of test items varies on intervals [-2; +2] and [-4; +4] logits. There were used 10 set of test items. The first set consists of 10 items, the second consists of 20, ..., the tenth of 100 items. In each set items were evenly distributed within above-mentioned intervals.

In terms of design of experiment there was used a four-way block randomized plan with replication having three treatment factors A, B, C and block-factor D [11]:

• Factor A is the range of test items variation; a = 2 levels: (-2.0, 2.0 logits), (-4.0, 4.0 logits).

• Factor B is the student location; b = 17: (-4.0, -3.5, -3.0, ..., +4.0 logits).

• Factor C is the number of items set; c = 10: (10, 20, 30, ., 100).

• Block-factor D varied on three levels; d = 3.

The response variable Y is the standard error of measurement of students' proficiency (latent variable).

Data of simulation experiment were generated in accordance with Rasch model for dichotomous items.

p = T+T^' (1)

where py - probability of a right answer of i-th student on j-th item, Pi - level of i-th student proficiency (logits), 5y - difficulty of j-th test item (logits).

Then based on the calculated probabilities (1) data of dichotomous matrix are generated:

Xjj = Int py - Rnd +1), (2)

Where Int (Y) - the whole part of number Y, Rnd - a random number evenly distributed on an interval (0; 1).

As an example in Table 1 the generated matrix of data for 30 items which varies in a range from - 4.0 logits to + 4.0 logits is presented.

Table 1

Data of simulation experiment with 30 items

Student Student Profi- Items (30)

ciency

1 4.0 1111111 1111111111111111111101

2 4.0 11111111 1111111111111111111001

3 4.0 11111111 1111111111111111111011

4 3.5 11111111 1111111111111101110010

5 3.5 11111111 1111111111111101111000

6 3.5 11111111 1111111111110111111110

7 3.0 11111111 1111111111111111100010

8 3.0 11111111 1111111111111111011010

9 3.0 11111111 1111111011110110111100

10 2.5 11111111 1110111111111111010000

11 2.5 11111111 1111111111101011100011

12 2.5 11111111 1111111111100111100000

13 2.0 11111111 1111111101100101000010

14 2.0 11111111 1111110111100111001100

15 2.0 11111111 1111111111111110010000

16 1.5 11111111 1111111101111111001000

17 1.5 11111111 1111011111101010000000

18 1.5 11111111 1111111111001100000000

19 1.0 11111111 1111011101010110010000

20 1.0 11111111 1111110101100100000000

21 1.0 11111111 0101100011001000100000

22 0.5 11111111 1111110100001100000000

23 0.5 11111111 1001101111001000100000

24 0.5 11111111 1111111111111000000000

25 0.0 111111100111101101000000000000

26 0.0 1111111 0111100100000000000000

27 0.0 11111111 1010011001000100000000

28 -0.5 11111110 1111111100010000000000

29 -0.5 11111111 0010111000000000100000

30 -0.5 11111111 1110110100100000000000

31 -1.0 1111110 1110011101000000000000

32 -1.0 0111111 1111000010001001000000

33 -1.0 1111110 1110000100000000000000

34 -1.5 0111101 1111100000001000000000

35 -1.5 11101010 1001000000000001000000

36 -1.5 11111111 1101100000000010000000

37 -2.0 11111111 0100000000000001000000

38 -2.0 111111100100100001000000000000

39 -2.0 111110100000000100100000100000

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

40 -2.5 010011010010000000000000000000

41 -2.5 111000000100011000100000000000

42 -2.5 100101000000000000000000000000

43 -3.0 110110100000000100000000000000

44 -3.0 011101000000000100000000000000

45 -3.0 001000010000000010000000000000

46 -3.5 111010100000000000000000000000

47 -3.5 010000000000000000010000000000

48 -3.5 111000000000000000000000000000

49 -4.0 101000010000000000001000000000

50 -4.0 111100000010000000000000000000

51 -4.0 111010000000000000000000000000

Based on the generated data matrix there were obtained estimations of students' proficiency. For these purposes dialogue system «MLV» developed by authors of this paper (Measurement of Latent Variables), developed in Laboratory for Objective Measurements of the Kuban State University was used.

Precision of measurement of students' proficiency is characterized by a standard error of measurement. For i-th student the standard error is:

where pj - probability of a right answer of i-th student on j-th test item; m - number of test items. Unlike the classical theory of testing where a measurement error same for all students, in the theory of latent variables these errors are different. For example, if i- th student has correctly answered all items the standard error tends to infinity. If the student has incorrectly answered all items the standard error also tends to infinity. The least error is observed for students who correctly answer approximately half of test items. From the formula (3) follows, that on the edges of a scale the standard error has maximum values.

Measurement precision of latent variable obtained from the simulation experiment is described by the following model

yijki (m) = p + ai + Pj + Yk + Tl + (aP)j + (ay) ik + (aPy) jk + j (m), (4)

Where yya m is the response variable which is standard error of measurement of latent variable; p is the overall mean;

aj, a2 are the main effects for the levels of factor A; Pi, P2, ..., P17 are the main effects for the levels of factor B; Yi, Y2, ..., Y10 are the main effects for the levels of factor C; T1; t2, t3 are the main effects for the levels of factor D; (ap)ij are the interactions for the combinations of factors A and B; (ay)jk are the interactions for the combinations of factors A and C (aPyijk) are the interactions for the combinations of factors A, B and C; Eijkl (m) are the errors that satisfy the conditions of mean equal to 0, equal variances, normality, and independence.

SE =

1

(3)

Results

With the purpose of an illustration Figure 1 displays precision of measurement of latent variable based on 50 items.

1 4 SE (logit)

1 2

. 1

C £ • • • •

^H^. . 6 1 1 -f—*--• ^

C 4

C 2 I I I I I o i i i i i

-E -4 -3 -2 -1 C 1 2 3 4 5

Location (logit)

Figure 1. Precision of measurement of a latent variable based on 50 items with (-4.0, +4.0) range of items variation

The statistical analysis of measurement precision of a latent variable

Results of the variance analysis (ANOVA) of a standard measurement error are presented in Table 2.

Table 2

ANOVA of standard error of measurement

Source of Variation Sum of Squares Degrees of Freedom Mean Sum of Squares F Sig.

Factor A .078 1 .078 18.473 <.001

Factor B 24.393 i—1 № 1.525 362.435 <.001

Factor C 111.507 9 12.390 2945.430 <.001

Block-factor D .039 2 .019 4.586 .010

Interaction AB 4.395 i—1 .275 65.297 <.001

Interaction AC 3.423 9 .380 90.417 <.001

Interaction BC 1.373 144 .010 2.267 <.001

Interaction ABC .991 144 .007 1.636 <.001

Error 11.433 2718 .004

Total 157.631 3059

All sources of variation are significant. In a certain degree it is due to the great volume of experimental data. The average values of measurement precision of a latent variable depending on the items set are presented in Table 3.

Table 3

Mean Standard Error of Items Set

Set of Items Mean Volume Standard Error 95% Confidence Interval

Lower Bound Upper Bound

10 .993 306 .004 .986 1.000

20 .708 306 .004 .701 .715

30 .604 306 .004 .597 .612

40 .539 306 .004 .532 .547

50 .477 306 .004 .470 .484

60 .443 306 .004 .436 .450

70 .407 306 .004 .399 .414

80 .375 306 .004 .368 .382

90 .362 306 .004 .355 .369

100 .343 306 .004 .336 .350

Important aspect of the investigation is the finding out measurement precision depending on location of persons on a scale (Figure 2).

Figure 2. Standard error of measurement of a latent variable depending on students' location on a scale and numbers of test items

Fig. 3. A standard error of measurement of a latent variable depending on students' location on a scale and a range of items variation

Figure 3. Shows the influence of a range of items variation on measurement precision of a latent variable

On the average at a small interval of items variation measurement precision a little higher, than at wider range (Table 4).

Table 4

Mean standard error of persons depending on range of items variation

Range Mean Volume Standard Error 95% Confidence Interval

Lower Bound Upper Bound

[-2.0, + 2.0] .520 1530 .002 .517 .523

[-4.0, + 4.0] .530 1530 .002 .527 .533

Discussion

Students' ability and items difficulty varied in a simulation experiment over a wide range: from -4.0 logits to +4.0 logits. This wide range covers the majority of practical testing.

As a result of the carried out research it is shown, that for achievement of a standard error of measurement in.5 logits there are enough 50 dichotomous items (Table 3). The further increase in number of items

slightly increases measurement precision. So, even 100 dichotomous items do not provide measurement precision less than.3 logits (Figure 2).

The range of a variation of test tasks significantly influences measurement precision of latent variable. Besides measurement precision in the middle of a scale is higher than on the edges of a scale (Figure 3).

The results are obtained for the case that latent variable vary from -4.0 to +4.0 logits. It is obvious, that for drawing conclusions concerning other intervals of a variation of a latent variable additional investigation is required.

Another possible way of increasing of measurement precision is replacing dichotomous items by polytomous ones. In the last case it is possible to take into account partially correct variants of the answer.

Conclusion

1. For achieving a standard measurement error of 5 logits, 50 dichotomous items is enough. It is necessary to notice, that students' proficiency and test item difficulty vary in the same interval: from -4.0 to +4.0 logits.

2. Measurement precision can slightly increase when the items number exceeds 50. However, even 100 items do not provide the measurement precision below 3 logits.

3. The measurement precision of students' proficiency (latent variable) is higher in the middle of the scale and lower on its edges.

Acknowledgment

This research was supported by the grant from the Russian Foundation for Basic Research 05-06-80110 «Development of the technique of measurement on an interval scale of latent variables in social and economic systems» (2005-2007), and the grant from the Russian Foundation for Humanities 08-06-00694а «Development of the technique of quality analysis of questionnaires used for measurement of latent variables» (2008-2010).

Статья рекомендована к публикации д-ром пед. наук, проф. Н. Е. Эргановой

References

1. Rasch G., 1980. Probabilistic models for some intelligence and attainment tests (Expanded edition, with foreword and afterword by Benjamin D. Wright). Chicago: University of Chicago Press. Р. 199.

Образование и наука. 2014 № 7 (116)

45

2. Maslak A. A. Measurement of latent variables in social systems. Slavyansk-on-Kuban. Publishing center of KubSU. 2012. P. 432. (In Russian)

3. Maslak A., Karabatsos G., Anisimova T., Osipov S. Measuring and Comparing Higher Education Quality between Countries Worldwide. Journal of Applied Measurement. 2005. V. 6. № 4. P. 432-442.

4. Crocker L. Algina Introduction to Classical and Modern Test Theory. Ohio. Cengage Learning Mason. 2008. P. 527.

5. Kruyen P. M. Using Short Tests and Questionnaires for Making Decisions about Individuals: When is Short too Short? Ridderkerk. 2012. 161 p.

6. Kruyen P. M., Emons, W. H. M. and Sijtsma K. Test Length and decision quality in personnel selection: When is short too short? International Journal of Testing. 2012. № 12. P. 321-344.

7. Letova L. V., Maslak A. A., Osipov S. A. Family of Rasch f models for objective measurement of latent variables. Informatization of Science and Education. 2013. № 4 (20). P. 131-141.

8. Humphry S. M., Andrich D. Understanding the unit in the Rasch Model. Journal of Applied Measurement. 2008. № 9 (3). P. 249-264.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

9. Wilson M. Constructing Measures: An Item Response modeling approach. Mahwah. Lawrence Erlbaum Associates Publ. 2005. P. 228.

10. Wolfe E. W., Smith V. Instrument Development Tools and Activities for Measure Validation Using Rasch Models; Part I - Instrument Development Tools. Journal of Applied Measurement. 2007. № 8 (1). P. 249-264.

11. Maslak A. A. Fundamentals of Design of Experiment in Management. Slavyansk-on-Kuban. Publishing center of KubSU. 2013. № 116.

Investigation of measurement precision of latent variables in education Текст научной статьи по специальности «Медицинские технологии»

Аннотация научной статьи по медицинским технологиям, автор научной работы — Maslak Anatoly A., Osipov Sergey A., Goncharova Tatyana N.

Похожие темы научных работ по медицинским технологиям , автор научной работы — Maslak Anatoly A., Osipov Sergey A., Goncharova Tatyana N.

Текст научной работы на тему «Investigation of measurement precision of latent variables in education»