Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros











Intervalo de año de publicación
1.
Br J Math Stat Psychol ; 75(3): 753-778, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35661350

RESUMEN

Recently, the Urnings algorithm (Bolsinova et al., 2022, J. R. Stat. Soc. Ser. C Appl. Statistics, 71, 91) has been proposed that allows for tracking the development of abilities of the learners and the difficulties of the items in adaptive learning systems. It is a simple and scalable algorithm which is suited for large-scale applications in which large streams of data are coming into the system and on-the-fly updating is needed. Compared to alternatives like the Elo rating system and its extensions, the Urnings rating system allows the uncertainty of the ratings to be evaluated and accounts for adaptive item selection which, if not corrected for, may distort the ratings. In this paper we extend the Urnings algorithm to allow for both between-item and within-item multidimensionality. This allows for tracking the development of interrelated abilities both at the individual and the population level. We present formal derivations of the multidimensional Urnings algorithm, illustrate its properties in simulations, and present an application to data from an adaptive learning system for primary school mathematics called Math Garden.


Asunto(s)
Aprendizaje , Humanos , Matemática
2.
Appl Psychol Meas ; 46(3): 219-235, 2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-35528271

RESUMEN

Adaptive learning and assessment systems support learners in acquiring knowledge and skills in a particular domain. The learners' progress is monitored through them solving items matching their level and aiming at specific learning goals. Scaffolding and providing learners with hints are powerful tools in helping the learning process. One way of introducing hints is to make hint use the choice of the student. When the learner is certain of their response, they answer without hints, but if the learner is not certain or does not know how to approach the item they can request a hint. We develop measurement models for applications where such on-demand hints are available. Such models take into account that hint use may be informative of ability, but at the same time may be influenced by other individual characteristics. Two modeling strategies are considered: (1) The measurement model is based on a scoring rule for ability which includes both response accuracy and hint use. (2) The choice to use hints and response accuracy conditional on this choice are modeled jointly using Item Response Tree models. The properties of different models and their implications are discussed. An application to data from Duolingo, an adaptive language learning system, is presented. Here, the best model is the scoring-rule-based model with full credit for correct responses without hints, partial credit for correct responses with hints, and no credit for all incorrect responses. The second dimension in the model accounts for the individual differences in the tendency to use hints.

3.
Psychometrika ; 86(4): 938-972, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34258714

RESUMEN

The emergence of computer-based assessments has made response times, in addition to response accuracies, available as a source of information about test takers' latent abilities. The development of substantively meaningful accounts of the cognitive process underlying item responses is critical to establishing the validity of psychometric tests. However, existing substantive theories such as the diffusion model have been slow to gain traction due to their unwieldy functional form and regular violations of model assumptions in psychometric contexts. In the present work, we develop an attention-based diffusion model based on process assumptions that are appropriate for psychometric applications. This model is straightforward to analyse using Gibbs sampling and can be readily extended. We demonstrate our model's good computational and statistical properties in a comparison with two well-established psychometric models.


Asunto(s)
Psicometría , Tiempo de Reacción , Reproducibilidad de los Resultados
4.
Sci Rep ; 10(1): 16226, 2020 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-33004877

RESUMEN

People's choices are often found to be inconsistent with the assumptions of rational choice theory. Over time, several probabilistic models have been proposed that account for such deviations from rationality. However, these models have become increasingly complex and are often limited to particular choice phenomena. Here we introduce a network approach that explains a broad set of choice phenomena. We demonstrate that this approach can be used to compare different choice theories and integrates several choice mechanisms from established models. A basic setup implements bounded rationality, loss aversion, and inhibition in a natural fashion, which allows us to predict the occurrence of well-known choice phenomena, such as the endowment effect and the similarity, attraction, compromise, and phantom context effects. Our results show that this network approach provides a simple representation of complex choice behaviour, and can be used to gain a better understanding of how the many choice phenomena and key theoretical principles from different types of decision-making are connected.

5.
J Intell ; 8(2)2020 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-32375211

RESUMEN

Geary puts forward an appealing argument for the consideration of mitochondrial functioning as a candidate for a formative g Geary (2019); it is also an ambitious argument [...].

6.
J Intell ; 8(1)2020 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-32138312

RESUMEN

One of the highest ambitions in educational technology is the move towards personalized learning. To this end, computerized adaptive learning (CAL) systems are developed. A popular method to track the development of student ability and item difficulty, in CAL systems, is the Elo Rating System (ERS). The ERS allows for dynamic model parameters by updating key parameters after every response. However, drawbacks of the ERS are that it does not provide standard errors and that it results in rating variance inflation. We identify three statistical issues responsible for both of these drawbacks. To solve these issues we introduce a new tracking system based on urns, where every person and item is represented by an urn filled with a combination of green and red marbles. Urns are updated, by an exchange of marbles after each response, such that the proportions of green marbles represent estimates of person ability or item difficulty. A main advantage of this approach is that the standard errors are known, hence the method allows for statistical inference, such as testing for learning effects. We highlight features of the Urnings algorithm and compare it to the popular ERS in a simulation study and in an empirical data example from a large-scale CAL application.

7.
Br J Math Stat Psychol ; 73(1): 72-87, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-30883704

RESUMEN

We introduce a general response model that allows for several simple restrictions, resulting in other models such as the extended Rasch model. For the extended Rasch model, a dynamic Bayesian estimation procedure is provided, which is able to deal with data sets that change over time, and possibly include many missing values. To ensure comparability over time, a data augmentation method is used, which provides an augmented person-by-item data matrix and reproduces the sufficient statistics of the complete data matrix. Hence, longitudinal comparisons can be easily made based on simple summaries, such as proportion correct, sum score, etc. As an illustration of the method, an example is provided using data from a computer-adaptive practice mathematical environment.


Asunto(s)
Modelos Educacionales , Modelos Psicológicos , Modelos Estadísticos , Teorema de Bayes , Simulación por Computador , Humanos , Aprendizaje , Cómputos Matemáticos , Psicometría
8.
Front Psychol ; 11: 500039, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33391063

RESUMEN

An extension to a rating system for tracking the evolution of parameters over time using continuous variables is introduced. The proposed rating system assumes a distribution for the continuous responses, which is agnostic to the origin of the continuous scores and thus can be used for applications as varied as continuous scores obtained from language testing to scores derived from accuracy and response time from elementary arithmetic learning systems. Large-scale, high-stakes, online, anywhere anytime learning and testing inherently comes with a number of unique problems that require new psychometric solutions. These include (1) the cold start problem, (2) problem of change, and (3) the problem of personalization and adaptation. We outline how our proposed method addresses each of these problems. Three simulations are carried out to demonstrate the utility of the proposed rating system.

9.
Perspect Psychol Sci ; 14(6): 1034-1061, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31647746

RESUMEN

The positive manifold of intelligence has fascinated generations of scholars in human ability. In the past century, various formal explanations have been proposed, including the dominant g factor, the revived sampling theory, and the recent multiplier effect model and mutualism model. In this article, we propose a novel idiographic explanation. We formally conceptualize intelligence as evolving networks in which new facts and procedures are wired together during development. The static model, an extension of the Fortuin-Kasteleyn model, provides a parsimonious explanation of the positive manifold and intelligence's hierarchical factor structure. We show how it can explain the Matthew effect across developmental stages. Finally, we introduce a method for studying growth dynamics. Our truly idiographic approach offers a new view on a century-old construct and ultimately allows the fields of human ability and human learning to coalesce.


Asunto(s)
Desarrollo Infantil , Individualidad , Inteligencia , Modelos Teóricos , Niño , Humanos
10.
PLoS One ; 12(1): e0169787, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28076429

RESUMEN

The Single Variable Exchange algorithm is based on a simple idea; any model that can be simulated can be estimated by producing draws from the posterior distribution. We build on this simple idea by framing the Exchange algorithm as a mixture of Metropolis transition kernels and propose strategies that automatically select the more efficient transition kernels. In this manner we achieve significant improvements in convergence rate and autocorrelation of the Markov chain without relying on more than being able to simulate from the model. Our focus will be on statistical models in the Exponential Family and use two simple models from educational measurement to illustrate the contribution.


Asunto(s)
Algoritmos , Simulación por Computador/estadística & datos numéricos , Modelos Teóricos
11.
Psychometrika ; 82(1): 210-232, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-27844271

RESUMEN

This paper discusses the issue of differential item functioning (DIF) in international surveys. DIF is likely to occur in international surveys. What is needed is a statistical approach that takes DIF into account, while at the same time allowing for meaningful comparisons between countries. Some existing approaches are discussed and an alternative is provided. The core of this alternative approach is to define the construct as a large set of items, and to report in terms of summary statistics. Since the data are incomplete, measurement models are used to complete the incomplete data. For that purpose, different models can be used across countries. The method is illustrated with PISA's reading literacy data. The results indicate that this approach fits the data better than the current PISA methodology; however, the league tables are nearly identical. The implications for monitoring changes over time are discussed.


Asunto(s)
Evaluación Educacional , Internacionalidad , Alfabetización , Modelos Estadísticos , Encuestas y Cuestionarios , Canadá , Humanos , México , Psicometría , Lectura
12.
Sci Rep ; 6: 34175, 2016 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-27698356

RESUMEN

Statistical models that analyse (pairwise) relations between variables encompass assumptions about the underlying mechanism that generated the associations in the observed data. In the present paper we demonstrate that three Ising model representations exist that, although each proposes a distinct theoretical explanation for the observed associations, are mathematically equivalent. This equivalence allows the researcher to interpret the results of one model in three different ways. We illustrate the ramifications of this by discussing concepts that are conceived as problematic in their traditional explanation, yet when interpreted in the context of another explanation make immediate sense.

13.
PLoS One ; 11(5): e0155149, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27167518

RESUMEN

We investigate the relation between speed and accuracy within problem solving in its simplest non-trivial form. We consider tests with only two items and code the item responses in two binary variables: one indicating the response accuracy, and one indicating the response speed. Despite being a very basic setup, it enables us to study item pairs stemming from a broad range of domains such as basic arithmetic, first language learning, intelligence-related problems, and chess, with large numbers of observations for every pair of problems under consideration. We carry out a survey over a large number of such item pairs and compare three types of psychometric accuracy-response time models present in the literature: two 'one-process' models, the first of which models accuracy and response time as conditionally independent and the second of which models accuracy and response time as conditionally dependent, and a 'two-process' model which models accuracy contingent on response time. We find that the data clearly violates the restrictions imposed by both one-process models and requires additional complexity which is parsimoniously provided by the two-process model. We supplement our survey with an analysis of the erroneous responses for an example item pair and demonstrate that there are very significant differences between the types of errors in fast and slow responses.


Asunto(s)
Exactitud de los Datos , Modelos Estadísticos , Solución de Problemas/fisiología , Tiempo de Reacción/fisiología , Juegos Recreacionales/psicología , Humanos , Inteligencia , Lenguaje , Aprendizaje/fisiología , Psicometría
14.
Psychometrika ; 81(2): 274-89, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27052959

RESUMEN

In this paper, we show that the marginal distribution of plausible values is a consistent estimator of the true latent variable distribution, and, furthermore, that convergence is monotone in an embedding in which the number of items tends to infinity. We use this result to clarify some of the misconceptions that exist about plausible values, and also show how they can be used in the analyses of educational surveys.


Asunto(s)
Psicometría , Estadística como Asunto , Teorema de Bayes , Evaluación Educacional , Humanos , Modelos Teóricos , Encuestas y Cuestionarios
15.
Psychometrika ; 81(1): 39-59, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26507635

RESUMEN

When a simple sum or number-correct score is used to evaluate the ability of individual testees, then, from an accountability perspective, the inferences based on the sum score should be the same as the inferences based on the complete response pattern. This requirement is fulfilled if the sum score is a sufficient statistic for the parameter of a unidimensional model. However, the models for which this holds true are known to be restrictive. It is shown that the less restrictive nonparametric models could result in an ordering of persons that is different from an ordering based on the sum score. To arrive at a fair evaluation of ability with a simple number-correct score, ordinal sufficiency is defined as a minimum condition for scoring. The monotone homogeneity model, together with the property of ordinal sufficiency of the sum score, is introduced as the nonparametric Rasch model. A basic outline for testable hypotheses about ordinal sufficiency, as well as illustrations with real data, is provided.


Asunto(s)
Modelos Estadísticos , Psicometría , Proyectos de Investigación , Estadísticas no Paramétricas
16.
Br J Math Stat Psychol ; 69(1): 62-79, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26059168

RESUMEN

An important distinction between different models for response time and accuracy is whether conditional independence (CI) between response time and accuracy is assumed. In the present study, a test for CI given an exponential family model for accuracy (for example, the Rasch model or the one-parameter logistic model) is proposed and evaluated in a simulation study. The procedure is based on the non-parametric Kolmogorov-Smirnov tests. As an illustrative example, the CI test was applied to data from an arithmetics test for secondary education.

17.
Psychometrika ; 80(4): 859-79, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26493183

RESUMEN

In their seminal work on characterizing the manifest probabilities of latent trait models, Cressie and Holland give a theoretically important characterization of the marginal Rasch model. Because their representation of the marginal Rasch model does not involve any latent trait, nor any specific distribution of a latent trait, it opens up the possibility for constructing a Markov chain - Monte Carlo method for Bayesian inference for the marginal Rasch model that does not rely on data augmentation. Such an approach would be highly efficient as its computational cost does not depend on the number of respondents, which makes it suitable for large-scale educational measurement. In this paper, such an approach will be developed and its operating characteristics illustrated with simulated data.


Asunto(s)
Teorema de Bayes , Funciones de Verosimilitud , Psicometría/estadística & datos numéricos , Algoritmos , Cadenas de Markov
18.
Sci Rep ; 5: 9050, 2015 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-25761415

RESUMEN

Estimating the structure of Ising networks is a notoriously difficult problem. We demonstrate that using a latent variable representation of the Ising network, we can employ a full-data-information approach to uncover the network structure. Thereby, only ignoring information encoded in the prior distribution (of the latent variables). The full-data-information approach avoids having to compute the partition function and is thus computationally feasible, even for networks with many nodes. We illustrate the full-data-information approach with the estimation of dense networks.


Asunto(s)
Algoritmos , Modelos Teóricos
20.
Front Psychol ; 6: 1956, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26779074

RESUMEN

In this paper test equating is considered as a missing data problem. The unobserved responses of the reference population to the new test must be imputed to specify a new cutscore. The proportion of students from the reference population that would have failed the new exam and those having failed the reference exam are made approximately the same. We investigate whether item response theory (IRT) makes it possible to identify the distribution of these missing responses and the distribution of test scores from the observed data without parametric assumptions for the ability distribution. We show that while the score distribution is not fully identifiable, the uncertainty about the score distribution on the new test due to non-identifiability is very small. Moreover, ignoring the non-identifiability issue and assuming a normal distribution for ability may lead to bias in test equating, which we illustrate in simulated and empirical data examples.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA