RESUMO
PURPOSE: The purpose of this study was to evaluate whether a short-form computerized adaptive testing (CAT) version of the Philadelphia Naming Test (PNT) provides error profiles and model-based estimates of semantic and phonological processing that agree with the full test. METHOD: Twenty-four persons with aphasia took the PNT-CAT and the full version of the PNT (hereinafter referred to as the "full PNT") at least 2 weeks apart. The PNT-CAT proceeded in two stages: (a) the PNT-CAT30, in which 30 items were selected to match the evolving ability estimate with the goal of producing a 50% error rate, and (b) the PNT-CAT60, in which an additional 30 items were selected to produce a 75% error rate. Agreement was evaluated in terms of the root-mean-square deviation of the response-type proportions and, for individual response types, in terms of agreement coefficients and bias. We also evaluated agreement and bias for estimates of semantic and phonological processing derived from the semantic-phonological interactive two-step model (SP model) of word production. RESULTS: The results suggested that agreement was poorest for semantic, formal, mixed, and unrelated errors, all of which were underestimated by the short forms. Better agreement was observed for correct and nonword responses. SP model weights estimated by the short forms demonstrated no substantial bias but generally inadequate agreement with the full PNT, which itself showed acceptable test-retest reliability for SP model weights and all response types except for formal errors. DISCUSSION: Results suggest that the PNT-CAT30 and the PNT-CAT60 are generally inadequate for generating naming error profiles or model-derived estimates of semantic and phonological processing ability. Post hoc analyses suggested that increasing the number of stimuli available in the CAT item bank may improve the utility of adaptive short forms for generating error profiles, but the underlying theory also suggests that there are limitations to this approach based on a unidimensional measurement model. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.22320814.
Assuntos
Afasia , Humanos , Afasia/diagnóstico , Linguística , Reprodutibilidade dos Testes , SemânticaRESUMO
PURPOSE: Small-N studies are the dominant study design supporting evidence-based interventions in communication science and disorders, including treatments for aphasia and related disorders. However, there is little guidance for conducting reproducible analyses or selecting appropriate effect sizes in small-N studies, which has implications for scientific review, rigor, and replication. This tutorial aims to (a) demonstrate how to conduct reproducible analyses using effect sizes common to research in aphasia and related disorders and (b) provide a conceptual discussion to improve the reader's understanding of these effect sizes. METHOD: We provide a tutorial on reproducible analyses of small-N designs in the statistical programming language R using published data from Wambaugh et al. (2017). In addition, we discuss the strengths, weaknesses, reporting requirements, and impact of experimental design decisions on effect sizes common to this body of research. RESULTS: Reproducible code demonstrates implementation and comparison of within-case standardized mean difference, proportion of maximal gain, tau-U, and frequentist and Bayesian mixed-effects models. Data, code, and an interactive web application are available as a resource for researchers, clinicians, and students. CONCLUSIONS: Pursuing reproducible research is key to promoting transparency in small-N treatment research. Researchers and clinicians must understand the properties of common effect size measures to make informed decisions in order to select ideal effect size measures and act as informed consumers of small-N studies. Together, a commitment to reproducibility and a keen understanding of effect sizes can improve the scientific rigor and synthesis of the evidence supporting clinical services in aphasiology and in communication sciences and disorders more broadly. Supplemental Material and Open Science Form: https://doi.org/10.23641/asha.21699476.
Assuntos
Afasia , Humanos , Reprodutibilidade dos Testes , Teorema de Bayes , Afasia/terapia , Comunicação , EstudantesRESUMO
Purpose This meta-analysis synthesizes published studies using "treatment of underlying forms" (TUF) for sentence-level deficits in people with aphasia (PWA). The study aims were to examine group-level evidence for TUF efficacy, to characterize the effects of treatment-related variables (sentence structural family and complexity; treatment dose) in relation to the Complexity Account of Treatment Efficacy (CATE) hypothesis, and to examine the effects of person-level variables (aphasia severity, sentence comprehension impairment, and time postonset of aphasia) on TUF response. Method Data from 13 single-subject, multiple-baseline TUF studies, including 46 PWA, were analyzed. Bayesian generalized linear mixed-effects interrupted time series models were used to assess the effect of treatment-related variables on probe accuracy during baseline and treatment. The moderating influence of person-level variables on TUF response was also investigated. Results The results provide group-level evidence for TUF efficacy demonstrating increased probe accuracy during treatment compared with baseline phases. Greater amounts of TUF were associated with larger increases in accuracy, with greater gains for treated than untreated sentences. The findings revealed generalization effects for sentences that were of the same family but less complex than treated sentences. Aphasia severity may moderate TUF response, with people with milder aphasia demonstrating greater gains compared with people with more severe aphasia. Sentence comprehension performance did not moderate TUF response. Greater time postonset of aphasia was associated with smaller improvements for treated sentences but not for untreated sentences. Conclusions Our results provide generalizable group-level evidence of TUF efficacy. Treatment and generalization responses were consistent with the CATE hypothesis. Model results also identified person-level moderators of TUF (aphasia severity, time postonset of aphasia) and preliminary estimates of the effects of varying amounts of TUF for treated and untreated sentences. Taken together, these findings add to the TUF evidence and may guide future TUF treatment-candidate selection. Supplemental Material https://doi.org/10.23641/asha.16828630.
Assuntos
Afasia , Afasia/terapia , Teorema de Bayes , Compreensão , Humanos , Idioma , Testes de LinguagemRESUMO
Purpose The purpose of this study was to verify the equivalence of 2 alternate test forms with nonoverlapping content generated by an item response theory (IRT)-based computer-adaptive test (CAT). The Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996)was utilized as an item bank in a prospective, independent sample of persons with aphasia. Method Two alternate CAT short forms of the PNT were administered to a sample of 25 persons with aphasia who were at least 6 months postonset and received no treatment for 2 weeks before or during the study. The 1st session included administration of a 30-item PNT-CAT, and the 2nd session, conducted approximately 2 weeks later, included a variable-length PNT-CAT that excluded items administered in the 1st session and terminated when the modeled precision of the ability estimate was equal to or greater than the value obtained in the 1st session. The ability estimates were analyzed in a Bayesian framework. Results The 2 test versions correlated highly (r = .89) and obtained means and standard deviations that were not credibly different from one another. The correlation and error variance between the 2 test versions were well predicted by the IRT measurement model. Discussion The results suggest that IRT-based CAT alternate forms may be productively used in the assessment of anomia. IRT methods offer advantages for the efficient and sensitive measurement of change over time. Future work should consider the potential impact of differential item functioning due to person factors and intervention-specific effects, as well as expanding the item bank to maximize the clinical utility of the test. Supplemental Material https://doi.org/10.23641/asha.11368040.