Pesquisa | Portal Regional da BVS

1.

Predicting students' academic progress and related attributes in first-year medical students: an analysis with artificial neural networks and Naïve Bayes.

Monteverde-Suárez, Diego; González-Flores, Patricia; Santos-Solórzano, Roberto; García-Minjares, Manuel; Zavala-Sierra, Irma; de la Luz, Verónica Luna; Sánchez-Mendiola, Melchor.

BMC Med Educ ; 24(1): 74, 2024 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-38243257

RESUMO

BACKGROUND: Dropout and poor academic performance are persistent problems in medical schools in emerging economies. Identifying at-risk students early and knowing the factors that contribute to their success would be useful for designing educational interventions. Educational Data Mining (EDM) methods can identify students at risk of poor academic progress and dropping out. The main goal of this study was to use machine learning models, Artificial Neural Networks (ANN) and Naïve Bayes (NB), to identify first year medical students that succeed academically, using sociodemographic data and academic history. METHODS: Data from seven cohorts (2011 to 2017) of admitted medical students to the National Autonomous University of Mexico (UNAM) Faculty of Medicine in Mexico City were analysed. Data from 7,976 students (2011 to 2017 cohorts) of the program were included. Information from admission diagnostic exam results, academic history, sociodemographic characteristics and family environment was used. The main dataset included 48 variables. The study followed the general knowledge discovery process: pre-processing, data analysis, and validation. Artificial Neural Networks (ANN) and Naïve Bayes (NB) models were used for data mining analysis. RESULTS: ANNs models had slightly better performance in accuracy, sensitivity, and specificity. Both models had better sensitivity when classifying regular students and better specificity when classifying irregular students. Of the 25 variables with highest predictive value in the Naïve Bayes model, percentage of correct answers in the diagnostic exam was the best variable. CONCLUSIONS: Both ANN and Naïve Bayes methods can be useful for predicting medical students' academic achievement in an undergraduate program, based on information of their prior knowledge and socio-demographic factors. Although ANN offered slightly superior results, Naïve Bayes made it possible to obtain an in-depth analysis of how the different variables influenced the model. The use of educational data mining techniques and machine learning classification techniques have potential in medical education.

Assuntos

Estudantes de Medicina , Humanos , Teorema de Bayes , Escolaridade , Logro , Redes Neurais de Computação

2.

Using Machine Learning in Veterinary Medical Education: An Introduction for Veterinary Medicine Educators.

Hooper, Sarah E; Hecker, Kent G; Artemiou, Elpida.

Vet Sci ; 10(9)2023 Aug 23.

Artigo em Inglês | MEDLINE | ID: mdl-37756059

RESUMO

Machine learning (ML) offers potential opportunities to enhance the learning, teaching, and assessments within veterinary medical education including but not limited to assisting with admissions processes as well as student progress evaluations. The purpose of this primer is to assist veterinary educators in appraising and potentially adopting these rapid upcoming advances in data science and technology. In the first section, we introduce ML concepts and highlight similarities/differences between ML and classical statistics. In the second section, we provide a step-by-step worked example using simulated veterinary student data to answer a hypothesis-driven question. Python syntax with explanations is provided within the text to create a random forest ML prediction model, a model composed of decision trees with each decision tree being composed of nodes and leaves. Within each step of the model creation, specific considerations such as how to manage incomplete student records are highlighted when applying ML algorithms within the veterinary education field. The results from the simulated data demonstrate how decisions by the veterinary educator during ML model creation may impact the most important features contributing to the model. These results highlight the need for the veterinary educator to be fully transparent during the creation of ML models and future research is needed to establish guidelines for handling data not missing at random in medical education, and preferred methods for model evaluation.

3.

Interval regression model adequacy checking and its application to estimate school dropout in Brazilian municipality educational scenario.

do Nascimento, Rafaella L S; Fagundes, Roberta A de A; de Souza, Renata M C R; Cysneiros, Francisco José A.

Pattern Anal Appl ; 26(1): 39-59, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-35873880

RESUMO

Interval-valued data have been commonly encountered in practice, and Symbolic Data Analysis provides a solution to the statistical treatment of these data. Regression analysis for interval-valued symbolic data is a topic that has been widely investigated in the literature of symbolic data analysis, and several models from different paradigms have been proposed. There are basic regression assumptions, and it is essential to validate them. This paper introduces an approach to check interval regression model adequacy based on residual analysis. Concepts of ordinary and standardized interval residual are presented, and graphical analysis of these residuals is also proposed. To show the usefulness of the proposed approach, an application for estimating school dropout in the scenario of Brazilian municipalities is performed. We observed some outliers from the interval residuals analysis, and interval robust regression models are more suitable for estimating school dropout.

4.

Improving the portability of predicting students' performance models by using ontologies.

López-Zambrano, Javier; Lara, Juan A; Romero, Cristóbal.

J Comput High Educ ; 34(1): 1-19, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-33776379

RESUMO

One of the main current challenges in Educational Data Mining and Learning Analytics is the portability or transferability of predictive models obtained for a particular course so that they can be applied to other different courses. To handle this challenge, one of the foremost problems is the models' excessive dependence on the low-level attributes used to train them, which reduces the models' portability. To solve this issue, the use of high-level attributes with more semantic meaning, such as ontologies, may be very useful. Along this line, we propose the utilization of an ontology that uses a taxonomy of actions that summarises students' interactions with the Moodle learning management system. We compare the results of this proposed approach against our previous results when we used low-level raw attributes obtained directly from Moodle logs. The results indicate that the use of the proposed ontology improves the portability of the models in terms of predictive accuracy. The main contribution of this paper is to show that the ontological models obtained in one source course can be applied to other different target courses with similar usage levels without losing prediction accuracy.

5.

Correlation analysis using teaching and learning analytics.

Prestes, P A N; Silva, T E V; Barroso, G C.

Heliyon ; 7(11): e08435, 2021 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-34877427

RESUMO

Data analytics techniques have been gaining more space in the scientific environment with applications in various areas of knowledge, including education. This paper aims to analyse data taken from a questionnaire of the Organization for Economic Development Cooperation (OECD) given to teachers and school managers. In this questionnaire, school environment issues are assessed, specifically: school environment, professional development, school leadership, and efficient management. As a methodology, Teaching and Learning Analytics (TLA) was used, particularly correlation analysis, which enables the extraction of useful information from raw data, relating issues that interfere with the teaching and learning relationship, besides specific analysis of student learning. The results obtained about the school environment are not linear. They do not present moderate or a solid linear correlation, making it impossible to validate and integrate answers related to the statements of the themes and sub-themes chosen for this analysis. In this sense, the research found dichotomous observations that mirrored many controversies and insecurities, enabling considerations about possible school scenarios and their effective practices.

6.

Unveiling educational patterns at a regional level in Colombia: data from elementary and public high school institutions.

Hernández-Leal, Emilcy; Duque-Méndez, Néstor Darío; Cechinel, Cristian.

Heliyon ; 7(9): e08017, 2021 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-34632136

RESUMO

Even though the field of Learning Analytics (LA) has experienced an expressive growth in the last few years. The vast majority of the works found in literature are usually focusing on experimentation of techniques and methods over datasets restricted to a given discipline, course, or institution and are still few works manipulating region and countrywide datasets. This may be since the implementation of LA in national or regional scope and using data from governments and institutions poses many challenges that may threaten the success of such initiatives, including the same availability of data. The present article describes the experience of LA in Latin America using governmental data from Elementary and Middle Schools of the State of Norte de Santander - Colombia. This study is focusing on students' performance. Data from 2013 to 2018 was collected, containing information related to 1) students' enrollment in school disciplines provided by Regional Education Secretary, 2) students qualifications provided by educational institutions, and 3) students qualifications provided by the national agency for education evaluation. The methodology followed includes a process of cleaning and integration of the data, subsequently a descriptive and visualization analysis is made and some educational data mining techniques are used (decision trees and clustering) for the modeling and extraction of some educational patterns. A total of eight patterns of interest are extracted. In addition to the decision trees, a feature ranking analysis was performed using xgboost and to facilitate the visual representation of the clusters, t-SNE and self-organized maps (SOM) were applied as result projection techniques. Finally, this paper compares the main challenges mentioned by the literature according to the Colombian experience and proposes an up-to-date list of challenges and solutions that can be used as a baseline for future works in this area and aligned with the Latin American context and reality.

7.

A Learning Analytics Framework to Analyze Corporal Postures in Students Presentations.

Vieira, Felipe; Cechinel, Cristian; Ramos, Vinicius; Riquelme, Fabián; Noel, Rene; Villarroel, Rodolfo; Cornide-Reyes, Hector; Munoz, Roberto.

Sensors (Basel) ; 21(4)2021 Feb 22.

Artigo em Inglês | MEDLINE | ID: mdl-33671797

RESUMO

Communicating in social and public environments are considered professional skills that can strongly influence career development. Therefore, it is important to proper train and evaluate students in this kind of abilities so that they can better interact in their professional relationships, during the resolution of problems, negotiations and conflict management. This is a complex problem as it involves corporal analysis and the assessment of aspects that until recently were almost impossible to quantitatively measure. Nowadays, a number of new technologies and sensors have being developed for the capture of different kinds of contextual and personal information, but these technologies were not yet fully integrated inside learning settings. In this context, this paper presents a framework to facilitate the analysis and detection of patterns of students in oral presentations. Four steps are proposed for the given framework: Data collection, Statistical Analysis, Clustering, and Sequential Pattern Mining. Data Collection step is responsible for the collection of students interactions during presentations and the arrangement of data for further analysis. Statistical Analysis provides a general understanding of the data collected by showing the differences and similarities of the presentations along the semester. The Clustering stage segments students into groups according to well-defined attributes helping to observe different corporal patterns of the students. Finally, Sequential Pattern Mining step complements the previous stages allowing the identification of sequential patterns of postures in the different groups. The framework was tested in a case study with data collected from 222 freshman students of Computer Engineering (CE) course at three different times during two different years. The analysis made it possible to segment the presenters into three distinct groups according to their corporal postures. The statistical analysis helped to assess how the postures of the students evolved throughout each year. The sequential pattern mining provided a complementary perspective for data evaluation and helped to observe the most frequent postural sequences of the students. Results show the framework could be used as a guidance to provide students automated feedback throughout their presentations and can serve as background information for future comparisons of students presentations from different undergraduate courses.

Assuntos

Análise de Dados , Aprendizagem , Postura , Estudantes , Comunicação , Humanos

8.

Dataset of academic performance evolution for engineering students.

Delahoz-Dominguez, Enrique; Zuluaga, Rohemi; Fontalvo-Herrera, Tomas.

Data Brief ; 30: 105537, 2020 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-32346571

RESUMO

This data article presents data on the results in national assessments for secondary and university education in engineering students. The data contains academic, social, economic information for 12,411 students. The data were obtained by orderly crossing the databases of the Colombian Institute for the Evaluation of Education (ICFES). The structure of the data allows us to observe the influence of social variables and the evolution of students' learning skills. In addition to serving as input to develop analysis of academic efficiency, student recommendation systems and educational data mining. The data is presented in comma separated value format. Data can be easily accessed through the Mendeley Data Repository (https://data.mendeley.com/datasets/83tcx8psxv/1).

9.

Using Depth Cameras to Detect Patterns in Oral Presentations: A Case Study Comparing Two Generations of Computer Engineering Students.

Roque, Felipe; Cechinel, Cristian; Weber, Tiago O; Lemos, Robson; Villarroel, Rodolfo; Miranda, Diego; Munoz, Roberto.

Sensors (Basel) ; 19(16)2019 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-31405011

RESUMO

Speaking and presenting in public are critical skills for academic and professional development. These skills are demanded across society, and their development and evaluation are a challenge faced by higher education institutions. There are some challenges to evaluate objectively, as well as to generate valuable information to professors and appropriate feedback to students. In this paper, in order to understand and detect patterns in oral student presentations, we collected data from 222 Computer Engineering (CE) fresh students at three different times, over two different years (2017 and 2018). For each presentation, using a developed system and Microsoft Kinect, we have detected 12 features related to corporal postures and oral speaking. These features were used as input for the clustering and statistical analysis that allowed for identifying three different clusters in the presentations of both years, with stronger patterns in the presentations of the year 2017. A Wilcoxon rank-sum test allowed us to evaluate the evolution of the presentations attributes over each year and pointed out a convergence in terms of the reduction of the number of features statistically different between presentations given at the same course time. The results can further help to give students automatic feedback in terms of their postures and speech throughout the presentations and may serve as baseline information for future comparisons with presentations from students coming from different undergraduate courses.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA