Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Am Med Inform Assoc ; 31(8): 1648-1656, 2024 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-38916911

RESUMEN

OBJECTIVE: Author name incompleteness, referring to only first initial available instead of full first name, is a long-standing problem in MEDLINE and has a negative impact on biomedical literature systems. The purpose of this study is to create an Enhanced Author Names (EAN) dataset for MEDLINE that maximizes the number of complete author names. MATERIALS AND METHODS: The EAN dataset is built based on a large-scale name comparison and restoration with author names collected from multiple literature databases such as MEDLINE, Microsoft Academic Graph, and Semantic Scholar. We assess the impact of EAN on biomedical literature systems by conducting comparative and statistical analyses between EAN and MEDLINE's author names dataset (MAN) on 2 important tasks, author name search and author name disambiguation. RESULTS: Evaluation results show that EAN improves the number of full author names in MEDLINE from 69.73 million to 110.9 million. EAN not only restores a substantial number of abbreviated names prior to the year 2002 when the NLM changed its author name indexing policy but also improves the availability of full author names in articles published afterward. The evaluation of the author name search and author name disambiguation tasks reveal that EAN is able to significantly enhance both tasks compared to MAN. CONCLUSION: The extensive coverage of full names in EAN suggests that the name incompleteness issue can be largely mitigated. This has significant implications for the development of an improved biomedical literature system. EAN is available at https://zenodo.org/record/10251358, and an updated version is available at https://zenodo.org/records/10663234.


Asunto(s)
Autoria , MEDLINE , Publicaciones Periódicas como Asunto , Nombres
2.
Account Res ; : 1-24, 2024 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-38704656

RESUMEN

The perennial problem of author name ambiguity has attracted increasing attention in the academic community. Drawing on the literature, this article first highlights the pervasiveness of the problem and discusses its adverse consequences. It then analyzes the behavioral causes of the problem in the Chinese context and attributes them to personal, cultural, and institutional factors. Informed by this analysis and recognizing ORCID as a promising solution, we propose an ORCID-based "Prevention plus Cure" campaign against author name ambiguity. The prevention objective relies on researchers' consistent use of ORCID, while the cure objective involves retrospectively integrating ORCIDs into backfile publications. We also outline the responsibilities of various stakeholders to ensure the success of the campaign. Furthermore, we argue that universal adoption of ORCID can help curb authorship-related misconduct, discern predatory journals and publishers, and track researchers' undesirable records of academic publishing. We then analyze the current status of ORCID adoption in China, identify potential challenges, propose tentative solutions to address them, and highlight ORCID as a tool that can be utilized to empower China's combat against research misconduct. In conclusion, we emphasize the importance of conducting empirical research to inform more effective promotion of ORCID adoption in China.

3.
PeerJ Comput Sci ; 9: e1536, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37810360

RESUMEN

Scholarly knowledge graphs (SKG) are knowledge graphs representing research-related information, powering discovery and statistics about research impact and trends. Author name disambiguation (AND) is required to produce high-quality SKGs, as a disambiguated set of authors is fundamental to ensure a coherent view of researchers' activity. Various issues, such as homonymy, scarcity of contextual information, and cardinality of the SKG, make simple name string matching insufficient or computationally complex. Many AND deep learning methods have been developed, and interesting surveys exist in the literature, comparing the approaches in terms of techniques, complexity, performance, etc. However, none of them specifically addresses AND methods in the context of SKGs, where the entity-relationship structure can be exploited. In this paper, we discuss recent graph-based methods for AND, define a framework through which such methods can be confronted, and catalog the most popular datasets and benchmarks used to test such methods. Finally, we outline possible directions for future work on this topic.

4.
J Am Med Inform Assoc ; 28(9): 1919-1927, 2021 08 13.
Artículo en Inglés | MEDLINE | ID: mdl-34180522

RESUMEN

OBJECTIVE: PubMed has suffered from the author ambiguity problem for many years. Existing studies on author name disambiguation (AND) for PubMed only used internal metadata for development. However, some of them are incomplete (eg, a large number of names are only abbreviated and their full names are not available) or less discriminative. To this end, we present a new disambiguation method, namely AggAND, by aggregating information from external databases. MATERIALS AND METHODS: We address this issue by exploring Microsoft Academic Graph, Semantic Scholar, and PubMed Knowledge Graph to enhance the built-in name metadata, and extend the internal metadata with some external and more discriminative metadata. RESULTS: Experimental results on enhanced name metadata demonstrate comparable performance to 3 author identifier systems, as well as show superiority over the original name metadata. More importantly, our method, AggAND, incorporating both enhanced name and extended metadata, yields F1 scores of 95.80% and 93.71% on 2 datasets and outperforms the state-of-the-art method by a large margin (3.61% and 6.55%, respectively). CONCLUSIONS: The feasibility and good performance of our methods not only help better understand the importance of external databases for disambiguation, but also point to a promising direction for future AND studies in which information aggregated from multiple bibliographic databases can be effective in improving disambiguation performance. The methodology shown here can be generalized to broader bibliographic databases beyond PubMed. Our code and data are available online (https://github.com/carmanzhang/PubMed-AND-method).


Asunto(s)
Metadatos , Semántica , Bases de Datos Bibliográficas , Bases de Datos Factuales , PubMed
5.
J Am Med Inform Assoc ; 26(10): 1037-1045, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30958542

RESUMEN

OBJECTIVE: Author-centric analyses of fast-growing biomedical reference databases are challenging due to author ambiguity. This problem has been mainly addressed through author disambiguation using supervised machine-learning algorithms. Such algorithms, however, require adequately designed gold standards that reflect the reference database properly. In this study we used MEDLINE to build the first unbiased gold standard in a reference database and improve over the existing state of the art in author disambiguation. MATERIALS AND METHODS: Following a new corpus design method, publication pairs randomly picked from MEDLINE were evaluated by both crowdsourcing and expert curators. Because the latter showed higher accuracy than crowdsourcing, expert curators were tasked to create a full corpus. The corpus was then used to explore new features that could improve state-of-the-art author disambiguation algorithms that would not have been discoverable with previously existing gold standards. RESULTS: We created a gold standard based on 1900 publication pairs that shows close similarity to MEDLINE in terms of chronological distribution and information completeness. A machine-learning algorithm that includes new features related to the ethnic origin of authors showed significant improvements over the current state of the art and demonstrates the necessity of realistic gold standards to further develop effective author disambiguation algorithms. DISCUSSION AND CONCLUSION: An unbiased gold standard can give a more accurate picture of the status of author disambiguation research and help in the discovery of new features for machine learning. The principles and methods shown here can be applied to other reference databases beyond MEDLINE. The gold standard and code used for this study are available at the following repository: https://github.com/amorgani/AND/.


Asunto(s)
Autoria , Minería de Datos/métodos , MEDLINE , Aprendizaje Automático , Estándares de Referencia , Algoritmos , Colaboración de las Masas , Bases de Datos Bibliográficas/normas , MEDLINE/normas
6.
Scientometrics ; 111(3): 1467-1500, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28596627

RESUMEN

Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data sets. These include both trivial requirements, like absence of errors and preservation of author order, and more substantial ones, like full disambiguation and adequate representation of publications with a small number of authors and highly variable author names. On the basis of these requirements, we create and make publicly available a new AND data set, SCAD-zbMATH. Both the quantitative analysis of this data set and the results of our initial AND experiments with a naive baseline algorithm show the SCAD-zbMATH data set to be considerably different from existing ones. We consider it a useful new resource that will challenge the state of the art in AND and benefit the AND research community.

7.
Med Ref Serv Q ; 34(2): 190-201, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25927511

RESUMEN

Scopus and Web of Science are the two major citation databases that collect and disseminate bibliometric statistics about research articles, journals, institutions, and individual authors. Liaison librarians are now regularly called upon to utilize these databases to assist faculty in finding citation activity on their published works for tenure and promotion, grant applications, and more. But questions about the accuracy, scope, and coverage of these tools deserve closer scrutiny. Discrepancies in citation capture led to a systematic study on how Scopus and Web of Science compared in a real-life situation encountered by liaisons: comparing three different disciplines at a medical school and nursing program. How many articles would each database retrieve for each faculty member using the author-searching tools provided? How many cited references for each faculty member would each tool generate? Results demonstrated troubling differences in publication and citation activity capture between Scopus and Web of Science. Implications for librarians are discussed.


Asunto(s)
Bibliometría , Bases de Datos Bibliográficas , Almacenamiento y Recuperación de la Información/métodos , Bibliotecas Médicas , Ginecología , Enfermería , Obstetricia , Publicaciones Periódicas como Asunto , Farmacia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA