Búsqueda | Portal Regional de la BVS

1.

What Can Interactive Visualization do for Participatory Budgeting in Chicago?

Kale, Alex; Liu, Danni; Ayala, Maria Gabriela; Schwab, Harper; McNutt, Andrew.

IEEE Trans Vis Comput Graph ; PP2024 Sep 09.

Artículo en Inglés | MEDLINE | ID: mdl-39250386

RESUMEN

Participatory budgeting (PB) is a democratic approach to allocating municipal spending that has been adopted in many places in recent years, including in Chicago. Current PB voting resembles a ballot where residents are asked which municipal projects, such as school improvements and road repairs, to fund with a limited budget. In this work, we ask how interactive visualization can benefit PB by conducting a design probe-based interview study (N = 13) with policy workers and academics with expertise in PB, urban planning, and civic HCI. Our probe explores how graphical elicitation of voter preferences and a dashboard of voting statistics can be incorporated into a realistic PB tool. Through qualitative analysis, we find that visualization creates opportunities for city government to set expectations about budget constraints while also granting their constituents greater freedom to articulate a wider range of preferences. However, using visualization to provide transparency about PB requires efforts to mitigate potential access barriers and mistrust. We call for more visualization professionals to help build civic capacity by working in and studying political systems.

2.

Mixing Linters with GUIs: A Color Palette Design Probe.

McNutt, Andrew; Stone, Maureen C; Heer, Jeffrey.

IEEE Trans Vis Comput Graph ; PP2024 Sep 11.

Artículo en Inglés | MEDLINE | ID: mdl-39259629

RESUMEN

Visualization linters are end-user facing evaluators that automatically identify potential chart issues. These spell-checker like systems offer a blend of interpretability and customization that is not found in other forms of automated assistance. However, existing linters do not model context and have primarily targeted users who do not need assistance, resulting in obvious-even annoying-advice. We investigate these issues within the domain of color palette design, which serves as a microcosm of visualization design concerns. We contribute a GUI-based color palette linter as a design probe that covers perception, accessibility, context, and other design criteria, and use it to explore visual explanations, integrated fixes, and user defined linting rules. Through a formative interview study and theory-driven analysis, we find that linters can be meaningfully integrated into graphical contexts thereby addressing many of their core issues. We discuss implications for integrating linters into visualization tools, developing improved assertion languages, and supporting end-user tunable advice-all laying the groundwork for more effective visualization linters in any context.

3.

Metrics-Based Evaluation and Comparison of Visualization Notations.

Kruchten, Nicolas; McNutt, Andrew M; McGuffin, Michael J.

IEEE Trans Vis Comput Graph ; 30(1): 425-435, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-37874719

RESUMEN

A visualization notation is a recurring pattern of symbols used to author specifications of visualizations, from data transformation to visual mapping. Programmatic notations use symbols defined by grammars or domain-specific languages (e.g. ggplot2, dplyr, Vega-Lite) or libraries (e.g. Matplotlib, Pandas). Designers and prospective users of grammars and libraries often evaluate visualization notations by inspecting galleries of examples. While such collections demonstrate usage and expressiveness, their construction and evaluation are usually ad hoc, making comparisons of different notations difficult. More rarely, experts analyze notations via usability heuristics, such as the Cognitive Dimensions of Notations framework. These analyses, akin to structured close readings of text, can reveal design deficiencies, but place a burden on the expert to simultaneously consider many facets of often complex systems. To alleviate these issues, we introduce a metrics-based approach to usability evaluation and comparison of notations in which metrics are computed for a gallery of examples across a suite of notations. While applicable to any visualization domain, we explore the utility of our approach via a case study considering statistical graphics that explores 40 visualizations across 9 widely used notations. We facilitate the computation of appropriate metrics and analysis via a new tool called NotaScope. We gathered feedback via interviews with authors or maintainers of prominent charting libraries ( n=6). We find that this approach is a promising way to formalize, externalize, and extend evaluations and comparisons of visualization notations.

4.

Open-ComBind: harnessing unlabeled data for improved binding pose prediction.

McNutt, Andrew T; Koes, David Ryan.

J Comput Aided Mol Des ; 38(1): 3, 2023 Dec 08.

Artículo en Inglés | MEDLINE | ID: mdl-38062207

RESUMEN

Determination of the bound pose of a ligand is a critical first step in many in silico drug discovery tasks. Molecular docking is the main tool for the prediction of non-covalent binding of a protein and ligand system. Molecular docking pipelines often only utilize the information of one ligand binding to the protein despite the commonly held hypothesis that different ligands share binding interactions when bound to the same receptor. Here we describe Open-ComBind, an easy-to-use, open-source version of the ComBind molecular docking pipeline that leverages information from multiple ligands without known bound structures to enhance pose selection. We first create distributions of feature similarities between ligand pose pairs, comparing near-native poses with all sampled docked poses. These distributions capture the likelihood of observing similar features, such as hydrogen bonds or hydrophobic contacts, in different pose configurations. These similarity distributions are then combined with a per-ligand docking score to enhance overall pose selection by 5% and 4.5% for high-affinity and congeneric series helper ligands, respectively. Open-ComBind reduces the average RMSD of ligands in our benchmark dataset by 9.0%. We provide Open-ComBind as an easy-to-use command line and Python API to increase pose prediction performance at www.github.com/drewnutt/open_combind .

Asunto(s)

Diseño de Fármacos , Proteínas , Simulación del Acoplamiento Molecular , Unión Proteica , Ligandos , Proteínas/química , Sitios de Unión

5.

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

McNutt, Andrew T; Bisiriyu, Fatimah; Song, Sophia; Vyas, Ananya; Hutchison, Geoffrey R; Koes, David Ryan.

J Chem Inf Model ; 63(21): 6598-6607, 2023 11 13.

Artículo en Inglés | MEDLINE | ID: mdl-37903507

RESUMEN

Conformer generation, the assignment of realistic 3D coordinates to a small molecule, is fundamental to structure-based drug design. Conformational ensembles are required for rigid-body matching algorithms, such as shape-based or pharmacophore approaches, and even methods that treat the ligand flexibly, such as docking, are dependent on the quality of the provided conformations due to not sampling all degrees of freedom (e.g., only sampling torsions). Here, we empirically elucidate some general principles about the size, diversity, and quality of the conformational ensembles needed to get the best performance in common structure-based drug discovery tasks. In many cases, our findings may parallel "common knowledge" well-known to practitioners of the field. Nonetheless, we feel that it is valuable to quantify these conformational effects while reproducing and expanding upon previous studies. Specifically, we investigate the performance of a state-of-the-art generative deep learning approach versus a more classical geometry-based approach, the effect of energy minimization as a postprocessing step, the effect of ensemble size (maximum number of conformers), and construction (filtering by root-mean-square deviation for diversity) and how these choices influence the ability to recapitulate bioactive conformations and perform pharmacophore screening and molecular docking.

Asunto(s)

Algoritmos , Diseño de Fármacos , Modelos Moleculares , Simulación del Acoplamiento Molecular , Conformación Molecular , Ligandos

6.

Integrated Cytometry With Machine Learning Applied to High-Content Imaging of Human Kidney Tissue for In Situ Cell Classification and Neighborhood Analysis.

Winfree, Seth; McNutt, Andrew T; Khochare, Suraj; Borgard, Tyler J; Barwinska, Daria; Sabo, Angela R; Ferkowicz, Michael J; Williams, James C; Lingeman, James E; Gulbronson, Connor J; Kelly, Katherine J; Sutton, Timothy A; Dagher, Pierre C; Eadon, Michael T; Dunn, Kenneth W; El-Achkar, Tarek M.

Lab Invest ; 103(6): 100104, 2023 06.

Artículo en Inglés | MEDLINE | ID: mdl-36867975

RESUMEN

The human kidney is a complex organ with various cell types that are intricately organized to perform key physiological functions and maintain homeostasis. New imaging modalities, such as mesoscale and highly multiplexed fluorescence microscopy, are increasingly being applied to human kidney tissue to create single-cell resolution data sets that are both spatially large and multidimensional. These single-cell resolution high-content imaging data sets have great potential to uncover the complex spatial organization and cellular makeup of the human kidney. Tissue cytometry is a novel approach used for the quantitative analysis of imaging data; however, the scale and complexity of such data sets pose unique challenges for processing and analysis. We have developed the Volumetric Tissue Exploration and Analysis (VTEA) software, a unique tool that integrates image processing, segmentation, and interactive cytometry analysis into a single framework on desktop computers. Supported by an extensible and open-source framework, VTEA's integrated pipeline now includes enhanced analytical tools, such as machine learning, data visualization, and neighborhood analyses, for hyperdimensional large-scale imaging data sets. These novel capabilities enable the analysis of mesoscale 2- and 3-dimensional multiplexed human kidney imaging data sets (such as co-detection by indexing and 3-dimensional confocal multiplexed fluorescence imaging). We demonstrate the utility of this approach in identifying cell subtypes in the kidney on the basis of labels, spatial association, and their microenvironment or neighborhood membership. VTEA provides an integrated and intuitive approach to decipher the cellular and spatial complexity of the human kidney and complements other transcriptomics and epigenetic efforts to define the landscape of kidney cell types.

Asunto(s)

Imagenología Tridimensional , Riñón , Humanos , Riñón/diagnóstico por imagen , Imagenología Tridimensional/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Programas Informáticos , Aprendizaje Automático

7.

No Grammar to Rule Them All: A Survey of JSON-style DSLs for Visualization.

McNutt, Andrew M.

IEEE Trans Vis Comput Graph ; 29(1): 160-170, 2023 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-36166549

RESUMEN

There has been substantial growth in the use of JSON-based grammars, as well as other standard data serialization languages, to create visualizations. Each of these grammars serves a purpose: some focus on particular computational tasks (such as animation), some are concerned with certain chart types (such as maps), and some target specific data domains (such as ML). Despite the prominence of this interface form, there has been little detailed analysis of the characteristics of these languages. In this study, we survey and analyze the design and implementation of 57 JSON-style DSLs for visualization. We analyze these languages supported by a collected corpus of examples for each DSL (consisting of 4395 instances) across a variety of axes organized into concerns related to domain, conceptual model, language relationships, affordances, and general practicalities. We identify tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages. Through this work, we seek to support language implementers by elucidating the choices, opportunities, and tradeoffs in visualization DSL design.

8.

Goethe and Candolle: National forms of scientific writing?

Kim, Agatha Seo-Hyun; McNutt, Andrew.

Theory Biosci ; 141(3): 321-338, 2022 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-35953686

RESUMEN

What role does nationality-or the image of a nation-play in how one thinks and receives scientific ideas? This paper investigates the commonly held ideas about "German science" and "French science" in early nineteenth-century France. During the politically turbulent time, the seemingly independent scientific community found itself in a difficult position: first, between the cosmopolitan ideals of scientific community and the invasive political reality, and second, between the popularized image of national differences and the actual comparisons of international scientific ideas. The tension between multiple sets of fictions and realities underscores the fragility of the concept of nationality as a scientific measure. A case study comparing morphological ideas, receptions in France, and the actual scientific texts of J. W. von Goethe and A. P. de Candolle further illustrates this fragility. Goethe and Candolle make an ideal comparative case because they were received in very different lights despite their similar concept of the plant type. Our sentence-classification and visualization methods are applied to their scientific texts, to compare the actual compositions and forms of the texts that purportedly represented German and French sciences. This paper concludes that there was a gap between what French readers assumed they read and what they really read, when it came to foreign scientific texts. The differences between Goethe's and Candolle's texts transcended the perceived national differences between German Romanticism and French Classicism.

Asunto(s)

Plantas , Escritura , Alemania , Historia del Siglo XIX , Internacionalidad , Lenguaje

9.

Improving ΔΔG Predictions with a Multitask Convolutional Siamese Network.

McNutt, Andrew T; Koes, David Ryan.

J Chem Inf Model ; 62(8): 1819-1829, 2022 04 25.

Artículo en Inglés | MEDLINE | ID: mdl-35380443

RESUMEN

The lead optimization phase of drug discovery refines an initial hit molecule for desired properties, especially potency. Synthesis and experimental testing of the small perturbations during this refinement can be quite costly and time-consuming. Relative binding free energy (RBFE, also referred to as ΔΔG) methods allow the estimation of binding free energy changes after small changes to a ligand scaffold. Here, we propose and evaluate a Siamese convolutional neural network (CNN) for the prediction of RBFE between two bound ligands. We show that our multitask loss is able to improve on a previous state-of-the-art Siamese network for RBFE prediction via increased regularization of the latent space. The Siamese network architecture is well suited to the prediction of RBFE in comparison to a standard CNN trained on the same data (Pearson's R of 0.553 and 0.5, respectively). When evaluated on a left-out protein family, our Siamese CNN shows variability in its RBFE predictive performance depending on the protein family being evaluated (Pearson's R ranging from -0.44 to 0.97). RBFE prediction performance can be improved during generalization by injecting only a few examples (few-shot learning) from the evaluation data set during model training.

Asunto(s)

Redes Neurales de la Computación , Proteínas , Descubrimiento de Drogas , Entropía , Ligandos , Proteínas/química

10.

GNINA 1.0: molecular docking with deep learning.

McNutt, Andrew T; Francoeur, Paul; Aggarwal, Rishal; Masuda, Tomohide; Meli, Rocco; Ragoza, Matthew; Sunseri, Jocelyn; Koes, David Ryan.

J Cheminform ; 13(1): 43, 2021 Jun 09.

Artículo en Inglés | MEDLINE | ID: mdl-34108002

RESUMEN

Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2Å root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. GNINA, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of GNINA under an open source license for use as a molecular docking tool at https://github.com/gnina/gnina .

11.

In Situ Classification of Cell Types in Human Kidney Tissue Using 3D Nuclear Staining.

Woloshuk, Andre; Khochare, Suraj; Almulhim, Aljohara F; McNutt, Andrew T; Dean, Dawson; Barwinska, Daria; Ferkowicz, Michael J; Eadon, Michael T; Kelly, Katherine J; Dunn, Kenneth W; Hasan, Mohammad A; El-Achkar, Tarek M; Winfree, Seth.

Cytometry A ; 99(7): 707-721, 2021 07.

Artículo en Inglés | MEDLINE | ID: mdl-33252180

RESUMEN

To understand the physiology and pathology of disease, capturing the heterogeneity of cell types within their tissue environment is fundamental. In such an endeavor, the human kidney presents a formidable challenge because its complex organizational structure is tightly linked to key physiological functions. Advances in imaging-based cell classification may be limited by the need to incorporate specific markers that can link classification to function. Multiplex imaging can mitigate these limitations, but requires cumulative incorporation of markers, which may lead to tissue exhaustion. Furthermore, the application of such strategies in large scale 3-dimensional (3D) imaging is challenging. Here, we propose that 3D nuclear signatures from a DNA stain, DAPI, which could be incorporated in most experimental imaging, can be used for classifying cells in intact human kidney tissue. We developed an unsupervised approach that uses 3D tissue cytometry to generate a large training dataset of nuclei images (NephNuc), where each nucleus is associated with a cell type label. We then devised various supervised machine learning approaches for kidney cell classification and demonstrated that a deep learning approach outperforms classical machine learning or shape-based classifiers. Specifically, a custom 3D convolutional neural network (NephNet3D) trained on nuclei image volumes achieved a balanced accuracy of 80.26%. Importantly, integrating NephNet3D classification with tissue cytometry allowed in situ visualization of cell type classifications in kidney tissue. In conclusion, we present a tissue cytometry and deep learning approach for in situ classification of cell types in human kidney tissue using only a DNA stain. This methodology is generalizable to other tissues and has potential advantages on tissue economy and non-exhaustive classification of different cell types.

Asunto(s)

Aprendizaje Automático , Redes Neurales de la Computación , Humanos , Riñón , Coloración y Etiquetado , Aprendizaje Automático Supervisado

12.

Data Mining and Computational Modeling of High-Throughput Screening Datasets.

Ekins, Sean; Clark, Alex M; Dole, Krishna; Gregory, Kellan; Mcnutt, Andrew M; Spektor, Anna Coulon; Weatherall, Charlie; Litterman, Nadia K; Bunin, Barry A.

Methods Mol Biol ; 1755: 197-221, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-29671272

RESUMEN

We are now seeing the benefit of investments made over the last decade in high-throughput screening (HTS) that is resulting in large structure activity datasets entering public and open databases such as ChEMBL and PubChem. The growth of academic HTS screening centers and the increasing move to academia for early stage drug discovery suggests a great need for the informatics tools and methods to mine such data and learn from it. Collaborative Drug Discovery, Inc. (CDD) has developed a number of tools for storing, mining, securely and selectively sharing, as well as learning from such HTS data. We present a new web based data mining and visualization module directly within the CDD Vault platform for high-throughput drug discovery data that makes use of a novel technology stack following modern reactive design principles. We also describe CDD Models within the CDD Vault platform that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous data. Our system is built on top of the Collaborative Drug Discovery Vault Activity and Registration data repository ecosystem which allows users to manipulate and visualize thousands of molecules in real time. This can be performed in any browser on any platform. In this chapter we present examples of its use with public datasets in CDD Vault. Such approaches can complement other cheminformatics tools, whether open source or commercial, in providing approaches for data mining and modeling of HTS data.

Asunto(s)

Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos Farmacéuticas , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/métodos , Programas Informáticos

13.

Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets.

Clark, Alex M; Dole, Krishna; Coulon-Spektor, Anna; McNutt, Andrew; Grass, George; Freundlich, Joel S; Reynolds, Robert C; Ekins, Sean.

J Chem Inf Model ; 55(6): 1231-45, 2015 Jun 22.

Artículo en Inglés | MEDLINE | ID: mdl-25994950

RESUMEN

On the order of hundreds of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) models have been described in the literature in the past decade which are more often than not inaccessible to anyone but their authors. Public accessibility is also an issue with computational models for bioactivity, and the ability to share such models still remains a major challenge limiting drug discovery. We describe the creation of a reference implementation of a Bayesian model-building software module, which we have released as an open source component that is now included in the Chemistry Development Kit (CDK) project, as well as implemented in the CDD Vault and in several mobile apps. We use this implementation to build an array of Bayesian models for ADME/Tox, in vitro and in vivo bioactivity, and other physicochemical properties. We show that these models possess cross-validation receiver operator curve values comparable to those generated previously in prior publications using alternative tools. We have now described how the implementation of Bayesian models with FCFP6 descriptors generated in the CDD Vault enables the rapid production of robust machine learning models from public data or the user's own datasets. The current study sets the stage for generating models in proprietary software (such as CDD) and exporting these models in a format that could be run in open source software using CDK components. This work also demonstrates that we can enable biocomputation across distributed private or public datasets to enhance drug discovery.

Asunto(s)

Absorción Fisicoquímica , Bases de Datos Farmacéuticas , Descubrimiento de Drogas/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo , Programas Informáticos , Animales , Teorema de Bayes , Simulación por Computador , Humanos , Ratones

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA