Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
J Appl Stat ; 51(9): 1729-1755, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38933136

RESUMO

We introduce the bivariate unit-log-symmetric model based on the bivariate log-symmetric distribution (BLS) defined in Vila et al. [25] as a flexible family of bivariate distributions over the unit square. We then study its mathematical properties such as stochastic representations, quantiles, conditional distributions, independence of the marginal distributions and marginal moments. Maximum likelihood estimation method is discussed and examined through Monte Carlo simulation. Finally, the proposed model is used to analyze some soccer data sets.

2.
J Appl Stat ; 51(4): 701-720, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38476620

RESUMO

The list of occurrences linked to significant climate change has grown in recent decades. These changes can be influenced by a set of covariates, such as temperature, location and period of the year. Analyzing the relation among elements and factors that influence the behavior of such events is extremely important for decision-making in order to minimize damages and losses. Exceedance analysis uses the tail of the distribution based on Extreme Value Theory (EVT). Extensions for these models have been proposed in literature, such as regression models for the tail parameters and a parametric or semi-parametric distribution for the part that comes before the tail (well known as bulk distribution). This work presents a new extension to exceedance model, in which the parameters for the bulk distribution capture the effect of covariates such as location and seasonality. We considered a Bayesian approach in the inference procedure. The estimation was done using MCMC -- Markov Chain Monte Carlo methods. Application results for modeling maximum and minimum temperature data showed an efficient estimation of extreme quantiles and a predictive advantage compared to models previously used in literature.

3.
Heliyon ; 10(2): e24047, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38293372

RESUMO

This work proposes a new methodology to identify and validate deep learning models for artificial oil lift systems that use submersible electric pumps. The proposed methodology allows for obtaining the models and evaluating the prediction's uncertainty jointly and systematically. The methodology employs a nonlinear model to generate training and validation data and the Markov Chain Monte Carlo algorithm to assess the neural network's epistemic uncertainty. The nonlinear model was used to overcome the limitations of the need for big datasets for training deep learning models. However, the developed models are validated against experimental data after training and validation with synthetic data. The validation is also performed through the models' uncertainty assessment and experimental data. From the implementation point of view, the method was coded in Python with Tensorflow and Keras libraries used to build the neural Networks and find the hyperparameters. The results show that the proposed methodology obtained models representing both the nonlinear model's dynamic behavior and the experimental data. It provides a most probable value close to the experimental data, and the uncertainty of the generated deep learning models has the same order of magnitude as that of the nonlinear model. This uncertainty assessment shows that the built models were adequately validated. The proposed deep learning models can be applied in several applications requiring a reliable and computationally lighter model. Hence, the obtained AI dynamic models can be employed for digital twin construction, control, and optimization.

4.
J Appl Stat ; 50(10): 2194-2208, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37434632

RESUMO

In this paper, we propose a hierarchical Bayesian approach for modeling the evolution of the 7-day moving average for the number of deaths due to COVID-19 in a country, state or city. The proposed approach is based on a Gaussian process regression model. The main advantage of this model is that it assumes that a nonlinear function f used for modeling the observed data is an unknown random parameter in opposite to usual approaches that set up f as being a known mathematical function. This assumption allows the development of a Bayesian approach with a Gaussian process prior over f. In order to estimate the parameters of interest, we develop an MCMC algorithm based on the Metropolis-within-Gibbs sampling algorithm. We also present a procedure for making predictions. The proposed method is illustrated in a case study, in which, we model the 7-day moving average for the number of deaths recorded in the state of São Paulo, Brazil. Results obtained show that the proposed method is very effective in modeling and predicting the values of the 7-day moving average.

5.
Sci. agric ; 80: e20220056, 2023. tab, ilus
Artigo em Inglês | VETINDEX | ID: biblio-1410169

RESUMO

Among the multi-trait models selected to study several traits and environments jointly, the Bayesian framework has been a preferred tool when constructing a more complex and biologically realistic model. In most cases, non-informative prior distributions are adopted in studies using the Bayesian approach. However, the Bayesian approach presents more accurate estimates when informative prior distributions are used. The present study was developed to evaluate the efficiency and applicability of multi-trait multi-environment (MTME) models within a Bayesian framework utilizing a strategy for eliciting informative prior distribution using previous data on rice. The study involved data pertaining to rice (Oryza sativa L.) genotypes in three environments and five crop seasons (2010/2011 until 2014/2015) for the following traits: grain yield (GY), flowering in days (FLOR) and plant height (PH). Variance components, genetic and non-genetic parameters were estimated using the Bayesian method. In general, the informative prior distribution in Bayesian MTME models provided higher estimates of individual narrow-sense heritability and variance components, as well as minor lengths for the highest probability density interval (HPD), compared to their respective non-informative prior distribution analyses. More informative prior distributions make it possible to detect genetic correlations between traits, which cannot be achieved with non-informative prior distributions. Therefore, this mechanism presented to update knowledge for an elicitation of an informative prior distribution can be efficiently applied in rice breeding programs.


Assuntos
Oryza/crescimento & desenvolvimento , Alimentos Geneticamente Modificados/estatística & dados numéricos
6.
J Appl Stat ; 49(13): 3436-3450, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36213780

RESUMO

According to the Atlas of Human Development in Brazil, the income dimension of Municipal Human Development Index (MHDI-I) is an indicator that shows the population's ability in a municipality to ensure a minimum standard of living to provide their basic needs, such as water, food and shelter. In public policy, one of the research objectives is to identify social and economic variables that are associated with this index. Due to the income inequality, evaluate these associations in quantiles, instead of the mean, could be more interest. Thus, in this paper, we develop a Bayesian variable selection in quantile regression models with hierarchical random effects. In particular, we assume a likelihood function based on the Generalized Asymmetric Laplace distribution, and a spike-and-slab prior is used to perform variable selection. The Generalized Asymmetric Laplace distribution is a more general alternative than the Asymmetric Laplace one, which is a common approach used in quantile regression under the Bayesian paradigm. The performance of the proposed method is evaluated via a comprehensive simulation study, and it is applied to the MHDI-I from municipalities located in the state of Rio de Janeiro.

7.
Entropy (Basel) ; 24(7)2022 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-35885116

RESUMO

Crime is a negative phenomenon that affects the daily life of the population and its development. When modeling crime data, assumptions on either the spatial or the temporal relationship between observations are necessary if any statistical analysis is to be performed. In this paper, we structure space-time dependency for count data by considering a stochastic difference equation for the intensity of the space-time process rather than placing structure on a latent space-time process, as Cox processes would do. We introduce a class of spatially correlated self-exciting spatio-temporal models for count data that capture both dependence due to self-excitation, as well as dependence in an underlying spatial process. We follow the principles in Clark and Dixon (2021) but considering a generalized additive structure on spatio-temporal varying covariates. A Bayesian framework is proposed for inference of model parameters. We analyze three distinct crime datasets in the city of Riobamba (Ecuador). Our model fits the data well and provides better predictions than other alternatives.

8.
One Health ; 14: 100359, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34977321

RESUMO

Echinococcus granulosus sensu lato is a globally prevalent zoonotic parasitic cestode leading to cystic echinococcosis (CE) in both humans and sheep with both medical and financial impacts, whose reduction requires the application of a One Health approach to its control. Regarding the animal health component of this approach, lack of accurate and practical diagnostics in livestock impedes the assessment of disease burden and the implementation and evaluation of control strategies. We use of a Bayesian Latent Class Analysis (LCA) model to estimate ovine CE prevalence in sheep samples from the Río Negro province of Argentina accounting for uncertainty in the diagnostics. We use model outputs to evaluate the performance of a novel recombinant B8/2 antigen B subunit (rEgAgB8/2) indirect enzyme-linked immunosorbent assay (ELISA) for detecting E. granulosus in sheep. Necropsy (as a partial gold standard), western blot (WB) and ELISA diagnostic data were collected from 79 sheep within two Río Negro slaughterhouses, and used to estimate individual infection status (assigned as a latent variable within the model). Using the model outputs, the performance of the novel ELISA at both individual and flock levels was evaluated, respectively, using a receiver operating characteristic (ROC) curve, and simulating a range of sample sizes and prevalence levels within hypothetical flocks. The estimated (mean) prevalence of ovine CE was 27.5% (95%Bayesian credible interval (95%BCI): 13.8%-58.9%) within the sample population. At the individual level, the ELISA had a mean sensitivity and specificity of 55% (95%BCI: 46%-68%) and 68% (95%BCI: 63%-92%), respectively, at an optimal optical density (OD) threshold of 0.378. At the flock level, the ELISA had an 80% probability of correctly classifying infection at an optimal cut-off threshold of 0.496. These results suggest that the novel ELISA could play a useful role as a flock-level diagnostic for CE surveillance in the region, supplementing surveillance activities in the human population and thus strengthening a One Health approach. Importantly, selection of ELISA cut-off threshold values must be tailored according to the epidemiological situation.

9.
Entropy (Basel) ; 25(1)2022 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-36673197

RESUMO

Mixture cure rate models have been developed to analyze failure time data where a proportion never fails. For such data, standard survival models are usually not appropriate because they do not account for the possibility of non-failure. In this context, mixture cure rate models assume that the studied population is a mixture of susceptible subjects who may experience the event of interest and non-susceptible subjects that will never experience it. More specifically, mixture cure rate models are a class of survival time models in which the probability of an eventual failure is less than one and both the probability of eventual failure and the timing of failure depend (separately) on certain individual characteristics. In this paper, we propose a Bayesian approach to estimate parametric mixture cure rate models with covariates. The probability of eventual failure is estimated using a binary regression model, and the timing of failure is determined using a Weibull distribution. Inference for these models is attained using Markov Chain Monte Carlo methods under the proposed Bayesian framework. Finally, we illustrate the method using data on the return-to-prison time for a sample of prison releases of men convicted of sexual crimes against women in England and Wales and we use mixture cure rate models to investigate the risk factors for long-term and short-term survival of recidivism.

10.
J Clin Monit Comput ; 36(3): 687-702, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-33907937

RESUMO

In this work it is proposed a modeling for operating room times based on a Bayesian Hierarchical structure. Specifically, it is employed a Bayesian generalized linear mixed model with an additional hierarchical level on the random effects. This configuration allows the estimation of operating room times (ORT) with few or no historical observations, without requiring a prior surgeon's estimate. In addition to the widely used lognormal distribution, it is also studied the gamma distribution to model the operating room times. For the scale parameters related to the random effects (surgeon and surgical group), which are important quantities in this type of modeling, different kinds of prior distributions such as Half-Cauchy, Sbeta2, and uniform are studied. A Bayesian version of the classical ANOVA is implemented to identify relevant predictors for the operating room times. We find that lognormal models outperform the gamma models in estimating upper prediction bounds (UB). Especially, the best ORT predictions for cases with few or no historical data (i.e., between 0 and 3 historical cases) are obtained with the [Formula: see text], SBeta2 model. With a deviation of less than 1% with respect to the nominal coverage of the upper bound predictions UB80% and UB90% and an average absolute percentage error of 38.5% in the point estimate.


Assuntos
Salas Cirúrgicas , Teorema de Bayes , Humanos , Fatores de Tempo
11.
Stat Methods Med Res ; 30(7): 1708-1724, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34074165

RESUMO

There is a well-established tradition within the statistics literature that explores different techniques for reducing the dimensionality of large feature spaces. The problem is central to machine learning and it has been largely explored under the unsupervised learning paradigm. We introduce a supervised clustering methodology that capitalizes on a Metropolis Hastings algorithm to optimize the partition structure of a large categorical feature space tailored towards minimizing the test error of a learning algorithm. This is a general methodology that can be applied to any supervised learning problem with a large categorical feature space. We show the benefits of the algorithm by applying this methodology to the problem of risk adjustment in competitive health insurance markets. We use a large claims data set that records ICD-10 codes, a large categorical feature space. We aim at improving risk adjustment by clustering diagnostic codes into risk groups suitable for health expenditure prediction. We test the performance of our methodology against common alternatives using panel data from a representative sample of twenty three million citizens in Colombian Healthcare System. Our results outperform common alternatives and suggest that it has potential to improve risk adjustment.


Assuntos
Algoritmos , Aprendizado de Máquina , Análise por Conglomerados
12.
Environ Monit Assess ; 193(6): 345, 2021 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-34013430

RESUMO

This paper presents a methodology to assess the influence of the correlation-covariance structure of measurement errors in online monitoring over the propagation of uncertainties, applied to wet-weather environmental indicators in sustainable urban drainage systems (SUDSs). The effect of auto-correlated and heteroskedastic errors in measured time-series over the estimated probability density function (PDF) of different environmental indicators is analyzed for a wide variety of possible error structures in the data. For this purpose, multiple correlation-covariance structures are randomly generated from exploring the parametric space of a linear exponent autoregressive (LEAR) model, employing a Bayesian-based Markov Chain Monte Carlo sampling technique. Significant differences tests are proposed to identify the most correlated parameters of the correlation-covariance error model with statistics of the environmental indicator PDFs. The method is applied to total suspended solids (TSS) and chemical oxygen demand (COD) time-series recorded during 13 rainfall events at the inlet and outlet of a SUDS train (stormwater settling tank-horizontal constructed wetland). In this case, results showed that the total error in the estimation of the analyzed environmental indicators is mostly explained by standard uncertainties (flattening of the PDFs) rather than bias contributions (displacement of the PDFs). The correlation-covariance model parameters related to the temporal delimitation of hydrographs/pollutographs and the intensity of the autocorrelation showed to have the strongest influence in the propagation of measurement errors (flattening/displacement of the PDFs).


Assuntos
Chuva , Movimentos da Água , Teorema de Bayes , Indicadores Ambientais , Monitoramento Ambiental
13.
J Appl Stat ; 48(16): 3048-3059, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35707258

RESUMO

Extreme Value Theory (EVT) aims to study the tails of probability distributions in order to measure and quantify extreme events of maximum and minimum. In river flow data, an extreme level of a river may be related to the level of a neighboring river that flows into it. In this type of data, it is very common for flooding of a location to have been caused by a very large flow from an affluent river that is tens or hundreds of kilometers from this location. In this sense, an interesting approach is to consider a conditional model for the estimation of a multivariate model. Inspired by this idea, we propose a Bayesian model to describe the dependence of exceedance between rivers, where we considered a conditionally independent structure. In this model, the dependence between rivers is captured by modeling the excess marginally of one river as a consequence of linear functions of the other rivers. The results showed that there is a strong and positive connection between excesses in one river caused by the excesses of the other rivers.

14.
Biology (Basel) ; 9(8)2020 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-32806613

RESUMO

A SIRU-type epidemic model is employed for the prediction of the COVID-19 epidemy evolution in Brazil, and analyze the influence of public health measures on simulating the control of this infectious disease. The proposed model allows for a time variable functional form of both the transmission rate and the fraction of asymptomatic infectious individuals that become reported symptomatic individuals, to reflect public health interventions, towards the epidemy control. An exponential analytical behavior for the accumulated reported cases evolution is assumed at the onset of the epidemy, for explicitly estimating initial conditions, while a Bayesian inference approach is adopted for the estimation of parameters by employing the direct problem model with the data from the first phase of the epidemy evolution, represented by the time series for the reported cases of infected individuals. The evolution of the COVID-19 epidemy in China is considered for validation purposes, by taking the first part of the dataset of accumulated reported infectious individuals to estimate the related parameters, and retaining the rest of the evolution data for direct comparison with the predicted results. Then, the available data on reported cases in Brazil from 15 February until 29 March, is used for estimating parameters and then predicting the first phase of the epidemy evolution from these initial conditions. The data for the reported cases in Brazil from 30 March until 23 April are reserved for validation of the model. Then, public health interventions are simulated, aimed at evaluating the effects on the disease spreading, by acting on both the transmission rate and the fraction of the total number of the symptomatic infectious individuals, considering time variable exponential behaviors for these two parameters. This first constructed model provides fairly accurate predictions up to day 65 below 5% relative deviation, when the data starts detaching from the theoretical curve. From the simulated public health intervention measures through five different scenarios, it was observed that a combination of careful control of the social distancing relaxation and improved sanitary habits, together with more intensive testing for isolation of symptomatic cases, is essential to achieve the overall control of the disease and avoid a second more strict social distancing intervention. Finally, the full dataset available by the completion of the present work is employed in redefining the model to yield updated epidemy evolution estimates.

15.
Heliyon ; 6(6): e03961, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32551374

RESUMO

In time-to-event studies it is common the presence of a fraction of individuals not expecting to experience the event of interest; these individuals who are immune to the event or cured for the disease during the study are known as long-term survivors. In addition, in many studies it is observed two lifetimes associated to the same individual, and in some cases there exists a dependence structure between them. In these situations, the usual existing lifetime distributions are not appropriate to model data sets with long-term survivors and dependent bivariate lifetimes. In this study, it is proposed a bivariate model based on a Weibull standard distribution with a dependence structure based on fifteen different copula functions. We assumed the Weibull distribution due to its wide use in survival data analysis and its greater flexibility and simplicity, but the presented methods can be adapted to other continuous survival distributions. Three examples, considering real data sets are introduced to illustrate the proposed methodology. A Bayesian approach is assumed to get the inferences for the parameters of the model where the posterior summaries of interest are obtained using Markov Chain Monte Carlo simulation methods and the Openbugs software. For the data analysis considering different real data sets it was assumed fifteen different copula models from which is was possible to find models with satisfactory fit for the bivariate lifetimes in presence of long-term survivors.

16.
Biometrics ; 76(4): 1297-1309, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-31994171

RESUMO

Semi-competing risks data include the time to a nonterminating event and the time to a terminating event, while competing risks data include the time to more than one terminating event. Our work is motivated by a prostate cancer study, which has one nonterminating event and two terminating events with both semi-competing risks and competing risks present as well as two censoring times. In this paper, we propose a new multi-risks survival (MRS) model for this type of data. In addition, the proposed MRS model can accommodate noninformative right-censoring times for nonterminating and terminating events. Properties of the proposed MRS model are examined in detail. Theoretical and empirical results show that the estimates of the cumulative incidence function for a nonterminating event may be biased if the information on a terminating event is ignored. A Markov chain Monte Carlo sampling algorithm is also developed. Our methodology is further assessed using simulations and also an analysis of the real data from a prostate cancer study. As a result, a prostate-specific antigen velocity greater than 2.0 ng/mL per year and higher biopsy Gleason scores are positively associated with a shorter time to death due to prostate cancer.


Assuntos
Algoritmos , Teorema de Bayes , Humanos , Incidência , Masculino , Cadeias de Markov , Análise de Sobrevida
17.
Stat Methods Med Res ; 29(5): 1386-1402, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31296119

RESUMO

We proposed a Bayesian analysis of pseudo-compositional data in presence of a latent factor, assuming a spatial structure. This development was motivated by a dataset containing information on the number of newborns of primiparous mothers living in each of the microregions of the state of Sao Paulo, Brazil, in the year of 2015, stratified by the age of the mothers (15-18, 19-29 and 30 years or more). Considering that data on newborns are not stochastically distributed among the three age groups, but they are explained in relation to women's population structure, we adopted the expression "pseudo-compositional data" to refer to this data structure. The hypothesis of interest establishes that the age of the first pregnancy is associated with the economic conditions of the geographic area where the mother lives. The incidence of poverty was included as an independent variable. Additive log-ratio (alr) and isometric log-ratio (ilr) transformations were considered, as is usually done in the analysis of compositional data. The model included a random effect related to the spatial effect assumed to have a conditional autoregressive structure. A Bayesian Markov Chain Monte Carlo (MCMC) simulation procedure was used to get the posterior summaries of interest. The model based on the (ilr) transformation was well fitted to the data, showing that in the microregions with the highest incidence of poverty, there are higher proportions of women who have their first child in adolescence, while in the microregions with the lowest incidence of poverty, there are higher proportions of women who have their first child after the age of 30 years. From these results it is possible to conclude that this Bayesian approach was very useful in the estimation of the parameters of the proposed model. The proposed method should have a broad application to other problems involving pseudo-compositional data.


Assuntos
Mães , Pobreza , Criança , Gravidez , Adolescente , Humanos , Recém-Nascido , Feminino , Adulto , Teorema de Bayes , Brasil/epidemiologia , Simulação por Computador , Método de Monte Carlo , Cadeias de Markov
18.
Entropy (Basel) ; 20(9)2018 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-33265731

RESUMO

In this paper, we study the performance of Bayesian computational methods to estimate the parameters of a bivariate survival model based on the Ali-Mikhail-Haq copula with marginal distributions given by Weibull distributions. The estimation procedure was based on Monte Carlo Markov Chain (MCMC) algorithms. We present three version of the Metropolis-Hastings algorithm: Independent Metropolis-Hastings (IMH), Random Walk Metropolis (RWM) and Metropolis-Hastings with a natural-candidate generating density (MH). Since the creation of a good candidate generating density in IMH and RWM may be difficult, we also describe how to update a parameter of interest using the slice sampling (SS) method. A simulation study was carried out to compare the performances of the IMH, RWM and SS. A comparison was made using the sample root mean square error as an indicator of performance. Results obtained from the simulations show that the SS algorithm is an effective alternative to the IMH and RWM methods when simulating values from the posterior distribution, especially for small sample sizes. We also applied these methods to a real data set.

19.
Artigo em Inglês | MEDLINE | ID: mdl-28684720

RESUMO

We implemented a spatial model for analysing PM 10 maxima across the Mexico City metropolitan area during the period 1995-2016. We assumed that these maxima follow a non-identical generalized extreme value (GEV) distribution and modeled the trend by introducing multivariate smoothing spline functions into the probability GEV distribution. A flexible, three-stage hierarchical Bayesian approach was developed to analyse the distribution of the PM 10 maxima in space and time. We evaluated the statistical model's performance by using a simulation study. The results showed strong evidence of a positive correlation between the PM 10 maxima and the longitude and latitude. The relationship between time and the PM 10 maxima was negative, indicating a decreasing trend over time. Finally, a high risk of PM 10 maxima presenting levels above 1000 µ g/m 3 (return period: 25 yr) was observed in the northwestern region of the study area.


Assuntos
Poluentes Atmosféricos/análise , Monitoramento Ambiental/estatística & dados numéricos , Modelos Estatísticos , Material Particulado/análise , Poluição do Ar/análise , Teorema de Bayes , Cidades , México , Análise Espacial
20.
J Multivar Anal ; 143: 94-106, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27274601

RESUMO

Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to assess association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. When a nonlinear mixed-effects model is used to fit the longitudinal trajectories, the existing estimation strategies based on likelihood approximations have been shown to exhibit some computational efficiency problems (De la Cruz et al., 2011). In this article we consider a Bayesian estimation procedure for the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. The proposed prior structure allows for the implementation of an MCMC sampler. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to assess the importance of modelling correlated errors and quantify the consequences of model misspecification.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA