RESUMO
Predicting the values of a financial time series is mainly a function of its price history, which depends on several factors, internal and external. With this history, it is possible to build an ∊-machine for predicting the financial time series. This work proposes considering the influence of a financial series through the transfer of entropy when the values of the other financial series are known. A method is proposed that considers the transfer of entropy for breaking the ties that occur when calculating the prediction with the ∊-machine. This analysis is carried out using data from six financial series: two American, the S&P 500 and the Nasdaq; two Asian, the Hang Seng and the Nikkei 225; and two European, the CAC 40 and the DAX. This work shows that it is possible to influence the prediction of the closing value of a series if the value of the influencing series is known. This work showed that the series that transfer the most information through entropy transfer are the American S&P 500 and Nasdaq, followed by the European DAX and CAC 40, and finally the Asian Nikkei 225 and Hang Seng.
RESUMO
In most big cities, public transports are enclosed and crowded spaces. Therefore, they are considered as one of the most important triggers of COVID-19 spread. Most of the existing research related to the mobility of people and COVID-19 spread is focused on investigating highly frequented paths by analyzing data collected from mobile devices, which mainly refer to geo-positioning records. In contrast, this paper tackles the problem by studying mass mobility. The relations between daily mobility on public transport (subway or metro) in three big cities and mortality due to COVID-19 are investigated. Data collected for these purposes come from official sources, such as the web pages of the cities' local governments. To provide a systematic framework, we applied the IBM Foundational Methodology for Data Science to the epidemiological domain of this paper. Our analysis consists of moving averages with a moving window equal to seven days so as to avoid bias due to weekly tendencies. Among the main findings of this work are: a) New York City and Madrid show similar distribution on studied variables, which resemble a Gauss bell, in contrast to Mexico City, and b) Non-pharmaceutical interventions don't bring immediate results, and reductions to the number of deaths due to COVID are observed after a certain number of days. This paper yields partial evidence for assessing the effectiveness of public policies in mitigating the COVID-19 pandemic.
Assuntos
COVID-19/mortalidade , Meios de Transporte , Adulto , COVID-19/epidemiologia , Cidades/epidemiologia , Cidades/estatística & dados numéricos , Ciência de Dados/métodos , Modelos Epidemiológicos , Humanos , México/epidemiologia , Cidade de Nova Iorque/epidemiologia , Espanha/epidemiologia , Meios de Transporte/métodos , Meios de Transporte/estatística & dados numéricosRESUMO
In this paper we propose a criterion to balance the processing time and the solution quality of k-means cluster algorithms when applied to instances where the number n of objects is big. The majority of the known strategies aimed to improve the performance of k-means algorithms are related to the initialization or classification steps. In contrast, our criterion applies in the convergence step, namely, the process stops whenever the number of objects that change their assigned cluster at any iteration is lower than a given threshold. Through computer experimentation with synthetic and real instances, we found that a threshold close to 0.03n involves a decrease in computing time of about a factor 4/100, yielding solutions whose quality reduces by less than two percent. These findings naturally suggest the usefulness of our criterion in Big Data realms.