texto de discussão nº 43 - IBRE

Transcrição

texto de discussão nº 43 - IBRE
TEXTO DE DISCUSSÃO Nº 43
APPLYING THE BOOTSTRAP
TECHNIQUES IN DETECTING
TURNING POINTS: A STUDY OF
CONSUMER SENTIMENT SURVEY
Pedro Guilherme Costa Ferreira
José Lisboa Gondin Junior
Viviane Seda Bittencourt
2014
1
Abstract: The purpose of this study is to improve the ability of the Consumer
Confidence Index (CCI) of detecting turning points by shorting the statistical
confidence interval by applying the Bootstrap Technique to Consumers
Survey at the Getúlio Vargas Foundation (IBRE/FGV). Confidence Indicators
are estimates that reflect not only macroeconomic conditions, but they also
estimate psychological factors, which cannot be captured by traditional
economic indicators. The results indicate that the ability of detecting turning
points has significantly improved, moving from 41% to 68% of the significant
monthly changes in the CCI by replacing the Theoretical Confidence Interval
to the Bootstrap Confidence Interval. This result, besides increasing the
dynamism of the survey, allows the indicator to detect Turning Points more
quickly. An example of the effectiveness of the new methodology is shown in
July 2009, when the new methodology indicates a significant monthly change
in the CCI, as opposed to the current methodology which indicates a
significant monthly change a few months later.
Keywords: Consumer Sentiment; Leading Indicators; Survey Methodology;
Confidence Interval; Bootstrap; Brazil
JEL Classification: C42, C82, D84, E27, E32
1. Introduction
According to (Issler, Notini, & Rodrigues, 2009), every society has an interest
to know what is their Economic Business Cycle or what state they are in
(expansion or recession). However, both the Business Cycle and the
Economic Sentiment are unobserved variables and there is no consensus
how to estimate these latent variables. The impossibility to estimate directly
the Business Cycle and Economic Sentiment has led to constructions of these
proxies. These variables are able to be used in real time and/or forecast in.
The Consumers Surveys which have been conducted in forty-five countries at
least.(Curtin, 2007) The monitoring of consumer sentiment aims to produce
information on their decisions on spending and future savings. These, in turn,
are useful indicators in anticipation the short-term tendency of the economy.
The Consumer Survey, however, aims to generate information that reflect the
macroeconomic conditions in vigor and to extract information in the
psychological scope not captured by traditional economic indicators, thus
contributing to the improvement of economic forecasting models.
In market economies, the consumers spending represent two third of the
whole economy. In consequence of that, small changes on the composition of
the families spending may cause a big economic impact. (Curtin, 2007).
Moreover, the consumer’s ability to predict economic cyclical changes is, in
general, likely to coincide with other economic variables during periods of
2
stable economic growth, while the importance of gut feelings increases near
turning poins or as a result of non economic impacts (Parigi & Golinelli, 2004)
In order to synthesize the results of each research indicators known as
Confidence Indicators have been developed. These indicators are
endogenous variables, reflect economic activities and are capable of
quantifying psychological factors, which are not captured by others economic
variables. The application of these variables into economic and statistics
models can improve the economic phenomeno analysis, taking into account
unobserved variables like optimism, which make it possible to improve the
short term forecast and detection of possible turning points.
According to (Curtin, 2007), the small contribution of these variables has
disappointed many researchers. However, this is what happens with most
other economic variables.
The idea of this article is to improve the level of sensivity of the indicator. This
contribution will allow researchers to detect turning points faster than
traditional terms. To do this, it will apply the Bootstrap technic in the
Consumer Confidence Index, made by FGV’s Consumer Survey (SACE,
2013).
The proposed methodology uses the (Efron, 1979) technique to evaluate the
estimator's variance, taking into account the database of only one sample.
The results show that the indicator's sensibility increases considerably form
41%, in the theoretical interval, to 65% in the bootstrap interval. This result,
besides increasing the research dynamics, allows that the indicator capture
the turning points faster. An example of the effectiveness of the new
methodology is shown in July 2009, when the new methodology indicates a
significant monthly change in the CCI, as opposed to the current methodology
which indicates a significant monthly change a few months later.
Beyond this introduction, this article is organized as follow. Section 2 presents
a brief discussion about Bootstrap technic, the Consumer Confidence Index,
their indicators and their relationship with macroeconomics variables, and the
proposed model. In section 3 the empirical results are presented, and in
section 4 are presented the final conclusions.
2. The Bootstrap technique, the Consumer Survey and the Proposed
Model
2.1. The Bootstrap Technique
The Bootstrap Technique, introduced by Efron (Efron, 1979), is a
nonparametric computer-intensive statistical method. It may enable the analist
to evaluate the variability of the estimators based on the data of a single
sample.
3
This technique is indicated for problems that conventional statistical
techniques are difficult to be applied. In most cases, this technique presents
advantages in situations involving either large or small samples, as long as it
provides results near the results obtained by asymptotic methods in large
samples or exceeding the reduced sample.
In practical terms, this technique consists in drawing randomly with
replacement from original sample to generate a same size sample and
stratum, which will be called Bootstrap Sample. A suitable number of
bootstrap samples are computed in order to obtain a Bootstrap Distribution of
the statistic that has been studied. Thus, the dataset obtained by
bootstrapping is an estimation of the true sampling distribution of the statistic.
As shown in (Efron, 1992), the Bootstrap Distributions converges to the real
sampling distribution when the number of bootstrap samples tends to infinity.
Let X  x1 , x2 ,..., xn1 , xn  be the original sample and the integer number n the
length of X. Assume that X is obtained by an unknown probabilistic model
which may be described by its cumulative function F and a statistics   S (X ) .
Let X i* , i = 1, 2, ..., B, be the i-th Bootstrap Sample the length of n obtained
from the sample X. For each Bootstrap Sample, X i* , there is a corresponding
statistics  i* , i.e.,  i*  S ( X i* ) .
The mean, variance and standard error of the Bootstrapped estimator of 
may be defined by;
B
* 

i 1
*
i
(2.1)
B
B
Var (i* ) 
 (
i 1
*
i
  * )2
(2.2)
B 1
SEboot  Var (i* )
(2.3)
respectively.
In (Efron, 1992) , it is shown that
Var ( SEboot ) 
C1 C 2

n 2 nB
(2.4)
where C1 and C2 are constants which depends on the distribution F, but not
on n e B.
4
Hence, the uncertainty associated to the Bootstrap estimator will depend on
the size of the original sample at last. In other words, there is not any
guarantee that the Bootstrapped Estimator converges to the truth estimator
when the number of Bootstrap Samples tend to infinity. However, we obtain a
good estimate of the confidence interval.
In order, to evaluate the Botstrap confidence interval was used the
Percentile Method,
^
this is, suppose one settles for 1000 bootstrapped replications of  , denoted
*
by 1* , 2* ,,1000
 . After ranking from bottom to top, let us denote these


bootstrap values as  (*1) , (*2) ,, (*1000) . Then the bootstrapped percentile


confidence interval at 95% level of confidence would be  (*25) , (*975) . Turning to
the theoretical aspects of this method, it should be pointed out that the
^
method requires the symmetry of the sampling distribution of  around 
(Singh & Xie, 2008); (Babu & Singh, 1983); (Beran, 1990).
2.2. Brief explanation about methodology of the Consumer Survey.
The Consumer Survey is a monthly survey that aims to generate
indicators regarding topics such as general economic situation. The questions
may be classified into: (i) Observations on the time of performing the survey,
and (ii) Forecasts for the next six months. For each question on the survey
has for options which are made in a comparative way. For instance, the
options may be 5 – much better, 4 – better; 3 – the same, 2 – worse, 1 –
much worse) (SACE, 2013).
The Survey is conducted in the seven capitals of Brazil, which have the
largest GDP (Belo Horizonte - BH, Brasília - BS, Porto Alegre - PO, Recife Re Salvado - Sa, Rio de Janeiro - RJ, São Paulo - SP). The sample is
stratified by income level (I1 – less than US$ 897.00; I2 – between US$
897.01 and US$ 2,051.00; I3 – between US$ 2,051.01 and US$ 4,103.00; I4
– more than R$ 4,103.001)), region of interest (Capitals), proportionalized by
the participation of household consumption in each stratum.
In order to calculate the Consumer Confidence Index (CCI), the statistics of
interest in this article, a few remarks are necessary.
The CCI is arithmetic mean of the five indicators (will be defined below)
calculated for five questions from the Consumer Survey, as following: (Local
Economic Situation at the moment – LESM; Households Financial Situation at
the moment – HFSM; Local Economic Situation in the next six months –
1
R$/US$ = 2.34 (12/19/2013)
5
LESF; Households Financial Situation in the next six months – HFSF;
Intention to Purchase Durable Goods in the next six months – IPDGF). The
first two questions are assessments on the present economic moment and the
last three ones are assessment on the consumers’ expectations of the
economy in the future. (Diagram 2.1) (SACE, 2013).
Diagram 2.1 – Questions that comprised the Consumer Confidence
Index (CCI)
Source: Autores
The sample has twenty-eight strata. In each one of seven Brazilian State
capitals (São Paulo, Belhohorizonte, Brasília, Rio de Janeiro, Salvador, Porto
Alegre and Recife) four household monthly income levels are cosidered (I1 –
less than US$ 897.00; I2 – between US$ 897.01 and US$ 2,051.00; I3 between US$ 2,051.01 and US$ 4,103.00; I4 – more than R$ 4,103.002) as
exemplified at Diagram 2.2.
Diagram 2.2 – Organogram (part of the sample design for one Indicador)
Source: Authors
The strata are weighted by their participation in the Brazilian household
consumption. Thus, for each question one Indicador is calculated by adding
100 to the difference of the aggregates of favorable and unfavorable, i.e.:
Indicador = 100 + favorables – unfavorables
The Consumers Survay aims to estimate, with a low sampling error and high
probabilistic reliability, proportions of responses in multiple choice questions
(SACE, 2013). In this situation, the sample size is determined to estimate the
parameters of a random variable that has Multinomial distribution, where the
sample size solves the equation (2.5).
2
R$/US$ = 2.34 (12/19/2013)
6


P P  P̂  erro  1   , i = 1, 2,..., k.
i
i
(2.5)
where:
Pi
proportion being estimated;
P̂i
estimator of the proportion Pi (Pi = ni / n, where ni is the number of
favorable responses to the alternative i and n is the sample size);
Error
maximum error of the estimate resulting from the use of a sample
(referred to as sampling error and usually set to 0.02 or 2%);
1–
level of probabilistic reliability of the sample (usually 95%).
The Consumer Survey, conducted monthly by IBRE/FGV, currently has
size sample of two thousand Brazilian consumers. Following international
standards, for that sample size and a confidence interval of 95% the absolute
sampling error is 2.19%. See Table 2 in (SACE, 2013).
2.3. The Proposed Method
The main goal of this article is introduce a method that goes further than
the maximum variance and maximum absolute error of the concerned
variable, fixed to all months at 2.19%. In this article, the sample error is
actually estimated for each month of the whole Time Serie do the ICC. The
Proposed Method consites in a Bootstrap Resampling of each stratrum of the
monthly sample (e.g. I1 in the LESF question from Rio de Janeiro) in order to
create Bootstrap Sample. The sampling described in (SACE, 2013) and the
careful collecting process, conducted by IBRE/FGV in the seven Brazilian
States Capitals, guarantee that this bootstrap sample is a good approximation
of a true random sample.
Given a monthly sample, the proposed algorithm resamples in each
stratum, keeping the characteristics and it can be presented in three steps:
(i)
For each question (e.g. LESF), select the interviewee into each
state capital (e.g. Rio de Janeiro) and income level (e.g. I3) (section
2.2); then twenty-eight Answer Set (column Ansuer Table 2.1) for
each question are obtained.
7
(ii)
Generate thousand five hundred observations sampled uniformly at
random, with replacemt, from each Answer Set. Then 28 sorted
observation lists are obtained.
(iii)
Join the 28 strata of each observation list, following the sorting.
Then we 2,500 bootstrap sample are generated for each question.
Table 2.1 – Example of the Answer matrix
Interviewee Question
State
Capital
Income
level
Answer
1
LESF
RJ
I3
2
2
LESF
RJ
I3
4
3
LESF
RJ
I3
3
4
LESF
RJ
I3
5
5
LESF
RJ
I3
3
6
LESF
RJ
I3
1
Source: authors
Following those steps, 2,500 boostrap samples are generated from a single
monthly sample. After that, an ICC* calculated for each bootstrap sample as it
was said above. The histogram of ICC* is, in fact, an impirical distribution
which is taken a confidence intervalo of 95% (Figura 2.1).
Figure 2.1 – Impirical Distribution of ICC* for 2,500 bootstrap samples
from the monthly sample of September of 2013.
Source: Authors
8
3. Results
Conforme explicitado na seção 2.3, a metodologia proposta bootstrapa as
respostas respeitando os blocos de respostas (capital e faixa de renda),
agrega as estatísticas calculadas em cada bloco e estima-se uma distribuição
para o ICC.
Apesar de essa metodologia estar de acordo com as melhores práticas
estatísticas quanto ao uso da reamostragem bootstrap, uma preocupação
dos autores foi testar a robustez do método antes de analisar o resultado
propriamente dito. Para tal, realizou-se o método proposto diversas vezes
com diferentes números de reamostragens 2,000, 6,000 e 10,000 e calculouse o ICC e o intervalo de confiança para uma amostra mensal. Conforme
pode ser observado na tabela 3.1, em todos os casos a média, o standard
error e o intervalo de confiança convergem para o mesmo resultado com uma
casa decimal. A mesma análise foi realizada para várias outras amostras
mensais e os resultados obtidos foram satisfatórios. Assim, o algoritmo
apresenta robustez com 2,500 reamostragens.
Table 3.1 - Resultados das reamostragems do microdados, mês
setembro, 2013
Reamostragens Média
Intervalo
(95%)
com
confiança Standard error
2,000
113.1764
[111.7539, 114.5117]
0.6979
6,000
113.2074
[111.8521, 114.5343]
0.6842
10,000
113.1952
[111.8663, 114.5619]
0.6869
Fonte: Authors
Com relação aos resultados gerais para a série histórica de setembro/2005 a
setembro/2013, observou-se que o standard error máximo de 0.9, com valor
médio de 0.65 e intervalo de confiança máximo de 1.68 pontos percentuais e
valor médio de 1.14 pontos percentuais.
Outro resultado que mostra a robustez e a adequabilidade do método ao
problema exposto é a variação do nível de confiança da estatística de
interesse nos meses da pesquisa. Conforme pode ser observado, o intervalo
de confiança do parâmetro é maior nos anos de 2008/2009 (chart 3.1), anos
de crise e menor nos anos de 2011/2012 (chart 3.2), ano de relativa
tranquilidade.
Por fim, analisou-se a assimetria da amostra bootstrap de ICCs e atestou-se
a simetria da distribuição dos ICCs, resultado que valida a utilização do
método percentílico para o cálculo do intervalo de confiança bootstrap,
conforme destacado por (Hall, 1988).
9
Chart 3.1 – Boxplot – valores bootstrapados do ICC no período próximo
a crise econômica de 2008
Fonte: autores
10
Chart 3.2 - Boxplot – valores bootstrapados do ICC em um período de
relativa estabilidade econômica
Fonte: autores
Tratando dos resultados, como pode ser observado no gráfico 3.3, ao
utilizar o intervalo de confiança explicitado na seção 2.2, o nível de
sensibilidade da pesquisa é baixo e muitas vezes, tardio quanto à certeza de
turning points na economia, isto é, em alguns casos, conforme destacado no
gráfico, há uma mudança na expectativa do consumidor, mas, em termos de
significância estatística, essa mudança só pode ser garantida após certo
período.
Observando o gráfico 3.3, observa-se que, por exemplo, nos pontos 1
e 2, a mudança de sentido, já é significante no mês de ocorrência, isto é, as
tendências mudam em Jan-06 e Jun-08 o ICC sinaliza no mesmo mês. Por
outro lado, há longos períodos em que não há mudanças no índice, com
destaque para o período Fev-09 a Out-09 e meses que o turning point não é
observado no mês de ocorrência, destaque para os pontos 3, 4 e 5, onde
muda-se a tendência em Jul-06, Nov-06, Ago-07 e o ICC sinaliza apenas em
Out-06, Abr-07 e Dez-07, respectivamente.
Ao utilizar o método proposto (série histórica – gráfico 3.4) observa-se
que a sensibilidade do indicador, sinalizado pela linha vertical laranja,
aumenta consideravelmente, passando de 41% para o caso do intervalo
teórico para 68% com intervalo bootstrap. Tal resultado, além de aumentar a
dinamicidade da pesquisa, permite que o indicador “capture” mais
rapidamente os turning points, como por exemplo, comparando com os casos
destacados anteriormente, a mudança em Jul-06 é sinalizada em Set-06, a
de Nov-06 é sinalizada em o Jan-07 e a madunaça de Ago-07 é sinalizada no
próprio mês.
11
Chart 3.3 - Série histórica ICC com intervalo de confiança assumindo
variância máxima– Sondagem do Consumidor - Brasil
Fonte: autores
(*) linhas pontilhadas pretas indicam o intervalo de confiança teórico;
(*) barras laranjas indicam os pontos onde a variação do indicador é
estatisticamente significante;
Chart 3.4 - Série histórica ICC com intervalo de confiança Bootstrap–
Sondagem do Consumidor - Brasil
Fonte: autores
(*) linhas pontilhadas pretas indicam o intervalo de confiança bootstrap;
(*) barras laranjas indicam os pontos onde a variação do indicador é
estatisticamente significante;
Outro ponto interessante de ser avaliado é Jul/2009 como se pode
observar no gráfico 3.3 utilizando a metodologia atual não há nenhuma
sinalização evidente de que os consumidores estão sentido o
12
desaquecimento da economia, contudo, ao analisar o gráfico 3.4 há uma
sinalização da queda da confiança do consumidor com significância
estatística. Antecipando uma sequencia de quedas no ICC.
4. Final remarks
Conforme foi observado no artigo o método proposto atingiu seu
objetivo de melhorar a sensibilidade estatística do indicador, deixando mais
claro as percepções do consumidor em cada momento. Prova disso, foi o
resultado de Set/2008 que com a metodologia bootstrap sinalizou, com 95%
de confiança, que os consumidores estavam mais pessimistas com a
situação da economia.
Como resultado secundário, mas também importante, verificou-se que
nos meses em torno da crise (Set-08 a Fev-09) o coeficiente de variação das
amostras bootstrap é 130 pontos percentuais superior a períodos de calmaria
(Set-11 a Fev-12), tal resultado pode ser entendido como um forte indicador
antecedente de períodos de crise e precisa ser melhor estudado.
Por fim, entende-se que a metodologia proposta mostrou-se útil para o
acompanhamento dos ciclos econômicos e pode ser utilizada por outras
Sondagens que objetivam aumentar a sensibilidade na detecção de turning
points.
5. References
Babu, G. J., & Singh, K. (1983). Inference on means using the bootstrap. Ann. Stat., 11.
Beran, R. (1990). Refining bootstrap simultaneous confidence sets. Jour. Amer. Stat. Assoc.,
pp. 417-428.
Curtin, R. (2007). Consumer Sentiment Surveys: Worldwide Review and Assessment.
Journal of Business Cycle Measurement and Analysis.
Efron, B. (1979). Bootstrap Methods: another look at jackknife. Ann. Stat. 7, , pp. 1-26.
Efron, B. (1992). Jackkinife-after-bootstrap standard erros and influences functions (with
discussion). J. R. Stat. Soc. B., 54, pp. 463-479.
Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals. Ann. Stat., 16, pp.
927-953.
Issler, J. V., Notini, H. H., & Rodrigues, C. F. (2009, Junho). Um Indicador Coincidente e
Antecedente da Atividade Econômica Brasileira. Ensaios Econômicos.
13
Parigi, G., & Golinelli, R. (2004). Consumer Sentiment and Economic Activity: A Cross
Country Comparison. Journal of Business Cycle Measurement and Analysis, pp. pp.
147-70.
SACE. (2013). Consumer Survey Methodology - Superintendence of Economic Cycles
(SACE). Retrieved December 01, 2013, from Brazilian Institute of Economics (IBRE
| FGV): http://portalibre.fgv.br
Singh, K., & Xie, M. (2008). Bootstrap: A Statistical Method. Unpublished Working Paper.
Rutgers
University.
<http://www.stat.rutgers.edu/home/mxie/RCPapers/bootstrap.pdf>.
14
Rio de Janeiro
www.fgv.br/ibre
Rua Barão de Itambi, 60
22231-000 - Rio de Janeiro – RJ
São Paulo
Av. Paulista, 548 - 6º andar
01310-000 - São Paulo – SP
15