Accounts

Transcrição

Accounts
Luxembourg: Office for Official Publications of the
European Communities, 2002
ISBN 92-894-3418-X
Cat. No. KS-AO-02-001-EN-N
2002 EDITION
COPYRIGHT
© European Communities, 2002
Workshop on
Quarterly National
Accounts
Paris-Bercy 05/12/1994-06/12/1994
E U R O P E A N
COMMISSION
2
THEME 2
Economy
and
finance
A great deal of additional information on the European Union is available on the Internet.
It can be accessed through the Europa server (http://europa.eu.int).
Luxembourg: Office for Official Publications of the European Communities, 2002
© European Communities, 2002
Printed in Belgium
PRINTED
ON WHITE CHLORINE-FREE PAPER
EUROSTAT - INSEE
Paris - Bercy
05/12/1994 - 06/12/1994
Workshop on
Quarterly National Accounts
Atelier sur les
Comptes Nationaux Trimestriels
1
Edited by
Roberto BARCELLAN and Gian Luigi MAZZI
Desktop publishing
Elisabeth ALMEIDA and Nelly DA SILVA
Special thanks
Stéphane GREGOIR for the organisation of the Workshop
Manuscript completed in April 2002
For further information, please contact
Roberto Barcellan - Gian Luigi Mazzi
Eurostat
5, rue Alphonse Weicker
L-2721 Luxembourg
Tel. (352) 4301 35802
Tel. (352) 4301 34351
Fax (352) 4301 33879
[email protected]
[email protected]
2
TABLE OF CONTENTS
FOREWORD
Since the early 1990’s, the importance of short term information has highly increased, in particul ar in the
field of quarterly national accounts.
For the monitoring of the economy and for achievement of the objectives of the European Monetary
Union, high-quality short term statistical instruments are in fact needed in order to provide the C ommunity
Institutions, Governments, Central Banks as well as economic and social operators with a set of
comparable and reliable data.
Being aware of the importance of the short term information on national accounts, Insee and Eurosta t
organised a joint workshop in Paris-Bercy the 5 and 6 December 1994. At this workshop participated many
specialists in the field of national accounts, and in particular of quarterly accounts, not only fr om the
National Statistical Institutes but also from Central Banks and Universities.
The workshop was mainly dedicated to the different approaches for compiling quarterly accounts, bas ed
on econometric models, statistical surveys or indirect methods. The issue of revisions and validati ons of
quarterly national accounts and of seasonal adjustment were also discussed. One session was dedicat ed to
the users’ point of view (most of the papers of the workshop are presented in this publication.)
The richness and the quality of the contributions of the Paris-Bercy workshop constituted an excell ent
basis for the Eurostat Handbook on Quarterly National Accounts drafted in the framework of the Euro pean
System of Accounts (ESA95). This handbook is providing to Member States but also to Candidate
Countries and other countries a precious and harmonised approach with a set of recommendations to b e
followed for the compilation of quarterly accounts.
Eurostat would like to thank INSEE for hosting this important workshop and all the participants, es pecially
the authors of the papers for the excellent contributions.
Marco DE MARCH
Head of Unit
Eurostat - Unit B2
Economic Accounts and International Markets:
Production and Analyses
i
TABLE OF CONTENTS
SUMMARY
Foreword
······························································i
Summary
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · iii
Agenda
······························································v
List of authors · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ix
Table of contents · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · xi
Section 0.
Introduction · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 1
Section 1.
Revisions and validation of quarterly national accounts· · · · · · · · · · · · · · · · · · 7
Section 2.
New development in the econometrics-based approaches · · · · · · · · · · · · · · · 63
Section 3.
Survey-based approach · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 167
Section 4.
Seasonal adjustment · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 201
Section 5.
Indirect methods · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 221
iii
TABLE OF CONTENTS
Workshop on
Quarterly National Accounts
Atelier sur les
Comptes Nationaux Trimestriels
Program
INSEE-EUROSTAT WORKSHOP
5-6 December 1994
Monday 5 December 1994
8.30-9.00 Registration
9.00-9.15 Welcome address by Y. Franchet (Director General of EUROSTAT) and P. Mazodier
(Directeur de la Direction des Etudes et des Synthèses Economiques – INSEE)
9.15-11.15
Revisions and validations of quarterly national accounts
Chairman: Heinrich Lützel (Statistisches Bundesamt)
• “Revisions to Italian Quarterly National Accounts Aggregates: Some Empirical
Results” – T. Di Fonzo (University of Brescia), S. Pisani, G. Savio (ISTAT)
• “Relations between Quarterly and Annual Accounts at INSEE” – V. Madelin
(INSEE)
• “The Use of External Data Sources to Validate National Accounts Estimates” –
K. Vernon (CSO)
• “Cyclical Patterns of the Spanish Quarterly National Accounts Series: A VAR
Analysis” – A.M. Abad Garcia, E. Martin Quilis (INE)
11.30-13.00
User’s point of view
Chairman: Marco De March (EUROSTAT)
F. Carlucci (University of Rome)
A. Gubian (DARES)
P. Cuneo (BIPE)
M. Dramais (CEE-DGII)
v
TABLE OF CONTENTS
14.30-16.45
New developments in the econometric-based approach
Chairman: Enrico Giovannini (ISTAT)
Invited paper
• “Temporal Disaggregation of a System of Time Series when the Aggregate is
Known: Optimal vs. Adjustment Methods” – Di Fonzo (University of Brescia)
and “ECOTRIM: A Program for Temporal Disaggregation of Time Series” – R.
Barcellan (University of Padova)
• “Estimating Missing Observations in ARIMA Models with the Kalman Filter” –
V. Gomez (INE)
Contributed paper
• “Time Disaggregation of Economic Time Series: An Econometric Approach” –
C. Lupi (ISTAT), G. Parigi (Bank of Italy)
• “Note on Temporal Disaggregation with Simple Dynamic Models” – S. Gregoir
(INSEE)
17.00-18.30
Round table on the new SNA and the european coordination
Chairmen: S. Gregoir (INSEE) and G. L. Mazzi (EUROSTAT)
Tuesday 6 December 1994
8.30-10.15
Survey-based approach I
Chairman: Jean-Etienne Chapron (INSEE)
• “The United Kingdom Approach to Quarterly National Accounts” – D. Caplan,
S. Lambert (CSO)
• “Quarterly National Accounts of the Federal Statistical Office of Germany” – H.
Lützel (Statistisches Bundesamt)
• “Special Aspects of Quarterly GDP Accounts in East Germany in the Initial
Years after German Unification” – R. Hein (Statistisches Bundesamt)
10.30-12.30
Seasonal adjustment
Chairman: Francesco Carlucci (University of Roma)
• “Comparison of Different Seasonal Adjustment Methods using Quarterly Data of
National Accounts” – B. Fisher
• “Two Programs for Time Series Analysis, with an application to seasonal
adjustment” – A. Maravall (European University Institute)
vi
TABLE OF CONTENTS
• “Trend-Cycle versus Seasonal Adjustment in Quarterly National Accounts” – A.
Cristobal Cristobal, E.M. Quilis (INE)
• “Forecasting Changing Seasonal Components Using Periodic Correlation” –
P.H. Franses, M. Ooms (Erasmus University)
13.45-15.45
Survey-based approach II
Chairman: David Caplan (CSO)
• “Compilation of Quarterly National Accounts in Denmark” – T.M. Pedersen
(Denmark Statistiks)
• “An Outline of the Swedish Quarterly National Accounts” – H. Runnquist, E.
Karlsson (Statistics Sweden)
• “Data Flows in Quarterly National Accounting” – N. Van Stokrom (CBS)
• “Source and Use of Funds and Investments Statements in Quarterly Financial
Accounts” – T. Grunspan (Bank of France)
16.00-18.00
Indirect methods
Chairman: Tommaso Di Fonzo (University of Brescia)
Invited paper
• “GDP-Monthly Indicator” – E. Lääkäri (Statistics Finland)
• “Indicators of Monthly National Accounts” E. Salazar, R. Smith, S. Wright, M.
Weale (Cambridge university – UK)
• “Quarterly Flash Estimates: A Proposal for Italy” – G. Bruno, E. Giovannini, C.
Lupi (ISTAT)
Contributed paper
• “Using Data of Qualitative Business Survey for the Estimation of Quarterly
National Accounts” – C. Coimbra (Lisboa)
• “Estimating Quarterly National Accounts for Southern Italy” – M. Carlucci, R.
Zelli (University of Rome, “La Sapienza”)
vii
TABLE OF CONTENTS
LIST OF AUTHORS
ABAD GARCIA Ana M.
LÄÄKÄRI Erkki
BARCELLAN Roberto
LUPI Claudio
BRUNO Giancarlo
LÜTZEL Heinrich
CARLUCCI Margherita
MADELIN Virginie
COIMBRA Carlos
MAZZI Gian Luigi
CRISTÓBAL CRISTÓBAL Alfredo
OOMENS Peter
CUBADDA Gianluca
PARIGI Giuseppe
DI FONZO Tommaso
PISANI Stefano
GIOVANNINI Enrico
QUILIS Enrique Martín
GOMEZ Victor
RUNNQUIST Helena
GREGOIR Stéphane
SAVIO Giovanni
HAIN Ralf
VAN STOKROM Nico
JANSSEN Ronald
ZELLI Roberto
KARLSSON Eddie
ix
TABLE OF CONTENTS
TABLE OF CONTENTS
Foreword
······························································i
Summary
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · iii
Agenda
······························································v
List of authors · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ix
Table of contents · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · xi
Section 0
Introduction
Gian Luigi MAZZI
Perspectives of Quarterly National Accounts in the new European System
of Accounts · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 1
Section 1
Revisions and Validations of Quarterly National Accounts
Tommaso DI FONZO, Stefano PISANI, Giovanni SAVIO
Revisions to Italian Quarterly National Aggregates:
Some empirical results · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 7
Virgine MADELIN
La relation entre les comptes annuels et les comptes trimestriels à l’INSEE· · · 33
Ana M. ABAD GARCIA, Enrique Martín QUILIS
Cyclical patterns of the Spanish Quarterly National Accounts:
a VAR analysis · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 45
xi
TABLE OF CONTENTS
Section 2
New Developments in the Econometrics-based Approach
Tommaso DI FONZO
Temporal disaggregation of system of time series when the aggregates
is known: Optimal Vs. Adjustment methods · · · · · · · · · · · · · · · · · · · · · · · · · 63
Roberto BARCELLAN
Ecotrim: a program for temporal disaggregation of time series · · · · · · · · · · · 79
Victor GOMEZ
Estimating observations in ARIMA models with the Kalman filter · · · · · · · · 97
Claudio LUPI, Giuseppe PARIGI
Temporal disaggregation of economic time series:
some econometric issues · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 111
Stéphane GREGOIR
Proposition pour une desagregation temporelle basée sur des
modèles dynamiques simples · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 141
Section 3
Survey-based Approach I
Heinrich LÜTZEL
Quarterly National Accounts of the Federal Statistical Office
of Germany · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 167
Ralf HAIN
Special Aspects of the Quarterly GDP Accounts in East Germany
in the Initial Years after German Unification · · · · · · · · · · · · · · · · · · · · · · · · 175
xii
TABLE OF CONTENTS
Survey-based Approach II
Helena RUNNQUIST, Eddie KARLSSON
An outline of the Swedish Quarterly National Accounts · · · · · · · · · · · · · · · 179
Ronald JANSSEN, Peter OOMENS, Nico van STROKROM
Data flows in Dutch Quarterly National Accounts · · · · · · · · · · · · · · · · · · · · 191
Section 4
Seasonal Adjustment
Alfredo CRISTÓBAL CRISTÓBAL, Enrique Martín, QUILIS
Cycle tendance contre correction de variations saisonnières
dans les Comptes nationaux trimestriels · · · · · · · · · · · · · · · · · · · · · · · · · · · · 201
Section 5
Indirect Mehtods
Erkki LÄÄKÄRI
The monthly GDP indicator · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 221
Giancarlo BRUNO, Gianluca CUBADDA, Enrico GIOVANNINI, Claudio LUPI
The flash estimate of the Italian real Gross Domestic Product · · · · · · · · · · · 225
Carlos COIMBRA
Using data of qualitative business surveys for the estimation
of Quarterly National Accounts· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 237
Margherita CARLUCCI, Roberto ZELLI
Estimating Quarterly Regional Accounts for southern Italy · · · · · · · · · · · · · 241
xiii
TABLE OF CONTENTS
SECTION 0
-
INTRODUCTION
Perspectives of Quarterly National Accounts in the new European System of
Accounts
Gian Luigi MAZZI
EUROSTAT
This paper contains a synthesis of the main issues discussed during the workshop on
Quarterly National Accounts that has been held in Paris on 3-4 December 1994. This
workshop has been jointly organised by INSEE and EUROSTAT. This paper is intended to
give an overlook on the main topics discussed in the papers as well as during the
contributions in the round-table on the future of European Union Quarterly National
Accounts. Moreover, the perspectives and the possible improvements of Quarterly
National Accounts in the framework of the new European System of Accounts are also
briefly discussed
1 Introduction
National Accounts are a continuously evolving system
of economic identities aiming to supply a synthetic
view of the economic reality. In its evolution
extensions to social, regional, environmental and
quarterly accounts have been introduced in the System
of National Accounts.
integrated in the System of National Accounts, in
order to obtain a better coverage and a complete
description of the economic behaviour of a nation.
Even if Quarterly National Accounts can be
considered a quite long-standing topic, only with the
introduction of the new European System of Accounts
their integration in the System of National Accounts
has been explicitly stated as a crucial point. In the
European Union Member States, QNA have been
developed following different principles, mainly due
to dissimilarity in the disaggregation level, coverage
and compilation methods (see Lutzel 2002). In some
cases QNA constitute an incomplete system with clear
problems of integration with annual figures. The only
Member State having a completely integrated system
of QNA, at present, is the United Kingdom.
Nevertheless, a significant difference can be observed
between such extensions. Social, regional and
quarterly accounts are already not only
methodological but also practical issues in most
Member States, so that data are already available.
Environmental accounts still represent essentially a
methodological issue, and only a limited number of
quantitative applications are already been
experimented.
From a macroeconomic point of view, Quarterly
National Accounts (QNA) represent a major
improvement. The possibility of compiling high
frequency National Accounts figures - i.e. quarterly
and monthly in the future (see Lääkäri, 2002) - is of
special relevance when Quarterly data are fully
From the middle of ‘80s, Eurostat started an active
methodological and empirical activity in this field of
QNA, in order to obtain from all Member States
integrated and reliable quarterly aggregates and in
order to produce high quality European aggregates.
Those data are intended to be really useful for analysts,
1
TABLE OF CONTENTS
decision and policy makers as well as for monetary
authorities.
Eurostat and National Statistical Institutes are more
and more focusing on QNA mainly responding to the
increasing role of quarterly data for short-term
economic analysis and for the evaluation of the
measures of economic policy.
has been clearly stated that Quarterly National
Accounts had be treated in a specific chapter.
This chapter included in the new ESA, although brief
and not exhaustive, provided a description of the main
specification of QNA: compilation rules, estimation
techniques, treatment of seasonality, consistency
between quarterly and annual accounts and revision
policy. In particular it represented the first attempt for
the improvement and harmonisation of QNA. At the
same time, Eurostat launched a series of projects
designed at the compilation of a handbook on
Quarterly National Accounts, to the production of a
special software for time disaggregation of economic
series and to the implementation of a system of flash
estimates of the most relevant quarterly aggregates.
This workshop has been jointly organised by INSEE
and EUROSTAT and intends to provide useful and
practical contributions on some key topics, such as:
• the Quarterly National Accounts compilation
scheme in Member States,
• the methodology of calculation of QNA,
• the treatment of seasonality;
• flash estimates techniques.
Handbook on quarterly national
3 accounts
The New European System of
2 Accounts
During the discussion of the ESA chapter on Quarterly
National Accounts and in the occasion of the working
group on National Accounts, many Member States
explicitly requested Eurostat to provide a theoretical
and practical manual for the compilation of Quarterly
National Accounts.
The relevance of National Accounts was recognised
early by the United Nations statistical service, which
already in 1968 prepared a manual for the compilation
of National Accounts. Eurostat has implemented and
adapted those principles to the structure of European
economies and in 1971 published by the European
System of Accounts (ESA), later reviewed in 1979.
The handbook on quarterly national accounts that
Eurostat is setting up, will deal, as requested, with
typical conceptual problems of QNA and will provide
some guidelines on their treatment.
Both in the UN and the EU manuals, no attention was
paid to the QNA. Nevertheless, in 1972 Eurostat
decided to prepare a specific document on Quarterly
Accounts, defining a simplified scheme for the
compilation of high frequency National Accounts
aggregates. Unfortunately, the document appeared to
be too ambitious for the time, and it had no practical
application.
As a preliminary step, Eurostat set up a questionnaire
in order to analyse the current situation of QNA in EU
Member States and in main EU economic partners.
The main outcome of this questionnaire should be an
up-to-date overview of the current situation of QNA
production. From this standpoint and together with the
consideration of the accounting rules stated in the new
ESA, the handbook will intend to provide:
At the end of the ’80, the preparatory work for the New
System of National Accounts started under the
supervision of United Nations. The need for explicit
treatment of Quarterly Accounts was addressed but in
the final version of the manual SNA 93 there was no
specific chapters on the compilation of QNA. From
the Eurostat point of view, this represented an
important drawback. Therefore, during the
preparation of the European application of the SNA, it
• guidelines for the production of QNA in those
Member States not yet compiling them;
• guidelines for the harmonisation of QNA
compilation procedures for those Member States
already producing them.
2
TABLE OF CONTENTS
In this sense, the following aspects are going to be the
core of the QNA handbook:
Spain (Madellin, Abad Garcia and Quilis, 2002). The
high-frequency series are estimated using available
information corresponding to low-frequency series
and, sometimes, to some related series. Despite their
large utilisation, such techniques are subject to some
criticism deriving from a purely econometric point of
view, as discussed in Lupi and Parigi (2002).
• description of the current procedures for QNA
compilation in the Member States and main
economic partners;
• integration of QNA in the context of the New
European System of Accounts;
ECOTRIM has been developed on behalf of the
European Commission – Eurostat - Directorate B
“Economic Statistics and Economic and Monetary
Convergence”.
• reading of accounting rules in order to consider the
shorter reference time horizon;
• identification of basic statistics and information
sources available at quarterly frequency;
• theoretical background supporting
methods used for compiling QNA;
existing
One of the main advantages in the use of ECOTRIM is
it offers the possibility to choose between a large set of
techniques in order to select the most appropriate
according to the available information.
• seasonal and calendar movements and their
adjustment;
• consistency between quarterly and annual accounts.
In particular the program supplies a range of
techniques concerning:
Some well recognised experts in QNA will develop,
together with Member States and Eurostat, the main
theoretical aspects mentioned above, in order to
provide a complete description and to supply
assessments and suggestions for their treatment.
• temporal disaggregation of univariate time series by
using or not using related series and fulfilling
temporal aggregation constraints (the methods that
ECOTRIM offers, follow the mathematical
approach, the ARIMA model based approach, and
the optimal, in the least square sense, approach);
The final result stemming from the handbook will lead
to the identification of a complete system set up by
Eurostat in order to obtain integrated and harmonised
QNA figures from all the Member States.
• temporal disaggregation of multivariate time series
for fulfilling both temporal and contemporaneous
aggregation constraints (in this case too ECOTRIM
proposes both adjustment and optimal techniques,
in the least square sense);
a software for temporal
4 ECOTRIM:
disaggregation of time series
• forecast of current year observations by using or not
using available information from related series.
Often economic time series are observed only at low
frequency. Economic analysis more and more often
requires high-frequency disaggregated data, therefore
economic analysts as well as national Institutes of
Statistics are confronted with the problem of temporal
disaggregation for estimating such series.
A data management section, a set of diagnostics and a
graphic/display section allow users to carry out a
complete analysis of the problem treated.
ECOTRIM can be used both in interactive mode and
in batch mode (when many series have to be submitted
to the disaggregation process).
ECOTRIM is a software written in GAUSS which
supplies a set of mathematical and statistical
techniques for carrying out temporal disaggregation.
A detailed presentation of this programme is given in
Barcellan and Di fonzo (2002).
ECOTRIM is going to be further developed by
EUROSTAT, in order to respond to the different needs
of operators treating sub-annual accounts. ECOTRIM
is currently under development in order to incorporate
some new interesting features in treating temporal
disaggregation.
With
reference
to
recent
The approach presented in ECOTRIM corresponds to
the so-called “indirect approach” to the estimation of
Quarterly National Accounts, which is already in use
in some Member States, such as Italy, France and
3
TABLE OF CONTENTS
developments in the dynamic model approach
literature, as well as the response to suggestions
coming from National Institutes of Statistics,
ECOTRIM will be enhanced with:
Given these premises, an efficient way of estimating
those aggregates should take into account the
sub-annual information coming from different sources
and should find out an efficient way to incorporate it
into the accounting framework. In this way the need
for prompt information, with the need for a consistent
information system can be reconciled.
• dynamic model approaches (Gregoir, 2002);
• non-linear constraints;
• development of Kalman filter techniques (Gomez,
2002);
• specific accounting treatments;
For the accomplishment of this objective, time-series
analysis techniques (ARIMA models, transfer
functions, VAR) are applied to the quarterly
aggregates and certain sub-annual statistics to be used
as indicators. This methodology has been preferred to
structural econometric modelling, in order to
overcome the identification and estimation problems
connected to structural models.
• more detailed diagnostics sections;
• co-integration analysis aspects (Gregoir, 2002).
estimates of quarterly national
5 Flash
accounts
Flash estimates are an essential tool for producing
timeline data for an integrated System of National
Accounts. Such estimates are currently produced by
some Member States such as Italy (see Bruno,
Giovannini and others) and the Netherlands (see
Jansen).
Using this approach, considerations coming from the
economic theory are generally disregarded, while
attention is focused on the statistical relation between
quarterly accounts variables and a set of basic statistics
(Giovannini, Lupi, Bruno, Cubadda, 2002).
Moreover, this procedure is consistent with the actual
production process of QNA.
Eurostat contributes to the development of those
techniques with a specific project for the production of
flash estimates of main QNA aggregates at the
European level - with a delay between 30 and 45 days
after the end of the reference period. This delay has
been chosen because it seems to be “reasonable” for
flash estimates, as suggested by the experience of
certain countries (UK, USA). Reducing such delay
normally implies a considerable reduction in the
available basic information.
It should be clear that flash estimates are not simply an
early release of quarterly accounts, but a rather
different “product”. In fact, it has been widely used as
qualitative data, with the characteristics of combining
qualitative information (e.g. Business and Consumer
Surveys) with time-series techniques. In addition, the
level of disaggregation is normally different from the
one normally used to compile QNA.
The first step of this procedure is the selection, among
available data, of those indicators showing the
following features:
The scope of this project is to ensure the construction
of a coherent system of data for short-term economic
analysis. Even if certain indicators (production
indexes, prices, foreign trade, business surveys, etc.)
are promptly available, they do not fulfil the
requirements of homogeneity and consistency typical
of an integrated System of National Accounts.
• stable relationship with the corresponding quarterly
aggregate to be estimated;
• good forecasting performance;
• early availability.
On the other hand, the first estimates of quarterly
accounts are often released with a considerable delay,
making them partly unsuitable for the users needs. In
particular, the business cycle analysis and evaluation
of the economic and monetary policy measures require
shorter delays.
After this first step, the relationship between the
indicator and the aggregate to be estimated can be
identified and evaluated in order to obtain forecasting
models. Those two steps are interrelated, and can lead
to a revision of the model.
4
TABLE OF CONTENTS
Some other issues may be involved in the work. First
of all, the treatment of seasonality: the use of raw data
and applying seasonal adjustment on the results or the
direct use of seasonally adjusted data do not
necessarily gives the same results. This is a rather
important point, given the importance of seasonally
adjusted figures. Second, the extent of the revisions is
likely to influence the estimation, so far as it relies on
the past history of the aggregates.
framework for Quarterly Accounts. The preparation of
a handbook of Quarterly Accounts will provide the
guidelines in order to ensure the production of reliable,
harmonized and comparable Quarterly National
Accounts for the EU countries and for European
aggregates. Furthermore, the enlargement process of
the European Union will bring in this challenge also to
the Accession countries.
Therefore, the compilation of Quarterly National
Accounts and the reduction of the delay for data
availability require important methodological
developments. In this sense this workshop has
represented an important forum to bring together
National Statistical Offices, academics and users. The
contributions presented during this workshop gave a
high-quality input for developments.
6 Conclusions
The new system of European National Accounts will
constitute a major improvement for all National
Accounts and a real enhancement for Quarterly
National Accounts.
Eurostat and Member States are requested to take an
active role in this process by improving both
methodological background and organisational
5
TABLE OF CONTENTS
References
Garcia A. M. A. and Quilis E. M. , “Cyclical patterns
of the Spanish quarterly national accounts: a VAR
analysis” in proceedings of the workshop of
Quarterly National Accounts edited by Barcellan
R. and Mazzi G.L. – Luxembourg 2002.
Lääkäri E. , “The monthly GDP indicators” in
proceedings of the workshop of Quarterly
National Accounts edited by Barcellan R. and
Mazzi G.L. – Luxembourg 2002.
Lupi C. and Parigi G. , “Temporal disaggregation of
economic time series: Some econometric issues”
in proceedings of the workshop of Quarterly
National Accounts edited by Barcellan R. and
Mazzi G.L. – Luxembourg 2002.
Barcellan R. , “ Ecotrim: A program for temporal
disaggregation of time series” in proceedings of
the workshop of Quarterly National Accounts
edited by Barcellan R. and Mazzi G.L. –
Luxembourg 2002
Lützel H., “ Quarterly National Accounts of the
Federal Statistical” in proceedings of the
workshop of Quarterly National Accounts edited
by Barcellan R. and Mazzi G.L. – Luxembourg
2002.
Bruno G., Cubadda G., Giovannini E., Lupi C. ,
“The flash Estimate of the Italian Real Gross
Product” in in proceedings of the workshop of
Quarterly National Accounts edited by Barcellan
R. and Mazzi G.L. – Luxembourg 2002.
Madelin V., “La relation entre les comptes annuels et
les comptes trimestriels à l’INSEE” in
proceedings of the workshop of Quarterly
National Accounts edited by Barcellan R. and
Mazzi G.L. – Luxembourg 2002.
Di Fonzo T. , “Temporal disaggregation of a system
of time series when the aggregate is known:
Optimal VS. Adjustment methods” in
proceedings of the workshop of Quarterly
National Accounts edited by Barcellan R. and
Mazzi G.L. – Luxembourg 2002
Quilis E. M. , “Cyclical patterns of the Spanish
quarterly national accounts: a VAR analysis” in
proceedings of the workshop of Quarterly
National Accounts edited by Barcellan R. and
Mazzi G.L. – Luxembourg 2002.
Gregoir S., “Propositions pour une desagregation
temporelle basée sur des modèles dynamiques
simples” in proceedings of the workshop of
Quarterly National Accounts edited by Barcellan
R. and Mazzi G.L. – Luxembourg 2002.
Gomez V. , “Estimating observations in ARIMA
models with the Kalman filter” in proceedings of
the workshop of Quarterly National Accounts
edited by Barcellan R. and Mazzi G.L. –
Luxembourg 2002.
6
TABLE OF CONTENTS
SECTION 1 REVISIONS AND VALIDATIONS OF QUARTERLY NATIONAL
ACCOUNTS
Revisions to Italian Quarterly National Accounts Aggregates:
Some empirical results
Tommaso DI FONZO ; Dipartimento di Scienze Statistiche, Università di Padova
Stefano PISANI; Dipartimento Contabilità Nazionale e Analisi Economica, Istat
Giovanni SAVIO; Dipartimento Contabilità Nazionale e Analisi Economica, Istat
1 Introduction
1
In common with most statistical agencies in other
countries, in Italy Istat periodically revises quarterly
national accounts estimates: almost all published
variables are first released on a preliminary basis to
satisfy the users’ need for timely information, and are
then revised at a later period to incorporate
information that was not available at the time of the
preliminary release.
a particular figure might never be revised and yet still
be very unreliable. Assessment of the reliability of the
final estimates must rely upon direct evidence about
the data sources and the estimation technique on which
they are based (Hibbert, 1981).
A history of systematic bias in the provisional
estimates might lead either to a direct adjustment to
the series in question so as to offset the presumed
continuing bias (Rao, Srinath and Quenneville, 1989),
or to more fundamental changes such as the collection
of additional data or advances in the estimation
technique.
This gives rise to a revision process characterized by a
preliminary (or provisional) estimate and by a
subsequent series of current revisions, routinely
produced, that follow closely upon one another. From
time to time, additional, more unusual revisions also
occur which take place at infrequent and irregular
intervals. These `extraordinary’ revisions can include
changes in concepts and definitions of the aggregates
and/or information coming from decennial censuses
and improved estimation procedures.
While the features, nature and size of the revision
process in the Italian annual national accounts
estimates can be found in Trivellato (1986, 1987), in
this paper we examine quarterly rather annual data of
GDP and its selected components.
A comprehensive analysis of the revisions would
encompass the full set of published data, in both
current and constant prices, including both raw and
seasonally adjusted data. Growth rates as well as
levels would be analyzed and a complete
documentation about the revisions themselves and
about the reasons why they were necessary would be
collected and investigated. However, such an analysis
would require considerable time and resources.
Revisions to economic time series have been a
continuing interest since the early studies by Zellner
(1958) and Morgenstern (1963). In fact, the analysis of
revisions provides a basis both for assessing the
accuracy of provisional in relations to final estimates
and for improving the methods of estimation used to
compile provisional estimates. It cannot, of course, tell
us anything about the reliability of the final estimates:
1
The views expressed in this paper are those of the authors and not necessarily of any institution o r organization.
We would like to thank Carolina Testa for her help in the collection and organization of the data s et as well as
for first, preliminary empirical analysis.
7
TABLE OF CONTENTS
The scope of the present paper is therefore more
limited: its main objectives are a first insight into the
revision process which characterizes the Italian
quarterly national accounts, an examination of its
distinctive features and, finally, a discussion of some
issues concerning the behaviour of the preliminary
estimates.
The first estimate for any given quarter is released
within about 110 days of the quarter’s end.
The magnitude of the revisions produced by Istat since
the second half of eighties is discussed, presenting an
overview of how the initial estimates differ from their
final figures in both levels and growth rates. Then,
according to some recent contributions (Patterson and
Heravi, 1991, Pisani and Savio, 1994), we perform an
econometric analysis of the revision process. Such an
analysis is organized in three steps.
We denote by ‘version t.q’ the quarterly time series
published 110 days after the end of quarter q of year t.
Thus, for example (see table 1), the first preliminary
estimate of GDP at current prices for the first quarter
of 1992 belongs to the 1992.1 version, the first revised
estimate for the same quarter belongs to the 1992.2
version and so on.
This provisional estimate is subsequently revised
every 110 days both in the same year and during the
following two years.
Finally, the estimates are revised on the second half of
April of the following three years.
So far the above mentioned preliminary estimate has
been revised 9 times, the last one on the second half of
October 1994. Other four estimates of the 1992.1
figure are planned in future, and precisely along with
the 1994.3, 1994.4, 1995.4 and, finally, 1996.4
versions. Normally these revisions complete the
process, and no further changes take place.
Firstly, we compare the short-medium term
components of different versions of data. Secondly,
we apply the concepts of integration and cointegration
to verify the stationarity of the revision process.
Lastly, in order to examine whether preliminary
estimates of the main quarterly aggregates are efficient
forecasts of the final figure, the hypotheses of
unbiasedness, weak efficiency and orthogonality are
tested.
The revision process is then an extensive one: after the
first, preliminary estimate for each quarterly
aggregate, from ten to thirteen further estimates
continue to be published during a period variable
between five and six years. More precisely, the release
timing provides for fourteen vintages of the first
quarter estimate, thirteen of the second quarter, twelve
of the third quarter and, finally, eleven vintages of the
fourth quarter data.
The paper is organized as follows. In section 2 we give
an outline of the Italian revision process and show the
links between the various annual and quarterly
estimates. Moreover, we discuss the information
content of different versions of an aggregate at the
same time to establish homogeneous comparisons.
Section 3 is devoted to a descriptive analysis of the
accuracy of provisional estimates at current prices,
both in levels and in growth rates. In section 4 the
revision process is analyzed from a different, though
complementary, point of view. The above mentioned
econometric approach is worked out, to better
understand the salient temporal features of the revision
process particularly with respect to the effects of the
extraordinary revision published in 1987. Brief
conclusions and indications for future research work
on this subject find place in section 5.
As the quarterly national accounts estimation
procedure is of indirect type, in the sense that quarterly
series are obtained by temporal disaggregating annual
data (Istat, 1985, Di Fonzo, 1987), the quarterly
revision process is strictly related to the annual one.
The complete publication plan for a quarterly national
accounts aggregate at year t is shown in table 2, from
which we can get a first impression of the relationships
that exist between the various quarterly estimates with
respect to the relevant annual figure.
Three types of current annual estimates and revisions
for year t can be recognized:
2 Outline of the Italian revision process
The current publication plan of the Italian quarterly
national accounts may be characterized in the
following way (Giovannini, 1993, p. 18).
Pt : first preliminary annual estimate, released in
year t+1;
8
TABLE OF CONTENTS
Tab. 1: Preliminary and revised estimates of Italian GDP (1992.1-1994.2, billions of current lire, seasonal ly
adjusted data
Date of reference
1992
Version
1
2
1993
3
4
1
2
1994
3
4
1
1992.1
371674
1992.2
371158
374278
1992.3
371374
375319
377308
1992.4
373264
377065
378234
378627
1993.1
373123
376049
377223
380795
383414
1993.2
373602
377238
377158
379192
382336
388362
1993.3
372778
377120
377533
379759
382140
388674
389840
1993.4
372493
373606
376198
379326
382434
389035
392026
396619
1994.1
373089
376257
375786
379190
382829
389020
391125
397140
400035
1994.2
372986
375976
375747
379614
383047
388899
391039
397130
401973
Tab. 2: Current publication plan of the Italian quarterly national accounts aggregates for year t
Version
t.1
t.2
t.3
t.4
t+1.1
t+1.2
t+1.3
t+1.4
t+2.1
t+2.2
t+2.3
t+2.4
t+3.1
t+3.2
t+3.3
t+3.4
t+4.1
t+4.2
t+4.3
t+4.4
2
Quarters of the year t
3
Annual data
1
N 1t ,1
N t2,1
N t3,1
Pt1,1
Pt 2,1
4
N 1t ,2
N t2,2
Pt1,2
Pt 2,2
N 1t ,3
Pt1,3
Pt 2,3
Pt1,4
Pt 2,4
n.a.
Pt
Pt
Pt 3,1
pt4,1
R1t ,1
Rt2,1
Rt3,1
Rt4,1
D 1t ,1
Pt 3,2
Pt 4,2
R1t ,2
Rt2,2
Rt3,2
Rt4,2
D 1t ,2
Pt 3,3
Pt 4,3
R1t ,3
Rt2,3
Rt3,3
Rt4,3
D1t ,3
Pt 3,4
Pt 4,4
R1t ,4
Rt2,4
Rt3,4
Rt4,4
D 1t ,4
Pt
Pt
Rt
Rt
Rt
Rt
Dt
D t2,1
D t2,2
Dt2,3
D t2,4
Dt
D t3,1
D t3,2
Dt3,3
D t3,4
Dt
n.a.
n.a.
9
2
400874
TABLE OF CONTENTS
R t : first revised annual estimate, released in year
t+2;
According to this current revision scheme, the final
part of each version will be revised several times. As
far as the last published series (1994.2 version) is
concerned, a complete overview of the future planned
revisions is given in table 3.
D t : second revised annual estimate, released in
year t+3.
From table 2 four groups of current quarterly estimates
clearly emerge:
This picture is an idealized one, however, since it
abstracts from the additional, more unusual revisions
which also occur.
 v = 1,2,3 j = 1

N : Within year, not constrained  v = 1,2 j = 2
v = 1
j=3

v
t, j
In Appendix we present the actual publication
schedule for the Italian quarterly national accounts
series from 1985 to the present time 2. It is clear that the
current revision process has been upset by several
additional revisions, whose characteristics have been
summarized in table 4.
Ptv, j : Constrained to Pt , =1,2,3,4 j=1,2,3,4;
R vt , j : Constrained to R t , =1,2,3,4 j=1,2,3,4;
D vt , j : Constrained to D t , =1,2,3,4, j=1,2,3,4;
Where v and j denote the within year vintage and the
quarter, respectively. The following accounting
relationships hold for quarterly and annual estimates
of flow variables (we consider a release time spanning
from t=2 to t=n+1):
4
∑P
j =1
v
t,j
4
∑R
j =1
4
v
t,j
∑D
j =1
v
t,j
= Pt
v = 12
, ,3,4
= Rt
v = 12
, ,3,4
= Dt
v = 12
, ,3
We call these extraordinary revisions historical or
occasional 3. They take place at infrequent and
irregular intervals, that affect the historical estimates
for more than five years. The added information for
historical revisions most usually comes from
decennial censuses, other kinds of enlargements of the
information basis and improved estimation
procedures. The occasion of an historical revision is
also used to implement conceptual and definitional
changes to the national accounts framework and/or
changes in the base year for constant prices evaluation.
Historical revisions usually are major events which
have wide-ranging effects through the detailed
components of the accounts, and through time.
t = 1,K , n
t = 1,K , n − 1
t = 1,K , n − 2
Tab. 3: Future planned revisions for the last published quarterly national accounts
Period
Actual type of
estimate
Number of revisions
Year of final release
90.1-90.4
D2
1
1995
91.1-91.4
D1
2
1996
92.1-92.4
R
3
4
1997
P
3
8
1998
94.1
N
2
12
1999
94.2
N1
12
1999
93.1-93.4
2
Quarterly national accounts publication actually started in 1976, but was discontinued in 1982. In 1985 Istat
adopted a new estimation procedure (Istat, 1985), still valid nowadays. The analysis will be enlarg ed to the
early estimates in a next paper.
3
A different characterization of the occasional revisions, unsuitable for the Italian case, is due t o Smith (1977).
10
TABLE OF CONTENTS
Among the revisions reported in table 4, the one
published in 1987 (1986.4 version) distinguishes itself
for its peculiarity and, as we shall see, for considerable
effects on the main aggregates. As Smith (1977)
pointed out, it is not possible to make general
statements about the characteristics of historical
revisions, since each one is rather unique. However the
1987 revision is the only one that has been concerned
with changes in the conceptual framework of the
Italian quarterly national accounts (Istat, 1993).
‘extraordinary’), whose nature is not homogeneous
with the current revision sequence.
All these facts have to be kept in mind when
comparing the various estimates, to properly
appreciate the information content offered by them.
3 The accuracy of provisional estimates
The descriptive analyses of the current revision
process are carried out on the following current prices,
seasonally adjusted aggregates:
It should be noted that the within year revisions are
slightly different from the subsequent revisions.
Estimates N vj ,t are extrapolations based on related
indicators which are available on a quarterly basis:
besides the different (less) amount of information on
which they are based, unlike the other estimates they
do not satisfy any accounting constraint, because the
annual benchmark is not known at the release time.
Therefore the within year revisions depend only on the
revisions to these related indicators, while for
estimates other than the primary source for revisions
is new information on the annual benchmark.
GDP:
CFI:
INV:
IMP:
EXP:
The data set includes all published versions from
1984.4 to 1994.2 (see the Appendix).
Another important point about the revision process is
that the presence of occasional revisions that are
superimposed upon the current revision process
results in mixed revisions (jointly ‘current’ and
Fig. 1:
Gross Domestic Product;
Final Domestic Consumption;
Gross Fixed Capital Formation;
Imports of Goods and Services;
Exports of Goods and Services.
In order to obtain a meaningful picture of the quarterly
revision process, it is first necessary to establish which
comparisons is appropriate to carry out. For each of
the five series examined we consider five sets of
Percentage relative revisions induced by the 1987 historical revision: GDP, CFI and INV
4 2 .2
41. 6
4 1 .3
40
4 0 .7
4 0 .1
40 . 0
3 9 .4
38. 6
3 9.3
38 . 6
3 8 .2
3 8 .1
3 7 .9
3 7.7
37. 6
3 7. 5
37 . 4
3 7 .0
37 .2
3 7 .1
3 6 .3
36 .2
3 5 .9
35 . 8
3 5 .2
3 4 .3
34 . 0
30
20
1 8 .4
1 8 .1
1 7 .9
1 7 .7
17 .6
1 7 .5
1 7 .4
17. 2
1 7 .4
1 7 .3
1 7 .1
17. 0
16 . 8
1 7 .0
16 . 8
1 6 .6
1 6 .6
1 6 .6
1 6. 4
1 6 .2
15. 8
15 .5
1 5 .7
1 5 .6
1 5.2
15. 0
1 4 .9
1 2 .9
1 2 .7
1 2.6
12 . 3
1 2 .4
1 2.0
1 .7
1 1 .7
1 1 .9
11 .8
1 1 .6
1 1 .2
10
1 0 .8
1 0 .9
1 0 .7
1 0 .7
10 . 7
1 0 .9
1 1. 2
11 . 3
11 .3
11 .3
1 1 .0
1 0 .8
1 0 .8
1 0.8
0
80
81
82
83
GDP
84
CFI
11
85
IN V
86
1 3 .0
TABLE OF CONTENTS
Tab. 4: Historical revisions of annual and quarterly national accounts
Reference period
annual data
quarterly data
Version
Type of revision
1984.4
New estimation procedure of quarterly national
accounts (Istat, 1985)
--
1970.1-1984.4
1986.1
Completion of 1985 revision (Istat, 1987)
--
1970.1-1986.1
1980-1986
1980.1-1986.4
Use of census data. New estimates of employment.
1986.4
Estimates of hidden economy. Change of the base
year.
New procedure of constant price evaluation.
Input-Output table for 1982 (Istat, 1990, 1993)
1987.4
Small firms survey 1985-1986
1983-1987
1983.1-1987.4
1988.4
Retrospective estimation starting from 1970 (Istat,
1992)
1970-1981
1983-1988
1970.1-1988.4
1990.4
Small firms survey for various years (Istat, 1991).
1970-1990
1980.1-1991.3 (*)
--
1970.1-1992.2
Input-Output tables for 1985 and 1988.
Change of the base year
1992.2
Revision of the estimation procedure of quarterly
national accounts.
Completion of 1991 revision
estimates, N t1, j , Pt 1, j , R t1, j , D t1, j , Ft , j , respectively, where
by Ft , j we denote the series of “final” estimates, for
which the current revision process can be regarded as
finished 4.
overview of the data availability and of the nature of
comparisons).
A graphical support to this choice is given by fig. 2, in
which GDP percentage relative errors (see below) of
N t1, j as compared with Pt 1, j are presented. The 1987
historical revision results in abnormally large relative
errors in the first three quarters of 1986 6. As far as IMP
and EXP are concerned, the effects of the 1987
historical revision are less marked (see fig. 3 and
fig. 4) 7.
The comparison between N t1, j and Pt 1, j tell us how
much of the revision is carried out in the first stage,
while the revision carried out in the second and third
stage can be appreciated by comparing Pt 1, j with R t1, j
and R t1, j with Dt1, j , respectively 5. The comparison
between N t1, j and Ft , j permits to evaluate the overall
effect of the revision process.
We provide summary statistics for different
comparisons between preliminary and revised series.
According with Biggeri and Trivellato (1991), relative
errors are analyzed to evaluate the accuracy of
preliminary estimates of the levels and errors to
evaluate the accuracy of preliminary growth rates.
Given the particularly strong effect on GDP, CFI and
INV of the revision published in 1987 (see fig. 1), it
seems appropriate to keep complete and ‘pure’
comparisons distinct, where in the latter case we omit
the ‘mixed’ comparisons related to the above
mentioned historical revision (see table 5 for an
4
In other words, the series Ft , j corresponds to the 1994.2 version up to the 1989.4 figure.
5
We consider only current revisions related to different annual benchmarks. However, an analysis of the
`intermediate’ revisions would be very useful to appreciate the effects of the revisions in the rel ated series used
in the estimation procedure.
6
Similar results were found for CFI and INV too.
7
However, the descriptive analysis has been carried out on the same set of observations for all aggr egates.
12
TABLE OF CONTENTS
Fig. 2:
Percentage relative errors of estimates N 1j ,t as compared with Pt 1, j GDP
2,000
93.1
85.2
0,000
88.2
-2,000
87.1
85.3
87.2
92.1
89.1
89.3
92.3
92.2
89.2
87.3
91.3
-4,000
-6,000
-8,000
-10,000
-12,000
-14,000
86.1
86.2
-16,000
86.3
-18,000
Fig. 3:
Percentage relative revisions induced by the 1987 historical revision: IMP and EXP
4
2
0
-2
-4
80
81
82
83
IM P
13
84
EXP _
85
86
93.2
TABLE OF CONTENTS
Tab. 5: Data availability for the analysis of the current revision process of the Italian quarterly nationa l
accounts
Year and
quarter
1983
1984
1985
1986
1987
1988
Current estimates
N
1
1
1
P
R
D
1990
1991
1992
F
N −P
1
1
P − R1
R1 − D1
N1−F
1
1
•
•
x
x
x
•
•
p
•
2
•
•
x
x
x
•
•
p
•
4
•
•
x
x
x
•
•
p
•
3
•
•
x
x
x
•
•
p
•
1
•
x
x
x
x
•
p
m
•
2
•
x
x
x
x
•
p
m
•
3
•
x
x
x
x
•
p
m
•
4
•
x
x
x
x
•
p
m
•
1
•
x
x
x
x
•
m
p
•
2
x
x
x
x
x
p
m
p
m
3
x
x
x
x
x
p
m
p
m
4
•
x
x
x
x
•
m
p
•
1
x
x
x
x
x
m
p
p
m
2
x
x
x
x
x
m
p
p
m
3
x
x
x
x
x
m
p
p
m
4
•
x
x
x
x
•
p
p
•
1
x
x
x
x
x
p
p
p
p
2
x
x
x
x
x
p
p
p
p
3
x
x
x
x
x
p
p
p
p
4
•
x
x
x
x
•
p
p
•
1
•
•
•
p
•
•
•
p
p
•
p
2
1989
Comparisons
1
3
•
•
•
p
•
•
4
•
•
•
p
•
•
1
x
x
•
x
x
p
•
•
p
2
x
x
•
x
x
p
•
•
p
3
x
x
•
x
x
p
•
•
p
4
•
x
•
x
x
•
•
•
•
1
x
•
x
x
•
•
•
p
•
2
x
•
x
x
•
•
•
p
•
3
x
•
x
x
•
•
•
p
•
4
•
•
x
x
•
•
•
p
•
1
•
x
x
x
•
•
p
p
•
2
•
x
x
x
•
•
p
p
•
3
x
x
x
x
•
p
p
p
•
4
•
x
x
x
•
•
p
p
•
1
x
x
x
•
•
p
p
•
•
2
x
x
x
•
•
p
p
•
•
3
x
x
x
•
•
p
p
•
•
4
•
x
x
•
•
•
p
•
•
14
TABLE OF CONTENTS
Fig. 4:
Percentage relative errors of estimates N 1j ,t as compared with Pj1,t : IMP and EXP
8,000
6,000
86.1
4,000
86.2
91.3
2,000
88.2
92.2
86.3
0,000
85.2
92.1
92.3
87.3
93.1
93.2
-2,000
9 3.3
89.1
89.3
87.1
89.2
-4,000
87.2
-6,000
85.3
IMP
More precisely, let v t and e t the error and the relative
error, respectively, in a provisional estimate ( pt ) with
respect to a revised estimate ( rt ):
vt = p t − rt
et =
5.
In order to analyze the preliminary estimates of levels
we use the following indices:
mean relative error, e;
mean absolute relative error, e';
3.
relative error standard deviation, se ;
4.
square root of the mean quadratic relative
error, d e ;
1.
2.
3.
4.
mean error, v;
mean absolute error, v';
error standard deviation, sv
square root of the mean quadratic error, dv ;
5.
bias component of the mean quadratic error,
U vM
In other words, we calculate :
Errors
Relative errors
v=
1 n
∑ vt
n t =1
e=
1 n
∑ et
n t =1
v '=
1 n
∑ vt
n t =1
e'=
1 n
∑ et
n t =1
(
)
1 n
∑ vt − v
n t=1
sv =
dv =
U
bias component of the mean quadratic
relative error, U eb .
In turns, to evaluate the accuracy of preliminary
growth rates we consider the following indices:
p t − rt
.
rt
1.
2.
EXP
M
v
2
se =
1 n 2
∑ vt
n t =1
( p − r)
=
(
)
1 n
∑ et − e
n t =1
de =
1 n 2
∑ et
n t =1
2
2
U eb =
dv2
15
e
de2
2
TABLE OF CONTENTS
Clearly, v and e are not measures of the accuracy of
provisional estimates. However, together with v' and e',
respectively, they permit to evaluate if, and eventually
how much, the errors are always (or almost always) in
the same direction.
rt1, 1 =
na
1
t,j
=
R t1− 1,4
R t1,1 − Dt1− 1,4
D t1− 1,4
ra t1,1 =
N t1, j − Pt 1− 1, j
Pt 1− 1, j
Pt 1, j − R t1− 1, j
pa t1,1 =
;
.
R t1− 1, j
Pt 1, j − D t1− 1, j
Dt1− 1, j
, j = 1,2,3;
, j = 1,K,4;
, j = 1,2,3,4.
As Patterson and Heravi (1992) pointed out, in order to
isolate the factors influencing the revisions the growth
rates should be calculated on the same vintage of data.
Each numerator of the above fractions can actually be
viewed as the sum of a ‘pure difference’ effect and of a
‘revision’ effect. More precisely, denoting by yvt , j the
-th vintage of an aggregate at quarter j of year t and by
yvt ,+j 4 the subsequent ( v + 4) -th vintage of the same
value, it is
A more marked, and systematic, underestimation
characterizes the early estimates of annual growth
rates (see table 8) for all aggregates, perhaps with the
only exception of EXP, for which the overall revision
process does not show a marked upward tendency.
(
) (
)
yvt , j − yvt −+14, j = yvt , j − yvt − 1, j − yvt −+14, j − yvt − 1, j ,
where yvt , j − yvt −1, j measures the pure difference effect
and yvt −+14, j − yvt − 1, j the revision effect. If, as it often
happens, on average yvt − 1, j < yvt −+14, j , the annual growth
rates tend to be underestimated by the early estimates.
It has to be noted that both quarterly and annual
growth rates have been calculated on the basis of the
same version of data. This means that we compare
levels at different stages of revision. For example, the
current first quarter growth rates of an aggregate at
year t are given by
Dt2− 1,4
Pt 1,1 − Rt1− 1,4
The annual growth rates are always calculated on the
basis of estimates pertaining to different stages of the
annual revision process. Indeed,
The systematicity in underestimating F t , j for GDP,
CFI and INV leads, on average, to an underestimation
of quarterly growth rates (about 0.9% for INV and
0.7% for GDP, see table 7).
Even in this case (i) the effect of the first revision stage
is larger (from .24 to .61 percentage points on average)
as compared with the following two, and (ii) the
intermediate revisions and the overall one of IMP and
EXP have different characteristics from those of GDP,
CFI and INV.
D 1t ,1 − D t2− 1,4
;
In all three cases not homogeneous figures are used, in
the sense that they don’t belong to the same annual
benchmark revision stage. 8
It should be noted that the weight of the first revision
(from N t1, j to Pt 1, j ) (i) is relatively more marked ( N t1, j
underestimates Pt 1, j by about 1% for GDP) and (ii) is
generally
characterized
by
a
considerable
systematicity (for GDP U eb is equal to .6352).
This problem is less noticed for d1t ,1 =
constraint with D t .
Pt 1− 1,4
pt1, 1 =
From table 6 it emerges that, as far as the aggregates in
question and given the small number of quarters token
into account, on average the preliminary N t1, j vintage
underestimates the final figure (by about 2% for INV,
1.5% for GDP, 1.4% for both CFI and IMP, and by a
smaller amount for EXP). This trend is very
systematic for GDP, CFI and INV, as the comparison
between e and e' clearly shows, while the corrections
to IMP and EXP are not always in the same direction.
8
N t1,1 − Pt 1− 1,4
n t1,1 =
However, the relatively scarcity of data due to the
irregular sequence of publication and to the
intervention of the historical revision published in
1987, as well as the peculiarity and the relatively brief
history of the Italian quarterly national accounts,
, because both D 1t , j and D t2, j satisfy the same accounting
16
TABLE OF CONTENTS
induced us to operate on the same version rather than
on the same vintage.
Firstly, the main statistical features of the cyclical
component of the various preliminary versions of a
series and that of the final one are compared.
Secondly, integration and cointegration properties of
different versions are considered. Lastly, viewing
preliminary versions as forecasts of the final version,
the issue of rationality is considered with regard to the
aspects of unbiasedness and (weak) efficiency.
On the whole, the empirical evidence shows that the
current preliminary estimates have a tendency to
underestimation, in a more marked way for GDP, CFI
and INV, less for IMP and EXP. Moreover, the
revision process is somewhat ‘convergent’, in the
sense that it has a sufficiently stable order so that
N 1 ≤ P 1 ≤ R1 ≤ F.
As one could expect, these results are in agreement
with those found for the Italian annual revision
process 9 (Trivellato, 1986, Biggeri and Trivellato,
1991). They also confirm a generalized phenomenon
to many countries (Glejser and Schavey, 1974, Smith,
1977, Young, 1993).
4
It should be noted that the analyses carried out in this
order provide different, but exhaustive, indications
about the revision process. For example, cyclical
components could show different patterns, but the
series should be cointegrated if the associated revision
is stationary. Furthermore, in this case the early
version could well be an unbiased though inefficient
forecast of the final version.
An econometric approach to the
This approach seems to be very useful in terms of its
efficacy in the analysis of the revisions involving
considerable changes in the series, in particular of
those defined here as historical or occasional. The
analysis will then focus mainly on the constant prices
1986.3, 1988.4 and 1990.3 versions, respectively.
These versions are compared with the final one 10,
conventionally defined as 1994.2. All the analyses of
this section are carried out using logarithmic
transformations of the rebased (at 1985 prices)
variables 11.
analysis of the revision process
As it has been shown in the preceding analyses, a
noticeable process of revision of Italian quarterly
national accounts took place during the 1980s, mainly
caused by the need to quantify appropriately the
structural changes experienced over the last two
decades.
To give an adequate assessment of the innovations
induced by successive revisions, within the
Department of National Accounts and Economic
Analysis of Istat the need was felt to define techniques
which would permit a complete appraisal of the
changes occurring to various quarterly and annual
aggregates, thus rendering their economic
interpretation possible.
4.1 The analysis of cyclical components
The first step of the analysis consists in comparing the
transitory (or cyclical) component of each preliminary
version with that of the final version. In Pisani and
Savio (1994) such an analysis was carried out using, as
a method of decomposition of the series, the
estimation of adequate structural models (see e.g.
Harvey, 1989).
For this purpose it was decided to use econometric
techniques, which are considered most appropriate for
a quantitative assessment of the phenomena under
study. Particularly, in Pisani and Savio (1994) the
following approach was proposed.
9
Though not yet concluded, the analyses on quarterly constant prices aggregates generally confirm th ese results.
10
When comparing different versions, the maximum available time span is used.
11
The rebasing has been done for each component of the analyzed variables as they are currently publi shed by
Istat. Methodological as well as practical aspects related to this issue (Rushbrook, 1978) will be deeply
analyzed in future work.
17
TABLE OF CONTENTS
Tab. 6: Indices of accuracy of provisional estimates of levels <
Comparison
n
e
e'
se
de
U eb
GDP
N 1 −P1
16
-0.0095
.0099
.0072
.0119
.6352
P1 −R1
24
-.0019
.0045
.0053
.0057
.1180
R1 −D1
24
-.0015
.0053
.0073
.0075
.0395
N 1 −F
7
-.0147
.0147
.0036
.0151
.9417
CFI
N 1 −P1
16
-.0053
.0072
.0079
.0095
.3088
P1 −R1
24
-.0017
.0047
.0055
.0057
.0859
R1 −D1
24
-.0018
.0034
.0045
.0048
.1365
N 1 −F
7
-.0136
.0136
.0044
.0143
.9057
INV
N 1 −P1
16
-.0043
.0215
.0240
.0243
.0314
P1 −R1
24
-.0047
.0093
.0106
.0116
.1654
R1 −D1
24
.0029
.0074
.0103
.0107
.0738
N 1 −F
7
-.0215
.0215
.0081
.0230
.8763
IMP
N 1 −P1
16
-.0137
.0192
.0197
.0240
.3266
P1 −R1
24
-.0013
.0083
.0106
.0107
.0155
R1 −D1
24
.0002
.0080
.0128
.0128
.0002
N 1 −F
7
-.0137
.0209
.0178
.0225
.3728
EXP
N 1 −P1
16
-.0110
.0168
.0150
.0186
.3497
P1 −R1
24
.0013
.0104
.0140
.0141
.0080
R1 −D1
24
.0011
.0087
.0123
.0124
.0079
N 1 −F
7
-.0040
.0224
.0271
.0274
.0218
18
TABLE OF CONTENTS
Tab. 7: Indices of accuracy of provisional estimates of quarterly growth rates
n
Comparison
v
v'
sv
dv
U vM
GDP
N 1 −P1
16
-0.0049
.0059
.0057
.0076
.4255
P −R
1
24
-.0015
.0041
.0049
.0052
.0894
R −D
1
24
.0011
.0044
.0061
.0062
.0297
7
-.0069
.0069
.0043
.0082
.7196
1
1
N 1 −F
CFI
N 1 −P1
16
-.0024
.0030
.0033
.0041
.3416
P1 −R1
24
-.0005
.0031
.0038
.0039
.0194
R1 −D1
24
.0002
.0018
.0023
.0023
.0088
N 1 −F
7
-.0026
.0036
.0038
.0046
.3111
INV
N 1 −P1
16
-.0036
.0096
.0109
.0115
.0967
P1 −R1
24
-.0015
.0074
.0086
.0088
.0302
R1 −D1
24
.0002
.0054
.0073
.0073
.0004
N 1 −F
7
-.0094
.0102
.0058
.0110
.7223
IMP
N 1 −P1
16
-.0061
.0109
.0126
.0140
.1917
P1 −R1
24
.0002
.0103
.0144
.0144
.0003
R1 −D1
24
-.0018
.0106
.0182
.0183
.0101
N 1 −F
7
.0032
.0187
.0216
.0218
.0218
EXP
N 1 −P1
16
-.0056
.0103
.0120
.0132
.1771
P1 −R1
24
.0005
.0137
.0195
.0195
.0007
R1 −D1
24
-.0021
.0122
.0191
.0193
.0120
N 1 −F
7
.0062
.0209
.0242
.0250
.0606
19
TABLE OF CONTENTS
Tab. 8: Indices of accuracy of provisional estimates of annual growth rates
Comparison
n
v
v'
sv
dv
U vM
GDP
N 1 −P1
16
-0.0073
.0079
.0055
.0092
.6378
P −R
24
-.0005
.0060
.0078
.0078
.0046
R1 −D1
24
-.0004
.0051
.0068
.0068
.0030
N 1 −F
7
-.0057
.0065
.0076
.0095
.3651
1
1
CFI
N 1 −P1
16
-.0037
.0065
.0064
.0074
.2527
P1 −R1
24
.0001
.0043
.0055
.0055
.0001
R1 −D1
24
-.0006
.0030
.0035
.0036
.0292
N 1 −F
7
-.0051
.0070
.0075
.0090
.3192
INV
N 1 −P1
16
-.0013
.0189
.0217
.0218
.0033
P1 −R1
24
-.0088
.0145
.0157
.0180
.2377
R1 −D1
24
.0018
.0076
.0111
.0113
.0247
N 1 −F
7
-.0247
.0265
.0180
.0305
.6535
IMP
N 1 −P1
16
-.0167
.0201
.0175
.0242
.4757
P1 −R1
24
-.0009
.0119
.0167
.0167
.0031
R1 −D1
24
.0006
.0101
.0161
.0161
.0014
N 1 −F
7
-.0236
.0239
.0182
.0298
.6257
EXP
N 1 −P1
16
-.0120
.0204
.0187
.0223
.2925
P1 −R1
24
.0003
.0096
.0120
.0120
.0006
R1 −D1
24
.0013
.0122
.0188
.0189
.0046
N 1 −F
7
-.0098
.0250
.0280
.0297
.1082
20
TABLE OF CONTENTS
4.2 Unit root analysis and cointegration
tests
Here we use a different method; it consists in the
application of the Hodrick and Prescott (1980) filter
for determining the trend component of a series 12.
It was recently stressed that the comparison between
provisional and final versions of a given series may
conveniently be carried out by means of reference to
the concept of cointegration (see Patterson and Heravi,
1991, Pisani and Savio, 1994).
The main characteristic of such a filter is that it
emphasizes the medium and high-frequency
movements in the data, those that are usually
associated with business cycle. It should be noted that
the filter is simple to use and highly operational, the
definition of the trend this procedure adopts is quite
intuitive, it is able to render stationary series that are
integrated up to the fourth order and, finally, the
estimated trend component is endowed with the same
features of integration and cointegration as the original
series (Lupi, 1992). Being statistical in nature, it
represents a particular way of viewing the data which
In fact, an important aspect to be verified during the
analysis of the revision process is the stationarity of
the revisions resulting from different versions and the
existence of a cointegration relationship between the
various provisional versions and the final one.
Such an analysis is found to be particularly relevant
both for the activity of the official statistical agencies
and for the users of data. In fact, if we accept the
hypothesis that the residuals from a cointegrating
regression between, say, a preliminary version and the
final one of a series are I(1) rather than I(0), then an
I(1) variable, or combination of variables, must have
been omitted 13. Thus, an important aspect to be
considered is the issue of whether preliminary
versions of data are cointegrated with the final
realization. Testing for cointegration requires a
preliminary analysis of the degree of integration of the
series examined. To this end, the conventional test
proposed by Dickey and Fuller (Fuller, 1976, Dickey
and Fuller, 1979, 1981) is used, the results of which
are given in table 10.
can be very productive.
Figure 5 reports the cyclical component of these
aggregates for the various versions considered in the
analysis. In the seventies the cyclical components of
the various versions are very close, while in the
eighties they are more irregular and, consequently,
show more marked differences.Nonetheless, the
graphs show that the main turning points, as well as the
phase and the period of the cycles, are substantially
similar, independently of the aggregate and the
version considered. This result parallels those
obtained in Pisani and Savio (1994) for GDP.
The contemporaneous cross correlations between the
early and final version (see table 9) give some
quantitative insights on the degree of correspondence
between the cyclical components. The correlations are
very high and less than 0.9 in just three cases, in
correspondence with the first version considered for
INV, IMP and EXP.
12
Briefly, the filter proposed by Hodrick and Prescott permits the extraction from the series yt of a trend
component τ t defined as the solution of the following minimum problem
n
2
n−1
min ∑( yt − τ t ) + µ ∑
τ
t
t =1
t=2
[(τ
2
t+1
]
− τ t ) − (τ t − τ t −1) , µ > 0.
Fluctuations are defined from the trend component,
13
Given the strong upwards trend in most of the data
series, the functional form used in the test is that with
drift and deterministic trend 14. The number of
differences used in the tests was chosen so that it
suffices the regression passing an LM test for
autocorrelation of residuals up to lag 8.
( yt − τ t ).
On the other hand, if the hypothesis of cointegration was accepted, the associated revision (derive d from the
joint restriction on the corresponding cointegrating regression that the constant is zero and the c ointegrating
coefficient is unity) could be non-stationary. We shall return to this aspect later, when we will d iscuss about the
rationality tests.
21
TABLE OF CONTENTS
Tab. 9: Estimated contemporaneous cross-correlations between the cyclical component of the final version
and those of different preliminary versions
Version
GDP
CFI
INV
IMP
EXP
1986.3
1988.4
0.944
0.989
0.909
0.968
0.898
0.962
0.851
0.973
0.807
0.968
1990.3
0.974
0.957
0.937
0.971
0.964
The null hypothesis of integration of order 1 is
strongly accepted for all the series considered,
excluding the first and the final version of imports, for
which the test statistic is only marginally below the
5% critical value.
1989) for the choice of the cointegration rank between
couples of variables.
The order of the VAR was carried out on the basis of
the results provided by a number of criteria: Akaike,
Hannan and Quinn, and Schwartz 15.
In these two cases, however, as in the others, the null
hypothesis is rejected by an analogous procedure
applied to the first differences of the series.
Following recent results obtained by Reimers (1993),
in case of conflict the results obtained from the
Hannan and Quinn’s criterion was preferred 16. In only
five cases the resulting VAR order was modified,
increasing it until the hypotheses of whiteness and
normality of residuals were satisfied. The tests
considered were those of Ljung and Box (1978) for
serial correlation, Jarque and Bera (1981) and Mardia
(1980) for the normality hypothesis.
Now, we can consider the tests of cointegration
between the final version and each preliminary
estimate.
Table 11 reports the results obtained using Johansen’s
maximum likelihood procedure (see Johansen, 1988,
Tab. 10: Dickey-Fuller unit root tests for log-level and first differences of different versions*
Series
1994.2
1990.3
1988.4
1986.3
GDP
-2.5054 (1)
-5.5959 (0)
-3.2447 (1)
-4.7364 (0)
-3.1015 (1)
-4.5282 (0)
-2.5279 (1)
-5.4797 (0)
-1.0140 (2)
-4.9364 (1)
-1.5690 (0)
-5.0858 (2)
-2.0729 (2)
-4.2561 (1)
-2.4538 (1)
-4.6179 (0)
-2.5354 (1)
-5.8423 (0)
-2.3180 (1)
-4.9659 (0)
-2.2314 (3)
-4.6528 (2)
-2.9060 (1)
-5.2346 (0)
-3.5218 (1)
-7.6797 (0)
-1.9009 (0)
-6.8469 (0)
-2.2500 (0)
-6.6661 (0)
-3.6149 (1)
-5.5993 (3)
-3.0644 (0)
-10.8241 (0)
-2.8386 (0)
-10.1938 (0)
-2.7403 (0)
-10.0698 (0)
-2.3008 (0)
-8.5313 (0)
∆ GDP
CFI
∆ CFI
INV
∆ INV
IMP
∆ IMP
EXP
∆ EXP
14
The results obtained in the specifications without trend are qualitatively analogous to those shown in the paper
and are available from the authors on request.
15
See Lütkepohl (1991) for details.
16
In fact, Reimers shows by Monte Carlo simulations that Hannan and Quinn’s test performs better in t erms of
corrected percentages of selection than the others mentioned in the text.
22
TABLE OF CONTENTS
Fig. 5: Cyclical components of some versions derived from the Hodrick-Prescott filter
0.06
0.03
GDP
CFI
0.04
0.02
0.02
0.01
0.00
0.00
-0.02
-0.01
-0.04
-0.02
-0.06
70 72 74 76 78 80 82 84 86 88 90 92 94
-0.03
70 72 74 76 78 80 82 84 86 88 90 92 94
0.10
0.2
INV
IMP
0.1
0.05
0.0
0.00
-0.1
-0.05
-0.2
70 72 74 76 78 80 82 84 86 88 90 92 94
-0.10
70 72 74 76 78 80 82 84 86 88 90 92 94
0.15
E XP
0.10
0.05
0.00
-0.05
-0.10
70 72 74 76 78 80 82 84
1986.3
1988.4
23
86 88 90 92 94
1990.3
1994.2
TABLE OF CONTENTS
According to the preferred functional form, a linear
trend is included in both I(0) and I(1) components (see
Johansen, 1992a, 1992b) for reasons analogous to
those which make preferable, in an univariate
framework, the model with trend as opposed to that
with only the drift, when the series is characterized by
a clear tendency. In these cases, the tests performed in
the model without trend would lead one to accept the
hypothesis of I(1) even if the variable was in fact
trend-stationary.
methodologies adopted or reference sources-might
therefore be decidedly misleading.
4.3 Efficiency and unbiasedness of the
revision process
Harvey et al. (1983) and Mankiw et al. (1984),
amongst others, have noted that preliminary versions
can be considered as forecasts of the final version, or
as preliminary measurements of the final version
subject to a measurement error. In the former sense,
the one considered in this work, a number of tests
widely used, for example, to verify the hypothesis of
rational expectations (unbiasedness, serial correlation,
efficiency and orthogonality) may be valid as an aid
for a probabilistic interpretation of the differences
between preliminary versions of data and the final one.
In this sense, a common starting point is given by the
following equation:
The results summarized in table 11 show that, while
for IMP and EXP the hypothesis of cointegration is
strongly accepted for all the preliminary versions, for
the other couples of variables it is not possible to
define, at a level of significance of 5%, a common
(stochastic) trend.
In particular, for INV the hypothesis of cointegration
is accepted for all the versions considered, excluding
that of 1986.3, for which the results of the tests based
on the maximum eigenvalue (column 5) and on the
trace (column 6) are contradictory. Such an ambiguity
might originate in the low power of the test when the
cointegrating relation is close to non-stationarity (see
Johansen and Juselius, 1992).
ytm = α + βyvt + u t ,
Where y tm and yvt represent the preliminary and the
final version of y respectively. Equation (1) is
traditionally interpreted as the basis for the
unbiasedness test, which verifies the hypothesis
H 0 :( α , β) = ( 0,1).
The hypothesis of cointegration is accepted for all
versions of CFI except the last one (1986.3), while for
GDP we just find i=1 for 1990.3 17. Combining these
results with those highlighted in the previous
paragraph, we should affirm that the revision process
might have led to substantial changes in the long-term
components of most series, while leaving their
cyclical components virtually unaffected. This may be
due to the fact that either the revision has no cycle, or
the cycle has the same characteristics as that present in
the early versions.
However Wallis (1989) has pointed out that the test is
in fact stricter: the mantained hypothesis should be
identified with an efficient, though not necessarily
fully efficient, predictor. In other words, the test
actually verifies the hypothesis of efficiency (in weak
sense) of expectations. In fact, efficiency requires that
the revision
ytm − yvt
is unforeseeable from
(
)
information at time t. Since part of this information
includes (beyond the constant) the provisional version
yvt , it follows that a finding of α ≠ 0 and/or β ≠ 1in (1)
suggests inefficient use of information.
Such results indicate that occasional revisions may
involve significant differences in log-term
components of data series. It should be stressed that
the
results
obtained
from
‘non-updated’
macroeconometric models vis-à-vis changes in
national accounts - such as those related to rebasing,
17
(1)
Furthermore, unbiasedness requires that the mean
revision is zero (Holden and Peel, 1990), which could
occur even if the preliminary version was an
inefficient predictor.
The differences in terms of cointegration properties might originate in the sharp re-evaluation of the service
sector due to the revision process, in particular for those activities included in the branch of bu siness services.
This may have determined more marked changes in the intermediate consumptions component vis-à-vis final
consumption.
24
TABLE OF CONTENTS
Tab. 11: Johansen maximum likelihood cointegration analysis: tests of the cointegration rank betwe en the
final and different preliminary versions*
(
)
(
)
Series
Version
N. Of lags in
the VAR
i
−n ln 1 − λ$ i
− n ∑ ln 1 − λ$ i
GDP
1990.3
2
1
2
19.02
9.49
28.50
9.49
1
1988.4
2
1
2
16.27
9.29
25.56
9.29
0
1986.3
2
1
2
10.29
3.92
14.21
3.92
0
1990.3
5
1
2
28.04
6.92
34.95
6.92
1
1988.4
2
1
2
22.55
5.90
28.04
5.90
1
1986.3
2
1
2
12.44
6.60
19.04
6.60
0
1990.3
2
1
2
40.31
5.97
46.28
5.97
1
1988.4
2
1
2
30.06
6.99
30.06
6.99
1
1986.3
2
1
2
14.02
9.59
23.61
9.59
0-1
1990.3
2
1
2
26.90
4.11
31
4.11
1
1988.4
2
1
2
24.75
4.81
29.56
4.81
1
1986.3
5
1
2
26.74
8.79
35.53
8.79
1
1990.3
2
1
2
19.77
8.20
26.01
8.20
1
1988.4
1
1
2
45.83
7.71
53.55
7.71
1
1986.3
1
1
2
44.44
5.92
50.37
5.92
1
CFI
INV
IMP
EXP
* 5% critical values are 18.96 and 12.25 for the λ$ MAX test, respectively for i = 1 and i = 2.
For the trace test 5%critical values are 25.32 and 12.25.
25
N. Of coint.
vectors
TABLE OF CONTENTS
Weak efficiency is a sufficient, but not a necessary
condition for unbiasedness 18. As a direct test of bias
we can consider estimating a regression of the revision
Compared with the version 1986.3, the aggregate
which underwent sharpest growth was INV (+30.0%),
followed by GDP (+12.1%) and the other variables the
quotas of which - in real terms - grew between 5% and
6%. In most of these cases, such re-evaluations caused
changes in the level of the series, while not
substantially altering their cyclical and long-term
components.
on a constant:
y tm − yvt = α 1 + ε t
(2)
If the constant is statistically different from zero, we
reject the hypothesis of unbiasedness. Note that weak
efficiency and unbiasedness are particular cases of the
orthogonality test. The orthogonality condition
requires preliminary revisions be unpredictable using
the information available at the time of the forecast.
Two points must be stressed:
5 Conclusions
This paper has documented and discussed the
revisions to the current prices, seasonally adjusted
expenditure components of Italian gross domestic
product published over the range 1985 - 1994, and, for
the same aggregates, to various versions of constant
prices, seasonally adjusted data.
• Brown and Maital (1981) have shown that the
hypotheses of unbiasedness and weak efficiency
might well be valid if the residuals in (1) and (2) are
serially correlated. In fact, as the forecast is ade h
steps ahead, the residuals may be characterized by
an MA(h-1) process . Following Hansen and
Hodrick (1980), they propose to adjust the
covariance matrix given by the OLS regression
obtaining correct standard errors for the
coefficients. This is the estimation method here
used.
A number of conclusions have been drawn,
concerning the relative sizes of the absolute revisions
to different components, the biases in the preliminary
estimates and the evolutionary patterns being followed
by the revision process. We observe that:
• virtually all important components of GDP
experienced significant revisions;
• GDP, CFI and INV have marked downward biases
in their preliminary estimates while IMP and EXP
are touched relatively less by the revision process;
• There is a difference of approach between these
tests and those for cointegration. The former ones
assume stationary variables and moving average
disturbances, while the latter assume non-stationary
variables and disturbances which are NID 0, σ 2 .
(
• these results are generally confirmed by an
econometric comparison of a number of constant
prices versions with the last published figures;
)
As Patterson and Heravi (1991) note, this difference
does not appear to have been overcome yet in the
literature. The results of running the regressions
given by (1) an (2) obtained are given in table 12.
The test statistic for the null hypothesis is shown as
a χ 2 ( 2) and a χ 2 (1) for the weak efficiency and
unbiasedness tests respectively.
• the short-medium term dynamics of three
representative constant prices versions remained
substantially unaffected by the revision process;
• the permanent component has been strongly
affected by the 1987 historical revision; the weak
efficiency and unbiasedness tests suggest that after
that date the revision process behaves in a more
regular way.
The null hypothesis of weak efficiency is surprisingly
accepted only for INV and the 1988.4 version, while
unbiasedness is found for CFI and INV for all versions
excluding 1986.3. On the contrary, EXP is always,
though marginally, unbiased. The values given by α t
provide measures of the mean bias.
18
Obviously, the present paper is only a beginning effort
in the analysis of quarterly national accounts
revisions, and many other important issues remain to
be investigated:
There is no inconsistency, however, between a finding of rejection of weak efficiency and an accept ance of the
hypothesis of unbiasedness.
26
TABLE OF CONTENTS
• are the seasonally adjusted estimates revised more
or less, on average, than the unadjusted estimates,
and are the biases stronger or weaker?
• is there any monotonous pattern in the sequence of
successive revisions that could be used in reducing
the average size of corrections?
• what’s about
aggregates?
the
other
national
accounts
Another related issue involves the use of preliminary
estimates in econometric models. Such a problem has
received much attention in literature for the obvious
implications on the reliability of the conclusions that
an applied economic researcher can draw (see, among
others, Grimm and Hirsch, 1983, Howrey, 1984,
Trivellato and Retto re, 1986, Nijman, 1989).
• what do the revision process tell us about the
behaviour of growth rates? Have the revisions to the
levels left the growth rates substantially unaffected?
Tab. 12: Efficiency and unbiasedness tests*
Series
Version
α$
β$
x 2 ( 2)
α$ 1
x 2 (1)
GDP
1990.3
0.2309
(0.0322)
0.9813
(0.0027)
102.0085
(0.00)
0.0047
(0.0021)
5.0379
(0.02)
1988.4
0.2205
(0.0221)
.9822
(0.0018)
133.8009
(0.00)
0.0053
(0.0021)
6.2547
(0.01)
1986.3
-3.5386
(0.2686)
1.3073
(0.026)
1992.6460
(0.00)
0.1219
(0.0212)
33.1358
(0.00)
1990.3
0.1435
(0.0461)
0.9878
(0.0039)
10.9096
(0.00)
-0.0006
(0.0014)
0.1641
(0.69)
1988.4
0.0808
(0.0280)
0.9932
(0.0023)
8.7086
(0.01)
0.0004
(0.0009)
0.1380
(0.71)
1986.3
-4.2053
(0.0924)
1.3641
(0.0080)
2231.6148
(0.00)
0.0605
(0.0256)
5.5802
(0.02)
1990.3
-0.1550
(0.0400)
1.0147
(0.0038)
19.5871
(0.00)
0.0010
(0.0009)
1.2729
(0.26)
1988.4
-0.0907
(0.0812)
1.0086
(0.0013)
2.3498
(0.31)
0.0005
(0.0005)
0.9906
(0.32)
1986.3
0.9025
(0.8746)
0.9417
(0.0844)
983.5052
(0.00)
0.3035
(0.0100)
880.6516
(0.00)
1990.3
-0.1750
(0.1938)
1.0140
(0.0184)
22.6953
(0.00)
-0.02719
(0.0070)
15.2612
(0.00)
1988.4
-0.4248
(0.1360)
1.0378
(0.0013)
73.2108
(0.00)
-0.0259
(0.0075)
12.0318
(0.00)
1986.3
2.5406
(0.0823)
0.7611
(0.0079)
2042.6419
(0.00)
0.0487
(0.0313)
2.4225
(0.12)
1990.3
0.1868
(0.0457)
0.9827
(0.0044)
45.9330
(0.00)
0.0058
(0.0034)
2.8680
(0.09)
1988.4
0.2351
(0.0350)
0.9779
(0.0034)
90.4598
(0.00)
0.0057
(0.0040)
2.0602
(0.15)
1986.3
1.8608
(0.0571)
0.8250
(0.0056)
3101.9622
(0.00)
0.0575
(0.0287)
4.0067
(0.005)
CFI
INV
IMP
EXP
* 5% Columns from 3 to 5 refer to equation (1), those from 6 to 7 to equation (2). Standard errors are reported in parentheses below
the coefficients. In square brackets marginal significance levels of the x 2 test are specified. x 2 (2) Test verifies the hypothesis
H 0: (α ,β) = ( 0,1) in (1), x 2 (1) verifies the hypothesis H 0 :α 1 = 0 in (2).
27
TABLE OF CONTENTS
References
Harvey A.C. (1989) , Forecasting, structural time
series and the Kalman filter, Cambridge,
Cambridge University Press.
Biggeri L. and U. Trivellato (1991) , “The evaluation
of errors in national accounts data: provisional
and revised estimates”, Survey Methodology, 17,
1, pp. 99-119.
Harvey A.C., C.R. McKenzie, D.P.C. Blake and
M.J. Desai (1983) , “Irregular data revisions”, in
A. Zellner (ed.), Applied time series analysis of
economic data, Washington D.C., US Bureau of
the Census, pp. 329-347.
Brown B.W. and S. Maital (1981) , “What do
economists know? An empirical study of experts’
expectations”, Econometrica, 49, 2, pp. 491-504.
Hibbert J. (1981) , “Revisions to quarterly estimates
of gross domestic product”, Economic Trends,
331, pp. 82-87.
Dickey D.A. and W.A. Fuller (1979) , “Distribution
of the estimators for autoregressive time series
with a unit root”, Journal of the American
Statistical Association, 74, pp. 427-431.
Hodrick R. and E. Prescott (1980) , “Post-war U.S.
business cycles: an empirical investigation”,
Carnegie Mellon University, (mimeo).
Dickey D.A. and W.A. Fuller (1981) , “Likelihood
ratio statistics for autoregressive time series with
unit root”, Econometrica, 49, pp. 1057-1072.
Holden K. and D.A. Peel (1990) , “On testing for
unbiasedness and efficiency of forecasts”, The
Manchester School, 58, pp. 120-127.
Di Fonzo T. (1987) , La stima indiretta di serie
economiche trimestrali, Padova Cleup.
Howrey E.P. (1984) , “Data revision, reconstruction,
and prediction: an application to inventory
investment”, The Review of Economics and
Statistics, 66, 3, pp. 386-393.
Fuller W.A. (1976) , Introduction to statistical time
series, New York, Wiley.
Giovannini E. (1993) , “L’informazione statistica per
le decisioni: alcune linee generali e un esempio
illustrativo”, Quaderni di Ricerca. Interventi e
Relazioni, Istat, Sistema Statistico Nazionale.
Istat (1985) , “I conti economici trimestrali. Anni
1970-1984”, Supplemento al Bollettino Mensile
di Statistica, 12.
Glejser H. and P. Schavey (1974) , “An analysis of
revisions on national accounts data for 40
countries”, The Review of Income and Wealth,
20, pp. 317-332.
Istat (1987) , “Miglioramenti apportati ai conti
economici trimestrali. Serie con base dei prezzi
1970”’, Collana d’Informazione, n. 4.
Grimm B.T. and A.A. Hirsch (1983) , “The impact of
the 1976 NIPA benchmark revision on the
structure and predictive accuracy of the BEA
quarterly econometric model”, in M.F. Foss (ed.),
The U.S. National Income and Product Accounts.
Selected topics, National Bureau of Economic
Research, Studies in Income and Wealth, 47, pp.
333-382.
Istat (1990), “Nuova contabilità nazionale”, Annali di
Statistica, vol. 9.
Istat (1991) , “Conti economici delle imprese con
meno di 10 addetti. Anni 1986 e 1988”, Collana
d’informazione, 45.
Istat (1992), “I conti economici trimestrali con base
1980”, Note e Relazioni, 1.
Hansen L.P. and R.J. Hodrick (1980) , “Forward
exchange rates as optimal predictors of future spot
rates: an econometric analysis”, Journal of
Political Economy, 88, pp. 829-853.
Istat (1993) , “The underground economy in Italian
economic accounts”, Annali di Statistica, serie X,
vol. 2.
28
TABLE OF CONTENTS
Jarque, C.M. and A.K. Bera (1980) , “Efficient tests
for normality, homoscedasticity and serisl
correlation”, Economics Letters, 6, pp. 255-259.
money stock rational forecasts?”, Journal of
Monetary Economics, 14, pp. 15-27.
Johansen S. (1988) , “Statistical analysis of
cointegration vectors”, Journal of Economics
Dynamics and Control, 12, pp. 231-254.
Mardia K.V. (1980) , “Tests for univariate and
multivariate normality”, in Handbook of
Statistics, vol. I, Amsterdam, North-Holland, pp.
279-320.
Johansen S. (1989) , Likelihood based inference on
cointegration. Theory and applications, Venezia,
Cafoscarina.
Morgenstern O. (1963) , On the accuracy of economic
observations, Princeton, Princeton University
Press.
Johansen S. (1992a) , “Determination of the
cointegration rank in the presence of a linear
trend”, Oxford Bulletin of Economics and
Statistics, 54, pp. 383-397.
Nijman T. (1989) , “A natural approach to optimal
forecasting in case of preliminary observations”,
(mimeo).
Patterson K.D. (1992) , “Revisions to the components
of the trade balance for the United Kingdom”,
Oxford Bulletin of Economics and Statistics, 54,
3, pp. 103-120.
Johansen S. (1992b) , The role of the constant term in
cointegration analysis of nonstationary variables,
Institute of Mathematical Statistics, University of
Copenhagen, Preprint 92-1.
Patterson K.D. and S.M. Heravi (1991) , “Data
revisions and the expenditure components of
GDP”, The Economic Journal, 101, 407, pp.
887-901.
Johansen S. and K. Juselius (1992) , “Testing
structural hypotheses in a multivariate
cointegration analysis of the PPP and the UIP for
UK”, Journal of Econometrics, 53, pp. 211-244.
Patterson K.D. and S.M. Heravi (1992) , “Efficient
forecasts or measurement errors? Some evidence
for revisions to United Kingdom GDP growth
rates”, The Manchester School, 60, 3, pp.
249-263.
Ljung G.M. and G.E.P. Box (1978) , “On a measure
of lack of fit in time series models”, Biometrika,
65, pp. 297-303.
Lupi C. (1992) , “Measures of persistence, trends and
cycles in I(1) time series: a rather skeptical
viewpoint”, Applied Economics Discussion
Paper, 137, Institute of Economics and Statistics,
University of Oxford.
Pisani S. and G. Savio (1994) , “Le revisioni delle
serie storiche trimestrali di Contabilità
Nazionale”, in Istat, Avanzamenti metodologici e
statistiche ufficiali, Roma, pp. 203-229.
Lütkepohl H. (1991) , Introduction to multiple time
series analysis, Berlin, Springer-Verlag.
Rao J.N.K., K.P. Srinath and B. Quenneville
(1989), “Estimation of level and change using
current preliminary data”, in D. Kasprzyk, G.
Duncan, G. Kalton and M.P. Singh (eds.), Panel
surveys, New York, Wiley, pp. 457-479.
MacKinnon J.G. (1991) , “Critical values for
cointegration tests”, in R.F. Engle and C.W.J.
Granger (eds.), Long-run economic relationships:
readings in cointegration, Oxford, Oxford
University Press, pp. 267-276.
Reimers H.E. (1993) , “Lag order determination in
cointegrated VAR systems with application to
small German macro-models”, paper presented at
the Econometric Society European Meeting ‘93,
Uppsala, August (mimeo).
Mankiw N.G., D.E. Runkle and M.D. Shapiro
(1984), “Are preliminary announcements of the
29
TABLE OF CONTENTS
Rushbrook J.A. (1978) , “Rebasing the national
accounts - why and how and the effects”,
Economic Trends, 293, pp.105-110.
simultaneous-equation models”, Journal of
Business & Economic Statistics, 4, pp. 445-453.
Wallis K.F. (1989) , “Macroeconomic forecasting: a
survey”, The Economic Journal, 99, pp. 28-64.
Smith P. (1977) , An analysis of the revisions to the
Canadian National Accounts, Statistics Canada,
Econometric Research Staff Working Paper.
Young A.H. (1993) , “Reliability and accuracy of the
quarterly estimates of GDP”, Survey of Current
Business, 73, 10, pp. 29-43.
Trivellato U. (ed.) (1986) , Errori nei dati preliminari,
previsioni e politiche economiche, Padova, Cleup.
Zellner A. (1958) , “A statistical analysis of
provisional estimates of gross national product
and its components, of selected national income
components, and of personal savings”, Journal of
the American Statistical Association, 53, 281, pp.
54-65.
Trivellato U. (ed.) (1987) , Attendibilità e
tempestività delle stime di contabilità nazionale,
Padova, Cleup.
Trivellato U. and E. Rettore (1986) , “Preliminary
data errors and their impact on the forecast error of
30
TABLE OF CONTENTS
APPENDIX
Ac tual Publication Schedule of Italian Quarterly National Acounts Aggregates
Year and quarter of reference
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
Vers.: 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
1984.4 x x x x x x x x x x x x x x x x x x x x x x x x x x x x
1985.1 . . . .
. . . . . . . . . . . . . . . .
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x x
1985.2
x x x
1985.3
1985.4
x x x x x x x x x x x x x x x x x x x x
1986.1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x
1986.2
1986.3
x x x x x x x x x x x
1986.4 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1987.1 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
x x x x x x x x x
1987.2 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
x x x x x x x x x x
1987.3 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1987.4 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1988.1 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1988.2 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1988.3 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1988.4 x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
.
x x x x x x x x x x
. . .
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
1989.1
x x x x x x x x x x
1989.2
x x x x x x x x x x x
1989.3
x x x x x x x x x x x x x x x x x x x x
1989.4
1990.1
x x x x x x x x x
1990.2
x x x x x x x x x x
x x x x x x x x x x x
1990.3
1990.4 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1991.1 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
1991.2 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .
1991.3 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1991.4 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1992.1 . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
1992.2 x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
1992.3
x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
1992.4
1993.1
x x x x x x x x x
1993.2
x x x x x x x x x x
1993.3
x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
1993.4
x x x x x x x x x
1994.1
x x x x x x x x x x
1994.2
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 34 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
70
71
72
73
74
75
76
77
78
79
80
81
82
83
x: new estimate
.: no published estimate
31
84
85
86
87
88
89
90
91
92
93
94
TABLE OF CONTENTS
La relation entre les comptes annuels et les comptes trimestriels à l’INSEE
Virginie MADELIN
INSEE
1 Introduction
travaillent à un niveau relativement agrégé (12
branches, 13 produits), des déformations sensibles
apparaissent entre la structure de leur TES (Tableaux
Entrées-Sorties) et celle du TES agrégé des
comptables annuels. De même, l’absence d’indicateur
infra-annuel pour certaines branches de l’activité
entraîne des erreurs dans la construction des comptes
trimestriels. Enfin, les comptables annuels privilégient
les comptes aux prix de l’année précédente, alors que
les comptables trimestriels raisonnent aux prix de
l’année de base (1980), ce qui entraîne, lorsqu’on
s’éloigne de l’année de base, des distorsions dans les
estimations.
L’élaboration de la comptabilité nationale française se
caractérise par le fait que les méthodes de construction
des comptes trimestriels et des comptes annuels sont
fondamentalement différentes, mais qu’il existe
néanmoins de fortes interactions et une cohérence
temporelle entre les deux systèmes. Les comptes
annuels sont bâtis à partir d’une approche exhaustive
qui exploite toutes les informations statistiques et
comptables disponibles et travaille à un niveau très fin
de la nomenclature (600 et 90 postes). Les comptes
trimestriels privilégient les enchaînements temporels
entre les grandeurs macro-économiques en s’appuyant
sur une information plus agrégée. Ils reposent sur une
méthode économétrique.
Les comptes trimestriels n’interviennent plus dans le
processus d’élaboration des comptes d’une année lors
des exercices de révision suivants effectués durant
trois années pour prendre en compte l’enrichissement
de l’information disponible. Ils deviennent des clients
des comptables annuels et introduisent dans leur base
de donnée les évaluations révisées des trois dernières
années.
Les comptables trimestriels n’interviennent dans le
processus d’élaboration des comptes annuels que lors
de la première évaluation d’une année en mars de
l’année suivante. L’information dont disposent alors
les comptables annuels est confrontée à celle qui a été
traitée et structurée en termes de comptabilité
nationale par les comptables trimestriels. L’approche
plus conjoncturelle des comptables trimestriels
s’enrichit du détail de l’analyse des comptables
annuels pour aboutir à une évaluation commune, à
quelque détails près, de l’année écoulée. Elle
contribue ainsi à remettre en question certaines
méthodes ou pratiques d’évaluation mais aussi
l’interprétation des comptes nationaux. Lors de cette
période de concertation, des problèmes récurrents sont
rencontrés. Comme les comptables trimestriels
Cette orthogonalité entre les deux méthodes
d’élaboration assure qu’il n’existe pas de biais
auto-entretenu dans l’approche économétrique des
comptes trimestriels.
Le système français de comptabilité nationale
comprend deux volets articulés, les comptes annuels
d’une part, les comptes trimestriels d’autre part. Les
premiers visent à donner une information
33
TABLE OF CONTENTS
macro-économique annuelle avec le maximum de
précision et de détail. Les seconds ont pour objectif de
préciser et d’anticiper à un rythme infra-annuel cette
information. Bien que travaillant à un niveau de détail
différent, tous deux reposent sur un système
comptable intégré, et répondent aux mêmes
contraintes de cohérence et d’équilibrage. La
différence de fond entre ces deux systèmes tient à leur
mode d’élaboration : les comptes annuels sont
construits par la mobilisation de sources statistiques à
champ exhaustif ou rendu exhaustif, qui sont
successivement mises en cohérence conceptuelle, puis
confrontées et arbitrées dans un cadre comptable
unique, année par année. Les comptes trimestriels sont
fondés sur une méthode économétrique et comptable
qui “trimestrialise” le compte annuel et permet les
prévisions ; l’information est élaborée, série par série,
en utilisant des indicateurs, éventuellement partiels,
dont la qualité fondamentale est la bonne
représentativité à l’égard de la série annuelle
correspondante.
disparaître l’essentiel. Ceci est décrit en troisième
partie.
Les comptes annuels et les comptes trimestriels
comprennent des tableaux entrées-sorties, des
comptes de secteurs institutionnels et des tableaux des
opérations financières en flux et en encours, dont
l’articulation est plus ou moins élaborée selon qu’il
s’agit des comptes annuels ou trimestriels. Par souci
de simplification, on s’est centré par la suite sur la
synthèse relative aux biens et services.
2
Comment le compte provisoire des
comptables annuels est-il construit ?
Dans le système de comptabilité nationale annuelle,
les comptes de l’année n sont établis au printemps de
l’année n+1, dans une version dite provisoire à partir
de statistiques souvent partielles sur les produits et
d’informations conjoncturelles qualitatives. Ces
comptes sont ensuite révisés trois années de suite et
donnent lieu à des versions dites semi-définitives 1
puis 2 et enfin définitive. Les comptes pour les années
n-3 (définitif), n-2 et n-1 (semi-définitifs) et
n (provisoire), sont élaborés lors de la “campagne de
compte” qui s’étend de juillet n à avril n+1 et sont
publiés dans le rapport sur les comptes de la Nation en
juin n+1.
Malgré ces différences sensibles dans leur
méthodologie, les comptes annuels et trimestriels sont
totalement articulés et cohérents. Pour leur première
version d’un compte (en mars n+1), dite provisoire, les
comptables annuels bénéficient de la réflexion
conjoncturelle qu’ont déjà menée les comptables
trimestriels mais apportent une information plus riche
et plus détaillée. La confrontation des méthodes, des
sources, des résultats et des diagnostics permet une
convergence sur l’essentiel des éléments de compte.
Les révisions sont rendues nécessaires par
l’enrichissement au fil du temps de l’information
statistique et économique ; en outre, la plus grande
disponibilité des données de base permet un plus grand
raffinement des méthodes utilisées.
Seule la version provisoire de l’année n donne lieu à
une confrontation avec les résultats des comptes
trimestriels, ces derniers étant exactement calés pour
les années antérieures sur les comptes nationaux. Ce
sont donc essentiellement les méthodes utilisées au
compte provisoire que l’on va développer.
Au fur et à mesure que l’information statistique et
comptable sur l’année se développe, les comptables
annuels revoient leurs premières estimations ; chaque
année est ainsi révisée trois fois. Les comptes
trimestriels se calent alors sur ces nouvelles
évaluations et revoient leurs relations économétriques.
Cette façon de faire assure que les comptes trimestriels
ne sont pas entachés de biais auto-entretenus.
La méthode générale d’élaboration des comptes
annuels français, consiste à évaluer le produit
intérieur brut (PIB) selon trois approches [1]. La
première est l’approche dite production dans laquelle
on établit les comptes de production en 90 branches
d’activité, c’est-à-dire au niveau 90 de la
nomenclature d’activités et de produits (NAP). Dans
la seconde, dite par la dépense, on élabore les
Dans la pratique, les comptables annuels élaborent une
première version de leurs comptes annuels de façon
presque entièrement indépendante des comptables
trimestriels. C’est pourquoi les méthodes de base de
chacun des intervenants sont d’abord décrites dans les
deux premières parties. Les différences de méthodes
génèrent naturellement des écarts entre les
évaluations, dont la concertation tend à faire
34
TABLE OF CONTENTS
équilibres Ressources-Emplois (ERE) des produits au
niveau 600 de la NAP. Dans la troisième, dite par le
revenu, on établit les comptes d’exploitation des
branches d’activité (CEB) au niveau 40 de la NAP, à
partir des comptes de secteurs institutionnels.
Ces trois optiques ne sont pas élaborées
indépendamment les unes des autres, mais au contraire
confrontées en permanence. Un tel principe nécessite
des redressements et arbitrages intermédiaires
nombreux, qui permettent d’aboutir à une seule
estimation du PIB. L’élaboration des comptes est
menée chaque année dans le cadre du Tableau
Entrées-Sorties (TES) et de l’articulation entre les
comptes d’exploitation par secteurs institutionnels et
par branche. Le TES est évalué en 90 branches et
produits, dans trois systèmes de prix (aux prix
courants, aux prix de l’année précédente et aux prix de
1980), dans deux systèmes de taxation (hors toutes
taxes et y compris TVA non déductible) ( graphique
1).
analyse intégrée des informations permet une
transposition de celles-ci à toutes les opérations non
financières de la comptabilité nationale. On obtient
simultanément les valeurs de la production et celles
des consommations intermédiaires. Au moment du
compte provisoire, cette méthode concerne les
secteurs institutionnels des Administrations
Publiques, des Institutions de crédit et des
Entreprises d’assurance. Ces informations sont
dites exogènes;
- la production est évaluée à partir de la connaissance
des produits : cette procédure concerne surtout la
production agricole;
- pour les autres secteurs institutionnels, il n’existe
pas au moment du compte provisoire de sources
comptables suffisamment développées pour
permettre une évaluation intégrée des comptes par
secteurs. La production des branches est évaluée à
ce stade à partir de sources conjoncturelles,
enquêtes de branche trimestrielles ou mensuelles,
indices de la production industrielle...
Dans l’élaboration du compte provisoire, faute
d’informations suffisantes, on n’utilise pas l’approche
par les revenus. Il n’y a ni compte d’exploitation par
branche, ni décomposition par sous-secteurs d’activité
des opérations des sociétés et quasi-sociétés non
financières et des entreprises individuelles.
L’élaboration du tableau des emplois intermédiaires se
fait en deux temps :
- On met d’abord en place les informations partielles
disponibles, lorsque le montant total de la
consommation intermédiaire par branche est connu,
ou lorsque le montant de certaines cases est connu.
Dans l’optique de la production , le PIB aux prix du
marché est la somme des valeurs ajoutées des
différents agents économiques, qu’ils soient étudiés
du point de vue des branches ou des secteurs
institutionnels, et des impôts frappant les produits non
déjà inclus dans les valeurs ajoutées des producteurs.
La synthèse de cette approche est réalisée dans le cadre
du TES par la mise en place des comptes de production
par branche, qui donnent le montant de la production
effective des branches, leur consommation
intermédiares et leur valeur ajoutée.
- Les autres cases du tableau sont ensuite obtenues
par une projection du TES au prix de l’année
précédente, sous l’hypothèse de la fixité d’une
année sur l’autre des coefficients techniques.
Les évaluations des comptes de production par
branche sont confrontées à celles obtenues dans
l’optique de la demande.
Dans l’optique de la demande , le PIB aux prix du
marché est mesuré par la valeur des biens et services
affectés à des emplois finals, nette de la valeur des
importations. En pratique, on évalue pour chacun des
produits au niveau 600 de la NAP, l’offre et la
demande
dans
le
cadre
des
équilibres
ressources-emplois qui viennent alimenter le TES ( cf
graphique 2).
On a :
PIB = Consommation finale
+ Formation brute de capital fixe (FBCF)
+ Variations des stocks
+ Exportations de biens et services (nettes
des importations) .
On a :
PIB =
∑
+
+
Valeurs ajoutées des branches
TVA grevant les produits
Impôts à l’importation (nets des
subventions à l’importation)
Les procédures d’évaluation de ces postes dépendent
de la nature des sources disponibles. On peut
distinguer trois situations différentes:
- il existe des informations comptables complètes et
jugées suffisamment fiables en elles-mêmes : une
35
TABLE OF CONTENTS
La production distribuée du produit est calculée à
partir de la production effective de la branche. Après
prise en compte des données relatives au commerce
extérieur fournies par la Direction générale des
Douanes, on évalue ensuite la demande intérieure,
répartie entre les ménages (consommation finale et
variations de stocks dans les commerces), la demande
intermédiaire
des
entreprises
et
des
administrations(consommation
intermédiaire
et
variations de stocks utilisateurs et commerce) et la
demande finale de ces mêmes agents (Formation Brute
de Capital Fixe et variations de stocks commerce). Ces
évaluations sont menées à partir des sources
conjoncturelles disponibles.
A ce niveau de détail, il y a bon nombre de cas où une
seule de ces catégories d’emplois existe. Ces
équilibres sont ensuite agrégés au niveau 90, celui du
TES, ce qui permet alors une meilleure estimation des
variations de stocks.
1980, pour les opérations sur biens et services ainsi
que des éléments de comptes des sociétés et
quasi-sociétés et des ménages. Un premier panorama
de l’année n est ainsi publié en février n+1, puis
détaillé en avril n+1. C’est pour cette dernière
publication que la concertation entre comptables
annuels et trimestriels est réalisée.
La méthode de base utilisée dans les comptes
trimestriels consiste à relier économétriquement une
information
statistique
infra-annuelle
ou
conjoncturelle à une grandeur comptable [2]. A
chaque série de comptes est associée une série de
périodicité trimestrielle disponible dans les délais
d’élaboration des comptes et dont l’évolution est
similaire à celle du poste comptable: cette série est
appelée indicateur du compte. Le plus souvent,
l’indicateur trimestriel utilisé diffère de l’évaluation
de la donnée comptable pour des raisons de définition
et de champ couvert.
A l’issue de ces deux étapes de travail, la mise en
cohérence des évaluations dans le cadre du TES
nécessite des arbitrages itératifs. La projection du
Tableau des échanges intermédiaires, aux prix de
l’année précédente, dans l’approche production donne
une consommation intermédiaire potentiellement
différente de celle obtenue directement dans
l’approche par la demande.
En outre, on dispose d’une estimation de la FBCF par
produits dans les équilibres ressources-emplois qui
doit être confrontée à celle obtenue dans les comptes
des secteurs institutionnels. Les comptes annuels sont
établis en valeur, aux prix de l’année précédente, et
aux prix de l’année 1980 par chaînage des indices de
prix de l’année précédente.
Les comptes sont d’abord réalisés en volume,
c’est-à-dire dans le cas des comptes trimestriels aux
prix de l’année 1980 , à partir d’indicateurs en
volume. Ils sont ensuite évalués en valeur. Ce sont de
plus des comptes corrigés des variations
saisonnières (CVS), construits directement à partir
d’indicateurs eux-mêmes désaisonnalisés. Cette façon
de corriger la saisonnalité des comptes a été préférée à
celle qui aurait consisté à construire d’abord des
comptes bruts puis à les désaisonnaliser, parce que la
saisonnalité de l’indicateur ne coïncide pas
nécessairement avec celle du phénomène économique
qu’on cherche à mesurer. Les comptes trimestriels
n’existent que dans une version CVS ; ils ne sont pas
calculés dans leur forme brute et ne sont pas non plus
corrigés des jours ouvrables (CJO).
Après ces arbitrages, on obtient une première
estimation complète du PIB qui va être rapprochée de
celle établie selon des méthodes radicalement
différentes, mais bien souvent à partir de sources très
proches, par les comptables trimestriels.
3 Méthodologie des comptes trimestriels
Plus précisément, l’élaboration des comptes
trimestriels se fait en trois étapes. Les deux premières
portent sur les années révolues. Ce sont l’étalonnage et
le calage. La dernière consiste à évaluer les comptes
pour la période séparant le compte calé du temps
présent.
Les comptes du trimestre t, décrivant les opérations
sur biens et services aux prix de l’année 1980, sont
publiés au milieu du trimestre t+1 dans une version
“allégée”, les “Premiers Résultats”. Les “Résultats
Détaillés” au trimestre t+2 donnent un détail
d’informations aux prix courants et aux prix de l’année
L’étalonnage est la transformation des indicateurs
trimestriels en éléments de comptes trimestriels. Il
consiste en un ajustement économétrique qui permet
d’élaborer une donnée de comptabilité trimestrielle.
On suppose qu’il existe une relation économétrique
36
TABLE OF CONTENTS
simple (et non dynamique) qui lie, sur le passé,
l’évolution annuelle du compte et l’indicateur
annualisé retenu. Estimée sur des données annuelles,
cette relation est supposée rester valable sur des
données trimestrielles. Cette hypothèse conduit à
effectuer la désaisonnalisation des séries non sur les
comptes trimestriels étalonnés mais sur l’indicateur
lui-même.
Les comptes trimestriels sont sujets à des révisions au
fur et à mesure de leurs publications successives.
D’abord, parce que les comptables annuels révisent
eux-aussi leurs estimations; à cette occasion, les
relations économétriques sont revues, tout en
préservant les profils infra-annuels. Ensuite, parce que
la donnée comptable trimestrielle peut être modifiée,
soit par rectification d’une première estimation de
l’indicateur, soit par une mise à jour des corrections
des variations saisonnières. Enfin, parce que la
méthodologie du calcul dans les comptes trimestriels
peut être modifiée à cause de la disparition d’une
source ou l’adoption d’un nouvel indicateur considéré
comme meilleur.
Le calage est la mise en conformité, sur le passé, des
éléments des comptes trimestriels ainsi obtenus aux
comptes annuels. Il n’est réalisé systématiquement
que pour les versions semi-définitives et définitives
des comptes annuels. L’écart constaté est réparti
chaque année proportionnellement sur chaque
trimestre de l’année, par ce qu’on appelle la procédure
de lissage.
Les comptes trimestriels sont organisés d’abord selon
des domaines de base (production, consommation des
ménages, FBCF et commerce extérieur) et des
domaines de synthèse du TES, qui permettent
d’évaluer les séries qui ne sont pas calculées dans ces
domaines de base. Pour ces derniers, les sources
utilisées sont très proches de celles des comptes
annuels; les estimations se font en règle générale au
moins au niveau 40 de la NAP. Dans le TES, pour une
branche donnée, les consommations intermédiaires en
produits sont pour l’essentiel, calculées au niveau 16
de la NAP. Elles sont étalonnées sur la production de
la branche correspondante en introduisant parfois un
trend temporel. Seules quelques consommations
Lorsqu’on passe à l’évaluation de l’année en cours, on
utilise la relation économétrique établie pour
l’élaboration des comptes trimestriels étalonnés pour
obtenir le compte du trimestre courant. En outre, pour
préserver l’évolution entre le compte du dernier
trimestre de l’année qui a pu être calée et ceux des
trimestres suivants et ainsi éviter une “marche
d’escalier”, le dernier écart est conservé sur les
comptes trimestriels suivants. La procédure globale
consiste en l’articulation de ces opérations pour
obtenir des comptes en volume, en valeur et en prix ;
elle s’articule de la façon décrite dans le shéma 1.
Schéma 1: La procédure globale
Indicateur brut
volulme
CVS
Indicateur CVS
volume
Indicateur brut
de prix
CVS
Indicateur CVS
prix
Compte
volume x par
indicateur CVS
Indicateur
CVS valeur
Compte valeur
divisé par
compte volume
Etalonnage
calage
Compte
volume
Etalonnage
calage
Compte
valeur
Compte
prix
37
TABLE OF CONTENTS
intermédiaires sont traitées différemment (agriculture
par exemple, où on utilise des indicateurs directs). Il
existe aussi un TEE (tableau économique
d’ensemble), assez fruste.
du compte provisoire est donnée à la qualité de
l’analyse macro-économique portée sur l’année qui
vient de s’écouler.
Comme on l’a déjà vu (cf. partie 1), les comptables
annuels
et
trimestriels
élaborent
quasiindépendamment leur première version pour l’année
écoulée n au printemps n+1. La confrontation de ces
premiers résultats complets fait apparaître des écarts
sur le PIB et ses contreparties, aussi bien dans
l’optique production (les valeurs ajoutées des
branches) que dans l’optique de la dépense (emplois
finals nets des importations). Toutes choses égales par
ailleurs, on est alors confronté, lors du rapprochement
entre les deux comptes, à une difficulté mécanique
inhérente à la nature matricielle du TES :
l’intervention sur un poste particulier de celui-ci a des
conséquences sur l’ensemble du tableau, qui plus est
divergentes parfois entre comptes trimestriels et
annuels. C’est pourquoi, là encore, les arbitrages sont
conduits de manière itérative.
La principale exception, due aux lacunes de
l’information statistique et conjoncturelle dans ce
domaine, apparaît dans les services.
Dans les services marchands d’abord, l’équilibre se
construit à partir des emplois. La consommation
intermédiaire de l’ensemble des branches en services
marchands est évaluée directement, à partir
d’indicateurs ad hoc. Celle-ci sert alors d’indicateur
pour estimer la consommation intermédiaire en
produit “services marchands” pour chaque branche
utilisatrice. On peut alors déterminer la somme des
emplois de ce produit et par conséquent la production.
Pour les consommations intermédiaires des branches
non marchandes, on ne dispose pas d’indicateurs
infra-annuels. Elles sont donc estimées annuellement,
puis lissées trimestriellement.
Une fois ces traitements effectués, le TES est équilibré
en calculant les variations de stocks comme solde du
total des ressources et des emplois totaux hors stocks.
Les écarts sont d’abord générés par les différences
de méthodes qui sont totalement “orthogonales”. Là
où les comptables annuels essaient de serrer au plus
près, avec un grand niveau de détail, les informations
économiques et statistiques, les comptables
trimestriels se placent dans une perspective temporelle
et une logique économétrique. Une fois ces méthodes
mises en oeuvre, on constate des différences dont
certaines sont de nature structurelle (comme le fait de
travailler à un niveau de détail différent dans le TES)
et donc délicates à réduire, tandis que les autres, qui
portent sur des points particuliers (évolution des
marges
commerciales
dans
les
industries
agro-alimentaires par exemple) se résolvent par une
analyse économique approfondie. Ces deux types
d’écarts sont implicitement résolus par une procédure
d’arbitrage en deux phases [3].
vient le moment de la
4 Quand
concertation
La concertation entre les comptables annuels et
trimestriels est un exercice éminemment pratique et
concret. Il ne s’agit donc pas dans cette partie de
théoriser cet exercice mais au contraire de retracer
comment
la
qualité
de
l’information
macro-économique se trouve enrichie par la
concertation.
Le rapprochement systématique entre les deux types
de comptes est récent et pourrait d’ailleurs être étendu
au compte annuel semi-définitif 1, une fois qu’une
capitalisation de l’expérience acquise en matière de
concertation aura été faite par les deux équipes.
L’objectif actuel n’est pas de caler exactement et en
détail un compte sur l’autre - ce qui ne poserait pas de
problèmes techniques une fois les principaux écarts
réduits -, mais d’obtenir un consensus sur le diagnostic
économique que l’on peut tirer de chacun d’eux
(tableau 1). Ainsi, on vise à obtenir des taux de
croissance globaux égaux, de faibles différences sur
les composantes du PIB étant tolérées. La priorité lors
La première fait intervenir les responsables de
domaines pour chaque type de comptes (par exemple,
les responsables annuels et trimestriels de l’évaluation
de la consommation des ménages). En premier lieu,
afin d’anticiper et de minimiser les éventuelles
divergences sur le diagnostic économique, des
réunions d’information et de concertation se tiennent
de façon précoce. Dès que les comptables trimestriels
sont en mesure de porter un diagnostic sur l’année
écoulée, un peu avant la fin de l’année n, ils informent
leurs homologues annuels. Puis, les “négociations”
38
TABLE OF CONTENTS
sont relancées, au fur et à mesure que ces derniers
disposent à leur tour d’informations s’inscrivant dans
leur cadre méthodologique. La concertation
proprement dite se tient en mars n+1.
Cette nécessaire réconciliation est faite dans la
deuxième phase de l’arbitrage , qui est une phase de
synthèse et concerne à ce titre les responsables de la
synthèse, et donc essentiellement du TES dans le cas
du PIB et de ses composantes. Les discussions se
fondent alors sur une mise en perspective et en relation
de tous les éléments de comptes abordés auparavant de
façon plus transversale. On rencontre dans cette phase
les mêmes problèmes que ceux rencontrés dans la
première, mais de manière exacerbée par la nécessaire
cohérence de l’ensemble. On se heurte en outre à des
difficultés structurelles récurrentes.
En restant sur l’exemple de la consommation des
ménages, on constatait en mars 1994 un écart spontané
de 0,5 point entre les deux estimations en volume
(tableau 2). L’étude approfondie des sources de base
et de leur interprétation, ainsi que des méthodes fines
employées, à un niveau plus détaillé que celui qui
apparaît dans le tableau, a permis de ramener cet écart
à 0,1 point (soit + 0,8 pour les comptes annuels et + 0,7
pour les comptes trimestriels). L’écart n’est pas
totalement réduit, dans la mesure où sa réduction
entraînait des problèmes d’équilibrage de certains
équilibres ressources-emplois (sur le tabac par
exemple). On butait dans ces cas sur le problème déjà
souligné de l’interconnexion de toutes les évaluations.
Un autre exemple est celui de la branche automobile et
matériel de transports, où les méthodes de
construction de l’équilibre sont complètement
différentes. En effet, la production est un solde pour
les comptes annuels (on remonte l’équilibre), alors
qu’elle est estimée directement à partir des indices de
production industrielle (IPI) pour les comptes
trimestriels.
D’abord, le problème d’agrégation du TES . Pour les
comptes nationaux annuels comme trimestriels, on a
noté (cf. parties 1 et 2) que la synthèse s’articule autour
d’une projection du TES en volume. Du côté des
comptes annuels, elle s’appuie pour partie sur
l’hypothèse de stabilité des coefficients techniques
d’une année sur l’autre. Cette hypothèse est cependant
largement tempérée par le fait que pour un certain
nombre de produits (ou de branche), on connaît
directement les consommations intermédiaires
détaillées à partir des sources spécifiques. Ce sont les
cases fixées, qui représentent près de 60 % du volume
des consommations intermédiaires du TES annuel
pour un compte semi-définitif (et plus encore sur un
compte définitif).
La discussion est parfois compliquée par le fait que les
comptables trimestriels élaborent des comptes
étalonnés sur des indicateurs corrigés des variations
saisonnières. Ce n’est bien sûr pas le cas des
comptables annuels. Une même source statistique
conduit donc parfois à des interprétations
économiques ou à des estimations d’éléments de
compte légèrement différentes. L’élaboration des
comptes trimestriels bruts, étudiée actuellement par
l’équipe qui en a la charge, facilitera naturellement les
arbitrages avec les comptes annuels.
Enfin, une difficulté supplémentaire provient de
l’organisation des équipes, dans lesquelles les
domaines couverts ne sont pas exactement les mêmes,
ce qui complique le dialogue. A titre d’illustration, on
a un expert de la production trimestrielle et un autre
pour la FBCF trimestrielle, alors que les comptables
nationaux sont responsables de l’ensemble de
l’équilibre pour des produits donnés. Il s’agit en
quelque sorte de concilier dans ces cas une vision du
TES en colonnes pour les premiers, et en ligne pour les
seconds.
Du côté des comptes trimestriels, l’hypothèse est
légèrement différente, parce que le TES est réalisé à un
niveau beaucoup plus agrégé (NAP 15) que dans les
comptes annuels : les coefficients techniques sont
calculés en appliquant la tendance observée
auparavant. Des cases fixées sont aussi évaluées, qui
s’appuient sur des relations économétriques directes.
Pour les comptes annuels, la projection s’effectue au
niveau 90 de la NAP alors que pour les comptes
trimestriels, elle s’effectue au niveau 15. Lorsqu’on
aborde la deuxième partie de l’arbitrage, on doit donc
agréger le TES des comptes annuels du niveau 90 au
niveau 15 pour le comparer à celui des comptables
trimestriels. Ce faisant, on introduit des distorsions par
rapport au cas où le TES des comptes annuels aurait
été projeté directement au niveau 15, du fait des effets
de structure.
En effet, si on calcule le TES au niveau 90 (ce qu’on
appelle par la suite le niveau des branches
élémentaires), et si on l’agrège ensuite, la
39
TABLE OF CONTENTS
consommation intermédiaire d’une branche du niveau
15 est la somme des consommations intermédiaires
des branches élémentaires qui la composent pondérées
par leurs taux de croissance respectifs. Si on calcule
directement le TES au niveau 15, cela revient à faire la
somme des consommations intermédiaires des
branches élémentaires de la branche agrégée
considérée, en leur appliquant uniformément le taux
de croissance de cette branche agrégée.
deviennent de plus en plus importantes au fur et à
mesure que l’on s’éloigne de l’année de base.
Enfin, la réduction des écarts est compliquée par le fait
que les équilibres comptables n’ont pas la même
fonction selon qu’il s’agit des comptes annuels ou des
comptes trimestriels. Pour les premiers (cf. partie 1),
ils sont au coeur de l’évaluation de la cohérence
d’ensemble des diverses estimations : l’examen et
l’interprétation des soldes comptables sont à la base
des procédures d’arbitrage. Cette importance apparaît
a contrario lorsque l’absence d’arbitrage laisse
subsister un ajustement, comme c’est le cas entre
comptes financiers et comptes non financiers. Dans la
confection des comptes trimestriels, la réalisation
d’équilibres comptables sert parfois à estimer les
séries pour lesquelles aucun indicateur n’est utilisable.
Les soldes comptables sont dans un premier temps des
résultats produits automatiquement par le système.
L’examen de leur niveau et de leur évolution porte sur
la vraisemblance de ce résultat et conduit
éventuellement à une remise en cause des indicateurs
utilisés en amont.
De plus, le fait de fixer certaines consommations
intermédiaires dans le tableau des échanges
intermédiaires (les cases fixées), introduit une
distorsion supplémentaire. En effet, ces tableaux ne
sont pas construits dans le même détail dans les
comptes annuels et les comptes trimestriels, et les
consommations intermédiaires fixées ne se recouvrent
alors pas nécessairement. Par exemple, les comptes
annuels peuvent ne fixer en NAP 90 qu’une partie d’un
niveau de la NAP 15 (en branche ou en produit), là où
les comptes trimestriels fixeraient directement ce
niveau.
Pour illustrer cet effet d’agrégation, variable en
fonction de la conjoncture propre à chaque branche, on
a comparé les résultats en volume de la projection du
TES annuel de 1991, hors services non marchands, au
niveau 90 agrégé au niveau 15, avec ceux qui auraient
été obtenus par projection directe de ce tableau au
niveau 15. On ne fait apparaître aucun écart sensible
sur le total des consommations intermédiaires entre la
première et la deuxième méthode, mais les écarts entre
les branches sont plus dispersés (tableau 3). Ainsi,
pour les consommations intermédiaires de la branche
agriculture, l’écart est de 1,2 % et il est de -1,2 % pour
la branche location immobilière.
C’est le cas par exemple des variations de stocks (cf.
partie 2), calculées comme solde du total des
ressources et des emplois totaux hors stocks, dans les
comptes trimestriels et appréhendées directement dans
les comptes nationaux. Il s’agit typiquement d’un
poste sur lequel la convergence est difficile et qui a un
impact sur l’ensemble du système. Le problème est le
même pour la production dans les services, calculée
par solde pour les comptes trimestriels.
Malgré ces difficultés, l’expérience montre qu’on peut
arriver à converger “raisonnablement” sur les
principaux agrégats ( tableaux 4 et 5) . Cela nécessite
une étroite collaboration et un véritable dialogue entre
toutes les équipes, mais la crédibilité des informations
économiques mises à la disposition des utilisateurs se
trouve ainsi renforcée.
Un autre problème structurel provient de ce que les
systèmes de prix sont différents : les comptables
annuels privilégient les prix de l’année précédente
tandis que les comptables trimestriels s’appuient sur
les prix de l’année 1980. Pour permettre les
comparaisons, les comptables annuels chaînent leurs
indices de prix. Non seulement le dialogue est délicat
du fait de “cultures” différentes, non seulement cela
pose le problème du choix des indices de prix, mais en
outre ce dialogue est compliqué par les effets de
modification de structure induits par le chaînage. Pour
résoudre ce problème, il faut alors raisonner au niveau
de détail le plus fin possible. De plus, les distorsions
Le rapprochement entre comptes annuels et
trimestriels ne s’arrête pas là. On a vu que pour les
années antérieures à l’année n, non seulement les
comptes trimestriels sont calés sur les comptes
annuels, mais, de plus, les relations économétriques
sont revues à chaque révision, afin de mieux prendre
en compte les trois nouveaux points annuels. Cette
40
TABLE OF CONTENTS
façon de faire permet de s’assurer qu’il n’existe pas de
biais systématique dans les comptes trimestriels.
l’équipe des comptes trimestriels comporte 14
personnes et on peut estimer qu’une centaine de
personnes travaillent sur les comptes annuels.
La concertation, facile en régime de croisière d’un
cycle de croissance, est plus délicate au moment des
retournements de conjoncture. Elle induit néanmoins
une réflexion sur les méthodes employées par l’une ou
l’autre équipe, ainsi que leur pertinence, dans un souci
de lisibilité et d’interprétation des comptes. D’ores et
déjà, des voies de progrès de la méthodologie des
comptes trimestriels sont apparues : un niveau de
travail plus fin (par exemple en NAP 40 pour les
branches et les produits) et un perfectionnement des
méthodes économétriques amélioreraient la qualité
des premières estimations, de même que la réalisation
de comptes bruts faciliteraient les arbitrages.
5 Conclusion
Malgré des méthodes radicalement différentes, les
comptes annuels et les comptes trimestriels français
donnent une analyse de la situation économique du
pays très convergente, aussi bien en termes qualitatifs
que quantitatifs. Cela résulte d’un dialogue précoce et
approfondi entre les diverses équipes responsables de
chaque type de comptes.
Il faut cependant avoir à l’esprit que l’efficacité de
l’organisation des comptes français a un coût matériel
(notamment informatique) et humain. Par exemple,
Références
[1] “Le Produit National Brut”, ouvrage collectif, collection INSEE Méthodes, n°34-35-36, 1993.
[2] “Les comptes nationaux trimestriels”, G. Dureau, collection INSEE Méthodes, n° 13,1991.
[3] “Notes internes” de l’INSEE, 1993 et 1994.
41
TABLE OF CONTENTS
Graphique 1
Le tableau “Entrées-Sorties”
Outil de synthèse des opérations sur biens et services
Produc.
Import.
Marges
Comm.
Droits
de
douane
TVA
grevant
les
produits
Tableau des ressources
Branches
produits
Consom.
finale
Tableau des
entrées
intermédiaires
FBCF
Tableau des emplois finals
Compte de
production
par branche
Transferts de
vente
résiduelles
PIB
Compte
d’exploitation
par branche
Graphique 2
L’équilibre ressources-emplois (ERE) d’un produit
Production
+
Importations
+
Droits de douane
(nets des subventions à l’import.)
+
Marges commerciales
Consommations intermédiaires
+
Consommation finale
+
FBCF
=
+
Variations de stocks
+
TVA grevant les produits
+
Exportations
42
Variation
De
stocks
Export.
TABLE OF CONTENTS
Tableau 1
Evolution du PIB (en % aux prix de l’année 1980)
Année
Compte trimestriel
(1)
Compte provisoire
1987
1988
1989
1990
1991
2,3
3,8
3,6
2,8
1,2
2,4
3,5
3,8
2,8
1,2
1992
1993
1,4
-1,0
1,2
-1,0
1) Evolutions des notes de conjoncture de juillet.
Tableau 2
Evolution de la consommation des ménages en 1993
avant et après concertation
en %
avant concertation
Comptes annuels
Alimentation
Energie
Produits manufacturés
Biens durables
Textiles et cuirs
Autres produits manufacturés
Services
Services marchands
Services non-marchands
Consommation totale (volume)
Prix de la consommation totale
0,6
0,5
-1,3
-6,4
-2,2
après concertation
comptes
trimestriels
Comptes
annuels
Comptes
trimestriels
0,8
-0,3
-0,8
-6,3
-1,6
0,8
0,7
-1,5
-6,6
-2,3
0,7
0,6
-1,3
-6,6
-1,9
2,3
3,1
1,4
2,2
1,3
2,9
2,0
2,0
2,2
2,5
0,3
2,2
0,8
2,2
43
2,3
2,2
2,0
1,8
0,6
2,1
2,3
1,8
0,7
2,1
TABLE OF CONTENTS
Tableau 3
Consommations intermédiaires par branche
obtenues par projection du TES 1991 SD2.
(écart en % entre la projection faite directement au niveau NAP15
et la projection faite au niveau 90 puis agrégée au niveau 15)
Agriculture
IAA
Energie
Biens intermédiaires
Biens d’équipement professionnels
Biens d’équipement ménagers
Automobile
Biens de consommation courante
Bâtiment, génie civil et agricole
Transports et Télécommunications
Services marchands
Location immobilière
Assurances
Organismes financiers
1,2
-0,2
0,8
-0,2
=
=
-0,2
-0,5
0,3
0,5
=
-1,2
-0,3
=
Total
=
Tableau 4
Evolution des ressources et emplois de biens et services en 1993
(en % aux prix de 1980)
Comptes annuels
PIB
Importations
Comptes trimestriels
(1)
-1,0
-3,1
Consommation finale des ménages
0,7
0,6
Consommation finale des administrations
FBCF totale
Variations des stocks
(milliards de francs de 1980)
Exportations
0,5
-5,1
-41,5
0,9
-4,4
-49,5
0,1
0
-1,0
-3,4
1) Evolutions de la note de conjoncture de juillet 1994
Tableau 5
Evolution de la production par branches en 1993
Produits agro-alimentaires
Energie
Produits manufacturés
Bâtiment génie civil et agricole
Commerce
Ensemble des services marchands
Comptes trimestriels
(1)
Comptes annuels
-1,7
1,4
-5,1
-4,1
-0,2
10,3
0,3
1,4
-5,2
-3,9
-1,1
10,6
1) Evaluations de la note de conjoncture de juillet 1994.
44
TABLE OF CONTENTS
Cyclical patterns of the Spanish Quarterly National Accounts:
a VAR analysis
Ana M. ABAD GARCIA
Enrique Martin QUILIS
Istituto nacional de estadistica
1 Introduction
The Spanish economy has suffered deep changes in its
level of activity and employment with adverse
consequences on welfare and macroeconomic
performance. This phenomenon is treated as a result of
cyclical forces. Can we identify some basic sources of
aggregate fluctuations?
All the data are expressed at 1986 constant prices. The
sample covered from 1970. I to 1194.I (93
observations). The variables are expressed as annual
rates of growth.
Section two deals with some descriptive measures of
the data, including cointegration analysis. The model
is presented in the third section and the structural
analysis (response - impulse and variance
decomposition) is presented in the fourth section. The
paper finishes with a set of conclusions and a graphical
appendix.
In this paper we examine some evidence provided by
the Spanish Quarterly National Accounts using
multiple time series techniques. The cyclical signal is
extracted applying annual rates of growth of the
original data. This simple filter offers a good measure
of the cyclical component of a time series, specially if
it is free form irregular component (Spanish Quarterly
National Accounts are free form seasonality and
irregularity), ( Melis, 1991; Cristóbal and Quillis,
1994; INE, 1192, 1993).
2
Descriptive and cointegration analysis
It is advisable to examine the sample properties of the
data before identifying and estimating a VAR model.
Special attention is devoted to the issue of unit roots
and cointegrating relationships.
The statistical model employed is a Vector of
Autorregressions (VAR). This type of model allows
an easy characterisation of the data without unreliable
a priori restrictions (Sims, 1980, 1981).
Attending to the variance and contemporaneous
correlation with the gross domestic product (see table
1 and graph 1), we detect a group formed by the
investment in equipment, in building and the imports,
which are closely related to GDP and are remarkably
more volatile than it. Private consumption is next to
this group in terms of correlation (the highest: 0.9), but
is slightly more volatile than aggregate output.
Public consumption is the least volatile serie and its
association with GDP is not quite high. Exports are
very volatile and almost unrelated to GDP.
The list of variables is:
cpn:
private national consumption
cpu:
public consumption
fbe:
fixed investment in equipment
fcs:
fixed investment in buildings
xbs:
export of goods and services
mbs:
import of goods and services
pib:
gross domestic product
45
TABLE OF CONTENTS
Table 1: Volatility and correlation with PIB
Series
PIB
CPN
CPU
FBE
FCS
XBS
MBS
µ
2.92
2.83
5.12
3.09
2.32
6.77
6.95
σ
2.45
2.88
1.96
10.08
6.79
4.78
8.53
σ / σ PIB
1.00
1.17
0.80
4.10
2.77
1.95
3.47
Table 2: Detection of unit roots
ρ., PIB
1.00
0.90
0.58
0.80
0.79
0.11
0.72
Series
PIB
CPN
CPU
FBE
FCS
XBS
MBS
ρ(1)
0.96
0.94
0.87
0.93
0.95
0.86
0.92
ρ (12)
0.00
0.15
-0.19
-0.15
-0.03
-0.33
0.08
ADF
-1.88
-1.72
-0.73
-1.58
-2.02
-2.91
-2.25
µ = sample mean
Critical value (1% level of significance) is -3.5
σ = sample standard deviation
(MacKinnon, 1990)
ρ = contemporaneous correlation with PIB
where: c i is a constant; βi : 6x1 is a vector of
parameters (to be estimated using OLS); x(i ) t are all
the variables excluding x i and ε it is a perturbation
term.
Recursive correlations starting in 1975.l are depicted
in grap 2. They show:
- public consumption exhibits increased procyclical
behaviour;
- exports are less procyclical;
- the rest of the variables intensify their procyclical
behaviour along the sample.
Using the EG technique, we conclude that there are no
cointegration relationship in the data. Another way
(non inferential) to detect such relationships is by
means of the analysis of the autovalues of the
variance-covariance matrix of the residuals generated
by VAR models of growing order (graph 3).
The series are non-stationary since all of them have a
unit root. Their means and variances evolve through
time (see graph 1) and their simple and partial
autocorrelation
functions exhibit the usual
characteristics of non-stationarity (a high value of pact
at lag 1 and a slow decay of act). The augmented
Dickey-Fuller (ADF) test does not reject the
hypothesis of a unit root. The regression used is:
∇xt = µ + βx t − 1 + ∑ j =1.. 4 γ j ∇x t − j + ε t
The results are:
- there are three autovalues close to zero, denoting a
rank of cointegration of three;
- there is a dominant non-stationary autovalue
(possibly) associated with the common cyclical
pattern of the series;
- there are two intermediate autovalues which could
be related to the idiosincrasic behaviour of some
series (the first) and a deviation correction
mechanism from main patterns (the second)
(Luenberger , 1979).
(1)
where: ∇ = 1 − B is the difference operator, B is the
j
backward operator B x t = xt − j ; µ , β and γ j are
parameters estimated using ordinary least squares
(OLS) and ε t is a iid perturbation with zero mean and
constant variance.
Since all the series are I(1), is there a cointergration
relationship among them? We use the Engle-Granger
two-step method ( Engle and Granger, 1987) to
examine this issue. The regressions used are:
We decided to model the cyclical signal of the series
( z t = ∇ 4 x t ) without additional differences (that is
yt = ∇∇ 4 x t ) because of the discrepancies between EG
Table 3: Engle-Granger Test of Cointegration
PIB
′
xit = c i + βi x ( i) t + ε it
i=1..7
CPN
CPU FBE
FCS
XBS MBS
-2.79 -3.36 -3.49 -2.31 -2.16 -3.30 -3.32
(2)
Critical value (1% level of significance) = -5.54
46
TABLE OF CONTENTS
tests and autovalues analysis. Another reason is the
inadequacy of differencing to achieve stationarity in
multiple time series modeling (Box and Tiao, 1977,
1981, 1982; Tiao and Tsay, 1989; Tiao and al., 1993).
intensity of dynamic interactions among the variables
(variance decomposition).
In order to perform this analysis we have ortogonalised
the innovations via Cholesky decomposition of the
matrix of variances and covariances of the residuals.
This procedure is equivalent to identify a structural
model with a recursive structure in its contemporaneous
interactions ( Lütkepohl, 1991; Sims, 1980). The
correlation matrix of the innovation is:
3 A VAR model
We consider the class of VAR models represented by:
φ ( B ) z t = c + at
p
m ×m
m ×1
m ×1
(3)
m× 1
′
where: z t = [PIB CPN CPU FBE FCS XBS MBS ]t is
the vector of variables to be modeled (m=7);
φ p ( B ) = I − φ1 B − φ 2 B 2 −...−φp B p
is
a
matrix
polynomial of p order in the backward operator B;
φ i i = 1.. p are m x m matrices of parameters; all the
roots of polynomial determinant φ p (B ) are on or
P
outside the unit circle; c is a vector of constants and a t
is a multivariate white noise shock with a matrix of
variances-covariances ∑ .
=
PIB
CPN
CPU
FBE
FCS
XBS
MBS
1.00
0.36
0.37
0.49
0.39
0.14
0.44
1.00
0.44
0.32
0.35
-0.30
0.59
1.00
0.31
0.11
-.032
0.20
1.00
0.31
-0.01
0.37
1.00
-0.08
0.47
1.00
-0.08
1.00
Contemporaneous interactions among variables are
significant in most cases. It is noticeable the strong
correlation between aggregate output ( PIB) and
investment, and the negative association of exports with
the rest of variables (except output).
This kind of models are very popular in applied
macroeconomic analysis because they distinguish
clearly between data encapsulation, via reduced forms
like (3), and structural analysis (through a
reparameterisation of (3) via Cholesky decomposition
of estimated ∑ ).
The scheme of identification used in the Cholesky
decomposition is:
Variance decomposition of the prediction error is
shown in table 4 and summarised visually in graph 5.
In the short run all the variables are explained by
themselves: there is a clear predominance of proper
shocks in short run dynamics. In the long run (40
quarters) these elements are less relevant although
continue to keep a significant role.
To specify the order p of the model we have the
Akaike information criterion ( Akaike, 1982), the
likelihood ratio test (Box and Tiao, 1981, 1982; Liu,
1986), combined with the patterns provided by cross
correlations matrices and stepwise autoregressions.
Although the message is not completely clear we have
chosen p=3 as the appropriate order of the system. In
graph 4 we have drawn the residual variances of each
variable from stepwise autoregressions. They show a
stabilisation of the decreases since p=3.
Table 4 contains the decomposition. The i, j element is
the percentage of the variance of the prediction error of
the variable i explained by an innovation in the variable
j, in short term (up) and long term (down).
The main results from impulse-response analysis are
presented in the table 5 and in graphs 6 to 12. They are:
4 Structural analysis
- private consumption ( CPN) is the main source of
cyclical variability because of its strong effects on
investment (specially in equipment), imports and
gross domestic product,
In this section we expose the results of the structural
analysis, performed using the response-impulse
functions and the variance decomposition of the
prediction errors. So, we get information about the
mechanism of propagation of impulses through the
system (response-impulse functions) and about the
- investment in equipment ( FBE) and imports ( MBS)
are receivers if impulses because of its intense
47
TABLE OF CONTENTS
CONTEMPORANEOUS INTERACTIONS
Adjusted R 2 in brackets
XBS (.9527)
CPU (.9553)
FCS (.9577)
CPN (.9720)
FBE (.9678)
MBS (.9732)
PIB (.9822)
reactions to impulses in the rest of the system and
the weak response of most of the variables to their
innovations,
All the variables respond to its innovations (except
public consumption and exports) and it reacts in
shocks in the rest of the system (except public
consumption and exports, again),
- investment in building (FCS) plays a very important
role since it is a receiver (like FBE) and an emitter at
the same time.
- public consumption (CPU) and exports ( XBS) are
the most idiosyncratic variables. The fist one is
Table 4: Decomposition of the variance of prediction errors, short (1 Q) and long (40 Q)
PIB
CPN
CPU
FBE
FCS
XBS
MBS
PIB
CPN
CPU
FBE
FCS
XBS
MBS
77.22
2.60
2.23
6.50
6.38
1.55
3.52
8.97
35.69
7.95
8.23
17.64
18.94
2.57
0.00
73.59
13.43
0.00
7.01
5.97
0.00
5.25
40.56
8.92
3.57
13.54
26.31
1.86
0.00
0.00
100.00
0.00
0.00
0.00
0.00
4.68
19.13
22.52
12.84
10.33
19.78
10.71
0.00
0.66
0.78
85.27
10.67
2.61
0.00
6.87
10.06
4.07
27.86
36.01
9.56
5.57
0.00
0.00
0.01
0.00
96.15
3.85
0.00
2.43
11.09
5.49
8.47
64.38
5.26
2.87
0.00
0.00
0.00
0.00
0.00
100.00
0.00
12.60
4.57
13.49
19.34
7.48
38.05
4.47
0.00
4.88
5.02
8.00
15.16
20.84
46.11
12.30
19.70
7.53
7.14
27.55
9.55
16.23
48
TABLE OF CONTENTS
almost neutral in the sense that it is not affected by
others and does not affect other variables. The
second one is dominated by its own impulses and its
shocks affects the rest of variables in a negative
fashion, except gross domestic product. This is a
strange result which deserves further analysis.
- Private consumption is a strongly procyclical serie.
It is a very important source of aggregate
fluctuations and its shocks play an important role in
the evolution of investment, imports and gross
domestic product.
- Public consumption is a weakly procyclical serie
with low volatility and does not play a significant
role in the explanation of the cyclical behaviour of
the other variables.
Table 5 illustrates the impulse-response analysis. The
i, j element represents the maximum response (in
absolute value) of i to an impulse in j. a ‘+’/’-’
indicates the sign of the response: positive
(movements in the same direction) or negative
(movements in opposite directions). The number of
signs denotes the intensity of the response: ‘+’/’-’
weak (between 0 and 0.5), ‘++’/’--’ medium (between
0.5 and 1) and ‘+++’/’---’ strong (greater than 1). The
quarter in which the maximum is located determines
the delay of the response: short run (cp, between 1 and
4 quarters), medium run ( mp, between 5 and 12
quarters) and long run ( lp, more than 13 quarters).
- Investment in equipment and building are strongly
and increasingly procyclical. Both are very volatile.
The fist one receives impulses from consumption,
aggregate output, imports and building. The second
one receives impulses and send shocks to these
variables. So, it plays an intermediate role in the
explanation of the co-movements of the series.
- Exports are very idiosyncratic since do not share the
common cycle of the rest of variables and affects
them in a strange way.
5 Conclusions
- Imports are highly, very volatile and very sensitive
to shocks in the whole system. They are a good
summarising variable of the state of the economy.
The main results of the analysis are:
Table 5: Impulse-Response Analysis
PIB
CPN
CPU
FBE
FCS
XBS
MBS
PIB
CPN
CPU
FBE
FCS
XBS
MBS
+
++
+
+
++
+
+
cp
mp
cp
mp
mp
mp
mp
+
++
+
+
++
-
+
mp
cp
cp
mp
mp
cp
mp
+
+
++
+
+
-
0
lp
lp
cp
mp
mp
mp
++
+++
+++
+++
+++
-
+++
mp
mp
cp
cp
mp
cp
mp
++
++
+
+++
+++
-
+
cp
mp
cp
cp
cp
mp
cp
+++
++
--
--
-
+++
0
mp
mp
mp
mp
lp
cp
++
+++
+
+++
+++
-
+++
mp
cp
cp
mp
mp
lp
cp
49
TABLE OF CONTENTS
References
Akaike, H. (1982) “Likehood of a model and
information criteria”, Journal Econometrics,
vol.20, n°1.
Luenberger , D.G. (1979) “Introduction to dynamic
systems”, John Wiley, New York, U.S.A.
Lütkepohl, H. (1991) “Introduction to multiple time
series analysis”, Springer Verlag, Berlín,
Alemania.
Box, G.E.P. y Tiao, G.C. (1977) “A canonical
analysis of multiple time series”, Biometrika,
n°64, 2, pp. 355-365.
MacKinnon, J. (1990) “Critical values for
cointegration tests”, Working Paper, University
of California, San Diego.
Box, G.E.P. y Tiao, G.C. (1981) “Multiple time series
with applications”, Journal of the American
Statistical Association, vol. 76, pp. 802-816.
Melis, F. (1991) “La estimación del ritmo de variación
en series económicas”, Estadística Española, vol.
33, n°126.
Box, G.E.P. y Tiao, G.C. (1982) “Introduction to
applied multiple time series”, Technical Report
582, University of Wisconsin, Madison, U.S.A.
Sims, Ch.A. (1980) “Macroeconomics and reality”,
Econometrica, vol.48, n°1, pp. 1-48.
Cristóbal, A. y Quilis, E.M. (1994) “Tasas de
variación , filtros y análisis de la coyuntura ”,
Boletín Trimestral de Coyuntura n°52, Instituto
Nacional de Estadística.
Sims, Ch.A. (1981) “An autoregressive index model
for the U.S., 1948-75”, en Kmenta, J. y Ramsey,
J.B. (Eds) “Large scale macroeconometric
models”, North-Holland, Amsterdam, The
Netherlands.
Engle, R.F. y Granger, C.W.J. (1987)
“Co-integration
and
error
correction:
representation,
estimations
and
testing”,
Econometrica, n°55, pp. 251-276.
Tiao, G.C. y Tsay, R.S. (1989) “Model specification
in multivariate time series”, Journal of the Royal
Statistical Society, series B, vol.51, n°2,
pp.157-213.
INE (1992) “Nota metodológica sobre la Contabilidad
Nacional Trimestral”, Boletín Trimestral de
Coyuntura n°44, Instituto Nacional de Estadística .
Tiao, G.C. y Tsay, R.S. y Wang, T. (1993)
“Usefulness of linear transformation in
multivariate time series analysis”, Empirical
Economics, pp. 567-595.
INE (1993) “Contabilidad Nacional Trimestral de
España. Metodología y serie trimestral
1970-1992”, Instituto Nacional de Estadística.
Liu, L.M. (1986) “Multivariate time series analysis
using vector ARMA models”, SCA Working
Paper.
50
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
SECTION 2 NEW DEVELOPMENTS IN THE ECONOMETRICS-BASED
APPROACH
Temporal disaggregation of a system of time series when the aggregate is
known: Optimal VS. adjustment methods
Tommaso DI FONZO
Dipartimento di Scienze Statistiche
Università di Padova
1 Introduction
1. M Preliminary quarterly time series,
Most of the data obtained by statistical agencies have
to be adjusted, corrected or somehow processed by
statisticians in order to arrive at useful, consistent and
publishable values.
j = 1,K , M , are available, where
∑p
j =1
j
pj ,
≠ z and/or
p j doesn’t comply with y0 j ;
2. a set of quarterly related indicators is used to
obtain indirect estimates of the M unknown time
series.
A problem often faced by government agencies that
collect and publish quarterly time series is that of
obtaining subannual data that simultaneously comply
with the relevant annual figures and satisfy accounting
constraints. From a formal point of view, one is
interested in estimating a number of individual
variables, which are aggregated over units and over
time.
It should be noted that the distinction is not necessarily
as strict as it seems, in that preliminary quarterly series
could have been individually obtained by using related
indicators.
Benchmarking methods that generalize the work by
Denton (1971) can be used in both situations
(Cholette, 1988): the preliminary estimates are
adjusted so that they comply with more reliable and
separately obtained aggregated measurements 1. Due
to the nature of the paper, essentially developed as a
technical support to ECOTRIM, we don’t deal with
adjustment issues according to some data reliability
indicator (see Stone et al., 1942, for the seminal idea).
There is no doubt, however, that recent contributions
about linear least squares adjustment of economic data
subject to accounting constraints (see, among others,
Weale, 1992, Solomou and Weale, 1993) could reveal
very useful for the problem in hand.
We deal with a number of statistical procedures
developed to solve this problem and available in
ECOTRIM (Barcellan, 1994), a program for temporal
disaggregating one or more time series. Much
emphasis is placed on technical characteristics and on
operational issues, in order to highlight distinctive
features of the various procedures at disposal.
The subject is a classical missing data problem solved
in an indirect estimation framework: given M
high-frequency (say, quarterly) time series ( z), in this
paper we consider two distinct situations:
1
M
Indeed, the adjustment procedures supported by ECOTRIM refer to a less general concept of benchmark ing
than the one of Cholette (1987, p.14): “Benchmarking is the process of optimally combining the orig inal
sub-annual series with the annual benchmarks and with the sub-annual benchmarks, in order to obtain a more
reliable sub-annual series and a more reliable annual series”.
63
TABLE OF CONTENTS
where D is the ( N × n) aggregation matrix converting
high-frequency in low frequency data. Each element
of y 0 j can be viewed as a non overlapping linear
combination of y j , with coefficients given by the
(m× 1) vector c, m being the aggregation order. Thus,
in general matrix D is equal to
As far as point 2. is concerned, assuming that each
basic series satisfies a multiple regression relationship
with a number of known related indicators, an optimal
(in least squares sense) solution, consistent with the
aggregation constraints, can be obtained (Di Fonzo,
1990). We develop this approach discussing some
specific error covariance pattern.
D = [c'⊗ I N M 0].
In order to appreciate how the different procedures
operate, the paper ends with a stylized empirical
application to a real-data problem coming from Italian
Quarterly National Accounts.
where 0 is a null ( N × h) matrix 4, h=n-mN.
A graphical overview of such a data configuration 5 can
be found in table 1.
2 Terms of the problem
Constraint (1) can be re-written in the following way:
We deal with an indirect estimation problem involving
a system of variables rather than a single one 2. More
precisely, we wish to estimate M unknown (n × 1)
vectors of high-frequency data, each pertaining to M
basic (i.e., disaggregate) time series which have to
satisfy both contemporaneous and temporal
aggregation constraints.
(1M ⊗ Ιn ) y = z,
 y1 
 
M 
y =  yj 
 
M 
y 
 M
As far as temporal aggregation constraints (2) are
concerned, in a more compact form we have
• z, ( n ×1) vector of contemporaneous aggregated
data;
(N × 1)
vectors of temporally
(ΙM ⊗ D) y = y0 ,
where
Denoting by y j , j = 1,K, M , the (n × 1) vectors to be
estimated, the following accounting constraints hold:
M
∑y
j
= z,
 y01 


M 
y0 =  y0 j .


M 
y 
 0M 
(1)
j=1
Dy j = y0 j , j = 1,K, M ,
(4)
where 1M is an (M × 1) vector of ones and
In order to solve this problem, we consider a number
of procedures that can be characterized as either pure
adjustment or optimal (in least squares sense). The
information basis common to both cases is given by
the following M + 1aggregated vectors 3:
• y oj , j = 1,K , M ,
aggregated data.
(3)
(2)
2
See Di Fonzo (1987) for a review of univariate procedures.
3
The notation is as close as possible to that of Barcellan (1994).
4
The presence of the null matrix in D permits to deal with extrapolation. Obviously, when n=mN, D = c'⊗ ΙN
5
The available data are represented in italic font, whereas for the unknown y j,t ‘s the normal font is used.
64
(5)
TABLE OF CONTENTS
Note that the contemporaneous aggregation of
temporally aggregated series implies
M
∑y
j=1
M
0 j ,T
=∑ z m (T − 1) + k = z 0,T ,
The complete set of constraints between the unknown
values and the available aggregated information can be
expressed in matrix form as
T = 1,K, N ,
Hy = ya
k =1
(7)
that is:
or, in matrix form,
(1
'
M
)
⊗ Ιn y 0 = D * z,
H 1 y = z,
(6)
H 2 y = y0 .
It has to be noted that, given the relationship (6) between
the M + 1 aggregated vectors, matrix H has rank
r = n + N (M −1), N aggregated observations being
redundant.
where D * is the ( N × n) aggregation matrix (3) where
c = 1m . Let H be the following [( n + NM ) × nM ]
aggregation matrix,
1' ⊗ Ιn  H 1 
H = M
 =  ,
ΙM ⊗ D  H 2 
3 Multivariate adjustment procedures
and y a the [(n + NM ) × 1] vector
Suppose we have M preliminary series, p j , j = 1,K, M ,
that need to be adjusted in order to satisfy the accounting
constraints. This has to be accomplished by distributing
the discrepancies
z 
ya =  .
 y0 
z − H 1 p,
y0 − H 2 p,
Tab. 1: The elements of a multivariate indirect estimation problem
Year
Quarter
T
k
y1, t
y01,T
L
yj ,t
y0 j,T
L
yM ,t
y0 M ,T
zt
1
y1,1
L
yj,1
L
yM ,1
z1
2
y1,2
L
yj,2
L
yM ,2
z2
3
y1,3
L
yj,3
L
yM ,3
z3
4
y1,4
L
yj,4
L
yM ,4
z4
1
y011,
L
L
L
y0 j ,1
L
L
y0 M ,1
L
z0 ,1
L
1
y1,4T − 3
L
y j,4T − 3
L
yM ,4T − 3
z4T − 3
2
y1,4T − 2
L
y j,4T − 2
L
yM ,4T − 2
z4T − 2
3
y1,4T − 1
L
y j,4T − 1
L
yM ,4T −1
z4T − 1
4
y1,4T
L
yj,4T
L
yM ,4T
z4T
T
y01,T
L
L
L
y0 j,T
L
L
y0 M ,T
L
z0 ,T
L
1
y1,4N − 3
L
y j,4 N − 3
L
yM ,4 N − 3
2
y1,4N − 2
L
y j,4 N − 2
L
yM ,4 N − 2
z4 N − 3
3
y1,4 N −1
L
y j ,4 N − 1
L
yM ,4 N −1
z4 N − 2
4
y1,4 N
L
yj ,4N
L
yM ,4 N
z4 N −1
N
y01,N
L
y0 j, N
L
y0 M , N
z4 N
1
y1,n − 1
L
yj,n−1
L
yM ,n −1
zn −1
2
y1,n
L
yj,n
L
yM ,n
zn
65
Z 0,T
z0 ,N
TABLE OF CONTENTS
where
P1 
 
M 
P = Pj ,
 
M 
P 
 M
y$ j ,m (T − 1) + k = p j ,m(T − 1) + k + w j ,T d m(T − 1) + k ,
 k = 1,K, m

T = 1,K, N .
 j = 1,K, M

accordingto some reasonable criterion. In this section
we deal with two multivariate adjustment procedures:
It’s immediate to recognize that
M
∑ y$
proportional adjustment;
Denton’s multivariate adjustment.
•
•
j =1
j ,m (T − 1) + k
=
M
∑p
The former is a very simple and widely used
technique, while the latter generalizes the univariate
procedure worked out by Denton (1971) by taking into
account some technical devices about (i) the treatment
of starting values (Cholette, 1984, 1988) and (ii) the
nature of the accounting constraints.
j=1
M
∑ y$
j =1
j ,m (T − 1) + k
∑p
As we shall see, proportional adjustment is less
generally applicable than Denton’s one, because it
assumes that only the contemporaneous aggregation
constraints must be fulfilled (that is, Dp j = y 0 j ,
j = 1,K , M ).
j =1
j =1
=
j ,m (T − 1) + k
m M
 m
+ w j ,T  ∑ z m (T − 1) + k − ∑ ∑ p j ,m (T −1) + k
k =1 j =1
 k= 1
M
∑ y$
3.1 The proportional adjustment

,


= y 0 j ,T + w j ,T ( z 0,T − z 0,T ) = y0 j ,T ,
j ,m (T − 1) + k
The adjusted estimates can be expressed in matrix
form as follows:
A simple and fairly reasonable way to eliminate the
discrepancy
between
a
contemporaneously
aggregated value and the corresponding sum of
disaggregated preliminary estimates consists in
distributing such discrepancy according to the weight
of each single temporally aggregated series with
respect to the contemporaneously aggregated one.
Thus, let
y0,T
, T = 1,K , N
w j ,T =
z 0,T
 diag( y01 ) 




M

−1
 diag( y02 ) 


y$ = p +  
diag
z
⊗
Ι
z
−
pj 
(
)
[
]

∑
0
m 


M
j =1




 diag( y ) 

0M 


If n>mN, disaggregated figures that comply only with
the contemporaneous constraints are to be estimated.
In this case the discrepancy can be modulated
according to the relative weight of each preliminary
series, that is
M
If
j=1
Dp j = y0 j , adjusted estimates are given by 6:
w j ,r =
M


y$ j ,m(T − 1) + k = p j ,m (T − 1) + k + w j ,T  zm (T − 1) + k − ∑ pr ,m (T − 1) + k ,

r =1

Pj , r
M
∑ Pq ,r
,
j =1,K , M ,
q= 1
r = mN +1,K , n,
that is
6
+ d m (T − 1) + k ∑ w j ,T = z m (T − 1) + k
that is
j =1
∑ wj, T = 1.
M
j , m(T − 1) + k
and
M
be a set of cefficients such that
(8)
We assume n=mN
66
TABLE OF CONTENTS
M
∑w
with
j ,r
Denton proportional first difference (PFD)
= 1, r = mN + 1,K , n and
j=1
• P −1 ( ΙM ⊗ ∆ 1,n ∆ 1,n ∆ 1,n ∆ 1,n )P −1
M


y$ j ,r = p j ,r + w j ,r  z r − ∑ p q ,r ,
q =1


Denton proportional first difference (PSD)
j = 1,K, M ,
r = mN + 1,K , n.
•
(S ⊗ Ιn)
−1
Rossi (1982) multivariate adjustment
3.2. The Denton’s multivariate adjustment
where ∆ 1,n is the following
performingfirst differences:
Denton’s multivariate procedure is a straightforward
extension of the univariate counterpart that, as it’s well
known, is based on a movement preservation principle
(Cholette, 1984). In general, like any other least
squares technique, the adjusted estimates are found by
solving the following quadratic loss function
minimization:
min ( y − p)' M ( y − p)
∆1,n
subject to the contemporaneous and temporal
aggregation constraints (7). These constraints being
linearly dependent, particular attention has to be given
to the rank of the matrices involved in the procedure.
The solution to problem (9) is actually given by
(
)
where
(HM
−1
)
H'
−
−
( ya
− Hp),
matrix 7
0 0
0 0

M M O M M

0 0 L − 1 1
0 0 L
1 0 L
S is a positive definite (M × M) matrix and
(9)
y
y$ = p + M −1 H ' HM −1 H '
1
−1
=
M

0
(n × n)
diag ( p 1 )
0

diag( p 2 )
 0
 M
M
P =
 0
0

M
 M

0
 0
(10)
L
0
L
L
0
L
O
L
M
O
diag ( p j ) L
O
M
O
L
0
L
0 

0 
M 

0 

M 

diag ( p M ) 
denotes the Moore-Penrose
generalized inverse of
( HM
−1
)
H ' . A solution,
Both additive and proportional variants of Denton’s
procedure operate according a movement preservation
principle focusing on:
equivalent to (10) and which does not involve singular
matrices to be inverted, can be expressed in terms of
n + m( N −1) “free” observations (see the Appendix).
• the simple period-to-period change in the additive
case;
• the period-to-period percentage change in the
proportional case.
In practicce the choice of M is restricted to the
matrices
• ΙM ⊗ ∆ 1,n ∆ 1,n
Denton additive first difference (AFD)
Finally, it should be noted that, in order to perform
correctly, Rossi’s procedure needs that the
preliminary estimates comply with the temporal
aggregation constraints. In this case (Di Fonzo, 1987,
1990), the adjusted estimates are given by
• ΙM ⊗ ∆ 1,n ∆ 1,n ∆ 1,n ∆ 1,n
Denton additive second difference (ASD)
• P −1 (ΙM ⊗ ∆1,n ∆ 1,n )P −1
7
For notational convenience, we present the original difference matrix used by Denton (1971). As Cho lette
(1984, 1987) pointed out, the computational reasons behind this specification, which involves p0 , j − y0 , j = 0 and
p0 , j − y0 , j = p−1, j − y−1, j = 0 for the FD and SD variants, respectively, have become obsolete. ECOTRIM is
currently being upgraded to correctly handle starting values.
67
TABLE OF CONTENTS
M


y$ = p + s −1 ( S 1M ⊗ Ιn ) z − ∑ p j ,
j=1


with L =VH 'Va− . As it’s clear, the estimator (14) is a
natural extension of the optimal univariate counterpart
worked out by Chow and Lin (1971).
with s = 1'M S 1M .
The estimation errors covariance matrix,
4 The BLUE approach
Ε [( y$ − y)( y$ − y) '] = ( Ιn − LH)V
If we have at disposal a set of high-frequency
indicators related to the unknown disaggregated
series, we may set up M regression models
y j = X jβ j + u j ,
(
+( X − LX a ) X a' V a− X a
(
Ε ui u 'j
) =V ,
ij
(12)
0  β 1   u 1 
   
0  β 2   u 2 
+
M M   M 
   
X M  β M  u M 
• Multivariate white noise
We assume
(
i, j = 1,K, M
Ε (uu ') = Σ ⊗ Ιn ,
(13)
ya = X a β + ua ,
• Multivariate random walk
A straightforward generalization of the univariate
approach of Fernández (1981) is the following:
where X a = HX . The aggregated disturbance vector,
u a = Hu, has zero mean and singular covariance
matrix Ε u a u a' = V a = HVH '.
(
)
The high-frequency estimates can be obtained as a
solution of a linear prediction problem in a generalized
regression model with singular covariance matrix (Di
Fonzo, 1990):
(
)
y$ = Xβ + L y a − X a β$ ,
(
β$ = X a' Va− X a
)
−1
X a' Va− ya ,
By suitable partition of Va only an
for details).
Σ = {σ ij }.
The elements of Σ can be estimated using the OLS
residuals of the temporally aggregated regressions
y0 j = X 0 j β j + u0 j . Furthermore, in this case the
inversion of Va can be notably semplified 8.
where
Ε ( u) = 0
and
Ε ( uu ') = V = {Vij }.
Pre-multiplying (13) by the aggregation matrix H we
obtain the observed, aggregated regression model:
8
)
Ε u i u 'j = σ ij Ιn ,
or
y = Xβ + u,
( X − LX a )'
In most practical problems matrices Vij are unknown
and must, therefore, be estimated according to proper
assumptions on the u j ‘s. Two cases seem to be
interesting from both a theoretical and a computational
point of view .
i , j = 1K
, , M,
where X j ,
j = 1,K , M are ( n × p j ) matrices of
related series. Models (12) can be grouped as follows:
 y1   X 1 0 L
  
 y2  =  0 X 2 L
 M  M
M O
  
 yM   0 0 L
−1
Depends on two components: the former is only
related to H and V, the latter is a systematic one and
rises with ( X − LX a ) (Bournay and Laroque, 1979).
with
Ε ( u j ) = 0,
)
u t = ut −1 + ε t ,
t = 1,K , n,
u 0 = 0,
Ε ( ε t ) = 0,
0
Ε ε l ε 's = 
Σ
(
(14)
)
if l ≠ s
if l = s
r, s = 1K
, , n,
where u t and ε t are (M × 1) contemporaneous
disturbances vector. These assumptions imply
(15)
[(M − 1) × (M − 1)] matrix needs to be inverted (see Di Fonzo, 1990, p.180,
68
TABLE OF CONTENTS
Ε (u t ) = 0
and
(
Ε u l u s'
) = Σ min( l, s ),
that
If a contemporaneously aggregated series is available,
z e such that H e ye = z e , the complete set of estimates
must be re-calculated 9 as follows:
is
Ε (uu ') = Σ ⊗ ( ∆'l ,n ∆ l ,n ), where Σ can be estimated as in
the multivariate white noise approach.
(
When n > MN we need to estimate data for which the
relevant temporally aggregated values are not
available.
We
distinguish
whether
the
contemporaneously aggregated information is or is not
available. In the former case we talk about constrained
extrapolation while in the latter we have a pure
extrapolation problem. In both cases we look for the
BLU estimator of :
y j ,mN + r = x
'
j ,mN + r
β$* =
*
a
−
X *a
] (X
−1
*
a
)'(V )
*
a
−
y*a ,
(17)
( )
L* = V * ( H *)' Va*
−
,
(
)
y$ e = X e β$ + Ω H 'Va− ya − X a β$ ,
with β$ as in (15).
 yel 


M 
ye =  yej 


M 
y 
 eM 
5 A stylized empirical application
In this section we present a typical session of
multivariate indirect estimation. Three annual Italian
employment series, each pertaining to the transport
and communication branch of economic activity, have
been disaggregated so that the quarterly estimated
series comply with the known benchmark z.
The series in hand, observed in the period 1970-1990,
are:
the ( hM × 1) vector of values to be estimated and, with
obvious notation, by X e and ue the relevant matrix of
related series and vector of stochastic disturbances,
respectively. Now, let us consider the enlarged model
y* = X * β + u *,
Y01 : labour units, transports;
Y02 : labour units, activities related to transport;
Y03 : labour units, communication.
where
X 
X * =  ,
X e 
u 
u* =  ,
u e 
We used three related indicators supplied (and used)
by Istat10: the first one ( X 1 ) comes from the Italian
Quarterly Labour Force Survey and has been
seasonally adjusted, while the remaining ones are a
deterministic trend ( X 2 ) and a dummy variable ( X 3 )
equal to 1 from 1987.1 to 1991.1, respectively.
Ε (u *) = 0 and
V Ω ' 
Ε [ u * ( u *)'] = V * = 
,
Ω Ve 
Each preliminary has been estimated according to the
AR(1) version of the optimal univariate approach 11
)
with Ω = Ε (ue u ') and Ve = Ε ue u e' .
9
*
a
In the pure extrapolation case the temporally
constrained data need not to be re-estimated, while the
extrapolations are given by
where x j ,mN + r is a ( p j ×1) vector of related indicators
for the j-th series at time mN+r and u j ,mN + r is a zero
mean unobservable random error. Following Di Fonzo
(1990), denote by
(
[( X )'(V )
(16)
H 0 
H* = 
,
 0 He 
y*a = H * y *, X *a = H * X * and Va* = H * V * ( H *)'.
where
β j + u j ,mN + r , j = 1,K , M , r = 1,K , h,
 y
y* =  ,
 ye 
)
y$* = X * β$ * + L * y*a − X *a β$ * ,
4.1 Extrapolation
In the multivariate white noise case expression (16) can be notably simplified (Di Fonzo, 1990, p. 181).
69
TABLE OF CONTENTS
worked out by Chow and Lin (1971). The GLS
regression results 12are in table 2, while in table 3 the
OLS regression results are showed 13. In both cases Y 02
gave the worst fitting ( R 2 < 0.8 ) even if, on the whole,
the results seem acceptable from a pure statistical
point of view.
The multivariate regression estimates are reported in
table 4 and 5 respectively. Both in terms of
significativity of the estimated parameters and of
global fitting, the results appear generally rather
satisfactory.
Tab. 2: Estimated parameters of the auxiliary annual univariate regressions. AR(1)
Series
Constant
X1
X2
Y01
450.2315
(5.9400)
0.1037
(1.1837)
2.4453
(11.4299)
Y02
-24.5809
(-1.1277)
0.1692
(7.1464)
Y03
144.1347
(5.2019)
0.0513
(1.6061)
-12.6245
(-4.3378)
1.4635
(20.2552)
2
$p
R
0.8316
0.9247
123.86
0.7524
0.7185
26.52
0.7524
0.9774
432.97
X3
F
Tab. 3: Estimated parameters of the auxiliary annual univariate regressions. OLS
Series
Constant
X1
X2
Y01
463.3512
(6.5410)
0.0869
(1.0693)
2.4612
(14.6587)
Y02
-30.9303
(-1.6905)
0.1759
(8.8592)
Y03
152.5776
(6.6925)
0.0421
(1.6105)
R
X3
-12.8261
(-5.0653)
1.4753
(27.3027)
2
F
0.9599
240.41
0.7946
39.69
0.9879
817.70
Tab. 4: Estimated parameters of the multivariate white noise regression
Series
Constant
X1
X2
Y01
425.4570
(5.2928)
0.1300
(1.4108)
2.4117
(12.5843)
Y02
-33.5542
(-1.6206)
0.1789
(7.9824)
Y03
147.7660
(5.8801)
0.0484
(1.6965)
X3
-13.2682
(-5.3142)
1.4498
(26.4409)
2
R = 0.9984
F = 1142217
.
10
No specification strategy was adopted. Besides the error term structure, the annual auxiliary regre ssion are as
close as possible to those used by Istat. As it is clear, this is a classical ‘poor indicators’ sit uation.
11
See Istat (1985) for details.
12
t-ratios are in parentheses.
13
The OLS residuals are used to estimate .
70
TABLE OF CONTENTS
Tab. 5: Estimated parameters of the multivariate random walk regression
Series
Constant
X1
X2
X3
Y01
-6.6290
(-0.1008)
0.6886
(8.5019)
1.2496
(1.4415)
Y02
-43.0699
(-1.3820)
0.1984
(5.2268)
Y03
177.0377
(5.4066)
0.0027
(0.0678)
-12.7544
(-4.4399)
1.5020
(6.2557)
2
R = 0.9358
F = 268.98
The discrepancies to be distributed by the adjustment
methods are represented in fig. 1 as percentage of the
contemporaneous benchmark. In fig. 2 the series
estimated by the multivariate random walk procedure
are represented along with their confidence intervals.
Unlike the other two, series Y$2 does not exhibit a
in some sense ‘closer’ to that of the related series. The
results are not univocal and can be summarized as
follows:
The final estimates of Y 1 give raise to generally more
irregular growth rates than those generated by the
preliminary series. However, Denton’s AFD
procedure behaves well as compared to the remaining
ones, in that the growth rates are characterized by less
marked peaks.
marked upward trend and is characterized by
relatively wider confidence intervals. This fact
confirms the perplexities raised by the annual
auxiliary regression results, and advises to be cautious
when using this variable.
As for series 2, both multivariate procedures work
well as compared to the preliminarly estimated growth
rates, while the adjustment procedures, notably
Denton’s, show unexpected marked peaks.
A simple, graphical comparison of the performances
of the two approaches described so far is conducted in
terms of quarterly growth rates (figg. 3, 4 and 5).
Two adjustment (proportional and Denton AFD) and
two optimal multivariate (white noise and random
walk) procedures have been considered. We consider
also the preliminary series, whose dynamics should be
Multivariate random walk estimates behave fairly well
for series 3, in the sense that they are close enough to
the preliminary counterparts, while very irregular
dynamics originate from the remaining procedures.
Fig. 1: Discrepancy as percentage of the contemporaneous benchmark
2
1
0
-1
-2
-3
70
72
74
76
78
80
71
82
84
86
88
90
TABLE OF CONTENTS
Fig. 2: Estimated series and confidence intervals: multivariate random walk
80 0
S e r ie s 1
75 0
70 0
65 0
60 0
55 0
50 0
70
72
74
76
78
80
82
84
86
88
90
76
78
80
82
84
86
88
90
76
78
80
82
84
86
88
90
1 60
S e r ie s 2
1 50
1 40
1 30
1 20
11 0
70
72
74
3 40
S e r ie s 3
3 20
3 00
2 80
2 60
2 40
2 20
2 00
1 80
1 60
70
72
74
72
TABLE OF CONTENTS
Fig. 3: Quarterly growth rates: series 1
4
2
0
-2
-4
72
74
76
   Univariate AR(1)
78
80
82
84
—  — Proportional
86
88
90
——- Denton AFD
8
6
4
2
0
-2
-4
-6
72
74
   Univariate AR(1)
76
78
80
82
84
—  — Multiv. white noise
73
86
88
90
——- Multiv. random walk
TABLE OF CONTENTS
Fig. 4: Quarterly growth rates: series 2
15
10
5
0
-5
-10
-15
72
74
76
   Univariate AR(1)
78
80
82
84
—  — Proportional
86
88
90
——- Denton AFD
8
4
0
-4
-8
72
74
   Univariate AR(1)
76
78
80
82
84
—  — Multiv. white noise
74
86
88
90
——- Multiv. random walk
TABLE OF CONTENTS
Fig. 5: Quarterly growth rates: series 3
6
4
2
0
-2
-4
72
74
76
   Univariate AR(1)
78
80
82
84
—  — Proportional
86
88
90
——- Denton AFD
6
4
2
0
-2
-4
72
74
   Univariate AR(1)
76
78
80
82
84
—  — Multiv. white noise
75
86
88
90
——- Multiv. random walk
TABLE OF CONTENTS
Appendix
H M = ZH w ,
Following Di Fonzo (1990), matrix H can be
partitioned as
the singular matrix HM −1 H ' can thus be written as
RH w M −1 H w' R ' = RM w−1 R ', where M w = H w M −1 H w' is a
full rank (r × r) matrix. Furthermore, it can be readily
checked that
 1'M ⊗ Ιn
 H w 


H = ΙM − 1 ⊗ D M 0 =  L .
 

 H M 
0
M
D

  
(HM
Denoting W the ( n × r) matrix, r = n + N( M −1),
[
(
W = D M− 1'M− 1 ⊗ Ιn
H = RH w ,
−1
)
H'
−
= R(R ' R) M w−1 ( R ' R) R '.
−1
−1
Given that R ' H = R ' RH w , the adjusted estimates (10)
can thus be expressed as
)]
and R the [(r + n) × r] matrix
y$ = p + M −1 H w' M w−1 ( yw − H w p),
Ι 
R =  r ,
W 
where y w = H w y is the ( r× 1) vector containing r
“free” aggregated observations.
we have
76
TABLE OF CONTENTS
References
Di Fonzo, T. (1987) , La stima indiretta di serie
economiche trimestrali, Padova, Cleup.
Barcellan, R. (1994) , ECOTRIM: a program for
temporal disaggregation of time series, paper
presented at the INSEE-EUROSTAT workshop
on Quarterly National Accounts, Paris, December
5-6, 1994.
Di Fonzo, T. (1990) , “The estimation of M
disaggregated time series when contemporaneous
and temporal aggregates are known”, The Review
of Economics and Statistics, 72, 1, pp. 178-182
Bournay, J. and Laroque, G. (1979) , “Réflexions sur
la méthode d’elaboration des comptes
trimestriels”, Annales de l’INSEE, 36, pp. 3-30.
Fernández, R.B. (1981) , “A methodological note on
the estimation of time series”, The Review of
Economics and Statistics, 63, 3, pp. 471-478.
Cholette, P.A. (1984) , “Adjusting sub-annual series
to yearly benchmarks”, Survey Methodology, 10,
1, pp. 35-49.
Istat (1985) , “I conti economici trimestrali. Anni
1970-1984”, Supplemento al Bollettino Mensile
di Statistica, 12.
Cholette, P.A. (1987) , “Concepts, definitions and
principles of benchmarking and interpolation of
time series”, Working Paper, Time Series
Research and Analysis Division, Statistics
Canada.
Rossi, N. (1982) , “A note on the estimation of
disaggregate time series when the aggregate is
known”, The Review of Economics and Statistics,
64, 4, pp. 695-696.
Cholette, P.A. (1988) , “Benchmarking systems of
socio-economic time series”, Working Paper,
Time Series Research and Analysis Division,
Statistics Canada.
Solomou, S. and Weale, M. (1993) , “Balanced
estimates of national accounts when measurement
errors are autocorrelated: the UK, 1920-38”,
Journal of the Royal Statistical Society, A, 156, 1,
pp. 89-105.
Chow, G.C. and Lin, A. (1971) , “Best linear unbiased
interpolation, distribuction and extrapolation of
time series by related series”, The Review of
Economics and Statistics, 53, 4, pp. 372-375.
Stone, J.R.N., Champernowne, D.A. and Meade,
J.E. (1942 ), “The precision of national income
accounting estimates”, The Review of Economics
Studies, 9, pp. 111-125.
Denton, F.T. (1971) , “Adjustment of monthly or
quarterly series to annual totals: an approach
based on quadratic minimization”, Journal of the
American Statistical Association, 66, 333, pp.
99-102.
Weale, M. (1992) , “Estimation of data measured with
error and subject to linear restrictions”, Journal of
Applied Econometrics, 7, pp. 167-174.
77
TABLE OF CONTENTS
Ecotrim: A program for temporal disaggregation of time series
Roberto BARCELLAN
Università di Padova
Applied researchers often face the problem of deriving disaggregated time series
data when only information in temporally aggregated form is available. ECOTRIM
is a program which supplies a set of mathematical and statistical techniques to
carry out temporal disaggregation. It performs an indirect approach to the
disaggregation problem: the disaggregated series are estimated by using the
available aggregated series and, in case, a set of known related indicators. This
paper offers a methodological support to the use of the package.
1 Introduction
Many economic and social variables are in principle
generated by continuous processes, but are observed
into discrete temporal units. Even one assumes (as we
shall do in the rest of the paper) that the original (or
basic) data generating process is discrete one, in time
series analysis we often deal with temporally
aggregated data. Thus we refer to a set of observations
which are recorded in a lower-frequency time unit than
that of the relevant original process.
from the available information given by the
low-frequency time series at disposal and, sometimes,
by one or more related series.
Several approaches to the disaggregation problem
have been proposed in literature. We can group them
in the following way:
• Mathematical approach: the disaggregated values
are estimated according to a mathematical criterion.
We usually consider a minimum quadratic loss
function problem whose solution gives raise to the
estimated series (Boot, Feibes and Lisman, 1967,
Cohen, Müller and Padberg, 1971);
The aggregated data are generally available in the
form of (i) sum over a period of time of, or (ii)
systematic sampling on m (m>1) consecutive original
observations. In the former case we have a distribution
problem which arises with reference to flow variables,
while in the latter we treat an interpolation problem
which deals with stocks variables.
• Adjustment approach: these methods forces a
preliminary series of estimates which does not
fulfill the aggregation constraints to fulfill them by
modulating the discrepancies (Denton, 1971,
Ginsburgh, 1973, Cholette, 1988);
We will refer to the process underlying the basic (high
frequency) series as original (or disaggregated)
process as opposed to the derived (or aggregated)
process. The analyst, therefore, faces the problem of
deriving the disaggregated data in reasonable way
• optimal statistical approach: the estimates are BLU
with respect to a regression model which involves
the unknown series and the related series (Chow and
Lin, 1971, Bournay and Laroque 1979, Barbone,
79
TABLE OF CONTENTS
Bodo and Visco, 1981, Fernandez, 1981, Litterman,
1983, Di Fonzo 1987);
• interactive mode;
• batch mode.
• ARIMA model based approach: the relationships
between the aggregated and the disaggregated
ARIMA model which describe respectively the
original and the derived process are used in order to
estimate the derived series. We can use the
relationships between the autocovariance structures
or with the preliminary series to solve the
disaggregation problem (Stram and Wei, 1986b,
Al-0sh, 1989, Wei and Stram, 1990, Guerrero,
1990, Barcellan and Di Fonzo, 1994).
The former option allows the user to run an interactive
session. This mode requires an active role to the
researcher, that is, by using a menu driven path, he has
to specify to the program all the information needed,
like aggregated data files, in case, related series files,
the disaggregation procedure he wants to use,
contemporaneous aggregated series, if required, and
so on.
The latter option corresponds to the choice of a batch
session. In this case the user has only to write, by using
an editor, a batch file, according to a set of rules, and
then to run it from the apposite menu. The batch mode
approach to the program is particularly suitable to treat
large number of series to be disaggregated. In fact the
batch command file can contains more than one set of
instructions to disaggregate aggregated time series. So
the program can be used in an automatic way to
disaggregate several series according to the
procedures chosen by the user and specified in the
batch command file.
The above considered techniques refer only to the
univariate disaggregation problem while in practice
we often deal with a multivariate problem, that is we
want to estimate M disaggregated series which fulfill
an accounting constraint. In this case the series must
fulfil both temporal and contemporaneous aggregation
constraints. In this case too, adjustment methods or a
BLUE approach can be used.
1.1 Ecotrim
Economic data estimation issues like those discussed
until now are strongly concerned by most statistical
agencies, which could find useful to have at their
disposal a computational support which permits them
to use the disaggregation techniques proposed in
literature.
In order to give an idea of the opportunities that
ECOTRIM offers and to introduce the methodologicaI
aspects which will be treated in this paper, we briefly
consider now a typical interactive ECOTRIM session
showing the main choices the user is called to express
(for more details see the manual).
ECOTRIM is a program system which supplies a set
of mathematical and statistical techniques to carry out
temporal disaggregation. It has been created by R.
Barcellan and T. Di Fonzo on behalf of the European
Commission, Statistical 0ffice 1. This paper represents
a methodological support to the use of the package.
Besides the option which allows to load the series
required by the analysis (like series to be
disaggregated, related series, other series to be used to
carry out graphical comparison between original, if
available, and estimated series), the user must chose
the procedure he wants to use, according to the
problem he wants to solve and to the set of information
he disposes.
The present version of ECOTRIM is written in Fortran
90. It offers a menu driven approach in which the user
is asked to choice the disaggregation procedure he
wants to run. It can be used in an interactive mode or in
batch mode according to the researcher’s
requirements. A typical ECOTRIM session proposes
to the user a beginning choice between
1
First of all he has to specify if he is dealing with
univariate or multivariate case choosing between
• univariate methods;
• multivariate methods.
Directorate B, Economic Statistics and Economic and Monetary Convergence.
80
TABLE OF CONTENTS
The univariate methods option allows to treat the
univariate disaggregation problem, that is to
disaggregate a single series in order to fulfill temporal
aggregation constraints.
model based techniques. All the techniques
considered until now produce an estimated
disaggregated series and some of them offer other
information like confidence intervals (not available
for mathematical/adjustment approach), diagnostics
or the disaggregated estimated ARIMA model
(obviously, only for ARIMA model based approach).
ECOTRIM interactive mode also allows to plot the
available series aggregated series, related series, other
series and estimated series) in order to see their
graphical representation and to make comparison
between them. It is possible to save the obtained
results, too. As to the multivariate methods,
ECOTRIM permits to estimate disaggregated series
and to fulfil temporal and contemporaneous
aggregation constraints. Both pure adjustment and
optimal, in the least squares sense. techniques are
offered:
According to the information available we can choose
• disaggregation without related series;
• disaggregation by related series.
In the former case we suppose to deal with problems
where the only available information corresponds to
the aggregated series and, sometimes; the ARIMA
model of its generating process. The procedures
offered by ECOTRIM are, therefore, mathematical
procedures or ARIMA model based procedure:
•
•
•
•
Boot, Feibes and Lisman’s procedure;
Denton’s procedure;
Wei and Stram’s procedure;
Al-Osh’s procedure.
•
•
•
•
•
While the first two methods are mathematical ones,
the last two are ARIMA model based techniques and
require the knowledge of the aggregated ARIMA
model. The methodological concepts underlying these
methods are illustrated in this paper in a slight
different manner, that is according to the theoretical
approach used (either mathematical adjustment or
ARIMA model based).
The first two options correspond to the optimal
approach, while the remaining ones to the adjustment
approach, It should be noted that in the last three cases
preliminary estimated series are needed and, as to
Rossi and weighted adjustment procedures, these
series must fulfill the temporal aggregation
constraints.
The disaggregation by related series uses the idea that
a logically correlated high-frequency series is
available (note that we do not pose the problem to
verify this hypothesis). In this case ECOTRIM offers
several techniques: some of them correspond to the
mathematical adjustment approach, some of them to
the optimal, in the least squares sense, approach and
some of them to the ARIMA model based approach.
ECOTRIM batch mode essentially uses the same idea
stated above and properly indicated in the batch
command file (for more technical details see the
manual). The present version of ECOTRIM is a
preliminary version to be developed, ECOTRIM is
also available in a more advanced version written in
GAUSS which requires GAUSS 3.1.5 version. We are
going to implement several improvements like data
manipulation, a Fortran Sun version running under
UNIX, a more detailed diagnostic analysis, state space
and Kalman filter based methods.
Thus we may choose among .
•
•
•
•
•
•
white noise procedure;
random walk procedure;
Rossi’s procedure;
Denton’s procedure;
weighted adjustment procedure.
Denton’s procedure;
Ginsburgh’s procedure;
AR(1) procedure;
Fernandez’s procedure;
Litterman’s procedure;
Guerrero’s procedure.
All the univariate techniques available in the program
are described from a theoretical point of view,
pointing out the approaches and the solutions
proposed by the authors. It should be stressed that
ECOTRIM offers the opportunity to obtain
While Denton and Ginsburgh’s are adjustment
methods, AR(1), Fernandez and Litterman’s are
optimal procedures and Guerrero’s is an ARIMA
81
TABLE OF CONTENTS
extrapolations, that is to estimate future values, when
the related aggregated value is not yet available.
The issue is addressed for the methods which present
this option. As far as the multivariate procedures are
concerned, we refer to Di Fonzo (1994). Section 2 is
devoted to establish the notation. The mathematical
and adjustment procedures available in ECOTRIM are
analyzed in section 3, while in section 4 the ARIMA
model based techniques are described. Optimal
procedures and the multivariate disaggregation
problem are considered in section 5 and 6,
respectively. Finally, section 7 contains some
concluding remarks.
concur to form several values of Y t . We will refer to the
transformation that gives rise to Y t and y 0,Τ as
overlapping and non-overlapping linear combination,
respectively. In this paper we consider only
non-overlapping linear combinations.
Let y0,T , T=1,...,N admit the ARIMA (P,D,Q)
representation:
ψ
p
(B)(1− B)
D
y 0,T = η Q (B)eT ,
where B is the bachshift operator such that
By0,T = y 0,T − 1 and B = B m , ψ p (B) = 1 − ψ 1B −K
Kψ P B P ), η Q (B) = 1 − η 1B −K−η Q B Q and {eT } is a
(
2 Notation
white noise process with variance σ .
The relation between yt and y0,T can be expressed as
follows:
m−1
φ p ( B)(1− B) yt = θ q ( B)a t ,
d
y 0,T = ∑ cm− j ym T − j
(1)
j =0
 m− 1
=  ∑ cm − j B j
 j=0

where B is the backshift operator, such that, Byt = y t − 1 ,
φ p ( B) = 1 − φ 1 B −K− φp B p ,
θ q ( B) = 1 − θ 1 B −K
(
Kθ q B ) and
{at }
)
(
Υ t = ∑ cm− j xt − j ,
t = m,K , n,
[
]
assumes one of the following forms:
c=
c=
c=
c=
(2)
Where c j is the j − th coefficient of the linear
combination considered, j = 1,K, m.
Let yT , Τ = 1,K, N, be the series obtained by
systematic sampling of Υ t every m time periods,
n 
N =  ,
m 
,
)
where m is the aggregation order, c = [c 1 c 2K cm ]' is the
vector of the aggregation coefficients and
ymT = y m(T − 1) + 1K y mT '. The vector c generally
(m× 1)
j=0
mΤ
(4)
T = 1,K , N ,
Let y t , t = m,K , n, the series obtained by a linear
combination of m consecutive values of the basic
series yt :
m−1

 y mT


= c1 B m − 1 + c2 B m − 2 +K+ cm y mT = c' y mT ,
is a white noise process with
variance σ 2a . Further assume that φ p (B) and θ q ( B)
have roots that lie outside the unit circle and don’t
share common roots.
y0,Τ = Υ
)
2
e
Let y t , t = 1,K , n an equally spaced time series that
admits the ARIMA (p, d, q) representation
q
(3)
1/m
(1
(1
(1
(0
1
1
0
0
K
K
K
K
1)',
1)',
0) ',
1) '.
(5)
The first configuration, typical for flow variables as
consumption, production and so on, is referred to as
linear temporal aggregation in strict sense. The second
configuration, suitable for index variables (e.g., prices)
gives rise to temporal averages of the basic series. The
last two configurations, finally, correspond to the
systematic sampling of the first and the last,
respectively, of m consecutive values of a stock
variable.
Τ = 1,K , N.
It is easy to recognise that each value of the basic
series yt concurs to the determination of only one of
the values of the series y 0,Τ . Whereas each yt can
In practice we may find other kinds of coefficient
vectors, let’s think, for example, to the weights used to
82
TABLE OF CONTENTS
calculate an annual functions as linear combination of
monthly values.
We often face with problems where, besides the
available low-frequency series, we dispose of other
information such as related series, that is high-frequency
series logically correlated with the unknown basic one.
We can represent all the above relationships in a more
compact form. Let y = [ y1 y 2K yn ]' the (n ×1) vector of
basic values, y0 = [ y 0,1 K y 0,N ]' the corresponding
(N × 1) aggregated vector and D = c'⊗ ΙN , the
(N × mN = n) aggregation matrix (where ⊗ denotes
the Kronecker product). Then
y 0 = Dy.
Let x j , t , t = 1,K , n, a related high-frequency series, x j ,
j = 1,K , p, the corresponding n× 1 vector, and Χ the
associated (n × p) matrix of all the available related
series. Using the same notation as for the original series,
we denote with x 0 j the aggregated (N × 1) vector
corresponding to the related series x j and with Χ 0 the
aggregated (N × p) matrix of all related series.
(6)
We are usually interested, besides the distribution or
interpolation problem, in forecasting future values of
the unknown derived series (extrapolation). In this
case no corresponding aggregated value is available,
so we have to forecast future disaggregated values by
using all past information.
We suppose that the unknown series and the available
high-frequency series are related by the linear regression
model
y = Χ β + u,
with u ( n× 1)
disturbances.
Let y F the ((n + r) × 1) vector unknown data, where r
is the forecast horizon. Then the temporal aggregation
constraint becomes
t = d ,K, n
d
and
uT = (1 − B) y0,T , T = D ,K, N , note that D=d due to
D
the
relationships
between
aggregated
and
disaggregated ARIMA models (see Barcellan and Di
Fonzo, 1993). Thus the following relationships hold:
w = ∆dn y,
u = ∆dN y 0
(8)
with ∆dl [( l − d ) × l] matrix of differentiation of the
form
δ 1 δ 2 K

0 δ1 δ2
∆dl = 
M M M

0 K K
vector
of
stochastic
ECOTRIM offers a set of univariate mathematical and
adjustment procedures to get the disaggregated series
from an aggregated one. These methods usually
correspond to a quadratic loss function minimisation
problem with reference to the disaggregated series
values. While they do not offer, like other approaches,
related information as diagnostics or confidence
intervals, due to their mathematical nature, they
represent a simple and quick way of obtaining series
which can be used as preliminary with other techniques
(for example, in multivariate adjustment methods). The
obtained series often supply good estimates for the
problem we face with.
If the basic and the aggregated process are described
by non-stationary ARIMA models, respectively, we
wt = (1 − B) yt ,
mean
Mathematical and adjustment
3 procedures
y0 = [D 0 r ] y F = DF y F .
pose
zero
(10)
K K K K K
3.1 Boot, Feibes and Lisman’s procedure
K δd
0
M
M
Boot, Feibes and Lisman (1967, hereafter BFL)
proposed a mathematical procedure to solve the
univariate problem of temporal disaggregation. The
main idea is that a plausible criterion is to minimise the
sum of squared differences between successive
disaggregated values (first differences model, FD) or to
minimise the sum of squared second differences (second
differences model, SD), subject to the aggregation
constraints. The subperiod values are determined on the
δd
0
M
K K K
0 

K K K K 0 
M M M M
M

0 δ 1 δ2 K δ d 
(9)
where δ i are the coefficients of L in ( L −1) , with L
d
equal to B or B according to the kind of difference
working in practice.
83
TABLE OF CONTENTS
basis of the values of the previous subperiods, that is,
in an autoregressive manner.
It is possible to show that the system matrix has full
rank so the system (11) can be easily solved.
BFL developed this idea for the particular case of
temporal aggregation in strict sense of quarterly
values. A generalised approach, for the strict sense
case, has been proposed by Cohen, Müller and
Padberg (1971). More generally, here we consider
linear combinations of m disaggregated values. From
a mathematical point of view the disaggregation
problem corresponds to the following
3.2 Ginsburgh’s procedure
∑ [(1− B)
n= mN
min
yj
Ginsburgh (1973) proposed an adjustment procedure
to solve the disaggregation problem by using related
series.
He supposed that a linear regression model, which
involves the aggregated values y 0 , Τ and the
aggregated related values x 0,Τ , Τ = 1,K, N, holds:
],
2
d
j =d + 1
yj
y0,T = β0 + β1 x 0,T + u 0,T ,
(12)
Then he supposed that the unknown y t ' s can be
estimated by
subject to the aggregation constraints (4) or, in matrix
form,
~y = β$ + β$ x ,
t
0
1 t
( )
min y' ∆dn ' ∆dn y,
y
T = 1,K , N .
t =1,K , n.
Obviously these estimates do not generally fulfill the
aggregation constraints. Thus the ~y' s have to be
adjusted in order to fulfill them 2.
subject to
y0,T = D ' y,
In practicethe required steps are:
where (1− B) y j is a general expression for the dth
d
1 Interpolation by BFL method of both the aggregated
series and the related aggregated indicators;
differences version. In the special case of the first
second differences versions to be treated here d
assumes the values 1 and 2 respectively.
2 Computation of the disaggregated final values as
follows
The problem is solved by considering the Lagrangean
expression
BFL
BFL

y$ m (t − 1) + j = ym (t − 1 ) + j + β$ 1  xm (t − 1 )+ j − x m (t − 1) + j

j =1,K , m.
( )
Ld ( y, y0 , λ) = y' ∆dn ' ∆dn y − λ( Dy − y0 ).
( )
aggregated regression equation (12), y BFL and
xBFL represent the preliminary estimates obtained by
BFL.
It can be easily shown that the estimates $y' s are
obtained by using an adjustment method. The
preliminary estimates ~y t = β$ 0 + β$ 1 xt , t = 1,K , n, with
β$ and β$ OLS parameter estimates in (12), are
(11)
0
2
(13)
where β$ 1 represents the OLS estimate of β1 in the
Upon differentiating with respect to y' s and λ' s and
equating the resulting expression to zero we obtain a
system of n + N linear equations of the general form
2 ∆dn ' ∆dn − D   y  0 

   =  .
D
0  λ   y 0 


,

1
Such a kind of method was first proposed by Vangrevelinghe (1966). The main difference between the two
methods consists of using a different mathematical procedure for the first step (BFL for Ginsburgh’ s method
and Lisman and Sandee, 1964, for Vangrevelinghe’s method).
84
TABLE OF CONTENTS
adjusted by modulating the discrepancies between the
aggregated values y0,T ' s and the corresponding
estimated aggregated values β$ 0 + β$ 1 x0,T according to
Therefore, the preliminary estimates are adjusted
distributing the aggregated discrepancies according to
the matrix of weights K.
the BFL procedure.
If we suppose that M is the identity matrix, we simply
distribute the discrepancies in equal amounts among
the m subperiod values. A more interesting possibility
is to employ a penalty function based on the dth
differences of the preliminary and adjusted series. Let
3.3 Denton’s procedure
Denton (1971) developed a pure adjustment method.
In fact, he did not consider the issue of estimating the
preliminary series, but he dealt with the problem of
distributing the discrepancies in order to fulfill the
aggregation constraints.
∆d ,n = ( ∆ 1,n ) , with
d
k 
∆ 1,n =  d ,
∆ 1 
The problem is to adjust a vector p of preliminary data
to obtain estimates which minimise, in some sense, the
preliminary series distortion and jointly are coherent
with the temporally aggregated data.
where k is the (1× n) vector equal to [10K0]; thus the
penalty function is
f ( y, p) = ( y − p)' ∆' d ,n ∆ d ,n ( y − p),
More generally we specify a penalty function f ( y, p)
and express the problem as that of choosing y in order
to minimise this penalty function. We wish to solve the
problem of minimising a quadratic loss function in the
differences between the preliminary and the adjusted
series:
min ( y − p) ' M ( y − p),
with M = ∆'d ,n ∆ d ,n . Thus for the additive first
differences and for additive, second difference s cases,
which are of interest to us, the matrix M assumes the
form
M = ∆'1,n ∆ 1, n
and
M = ∆'1,n ∆' 1,n ∆ 1,n ∆1,n ,
respectively.
(14)
y
The calculation of the estimated values, as pointed out
by Denton, can be carried out without difficulty since
the form of the matrices involved makes it easy to deal
with (see Denton, 1971, Di Fonzo, 1987). Note that we
have to eliminate from the penalty function all values
outside the adjustment range. Thus we have set
yt = p t for t = 0,−1,K ,1− d 3 .
subject to the aggregation constraints (14), where M is
a symmetric (n × n) nonsingular matrix to be specified.
We set up a Lagrangean expression and write
L = ( y − p) ' M ( y − p) − 2(λ)' ( y0 − Dy).
Differentiating with respect to y’sand λ' s and equating
the resulting expression to zero we obtain a system of
n+N linear equations of the general form
Finally, we note that the main problem of this
approach are initial values. Cholette (1988) gives a
solution to such a problem.
−1
 y M D '  Mp 
λ  = D 0   y − Dp ,
  
  0

4 Arima model based approach
The main idea of the ARIMA model based approach is
that we can use, to estimate the high-frequency series,
the information supplied by the aggregated ARIMA
model associated to the low-frequency available series
whose solution is
y = p + Κ ( y0 − Dp),
(
)
Κ = M −1 D' DM −1 D '
3
−1
.
That is we assume that ( y0 − p0 ) = ( y− 1 − p− 1) =K = 0
85
TABLE OF CONTENTS
where γ *u = [ γ u (0) γ u (1K
) ]',
γ*w = γ w ( −( d + 1)( m − 1))K γ w ( 0)K ' and
and by the theoretical relationships which link
aggregated and disaggregated ARIMA models.
As a result of the ARIMA model based approach some
further information can be obtained using these
methods. In particular, all the techniques considered
provide for confidence intervals and some of them
supply an estimate of the disaggregate ARIMA model
besides the opportunity to predict future values. Let’s
note that, with reference to the first two procedure we
are going to analyse, the only information available
corresponds to the aggregated series and to its ARIMA
model, so we deal with a problem with strong
information lack.
[
]
Α
with
v
*
v'
 '
0m
= M

0'
M

0' 0' K 0' K

v' 0' K 0' K
M M O
M O ,

0' 0' K v ' K
M M O
M O 
[( 2(d + 1)( m −1) + 1) × 1]
vector
the
coefficients of B i , i = 0,K ,2( d + 1)(m − 1), in the
polynomial
Despite of this, in practice these methods require more
computational resources and a more active role of the
researcher.
2 d m− 1 m − 1
(1+ B +K+B ) ∑ ∑ c
m−1
i =0 j = 0
m− i
cm − j B (
m − 1) − i + j
.
Let’s suppose to be interested only in the first h
autocovariances of {uT }. Then we consider only the
“adjusted” upper left corner matrix of the matrix Α * ,
let’s indicate it with A, [(h + 1) × (l + 1)], with
l = hm + (m − 1)(d + 1), ), (Stram and Wei, 1986a,
Barcellan
and
Di
Fonzo,
1993).
Let
γ u = [ γ u (0)K γ u ( h)] ' and γ w = [γ w ( 0)K γ w ( l)]' then
4.1 Wei and Stram’s procedure
Since the ARIMA model and its autocovariance
structure are closely related, Stram and Wei (1986b)
and Wei and Stram (1990) considered the possibility
of estimating the autocovariance structure for the
unknown series from the available autocovariance of
the aggregated model.
γu =Α γw.
The relationship between the disaggregated and the
aggregated structure of covariance is the basis of the
approach. We refer to the stationary series
(16)
Then the disaggregation problem is dealt with
according two components:
wt = (1 − B) yt and uT = (1 − B) y 0,T obtained by
d
of
d
• the identification and estimation of the basic
ARIMA model;
• the estimation of the disaggregated series values.
differentiating the basic and the aggregated series
respectively. The following result (Barcellan e Di
Fonzo, 1993) states the relationship between the
covariances of {wt } and {uT }:
(
γ u ( k) = 1 + B +KB m− 1
j =0
2d m − 1
) ∑c
m −1
∑c
In order to obtain the estimates for y' s, Stram and Wei
(1986b) and Wei and Stram (1990) developed a
generalised least squares procedure, for the strict sense
aggregation case, which minimises the quadratic form
w'Vω−1 w subject to the temporal aggregation constraint
y0 = Dy.
m− jγw
i =0
m− i
[mk + (m − 1)d + i − j]
(15)
Thus each autocovariance γ u ( k) can be expressed as a
linear combination of γ w ( j) ' s with j varying from
km − ( d + 1)(m − 1) to km + (d + 1)(m − 1). We can use a
matrix form to express (15), that is
Barcellan and Di Fonzo (1994) extended these results
to general linear combinations obtaining.
( )
= V (C )'V
$ = Vw C d 'Vu−1 u
w
d
w
γ *u = Α * γ*w ,
86
−1
u
.
∆dN y
TABLE OF CONTENTS
( )
−1
d
−1 d
 ∆dn
 V w C 'Vu ∆N 
x$ = 

 y0 .

0 M Ιd ⊗ c'  0 M Ιd 
with z a (h × 1) vector whose elements are
(17)
 c m for i = 1

zi  0
for i = 2,K , r
c
 m− j for i = r + 1,K, h
and
g 0

0
=
M
0



K g ( m −1) (d + 1) K
0

O
M
O
M

K
0
K g (m −1) (d + 1) 

g 1 K g m −1 g m K g ( m−1) (d + 1) 0
0 K 0 g 0 K g ( m−1) d −1 g (m −1) d
M O
M
0 K 0
M
O
0 K
M
M
0
0
K
0
K
0
and α mT a (h ×1) vector of h state variables with the
following form:
 y mT
fori = 1
 r
r −1

α mT ( i) =  ∑ ξ j ymT + i − 1+ j − ∑ ϑ j a mT + i − 1− j for i = 2,K , r
j=i−1
 j =1
 y mT + r − i
fori = r + 1,K , h

is a [( N − d ) × ( n − d )] matrix where g j represents the
coefficient of B , j = 0,K ,( m − 1)( d + 1), in the
d m − 1

polynomial 1 + B +K+ B m− 1  ∑ c m − i B i .
 i =0

(
)
Note that some ξ i or θ j are zeros unless p + d = q + 1.
The vector α mT is dynamic and changes according to
the state equation
In practice the matrices Vw and Vu are usually unknown
and have to be estimated using the available data.
While Vu can be easily calculated by using the
aggregated model parameter estimates, the estimation
of V w is not so immediate. Wei and Stram (1990)
founded their disaggregation procedure over the
relationship (16). This is not generally a one to one
relationship but, under suitable assumptions, it permits
to estimate the disaggregated ARIMA model and then
the associated autocovariance matrix (see Wei and
Stram, 1990).
α mT = F α m (T −1 ) + Gη mT ,
[
In order to estimate the values of the maximum
likelihood estimates of the parameters of the identified
model by using the available information and the
recursive Kalman filter equations (Al-Osh, 1989, p.
90).
4.3 Guerrero’s procedure
Let y t , t = 1,K , N the unknown high-frequency series
which we suppose admits the ARIMA representation
(1). If we consider the minimum square error (MMSE)
linear estimator of y t , based on information up to time
0<t (Box and Jenkins, 1976), given by the conditional
expectation
)
Let φ p ( B)(1 − B) = Ξ ( B) = 1 − ξ 1 Β −K−ξ p + d B p + d ,
yT = z ' α mT ,
T = 12
, ,K , N
]
The estimated state vectors and the corresponding
covariance matrices are obtained by using the Kalman
filter with reference to the state space representation
(18) and (19).
Al-Osh (1989) considered a dynamic linear model
approach for disaggregating time series. He used an
appropriate state space representation of the ARIMA
model which describes the unknown series and which
takes care of the temporal aggregation constraint.
r = max( p + d , q + 1), h = m + r −1. Thus
(19)
vector with zero and covariance matrix Q; F and G are
a (h × h) and a (h × m) matrices derived (Al-Osh, 1989,
p. 88) from the state space representation of the
disaggregated ARIMA model and from the temporal
aggregation constraints.
4.2 Al-Osh’s procedure
(
T = 1,2,K , N ,
with η mT = a mT a mT − 1 Ka m(t − 1) + 1 ' a (m× 1) gaussian
The extrapolation problem can be treated noting that
the aggregation constraints are fulfilled if we use the
relationship (7). As we shall see for the optimal
approach (see par. 5.7) the extrapolation problem can
be solved by using the augmented aggregation matrix
[0r | D] instead of D.
d
and j = 1,K, m − 1
Ε ( y t y 0 , y 1 ,K) = Ε
(18)
87
0
( yt ),
TABLE OF CONTENTS
The forecast error, in terms of the pure MA
representation, is
yt − Ε
t−1
0
( yt ) = ∑ τ j a t − j ,
(τ 0
j=0
(
)
[ ( ) ]
E a * p = 0, E a * a* ' p = σ 2Q,
≡ 1),
with Q a positive definite matrix.
The BLUE of y, given p and the aggregation
constraints, is
with τ' s obtained equating coefficients of the powers
of B in the equation τ(B)∅ (B)(1 − B) = θ(B). In
d
y$ = p + A$ [ y0 − Dp ]
matrix form
y−Ε
0
with
( y) = Τ a,
−1
A$ = TQT ' D '(DTQT ' D ')
and
where T is a low triangular matrix formed by the
sequences of weights 1, τ 1 ,K , τ n−1 in the first column,
the weights 0,1,K, τ n− 2 in the second column, and so
on. The linear MMSE estimator of y is
y$ = Ε
0
(
(22)
)
$ TQT '
cov[( y$ − y) p] = σ 2 Ι − AD
(23)
As we can see this estimator consists of two
components: the former corresponds to the
preliminary estimated series, the latter can be
interpreted as a term which adjusts the first component
by using a weighted combination, whose weights are
$ of the
derived from the ARIMA representation by A,
( y) + A$ [ y0 − DΕ 0 ( y)],
with
−1
A$ = TT ' D' (DTT ' D ')
discrepancy between the aggregated
preliminary aggregated series.
and
the
and the estimatior error covariance matrix is
(
Obviously, we should be able to estimate σ 2 and TQT'
in order to obtain the estimator. A simple case occurs
when Q = I since expressions (22) and (23) become
easier and σ 2 can be estimated from
)
$ TT '.
E0 [( y$ − y)( y$ − y)'] = σ 2a Ι − AD
In practice, we assume that a preliminary estimate p
exists and that { p t} and { y t } have essentially the same
autocorrelation structure. So they admit basically the
same ARIMA model but p does not fulfill the
aggregation constraints.
( n − k)σ$ 2
−1
( y$ − p),
where k is the number of estimated parameters in the
ARIMA model. If this is not the case, Guerrero (1990)
proposed the following EGLS procedure. Given (21)
Let’s assume that E( y t p) = p t and E0 ( y) is
independent of p. Then, taking the conditional
expectation of (20), given p, we have
e = Ka* = K −1T( y − p) = Ω
−1
( y − p),
with the non-singular matrix K such that KQK’=I
according to the steps:
p − E0 ( y) = E( y p) − E[ E0 ( y) p] = TE( a p ),
1. assume Q=I and calculate the BLU estimator.
Construct the series a$ = T −1 ( y$ − p). If this series
can be assumed as generated by a white noise, the
assumption Q=I is valid and so are the estimator
and the estimated error covarariance matrix. If it is
not, we must build an ARIMA model for
and subtracting this form (20) we obtain
y − p = Ta* ,
( )
= a* ' a* = ( y$ − p)' (TT ')
(21)
with a * = a − E( a p ) a random vector such that
88
TABLE OF CONTENTS
{ y t − p t } and obtain its pure MA representation in
The conditions (24) and (25) imply
order to get an estimate of the matrix Ω :
AX 0 − X = 0;
(26)
2. make use of the relationship
$y − y = Au 0 − u.
Ω Ω ' = T(K ' K ) T ' = TQT '
−1
The covariance matrix of ( y$ − y) is therefore
To calculate A$ for the BLU estimator and then
calculate y$ .
Cov( y$ − y) = E[( $y − y)' ( y$ − y)]
Finally estimate σ 2 by means of
(n − k)σ$ 2
( )
= e * ' e* = ( y$ − p)' (TQT ')
−1
(27)
= AV0 A '−AVu −Vu A '+V
where V0 denotes E[u 0 u 0 '], V denotes E( uu') and Vu
denotes E(u 0 u ').
( y$ − p),
To find the best linear unbiased estimator y$ , Chow and
Lin minimised the trace of (27) with respect to A
subject to the matrix equation AX 0 − X = 0. Using a
matrix H of Lagrange multipliers we form the
Lagrangean expression
with k equal to the number of parameters of the
ARIMA model used to estimate Ω .
The best linear unbiased estimator
5 approach
L=
This approach was developed by Chow and Lin
(1971). The idea is to relate the regression model for
the disaggregated series and the corresponding
aggregated regression model. Obviously, this method
requires related series.
1
tr[AV0 A '− AVu −Vu A '+V ]− tr[ H ' ( AX 0 − X )]
2
and setting its partial derivatives with respect to A
equal to 0 we obtain
AV0 −Vu = HX 0
Let’s consider the disaggregated regression model
(10), which relates the series to be estimated and the
available related series, and the corresponding
aggregated
regression
model
obtained
by
pre-multiplying this equation by the aggregation
matrix D
(28)
Solving (28) for A gives A = HX 0V0− 1 + VuV0−1 , which,
when subsituted in (26) gives the solution for H:
HX 0' V0−1 X 0 + V uV0−1 X 0 − X = 0,
or
(
H = X X 0' V0−1 X 0
D y = DXβ + Du,
) − (V V
−1
u
−1
0
X0
)( X V
'
−1
0 0
X0
)
−1
.
that is
y 0 = X 0β + u 0 .
The solution for A is then
(
We consider the problem of predicting y in a general
linear model (Goldberger, 1962). A linear unbiased
estimator $y of y satisfies; for some matrix A,
y$ = Ay0 = A( X 0β + u0 ),
A = X X 0' V0−1 X 0
)
−1
(
X 0' V 0−1 + VuV0−1
(
Ι− X X ' V −1 X
0
0 0
0

)
(
)
y$ = Ay0 = Xβ$ + Vu V0−1 u$ 0 ,
= A( X 0 − X )β = A( DX − X )β = 0
X 0' V0−1 

(29)
The resulting estimator is
(24)
and
E( y$ − y) = E[ A( X 0β + u 0 ) − ( Xβ + u)]
)
−1
where
(25)
(
β$ = X 0' V0− 1 X 0
89
)
−1
X 0' V0−1 y0
(30)
{ y t − pt }
(31)
TABLE OF CONTENTS
Is the generalised least squares estimate of the
regression coefficients for the aggregated model, and
(
u$ 0 = I − X 0 X 0' V0−1 X 0

= y − X β$
0
)
−1
Cov( y$ − y) = Ε [( y$ − y)( y$ − y) ']
(
= ( Ι − LD)V + ( X − LX 0 ) X 0V0−1 X 0
X V  y0

' −1
0 0
represents the ( N × 1) vector of residuals in the
aggregated regression.
Since
Like any other generalised least squares estimator, we
need to know the covariance matrix V to calculate the
estimated series (32). In practice this matrix is
unknown and has to be estimated. The usual practice is
to assume some structure for the high-frequency
disturbances.
,
= VD '
Then,
(
)
y$ = Xβ$ + VD'V0−1 u$ 0 ,
(32)
The simplest case assumes that the disturbances are
serially uncorrelated, each with variance σ 2 .
In this case V = σ 2 I. A more attractive case is that of an
autoregressive structure.
= Xβ$ + Lu$ 0
with L = VD 'V0−1 .
We note that the estimate (32) consits of two
components:
5.2 AR(1) model
If the disaggregated disturbances follow a first order
autoregressive process
$ applies the estimated regression
1. The former Xβ,
coefficient to the disaggregated observations of
the related variables in order to obtain a
preliminary estimate of the dependent variable;
u t = ρ ut − 1 + v t ,
t =1,K , n,
with ρ < 1 and vt white noise of variance σv2 , the
variance-covariance matrix is equal to
2. The latter, can be interpreted as a term which
adjusts the first component by using a weighted
combination, whose weights are given by the
matrix L, of the estimated residuals of the
aggregated regression model.
V=
It is easy to show that the estimates produced by this
method fulfill the aggregation constraints:
(
( X − LX 0 )'
5.1 Estimation of the covariance matrix of
residuals
Vu = E( uu' 0 ) = E[( y − Xβ)( y 0 − X 0β)]
[( y − Xβ)( D y − DXβ)']
−1
As we can it is formed by two components: the former
which depends only on D and V, the latter increases
with ( X − LX 0 ). This result can be used to evaluate
reliability indicators (Di Fonzo, 1987, p. 44) and
confidence intervals.
0
=E
)
)
σv2
(1−ρ )
2
1 ρ
ρ2

1
ρ
ρ
ρ 2 ρ
1

K K K
 n − 1 n− 2
ρ
K
ρ
K ρ n− 1 

K ρ n− 2 
K ρ n− 3 .

K K

K 1 

In practice ρ is unknown and has to be estimated in a
suitable way. If we suppose that the disaggregated
disturbances are normally distributed, we can estimate
ρ , β and σ v2 using likelihood function
Dy$ = DXβ$ + D VD 'V0−1 u$ 0
= X 0β$ + u$ 0 = y0
Recalling that V0 = Cov(Du) = DVD '.
The covariance matrix of the errors ( y$ − y) is (Bournay
and Laroque, 1979)
90
TABLE OF CONTENTS
(2π σ )
2
v
−N/ 2
V0
 1
exp  − 2
 2σv
−1 / 2

β$ =  X 0' D ∆'1,n ∆ 1,n
 
(
( y 0 − X 0β)'V0−1 ( y 0 − X 0β)}.
−1
D '

−1

X 0 

(
)
−1
D' 

−1
y0
This procedure presents some advantages like the
computationally easy procedure which allows to
estimate β and σ 2 in a generalised regression model
framework with known disturbances covariance
matrix.
Barbone, Bodo and Visco (1981) proposed to
minimise the weighted sum of square residuals, that is
u$ 0' V0−1 u$ 0
−1
X 0' D ∆'1, n ∆ 1,n

(33)
As noted by Bournay and Laroque (1979) this
corresponds to a problem of maximisation with
respect to ρ of the log-likelihood function.
5.4 Litterman’s procedure
(34)
Litterman (1983) supposed that the disturbances
process is described by
which corresponds to the argument of the exponential
function in (33). As we can see this is an EGLS
approach.
u t = ut −1 + et ,
e t = µet − 1 + vt
From a technical point of view a scanning procedure is
adopted to estimate ρ. We assign a set of values
between –1 and 1 to ρ and then we determine V , β$ , σv2
t =1,K , n,
t =1,K , n,
with µ < 1and initial conditions u 0 = e0 = 0. Letting H
a (n × n) matrix
choosing the value which maximises the
log-likelihood function (for more details see Di Fonzo,
1987).
1 0 K K K K
−µ 1 0 K K K

H = 0 − µ 1 0 K K

K K K K K K
0 K K K 0 − µ
5.3 Fernandez’s procedure
Fernandez (1981) considered the usual regression
model assuming that the disturbance u t follows a
random walk process
ut = ut − 1 + vt ,
)
0
0

0 ,

K
1 
The covariance matrix of u is equal to
(
t = 1,K , n,
V = σ 2 ∆'1,n H ' H ∆1 ,n
)
−1
.
(36)
with v t white noise with variance σ 2 . Thus
Cov(u t u l ) = σ min (t , l),
Substituting the right-hand side of (36) in (31) and
(32) yields
t , l =1,K , n.
2
(
y$ = X β$ + ∆'1,n Η ' Η ∆ 1,n
In matrix form it can be showed that
(
V = σ 2 ∆'1,n ∆1 ,n
)
−1
.
(35)
(
)
−1
(
D ' D ∆'1,n ∆ 1,n

)
−1
−1
(
D ' D ∆'1,n Η ' Η ∆ 1,n

)
−1
−1
D ' u$ 0

with
Subsituting the right-hand side of (35) in (31) and (32)
yields
$ + ∆' ∆
y$ = Xβ
1, n 1 ,n
)

β$ =  X 0' D ∆'1,n H ' H ∆ 1,n
 
(
)
−1
−1
−1

D  X 0 


'
(
X 0' D ∆'1,n H ' H∆ 1,n

−1
D ' u$ 0

with
91
)
−1
D' 

−1
y0
TABLE OF CONTENTS
6 Multivariate disaggregation
Some computational problems can arise when
estimating µ. Di Fonzo (1987) proposed a scanning
approach like the one followed for the AR(1) model 4 .
ECOTRIM permits to estimate M high-frequency
series which have to satisfy both contemporaneous
and temporal aggregation constraints (for technical
details see Di Fonzo, 1994). Two different kinds of
multivariate procedures have been implemented:
5.5 Extrapolation
To solve the extrapolation problem we look for the
best linear predictor, according to y 0 , of
y n+ j = X n' + j β + u n+ j ,
(
with E( u n+ j ) = 0 and E u n2+ j
)=σ
• Adjustment methods;
• Optimal methods.
j = 1,2,K
In the former case the options are the following:
< +∞ .
2
n+ j
• Denton’s multivariate procedure;
• Weighted adjustment procedure;
• Rossi’s procedure.
According to sec. 5 the solution is
y$ n' + j = xn' + j β$ + wn' + j V0−1 u 0 ,
with w n+ j (n × 1) vector of the covariances between
u n+ j and u 0 .
Denton’s procedure represents a multivariate version
of the univariate one and ECOTRIM offers the
following choices for the matrix of weights:
Di Fonzo (1987, p. 55) illustrates how to calculate
forecasts according to the chosen estimation
procedure. In practice we use the same approach of the
estimation problem where we substitute the
aggregation matrix D with the augmented matrix D F
in (7).
•
•
•
•
4
Denton additive first differences;
Denton additive second differences;
Denton proportional first differences;
Denton proportional second differences.
Rossi’s procedure can be viewed as a sub-case of
Denton’s. The user can supply preliminary series, that
satisfy or not the temporal aggregation constraints,
otherwise he can calculate them in a preliminary
univariate session of ECOTRIM. Note that Rossi’s
The Stram and Wei’s procedure offers a general form for the optimal approach which handles with a l arge class
of error assumptions that corresponds to the ARIMA models whose sub-cases are the procedures illust rated in
this section, unless initial conditions. The optimal solution to the disaggregation problem is form ally equal to
(32) with β$ and L given by
(
)(
( )) ( ∆
β$ =  ∆dN X 0 ' C dV w C d '

and
−1
d
N
−1
) (∆
X0 

( )(
d
N
)(
( ))
X 0 ' C d Vw C d '
)
−1
∆dN y0
(37)
V C d ' C d V C d −1 ∆d 
w
N
 w

(38)


0 M Ιd
We immediately recognise in β$ the generalised least squares estimator of β in the regression model
 ∆dN

L=

 0 M Ιd ⊗ c'
−1
∆dN y0 = ∆dN X 0β + ∆dN u0 ,
with error covariance matrix
[
( )]
(
( ))
( )
Ε ∆dN u0 (u0 ) ' ∆dN ' = Ε  C d ww' C d '  = C d V w C d '.


92
TABLE OF CONTENTS
and weighted adjustment procedures require
preliminary series that fulfill the temporal aggregation
constraints while Denton’s do not. In fact both
methods force the estimated series only to satisfy the
contemporaneous constraints and therefore, in order to
fulfill the temporal constraints too, the preliminary
series must already fulfill them. On the other hand,
Denton’s procedure force the estimated series to
satisfy both the aggregation constraints.
In particular, if we use an ARIMA model based
approach
we
obtain,
besides
the
estimated
disaggregated series, an estimated ARIMA model for
the high-frequency process and an estimate of the
covariance matrix of estimated values. Note that we do
not give a significance level for the ARlMA
parameters since, due to the non-linear relationships
which link aggregated and disaggregated models, it is
straightforward to estimate their covariances using the
corresponding aggregated information.
The adjustment step is related to a high-frequency
series, say z, with which the estimated series have to
be coherent. Then the adjustment procedures
modulate
the
discrepancies
between
this
contemporaneous aggregated series and the sum of the
available preliminary series. As far as the multivariate
methods are concerned, ECOTRIM has two options:
The optimal approach supplies much more
information about regression results, such as
regression diagnostics, significance parameter levels,
confidence intervals and some other useful statistics.
Similar indicators are obtained for the multivariate
optimal approach.
• white noise:
• random walk,
Mathematical and adjustment procedures neither
allow to calculate diagnostic nor permit to get any
extrapolation, except for the multivariate adjustment
method as stated in sec. 6.1.1.
that can be viewed as a straightforward extension of
the univariate counterpart. Multivariate AR(1) models
are being developed to enlarge this section of the
program. Finally, in the multivariate case too,
ECOTRIM offers the extrapolation option as regards
the optimal methods and the weighted adjustment
method.
The choice of the technique to be used in practice is
devolved to the analyst. He must consider also the
computational time required, besides the information
at disposal and the problem he is facing with.
7 Concluding remarks
If related series are available the optimal approach
seems to offer many advantages: diagnostics, several
approaches, extrapolation. If it is not, ARIMA model
based procedures seem to offer the same opportunities
(extrapolation, several approaches, some diagnostics)
but they require a preliminary analysis in order to
estimate the ARIMA aggregated model. This may
represent a problem if the analysis must be developed
in an automatic way.
In this paper I have presented the techniques offered
by ECOTRIM to estimate the unknown originally
series by using the available information. Both
univariate and multivariate issues have been
considered and the related solutions that ECOTRIM
offers have been analyzed.
According to the disaggregation method chosen,
ECOTRIM can offer further disaggregation
information such as diagnostics.
93
TABLE OF CONTENTS
References
Cohen, G.C., Müller. W.M. and Padberg, M.W.
(1971),
“Autoregressive
approaches
to
disaggregation of time series data”, Applied
Statistics, 20 (2), pp. 119-129.
Al-Osh, M. (1989) ”A dynamic linear model approach
for disaggregating time series data", Journal of
Forecasting. 8, pp. 65-96. ‘
Barbone, L., Bodo. G. and Visco, L. (1981) , “Costi e
profitti in senso stretto: un’analisi su serie
trimestrali, 1970-1980", Bollettino delta Banca
d’ltalia, 36, numero unico, pp. 465-510.
Denton, F.T. (1971) , “Adjustment of monthly or
quarterly series to annual totals: an approach
based on quadratic minimisation”, Journal of the
America Statistical Association, 66 (333), pp.
99-102.
Barcellan. R. and Di Fonzo, T. (1993) , Temporal
aggregation and time series: a survey,
unpublished paper
Di Fonzo, T. (1987) , La stima indiretta di serie
economiche trimestrali, Padova. Cleup.
Barcellan, R. and Di Fonzo, T. (1994) ,
Disaggregazione temporale e modelli ARIMA",
Atti della XXXVII Riunione Scientifica della
S.I.S., San Remo 6-8th April 1994, vol. II, pp.
355-362,
Di Fonzo. T. (1990a) , “The estimation of M
disaggregated time series when contemporaneous
and temporal aggregates are known”’, The
Review of Economics and Statistics, 72 (1), pp.
178-182
Bassie, V.L. (1958) , Economic Forecasting, New
York, Mc Grow-Hill
Di Fonzo. T. (1910b) , “Stima indiretta di pill serie
economiche legate da un vincolo contabile”,
Statistica Applicata, vol. 2, n. 2, pp. 149-161.
Boot, J.C.G., Feibes. W. and Lisman, J.H.C. (1967) ,
“Further methods of derivation of quarterly
figures from annual data”’, Applied Statistics, 16
(1), pp. 65-75.
Di Fonzo, T. (1994) , ‘"Temporal disaggregation of a
system of time series when the aggregate is
known: optimal vs adjustment methods",
Proceedings of National Quarterly Accounts
Workshop, Paris 5-6th December 1994.
Bournay, J. and Laroque, G. (1979) , “Reflexions sur
la méthode d’elaboration des comptes
trimestriels”, Annales de I’INSEE, 36, pp. 3-30.
Fernandez. R.B. (1981) , “A methodological note on
the estimation of time series”, The Review of
Economics and Statistics, 63 (3), pp. 471-478.
Box, G.E.P. and Jenkins, G.M. (1976) , Time series
analysis, forecasting and control, San Francisco.
Holden-Day.
Ginsburgh, V.A. (1973) , “A further not on the
derivation of quarterly figures consistent with
annual data”, Applied Statistics, 22 (3), pp.
368-374.
Cholette, P.A. (1988) , “Benchmarking and
interpolation of time series”’, Working Paper,
Time Series Research and Analysis Division,
Statistics Canada.
Goldberger, A. S. “Best linear unbiased prediction in
the generalised linear regression model”, Journal
of the American Statistical Association 57, pp.
369-375.
Chow, G.C. and Lin, A. (1971) , “Best linear unbiased
interpolation, distribution and extrapolation of
time series by related series”, The Review of
Economics and Statistics, 53 (4), pp. 372-375.
94
TABLE OF CONTENTS
Guerrero, V.M. (1990) , “Temporal disaggregation of
time series: an ARIMA-based approach”,
International Statistical Review, 58 (1), pp. 29-46.
Stram, D.O. and Wei, W.W.S. (1986a) , “Temporal
aggregation in the ARIMA process”, Journal of
Time Series Analysis, 7 (4), pp. 279-292.
Lisman, J.H.C. and Sandee J. (1964) , “Derivation of
quarterly figures from annual data”, Applied
Statistics, 13 (2), pp. 87-90.
Stram, D.O. and Wei, W.W.S. (1986b) , “A
methodological note on the disaggregation of time
series totals”, Journal of Time Series Analysis, 7
(4), pp. 293-302.
Litterman, R.B. (1983) , “A random walk, Markov
model for the distribution of time series”. Journal
of Business and Economic Statistics, 1 (2), pp.
169-173.
Vangrevelinghe, G. (1966) , “L’évolution a court
terme de la consommation des menages:
connaissance, analyse et prevision”, études et
Conjoncture, 9, pp. 54-102.
Rossi N. (1982) , “A note on the estimation of
disaggregate time series when the aggregated is
known”, The Review of Economics and Statistics,
64, pp. 695-696
Wei, W.W.S. and Stram, D.O. (1990) ,
“Disaggregation of time series models”, Journal
of the Royal Statistical Society, B, 52 (3), pp.
453-467.
95
TABLE OF CONTENTS
Estimating observations in ARIMA models with the Kalman filter
Víctor GÓMEZ
Two problems regarding missing observations are considered. The first concerns the
estimation of missing observations in time series. The second concerns the
disaggregation of time series totals, like, for example, the disaggregation of annual
time series data to quarterly figures. Both problems can be solved by setting up the
model in state space form and applying the Kalman filter.
KEY WORDS: Time Series, Missing Observations, Nonstationarity, ARIMA Models,
Temporal Aggregation
1 Introduction
This paper considers two problems that are usually
encountered when dealing with economic time series.
The first concerns the interpolation of missing
observations in a series, while the second has to do
with the distribution of time series data subject to
temporal aggregation.
for prediction and likelihood evaluation, skipping the
missing observations whenever they are encountered.
In this way, estimation of model parameters is possible
and then, at a later stage, the fixed point smoother can
be used to interpolate unobserved values of the series.
The methodology can be easily extended to the case of
regression models with ARIMA noise and missing
observations. Initial missing observations (i.e.,
missing observations among the first observations of
the series lost by differencing) are considered fixed
and are, therefore, treated as regression parameters.
As an example of the first problem, one can think of a
stock variable, such as the money supply, which is
published at monthly intervals but is only available on
a quarterly basis. An example of the second problem
is a flow variable, like investment, which is published
at quarterly intervals but is only available on an annual
basis. Note that a stock is the quantity of something at
a particular point in time, while a flow is a quantity that
accumulates over a given period of time.
To handle the problem of temporal aggregation of a
flow variable, the same methodology of Gomez and
Maravall (1994b) is applicable. Only a change in the
state space representation is needed. To illustrate, a
program in Fortran has been written and applied to
some regression models with ARIMA noise. It is to be
noted that this methodology does not require the
construction and inversion of large covariance
matrices. Also, the specification of the model for the
noise is not constrained to an AR(1) process or a
random walk, like in Chow and Lin (1971, 1976),
Denton (1971) and Fernandez (1981).
For the problem of missing observations on a stock
variable, the methodology of Gomez and Maravall
(1994b) is used, which is implemented in a program
called “Time Series Regression with ARIMA noise,
Missing observations and Outliers” (or TRAMO, in
short). This methodology is a generalization to
non-stationary series of the skipping strategy of Jones
(1980) for ARMA models. It uses the Kalman filter
97
TABLE OF CONTENTS
State space representation and the
2 Kalman
filter
The following example will be used throughout the
paper to illustrate matters.
z t − z t − 4 = a t + θ at − 1 .
2.1. State space representation
Le the process {z t } follow the ARIMA ( p, d, q) model
For this model, G = (1, θ ,0,0 )' and
φ(L) α( L) z t = θ(L)a t ,
0
0
F =
0

1
Where L is the lag operator, L( zt ) = z t − 1 , all the roots
of the polynomial in L α ( L ) = 1+ α 1 L+K+ α d Ld are in
the unit circumference, all the roots of
φ(L ) = 1 + φ 1 L+K+ φ p Lp are outside the unit circle and
those of θ ( L ) = 1+ θ 1 L+K+ θ q Lq are outside the unit
2.2 Prediction error decomposition and the
Kalman filter
circle or on the unit cirumference. It is assumed tha t
φ(l ), α ( l ) and θ (L ) have no common factors and that
{a t } is a sequence of i.i.d. N (0, σ 2 ) Variables. Let
r = max{p + d , q + 1} and define φ* (L ) = φ( L)α ( L ),
Suppose first that there are no missing observations in
the series z = ( z 1 ,K , z n )'. Letting u = ( u d + 1 ,K , u N )',
where ut = α( L) z t denotes the transformation that
renders {z t } stationary, the likelihood of the ARIMA
model is usually defined as the likelihood of the
defferenced series, L( u ) (see Box and Jenkins 1976,
chap. 7). In order to define the likelihood in terms of
the original series, the following two assumptions will
be made:
ψ * ( L ) = θ ( L ) / φ* ( L ) = ∑ i = 0 ψ *i Li and φ i = 0 if i > p.
∞
Then, one state space representation for {z t }which has
minimum dimension is given by
x t = Fxt − 1 + Ga t
z t = H 'xt
(2.1a)
(2.1b)
where
t = 1,K, N ,
xt = ( z t , z t + 1,t ,K , zt + r − 1,t )',
G = (1, ψ *1 ,K , ψ *r )', H ' = (1,0,K ,0 ),
 0
 0

F = M

 0
−φ*r
1
0
0
1
M
M
0
0
−φ
*
r −1
∗
0
−φ*r − 2
and z t + i ,t = z t + i − ψ a t + i −K−ψ
*
i− 1
1 0 0
0 1 0
.
0 0 1

0 0 0
Assumption A:
the variables {z 1 ,K , z d } are independent of the
variables {u t }.
0 
K 0 

O
M ,

K 1 
K −φ*1 
K
Assumption B:
The variables {z 1 ,K, z d } are jointly normally
distributed.
The first assumption is a starndard one when
forecasting with ARIMA models (see Brockwell and
Davis, 1992, pp. 314-317).
a t + 1 , i = 1,K , r − 1.
The expression z t + i ,t denotes the prediction of z t + i
based on {z s : s ≤ t}. Thus, the state vector x t contains z t
and its (r-1)-periods-ahead forecast function with
respect to the semi-infinite sample {z s : s ≤ t}. Lemma 3
of Gómez and Maravall (1994b) ensures that the state
space representation (2.1) is correct.
It is shown in Gómez and Marvall (1994b), pg. 614,
that, under assumptions A and B, the conditional
density
L( z d + 1 ,K , z N | Zd ),
(2.2)
Where Zd = {z 1 ,K , z d }, is a well defined density and
coincides with the Box-Jenkins likelihood L( u ). The
conditional likelihood (2.2) admits the prediction error
decomposition
The state space representation (2.1) is defined in terms
of the original series. This will allow for the use of the
Kalman filter for prediction and likelihood evaluation,
and the fixed point smoother for interpolation, even in
the case where there are missing observations.
98
TABLE OF CONTENTS
2.3 Missing observations
N
L( z d + 1 ,K , z N | Z d ) =
∏
L( z t | z t − 1 ,K , z 1 ),
t =d + 1
Suppose now that there are missing observations in the
series
and
let
z = ( z 1 ,K, z N )'
z 0 = ( z t1 ,K , z t M )' , 1 ≤ t 1 ,K , t M ≤ N , be the observed
series. Than the observation equation (2.1b) can be
replaced by
which in turn suggests the application of the Kalman
filter to the state space representation (2.1) to evaluate
(2.2). Following Bell (1984), pg. 650, the variable z t
can be expressed as
zt = A' t z I +
t −d − 1
∑ ξ i ut −1 ,
z t = H ' xt + γ t W t ,
t > d,
i=0
i , j = 1,K , d ,
A it = −α 1 A it − 1 −K−α d A it − d ,
δ ij
Is
the
t > d,
Kronecker
(2.3)
i = 1,K, d ,
delta
If there are no missing observations among the first d
values of the series, the Kalman filter can be applied to
the state space representation given by (2.1a) and
(2.5), with starting conditions (2.4), like in Jones
(1980) to evaluate the likelihood. At a later stage, after
having estimated all unknown parameters in the
model, the fixed point smoother can be used to
interpolate the missing observations.
and
ξ (L ) = 1 / α ( L ) = ∑ i = 0 ξ i Li . From this, it is obatined
∞
that x d + 1 = A I zi + Ξ U , where AI = [ A d + 1 ,K , Ad + r ]' , Ξ
is the lower triangular r × r matrixwith rows the vectors
( ξ j −1 , ξ j− 2 ,K ,1,0,K ,0 ),
j = 1,K , r,
U = ( u d + 1 , u d + 2,d + 1 ,K , u d + r ,d + 1 )'
and
u d + i ,d + 1 = E ( u d + i | u s : s ≤ d + 1). The Kalman filter can
then be initialized with
x$ d + 1,d = AI z I and Σ d + 1,d = Ξ E (UU ' )Ξ ',
(2.5)
where H ' t = (1,0,K ,0 ), γ t = 0 if z t is observed, and
H 't = ( 0,0,K ,0 ), γ t = 1 if z t is missing (Brockwell
and Davis 1992, p. 483). The variable W t represents a
sequence of i.i.d. N (0,1) variables independent of
z t , t = 1K
, , d and at , t = 0,± 1,K.
where z I = ( z 1 ,K, z d )', the At = ( A 1t ,K, A dt )' can be
calculated recursively from
A ij = δ ij ,
t = 1K
, ,N
The general case, where there may be missing
observations among the first d values of the series, can
be handled as follows. Let z I0 = ( zt1 ,K , z t k )' be the
vector of observations in z I = ( z 1 ,K, z d )'. If zIm
denotes the vector of missing observations in z I , then
the observed values in the series z can be expressed as
(2.4)
where E (UU ' ) can be computed from the stationary
process {u t } as in Jones (1980), x$ d + 1,d = E ( xd + 1 | Z d )
and Σ d + 1,d = Var(x d + 1 − x$ d + 1,d ).
z t i = A 't i z I
= B 't i z I0 + C ' t i z t i , i = 1,K, k ,
For the example of Section 2.1, it is not difficult to
check that z I = ( z 1 , z 2 , z 3 , z 4 )' and
and
1
0
AI = 
0

0
0 0 0
1 0 0
,
0 1 0

0 0 1
z t i = A 't i z I + u~t i
= B 't i z I0 + C ' t i z Im + u~t i ,
t −d − 1
where u~t = ∑ j = 0 ξ j u t − j , t > d , the vectors A' s have
Σ 5,4 = Ξ E (UU ' )Ξ ' = I 4 E(UU ' )I 4 =
1+ θ 2

θ
=
 0

 0
θ
θ2
0
0
i = k + 1,K , M ,
been defined in (2.3), and B 's and C 's are the
subvectors of A 's such that A 's z I = B' s z I 0 + C ' s z Im ,
s > 0. Let z II and u~ denote the vectors ( z t k +1 ,K , z t M )'
and ( u~t k +1 ,K , u~t M )' , and let B and C denote the
(M − k )× k and ( M − k ) × ( d − k ) matrices with rows
B' s and C ' s , s = t k + 1 ,K , t M . Then, the preceding
equations can be written as
0 0

0 0
0 0

0 0
99
TABLE OF CONTENTS
 z  I
z0 =  I 0  =  k
 zII   B
0   zI 0  0
   +  .
C   zIm  u~
full column randk, this last algorithm premultiplies
both L−1 y and L−1C by an orthogonal matrix Q to obtain
v = QL−1 y and (U ' ,0' )' = QL−1C , where U is a
nonsingular ( d − k ) × ( d − k ) upper triangular matrix.
Then z Im = U −1 v1 and σ$ 2 = v'2 v2 / ( M − k ), where
v = ( v '1 , v' 2 )' , v1 has dimension d-k and v 2 has
dimension M-d. Replacing σ 2 and z Im in (2.9) by its
estimators σ$ 2 and z$ Im , yields the concentrated
likelihood. It is not difficult to verify that maximizing
the concentrated likelihood with respect to the
parameters ( φ, θ ) is equivalent to minimizing the
non-linear sum of squares
(2.6)
Defining y = zII − Bz I0 , (2.6) implies that
y = Cz Im + u~.
(2.7)
A natural way of extending the definition of the
conditional likekihood (2.2) to the case of missing
observations is to consider the likelihood of the
observations z II conditional on Zd = {z 1 ,K , zd }and to
treat the vector of initial missing observations z Im as
additional parameters. This is the definition of Gómez
and Maravall (1994b). It is equivalent to considering
(2.7) as a regression model whose errors u~ have a
known covariance matrix σ 2 ∆. The likelihood
associated with (2.7) is
1
( 2π σ ) (M − k) / 2
2
S ( φ, θ ) =| L|1/ (M − k) v'1 v2 | L|1/ ( M− k) .
Minimising S (φ, θ ), one can obtain the estimators
(φ$ , θ$ ); then, σ$ 2 and z$ Im are estimated by σ$ 2 and z$ Im .
| ∆| −1/ 2
(
exp −
1
2σ2
Once the parameters of the model have been
estimated, one can use the Kalman filter for prediction
and the fixed point smoother for interpolation of the
missing observations z t with t k < t ≤ N . Note that the
missing observations z t with 1 < t ≤ t k are estimated by
z$ Im . See Gómez and Maravall (1994b), pg. 617, for the
details. See also Kohn and Ansley (1985).
)
( y − CzIm )' ∆−1/ 2 ( y − CzIm )
(2.9)
Where the unknown parameters are σ 2 , z Im and the
coefficients ( φ, θ ) = ( φ 1 ,K , φp , θ 1 ,K , θ p ) of the
ARIMA model.
Suppose that for the example of Section 2.1 the
observed series is z o = ( z 1 , z 4 , z 5 , z6 , z 7 , z 8 , z 10 )'.
Then,
z Io = ( z 1 , z 4 )',
z Im = ( z 2 , z 3 )',
z II = ( z 5 , z6 , z 7 , z 8 , z 10 )' and
The parameters σ 2 and z Im can be concentrated out of
the likelihood (2.9) using the Kalman filter.
Specifically, let ∆ = LL', whit L lower triangular, be the
Cholesky decomposition of ∆ and suppose that
z Im = 0. The Kalman filter, applied as in Hones (1980)
to (2.1a) and (2.5), with starting conditions (2.4),
yields
and
|L|.
Note
that
L−1 y
x$ d + 1,d = [B d + 1 ,K, B d + r ]' z I0 in this case because of the
assumption z Im = 0 (see Gómez and Marvall,(1994b).
The same algorithm applied to the columns of the
matrix C, with starting conditions x$ d + 1,d = 0and Σ d + 1,d
as given by (2.4), also permits the computation of
L−1C . Then, it is possible to move from the generalised
least squares regression model (2.7) to the model
L−1 y = L−1 Cz Im + L−1 u~,
1
0

B = 0

0
0
0
0
1
0


0 , C = 0


1
0

1
0
0
0

1.

0
0
To obtain L−1 y, the Kalman filter should be initialised
with x$ 5,4 = ( z 1 ,0,0, z 4 )'and Σ 5,4 as given in Section 2.2.
2.4 Regression Models with Missing
Observations and ARIMA errors
(2.10)
The methodology described in the preceding Sections
can be extended easily to the case of regression models
with missing observations and ARIMA errors.
Consider the regression model
where Var( L−1 u~ ) = σ 2 I M − k . Therefore, (2.10) is an
ordinary least squares regrerssion model. The
$ 2 and z$ Im of σ 2 and
maximum likelihood estimators σ
z Im in (2.7) can now be efficiently and accurately
obtained using the QR algorithm. Supposing C is of
z t = ω ' t βν t ,
100
(2.11)
TABLE OF CONTENTS
L−1 y = L−1 [C , W II − A II WI ][z 'Im ,β']'+ L−1 u~,
where β = (β1 ,K, β h )' is a vector of parameters, ω ' t is a
vector of h independent variables, z t is the dependent
variable, and ν t is assumed to follow the ARIMA
model
φ( L )α ( L )ν t = θ ( L )a t
and one can proceed as described after having
obatined (2.10) in Section 2.3.
(2.12)
3 Temporal Aggregation
The state space reprensentation is given by (2.1a),
where xt is given by the state vector of Section 2.1 with
z t replaced by ν t , and the observation equation
z t = ω 't β + H ' t xt + γ t Wt ,
Consedr again the regression model (2.11), where the
errors {ν t } follow the ARIMA model (2.12). Suppose
that for t = t 1 ,K, t M , the aggregate
t = 1,K , N ,
n( t)
~
z t = ∑ zt − j + 1
where H ' t and γ t are as in Section 2.3. Let
z o = ( z t1 ,K , z t M )', 1 ≤ t 1 ,K , t M ≤ N , z I , z Io , zIm and
z II be as in Section 2.3 and define the corresponding
vectors ν Io , ν Im and ν II . Let WIo , W Im and WII be the
matrices with rows the vectors ω ' t corresponding to
the vectors ν Io , ν Im and ν II . Then, with the notation of
Section 2.3, one can write ν II = Bν Io + Cν Im + u~ and,
replacing ν Io by z Io − WIo β, ν Im by z Im − W Im β and ν II
by z II −WII β, the following regression model is
obtained
is observed, where 1 ≤ n( t ) ≤ n and n is the maximum
number of time periods over which the flow variable
z t is aggregated. Let ~
r = r + n −1, where r is as in
Section 2.1, and define the ~
r dimensional state vector
x~t = (ν t − n+ 1 ,K ,ν t − 1 , x't )', where xt is the state vector
of Section 2.1 with z t replaced by ν t . Then, one state
space representation is given by
~
~
~
xt = Fx~t − 1 + Ga t
z II = Bz Io + Cz Im + WIIβ − BW Io β − CW Im β + u~,
n( t)
~
~
z t = ∑ ω t − j + 1β + H ' t ~
xt + γ t W t ,
j=1
where the regression parameters are z Im and β. Letting
y = z II − Bz Io , this model can be rewritten as
y = [C , WII − BW Io − CW Im ] [ z ' Im , β']'+ u~
~
= C , W − A W z ' , β' '+ u,
[
II
II
I][
Im
]
(3.1)
j=1
~
Where G = ( 0,K,0, G ' )', G is as in Section 2.1,
~
H 't = ( 0,K ,0,1,K,1,0,K,0 ), γ t = 0 if ~
z t is observed,
~
~
H ' = 0, γ = 1if z is missing, W is a sequence of i.i.d.
(2.13)
t
t
t
t
N(0,1) variables independent of ν t , t=1,...,d and
at , t = 0,± 1,K,
where WI is the d × h matrix formed by the rows ω ' t ,
t=1,...,d, and AII is the ( M − k )× d matrix with rows
the vectors A 's , s = t k + 1 ,K , t M defined in (2.3). The
parameters σ 2 , z Im and βcan be concentrated out of the
likelihood. Specifically, the Kalman filter applied to
the model y = z II − Bz Io = u~, which coincides with
(2.13) under the assumption that z Im = 0 and β = 0,
yeilds L−1 y and |L|, where L is as in Section 2.3. The
starting
conditions
to
use
are
x$ d + 1,d = [B d + 1 ,K, B d + r ]' z Io and Σ d + 1,d as given by
(2.4). The same algorithm applies to the columns of
the matrix [C , WII − A II WI ], with starting conditions
x$ d + 1,d = 0 and Σ d + 1,d as given by (2.4), permits the
caluclation of the product of L−1 by this matrix. Then
the QR algorithm can be applied to the transformed
model
 0
 0
~ 
F = M

 0
−φ*~r
0
M
M
0
0
−φ
*
~r − 1
0 
0 

O
M ,

K 1 
K −φ*1 
K
K
0
−φ*~r −2
And the φ*i are as in Section 2.1. Note that φ i = 0 if
i > p, which implies φ*j = 0 if j > p + d , and that there
~
are n(t) ones in H t ≠ 0. When q = 0, a more economical
state space representation is obtained by redefining ~
r
as max{ p + d , n} and letting the state vector be
~
x~t = (ν t − ~r + 1 ,K ,ν t )' and G = ( 0,K ,0,1)'.
101
TABLE OF CONTENTS
~
~
~y = Cz
~
I 2 + u.
As in Section 2.3, the likelihood is defined by
conditioning on Z d = {z 1 ,K , zd }. In order to get a
regression model similar to (2.13), whose likelihood
will be the likelihood of the aggregated series, suppose
first that β = 0 and let z = ( z 1 ,K, z N )' be the
disaggregated series. Then, z = Az I = [ 0, u~' ]', where A
is the N × d matrix with rows the vectors A ' t ,
t = 1,K , N , defined in Section 2.3, u~ is the ( N − d ) × 1
vector with elements u~t , t = d + 1,K , N , also defined in
Section 2.3, and z I = ( z 1 ,K , z d )'. The observed
z = (~
z t1 ,K , ~
zt M )' is
aggregated series ~
~
z = Jz = JAz I + J [ 0, u~' ]',
(3.3)
~
If t 1 > d, then ~y = ~
z , C = JA and z I2 = z I in (3.3). For
the general case, where β ≠ 0, an argument similar to
that used in Section 2.4 to obtain (2.13) leads to
~y = [C~, W~ − A~ W ][ z ' , β' ]'+ u~
~,
II
II
I
I2
(3.4)
~
~
where WII = J IIW , A II = J II A, W is the matrix with
rows the vectors ω 't of (2.11), t = 1,K, N , and WI is,
like in Section 2.4, the submatrix of W formed withits
~
first d rows. Again, if t 1 > d , , then ~y = ~z, C = JA and
(3.2)
z I 2 = z I in (3.4).
where the rows of the matrix J are formed with zeros
and ones. For example, in the case of a quarterly series
aggregated to yearly totals, if N = 4 M , the J is the
M × 4 M matrix
1
0
J =
M

0
In the caseof missing observation sof a stock variable,
the Kalman filter was always initialize at t = s, where
s = max{t 1 , d + 1.
} The initial state vector is
~
~~
x~s = A I ν I + Ξ U,
1 1 1 0 K 0 0 0 0
1 1 1 1 K 0 0 0 0
.
M M M M O M M M M

0 0 0 0 K 1 1 1 1
(3.5)
~ ~
where ν I = (ν 1 ,K , ν d )' and the matrices A I , Ξ z and
~
the vector U can be constructed in a similar manner to
that used in Section 2.2 to build up the matrices A I , Ξ
and the vector U. On very rare occasions, when some
first elements of the intial state vector have a non
-positive temporal subindex, Bell’s backward
representation (Bell, 1984) will be needed as well as
the forward representation (2.3). Replacing ν I with
z I − W I β in (3.5), yields
Equation (3.1) should be used each time there is an
observation ~
z t with t ≤ d to reduce the number of
initial missing values z 1 ,K , zd , which are considered
fixed and are treated as parameters. That is, each time
there is an observation ~
z t with t ≤ d , equation (3.1)
should be used to put one of the initial missing values
as linear combination of ~
z t and the other initial
~
~
missing values. Let z I = ( z t1 ,K, ~
z t k )', where t k ≤ d
and t k+ 1 > d . By selecting k appropriate variables in z I ,
it is always possible to find a representation of the
form ~z I = z I1 + Pz I 2 , where P is a k × ( d − k ) matrix
and z I1 and z I2 are k × 1and ( d − k ) × 1subvectors of z I
which have no elements in common. Then one can
z I − Pz I 2 . Let A 1 and A 2 be the submatrices
write z I1 = ~
of A such that Az I = A 1 zI1 + A 2 z I2 and define
~
z II = ( ~
z t k +1 ,K , ~
zt M )'.
Partition
J = [ J 'I , J ' II ]'
conforming to ~
z = (~
z 'I , ~
z ' II )'. Then,
~
~
~~
~
xs = A I z I − A I W I β + Ξ U
~
~
~
~
~~
= A I1 ~
z I + ( A I 2 − A I 1 P ) z I2 − A I W I β + Ξ U ,
~
~
~
where A I1 and A I2 are the submatrices of A I formed
~
with the columns of A I which correspond to the
subvectors z I1 and z I 2 of z I . From this last expression ,
initial conditions for the Kalman filter can be obtained.
~
z  0
0 ~
 z  I
~
z = Jz = ~ I  =  ~k ~  I  + ~
~,
 z II   B C   z I 2  u 
For likelihood evaluation, one can proceed like in
Section 2.4. The parameters σ 2 , z I2 and β can be
concentrated out of the likelihood. The Kalman filter
~
to be applied to the model ~y = u~, which coincides with
~
~
where
and
B = J II A 1 ,
C = J II A 2 − J II A 1 P
~
~~
~
~
~
~
u = J II [ 0, u ' ]'. Defining y = zII − Bz I , the following
(3.4) under the assumption that z I2 = 0 and β = 0,
~
z I and
should be initialised with x~$ s , s − 1 = A I1 ~
~
~ ~~ ~
Σ s ,s − 1 = Ξ E (UU ' )Ξ '. The starting conditions for the
regression model is obtained
Kalman filter to be applied to the columns of
102
TABLE OF CONTENTS
~ ~
~
~
[C , W II − A II WI ] are ~
x$ s ,s − 1 = 0 and Σ s ,s −1 . After all
variable, as described in the paper, has been
implemented in a computer program, written in
Fortran, called TRAMO (”Time Series Regression
with ARIMA Noise, Missing Observations, and
Outliers”). See Gómez and Maravall (1994a). The
program performs estimation, forecasting and
interpolation of regression models with missing
observations and ARIMA errors, in the presence of
possibly several types of outliers. The ARIMA model
can be identified automatically (no restriction is
imposed on the location of the missing observations in
the series). The program fits the regression model
(2.11), where the errors ν t follow the general ARIMA
model (2.12). The polinomials δ ( L ), φ( L ) and θ ( L ) in
TRAMO are assumed to have the following
multiplicative form
unknown parameters in the model have been
estimated, the Kalman filter can be used for prediction
of the future observations and the fixed point smoother
for interpolation of the missing observations. Note that
the initial missing observations are estimated by linear
regression.
Suppose that a quarterly series follows the model of
the example of Secrion 2.1 and it is aggregated to
yearly totals. Then, n = 4, ~
r = 7, and if
xt = ( z t , z t + 1,t , z t + 2,t , z t + 3,t )' is the state vector
corresponding to the disaggregated series, the state
vector
for
the
aggregated
data
is
~
xt = ( z t − 3 , z t − 2 , z t − 1 , x ' t )'. The Kalman filter should be
initialized at s = 5 and
0
0

0

~
x 5 = 1
0

0
0
δ ( L ) = (1 − L) d (1 − Ls ) D
1 0 0
0 1 0
 z1 
0 0 1  
0 0  0
 z
,
0 0 0  2  + 
 z 3  0 I 4  v
1 0 0  
 z 
0 1 0 4
0 0 1
φ( L ) = (1 + φ1 L+K+φ p Lp )(1+ Φ 1 Ls +K+ Φ P Ls ×P )
s ×Q
θ ( L ) = (1 + θ 1 L+K+ θ q L )(1+ Θ 1 L+K+ Θ Q L
q
),
(3.6)
where s denotes the number of observations per year.
one can put, for example, z1 = ~z 4 − ∑ j = 1 z5− j to
The regression variables can be input by the user (such
as economic variables thought to be related with z t ), or
generated by the program. The variables that can be
generated are trading day, easter effect and
intervention variables of the type:
reduce the number of initial missing observations.
Substituting back in (3.6), initial conditions for the
Kalman filter can be obtained. The initial conditions
~
for the Kalman filter to be applied to the model ~y = u~,
a)
b)
c)
4
where v = ( u 5 , u6 , u 7 , u 8 )'. Given that ~
z 4 = ∑ j = 1 z 5− j ,
3
which is (3.3) under the assumption
~
x$ = ( 0,0,0, ~
z ,0,0,0 )' and
5,4
z I2 = 0, are
d)
4
0 0 
~
Σ 5, 4 = 
,
0 Σ 5,4 
e)
(3.7)
dummy variables;
any possible sequence of ones and zeros;
1 / (1 − δL ) of any sequence of ones and zeros,
where 0 < δ ≤ 1;
1 / (1 − δ s Ls ) of any sequence of ones and zeros,
where 0 < δ s ≤ 1;
1 / (1 − δ L )(1 − δ s Ls ) of any sequence of ones and
zeros.
Several algorithms are available for computing the
likelihood or more precisely, the nonlinear sum of
squares to be minimized. When the differenced series
can be used, the algorithm of Morf, Sidhu and Kailath
(1974) (with simplification simila ro that of Mélard,
1984) is employed.
Where Σ 5,4 is the matrixgiven in Section 2.2. The
Kalman filter to be applied to the columns of the
~
matrix C in the regression model (3.3), where
z I2 = ( z 2 , z 3 , z4 )' are ~
x$ = 0 and (3.7).
4 Applications
For the nondifferenced series, it is possible to use the
ordinary Kalman filter (default option), or its square
root version (see Anderson and Moore, 1979). The
The methodology of Gómez and Maravall (1994b) to
handle the proble of missing observations on a stock
103
TABLE OF CONTENTS
latter is adequate when numerical difficulties arise;
however, it is markedly slower.
sequence has been completed, a multiple regression is
performed and, if some outliers are eliminated,
theprogram goes back to the first sequence, and
iterates in this way until no more outliers are
eliminated in the multiple regression. A notable
feature of this algorithm is that all calculations are
based on linear regression techniques, which reduces
computational time. The four types of outliers
considered are additive outlier, innovational outlier,
level shift and transitory change.
By default, the exact maximum likelihood method is
employed, and the unconditional and conditional least
squares methods are available as options. Nonlinear
maximizaton of the likelihood function and
computation of the parameter estimates standard
errors is made using Marquardt’s method and first
numerical derivatives.
Estimation of regression parameters is made, as
described in the paper, by using first the Cholesky
decomposition of the inverse error covariance matrix
to transform the regression equation (the Kalman filter
provides an efficient algorithm to compute the
variables in this transformed regression). Then, the
resulting least squares problem is solved by
orthogonal matrix factorisation using Householder
transformation. This procedure yields an efficient and
numerically stable method to compute GLS estimators
of the regression parameters, which avoids matrx
inversion.
The program contains also a facility for automatic
identification of the ARIMA model. This is done in
two steps. The first one yields the nonstationary
polinomial δ( L ) and detects whether a mean should be
specified to the model (2.12). This is done by iterating
on a sequence of AR and ARMA(1,1) models (with
mean), which have a multiplicative structure when the
data is seasonal. The procedure is based on results of
Tiao and Tsay (1983, Theor. 3.2 and 4.1), and Tsay
(1984, Corol. 2.1). Regular and seasonal differences
are obtained, up to a maximum order of ∇ 2 ∇ s . The
program also checks for possible complex unit roots at
nonzero nonseasonal frequencies.
For forescasting, the ordinary Kalman filter or the
square root filter options are available. Interpolation of
missin values is made by a simplified Fixed Point
Smoother, and yields identical results to Kohn and
Ansley (1986); for a more detailed discussion , see
Gómez and Maravall (1993). When concentrating the
regression parameters out of the likelihood, mean
squared errors of the forecasts and interpolations are
obtained following the approach of Kohn and Ansley
(1985).
The second step identifies an ARMA model for the
stationary series (corrected for outliers and
regression-type
effects)
following
the
Hannan-Rissanen
procedure,
with
some
modifications. For the general multiplicative model
φ p ( L )Φ P ( Ls )x t = θ q ( L )Θ
Q
(Ls )a t ,
the search is made over the range 0 ≤ ( p, q ) ≤ 3,
0 ≤ ( P , Q ) ≤ 2. This is done sequentially (for fixed
regular polynomials, the seasonal ones are obtained,
and viceversa), and the final orders of the polinomials
are chosen according to the BIC criterion, with some
possible constraints aimed at increasing parsimony
and favoring “balanced” models (similar AR and MA
orders).
The program has a facility for detecting outliers and
for removing their effect; the outliers can be entered by
the user or they can be automatically detected by the
program, using an approach similar to that of Chen and
Liu (1993), with some important modifications
incorporated. In brief, regression parameters are
initialized by OLS, and then the ARMA model
parameters are estimated with two regressions, as in
Hannan and Rissanen (1982). Next, the Kalman filter
provides the series residuals, and new regression
parameter estimates are obtained. For each
observation, t-tests are computed for four types of
outliers, as in Chen and Liu (1993). Outliers are
removed one by one and, each time, new model
parameter estimates, are obtained. Once this first
TRAMO has been designed so that it can be used with
a companion program named SEATS (”Signal
Extraction in ARIMA Time Series”), described in
Maravall and Gómez (1994). SEATS is an
ARIMA-model-based method for estimation of
unobserved components (trend, cyclical, seasonal and
irregular component), and in particular for seasonal
adjustment. Since the method applies to linear time
104
TABLE OF CONTENTS
series, TRAMO can be seen as a preadjustment
program, that produces the linear series for SEATS.
Since both programs can handle routine applications
to a large number of series, they provide a fully
model-based alternative to REGARIMA and
X11ARIMA, that form the new Census X12
procedure (see Findley et al., 1992). Programs
TRAMO and SEATS, and more detailed
documentation on both programs, are available from
the authors.
The results of exact maixmum likelihood estimation
are shown in Table 1.
Table 2 shows the estimates of the missing
observations.
Table 1. Maximum Likelihood Estimates
of Parameters for Ozone Model, (4.1)
Data Set
The first example illustrates the case of missing
observations on a stock variable, where regression
variables (intervention variables constructed by the
program) are present. It is the series of ozone ( O 3 )
levels of Bow and Tiao (1975). The model is given by
the equations
zt =
ω0
d 1t +
ω2
ω3
d 2t +
1− L
1 − L12
1− L12
∇ 12ν t = (1 + θ 1 L )(1 + θ 12 L12 )a t ,
Full Data
Missing
Observations
Parameters*
θ1
θ 12
.241
-.765
(0.068)
(0.061)
.265
-.770
(0.074)
(0.058)
* Figures in parentheses are standard errors.
d 3t + ν t
(4.1)
where
 1 t = January 1960
d 1t = 
 0 otherwise
 1 months June −October, beginningin in 1966
d 2t = 
 0 otherwise
 1 months November − May, beginningin in 1966
d 3t = 
 0 otherwise
8 observations were deleted from the 216 observations
of the original data set. These were observations
number 3, 21, 39, 43, 113, 142, 170 and 201. Noter that
the first missing observation falls among the first 12
values.
To implement the approach proposed in this paper
tohandle the problem of temporal aggregation of a
flow variable, a computer program has been written in
Fortran by the author. To illustrate, the first 196
observations of Series A of Box and Jenkins (1976)
have been aggregated by groups of four and assigned
to the last observation in each group. As in Box and
Jenkins (1976), two models have been used for the
disaggregated series: ARMA (1,1) with mean µ and
IMA (1,1) without mean. The first one is a stationary
mloedl, whereas the second one is not. The stationary
model is a regression model, with the mean as
regression parameter. The results of exact maximum
likelihood estimation are shown in Table 3.
Table 4 shows the estimates of the last 8 missing
observations.
Table 2. Estimates of Missing Observations and Associated Root Mean Squared Errors
Observation Number
Data Set
Missing Observation s
Actual Values
3
21
39
43
113
142
170
210
4.364
6.178
4.322
5.716
3.507
4.631
2.333
3.043
(.752)
(.725)
(.716)
(.709)
(.701)
(.702)
(.707)
(.763)
3.600
8.700
2.500
4.400
3.500
4.800
1.700
3.400
105
TABLE OF CONTENTS
Table 3. Maximum Likelihood Estimates of Parameters for Series A
Parameters*
Data Set
Full Data : ARMA (1,1)
µ
φ
θ
17.065
-.905
-.566
(0.098)
(0.047)
(.089)
-
-
-.695
-
-
(.051)
17.061
-.944
-.609
(0.129)
(0.012)
(.052)
-
-
-.688
-
-
(.030)
Full Data: IMA (1,1)
Missing Observations: ARMA (1,1)
Missing Observations: IMA (1,1)
*Figures in parentheses are standard errors
Table 4. Estimates of the Last 8 Missing Observations and Associated Root Mean Squared Errors
Observation Number
Data Set
Missing Observation s
189
190
191
192
193
194
195
196
17.576
17.638
17.675
17.711
17.601
17.586
17.568
17.546
ARMA (1,1)
(.334)
(.204)
(.205)
(.200)
(.340)
(.206)
(.205)
(.217)
Missing Observations
17.591
17.641
17.675
17.692
17.575
17.575
17.575
17.575
IMA(1,1)
(.346)
(.213)
(.214)
(.208)
(.349)
(.215)
(.214)
(.223)
Actual Values
17.400
17.000
18.000
18.200
17.600
17.800
17.700
17.200
106
TABLE OF CONTENTS
References
Findley, D.F., Monsell, B., Otto, M., Bell, W. and
Pugh, M. (1992) , “Towards X-12 ARIMA,”
mimeo, Bureau of the Census.
Anderson, B.D.A. and Moore, J.B., (1979) , Optimal
Filtering, Prentice-Hall.
Ansley, C.F., (1979) , “An algorithm for the exact
likelihood of a mixed autoregressive-moving
average process,”Biometrika, 66, 59-65.
Gómez, V., and Maravall, A., (1993) , “Initializing
the Kalman Filter with Incompletely Specified
Initial Conditions,”in Chen, G. R. (ed.),
Approximate Kalman Filtering (Series on
Approximation and Decomposition), World
Scientific Publishing Co., Inc., London.
Bell, W., (1984) , “Signal Extraction for Nonstationary
Series,” The Annals of Statistics, 12, 646-664.
Brockwell, P., and Davis, R., (1992) , Time Series:
Theory and Methods, Springer-Verlag, Berlin.
Gómes, V., and Maravall, A., (1994a) , “Program
TRAMO (Time Series Regression with ARIMA
noise, Missing observations and Outliers –
Instructions for the user,” EUI Working Paper
ECO No. 94/31, Department of Economics,
European University Institute.
Box, G.E.P., and Jenkins, G.M., (1976) , Time Series
Analysis, Forecasing and Control, Holden-Day,
San Francisco.
Box, G.E.P., and Tiao, G.C., (1975) , “Intervention
Analysis with Applications to Economic and
Environmental Problems,”, Journal of the
American Statistical Association, 70, 70-79.
Gómez, V., and Maravall, A., (1994b) , “Estimation,
Prediction and Interpolation for Nonstationary
Series With the Kalman Filter,” Journal of the
American Statistical Association, 89, 611-624.
Chen, C. and Liu, L., (1993) , “Joint Estimation of
Model Parameters and Outlier Effects in Time
Series,” Journal of the American Statistical
Association, 88, 284-297.
Hannan, E. J. and Rissanen, J. , “Recursive
Estimation of Mixed Autoregressive-Moving
Average Order,” Biometrika, 69, 81-94.
Harvey, A. C., (1989) , Forecasting, structural time
series models and the Kalman filter, Cambridge
University Press, Cambridge.
Chow, G.C., and Lin., A. (1971) , “Best Linear
Unbiased Interpolation, Distribution and
Extrapolation of Time Series by Related Series,”
Review of Economics and Statistics, 53, 372-375.
Jones, R., (1980) , Maximum Likelihood Fitting of
ARMA Models to Time Series With
MissingObservations,”
technometrics,
22,
389-395.
Chow, G.C., and Lin., A. (1976) , “Best Linear
Unbiased Estimation of Missing Observation in
an Economic Time Series,” Journal of the
American Statistical Association, 71, 719-721.
Kohn, R., and Ansley, C.F., (1985) , “Efficient
Estimation and Prediction in Time Series
Regression Models”Biometrika, 72, 694-697.
Denton, F.T. (1971) , “Adjustment of Monthly or
Quarterly Series to Annual Totals: An Approach
Based on Quadratic Minimization,” Journal of the
American Stastical Association, 66, 99-102.
Kohn, R., and Ansley, C.F., (1986) , “Estimation,
Prediction and Interpolation for ARIMA Models
With Missing Data,” Journal of the American
Statistical Association, 81, 751-761.
Fernández, R. (1981) , “A Methodological Note on
the Estimation of Time Series,” Review of
Economics and Statistics, LXIII, 471-475.
107
TABLE OF CONTENTS
Maravall, A., and Gómez, V., (1994) , “Program
SEATS (Signal Extraction in ARIMA Time
Series – Instructions for the user,” EUI Working
Paper ECO No. 94/28, Department of Economics,
European University Institute.
Pearlman, J.G., (1980) , “An Algorithm for the Exact
Likelihood
of
a
High-Order-Autoregressive-Moving Average Process,” Biometrika,
67, 232-233.
Tiao, G. C. and Tsay, R. S., (1983) , “Consistency
Properties of Least Squares Estimates of
Autoregressive Parameters in ARMA Models,”
The Annals of Statistics, 11, 856-871.
Mélard, G., (1984) , “A Fast Algorithm for the Exact
Likehood of Autoregressive-Moving Average
Models,” Applied Statistics, 35, 104-114.
Morf, M., Sidhu, G.S., and Kailath, T., (1974) ,
“Some New Algorithms for Recursive Estimation
on Constant, Linear, Discrete- Time Systems,”
IEEE Transactions on Automatic Control, AC-19,
315-323.
Tsay, R. S., (1984) , “Regression Models With Times
Series Errors,” Journal of the American Statistical
Association, 79, 118-124.
108
TABLE OF CONTENTS
TABLE OF CONTENTS
TABLE OF CONTENTS
Temporal disaggregation of economic time series: Some econometric issues
Claudio LUPI; Istituto di Studi e Analisi Econometrica (ISAE)
Giuseppe PARIGI ; Banca d’Italia, Servizio Studi
The paper critically reviews the most commonly used techniques to disaggreagate
economic time series on a temporal basis. Different approaches are discussed in
the attempt to assess their respective advantages and disadvantageous. Relatively
new procedure are also described, such as the “missing data” approach, which
sheds light on the general solutions offered by the Kalman filter estimator. The
emphasis of the paper is on the econometric features of the different methods, and in
particular on their characteristics and implications with respect to the
integration/cointegration literature. This allows to highlight the inadequacy of
some procedures to provide reliable results.
JEL Classifications Numbers: C22, C28
Key Words: Time Disaggregation, Missing Data, Cointegration
1 Introduction
1
The knowledge of the short term evolution of most
macroeconomic variables is crucial for the policy
maker. In this context it is important to have high
frequency information , quarterly, monthly or even
daily in some cases. As it is not always possible to
obtain direct measures at high frequency it is then
necessary to apply statistical and mathematical
procedures to temporally disaggregate the data.
Several methods have been proposed in the literature,
from purely mathematical to more sophisticated
1
statistical procedures (a survey is in Di Fonzo, 1987;
DiF henceforth).
A common feature of these methods is the neglect of
proper econometric analysis, in the sense that
economic interrelationships among variables are
generally ignored. The aim of this paper is to
re-examine the most influential and widely used
procedures in the light of recent econometric
developments, such as the unit root and cointegration
This paper was written while Claudio Lupi was working at ISTAT. The authors would like to thank Tom maso
Di Fonzo, Enrico Giovannini, Stefano Pisani, Patrizia Ordine, Giovannini Savio for interesting obse rvations
and fruitful discussions. However, none of them is responsible for any error remaining. We are grat eful also to
Maria Rosaria Marino for her editing skill. The paper is the result of a close collaboration betwee n the two
authors. However, sections 1-3 are to be attributed mainly to Giuseppe Parigi, sections 4-6 and the appendix to
Claudio Lupi. The opinions expressed in the paper are of the authors only and not involve any respo nsibility on
the part of the Bank of Italy, ISTAT and ISAE.
111
TABLE OF CONTENTS
analysis. In this way, some well known methods for
temporal disaggregation are critically reviewed.
coincide with one of the high frequency observations;
the total of banking deposits or the initial capital stock
of firm fall in this category.
The importance of the concept of cointegration, which
is usually dependent on the particular theoretical
model under examination, should not obscure the fact
that the problem of temporal disaggregation has to be
analysed within the framework of statistical accounts,
without the influence of strong structural economic
information.
In more formal terms, let z i be the unknown series to
be estimated for the periods i =1,K , mΤ , and
yt (t = 1,K , Τ )
the
aggregated
series;
if
(
z m (t − 1 ),t = z m (t − 1) + 1 ,K , z mt
)
'
and c = (c 1 ,K , c m ) are
'
column vectors of length m (the elements of c are
known constants), then:
The paper is organised as follows. A general
description of the temporal disaggregation problem
and the related methods is presented in the second
section. The third one briefly sketches the problem of
temporal disaggregation in the presence of a
contemporaneous adding-up constraint. In the fourth
section we focus on inference, with special emphasis
on the econometric testing of hypotheses. The
relationships among the different methods and the
theory of integration and cointegration is described in
the fifth section, while an empirical application is
illustrated in the following. Some concluding remarks
remarks are contained in the last section.
(t = 1,K , Τ )
yt = c' z m (t − 1) ,t
(1)
where c is a ( m× 1) vector which can be represented
as:
interpolation
(1,0,K ,0)
( 0,0,K ,1) '
'
c=
initial period
final period
distribution
c=
We try to maintain a uniform notation throughout the
whole paper: y is the aggregated series of Τ
observations; z, is the unknown series with frequency
m > 1; W ; is the ( mΤ × k) matrix of k indicators; ε and u
two random disturbances with frequency m and 1 (e.g.
yearly), respectively; i = 1,K , mΤ , and t = 1,K, Τ are
the temporal indices of the highest and lowest
frequency respectively.
'
, ,K1)
(11
1
(11, ,K1) '
m
sum
mean
(2)
Indirect estimation methods may be classified
according to the following categories: methods which
do not use auxiliary information and methods based on
indicators.
In the first case, the high frequency series is not
obtained
with
a
generally
arbitrary
mathematical-statistical procedure so that it can be
applied when the information set is very limited (i.e.
when no indicator is available).
2 Indirect estimation methods
2.1 General remarks
The literature on temporal disaggregation is
characterised by estimation methods which compute
observations at high frequency by distributing or
interpolating the low-frequency data.
The methods based on auxiliary information may be
divided into two classes:
a) adjustment methods;
b) optimal methods.
More specifically, there is a problem of distribution
when the aggregated series is computed by adding or
averaging the high frequency data; this is the case of
flow variables, such as the gross national product
(GNP) and its components. The case of interpolation
arises for stock variables, when the aggregated data
Both exploit the idea that the dynamics of the series to
be estimated is correlated with that of the indicators; in
this case, z i may be related to the indicator matrix
W (m Τ × k) through the statistical model:
112
TABLE OF CONTENTS
(i = 1,K, mΤ )
z i = Wi β + ε i
estimate obtained from the aggregate model (4) and is
given by:
(3)
[(CW) Ω
(CW )] (CW) ' Ω
where Wi is the i-th row of W; β is the parameter (k × 1)
vector and ε i a random disturbance. For Instance,
when we want to disaggregated the value added, W
may be composed by the industrial production index
(usually measured at high frequencies) and possibly
by some trend and/or dummy variables for peculiar
episodes. By aggregating (3) we have:
~
β=
y = CW β + u
unbiased, GLS estimator).
(4)
~
u~ = Cz −Cz~ = y − CW β
~
~
z = W β + Lu~
s. t . y = C z
(5)
The minimand in (5) is the weighted sum of the
squared residuals given by the unknown series z and
its projection on the space generated by the indicators:
$ where β$ is any estimate of the parameter
z$ = Wβ,
vector.
For quarterly flow variables:
0 
1 1 1 1 K 0 0 0 0 

K K K K K K K K K
K K K K K 1 1 1 1 
0
0
K
0
0
(8)
Indirect estimation methods allow also to compute the
values of z when aggregate is not yet known. This is
the case of the extrapolation of z through the optimal
It can be shown that the solution for β in (5)
corresponds to the Generalised Least Squares (GLS)
0
(7)
(8) is efficient only if L reflects the stochastic of ε( u):
this is the main difference between adjustment are
distinct phases, while in the latter they are obtained
simultaneously. In other words, the aggregation
constraint in adjustment methods is obtained
according to arbitrary hypotheses, not related to the
procedure employed to compute the preliminary
estimate. On the contrary, with optimal methods the
properties of ε determine both the estimate of β and the
form of the L matrix; moreover, β and z are estimated
simultaneously.
z ,β
0
y
among the disaggregated observations so as to satisfy
the constraint in (1). It is the smoothing matrix ( L),
which optimally distributes the aggregated errors
among the high frequency observations. The final
optimal solution for z is:
'
1 1 1 1
0 0 0 0
c=
K K K K
K K K K

−1
u
The solution of (5) includes the simultaneous
adjustment step, which consists of distributing the
discrepancies at the aggregated level
( z −Wβ) Ω −ε 1 ( z − Wβ)
1
−1
where Ω u is the covariance matrix of u (if ε is
integrated of order d, the same is true for u and Ω u
becomes the covariance matrix of the d-th differences
~
of u). If Ω u is known, β is BLUE (best, linear,
Disaggregation procedures may be generally
interpreted as problems of constrained optimisation
(see Stram and Wei, 1986b). If ε has mean 0 and
covariance matrix Ω ε (when ε i follows a ARIMA
(p,d,q) stochastic process, then Ω ε is the covariance
matrix of the d-th differences of ε, with dimension
(mΤ − d) × (mΤ − d); see the Appendix), then:
2
−1
u
(6)
where C is the ( Τ × m Τ ) aggregation matrix given by
ΙΤ ⊗ c ' (⊗ is the Kronecker product) 2 ; u = Cε is the
aggregated error. Notice that the hypothesis of a static
relationship between z and the indicators is imposed a
priori. Dynamic elements may be contained in the
error term, implicitly assuming the existence of
common factors.
min
'
0
113
TABLE OF CONTENTS
solution (5) and the available information contained in
the indicators. Extrapolated values may be considered
as efficient forecasts if and only if the postulated
model is based on a causal, theoretical, relationship
among the variables. In the case of temporal
disaggregation the link between the unknown series
and
the
indicators
should
derive
from
statistical-accounting considerations and not from a
particular economic theory. Moreover, the values
extrapolated with indirect methods are generally liable
to revisions because of the lack of aggregated
information.
integrated of order 1 (see Beveridge and Nelson,
1981).
The estimates derived from Boot et al. are not
completely satisfactory for economic applications.
First, when a variate grows, a constant preliminary
estimate does not seem acceptable; moreover, the use
of first differences does not allow to satisfy the
condition of constant variation (point d above). The
alternative proposal of Boot et al. is to employ second
differences, which corresponds to the inclusion of a
linear trend in (9). The trend however may be
considered as an indicator so that a more efficient
solution is given by optimal methods. Another flaw of
the method just described is that the estimates of z may
change dramatically according to the period and
especially when a new aggregate value is available 3.
2.2 Methods which do not use indicators
These procedures do not employ auxiliary information
to compute high frequency data. Generally the
estimates must satisfy the following requirements:
a) aggregation constraint (1);
More recently, new disaggregation methods have been
proposed, based on the relationships existing among
the autocovariance matrices of the ARIMA processes
of z and the temporal aggregate (see Stram and Wei,
1986a). In this context, the disaggregated series is
obtained as a solution to a constraint minimisation
problem similar to that described in the next section.
What is needed are the autocovariance matrices of the
d-th differences of y and z. While the matrices for ∆d y
(∆ is difference operator) may be obtained through the
identification and estimation of an ARIMA model for
y, this is not possible for z, which is not known. The
problem is that there is not a one-to-one mapping from
the aggregate ARIMA model to the disaggregate one.
Only maximum orders may be obtained for the
disaggregate model, when the orders ( p,d,q) of the
aggregate are known. The identification problem is
tackled by Al-Osh (1989) by restricting the relevant
class of models, and by selecting what they consider
the simplest and most likely models for the
disaggregate series. Wei and Stram (1990) solve
elegantly this problem by proposing an approximate
transformation to convert the autocovariances of ∆d y
into those of ∆d z. Barcellan and Di Fonzo (1994)
applies the last two procedures to true economic time
series and to simulated series, showing a slight,
showing a slight superiority of the Wei and Stram
method with respect to that of Al-Osh.
b) symmetry conditions: if y t − 1 < yt < yt + 1 , then
disaggregated observations must evolve opposite
to the case when yt − 1 > yt > y t + 1 ;
c) zero variation condition: if y t − 1 = y t = yt + 1 , then
the m-th frequency observations must be equal to
(1/ m) yt ;
d) constant
variation
condition:
if
y t − y t − 1 = yt + 1 − yt = a, then the high frequency
data must grow in the same proportion 1 / m 2 a.
(
)
Most of these methods may be derived from that
proposed by Boot et al. (1967), where the matrix W
coincides with a constant value ( z). Following the
procedure described in the preceding section, if:
zi = z + εi
ε i = ε i − 1 + yi
(9)
z is given by the solution to:
the estimate ~
min ∆ z ' Ω
z
s. t .
−1
v
∆z
(10)
y = Cz
When v is a white noise process (mean zero and
variance σv2 = 1), then Ω ν = Ι. In the general case, (9)
coincides with the representation of time series
3
2
Alternative suggestions to overcome these problems have been proposed; see DiF for a survey.
114
TABLE OF CONTENTS
2.3 Methods based on indicators:
pure adjustment methods
(D ' D) −1
The most famous and applied adjustment method is
that by Denton (1971) 34, conceptually similar to that
by Boot et al. According to this procedure, once a
preliminary estimate of z has been obtained, the
smoothing matrix can be computed by solving:
'
min( z − ~
z) Μ ( z − ~
z)
y = Cz
M is a symmetric ( m Τ × m Τ ) matrix, whose form
depends on the hypotheses about the minimand or loss
function. The smoothing matrix has the following
form:
L= Μ
−1
(
C ' CM − 1C '
)
−1
(12)
When the loss function is based on first differences (a
case coincident with the problem (10) analysed by
Boot et al. with a constant preliminary estimate of z),
the matrix M (in this case, Ω v−1 = Μ ) is given by the
product D ' D , where D is 45:
(
)
1 0 K K K
− 1 1 0 K K

0 − 1 1 0 K
D=
K K K K K
K K K K K

0 0 K K − 1
(11) therefore becomes:
2.4 Methods based on indicators:
the optimal methods
0 
0

0 

K
K

1
Optimal methods allow to compute the estimate of z
subject to the aggregation constraint. The high
frequency series is obtained through a statistical model
similar to (5), where it is possible to apply the most
common estimation techniques and, by computing the
covariance matrix of the estimators, inference
analysis. In this section we will describe three
different models, which stem from some restrictive
hypotheses imposed on the stochastic process for ε. In
particular, we will consider ε generated as:
- a white noise, with zero mean and variance σ 2ε
(section 2.4.1);
- an AR(1) process (section 2.4.2);
- a random walk (section 2.4.3).
(13)
'
min ( z − ~
z ) ( D ' D)( z − ~
z)
(14)
z
s. t .
y = Cz
and the solution for L is:
[
]
L = ( D ' D) C ' C( D ' D) C '
−1
−1
−1
(15)
Denton’s method is very simple to implement; in
effects,
4
5
3
4
(16)
Denton’s proposal may be interpreted as a two stage
procedure, because only after the preliminary estimate
is obtained the values are corrected so as to satisfy the
aggregation constraint. Generally speaking, problem
(5) may be seen as decomposed into two phases, the
first one related to the estimate of β, without any
consideration of Ω ε ; the second one related to the
derivation of L. There is clearly an efficiency loss with
respect to GLS methods, where β and L are estimated
simultaneously (see DiF for a formal proof).
Intuitively, the inefficiency of this method stems from
the ignorance of the information contained in the
covariance matrix Ω ε , which is the base to compute
the smoothing matrix. On the contrary, optimal
methods solve this problem, but crucially depend on
the hypotheses about the form of the matrix Ω ε .
(11)
z
s. t .
1 1 K 1 
1
1
2 2 K 2 


= 1
2 3 K 3 


K K K K K 
1 2 3 K mΤ 
The first case does not differ from the two stage
methods; when the disturbance is distributed as a
white noise, the optimal method coincides with the
Other adjustment methods are: Bassie (1958), Vangrevelinghe (1966) e Ginsburgh (1973).
When the loss function is based on the second differences, Μ = D ' D ' DD .
115
TABLE OF CONTENTS
where ν i is white noise and ρ < 1 to guarantee
stationarity of ε i .
OLS estimator and an equal distribution of the
aggregate discrepancies. The other two cases have
been analysed by Chow and Lin (1971) and by
Fernandez (1981) and Litterman (1983). These are the
most widely applied models in the literature: the Chow
and Lin method, modified by Barbone et al. (1981)
and successively revised, is currently employed by
ISTAT in the construction of quarterly national
accounts data (see ISTAT, 1992).
The idea is that this specification may capture dynamic
elements ignored in the model (see Hendry and Mizon,
1978). In effects, when ε i is generated as in (19),
model (3) may be rewritten as:
z i = ρ z i − 1 + W i β− Wi −1 (βρ) + ν i
2.4.1 ε distributed as a white noise
This hypotheses is however quite arbitrary and should
be tested (see section 4 on this point). The covariance
matrix of ε i is:
When ε follows a white noise, the covariance matrix
Ω ε is given by σ 2ε Ι and (5) may be rewritten as:
min ( z − Wβ) ( z − Wβ)
 1

 ρ
σ 2ε  2
Vε =
ρ
1−ρ 2 
L
 mΤ −1
ρ
'
β, z
s. t .
(20)
(17)
y = Cz
β may be estimated with OLS applied to the aggregate
model without any efficiency loss; in this case, the
smoothing matrix is equal to (1/ m)C ' and the optimal
solution is:
ρ
1
ρ
L
ρ mΤ − 2
ρ mΤ − 1 

ρ
L ρ mΤ − 2 
1
L ρ mΤ − 3 


L
L
L

mΤ − 3
ρ
L
1 

ρ2
L
(21)
and (5) becomes:
min ( z i − Wi β) Vε−1 ( z i − Wi β)
'
1
z$ = Wβ$ + C ' u$
m
(18)
It may occur that, as u$ is equally distributed to the high
frequency data, the estimate of z is characterised by
undesirable “jumps” between adjacent observations.
This is due to the fact that, as with adjustment
methods, it is assumed that there is no correlation
among the disturbances. In the empirical work the
solution is to assume much more complex stochastic
processes, which allow to distribute the discrepancies
in a more realistic way.
The optimal solution is very similar to (6):
[
~
β = W ' C 'V u−1 CW
5
−1
W ' C 'V u−1 y
(23)
where Vu = CVε C '..
Notice that the smoothing matrix distributes the
discrepancies over high frequency data according to
decreasing powers of ρ. As ρ is less than unity in
absolute terms, more distant observations are only
marginally affected by the updating of the series due to
the availability of new aggregate values.
An alternative to the restrictive white noise hypothesis
is proposed by Chow and Lin (1971), who suppose
that ε might be generated by an ARMA process,
simplified in their application and in most of the
following literature into the AR(1):
6
]
L = Vε C 'V u−1
2.4.2 ε distributed as an AR(1)
ε i = ρε i − 1 + ν i
(22)
y = Cz
s. t .
The estimate of the autoregressive parameter ρ is
rather
difficult;
iterative
methods
à
la
Cochrane-Orcutt (1949) 56 cannot be applied because
only the aggregate error is known. Under the
assumption of normality, however, a possible solution
(19)
In general, when m is odd Cochrane-Orcutt procedure may be used: see Wei and Stram (1990).
116
TABLE OF CONTENTS
2.4.3 ε distributed as a random walk
is provided by the maximum likelihood estimation
technique. β, σ 2ν and ρ may therefore be jointly
estimated by solving:
(
max
max 2π σ 2ε
2
β ,σε ,ρ
)
1
− n
2
Vu
−
1
2
exp −
( y − CWβ) ' Vu−1 ( y − CWβ)
Fernandez (1981) reinterprets Denton’s procedure
(1971; section 2.3) in the framework of optimal
methods. The crucial point is the hypothesis that ε is a
random walk:
2σ 2ε
εi = ε i − 1 + ν i
(24)
i = 1,K, mΤ
where ν i is a white noise with mean 0 and variance σ 2ν .
In this case, the proposal of Fernandez is similar to that
discussed in the previous section, with the additional
assumption that ρ = 1.
This allows to simplify the computations because the
autoregressive parameter has not to be estimated;
given the initial conditions, the covariance matrices
are:
The computing of these estimates is not free of
problems: when the high frequency data present sharp
fluctuations, the estimated values may not be
acceptable (see DiF, chapter 5).
To overcome these problems Barbone et al. (1981)
propose a new method called “estimated generalised
least squares” (Visco, 1983), based on the
minimisation, with respect to β and ρ, of the following
expression:
Vε = σ 2ν (D ' D)
(
Vu = σ C(D ' D) C '
)
(
)
(25)
2
ν
The estimate of ρ is obtained by scanning over a grid of
values in the interval (-1,1) according to the procedure
proposed by Hildreth and Lu (1960). Comparing (24)
and (25) it may be observed that the estimate of ρ is
−
1
2
matrices given in (30).
so that it is not really a maximum
2.5 The missing data approach
In many empirical cases, it may occur that the high
frequency data are available only for some sub period
of the sample (i.e. yearly data to some point and
quarterly data thereafter). If there exist one or more
high frequency indicators for the whole period, then
the missing data procedures may be used to construct
the whole set of high frequency observations. It should
be noticed that the missing data problems is somehow
more difficult to solve in econometrics than in other
disciplines, such as biometrics 7. The principal reason
for this it the intrinsically dynamic structure of
economic systems.
The extrapolation of n values, when the aggregate is
not yet known, is easy to obtain by using the optimal
solution (8). The proposal of Bournay and Laroque
(1979) is equivalent to augmenting C with n vectors of
zero as:
C + = (CM0)
(26)
where 0 is a (m Τ xn) matrix. Further, Vε is substituted
by Vε+ where
(
−1
z has been obtained,
Once a preliminary estimate of ~
the optimisation problem is the same as Denton’s and
~
the solution for β and L is similar to (23), with the
likelihood estimate.
ε 
'
Vε+ = Ε  +  ε 'Mε +
ε 
−1
(30)
~ '
~
y − CW β Vu−1 y − CWβ
independent of Vu
(29)
)
( ) 
) Ε (ε ε ) 
 Vε
=
 Ε ε + ε'

(
Ε εε + '
+
A first attempt to estimate an econometric model in the
presence of missing data is proposed by Sargan and
Drettakis (1974). Essentially, this is an iterative
maximum likelihood method where the missing data
are seen as parameters to be estimated 8. At first, the
structural parameters of the model are estimated by
employing the sample period for which the high
frequency data are available. Missing data are then
+'
(27)
where ε + = ( ε Τ m + 1 ,K , ε Τ m+ n ) '. The estimate of the
unknown values is then given by:
y$ = CW β$ + Vε+ C + 'V u−1 u$
(28)
117
TABLE OF CONTENTS
computed by extrapolation from the model just
estimated. The procedure is repeated by completing
the information set with the extrapolated observations
and the parameters are re-estimated 9 . If the model is
correctly specified, the estimates are asymptotically
consistent and efficient. In this context the estimate of
missing data has to be considered as a by-product of
the more general estimation problem. The practical
application of this method is however quite complex;
Drettakis
(1973)
explicitly
considers
the
disaggregation problem with the imposition of
constraint (1).
observe that this estimator is equivalent to that of
Chow and Lin (1971 and 1976), when the disturbances
are distributed as an AR(1) 12.
Gilbert (1977) concentrates only on the case of single
equation estimators. He shows that in the case of flow
variables, by substituting the disaggregation mean for
y
the missing data and/or the aggregate data ( for each
m
10
of the m observations), the OLS
estimator is
equivalent to the GLS on the observed data and to that
of Theil-Goldberger (1961). The approach may be
extended so that the aggregation constraint may be
imposed 11.
The estimation methods just seen may be correctly
defined as ML only when the likelihood function is
marginalised for the missing data (see Pena and Tiao,
1991). In dynamic models, however, the
marginalisation is very difficult to perform so that
iterative estimation techniques have been employed
which cannot be defined as ML, even if this term is
currently used.
Palm and Nijman (1984) analyse the more general
dynamic case with ARMA disturbances. In particular,
they show the superiority of maximum likelihood
(ML) estimates without substitution of the missing
data with some linear transforms of the aggregate
values. However, they warn against possible problems
of parameter identification, especially when residuals
have MA components.
A way to avoid explicit marginalisation is to evaluate
the (log) likelihood function and its derivatives in
prediction error decomposition form by means of the
Kalman filter. In this context, (see Ansley and Kohn,
1983) the model has to be transformed into the
state-space form and the (high frequency) missing data
may be computed through the smoothed estimator of
the state vector (see Harvey and McKenzie, 1983 for
an application of Kalman filter technique to the model
of Sargan and Drettakis). This approach has
successively been generalised by Harvey and Pierse
(1984), who show that by employing a state-space
transformation of (3) is possible to modify and extend
the Chow and Lin method to the case when the
Gilbert (1977) shows also that when in a static
simultaneous system there is a problem of missing
data for all the endogenous variables in the same
period, the 2SLS estimator is asymptotically
equivalent to the LIML estimator if the period with
missing observations grows with the sample.
The dynamic case is tackled by Gilbert through an
autoregressive model and the application of an
iterative procedure similar to that advocated by Sargan
and Drettakis (again the maximum likelihood estimate
is obtained on convergence). It is interesting to
7
8
9
10
11
12
See Afifi e Elashoff (1966) for a classical survey of missing data problems in biometrics.
The necessity of iterative methods stems from the impossibility in the dynamic case of partitioning the
likelihood according to the data observed at different frequencies.
7
However, notice that there is in the likelihood a correcting term to take into account the estimate d nature of
some observations.
8
The estimate must be corrected for the number of degrees of freedom, which are the same of a regres sion on
the observed data.
9
See Tzerkezos (1993) for an application with distributed lags.
1
Acutally the equivalence is obtained through an approximation of the matrix Vε .
11
118
TABLE OF CONTENTS
residual ε is generated by a stationary ARMA(p,q)
process1 13. With this approach, using an extended
Kalman filter, it is also possible to consider non linear
transformations of the data, such as the logarithmic
one (see Anderson and Moore, 1979).
A direct, but quite arbitrary, solution to this problem,
consists in disaggregating all the series but one to be
determined as a residual. Needless to say, this series
will be characterised by the errors of the
disaggregation procedure applied to the other
variables. Another possibility is to obtain the quarterly
series for all the variables and then to balance them in
order to satisfy the adding-up constraints (see Stone et
al., 1942; Stone, 1990 for a general treatment of this
problem in the context of national accounts).
Kalman filter based methods seem to be more easily
applicable than ML ones; however, a deep knowledge
of the properties of this estimator is requested for a
better implementation. In particular, the non
uniqueness of the state space representation needs a
specific analysis in each case, while for instance, the
Chow and Lin method does not.
A more efficient alternative should be based on an
explicit consideration of adding-up constraints in the
disaggregation procedures. In this case, there are 3
kinds of constraints:
It is useful to observe that with the missing data
approach the estimates crucially depend on the
hypotheses underlying the generating model of the
data. As we have already seen, the estimates are
asymptotically efficient and consistent only if the
structural model is correctly specified. Here the point
is the interpretation of the missing data as fixed
parameters to be estimated. This seems a contradiction
in terms: it would be much more natural to consider
the missing data as random variables, with the same
probabilistic structure of the already known
observations (see Pena and Tiao, 1991).
a)
Contemporaneous adding-up constraint.
Let l be a vector of S series, and let z be the sum of
its components at each time period. The
constraint may then be written as follows:
s
∑l
j=1
b)
j ,m (t − 1 ) + k
∑l
k= 1
c)
constraints
= y j ,t
j = 1K
, ,S
(34)
Adding-up constraint at the aggregate level.
Once the constraints have been specified, the
procedure is similar to the univariate case. DiF
proposes a generalisation of the Chow and Lin
method: the difficulty is how to estimate the
covariance matrix of the residuals, given the
hypotheses on the different error terms; for instance, in
the situation where the disturbance follows a
multivariate AR(1), with S series to disaggregate,
S(S+1)/2
autoregressive parameters should be
estimated with iterative techniques.
(32)
where GDP is the gross domestic product, C is
consumption, I investments, S the change in stocks, E
exports and M imports. (32) is an identity valid every
time period. In general, the application of some of the
disaggregation methods discussed so far allows to
satisfy the aggregation constraint (1), but not (32).
12
j ,m (t − 1 ) + k
For this case, it is assumed that as the aggregate
value of the series is known, the adding-up
constraint is automatically satisfied.
In many instances, the disaggregation problem refers
to economic variables characterised by adding-up
constraints. In the case of the resources and uses
account:
13
(33)
This is the usual constraint analysed up to now:
estimate of several series
3 Indirect
with contemporaneous adding-up
t
k = 1,K, m
Temporal aggregation constraint.
m
GDPt = C t + Ιt + S t + Ε t − Μ
= z m ( t − 1) + k
In this case, β should be part of the state vector. On this, see Harvey e Phillips (1979).
119
TABLE OF CONTENTS
To avoid this kind of compications, Rossi (1982)
suggests a two stage procedure, which is a
reproposition of Denton’s method (section 2.3), where
the two phases consist of the preliminary estimate and
the following adjustment. For the first one, Rossi
proposes to compute the S disaggregated series with
the univariate procedure of Chow and Lin (section
2.4.2) 14; at this stage the single series do not satisfy the
constraint in (34). In the second step a kind of
smoothing matrix to adjust the series is computed. Di
Fonzo (1990) describes Rossi’s method and show that
the adjustment phase is based on the hypothesis that
the multivariate disturbance is a white noise, thus
contradicting the assumption of an AR(1) for the
single series.
nature of (35), the residuals u t are generally
autocorrelated and heteroskedastic. For example,
assume that the true relation between the variable and
the indicator is
z i = γ + ψ (B)Wi + ζ( B)ε t
are lag polynomials whose roots are all outside the unit
circle: then it is possible to show that the residuals of
the equation (35) are in fact autocorrelated (Tiao and
Wei, 1976) .
If Ε ( uu') = σ 2 Ω ≠ σ 2 Ι, then it is very well known that
the OLS estimator of β in (35) is not efficient and
inference based on OLS estimates is not correct in
general 15. Of course, it is possible to use Aitken’s
(1935) GLS or equivalent methods, but in this case the
usual “goodness of fit statistics” have to be computed
consistently (see e.g. Buse, 1973). The theory is well
known and we will not treat it here 16. However, the
main issue is that the GLS estimator is unbiased and
consistent if and only if the covariance matrix used in
the estimate is the true one and if autocorrelation
and/or heteroskedasticity are real features of the
disturbances in the GDP. If residual autocorrelation
and/or heteroskedasticity are induced by model
misspecifications (as it is often the case in a temporal
disaggregation framework), then the GLS estimator is
no longer unbiased and consistent. Therefore, an
empirical assessment of the nature of the
autocorrelation using e.g. the COMFAC test (Hendry
and Mizon, 1978) is recommended. A general
problem, common to GLS and maximum likelihood
methods, is that at least the structure of Ω must be
specified a priori, unless a Kalman filter approach,
which does not consider Ω explicitly (see Harvey and
Phillips, 1979), is used. An alternative is offered by
GMM estimators (Hansen, 1982), based on consistent
estimates of the covariance matrix 17. However, it
4 Inference
The results of the disaggregation procedure depend
crucially on the model used for the temporally
aggregated variables. In fact, the initial step in a
Chow-Lin-like disaggregation procedure is the
regression model
(35)
where yt represents the variable to be disaggregated,
and X t is the t-th row of the matrix of temporally
aggregated indicators, i.e. X=CW.
Because of the small number of indicators (we are not
trying to “explain” the true DGP) and of the static
14
15
16
(36)
where ψ (B) ≡ 1+ ∑ j ψ j B j and ζ( B) ≡ 1 + ∑ j ζ j B j
The same logic of the univariate case applies for
extrapolating the series when the aggregate values are
not known. It is important to notice that every piece of
information must be taken into account: for instance,
when only one of the aggregate observations is
available.
yt = X t β+ u t ( t = 1,K , Τ )
( i = 1,K , mΤ )
In Rossi’s procedure the disturbances are generated by a process free of correlation. The approach may
however be extended to the case analysed by Fernandez (paragraph 2.4.3).
See for example Nicholls and Pagan (1977) and Dufour (1988). There are particular cases in which O LS are
efficient and/or consistent (see Fiebig et al., 1992). However, these cases do not seem to find nat ural
applications in the present context.
14
Interested readers might refer to Amemiya (1973) for an extensive discussion of the properties of GLS
estimator in the presence of an estimated variance-covariance matrix.
15
120
TABLE OF CONTENTS
where ϑ$ is a triangular matrix whose non zero
elements are the estimates of the coefficients of the
Wold representation (37). In practice, in the presence
of one indicator, the estimate of σ 2ε can be derived
from the ARMA model φ(B)Wi = θ( B)ε i . Under the
~
null that the preliminary aggregated estimate, CW β, is
consistent with observed data, Κ$ is asymptotically
would be necessary to study further the properties of
these estimators in small samples. Finally, solutions
can be searched in the applications of bootstrap
methods (Efron, 1979). In this case the main difficulty
is given by the necessity of using consistent bootstrap
techniques in the presence of dependent data.
However, there are rather general conditions under
which
the
autocorrelation-heteroskedasticity
problems are alleviated, as we will see in Section 5.
distributed as a χ 2 with T degrees of freedom.
Unfortunately, this procedure is rather complex and
becomes extremely complicated when more than one
indicator is used. Furthermore, and this is perhaps the
main issue at hand, the emphasis put on the stochastic
structure of the residuals in the Chow-Lin method (and
its generalisations), is posed directly on the
relationship among the unknown disaggregated
variable, its preliminary estimate and the indicators.
There is a crucial problem: the preliminary estimate ~
z
and the disaggregated series z cannot have the same
ARIMA model if ~
z i ~ Ι(1) and z i ~ Ι(1). In fact, assume
Inference should also be used in order to assess the
consistency of preliminary estimates of disaggregated
variables arising from the aggregate regression model
(35). Under stationarity, and in the case of only one
indicator, Guerrero (1990) suggests a method based on
the following hypotheses:
(i)
the indicator and the variable have the same
ARMA representation;
Ε ( z i ~zi ) = ~
zi ;
Ε ( z i ℑ i − 1 ) is independent of W, where ℑ j is
the information set as of time j.
(ii)
(iii)
If (i) holds, then the preliminary estimate ~
z i follows
the same ARMA process as the true series. This
ARMA model can be written according to the Wold
representation as
~
z i = θ ( B)ε i
and Guerrero derives the minimum variance estimator
as
= W + Θ Θ ' C '(CΘ Θ ' C ')
−1
(Y − CW )
(40)
φ(B)(1 − B) ~
z i = θ( B) ε 2i
(41)
from which, (1− B) φ( B)( z i − ~
z i ) = θ(B)η . Therefore,
in this case the differences between the preliminary
estimate and the true series are non stationary.
Alternatively, suppose that z i and ~
z i are generated by
the simple bivariate cointegrated process
(37)
z$ = W + L(Y − CW )
φ(B)(1 − B) z i = θ( Β ) ε 1i
1

0

(38)
0  ε 1i 
−β  z i   θ 11 (B)
 
  = 
~


(1−β)  z i   0
θ 22 (B)  ε 2i 
(42)
It is easy to see that
where Θ is a matrix whose elements are functions of
the parameters of the Wold representation (37).
∆z i = θ 11 (B)(1 − B) ε 1i + βθ 22 ( B)ε 2i
The simplest case is when θ 11 ( B) ≡ 1 and β ≡ 1.
However, even in this rather special case
A consistency test is then derived under normality,
computing the statistic
~
~
y −CW β) '(Cϑ$ ϑ$ ' C ') ( y − CW β)
(
$
K=
∆z i = ∆ε 1i + θ 22 ( B) ε 2i
−1
~
β 2 σ$ 2ε
17
(43)
(39)
Consistent covariance estimators are studied, among others, by White (1984), Newey and West (1987) ,
Andrews (1991), Andrews and Monahan (1992), Hansen (1992).
16
121
(44)
TABLE OF CONTENTS
p lim Τ −1 ( X ' X ) = Μ
It follows that the true series and the preliminary
estimate cannot be generated by the same ARIMA
process.
p lim( X ' u) = 0
(45)
with M finite positive definite matrix. Most of the
difficulties with integrated variables stem from the
fact that second moments and cross-products, when
appropriately normalised, do have non standard
asymptotic distributions, which can be in general
expressed as functionals of standard Wiener
processes.
Non-stationarity and temporal
5 disaggregation
Cointegration is of paramount importance for
estimation and inference in models with integrated
time series. The intuition is very simple: if two (or
more) variables are cointegrated, they cannot drift
apart too much as time passes, and a stationary linear
combination must exist. From the economic point of
view, cointegration arises from the existence of a
long-run equilibrium relationship.
The most common cases are examined below.
a)
X t ~ I ( 0)
Actually, this is not a very interesting case in that it is
self-evident that (35) is not well balanced in the sense
that the dependent and the explanatory variables have
different orders of integration. This implies that
residuals are non stationary and the estimate of β tends
to zero.
In temporal disaggregation contexts, usually the
indicator X t measures substantially the same
phenomenon as the variable yt to be disaggregated.
However, important differences can exist, for example
in the way the two variables measure the same
quantity. A typical case is when X t is measured on a
subsample with respect to the sample used to measure
yt . These differences notwithstanding, the general
behaviour of the two variables should be very similar.
b)
Linear trend used as indicator
(35) becomes in this case
yt = α + β t + et
In what follows we will assume that yt ~ I (1). For X t
to be a valid indicator for yt it must be that the two
, ). This
variables are cointegrated, i.e., ( y t X t ) ~ CI (11
of course implies that in (35) u t ~ I ( 0), in apparent
contradiction with some disaggregation procedures,
notably those by Fernandéz (1981) and Litterman
(1983), but also with the general approach analysed by
Stram and Wei (1986b). In other words, while
according to Fernandéz (1981) random walk residuals
should be considered as a realistic hypothesis, in our
opinion it is a misleading one, meaning that the
indicators used are not valid. This contradiction is
partly explained by the fact that most studies on time
disaggregation are purely statistical, and consider
residuals as autonomous processes rather than, more
correctly, model-based derived quantities.
(46)
Durlauf and Phillips (1988) treat this case extensively
and demonstrate that
b.i)
b.ii)
the distributions of α$ , Fα = 0 and t α diverge;
β$ tends to the true value of the drift of yt but
b.iii)
the distributions of Fβ= 0 and t β diverge;
the estimated R 2 has a non degenerate
asymptotic distribution.
The empirical results from (46) could in general
induce the investigator to assume that the trend is a
“good” indicator for yt having a significant t-statistic
and being able to produce a relatively high R 2 .
However the Durbin-Watson (1950, 1951) DW
statistic in this case gives an important indication that
something must be wrong. In fact:
In this section we will try to explain in much greater
detail our viewpoint, using well known results
concerning integrated variables. Remember that in the
case of regressions between stationary variables, the
asymptotic properties of the most common estimators
are based on the assumptions (cf. White, 1984):
b.iv)
DW is asymptotically zero.
The convergence to zero of the DW statistic is due to
the nonstationarity of the equation residuals, and
122
TABLE OF CONTENTS
$ is super-consistent ( T-consistent).
β
d.i)
d.iv)
the Wald test for q linear restrictions is
asymptotically distributed, as in the stationary case, as
a χ 2 with q degrees of freedom.
This fundamental result is proved in different contexts
by Andrews (1987), Phillips (1987a), Phillips and
Durlauf (1986), Stock (1987), and highlight that OLS
estimates among cointegrated variables. Furthermore,
this results continues to hold even if the residuals are
correlated with the regressors and also in the presence
of measurement errors. OLS on stationary variables
would have produced inconsistent estimates under the
same circumstances 20. However,
( )
Τ β$ − β
d.ii)
is
asymptotically
In most temporal disaggregation problems, the
application of cointegration theory should require
great care, since sample sizes are typically too small
for the validity of asymptotic results; however,
cointegration analysis should continue to play an
important role as a “guide” also in finite samples 22. It
has been observed that, despite super consistency
results, static regressions among cointegrated
variables can give biased results in small samples.
Two main solutions have been suggested. The first is
to avoid static regressions and use dynamic models in
the form of unrestricted ECM’s (Banerjee et al., 1986;
Boswijk, 1992); the second is to use semiparametric
corrections (Phillips and Hansen, 1990 23. It is known
that greater biases correspond to smaller values of the
R 2 statistic 24. For this reason a very good fit may
matter. Further, the bias does not converge to zero at
the super-consistency rate in finite samples, and may
persist even if the sample size is relatively large 25.
distributed
according to a mixture of normals centred on
zero,
and implies that non standard distributions have to be
used to draw inference (Park and Phillips, 1988). A
further asymptotic advantage offered by the presence
of a cointegrating relationship is given by the fact that
(Phillips and Park, 1988)
( )
Τ β$ − β
d.iii)
(
)
and Τ β$ GLS − β
have the same
asymptotic distribution when β$ GLS is Aitken’s
(1935) GLS estimator.
6 An example with real data
Therefore, in contrast with the stationary case, OLS
are asymptotically efficient even in the presence of
autocorrelation. Furthermore, Phillips and Park (1988)
show that, if X t is strictly exogenous (in the sense of
Engle et al., 1983) 21, then
20
21
22
23
24
25
1
In order to be able to compare our results with
preceding studies and different methodologies, we
chose to examine the disaggregation of Italian
National Accounts production using industrial
A classical reference is Haavelmo (1943)
Strict exogeneity is a rather peculiar feature. In general, if this condition is not satisfied, con ventional tests
cannot be used. For more on this, see Phillips (1988, 1991), Park and Phillips (1988, 1989), Johans en (1992).
Even lack of weak exogeneity can produce non trivial complications also in extremely simplified mod els
(Hendry, 1993).
20
21
See e.g. Hendry (1973, 1982) on the role of asymptotic theory in finite samples.
There is Monte Carlo evidence that indicates that in many realistic cases the unrestricted ECM esti mator is to
be preferred to OLS and to modified OLS with Phillips and Hansen’s (1990) semiparametric correction . Details
are contained in Inder (1993).
2
However, remember that here R tends asymptotically to a non degenerate random variable.
This result has been found by Banerjee et al. (1986) using Monte Carlo experiments. It is justified theoretically
by Abadir (1993a), who shows that if Y0 / σε ≠ 0, Y0 being the initial condition and σε the standard deviation of
the stationary series of shocks, then in small samples the bias is inversely proportional to Y0 / σε but converges
to zero at a rate which itself is inversely proportional to the same quantity. Of course, the asymp totic
convergence rate to zero is Ο p (T −1).
124
TABLE OF CONTENTS
d.i)
$ is super-consistent ( T-consistent).
β
d.iv)
the Wald test for q linear restrictions is
asymptotically distributed, as in the stationary case, as
a χ 2 with q degrees of freedom.
This fundamental result is proved in different contexts
by Andrews (1987), Phillips (1987a), Phillips and
Durlauf (1986), Stock (1987), and highlight that OLS
estimates among cointegrated variables. Furthermore,
this results continues to hold even if the residuals are
correlated with the regressors and also in the presence
of measurement errors. OLS on stationary variables
would have produced inconsistent estimates under the
same circumstances 20. However,
d.ii)
( )
Τ β$ − β
is
asymptotically
In most temporal disaggregation problems, the
application of cointegration theory should require
great care, since sample sizes are typically too small
for the validity of asymptotic results; however,
cointegration analysis should continue to play an
important role as a “guide” also in finite samples 22. It
has been observed that, despite super consistency
results, static regressions among cointegrated
variables can give biased results in small samples.
Two main solutions have been suggested. The first is
to avoid static regressions and use dynamic models in
the form of unrestricted ECM’s (Banerjee et al., 1986;
Boswijk, 1992); the second is to use semiparametric
corrections (Phillips and Hansen, 1990 23. It is known
that greater biases correspond to smaller values of the
R 2 statistic 24. For this reason a very good fit may
matter. Further, the bias does not converge to zero at
the super-consistency rate in finite samples, and may
persist even if the sample size is relatively large 25.
distributed
according to a mixture of normals centred on
zero,
and implies that non standard distributions have to be
used to draw inference (Park and Phillips, 1988). A
further asymptotic advantage offered by the presence
of a cointegrating relationship is given by the fact that
(Phillips and Park, 1988)
d.iii)
( )
Τ β$ − β
(
)
and Τ β$ GLS − β
have the same
asymptotic distribution when β$ GLS is Aitken’s
(1935) GLS estimator.
6 An example with real data
Therefore, in contrast with the stationary case, OLS
are asymptotically efficient even in the presence of
autocorrelation. Furthermore, Phillips and Park (1988)
show that, if X t is strictly exogenous (in the sense of
Engle et al., 1983) 21, then
In order to be able to compare our results with
preceding studies and different methodologies, we
chose to examine the disaggregation of Italian
National Accounts production using industrial
20
A classical reference is Haavelmo (1943)
21
Strict exogeneity is a rather peculiar feature. In general, if this condition is not satisfied, con ventional tests
cannot be used. For more on this, see Phillips (1988, 1991), Park and Phillips (1988, 1989), Johans en (1992).
Even lack of weak exogeneity can produce non trivial complications also in extremely simplified mod els
(Hendry, 1993).
22
See e.g. Hendry (1973, 1982) on the role of asymptotic theory in finite samples.
23
There is Monte Carlo evidence that indicates that in many realistic cases the unrestricted ECM esti mator is to
be preferred to OLS and to modified OLS with Phillips and Hansen’s (1990) semiparametric correction . Details
are contained in Inder (1993).
24
2
However, remember that here R tends asymptotically to a non degenerate random variable.
25
This result has been found by Banerjee et al. (1986) using Monte Carlo experiments. It is justified theoretically
by Abadir (1993a), who shows that if Y0 / σε ≠ 0, Y0 being the initial condition and σε the standard deviation of
the stationary series of shocks, then in small samples the bias is inversely proportional to Y0 / σε but converges
to zero at a rate which itself is inversely proportional to the same quantity. Of course, the asymp totic
convergence rate to zero is Ο p (T −1).
124
TABLE OF CONTENTS
production and price indices as indicators. The same
problem is examined, under a different viewpoint, in
Gennari e Giovannini (1993) (GG, henceforth) 26.
In our approach, consistently with the discussion
contained in the preceding section, the
misspecification problems are treated as arising from
the lack of cointegration between integrated variables.
Table 2 lists the results of the Dickey-Fuller (1979,
1981) test for stationarity: they seem to indicate that
both the variables YC (21) , YC (41) and the indicators
Let YC ( ) be the current prices production of branch k,
(k )
(k )
and denote by ΙPΙ and POUT
the industrial
production index and the output price of the same
branch. The annual equation is
k
(
)
YC t(k ) = α + β IPI t(k ) × POUTt (k ) + e t(k )
(IPI
(48)
(21)
)(
× POUT (21 ) , IPI (41) × POUT (41)
) are well
approximated by I(1) processes. Note that the
Dickey-Fuller tests on the OLS residuals reported in
table 1 do not reject the null hypothesis of no
cointegration 30. Of course, the two step Engle-Granger
method applied here, is not necessarily the most
efficient one to test for cointegration 31. However,
given that only one cointegrating vector can be present
in each of our equations, and since our basic model is a
static one, Engle and Granger two step method is
naturally nested in our problem. Further, this choice
allows us to use finite sample critical values (see
MacKinnon, 1991).
( )
where et k is the equation residual. In order to have a
better fit, estimates are often corrected introducing
dummy variables and/or broken trends. However, the
choice of dummies and trends is usually arbitrary, and
pre-testing problems are likely to occur in practice.
The branches considered here are the same as those
examined in GG, namely branches 21 and 41 of the
NACE classifications, corresponding to “Agricultural
and Industrial Machinery” and “Textiles and
Clothing”, respectively. Our OLS estimates, relative
to the equations exposed in GG 27, are reported in table
128. The diagnostic statistics indicate the presence of
misspecification problems in both equations. GG’s
proposal is to represent the broken trends in the
equations by means of changing parameters models.
Their estimates are based on the Kalman filter 29, and
seem to be decidedly better than OLS estimates. The
authors give also an interesting structural
interpretation of their estimates in terms of the
different behaviour between large and small firms, and
of the composition of the industrial production index.
The OLS estimates of the basic equations (48) are
listed in table 3, from which other important pieces of
information can be gathered. Note that, while it is
important to obtain a very good fit at this stage of the
disaggregtion process, nevertheless the use of broken
trends and dummy variables may weaken the
relationship between variable and indicator, as it can
be clearly seen from comparing tables 1 and 3. In
particular, note the highly significant value of the Wβ= 1
test.
26
Strictly speaking, our estimates are not exactly comparable with those in GG in that are based on a temporal
sample which includes also the year 1992 (instead of just 1991) and embodies the annual revision of National
Accounts 1990-91 data carried out in March 1993
27
In fact, our second equation is a reparameterization of that proposed in GG. Our parameterization i s closer to
that commonly used in the analysis of structural breaks (see for example Perron, 1989 and Zivot and Andrews,
1992).
28
The table reports the estimated models, the diagnostic statistics and the Dickey-Fuller tests for t he residuals.
The diagnostics of the Dickey-Fuller equations are also listed in the same table.
29
The precise algorithm is discussed in Carraro and Sartore (1987).
30
31
This occurs despite the fact that pre-testing problems are present in these equations and that the critical values
should be greater in absolute value in order to take account of this. See for example Zivot and And rews (1992)
for similar considerations in the unit root tests framework.
29
On this problem see for example Boswijk and Franses (1992), Johansen (1988), Phillips (1991), Phil lips and
Loretan (1991) and Phillips and Ouliaris (1990).
30
125
TABLE OF CONTENTS
This justifies the analysis of the logs of our variables.
Table 4 reports the Dickey-Fuller tests of the
transformed variables. Once more, the tests cannot
reject the null of no stationarity. The results of the OLS
regressions for the variables expressed in logs are
listed in table 5: the estimates are noteworthy better
than those arising from the non-transformed OLS
regressions, without having to use arbitrary dummy
variables and broken trends.
A crucial aspect is the result of Ramey’s RESET test
(Ramsey, 1969; Ramsey and Schmidt, 1979) which is
significant in both equations in tables 1 and 3. Indeed,
it has been observed (Godfrey et al., 1988) that the
RESET test has good properties as far as the choice
between linear and log-linear parameterizations are
concerned. There are also theoretical reasons which
make the log-linear alternative particularly appealing
for economic variables. Consider for example the
simple GDP:
(
∆y t = µ + ε t , µ > 0, ε t ~ NID 0, σ 2ε
)
Variables and indicators appear to be cointegrated and
this ensures T-consistency of the relevant parameters.
The bias should be moderate, given the high R 2 and
the absence, at least in the second equation, of
significant residual autocorrelation 33.
(49)
( )
In this case, σ 2ε /Var( yt ) = Op T −1 , so that the series
tends to be “asymptotically deterministic”.
Furthermore,
Ε (∆yt ) = µ ∀ t,
and
−1
'
so that the ∆yt s become less
Ε (∆y t ) / y t − 1 = Op T
Though our results are very preliminary and partial,
nevertheless they seem to indicate that more effort
should be posed in the analysis of time disaggregation
problems in the presence of non-lineary transformed
data. It could be useful to examine further the
possibilities offered in this context by the extended
Kalman filter as suggested by Harvey and Pierse
(1984).
( )
and less important. These are unappealing properties
for economic time series. On the contrary, if the series
were generated by the following process in logarithms
(
∆log( yt ) = γ + ν t , γ > 0, ν t ~ NID 0, σ 2ε
)
(50)
then, using the approximation log (1+ γ) ≅ γ, we would
have
(
yt = (1 + γ) yt − 1 et , log ( et ) ~ NID 0, σ
2
ε
)
7 Concluding remarks
The paper discusses the main temporal disaggregation
procedures, specially focusing on the optimal
methods, which, being based on indicators, exploit a
larger information set. Barcellan and Di Fonzo (1994)
provide preliminary evidence on the reliability of
these methods in comparison to the univariate ones. In
this context our effort is directed to give a critical
appraisal of the several procedures in the light of
recent econometric literature. This allows us to
highlight the main problems of each method which
deserve more analysis.
(51)
Here, Ε ( ∆ log yt ) = Ε { log ( yt / y t −1 )} = γ , ∀ t, so that
the relative increments, and not the absolute ones,
have a constant mean. Further, σ e varies
proportionally to y t and the series does not become
“asymptotically deterministic”.
In general, integration and cointegration properties of
time series are not invariant to logarithmic transforms
of the data. In particular, while the existence of a
cointegrating relationship in the levels implies the
existence of an analogue relation in the logarithms, the
converse is not generally true 32.
32
33
Some area for further research has been found: for
instance, the application of the optimal methods in the
presence of non linear transformations of variables
and of general forms for the covariance matrix of the
These and related aspects are treated for example in Granger and Hallman (1991) and Ermini and Gra nger
(1993).
31
32
However, remember that in the present framework R 2 is a non-degenerate random variable.
126
TABLE OF CONTENTS
error term; the development of suitable inference tools
in order to test the consistency of the disaggregate
estimates with the observed, aggregate, series; the
application of cointegration analysis. These issues are
particular aspects of the more general problem of
finding a well specified model to be employed in the
disaggregation of economic time series. While in the
applied economic analysis model specification
exploits the information provided by economic theory,
in this framework the same information should be of
very limited use. Cointegration analysis should then
be used mainly as a statistical device in order to find
meaningful relationships among variables and
indicators. This raises the crucial point of defining
when to stop in the specification search of the “best”
model.
The approach followed in the paper helps to show that
the assumption of a random walk process for the high
frequency disturbance is not appropriate even if it
greatly simplifies the computations. In this case, the
problem of spurious regression emerges quite clearly
with the danger of obtaining very biased results.
Form the discussion above it is clear that the model to
be employed for the disaggregation is aimed at
providing a reliable measure of the phenomenon under
analysis and not at explaining it. This may help
understanding that when the aggregate is not yet
known, the extrapolated values are not to be
considered as efficient forecasts, but simply as
anticipated, provisional measures.
127
TABLE OF CONTENTS
Appendix
The statistical approach to temporal
disaggregation
Following Stram and Wei (1986b) let us consider
Let
z i = W i β+ ε i ( i = 1,K , mΤ )
(A.1)
be the true relationship at the disaggregated level,
where Wi is the i-th row of the (m Τ × k) indicators
matrix W, and ε i is a stationary component. Under the
strong hypothesis that there is no omitted dynamics,
(A.1) implies 34
yt = X t β+ u t ( t = 1,K , T)
r = D mdΤ ε
(A.6)
s = D Τd u~
(A.7)
d
Where r = (rd + 1 ,K , rmT )', x = ( x1 ,K , xmT ) ' and DmT
is
(mT − d) × (mT − d) matrix of the kind
(A.2)
d
DMT
when X t is the t-th row of the indicators at the
~
aggregated level. Further, let β be a consistent estimate
of β in (A.2) and define the preliminary estimate of the
d
unknown disaggregated series {z i } 1 as
d
practice, D MT
is the matrix difference operator of order
d. For example, if d = 1
~
~
z i = Wi β
(A.3)
In general, the difference
{u~t }T1
between the
preliminary estimate and the true series
{ yt } 1
T
1
DmT
is
non-null. Stram e Wei (1986b) assume
u t ~ ARIMA( p , d , q) and show that aggregated
discrepancies can be distributed solving the
constrained minimisation problem
ε
−1
r
r
tm
s. t .
∑ε
( )
j
j = t − 1 m+ 1
= u~j
and
Ω
r
is
(A.10)
 c'
0 K

'
 0m
c' K

C d =  0 '2m
K K
 K
K K

'
 0(T − d − 1) m K K

the
r. Usually
Ω r specified a priori and this is equivalent to impose a
particular stochastic structure to {ε i }. Major
differences among alternative optimal methods of
temporal disaggregation are essentially based on the
specification of the structure of Ω r .
0 

0

0 
K

c' 

(A.11)
when
0 'j is a (1× j) vector of zeros and
c' = c0 , c 1 ,K , c(d + 1 ) (m− 1 ) , where c j is the coefficient
(
(
)
associated to B j in 1+ B +K+B m − 1
when d = 1 and m = 4, we have
33
(A.9)
where C d is a (T − d ) × (mT − d) matrix given by
(mT − d) × (mT − d) covariance matrix of
34






s=Cdr
(A.5)
d
 −1 1 0 0 K 0 0

 0 −1 1 0 K 0 0
=
K K K K K K K

 0 0 0 0 K −1 1
Further,
(A.4)
ri = (1− B) ε i
where
(A.8)
Where δ j is the coefficient of B j in ( B −1) . In
mT
min r' Ω
0 K 0
δ 0 δ1 K δ d


0
δ
K
δ
δ
K 0

d
0
d−1
=
K K K K K K K 


0
0 0 0
K K δ d 

A static relationship is necessary to write (A.2) ; see Tiao e Wei (1976).
128
)
d+ 1
. In practice,
TABLE OF CONTENTS
2 1
1 2 3 4 3

0
0
0
0
1
2
3

1

C = 0 0 0 0 0 0 0

K K K K K K K
0 0 0 0 0 0 0

0
0
0
0
0 K
0
4
3
2
1
0 K
0
0
1
2
3
4
K
0
K K K K K K K
0
0
0
0 K
0
1








(A.12)
The covariance matrix of ( r, s) is
where L is the smoothing matrix. Stram and Wei
d
 DmT

 exists and
(1986b) show that the matrix 
 0MΙ ⊗ J ' 
 d
m 
(C )'
is invertible and that the solution of (A.4) – (A.5) for
satisfies the aggregation constraint
{ε$ i } mT
1
 Ω r
Cov( r, s) = 
 C d 'Ω

( )
Ω
d
r
Ω
r
(A.13)


s
tm
∑ ε$
( )
j
j = t − 1 m+ 1
series { z i } 1 can be computed as:
so that, if the r’s are normal, the conditional
expectation of r, given s, is
r = Ε ( r s) = Ω
(C )' Ω
d
r
−1
s
s =Ω
(C )'Ω
d
r
−1
s
mT
z$ = W β$ + ε$
DTd ~
u
Let be u* = ( uT − d −1 ,K , uT ), and denote the identity
matrix of order d by Ιd ; if J m' is a row vector of m zeros
we have:

ε


ε$ = Ω
 D
ε$ = 
'
 0MΙd ⊗ J m




−1
r

 u~*

(A.15)








−1
( )
 Wr C d ' Ws− 1 DTd


0M Ιd

(C )' Ω
0
−1
u
u~
(A.19)
The smoothing matrix L in (A.17) can be used to
derive, as particular cases, all the smoothing matrices
examined in section 2. Furthermore, this method can
be used also to disaggregate a series y without using
indicators; in this case it is sufficient to substitute z to ε
in (A.6) and y to u~ in (A.7).
(A.16)
Using (A.14), it is possible to obtain:
d
 DmT
ε$ = 
 0M Ι ⊗ J '
d
m

ε
where C 0 corresponds to C in section 2.1. Stationary
$ not only are theoretically more
residuals, u,
appropriate, but also offer advantages in terms of
computing algorithms.
so that {ε$} can be derived from (A.15) as:
d
mT
(A.18)
When {u~t } is stationary, d = 0, and (A.16) becomes:
(A.14)
d
 r   DmT
 =
 u~*   0MΙ ⊗ J '
   d
m
= u~t ( t = 1,K , T). Therefore the unknown

u~ = Lu~ (A.17)


129
TABLE OF CONTENTS
Table 1. OLS estimates with trends ad dummy variables.
(
)
YCt(21) = -3241.4 + 477.97 * IPI t(21) POUTt ( 21) – 3036.1 * D87 + 1548.3 * TR8492
(872.60)
[629.25]
(16.176)
[17.524]
(1912.0)
[678.19]
( 21)
YC t
= -0.108 + 0.859 *
(0.0241)
[0.0308]
R²=0.996
(IPI (
21)
t
(262.72)
[322.35]
)
POUTt ( 21) – 0.113 * D87 + 0.0576 * TR8492
(0.0289)
[0.0313]
(0.0711)
[0.0252]
F(3,19)=1648.4
[0.000]
ó =1830.16
(0.00977)
[0.0120]
DW=0.922
RSS=63.64*10
[5.224%]
LM(2)=7.229
ARCH(1)=3.413
Norm.=1.444
Het.=1.307
RESET=4.540
[0.005]
[0.082]
[0.486]
[0.320]
[0.047]
LM(2)=0.985
ARCH(1)=1.187
Norm.=1.600
Het.=1.709
[0.292]
[0.449]
[0.208]
ADFt(1)=-2.717
[-3.253]
6
[0.395]
Wb=1 =26.106
[0.000]
(
)
YCt( 41) = -494.30 + 416.96 * IPI t( 41) POUTt( 41) – 3474.8 * D86 + 652.95 * t + 2433.2 * TR7992
(980.24)
[541.25]
(61.728)
[73.107]
( 41)
YC t
= -0.563 + 0.530 *
(0.124)
[0.142]
R²=0.996
(1574.5)
[743.03]
(IPI
( 41)
t
[0.000]
(264.00)
[314.93]
)
POUTt ( 41) – 0.104 * D86 + 0.0196 * t + 0.0729 * TR7992
(0.0784)
[0.0929]
F(4,18)=3045.9
(307.31)
[315.32]
(0.0472)
[0.0223]
ó =1448.69
(0.00921)
[0.00945]
(0.00791)
[0.00944]
6
DW=1.320
RSS=37.8*10
[3.163%]
LM(2)=0.992
ARCH(1)=0.124
Norm.=0.278
Het.=1.037
RESET=5.253
[0.392]
[0.729]
[0.870]
[0.463]
[0.035]
LM(2)=0.300
ARCH(1)=0.915
Norm.=0.070
Het.=0.341
[0.352]
[0.966]
[0.716]
DF t=-3.001
[-3.243]
[0.744]
Wß=1=35.933
[0.000]
Notes:
Barred variables are standardised. “D86” and “D87” are two dummy variables for 1986 and 1987, respectively. “t” indicates a linear trend over
the whole period while “TR8492” and “TR7992” represent two broken trends for the periods 1984-1992 and 1979-1992, respectively. Standard
errors in parenthesis and heteroskedasticity consistent standard errors in brackets (White, 1980). Diagnostics include R², joint significance F test,
equations standard errors (ó). Durbin-Watson statistics (DW [Durbin e Watson, 1950; 1951]), residuals sum of squares (RSS), residuals
autocorrelation tests up to order two (LM(2) [Breusch e Pagan, 1980]), residuals first order ARCH test (ARCH(1), [Engle, 1982]), normality tests
(Norm. [Jarque e Bera, 1980]), heteroskedasticity tests (Het. [White, 1980]), RESET tests (RESET [Ramsey, 1969; Ramsey e Schmidt, 1976]),
Wald tests for the null ß=1 in the standardised equations (Wß=1 ). In the table the Dickey-Fuller t-tests (DF [Dickey e Fuller, 1979; 1981]) and the
augmented Dickey-Fuller t-tests with k lags (ADF(k) [Said e Dickey, 1984]) on equations residuals are also listed. Under each test, the probability
levels are reported in square brackets; for the DF and ADF (k) tests the 10% critical value is given (MacKinnon, 1991]). For ó, the value
expressed in percentage of the mean of the dependent variable is shown in brackets.
130
TABLE OF CONTENTS
Table 2. Dickey-Fuller tests
Variable
k
ADF(k)
LM(2)
ARCH(1)
(21)
3
-2.986
[-3.276]
1.418
[0.283]
0.346
[0.568]
1.372
[0.504]
0.279
[0.933]
1
-2.550
[-3.260]
1.453
[0.265]
2.557
[0.131]
0.375
[0.829]
1.281
[0.347]
2
2.490
[-3.268]
1.770
[0.209]
0.009
[0.925]
0.622
[0.733]
0.441
[0.859]
0
-2.420
[-3.253]
0.082
[0.921]
0.166
[0.689]
1.287
[0.525]
0.857
[0.513]
YC
YC
(41)
IPI
IPI
(21)
(41)
* POUT
* POUT
(21)
(41)
Norm.
Het.
Notes:
Standard errors in parenthesis and heteroskedasticity consistent standard errors in brackets (White, 1980). Diagnostics of the Dickey-Fuller
equations include R², joint significance F test, equations standard errors (ó). Durbin-Watson statistics (DW [Durbin e Watson, 1950; 1951]),
residuals sum of squares (RSS), residuals autocorrelation tests up to order two (LM(2) [Breusch e Pagan, 1980]), residuals first order ARCH test
(ARCH(1) [Engle, 1982]), normality tests (Norm. [Jarque e Bera, 1980]), heteroskedasticity tests (Het. [White, 1980]), RESET tests (RESET
[Ramsey, 1969; Ramsey e Schmidt, 1976]) Wald tests for the null ß=1 in the standardised equations (W ß=1). In the table the Dickey-Fuller t-tests
(DF [Dickey e Fuller, 1979; 1981]) and the augmented Dickey-Fuller t-tests with k lags (ADF(k) [Said e Dickey, 1984]) on equations residuals
are also listed. Under each test, the probability levels are reported in square brackets: for the DF and ADF(k) test the 10% critical value is given
[MacKinnon, 1991]. For ó, the value expressed in percentage of the mean of the dependent variable is shown in brackets.
131
TABLE OF CONTENTS
Table 3. OLS estimates
(
YCt( 21) = -6221.1 + 557.38 * IPI t( 21) POUTt ( 21)
(1158.9)
[932.55]
)
(13.141)
[14.738]
(
YC t( 21) = -1.640 * 10-16 + 0.994 * IPI t( 21) POUTt ( 21)
(0.0234)
[0.0234]
R²=0.998
F(1,21)=1799.1
(0.0234)
[0.0263]
ó =3022.48
[0.000]
)
DW=0.519
RSS=191.8*10
[8.628%]
LM(2)=13.244
ARCH(1)=6.262
Norm.=4.675
Het.=1.078
RESET=6.944
[0.005]
[0.082]
[0.486]
[0.320]
[0.047]
LM(2)=2.632
ARCH(1)=0.025
Norm.=0.872
Het.=1.210
[0.877]
[0.647]
[0.350]
DF t=-0.014
[-3.832]
6
[0.101]
Wß=1=0.0609
[0.807]
(
YCt( 41) = -5850.0 + 783.07 * IPI t( 41) POUTt ( 41)
(1339.8)
[966.37]
)
(17.087)
[18.109]
(
YCt( 41) = 2.039 * 10-16 + 0.995 * IPI t( 41) POUTt ( 41)
(0.0217)
[0.0217]
R²=0.990
F(1,21)=2100.3
[0.000]
)
(0.0217)
[0.0230]
ó =3474.42
RSS=253.5*10
RESET=47.357
[7.586%]
LM(2)=13.33
ARCH(1)=1.004
Norm.=0.859
Het.=4.891
[0.000]
[0.329]
[0.651]
[0.020]
[0.000]
LM(2)=1.240
ARCH(1)=0.385
Norm.=0.152
Het.=1.240
[0.543]
[0.927]
[0.314]
DF t=-1.533
[-3.243]
[0.313]
6
DW=0.493
Wß=1=0.0522
[0.8214]
Notes:
Barred variables are standardised. “D86” and “D87” are two dummy variables for 1986 and 1987, respectively. “t” indicates a linear trend over
the whole period while “TR8492” and “TR7992” represent two broken trends for the periods 1984-1992 and 1979-1992, respectively. Standard
errors in parenthesis and heteroskedasticity consistent standard errors in brackets (White, 1980). Diagnostics include R², joint significance F test,
equations standard errors (ó). Durbin-Watson statistics (DW [Durbin e Watson, 1950; 1951]), residuals sum of squares (RSS), residuals
autocorrelation tests up to order two (LM(2) [Breusch e Pagan, 1980]), residuals first order ARCH test (ARCH(1), [Engle, 1982]), normality tests
(Norm. [Jarque e Bera, 1980]), heteroskedasticity tests (Het. [White, 1980]), RESET tests (RESET [Ramsey, 1969; Ramsey e Schmidt, 1976]),
Wald tests for the null ß=1 in the standardised equations W ß=1 . In the table the Dickey-Fuller t-tests (DF [Dickey e Fuller, 1979; 1981]) and the
augmented Dickey-Fuller t-tests with k lags (ADF(k) [Said e Dickey, 1984]) on equations residuals are also listed. Under each test, the probability
levels are reported in square brackets; for the DF and ADF (k) tests the 10% critical value is given (MacKinnon, 1991]). For ó, the value
expressed in percentage of the mean of the dependent variable is shown in brackets.
132
TABLE OF CONTENTS
Table 4. Dickey-Fuller tests on logarithms
Variable
k
ADF(k)
LM(2)
ARCH(1)
log(YC
(21)
2
-0.096
[-3.268]
1.437
[0.273]
1.379
[0.261]
0.984
[0.611]
0.784
[0.635]
log(YC
(41)
2
0.904
[-3.268]
0.704
[0.0.513]
0.223
[0.645]
0.289
[0.865]
0.434
[0.864]
) 2
)
)
Norm.
Het.
log(IPI
(21)
* POUT
(21)
-0.144
[-3.268]
1.588
[0.242]
0.673
[0.427]
0.722
[0.697]
1.013
[0.508]
log(IPI
(41)
* POUT
(41)
-2.092
[-3.268]
0.403
[0.676]
0.875
[0.367]
0.576
[0.750]
0.475
[0.837]
) 2
Notes:
Standard errors in parenthesis and heteroskedasticity consistent standard errors in brackets (White, 1980). Diagnostics of the Dickey-Fuller
equations include R², joint significance F test, equations standard errors (ó). Durbin-Watson statistics (DW [Durbin e Watson, 1950; 1951]),
residuals sum of squares (RSS), residuals autocorrelation tests up to order two (LM(2) [Breusch e Pagan, 1980]), residuals first order ARCH test
(ARCH(1) [Engle, 1982]), normality tests (Norm. [Jarque e Bera, 1980]), heteroskedasticity tests (Het. [White, 1980]), RESET tests (RESET
[Ramsey, 1969; Ramsey e Schmidt, 1976]) Wald tests for the null ß=1 in the standardised equations (Wß=1). In the table the Dickey-Fuller t-tests
(DF [Dickey e Fuller, 1979; 1981]) and the augmented Dickey-Fuller t-tests with k lags (ADF(k) [Said e Dickey, 1984]) on equations residuals
are also listed. Under each test, the probability levels are reported in square brackets: for the DF and ADF(k) test the 10% critical value is given
[MacKinnon, 1991]. For ó, the value expressed in percentage of the mean of the dependent variable is shown in brackets.
133
TABLE OF CONTENTS
Table 5. OLS estimates of the logarithms
(
)
(
log YC t( 21) = -4.882.1 + 1.279 * log IPI t( 21) POUTt ( 21)
(0.0599)
[0.0605]
(
)
(0.0146)
[0.0146]
)
(
log YC t( 21) = -1.440 * 10 -15 + 0.999 * log IPI t( 21) POUTt ( 21)
(0.0234)
[0.0234]
R²=0.997
F(1,21)=7668.5
ó =0.0589
[0.000]
LM(2)=5.053
ADF t(3)=-4.686
[-3.276]
DW=1.320
RSS=0.0729
Het.=0.0193
RESET=0.00371
[0.589%]
ARCH(1)=4.936
[0.0174]
)
(0.0114)
[0.0114]
Norm.=2.106
[0.0386]
[0.349]
[0.981]
[0.952]
LM(2)=2.014
ARCH(1)=0.015
Norm.=0.388
Het.=0.707
[0.904]
[0.842]
[0.685]
[0.176]
Wß=1=0.0144
[0.906]
(
( 41)
log YCt
)= 5.8119 + 1.1634 * log(IPI
(0.0442)
[0.0339]
(
( 41)
t
POUTt
( 41)
)
(0.0111)
[0.0100]
)
(
log YCt( 41) = 2.199 * 10 -16 + 0.999 * log IPI t( 41) POUTt ( 41)
(0.00952)
[0.00952]
R²=0.998
F(1,21)=1.1002
[0.000]
)
(0.00952)
[0.00863]
ó =0.0457
DW=1.640
RSS=0.043
[0.437%]
LM(2)=0.237
ARCH(1)=0.0499
Norm.=0.717
Het.=0.317
RESET=2.889
[0.791]
[0.826]
[0.699]
[0.732]
[0.104]
LM(2)=1.240
ARCH(1)=0.385
Norm.=0.152
Het.=1.240
[0.726]
[0.939]
DFt=-1.533
[-3.243]
[0.999]
[0.9402]
Wß=1 =0.0100
[0.921]
Notes:
Barred variables are standardised. “D86” and “D87” are two dummy variables for 1986 and 1987, respectively. “t” indicates a linear trend over
the whole period while “TR8492” and “TR7992” represent two broken trends for the periods 1984-1992 and 1979-1992, respectively. Standard
errors in parenthesis and heteroskedasticity consistent standard errors in brackets (White, 1980). Diagnostics include R², joint significance F test,
equations standard errors (ó). Durbin-Watson statistics (DW [Durbin e Watson, 1950; 1951]), residuals sum of squares (RSS), residuals
autocorrelation tests up to order two (LM(2) [Breusch e Pagan, 1980]), residuals first order ARCH test (ARCH(1), [Engle, 1982]), normality tests
(Norm. [Jarque e Bera, 1980]), heteroskedasticity tests (Het. [White, 1980]), RESET tests (RESET [Ramsey, 1969; Ramsey e Schmidt, 1976]),
Wald tests for the null ß=1 in the standardised equations Wß=1. In the table the Dickey-Fuller t-tests (DF [Dickey e Fuller, 1979; 1981]) and the
augmented Dickey-Fuller t-tests with k lags (ADF(k) [Said e Dickey, 1984]) on equations residuals are also listed. Under each test, the probability
levels are reported in square brackets; for the DF and ADF (k) tests the 10% critical value is given (MacKinnon, 1991]). For ó, the value
expressed in percentage of the mean of the dependent variable is shown in brackets.
134
TABLE OF CONTENTS
References
ABADIR, K.M., (1992) , “A Distribution Generating
Equation for Unit Root Statistics”, Oxford
Bulletin of Economics and Statistics, 54, 305-323.
Average Process with Missing or Aggergated Data”,
Biometrika, 70, 275-278 .
BANEJEE, A. – DOLADO, J.J. – HENDRY, D.F. –
SMITH, G.W., (1986) , “Exploring Equilibrium
Relationships in Econometrics Through Static
Models: Some Monte Carlo Evidence”, Oxford
Bulletin of Economics and Statistics, 48, 253-277
ABADIR, K.M., (1993a) , “OLS Bias in a
Nonstationary Autoregression”, Econometric
Theory, 9, 81-93.
ABADIR, K.M., (1993b) , “The Limiting Distribution
of the Autocorrelation Coefficient Under a Unit
Root”, Annals of Statistics, 21, 1058-1070
BARBONE, L. – BODO, G. – VISCO, I., (1981) ,
“Costi e profitti in senso stretto: un’analisi su serie
trimestrali, 1970-1980”, Bollettino della Banca
d’Italia, 36, 465-510.
AFIFI, A.A. – ELASHOFF, R.M. (1966) , “Missing
Observation in Multivariate Statistics, I: Review
of the Literature”, Journal of the American
Statistical Association, 61, 595-604.
BARCELLAN, R. – DI FONZO, T., (1994) ,
“Disaggregazione
Temporale
e
Modelli
ARIMA”, Atti della XXXVII Riunione
Scientifica della SIS, 355-362
AITKEN, A.C., (1935) , “On Least Squares and
Linear
Combinations
of
Observations”,
Proceedings of the Royal Society of Edinburgh,
55, 42-48.
BASSIE, V.L., (1958) , Economic Forecasting. New
York, McGraw-Hill
BEVERIDGE, S. – NELSON, C.R., (1981) ,“A New
Approach to Decomposition of Economic Time
Series into Permanent and Transitory
Components with Particular Attention to
Measurement of the ‘Business Cycle’”, Journal of
Monetary Economics, 7, 151-174
AL-OSH, M., (1989) , “A Dynamic Linear Model
Approach for Disaggregating Time Series Data”,
Journal of Forecasting, 8, 85-96.
AMEMIYA, T., (1973) , “Generalised Least Squares
with an Estimated Autocovariance Matrix”,
Econometrica, 41, 723-732.
BILLINGLEY, P., (1968) , Convergence
Probability Measures. New York, Wiley
ANDERSON, B.D.A. – MOORE, J.B., (1979) ,
Optimal Filtering. Englewood Cliffs, N.J.,
BOOT, J.C.G. – FEIBES, W. – LISMAN J.H.C.,
(1967), “Further Methods of Derivation of
Quarterly Figures from Annual Data”, Applied
Statistics, 16, 65-75
Prentice Hall
ANDREWS, D.W.K., (1987) ,
Regression with Integrated
of
“Least-Squares
or Dynamic
Regressors under Weak Error Assumptions”,
Econometric Theory, 3, 98-116.
BOSWIJK, H.P., (1992) , “Efficient Inference on
Cointegration Parameters in Structural Error
Correction Models”, University of Amsterdam,
Econometrics Discussion Paper.
ANDREWS, D.W.K., - MONAHAN, J.C., (1992) ,
“An
Improved
Heteroskedasticity
and
Autocorrelation Consistent Covariance Matrix
BOSWIJK, H.P. – FRANSES, P.H. (1992) ,
“Dynamic Specification and Cointegration”,
Oxford Bulletin of Economics and Statistics, 54,
369-381.
Estimation”, Econometrica, 60, 953-966
ANSLEY, C.F. – KOHN, R. (1983) , “Extract
Likelihood of Vector Autoregressive Moving
BOURNAY, J. – LAROQUE, G., (1979) ,
“Reflections sur la Méthode d’Elaborations des
135
TABLE OF CONTENTS
Comptes Trimestriels”, Annales de l’INIFE, 36,
3-30.
Based on Quadratic Minimization”, Journal of the
American Statistical Association, 66, 99-102
BOX, G.E.P. – JENKINS, G.M., (1976) , Time
Series Analysis: Forecasting and Control (2nd
Edition). San Francisco, Holden Day
DI FONZO, T., (1987) , La stima indiretta di serie
economiche trimestrali. Padova, Cleup Editore.
DI FONZO, T., (1990) , “The Disaggregation of M
Time Series When Contemporaneous and
Temporal Aggregates are Known"” Review of
Economics and Statistics, 72, 178-182
BREUSCH, T.S. – PAGAN, A.R., (1980) , “The
Lagrange Multiplier Test and Its Applications to
Model Specification in Econometrics”, Review of
Economic Studies, 47, 239-253
DICKEY, D.A. – FULLER, W.A., (1979) ,
“Distribution of the Estimators for Autoregressive
Time Series with a Unit Root”, Journal of the
American Statistical Association, 74, 427-431
BUSE, A. (1973) , “Goodness of Fit in Generalised
Least Squares Estimation”, The American
Statistician, 27, 106-108
DICKEY, D.A. – FULLER, W.A., (1981) ,
“Likehood Ratio Statitics for Autoregressive
Time Series with a Unit Root”, Econometrica, 49,
1057-1072
CARRARO, C. – SARTORE, D., (1987) , “Square
Root Iterative Filter: Theory and Applications to
Econometric Models”, Annales d’Economie et
Statistique, n. 6-7, 435-459
DRETTAKIS, E.G., (1973) , “Missing Data in
Econometric Estimation” Review of Economic
Studies, 40, 537-552
CHOW, G. – LIN, A.L., (1971) , “Best Linear
Unbiased Interpolation Distribution and
Extrapolation of Time Series by Related Series”,
Review of Economics and Statistics, 53, 372-375
DUFOUR, J.M., (1988) , “Estimators of the
Disturbance Variance in Econometric Models:
Small-sample Bias and the Existence of
Moments”, Journal of Econometrics, 37, 277-292
CHOW, G. – LIN, A.L., (1976) , “Best Linear
Unbiased Estimation of Missing Observations in
an Economic Time Series”, Journal of the
American Statistical Association, 71, 719-721
DURBIN, J. – WATSON, G.S., (1950) , “Testing for
Serial Correlation in Least Squares Regression I”,
Biometrika, 37, 409-428
COCHRANE, D. – ORCUTT, G.H., (1949) ,
“Application of Least Squares Regression to
Relationships Containing Autocorrelated Error
Terms”, Journal of the American Statistical
Association, 44, 32-61
DURBIN, J. – WATSON, G.S., (1951) , “Testing for
Serial Correlation in Least Squares Regression
II”, Biometrika, 38, 1-19
DAVIDSON, R. – HENDRY, D.F. – SRBA, F. –
YEO, S., (1978), “Econometric Modelling of the
Aggregate Relationship Between Consumers’
Expenditure and Income in the United Kingdom”,
Economic Journal, 88, 661-692
DURLAUF, S.N. – PHILLIPS, P.C.B., (1988) ,
“Trends versus Random Walks in Time Series
Analysis”, Econometrica, 56, 1333-1354
EFRON, B., (1979) , “Bootstrapping Methods:
Another Look at the Jackknife”, Annals of
Statistics, 7, 1-26
DAVIDSON, R. – MACKINNON, J.G., (1993) ,
Estimation and Inference in Econometrics.
Oxford, Oxford University Press
ENGLE, R.F., (1982) , “Autoregressive Conditional
Heteroscedasticity,
DENTON, F.T., (1971) , “Adjustment of Monthly or
Quarterly Series to Annual Totals: An Approach
136
with
Estimates
of
the
TABLE OF CONTENTS
Variance
of
United
Kingdom
Inflation”,
GRANGER, C.W.J. – NEWBOLD, P., (1974) ,
“Spurious Regressions in Econometrics”, Journal
of Econometrics, 2, 111-120
Econometrica, 50, 987-1008
ENGLE, R.F. – GANGER, C.W.J., (1987) ,
“Co-integration
and
Error-correction:
Representation, Estimation and Testing”,
Econometrica, 55, 251-276
GUERRERO,
V.M.,
(1990) ,
“Temporal
Disaggregation of Time Series: an ARIMA based
Approach”, International Statistical Review, 58,
ENGLE, R.F. – HENDRY, D.F. – RICHARD, J-F.,
(1983), “Exogeneity”, Econometrica, 55, 251-276
29-36
HAAVELMO, T., (1943) , “The Statistical
Implications of a System of Simultaneous
Equations”, Econometrica, 11, 1-12
ERMINI, L. – GRANGER, C.W.J., (1993) , “Some
Generalizations on the Algebra of I(1) Processes”,
Journal of Econometrics, 58; 369-384
HANSEN, B.E., (1992) , “Consistent Covariance
Matrix Estimation for Dependent Heterogeneous
Procesifs”, Econometrica, 60, 967-972
FERNANDEZ, R.B., (1981) , “A Methodological
Note on the Estimation of Time Series”, Review
of Economics and Statistics, 63, 471-478
HANSEN, L.P., (1982) , “Large Sample Properties of
Generalised Method of Moments Estimators”,
Econometrica, 50, 1029-1054
FIEBIG, D.G. – MCALEER, M. – BARTELS, R.,
(1992), “Properties of Ordinary Least Squares in
Regression
Models
with
Nonspherical
Disturbances”, Journal of Econometrics, 54,
321-334
HARVEY, A.C., (1981) , The Econometric Analysis
of Time Series. Oxford, Philip Allan
GENNARI, P. – GIOVANNINI, E., (1993) , “La
stima trimestrale dei conti nazionali mediante
modelli a parametri variabili”, Quaderni di
Ricerca Serie Economica e Ambiente, ISTAT,
n.5/1993
HARVEY, A.C. – MCKENZIE, C.R., (1984) ,
“Missing Observations in Dynamic Econometric
Models: A Partial Synthesis”, in E. Parzen (ed.)
Time Series Analysis of Irregularly Observed
Data. Berlin, Springer-Verlag
GILBERT, C.L. (1977) , “Regression Using Mixed
Annual and Quartely Data”, Journal of
Econometrics, 5, 221-239
HARVEY, A.C. – PIERSE, R.G., (1984) ,
“Estimating Missing Observations in Economic
Time Series”, Journal of the American Statistical
Association, 79, 125-131
GINSBURGH, V.A., (1973) , “A Further Note on the
Derivation of Quarterly Figures Consistent with
Annual Data”, Applied Statistics, 22, 368-374
HARVEY, A.C. – PHILLIPS, G.D.A., (1979) ,
“Maximum Likelihood Estimation of Regression
Models with Autoregressive-Moving Average
Disturbances”, Biometrika, 66, 49-58
GODFREY, L.G. – MCALEER, M. –
MCKENZIE, C.R., (1988 ), “Variable Addition
and Lagrange Multiplier Tests for Linear and
Logarithmic Regression Models”, Review of
Economics and Statistics, 70, 492-503
HENDRY, D.F., (1973) , “On Asymptotic Theory and
Finite Sample Experiments”, Economica, 40,
210-217
GRANGER, C.W.J. – HALLMAN, J., (1991) ,
“Nonlinear Transformations of Integrated Time
Series”, Journal of Time Series Analysis, 12,
207-224
HENDRY, D.F., (1982) , “A Reply to Professors
Maasoumi and Philips”, Journal of Econometrics,
19, 203-213
137
TABLE OF CONTENTS
HENDRY, D.F., (1993) , “On the Interactions of Unit
Roots and Exogeneity. Nuffield College, mimeo
LUPI, C., (1992) , Measures of Persistence, Trends,
and Cycles in I(1) Time Series: A Rather
Skeptical Viewpoint, University of Oxford,
Institute of Economics and Statistics, Applied
Economics Discussion Paper No. 137
HENDRY, D.F. – MIZON, G.E., (1978) , “Serial
Correlation as a Convenient Simplification, not a
Nuisance: A Comment on a Study of the Demand
for Money by the Bank of England”, Economic
Journal, 88, 549-563
MACKINNON, J.G., (1991) , “Critical Values for
Cointegration Tests”, in R.F. Engle e C.W.J
Granger
(Eds.):
Long-Run
Economic
Relationships: Readings in Cointegration. Oxford,
Oxford University Press
HERRNDORF, N, (1984) , “A Functional Central
Limit Theorem for Weakly Dependent Sequences
of Random Variables”, Annals of Probability, 12,
141-153
MCLEISH, D.L., (1975a) , “A Maximal Inequality
and Dependent Strong Laws”, Annals of
Probability, 3, 829-839
HILDRETH, C. – LU, J. Y., (1960) , “Demand
Relations with Autocorrelated Disturbances”,
Michigan State University, Research Bulletin 76
MCLEISH, D.L., (1975b) , “Invariance Principles for
Dependant
Variables”.
Z.
Wahrscheinlichtstheorie und Verw. Gebiete, 32,
165-168
HYLLEBERG, S. – MIZON, G.E. (1989) ,
“Cointegration
and
Error
Correction
Mechanisms”, Economic Journal (Supplement),
99, 113-125
NELSON, C.R. – PLOSSER, C.I., (1982) , “Trends
and Random Walks in Macroeconomic Time
Series: Some Evidence and Implications”, Journal
of Monetary Economics, 10, 139-162
INDER, B., (1993) , “Estimating Long-Run
Relationships in Economics: A Comparison of
Different Approaches”, Journal of Econometrics,
57, 53-68
ISTAT, (1992) , “I Conti Economici Trimestrali con
baif 1980”, Note e Relazion, 1
NEWEY, W.K. – WEST, K.D., (1987) , “A Simple
Positive Semi-Definite, Heteroskedasticity and
Autocorrelation Consistent Covariance Matrix”,
Econometrica, 55, 703-708
JARQUE, C.M. – BERA, A.K., (1980) , “Efficient
Tests for Normality, Homoscedasticity and Serial
Independence of Regression Residuals”,
Economics Letters, 6, 255-259
NICHOLLS, D.F. – PAGAN, A.R., (1977) ,
“Specification of the Disturbance for Efficient
Estimation:
An
Extended
Analysis”,
Econometrica, 45, 211-217
JOHANSEN, S., (1988) , “Statistical Analysis of
Cointegration Vectors”, Journal of Economic
Dynamics and Control, 12, 231-254
NICKELL, S.J., (1985) , “Error Correction, Partial
Adjustment and All That: An Expository Note”,
Oxford Bulletin of Economics and Statistics, 47,
119-129
JOHANSEN, S., (1992) , “Cointegration in Partial
Systems and the Efficiency of Single-Equation
Analysis”, Journal of Econometrics, 52, 389-402
ONOFRI, P., (1992) , “Osservazione empirica e
analisi economica: Esperienze di indagine sulle
fluttuazioni cicliche”, SIS Bollettino, n.26,
134-164
LITTERMAN, R.B., (1983) , “A Random Walk,
Markov Model for the Distribution of Time
Series”, Journal of Business and Economic
Statistics, 1, 169-173
PALM, F.C. – NIJMAN, T.E., (1984) , “Missing
Observations in the Dynamic Regression Model”,
Econometrica, 52, 1415-1435
138
TABLE OF CONTENTS
PARK, J.Y. – PHILLIPS, P.C.B., (1988) ,
“Statistical Inference in Regressions with
Integrated Processes: Part 1”, Econometric
Theory, 4, 468-497
PHILLIPS, P.C.B. – OULIARIS, S., (1990) ,
“Asymptotic Properties of Residual Based Tests
for Cointegration”, Econometrica, 58, 165-193
PHILLIPS,
PARK, J.Y. – PHILLIPS, P.C.B., (1988) ,
“Statistical Inference in Regressions with
Integrated Processes: Part 2”, Econometric
Theory, 5, 95-131
P.C.B.
–
PARK,
J.Y.,
(1988) ,
“Asymptotic Equivalence of Ordinary Least
Squares and Generalised Least Squares in
Regressions with Integrated Variables”, Journal
of the American Statistical Association, 83,
111-115
PENA, D. – TIAO, G.C., (1991) , “A Note on
Likelihood Estimation of Missing Values in Time
Series”, American Statistican, 45, 212-213
RAMSEY, J.B., (1969) , “Tests for Specification
Errors in Classical Linear Least Squares
Regression Analysis”, Journal of the Royal
Statistical Society, Series B, 31, 350-371
PHILLIPS, P.C.B., (1986) , “Understanding Spurious
Regressions in Econometrics”, Journal of
Econometrics, 33, 311-340
RAMSEY, J.B. – SCHMIDT, P., (1976) , “Some
Further Results on the Use of OLS and BLUS
Residuals in Specification Error Tests”, Journal of
the American Statistical Association, 71, 389-390
PHILLIPS, P.C.B., (1987a) , “Time Series
Regression with a Unit Root”, Econometrica, 55,
277-301
ROSSI, N., (1982) , “A Note on the Estimation of
Disaggregated Time Series When the Aggregate
is Known”, Review of Economics and Statistics,
64, 695-696
PHILLIPS, P.C.B., (1987b) , “Towards a Unified
Asymptotic Theory for Autoregression”,
Biometrika, 74, 535-547
PHILLIPS, P.C.B., (1988) , “Reflections on
Econometric Methodology”, Economic Record,
64, 544-559
SAID, S.E. – DICKEY, D.A., (1984) , “Testing for
Unit Roots in Autoregressive-Moving Average
Models of Unknown Order”, Biometrika, 71,
599-607
PHILLIPS, P.C.B., (1991) , “Optimal Inference in
Cointegrated Systems”, Econometrica, 59,
283-306
SARGAN, J.D. – DRETTAKIS, E.G., (1974) ,
“Missing Data in an Autoregressive Model”,
International Economic Review, 15, 39-58
PHILLIPS, P.C.B. – DURLAUF, S.N., (1986) ,
“Multiple Time Series Regression with Integrated
Processes, Review of Economic Studies, 53,
473-495
SIMS, C.A. – STOCK, J.H. – WATSON, M.W.,
(1990), “Inference in Linear Time Series Models
with Some Unit Roots”, Econometrica, 58,
113-144
PHILLIPS, P.C.B. – HANSEN, B.E., (1990) ,
“Statistical Inference in Instrumental Variables
Regression with I(1) Processes”, Review of
Economic Studies, 57, 99-125
STOCK, J.H., (1987) “Asymptotic Properties of
Least Squares Estimators of Cointegrating
Vectors”, Econometrica, 55, 1035-1056
STONE, R., (1990) , “Adjusting the National
Accounts”, Annali di Statistica, 9, 311-325
PHILLIPS, P.C.B. – LORETAN, M., (1991) ,
“Estimating Long-Run Economic Equilibria”,
Review of Economic Studies, 58, 407-436
139
TABLE OF CONTENTS
STONE, R. – CHAMPERNOWNE, D.G. –
MEADE, J.E. (1942) , “The Precision of National
Income Estimates”, Review of Economic Studies,
9, 111-125
Rivista Internazionale di Scienze Economiche e
Commerciali, 40, 457-474
VANGREVELINGHE, G., (1966) , “L’Evolution à
Court Terme de la Consommation des Ménages:
Connaissance, Analyse et Prévision”, Etudes et
Conjoncture, 9, 54-102
STRAM, D.O. – WEI, W.W.S., (1986a) , “Temporal
Aggregation in the ARIMA Process”, Journal of
Time Series Analysis, 7, 279-292
VISCO, I., (1983) , “Materiale per un Seminario
ISTAT sulla Trimestralizzazione”, mimeo, Roma
STRAM, D.O. – WEI, W.W.S., (1986b) , “A
Methodological Note on the Disaggregation of
Time Series Totals”, Journal of Time Series
Analysis, 7, 293-302
WEI, W.W.S. – STRAM, D.O., (1990) ,
“Disaggregation of Time Series Models”, Journal
of The Royal Statistical Society, Series B, 52,
453-467
TAYLOR, W.E., (1981) , “On the Efficiency of the
Cochrane-Orcutt
Estimator”,
Journal
of
Econometrics, 17, 67-82
WHITE, H., (1980) , “A HeteroskedasticityConsistent Covariance Matrix and a Direct Test
for Heteroskedasticity”, Econometrica, 48,
817-838
THEIL, H. – GOLDBERGER, A.S., (1961) , “On
Pure and Mixed Statistical Estimation in
Economics”, International Economic Review, 2,
65-78
WHITE, H., (1984) , Asymptotic Theory for
Econometricians. London, Academic Press
TIAO, G.C. – WEI, W.W.S., (1976) , “Effect of
Temporal Aggregation on the Dynamic
Relationship of Two Time Series Variables”,
Biometrika, 63, 513-523
ZIVOT, E. – ANDREWS, D.W.K., (1992) , “Further
Evidence on the Great Crash, the Oil Price Shock,
and the Unit Root Hypothesis”, Journal of
Business and Economic Statistics, 10, 251-270.
TSERKEZOS,
D.E.,
(1993) ,
“Quarterly
Disaggregation of the Annually Known Greek
Gross Industrial Product Using Related Series”,
140
TABLE OF CONTENTS
Propositions pour une désagrégation temporelle basée sur des modèles
dynamiques simples
Stéphane GREGOIR
INSEE
La construction de comptes nationaux trimestriels français se fait de manière indirecte.
Elle repose sur l’utilisation de méthodes économétriques qui permettent de quantifier les
relations entre des indicateurs infra-annuels et des postes comptables évalués
annuellement à partir d’informations très fines. Dans le passé récent, les comptes
trimestriels français n’ont pas su décrire avec précision l’amplitude des mouvements
macro-économiques. Cette faiblesse est peut-être liée au fait que les relations
économétriques utilisées dans ce processus sont statiques et présentent parfois des
spécifications de faible qualité au vu des apports récents de la théorie des séries
temporelles. Ce travail analyse quelques aspects de l’emploi de modèles dynamiques
simples dans la construction de séries désagrégées en sous-périodes. Différentes
spécifications avec peu de retards sont considérées, la construction de comptes
trimestriels bruts semble être une approche possédant un certain nombre d’avantages.
1 Introduction
Les comptes trimestriels ont pour objectif de décrire
les évolutions des agrégats économiques à un rythme
infra-annuel en s’appuyant sur un cadre comptable
cohérent afin de permettre d’observer les cycles
conjoncturels, d’évaluer les délais entre événements et
par là de mieux suivre, comprendre et prévoir la
dynamique des mouvements économiques. Ils
permettent
aussi
l’estimation
de
modèles
macro-économiques. Enfin, ils cherchent à construire
une évaluation de l’année en cours la plus robuste et la
plus précise possible en fonction de l’information
infra-annuelle disponible.
trimestrielles. La deuxième approche utilise des
méthodes indirectes économétriques pour construire
les évaluations trimestrielles qui sont recalées sur le
passé sur les données annuelles obtenues par des
méthodes de type enquêtes et/ou exploitation de
sources fiscales (voir Di Fonzo, 1986, ou Di Fonzo et
Filosa, 1987, pour un survol de la littérature). Ces deux
approches ne sont pas si éloignées. Il est possible
d’interpréter la seconde comme une méthode
s’appuyant sur des enquêtes mais pour lesquelles les
procédures de redressement reposeraient sur des
modèles de régression. Une théorie plus générale des
sondages reposant sur des modèles permet donc de
réconcilier, si nécessaire, les deux approches.
On peut séparer les instituts nationaux en charge de la
construction des comptes trimestriels en deux
catégories suivant la méthode qu’ils emploient pour
construire ces derniers. La première approche repose
sur un système d’enquêtes auprès des agents
économiques et applique la même méthode pour
construire les données annuelles et les données
En France, la deuxième approche est utilisée par
l’INSEE pour construire les évaluations des comptes
trimestriels. Lors des dernières années pendant
lesquelles l’économie française a connu un
mouvement cyclique marqué, un problème a pris une
141
TABLE OF CONTENTS
ampleur remarquée. Il a été en effet observé que les
révisions que les comptables trimestriels sont amenés
à faire sur leurs premières estimations sont fortement
corrélées avec la phase du cycle : révision à la hausse
en phase de croissance, révision à la baisse en phase de
décroissance. Parmi plusieurs explications possibles,
deux ont retenu l’intérêt des comptables. La première
porte sur le type d’indicateur utilisé pour construire
l’évolution d’un agrégat. Lorsque cet indicateur est à
champ constant, il ne permet pas de capturer les
phénomènes de démographie des entreprises et peut
donc être amené à sous-estimer la croissance en phase
croissante et inversement. Une autre explication
possible porte sur le fait que les modèles
économétriques utilisés pour construire les
évaluations courantes sont statiques et ne décrivent
pas les mouvements dynamiques dans leurs
enchaînements.
se propagent de séries annuelles à des séries
trimestrielles (conservation de propriétés) mais aussi
on souligne l’impossibilité de connaître des propriétés
du même genre pour des fréquences différentes de 0.
En d’autres termes, toute l’information sur la
saisonnalité, lorsque que l’on veut construire des
comptes bruts, ne peut être apportée que par les
indicateurs. La méthode doit veiller à la transmettre de
la façon la moins déformée possible aux variables de
compte. Ensuite on montre à l’aide d’un exemple, les
conséquences que peut avoir sur l’estimation et
l’inférence le fait de négliger le propriétés
d’intégration des variables. Puis on propose une
spécification générale qui peut par des contraintes
appropriées englober un grand nombre de cas.
Ensuite on s’intéresse à l’emploi d’un modèle
dynamique simple compatible avec cette spécification
permettant de construire des variables comptables
trimestrielles (de flux et de stock) corrigées des
variations saisonnières. La recherche d’une méthode
simple d’estimation suivant un principe de statistique
de moindres carrés ne donne pas de bonnes propriétés
et se révèle coûteuse en temps machine. Quelques
simulations viennent illustrer ce propos. Nous nous
tournons ensuite vers la construction de comptes
saisonniers pour lesquels nous obtenons de meilleures
propriétés : simplicité d’usage et optimalité. Enfin
nous effectuons quelques essais en vraie grandeur.
Le but de ce travail est de tenter de répondre à cette
dernière critique en recherchant des spécifications
dynamiques qui capturent à la fois les propriétés de
court terme et de moyen et long terme. Pour ce faire, il
est nécessaire de prendre en compte les
développements récents de l’analyse des séries
temporelles (intégration, cointégration) pour obtenir
des spécifications appropriées. Il nous semble surtout
important d’éviter d’utiliser des modèles où les erreurs
sont intégrées et donc à variance croissante en fonction
du temps. Après avoir proposé une spécification avec
peu de retards assez générale pour tenir compte de ces
propriétés, nous montrerons que plusieurs
spécifications trimestrielles sont compatibles avec une
spécification annuelle tout en satisfaisant certaines
contraintes de cohérence inter temporelle (telle que la
somme de quatre trimestres est égale à l’année pour
des variables de flux). Dans ce travail, nous
recherchons des méthodes simples, facilement
“réplicables”, ne demandant pas une connaissance
statistique ou en algorithmique trop grande de façon à
pouvoir être utilisées par beaucoup de personnes au
niveau de connaissance hétérogène. Suivant que l’on
veut construire des comptes corrigés des variations
saisonnières ou des comptes bruts, les modèles
simples que nous proposons ne présentent pas des
propriétés semblables.
propriétés des séries
2 Quelques
temporelles
2.1 Définitions
L’approche économétrique de la construction des
comptes trimestriels de la Nation repose sur un
domaine de la statistique qui a connu un
développement important durant les vingt dernières
années, celui des séries chronologiques (ou séries
temporelles).
L’usage
de
ces
techniques
mathématiques dans l’analyse quantitative des
phénomènes économiques a suscité l’introduction de
nouveaux concepts et l’étude de classes de séries
temporelles particulières. En effet, des travaux
empiriques au début des années quatre-vingt (Nelson
et Plosser (1982) inter alia) ont montré que nombre de
séries
macro-économiques
pouvaient
être
correctement décrites en termes statistiques à l’aide de
Le plan du papier est le suivant. Dans un premier
temps, on rappelle les définitions d’intégration et de
cointégration et l’on montre comment ces propriétés
142
TABLE OF CONTENTS
processus non-stationnaires. L’étude de ces processus
a entraîné l’introduction de nombreuses techniques de
test et d’estimations. Plus précisément, la théorie
statistique des processus temporels la plus
communément utilisée s’appuie sur le concept de
séries stationnaires du second ordre. Ces processus
sont tels que leurs moments du premier et deuxième
ordre vérifient des propriétés d’invariance temporelle
qui permettent l’inférence usuelle. Un processus
stationnaire du second ordre purement non
déterministe admet ainsi une représentation de Wold
unique de la forme:
Un autre concept a aussi pris de l’importance au cours
de ces dernières années du fait de son interprétabilité
en termes économiques. Il s’agit de la cointégration
(Engle et Granger (1987)). Ce concept caractérise la
situation dans laquelle il existe une combinaison
linéaire d’au moins deux séries intégrées du même
ordre qui soit intégrée d’un ordre inférieur.
Mathématiquement, cela s’énonce de la façon
suivante :
Définition 2: un processus multivarié x t de dimension
n, intégré d’ordre h est dit cointégré de degré m (m
entier), s’il existe une vecteur α(1× n) tel que αxt est
un processus intégré d’ordre h-m.
xt = c(B) ε t
La situation la plus communément observée dans les
études empiriques correspond à une cointégration de
degré 1 entre des variables intégrées d’ordre 1,
obtenant ainsi des relations de cointégration
stationnaires du second ordre. Nous nous limiterons à
cette situation par la suite, mais une extension des
propositions énoncées ci-dessous peut être facilement
obtenue pour des cadres de travail plus généraux. Par
ailleurs, il existe d’autres formes de cointégration liées
à la saisonnalité dont nous ne traiterons pas
explicitement dans ce travail mais auxquelles nous
ferons parfois référence par la suite.
où B est un opérateur de décalage Bx t = x t −1 , tel quel ε t
est un bruit blanc de moyenne nulle,
+∞
c(0) = 1, c(u) = ∑ c i u i satisfaisant
i=0
+∞
∑c
i =1
2
i
< +∞ et tel
que les zéros de ce polynôme soient sur ou à
l’extérieur du cercle d’unité. L’analyse usuelle d’un
tel processus est effectuée à l’aide de sa structure
d’autocorrélation et de son spectre. Beaucoup de
séries économiques semblent ne pas satisfaire cette
représentation, mais être mieux décrites par des
processus intégrés d’ordre 1, voire 2, dont la définition
est la suivante (cf. Engle et Granger (1987)):
Dans ce paragraphe et par défaut dans la suite de ce
texte, nous nous sommes placés dans le cadre usuel de
l’analyse des séries temporelles. Un cadre plus large
pourrait être utilisé. En effet, nous pourrions
considérer le cadre des processus périodiques
(Gladishev, 1961) ou plus généralement celui des
processus harmonisables (Loève, 1963, Karhunen
1947) pour modéliser la structure des séries
trimestrielles en l’absence d’information sur la nature
saisonnière des séries. Mais l’emploi de ce cadre ne
serait pas parcimonieux (multiplication des
paramètres pour isoler les non-stationnarités) et
réduirait la qualité des inférences.
Définition 1: un processus stochastique purement non
déterministe est dit intégré d’ordre h (h entier), ce que
h
l’on
note l(h), si (1− B) xt est un processus
stationnaire du second ordre dont le polynôme c(u)
associé à sa représentation de Wold vérifie c(1) ≠ 0.
Une série temporelle stationnaire du second ordre est
parfois appelée I(0). Il existe d’autres formes
d’intégration associées à des non-stationnarités liées à
la saisonnalité. Ce type d’intégration saisonnière ne
peut être pris en compte directement dans la
modélisation de séries trimestrielles à partir de la seule
observation de la série des quantités annuelles. En
revanche, l’emploi de séries connexes observées de
façon infra-annuelles permet d’obtenir de
l’information sur ces propriétés. Aussi nous
renvoyons le lecteur intéressé par ces aspects à
d’autres travaux (Hylleberg, Engle, Granger et Yoo
(1991) ou Gregoir (1994)).
2.2 Notations
En pratique, nous allons travailler à partir de séries de
valeurs annuelles de la comptabilité nationale et de
séries d’indicateurs observés de façon infra-annuelle
pour construire des évaluations trimestrielles. Il
importe donc de savoir s’il y a conservation des
propriétés de non-stationnarité entre les deux séries.
Nous devons distinguer deux situations selon que l’on
143
TABLE OF CONTENTS
travaille sur des variables de flux ou des variables de
stocks. Mais avant cela, il nous faut introduire
quelques notations qui seront utilisées dans la suite de
ce travail.
correspondante, alors si les deux séries suffisamment
différenciées admettent une représentation de Wold.
∃ α , α xt ≈ Ι(0) ⇔ ∃ α , αX a ≈ Ι(0)
Preuve: voir annexe 1.
Dans ce qui suit nous dénoterons les quantités
annuelles par des majuscules (X, Y,...) et les quantités
trimestrielles correspondantes par des minuscules (x,
y,...). “a” sera l’indice du temps annuel, “t” l’indice du
temps trimestriel, et “i” le numéro du trimestre dans
l’année (i = 1, 2, 3, 4). Ainsi par construction, pour
toute valeur de “t”, il existera une valeur unique de “a”
et une valeur unique de “i” telles que t=4a+i.
Les
variables
de
flux
vérifient
Là encore, cette propriété est vérifiée à la fois pour des
variables de stock et de variables de flux.
Remarque 1: nous n’avons pas considéré dans cette
partie des processus possédant une partie déterministe,
mais les propriétés que nous venons d’énoncer sur les
degrés d’intégration sont valables pour le degré des
polynômes des parties déterministes des quantités
trimestrielles et annuelles, ainsi que pour les
propriétés de cointégration. De même, la non
conservation de l’information sur la structure
d’intégration liée à la saisonnalité demeure lorsque
l’on considère les tendances déterministes associées, à
savoir les polynômes combinaisons linéaires de
π
monômes de la forme t k cos ω t , t k sin ω t , ω = , π.
2
l’équation
4
Y a = ∑ y4(a − 1 )+ i alors que les variables de stock
i=1
satisfont Y a = y4 a .
2.3 Conservation des propriétés intégration
et de cointégration entre des séries
trimestrielles et des séries annuelles
(
Il importe avant toute chose de se poser la question de
la conservation des propriétés précédemment décrites
dans le processus d’agrégation temporelle. En fait, la
réponse est relativement simple. Toutes les formes
d’intégration et de cointégration non liées à la
saisonnalité se transmettent des observations
trimestrielles aux observations annuelles et
réciproquement.
)
2.4 Conséquences de l’omission de ces
propriétés
En pratique, il importe de tenir compte des propriétés
d’intégration et de cointégration entre des variables
pour pouvoir mener à bien une analyse
économétrique. Dans le travail qui nous intéresse nous
recherchons des situations où l’apport d’information
par des séries économiques disponibles à un niveau
infra-annuel peut permettre de construire des
évaluations infra-annuelles pertinentes des postes
comptables. En général, ceci se ramène à un régression
des moindres carrés sur des indicateurs infra-annuels
annualisés et des évaluations annuelles des postes
comptables. Si, par exemple, ces variables sont
intégrées d’ordre 1 mais qu’elles ne sont pas
cointégrées, travailler sur le niveau des variables à des
conséquences non négligeables sur la qualité de
l’ajustement économétrique. Le résultat suivant donne
une illustration de ces situations:
Proposition 1: soient xt une série univariée
trimestrielle et X a la série annuelle correspondante,
alors si les deux séries suffisamment différenciées
admettent une représentation de Wold,
xt ≈ Ι( h) ⇔ X a ≈ Ι(h)
Preuve: voir annexe 1.
Cette propriété est satisfaite que l’on considère une
variable de flux ou une variable de stock. En revanche,
et de façon bien évidente, il n’y a pas conservation des
propriétés d’intégration liées à la saisonnalité entre des
observations trimestrielles et annuelles puisqu’il n’y a
pas de conservation de l’information sur la structure
d’intégration saisonnière.
Exemple 1: soit X t variable I(1) satisfaisant la
relation (1 − B) X t = β(1− B)Y t + ε t pour t = 1,K, T , où
ε t est i.i.d. de variance σ 2ε ,Y t est défini par
(1− B)Y t = η t avec η t i.i.d. de variance σ 2η et
indépendant à toutes les dates de ε t , et
X 0 = Y 0 = ε 0 = 0, soit β$ T l’estimateur des moindres
Proposition 2: soient x t une série multivariée de
dimension n, I(1) et X a la série annuelle
144
TABLE OF CONTENTS
carrés ordinaires de la régression de X t sur Yt , alors
lorsque T tend vers l’infini :
trimestrielles. L’extension de cette écriture à des
modèles linéaires plus complexes est directe.
β$ T = β + Op (1)
La forme suivante :
 β$ T + 1 
 1

−1 = Op  
 β$

T 
 T

Y t = αY t − 1 + βX t + γX t − 1 + ε t
englobe les différentes situations que l’on peut
rencontrer avec des séries intégrées d’ordre 1. En effet,
nous avons
En d’autres termes, si l’on effectue à tort une
régression en niveau alors qu’il faudrait l’effectuer en
différences premières, on peut obtenir un estimateur
sans biais qui ne converge pas, pour lequel les règles
usuelles de l’inférence normale ne s’applique pas. Les
résidus de l’équation ne sont pas stationnaires et ont
une variance qui croit linéairement avec le temps. Le
deuxième résultat sur la convergence de l’estimateur
du coefficient montre que si l’on veut analyser les
sources de révisions des prévisions, le caractère
intégré des termes d’erreur joue un rôle important sur
le niveau de la révision. En effet, nous pouvons écrire :
xT + 1 − x$T + 1 =
• si X t ≈ Ι(1) , Yt ≈ Ι(1) et sont cointégrés alors nous
devons utiliser une représentation à terme à
correction d’erreur (théorème de représentation de
Granger et Weiss (1983)). Le paramètrage
α = 1+ a , β = c , γ = ab − c à cette situation. L’usage
des moindres carrés ordinaires sous cette forme
moyennant l’hypothèse d’orthogonalité du terme
d’erreur par rapport aux variables présentes dans le
membre de droite de l’équation donne des
estimateurs convergents des paramètres α , β, γ.
β$ T + 1 − β$ T
x$T + 1 + ε$ T + 1
β$
• si X t ≈ Ι(1) , Yt ≈ Ι(1) et ne sont pas cointégrés alors
la représentation doit faire intervenir des
différences premières des variables. Le paramètrage
qui convient est α = 1, β = a, γ = −a . L’usage des
moindres carrés ordinaires sous cette forme
moyennant l’hypothèse d’orthogonalité du terme
d’erreur par rapport aux variables présentes dans le
membre de droite de l’équation donne des
estimateurs super-convergent du paramètre α et
convergents de β, γ.sSi X t ≈ Ι(0), Y t ≈ Ι(1), la
variable intégrée doit intervenir sous forme de
différences premières α = 1, β = a , γ = b. L’usage des
moindres carrés ordinaires sous cette forme
moyennant l’hypothèse d’orthogonalité du terme
d’erreur par rapport aux variables présentes dans le
membre de droite de l’équation donne des
estimateurs super-convergent du paramètre α et
convergents des paramètres β, γ.
T
et nous obtenons la convergence suivante :
xT + 1 − x$T + 1
T
= Op (1)
Ce qui signifie que l’erreur de révision croit en
Op
( T ).
2.5 Spécification générale d’une équation
dynamique sur données annuelles
Nous venons d’illustrer le fait qu’une mauvaise
spécification d’une équation entre des variables
éventuellement intégrées peut avoir des conséquences
non négligeables sur l’inférence que l’on peut
pratiquer et sur les prévisions que l’on peut établir. Un
grand nombre de séries macro-économiques semblent
être assez bien décrites par une modélisation de
variable intégrée, il faut donc pouvoir à travers une
spécification assez souple tenir compte des différentes
situations possibles. Par souci de simplicité, nous nous
limiterons ici à des spécifications avec peu de retards,
ce qui est relativement bien adapté à des modèles sur
quantités annuelles, un peu moins pour des quantités
• si X t ≈ Ι(1) ,Y t ≈ Ι( 0), la variable intégrée doit
intervenir sous forme de différences premières
α = a, β = b , γ = −b. L’usage des moindres carrés
ordinaires sous cette forme moyennant l’hypothèse
d’orthogonalité du terme d’erreur par rapport aux
variables présentes dans le membre de droite de
l’équation donne des estimateurs convergents des
paramètres α, β, γ.
145
TABLE OF CONTENTS
• si X t ≈ Ι(0) , Y t ≈ Ι( 0), il n’y a aucun problème dans
la représentation, nous sommes sous une forme
standard.
dynamique de type ARI infra-annuelle en utilisant des
indicateurs désaisonnalisés et rechercher la
dynamique agrégée compatible. Une autre approche
consiste à construire des comptes bruts à partir d’une
modélisation SARI des comptes trimestriels en
utilisant des indicateurs bruts et étudier la dynamique
annuelle compatible. Cette solution a l’avantage de
permettre la construction de comptes trimestriels
corrigés pour jours ouvrables comme une étape
ultérieure du processus de fabrication. Nous
proposons d’étudier ces deux solutions dans des
spécifications parcimonieuses des dynamiques
évoquées ci-dessus.
Pour des modèles avec des variables intégrées d’ordre
deux, il suffit d’augmenter de 1 le nombre de retards
dans l’écriture de l’équation. Nous ne pouvons en
général rien affirmer sur la nature de ε t . Il se peut qu’il
existe une corrélation entre les endogènes et les termes
d’erreur. Si tel est le cas, il faut modéliser la forme de
la corrélation qui existe.
Les différents types de convergence énoncés ci-dessus
impliquent que les règles usuelles d’inférence ne
peuvent être utilisées de façon systématique. Suivant
la nature des variables et l’existence ou non de relation
de cointégration, les propriétés des statistiques
usuelles ne sont pas les mêmes et les tables que l’on
doit utiliser diffèrent. Pour décider de la situation dans
laquelle on se trouve, il faut pratiquer un certain
nombre de tests (test d’intégration, de cointégration)
dont la puissance est faible. Néanmoins, des travaux
récents (Kitamura et Phillips (1994), Kitamura (1994),
Phillips (1993a,1993b)) permettent de construire des
statistiques de tests de contraintes linéaires sur les
coefficients dont la distribution asymptotique est un
chi-deux quelles que soient les propriétés
individuelles ou collectives des variables tant que l’on
manipule des séries au plus intégrées d’ordre 1. Ces
tests sont néanmoins légèrement conservatifs.
d’un modèle trimestriel
3 Construction
corrigé des variations saisonnières
3.1 Le modèle
3.1.1 Variable de flux
Nous considérons dans un premier temps des variables
de flux. Nous postulons une dynamique simple de la
forme suivante:
yt = α yt − 1 + xt β + ε t ,
α ≤1
(1)
où ε t ≈ i. i. d . avec Ε ε t x t , y t − 1 = 0,Vε t x t , y t − 1 = σ 2ε , xt
est
un
vecteur
d’indicateurs
observés
trimestriellement (il peut contenir des différences
premières ou des variables décalées dans le temps) et
xt représente l’ensemble des valeurs passées prises par
le processus stochastique x jusqu’à la date t. Nous en
déduisons la dynamique annuelle :
Les comptables trimestriels sont confrontés au
problème de “créer” une dynamique trimestrielle
cohérente avec une dynamique annuelle. Ceci signifie
qu’il faut respecter les propriétés d’intégration et de
cointégration. Nous pouvons partir d’une dynamique
annuelle et rechercher une dynamique trimestrielle
compatible. Cette voie est relativement compliquée.
Dans certains cas nous serions amenés à considérer
des dynamiques de type moyenne mobile infinie sur
les variables trimestrielles pour obtenir des
dynamiques autorégressives simples sur les quantités
annuelles. Nous préférons effectuer la démarche dans
le sens inverse, partir du trimestriel pour construire
une dynamique annuelle et en déduire ce qui peut être
inféré sur la dynamique infra-annuelle. De façon
schématique, deux solutions sont envisageables. Nous
pouvons construire des comptes trimestriels corrigés
des variations saisonnières modélisés par une
Ya = α 4Y a − 1 + X a β + αX a ,1 β + α 2 X a ,2β + α 3 X a ,3β + Ε
4
avec X a , j = ∑ x4 (a − 1 ) + i − j
i=1
et
(
+ α + 1)ε
)ε + α
)
Ea = ε 4a + (α + 1) ε 4a − 1 + α 2 + α + 1 ε 4a − 2
(
+ (α
+ α 3 + α2
146
3
+ α2
4a − 3
4a −5
3
(
)
+ α 3 + α 2 + α ε 4a − 4
ε 4a −6
a
TABLE OF CONTENTS
Cette représentation n’est pas une représentation
canonique du processus annuel, en particulier, nous
n’avons pas isolé l’innovation du processus puisque
Cov(Y a − 1 , Ε a ) ≠ 0. Par ailleurs, la dynamique simple
autorégressive sur la variable trimestrielle d’intérêt
entraîne une modification notable de la dynamique de
l’indicateur que nous n’avons pas explicitée jusqu’ici
sous la forme annuelle déduite.
l’indicateur où intervient chacune des observations
effectuées au cours de l’année.
Comme pour les variables de flux, nous ne pouvons
pas utiliser de façon simple et directe la représentation
annuelle pour estimer les paramètres d’intérêt de la
dynamique trimestrielle. L’emploi de toute
l’information disponible entraînerait l’usage de
méthodes non-linéaires. Ne pas tenir compte des
contraintes non-linéaires et estimer le modèle par une
méthode des moindres carrés entraîne ici encore un
problème d’identifiabilité sur le signe de α.
Nous ne pouvons pas utiliser de façon simple et directe
la représentation annuelle pour estimer les paramètres
d’intérêt de la dynamique trimestrielle. L’emploi de
toute l’information disponible entraînerait l’usage de
méthodes non-linéaires et plutôt de type variables
instrumentales du fait de la corrélation entre le terme
d’erreur et l’endogène retardée. Ne pas tenir compte
des contraintes non-linéaires et estimer le modèle par
une méthode des variables instrumentales entraîne un
problème d’identifiabilité sur le signe de α1.
3.2 Méthode d’estimation et ses propriétés
Nous ne voulons pas considérer des méthodes
sophistiquées pour estimer ces modèles. Nous avons
besoin de méthodes simples, robustes, employables
par toutes les personnes impliquées dans la
construction de comptes trimestriels. De plus des
méthodes trop coûteuse en temps de calcul
informatique seraient pénalisantes. Il doit être possible
de réestimer rapidement les quelques milliers
d’équations qui servent à la construction d’un compte.
Nous nous tournons vers des méthodes du type
moindres carrés ordinaires (du moins dans l’écriture
du problème).
3.2.1 Variable de stock
Nous considérons maintenant des variables de stock.
Comme précédemment, nous postulons une
dynamique simple de la forme suivante:
y t = αy t − 1 + x t β + ε t ,
α ≤1
(2)
Nous proposons donc de travailler directement à partir
de la représentation trimestrielle et de minimiser la
somme des carrés des termes d’erreur sous contrainte
de cohérence entre les évaluations trimestrielles et les
comptes annuels. Nous ne tenons pas compte des
contraintes de positivité qui existent sur les variables
de comptabilité dans la mesure où si la contrainte
devenait active, cela signifierait qu’une valeur
comptable doit être mise égale à zéro, ce qui a peu de
pertinence si nous ne disposons pas d’informations
exogènes qui nous permettent d’asseoir ce fait. De
même, nous n’imposons pas la contrainte α ≤ 1.
où ε t ≈ i. i. d . avec Ε ε t x t , y t −1 = 0,Vε t x t , y t − 1 = σ 2ε et
xt est un vecteur d’indicateurs observés
trimestriellement. Nous en déduisons la dynamique
annuelle :
Y a = α 4 Ya − 1 + x4 a β + αx 4 a −1 β + α 2 x 4a − 2β + α 3 x4a − 3 β + Ε
avec Ε
a
a
= ε 4a + αε 4a − 1 + α 2 ε 4a − 2 + α 3 ε 4a − 3 .
Cette représentation est une représentation canonique
du processus annuel, nous avons isolé l’innovation du
processus puisque Cov(Y a − 1 , Ε a ) = 0. La dynamique
simple autorégressive sur la variable trimestrielle
d’intérêt entraîne une dynamique annuelle de
1
De façon pratique, si la valeur absolue de l’estimateur
prend une valeur supérieure à 1, cela peut être une
information sur le degré d’intégration de la série (série
Un parallèle peut être tiré à ce propos entre le modèle autorégressif et le modèle avec autocorréla tion des
erreurs tel qu’envisagé par Bournay et Laroque (1979). Sous cette spécification, il existe aussi un problème
d’identifiabilité du fait qu’un rapport de produit croisé de résidus puisse permettre l’estimation du coefficient
d’autocorrélation des résidus mais élevé à la puissance 4.
147
TABLE OF CONTENTS
intégrée d’ordre 2) et inciter à introduire un retard
supplémentaire dans la représentation autorégressive
considérée. En revanche, si la valeur est faiblement
supérieure à 1, il faudrait être en mesure de tester son
égalité afin d’imposer la valeur 1 pour éviter des
processus explosifs. La fonctionnelle que nous devons
minimiser est quadratique en les paramètres {y, α , β}
où est le vecteur des variables trimestrielles
inobservées. Nous considérons donc pour des
variables de flux le problème :

Min ∑ ( yt − αyt − 1 − x t β)
Ι
( ) y ,α , β t = 2
 Ι ⊗ (1111) y = Y
 A
T
(b)
i+ 1
i+ 1
où i V$ =
(
i
$y−1
)
x et
i
i
V$
i+ 1
)
(
α$ x i + 1 β$ − v
i+ 1
α$
)
i+ 1
$y0 

(
) ( Ι−
(
)(
$J
y$ =  Ι − i + 1 α

 Ι − i + 1 α$ J

'
'
i+ 1
i+ 1
)
α$ J 

−1
)
i+ 1
α$ et $y0 + x i + 1β$ + Φ ' i + 1 Λ$ 

est
une
variable
de
flux}
,
variables de flux après concaténation à e a , un vecteur
de taille A avec un 1 en première position et des 0
ensuite, on obtient la matrice d’annualisation
((ea Φ ) = Ιa ⊗ (1 1 1 1) ) et pour des variables
de stocks, après concaténation à un vecteur nul de
dimension A, on obtient la matrice d’annualisation
appropriée Ιa ⊗ ( 0 0 0 1).
2
Le fait que les fonctions que l’on minimise sont
quadratiques en les paramètres et que les contraintes
sont linéaires en les paramètres assure que ce type
élémentaire d’algorithme converge (plus ou moins
vite). Néanmoins les propriétés statistiques des
estimateurs que l’on obtient sont mauvaises. Il est
simple de remarquer les estimateurs y$ ne sont pas
convergents dans la mesure où le nombre de
paramètres augmente avec le nombre d’observations.
Ceci est toujours vérifié dans ce type de problème. En
revanche l’étude des biais apporte de l’information sur
la qualité des estimations.
i
i
)
y$
y$ −1 =i $y 1 e t + J i $y, J est la
Proposition 3: les estimateurs
{ y$ , α$ ,β$} solutions du
matrice de Jordan de taille (4A-1,4A-1) (avec des 1
sous la diagonale principale et des 0 ailleurs) et e t est
un vecteur de taille 4A-1 avec un 1 en première
position et des 0 ailleurs,
programme (I) ou (II) sont des estimateurs biaisés de
{y, α , β}.
étape 2(i+1) :
Les défauts de cette approche sont relativement
nombreux. En plus du biais possible des estimateurs, il
faut remarquer qu’au cours d’une nouvelle année,
lorsque l’on utilise le modèle sans information sur la
valeur annuelle, on utilise généralement des
indicateurs susceptibles de révisions au fur et à mesure
que les réponses aux enquêtes peuvent être exploitées.
Preuve : voir annexe 2
(a)
i+ 1
(
−1
Φ (α) = Φ (Ι − α J) , où Φ est telle que pour des
) ( V$ '
−1
))
α$ '
−1
étape 1(i+1) :
i
i+ 1
avec v(α) = αΦ ( α) et + e a 1{(y
où ΙA est la matrice identité de dimension A = T ÷ 4. La
résolution numérique de ces problèmes est
relativement simple et peut être obtenue en résolvant
de façon itérative les conditions du premier ordre
déduites du Lagrangien du problème. Ceci s’écrit sous
la forme suivante où l’on a séparé la condition initiale
y1 du reste du vecteur y:
( V$ '
) (
$ Φ
α
(c)
et pour des variables de stock, le problème suivant:
 i + 1 α$ 

=
 i + 1 β$ 


i+ 1
 Y − Φ

2
T

( yt − αyt − 1 − x t β)
∑
 Min
(ΙΙ) y ,α ,β t = 2
 Ι ⊗ ( 0001) y = Y
 A
((
Λ$ = Φ
y$ 1 =
(
V
i+ 1
) ( ( α$ )Φ ( α$ )') (Φ ( α$ )x β$ − Y )
V ( α$ )' (Φ ( α$ )Φ ( α$ )') v( α$ )
α$ ' Φ
i+ 1
i+ 1
i+ 1
i+ 1
−1
i+ 1
i+ 1
−1
i+ 1
i+ 1
148
TABLE OF CONTENTS
Il s’ensuit que la construction du compte serait
fragilisée par l’accumulation des erreurs faites sur
l’évaluation de chaque trimestre du fait du caractère
auto régressif du modèle.
avec x. Sous H3 et H5, les deux variables sont
intégrées et cointégrées entre elles.
Nous avons effectué 1000 simulations d’échantillons
de 100 observations et représenté la distribution des
paramètres estimés à l’aide d’un estimateur à noyau
gaussien.
3.3 Expériences de simulations
Nous avons procédé à quelques exercices de
simulations pour illustrer cette méthode et ses défauts.
Pour ce faire, nous avons simuler cinq différents
modèles dynamiques en trimestriel, puis nous avons
annualisé ces observations et effectué une estimation à
l’aide de la méthode décrite ci-dessus. Les modèles
simulés sont cohérents avec la nulle de la méthode
d’estimation proposée, ils correspondent simplement à
différentes configurations de stationnarité ou
d’intégration des variables intervenant dans la
relation. Les cinq modèles simulés sont les suivants :
H1:
y t = 0.8 y t − 1 + xt + ε t , x ≈ Ι( 0), i.i.d.N(0,10),
ε i.i.d.N (0,1)
H2:
y t = yt − 1 + x t + ε t , x ≈ Ι(0), i.i.d.N(0,10),
ε i.i.d.N (0,1)
H3:
y t = 0.8 y t − 1 + xt + ε t , x ≈ Ι(1),
(1-B)xt i.i.d.N (0,10), ε i.i.d.N(0,1)
H4:
H5:
yt
ε
yt
ε
Les distributions que nous obtenons sont
asymétriques, la masse à gauche du mode est plus
importante que la masse à sa droite pour les
hypothèses H1,2,3,5 et inversement pour H4. Même si
le mode semble bien positionné, un biais peut
subsister. La valeur du biais empirique que nous
obtenons sur ces échantillons est illustré dans le
tableau 1.
Les biais empiriques sont relativement faibles et
d’autant plus faibles que les variables sont intégrées
d’ordre 1.
Ceci est compréhensible dans la mesure où la
divergence des moments croisés est plus rapide pour
ces séries.
Les distributions que nous obtenons pour le paramètre
de l’indicateur présentent une asymétrie inverse à celle
des distributions des α, la masse à droite du mode est
plus importante que la masse à sa gauche pour les
hypothèses H1,2,3,5 et inversement pour H4. Le mode
semble correspondre à la valeur sous l’hypothèse
nulle. Le biais empirique que nous obtenons sur ces
échantillons est illustré dans le tableau 2.
= xt + ε t , x ≈ Ι(0), i.i.d.N(0,10),
i.i.d.N (0,1)
= xt + ε t , x ≈ Ι(1),(1-B)x t i.i.d.N (0,10),
i.i.d.N(0,1)
Dans H1 et H4, la série simulée est stationnaire. Dans
H2, elle est intégrée d’ordre 1 mais non cointégrée
Tableau 1: bias empirique
biais sur α
H1
H2
H3
H4
H5
-0.012
-0.003
-0.002
0.038
-0.003
H1
H2
H3
H4
H5
0.033
0.020
0.010
-0.027
0.003
Tableau 2: bias empirique
biais sur β
149
TABLE OF CONTENTS
25
160
140
20
120
100
15
80
10
60
40
5
20
0
0,66
0,70
0,74
0,78
0
0,93
0,81
Distribution de alpha sous H1
0,95
0,96
0,98
1,00
Distribution de alpha sous H2
400
10
8
300
6
200
4
100
2
0
0,780 0,784 0,789 0,793 0,797
0
-0,14
Distribution de alpha sous H3
Distribution de alpha sous H4
60
50
40
30
20
10
0
-0,03
-0,02
-0,00 0,007 0,021
Distribution de alpha sous H5
150
-0,03 0,069 0,176 0,283
TABLE OF CONTENTS
7
6
6
5
5
4
4
3
3
2
2
1
1
0
0,7310,8540,976 1,099 1,2221,3441,467
0
0,67 0,79 0,90 1,02 1,14 1,25 1,37
Distribution de bêta sous H1
Distribution de bêta sous H2
100
8
80
6
60
4
40
2
20
0
0,65 0,73 0,82 0,91 0,99 1,08 1,17
0
0,99 1,00 1,01 1,02 1,02 1,03 1,04
Distribution de bêta sous H3
Distribution de bêta sous H4
60
50
40
30
20
10
0
0,96 0,97 0,98 0,99 1,01 1,02 1,03
Distribution de bêta sous H5
151
TABLE OF CONTENTS
y t − y t − 4 = α 0 β0 ( y t − 1 + y t − 2 + y t − 3 + y t − 4 )
+ α π βπ ( yt − 1 + y t − 2 + yt −3 + yt − 4 )


+ α 1π  β1π ( yt − 1 − y t − 3 ) + β2π ( yt − 2 − y t − 4 ) 

2 2
2
Les biais empiriques sont plus importants sur le
coefficient de la variable qui joue le rôle de
l’indicateur. Il est plus faible (et opposé en signe)
lorsque les variables sont intégrées d’ordre 1 et non
cointégrées entre elles. La cointégration introduit des
termes stationnaires qui polluent la vitesse de
convergence pour tous les coefficients. Les biais
observés sur les β sont de signe opposé à celui observé
sur les α, ce biais provient de l’interaction des
contraintes annuelles qui sont imposées dans
l’estimation. Il y a donc une sorte de compensation
entre les deux erreurs. Ceci entraîne que l’erreur faite
sur les évaluations trimestrielles peut être plus faible.


+ α 2π  β1π ( yt − 2 − y t − 4 ) + β2π ( yt − 3 − y t − 1 )  + ε t

2 2
2
mais la seule relation de cointégration que nous
pouvons observer est celle décrite par β0 (associée à la
fréquence 0) qui demeure sur les quantités annuelles
(la preuve de cette affirmation est relativement longue,
elle n’est pas donnée pour ne pas alourdir le texte). Il
semble en fait plus important de construire des séries
comptables qui reproduisent les propriétés de
cointégration saisonnière rencontrées sur les
indicateurs.
d’un modèle trimestriel
4 Construction
saisonnier
Si deux indicateurs bruts sont cointégrés, il est
souhaitable que les variables comptables construites à
partir de leur observation satisfassent la même
propriété. Qualitativement, on voit que cette
conservation des propriétés de cointégration
saisonnière sera satisfaite si les variables comptables
sont construites à partir de mêmes filtrages linéaires
“stationnaires” mono-directionels des indicateurs. Par
linéarité, les propriétés seront conservées.
4.1 Le modèle
L’écriture d’un modèle saisonnier est ici entendue
dans le sens où le comportement des variables
trimestrielles peut être relativement bien décrit par une
représentation SAR avec peu de retards. Dans cette
écriture, la variable d’intérêt présente une forte
autocorrélation d’ordre 4. Si nous considérons des
variables intégrées à des fréquences saisonnières, il
nous faut alors travailler avec les différences d’ordre 4
des variables.
4.1.1 Variable de flux
Nous proposons donc de considérer la forme suivante
de modèles dynamiques :
Ce qui a été dit a propos d’une écriture qui permette
d’englober différentes configurations de cointégration
entre les variables demeurent valables a l’exception
près que par cette écriture nous ne pouvons pas
prendre en compte les formes de cointégration
saisonnière.
yt = α yt − 4 + xt β + ε t ,
α ≤1
(3)
où ε t ≈ i. i. d . avec Ε ε t x t , y = 0,V ε t x t , y = σ 2ε et
t −1
t −1
xt
est un vecteur d’indicateurs observés
trimestriellement de données non désaisonnalisées (il
peut contenir des différences quatrièmes ou des
variables décalées dans le temps). Nous en déduisons
la dynamique annuelle :
Il faudrait en effet dans ce cas faire intervenir les
variables en niveau décalées de 1,2 et 3 dates (cf.
théorème de représentation énoncé dans Gregoir
(1994)). Mais pour le travail que nous voulons
effectuer, ceci n’a pas beaucoup de pertinence car
cette structure de cointégration saisonnière ne peut
être induite à partir de l’information accessible au
niveau annuel.
Ya = α Y a − 1 + X a β + Ε
a
La grande simplicité du passage du modèle trimestriel
au modèle annuel permet de considérer des modèles
avec de l’autocorrélation des résidus saisonnière (elle
demeure conservée au niveau annuel) comme l’ont
proposé Bournay et Laroque (1979). L’autocorrélation
En d’autres termes, elle n’est pas identifiable. En effet,
dans le cas le plus simple de cointégration saisonnières
nous avons, lorsque nous considérons un vecteur de
variables endogènes :
152
TABLE OF CONTENTS
des résidus est alors directement mesurée sur les
résidus annuels.
et
les
ϕ (u)
k
j
=
1
,
K
,
n
,
i
=
1
,2,3,4
( j ,i )
4.1.2 Variable de stock
∀ j , ∑ k j ,i = 0.
constantes
réelles
sont
telles
que
4
i =1
Pour les variables de stock, nous proposons de
considérer des modèles dynamiques de la forme
suivante :
Exemple 2 : Dans le cas du modèle (3), les processus
n
de la forme y*4(a − 1 )+ i = y 4(a − 1) + i + ∑ λaj k j ,i satisfont à
j=1
yt = α yt − 4 + xt β + ε t ,
α ≤1
(4)
la fois la dynamique autorégressive et les contraintes
où ε t ≈ i. i. d . avec Ε ε t x t , y t − 1 =0,Vε t x t , y t − 1 =σ 2ε et xt
sur les sommes annuelles. L’espace des solutions
possibles est donc de dimension trois, mais ces degrés
est
un
vecteur
d’indicateurs
observés
trimestriellement de données non désaisonnalisées.
Nous en déduisons la dynamique annuelle:
de liberté jouent un rôle de plus en plus faible sur
l’évolution de au fur et à mesure que l’on s’éloigne de
Y a = α Y a − 1 + x4a β + ε 4a
Ceci signifie que nous devons imposer une
information supplémentaire pour pouvoir estimer une
série trimestrielle. Différents critères peuvent être
envisagés. Nous nous limiterons à des critères associé
à la variabilité de la série.
la date initiale.
Là encore, la simplicité du passage du modèle
trimestriel au modèle annuel permet de considérer des
modèles avec de l’autocorrélation des résidus
saisonnière qui est mesurée sur les résidus annuels.
Dans la mesure où nous n’observons pas la vraie série,
il nous semble prudent de rechercher des séries à faible
variabilité ou du moins pour lesquelles les conditions
initiales contribuent le moins possible à la variabilité
totale de la série.
4.2 Identifiabilité de la série trimestrielle
Néanmoins l’usage d’une telle modélisation n’est pas
sans quelques inconvénients. Le plus important porte
sur l’identifiabilité de la série trimestrielle à partir de
l’information annuelle. En effet, il existe un nombre
infini de séries trimestrielles qui vérifient à la fois une
telle écriture autorégressive et la contrainte
comptables liant les quantités trimestrielles à la
quantité annuelle. Ceci est particulièrement évident
pour les variables de stock, puisque nous n’imposons
aucune contrainte sur les valeurs des trois premiers
trimestres de l’année. Mais ceci est aussi vrai pour le
modèle le plus général de variables de flux.
4.3 Méthode d’estimation et ses propriétés
Nous recherchons des méthodes simples, facilement
mises en oeuvre, peu coûteuses en temps de calcul,
satisfaisant les approches standard de la statistique. Le
paragraphe précédent nous a montré qu’une
information supplémentaire doit être introduite pour
obtenir l’estimation d’une série trimestrielle. De façon
générale, nous posons le programme pour des
variables de flux sous la forme suivante sans prendre
en compte la contrainte sur la valeur absolue de α:
Proposition 4 : si yt vérifie la dynamique
autorégressive ϕ B 4 yt = z t où ϕ (u) est de degré n,
( )
  MinCritère( y$ ( y 0 ) )
y

 0
  (1111) y = Y
0
0

( ΙΙΙ)
T
2


( yt − αy t − 4 − xt β)
∑
 Min
y ,α ,β
$
y
y
∈
arg
(
)

t=5

0

 Ι ⊗ (1111) y = Y
 A

4
et satisfait les contraintes ∀ a , ∑ y 4(a − 1) + i = Ya , alors
i=1
n
y*4(a − 1) + i = y4 (a − 1 ) + i + ∑ λaj k j ,i
satisfait
aussi
les
j=1
contraintes et vérifie la dynamique autorégressive
lorsque ( λ j ) j = 1,K,n sont les racines du polynôme ϕ (u)
153
TABLE OF CONTENTS
y$ ( y 0 ) = (ΙA− 1 − α$
où y 0 représente les conditions initiales sur lesquelles
on reporte l’indétermination. Le problème pour des
variables de stock s’écrit de la même façon aux
contraintes près :
+ (I A− 1 − α$
  Min Critère( y$ ( y 0) )
y

 0
  (0001) y = Y
0
0

(IV )
T

2

( y t − αyt − 4 − xt β)
∑
 Min
y ,α ,β
$
y
y
∈
arg
(
)
t
=
5


0

 I ⊗ ( 0001) y = Y
 A

Par ailleurs, le caractère quadratique des fonctions que
l’on minimise assure que les familles de solutions
$y( y 0 ) sont des fonctions linéaires des conditions
initiales :
+ ( I A − 1 − α$
J A− 1 )
−1
)
eA − 1 ⊗ I 4 y0 + xβ$
 1
 
 1
⊗  Ε$ / 4
1
 
 1
 
 y0 

V
 y$ ( y ) 
0 

Les solutions de ce programme sont simples (cf.
annexe 2), néanmoins les interactions qui peuvent
exister entre les conditions initiales et les séries
d’indicateurs peuvent entraîner des révisions sur le
pour des variables de flux et
2
 0
 
 0
⊗  Ε$
0
 
1 
 
Il nous faut maintenant regarder les problèmes liés au
programme complet que nous considérons. Le premier
critère naturel que nous pouvons considérer est celui
qui fait choix de la trajectoire qui présente la
variabilité la plus faible 2. Le critère est donc :
Proposition 6: les solutions y( y 0 ) du programme
“primaire” sont données par les équations suivantes:
(
−1
Nous faisons abstraction ici des problèmes de non stationarité de la solution obtenue, la variance de la variable
I(1) en niveau est un critère peu satisfaisant dans la mesure où il diverge en T (où T est la taill e de
l’échantillon) et où proprement redéfini en le divisant par T (ce qui est sans conséquence à distan ce finie), il
converge vers une variable aléatoire. Ce critère n’a donc pas toujours de pertinence statistique. I l faudrait
pouvoir proposer un critère satisfaisant qui converge vers une constante quelle que soit la nature de la série,
intégrée ou non. Il faut toutefois noter que la divergence de la variance ne peut être due aux cond itions initiales,
mais à la présence de variables intégrées parmi les indicateurs.
154
)
$
eA − 1 ⊗ I 4 y 0 + xβ
Ces propriétés sont conservées si nous considérons des
modèles avec des spécifications légèrement plus
compliquées, i.e. avec plus de retards ou avec de
l’autocorrélation d’ordre quatre des termes d’erreurs.
Il faut simplement dans ce dernier cas procéder à une
estimation en plusieurs étapes pour construire un
estimateur des moindres carrés généralisés des
paramètres du modèles sur quantités annuelles.
Preuve : voir annexe 2
−1
J A− 1 )
(
⊗ I 4 α$
Preuve : voir annexe 2
Proposition 5: les estimateurs de (α , β) issus du
programme “primaire” sont les estimateurs des
moindres carrés du modèle annuel.
J A− 1 ) ⊗ I 4 α$
−1
pour des variables de stock, où eA−1 est un vecteur de
taille A-1 dont toutes les valeurs sont égales à zéro
excepté la première qui vaut 1, J A−1 est la première
matrice de Jordan dont la première sous diagonale est
égale à 1 et les autres paramètres sont nuls et E$ est le
vecteur des résidus estimés du modèle annuel
correspondant.
Les propriétés des solutions du programme “primaire”
de minimisation de la somme des carrés des résidus
sont proches de celles trouvées par Chow et Lin (1971)
en ce qui concernent les estimateurs.
y$( y 0 ) = ( ΙA − 1 − α$
J A− 1 )
TABLE OF CONTENTS
début de période lors des réestimations nécessaires
après l’introduction de nouvelles observations.
éventuellement des poids pour qualifier les périodes
d’intérêt. Le problème de cette approche réside dans
son initialisation.
Le fait que la solution du problème “primaire” soit
linéaire en les conditions initiales permet aussi de
considérer comme critère la contribution des
conditions initiales à la variance de la série 3.
Lorsque le nombres de retards présents dans
l’équation dynamique est supérieur à un, les
inconvénients d’une telle méthode résident dans le fait
que l’on travaille avant tout sur des évolutions. Les
évaluations successives des années deviennent
sensibles aux taux de croissance des années
antérieures. Ceci correspond bien aussi à l’idée de
travailler sur des variables stationnaires. En revanche,
lorsque le nombre de retard est de un et le coefficient
de retard a une valeur proche de 1, en révisant la valeur
de l’année précédente, on décale la valeur courante
d’autant et on préserve au premier ordre le taux de
croissance (sans révisions des indicateurs).
Le critère devient :
  y0  
y0


, 
Cov 
−
1
  y$ ( y0 )   (I − α$ J ) αε

$
⊗
I
y


A
−
1
A
−
1
Α
−
1
4
0



 y0 

V
 $y( y ) 
0 

Les conditions du premier ordre issues de ce
programme qualifient en général deux extrema, un
minimum et un maximum (cf. annexe 2). Le calcul de
la fonction objectif permet de faire le choix des
conditions initiales appropriées. Remarquons que la
résolution de ces programmes avec un modèle auto
régressif avec plus de retard ne pose pas de problèmes
techniques majeurs.
5 Etudes des résultats empiriques
5.1 Choix des critères d’évaluation
Il est difficile de définir des critères qui au niveau
trimestriel permettent de juger de la qualité d’un
ajustement économétrique parmi plusieurs modèles
disponibles. Seules les “performances” du modèle au
niveau annuel peuvent être évaluées dans la mesure où
seules les quantités annuelles de la variable
dépendante sont observées. Néanmoins, en termes de
comptabilité nationale, il faut quatre ans pour
connaître la valeur “définitive” d’un agrégat et les
évaluations de comptes trimestriels ne peuvent
prétendre à partir d’indicateurs relativement simples
correspondant à des niveaux agrégés de la
nomenclature décrire avec précision des évolutions
qui seront obtenues à l’issue d’un processus de quatre
années basé sur les comptes des entreprises à un
niveau fin. Il me semble donc utile de comparer les
performances des méthodes en compétition à la
première évaluation disponible des comptes annuels
(le compte provisoire) issue d’un travail de
confrontation des données des comptables annuels et
trimestriels. Néanmoins il ne faut pas attribuer trop de
significativité à ces comparaisons dans la mesure où le
Par ailleurs, une autre méthode simple, facilement
mise en oeuvre, consisterait à construire des valeurs
initiales vérifiant la contrainte annuelle et reproduisant
le profil de l’indicateur brut utilisé.
Enfin, un soucis de lisibilité des comptes peut inciter
les comptables trimestriels à construire les séries qui
présentent le moins de modifications possible entre
deux publications 4. Le critère devient alors la
minimisation de la distance entre les deux dernières
estimations sur une période sélectionnée, ce qui
s’écrit:
( y$
1
− y$ 2 ( y0 ))' P( y$ 1 − y$ 2 ( y0 ) )
où y$ 1 est issu des données de l’année précédente ,
$y2 ( y 0 ) est calculé avec les données révisées et
restreint à la période d’observation correspondante, et
P sélectionne la période sur laquelle on désire rendre
les observation les plus proches possible avec
3
Une remarque semblable à celle faite pour le critère précédent reste pertinente.
4
Cette approche a été suggérée par Jacques Bournay .
155
TABLE OF CONTENTS
poids relatif de chacune des parties prenant part à la
concertation varie suivant les variables et les années.
par le modèle dynamique calé pour les trois premières
années.
5.2 Le cas de la production de textile sur
données françaises
Il faut néanmoins être prudent quant à la signification
de ces résultats, ils dépendent du poids de chacun des
intervenants dans la phase de concertation.
Pour la période 1970-1992, l’étalonnage retenu dans
les comptes trimestriels est le suivant :
Nous avons utilisé les résultats présentés ci-dessus
pour calculer les séries trimestrielles correspondantes
en imposant la contrainte α ≤ 1 dans l’estimation.
y = 25.57x + 1283.88t
la racine carré de la moyenne des erreurs au carré
(RMSE) sur données annuelles est de 3769.9 et la
statistique de Durbin-Watson s’élève à 0.44, illustrant
une forte corrélation des résidus annuels. La recherche
d’un modèle auto régressif d’ordre 4 sur les mêmes
séries aboutit au modèle suivant (une indicatrice a été
introduite en 1986 année de départ de la rétropolation
de la base 1980) :
Dans le graphique 1, les séries obtenues pour les deux
critères retenus (minimisation de la variance et
minimisation de la contribution des conditions
initiales à la variance) ont été représentées. On observe
que la série obtenue en minimisant la contribution à la
variance des conditions initiales présente une première
années pratiquement plate.
En l’absence d’indicateur utilisé implicitement dans
l’estimation de la dynamique infra-annuelle de cette
première année, il semble naturel que la minimisation
de la contribution à la variance corresponde à un profil
relativement plat lors de l’année d’initialisation. Le
profil obtenu pour les autres années diffère entre les
deux méthodes dans la mesure où le coefficient auto
régressif sur la variable comptable est égal à 1, ce qui
entraîne une persistance de l’influence du profil de la
première année sur le reste de la série. Ainsi la série
obtenue en minimisant la variance apparaît beaucoup
moins variable que la série obtenue en minimisant la
contribution à la variance des conditions initiales.
Nous comparons ensuite la série obtenue par
minimisation de la contribution à la variance à la série
publiée par les comptes trimestriels lors des résultats
détaillés du quatrième trimestre 1992.
La série obtenue à partir du modèle dynamique joue le
rôle d’une série brute, puisqu’étalonnée sur un
indicateur brut, elle présente donc des fluctuations
plus marquées.
y = 104
. y−4 + 35.2x − 35.6x−4
le RMSE est de 2121.7 et la statistique de
Durbin-Watson vaut 1.82. Les statistiques obtenues
pour les deux modèles nous permettent de dire que le
modèle dynamique domine d’un point de vue
statistique le modèle statique.
Dans le tableau 3 nous avons simulé à l’horizon un, les
deux modèles retenus dans les deux approches, pour
différentes versions des comptes annuels. Nous avons
comparé les taux que nous donnaient ces équations à la
valeur publiée après la concertation des comptables
trimestriels et annuels. Le modèle dynamique calé
donne des résultats convenables excepté en 1992 et
1993.
Le modèle statique calé quant à lui donne des taux plus
satisfaisants pour ces deux années, mais est dominé
Tableau 3: Simulation à l’horizon 1
année
1989
1990
1991
1992
1993
modèle statique
non calé
-2.7
-5.5
-7.0
-6.1
-7.2
modèle statique
calé
-1.2
-2.1
-3.1
-2.2
-3.6
modèle dynamique
non calé
-2.6
-2.6
-3.6
-2.3
-4.1
156
modèle dynamique
calé
-0.2
-1.9
-2.6
-0.8
-5.3
publication au
provisoire
-0.3
-1.1
-1.9
-2.9
-1.7
TABLE OF CONTENTS
Graphique 1: Production textile
Graphique 2: Comparation avec les comptes trimestriels
157
TABLE OF CONTENTS
5.3 Le cas de la production de minerais et
métaux non-ferreux sur données
françaises
Nous avons utilisé les résultats présentés ci-dessus
pour calculer les séries trimestrielles correspondantes.
Dans le graphique ci-dessous, les séries obtenues pour
les deux critères retenus (minimisation de la variance
et minimisation de la contribution des conditions
initiales à la variance) ont été représentées. Dès la
troisième année, les profils obtenus par les deux
méthodes sont très proches. L’influence des
conditions initiales disparaît avec la rapide
convergence vers 0 des puissances de 0.89.
Pour la période 1970-1992, l’étalonnage retenu dans
les comptes trimestriels est le suivant :
y = 10.24 x + 730.26t
la racine carré de la moyenne des erreurs au carré
(RMSE) sur données annuelles est de 2901.3 et la
statistique de Durbin-Watson s’élève à 0.73. La
recherche d’un modèle auto régressif simple d’ordre 4
sur les mêmes séries aboutit au modèle suivant :
Nous comparons ensuite la série obtenue par
minimisation de la contribution à la variance à la série
publiée par les comptes trimestriels lors des résultat
détaillés du quatrième trimestre 1992 (graphique 4). Il
faut noter que le profil des comptes trimestriels publiés
est particulièrement heurté en fin de période
soulignant un problème de traitement de la
saisonnalité.
y = 089
. y −4 + 13.68x − 12.29x−4
le RMSE est de 2114.3 et la statistique de
Durbin-Watson vaut 2.18. Le modèle dynamique
obtenu est proche du modèle suivant :
(1− 0.89B )( y −13.74 x) = ε
5.4 Le cas de la production de la fonderie et
travail des métaux sur données françaises
qui indique une possibilité de relation de cointégration
entre les variables, relation mal capturée dans le
modèle statique du fait de la présence d’une tendance
déterministe. Là encore les statistiques obtenues pour
le modèle dynamique sont beaucoup plus
satisfaisantes que celles obtenues pour le modèle
statique.
Pour la période 1970-1992, l’étalonnage retenu dans
les comptes trimestriels est le suivant :
4
y = 32.78x
la racine carré de la moyenne des erreurs au carré
(RMSE) sur données annuelles est de 4412.7 et la
statistique de Durbin-Watson s’élève à 0.46, illustrant
une forte corrélation des résidus annuels. La recherche
d’un modèle autorégressif simple d’ordre 4 sur les
mêmes séries aboutit au modèle suivant :
Dans le tableau 4 nous avons simulé à l’horizon un les
deux modèles retenus dans les deux approches, pour
différentes versions des comptes annuels. Nous avons
comparé comme précédemment les taux que nous
donnaient ces équations à la valeur publiée après la
concertation des comptables trimestriels et annuels.
Aucun modèle semble d’un qualité supérieure,
compliquer légèrement le modèle dynamique
permettrait peut-être d’améliorer ses résultats.
y = 0.82 y−4 + 29.27x − 23.24x −4
Tableau 4: simulation à l’horizon 1
année
1989
1990
1991
1992
1993
modèle statique
non calé
6.9
3.6
3.9
5.7
-1.0
modèle statique
calé
3.5
2.3
-0.1
2.6
-4.4
modèle dynamique
non calé
2.4
0.8
-2.5
1.4
-7.8
158
modèle dynamique
calé
2.7
1.6
-5.7
2.9
-7.0
publication au
provisoire
6.9
0.9
-3.3
-3.1
-1.9
TABLE OF CONTENTS
Graphique 3: Minerais et métaux non-ferreux
Graphique 4: Comparaison avec les comptes trimestriels
159
TABLE OF CONTENTS
le RMSE est de 2958.2 et la statistique de
Durbin-Watson vaut 1.15. Nous remarquons que le
modèle dynamique est proche d’un modèle du type
La décision prise pour l’évolution de la dernière année
semble peu en accord avec les deux types de modèle.
Comme précédemment, nous avons utilisé les
résultats présentés ci-dessus pour calculer les séries
trimestrielles correspondantes. Dans le graphique
ci-dessous, les séries obtenues pour les deux critères
retenus (minimisation de la variance et minimisation
de la contribution des conditions initiales à la
variance) ont été représentées.
. x) = ε
(1− 0.82B )( y − 2890
4
qui fait penser à la possibilité d’une relation de
cointégration entre le poste comptable et l’indicateur,
ce qui correspondait au choix effectué dans la méthode
statique. Là encore les statistiques obtenues pour le
modèle dynamique sont beaucoup plus satisfaisantes
que celles obtenues pour le modèle statique.
Néanmoins, la forte corrélation des résidus du modèle
dynamique peut inciter à rechercher un modèle avec
autocorrélation des résidus. Ceci est possible avec un
légère modification des algorithmes envisagés, mais
ne sera pas effectué ici pour ne pas surcharger le
propos.
Comme nous l’observons, les deux méthodes donnent
des séries très voisines dès la cinquième année, le
profil des conditions initiales de la série obtenue en
minimisant la variance est assez heurté et joue un rôle
plus persistant dans le profil du reste de la série.
Nous comparons enfin la série obtenue par
minimisation de la contribution à la variance à la série
publiée par les comptes trimestriels lors des résultat
détaillés du quatrième trimestre 1992 (graphique 6).
La série issue du modèle dynamique présente un profil
saisonnier qui oscille autour de la série publiée par les
comptes trimestriels.
Dans le tableau 5 nous avons simulé à l’horizon un les
deux modèles retenus dans les deux approches, pour
différentes versions des comptes annuels. Nous avons
comparé comme précédemment les taux que nous
donnaient ces équations à la valeur publiée après la
concertation des comptables trimestriels et annuels.
Tableau 5: Simulation à l’horozon 1
année
1989
1990
1991
1992
1993
modèle statique
non calé
10.6
2.4
-6.0
-7.6
-10.8
modèle statique
calé
7.1
0.6
-4.4
-2.9
-8.4
modèle dynamique
non calé
5.9
0.9
-3.9
-2.8
-7.7
160
modèle dynamique
calé
7.1
-1.5
-3.5
-2.3
-7.9
publication au
provisoire
6.8
0.9
-3.4
-3.1
-2.2
TABLE OF CONTENTS
Graphique 5: Fonderie et traveaux des métaux
Graphique 6: Comparaison avec les comptes trimestriels
161
TABLE OF CONTENTS
ANNEXE 1
Proposition 1 :
Les conséquences de l’agrégation temporelle sur les propriétés de stationnarité des séries ont fait l’objet de
nombreux papiers. Dans la mesure où nous nous intéressons à des variables agrégées sur des périodes de temps finies
les propriétés sont conservées.
Pour plus de détails et des éléments bibliographiques, nous renvoyons le lecteur à un papier récent de Pierse R.G et
A.J.Snell (1995).
Proposition 2 :
La proposition 2 est une conséquence directe de la proposition 1.
ANNEXE 2
Preuve de la proposition 3 :
Nous utilisons les notations introduites dans le paragraphe 2.2. Des calculs directs à partir des c onditions du premier
ordre permettent d’écrire :
y$ − y = (I − α$ J)
{(α$ e − Φ ( α$ )' (Φ (α$)Φ (α$ )')
−1
t
(I − Φ (α$ )'( Φ (α$ )Φ (α$ )')
−1
−1
)( y$
v(α$ )
)(x(β$ −β) + (α$ − α)(e y
Φ ( α$ )
t
0
0
− y0 )
)
+ Jy) − ε
ce qui illustre l’absence de convergence des estimateurs des valeurs des trimestres, mais aussi le fait que le biais sur $y
est fonction des biais sur les paramètres ( α , β, y0 ). A l’aide des mêmes conditions d’orthogonalité, nous obtenons:
−1


$ )' ( Φ ( α
$ )Φ ( α$ )') Φ ( α$ )ε
v( α

 y$ 0 − y 0  

 

−1
M  α$ − α  = ( Jy$ + et y$ 0 )' Φ ( α$ )' ( Φ ( α$ ) Φ ( α$ )') Φ ( α$ ) ε 
 $
 

−1
 β − β 
x ' Φ (α$ )' ( Φ (α$ )Φ ( α$ ) ') Φ (α$ )ε



où
−1

v( α$ )' ( Φ ( α$ )Φ (α$ ) ') v(α$ )

M =
−1
( et $y 0 + Jy$ x)' Φ ( α
$ )' ( Φ ( α
$ )Φ (α$ )') v( α$ )

−1
−1
$ )(et $y0 + Jy$ x) 
v( α$ ) ' ( Φ ( α$ )Φ ( α$ ) ') ' ( Φ (α$ ) Φ ( α$ )') Φ ( α

−1
(e t $y0 + Jy$ x)' Φ ( α$ )' ( Φ (α$ )Φ (α$ ) ') Φ (α$ )(et y$ 0 + Jy$ x) 
apparaît donc que même en l’absence de corrélation à toutes les dates entre les indicateurs et les termes d’erreur, le
deuxième élément du vecteur de droite, à savoir
(Jy$ + e t y$ 0 )' Φ (α$ )' ( Φ (α$ )Φ (α$ ) ')
−1
Φ ( α$ )ε
introduit un biais qui se propage à tous les termes. En effet ce terme n’est pas toujours d’espéran ce nulle du fait de
l’action de l’opérateur Φ (α$ ) qui mélange les observations en annualisant des sommes pondérées.
162
TABLE OF CONTENTS
Preuve de la proposition 4 :
La preuve est laissée au lecteur, il suffit de vérifier que la solution proposée satisfait l’équati on.
Preuve de la proposition 5 :
Le modèle s’écrit vectoriellement sous la forme suivante :
(
)
y1 = α J 4 y1 + e A − 1 ⊗ I 4 y 0 + xβ + ε
où y 1 est le vecteur des valeurs trimestrielles des dates 5 à 4A et y 0 le vecteur des conditions initiales. Par la suite , Y1
représente le vecteur des valeurs annuelles correspondant à y1 . Le lagrangien du système est :
(
(
)(
)
(
)
)
L = y1 − α J 4 y 1 + eA − 1 ⊗ I 4 y0 − x β ' y 1 − α J 4 y1 + eA − 1 ⊗ I 4 y0 − xβ + λ' ( ψ y1 −Y 1 )
où ψ est la matrice d’annualisation appropriée suivant que l’on travaille avec des variables de flux ou des variables
de stock. Les conditions du premier ordre sont simplement dérivées du lagrangien et donnent :
)(
) )
(1)
(J
(2)
x' $y1 − α$ J 4 y$ 1 + eA − 1 ⊗ I 4
(3)
ψ $y1 − Y 1 = 0
I − α$ J 4 ' $y1 − α$ J 4 y$ 1 + eA − 1 ⊗ I 4 y0 − xβ$ + ψ ' λ$ = 0
(4)
4
(
y ) − x β$) = 0
y$ 1 + e A − 1 ⊗ I 4 y0 ' $y 1 − α$ J 4 y$ 1 + e A − 1 ⊗ I 4 y 0 − xβ$ = 0
(
(
)(
(
0
(
)
)
fe (4) reporté dans (1) et (2), nous tirons :
(5)
(
)
$ J 4 '−1 ψ ' λ$ = 0
x' I − α
Et
(6)
(J
4
)(
)
y$ 1 + e A − 1 ⊗ I 4 y0 ' I − α$ J 4 ' −1 ψ ' λ$ = 0
et de (3) et (4), nous obtenons l’expression de λ$ :
(7)
(
)
1
λ$ = −
(I A− 1 − α$ J A− 1 )' (I A− 1 − α$ J A− 1 ) Y 1 −(I A− 1 − α$ J A − 1 ) −1 (α$ e A− 1Y 0 + Xβ)
e' e
où e est égal à (1111) ou (0 0 01) suivant les cas. Le résultat en est déduit en remarquant que les égalités suivantes
sont satisfaites :
(
ψ I − α$ J 4
)
−1
x = (I A − 1 − α$ J A − 1 ) X
−1
et
(
ψ I − α$ J 4
) (J
−1
4
)
$y 1 + eA− 1 ⊗ I 4 y 0 = (I A− 1 − α
$ J A− 1 )
−1
( J A − 1 Y 1 + e A − 1Y 0 )
Ce résultat peut être étendu à des modèles avec des dynamiques plus longues sans aucun problème.
163
TABLE OF CONTENTS
La solution de ce problème est simple, le lagrangien s’écrit sous la forme suivante :
L = y 0 ' y0 + y$ 1 ( y0 )' y$ 1 ( y 0 ) − Y + λ( ey0 − Y 0 )
2
où Y représente la moyenne sur l’ensemble des observations des quantité de référence (annuelle ou quatri ème
trimestre) et e est le vecteur d’annualisation approprié. Si l’on note c(α) la quantité suivante :
(
c(α) = 1 + α 2 eA − 1 ' (I A − 1 − α J A − 1 )' −1 ( I A− 1 − αJ A− 1 ) eA − 1
−1
)
les conditions du premier ordre issues du lagrangien s’écrivent :
c(α$ ) $y 0 + Z$ + e ' λ$ 1 = 0
où
Z$ = α$ e
A− 1
) {( I − α$ J )
(
'⊗ I 4 I − α$ J 4 ' −1
4
−1
xβ$ + (I A − 1 − α$ J A − 1 )
−1
}
⊗ e' E$ / ee'
d’où nous tirons l’expression des valeurs trimestrielles de l’année initiale à l’aide de la contrai nte comptable sur
l’année
{(I
y$ 0 = ( e' Y0 / ee ') −
4
}
− e' e / ee')Z$ / c( α$ )
Solution du problème de minimisation de la contribution à la variance :
Nous réécrivons l’expression des solutions du programme primaire données par la proposition 6 sous
suivante :
y$( y 0 ) = Py 0 + Q
L’expression de la fonction objectif, à savoir la contribution à la variance, s’écrit sous la forme suivante:
( y0 ' Ty 0 + Sy0 + y0 ' S '+U) ( y0 'Ty 0 + Sy 0 )
−1
où
e e '  I 4 

P ') I 4A − 4A 4A  
4 A  P 

e e '  I 

S = ( 0 Q') I 4A − 4A 4A  4 
4 A  P 

e e '  0 

U = ( 0 Q') I 4A − 4A 4A  
4 A  Q 

T = (I 4
Les conditions du premier ordre donnent les équation suivantes :
2

 2( Sy$ 0 +U )Ty$ 0 + (U − y$ 0 ' Ty$ 0 ) S '−( y$ 0 ' Ty$ 0 + 2Sy$ 0 + U) e ' λ 0 = 0


e y$ 0 = Y 0

d’où
164
la forme
TABLE OF CONTENTS
(1) Si ( S ' , e') est une famille de vecteurs indépendants, on pose y$ 0 = vT −1 S + µ T −1 e' et l’on tire des expressions
Sy$ 0 , $y 0 ' Ty$ 0 , ey$ 0 les équations suivantes :
2

 v ss + vµ se + (2v + 1)U − µY 0 = 0


ves + µee = Y 0

où ss = ST −1 S ' , se = ST −1 e ', es = eT −1 S ' , ee = eT −1 e'. La solution de ce système d’équations consiste en deux
couples de solutions. La fonction objectif étant un rapport de formes quadratiques dont les termes dominants sont
égaux et dont le dénominateur est une forme définie positive, elle possède deux extrema : un minimu m et un
maximum. Le calcul des valeurs de la fonction en les solutions permet de faire choix du minimum.
(2) Si ( S ' , e') est une famille de vecteurs dépendants, alors :
y$ 0 =
Y0
eT −1 e'
T −1 e'
La dérivée seconde calculée en ce point est égale à 2UT qui est définie positive ( T est une variance). La solution
correspond donc à un minimum.
165
TABLE OF CONTENTS
Références
Bournay J. et G.Laroque (1979) “Réflexions sur la
méthode d’élaboration des comptes trimestriels”
Annales de L’INSEE , 36, p3-30
Hylleberg S., R.F.Engle , C.W.J. Granger et
B.S.Yoo (1991) “Seasonal Integration and
Cointegration” Journal of Econometrics , 44,
p215-238
Chow G.C. et A.L.Lin (1971) “Best Linear Unbiased
Interpolation, Distribution and Extrapolation of
Time Series by Related Series” The Review of
Economics and Statistics , 53, p372-375
Karhunen K. (1947) “Über lineare Methoden in des
Wahrscheinlichkeitsrechnung” Ann. Acad. Sci.
Fenn. Ser. AI, Math, 37, p3-79
Di Fonzo T . (1986) “La stima indiretta di serie
economiche trimestrali”, Thèse, Université de
Padoue, décembre
Kitamura Y. (1994) “Hypothesis testing in models
with possibly non-stationary processes”, working
paper , department of Economics, University of
Minnesota
Di Fonzo T. et R.Filosa (1987) “Methods of
estimation of quarterly national accounts series: a
comparison”, papier présenté aux “Journées
franco-italiennes de comptabilité nationale”
Lausanne 18-20 mai 1987
Kitamura Y. et P.C.B. Phillips (1994) “Fully
Modified IV, GIVE and GMM Estimation with
Possibly
non-Stationary
Regressors
and
Instruments” Cowles foundation discussion paper
#1082
Engle R.F et C.W. Granger (1987) “Co-Integration
and Error Correction : Representation, Estimation
and Testing”, Econometrica , 55, p45-62
Loève M. (1963) Probability Theory, Van Nostrand,
N.Y.
Nelson C.R. et C.I. Plosser (1982) “Trends and
Random Walks in Macro-economic Time Series:
Some Evidence and Implications”, Journal of
Monetary Economics , 10,p139-162
Gladishev E.G. (1961) “Periodically correlated
random sequences” Soviet Math. 2, p383-388
Granger C.W.J et A.A.Weiss (1983) “Time Series
Analysis of Error Correcting Models” in Studies
in Econometrics, Time Series and Multivariate
Statistics, NewYork, Academic Press, p225-278
Phillips P.C.B. (1993a) “Fully Modified Least
Squares and Vector Autoregression” Cowles
foundation discusson paper #1047
Gregoir S. (1994) “Multivariate Time Series with
Various Unit Roots : Part I : Integral Operator
Algebra and Representation Theorem”, miméo
Phillips P.C.B. (1993b)
Pierse R.G et A.J.Snell (1995) “Temporal
Aggregation and the power of tests for a Unit
Root” Journal of Econometrics , 65, p333-345 .
166
TABLE OF CONTENTS
SECTION 3 SURVEY-BASED APPROACH I
Quarterly National Accounts of the Federal Statistical Office of Germany
Heinrich LÜTZEL
Statistiches Bundesamt
Since 1978 the Federal Statistical Office (FSO) has published quarterly main aggregates of
German national accounts. The time series start with 1968. The time-lag of the publication is
about nine weeks. These data are of greatest interest to specialists as well as to the general
public in Germany.
Quarterly data are mainly used for short-term economic analysis, the analysis of business
cycles, and as the basis for economic forecasts. The internal use of quarterly data is also
important especially as a basis for timely annual estimates. Compared with these purposes the
use of econometric models may be considered less important.
At the FSO the quarterly national accounts aggregates are prepared by the same persons who
also prepare annual estimates. In some cases, quarterly and annual methods are the same up to
the final estimates. In other cases different indicator methods for quarterly and first annual
estimates are used. In some cases only special rough estimates are possible. Preparing
quarterly and annual data “in one hand” is in many respects preferable. The original quarterly
estimates should be prepared seasonally unadjusted.
There is no “best” method for seasonal adjustment. No method can meet all requirements at
one time. It must be decided whether accounting rules (additivity over time and of aggregates)
or the quality of the adjustment should be given priority. Taking the main purposes of
seasonally adjusted quarterly national accounts into account, the FSO gives priority to the
quality and stability of seasonal adjustment.
A handbook on quarterly national accounts should discuss typical conceptual problems of
quarterly accounting and should give advice on how to handle them. Special problems
referring to the time of recording transactions like income, taxes or fixed capital formation
have to be dealt with. Other questions concern the treatment of typical seasonal changes in
prices and problems of estimating the natural growth of crops or of holding gains in stocks of
inventories.
167
TABLE OF CONTENTS
1 Introduction
The FSO provides quarterly estimate for the
production side of GDP (value added of 5 industries),
the expenditure side (7 main aggregates and 14
additional aggregates), employment (5 industries),
and on unemployment. Analysts and the general
public are interested above all in the expenditure side
(at constant prices) and employment data. The annual
growth rate of real GDP often is (mis) used as “the”
indicator for the success of the economic policy of the
government.
In 1978, first quarterly national accounts data for the
Federal Republic of Germany (FRG) were published
by the Federal Statistical Office (FSO).
In November 1994 the time series cover the period
from the first quarter of 1968 to the second quarter of
1994 for the former territory of the fourth quarter of
1993.
From 1995 onwards the quarterly accounts will be
calculated only for the unified Germany.
2.2 Econometric models
Many model builders prefer quarterly accounts
because the number of data in the time series is bigger
and thus the t-statistics of their estimators are better. It
is less important to them that the accuracy of annual
results is better than the accuracy of quarterly results.
Long time series and a complete system of quarterly
national accounts are demanded by model builders.
Compared with internal uses and with the use of
quarterly data for short-term economic policy, the use
for econometric models is of minor importance.
Before 1978 quarterly estimates were published by
several German economic research institutes, for
instance by the DIW in Berlin. At that time the FSO
prepared quarterly estimates only for internal
purposes, while half-yearly data were published.
2 Main purposes of quarterly accounts
Three main purposes of quarterly data can be
distinguished (in Germany)
2.3 Internal uses
• Economic analysis and policies
• Econometric models
• Internal uses
Quarterly national accounts are very important also for
internal purposes because they allow to prepare very
timely annual data using monthly and quarterly
statistics. The FSO publishes first annual estimates
just ten days after the end of the year. It is possible to
produce these estimates only on the basis of quarterly
accounts.
2.1 Economic analysis
The use of quarterly data for economic analysis and,
more specifically, analysis of business cycles is the
most important purpose of quarterly national accounts
However, no quarterly figures are presented at that
time, because only one month of the fourth quarter has
been covered by statistics. At the press conferences,
the journalists urgently demand information on the
fourth quarter every year. This seems to be the most
important information from national accounts.
in Germany. The data should be meet the following
requirements:
– timely estimates must be available;
– highly aggregated main indicators are needed ;
– original and seasonally adjusted time series should
be provided ;
– reliable estimates with small revisions are important
Another important internal use of quarterly data
regards the preparation of short-term economic
forecasts. The FSO does not publish any forecasts. But
it is an important task to provide a picture of the
economy at the moment when the government
prepares its forecasts.
The FSO publishes the quarterly accounts data about
nine weeks after the end of the latest quarter. At that
time monthly basic statistics for all three months of the
quarter are available. The DIW publishes first
estimates about two weeks earlier.
168
TABLE OF CONTENTS
Timeliness of national accounts is most important. To
achieve this goal, quarterly national accounts are
necessary.
For some aggregates (industries) the statistical basis
for the quarterly estimates is very poor. Here weak
information (employment data from the Federal
Institute for Employment), plausibility checks (change
in stocks) or quarterly breakdown of (estimated)
annual data (consumption of fixed capital) are used,
too. For transforming annual data into quarterly data
without using any indicator, the following formula is
applied:
of quarterly national
3 Preparation
accounts
The way and the method used in quarterly accounts
depend above all on the availability of timely monthly
and quarterly statistics. There is no “best method” of
calculating quarterly accounts data.
1.
2.
3.
4.
At the FSO the quarterly estimates are prepared by the
same staff members who produce annual estimates.
This is preferable because these persons
Q t = (12 A t + 5A t − 1 − 1A t + 1 ):64
Q t = ( 20 A t − 1A t − 1 − 3A t + 1 ):64
Q t = ( 20 A t − 3A t − 1 − A t + 1 ):64
Q t = (12 A t − 1A t − 1 + 5A t + 1 ):64
Q: quarterly results
A: annual results
T: year of recording
– know the data and methods of “their” aggregate or
industry;
This formula has a goods smoothing effect and makes
sure that the four quarters add up to the annual result.
But at the end of the time series the annual data A t and
At + 1 must be estimated.
– know what is going on in “their” economic sector;
– prepare quarterly data which are always consistent
with the annual data provided at the end of the year;
– use quarterly results for the first annual estimates.
Other principles for quarterly estimates at the FSO are:
In addition some annual results are originally
calculated on the basis of quarterly statistics. In these
cases it is obvious, that quarterly and annual
calculations should be prepared by the same persons.
The two tables of “Quarterly National Accounts in
Germany” show in the column “Dominating method”
which aggregates are originally calculated on a
quarterly basis in Germany.
– They are prepared on an original basis and not
seasonally adjusted.
– They always add up to annual data by integrating
the quarters into the annual data, if available.
– They always add up to the GDP. Adjustments are
made in such a way that no residuals have to be
published.
– Timeliness and accuracy are given priority over
comprehensiveness of the system. The FSO does
not present a full quarterly system of accounts.
In most cases quarterly data of the previous year are
extrapolated with indices. For example, the value
added at constant prices of he third quarter of 1994 is
estimated by multiplying the value added at constant
prices of the third quarter of 1993 by the growth rates
of production index for the same periods. But this is
not done in a strictly mathematical way.
The strong demand of economic analysts for timely
quarterly data, the increasing interest of journalists in
our press conferences as well as the positive reaction
of the general public show that at least quarterly main
aggregates of national accounts are extremely
important.
This relation can be modified “manually” taking into
account that in some industries, or in special phases of
business cycles, the indicator normally shows a faster
(or slower) change than the national accounts
aggregate. When preparing national accounts,
experience often is more important than highly
sophisticated mathematical methods.
4 Seasonal adjustment
In Germany two main methods of seasonal adjustment
are used:
– “Berlin Method”, fourth variance
169
TABLE OF CONTENTS
– “Census X-11 Method”, with and without working
day adjustments
– the trend should be smooth but true;
– the first estimates of the seasonal component should
not be corrected excessively (stability in
adjustment);
– the adjusted quarters should add up to the original
result (additivity over time);
– the adjusted aggregates should add up to the
adjusted sum (additivity of aggregates);
– adjusted nominal aggregates divided by adjusted
deflators should equal adjusted real aggregates.
The FSO publishes seasonally adjusted quarterly data
in three steps:
• In the harmonised press releases of the FSO and the
Ministry of Economic Affairs, a uniform growth
rate is published for the real GDP. A working day
adjustment is included. The adjustment is made by
the Deutsche Bundesbank using the original data of
FSO.
Hundreds of comparable calculations were made. The
findings are clear: there is no “best” method. It is
impossible to meet only three of these requirements at
the same time. Stability in and quality of adjustment
was given priority over other requirements.
• In the subject-matter series quarterly national
accounts, seasonally adjusted data based on the
Berlin Method are published. All series are adjusted
with a uniform version of this method. No
adjustments are made for extremes or working days.
• In the generally available working document
“Materialien zur Zeitreihenzerlegung”, the FSO
adjusts the series with the Berlin Method and the
Census X-11 Method in a special uniform version
for quarterly accounts. All components are
presented here, as well as growth rates versus the
previous year (original data) and the previous
quarter (adjusted data) also as an “annual rate”.
The main purpose of seasonally adjusted time series is
the analysis of economic trends and business cycles at
the end of the time series. Thus, the quality of should
be given priority over the accounting rules in the
system (additivity over time and of aggregates). In
addition, other adjustments should be made and
published separately. Working day adjustment and
weather adjustment are typical examples.
This praxis of publication meets the special needs of
the different users:
A search for the “best” method is more or less useless.
The differences in results caused by employing
different versions of the same method are often greater
than the differences between the results of different
methods. A very smooth trend component and a clear
theoretical foundation are the advantages of the Berlin
Method. It is based on the theory of spectral analysis.
• The general public may be confused if different
results (according to the method used) are
published.
• The “ordinary” economic analyst wants to obtain
seasonally adjusted data to find out about the latest
trend of the economy or the phase in the business
cycle.
5 General conceptual problems
• The specialist wants to have all the details. He
knows that different methods or different versions
of the same method will result in different adjusted
data. These differences will show him that in some
quarters it is dangerous to rely on seasonally
adjusted data.
In general, the same definitions and accounting rules
should be used in quarterly national accounts as well
as in the annual accounts of the SNA 1993 and the
ESA 1995, respectively. But there are some special
points which should be discussed and explained in
detail in a handbook of quarterly national accounts.
The problems concern the time at which transactions
are recorded and special seasonal effects within a year.
At the beginning of seasonal adjustment exercises the
FSO tried to find out which is the “best” adjustment
method for quarterly national accounts. An ideal
method should meet the following requirements:
This paper raises some of these problems. Other issue
should be included and discussed.
– the seasonal component should not be too stable and
not too flexible over time;
170
TABLE OF CONTENTS
Problems:
A.
» Should typical seasonal price movements over the
year (mostly prices of agricultural products) be
treated as changes in prices or in volume?
According to the SNA 1993 and the ESA 1995 these
are changes in volume (different products in
different months). What is the consequence for
quarterly accounts, especially for changes in
stocks?
B. Recording the results of annual tax declarations in
the following year
C.
Recording the results of tax inspections some
years later
D. Practical advice on how to adjust cash data on
taxes to an accrual basis.
» The SNA 1993 and the ESA 1995 recommend that
in agriculture the natural growth of crops should be
included in output. How can this be estimated in
quarterly accounts? Is there any experience in
national accounts of other countries?
» Special tinning problems of recording fixed capital
formation:
» For intermediate consumption, usually only annual
information is available. Is it acceptable to assume
that there is no seasonal influence on the
intermediate consumption ratio by quarters, except,
of course, for crop production in agriculture?
» Special tinning problems of recording income flows
should be explained:
A.
New buildings and other construction have to be
recorded as fixed capital formation in the period
when they are produced. The commodity-flow
approach provides good estimates.
B.
New machinery and equipment have to be
recorded as fixed capital formation at the time
when the change in ownership takes place. The
commodity-flow estimates must be corrected for
changes in stocks of such investment goods. What
information can be used for such corrections?
» Is there any experience in calculating quarterly
changes in stocks? Which methods are applied in
other countries for adjusting holding gains and
losses?
A. Wages and salaries often paid before (after) the
month for which they are paid
B. Normal annual payments like salaries for a 13th
month at the end of the year
C. Quarterly estimates of interest income
D.
Are advance payments of taxes (in Germany
typical of VAT and income tax) an acceptable
proxy for the accrual basis?
These are some examples for typical problems, which
should be explained in the handbook on quarterly
national accounts.
Time of recording dividends paid on shares.
» Special tinning problems of recording taxes:
171
TABLE OF CONTENTS
Quarterly National Accounts in Germany (former FRG, t+70 days estimates)
1. Gross Value Added (GVA) at Market Prices
Industry
Dominating method
Sources, indicators, m: monthly, q: quarterly
Breakdown
d.: deflation, i.: inflation
Agriculture
Original (valuation of
quantities)
Animals slaughtered (m), stocks of animals, crops
harvested (m) sales prices (basic year)
34 types of commodities
i. with the price index of agricultural products
Forestry, fishing
Estimates
Felled timber (m), catches of fish (m), sales prices
(basic year)
2 industries
i. with the price index of timber (m)
Manufacturing,
mining, electricity
Indicators
(extrapolation of GVA)
Production indices (m)
3 industries
i. with the producers’ price index
Construction
Indicators
(extrapolation of GVA)
Hours worked (m)
Estimates of change in productivity
2 industries
i. with the construction price index (q)
Trade
Indicators
Turnover (m)
2 industries
d. with retail and wholesale price index (m)
Transport
Indicators
Information from railways, post offices, road
transport (m)
Balance of payments (m)
9 industries
Partly i. With CPI (Consumer price index),
(m)
Other market
services
Indicators
Information from banks, hotels, insurance’s (m)
Employment (m)
7 industries
i. with CPI (consumer price index), (m)
General government
Indicators
Finance statistics (q)
3 cost components
4 levels of government
i. with compensation index
172
TABLE OF CONTENTS
Quarterly National Accounts in Germany (former FRG, t+70 days estimates)
2. Expenditure of GDP
Aggregate
Dominating method
Sources, indicators, m: monthly, q: quarterly
Breakdown
d.: deflation, i.: inflation
Household final
consumption
Original, indicators
(supplier approach)
Statistics on turnover of trade, restaurants, energy (m)
120 branches of suppliers,
300 purposes
d. with CPI (consumer price
index, m) 300 commodities by
purpose
Final consumption of
general government
indicators
Finance statistics (q)
6 cost components, 1 sale
d. indices for cost components
Fixed capital formation
-in machinery and
equipment
Original (commodity
flow method)
Production statistics (q) foreign trade statistics (m)
1600 commodities
d. with producer’s price index (m),
foreign trade price index (m), 180
commodities
Indicators (commodity
flow method)
Hours worked (m) plus estimated change in productivity
8 types of construction
i. with price index of buildings (q)
8 types of building
Change in stocks
Estimates
Difference between production and expenditure side of
GDP
Exports and imports of
goods and services
original
Foreign trade statistics (m)
Balance of payments statistics (m)
1 800 commodities
d. with foreign trade price index
by 1 800 commodities
-in building and
construction
173
TABLE OF CONTENTS
Quarterly National Accounts in Germany (former FRG, t+70 days estimates)
3. Distribution of income
Aggregate
Dominating method
Sources, indicators, m: monthly, q: quarterly
Consumption of fixed
capital
Estimates
Annual perpetual inventory estimates and formula
Taxes/subsidies
Original/indicators
Tax receipts (m)
3 items
Compensation of
employees
Indicators
Statistics on employment (m), wage sale index
23 industries
Social contributions
Original
Information from social insurance institutions (m)
11 social insurance schemes
1
Breakdown
d.: deflation, i.: inflation
2 commodities
i. with deflators of gross fixed
capital formation
4. Employment
1
Aggregate
Dominating method
Sources, indicators, m: monthly, q: quarterly
Breakdown
Employment
Indicators
Statistics on employment (m), estimates for self employed
persons
31 industries
Unemployed persons
Original
Information from the Federal Institute for Employment
(m)
(
)
1° Q t = 12D t + 5D t − 1 − 1D t + 1 :64; 2° Q t = ( 20D t − 1D t −1 − 3D t + 1):64; 3° Qt = ( 20D t − 3D t − 1 − 1D t + 1):64; 4° Q t = (12D t − 1D t − 1 + 5D t + 1):64
Q: quarterly results; D: annual data; y: year of recording
174
d.: deflation, i.: inflation
TABLE OF CONTENTS
Special aspects of the Quarterly GDP Accounts in East Germany in the
initial years after German Unification
Ralf HAIN
Statistisches Bundesamt
1 Introduction
With the unification of Germany on Oct. 3, 1990 the
statistical system of the Federal Republic of Germany
was adopted in the new Laender. This meant the
adoption of the SNA calculations for the national
accounts at short notice. Owing to the differences in
the economic level and the varying database and
quality in the new Laender as compared with the
former federal territory, separate accounting for West
and East Germany are carried out for some years.
From the very beginning efforts have been made to
reach a possibly complete adaptation of the methods
and depth of calculation and periodicity to those
applied for the former federal territory. As from 1995
the transition to an all-German presentation is
envisaged. Then, still only a few facts will be
calculated separately for West and East Germany.
not yet allowed to present actual quarterly results at the
same time with those for the former federal territory,
always approximately two months after the end of the
quarter, though, from the very beginning, the methods
of calculation have been envisaged for quarters,
analogously to those applied for West Germany. The
determination of seasonally adjusted rows for East
Germany was not possible either.
Annual accounts of the German national accounting
are, for the most part, based on statistical surveys
carried out once a year or at intervals of a few years
only for full calendar years. The database for quarterly
accounts is essentially smaller, for the most part these
are the results of short-term (monthly and quarterly)
economic statistics. The fact that they do not cover the
whole domain of the economy or all components of
the facts to be proved, corresponding in general less to
the definitions and deliminations of the national
accounts than the annual statistics is to the
disadvantage of the national accounts. Owing to the
higher uncertainties thus arising the publication of
quarterly results of the national accounts is restricted
to a few facts.
The present article is aimed at outlining the methods of
the quarterly calculation of the gross domestic product
(GDP) for East Germany and describing some special
aspects that arise from the economic situation.
Sequence and Basis of Quarterly
2 Accounts
The calculations are carried out separately for the
production and the use of the GDP. The co-ordination
of the overall aggregates is made proceeding from
qualitative criteria.
For the first time, the results achieved in the 2 nd
half-year of 1990 in East Germany were published in
April 1991. The currency conversion effected on July
1, 1990 did not allow providing an account for the
whole year. Actual annual and semi-annual results
have, in principle, been calculated or revised
simultaneously with the West German results from
September 1991 on. As from 1992 quarterly results
have been published once a year in September for the
preceding year. The uncertainties of the database have
Problems for the national accounts in East Germany
resulted, in particular, from the fundamental changes
proceeding in economy that started directly with the
adoption of the economic and currency union.
Carrying out surveys according to the Federal
175
TABLE OF CONTENTS
Statistics Act meant a complete radical change in the
statistical database. To avoid data gaps or an
unjustifiable delay in the supply of important results,
provisional regulations for carrying out various
surveys on the territory of the new Laender were
adopted with the unification treaty and in a special
statistics adaptation ordinance.
For other service enterprises only very few statistical
data were available for individual branches, e.g.
transport, credit and insurance companies and health
care. It is only possible to make a rough quarterly
estimate of the gross value added of most of the
service branches through the development of the
employment figure. In 1991 and 1992 quarterly
service statistics were additionally complied (see item
3).
Survey of the Methods of Production
3 and
Expenditure Calculation
As regards the method, the quarterly calculations of
the GDP-expenditure side, do not differ essentially
from the annual account as the biggest part of the
statistical data is quarterly available.
The quarterly calculation of the production side of the
GDP is, in principle, based on the allocation of the
final annual results or in the case of actual calculations
on updating the results with the aid of the actual
economic indicators available for quarters or months.
As a rule, the same indicators are applied in allocating
the final annual results to quarters and for actual
calculations. Statistical data on changes of stocks,
self-constructed equipment and, in particular, the cost
structure which, in general, are recorded only for the
whole years, are lacking.
Until 1993 the Private consumption in the new
Laender was mainly determined on the basis of the
household surveys carried out quarterly. After
evaluating the 1993 trade census a complete account
for the supply sector will be possible – as it is common
practice for the old Laender
Investments in fixed assets are calculated proceeding
from the production, import and export data according
to the commodity flow method as for the former
federal territory. Also the quarterly export and import
account that is mainly based on the monthly data of the
foreign trade statistics and service transactions data
corresponds to that for West Germany.
The gross value added of the manufacturing industry is
determined for the new Laender proceeding from the
turnover trend of the enterprises with 20 and more
employees and the handicraft enterprises. On the
former federal territory the net production index is
applied to update the real gross value added.
4 Special Problems
As for West Germany the gross value added of the
building industry is effected on the basis of the data on
hours worked, and, additionally, for preliminary
calculations on the basis of the estimates of the
development of labour productivity and an estimation
factor for newly established companies. As to the
wholesale trade the gross value added is distributed
quarterly with the aid of the turnover data.
4.1 Effects of the Economic Situation on the
DataBase and Methods of Calculation
Immediately after the currency union was established,
the production in East Germany declined strongly. A
big number of enterprises was closed down. Many
others worked at a loss.
It was not possible to determine the gross value added
of the retail trade on the basis of the monthly business
statistics which through the lack of an actual trade
census was based on an outdated register from 1990.
That is why the turnover volume was indicated as
being much too low. Therefore, it was necessary to
make rough estimations proceeding from the Private
consumption in the first years after the unification.
Altogether, an enormous structural break proceeded in
economy.
The net production index was not applied for quarterly
accounts in the manufacturing industry. The structures
of economic sectors and costs existing in the 2 nd
half-year of 1990 applied to determine the weighting
scheme were already outdated in 1991. That is why the
176
TABLE OF CONTENTS
calculations for the new Laender are made proceeding
from the determined turnover and supplementary
estimates made for the lacking facts.
established companies with the aid of which the
turnover was estimated.
The suitability of indicators for updating had to be also
examined thoroughly. Thus, the number of gainfully
employed persons was not applied as an indicator for
the development of the gross value added of service
enterprises. Owing to the strong backlog of demand a
very strong increase in production, partly with
saturation setting in slowly subsequently, was to be
stated in the individual sectors. In such cases it was
scarcely possible to assess the development of
productivity.
Also the assumption of constant input ration for all
quarters was not appropriate as they varied by more
than ten percentage points in individual sectors from
year to year. That is why “curves” were estimated for
the quarterly input ration proceeding from the
available yearly information.
Problems have been arising in evaluating the surveys
carried out in the manufacturing industry. For the
turnover of companies with 20 and more employees
determined on the basis of monthly and yearly surveys
and cost structure statistics very differing results were
achieved. That is why an extrapolation was made on
the basis of all companies included in the surveys.
Thus, an additional correction of level by more than 10
per cent had to be made for the gross value added of
the manufacturing industry for 1991.
4.2 Additional Surveys in the Transitional
Period
Service statistics and household budget surveys were
of special importance among the surveys carried out
additionally. The surveys of cost structures carried out
in all sectors supplied very important information on
the input ration, yet only as annual data.
Due to the strong structural changes it was not possible
to make an estimate in one sum analogously to the
West German account made for the manufacturing
industry. That is why until 1994 the calculations have
been made subdivided into 35 sectors of the
manufacturing industry. The data of handicraft
enterprises with a turnover rising much faster than in
the remaining manufacturing industry during the last
few years were separately considered.
In 1991 and 1992 the calculations of the gross value
added in the service sectors were based on the results
of the random sampling effected in this sector
quarterly. As the random sampling plan and the
extrapolation range of the survey were primarily based
on the survey of gainfully employed persons carried
out in 1990 many newly established enterprises or
subsidiaries of West German companies were
considered only incompletely. Thus, strong
corrections of the results of the random sampling were
required. Owing to the additional information on
individual sectors the gross value added of other
service companies was, in principle, doubled in
preliminary estimates. The first presentation of the
results of the 1992 turnover tax statistics showed,
however, that a remarkable increase in the gross value
added by 50 per cent was once more necessary. Given
the conditions of fast economic changes it would have
been only possible to achieve better results if the
survey was based on a constantly updated register.
The calculation in constant prices was also effected
subdivided in a more detailed way than it has been
common practice for West Germany, mainly to
exclude the effects of the outdated weighting scheme
for the producer’s price index of 1989 applied in the
first years. Owing to the strong quarterly variations of
the production value level when going over to the
1991 basis of quotation the price index was calculated
as a weighted average.
Many methods of calculation involved uncertainties
based on the updating of the data of a base year or the
same period of the preceding year. This refers notably
to sectors with random sampling (e.g. in building
industry, handicraft and retail trade). For instance, for
handicraft enterprises of the manufacturing industry a
corrective element was estimated for individual
quarters proceeding from the number of newly
The results of the quarterly budget survey carried out
for 3500 households in connection with the population
structure obtained through the micro census proved to
be useful in estimating the Private consumption.
177
TABLE OF CONTENTS
5 Conclusions
detailed subdivision and additional estimates of
changes of the operating sector had to be made. As the
necessary information was frequently not available it
had, to a large degree, to be supplemented by
estimates.
The methods of calculation applied for quarterly
national accounts depend essentially on the database
available, yet also on the special features of the
economic development. Under the radical change
taking place in East Germany it was necessary to
examine the methods of calculation applied especially
thoroughly. The statistical results to be applied as
basic data or indicators in the event of a “normal”
development proceeding proved to be partly useless or
incomplete.
The determination of the absolute level and the
development of the gross valued added in the service
branches was connected with great difficulties. The
random sampling carried out in retail trade, transport
and other service branches supplied results which
being based on outdated registers were only
conditionally useful. It would have been necessary to
keep the registers up-to-date and to adapt random
sampling in each case.
Owing to the strong structural changes going on in the
economy accounting had to be performed with a more
178
TABLE OF CONTENTS
SURVEY-BASED APPROACH II
An outline of the Swedish Quarterly National Accounts
Helena RUNNQUIST
Eddie KARLSSON
Statistics Sweden
1 Introduction
Foreign trade is divided on goods and services in
accordance with the HS classification system.
Changes in stocks are calculated for a wide range of
branches in mining and manufacturing. There are also
stock data for retail and wholesale, agriculture and
forestry (concerning classification levels for
calculations, see appendix II).
This outline of the Swedish quarterly National
Accounts (NA) deals with the scope of the accounts
and the methods used in balancing and analysing data
at an aggregated level. It is not a document of sources
of data. Swedish quarterly NA have been compiled
since 1972. The time series runs from 1970 but present
series are consistent from 1980.
2 Calculation
Calculations are made from the production and the
expenditure side. The extension of the system is
increased in the way that estimations now are made
also in current prices (at the start only in constant
prices). Variables are in principle the same as from the
start, which means, GDP by expenditure and kind of
activity, disposable income for households and
number of hours worked and persons employed (for
more details on labour statistics, see appendix I).
For expenditures like private final consumption
expenditure, changes in stocks and foreign trade of
goods, the sources are quarterly and used for the
annual estimates as well. The other parts of the
expenditure estimates are covered quarterly by
surveys measuring data at a more aggregated level
than the annual ones. But for an essential part of the
expenditures we have information on kind of use. For
the production estimates information on intermediate
consumption is mostly not available. Measures of the
production (volume or value) are therefore used at
indicators for values added. With the assumptions that
input coefficients in volume are unchanged compared
to previous yearly estimate and also that coefficients
are constant during all four quarters.
Number of activities, expenditures and other details
are much more comprehensive now compared to the
first calculations. Value added is computed for 58
activities of industries, producers of government
services, private non-profit institutions serving
households and domestic services. For expenditures
there are a great amount of details, in households
consumption, for example, we have more than 100
items. Government final consumption expenditures
are calculated for central and local government and by
type of cost. Figures on fixed gross capital formation
are achieved for different types of capital goods and
sector and, nine groups of activities of industries.
In the process of calculating the quarterly NA the same
staff is involved as in the annual accounts. One of the
advantages with this we think is the thorough
knowledge in their special fields that each person can
develop and make use of in the quarterly calculations.
We want to keep a very close relation between the
annual and the quarterly accounts and quarterly
179
TABLE OF CONTENTS
accounts are always adjusted to yearly level is a
satisfying solution. This is however not always the
case with the estimates of value added at constant
prices. The result is that the time series shows
discontinuities at the turns of year. To avoid this
unwanted effect Statistics Sweden have developed a
method called MIND4. This least square-method
minimises D4 (see below) combined with the
conditions that the yearly total is unchanged and that
the first adjusted quarter is linked to the last quarter in
the previously adjusted period.
D 4 = ∑ (Y i / X i − Y i − 1 / X
i −1
)
The key between activities and commodities are taken
from the annual system. Quarterly, only production
estimates for activities are available. As mentioned
before input coefficients are also unchanged compared
to the annual system and consequently inputs are
spread out on various commodities in consistence with
the annual key. Exports and imports are easily
distributed on goods, and by using some common
sense also on services. Import duties are calculated by
adopting a percent figure on imports.
Government sales and intermediate consumption of
governments are divided on commodities in the same
way in previous or present yearly calculation.
2
where X i = the original quarterly series; and Yi = the
adjusted quarterly series. This method of adjusting the
quarterly series to a yearly total is used once a year in
autumn when the annual accounts are completed. It is
applied to a period of two years at a time. Each year is
adjusted twice. The practice shows however that the
problem in identifying the « best » quarterly pattern is
the most important to overcome in calculating
quarterly time-series.
For gross capital formation there are fairly reliable
information on type and by kind of activity at a 9 digit
level. Stock figures must be divided on goods by using
keys both yearly and quarterly. The information is like
in most countries of poor quality and consequently the
variable is much useful in input-output handling.
Private consumption expenditures are calculated at
purpose and are therefore easily distributed.
Trade margins are calculated in the system at the
outcome of the balancing process. For different kind
of use and kind of commodity there are constant rates
on trade margins. For example, we have the
commodity car and the uses of private consumption,
gross fixed capital formation and exports. By applying
the fixed trade margins, varying among different kind
of use, it is possible to get a total value for commodity
as well as total economy.
3 Reconciliation and analysing results
Some years ago we adopted an input-output system
functioning at a half year basis. In recent years it has
been somewhat transformed to operate also quarterly.
Balances are tuned quarterly in constant prices and in
both current and constant prices at a half year basis.
(Comments below refers to constant price
calculations). Supply and demand are divided on 58
commodity groups (see appendix II).
Indirect taxes and subsidies are calculated in the same
way as trade margins as shares for different
commodities and kind of use.
On supply side that means, gross output of industries,
government sales, sales of non-profit institutions
(NPI) serving households, import and import duties,
trade margins and indirect taxes and subsidies.
In varying degree residuals will occur on the 58 groups
of commodities. A problem in handling the residuals
in the quarterly estimate is the appearance of seasonal
patterns on divergences. The IO system was
introduced only a few years ago and then the patterns
were disclosed. The aim in the analyse is of course to
minimise residuals at commodity level and hereby
also at total level.
On demand side there are intermediate consumption,
government intermediate consumption, private
consumption expenditure, gross fixed capital
formation, change in stocks and exports.
The information available quarterly is limited and can
not offer the total range of information required for
making input-output calculation of previous or present
year.
Current price estimates are produced at a half-year
basis.
180
TABLE OF CONTENTS
is based on the actual number of working days for the
present quarter compared to a “normal” quarter; an
average of the latest twelve years. Number of working
days varies for different “working schedules”. We use
four different kinds, five days a week, six days a week;
shifts 29 days a month and 30 days a month. The quota
is also weighted with the information of what kind of
schedules are used in each activity. Several kinds can
be applied for one activity. This information is only
obtained by telephoning branch-organisations etc. and
it is not systematised. This is probably a weak link in
the calculations. By tradition no yearly totals for the
time-series adjusted for number of working days are
published or presented since they differ from the
original ones.
On demand side current price figures are directly
calculated and distributed.
On supply side output price indices are used for
getting commodity figures. Total supply in current and
constant prices yields an implicit price index which is
used for reflating intermediate consumption at
commodity level. The outcomes are figures in current
prices for both intermediate consumption and value
added at activity level.
At yearly level GDP is determined from the
expenditures. The positive or negative residual is
hence on production side. For a single quarter
corrections are a bit more complicated. For the first
and third quarter GDP is determined by using the
average growth rate between expenditures and
production hereby obtaining a GDP level.
Seasonally adjusted series are calculated for a period
of the latest twelve years (for year 1983 to 1994 at
present). We recalculate the adjustments for the whole
period when we introduce a new quarter. Since we
have found that seasonal patterns can be hard to
distinguish in many of the time-series due to relatively
big irregular components, we adjust the series at a
rather aggregated level (for levels of adjustment, see
appendix III). This may be a problem specific for
relatively small economies like Sweden, compared to
the United States for example.
If there is a complete half year, GDP for both quarters
is determined from production side by taking the
weights for each quarter within it’s half year and then
applying that on the half year GDP.
Consequently, at the quarterly basis there are
discrepancies on both sides.
Principles mentioned in table 1 are valid for both
current and constant price estimates. In practise the
method is a bit different in current prices as the
production estimates are available only at half-year
basis. The solution in getting current price estimates is
to construct a relevant price index for reflation of the
correction item on the expenditure side. GDP at
current prices is the sum of expenditures and the
correction item.
Seasonal adjustments are calculated for values added
on series adjusted for calendar effects and for series of
expenditure of GDP. We calculate readjustments for
extreme irregular values that we know of, like for
example the big conflict on the labour market that we
had in 1980. The problem is of course to estimate the
impact of an event like that on values added. These
readjustments are made since we do not want the
seasonal adjustment factors to be influenced by so
called extreme values. The adjusted series contains the
irregularities. Estimation of the factors for seasonal
adjustment is made with a computer programme
developed by Statistics Canada; X-11 ARIMA\88. We
use a function for correcting extreme values (other
than the ones mentioned above) and the ARIMA
To facilitate analysis of the quarterly time series we
calculate series adjusted for calendar effects and for
seasonal variations.
The adjustments for calendar effects is done for
production data at constant prices and number of hours
worked only. For each activity a quota is calculated. It
Table 1: Rules on quarterly tuning
Yearly GDP = Expenditures = Production +(-) Discrepancy
Half yearly GDP = Expenditures = Production +(-) Discrepancy
Quarterly GDP = Expenditures +(-) Correction +(-) Discrepancy
181
TABLE OF CONTENTS
forecasting model in the programme. Each data series
is adjusted separately which means that no form of
additivity exists in figures. Both additive and
multiplicative methods are used depending on what is
most appropriate for each series. The lack of additivity
does not seem to have been a problem for the users so
far. The yearly totals are not affected by the seasonal
adjustments.
In June figures for the first quarter are published.
There are no revisions for preceding years.
In September second quarter is completed and first
quarter revised.
In October/November the annually accounts for the
previous two years are completed and the
corresponding quarterly estimates are adjusted to the
new levels. When the third quarter is published in
middle of December the revised series are also
published.
4 Revisions and publishing
GDP figures are presented in current and constant
prices (1991 prices from this autumn) from first
quarter of 1980. Expenditures are presented quarterly
in both current and constant prices while production
data are shown quarterly in constant prices and in
current prices at a half year basis. GDP expenditures
and value added are also presented in a seasonally
adjusted version. GDP and value added are also
available in a version corrected for number of working
days as well as seasonal patterns.
In March the four quarters of the preceding year are
published. The first three-quarters are revised. An
exception from the scheme are quarterly data on
disposable income of households which are compiled
and published only two times a year, in September and
March.
The main results are always presented in a press
release. A short version of the press release is also
introduced in a data base (Key Economic Indicators)
containing a wide range of economic statistics. Clients
are mostly banks, stockbrokers and companies on the
financial market. There is also a more comprehensive
database for more common use (TSDB).
In tables both levels and changes in volume are noted.
In the seasonal adjusted series the volume changes are
noted with regard to previous quarter without any
upgrading to yearly levels or changes.
There are some fixed rules for revisions in the
quarterly series. Data are published about 2.5 months
after the quarter in question. The estimating process
starts approximately one month earlier for the early
estimates like employment and households
expenditures. Recently a project has started aiming at
quicker estimates.
The division is also compiling and publishing a
comprehensive quarterly publication. From Statistics
Sweden there are two more publications (monthly)
containing all kinds of economic statistics including
NA.
182
TABLE OF CONTENTS
APPENDIX I
The LFS also determines the total level of
employment.
Labour statistics used in the National
Accounts
For some activities where more reliable sources are at
hand, the LFS is replaced with other statistics
available. That concerns employment within
manufacturing and local governments. The source for
manufacturing is two separate monthly inquires
which, for example, measures absence and salaries for
employees, but also number of persons employed and
hours worked. Information for local governments is
received from monthly statistics produced by their
associated organisations. This information is the only
one used in labour accounting which is not produced
by Statistics Sweden.
The main source in estimating number of persons
employed and number of hours worked, quarterly as
well as yearly, is the Labour Force Survey (LFS).
The LFS is a monthly inquiry of the total labour force
with an representative sample of 18 000 persons. The
inquiry tells both number of persons employed as well
as number of hours worked for all of the activities and
sectors in the Swedish system of labour accounting.
183
TABLE OF CONTENTS
APPENDIX II
Classification of households goods and
services
Function
1000
Food, beverages and tobacco
1100
Food
1110
1120
1130
1140
1150
1160
1170
1180
1191
1192
Bread and cereals
Meat
Fish
Milk, cheese and eggs
Oils and fats
Fruits and vegetables other than potatoes and
similar tubers
Potatoes, manioc and other tubers
Sugar
Coffee, tea, cocoa
Confectionery, etc.
1200
Non-alcoholic beverages
1300
Alcoholic beverages
1310
1311
1312
1313
1320
Spirits, wine and export beer
Spirits
Wine
Export beer
Beer n.e.c.
1400
Tobacco
2000
Clothing and footwear
2100
Clothing including repairs
2110
2120
Clothing, fabrics and yarn
Repairs to clothing
2200
Footwear including repairs
2210
2220
Footwear
Repairs to footwear
3000
Gross rent fuel and power
3100
Gross rent and water charges
3110
3120
3130
3140
Gross rent
Imputed rents of owner occupied dwellings
Imputed rents of secondary dwellings
Tenants repair costs
3200
3210
3220
3230
3240
3250
Fuel and power
Electricity
Gas
Liquid fuels
Other fuels
Purchased heat
4000
Furniture,
furnishings,
equipment and operation
4100
Furniture, carpets and repairs
4110
4120
Furniture, carpets and lamps
Repairs to furniture
4200
Household textiles and other furnishings incl.
Repairs
4300
Heating and cooking appliances,
refrigerators, washing machines and similar
major household appliances including
fittings and repairs
4310
4320
Heating and cooking appliances,
refrigerators, washing machines and similar
major household appliances including
fittings
Repairs to major household appliances
4400
Glassware, tableware and household utensils
4500
4600
Household operation except domestic
services
Non-durable household goods
Household services excluding domestic
service
Domestic services
4610
4620
Private welfare services, children
Local government services, children
4510
4520
184
household
TABLE OF CONTENTS
4630
Welfare services elderly
5000
Medical care and health expenses
5100
Medical and pharmaceutical products
5110
5120
Medicines
Other products
5200
Therapeutic appliances and equipment
5300
Fees paid to physicians, dentist and related
practitioners
7100
Equipment and accessories, including
repairs
7110
7120
Radio and television
Photographic
equipment,
musical
instruments, boats and other major durables
Photographic equipment
Boats
Other recreational goods
Repairs to recreational goods, etc.
Parts and accessories for and repairs to
recreational goods
Port services
7121
7122
7130
7140
7141
7142
6000
Transport and communication
6100
Personal transport equipment
6110
6120
6121
6122
Purchases of motor cars
Other transport equipment
Motorcycles and bicycles
Caravan
6200
Operation of personal transport equipment
6210
6220
6230
6231
6232
6233
6234
6235
Repair charges, parts and accessories
Gasoline, oils and greases
Other expenditures on cars
Compulsory tests of cars
Driving lesson
Garaging
Parking
Car leasing
6300
Purchased transport
7200
Entertainment, recreational and cultural
services excluding hotels, restaurants and
cafés
7211
7212
7213
7214
Entertainment and photo services
Television licences
Gambling, lotteries etc.
Veterinary services
7300
Books, newspapers and magazines
7310
7320
7330
Books
Newspapers and magazines
Other printed matter
7400
Education
7410
7420
Music school
Education
8000
Miscellaneous goods and services
8100
Services of barber and beauty shops etc.
8110
8120
Services of barber and beauty shops etc.
Goods for personal care
8200
8210
6310
6320
6340
6350
6360
6370
6380
6400
Railways
Bus and local traffic
Cabs
Ships
Airlines
Services of travel agencies and air charter
Removal
Communication
6411
6412
Postal services
Telephone services
8220
8230
Goods n.e.c.
Jewellery, watches; rings and precious
stones
Other personal goods
Writing and drawing equipment and supplies
7000
Recreation, entertainment, education and
cultural services
8300
Expenditure in restaurants, cafés and hotels
185
TABLE OF CONTENTS
8310
8320
Expenditure in restaurants and cafés
Expenditure for hotels and similar lodging
services
8500
Financial services
8600
Services n.e.c.
8611
8612
By durability
Undertaking
Other services
Sum
Direct purchases abroad by resident households
Total private final consumption expenditure by
households
Private non-profit organisations
Total private final consumption expenditure
1100
1110
1111
1112
Goods
Durable goods
Of which: purchases of motor cars
Other durable goods
1120
1130
1131
1132
Semi-durable goods
Non-durable goods
Of which: food and beverages
Other non-durable goods
1200
1210
1220
Services
Of which: gross rent
Other services
Classification of industries by kind of economic activity
SNR-REV 1
SNR 2
SNI 3
1000
Agriculture, forestry, fishing
1000
1
1100
Agriculture
1100
11
1200
Forestry and logging
1200
12
1300
Fishing
1300
13
2000
Mining and quarrying
2000
2
2100
Iron ore mining
2100
2301
2200
Non-ferrous ore mining
2200
2302
2900
Other mining
2300
21,22,29
3000
Manufacturing
3000
3
3100
Manufacture of food, beverages and tobacco
3100
31
3110
Protected food manufacturing
3111
3111/2, 3116/8
3120
Import-competing food manufacturing
3112
3113/5, 3119, 312
3130
Beverage and tobacco manufactures
3120
313/4
3200
Textile, wearing apparel and leather industries
3200
32
3300
Manufacture of wood and wood products, incl. furniture
3410
33
3310
Saw mills and planing mills
3411
33111
3320
Other wood industry
3412
33 exkl 331111
3400
Manufacture of paper and paper products, printing and
publishing
3410
Manufacture of pulp
3421
34111
3420
Manufacture of paper and paperboard
3422
34112
3430
Manufacture of pulp, paper and paperboard products
3423
34113, 3412/9
186
34
TABLE OF CONTENTS
SNR-REV 1
SNR 2
SNI 3
3440
Printing, publishing and allied industries
3430
342
3500
Manufacture of chemicals, plastic products and
petroleum
3500
35
3510
Manufacture of industrial chemicals, incl. Plastic
materials
3521
351
3520
Manufacture of other chemical products
3522
352
3530
Petroleum refineries and manufacture of products of
petroleum and coal
3530
353/4
3550
Manufacture of rubber products
3510
355
3560
Manufacture of plastic products
3523
356
3600
Manufacture of non-metallic mineral products, except
products of petroleum and coal
3600
36
3700
Basic metal industries
3700
37
3710
Iron and steel basic industries
3710
371
3720
Non-ferrous metal basic industries
3720
372
3800
Manufacture of fabricated metal products, machinery
and equipment
3800
38
3810
Manufacture of fabricated metal products, except
machinery and equipment
3811
381
3820
Manufacture of machinery and equipment
3812
382
3830
Manufacture of electrical machinery, apparatus,
appliances and supplies
3830
383
3840
Manufacture of transport equipment, except ship
building
3813
384 exkl 3841
3850
Manufacture of professional, scientific, measuring and
controlling equipment, and of photographic and optical
goods
3814
385
3860
Ship building and repairing
3843
3841
3900
Other manufacturing
3900
39
4000
Electricity, gas and water, incl. Steam and hot water
supply and sewage disposal
4000
4
4100
Electric light and power, steam and hot water supply
4100
4101, 4103
4200
Gas manufacture and distribution
4200
4102
4300
Water works and supply, incl. sewage disposal
4410, 9200
del
42, 92001
5000
Contstruction
5000
5
6000
Wholesale and retail trade, hotels and restaurants
6000
6
6100
Wholesale and retail trade
6100
61/62
6300
Restaurants and hotels
6300
63
7000
Transport, storage and communication
7000
7
7100
Transport and storage
7100
71
7110
Land transport
7111
Railway transport
711
7110
187
7111
TABLE OF CONTENTS
SNR-REV 1
1
2
3
SNR 2
SNI 3
7112
Urban, suburban and interurban highway passenger
transport
7120
7112
7113
Other passenger land transport
7130
7113
7114
Freight transport by road
7140
7114
7116
Supporting services to land transport
7180 del
7116
7120
Water transport and allied services
7121
Ocean, coastal and inland water transport
7150
7121/2
7123
Supporting services to water transport
7160
7123
7130
Air transport
7170
713
7190
Services allied to transport
7180 del
719
7200
Communication
7200
72
7210
Postal services
7210
72001
7220
Telecommunication
7220
72002
8000
Financing, insurance, real estate and business services
8000
8
8100
Financial institutions
8110
81
8200
Insurance
8210
82
8300
Real estate and business services
83
8310
Real estate
831
8311
One- and two-family houses and leisurehouses
83, 84
831 a
8312
Other real estate
83, 84
831 b
8320
Business services
8500
832/3
9000
Other personal services
9000
9
9200
Sanitary and similar services, except sewage disposal
9200 del
92 exkl 92001
9300
Education and health services
9300
93
9400
Recreational and cultural services
9400
94
9500
Personal and household services
9500
95
9510
Repair services not elsewhere classified
9510
951
9520
Other services
9520
952/9
712/9
Revised code of classification by kind of activity in the Swedish national accounts (SNR).
The former SNR classification.
Swedish standard industrial classification of all economic activities (see Reports on statistical c o-ordination
1969:8 and 1977:9). SNI is a Swedish version of the ISIC and is equal to that on a four digit level .
Classification of Gross Fixed Capital
Formation
Kind of activity:
By type of capital good :
ISIC 1 – 9
Local Government
Central Government
Residential buildings
Other buildings and constructions
Machinery and other
188
TABLE OF CONTENTS
APPENDIX III
Levels of calculating Seasonal adjustment
Expenditure of GDP
Value added
Private final consumption expenditure, total by
durability (see Appendix I)
Activity 1-9 ISIC
Unallocated banking services
Indirect commodity taxes net
Central government
Local government
NPI
Aggregated levels for example the sum of ISIC, GDP
etc.
General government consumption, total
Central government consumption
Local government consumption
Gross fixed capital formation, total
Gross fixed capital formation, buildings
Gross fixed capital formation, machinery and other
Changes in stocks
Hours worked
Exports, total
Exports of goods
Exports of services
Activity 1-9 ISIC
Unallocated banking services
Central government
Local government
NPI
Imports, total
Imports of goods
Imports of goods
Aggregated levels for example the sum of ISIC; Total
number of hours worked etc.
GDP
189
TABLE OF CONTENTS
Data flows in Dutch Quarterly National Accounts 1
Ronald JANSSEN
Peter OOMENS
Nico van STOKROM
Statistics Netherlands
This paper describes the dataflows in the Dutch Quarterly National
Accounts. It gives insight in methods, organisation and publications.
Also, new developments and future fields of study are discussed.
1 Introduction
The subject of this paper is to describe the
transformation of basic short-term statistics into a
consistent set of Quarterly National Accounts (QNA).
In terms of a flow-chart of data: from non-integrated or
single statistics (basic information) to fully integrated
and consistent National Accounts. An outline of the
philosophy of the Dutch system of national accounting
will be given. Also remarks will be made on the
organization, the publications and new developments
and new fields of study for the Quarterly National
Accounts in the Netherlands.
macro economic totals can be derived. The third phase
concerns publications.
The organization of the Dutch QNA reflects the
dataflows described. There are three project groups
working on the QNA. These are: basic information ;
integration and publication.
The output in terms of publications of the Dutch QNA
is quite varied. There are press reports highlighting the
changes of important macro economic variables. More
detailed analysis follows in bulletins and other printed
publications. There are also electronic publications
available to the public.
In the process of compiling QNA in the Netherlands
there are three main phases. The first phase is that of
data collection and adjustment. In the second phase all
data are brought together in an input output table. In
the Dutch system of (Q) NA and input output table (or
a make use table) is an instrument for statistical
integration. The result of the second phase is a fully
balanced input output table from which the relevant
1
A recent development is the preliminary (flash)
estimate of GDP. Furthermore, the influence of
variations in numbers of working days on output is
being studied. Also, seasonal adjustment procedures
are under scrutiny. In the field of publications, the
electronic publication will be renewed and Statistics
The views expressed in this paper are those of the authors and do not necessarily reflect the views of Statistics
Netherlands.
191
TABLE OF CONTENTS
Netherlands is investigating alternative ways, such as
electronic mail, to distribute the electronic
publications.
output table. The output approach is applied by
aggregating gross value added in the columns of the
table. The breakdown of primary costs in the table is a
result of the income approach and the breakdown in
destination of goods and services produced by
industry reflects the expenditure approach. In practice,
it is important to keep the table as simple as possible,
which means that most breakdowns necessary for the
three methods do not appear in full detail in the input
output table. For example, the income approach
requires a breakdown of incomes into their elements.
The expenditure approach requires a division of
household consumption by distribution channel or a
breakdown of investments by type of asset. No such
details are shown in the input output table, although
they do play a role in the process of statistical
integration.
Some remarks on macro economic
2 statistics
The many different statistics produced by Statistics
Netherlands can be thought of as a coherent system.
On one hand there is long-term (mainly annual)
statistical information which describes the economic
structure or another phenomenon with great detail.
This information is characterized as reliable and
detailed but available only after a relatively long
period of time. On the other extreme there is
short-term information. This usually is rough and
aggregated but relatively fast. There are of course lots
of statistics which can be put in categories between
these extremes. Specifically, the system of
macro-economic statistics contains annual and
quarterly national accounts and monthly indicators.
The annual national accounts go in the first (detailed
but slow) category. The monthly indicators are of the
rough and ready kind . In between, we have the QNA.
The extrapolation of statistical structures to more
recent periods is a kind of short-term statistical
analysis. Changes in (economic) structures cannot be
quantified very reliably in this way. Of course, it is
possible to obtain indications, for example about the
nature and the direction of structural changes. The
results of such short-term statistical analysis depend
on, among other things, a judgement about the set of
used indicators; both separately and in mutual
relationships. Is is clear that this kind of short-term
statistical analysis takes more time than producing
separate indicators.
One of the main macro-variables in national
accounting is total production (Gross Domestic
Product, GDP). Production (value added of
production) is generated in the various branches of
industry. This approach to the macro-economic
aggregate of GDP is called the output approach .
Calculating the same macro-economic aggregate by
way of the process of income distribution and
redistribution, where all individual incomes are added
up, is known as the income approach . The third
sub-process is the expenditure process. Here the
variables of the confrontation of supply and demand
are measured separately. GDP can be determined as a
balancing item if information is available about total
consumption (by households and government), total
investments (by government and enterprises,
including changes in stocks and work in progress),
total exports (of goods, services and primary income)
and total imports (ditto). This estimation of GDP is the
expenditure approach . In the Dutch National Accounts
the input output table is traditionally an important tool
for integrating most of the information on the
economic process. The three above mentioned
approaches come together very efficiently in the input
In the Dutch system of QNA, an extrapolation of the
structure of the Dutch economy is the result of the
exercise. It is done to make consistent all available
statistical information about changes in production,
income and expenditure. The input output table of a
base period can be seen as a two
dimensional weighting scheme for integrating all short
term information. In other words: the input output
table is an instrument for the integration of short-term
statistics. It is not a goal in itself.
It goes without saying that the input output table can
be compiled on a more disaggregated level and in
more detail as more time passes after the end of the
reporting period. Integration of very short-term
(monthly) statistics is considered as not possible (yet).
The monthly figures describe aspects of one specific
row or column of the input-output table. Generally
speaking monthly indicators are like a thermometer ;
192
TABLE OF CONTENTS
they do not describe the state of health, but they spot
any significant changes in it.
research project was started to try to find a new
optimum between the availability of short-term
indicators, the possibility of making additional
estimates and the wish to shorten the time lag. The
possibility to use the extrapolation method of the
regular QNA in the fast QNA was also investigated.
In the last decade the emphasis in the development of
Dutch macro economic statistics has been on the
short-term statistics . In the early eighties this led,
among other things, to the resumption of the QNA.
There were QNA for 1948 to the second quarter of
1953. These were discontinued. The statisticians in
those days also had to work on the annual national
accounts. It was found that the differences between
annual national accounts and the sum of four quarters
were very significant. The reliability of the quarterly
data was considered insufficient and the work on QNA
was discontinued. The budget restraints at that time
were very strict. Priority was given to the annual
national accounts. The motto was : first get your
annual statistical structure in order and then you can
extend and extrapolate to quarters.
The research led to the fast quarterly GDP estimate.
Generally, it appears that some of the most important
monthly indicators were available about 6 weeks after
the end of the month or quarter under report.
Obviously, the timeliness for the statistical basic data
for the fast QNA was established at 6 weeks as well. It
is hard to cut down on the 2 weeks needed for the
integrating and balancing process. The process was
even becoming more complex because of the larger
statistical margins. Some time was cut by performing
the balancing process on a more aggregated level (a
smaller input output table). All in all the starting point
was that the first results of the fast QNA should be
published 7 to 8 weeks after the end of the quarter under
review. The so called flash estimate gives the result of
economic growth only. We present GDP-volume and
some information on the production side of the
economy.
The designers of the new QNA were confronted with a
demand for optimal timeliness. On the one hand the
availability of short-term statistical information was
important, while on the other the timeliness of the
process of statistical integration was also an issue.
With respect to the former, initially the necessary
basic statistical information was complete about 16 to
17 weeks after the end of the quarter under report. At
that moment the process of integration could start.
This process – building up the input output table,
confronting statistically calculated supply and
demand,
deriving
statistical
discrepancies,
deliberating these differences and consulting
specialists and, last but not least, taking decisions –
takes about two weeks. Later the timeliness of
necessary statistical information came to be somewhat
shorter, so that the regular QNA could be published in
full detail at a fairly steady 17 weeks after the end of
the quarter.
Data collection and adjustment in the
3 QNA
The purpose of the QNA is to create from unstructured
and
sometimes
inconsistent
information
a
comprehensive, complete and consistent synopsis of
production, consumption and of income generation
from productive activities in the economy (scheme 1).
Therefore we first need data from economic actors.
Basic data is derived from a diverse number of sources
(e.g. production, turnover and investment statistics per
industry, inquiries within households, foreign trade
statistics, government accounts). In practice it is
common that the data is not completely consistent.
These inconsistencies between supply and demand of
commodities must be balanced.
This delay, however, is quite large. Several users of
QNA have asked for figures with better timeliness.
Also, other countries produce their QNA faster. The
timeliness of the underlying statistics is often shorter.
It is our impression that the compilation of QNA in
many other countries is usually based on one of the
three approaches, or on a combination of the
approaches in which the integration takes place on the
macro-level. For the Netherlands CBS the relatively
large delay led to reconsideration of this problem. A
The system of extrapolation of Quarterly National
Accounts provides quarterly input output tables in both
current and constant prices. The coherence between
these tables and the method of extrapolation is outlined
in scheme 2. The same method of extrapolation is
applied for the fast QNA, although there is less
193
TABLE OF CONTENTS
4 Statistical integration
statistical information and the balancing process is less
disaggregated.
The integration of data in compiling QNA takes place
in a input output framework. The quarterly economic
structure of a base quarter is updated. This is done
simultaneously in current prices and in constant prices
(average prices of the previous year). The first tasks of
the integration specialist is to build up an unbalanced
input output table. The input output table in the QNA
is an industry-industry type table. It consists of 86
rows, which are branches of industry. On the input side
63 columns are distinguished, which correspond with
63 branches of industry (at a slightly higher level of
aggregation).
Scheme 2 shows that there is a relation between the
table for the base-quarter (t-1) in current prices and the
table for the reporting quarter t (this is the same quarter
one year ahead) in current prices. This relationship is
represented by a dotted line. It is usual to decompose
information about value changes into a price and a
quantity (volume) component. With these two
components the continuous lines are followed.
Furthermore, the scheme shows that the balancing
process of the two tables (in current and in constant
prices) is a simultaneous process. Balancing decisions
are based on volumes as well as prices and values.
The tables are compiled in current as well as in
constant prices. To extrapolate the economic structure
a production function with fixed coefficients is
assumed. The change in gross production is used to
calculate the change in intermediate consumption of
the branches of industry. Basicly, to produce 10%
more cars, 10% more metals, glass, rubber products,
etc is needed. The next step is to add data on household
consumption, capital formation, exports and imports
and, if available, data or estimates on the change in
stocks and work in progress. The result is an
estimation of total demand. This is confronted with the
estimation of gross production (= total supply).
Subsequently the various output and expenditure
indicators at constant and current prices are used to
calculate a price index for intermediate demand (row).
The final result is a column of statistical discrepancies
at current and constant prices. That column regrettably
hardly ever contains cells with a value of zero.
Branche specialists in the quarterly national accounts
are responsible for the data collection. Most data are
gathered by other divisions of Statistics Netherlands.
An important task of the specialist is to transfer the
basic data into data suitable as input in the quarterly
input output table. A major element in this task is to
check the data for continuity. Important is: do
year-to-year changes in the data (quarter under report
compared to the corresponding quarter a year before)
represent real growth rates or are they perhaps caused
partly by changes in the classifications of statistical
units?
The branch specialists also complete the data : often
small firms are not included in the statistical surveys
of basic statistics. Since the QNA require, just as the
annual National accounts, complete estimates for an
industry, data from basic statistics must be grossed up.
Also some estimates for hidden transactions are best
made with a view on the total values in the
input-output tables.
Now the real work of statistical integration can start.
The statistical discrepancies are to be resolved. Much
depends on the quality of the basic data and the
stubbornness or flexibility of the branch and
integration specialists. The process is one of
negotiation. The real questions in this process are :
what is the quality of the indicator of production; how
were changes in stocks and other final expenditure
variables estimated; and nowadays a very relevant
point, what about the level of imports and exports. If
gross production is changed because demand is found
to be higher than production, then the whole column
(input) of this branch must be recalculated. That of
course leads to new statistical discrepancies. This is
called the process of iteration. Finally, the statistical
Further on data are subjected to some plausibility
checks. E.g. the year-to-year quantity changes of total
output, total input, total value added and the number of
employees are compared. If necessary data are
corrected.
Scheme 3 sums up the activities by the branch
specialists in the Netherlands carried out before the
results from basic statistics can be included in the
integration process.
194
TABLE OF CONTENTS
5 Publications of QNA
differences are brought to zero. The input output table
in its final stage may not contain residual statistical
differences.
The above mentioned dataset is the starting point for
various calculations.
The advantage of this method of working is that the
consequences of a decision concerning a statistical
difference immediately become clear. If intermediate
demand was considered to be too low and was
subsequently raised, then value added has been
decreased and the operating surplus goes down
correspondingly.
The first calculations are about time series. The time
series at constant and current prices must be updated
with the most recent quarterly data. Calculations in the
input output table are done at current prices and at
constant (average) prices of the previous year. For the
QNA publication time series with a fixed base year are
wanted. The base year at this moment is 1990. From
the values at constant prices in the input output tables
the volume changes of all variable are calculated.
These changes are then chained on the values at
constant prices in the base year. The outcome is not a
time series with constant weights as in a Laspeyres
index, it is a time series of chained volume changes.
This is done for all variables for which values at
constant prices are calculated. These problems of
course do not exist for the values at current prices.
These are derived directly from the input output tables.
Fully integrated and well-structured data are the
outcome of this statistical process. Of great
importance for improving the quality of the relevant
basic statistics is the « feed-back » of the adjusted data
to the surveying divisions. If production for a branch
of industry is consistently too high or too low relative
to other indicators, that is reported to the statistical
departments responsible for gathering the data.
Regretfully, there is much feed bak on the foreign
trade statistics these days.
There are also several variables which are not found in
the input output tables. Data on primary incomes and
unrequited income transfers to and from the rest of the
world is obtained from the Netherlands Central Bank.
These income flows are part of the balance of
payments. Data is adjusted to national accounts
definitions and then national product, national income
and the surplus of the nation on current transactions is
calculated. The third part in the additional calculations
done by the publications group is the seasonal
adjustment of the time series. The method used by
Statistics Netherlands is CENSUS X-11 . Each
variable
is
adjusted
separately.
Statistical
discrepancies between seasonally adjusted aggregates
and the sum of partial series will occur.
The input output table or an aggregated version of it
does not give all the needed information. Capital
formation by type of asset cannot be derived from the
input output table. Additional calculations are
necessary. For example a matrix of capital formation
could have rows showing branches of industry where
the capital assets are produced (or imported) and
columns which show the type of capital good.
Changes in capital formation resulting from the
integration process lead automatically to a change in
capital formation by branch of industry. Matrix
calculations can be made for the consequences to the
values of investment of various types of capital goods.
Household consumption and other expenditure
variables are handled in the same way.
Once the whole set of data in all its dimensions has
been completed, work on the publications can begin.
The input output table is the common core of a set of
matrices relating to various economic variables. In the
end the integration specialist declares himself satisfied
with the data in the balanced input output table. The
results are then examined and discussed. This
sometimes leads to further corrections. Finally, the
aggregated dataset is made and presented to the
publications group.
The first publication is always a press release. The
press report has two functions. The first is of course to
inform the general public. The second purpose is to
signal to our customers that the QNA data are
available.
195
TABLE OF CONTENTS
The press report typically consists of about three
pages. It gives a one page abstract of recent economic
developments in the Dutch economy and two pages
with some tables and notes (name of spokesman,
telephone numbers, etc.).
very customer friendly. So the publication format is
being changed to a different standard. The
Netherlands CBS has recently updated its software for
electronic publications. This is CBSview, release 2.0.
This software is presented free of charge to customers
and must be installed on their computer system. It uses
menu bars and has a tree structure to select variables
from a database.
There usually is quite a lot of media attention for these
press reports. We have interviews on radio, articles in
newspapers and occasionally (Holland going into
recession ) a television performance. In the case of our
fully based estimates the press report highlights the
expenditure approach to GDP and extra details are
given on capital formation. The press report in the case
of the flash estimate of GDP gives fewer details. The
change in GDP volume (seasonally adjusted and
original) and some details on the output approach to
GDP are presented. The volume changes of value
added are given for three groups of industries. These
are the producers of goods, commercial services and
non commercial services. Government value added is
included in the last category.
The electronic publication QNA will be presented in
its new format in the near future. It will enable
customers to select data from the database of QNA
with relative ease. The selection can furthermore be
put in a format preferred by the customer. For
example : a spreadsheet in WK1 format, a DBASE
file, a SPSS file, DIF files, plain ascii format, etc.
In the new version of our electronic publication many
new features have been added to the existing
possibilities. There are three partial publications : time
series, publication tables and investment matrices.
The second outlet for the QNA is on Videotext.
Videotext is a kind of databank with pages of
information accessible by computer modem. The
tables of the printed publication (full details) are
presented. Each table can be downloaded but there is
no possibility to get other data from the QNA. There is
no publication delay for Videotext. Both Videotext
and the press report are published on the same day at
10 o’clock in the morning. The exact publication dates
are known to press and other customers well in
advance.
The time series publication has been extended. Many
previously unpublished time series have been added.
The second part of the electronic publication is based
on the tables in the printed publication. The format of
those tables has been adapted so as to make it look
right on a computer screen. One of the simplest
changes is putting the newest figures in the first
datacolumn. In a printed publication one would put the
newest figure in the last column. On a computer screen
that just does not work.
The investment matrices are published electronically
for the first time. These are matrices where the capital
formation of one quarter is given in further detail.
There are two sets of matrices. One gives capital
formation by type of capital good and by industry of
origin. The other gives capital formation by type of
capital good and industry of destination. Each table is
presented in current as well as in constant prices. Data
are available from 1987 to the present.
The immediate needs of the clients can be fulfilled
with the press report and Videotext. The aim is to have
the full set of QNA data with the customers within
days of the press release. The electronic publication is
the means to do so. At this moment QNA has an
electronic publication which consists a lots of large
tables with time series of all published variables.
There are 28 (ascii) files with the tables of the printed
publication and nearly 50 files with times series on
specific variables. The time series typically go back to
1977. Values at constant prices of 1990, values in
current prices and (implicit) price indices are
presented. For all these there are original and
seasonally adjusted time series. Also the changes of
values, etc. are (electronically) printed. The present
electronic publication of QNA in its ascii format is not
The new electronic publication not only holds data. It
also has text on the quarterly national accounts in
general and the variables and tables in particular.
Explanations can be called on screen at almost all
levels in the tree structure.
196
TABLE OF CONTENTS
A test version of the new electronic publication has
been completed. Some of the most favoured customers
(central bank, various commercial banks, ministry of
economic affairs) have been asked study and criticize
the result. Their comments will be used for the final
publication. That will be introduced to all Dutch
customers in early 1995. An English version will be
made immediately afterwards.
The last part of the printed publication is the tables
section. These tables show the variables of the
expenditure approach to GDP and details of
expenditure, the variables of the output approach,
variables of the income approach and a breakdown of
total wages. Also there is a table giving details about
the economic relations with the rest of the world.
6 Future fields of study
Lots of questions about the economic interpretation of
the figures are posed immediately after the press
release. An extended version of the press release
would be useful. Statistics Netherlands publishes a
weekly bulletin. In this bulletin there is room for text
and tables with data. The production of the bulletin
and distribution to the customers takes about one
week. It is therefore fast enough for our purpose.
Every quarter we fill two pages in this bulletin with the
main results and comments on the QNA. Publication
takes place some days after the press report.
There are several issues which need further study.
The first calculations for the Dutch QNA were made
for 1977. So, for many important variables the time
series go back to 1977. Up to the mid eighties the
quarterly input output tables were only compiled in
current prices. The resulting values were then deflated
to get at volume changes.
A major change in working practice led to a system
where input output tables were compiled in current as
well as in constant prices. For some variables the time
series in constant prices start only in 1987. Work has
started to extend the coverage of the time series back
to 1977. Specifically, work is done on value added at
constant prices by branch of industry. Now this is
available only in current prices for the quarters of 1977
to 1986. The data on capital formation nowadays is
fully consistent whith the QNA. In the years before
1987 this was not the case. The changes of capital
formation were calculated independently from QNA.
This so called investment index was subsequently
used as and input in the QNA. The value in millions of
guilders from the investment index however did not
correspond with the levels published in the QNA.
From 1987 onwards the investment index was
integrated in the QNA. Inconsistencies no longer exist.
What is lacking however, is a time series of the details
of capital formation by type of capital good and branch
of industry of destination for the quarter of 1977 to
1986. Work is in progress to recontruct from old data a
consistent set of data on these subjects.
The last publication is the printed publication. This 50
page magazine has all information necessary for a
proper interpretation of the figures. The publication
opens with a special section called ‘Topics’. In this
section to main macro-economic trends are
highlighted. Next, we have a section with explanatory
notes on the results of the quarter. In this section
economic growth is analyzed in many ways. Which
expenditure category contributes most to GDP growth
and why, for example. The contribution of branches of
industry to GDP growth is calculated and analyzed.
Details of gross fixed capital formation and
consumption expenditure are highlighted and
relationships with other trends in the economy are
shown. Sometimes methodological comments are
made.
In this publication there also is a section about the
international trends. Developments in the economies
of the other countries of the European Union are
important to the Dutch economy. Recently, special
attention was given to developments in the German
economy. Also in the printed publication there is room
for articles. Subjects have included a monthly
production index of construction, economic
developments in agriculture and economic relations
with the rest of the world.
Another subject which needs more attention is the
matter of working days or more precisely production
days in a quarter. The number of production days
changes between quarters. In the first quarter of 1992
economic growth in the Netherlands was 3.2%.
197
TABLE OF CONTENTS
Roughly half of this growth had nothing to do with the
business cycle but was caused by the leap day in
combination with an extra production day in that
quarter. These factors must be measured more exactly
for a correct interpretation of the changes in GDP. The
solution to this problem probably needs calculations of
the effect of changes in the number of working days on
value added for each branch of industry. That is being
studied. In fact, we think a GDP-figure really
describing the business cycle will result. This will
make interpretation of the QNA in an economic sense
more meaningful. To the end the seasonal adjustment
procedures used in the Dutch QNA are being
scrutinized. A more sophisticated seasonal adjustment
procedure would for example keep track of production
days in value added and shopping days in case of
household consumption. These effects are not
seasonal in nature. The point is to subtract (or add)
from the time series the special effects and only then
adjust the time series for seasonal influences. This
probably would make the interpretation of seasonally
adjusted data easier. The annual national accounts in
the Netherlands also use input output tables. Recently
however, the usual input output table has been
replaced by use and make matrices. The use of the
matrices in the QNA is to be investigated.
And last but not least : distribution of publications.
The national and international use of Internet is
growing very fast. Statistics Netherlands has E-mail
facilities which are being used more and more. It is
worth investigating the possibilities of distributing
electronic publications by E-mail. Another even more
promising possibility is to allow customers to access a
computer (for example a GOPHER-like system) of
Statistics Netherlands. The customer would of course
need a username and password to get in. Subsequently,
the customer can download or update the publication
files. This service to customers certainly deserves
further study.
198
TABLE OF CONTENTS
Schema 1: Flow-chart of data
DATAFLOW
Basic Statistics
Q
N
Ajustements by specialists to NA definitions
Balancing of input/output table
A
Publication of tables and time series
Customers
Scheme 2: Relations between quarterly input output tables
Base quarter (t-1) current
prices
Value
indicators
Ratio of prices base quarter
(Quarter under report
compared with
corresponding base quarter)
Volume
indicators
Price indicators
Quarter under report t
current prices
(Quarter under report compared with
year (t-1))
BEFORE BALANCING
Base quarter (t-1) constant
prices of year (t-1)
(Quarter under report
compared with corresponding
base quarter)
Quarter under report t
costant prices year (t-1)
BEFORE BALANCING
Balancing process
Ratio of prices
Quarter under report t
current prices
(Quarter under report compared with
year (t-1))
AFTER BALANCING
Quarter under report t
costant prices year (t-1)
AFTER BALANCING
199
TABLE OF CONTENTS
Scheme 3: Functions of specialists in NA and in QNA department
Functions of the specialists in I/O
for
–
Industries
–
Final expenditure
–
Continuity of figures over time
–
Adaptation to definitions of NA
–
Completness for
–
cut off statistics
–
hidden economy
–
Plausibility of figures
–
Participation in the balancing
200
TABLE OF CONTENTS
SECTION 4 SEASONAL ADJUSTMENT
Cycle tendance contre correction de variations saisonnières dans les
Comptes nationaux trimestriels
Alfredo CRISTÓBAL CRISTÓBAL
Enrique Martín QUILIS
Instituto Nacional de Estadística, Madrid
1 Introduction
1
L’emploi de la composante du cycle-tendance des
indicateurs à court terme dans les Comptes Nationaux
Trimestriels espagnols (CNTR) est une de leurs
caractéristiques essentielles (INE, 1992, 1993). En
effet, il y a une différence notable avec la pratique
courante d’autres bureaux statistiques, qui corrigent
les séries des variations saisonnières seulement.
changée par d’autres techniques plus efficaces et
dépurées 3. Dans ce rôle, on va présenter une méthode
d’estimation du cycle-tendance employée à l’INE. Il
s’agit d’une technique qui combine le dessin de filtres
sur mesure (Melis 1983, 1986, 1988, 1989, 1991 et
1992) et les procédures modèles-basées (Hillmer et
Tiao, 1982; Maravall, 1987; Bell et al., 1983).
Le filtre qu’on propose, appelé “Lignes Aériennes
(Airlines) Modifié” (LAM) est robuste par rapport à des
spécifications alternatives du modèle qui engendre les
données, a un coût d’information réduit et une capacité
formidable d’obtenir le signal du cycle-tendance.
La relative nouveauté des méthodes d’obtention de
signaux du cycle-tendance dans le cadre économique 2
et la diffusion des méthodes du type “boite noire”
comme X11 ou X11-ARIMA, très faciles d’employer
mais difficiles de comprendre leur technique, à mené
la production de statistiques vers un produit hybride,
les séries corrigées des variations saisonnières, peu
utiles pour les analystes et avec des principes
statistiques améliorables.
L’organisation de l’article est articulée de la façon
suivante: d’abord, on parle des raisons qui justifient la
correction de la composante saisonnière et la procédure
la plus habituelle employée (la méthode X11); après, on
expose les fondements de l’obtention du signal de
cycle-tendance, on critique la technique utilisée par la
X11 (surtout les moyennes mobiles de Henderson) et on
propose quelques solutions. En particulier, on présente
une alternative choisie par l’INE. A la fin, on offre des
conclusions.
Il s’agit de mettre en question les limitations de la
méthode X11 dans la procédure d’extraction du signal
de cycle-tendance. Cette méthode a eté très
importante dans le développement de l’analyse des
séries chronologiques, mais aujourd’hui, elle doit être
1
Les avis manifestés dans l’article appartienent aux auteurs, non nécessairement à l’INE.
2
Cet événement est différent au procédé routinier de ces techniques dans d’autres domaines scientifi ques, par
exemple dans la génie de télécommunication.
3
Il convient de souligner le prochain remplacement de la méthode X11 par une autre appelée “X12-ARI MA”,
réalisée par l’U.S. Bureau of the Census.
201
TABLE OF CONTENTS
Pourquoi corriger des variations
2 saisonnières
?
(b) cycle: la composante cyclique d’une série
chronologique est caractérisée par des oscillations
dont la durée est comprise entre deux et cinq ans,
et d’une façon précise, entre vingt et soixante
mois. C’est une composante associée aux basses
fréquences, ainsi que la tendance, mais à l’origine
est causée par des facteurs différents, où
apparaîssent des aspects propres du court terme.
La correction des variations saisonnières est une
pratique habituelle dans l’analyse à court terme et
même dans les comptes trimestriels. Dans cette
section, on examine les raisons économiques et
statistiques qui permettent l’usage de cette technique.
On va rapporter tous les résultats à des séries
mensuelles, mais la généralisation aux séries
trimestrielles est directe.
D’habitude, on peut caracteriser ces mouvements
par la réponse optimum des agents rationnels à
shocks exogènes différents, en prenant prix et/ou
quantités comme instruments. C’est le domaine
essentiel de la macroéconomie et de l’analyse à
court terme. Il convient de souligner l’abondance
d’explications théoriques pour ce genre de
phénomènes. Par conséquent, on va considérer le
cycle, toutes les oscillations dont la fréquence, en
radians, soit compris entre 2π / 60 et 2π / 20.
L’hypothèse des composantes dans le domaine du
temps permet d’établir une série X t comme la somme
de quatre composantes orthogonales: la tendance, le
cycle, la composante saisonnière et l’irrégularité:
[1]
X t = Tt + C t + S t + I t
L’expression [1] est aussi valide pour un schéma
*
*
*
multiplicatif X t = Tt + C t + S t I t si on applique des
Plusieur fois, c’est difficile de discriminer entre
tendance et cycle, à cause des oscillations de
période très longue. La plupart des séries
macroéconomiques qu’on analyse sont très
courtes et il peut être impossible de trouver des
cycles longs. C’est pour cela que le dessin de
filtres idéaux exclusifs pour la tendance ou le
cycle est un problème très difficile à résoudre.
D’un côté, d’un point de vue théorique, on accepte
que plusieurs facteurs qui touchent la tendance
sont aussi responsables de la conduite cyclique 5,
ce qui rend peu convenable d’imposer une
distinction très abrupte. On travaillera alors avec
une composante mixte de cycle et tendance
Pt = Tt + C t et comme ça [1] devient:
logarithmes sur la série originale. Chaque composante
est associée à une bande de fréquence:
(a) la tendance: est associée aux basses fréquences,
c’est à dire, aux mouvements de longue durée,
oscillations dont la période est plus grande que 60
mois (cinq ans). Cette composante peut être
associée aux déterminants de la croissance
économique: le développement technique;
l’évolution du stock de capital physique; niveau,
composition et qualification de la force du travail.
En somme, tout ce qu’on appelle “Théorie de la
Croissance” dans la Théorie Economique. Alors,
on parlera de la tendance pour les oscillations dont
la fréquence, exprimée en radians, soit entre 0 et
2π / 604. Il convient de souligner, dans cette bande
de fréquence, sa limite inférieure, w=0, qui est
associée aux oscillations dont la période est
infinie, ou bien, dans les échantillons finis, avec
des périodes dont la durée est plus grande que le
nombre de données de la série. C’est la fréquence
de la tendance absolue.
[1']
X t = Pt + S t + I t
(c) composante saisonnière:
il s’agit d’un
mouvement périodique dont la durée est d’une
année. Elle est causé, essentiellement, par facteurs
institutionnels, climatiques et techniques qui
évoluent doucement. La variation très faible de
tels facteurs déterminent que cette composante ne
soit pas la plus remarquable dans l’analyse à court
4
w = 2π / p dont w est la fréquence (en radians) et p la période.
5
Surtout ceux qui touchent au stock de capital et à la productivité des facteurs.
202
TABLE OF CONTENTS
terme, mais elle peut être importante dans un
analyse structurel. Alors, on appelera composante
saisonnière les oscillations dont la fréquence en
radians soit 2π / 12, période d’une année (sans
oublier les harmoniques 2kπ / 12, k = 2..6).
la tendance et le cycle seulement, dû à sa
fréquence annuelle.
La manière la plus développée pour résoudre ces
problèmes consiste à éliminer les variations
saisonnières de la série originale, c’est à dire 6:
(d) irrégularité: mouvements errants et très difficiles
à prédire. Ils constituent généralement une source
de bruit dans l’analyse à court terme, et c’est
pourquoi on ne doit pas en tenir compte. Alors, on
attribue à l’irrégularité aux oscillations dont la
durée est inférieure à douze mois (une année),
c’est à dire, inférieure à 2π / 12 radians, les
harmoniques saisonniers exceptés.
[2]
Cette procédure est la “correction des variations
saisonnières” et la méthode la plus populaire qui
l’automatise c’est la X11 du U.S. Bureau of the
Census et sa variante X11-ARIMA de Statistics
Canada.
Les composantes les plus importantes pour l’analyse à
court terme sont la tendance et le cycle. Les variations
saisonnières sont liées à des caractéristiques
structurelles de l’économie, qui changent lentement.
Le terme irrégulier ne transmet aucune information et
c’est un élément de perturbation qui dénature un
rapport stable.
Cette méthode apporte un filtre simple, valable pour
un grand nombre de séries et très facile d’interpréter
dans le domaine du temps. On peut voir le tableau
numéro 2 pour avoir une description complète de la
méthode. La procédure basique de filtrage consiste
dans un ensemble de filtres de moyennes mobiles
symétriques tels que:
[3]
Au contraire, d’un point de vue économique la
tendance et le cycle captent ce qui est essentiel dans
l’évolution d’une série. Ces composantes reprennent
les effets des décisions des agents économiques sur
l’accumulation de capital (physique et humain) ainsi
que le rhytme et la manière de s’approcher à
l’évolution à long terme.
(
H 1 ( B) = (1 / 2) B 1/ 2 + B −1/ 2
⋅
[∑
j = 1K6
)(1/ 12)
]
(B j − 1/ 2 + B 1/ 2− j )
Après avoir appliqué [3] à la série originelle, on a une
première estimation de la composante de
cycle-tendance. On peut trouver la fonction de
puissance de ce filtre dans le graphique 1, où on peut
voir l’annulation des variations saisonnières, la
relative atténuation du signal cyclique et la perceptible
réduction de l’apport de la composante irrégulière.
Par rapport aux comptes trimestriels, on doit mettre en
relief:
(a) les variations saisonnières des estimations
trimestrielles obtenues par la méthode de
Chow-Lin sont déterminées par celles de
l’indicateur employé dans la régression. La
plupart des fois, ces variations saisonnières ne
seront pas les vraies puisque l’indicateur ne
contiendra pas toute l’information de la variable
annuelle.
La tendance reste invariable. Le filtre complémentaire
de H 1 (B), H 2 (B) = 1 − H 1 ( B), permet d’avoir une
première estimation des composantes saisonnière et
irrégulière. Sa fonction de puissance est dans le
graphique 2.
On peut éliminer l’irrégularité par l’intermédiaire de
filtres de moyennes mobiles saisonnières:
(b) De plus, on ne peut pas vérifier de telles variations
puisque la méthode de Chow-Lin met en rapport
l’indicateur avec une série qui a l’information sur
6
xst = X t − st
[4 ]
En general z t est l’estimateur de Z t .
203
H 3 ( B) =
[(1/ 3)(∑
j = −1K 1
B 12 j
)]
2
TABLE OF CONTENTS
[5]
H 5 ( B) = [(1 / 5)(∑ j = −2K2 B 12 j )][(1 / 3)(∑ j = −1K1 B 12 j )]
La figure 4 contient la fonction de puissance du filtre.
On peut voir que après [7], les oscillations dont la
période est supérieure à vingt mois restent intactes
(puissance unitaire). L’irrégularité reste intacte aussi
et les variations saisonnières s’évaporent.
La X11 a une fonction de décision qui permet de
choisir entre [4] et [5] (Dagum, 1988). En avant, on va
employer [5] sans perte de généralité.
La méthode X11 effectue une deuxième réitération
dont la plus importante contribution est une estimation
plus fine du cycle-tendance, fondée sur l’emploi
comme input de la série corrigée de variations
saisonnières. Pour cela, on emploie un filtre de
passage bas (de basse fréquence) appelé “moyennes
mobiles de Henderson”, symétrique, dont la formule
est:
On trouve dans le graphique 3 la fonction de puissance
de [5]. Après avoir appliqué le filtre, il reste presque
seulement la tendance et les variations saisonnières.
Avec H 2(B) y H 5(B) et la restriction de que le filtre
soit inoffensif pour les séries sans variations
saisonnières (que la somme des termes intra-annuels
soit nulle), on aura une première estimation de cette
composante saisonnière:
[6]
[8]
H6(B) = H 2 (B)H 5(B)H 2(B)
On va l’analyser [8] en détail dans la prochaine
section. La X11 considère trois grandeurs pour [8]:
m=4, 6 et 11. Le choix dépend du quotient signal/bruit,
de telle sorte que pour les séries plus (moins)
irrégulières, le filtre est plus grand (petit) (Dagum,
1988). Le tableau 1 expose les coefficients de [8] pour
toutes les grandeurs de m.
Alors, le filtre qui corrige les variations saisonnières
sera:
[7]
H 4 ( B) = ∑ j =− mKm c j B j
H7(B) = I - H 6(B)
Tableau 1 : Coefficients du filtre de Henderson
m
j
4
6
11
0
.32144
.23381
.13994
1
.26297
.21038
13472
2
.12540
.14854
.11977
3
-.00296
.07059
.09716
4
-.04612
.00422
.07003
5
—
-.02869
.04206
6
—
-.02195
.01697
7
—
—
-.00217
8
—
—
-.01346
9
—
—
-.01661
10
—
—
-.01302
11
—
—
-.00542
204
TABLE OF CONTENTS
Tableau 2 : Equations basiques de la X11
A: PREMIERE ESTIMATION
Série
Formule
Cycle-tendance:
p(1)t = H 1(B) X t
Comp. saison. + irrég.:
sit = X t - p (1)t = [I - H 1(B)] X t =
= H 2(B) X t
Comp. saisonnière:
s (1)t = H 5(B) si t = H 5 (B) H 2(B) X t
s(2)t = H 2(B) s (1)t =
= H 2(B) H 5(B) H 2(B) X t
= H 6(B) X t
i (1)t = X t - p (1)t - s (2) t
= [I - H 1(B) - H 6 (B)] X t
= [H 2(B) - H 6(B)] X t
Irrégularité:
Corrigée de var. saison.:
xs (1)t = X t - s (2)t = [I - H 6(B)] X t
= H 7(B) X t
B: ESTIMATION DEFINITIVE
Série
Formule
Cycle-tendance:
p(2)t = H 4(B) xs (1)t = H 4(B) H 7(B) X t
= H 8(B) X t
Comp. saisonnière:
s (3)t = H 5(B) [X t - p (2)t]
= H 5(B) [I -H 8(B)] X t
s(4)t = H 2(B) s (3)t
= H 2(B) H 5(B) [I - H 8(B)] X t
= H 9(B) X t
Irrégularité:
Corrigée de var. saison.:
i (2)t = X t - p (2)t - s (4) t
= [I - H 8(B) - H 9 (B)] X t
xs (2)t = x t - s (4)t = [I - H 9(B)] X t
= H 11(B) X t
205
TABLE OF CONTENTS
où a t ~ Niéd ( 0,σ a ) sont les shocks ou innovations de
la série qui, estimés, représentent les erreurs de
prédiction (Box et Jenkins, 1970).
L’estimation définitive de la composante saisonnière
apparaît de la même façon qu’au début; on doit
changer seulement H 1 par H 8 et alors:
[9]
H9(B) = H 2 (B)H 5(B) [I - H 8(B)]
L’hypothèse des composantes subjacentes dans le
domaine du temps établit que la série observée peut
être exprimée aussi selon [1]. De [1] et [11] on obtient:
où H 8(B) = H 4(B) [I - H 6(B)]
[12]
Le filtre définitif pour corriger les variations
saisonnières sera:
[10]
Selon [12] on peut déduire que, seulement sur des
conditions restrictives à l’extrême et assez peu
improbables, I t=a t. Par conséquent, dans le cas
général, une innovation sera distribuée parmi toutes
les composantes, non seulement dans l’irrégulière.
H11(B) = I - H 9(B)
L’image 5 montre la fonction de puissance du filtre
[10]. Elle est très semblable à celle de [7] (graphique
4) sauf que:
D’ordinaire, la correction de variations saisonnières
est suivie d’un ajustage des effets du calendrier (cycle
hebdomadaire ou correction par l’effet des jours
ouvrables et aussi pour les fêtes mobiles, par exemple
Pâques). Le cycle hebdomadaire produit un effet alias
dans la série mensuelle. On peut voir une pointe dans
son spectre à la fréquence de 2,873 mois, au domaine
irrégulier. Corriger cet effet consiste à introduire un
zéro dans la fonction de puissance du filtre (graphique
6). Maintenant, la question est, pourquoi éliminer cette
irrégularité et laisser passer la restante question? Le
motif de la correction du premier effet est connu et
l’origine des autres inconnue mais cela n’empêche pas
que les deux effets soient néfastes sur la clarté du
signal à court terme estimé.
(a) elle a une bande de rejet plus étroite
(b) elle offre une bande de passage plus ample dans la
fréquence cyclique et
(c) les lobes de la bande irrégulière sont plus petits.
Pourquoi extraire le signal de cycle
3 tendance?
L’emploi systématique des séries corrigées des
variations saisonnières est contraire au faible intérêt
que montrent celles du cycle-tendance, spécialement
si on considère que les raisons exposées pour la
correction des variations saisonnières sont les mêmes
qui conseillent l’élimination de l’irrégularité. Se servir
de séries corrigées des variations saisonnières revient
à travailler avec le cycle-tendance contaminé par
l’irrégularité. Puisque cette dernière composante n’a
aucune information importante pour l’analyse à court
terme, ce traitement n’est pas cohérent car on élimine
seulement une partie de l’information hors ligne de
l’analyse (les variations saisonnières).
Conséquemment, on peut ajouter seulement des
raisons statistiques pour justifier ce désintéressement
pour les séries de cycle-tendance. On va analyser au
fond le filtrage de la X11 pour l’estimation de cette
composante.
Dans la première estimation on emploie une moyenne
mobile centrée de douze termes [3] (graphique 1 pour
sa puissance). Ce filtre a un pauvre rendement pour
l’extraction du signal de basse fréquence, sauf pour la
tendance pure, car il devient zéro très rapidement. La
puissance du domaine cyclique est atténuée presque à
un 70 pourcent. Cet estimateur est un outil défectueux
pour l’extraction du cycle-tendance.
Le terme irrégulier ne doit pas être confondu avec les
innovations, surprises ou valeurs non anticipés. Une
série chronologique peut être considerée comme
l’agrégation d’un nombre infini de shocks non
correlationnés, de moyenne nulle et variance finie
(bruit blanc), selon l’expression:
[11]
ψ (B) at = Tt + C t + S t + I t
On estime de façon définitive cette composante avec le
filtre:
X t = ∑ j = 0K∞ ψ j a t − j = ψ (B) at
206
TABLE OF CONTENTS
[13]
H8(B) = H 4 (B) [I - H 6(B)]
seulement ils s’attenuent un peu. Au contraire, les
oscillations du domaine cyclique restent presque
invariables. Le résultat sera une série de
cycle-tendance peu doux. Par conséquent, si la série
est très irrégulière, l’output a une information
tendencielle presque seulement et si elle est moins
irrégulière, l’information du domaine cyclique reste
invariable mais cachée derrière une irrégularité
résiduelle non éliminée.
Le filtre [I - H 6(B)] est l’estimateur préliminaire des
variations saisonnières et H 4(B) est la moyenne
mobile de Henderson, dont la formule est :
[8]
H 4 (B) = ∑ j = − mKm c j B j
On parle d’un filtre symétrique, de basse fréquence et
grandeur variable qui dépend du quotient signal/bruit.
On peut voir dans le graphique 7 la puissance pour les
filtres de grandeurs m=6 (le médiane) et m=11 (le plus
long).
De plus, on peut faire une autre critique au filtre de
Henderson.
Celui-ci
a
une
formule
j
H 4 ( B) = ∑ j = − mKm c j B , où c j sont ses coefficients,
qu’on calcule avec une procédure d’optimum avec
restrictions. Le filtre dépend d’une spécification très
particulière de la composante du cycle-tendance:
Ce filtre est mieux que le préliminaire [3] et
l’atténuance aux basses fréquences est moins accusée.
Pourtant, il souffre de deux graves défauts:
[14]
(a) la fonction de puissance n’a pas de racines aux
fréquences saisonnières. Par conséquent, il peut
être seulement employé après avoir fait la
correction des variations saisonnières, dans ce
cas, [I - H 6(B)]. Si on applique le filtre de
Henderson sur une série avec variations
saisonnières, la série résultante sera assez douce
mais elle les aura encore. La dépendance d’un
filtrage préalable limite son application.
Xt = P t + U t
où:
(tendance cubique
Pt = a + bt + ct 2 + dt 3
déterministe) U t ~iéd( 0, σ U ) (perturbation genre bruit
blanc).
On impose deux conditions:
(i)
(ii)
(b) la bande de passage du filtre est très ample car elle
n’atténue pas suffisamment les oscillations dont
sa période est entre 6 et 12 mois si m=6. Dans ce
cas, le résultat ne sera pas très doux. Si m=11 cet
effet disparaît, mais avec un coût supplémentaire.
symmétrie : c j = c − j ∀ j
invariabilité : H 4 (B) Pt = Pt ∀ t
La fonction objective de la procédure de minimisation
est la norme de la série des troisièmes différences
dusignal de cycle-tendance, Pt restreinte par (i) e (ii).
Toute la procédure est très arbitraire pour ces deux
raisons:
Le graphique 8 montre la puissance du filtre définitif
de cycle-tendance de la X11 avec des moyennes
mobiles de Henderson de grandeur 13 et 23. Toutes les
deux sont zéro aux fréquences harmomiques
saisonnières. Le filtrage avec la moyenne de 23 termes
donne un résultat plus doux que l’autre, mais attenue
plus les oscillations du domaine cyclique. Il convient
de souligner le schéma de fonctionnement de la X11:
si la série est très irrégulière, le choix de la méthode est
de prendre une moyenne de 23 termes. L’irrégularité
deviendra nulle dans la série filtrée et les oscillations
du domaine cyclique seront assez atténuées. Mais si la
série n’est pas très irrégulière, le choix sera de prendre
une moyenne de 13 termes. La série a des oscillations
entre 6 et 12 mois, et le filtrage ne les élimine pas,
(1) le choix de la fonction objective est très
particulière, car prendre une différence troisième
d’une série est assez rare dans l’analyse des séries
chronologiques et de plus, elle a une interprétation
statistique difficile.
(2) le modèle du cycle-tendance est très restrictif,
arbitraire et peu consistant avec l’analyse
moderne, où il est habituel d’employer des
modèles ARIMA avec tendances stochastiques ou
mixtes.
Il convient de souligner un autre élément très
important: le coût du filtrage. C’est le nombre de
207
TABLE OF CONTENTS
prédictions à l’extrême qu’il faut pour employer le
filtre d’une façon que l’input et l’output soient
simultanés, c’est à dire, le déphasage du filtre. On
déduit du tableau 2 que l’estimation définitive du
cycle-tendance a besoin de 54 prédictions, si on
emploie la moyenne de 13 termes (m=6) et cinq de
plus si on emploie celle de 23 termes (m=11). De
celles-ci, 48 prédictions (presque le 90 pourcent) ont
eté causées par la procédure (obligatoire) de correction
des variations saisonnières qu’on doit faire avant
d’appliquer le filtre de Henderson. La série définitive
corrigée de variations saisonnières a besoin de 96
prédictions (m=6), 1.7 fois plus que l’estimation du
cycle-tendance.
Toutes les répresentations n’ont pas des facteurs
communs et les racines des polynômes en B sont sûr
ou hors du cercle unitaire. Les termes b t , c t y d t sont
des bruits blancs indépendants.
Après [15] et [16] on obtient l’expression connue
ARIMA de la série observée (forme réduite):
[17]
[18]
(B)Φ n (B)]θ s (B) ct
+[ Φ p (B) Φ s (B)] θ n (B) d t
p
Il y a un problème d’identification évident, car il est
possible trouver un nombre infini de modèles en
accord avec [15] et [16], compatibles avec la structure
[17]. Alors, il convient d’introduire des spécifications
a priori sur les modèles des composantes.
On va considérer que la composante de cycle-tendance
et les variations saisonnières doivent suivre des
schémas aléatoires qui respectent les renseignements
statistiques habituels qu’on attribue dans l’analyse
empirique d’extration des signaux.
Sur le domaine spectral, on peut associer les basses
fréquences avec Pt (oscillations dont la durée est
supérieure à vingt mois), et les pointes des fréquences
saisonnières ( 2kπ / 12, k = 1..6) avec S t (oscillations
dont la durée est de douze mois).
Un modèle adéquat pour le cycle-tendance sera un
IMA( d , q p ), où qp ≤ d, car le spectre tend à l’infini à
la fréquence zéro et après, il descend d’une façon
monotone vers zéro aux hautes fréquences. La
formulation de ce modèle est:
Soit une série chronologique X t construite par une
somme de trois composantes orthogonales non
observables:
cycle-tendance
(P t ),
variations
saisonnières (S t) y bruit (I t )
[19]
Xt = P t + S t + I t
(1− B)
d
Pt = θ p (B)bt
Pareillement, un modèle ARMA (11,q s ) où q s ≤ 11,
réflète d’une manière adéquate les propriétés
fréquentielles des variations saisonnières parce qu’il
n’a aucune information à la fréquence zéro et tend à
l’infini aux harmoniques saisonniers. La valeur
moyenne, l’espérance mathématique est nulle. C’est:
Toutes les composantes peuvent être formulées avec
des modèles ARIMA.
[16]
Φ (B) = Φ p ( B)Φ s ( B) Φ n (B)
θ ( B )at = [Φ s (B)Φ n ( B)]θ p (B)b t
+ [Φ
On va exposer la méthode d’estimation du signal
employée dans le système espagnol de comptes
trimestriels.
Φ p( B) Pt = θ p(B)bt
Φ s(B) S t = θ s( B)c t
Φ n(B) I t = θ n(B)d t
a t : Niéd( 0, σa)
La relation entre [15] et [17] sera donnée par:
Cette différence devrait mener à l’emploi de cette
dernière, surtout parce que l’usage de filtres très longs
(193 termes dans la série corrigée de variations
saisonnières si m=6) implique un grand nombre de
révisions, spécialement aux extrêmes de la série. Par
conséquent, l’évolution la plus récente ne peut être
perçue avec clarté et elle devient incroyable et peu
utile pour l’analyse à court terme.
Après ça, la X11 semble un outil inadéquat pour
l’obtention du signal de cycle-tendance d’une série, dû
aux déficences de ses filtres, surtout celui de
Henderson et leurs coûts informatifs (en termes de
prédictions à l’extrême). C’est pour cela que l’INE a
choisi de travailler avec d’autres techniques
d’extraction du signal pour essayer la correction de ces
défauts-là.
[15]
Φ (B) X t = θ( B)a t
bt : Niéd ( 0, σb)
c t : Niéd ( 0, σc)
d t : Niéd ( 0, σ d )
208
TABLE OF CONTENTS
[20]
U(B) st = θ s ( B)c t
avec
U(B) = 1 + B + B 2 +..+ B 11
après considérer [18], [19]et [20]
L’imposition du requis canonique (Bell, Hilmer et
Tiao, 1983; Hilmer et Tiao, 1981) détermine que le
spectre de Pt touche à l’axe d’abscisses à la
fréquence w = π. Alors :
[21]
[28]
θ (B) = (1 − B) U (B) Φ
d
n
(B).
(1 − α B − α B ) = (1− h B)(1− h B)
2
1
2
1
où h i i = 1,2 sont les racines du polynôme MA(2). Par
conséquent, la fonction de puissance du filtre qui
origine Pt sera :
L’expression [21] détermine que la partie autoregrésive
du modèle de la série observée doit contenir le terme
U(B) pour qu’il ait une décomposition valable.
[29]
Pour restreindre un peu plus le cadre d’analyse, on va
considérer que la série X t évolue en accord à un modèle
du genre “lignes aériennes” (Airlines).
σ
g p ( w ) =  a
 σb

2
(1 − h 1 cos w)(1− h2 cos w) /(1 −cos w)


[22] (1− B)(1− B 12 ) X t = (1− θ 1 B)(1− θ 12 B 12 )a t
Doit au requis canonique :
θ 1 < 1, θ 12 > 0.
La condition θ 12 > 0 s’établit pour que la série ait
l’information dans les harmoniques saisonniers
(Burman, 1980).
Après [21:]
[30]
Le modèle canonique de la composante de cycle
tendance d’une série “ligne aériennes” est donné par :
[31]
Et, par conséquent
Φ
p
(B) = (1 − B) 2
Φ
s
( B) = U(B)
Φ
n
(B) =1
[32]
[(1 − α B − α B )U(B)] / [(1− θ B)(1− θ B )]
V (F ) = [(1 − α F − α F )U( F )] / [(1 − θ F )(1 − θ F )]
V (B) =
= U (B) θ p ( B) bt + (1 − B) θ s (B)c t +
2
1
2
12
1
12
12
1
12
Le filtre extracteur du cycle-tendance est symétrique,
centré et avec des queues infinies. C’est pourquoi on
a besoin d’un grand nombre de prédictions (en avant
et en arrière) si θ 1 et/ou θ 12 sont près de la limite
d’inversibilité. De la même manière, pour calculer
les paramètres α 1 et α 2 du filtre, il faur résoudre un
sistème d’équations originé à partir de la méthode
des moments (Maravall, 1987). Ce système a aussi
des restrictions ajoutées, dû au principe de
décomposition canonique.
ordre θ p ( B) ≤ 2
ordre θ s ( B) ≤ 11
ordre θ n ( B) = 0, c’est-à-dire, θ n ( B) =1
Alors le modèle général de la composante du
cycle-tendance est un IMA (2,2) :
(
2
k = σb / σ a
Puisque θ (B)at est une moyenne mobile de treize
termes, on doit remplir les restrictions suivantes :
(1− B)2 p t
2
1
2
U( B)(1 − B) θ n ( B) d t .
[27]
p t = kV ( B)V (F ) X t = kV ( B , F ) X t
où
2
[26]
(1 − B) 2 Pt = (1+ B)(1− αB)bt
L’estimateur du cycle-tendance que minimise
l’erreur quadratique moyenne est (Maravall, 1987) :
Si on considère [18], on a:
[25]
θ(B) a t
g p ( w = π) = 0
ce qui détermine que h 1 = −1.
[23]
(1− B)(1− B 12 ) = (1− B)(1− B)U(B) = (1− B) 2 U(B)
[24]
2
)
= 1− α 1 B − α 2 B 2 b t
209
TABLE OF CONTENTS
Récemment, on a proposé une variation du filtre [32]
appelée “lignes aériennes modifié” (LAM) (Melis et
Gómez, 1989; Melis, 1991, 1992). Ce filtre évite la
décomposition à fractions partielles proposée par
Burman et Maravall et par conséquent son emploi est
très simple.
(a) il s’adapte aux caractéristiques de la série, de
manière qu’on amplifie (contracte) la bande de
passage si l’input est peu (très) irrégulier.
(b) on connaît la modélisation de la composante du
cycle-tendance et on peut réaliser des tests sur
l’adéquation de l’extraction faite (p.e., analyse de
similitudes entre les fonctions d’autocorrélation
théorique et celle du signal estimé).
Les coefficients α i i = 1..2 dépendent des autres
θ i i = 1..2 :
[35]
α i = α i ( θ 1 , θ 12 )
i =1,2
La désavantage principal de ce filtre apparaît quand la
série a une valeur θ 1 positive (situation habituelle aux
séries avec irrégularité moderée ou élevée). Dans ce
cas, la fonction de puissance ferme trop la bande de
passage et attenue beaucoup l’information du domaine
cyclique (graphique 9). Alors, il devient un bon
estimateur de la tendance mais non du cycle.
On applique deux conditions au filtre :
a.
Requis canonique : la puissance doit-être zéro à
w = π (absence de bruit). Alors, si h1 = − 1on aura :
(1 − α (B = −1) − α (B = −1) ) = 0 c’est-à-dire :
2
1
2
[36]
Les paramètres du filtre dépendent de ceux de la série
modélisée, pourtant, on peut fixer θ 1 et θ 12 et
construire un filtre fixe de manière que la fonction de
puissance soit appropiée pour l’extraction du
cycle-tendance et le coût, nombre de prédictions à
l’extrême, soit bas.
α 2 −α1 =1
b. Condition de tangence élevée à l’origine (Melis,
1992) : il faut que la deuxième dérivée de la
fonction de puissance soit nulle à w = 0 (de cette
manière, on obtient une similitude maximum avec
le filtre idéal de passage bas) :
[37]
Au point de vue du dessinateur de filtres, ce n’est pas
nécessaire que ceux-là soient originés par un bon
modèle de prédiction; il sera suffisant que le filtre
dessiné ait une fonction de puissance adéquate pour
l’extraction de la signal et que le déphasage (coût à
l’extrême) soit minimum. On ne va pas chercher une
liaison directe entre le modèle de la série et le filtre
extracteur. Pourtant, on va employer le filtre V(B,F)
dessiné, mais avec quelques modifications:
d 2 g ( w) / dw 2 = 0 À w = 0
avec g( w) =||V e − iw || 2
(
)
Dû aux conditions (a) et (b) on peut calculer les
paramètres α i i = 1..2, fonctions de θ 1 y θ 12 selon les
expressions suivantes (Melis, 1992) :
[38]
{[θ
C =−
1
/ (1 − θ 1 )
2
] +[ s θ
2
12
/ (1 − θ 12 )
2
]}
(1)- on va maintenir θ 1 < 0, fixe dans le filtre. De cette
manière, on assure un bon choix du signal
cyclique.
On fixera aussi θ 12 dans le filtre. La procédure
automatique assume θ 1 = −0,80 et θ 12 = 0,85.
avec s=12
[
(2 − 4C)]/ (4C − 1)
[39]
α 1 = 2− 2
[40]
α 2 = 1 + α 1 (Requis canonique)
[41]
k = [(1 − θ 1 )(1 − θ 12 )] / [s(1 − α 1 − α 2 )],
avec s=12
(2)- après V(B,F), et pour éliminer l’irrégularité qui
contient la série filtrée, on va appliquer un autre
filtre de passage bas de la famille Butterworth,
autorégressif d’ordre quatre, avec une puissance
moitié à seize mois dessiné sur mesure selon:
L’expression [41] découle de la condition de
normalisation de la fonction de puissance, si on fait
l’unité sa valeur à l’origine. Les avantages principaux
du filtre V(B,F) sont deux:
[42] Ω (B) = Ω
210
0
(
/ 1+ Ω 1B + Ω 2 B 2 + Ω 3B 3 + Ω 4 B 4
)
TABLE OF CONTENTS
Ω
Ω
où
0
1
= 0,0139
= 2,9889
Ω
Ω
Ω
2
3
4
= −3,4456
= 1,0829
= −0,3598
ce sera difficile de faire un diagnostic à court terme) et
ils ont besoin d’un grand nombre de prédictions à
l’extrême et, par conséquent, il faudra réviser les
dernières données filtrées, dans quelques cas, presque
cinq ans.
On peut voir les fonctions de puissance et
déphasage aux graphiques 10a et 10b.
De toutes les méthodes analysées pour l’extration du
cycle-tendance, le filtre proposé est le plus adéquat
puisqu’il a la meilleure conduite (puissance) au
domaine cyclique et en outre, son coût informatif à
l’extrême est le plus bas.
(3)- en dernier lieu, il faut observer qu’on a un filtre
non symétrique et par conséquent le nombre de
prédictions dont on a besoin aux extrêmes de la
série est différent. Au début, il faut prédire 17
observations, 13 du filtre V(B) (ARMA(13,13)) et
4 de l’adoucissant AR(4). A la fin, il faut prédire
18 observations, 13 dues au filtre V(F) et 5 dues au
déphasage du AR(4) au domaine cyclique. Si on
applique seulement V ( B)Ω (B), c’est-à-dire,
ignorant V(F), le coût informatif à la fin de la série
sera la somme des déphasages des filtres V(B) et
Ω ( B) au domaine cyclique, qui sont,
respectivement, un et cinq mois, c’st-à-dire, six
mois, coût très bas.
En effet, le graphique 12a montre les fonctions de
puissance des trois filtres de cycle-tendance analysés
dans l’article: les deux correspondants à la X11 avec
les moyennes mobiles de Henderson de 13 (m=6) et 23
(m=11) termes et le filtre proposé qu’on emploi dans
la CNTR espagnole. Tous les trois éliminent les
variations saisonnières et presque toute l’irrégularité
produite par des oscillations dont la durée est plus
petite que six mois.
Le graphique 12a et avec plus de détail le graphique
12b montrent les différences entre les puissances des
trois filtres au domaine des oscillations d’une période
plus grande de six mois. Le filtre X11 avec la moyenne
de 13 termes est le meilleur estimateur du cycle, mais
n’attenue pas beaucoup les oscillations de période
entre six et douze mois, surtout celles de huit mois de
durée. Alors, le signal produit sera assez irrégulier.
Le filtre proposé, qu’on emploie dans la CNTR
espagnole est :
[43]
H (B) = kV ( B)Ω (B)
Les graphiques 11a et 11b montrent les fonctions de
puissance et déphasage de H(B). On a remarqué le
domaine cyclique. Il est nécessaire d’avancer six mois
l’input afin de corriger le déphasage que le filtre induit
sur l’output. L’expression complète du processus est :
[44]
Les autres deux rendent nulles de telles oscillations.
Après avoir analysé le domaine cyclique, on peut voir
que le filtre proposé est toujours plus puissant que
celui de la X11 construit avec la moyenne de 23
termes. C’est pour cela que le filtre proposé est
préférable a celui de la X11.
p t = kV( B)Ω ( B)F 6 X t
On a écrit une procédure SAS pour automatiser le filtre
et pouvoir l’appliquer à n’importe qu’elle série.
Celle-ci est accesible à toute personne qui la demande.
En outre, ce filtre a un coût informatif très inférieur a
celui-là: 6 observations par rapport à 59 de celui de la
X11. C’est à dire, le filtre de la X11, produit une
variabilité dans les 59 dernières observations du signal
(cinq ans !). Cet aspect est contraire à l’analyse à court
terme, plus intéressée aux observations les plus
récentes, lesquelles sont les plus révisées. Cela
démontre une fiabilité assez faible du signal estimée et
par conséquent introduit un risque élevé de faire un
mauvais diagnostic économique à court terme.
4 Conclusion
Il n’y a pas de raisons consistantes pour l’emploi de
séries corrigées de variations saisonnières dans
l’analyse à court terme au lieu de celles de
cycle-tendance. Les filtres qui corrigent ces
variations-là sont pires que celles qui extraient le
cycle-tendance car ils laissent toute irrégularité (on ne
peut pas voir avec clarté l’évolution de la série et alors,
211
TABLE OF CONTENTS
Références
Bell W.R., Hillmer S.C. et Tiao G.C. (1983)
Modeling considerations in the seasonal
adjustment of economic time series , en Zellner, A.
(Ed.) Applied Time Series Analysis of Economic
Data, U.S. Department of Commerce, Bureau of
the Census, Washington, U.S.A.
Melis F. (1983) Construcción de indicadores cíclicos
mediante ecuaciones en diferencias , Estadística
Española, n. 98, p. 45-89.
Melis F. (1986) Apuntes de Series Temporales ,
Documento Interno, Instituto Nacional de
Estadística.
Box G.E.P. et Jenkins G.M . (1970) Time series
analysis, forecasting and control , Holden Day,
San Francisco, U.S.A.
Melis F. (1988) La extracción del componente cíclico
mediante filtros de paso bajo , Documento
Interno, Instituto Nacional de Estadística.
Burman J.P. (1980) Seasonal adjustment by signal
extraction, Journal of the Royal Statisticial
Society, series A, n. 143, p. 321-337.
Melis F. (1989) Sobre la hipótesis de componentes y
la extracción de la señal de coyuntura sin previa
desestacionalización , Revista Española de
Economía, vol. 6, ns. 1 y 2.
Dagum E.B. (1988) The X11-ARIMA seasonal
adjustment method , Statistics Canada, Ottawa.
Melis F. (1991) La estimación del ritmo de variación
en series económicas , Estadística Española, vol.
33, n. 126.
Hillmer S.C. et Tiao G.C. (1982) An ARIMA
model-based approach to seasonal adjustment ,
Journal of the American Statistical Society, vol.
77, n. 377, p. 63-70.
Melis F. (1992) Agregación temporal y solapamiento
o “aliasing” , Estadística Española, n. 130, p.
309-346.
INE (1992) Nota metodológica sobre la Contabilidad
Nacional Trimestral , Boletín Trimestral de
Coyuntura n. 44, Instituto Nacional de Estadística.
Melis F. et Gómez V. (1989) Sobre los filtros de
ciclo-tendencia obtenidos por descomposición de
modelos ARIMA , Documento Interno, Instituto
Nacional de Estadística.
INE (1993a) Contabilidad Nacional Trimestral de
España. Metodología y serie trimestral
1970-1992, Instituto Nacional de Estadística.
Maravall A. (1987) Descomposición de series
temporales. Especificación, estimación e
inferencia, Estadística Española, vol. 29, n. 114.
212
TABLE OF CONTENTS
Appendix : graphiques
Graphique 1
Graphique 2
213
TABLE OF CONTENTS
Graphique 3
Graphique 4
214
TABLE OF CONTENTS
Graphique 5
Graphique 6
215
TABLE OF CONTENTS
Graphique 7
Graphique 8
216
TABLE OF CONTENTS
Graphique 9
217
TABLE OF CONTENTS
Graphique 10a
Graphique 10b
218
TABLE OF CONTENTS
Graphique 11a
Graphique 11b
219
TABLE OF CONTENTS
Graphique 12a
Graphique 12b
220
TABLE OF CONTENTS
SECTION 5 INDIRECT METHODS
The monthly GDP indicator
Erkki LÄÄKÄRI
Statistics Finland
The purpose and structure of the
1 monthly
indicator
2 Sectoral review
The monthly indicator of total output is a quick way of
indicating the trend in overall economic development.
It is based mainly on monthly statistics, speed being
one of its key objectives in comparison with the
quarterly accounts. The monthly indicator is
completed within two months from the end of the
month concerned.
Primary production
Primary production, the first main sector of the
economy, consists of agriculture and forestry. Its
contribution to total output in 1993 was 6.3 per cent.
Milk and meat production, which accounts for approx.
60 per cent of agricultural output, is used to represent
agricultural output. The output of crop cultivation is
taken into account during the harvest season, i.e. in
August, September and October. Market fellings,
which cover 80 to 90 per cent of the value added in
forestry, are used to represent the output of forestry.
The other objectives are simultaneousness and a high
degree of correlation with the reference series. The
quarterly GDP series, the widest and most familiar
index describing total output, has been selected as the
reference series. Linear interpolation is used to
convert the series to a monthly one.
Manufacturing
The monthly volume index of industrial output is used
as the indicator of industrial output. Manufacturing
accounts for approx. 27.6 per cent of total output.
To enable output comparisons between counterpart
months in different years, the monthly volume index
of industrial output is calculated per working day.
The structure of the monthly index is based on the
principle of representativeness under which single
indicators, or indicators composed of a few series,
represent the output of the different sectors of the
economy. This method was selected in order to make it
possible to follow not only total output but also the
trend in the output of the most important economic
sectors.
Construction
The construction industry accounts for approx. 6.5 per
cent of total output. The index of construction material
inputs 1990=100 is used as the indicator of the output
of the construction industry. It is modified with the
help of the construction industry series on
employment and on the hours worked which are
derived from Statistics Finland’s Labour Force
Survey.
In addition, the production method used resembles the
method by which the GDP series is compiled in the
quarterly accounts, thus facilitating the use of this
series for reference.
221
TABLE OF CONTENTS
Trade
seasonal and random variation. The calculation of the
trend-cycle component is analogous to the version
based on the original series in that each series is
separately adjusted for seasonal variation and, after
the series have been weighted together, the regression
estimate is calculated with the help of the seasonally
adjusted quarterly series serving as the reference
series.
Trade accounts for approximately. 10.7 per cent of
GDP. The volume indices of wholesale and retail sales
are used as the indicator of the output of trade.
The monthly volume index of retail sales is calculated
per trading day.
Transport and communications
Transport and communications accounted for 8.7 per
cent of total output last year. The monthly indicator of
transport and communications consists of three
indicators: the real income of the Finnish P&T; the
gross ton kilometres of goods transport on the State
Railways; and the consumption of diesel oil, which
mainly describes the development of professional road
transport. Railway and road transport account for
approximately 65 per cent of the output of transport
and storage and for approx. 40 per cent of the output of
the whole sector. Communications account for just
over a quarter of the output of the whole sector.
The method of seasonal adjustment used is the one
developed by the Bank of Finland and which is a slight
modification of the second version, X-11, of the
method developed by the US Bureau of the Census. In
october 1994 we have changed the method to
X-11-ARIMA.
4 Publication
The key results of the monthly indicator of total output
are published as a four-page monthly report appearing
two months after the end of the month in question.
Public and other services
The sector accounts for approx. 40.2 per cent of total
output. The linear trend calculated from quarterly
accounts data has turned out to be the best indicator of
the sector for the purposes of the monthly indicator.
The monthly indicator and quarterly
5 national
accounts
The quarterly national accounts are compiled
indepently from GDP-indicator, but because of the
data sources are partly the same, there is a close
connection between these two estimates of the GDP.
3 Calculation of the monthly indicator
The series of indicators describing an economic sector
are weighted together to serve as the indicator of the
sector. The indicators of the six economic sectors are
made commensurable by indexing them according to a
base year, currently the year 1990 (1990=100).
The quarterly accounts are published about two
months and three weeks after the end of the quarter.
The data available at that moment is much larger than
is available for monthly indicator.
The results of the indicator are used as a comparison
data for quarterly accounts. It is also useful in
estimating the economic development,because of the
production lag in quarterly accounts.
The monthly indicator of total output is formed by
weighting the sectoral indicators together by means of
their relative contributions to GDP at fixed prices in
the previous year and by calculating the regression
estimate of the weighted indicator with the help of the
GDP series serving as the reference series.
The indicator measures GDP, or its changes in basic
values. Therefore we have to compare the change in
GDP-indicator with the change of quarterly GDP in
basic values. Table 1 shows the annual volume
changes of the indicator and the quarterly GDP.
The trend-cycle component of the
4 monthly
indicator
Because the figures accounted by both ways are
revised when new data is available, the comparison is
made from the first release day figures.
The monthly indicator is calculated in two versions:
the one as based on the original series and the other,
the so-called trend-cycle component, as adjusted for
222
TABLE OF CONTENTS
When comparing the first estimates given by the
monthly indicator with the first estimates given by
quartely national accounts we can observe that the
indicator is overestimating the development in 1992
and in the first half of 1993.
statistics is released one and a half week later than the
monthly indicator is published.At the time the
indicator was compiled the actual sales figures were
not known.
These two problems we have solved by changing the
method of estimating the public sectors development
by receiving information of the salaries paid by the
public sector. The problem with salesstatistics is also
solved. We are receiving informatin from sales
statistics as soon as they receive it from the firms.
With closer examination we have observed that the
monthly indicator has failed to forecast the
development in two sectors of economy; namely
Wholesale and retail trade and public and other
services.
During depression the linear reggression in estimating
the production of the public sector is not good
anymore.
As can be seen the development given by the monthly
indicator after the first half of 1993 is quite similar to
the development given by the quarterly national
accounts.
The problem with the wholesale and retail sales
statistics was merely a problem of timing. The sales
Table 1: Annual volume changes of the monthly indicator and quarterly GDP
Year
Quarter
Monthly GDP indicator
GDP in basic values
First release day
First release day of quarterly accounts
1992
I
II
III
IV
-2.2
-0.5
-0.1
-0.3
-2.4
-1.4
-1.8
-1.5
1993
I
II
III
IV
-0.5
-1.6
-1.7
0.7
-1.9
-2.9
0.0
0.0
1994
I
II
1.5
5.2
2.0
5.2
223
TABLE OF CONTENTS
The flash estimate of the Italian Real Gross Domestic Product
Giancarlo BRUNO ; Eurostat
Gianluca CUBADDA ; Università di Roma “La Sapienza”
Enrico GIOVANNINI ; ISTAT
Claudio LUPI ; ISTAT
1 Introduction
It is evident that the timely availability of
high-frequency indicators is of crucial importance for
the estimation of the quarterly national accounts. In
Italy, the problem of having quick access to the high
frequency indicators is complicated by the number of
the indicators themselves. Indeed, the present version
of the quarterly national accounts covers estimates of
Resources and Uses accounts, value added, standard
labour units various measures of labour cost and
productivity, all evaluated according to the detail of 44
branches of economic activity. Furthermore,
household consumption is estimated for 50
consumption goods and services 1. The amount of
information needed to estimate the quarterly figures is
therefore impressive. Unfortunately, this implies also
that many of the indicators are available only with a
considerable delay form the end of the reference
quarter, so that the quarterly national accounts are
currently released with a mean lag of 95 days form the
end of the reference quarter. However, economic
analysis calls for a more rapid evaluation at least or the
main aggregates.
In Italy is the Istituto Nazionale di Statitica (ISTAT,
the Italian Central Statistical Office) which releases
the official estimates of the quarterly national
accounts. As in many other European countries, in
Italy, the quarterly national accounts are estimated
resorting to indirect disaggregation methods. These
methods, that can be traced back to Chow and Lin
(1971), are based on the idea that it is possible to
temporally disaggregate a low-frequency variable by
using high-frequency related series. The literature on
this topic is vast and we will not try to review it here. A
critical appraisal is contained in Lupi and Parigi
(1995) in this volume. However, the main idea is
simple and can be loosely described as follows. Let
{Υ t} 1
Τa
be an annual time series to be disaggregated at
quarterly frequency and let {xt } 1 (where Τ = 4 Τ a ) be a
Τ
quarterly time series whose annual value is related to
according to the relation:
Υ t =β
4t
∑x
j =4 t − 3
j
+ εt
( t = 1,K , Τ a
j = 1,K , Τ )
In this paper we describe the methodology followed by
ISTAT for producing anticipated quarterly estimates
of the main aggregates of the Resources and Uses
Account. In section 2 we review various aspects
Then the quarterly estimate of Υ t can be derived
imposing a similar relationship among the quarterly
variables.
1
A complete description of the Italian quarterly national accounts can be found in Istat (1992)
225
TABLE OF CONTENTS
concerning the compilation of quarterly national
accounts in several European countries. Section 3
deals with the timing of the information needed in the
final estimation of the Italian quarterly accounts.
Section 4 is devoted to the discussion of the main
methodological issues. Section 5 illustrates some
experimental results relative to the flash estimate of
the Italian GDP for the first quarter 1995. Section 6
concludes.
An alternative approach is the indirect one: the annual
figures are disaggregated by means of various
techniques, which can use some indicators related to
the variable we want to disaggregate; in the last years
many enhancements have been made in this field (Di
Fonzo, 1987).
In table 1 we can see the methods used in European
countries. One should be aware that many countries
that actually use direct methods, integrate them with
the indirect ones for the items where they lack survey
based quarterly information.
Situation of quarterly national
2 accounts
in European countries
The situation of quarterly accounts is quite different
among the various European countries. Some
differences relate to concepts that are common to the
annual accounts, while some others arise from the
specific nature of quarterly accounts. Here, we will
present some of the main discrepancies in comparison
to the Italian situation, how these influence the
comparability of the figures among different
countries, and how they affect the opportunity and
feasibility of a flash estimate. We are particular
interested in the time release of quarterly accounts in
the different countries and in the revision process.
Table 1: Quarterly national accounts in the
main European countries
The main differences among the countries deal with
the following subjects:
•
•
•
•
•
the calculation method
types of series produced
the timing of publication
the calendar and the extend of the revisions
the extend of disaggregation
Country
Methodology
Austria
Denmark
Finland
France
Germany
Italy
Netherlands
Norway
Portugal
Spain
Sweden
United
Kingdom
direct
direct
Mixed
Indirect
direct
indirect
direct (I/O)
direct
indirect
indirect
direct
Starting date of
time series
1970
1977
1970
1970
1968
1970
1977
1978
1977
1970
1980
direct
1955
Some countries which base quarterly national
accounts on direct methods, use input-output annual
tables as a check of the consistency of quarterly
figures; a peculiar case is represented by the
Netherlands, where a complete system of quarterly
input-output tables, both at current and constant
prices, is built. Concerning Finland, the reason why
we indicated “mixed” in table 1, depends on the fact
that in this case is difficult to find a prevailing method.
2.1 The calculation method
Quarterly accounts may be compiled using mainly two
approaches : the direct approach and the indirect one.
The former provide for a set of surveys so as to have
“directly” the quarterly account aggregate; usually the
information provided by these surveys. However there
are countries like the United Kingdom which do not
build quarterly accounts with a separate procedure
from the annual ones; that is the quarterly accounts
represent the main object of national accounts, while
annual data are produced aggregating the quarterly
figures.
Considering only EU countries, we note that Belgium,
Ireland, Greece and Luxembourg do not produce
quarterly accounts. Nevertheless, some of these
countries have already accomplished preliminary
226
TABLE OF CONTENTS
studies to compile quarterly accounts, possibly using
methods.
use of different seasonal adjustment methods.
Moreover, even when the methods are the same, they
can use different options (sometimes the possible
choice is among lots of options, like in the X11
method) leading to quite different resulting series.
Some countries publish also cycle-trend data, i.e. data
from which the irregular component has been also
removed.
The choice between direct and indirect methods is due
more to the statistical tradition and to empirical
considerations that to theoretical ones. Anyway it has
some consequences on flash estimation. In fact
indirect methods already provide for relations of
quarterly aggregates with some related series, so the
problem of flash estimation consists of selecting some
of these related series, on the basis of there early
availability and/or the possibility to forecast them. In
the case of direct methods, one must first find such
relationships. In addition, once they have been
identified, they are likely ot be less stable than the
corresponding relations used with the indirect
methods, given the ad-hoc subjective adjustments
more frequently made by countries that use direct
methods.
In table 2 we observe the different types of series
produced by the European countries and relative
seasonal adjustment methods.
The main objective of flash estimates is to calculate
the seasonally adjusted quarterly figures. The problem
then arise, whether it is better to seasonally adjust the
forecasted raw quarterly figures, or directly the
seasonally adjusted ones. This heavily depends on the
method already in use in each country. In our case the
procedure to produce adjusted data consists of
disaggregating annual figures by means of seasonally
adjusted related series. So, when the adjusted indicator
is available, we could directly estimate the seasonally
adjusted quarterly series. Nevertheless in case we have
to forecast the related series, it is worth noting that a
forecasting model can take efficiently into account the
seasonal pattern of series so as to improve the forecast
itself. So the “best” procedure in this case can be
sketched as in figure 1.
2.2 Types of produced series
Quarterly data, unlike annual figures, are affected by
the “problem” of seasonality. Given the use of
quarterly data for short-term analysis, statistical
agencies are often required to produce, in addition to
the raw figures, also seasonally adjusted ones. This
raises some comparability problems among the
adjusted figures of the different countries, due to the
Table 2: Types of quarterly series supplied by the countries and method of seasonal adjustment used
Raw data
Seasonal adjusted
data
Austria
O
O*
SEATS
Denmark
Finland
O
O
O
O*
X-11 ARIMA
X-11 ARIMA
O
X-11 ARIMA
Country
France
Germany
Italy
O
O
O*
O
Netherlands
O
O
Norway
Portugal
O
O
O
O
Spain
Cycle-trend
data
O
Method
BV, X-11
X-11 ARIMA
X-11
O
X-11 ARIMA
O
Ad hoc
Sweden
O
O*
X-11 ARIMA
United Kingdom
O
O
X-11 (CSO version)
* Not available for all quarterly series
227
TABLE OF CONTENTS
Figure 1: “Ideal” process of production of flash estimate of seasonally adjusted quarterly aggregat e
Raw related series
Additional
information
Annual data
Forescasting model
Quarterly
Disaggregation
model
Seasonal adjusted
quarterly aggregate
Forecast of raw
related series
Seasonal ajdusted
related series
X-11 ARIMA
Seasonal adjustment
2.3 Timing of publication and process of
revision
We can see from figure 2 that some countries, like
USA, already provide very early estimates of quarterly
accounts. In these cases the revisions are larger and
can affect substantially the reliability of the first
released figures. This effect can have a larger extend
taking into account also the revisions due to the
seasonal adjustment. Among European countries, UK
provides the earliest estimate of quarterly GDP, three
weeks after the end of the reference period (the figures
provided are index numbers of output-based real GDP
and its main components).
Also the timing of publication is quite variable among
the different countries, due to the different choices
between early availability of quarterly data and extent
of the revisions. We cannot give her a complete
comparative study on the timing of publication and the
extend of the revisions in the different countries, but
we can show some figures that give an idea of the main
differences between the various countries (figure 2).
Figure 2: Timeliness of publication of the first five revisions of quarterly accounts
228
TABLE OF CONTENTS
national accounts at different delays after the end of
the reference period, expressed as percentage of total
information needed to calculate seasonally adjusted
value added at constant prices for each branch value
added at constant prices for each branch. The total is
calculated as a weighted average of the branch values,
where each branch weight is the same as the weighted
average of the branch values, where each branch
weight is the same as the corresponding branch share
on total real value added in 1990.
ISTAT releases its estimates about 95 days after the
end of the reference period. Although the long delay of
publication, these data are subject to revisions too, due
to the incomplete set of information used at that delay
and to that delay and to the subsequent revisions of
annual figures (Di Fonzo et al.1995).
If a systematic pattern could be found in subsequent
revision , they could be efficiently used to improve the
first release and consequently the flash estimates of
quarterly accounts (Patterson, 1995).
We can see that between 45 and 50 days there is a
substantial increase in the available information,
mainly due to the fact that there is a complete set of
2.4 The extend of disaggregation
information for manufacturing and energy industry,
consisting of industrial production indices.
While ISTAT produces anticipated quarterly accounts
series at disaggregated level, the flash estimate is
released at a fairly aggregated level, so that errors
deriving from independent sources should compensate
to a certain extent, leading to a more accurate estimate.
In addition we can compare the difference in the
available information between 1986 and 1994. In
figure 3 we observe that there is an apparent worsening
in the delay (Giovannini, 1988); while this result is due
to the substantial increase in the set of base
information used, nevertheless this makes more
difficult to obtain a flash estimate of GDP. In figure 3
we have another evidence that the availability of the
base information concerning the industry value added
shows two “steps”, corresponding to the availability of
The situation of basic information in
3 the
Italian quarterly national accounts
Table 3 shows the current situation of the basic
statistical information concerning the Italian quarterly
Table 3 : Available information at different time delays from the end of the reference period
days
0
15
30
45
60
75
90
100.0%
33.3%
8.3%
16.7%
16.7%
0.0%
8.3%
1.4%
0.0%
0.0%
100.0%
33.3%
8.3%
33.3%
33.3%
0.0%
16.7%
2.7%
0.0%
0.0%
100.0%
66.7%
16.7%
50.0%
50.0%
33.3%
16.7%
30.6%
0.0%
16.7%
100.0%
66.7%
16.7%
66.7%
66.7%
33.3%
41.7%
34.7%
0.0%
16.7%
100.0%
100.0%
25.0%
83.3%
83.3%
66.7%
41.7%
62.6%
0.0%
33.3%
100.0%
100.0%
25.0%
100.0%
100.0%
66.7%
58.3%
65.3%
0.0%
33.3%
100.0%
100.0%
50.0%
100.0%
100.0%
100.0%
66.7%
94.6%
50.0%
100.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
0.0%
11.1%
0.0%
0.0%
11.1%
100.0%
100.0%
55.6%
100.0%
29.3%
100.0%
32.5%
100.0%
48.1%
100.0%
51.4%
100.0%
67.6%
100.0%
70.9%
100.0%
92.7%
branch
Agriculture, forestry, fishing
Industry, strict sense
Building and construction
Recovered goods and reparation
Trade, lodging and catering
Inland transport
Air and water transport
Supporting services to transport
Communication
Banking, finance and insurance
services
Business services
Building lease
Health, education and recreational
services
Non-market services
Total
229
TABLE OF CONTENTS
consistent with the observation that the turning points
in GDP are well identified by the turning points of the
industrial production index (Parigi and Schiltzer
1995).
Figure 3: Available information at different
time delays in 1986 and 1994
the quarterly national
4 Anticipating
accounts estimates
Since the quarterly national accounts series are
estimated by indirect methods, two possible
forecasting routes are conceivable:
i)
ii)
the industrial production index, thirty and sixty days
after the reference period. This means that if we were
able to produce a reliable estimation of the third month
industrial production index, we could have a flash
estimate of GDP with 30/45 days delay form the end of
the reference period.One way to do that is by means of
business surveys results, which are compiled monthly
by ISCO. They are quickly available, about 30 days
after the end of the reference period. A way to use
them to forecast the industrial production index is
shown in Gennari (1991).
direct forecast of the quarterly series;
forecast of the indicator series and time
disaggregation along the standard
procedure.
In general there is no solid theoretical foundation to
assert if either (i) or (ii) is to be preferred. In fact, let
{yt } 1
Τ
be the quarterly series,
{xt } 1
Τ
the vector of
indicator series, {ut } 1 the quarterly residuals, β the
Τ
vector of coefficients which relates yt to x t , and {z t }1
Τ
a vector of exogenous variables; further, let
FΤ =
({x }
t
and Ε ( ⋅FΤ
then
The strategy of the flash estimate exercise is then to
forecast the industry value added by means of business
surveys. The same applies to the building sector. Part
of market and most of non-market services are
forecasted using ARIMA models. This strategy is
Τ
1
,{ut } 1 , {z t } 1
Τ
Τ +1
) be the information set
) the conditional expectation as of time
Ε ( y Τ + 1 FΤ
(
) = βΕ ( x
so that, if Ε u Τ + 1 {z t } 1
Τ +1
Τ +1
FΤ
) + Ε(u
Τ +1
FΤ
)
) = 0, then strategy ( i) and
strategy ( ii) are equivalent.
Figure 4: Percentage “ignorance” at different time lags for base statistics infomation used to
estimate value added at constant prices
230
T,
TABLE OF CONTENTS
However, the fact that the final estimates are revised
quarterly constitutes a good reason why ( ii) can be
preferred to ( i). Revisions are motivated both by
revisions of the seasonally unadjusted indicator series
and by seasonal adjustment carried out on the
indicators themselves. Strategy ( ii) has the advantage
of exploiting the information conveyed by the
revisions of the indicator series. Consistency of the
flash estimate with the final quarterly one also
suggests the opportunity of forecasting the missing
indicators and use the same procedure to derive both
the anticipated estimate and the final one. Therefore,
we decided to follow this route in the present paper.
α 0 ( L) xt = α 1 ( L) z t + vt
where the α i ( L) ' s are finite polynomials in the lag
operator L with all the roots not less than 1 and vt is a
white noise process. We identify these models using a
general-to-specific approach (Hendry, 1995). In
particular the model is estimated by OLS usually using
eight lags for the lag polynomials α i (L) ' s. Then the
usual misspecification tests are carried out over each
reduction of the model. In particular we test for
residual serial correlation using the LM test (in the F
form; see Breusch and Pagan, 1980 and Harvey, 1981)
over various lags (usually from 4 to 12). We then
check for ARCH structure in the residuals (Engle,
1982) and for residuals normality (Jarque and Bera,
1980). The null of linearity is tested against
polynomial alternatives, while stability is checked
using recursive Chow tests. In general the selected
models satisfy all these tests at the usual confidence
levels. Those cases in which normality, stability, and
sometimes autocorrelation statistics are significant are
considered as possibly affected by the presence of
outliers or structural breaks. Impulse or step dummy
variables are then included according to the
indications arising from the recursive Chow tests and
the graphical analysis of the residuals. Finally, in order
to test the forecasting ability of the final (restricted)
model, the forecasting performance of this is checked
against the one-step ahead forecasts of the unrestricted
model over the last eight quarters. These comparisons
generally highlight a clear superiority of the restricted
model.
A particular situation arises when the indicator series
are themselves the result of a preceding disaggregation
step. Given the limited number of high-frequency
indicators, this is not an infrequent case in the
estimation of the Italian quarterly national accounts
and in the present study we have been forced to
consider these disaggregated series as truly observed
ones. However, it should be highlighted that doing so
can cause problems in estimation, inference, and
forecasting both because of the “generated regressors”
problem (Pagan, 1985) and because of noticeable data
revisions 2.
Let’s consider now strategy ( ii) in greater detail. Two
different settings can be distinguished:
(ii-a) a vector of exogenous variables {z t }1
Τ +1
is
available which is useful in forecasting
xΤ + 1 ;
(ii-b) no exogenous information is available to
forecast x Τ + 1 .
Up to this point we did not address two major issues.
The first is related to the choice of using
single-equation instead of multi-equation models; the
second is concerned with the possible presence of unit
roots in the series under examination. As is well
known, weak exogeneity in the sense of Engle et al.
(1983) is the necessary condition to have correct
inference in conditional (single-equation) models. In
the present case, parameter stability is regarded as
indirect evidence of weak exogeneity (Favero and
Hendry, 1992). However, it should be stressed that the
Note that by “exogenous information” we mean
essentially a proxy of the variable to be forecasted.
Therefore, consistently with current quarterly national
accounts estimation, we don’t use behavioural models
that rely on particular economic theories. In this sense
our exercise differs considerably from a typical
economic forecasting problem. In all those cases in
which (ii-a) applies, we estimate a single-equation
dynamic models of the kind
2
Note that this is true also for all the models which use disaggregated time series. However, this f act does not
appear to have been fully realized by models propietors .
231
TABLE OF CONTENTS
variable-indicator setting can induce nontrivial
complications as far as exogeneity is concerned (Lupi,
1995). For what concerns the unit root problem,
various aspects should be considered. First,
unrestricted (in the sense of not having imposed unit
roots) models do not necessarily forecast worse than
restricted ones in the presence of I(1) variables (see
e.g. Clements and Hendry, 1994). Secondly, our
dynamic equation can be interpreted as a single row of
a VAR model and the parameters estimates are
consistent (Sims et al., 1990) 3. Finally, it is well
known that standard unit roots tests have low power in
the presence of small temporal samples, typical of the
applications considered here 4.
outliers. More specifically, additive and innovative
outliers are considered along with transitory changes
and level shifts. Loosely speaking the automatic
procedure operates in three phases. In the first, the
order d necessary to make the series stationary is
selected: this is done avoiding over differencing, i.e.
introducing unit roots in the MA polynomial θ (L). In
the second phase the orders p and q are chosen by
means of the Bayes Information Criterion (BIC).
Finally, the kind and position of the outliers are
identified using LM tests for each observation. When
outliers are found the procedure starts again from the
first step. Further details can be found in Gomez and
Maravall (1994a, 1994b).
When situation ( ii-b) is relevant, there is not much to
choose about the methods to be used and the standard
ARIMA ( p,d,q) model
5 Some results
In order to motivate further our choice of forecasting
the indicators instead of the final estimate, in table 5
we report an example based on the forecasts obtained
by using a simple dynamic model for the value added
of the building sector. The model exploits the
information deriving from the ISCO monthly business
survey. In particular, value added is forecasted using
the level of the orders of the building sector. From the
statistics reported in table 4 it is not possible to detect
any clear sign of misspecification.
φ( L)δ( L) xt = µ + θ (L) u t
is utilised. In this model φ(L) and θ (L) are polynomials
in L of order p and q respectively, having all the roots
outside the unit circle; δ(L) is polynomial of order d in
L with all the roots equal to one; µ is a constant term.
The forecasting performance of ARIMA models
depends crucially not only on the correct identification
of the polynomials orders ( p,d,q) but also on the
preliminary correction of the outliers potentially
present in the series. Correct identification and
correction of the outliers would presume the use of a
vast number of statistical and graphical tools and the
use of qualitative information on the single economic
phenomena. However, the flash estimate of the
quarterly national accounts needs a large amount of
forecasts to be obtained in a relatively short period of
time so that this route would be impractical. In order to
render outlier identification as much automatic as
possible we decided to resort to the software TRAMO
(Gomez and Maravall, 1994b). This software allows in
fact the automatic selection of the orders p, d, q of the
ARIMA model and the correction of four types of
Table 4: Building Sector Model
Diagnostic Test
value
p-value
R2
0.9614
F(4,48)
298.64
DW
2.3000
AR 1-4 F(4,44)
1.8237
0.1413
ARCH 4 F(4,40)
0.0211
0.9991
Normality Chi2(2)
1.6415
0.4401
Xi2 F(7,40)
0.8054
0.5878
RESET F(1,47)
0.0421
0.8384
0.0000
3
However, we are aware of the fact that in the presence of unit roots nontrivial complications for i nference may
arise.
4
The typical sample in this study is 1980q1-1994q4
232
TABLE OF CONTENTS
The one-step ahead forecasting performance of the
model, taking the value added series as fixed, is good
(see table 5). However, the real value added series, as
well as all the other quarterly national accounts series,
is revised quarterly and is not fixed 5.
Table 6: Building Sector Indicator Model
When the true series is used for estimating and
forecasting the model, much worse predictions arise
(table 5). This fact clearly reinforce our decision of
forecasting the indicator instead of the final aggregate.
In table 6 the model diagnostics relative to the
indicator series of the value added of the building
sector are reported. Again, these tests do not suggest
any misspecification of the model. Its forecasting
performance is satisfactory apart, perhaps, the result
associated to the second quarter 1993 (see table 7).
Diagnostic Test
value
R2
0.9477
F(8,46)
104.24
p-value
0.0000
DW
2.3400
AR 1-4 F(4,42)
1.1908
0.3289
ARCH 4 F(4,38)
0.1107
0.9780
Normality Chi2 (2)
4.1824
0.1235
Xi2
0.5163
0.9114
RESET
0.7536
0.3899
represented about 67% of the market goods and
services value added. Where no exogenous
information could be used in forecasting the indicator
series, we used univariate forecasts of the indicators
themselves. On this basis, in order to produce an
anticipated estimate of the supply side growth of the
economy, we decided to adopt the strategy of using the
indicator series augmented by the one-step ahead
predictions (where necessary) in the disaggregation
procedure actually used to produce the final estimate.
Similar situations arise in the other cases for which
exogenous information is available to forecast the
indicator variables, such as manufacturing, energy and
gas, agriculture, and trade. These sectors in 1994
Table 5: One step ahead forecasts of the
building sector value added changes
Unfortunately, there are cases in which a
disaggregated variable is used as an indicator of a
further disaggregation step. This happens in particular
Date
Var% a
Var% b
Var% c
Var% d
93.1
-2.1
-2.1
-0.2
-1.7
93.2
-1.7
-1.7
-0.3
-1.2
93.3
-1.3
-0.3
-0.5
-2.5
93.4
-1.2
-1.2
-0.4
-2.4
94.1
-0.9
-1.1
-0.7
-1.7
Date
True
Forecast
94.2
-1.2
-1.9
-1.0
-1.2
93.1
-2.1
-3.6
94.3
-1.3
-2.2
-1.2
-0.4
93.2
-1.8
-0.2
94.4
-0.8
-0.5
-0.8
-0.3
93.3
-1.4
-2.9
Notes:
(a) as of at 1994.4
93.4
-1.4
-1.2
94.1
-1.2
-1.7
(b) one step ahead forecast treating the series as fixed as of at
1994.4
94.2
-1.4
-1.5
94.3
-1.5
-0.8
94.4
-0.5
-0.7
Table 7: One-step ahead forecasts of the
building sector indicator changes
(c) as of at the reference date
(d) one step ahead forecast using the series as of at the reference
date
5
The last two years of the quarterly national accounts are revised quarterly. In addition, when the fourth quarter
of a specific year is estimated, quarterly national accounts series are revised since five years before. A study of
the revisions of the Italian quarterly national accounts is in Di Fonzo et al. (1995)
233
TABLE OF CONTENTS
very frequently. The information set just described is
typically available 60 days after the end of the reference
quarter. This would allow us to produce a first estimate
of the market economy value added growth rate 35 days
in advance with respect to the publication of the
complete set of the quarterly national accounts.
Table 8: Value added % changes of quarterly
national accounts and flash estimate
1995.1
Sector
QNA
Flash
Agriculture, forestry and
fishering
6.0
6.0
Manufacturing
3.0
3.2
Energy and gas
1.6
1.6
Building
-0.7
-0.6
Market Services
0.6
0.5
Market Goods & Services
1.5
1.5
6 Conclusions
This paper describes the methodology adopted by
ISTAT for the flash estimate of the Italian real GDP. As
it stands, the estimate is derived mainly from the supply
side, given that the reliable information related to
production is more readily available. The study
highlights the opportunity of using forecasts of the
indicator series in the standard disaggregation
procedures instead of producing direct forecasts of
sectoral value added series. The forecasts of the
indicator series are obtained by utilising dynamic
single-equation models, which exploit the information
embodied in the business survey data provided by
ISCO. When there is no genuine exogenous
information, we rely on ARIMA-based procedures with
outlier correction. Following this approach, we are able
to provide an early estimate of constant prices GDP
growth rates 60 days after the reference quarter,
anticipating the release of the ordinary estimate of about
35 days. The reported results suggest that this
methodology is able to produce accurate anticipations.
This is encouraging towards the direction of
implementing a flash estimate of the whole resources
and uses account.
in some cases related to the service sector, where there
are more problems in finding relevant exogenous
information. In these cases, at least until we have not
completed the flash estimates for what concerns the
demand side, we are forced to treat the disaggregated
indicator as a truly observed one, and the caveats
exposed before apply.
As a final example we provide the results relative to
the flash estimate of sectoral value added growth rates
for the first quarter 1995. These are reported in table 8.
At the time in which the experiment was performed,
the available indicators were essentially the industrial
production index and most of the price indices along
with some minor indicators. On the contrary, service
and building sectors indicators had to be forecasted
together with the remaining price indices. A particular
case is represented by the agricultural sector, where an
indicator is usually readily available but is revised
234
TABLE OF CONTENTS
References
Breusch, T.S., and Pagan, A.R. (1980) : “The
Lagrange Multiplier Test and Its Applications to
Model Specification in Econometrics”, Review of
Economic Studies , 47, 239-253.
Harvey, A.C. (1981) : The Econometric Analysis of
Time Series. London: Philip Allan
Hendry, D.F. (1995) : Dynamic Econometrics .
Oxford: Oxford University Press.
Chow, G., and Lin, A.L. (1971) : “Best Linear
Unbiased Interpolation Distribution and
Extrapolation of Time Series by Related Series”,
Review of Economics and Statistics , 53, 372-375.
Istat (1992): “I conti economici trimestrali con base
1980”, Note e Relazioni , 1992, n.1.
Clements, M., and Hendry, D.F. (1994) : “Towards a
Theory of Economic Forecasting”, Economic
Journal.
Jarque, C.M., and Bera, A.K. (1980) : “Efficient
Tests for Normality, Homoscedasticity and Serial
Independence of Regression Residuals”,
Economics Letters , 6, 255-259.
Di Fonzo, T. (1987) : La stima indiretta di serie
economiche trimestrali . Padova: CLEUP Editore
Lupi, C. (1995) : “Stationary Measurement Errors
Among Cointegrating Variables”, ISTAT,
mimeo.
Di Fonzo, T., Pisani, S., and Savio, G. (1995) :
“Revisions to Italian Quarterly National
Accounts”, in this volume.
Lupi, C., and Parigi, G. (1995) : “Time Series
Disaggregation of Economic Variables: Some
Econometric Issues”, this volume.
Engle, R.F. (1982) : “Autoregressive Conditional
Heteroscedasticity, with Estimates of the
Variance of United Kingdom Inflation”,
Econometrica, 50, 987-1008.
Pagan, A.R. (1985) : “Econometric Issues in the
Analysis of Regressions with Generated
Regressors”, International Economic Review , 25,
221-247.
Engle, R.F., Hendry, D.F., and Richard J-F. (1983) :
“Exogeneity”, Econometrica, 51, 277-304.
Parigi, G., and Schiltzer, G. (1995) : “Quarterly
Forecasts of the Italian Business Cycle by Means
of Monthly Economic Indicators”, Journal of
Forecasting, 14, 117-141.
Favero, C., and Hendry, D.F., (1992) : “Testing the
Lucas Critique: A Review”, Econometric
Reviews, 11, 256-306.
Gennari, P. (1991) : “L’uso delle Indagini
Congiunturali ISCO per la Previsione degli Indici
della Produzione Industriale”, Rassegna dei
Lavori dell’ISCO , n.13.
Patterson, K.D. (1995) : “An Integrated Model of the
Data Measurement and Data Generation Process
with an Application to Consumers’ Expenditure”,
Economic Journal , 105, 54-76.
Giovannini, E. (1988) : “A Methodology for an Early
Estimate of Quarterly National Accounts”,
Economia Internazionale , 41, 197-215.
Sims, C.A., Stock, J.H., and Watson, M.W. (1990) :
“Inference in Linear Time Series Models with
Some Unit Roots”, Econometrica , 58, 113-144.
Gomez, V., and Maravall, A. (1994a) : “Estimation,
Prediction, and Interpolation for Nonstationary
Series with the Kalman Filter”, Journal of the
American Statistical Association , 89, 611-624.
Smith, P. (1995) : “The Timeliness of Quarterly
Income and Expenditure Accounts: An
International
Comparison”,
Joint
OECD-Hungarian Central Statistical Office
Workshop on quarterly national accounts,
Budapest 21-24 February
Gomez, V., and Maravall, A. (1994b) : “Program
TRAMO: Time Series Regression with ARIMA
Noise, Missing Observations, and Outliers”,
European University Institute, Florence, mimeo.
235
TABLE OF CONTENTS
Using data of qualitative business surveys for the estimation of Quarterly
National Accounts
Carlos Coimbra 1
Gabinete de estudos económicos do INE & Inst. superior ciências do trabalho e da empresa
The data from qualitative business survey is usually available first than the data from
conventional quantitative statistical sources. the qualitative data may be useful for the
quarterly accounts estimation. This problem of how to use the data from qualitative
surveys for the estimation of quarterly national accounts is very relevant for the
portuguese case. Often, we do not have all the intra-annual quantitative data usually
available in the others countries of the European Union or we have it with a relative delay.
In this paper we intend to present one of the research directions we are developing about
this problem, which does not follow a kind of Carlson & Parkin approach. In general
terms, we study the possibility of obtaining quantitative indicators, using an econometric
friendly method, from qualitative results of Business Surveys in a way that preserves all
the aggregated information obtained by them.
1 Introduction
the Balance, but that intends to correct its
insufficiency problem.
The data from qualitative business survey is usually
available before data from conventional quantitative
statistical sources. Some of this qualitative data may
be used to produce quantitative related indicators for
the estimation of quarterly accounts. The usual form of
quantification, the balance, is rather unsatisfactory.
However, its simplicity probably explains why it is so
popular. Maybe it is an impossible task to find another
form of quantification maintaining that characteristic
of simplicity and that, at the same time, does not have
the problem of insufficiency affecting the balance
quantification.
In particular, in this paper we are concerned with how
to use to data from qualitative surveys for the
estimation of quarterly national accounts, or more
precisely, for the first (fast) estimations of quarterly
accounts. This problem is very relevant for the
portuguese case because we do not have all the
intra-annual quantitative data usually available in the
others countries of the European Union or we have
quantitative information with a considerable delay. In
this paper we intend to present one of the research
directions we are developing, which is not in the line
of the Carlson & Parkin approach.
In this paper we develop another way of quantification
which does not have the same degree of simplicity of
1
The author wish to thank J. Santos Silva for his assistance in developing the final version of the
econometric model presented in this paper.
237
TABLE OF CONTENTS
In the next section we discuss an alternative way of
quantification of the business surveys information. In
the last section we provide an illustration for the case
of portuguese exports.
So, in order to monitor V, one should pay attention
both to the balance and to the P2 answers. The question
is how to combine this two indicators in a proper
manner.
approach less naive than the
2 One
Balance
An obvious way, is to use equation (2) to estimate a
composite indicator, taking as proxy regressors the
surveys extrapolated proportions, which has not the
problem of perfect collinearity present in (1).
In general, the qualitative business surveys pose
questions about the signal of the variation of some
important variables for short term economic analysis.
As result, the responses are aggregated in three
complementary percentages. Consider P1 as the
percentage of positive responses, P2 the percentage of
neutral responses, i.e., the percentage of responses that
report no variation, and P3 the percentage of negative
responses.
However, the β coefficients are linear combinations of
variation rates implicit in each group of responders.
Those rates will not be constant. In fact, they probably
are affected by the business cycle. Consequently, the
same should apply to the β coefficients. But, also the
P2 answers and S are affected by business cycle. Thus,
one possible way to take into accounts the variability
with time of the β is to consider, for a given month or
quarter, the three following stocastic equations:
The variation V of the variable questioned can be seen
as a weight average of the average variations inside the
three groups of answers. For example, if the question
is about the production evolution, V corresponds to
weigh average of production variation rate of the
concerned economic sector, being V1 , the average
variation rate inside the positive responders (firms), V2
the average variation rate inside the “neutral”
responders, V3 the average variation inside the
negative responders (firms), in a given month or
quarter, we can write the following identity equation:
β1 = α 0,1 + α 1,i S + α 2,i P2 + ε i ,
(i = 01, ,2)
(3)
where, in each equation there is residual term ε 1 .
After substituting the β1 in (2) by (3) we obtain:
V = α 0,0 + ( α 1,0 + α 0 ,1 )S + ( α 2,0 + α 0 ,2 )P2
+ α 1,1 S 2 + ( α 2,1 + α 1,2 )SP2 + α 2 ,2 P22 + v
(4)
where v = ε 0 + ε 1 S + ε 2 P2
V = V 1 ⋅P1 + V2 ⋅P2 + V3 ⋅P3
(1)
Even if it was possible to run a regression in (4) there
would be an identification problem on some of the α
parameters.
Supposing that, like the percentages, V is observable,
it is impossible to apply OLS to estimate the implicit
average variations V 1 ,V 2 and V 3 because of the
regressors perfect collinearity.
Fortunately, this problem is not relevant in practical
terms, since the objective is not to estimate the α
parameters but simply to approximate V as a function
of the available information. Moreover, we do not
observe the true proportions, i.e., we ignore the
effective values of S and P2 . We only obtain from the
qualitative surveys proxy proportions, the
extrapolated proportions form their reference samples.
~
~
That is, we only obtain S and P2 .
One easy way to surpass this problem, without losing
information, is transforming (1) into:
V = β 0 + β 1 S + β 2 P2
(2)
where
S = P1 − P3 , β0 = 1 / 2(V1 + V3 ) ,
β1 = 1 / 2(V1 −V3 ) and β2 = V2 −1 / 2(V1 + V3 ).
So we can rewrite (4) as:
If P2 is not zero, the use of the Balance S will not be
misleading only if V2 = 1 / 2(V1 + V 3 ). In fact, even if
V2 = 0, P2 is relevant to define V, unless V1 and V 3 are
symmetric.
~
~
~
~~
~
V = γ 0 + γ 1 S + γ 2 P2 + γ 3 S 2 + γ 4 SP2 + γ 5 P22 + η (5)
238
TABLE OF CONTENTS
Supposing η an homoscedastic and not serial
correlated residual variable 2, one may apply OLS to
estimate V. This estimate can be seen as a quantitative
indicator potentially useful for quarterly accounts
estimation.
δ$ 0 = 0.61; δ$ 1 = − 0.04; δ$ 2 = 0.35.
(2. 4 )
(−0. 4 )
(1. 9 )
R 2 = 052
. ;
White’s heteroscedasticity test=5.5;
DW=1.07.
Next, taking as example the portuguese exports, the
first we will show that one should not ignore the P2
answers and, after, we use the equation (5) to estimate
the exports with satisfactory results.
Even if it is not clear that we do not have a serial
correlation problem, the above results suggest that P2
is a variable that should be considered. However this
model is not satisfactory because there are some
evidence of collinearity between the regressors and we
are supposing the parameters constant. So we have
estimated the model (5), and obtained the following
estimates:
3 An empirical illustration
In the manufacturing industry monthly business
survey there is question about the new orders from
abroad variation. We have the extrapolated
proportions of answers since 1987 to that question.
Unfortunately we only have the exports, at constant
prices, in a quarterly frequency. For that reason, we
have aggregated the monthly data into quarterly data.
After, studying the appropriated lags we have
estimated three models using TSP package.
γ$ 0 = −10.4; γ$ 1 = − 8.0; γ$ 2 =15.5;
(−2. 0 )
γ$ 3 = −2.9;
(−2. 1 )
(−2.2 )
( 2.1 )
γ$ 4 =1 1.0; γ$ 5 = −10.5
(2. 2 )
(−2.0 )
R 2 = 055
. ;
White’s heteroscedasticity test=13.5;
DW=1.40.
First we have estimated the following model by OLS;
~
Vt = δ 0 + δ 1 S t + µ t
Apparently, this model is a step in the right direction.
The non-linearities seem to be rather significant 3,
suggesting that one should account for the time
variation in the coefficients of equation (2).
(6)
The estimates for this model are (t-ratios in brackets):
Naturally, the approach adopted here to model the
coefficients variation is just one among other
alternatives. The main points are that one should not
waste the neutral answers proportion to get proper
related indicators to the quarterly accounts estimation,
and that equation (2) may be an interesting starting
point.
δ$ 0 = 111
. ; δ$ 1 = 015
.
(73. 0 )
(4. 7 )
R 2 = 0.46;
White’s heteroscedasticity test=0.85;
DW=1.08.
To assess the importance of neutral answers
proportions, we estimated the new model:
~
~
Vt = δ 0 + δ 1 S t + δ 2 P22t + ε t
(7)
The estimates for this model are (t-ratios in brackets):
2
These are not an indispensable hypotheses. In fact, they should be tested case by case and if they do not hold in
a particular case, then the conventional alternatives should be applied.
3
Due to the possibility of serial correlated errors, we have estimated a covariance matrix that is r obust to this
type of problem. However the results obtained are qualitatively equivalent.
239
TABLE OF CONTENTS
References
Batchelor , R.A., 1982, “Expectations output and
inflation: The European Experience”, European
Economic Review , 17.
Batchelor , R.A. and A.B. Orr, 1987, “Inflation
Expectations Revisited”, Economica, 55.
Person, B.A., 1977, “Balance Series as true and false
indicators of qualitative change: Theory and
Practice”, CIRET, 13 th Conference.
Person, B.A. , 1985, “On the relationship between
qualitative and quantitative indicators: Outline of
Carlson, J.A. and M. Parkin, 1975, “Inflation
Expectations”, Economica, 42
an explanatory model"” CIRET, 17th Conference.
Soares, M.C., 1993, “Indicadores quantitativos a
partir de respostas qualitativas: evolução das
exportações em volume”, Boletim Trimestral do
Banco de Portugal , first quarter of 1993
Dasgupta, S. and K. Lahiri, 1992, “A comparative
study of alternative methods of quantifying
qualitative survey responses using NAPM data”,
Journal of Business & Economic Statistics , vol
10.
240
TABLE OF CONTENTS
Estimating Quarterly Regional Accounts for southern Italy
Margherita CARLUCCI and Roberto ZELLI
1
University of Rome “La Sapienza”
1 Introduction
The availability of sub-national data in EU Countries
is crucial to analyse regional disparities and to identify
regional-specific policies. This is especially true in
Italy where the South has experienced different
patterns of growth with respect to the Northern part of
the Country. National accounting data issued by the
National Statistical Institute (Istat) at regional level
concern only annual data, hence no attempt has been
made so far to capture the conjunctural differences
between North and South Italy.
The paper is set out as follows. In the second section a
brief review of the methodology used is given. The
third section concerns the choice of quarterly
indicators available at sub-national level and the
estimation procedure to obtain quarterly data. The
fourth section describes the results obtained both in
terms of historical quarterly profile and in terms of
extrapolation of annual data.
2 Methodology
To study Italian dualism at the edge of 1992 Europe
unification, we made up a first attempt of
disaggregation of regional annual data by deriving
quarterly figures of the domestic product account and
of value added by kind of activity of the Northern and
Southern economies. As far as we know, no other
regional disaggregation of the kind has been
performed by either institutional or private
researchers, at least for Italy. Therefore, this attempt
may give helpful suggestions for users who are
interested in building quarterly regional data for
conjunctural purposes. Moreover, since the official
regional data are released with considerable delay, the
construction of a quarterly model allows one to
provide forecasts of annual data.
1
The methodology used is based on Barbone, Bodo and
Visco (1981, BBV from now on) who modified the
Chow and Lin (1971) method. This method is
consistent with the one currently used by Istat to
derive quarterly data of national accounts (Istat, 1992).
As it is well known, this method is based on a linear
relation between unknown quarterly data of interest
and known quarterly indicators. While other
“smoothing” methods also based on indicators imply
two subsequent steps, first the estimate of quarterly
values and then their correction according to the
annual constraint, this one automatically takes into
account the annual constraint in the estimation
procedure.
This paper is the outcome of joint work of the Authors; however, Roberto Zelli wrote sections 1 and 3,
Margherita Carlucci wrote sections 2 and 4.
241
TABLE OF CONTENTS
Denoting with Y the unknown ( 4 n× 1) vector of
observations, with X the (4n × k) matrix of related
indicators and with β the (k × 1) vector of parameters,
the following relation applies:
Since ρ is generally unknown, it has to be estimated.
Chow and Lin (1971) suggested a maximum
likelihood procedure, whereas BBV adopted a feasible
generalised least squares procedure. It consists of an
iterative scanning on the values of , ranging in the
interval (-1, 1), to build punctually the likelihood
function. Here, we started with a step size of .01; in
correspondence of the maximum likelihood value
observed, we squeezed the step size to .001. The BBV
procedure is shown to be preferable when the
maximum likelihood quarterly time series are, for
instance, characterised by dramatic fluctuations of
uncertain economic meaning.
Y = Xβ + u
where the (4 n × 1) random vector u has mean zero and
covariance matrix V.
Defining the aggregation matrix B, we obtain the
estimable relation on the annual data y:
y = BY = BXβ + Bu
Moreover, since the assumption of autocorrelation of
order one, seasonally adjusted quarterly data have
been estimated.
In this case the Best Linear Unbiased Estimator
consistent with the aggregate time series is given by:
Y * = Xb + VB ' (BVB ')
−1
“Optimal” methods, such as the Chow-Lin one, permit
the extrapolation of the quarterly series outside the
sample period. Extrapolated estimates are considered
temporary data, to be revised when the annual
constraint is available. We extend the extrapolation
period to no more than 4 quarters, to avoid the
summing up of prevision error.
( y − BXb)
where
(
b = X ' B '(BVB ') BX
−1
)
−1
X ' B '( BVB ')
−1
y.
Therefore, the estimated Y* are obtained as sum of a
systematic component, deriving from the estimated
regression, and an “error” one, which for each year
distributes the bias between the sum of the regression
quarterly data and the observed annual aggregate
among the four estimated values according to the
values of V. In other terms, higher variances suggest
less reliability of the regression quarterly values, thus
leading to an higher correction.
Our series must satisfy two kinds of constraint: a
temporal identity (adding up of quarterly values to the
annual one) and a contemporaneous one (between
resources and use aggregates). The balancing of
aggregates
was
then
achieved
with
a
double-proportional algorithm.
Available information at sub national
3 level:
sources and timeliness
In most cases, the relationship between short-run
movements in Y and Xβ is fairly stable, but the levels
of Y and Xβ may vary over time: the residuals will then
exhibit serial correlation (Litterman, 1983). Thus,
random errors are supposed to be generated by a
first-order Markov process (Chow and Lin, 1971), a
random walk (Fernandez, 1981), an ARIMA (1,1,0)
process
(Litterman,
1983),
a
generalised
ARIMA(p,d,q) process (Stram and Wei, 1986). Here,
to avoid step discontinuities of quarterly estimates
between years, the first order autoregressive process is
adopted, with:
ut = ρ ut − 1 + ε t ,
Concrete application have shown that when annual
profiles of the indicators and of the series that have to
be disaggregated are similar, then quarterly estimates
obtained by alternative methods tend to be very close.
Therefore the main effort of the research has been the
identification of reliable indicators. This problem is
particularly evident when the purpose is temporal
disaggregation of regional series since the availability
of quarterly variables is in this case extremely poor.
Moreover, “official” annual data that represent the
constraint of the model are released with considerable
delay, whereas first unofficial annual data of Southern
ε t ≈ i. i. d .
242
TABLE OF CONTENTS
economy are given in July of the next year. Since the
availability of quarterly indicators anticipates the
estimation of annual data - the indicators are available
within 5 months - therefore modelling quarterly data
allows one to obtain provisional estimates of annual
figures by the aggregation of quarterly figures.
• delay on qualitative Isco indicators: 1 month.
The structure of delays described above suggested an
outline for the procedure of estimation of quarterly
accounts in three steps:
• temporary estimation of a restricted number of
aggregates based on Isco indicators;
• first revision of the estimates when all the indicators
are available;
• final revision when the annual data is available
At sub national level, it is possible to obtain
information from the following Istat surveys:
• monthly survey on household consumption;
• monthly survey on consumer prices;
• quarterly survey on labour force.
Utilisation of information gathered by Isco leads to
some interpretative difficulties due to qualitative
nature of the data. Expectation on variables as
production, inventories, orders, prices and so on is
given according to modalities as “normal”, “more than
normal”, “less than normal”. The procedure of
quantification is the percentage of answers relating to
each modality and eventually the balance between
percentage of answers “more than normal” and
answers “less than normal”.
Additional sources of information for Southern
economy is the monthly survey on industrial
electricity consumption led by national electric
authority (Enel) and the monthly survey on
entrepreneurs’ expectation on economic activity
conducted by the institute for short-term economic
analysis (ISCO).
It should be noted that each survey has, at regional
level, a different degree of reliability due to the
different coverage of regional samples.
The problem is that each entrepreneur could give a
different meaning to these modalities. However Isco
has implemented a procedure in order to make the
results coherent taking into account the long run
average of the balance.
Unpublished elaboration of regional consumption are
given for the following groups of goods and services:
food, drink and tobacco; clothing, housing; fuel;
transport and communication; other goods and
services.
Isco results can be used in two different ways: as
leading indicators of independent variables of the
model or directly as indicators of the model itself.
Since the coverage of the sample to Southern regions
is reliable enough only starting from 1986, the second
way of utilisation has been preferred 2.
Survey on consumer prices gives provincial data.
Therefore they have to be aggregated in order to obtain
regional figures.
The estimation spans the period 1980-1991. In the
following outline (Table 1) the quarterly aggregates
estimated, their main indicators, and the average delay
these indicators are released with, are presented.
Labour force survey provides, at regional level,
figures on workforce in employment by kind of
activity (agriculture, industry, private and public
services) and by condition (employees in employment
and self-employed), and on unemployment, divided in
first employment claimants and unemployed.
Given the hypothesis of the model adopted
autoregressive process of order one in the error term
indicators have been seasonally adjusted. For monthly
indicators, we applied the X-11 method: quarterly data
are obtained by sum of seasonally adjusted monthly
ones. Household expenditure data have first been
expressed in 5-terms moving averages, then
interpolated with a third degree polynomial form
(“spline” method, see Kendall and Stuart, 1976).
Choice of indicators is also linked to the timeliness
with which they are released. At this regard, it is
possible to distinguish three orders of delay:
• delay on annual data: 7 months;
• delay on Istat and Enel indicators: between 3 and 7
months;
243
TABLE OF CONTENTS
Table 1: Outline of quarterly aggregates estimated, main indicators and average delay
Aggregates
Value added at factor cost:
• Agriculture
Indicators
• Workforce in employment
• Industry
National indicator
• Industrial electricity consumption
• Private Services
• Workforce in employment
5 months
4 months
3 months
GDP at market prices
Net Imports
National indicator
• Workforce in employment
National indicator
Values added by kind of activity
GDP at market prices
5 months
4 months
5 months
4 months
5 months
5 months
Consumers’ expenditure
General government final consumption
Household consumption survey
Workforce in employment
7 months
5 months
Gross domestic fixed capital formation
4 months
Consumers’ expenditure deflator
National indicator
Consumer prices survey
GDP deflator
National GDP deflator
4 months
• Public Services
Short-term behaviour of Southern
4 Economy
6 months
At the beginning of the Eighties, Southern economy
already presented negative growth rate in real terms;
thus, neither it suffered the slowing down of the
second part of the 1980, nor it fully benefited of the
recovery of 1981.
The availability of quarterly series of main regional
economic aggregates allowed us to analyse the
conjunctural differences between North and South
Italy immediately before the European unification. A
brief sketch of the main differences is reported in the
following 3.
After the slump of the third quarter of 1982, the
Southern economy seemed to grow faster than the
Northern one in the following three years. But, with
the spreading of more favourable international
conditions, the stronger economy of industrialised
North Italy started an accelerating growth path, thus
leading to an increasing gap between the two areas
until the second half of 1989.
Due to different structural composition of the two
economies, with an higher incidence of
non-competitive sectors (agriculture, public services)
in Southern one, international cyclic fluctuations seem
to arrive in South Italy “smoothed” and with a certain
delay, with some cases of counter-cycle in North and
South (see Table 2).
2
Delay
While Northern economy experienced a stable GDP
growth rate pattern during this period, quarterly
On the basis of a sample of 74 monthly observations, a linear relation between electricity consumpt ion and Isco
indicator has been estimated, where error term is assumed to follow a seasonal moving average proce ss. Results
seem satisfying enough both in term of fitting and forcasting.
However the direct utilization of ISCO subjective indicators seems more interesting on perspective and will be
adopted as soon as the time series will be long enough. As a matter of fact, utilization of industr ial electricity
consumption can not be extended to further regional disaggregation since data are collected at
“compartimental” level, according to the sources of electricity supply, which is not coherent with regional
disaggregation.
3
From a different point of view, a discussion of more detailed economic results was first presented in: Centro
Europa Ricerche, “Ciclo economico ed attivita’ creditizia nel Mezzogiorno” (“Business cycle and ban king
activity in South Italy”), Rapporto n.6 , 1992.
244
TABLE OF CONTENTS
variations in Southern GDP growth showed wider
fluctuations and shorter persistence of positive
swings.
as compared to Northern one, which was confirmed
only 3 months later by the annual data.
Consumption cycles are quite similar in the two areas
even though Southern one seemed less sensitive to
international cycle fluctuations and the same
behaviour characterise investment cycle.
Growth disparities persisted during the whole
turndown period of the end of the decade; in 1991, due
to an exceptional crop and a better performance of
Southern industry than Northern one (the last one
more affected by a fall of international investment
demand), for the first time growth rates are higher in
the South.
This piece of work should be interpreted as a first
attempt in deriving quarterly regional accounts.
However, results obtained seem interesting enough for
future investigations.
It is interesting to note that our model allowed us to
forecast this unexpected gain in 1991 Southern GDP
245
TABLE OF CONTENTS
Table 1: Main economic aggregates of South Italy Quarterly series 1980-1991
(billions of lire 1985)
Quarters
GDP
Net imports
Gross fixed
Investments
Household
Consumption
Collective
Consumption
Changes in
inventories
80.1
80.2
80.3
80.4
81.1
81.2
81.3
81.4
82.1
82.2
82.3
82.4
83.1
83.2
83.3
83.4
84.1
84.2
84.3
84.4
85.1
85.2
85.3
85.4
86.1
86.2
86.3
86.4
87.1
87.2
87.3
87.4
88.1
88.2
88.3
88.4
89.1
89.2
89.3
89.4
90.1
90,2
90.3
90.4
91.1
91.2
91.3
91.4
46.955
46.411
46.417
46.681
46.363
46.281
46.285
46.602
47.073
46.895
46.358
46.475
47.137
47.582
48.505
49.027
49.159
49.573
50.012
50.129
49.964
50.832
51.509
50.965
50.746
51.494
51.806
51.941
51.780
52.966
53.703
54.172
54.047
54.564
54.892
55.327
55.555
55.925
56.164
56.496
56.526
56.811
57.648
56.981
57.711
58.259
58.758
58.922
8.766
8.462
8.502
8.744
8.783
8.883
8.993
9.246
9.580
9.520
9.219
9.184
9.345
9.439
9.796
9.952
9.893
10.003
10.134
10.094
9.888
10.302
10.714
10.613
10.814
11.387
11.655
11.770
11.679
12.297
12.718
13.044
13.123
13.431
13.555
13.638
13.498
13.580
13.729
14.063
14.410
14.687
15.066
14.500
14.416
14.434
14.575
14.675
12.434
12.383
12.294
12.256
12.253
12.108
11.911
11.737
11.682
11.629
11.662
11.840
12.032
12.272
12.477
12.636
12.906
12.939
12.962
12.914
12.557
12.553
12.592
12.461
12.425
12.607
12.722
12.831
12.994
13.265
13.211
13.310
13.548
13.595
13.655
13.719
13.763
13.746
13.882
14.247
14.439
14.511
14.534
14.401
14.213
14.247
14.474
14.560
31.730
32.130
32.207
32.296
32.312
32.482
32.783
32.966
32.987
33.083
33.417
33.620
33.632
33.740
34.036
34.344
34.640
34.823
34.854
35.047
35.458
35.996
36.252
37.019
37.198
37.348
37.469
38.168
38.639
39.034
39.642
40.006
40.380
40.786
41.284
41.689
41.922
42.233
42.626
42.577
42.613
43.299
44.050
44.238
43.972
44.371
45.208
44.944
10.072
10.352
10.360
10.476
10.613
10.535
10.573
10.643
10.658
10.882
10.898
10.960
11.245
11.224
11.368
11.312
11.507
11.251
11.593
11.941
11.836
11.770
12.286
12.206
12.113
12.104
12.309
12.707
12.636
12.606
12.751
12.947
12.994
13.198
13.215
12.988
13.048
13.302
13.362
13.222
13.225
13.386
13.615
13.506
13.594
13.683
13.700
13.749
1.484
8
58
398
-32
39
11
502
1.326
819
-401
-760
-428
-215
420
688
-1
563
738
322
0
815
1.092
-107
-176
822
961
4
-811
359
818
952
247
416
293
570
320
225
23
513
659
301
515
-663
348
392
-49
344
246
TABLE OF CONTENTS
References
Barbone L., Bodo G., Visco I. (1981), “Costi e
profitti in senso stretto: un’analisi su serie trimestrali,
1970-80", Bollettino della Banca d’Italia , 36.
Istat (1992), “I conti economici trimestrali con base
1980", Note e Relazioni, 1.
Kendall M.G., Stuart A. (1976), The Advanced
Theory of Statistics, vol. III, London, Griffin.
Chow G., Lin A. (1971), “Best Linear Unbiased
Interpolation, Distribution and Extrapolation of Time
Series by Related Series”, The Review of Economics
and Statistics , 53.
Litterman R.B. (1983), “A Random Walk, Markov
Model for the Distribution of Time Series”, Journal of
Business and Economic Statistics , 1.
Di Fonzo T. (1990), “The Estimation of M
Disaggregate Time Series when Contemporaneous
and Temporal Aggregates Are Known”, The Review of
Economics and Statistics , 72.
Stram D.O., Wei W.W.S. (1986), “A Methodological
Note on the Disaggregation of Time Series Totals”,
Journal of Time Series Analysis , 7.
Fernàndez R.B. (1981), “A Methodological Note
onthe Estimation of Time Series”, The Review of
Economics and Statistics , 63.
247

Documentos relacionados