Using the text of 200 million pages of 13,000 US local newspapers and state-of-the-art machine learning methods, we construct a novel 170-year-long time series measure of economic sentiment at the country and state levels. It expands existing measures in the time series availability (by more than a century) and is the first to provide cross-sectional (regional) variation. Our corpus includes approximately 1 billion newspaper articles, a major increment over the Wall Street Journal corpus, a popular source of text data in economics and finance, that contains about 1 million articles.
To measure text-based economic sentiment, we customize a new state-of-the-art machine learning technique. We create a fully automated topic-specific dictionary by leveraging Word2vec, a neural-network-based algorithm, that allows us to capture the meaning of words and phrases from the context in which they are used. Furthermore, instead of using a binary positive/negative connotation, we produce a continuous measure of sentiment for each word and phrase in the dictionary. As a result, our method automatically overcomes many common challenges faced by simple word-count techniques (e.g., detecting negation or measuring word/phrase intensity). Empirically, our measure is highly correlated with the outcome of observable survey expectations (e.g., Michigan Consumer Sentiment survey), yet significantly extends their available time span for analysis and provides a source of cross-sectional variation.
Our measure predicts future economic growth even after controlling for current macroeconomic fundamentals: One standard deviation increase in sentiment corresponds to 2% additional annual growth in GDP per capita during 1850-2017. Over the recent sample with observed quarterly data (1947Q1 - 2019Q4), a one standard deviation increase in sentiment leads to 0.29% of additional GDP growth one quarter ahead (corresponding to 1.1% annualized growth). This predictability remains even after controlling the slope of the yield curve, past GDP dynamics, and the consensus forecast, implying that our measure captures important information not spanned by leading predictive macroeconomic indicators.
We then examine the sub-components of GDP that our measure is able to predict. We show that our results are mainly consistent with the labour channel rather than the capital channel of sentiment propagation in the economy, as our measure predicts employment, consumption, and services but neither investment nor industrial production. Similarly, we also show that economic sentiment operates through the real economy and does not affect inflation.
Next, we evaluate the extent to which sentiment affects monetary policy decisions. To this end, we quantify the importance of sentiment in explaining the changes in the fed funds rate relative to what the forward-looking Taylor rule proposed by Romer and Romer (2004) would imply. We find that sentiment has a large influence on the key policy rate: a one-standard-deviation decrease in sentiment over the past two quarters leads to a 25 basis point (5 basis point) decrease in the policy rate during recessions (expansions). Furthermore, we find that sentiment has significant predictive power for the Fed Funds Rate during recessions even after controlling for its predictive power on ex ante (Tealbook projections) and ex post (realized) GDP growth.
Our data also allows us to measure local sentiment, for example, at the state level. This variation is important, as it reveals significant heterogeneity in sentiment across states, with the common component driving only approximately 35% of the state-level sentiment. Local sentiment predicts state-level GDP growth even after controlling for national sentiment and both national and state-level fundamentals. Furthermore, using the dispersion in sentiment across states as a measure of heterogeneity, we find that higher dispersion is a significant predictor of low economic growth at the national level.
Overall, our results indicate the importance of sentiment for understanding business cycles (both globally and locally), and provide a set of robust empirical facts that could indicate potential channels of its impact.
Mayukh Mukhopadhyay is a PhD student in Financial Economics at the London Business School. He holds a BA in Economics and an MPhil in Economic Research from the University of Cambridge. Mayukh’s research interests include the use of big data and machine learning methods for the prediction of asset returns and macroeconomic forecasting. Prior to joining London Business School, Mayukh worked for the International Finance Division at the Bank of England.
Research Interests: Asset Pricing, Macroeconomics, Big Data, Machine Learning