Today, professional and institutional investors have access to at least 100 times more consumer transaction data than they did just a decade ago.
That nets out to at least one trillion rows worth of consumer transaction data that can be parsed to gain insights into where a company, the economy, or markets are headed.
And this only represents a drop in a global ocean of alternative data that includes email receipts, social media posts, mobile phone location and other data that exists outside the realm of most corporate filings and official government reports. The $4.5 billion alternative data market has grown 50% annually for the past four years and it shows no signs of slowing down.
This data has become an increasing source of competitive advantage, especially for the quant-focused hedge funds that together manage well over $1 trillion, representing 29% of all hedge fund assets. Other fund managers we have spoken to have expressed concern they can’t keep up with this alternative data arms race, but recent advances in artificial intelligence are suddenly leveling the playing field by:
- Closing the Data Scientist Gap: As the amount of data has grown, so too has the number of data scientists investment firms need to hire to keep up with it. Imagine if you had two data scientists to extract value out of U.S. consumer data and then you got access to reams of consumer data from France. Well now you need a third data scientist, and ideally they need to speak French. Some firms have the resources to keep hiring. Most don’t. Fortunately, the advent of tools like Open AI’s Code Interpreter are making it easier for any investor to glean data insights. Some are already using Interpreter to upload and analyze data and to create tables and charts. You don’t even need to know exactly what you’re looking for. You could, for example, upload information on a retailer’s foot traffic, the weather in the region and social media data and prompt Interpreter by asking it, “Tell me what’s unusual or interesting about this data set.” A few years ago, only a trained data scientist could have done this exercise. Now, almost anyone can do it.
- Finding a Needle in a Research Haystack: If you are a buy side analyst who covers 100 energy companies, you could be getting reports on them from 20 sell side analysts publishing twice a quarter. That’s 4,000 reports, which collectively could exceed 40,000 pages. You will never read it all, but you don’t have to if you are using AI summarization tools, which can distill reams of research reports down into concise summaries.
- Cleaning up the Messy Work: Alternative data can be messy. Even if the data is structured, it may have inconsistent merchant names or product descriptions. Within a single dataset, there can be multiple data sources (for example, data from different point-of-sale systems) that is differently labeled. Making sense of unstructured data can be even more challenging, as there may be a significant amount of language processing required to extract insights. All of these tasks are time-consuming, require precision, and are more monotonous than higher-level insight generation. Thankfully, many of these tasks represent perfect applications for large language models. For example, performing entity resolution (mapping merchant descriptions to tickers in consumer transaction data) can be largely automated with AI tools. This allows analysts to focus on higher-level analysis of the data, instead of menial data cleansing.
In recent years, investors have been both wowed and overwhelmed by the amount and type of alternative data that’s available. But AI is democratizing alternative data access, making it easier to find, to make sense of it, and most importantly to deliver actionable investing insights.
Spenser Marshall is the Chief Data Officer of Sundial Data, a direct data sales subsidiary of M Science, a portfolio company of Leucadia Investments, a division of Jefferies Financial Group.