What is Big Data? And What Does it Mean to Hedge Funds?
The phrase ‘Big Data’ is the rising star of industry buzzwords, but what exactly does it mean? In this article we’ll aim to define big data and potentially more importantly, discuss the implications of big data on the hedge fund market.
Wikipedia defines big data as a “collection of data sets so large and complex that it becomes awkward to work with using on-hand database management tools.” As a result, the top software companies (i.e Oracle, Microsoft, HP) as well as financial application vendors are investing heavily in building systems to help companies harness the power of big data.
And big data just keeps getting bigger. According to IBM, each day we create 2.5 quintillion bytes of data from everyday activities including social media, digital pictures and videos, online transactions, GPS signals and more. Highlighting the explosion of data, it is estimated that 90% of the data in the world today was created in the last two years alone.
What is the Significance of Big Data?
If big data can be harnessed, it provides the opportunity to spot trends, find new insights or trading ideas and answer questions that were previously considered outside of reach.
Signaling the importance of big data, the World Economic Forum released a report earlier this year outlining the significant impact big data will have on international development. According to the report, “researchers and policymakers are beginning to realize the potential for channeling these torrents of data into actionable information that can be used to identify needs and provide services for the benefit of low-income populations.”
But harnessing the data is easier said than done. A report last year by the McKinsey Global Institute, the research arm of the consulting firm, projected that to capitalize on big data the United States needs 140,000 to 190,000 more workers with “deep analytical” expertise and 1.5 million more data-literate managers, whether retrained or hired.
Big Data and Wall Street
Quantitative hedge funds and investment strategies are the most obvious application for big data. In a recent Forbes.com article, David Leinweber explains that “many of the ideas from quant investing make sense in [big data] context; histories are huge, and experimentation is easy. There’s an underlying behavioral model, plus, you know your counter-parties. The large volume and variety of data allows use of new “data voracious” statistical and machine learning methods that, in finance, are useful for high-frequency trading, but are worthless on daily or monthly market data.”
Most large Wall Street banks are also looking at better ways to capitalize on large datasets. Bank of America Merrill Lynch, for example, is using Hadoop, which is an open source framework that allows for the distributed processing of large data sets. With Hadoop, Bank of America Merrill Lynch is applying big data strategies to manage petabytes of data for regulatory compliance and advanced analytics.
Bigger and Bigger and Bigger
We can expect both the amount of data as well as the market touting big data solutions to just keep increasing. Just as cloud computing has gone mainstream, so too will big data. The question is how long will it take for the solutions to become viable options for traditional hedge funds.
Photo Credit: Deviantart