The Bid podcast

Episode 2: Big Data: The race is on

It’s no secret that technology has led to an explosion of data – but how do you make meaning out of it? Jeff Shen, Co-Chief Investment Officer of Active Equity and Co-Head of Systematic Active Equity, goes behind the hype to discuss Big Data in investing, and whether it creates a bigger or a more competitive opportunity for investors.

Subscribe on iTunes

  • View transcript

    Mary-Catherine Lader: Ninety percent of the world's data was created in the last two years. That was true last year, and the year before that. So, in other words, there's an explosion of data. This creates a massive opportunity, but it also raises new challenges. With so much data out there, how do you value it? How do you harness it? How do you make meaning out of it?

    Welcome back to The Bid and to our mini-series Behind the Hype: Demystifying Fintech. On our last episode, BlackRock's Chief Operating Officer, Rob Goldstein, set the stage for us and talked about how transformations in technology are disrupting industries and influencing how companies scale their business.

    Today, demystifying the buzz around big data is Jeff Shen. He's Co-Chief Investment Officer of Active Equity, and Co-Head of Systematic Active Equity, BlackRock's quantitative investing platform. With Jeff, we'll get to the root of what exactly big data in investing is, how it's used, and whether it creates a bigger or a smaller and more competitive opportunity for investors. I'm your host, Mary-Catherine Lader. We hope you enjoy.

    Mary-Catherine Lader: Hi, Jeff. Thanks so much for joining us today.

    Jeff Shen: M.C., thanks very much for having me.

    Mary-Catherine Lader: Today we're talking about big data, machine learning, and what that means for your team and for the investors whose money you manage. We're all familiar with how smart phones, GPS, and smart home devices are totally changing our lives. They're changing communication, transportation, daily tasks. I know I get my news, my weather from a device in my home. So, as that technology grows, new forms of data, of course, grow with it. The term "alternative data" or "Big Data" gets thrown around a lot. What exactly does that mean, from your perspective as an investor? And what kinds of data sets can your team now interact with that you might not have been able to when you started out as a manager 14 years ago?

    Jeff Shen: I think it's an extremely exciting time for us to look at data and the implication for investment. What I like to call it is "bits over atoms." Now, what do we mean by that? We used to collect information by moving atoms around. So if I wanted to see where the factory is being built, I fly over it to check out if the factory has been built. And if I want to get a sense of the real estate development that's coming online, I do a field trip. So, there are a lot of atoms moving around to collect some of this information. Fast forward, today, a lot of this information now can be collected through bits. Bits really means zeros and ones; we're using a digital form to collect some of this data and some of this information. And then, when we look at how portfolio managers are processing this information, increasingly, the information are coming at us through these bits. And to the extent we can leverage data and technology for our investments, we're certainly in a bit of a brave new world.

    Mary-Catherine Lader: So what are some specific examples? I mean, do you now have access to foot traffic data? That's a famous example of using that to assess, like, retail demand, to social media; you know, what in the past year or two can you now look at to get a sense of investment opportunities that you couldn't before?

    Jeff Shen: I think I would classify this probably into three major categories. Number one is really about fundamentals. If you think about -- to your point, a person comes to the store, if that results eventually into a sale, then tracking the foot traffic data through Wi-Fi beacon information will certainly be quite helpful to capture fundamental information. It could also be through the second category that I like to call sentiment. If we want to get a sense of how 100 million retail investors in China are thinking about the equity market, then social media information give us a glimpse of the human intentions and emotions. We get a sense of what they are thinking about the Chinese stocks locally in Shanghai and Shenzhen. The third category of information that we also like to track is certainly try to look at policymakers' potential movements. And it could be fiscal policy; it could be monetary policy movements. From this, we're also using natural language processing to process a lot of these data and text to get a sense of what are the potential policy movements going forward. To sum it up, what we are looking at is certainly using these alternative datasets to track fundamentals, sentiment, and also macro policy; to see that can give us better answers to a set of classic questions that we've been tracking for a long time.

    Mary-Catherine Lader: Jeff, that policy example is so interesting, because there's so much bad information about what's going to happen with policy. There's so much bad information about what management teams or government leaders, human leaders might do that could influence companies. So, how do you parse through that kind of information for either policy indicators or sentiment to really figure out and discern the signal from the noise, what's good information versus bad information?

    Jeff Shen: Signal to noise is certainly a huge challenge. What we do over here on the policy front is certainly try to have a bit of a supervised machine learning, if you will. So on one hand, we want to give machine a bit of direction on where to look for information, especially related to policy. It could be monetary policy. It could be fiscal policy. The world is certainly right now switching from monetary policy to a regime where fiscal policy is becoming much more important. So what we are doing is to guide the machine to look for -- in a sense, away from the monetary policy stance, into the fiscal policy stance. And then, the natural language processing technique is quite critical. What we are looking for is not only to get a sense of the policymakers are becoming positive or negative on a particular set of policy directions, but also importantly what are the emerging topics that can be coming up from some of their speeches and some of their discussions? So, to a certain extent, it's no different from you and me reading a corpus of text and to figure out where the policy could be, except in this case here we like to hire a lot of PhDs, who I joke sometimes may not like to read, but would like to use a machine to read a lot of these texts.

    Mary-Catherine Lader: So there's basically a person reviewing all of this. That's what you mean by "supervised machine learning," that there's like a human check at the end of it all?

    Jeff Shen: A human will set up the framework of thinking through this, and if you will, humans really determine what the algorithm is. And the machine is coming in to provide scalability to read through thousands and thousands of documents in one go. And we process seven major languages in the world. So not only being able to read in English, but also try to read Spanish, Portuguese, Chinese, and Japanese. Across languages there are a lot of nuances. So when you get into some of these details, some of these techniques, the machine nuances become quite important. But ultimately, it is really about human-plus-machine that we think can solve the overall puzzle.

    Mary-Catherine Lader: So are there any examples of really strange outcomes, either strange investment recommendations or even weird sets of data that you've found less useful, but that human check was critical to identifying?

    Jeff Shen: We like strange things. Our perspective here is that, if the questions or the answers are too obvious, it's probably priced into a market too quickly. And to a greater extent, it is about potentially asking a different set of questions that nobody has actually asked before. One specific example that I have is what we like to call as signal combination. So essentially what we're trying to do is to look at 200-plus reasonably generic insights that can predict stock returns. But rather than combining them using common weights across every single signal, what we want to do is to look at each individual stock and to have potentially a different weighting according to the characteristics of the stock: which country it is in, which sector it is in, which market capitalization it is in. So, essentially give us a very individualized combination. This, to a certain extent, is instead of having a forest view, what we want to have is to zoom from the forest into the tree, to have a very individualized combination. Now, that particular type of questions are certainly a bit of a strange question in the sense that without the leverage and scalability of a machine, we were never able to ask these types of questions before. But fast-forward to 2018, we are able to ask this set of questions and also come up with pretty interesting answers.

    Mary-Catherine Lader: I mean, it does make you wonder about the importance of evolving all investors' skill sets. I started out as an investor at the beginning of my career and had to key in every company's financials, build my own model, kind of project their financials, call up and do research. And so, to be able to do that at scale so quickly and just tweak the things that require your own judgment sounds extremely compelling. In that context, I mean, what is the future of an investor who's using an Excel spreadsheet and a 10K, and how will they compete?

    Jeff Shen: I think investors need to evolve, especially given the new context of data and the algorithm revolution that's sitting in front of us. And I think to a great extent, historically, a lot of discretionary fundamental-oriented managers certainly have a lot of depth of knowledge, knowing one or two or a dozen companies extremely well, and being able to achieve depth of understanding. The systematic quantitative investors historically tend to be very wide in terms of understanding, but sometimes can be a bit shallow. So they know a lot of different things. But at the same time, at the individual company level, they may not know that much. I think we're into a phase and world where it's potentially possible to have both breadth of knowledge, but also with depth of understanding. And I think that's where the future state of an investor -- for active management -- that's extraordinarily exciting. And it's not really limited to systematic quantitative managers, per se. Fundamental discretionary-oriented managers can certainly leverage of these new data and new technology to evolve. So, I think the race is on. And ultimately, it all comes down to evolving the investment process and to leverage data and technology going forward.

    Mary-Catherine Lader: Can I ask you a slightly personal question?

    Jeff Shen: Absolutely.

    Mary-Catherine Lader: Do you have kids?

    Jeff Shen: Two daughters, 10-year-old and 7-year-old.

    Mary-Catherine Lader: So if your 10-year-old said, "Dad, I really want to be an investor like you. What do I need to learn?" What would you tell her?

    Jeff Shen: Funny enough that you ask. She's actually a big fan of Shark Tank on CNBC, for whatever strange reason. But I think to answer your question, I think it's going to be a bit of a combination of, on one hand, I think the kids today need to get a better understanding of how a machine would work. So, understanding of computer science, of algorithms, and understanding the subtleties of how to use this algorithm for what specific setting and context is going to be quite an important skill set to have. But on the other hand, I do think that liberal arts majors would have a bright future for the future state of investment, because ultimately it is not only about coming up with a set of answers to a set of questions, but it's also more importantly coming up with better questions to ask, to making sure that we have a critical mindset to think about some of these issues. So ultimately, the questions may matter even more than the answers.

    Mary-Catherine Lader: I know we at BlackRock are a big fan of hiring a mix of people with liberal arts backgrounds for exactly that reason. Even if you're, for example, a student of history, you're studying judgment over the course of human history and thinking about what causes certain unpredictable events, and therefore the right questions to ask to think about the future and project. So, if your daughter is going to be learning how machines work, and possibly learning how to code -- and she might want to be an investor, given how much she already loves Shark Tank -- then are we just at -- we're already in an arms race, and what today is cutting-edge, what today is being used by funds like SAE is just going to be the norm in the future?

    Jeff Shen: I think so. And it's not only investment in the machine, in the data, that is critical -- it's table stakes for sure. At the same time, it is also about talent, about the people that we can attract. But it's also about making linkages. I do think that the diversity issue in investment is a critical one. I think not only we want to have people with very strong computer science/engineering background, but also making sure that we can make connections across some of these answers and try to answer these questions in a holistic fashion. Technology companies are certainly pretty strong competitors in this space -- I mean, San Francisco I certainly see it outside my window. At the same time, I would say that to solve the investment problem, some of the challenges are quite fundamentally different from solving a regular engineering problem. The fuzzy objective function for investment is certainly quite different in the sense that when we think about a better investment result, not only higher return, lower risk is important; the journey also matters. The only thing that we know for the future is that it's going to be different from the past. Whatever we've done in the past 30 years in systematic quantitative investment, we know that for the future the requirement is going to be higher, and we're going to need to make sure that we have the right talent to tackle some of the new challenges.

    Mary-Catherine Lader: So we think of financial markets and investing as being so precise, but it's funny you mention that actually a lot of the questions we're asking, the data is kind of fuzzy. And that's part of what appeals to talent. I remember when we were setting up the artificial intelligence lab with Stephen Boyd, who's of course a professor at Stanford, and some of his colleagues from Stanford and Berkeley earlier this year, they mentioned that that was a big part of why they started working with you and the SAE team, is because these questions are so undefined, and markets are so unpredictable. So can you talk a little bit about what Stephen and some of his colleagues are working on at the AI lab in Palo Alto?

    Jeff Shen: I think there are two very important elements that Stephen and the AI lab at Palo Alto is helping us. I think number one is that, coming from investment and finance background, I think all of us are very much used to a data-poor environment, historically. And I think what Stephen and the advisors bring onto the table is certainly a mindset that allows the data to tell us a bit more of what's going on in the world. So rather than going from theory to the world, this is actually going the opposite way; it's actually coming from data to inform us what's going on, really going on in the world. I think the second part is also this concept of machine learning, in the sense that the machine can really learn. And I think that's a bit of a new-new world, in the sense that historically we're very used to telling what a machine to do, with human judgment and human intuition. In the new world, I do think that the machine, with a very simple, fundamental algorithm embedded in there, can potentially learn things that we've never thought about as possible. And this concept of "machine can learn on its own" I think is going to have an increasingly greater role in investment and how we operate our business going forward. So, both -- whether it's data to tell us a bit more, or let the machine to learn, I think are going to be quite disruptive in our own investment business.

    Mary-Catherine Lader: We're going to end with a rapid-fire round, and since you spend all day thinking about the future, and you have a pretty unique breadth of data with which to assess and analyze the future, I'm going to ask you how many years you think it'll take for some of these things to become reality. So five, 10, 30, never. You know, whatever you think. Ready?

    Jeff Shen: Yes.

    Mary-Catherine Lader: Okay. Autonomous vehicles.

    Jeff Shen: Five years.

    Mary-Catherine Lader: Okay. I'm going to ask some follow-up questions. So, why five years?

    Jeff Shen: I already see them on the street corners in San Francisco, and I think the challenge there is more to do with how do you put autonomous vehicles alongside with human driving in a cohesive fashion? The technology itself is getting to be quite mature. But I think the challenge is really how to fit that into the existing framework. If the world is driven completely by autonomous vehicles, I would say that's probably within one or two years we can do it. But the problems are human.

    Mary-Catherine Lader: Human life on Mars.

    Jeff Shen: Thirty years. Mars is pretty far away. [Laughs] And I think it's not only getting there that's difficult, but it's also how do you stay there over the intermediate horizon. I think it's going to take a while.

    Mary-Catherine Lader: What about commonplace use of gene editing?

    Jeff Shen: Ten years. I think that's a part that is surprising, in the sense that the technology breakthrough there is quite real. But at the same time, I think the realization is that you may have a dictionary of a particular language, in this case here, for DNA. But to truly be able to solve the puzzle with one dictionary is actually not enough. Human body and the human evolution is much more complex than we thought. At the same time, the computational power is probably going to allow us to make some breakthrough over the next 10 years.

    Mary-Catherine Lader: How about when electric cars might outnumber gasoline-fueled cars?

    Jeff Shen: Twenty years will be my guess. I think ultimately it comes down to the cost of battery and the regulatory environment. We may see a bit greater adoption in markets like China or India, where the existing infrastructure is such that existing gasoline cars are still on the rise, and people may have much greater adoption in some of these emerging markets than developed markets.

    Mary-Catherine Lader: That's fascinating. And a little discouraging. Anyway, thank you, Jeff, so much for joining us today. It's been such a pleasure to talk with you about what you're doing at SAE, how you and your team use machine learning, big data, these often-used but not-that-well-understood terms. And what it means for the future of investing. Thanks again.

    Jeff Shen: Thanks very much, M.C.

Quarterly Investment Outlook
We see steady global expansion continuing, but the range of potential economic outcomes is widening.
Read more Read more