computer screen

A prescription for healthcare investing

Advances in technology are transforming patient care, easing the burden of disease, and generating compelling new opportunities for investors. In this article, we investigate how advances in artificial intelligence and data science can identify areas of healthcare innovation and potential investment opportunities.

Key points


Biomedicine is at an inflection point

As the healthcare sector grows in importance, relevance, and size against a backdrop of an aging global population and rapid scientific innovation, so too does the potential opportunity for investors.


Sector-specific risks remain

Risks in therapeutic development – high cost, long duration, and high failure rates – translate to challenges in healthcare investing. We believe a data-driven approach is ideally suited to address those challenges.


A data-driven approach

By leveraging a wealth of data, a systematic approach to healthcare investing can offer the distinct advantages of a scalable and transparent investment process. Our approach seeks to identify investment opportunities, construct portfolios, and manage risks—while striving to deliver positive outcomes for both patients and investors.

Biomedicine is at an inflection point

When discussing cancer treatments, oncologists rarely use the word “cure.” Cancers often mutate and adapt, and patients that are initially treated successfully can relapse months or years later. But today’s oncologists—and researchers in many other diseases—are now using the word “cure” more often, thanks to advances in computation, genomics, immunology, and precision medical technologies. In fact, experts observe that we’re experiencing an inflection point in biomedicine—a truly exceptional period in history where breakthroughs in biology, medicine, and technology are converging to save lives and ease the burden of disease.

Despite generally favorable fundamentals and a rich potential opportunity set, healthcare investing has traditionally been the domain of a relatively small community of sector-specific specialists. Most healthcare investors rely on a traditional fundamental approach, leveraging key opinion leaders and their own scientific assessment to determine a company’s attractiveness.1 This manual process can mean that the coverage area is narrow, scale is harder to achieve, reaction times are slower, and insights can be subject to human bias.

We believe that leveraging data science can transform healthcare investing by codifying the insights of domain-specific experts into a modern, scalable investment process. We introduce a data-driven investment framework that combines expert scientific insights with robust portfolio construction techniques that seek to better select the most promising treatments and provide greater diversification potential. Thanks to recent advances in artificial intelligence, data science, and computing power, tasks that were previously best accomplished manually — such as estimating probabilities of clinical success, conducting scientific literature reviews, and performing discounted cash flow analyses — can now be automated to a significant degree and systematically applied to thousands of drugs, clinical trials, and therapeutic areas. We believe these advances are poised to provide an unprecedented level of access and transparency to a wider spectrum of investors, attract greater amounts of capital to this important sector, and lead to even more breakthroughs for patients in need — all while delivering potentially compelling returns to investors.

A specific set of risks

Although healthcare presents a potentially compelling set of opportunities for investors, it is not without risk. There are challenges and attributes specific to this sector, some of which are directly related to the life-saving nature of biomedical innovation.

By its very nature, healthcare investments tend to have binary outcomes: either a drug is safe and effective or it’s not; there is rarely any middle ground. Given the rigorous safety standards to which therapeutics are held, the likelihood of success for a new therapeutic has historically been quite low across the industry, with fewer than 10% of therapies ultimately receiving approval.2 Certain sectors have even lower approval rates: the historical success rate of a cancer drug from early clinical trials through final approval is only 5.3%.3 Exacerbating the investment challenge is the often-unappreciated correlation between these clinical trials: outcomes are not necessarily independent, and without proper risk management, the innate correlations can compound the high level of risk already present when investing in clinical-stage biotech companies.4

Figure 1: Overall Likelihood of Approval from phase 1 by disease area

Clinical Development Success Rates 2011-2020

Source: Clinical Development Success Rates and Contributing Factors 2011-2020. Authors: Biotechnology Innovation Organization, PharmaIntelligence, QLS Advisors. Data covers individual drug program phase transitions from January 1, 2011 to November 30, 2020.

As a result, the financial risks associated with an investment in a clinical trial tend to be binary as well—a trial either succeeds or fails, and stock prices tend to experience significant moves in response to such news. Therefore, a clinical-stage biopharma company faces significant event risk on a regular basis. While most publicly traded equities may experience dramatic volatility on occasion, for clinical-stage therapeutics companies this is the norm rather than the exception (see Figure 2).

Figure 2: Average stock price return in anticipation and response to clinical trial outcomes

Stock price returns of public healthcare companies which had phase 2 and 3 clinical trial events between January 2012 to April 2022

Source: QLS. *Stock price return of all public healthcare companies which had phase 2 and 3 clinical trial events between January 1, 2012-April 30, 2022. **Average holding period of 131 calendar days (starting the first day of the quarter before the readout quarter). Past performance is not a reliable indicator of future results and should not be the sole factor of consideration when selecting a product or strategy.

The binary nature of healthcare risk implies that even a well-diversified portfolio of such companies will exhibit higher levels of risk than less-binary investments, risk that is temporally concentrated near and on clinical trial readouts. This underscores the importance of robust portfolio construction and thoughtful risk management.5

Measuring and managing the specific types of risk of the healthcare sector requires a new set of tools to supplement those used in standard asset management contexts. We discuss some of these tools in the following sections.

A data-driven approach

To better understand the nature of risk and reward, one natural starting point is the “Fundamental Law of Healthcare Finance.”6 This is a stylized expression of the economic value of a therapeutic program, written as a function of just four terms:

the economic value of a therapeutic program

The expected net present value (NPV) of a therapeutic candidate is equal to the product of the present value (PV) of all future cash flows from sales of the therapy if approved times the probability of success (PoS), less the cost of developing, manufacturing, and delivering the therapy to patients (Costs).

In practice, the Fundamental Law of Healthcare Finance can be applied many times over, asset by asset through the well-known discounted cash flow calculations that lie at the core of fundamental equity analysis. Information and assumptions about revenues, costs, and PoS are combined to yield an estimated price per share which can then be compared to the current market price to determine whether the company is attractively valued. We illustrate this concept through a hypothetical example in Figure 3: our PoS-implied price estimate for Stock 1 leads to a higher assessment of fair value, suggesting the market is currently undervaluing the stock and it might therefore be an attractive position to hold. Conversely, our PoS-implied estimate for Stock 2 suggests the market is overvaluing the stock, while we think current pricing is reasonably fair for Stock 3.

Figure 3: Market vs. PoS-implied pricing

Market price for public healthcare companies vs. implied probability of success pricing

BlackRock, as of December 31, 2022. Graphic is shown for illustrative purposes only and does not depict actual data. Orange bars are meant to represent a hypothetical market price for each hypothetical stock; the yellow lines represent the hypothetical price calculated using a hypothetical PoS estimate. None of the stocks represent an actual stock, holding, nor position; all data points are selected to illustrate the concept.

More accurate predictions with machine learning

Given that clinical trial outcomes tend to have a significant impact on stock returns, accurately estimating the PoS for any given clinical trial can be a major factor to successful healthcare investing. Sophisticated analytical techniques can improve the accuracy of PoS estimations by leveraging data to achieve scale, update estimates in real time, and mitigate human bias.

A systematic approach to healthcare investing is based on similar insights and analytics as those used by more traditional, fundamental managers, but uses data-driven tools to add value at scale by augmenting the effective throughput of the conventional healthcare analyst and exploiting empirical regularities that are not apparent or tractable through fundamental analysis. Unlike a traditional approach, a data-driven approach to PoS may help mitigate human bias while allowing for the consideration of many more data points, in real time, across more therapies. Advanced technologies such as machine learning and artificial intelligence afford the opportunity to leverage data from hundreds of thousands of clinical trial events across an array of therapeutics. Using many features of a company’s therapeutic candidate —including the target disease, the biological and chemical nature of the candidate, the design aspects of the clinical trial, and the researchers’ and companies’ track records, among others — can allow us to form a more accurate PoS.

Given the magnitudes of the cash flows involved in approved drugs, small differences in PoS estimates can lead to very large differences in valuations and, therefore, investment decisions. Moreover, this systematic approach to estimating PoS may be easily automated and scaled to encompass the broader universe of potential therapeutic investments, allowing large portfolios to be monitored and managed efficiently and transparently.

Why use machine learning?

Recent advances in machine learning and artificial intelligence can identify non-linear predictive patterns with many different sources of data. For example, when training a decision tree, one task is to learn the most informative questions to ask at each branch point of the tree. The algorithm might ask, “Is the sponsor a big pharma company?” If the answer is “yes,” then the algorithm would move to the “yes” branch of the decision tree and then ask, “Is the trial double-blind?” On the other hand, if the answer is “no,” then the algorithm would move to the “no” branch of the decision tree, and ask, “Is the drug a small molecule?” and so on, until enough information has been collected about the drug-program to make a prediction about its potential success or failure.

The motivation for a random forest (a commonly used machine learning algorithm) is then to aggregate the predictions of many different decision trees, each of which is trained over a randomized subset of variables. As in the “wisdom of crowds” phenomenon, while each individual tree may be a weak predictor, their collective average yields a much stronger predictor. By identifying subtle patterns in a sufficiently large database of historical drug transitions, machine learning algorithms can potentially improve the accuracy of predictions.

Figure 4: Illustrative example of machine learning decision tree to analyze clinical trial data

Machine learning random forest decision tree to analyze clinical trial data

Random Forests is a machine learning algorithm for classification and prediction, whereby a collection of decision trees (such as the one illustrated above) are created to estimate the likelihood of a certain event. Each decision tree is distinct—differing in the ordering of the “questions” (i.e. model features), the set of features used, and other attributes—which seeks to limit the model’s bias to overfit based on the data used to train the model. For the purposes of forecasting, the resulting estimation (in this case, the probability of clinical success) is a blended prediction (e.g. mean or average prediction) across hundreds or thousands of individual decision trees. Source: BlackRock and QLS as of May 1, 2023. The machine learning process success rate is being provided for illustrative purposes only as a hypothetical example of what the process seeks to potentially achieve. The information is not a prediction of future performance of any investments selected by the process and does not represent any actual success rates of the process. Note that the above process information is based on the current market environment; it does not reflect actual positions proposed. Strategies and targets depend upon a variety of factors, including prevailing market conditions and investment availability. There is no guarantee that they will be achieved or that any particular investment will meet the target criteria. Note that the above process information is hypothetical and illustrative based on the current market environment; it does not reflect actual positions proposed. Strategies and targets depend upon a variety of factors, including prevailing market conditions and investment availability. There is no guarantee that they will be achieved or that any particular investment will meet the target criteria.

Predicting clinical success

Figure 5 compares our models’ point-in-time predicted PoS for over 4,000 clinical trials versus the realized outcome of each trial. The positive relationships capture the predictive power of our models. We look at clinical trials across four stages, each represented by one of the four histograms below: phase 1 to approval, phase 2 to approval, phase 3 to approval, and regulatory filing to approval. The x-axis of each histogram groups our PoS predictions into quintiles. The heights of each bar represent the quintile’s realized frequency of success in the 2020–2021 out-of-sample period. The upward sloping linear relationship in each histogram demonstrates a direct relation between the predictions and subsequent performance: for each stage, the clinical trials with lower forecasted PoS have lower realized success rates relative to the trials with higher forecasted PoS. Although the out-of-sample realized success rates aren’t precisely equal to the forecasted rates, the similarity is suggestive of significant forecast power in these machine learning estimates.

Figure 5: Estimates of clinical success are positively correlated with realized clinical success

Probability of clinical success vs realized clinical success from 2000 to 2019

Source: BlackRock using data from QLS as of 12/31/2021. The graph plots the model’s predicted probability of clinical trial success (trials grouped into ranges) vs. the realized clinical success rates for the same groups of trials. Predictions are shown for programs which were discontinued or received approval in 2019 or later; the predictions are based on a model trained on data from the years 2000 to 2019 (no overlap). The results show the model’s forecast power on out-of-sample outcomes: trials with higher probability estimates were more likely to be approved, and vice versa.

Accounting for correlations

In addition to transforming a sea of raw data into useful investment insights, a systematic investment approach can mitigate downside risk by building a more robust portfolio that accounts for the correlations between clinical trial outcomes and, therefore, companies. Diversification in biotech is predicated on the ability to measure the correlation between healthcare companies and between pipeline assets within a company.7 The same systematic techniques that may potentially better predict the PoS for a given treatment can also be used to estimate the correlations between any pair of treatments. Intuitively, two therapeutic programs targeting the same disease mechanism using the same or similar biological agents are more highly correlated—a failure of the first program doesn’t bode well for the prospects of the second. On the other hand, two programs focused on completely different diseases and involving different drug compounds may be much less correlated. We can use this information to help construct a portfolio of diversified assets and more robustly manage the risk of the portfolio.

Complementary insights

While clinical trial outcomes are an important driver of the financial success of therapeutic companies, there are several other factors that impact valuations and stock price. Creating a robust view about the company’s prospects—whether a public company or private company—requires assessing the company’s fundamentals, sentiment from market participants, the competitive landscape, and industry trends alongside healthcare-specific insights. Alternative data can be a rich and granular source of information. Applying advanced technologies to big data allows us to uncover insights amidst market complexity, discovering information that might not otherwise be obvious to investors. For example, demographic data coupled with historical health insurance claims can help identify growing unmet medical needs. We also leverage data which gives us insight into a company’s broader financial picture, such as employee sentiment, balance sheet strength, or broker attention. In aggregate, these diverse and complementary insights can provide a robust profile to evaluate a given company.

Creating value for investors and society

As the healthcare sector grows in importance, relevance, and size against a backdrop of an aging global population and rapid scientific innovation, so too does the potential opportunity for investors. Therapeutic development programs have the potential to deliver attractive investment returns, but as biomedicine has grown more complex, so have the challenges in identifying and assessing risks for investors.

By leveraging a wealth of available data, applying a systematic approach to healthcare investing can offer a scalable, transparent, and risk-managed process to invest in the healthcare sector. Systematic tools can help scale the breadth and depth of analysis, enhancing more traditional approaches, while remaining anchored in fundamental science. A machine-learned approach—which can analyze hundreds of thousands of clinical trial data points in real time—can account for unapparent linkages to estimate probabilities of clinical success more comprehensively, while the application of sophisticated technologies to alternative data sets can help glean useful information about complementary drivers of valuations, such as fundamentals, sentiment, and trends. This novel approach has the potential to generate better outcomes for investors, increase the availability of capital to fund underinvested but high-need areas of healthcare innovation and research, and improve our society’s future wellbeing.


Raffaele Savi
Global Head of BlackRock Systematic
Jeff Shen, PhD
co-CIO of Systematic Active Equity
Andrew W. Lo, PhD
Co-founder and Chairman of QLS
Ali Almufti, CFA
Andrew Ang, PhD
Shomesh E. Chaudhuri, PhD
Travis Cooke, CFA
Alexandra Eldemir
Linus Franngard
Katelyn Gallagher
Alan Kwan, PhD

Sign up to receive our market insights

Please try again
First name *
Please enter a valid first name
Last name *
Please enter a valid last name
Organization type *
This field is mandatory
Organization *
This field is mandatory
Title *
This field is mandatory
Business email *
Please enter a valid email
Location *
This field is mandatory
Thank you
Thank you for subscribing!
You'll be the first to receive BlackRock's latest thought leadership and investment ideas to your inbox. In the meantime, explore our website to read insights on the markets, portfolio design and more.