Crowded city cross walk
SYSTEMATIC INSIGHT

Alternative data for systematic investing

In an era where data is abundant and accessible, discernment — not information — defines the competitive edge.

Key points

  • 01

    Finding the competitive edge

    With readily available data, the challenge is no longer acquiring data but identifying what matters.

  • 02

    Data evaluation

    Every dataset selected in our systematic process undergoes rigorous evaluation to ensure it adds real, incremental value.

  • 03

    Assessing alpha additivity

    Disciplined selection, robust architecture, and domain expertise transform information overload into a structured path toward alpha.

Navigating the data deluge

Over the last decade, the explosion of alternative datasets — from satellite imagery and job postings to shipping records and scraped web content — has radically expanded what’s measurable. Yet with this abundance comes a new challenge: discernment. Without a rigorous process, it’s easy to succumb to confirmation bias, leaning on a single dataset that ‘proves’ a hypothesis rather than generating genuinely additive insights.

As Figure 1 illustrates, the number of differentiated datasets available to our research team continues to grow across asset classes and geographies, reflecting both the increasing availability of alternative data and the escalating complexity in separating signals from noise. As illustrated in Figure 2, the inflow of datasets surged, yet our onboarding selectivity tightened — with the number of datasets rejected by our research team increasing fivefold from 2019 to 2024.

How we evaluate data quality

Crowded town square
Originality

Does the dataset offer new, distinct insights not already captured by existing signals?

Completeness and coverage

Is the data broad, deep, and representative across time, sectors, and geographies?

Timeliness and latency

How quickly is the data updated? Are timestamps reliable and consistent?

Transparency and lineage

Can we trace the source, processing methods, and version history with confidence?

Distinguishing signals from noise 

We evaluate prospective signals using a combination of quantitative testing, economic reasoning, and cross-signal validation. Our process focuses on seeking datasets that deliver predictive, economically intuitive signals that consistently forecast asset returns.

  • Testing for statistical power: Frameworks such as Information Coefficient (IC), Predictive R-squared, and Horizon-Decayed IR help quantify dataset effectiveness.
  • Economic intuition testing: Event studies, cross-sectional regression tests, or integrating the signal into a broader alpha model help validate rational investor behavior.
  • Additivity and redundancy checks: Orthogonal testing against existing signals ensures differentiated, non-redundant insight.

Finding alpha potential in a sea of raw data 

Sourcing, selecting, and scaling high-quality alternative data is no longer a niche capability — it’s a strategic imperative. In a world where signal libraries number in the thousands, finding the next source of alpha isn’t about being faster — it’s about being smarter.

We believe the future belongs to those who combine human insight, operational precision, and organizational scale. When every marginal dataset matters, how you source data becomes just as important as how you use it.

Read the full report

Discover how BlackRock Systematic is identifying what matters in a sea of raw data.
Whitepaper on alternative data by BlackRock Systematic

Authors

Raffaele Savi
Global Head of BlackRock Systematic
Jeff Shen, PhD
Co-Head and Co-CIO of Systematic Active Equities
Gerald Garvey
Co-head of Equity Research, BlackRock Systematic Active Equities
Linus Franngard
Head of Product Research and Innovation, BlackRock Systematic Active Equities
Alex Remorov
Portfolio Manager and Researcher, BlackRock Systematic Active Equities
Dominique DeRubeis
Researcher BlackRock Systematic Active Equities
Vassiliki Carter
Investment Strategist, BlackRock Systematic Active Equities