Category

Share

How Sony AI’s Scientific Discovery Team is Reimagining How Researchers Evaluate Hypotheses

Sony AI

January 26, 2026

In today’s research landscape, thousands of scientific papers are published each day; a metaphorical sea of knowledge. Even domain experts struggle to keep up. As Pablo Sánchez Martín, Researcher at Sony AI, explained, “Doing a proper literature review can easily become a full-time job, and that’s only one part of a scientist’s work.”

Rather than focus solely on helping researchers manage this scale, the team is exploring how the entire body of scientific literature can be transformed into a resource — one that allows AI to harness all existing scientific knowledge and surface connections that the literature implies but has not yet articulated.

Recognizing this challenge, the team set out to help researchers navigate and leverage the overwhelming scale.

Instead of treating research papers as isolated documents, the team analyzes them as part of a vast, interconnected, and evolving system of ideas. By modeling how knowledge accumulates and shifts over time, their work aims to uncover the hidden relationships that link discoveries across domains, from biology to chemistry to medicine.

For Uchenna Akujuobi, Senior Research Scientist at Sony AI, the origins of this project trace back years. In early conversations between Sony and the Systems Biology Institute in Tokyo, a recurring theme emerged: experts have no practical way to keep up with the accelerating pace of scientific publication. “Even domain experts specialize in very narrow areas,” Akujuobi said. “So the question became: can AI help researchers access more information across domains and become more efficient?” That question eventually led to the team’s first NeurIPS paper.

From the outset, however, the goal has reached beyond efficiency. The team’s ambition is to build systems that help researchers uncover knowledge that should exist. In other words, uncovering scientifically plausible links that are supported by decades of findings but have not yet appeared in published literature.

Framing the Approach

At the center of the team’s work is literature-based hypothesis generation (LHG), an approach that uses the structure and evolution of scientific literature to infer what knowledge is missing but scientifically plausible.

Traditional machine learning is built to detect patterns in existing data. LHG, by contrast, aims to identify relationships that should or could exist. Relationships implied by decades of scientific findings but not yet articulated in the literature. To do this, the team relies heavily on knowledge graphs (KGs), which model scientific concepts and their relationships over time.

This temporal view is essential. Ideas rarely appear fully formed; they emerge gradually as new studies reinforce, refine, debunk, or contradict earlier work. As Tarek R. Besold, Senior Staff Research Scientist at Sony AI, explained, their system reconstructs the scientific record year by year, allowing the model to reason about how ideas evolve rather than treat knowledge as static.

Knowledge graphs also provide interpretability. “We know exactly where each predicted connection comes from because it’s grounded in literature,” Sánchez Martín said. This transparency is a crucial advantage over language models, which often lack clear reasoning paths and require significant computational resources to process thousands of papers at once.

The team’s approach reflects a broader scientific lineage.

In a 2016 article in AI Magazine, Hiroaki Kitano framed the challenge succinctly: “Biomedical research is flooded with data and publications at a rate far beyond human information-processing capabilities. Over one million papers are published each year, and this rate is increasing rapidly.”

That observation forms part of the philosophical foundation for Sony AI’s work, reinforcing why structured, machine-assisted reasoning has become essential.

Process & System Design

Building a system capable of modeling decades of evolving literature required expertise spanning AI, ontology design, data engineering, and scientific reasoning. Much of the team’s work involves transforming unstructured scientific text into temporal knowledge graphs that can support predictive models.

The data is inherently messy: terminology is inconsistent, key details vary between authors, and scientific viewpoints sometimes conflict, even formatting varies. Yet these inconsistencies are not obstacles, they are signals. They reveal how ideas diverge and converge over time.

Frederick Gifford, Senior Product Manager at Sony AI, emphasized that the team’s approach aligns with a long-standing effort to use computational tools to explore scientific possibility space.

“A lot of what we’re doing sits on a continuum with earlier attempts to help scientists reason across vast, complex bodies of knowledge,” he said.

That continuum traces back to Kitano’s argument that “the critical aspect of scientific discovery is how many hypotheses can be generated and tested, including examples that may seem highly unlikely.” Sony AI’s system builds on that perspective by enabling researchers to identify not just existing relationships in the literature, but the ones that are likely to emerge next.

Equally important was ensuring the system would be useful to real researchers. Generating predictions is one thing; making them actionable or useful is another. The team evaluates not only model accuracy, but also whether suggestions provide meaningful scientific direction, and, most importantly, whether they help researchers ask better questions, explore new areas, or avoid unproductive paths. This required building interfaces that surface clear, literature-backed reasoning chains rather than opaque predictions.

Challenges & Learnings

All members of the team agree that the most significant challenge is the data itself. Scientific literature is inconsistent, contradictory, and often incomplete. “It can feel like the edge cases are infinite,” Sánchez Martín said. These inconsistencies make scientific reasoning difficult for humans and they create challenges for machines as well.

Resource constraints added further complexity. Unlike large industry labs with vast compute budgets, Sony AI had to design models that work efficiently under practical limitations. But Akujuobi noted that these constraints actually pushed the team to innovate, forcing them to find solutions that were both computationally lean and scientifically effective.

Another important challenge is trust. Machine-generated hypotheses can be met with skepticism, especially in fields where scientific intuition develops over years of study. To address this, Gifford explained that Sony AI’s system surfaces reasoning paths behind each prediction. “This is a major benefit of models built on Knowledge Graphs,” he said, “you can go back to the Graph to understand the architecture of the prediction.” Kitano described this environment of noisy, and imperfect knowledge as the “twilight zone of scientific discovery,” where AI systems must be designed to reason through uncertainty rather than avoid it.

The team also learned valuable lessons from collaborations with biologists and domain experts. In early iterations, researchers found some models too complex, preferring simpler paths through the literature rather than deeply nested contextual layers. These insights helped refine the system to better align with real-world scientific workflows.

Future Outlook

The researchers share a common belief: AI should augment human scientific thought, not replace it. “The goal isn’t to stop scientists from thinking,” Akujuobi said. “It’s to give them more confidence and make them more efficient.” Besold added that efficiency is especially important given limited scientific resources.

In practice, that means giving researchers a wider field of view—surfacing literature-grounded connections and clear reasoning paths they can evaluate, challenge, and build on.

Looking ahead, Gifford described the team’s ambition as enabling researchers to reason across the full breadth of scientific knowledge—a goal consistent with earlier visions for computational discovery systems. What excites the team most is the potential for deeper, faster scientific discovery.

“There are diseases without cures,” Akujuobi said. “But that doesn’t mean cures don’t exist—just that they haven’t been discovered.” Sony AI hopes to play a meaningful role in helping researchers uncover those missing connections.

Ultimately, their work signals a shift in how scientific literature can be used: not simply as a record of what is already known, but as a structured foundation for identifying discoveries that the accumulated body of knowledge suggests should exist. By putting the full power of scientific knowledge into the hands of researchers, the team aims to support a new era of scientific reasoning grounded in both human insight and the collective evidence of decades of scientific thought.

Latest Blog

December 22, 2025 | Life at Sony AI, Sony AI

Sony AI 2025, Year in Review

As the year comes to close, we’re reflecting not only on key research milestones, but on the scale of work achieved across our global teams. In 2025, we published 87 papers, contri…

December 15, 2025 | Sony AI

Protecting Creator’s Rights in the Age of AI

The rise of artificial intelligence technologies that can generate songs and mimic musical styles and artists has led to a surge in AI music content with unclear origins, prompting…

December 1, 2025 | Events

NeurIPS 2025: Sony AI’s Latest Contributions

NeurIPS 2025 brings together work across Sony AI that targets real problems in real settings. The research in this roundup focuses on models that adapt under constraints, handle fr…

  • HOME
  • Blog
  • How Sony AI’s Scientific Discovery Team is Reimagining How Researchers Evaluate Hypotheses

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.