Improving LLM Reasoning in Materials Science

A modular, DAG‑based framework for experimenting with pipelines that test and improve the reasoning capabilities of LLMs.

GitHub repository

This project reimagines how LLMs can reason about science. It transforms raw materials-science papers into structured causal contracts, then uses them to generate and evaluate reasoning tasks in a fully reproducible pipeline. The result is a novel research framework that makes scientific AI more transparent, testable, and ultimately more useful for discovery.

This matters because materials discovery depends on linking evidence, mechanisms, and outcomes with precision. Instead of treating LLMs as black boxes, this framework makes their reasoning traceable, reproducible, and open to controlled comparison across different context settings and models.

What makes it novel is the combination of agentic, DAG-orchestrated research pipelines with schema-constrained contracts and end-to-end evaluation artifacts. In other words, it is a modular research system for studying, challenging, and improving scientific reasoning itself.

This project was a bifurcation point in my research regarding applications of AI in science. Some ideas have become the foundation of AIR, a smart autonomous maintainer, to harness AI to create a self-healing and self-maintainable software.