Top Page | English | 简体中文 | 繁體中文 | 한국어 | 日本語
Thursday, 8 January 2026, 00:00 HKT/SGT
Share:
    

Source: Science and Technology of Advanced Materials: Methods (STAM-M)
Unearthing experimental data buried in scientific papers
Large language models accelerate construction of materials property databases.

TSUKUBA, Japan, Jan 8, 2026 - (ACN Newswire) - Technologies that underpin modern society, such as smartphones and automobiles, rely on a diverse range of functional materials. Materials scientists are therefore working to develop and improve new materials, but predicting material properties is no simple task. Data science is key to transforming this field, and new tools powered by artificial intelligence are expected to accelerate the exploration, collection, and management of materials property data worldwide.

Researchers and artificial intelligence work together to collect experimental materials science data from papers worldwide and build a database. (Copyright: Kenji Tashiro. Instagram: ripplemarkmaker. CC-BY-4.0)

Researchers and artificial intelligence work together to collect experimental materials science data from papers worldwide and build a database. (Copyright: Kenji Tashiro. Instagram: ripplemarkmaker. CC-BY-4.0)

The relationship between functional materials and their properties is complex. Even slight differences in composition or synthesis methods can affect electronic states and microstructures, often resulting in entirely different properties. For this reason, theoretical models alone cannot provide reliable predictions, and the intuition of researchers and engineers built on years of experience has played a significant role.

Machine learning is a technology that can learn empirical trends rather than relying on theory. By applying machine learning to experimental data in materials science, it may be possible to replicate such intuition computationally. Large language models (LLMs), such as ChatGPT, now support the daily lives of many people and are capable of flexible information extraction that takes background knowledge and context into account. This opens up the possibility of automating the process of converting complex information sources like scientific papers into structured data. If large-scale datasets of experimental data can be built through this approach, it is expected to enable researchers to gain inspiration through a bird's-eye view of the data, as well as to realize property predictions based on empirical trends using machine learning.

A team led by Dr. Yukari Katsura, a Senior Researcher at the National Institute for Materials Science (NIMS), has focused on this potential and developed two new tools to accelerate the construction of Starrydata, a materials property database built from data collected from scientific papers. This work was recently published in the journal Science and Technology of Advanced Materials: Methods.

"Graphs in the millions of papers published to date contain valuable experimental data collected by past researchers, and much of it remains untapped," says Prof. Katsura. In the Starrydata project, which she launched in 2015, data collection from papers was performed manually and supported by the independently developed Starrydata2 web system, successfully amassing an unprecedented volume of experimental data. The new tools are designed to further streamline this data collection process. "We found that by specifying a data structure and giving instructions to an LLM, we can accurately and comprehensively extract information about figures, tables, and samples from the text of paper PDFs across a wide range of fields."

Prof. Katsura added, "Many publishers prohibit the use of artificial intelligence on paper PDFs, so we are currently developing the system to target open-access papers."

The first tool, Starrydata Auto-Suggestion for Sample Information, is a function that reads the text of a paper and suggests candidate entries for data fields pre-designed for each materials domain; it is already integrated into the Starrydata2 web system. When a user pastes text from a paper's abstract or experimental methods section, it is sent to OpenAI's GPT via API, and candidate entries in English are automatically displayed below each input field.

The second tool, Starrydata Auto-Summary GPT, deconstructs an entire open-access paper PDF uploaded by the user and automatically summarizes all descriptions of figures, tables, and samples appearing in the paper as a structured data in JSON format. The JSON data output is generated using ChatGPT's custom GPT feature, and the resulting data can be viewed as an easy-to-read table in a web browser. Although this data is not currently incorporated directly into the Starrydata database, it dramatically accelerates the work of data collectors in quickly locating target data and entering information. Note that reading data points from graph images is difficult for LLMs, so this task is performed by data collectors using an independently developed semi-automated tool.

"A paper is a logical structure assembled to convey the author's claims, but by deconstructing it and returning it to the form of experimental data, other researchers can also use it for their own research," says Dr. Katsura. "In this way, we are aiming for a future where experimental data from all materials science fields can be shared in digital format and viewed from a bird's-eye perspective."

At present, Starrydata has only progressed in building databases for certain materials science fields, such as thermoelectric materials that convert heat and electricity, and magnets. However, as an open dataset that can be used for new materials development, it is beginning to be utilized primarily by leading researchers around the world. The team is advancing their research with the aim of raising broader awareness of the potential of such large-scale experimental data and establishing paper data collection as a recognized form of research within the scientific community.

Further information
Yukari Katsura
Senior Researcher, National Institute for Materials Science (NIMS)
KATSURA.Yukari@nims.go.jp
(Yukari Katsura is also an associate professor at University of Tsukuba and guest researcher at RIKEN)

Paper: https://doi.org/10.1080/27660400.2025.2590811 

About Science and Technology of Advanced Materials: Methods (STAM-M)

STAM Methods is an open access sister journal of Science and Technology of Advanced Materials (STAM), and focuses on emergent methods and tools for improving and/or accelerating materials developments, such as methodology, apparatus, instrumentation, modeling, high-through put data collection, materials/process informatics, databases, and programming. https://www.tandfonline.com/STAM-M 

Dr Kazuya Saito
STAM Methods Publishing Director
SAITO.Kazuya@nims.go.jp

Press release distributed by Asia Research News for Science and Technology of Advanced Materials.




Topic: Press release summary
Source: Science and Technology of Advanced Materials: Methods (STAM-M)

Sectors: Materials & Nanotech, Artificial Intel [AI]
http://www.acnnewswire.com
From the Asia Corporate News Network


Copyright © 2026 ACN Newswire. All rights reserved. A division of Asia Corporate News Network.



Latest Press Releases
Argentine Football Association (AFA) teams with Verofax to offer AI Experiences to Fans  
Jan 9, 2026 18:00 HKT/SGT
TOYOTA GAZOO Racing Announces GR Yaris MORIZO RR  
Friday, January 9, 2026 3:03:00 PM
Honda Announces New Lines of Models that Represent "Honda Sports DNA" at Tokyo Auto Salon 2026  
Friday, January 9, 2026 1:24:00 PM
Honda Launches Fixed-Battery Electric Two-Wheeled Personal Commuter "Honda UC3" in Thailand and Vietnam  
Friday, January 9, 2026 12:27:00 PM
Sponsorship Agreement Reached with LCR Honda; Full-Season MotoGP Entry as Pro Honda LCR  
Friday, January 9, 2026 12:05:00 PM
Mazda Develops New Body Color, "Navy Blue Mica"  
Friday, January 9, 2026 9:23:00 AM
Fujitsu develops digital learning platform for JAL to support self-directed learning and training management  
Friday, January 9, 2026 9:07:00 AM
Capital Margin Trade Announces the Launch of Practical Trading Tools Designed For Structured Market Execution  
Jan 9, 2026 07:00 HKT/SGT
Military Metals Announces Buyback of 1% Royalty on Slovakian Portfolio  
Jan 8, 2026 20:59 HKT/SGT
Hong Kong Tech Firms Win Big at CES 2026 with Frontier Tech Innovations to Draw Global Buyer Interest  
Jan 8, 2026 19:31 HKT/SGT
More Press release >>
 Events:
More >>
 News Alerts
Copyright © 2026 ACN Newswire - Asia Corporate News Network
Home | About us | Services | Partners | Events | Login | Contact us | Privacy Policy | Terms of Use | RSS
US: +1 214 890 4418 | China: +86 181 2376 3721 | Hong Kong: +852 8192 4922 | Singapore: +65 6549 7068 | Tokyo: +81 3 6859 8575

Connect With us: