Discovery of complex oxides via automated experiments and data science



Automation is accelerating the discovery of useful materials, yet testing even a small fraction of the billions of possible materials for a desired property is beyond the reach of workflows involving resource-intensive property measurements. Due to relationships among composition, structure, and properties, identifying a complex material with one interesting property makes it the proverbial needle in a haystack that merits testing for additional properties. We accelerate materials synthesis and optical characterization by employing physics-aware data science to identify materials for further investigation. With this approach, one does not need high-throughput methods for measuring every material property of interest since a single ultra-high–throughput workflow can guide material selection for other properties, which is a new paradigm for accelerated materials discovery.


The quest to identify materials with tailored properties is increasingly expanding into high-order composition spaces, with a corresponding combinatorial explosion in the number of candidate materials. A key challenge is to discover regions in composition space where materials have novel properties. Traditional predictive models for material properties are not accurate enough to guide the search. Herein, we use high-throughput measurements of optical properties to identify novel regions in three-cation metal oxide composition spaces by identifying compositions whose optical trends cannot be explained by simple phase mixtures. We screen 376,752 distinct compositions from 108 three-cation oxide systems based on the cation elements Mg, Fe, Co, Ni, Cu, Y, In, Sn, Ce, and Ta. Data models for candidate phase diagrams and three-cation compositions with emergent optical properties guide the discovery of materials with complex phase-dependent properties, as demonstrated by the discovery of a Co-Ta-Sn substitutional alloy oxide with tunable transparency, catalytic activity, and stability in strong acid electrolytes. These results required close coupling of data validation to experiment design to generate a reliable end-to-end high-throughput workflow for accelerating scientific discovery.

Increased incorporation of data science in materials research is anticipated to accelerate discovery of materials with improved properties and combinations thereof for technological applications requiring multifunctional materials (1, 2). Machine learning is one popular approach for building predictive models, but limited materials training data often compromises the prediction accuracy, especially in composition spaces for which no training data are available (35). Training data are particularly limited in high-order composition spaces (e.g., at least three cation oxides), which offer opportunities for tuning multiple properties through formation of a phase, i.e., a crystal structure or substitutional alloy, that contains all three cations. The vast number of potential high-order compositions exceeds current methods of discovery or prediction (69), and prediction of substitutional alloy phases and their properties remains a substantial challenge (10, 11).

We develop two data science methods to discover materials in high-order composition spaces. The phase diagram model uses thermodynamic equilibrium assumptions to propose candidate phase diagrams using only optical absorption data. The emergent property model uses the same data to identify compositions whose optical properties cannot be explained by combinations of lower-order compositions of the same elements. The present work additionally describes the design and implementation of the high-throughput workflow that provides data to these models as well as an example use case for guiding discovery. Our primary finding is that appropriately constructed data science models can make inferences about the phase behavior of complex materials using data that are not traditionally used for phase characterization. These inferences add scientific value to existing datasets and guide materials discovery efforts.

We demonstrate this approach for three-cation oxide systems via high-throughput experiments coupled to automated quality control and modeling of spectral microscopy data. The select three-cation oxide compositions whose properties appear unique compared to lower-order oxide compositions are then candidates for more expensive and time-consuming structural and functional characterization. This approach is distinct from computational inverse design wherein a model predicts a material to have a specific property, a promising strategy that is hampered by the dual challenges of computational prediction of experimental properties and the computational generation of synthesizable materials (12, 13). Our approach shifts the strategy from identifying materials with a specific property to rapidly screening materials that may be exceptional for any property. By releasing the database of experiments and analyses alongside this work, we aim to accelerate the community’s selection of composition spaces and compositions therein for discovery of materials exhibiting a broad range of properties (14).

Discovering complex phases with desirable properties, whether by experiment or computation, is highly challenging due to the combinatorics of composition spaces. Searching the Materials Project (15) for entries containing oxygen, having an associated Inorganic Crystal Structure Database entry (16), having unique composition and space group, and excluding inert gas and nonmetallic elements (He, Ne, Ar, Kr, Xe, Rn, C, N, F, P, S, Cl, Se, Br, I, and H) yields 755 1-cation oxide entries from 73 cation elements. Applying the same search to two-cation oxides increases the number of identified materials to 4,345, although the corresponding search for three-cation oxides yields only 3,163 materials. While some two-cation oxide phases undoubtedly remain to be discovered, there has been extensive computational exploration of two-cation oxides, making such materials the focus of recent high-throughput computational (1719) and machine learning–driven materials discovery (2023). Higher-order composition spaces enable further tuning of materials properties, but the expense of comprehensive search of combinatorial spaces is clear when considering 3-cation oxides. Using the 73 cation elements from the 1-cation oxide entries, there are 62,196 (73 choose 3) possible 3-cation oxide composition spaces, yet only 2,205 are represented in the Materials Project, leaving over 96% of the composition spaces with no existing data.

The computational exploration of three-cation phases to date has focused on crystal structures where each cation has a unique crystallographic site. The site substitution of multiple elements on a single crystallographic site is a distinguishing feature of metal substitutional alloys, and a metal oxide structure exhibiting such substitutions on cation sites is referred to herein as an substitutional alloy oxide (or “alloy” for brevity). A three-cation oxide can crystallize in a structure observed in the one or two-cation subspaces, and the composition-tuned decoration of the cation sublattice comprises an opportunity for tuning properties in the three-cation composition space. Since the site substitution is disordered, large unit cells combined with ensemble averaging of different random site decorations are required to explicitly model substitutional alloys. While approximations to computational modeling of alloys have been developed (2427), alloys in high-order composition space comprise a dramatically underexplored class of materials for discovery efforts. We know from the examples of high-temperature superconductors and catalysis that extremely valuable properties are obtainable via substitutional alloying in high-order composition spaces (8, 28, 29).

We report a high-throughput workflow for discovering candidate compositions for functional properties by coupling high-throughput synthesis and optical characterization with automated data interpretation. Parallel optical screening was recently demonstrated as a proxy for phase behavior in the context of combinatorial thermal processing of individual compositions (30). We extend this approach to high-order composition spaces using inkjet printing (31) to deposit composition-gradient lines of material that are subsequently imaged by a purpose-built hyperspectral microscope that measures optical absorption from the infrared to ultraviolet (UV). We present a dataset consisting of nine channels of optical absorption data for a series of metal oxide composition samples. Each composition sample is defined by the stoichiometry of cation elements, with oxygen content driven toward equilibrium by calcination at fixed oxygen pressure. The dataset contains 376,752 distinct compositions from 108 three-cation oxide systems based on the cation elements Mg, Fe, Co, Ni, Cu, Y, In, Sn, Ce, and Ta, for which only the Ce-Cu-Fe oxide system contains an entry in the Materials Project. We present a data science workflow incorporating cross-validation and other quality control measures to establish confidence in the data, enabling subsequent data modeling to predict aspects of the underlying phase behavior. In the present work, we discuss models that 1) predict candidate phase diagrams along with the absorption spectrum of each phase (the “phase diagram model”) and 2) predict the likelihood that the three-cation composition space contains a three-cation phase whose properties are distinct from one or two-cation phases (the “emergent property model”). These complementary prediction models are emblematic of the usage of data from high-throughput experiments to make inferences that accelerate resource-intensive experiments.

This implementation of data science–driven analysis of experimental data are complementary to quantum mechanical (32) and…


Read More:Discovery of complex oxides via automated experiments and data science