Introduction

Human history has evolved together with material innovation: steel production from heating iron with carbon triggered a shift from the Bronze to Iron age, and the growth of high-purity Si crystal ingots led to the burgeoning of the computer age. In modern times, synthetic multicomponent materials are constantly developed to meet the demands of diverse applications. The Inorganic Crystal Structure Database (ICSD), including most of the experimentally synthesized inorganic compounds, has approximately 200,000 materials registered to date, and the data entry steadily increases by ~5000 every year1. The sheer size and active expansion of the database reflect that new materials continue to drive scientific and engineering advances in fields such as electronics, energy harvesting/storage, and high-Tc superconductors2,3,4,5,6,7,8.

Despite the vast material library available today, it is far from being complete in ternary or higher-order (simply multinary hereafter) phases. Based on a rough estimate, only approximately 16 and 1% of ternary and quaternary compounds, respectively, are at least partially revealed9. This inspires reasonable hope that valuable materials can be discovered in the largely unexplored multinary domain. Furthermore, the current material repositories are chemically and synthetically biased. For instance, elemental occurrences within ternary compounds peak at oxygen with 22,476 counts in the ICSD, which is more than three times that of the next most frequent elements (Fe, Si, and S). This is mainly because oxygen is the most earth-abundant element and forms stable compounds with most metals. In addition, oxides are easier to synthesize than other compounds and benefit from synthetic recipes established throughout the long history10,11,12. This indicates that the present material database is biased toward those with facile synthesis, possibly missing promising compounds that are unfamiliar today13.

Considering the very large gap between current experimental throughput and the number of unknown materials, together with rising synthesis barriers, exploring the uncharted chemical space solely by experiment would be inefficient. Alternatively, experimental endeavors can take advantage of computational prescreening based on the density-functional theory (DFT) calculations. This has been demonstrated by a multitude of recent publications in which the discovery of new materials was accelerated by DFT calculations: cathodes for Li-ion batteries14, nitride semiconductors15, metal nitrides13, 18-electron compounds16, boron-based MAX phases17, and high-Tc superconductors18,19,20. Throughout these works, DFT results were able to suggest compositions that are presumably stable under ambient conditions and thus have high synthesizability. In addition, DFT predictions of basic properties could steer experimental resources to materials appropriate for specific applications.

Despite great promise, the current computational exploration of unknown materials inherently suffers from low throughput; in contrast to the high-throughput screening of materials cataloged in the ICSD21,22, the investigation into as-yet-unreported materials should start from the crystal structure prediction (CSP), preferentially for equilibrium phases with the lowest Gibbs free energy (evaluated at the DFT level) under given chemical composition and thermodynamic conditions. However, CSP is a classic NP-hard problem in which the computational load in identifying the global minimum exponentially increases with material complexity23. Compounded further by the high costs of DFT calculations, this results in extremely low throughput of DFT-based CSP. Consequently, computational investigations are often limited to known prototypes13,14 or short evolutionary steps15,16, which risks resulting in metastable structures instead of the ground state. Reflecting this, the compositional territory has been scanned rather ‘conservatively’ over subsets close to known chemical families. If free energies can be evaluated much faster than with DFT, then more aggressive and far-reaching exploration will be viable on a large scale, raising the chance of discovering materials with novel functionalities.

Recently, machine-learning potentials (MLPs) have attracted considerable attention because they can provide close to DFT-accurate energies at a fraction of the cost, which finds immediate applications to structure prediction of crystals24,25,26,27,28,29,30,31, nanoclusters32,33,34,35,36,37,38, and surface structures39. As such, the structure prediction based on machine-learning potentials is by now a well-established approach. In applying machine-learning potentials to CSP, however, one is faced with difficulties in choosing training sets without information on crystal structures. We recently suggested a way to resolve this problem by using melt-quench molecular dynamics (MD) simulations40. The liquid simulation can self-start from a random distribution with rapid equilibration; thus, a priori information on the crystal structure is not required. In addition, the MD trajectory automatically samples diverse local orders that may appear in stable or metastable crystalline structures. We demonstrated that the Behler-Parrinello-type neural network potential (NNP)41 trained with the melt-quench-annealing trajectories can serve as a high-fidelity surrogate model in CSP.

Going beyond the previous achievement, we herein develop a structure prediction framework combining NNPs with evolutionary or random searches, named SPINNER (Structure Prediction of Inorganic crystals using Neural Network potentials with Evolutionary and Random searches). Free from any empirical knowledge on material structures, the program identifies the global minimum in a brute-force style by harnessing the accuracy and speed of NNPs. Furthermore, SPINNER incorporates algorithms tuned for MLPs within the genetic algorithm, thereby maximizing the search efficiency for multinary compounds. In blind tests on ternary compounds with significant complexity and diverse crystal symmetries, SPINNER successfully identifies the experimental (or theoretically more stable) phases for ~80% of the cases. The program also outperforms other favored approaches such as data mining42, evolutionary algorithm43, and particle-swarm optimization44, in the majority of test materials. The average computational throughput is ~4 days per one composition on a 36-core node including overheads for the MD simulation and training procedure, which is estimated to be 102–103 times faster than pure DFT-based approaches. The outstanding performance of SPINNER allows for large-scale, open computational exploration of undiscovered inorganic crystals.

The basic workflow of SPINNER is schematically presented in Fig. 1a. In brief, for an input chemical composition (elements and stoichiometry), SPINNER first carries out a melt-quench-annealing simulation and trains an NNP with MD trajectories. To enhance the accuracy for ordered phases, the NNP is iteratively retrained over low-energy structures in the refining-stage CSP (the upper part of Fig. 1a). Unlike previous works24,25,26, the present approach does not sample high-energy structures and selects only low-energy structures which are relevant for the ground state. This contributes to thinning the final candidate pool, thereby reducing the computational cost of DFT calculations significantly. In the main CSP proceeding up to 5000 generations (the lower part of Fig. 1a), SPINNER collects low-energy candidate structures within 50 meV atom−1, which are finally sorted after full relaxations at the DFT level. For DFT calculations, we use the Vienna Ab initio Simulation Package (VASP)45 with the Perdew-Burke-Ernzerhof (PBE) functional for exchange-correlation energies46. Figure 1b shows a schematic of the present evolutionary algorithms that are based on random generation, crossover, permutation, and lattice mutation. The random generation is heavily used over mutations, which we find to be instrumental in multi-component systems. Detailed descriptions of DFT calculations and evolutionary algorithms are provided in the Methods section. We stress that no empirical information (for instance, prototype data47, bonding topology48, or Pauling’s rules49) is used in the present CSP scheme.

Fig. 1: Schematic workflow of SPINNER.
figure 1

a Schematic depiction of the whole workflow of CSP. Top left: An initial NNP is trained over disordered structures sampled from melt-quench-annealing trajectories of a given composition obtained by DFT calculations. Top right: The initial NNP is iteratively refined over low-energy crystal structures obtained during 400 generations of the structure search. Bottom left: With the refined NNP, the structure search is carried out up to 5000 generations. The quality of the NNP is monitored every 1000 generations. Bottom right: Free energies of final candidates are evaluated by DFT calculations. b Left: Representation of the evolutionary algorithm. Under given fractions of random structure generation and mutations (crossover, permutation, and lattice mutation), new structures are generated for the next generation while low-energy structures additionally survive. Right: The schematic illustration of each generation method is presented. Decision chart in the random generation represents distance constraints.

We introduce several features into the program to consider the unique characteristics of MLPs. Most importantly, the pairwise radial distributions obtained from the melt-quench process sets the minimum distances (named as MQ distance constraints) that are used in random structure generation and structure relaxation. The MQ distance constraint prevents unphysical structures caused by the poor extrapolation accuracy of MLPs, and also contribute to generating low-energy structures (see Supplementary Fig. 1). (Such distance constraints would be unnecessary in MD-based structure prediction50 in which pair-wise distances remain in a physically relevant range). In the crossover algorithm, on the other hand, we utilize atomic energies to suppress unnecessary disruption of stable chemical units (see Supplementary Fig. 2). The effect of these features will be demonstrated in the Discussion section.

Results

Selection of test materials

In benchmarking search algorithms tackling NP-hard problems, the global minimum is usually unknown. The situation is slightly different in CSP because the equilibrium structures that are experimentally resolved by diffraction analysis mostly equate to the global minimum for the given chemical formula and thermodynamic conditions. Therefore, blind tests with the compositions reported in the ICSD can serve as an ultimate evaluation of the performance of a CSP algorithm. Here we focus on ternary compounds because the corresponding database spans a wide range of chemistries and structural prototypes.

Within the ICSD, we first select ternary compounds measured at room temperature or below and at atmospheric pressure, as we are mainly concerned with materials that are stable under ambient conditions. We also consider only high-quality (R < 0.1) ordered crystals with well-defined structures and rule out molecular crystals as well as compounds including 3d transition elements (V–Zn), lanthanides and actinides. In the latter materials with 3d and f electrons, the magnetic ordering influences the free energy, but the current implementation of NNPs has yet to resolve the energy scales of magnetic interactions. (We note that meaningful developments are underway51,52). When distinct crystal structures with the same composition are available in the ICSD, the most stable structure within the PBE functional is regarded as the reference. Among the filtered structures, we randomly select 50 materials under the condition that at least one crystal is selected from the 32 crystallographic point groups (omitting 6/m due to zero occurrence), which secures the diversity of the test structures. To exclude structures that are too simple, we choose compounds with the formula units (Z) in the primitive cell being at least 4. (So, the numbers of atoms are at least 12). We additionally handpick 10 structures to further diversify local motifs and chemistry. The full list of the selected 60 materials is provided in the Supplementary Information. Regarding the bandgap (Eg), there are 18 metals, 13 semiconductors (0 < Eg ≤ 2 eV), and 29 insulators (Eg > 2 eV). (The finite band gaps are calculated within one-shot hybrid functional calculations53,54). Most structures have a hull energy of 0 in the Materials Project database55 (see Supplementary Table 1).

Structure prediction by SPINNER

In running SPINNER, we assume the same Z value as in the ICSD structure. Figure 2a shows ΔEmin, the offset of the PBE energy of the most stable structure after 5000 generations of structure search from that of the experimental structure. Since the pool size is 24–80 (see the Methods section), up to half a million structure relaxations are performed for each composition. The color code of the data indicates the earliest generation at which the minimum energy structure is identified (Ng). For 75% of the compositions (45 of 60), SPINNER predicts the reference (ΔEmin = 0) or a lower-energy (ΔEmin < 0) structure, which is mostly found within 1000 generations (38 of 45). The largest error occurs for Sr2Pt3In4 with a ΔEmin of 36 meV atom−1. Figures at the bottom of Fig. 2a display the unit cells of some materials with ΔEmin = 0.

Fig. 2: Search results of SPINNER against structures in the ICSD.
figure 2

a Distribution of the energy difference between the most stable structure found by SPINNER and the ICSD structure (ΔEmin). The negative, zero, and positive ΔEmin indicate that SPINNER found more stable structures than the ground state, exact ground states, and metastable structures, respectively. The color codes indicate the generation at which the lowest-energy structure was identified (Ng). The circle, square, and sun cross indicate the exchange-correlation functional and diamonds are results obtained with the conventional cell. Equilibrium structures of some compositions with ΔEmin = 0 are displayed below. b Definition and distribution of Nt and Nm for materials with ΔEmin ≤ 0; Nt denotes the number of generations that elapsed from random seeding until the ground state is identified, and Nm is the number of mutations involved in this process. Nt = 0 means that the ground state is directly obtained by relaxation from a random structure. c, Schematic illustration of potential energy surfaces by the DFT and NNP.

In Fig. 2a, six materials have negative ΔEmin down to −16 meV atom−1, meaning that the structure found by SPINNER is more stable than the experimental phase within the PBE functional. We note that both structures share almost the same local order, differing in ranges only beyond ~4 Å. Since the ICSD does not register any other experimental structures for these compositions, the energy ordering by the PBE functional is likely incorrect, calling for further investigation. In ref. 56, the PBE functional was found to incorrectly stabilize metastable phases for some binary materials, which was partly resolved by the SCAN functional57,58. To check whether this is the case for the six materials with negative ΔEmin’s in Fig. 2a, we recalculate their energies as well as those of experimental structures using the SCAN functional (the structures are fully relaxed within SCAN). The empty squares in Fig. 2a are the ΔEmin’s obtained by the SCAN functional. Except for PbOsO3 and LiYSn, the ΔEmin values becomes positive. On the other hand, when tested over 10 compounds with ΔEmin = 0, the SCAN functional correctly produces the lowest energy for the ICSD structures among the low-energy structures found by SPINNER. This confirms that the SCAN functional is more accurate than PBE in the energy ordering. Furthermore, we recalculate energies for Tl3PbCl5 and PbOsO3 with spin-orbit coupling (SOC) on top of PBE because heavy elements such as Tl, Pb, and Bi possess large SOC. As a result, the ΔEmin for Tl3PbCl5 and PbOsO3 increases to −0.2 and 16 meV atom−1, respectively (see sun crosses). The foregoing discussions indicate that the incorrect energy orderings in PBE are effectively rectified by introducing more sophisticated energy functionals. (Nevertheless, the bare PBE functional without SOC correctly reproduces the equilibrium phases in many compositions including Tl, Pb, and Bi).

We assess the inference accuracy of NNP based on two metrics: The first is the absolute energy difference (ΔE0) between DFT and NNP for the experimental structure (relaxed within each method), which relates to how well NNP predicts the structure and energy of the reference structure. The second metric (ΔĒ) is similar to the first one but it is averaged over final candidates within the bottom 50 meV atom−1. ΔĒ indicates how accurately NNP ranks the energies of stable and metastable phases. ΔE0 and ΔĒ are provided for every material in Supplementary Table 1. When averaged over materials with ΔEmin ≤ 0, ΔE0 and ΔĒ are 12.9 and 11.8 meV atom−1, respectively, meaning the potential energy surfaces of DFT and NNP agree well around the equilibrium and low-energy metastable structures. This confirms that the trained NNPs are good surrogate models of DFT. The remaining errors are partly attributed to a low resolution in describing medium- to long-range order beyond cutoff radii of descriptors, which are 6 and 4.5 Å for radial and angular parts, respectively. This can be improved by increasing the cutoff radius, but it adversely affects the computational efficiency due to rising costs in evaluating descriptors. The analysis on the failed cases (ΔEmin > 0) is given in the Discussion section.

To understand how SPINNER effectively navigates the complex configuration space, we analyze the evolutionary steps leading to the experimental structure for materials with ΔEmin ≤ 0 based on two parameters, Nt and Nm. As schematically shown in Fig. 2b, Nt counts the generations from random seeding to the appearance of the equilibrium structure at Ng. Nm indicates the number of mutations within Nt steps. The distribution of Nt in Fig. 2b shows that Nt is 0 for almost half of the cases, meaning that the ground-state structure is obtained directly from relaxing a random structure, without any mutations involved. Even in cases undergoing finite mutations, Nm is mostly within 5. (The permutation and lattice mutation are used in 34 and 66% of cases, respectively). Intriguingly, the randomly generated structures are very close to the global minimum. There are two reasons contributing to this. The first is the MQ distance constraints in which the pair-wise minimum distances during melt-quench-annealing trajectories are imposed as constraints in the structure generation and relaxation (see the Methods section). This condition filters out ~99% of the randomly generated structures that are unlikely to relax into low-energy structures (see the decision tree in Fig. 1b). It also effectively protects atomic configurations from evolving into unphysical structures during relaxation. Note that the minimum distance constraints extend to 2–3 Å for some pairs, which are much stronger than simple conditions preventing too short bonds. Second, we find that when the random structures leading to the global minimum are relaxed within DFT instead of the NNP, more than half of them relax to high-energy metastable structures instead of the global minimum. This implies that the energy landscape around the global minimum is different between DFT and the NNP (see Fig. 2c). That is, DFT calculations adaptively forms chemical bonds that can locally stabilize the initial configuration; thus, the search can be stuck at a high-energy metastable structure. Having not machine-learned such chemical bonds, the NNP unwittingly smooths out the corresponding energy region, directing the structure to the global minimum. Figuratively, the NNP “catalyzes” the relaxation process to the equilibrium, eliminating or lowering energy barriers around certain metastable structures. This also effectively increases the multi-dimensional volume of the configurations that can funnel down to the equilibrium structure.

We also examine whether SPINNER can identify metastable structures that were found experimentally. According to ICSD, Na3PS4, TlSbO3, TlGaSe2 have ordered metastable phases reported with high sample qualities (R < 0.1) and Z less than or equal to the ground state. Within PBE, these structures lie within the energy window of 50 meV atom−1, and it is confirmed that SPINNER identified all of these metastable structures.

Comparison with other approaches

To benchmark SPINNER against other CSP methods, we select ternary structures from the literature that were theoretically predicted by either data-mining known prototypes13,16,59,60,61 or using DFT-based evolutionary approaches such as a genetic algorithm62,63,64 or particle swarm optimization65. The materials are listed in Fig. 3a and they are all nonoxides (nitrides, borides, etc). This is because the structures and chemistries of oxides were thoroughly studied compared to other groups, such as nitrides and sulfides, so most studies exploring new materials have focused on nonoxides.13,66 None of these compositions has ever been synthesized as far as we are aware except for KScS2, Sc2C3N6, and ScPtBi (see below). For the test compositions, we perform CSP with SPINNER up to 1000 generations with Z ranging from 2 to 8. The plot at the top of Fig. 3a shows ΔEmin in reference to the structure in the literature. (The structures from the references are fully relaxed within the present DFT method). SPINNER identifies lower-energy structures for the majority of cases (13 of 21) and the same structures for the rest. In ref. 64, CSP was performed with the PBEsol functional67 instead of PBE for Sn5S4Cl2 and Cd4SF6. As a comparison, we evaluate ΔEmin with PBEsol and find that compounds identified in this study are still more stable than those predicted in the reference by 31 and 4 meV atom−1 for Sn5S4Cl2 and Cd4SF6, respectively. The lower part of Fig. 3a shows Z values of the primitive cells with the lowest energies (solid squares) in comparison with those in the references (solid circles). Lower energies are found mostly at Z values higher than those in the literature. In addition, SPINNER finds the lowest-energy structure often at a larger Z than that of the identified primitive cell. (There are 8 cases in which Z is bigger than that used in the reference. Using the same Z, we obtain the identical structures in the literature for Ti3O3N2, Zr3O3N2, Li2SnN2, Li2TiN2, W4Mo4B15 while lower energies are still found for AlPtSb (−3.5 meV/atom), Cd4SF6 (−4.9 meV), and TaCN3 (−47.8 meV)). This indicates the significance of trials with diverse Z numbers. The NNP is far more advantageous in varying Z than DFT owing to the linear scaling with respect to the number of atoms, in contrast to the cubic scaling of DFT. It is noted that for Sn5S4Cl2, Cd4SF6, and TaCN3, hull energies are positive for both reference structures and those identified by SPINNER. Nevertheless, we do not find any signature of phase separations in the final structures (see Cd4SF6, and TaCN3 in Fig. 3), indicating that the lower-energy structures found by SPINNER are not artifacts of the phase separation.

Fig. 3: Benchmark results of SPINNER against other methods.
figure 3

a Upper part shows energy difference between the structure predicted by SPINNER and the reference structure (ΔEmin) predicted by data mining or evolutionary algorithms. The lower part compares the formula unit in the unit cell (Z) between the reference and the value at which the lowest energy is found by SPINNER. The structures are adopted from ref. 13 (Li2HfN2, Sc2C3N6, Hf2SeN2, Na3OsN2), ref. 59 (Ti3O3N2, Zr3O3N2), ref. 60 (Li2SnN2, Li2SnF6, Li2TiN2), ref. 61 (KScS2), ref. 16 (HfIrP, ScPtBi, AlPtSb), ref. 62 (Hf3PB4, Zr2InB2), ref. 63 (WMo5B6, W4Mo4B15), ref. 64 (Sn5S4Cl2, Cd4SF6), and ref. 65 (Hf2CN4, TaCN3). The chemical formulas with star marks do not have corresponding structural prototypes in the ICSD. b Examples illustrating the structural difference between the crystal structures in references (top) and ones identified by SPINNER (bottom).

While structures found by SPINNER have usually local structures similar to reference structures (such as KScS2 and W4Mo4B15 in Fig. 3b), different short-range orders appear in materials, such as Li2TiN2, Cd4SF6 and TaCN3 (see Fig. 3b). We note that Cd4SF6 and TaCN3 were discovered by DFT-based evolutionary searches64,65. The failure of the previous works in identifying the present lower-energy structures might be attributed to small Z numbers and/or few generations. In Fig. 3a, Na3OsN2 shows the biggest energy drop. Both structures are similar in local order, but the reference structure is distorted from that found by SPINNER. We comment that metastable structures can have distinct physical properties compared to the ground state. In the example of HfO2, the high-temperature tetragonal phase that is metastable by 57 meV atom−1 at 0 K possesses a far higher dielectric constant (70) than that for the low-temperature monoclinic phase (16)68. This underscores the importance of identifying the true global minimum for the accurate prediction of material properties.

For the 13 structures newly identified in Fig. 3a by SPINNER, we check the existence of corresponding prototypes in the ICSD using AFLOW-XtalFinder69. It is found that 10 of 13 materials do not have any matching prototypes (star-marked in Fig. 3a). This suggests that the current prototypes for the ternary phase are not sufficiently cataloged, which is understandable because materials such as nitrides have been synthesized much less frequently than oxides. Notably, the prototype of the SPINNER-identified Zr3O3N2 and Ti3O3N2 is commonly Ti3O5. We think that ref. 59 missed the ground state because they did not fully consider partial ion-exchanges replacing O with N. In passing, the three compositions for which SPINNER and previous evolutionary searches agree have corresponding prototypes in the ICSD.

Sc2C3N6 was recently synthesized with the structure predicted by data mining70, which was confirmed by SPINNER. For KScS2, there was an experimental report71 preceding the theoretical prediction, which was not recognized by ref. 61. The experimental crystal structure is consistent with the present work, which is more stable than that for ref. 61 by 2.5 meV atom−1(see Fig. 3b). The crystal structure of ScPtBi was predicted and confirmed by the synthesis in ref. 16. Although it was grown as a multiphase, SPINNER does not identify any phases within 50 meV atom−1 other than the one predicted in ref. 16.

Discussion

Regarding the computational cost, the whole computation from the melt-quench-annealing MD to 5000 CSP steps takes 3–5 days on a 36-core node (CPU of Intel® Xeon Platinum 8000-series). (For 1000 evolution steps, it takes 2–3 days). This excludes human time in managing data flow and making decisions, which can be executed instantly by automation (under development). In the case of Mg2SiO4 for example, the workload can be split into the DFT-MD (32%), training of the NNP (16%), SPINNER (49%), and DFT relaxation of crystals (3%) (see Fig. 4a). The computational time for the DFT part varies widely depending on the number of electrons. The SPINNER part is highly scalable on massively parallel computers. For SPINNER, it takes 58 h for 5000 generations with the pool size of 56 and Z = 4. When tested with the same computational resources and conditions, a DFT evolutionary program (USPEX)43 could proceed up to only 31 generations under the setting suggested in the manual (Fig. 4b). (The full computational setup is provided in Supplementary Information). Therefore, the entire 5000 generations would take several years for DFT calculations. (We add that actual generations required for identifying the equilibrium phase can be different between DFT and NNP). The estimation of time and cost indicates that it would be viable to construct databases of ~1000 materials targeted for a specific application at a reasonable cost and in a reasonable time frame.

Fig. 4: Computational efficiency of SPINNER.
figure 4

a Schematic diagram showing the relative time spent on each step for Mg2SiO4. b Comparison of the SPINNER to the USPEX code with DFT calculations in the case of Mg2SiO4. In the Figure, ΔEmin of SPINNER and USPEX are evaluated by NNP and DFT, respectively. c Comparison of the SPINNER to the USPEX code with NNP calculations. ΔEmin of filled points are calculated by NNP and the empty points are the lowest DFT energies among the structure candidates within an energy window of 50 meV/atom.

We also compare SPINNER with USPEX43 employing NNP (USPEX-NNP) to directly test the effects of various features developed in the present work such as MQ distance constraint, crossover algorithm using atomic energies, and iterative training scheme. Since USPEX incorporates state-of-the-art genetic algorithms for CSP, this would be an informative test comparing conventional methods with the present method. For a fair comparison, the pool size is the same and the suggested options in the manual are used for USPEX. The results are shown in Fig. 4c. (The empty symbols at the final generation indicate DFT energies). It is seen that SPINNER outperforms USPEX-NNP by 34 meV/atom. Similar performance gaps are observed for CdI2O6, Sr2P7Br, and MgIrB (see Supplementary Fig. 3).

Despite impressive performance, SPINNER failed with 25% of test ternary materials, which merits further discussion. First, ΔE0 and ΔĒ averaged over materials with ΔEmin > 0 (see Supplementary Table 1) are 41.0 and 43.3 meV atom−1, which are substantially larger than 12.5 and 11.9 meV atom−1 for structures with ΔEmin ≤ 0. Most notably, the NNPs for SnGeS3 and Sr2Pt3In4 show the largest ΔE0Ē) of 75.5 (227.3) and 83.5 (140.0) meV atom−1, respectively. For these materials, locating the equilibrium structures would be very difficult, as they are high in the energy scale. The origin of the poor NNP quality requires further investigation. (Increasing the cutoff radii of descriptors was not helpful). Second, among the failed cases, qualities of the NNP are acceptable in several materials, with accuracy metrics on par with those with ΔEmin ≤ 0. In these materials, the experimental structures would be found in principle by extending the evolutionary process beyond 5000 generations. We note that BaGe2S5, Na3SbO3, and YPdGe have rather flat or pointed cell shapes while their conventional unit cells have more isotropic and 3-D-like structures. However, RandSpg72, the random structure generator used in the present study, generates atomic configurations based on the conventional unit cell, which may have affected search efficiency. Therefore, we additionally carry out structure searches with Z equal to that of the conventional cells (see Supplementary Table 1). In all three cases, SPINNER successfully identifies experimental or theoretically more stable structures (see diamonds in Fig. 2a).

Even though the training set is constructed within a specific stoichiometry, dynamic fluctuations during the melt-quench-annealing process extend the transferability of NNPs, enabling transfer learning to other compositions. For example, employing the NNP developed for Mg2SiO4, we try CSP for MgSiO3 and SPINNER successfully identifies the experimental crystal structure (ICSD-ID of 196432). Since the NNP training set has a single stoichiometry, atomic-energy offsets among chemical species become arbitrary, resulting in constant energy shifts between the DFT and NNP energies for crystalline structures of MgSiO373. Nevertheless, energy ordering among the low-energy structures is well maintained, leading to the successful identification of the equilibrium phase. In addition, one can refine the NNP with DFT results for a small number of MgSiO3 crystals, which effectively removes the energy offsets. We also confirm that the transfer learning works even for cases involving valence changes, for example, MoPdO5 → MoPdO4 and InPbO3 → In2PbO4. Such transfer learning would be highly effective when one tries to find relevant stoichiometries for unknown materials.

To test SPINNER on compounds other than ternary phases, we carry out similar studies on TiO2, P3N5, NbPd3, Li10GeP2S122, and InGaZnO43 whose equilibrium structures are experimentally known. In all cases, ground-state structures are found within or around 5000 generations. TiO2 is well known for rich polymorphism.74 With Z = 4 and 8, SPINNER obtained all the experimental polymorphs (C2/m, anatase, brookite, columbite-type, rutile) within the energy window of 50 meV atom−1. In addition, SPINNER finds 7 unreported structures of TiO2 including 3 theoretical structures present in the Materials Project database55. Even though the test sets are small compared to the ternary compounds, these results support that the present CSP framework works regardless of material complexity. However, quaternary or higher-order compounds may generally require longer generations than in the present work, which needs further investigation.

Methods

Training machine-learning potentials

The NNPs used in CSP are first trained with DFT-MD simulations on the melt-quench-annealing trajectory of each compound. All DFT calculations are performed with the Vienna Ab initio Simulation Package (VASP)45 and the Perdew-Burke-Ernzerhof (PBE) functional is used for the exchange-correlation energy of electrons46. The initial structures are prepared by randomly distributing ~80 atoms under the given composition and superheating them at 4000 K for 4.5 ps (premelting). Next, we obtain liquid-phase trajectories for 16 ps at ad hoc melting temperature (Tm) obtained by the scheme proposed in ref. 40. Subsequently, the liquid is quenched at a cooling rate of 200 K ps−1 from Tm to 300 K and then annealed at 500 K for 4 ps to sample amorphous structures. The cutoff energies and k-point meshes are chosen for the pre-melted structures such that the energy, forces, and stress tensors converge to within 20 meV atom−1, 0.3 eV Å−1, and 10 kbar, respectively. As an alternative to the melt-quench-annealing simulation, we find that the metadynamics in the atomic environment space75 can also generate trajectories that can be used as training sets for CSP. (However, computational costs are higher). We also note that using the data-mining method42 can sample diverse local orders contained in existing databases, and may augment the sampling and increase the reliability of NNP.

For training NNPs, we use the SIMPLE-NN package76. Behler-Parrinello-type symmetry function vectors are adopted as input features. For each pair of atomic species, 8 radial and 18 angular components are used with cutoff radii of 6 and 4.5 Å, respectively77. We train the NNPs until the root mean square errors of the validation set reduce to within 20 meV atom−1, 0.3 eV Å−1, and 20 kbar for the energy, force, and stress components, respectively. Other parameters for NNPs are identical to ref. 40.

To enhance accuracy on ordered phases, the NNPs are iteratively retrained over low-energy structures in the refining-state CSP (the upper part of Fig. 1a). With the NNPs trained by the melt-quench-annealing trajectories, we carry out CSP for 50 generations and select 10 structures: 5 lowest-energy structures and 5 structures with the lowest antiseed weights (see below) within the bottom 100 meV atom−1. These structures are further relaxed within DFT by using AMP2 53. The relaxation terminates when atomic forces and stress tensors are less than 0.1 eV Å−1 and 20 kbar, respectively. The atomic positions and energies are sampled during the DFT relaxation, which are augmented to the training set for refining the NNPs. We find that the initial cell structure sometimes undergoes significant deformations, which necessitates adapting the k-point mesh consistently. To this end, the k-point mesh is checked and modified every 10 relaxation steps according to the converged k-point spacing. Then, we continue CSP and retrain the NNPs at 100, 200, and 400 generations in similar ways but excluding structures already learned in the previous iterations. In total, 40 crystal structures are sampled to refine the NNPs.

Workflow of crystal structure prediction

With the refined NNPs, we perform up to 5000 generations of the main CSP (the lower part of Fig. 1a). The structures are generated by random seeding, permutation, lattice mutation, and crossover algorithms. Detailed descriptions of each mode are provided in the next subsection. The generated structures are relaxed within the NNPs using the LAMMPS code78. We use the restraint option during structure relaxation, which prevents atom pairs from being closer than a certain distance limit through repulsive harmonic forces. The pair-wise distance limits are set to the minimum values found for the corresponding pair during the melt-quench-annealing process. The restraint option effectively prohibits the structures from evolving into the untrained domain. In comparison, the structural relaxation by DFT methods typically takes 102–103 times that of the NNP.

To prevent unphysical structures from appearing as candidate structures, we compare the single-shot DFT energy and NNP energy of the lowest-energy structure every 1000 generations. If the discrepancy is greater than 50 meV atom−1, then the NNP is retrained over 10 structures selected similarly to the refining stage (see above). Last, we perform DFT structure relaxation for final candidates that lie within the bottom 50 meV atom−1 using the AMP2. For accurate evaluation of energies, we adopt tight convergence criteria of 2 meV atom−1 and 3 kbar for energy and pressure, respectively.

Evolutionary algorithms

Following ref. 43, SPINNER utilizes evolutionary algorithms to find the global minimum in the configuration-energy space under the given composition and Z number. From extensive tests, we find that random generation, crossover, permutation, and lattice mutation algorithms are particularly effective in the structure prediction of multinary phases, and so the current version of SPINNER is based on these four-generation modes. In generating random structures, the space group, lattice vectors, and Wyckoff positions are selected randomly using the RandSpg package72. The atomic density in the first generation is set to that of the amorphous phase generated by the DFT-MD simulations. Then, the volume of each structure in the ith generation (i > 1) is chosen randomly between 70 and 130% of the volume of the lowest-energy structure in the previous generation. Distance constraints are imposed on all pairs of atoms such that distances between pairs are longer than the minimum values found for the corresponding pair during the melt-quench-annealing process. In applying the crossover algorithm, we note that the conventional approach often unnecessarily disrupts stable chemical units (for instance SiO4 tetrahedra in Mg2SiO4). To avoid this, SPINNER utilizes the atomic energies of the NNP in choosing cutting planes such that the average atomic energy within the slab is sufficiently low. The average atomic energies are also used in adjusting lattice vectors and atomic registries in combining the two slabs. This effectively conserves chemically stable local units. Since atomic energies can be defined at the DFT level73,79, this algorithm would be also applicable with DFT. We note that this approach is not compatible with machine learning potentials including long-range interactions80.

During the evolutionary steps, the structural similarity is measured by the partial radial distribution function (PRDF)81 and if two structures are determined to be too similar, then one of them is discarded from the pool. We note that PRDF becomes less sensitive for large supercells or more than ternary systems because it averages out the local environments of atoms. The atomic descriptor using symmetry functions of individual atoms would provide more detailed information, which is under investigation. We also adopt an antiseed algorithm to diversify the structure pool82. The antiseed weight of ith structure (Ai) is defined as follows:

$$A_i = \mathop {\sum}\limits_a {{{{\mathrm{exp}}}}\left( { - \frac{{d_{ia}^2}}{{2\sigma ^2}}} \right),}$$
(1)

where the summation runs over structures in the inheritable pool (bottom 200 meV atom−1 in the refining stage and 100 meV atom−1 during the main CSP), dia is the distance between the ith and ath structures measured by PRDF, and σ is a Gaussian width. Unlike the original scheme82, we use the antiseed weight in defining the probability (Pi exp(−Ai/σA) where σA is a parameter) for a structure to be inherited in the next generation by mutation (see below).

The number of structures generated by the operations is set to twice the number of atoms in the simulation cell. The inherited structures are chosen within the bottom 200 and 100 meV atom−1 for the refining-stage and main CSP, respectively, and we choose half of them randomly and the rest are chosen according to the probability distribution of {Pi} defined in the above. In the refining stage, ratios of operations of random generation, crossover, permutations, and lattice mutations are 30%, 50%, 10%, and 10%, respectively. The crossover tends to generate diverse local orders, which is useful in training robust NNPs. However, we find that the algorithm is less effective in discovering the ground state of the generated structures, so we exclude the crossover operation in the main CSP in which random (70%), permutation (20%), and lattice mutation (10%) are employed. In addition to the structural pool generated by these operations, we carry low-energy structures in the bottom 100 and 50 meV atom−1 to the next generation in the refining stage and main CSP, respectively.