1 s2.0 S026412752100085X Main

Materials and Design 202 (2021) 109532
Contents lists available at ScienceDirect
Materials and Design
journal homepage: www.elsevier.com/locate/matdes
Revealing high-fidelity phase selection rules for high entropy alloys: A

combined CALPHAD and machine learning study
Yingzhi Zeng 1, Mengren Man 1, Kewu Bai ⁎, Yong-Wei Zhang ⁎
Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore 1 Fusionopolis Way, #16-16 Connexis, 138632, Singapore
H I G H L I G H T S G R A P H I C A L A B S T R A C T
• 300,000+ equilibrium data with 8

metal elements in the vast
temperature-composition space were
generated by CALPHAD calculations.
• Machine learning model developed by
generated data was validated by 155 ex-
perimental data and used to predict 213
new HEAs.
• High fidelity phase selection rules were
established based on large data and
five important features identified by
the ML model.
• In-depth insights into the paradigm of
composition-feature-phase of high en-
tropy alloys were revealed.
a r t i c l e i n f o a b s t r a c t
Article history: We reveal high-fidelity new phase selection rules for high entropy alloys (HEAs) by combining CALPHAD cal-
Received 15 December 2020 culations and the machine learning (ML) method. Employing Thermo-Calc and TCHEA3 database, we first
Received in revised form 14 January 2021 generate more than 300,000 equilibrium phase data from 20 quinary families formed by the 8 elements of
Accepted 25 January 2021
Al Co, Cr, Cu, Fe, Mn, Ni, and Ti, and choose initially 15 materials/physical descriptors. The eXtreme Gradient
Available online 29 January 2021
Boosting (XGBoost) method is then used to identify 5 most important descriptors that best delineate the sin-
Keywords:
gle and mixed phases in the complex temperature-composition space of HEAs. The ML model trained by the
Machine learning 5 features is validated by 155 annealing experimental data points from 15 publications and then used to pre-
High entropy alloy dict 213 new single-phase alloys with BCC and FCC structures of the alloy families of AlCrNiFeMn and
CALPHAD AlCrCoNiFeTi. We also highlight the importance of equilibrium temperature and offer in-depth insights
Solid solution into the paradigm of composition-feature-phase of HEAs. On the basis of the 5 important features, we estab-
Phase selection rules lish new phase selection rules for single FCC and BCC phases with a success rate above 90%, significantly
outperforming all existing phase selection rules and providing a powerful tool for mapping single-phase in
the complex temperature-composition space of HEAs.
© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://
creativecommons.org/licenses/by-nc-nd/4.0/).
⁎ Corresponding authors.
E-mail addresses: baikw@ihpc.a-star.edu.sg (K. Bai), zhangyw@ihpc.a-sta.edu.sg (Y.-W. Zhang).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.matdes.2021.109532
0264-1275/© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
1. Introduction the vast composition range of HEAs. For instance, a high-throughput

DFT method was developed to examine the thermal stability of the com-
In recent years, High Entropy Alloys (HEAs) have garnered a great peting binary compounds in the multicomponent HEA alloy system and
deal of interest in materials science [1–8] due to their exceptional me- the likely formation of single-phase HEAs, but the method is still com-
chanical and physical properties, such as high strength, thermal stabil- putationally expensive and thus it is not adequate to be used to generate
ity, wear resistance and irradiation resistance [9–15]. The excellent large amounts of data. Furthermore, finite temperature contribution to
properties of HEAs are mainly attributed to their unique microstruc- the free energies of phases is ignored in such method although it plays
tures [16]. Therefore, the phase selection rule has been at the centre of a fundamental role in phase stability [11,36].
the research of HEAs, which however remains elusive despite extensive CALculation of PHAse Diagrams (CALPHAD) method [38] has proved
experimental and computational efforts for many years. The detailed re- to be a powerful tool in alloy design [11,34,38,39] and has gained great
views on the phase selection rules of HEAs can be found elsewhere [6,8]. interest in HEAs design. So far, the CALPHAD method is considered the
Empirical methods have been commonly employed to study the most credible tool in predicting the stability of HEAs in a large well-
phase selection rules of HEAs. In such methods, the ‘two-dimensional defined composition-temperature space [40–43]. For example, Senkov
phase stability maps’ of HEAs as a function of materials parameters, et al. [44] proposed an approach of combining CLAPHAD calculations
such as mixing entropy, size mismatch, valence electron concentrations, and property calculations to screen 130,000 equimolar alloy systems
electronegativities, and mixing enthalpies, were proposed. Yeh et al. [2] and identified 51 single-phase new equimolar alloys comprising 3–6 el-
presented the entropy of mixing as the criterion for single-phase HEAs. ements. Abu-Odeh et al. [45] demonstrated a framework to explore the
Based on the Miedema model, Zhang et al. [15] suggested using both en- HEA composition-temperature space for phase selection with the inte-
thalpy and atomic size difference to predict the formation of single- gration of CALPHAD with TCHEA1 database. Based on CALPHAD calcula-
phase and mixed-phase. Considering the role of entropy–enthalpy com- tion results, Tancret et al. [46] proposed a model to predict the
petition in the stabilization of phases in HEAs, Zhang et al. further pro- formation of single-phase solid solution.
posed to use both the entropy vs. enthalpy contribution ratio and the In this work, we combine CALPHAD calculations and ML method
atomic size difference to differentiate single-phase and mixed-phase to explore the selection rules of single FCC and BCC phases of HEAs.
of HEAs [17,18]. Guo et al. first developed a valence electron concentra- Our focus is placed on lightweight and high strength HEAs with Al
tion (VEC) rule [19] to differentiate face centred cubic (FCC) and body and 7 other elements from the 3d transition metal family, namely,
centred cubic (BCC) phases. Recently, Kube et al. [20] and Yang et al. Co, Cr, Cu, Fe, Mn, Ni, and Ti. Unlike previous studies by other groups
[21] revisited the VEC rule by performing experiments and CALPAHD [44–46], we employ the latest version of HEA database TCHEA3 [47]
calculations, respectively. It was found [21] that the thresholds of VEC with much higher credibility. We first perform high-throughput
are not consistent among different systems and processing conditions, CALPHAD calculations by employing Thermo-Calc Software [39] and
and temperature plays an important role in determining the phase se- TCHEA3 database to generate equilibrium phase data from 20 quinary
lection rules. Overall, while these existing attempts are useful, they are families formed by the above 8 elements with a composition step of
not robust [22] mainly due to the fact that these rules were developed 10 at.% and a temperature step of 10 K. As a result, more than
from the experimental data of limited composition spaces and insuffi- 300,000 data points have been generated and the results are labelled
cient data points. Furthermore, these previous studies highlight the ne- as three classes: single-phase FCC, single-phase BCC, and the others.
cessity of using multiple parameters to develop accurate phase selection Furthermore, we investigate the roles of 15 descriptors including
rules of HEAs [23]. the equilibrium temperature and 14 materials/physical descriptors.
Machine learning (ML) methods have evolved rapidly in recent We employ the state-of-the-art machine learning (ML) algorithm,
years with the promise of exploring the phase formation and stability the eXtreme Gradient Boosting (XGBoost) method [48] to perform a
of HEAs [14,16,24–31]. While these studies achieved various degrees feature selection to identify the top five features that best discrimi-
of success, there are several drawbacks in the phase selection studies nate the three classes. Subsequently, we develop the machine learn-
of HEAs. Firstly, the ML models were trained on datasets with limited ing model by using the five features and validate the model with
composition space that mainly contained only a few hundred experi- 155 annealing experimental data points. It is worthwhile to mention
mental data on the microstructure of the as-casted or annealed HEAs. that the presence of some experimental data containing elements be-
Secondly, while a major task of machine learning is to seek physically yond the realm of the alloy systems for training the ML model demon-
meaningful descriptors to best represent the multi-dimensional phase strates the robustness of our method. More importantly, we establish
diagram of HEAs, one of the important variables, i.e., the equilibrium the new phase selection rules based on the five important descriptors
temperature, was rarely selected as a descriptor. Often, less physically which can serve as a powerful tool to search for single-phase FCC and
relevant quantities, such as the average melting temperature of the con- BCC for HEA design. In addition, we investigate the relationships in
stituent elements [17,32], were selected. Furthermore, an ML model is the paradigm of composition-feature-phase, namely, the five features
often a ‘black box’ to the HEA community, and little attempt has been vs the three HEA classes, the compositions vs the three HEA classes,
made to develop easy-to-use yet effective rules to guide HEA design. and the compositions vs the five features. These correlations offer ad-
Therefore, there is a strong need for deriving high-fidelity phase selec- ditional insights for understanding the formation of HEAs. Finally, we
tion rules from ML for HEAs design. To achieve this goal, two long- predict a set of 213 new single BCC and FCC phases in quinary
standing challenges in HEA alloy research by ML methods have to be AlCrNiFeMn and senary AlCrCoNiFeTi HEAs which offer a guideline
overcome [29,33], namely, to develop databases with a large number for experimental exploration of new high and medium entropy
of data in the multidimensional temperature-composition space of alloys.
HEAs and select a set of effective descriptors for ML method from fea-
ture engineering. 2. Methodologies
Considering the huge number of possible HEAs, it would be a daunt-
ing task to explore the vast design space of HEAs and generate large In this section, we first present the procedure to generate a large
amounts of data by solely lab-based experiments. Hence, computational dataset of multidimensional temperature-composition data using
approaches present a competitive edge over experiments [34]. First- Thermo-Calc tool and TCHEA3 database. Subsequently, we describe var-
principles calculations, such as density functional theory (DFT) calcula- ious descriptors that may govern the phase formation. Finally, we dis-
tions [11,35,36] have been employed to study the phase formation and cuss our ML procedure for the classification of different phases,
mechanical properties of HEAs [10,37]. Such calculations are however including how to address the imbalance of the three classes in the
computationally too expensive to produce a large database covering dataset and choose suitable ML models.
2
2.1. Data generation by CALPHAD method are expressed in Eq. (1), where P is denoted as one of the properties and
xi the mole fraction of ith element.
Eight elements, Al, Co, Cr, Cu, Fe, Mn, Ni, and Ti, are selected to pro-
N
duce light-weight quinary HEA alloys. Furthermore, since Al must be in- P avg ¼ ∑ xi P i ð1Þ
cluded in each alloy, the number of Al-containing quinary alloy families i¼1
is 35 (=7C4). The equilibrium phases of the quinary alloys are computed
by Thermo-Calc Software with TCHEA3 database. It is noted that many material properties cannot be described simply
The credibility of TCHEA3 can be determined by two parameters by the compositional averages in their constituent properties, and the
[44], namely, the fractions of assessed end-binary (FAB) and end- differences in these properties are also important [49]. Therefore, the
ternary systems (FAT), respectively (see Section S1 in Supplemental In- variances of the five properties of the constituent elements are also
formation for the data assessment). The analysis by Senkov et al. [44] taken into account. Specifically, we take the atomic radius difference
found a good agreement between calculations and experiments with (ΔR) defined in Eq. (2), which is the same as that used in [15,32]. The
FAB>0.5, an important criterion that can be used to assess the validity other four, namely, the electronegativity difference Δχ, the atomic
of CALPHAD predictions. While all the end-binary systems have been number difference (Δ(AN)), the melting temperature difference (ΔTm)
fully assessed in TCHEA3 for all the 35 quinary families in our study, and the valance electron difference (Δ(VE)), are defined in Eq. (3).
i.e., FAB = 1, we conservatively select the quinary series with FAT ≥0.6 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n 2
to ensure even higher reliability of the CALPHAD results. Consequently, ΔR ¼ ∑i¼1 xi 1−Ri =R ð2Þ
only the first 20 quinary families in Table S1 are selected in our study.
n
In the present study, we use Thermo-Calc to generate phase equilib- where R ¼ ∑ xi Ri .
rium data that cover the full composition space of quinary alloy systems i¼1
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
with a compositional step of δx = 0.1 (mole fraction). As a result, the n 2
ΔP ¼ ∑i¼1 xi P i −P ð3Þ
number of data points for a quinary system is 126. The temperature
n
range is set to 300 K – 1700 K with a step of 10 K, which results in 141
where P ¼ ∑ xi P i .
temperatures. Thus, the total number of instances is 355,320 (= i¼1
126 × 141 × 20) for all the 20 quinary systems. The proportions of all
the phases of each instance (data point) are calculated by the single It is noted that the units and data sources of the above-mentioned el-
point equilibrium calculation. To better study the phase selection (for- emental properties are the following: melting temperature in Kelvin,
mation) rules, we label all the data points by the 3-class scheme defined atomic radii in Å (Killet scale) [50], electronegativity (Allen scale) [51]
as below: and valence electron (Villars scale) [52].
Class-1: disordered single FCC phase with mole fraction >0.999 In addition, the other four quantities, i.e., ΔHmix [15,53], the deviation
Class-2: disordered single BCC phase with mole fraction >0.999 in enthalpy σH [54], ΔSmix [2,15,32], and the parameter Ω that is defined
Class-3: the other phases as the entropy of mixing divided by the enthalpy of mixing and multi-
It is noted that the above definitions of single-phase FCC and BCC plying by the average melting temperature of the elements [17,32],
structures are stricter than those defined in [45], where the cut-off are also taken as the descriptors as defined in Eqs. (4–7), respectively.
point for a single solid solution phase is 0.99. In the present study, the
n C2
resultant numbers of the 3 classes of the 20 quinary series (after remov- ΔHmix ¼ ∑ ΔH mix ð4Þ
xi ,xj
ing liquid-containing entries) are 1004, 992 and 257,215, respectively. i¼1
Clearly, the number of instances in Class-3 significantly outnumbers i≠j
those of the other two classes. This imbalance in the dataset may !
3 k x
cause difficulties in ML training. To alleviate the imbalance, we generate where ΔHmix
xi ,xj ¼ 4ci cj ∑ W k xi,nor −x j,nor , xi,nor ¼ xi xþx
i
j
, x j,nor ¼ xi þx
j
j
,
k¼0
the other two groups of data by Thermo-Calc calculations. One of them
is obtained by calculating the phase equilibrium for the full composition and Wk values can be found in Table 2 of Ref. [53].
space of Al-Co-Cr-Fe-Ni series with δx = 0.05, from which only the in- sffiffiffiffiffiffiffiffi
stances with Class-1 and Class-2 are selected. The other group is gener- 1 n C 2 mix 2
σH ¼ ∑ ΔHxi ,xj −ΔH ð5Þ
ated by conducting ‘stepping’ calculation in Thermo-Calc with the initial C
n 2 i¼1
compositions of the instances of Class-1 and Class-2 and fixing the FCC i≠j
and BCC phase, respectively. All the above instances are compiled and
referred to as the ML training dataset in which the total numbers of in- where ΔH ¼ ΔHCmix .
n 2
stances of Class-1, Class-2, and Class-3 are 41,329, 8804, and 257,215,
N
respectively.
ΔSmix ¼ −R ∑ xi ln xi ð6Þ
i¼1
T m ΔSmix
2.2. Descriptor selection Ω¼ ð7Þ
jΔHmix j
A key ingredient of a machine learning algorithm is to select physical
meaningful descriptors, and generally, the best choice is highly depen-
2.3. Machine learning methodology
dent on the target under study [33]. To begin with, we choose a few typ-
ical physicochemical/thermodynamic parameters that were previously
2.3.1. Addressing data imbalance
proposed to identify phase formation [21]. Overall, a total of 15 descrip-
We split the data set of 307,348 samples into a training set (90%) and
tors, including the equilibrium temperature (Teq), is chosen as the initial
a test set (10%). Of the 276,613 samples in the training set, 37,188 are
features for ML training. Among these descriptors, five of them are
Class 1, 7910 are Class 2, and the remaining 231,515 are Class 3. It is
formed by the average values of the five fundamental elemental proper-
clear that inherent in the data set, there is a severe imbalance despite
ties, namely, atomic radius (R), electronegativity (χ), atomic number the effort in balancing these data as described in Section 2.1. The relative
(AN), melting temperature (T m ), and valence electron count (VEC) under-representation of Classes 1 and 2 can be considered a reflection of
(VE) of the constituent elements. The definitions of the five descriptors the scarcity of single FCC and BCC phases in reality. This imbalance can
3
cause a high accuracy in majority classes but low accuracy for minority 3. Results and discussions
classes. Typically, two types of strategies [55] can be used address this
data imbalance issue: (i) re-balancing the dataset through under- 3.1. Results of machine learning
sampling majority classes or over-sampling minority classes, and (ii)
cost-sensitive scoring, where sample weights inversely proportional to 3.1.1. Feature selection
class sizes are taken into consideration when evaluating the perfor- Feature selection is another essential step for establishing a robust
mance metric and objective functions. It is known that rebalancing the machine learning model. We first calculate Pearson correlation coeffi-
dataset by oversampling the minority classes would result in a huge cients between every pair of features. As can be seen from Fig. S1,
training set consisting of 231,515 × 3 = 694,545 samples, adding to some features are highly correlated, which indicates that there is
the computational cost. On the other hand, under-sampling the major- room for feature selection, i.e., selecting a subset consisting of the
ity of classes would result in the loss of data. Such considerations have most significant features.
led us to implement cost-sensitive scoring. To rank the importance of the individual features in terms of their
For classification problems, the objective or cost function is contributions towards the classification objective, we train an XGBoost
usually the cross entropy loss, also known as log loss, defined as model by using all 15 features of the data set. The resulting importance
N K scores are shown in Fig. 1(a). Such feature importance scores depend on
ði Þ
N1 ∑ ∑ ρk ln ρ ^ ðkiÞ , where N is the total number of samples, K the
i¼1 k¼1
the training data set; it is quite possible that with a different training set,
ðiÞ the importance scores can be substantially different even if the problem
number of classes, ρk the target probability that the ith sample or in-
is still classifying the same three classes. To give an estimate of the ex-
ðiÞ
stance belongs to class k and takes the value of either 0 or 1, and ρ bk tent of variation, we trained 256 XGBoost models, each on a randomly
the predicted probability that the ith sample or instance belongs to undersampled 90% of the training data set. In Fig. 1(a), the width of
class k. In our model-building process, we use the weighted cross en- each blue stripe corresponds to the mean of the importance scores for
N K a feature generated across the 256 models, and the error bar the stan-
ðiÞ
^ðkiÞ , where the ith sample is associated
tropy loss N1 ∑ ∑ wðiÞ ρk ln ρ
i¼1 k¼1 dard deviation. The standard deviation allows one to gauge the likeli-
with a weight w (i) that is inversely proportional to the size of the hood that feature importance ranking can be perturbed. We observe
class that the sample belongs to. The sample weight w(i) is computed that the standard deviation is low for most features, especially those
using the Scikit-Learn library [56], according to the formula with low importance scores. Subsequently, we carry out feature selec-
wðiÞ ¼ ðiÞ N , where N is the total number of samples in the training tion by eliminating features one by one in the order of increasing impor-
N N classes
tance, starting with the least important feature, σH. At each stage of the
set, N(i) is the number of class-i samples in the training set, and
process, we tune the hyperparameters of the reduced model (see
Nclasses is the number of classes. As a result, the samples of Class 1,
Section S2 in Supplementary Information) using 5-fold cross-
2, and 3 are attached to a weight 2.48, 11.66 and 0.40, respectively.
validation grid search and record the weighted cross-entropy loss mea-
During training, such a weighted loss function penalizes errors
sured on the training set. Fig. 1(b) shows the weighted cross-entropy
made on the minority classes more than those on the majority
loss as a function of number of input features, which indicates that
class, forcing the model to achieve the right balance between the
five input features are necessary since further reduction leads to a sig-
classes.
nificant increase in the cross-entropy. The five most important features
identified are Teq, R, Δ(VE), VE and Δχ.
It is noted that some of the five most important features identified
2.3.2. Choice of ML model - XGBoost classifier
from the present work are consistent with previous studies [17–19],
It is well-known that XGBoost regularly outperforms classical ML
where the radius, electronegativity, and valence electron were often
models, such as the logistic regression, k-nearest neighbours (KNN),
chosen based on the Hume-Rothery rule. In contrast to all previous
support vector machines (SVM), as well as neural networks [48].
parametric studies in which the equilibrium temperature was not
Boosting algorithms proceed iteratively by combining weak predictors
taken into account, our work highlights the importance of the equilib-
to form a strong predictor, with each additional predictor correcting
rium temperature in the phase selection of HEAs. In the following, we
the residual error of the previous one, a sequence of classifiers can
discuss in greater detail the possible reasons why the five important de-
thus be built, enabling progressively more accurate training. In addition,
the XGBoost implementation of gradient boosting possesses the advan- scriptors, namely, Teq, R, Δ(VE), VE and Δχ, are chosen in our phase se-
tages of scalability, the ability to handle large datasets with ease, as well lection rules of the multicomponent HEAs. Since three other
as support for parallel and distributed computing. Importantly, XGBoost descriptors, namely ΔSmix, ΔHmix, and Ω are not chosen in our phase se-
is also compatible with cost-sensitive scoring and allows implementa- lection rules, we also discuss the possible reasons and their roles in
tion of weighted cross entropy loss for classification problems. For this HEAs thermodynamics.
specific problem, we carried out comparison between various machine The fundamental principle of phase stability is attributed to the
learning models, and XGBoost was found to deliver the best generaliza- Gibbs energy, which can be written as ΔG = ΔHmix − T(ΔSmix +
tion performance. The comparison of their accuracies can be found in ΔSex
mix ), where T is the equilibrium temperature (Teq), ΔSmix is the config-
section S2 of the supplemental information. uration entropy as defined by Eq. 6 in the descriptor selection section
The base predictors of XGBoost are classification and regression and ΔSex mix is the excess entropy. In principle, all the quantities in the
trees (CART), which provide us with a natural and convenient mecha- Gibbs energy should be important. Indeed, our study shows that the
nism for measuring the importance of each feature [48,57]. Based on equilibrium temperature is an important factor. As pointed out by Guo
such feature importance scores, feature selection can be carried out, en- [58] and George et al., [34] however, ΔSmix is not an effective factor for
abling the reduction of model complexity while at the same time, im- HEA phase selection, which is consistent with our result. Although ΔSex mix
proving interpretability and offering insights into the physical could be an important factor, [34] it is commonly not considered per-
relevance of each descriptor in the phase selection. XGBoost is therefore haps due to the difficulties in calculating it. In the present study, ΔHmix
chosen in the current work for building our machine learning model is calculated by commonly-used scheme, [53] which is based on the
due to its support for cost-sensitive scoring and ability to generate fea- thermodynamics of liquid phases. A recent study [59] has shown that
ture importance scores, besides being more accurate than other ma- the scheme is inadequate to describe the complex thermodynamics of
chine learning methods. solid HEAs, which may cause the exclusion of ΔHmix in our phase
4
Fig. 1. (a) The importance scores for all 15 features. (b) Weighted cross-entropy loss on training set as a function of the number of input features. Features are eliminated one by one,
starting with the least important one, σH, in the order of increasing importance as indicated in (a).
selection rules. Due to the same reasons, Ω, which is defined as significance of valence electron concentration (VEC) and its crucial roles
T m ΔSmix in phase stability as well as numerous physical properties of alloys have
Ω¼ jΔH mix j , is also not included in our phase selection rules.
been discussed in [23,62], and the importance of VEC in phase stability
The significance of the other 4 chosen descriptors R, Δ(VE), VE and
has also been highlighted in [19,58]. The Miedema model also suggests
Δχ can be explained by their relationships with the mixing enthalpy
that the other two descriptors, namely, the difference in electronegativ-
ΔHmix of HEAs. It is instructive to use a binary AB alloy to illustrate
ity (Δχ) and the average radius R (which is related to the element vol-
these relationships. According to the well-known Miedema model
ume) should also be taken into consideration. It is noted that the
[60,61], the ΔHmix can be written as Eq.(8).
importance of atomic radius and electronegativity differences were also
2 revealed in [23].
2
2f ðcÞ xA V 3A þ xB V 3B 1 2
ΔHmix ¼ −P ð ΔχÞ2 þ Q Δn3ws −R ð8Þ
−1
− 1
nAws 3 þ nBws 3 3.1.2. Testing on the final model based on the five features
The final ML model is trained on the five most important features,
where xA(B) and VA(B) are the molar volume of A(B) element and the vol- Teq, R, Δ(VE), VE and Δχ. The performance on the training and test sets
ume of A(B) element in the binary alloy, respectively. f(c) represents the can be seen from the confusion matrices of predictions [63] as shown
concentration dependence of ΔHmix, nws denotes the electron density at in Fig. 2(a) and (b), respectively. In each confusion matrix, the row-
the boundary of Wigner–Seitz cell, R denotes the hybridization param- wise accuracies are shown in the top three cells of the right-most col-
eter for the alloying of a transition metal with a non-transition metal, umn, corresponding to the recalls for Classes 1, 2 and 3, respectively.
and P and Q are constants. Clearly, both Δnws and R are associated with Similarly, the column-wise accuracies shown in the first three cells of
the redistribution of the valence electron density that contributes to the the bottom row are the precisions for the three classes. The overall pre-
metallic bonding. Thus these two parameters are closely related with diction accuracy is given in the bottom right corner. On the training set,
the two chosen descriptors Δ(VE) and VE. The thermodynamic this accuracy is 99.95%, while on the test set, it is 99.92%. The precisions
Fig. 2. Confusion matrices of the ML model trained with the five features for (a) the training data set, (b) test data set, and (c) the experimental data set from 15 sources.
5
are all above 99% for all the classes. Such high accuracy indicates that the Table 1
final model built on the five features possesses a good general property. Comparison of the ranges of the five features of the training dataset and experimental
dataset.
In the subsequent sections, we will use this model to make predictions
on experimental data gathered from literature and guide the design of Descriptors Phase FCC BCC Other Phases
single-phase FCC and BCC HEAs. Range Min Max Min Max Min Max
Teq (K) Training 615 1720 1070 1700 300 1700

3.2. Testing on experimental data Experiment 973 1523 973 1473 873 1523
R (Å) Training 1.25 1.3 1.27 1.32 1.27 1.4
We have collected 155 annealing experimental data of HEAs pub- Experiment 1.25 1.29 1.3 1.47 1.25 1.47
lished by 15 research groups (see Section S4 and Table S3 in Supple- VE (Villars scale) Training 7.4 10.45 5.98 7.85 4.5 9.6
Experiment 7.5 9.8 4.5 6.8 4.09 8.88
mental Information for details). Testing on these experimental data Δχ (Allen scale) Training 0.018 0.177 0.056 0.102 0.064 0.23
provides an important means to evaluate the performance of our Experiment 0.033 0.145 0.076 0.125 0.034 0.182
trained ML model. The testing results in the form of the confusion ma- Δ(VE) (Villars scale) Training 0.59 2.68 1.14 2.32 1.61 3.66
trix are shown in Fig. 2(c). It is seen that the recalls of Class 1 (FCC) Experiment 0.75 2.13 0.5 2.44 0.6 3.44
and Class 3 (multi-phases) are 94% and 76%, respectively. In contrast,
the recall of Class 2 (BCC) is low (40%), largely due to a very small sam-
ple size (5 samples only). Nevertheless, the overall accuracy of the en-
consistent with the rule developed by Guo et al. [19]. It is also observed
tire experimental dataset reaches as high as 81%. It is worthwhile to
that the five features span nearly the entire ranges for Class 3, implying
note that the ML model is developed based on sole quinary alloys
overlaps in the ranges of the five features for forming Class 3 and the
with the combinations of the 8 elements Al, Cr, Co, Ni, Fe, Cu, Mn, Ti,
two single-phase classes.
and the assessment of the ML model by the experimental dataset is ex-
The PCPs of the 5 features obtained from the experimental data are
trapolated to various alloy systems of 3- to 7- atoms including not only
superimposed on the respective classes of the training dataset as
the 8 elements but also other elements, such as Hf, Mo, Nb, Ru, Ta, W, V,
shown in grey in Fig. 3. It is seen that the majority of the experimental
Zn and Zr. Therefore, the present ML model demonstrates a good pre-
data fall in the ranges of the training dataset. In addition, the ranges
dictive capability for phase classification for HEAs. In fact, the perfor-
for the five features obtained from the experimental dataset are also
mance of our ML model is comparable or even higher than those
shown in Table 1 for comparison. Note that the upper or lower bounds
reported in [24,26,31].
of the experimental dataset that lie outside of the training dataset are
indicated in bold. As can be seen, the bounds of the experimental data
3.3. Criteria for single-phase formation
generally fall within the ranges of the corresponding training dataset,
which may well explain the good prediction accuracy (81%).
To visualize the correlations of the 5 features with the three classes,
In order to establish a high-fidelity guideline to search for single FCC
we generate parallel coordinate plots (PCPs) from the training dataset
and BCC phases, we analyse the success rates that are defined as the per-
as shown in red in Fig. 3(a), (b), and (c) for the three classes, respec-
centage of the instances of single-phase FCC or BCC to the total instances
tively. The five features are represented as five vertical axes spaced
within the ranges of the 5 features. The feature ranges are set to the [5th,
evenly along the horizontal axis. The connected line segments represent
95th] percentiles and the results are shown in Table 2. As shown in the
the three classes in the training dataset. The maximum and minimum
last column, if only one individual condition is satisfied, i.e., falling
values of each descriptor on the vertical direction of the PCP for each
within the range of one feature, the success rate of FCC formation varies
class are shown in Table 1.
between 28% to 77% and BCC formation between 7% to 15%. For instance,
Clearly, the PCPs suggest that lower values of R, Δ(VE) and Δχ are in in the temperature range from 1080 K to 1660 K, the success rate to
favour of the formation of both FCC and BCC phases, which is in good form an FCC phase is 33%. However, if all 5 conditions are satisfied si-
agreement with the Hume-Rothery rule. We also notice that Senkov multaneously, the success rate for the formation of FCC and BCC
et al. [44] inferred, on the basis of their CALPHAD calculations, that sim- single-phases increases to 93% and 92%, respectively. It is worthwhile
ilar electronegativity and/or valence of the alloying elements are not al- to note that if the feature ranges are set to [minimum, maximum] as
ways required for the formation of single-phase in HEAs, which listed in Table 1, the success rate for single-phase FCC and BCC decreases
however contradicts the Hume-Rothery rule. It is also observed from to 53% and 48%, respectively. Therefore, the combination of five feature
the PCPs that single FCC and BCC phases are more likely to be stable at ranges defined by the [5th, 95th] percentiles offers more effective rules
higher temperatures, which is consistent with the common knowledge for the search of new single-phase FCC and BCC structures.
that a higher temperature is in favour of single-phase HEA due to the in- It is also interesting to compare the parameters reported in the liter-
crease of (-TΔSmix). It is also seen that FCC is likely to be stable at the ature with the resultant ranges of individual descriptors, although some
higher end while BCC at the middle of VE. This behaviour is also of them are not identified as the 5 important features in the present
Fig. 3. Parallel coordinate plots of FCC (a), BCC (b) and multi-phases (c) from the training and experimental datasets. The units of the features are given in the parentheses as below: Teq (K),
R (Å), VE (Villars scale), Δχ (Allen scale), Δ(VE) (Villars scale).
6
Table 2
The instance counts of single-phase (FCC and BCC) and the success rates when satisfying the conditions.
Feature FCC Condition, [5th, 95th] Total counts of instances satisfying the Counts of FCC instances satisfying the 100 × (FCC counts) / (Total
percentiles condition condition counts)
Teq (K) [1080, 1660] 102,127 33,922 33

R (Å) [1.26, 1.28] 44,585 34,224 77
VE (Villars scale) [7.80, 8.96] 87,135 33,804 39
Δχ (Allen scale) [0.066, 0.105] 120,361 33,425 28
Δ(VE) (Villars [1.50, 2.26] 87,314 33,575 38
scale)
All 5 features All 5 conditions 27,450 25,627 93
Feature BCC Condition, [5th, 95th] Total counts of instances satisfying the Counts of BCC instances satisfying the 100 × (BCC counts) / (Total
percentiles condition condition counts)
Teq (K) [1330, 1690] 48,778 7345 15
R (Å) [1.28, 1.30] 80,052 7228 9
VE (Villars scale) [6.45, 7.55] 97,154 7150 7
Δχ (Allen scale) [0.071, 0.096] 80,312 7142 9
Δ(VE) (Villars [1.46, 2.10] 59,661 7200 12
scale)
All 5 features All 5 conditions 6105 5603 92
work. It was reported that the solid solution phase stabilizes in the Al appears in both BCC afnd FCC phases, Al is found in higher levels in
range of Ω ≥ 1.1 and δ ≤ 6.6% [17]. Our results show that for [5th, BCC phases thus it is regarded as a BCC stabilizer in Table 3. Similarly,
95th] percentiles, the Ω is in the range of [1.9, 4.2] for both single- Co and Ni are found to be the FCC stabilizers in all the studies. The
phase FCC and BCC, and the δ for FCC is [2.9%, 4.2%] and BCC [2.8%, PCPs of both FCC and BCC show that the Fe composition varies from
5%], in good agreement with [17]. From the study of a quinary low to high levels, indicating that Fe is neither a FCC nor a BCC stabilizer.
AlCoCrFeNi system by CALPHAD, it was found that the VEC for BCC is This observation is in agreement with [65], where it is argued that Fe is
in the range of [5.7, 7.2], while for FCC, VEC > 8.4 [21]. Also, it was re- neutral in stabilizing the solid-solution phases. However, there are in-
ported in [19] that an FCC solid solution is stable at higher VEC (≥8), consistent arguments on the role of Mn as a stabilizer. While Mn was
while a BCC solid solution is stable at lower VEC (≤6.87). In contrast, believed to be an FCC stabilizer in [66,67], it was regarded as a BCC sta-
the VEC ranges defined by the [5th, 95th] percentiles in our results ob- bilizer in [20]. The PCPs show that Mn compositions span from low to
tained from 20 quinary families of 276,613 data points are [6.5, 7.6] high levels in BCC phase while at lower levels only in FCC phase. Thus
and [7.8, 9.0] for BCC and FCC, respectively. Clearly, these previous stud- our result supports the argument of [20]. From the PCPs, we can con-
ies support the current findings as these critical VEC values are within clude that both Cu and Ti are neither FCC stabilizers nor BCC stabilizers,
the ranges of our phase selection rules. This result implies that our although low levels of Cu can exist in FCC phases. Our results suggest
new phase selection rules established based on a large number of equi- that the concept of element stabilizer in the conventional alloys should
librium data calculated by CALPHAD can also be used to guide HEA de- be used cautiously in the element selection for HEA design. Instead, the
sign by as-cast experiments. This conclusion is in agreement with ML new phase selection rules established in our study should be used to
study by Pei et al. [31] in which the CALPHAD verification reached 94% choose the phase stabilizing elements.
consistency with the prediction from a ML model built on 1252 as-
cast data. 3.5. Relationship of element compositions and descriptors
It is interesting to point out that our work here clarifies the mystery
concerning the role of VEC in phase selection raised recently in a review In the following, we analyse how each of the 8 composition elements
article [64] which found that the role of VEC in the ML studies [24,26] is correlated with each of the 5 features by the correlation map, i.e., the
was significantly different from the empirical study [19]. In the former, Pearson matrix. Fig. 5 reveals the correlations in the training dataset. It
the VEC criterion was regarded as very important in the phase selection,
can be seen that for the average radii R, Ti has the largest impact,
whereas in the latter, it was stated that VEC could only distinguish FCC
followed by Al. In comparison, for the average valence electron VE, the
and BCC. Our study here shows that VEC is only one of the five most im-
elements with significant impact in descending order are Al, Ti and Ni.
portant descriptors in the phase selection rules and ranks as the third
The largest correlation with Δ(VE) is Al, whereas with Δχ is Ti. The cor-
most important factor according to the importance scores shown in
relations of all the elements with temperature are low and thus
Fig. 1(a).
negligible.
Of particular interest is that the Pearson correlation coefficients
3.4. The element effect on the stability of single-phase solid solutions
shown in Fig. 5 may offer additional insights into the role of different el-
ements in the phase stabilization in HEAs via the element effects on VE.
In recent years, the concept of the stabilizer of BCC or FCC phases in
As shown in the third column of Fig. 5, large negative coefficients of both
the conventional alloys has been extended to HEA alloys and some in-
consistent results were reported [20,65–67]. To clarify the inconsistency Al and Ti with the VE are observed, suggesting that Al and Ti have
and further explore the element effect on the stability of single-phase greater effects on decreasing VE and thus may strongly stabilize BCC
solid solutions, we generate the PCPs of the 8 elements for each of the phases. Furthermore, Cr has a relatively small negative coefficient, and
three classes in the same manner as the descriptors. The PCPs for Classes thus Cr is a weak BCC stabilizer. On the other hand, moderate positive
1, 2, and 3 are shown in Fig. 4(a), (b), and (c), respectively. To compare coefficients are found in Co, Ni and Cu, and thus these three elements
our results with those published by others, we define a phase stabilizer have moderate effects on the increase of VE, which leads to stabilizing
as following, namely, the composition of an element being found at FCC phases. Finally, small positive coefficient of Fe and small negative
higher levels in one PCP (e.g., FCC) and at lower levels in the other coefficient of Mn are observed, which implies that Fe is a weak FCC sta-
PCP (e.g., BCC). The comparison is shown in Table 3. Clearly, Al and Cr bilizer and Mn a weak BCC stabilizer. Despite these findings, it should be
are deemed as BCC stabilizers in all the studies. Although in our study emphasized that VEC is only one of the five important factors in the
7
Fig. 4. Parallel coordinate plots of compositions of the 8 elements in the training dataset for (a) Class 1, (b) Class 2 and (c) Class 3.
correlation map. The strong correlations between elements and other is determined by the overall effects of the five factors identified in the
descriptors, such as Al and Ti with R and Δχ, may outweigh VEC factors present work which offers a framework in the rationalization of differ-
in the phase selection of HEAs. This conclusion is supported by a recent ent empirical physical parameters in the phase selection in the high-
study on the effect of different elements on phase formation in the al- entropy alloys.
loys [68] where the authors found that while FCC structure is stable We further analyse the impact of the element compositions on the
when the radius difference ΔR ≤ 2.8 and VE ≥ 8.27, the intermetallic magnitudes of the five features by the sensitivity analysis, namely, the
phases are favoured when Δ χ > 0.133. In summary, the phase selection partial derivatives of each feature with respect to each element
8
Table 3 good agreement with those of higher correlation in the Pearson ma-
Comparison of the phase stabilizers reported in the literature and this work. trix. Since the sensitivity analysis of Δχ and Δ(VE) expressed by
Element FCC BCC Eq. (10) is too complicated, we employ a numerical method to calcu-
Al [20,65], this work
late the partial derivatives of the 8 elements in pseudo 8-atom alloys.
Cr [20,65], this work By setting the composition spacing to 0.02 mole fraction, the resul-
Co [20,65], this work tant number of instances (data points) is 85,900,584. Among them,
Ni [20,65], this work the highest partial derivative of Δχ with respect to Ti mole fraction
Fe
accounts for 95.4%, followed by that of Ni (4.6%). These results
Cu
Mn [66,67] [20], this work agree well with the Pearson matrix, where Ti has the highest corre-
Ti lation with Δχ. Similarly, the highest partial derivative of Δ(VE)
with respect to Al is found to be ranked at the top, i.e., 63.4%,
followed by Cu, 37.5%. These results are also consistent with the
Pearson matrix, which shows that Al and Cu have highest correla-
tions with Δ(VE).
3.6. Prediction of new single-phase FCC and BCC HEAs
By using the newly developed phase selection rules and the

trained ML model, we can search for possible single-phase FCC and
BCC structures for the design of new medium and high entropy al-
loys. As a demonstration, we study one quinary family and one se-
nary family of HEA alloys, namely, AlCrFeMnNi and AlCoCrFeNiTi.
The composition spacing is 0.05 mole fraction, and each element
composition varies from 0.05 to 0.8 in the quinary family and from
0.05 to 0.75 in the senary family. The temperature range is set to
be 800 K to 1700 K with a step of 10 K. The values of the five features
are calculated for the two families and the instances are screened for
those falling within the suggested features ranges. By taking the ad-
vantage of our efficient ML model, a set of pseudo single-phase FCC
and BCC formulas are rapidly predicted and then verified by
Thermo-Calc calculations. A total of 213 formulas are generated
and compiled together with their equilibrium temperature range
as shown in Table S4 in Section S5 of Supplemental Information. It
is seen that AlCrNiFeMn series mainly result in BCC structures,
whereas AlCoCrFeNiTi series mainly lead to FCC structures. Since a
high configurational entropy ΔSmix is believed to be the main stabi-
lizing factor for single phase BCC and FCC structures [69,70], all the
new candidates are presented in Table S4 in the decreasing order
of ΔSmix/R for easy search for alloys with high entropy (ΔSmix > 1.5R)
and medium entropy (1R < ΔSmix < 1.5R).
Fig. 5. The correlation map between 8 elements and 5 important features. The value in the 3.7. Criteria for the formation of duplex phases
grid shows the correlation coefficient between the corresponding element and descriptor.
The colour intensity is proportional to the magnitude of the correlation coefficients.
While the focus of the present work is the formation of single-phase
HEAs, we extend the methodology to establish the criteria for the for-
mation of duplex phases. The duplex phases are defined as the mole
composition. Of the five features identified in this work, namely, Teq, fraction of the matrix (disordered FCC or BCC phases) greater than 0.8
R, ΔðVEÞ, VE, Δχ, two of them (R and VE) are obtained by Eq. (1) and and the sum of the matrix and precipitate (ordered FCC or BCC) greater
the other two (Δχ and Δ(VE)) from Eq. (3). Thus the partial derivatives than 0.999. A total of 58 instances of FCC duplex phases and 667 in-
of R and VE are calculated by Eq.(9), and Δχ and Δ(VE) by Eq. (10). stances of BCC duplex phases are found in the training dataset. In the
same fashion as described in Section 3.3 where the criteria for the for-
∂P mation of single FCC and BCC phases are suggested, the values of the 5
¼ Pi ð9Þ
∂xi features that simultaneously fall in the ranges leading to the duplex
phases are listed in Table 4.
2 N
∂ΔP P i −P −2P i ∑ j¼1 xj P j −P
¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 ffi ð10Þ
∂xi N
2 ∑ j¼1 xj P j −P Table 4
Important feature ranges for producing FCC and BCC duplex phases.
where xi and Pi are the mole fraction and the element property of ith el-
Duplex Range Teq R VE (Villars Δχ (Allen Δ(VE)(Villars
ement, respectively. phases (K) (Å) scale) scale) scale)
The sensitivity of an element composition to R and VE is readily
FCC min 1020 1.27 8.00 0.08 2.17
available from Eq. (9), namely, Ti > Al > Cr = Cu > Fe > Mn > Co = max 1440 1.30 9.10 0.16 2.62
Ni for R, and Cu < Ni < Co < Fe < Mn < Cr < Ti < Al for VE. As can be BCC min 1050 1.28 5.90 0.07 1.66
max 1680 1.32 7.40 0.10 2.30
seen, the elements with a higher sensitivity to both R and VE are in
9
4. Conclusion [3] B. Cantor, I.T.H. Chang, P. Knight, A.J.B. Vincent, Microstructural development in
equiatomic multicomponent alloys, Mater. Sci. Eng. A-Struct. Mater. Prop.
Microstruct. Process. 375 (2004) 213–218.
We develop ML model by using large amount of phase equilibrium [4] E.P. George, W.A. Curtin, C.C. Tasan, High entropy alloys: a focused review of me-
data in the well-defined vast composition-temperature space of HEAs chanical properties and deformation mechanisms, Acta Mater. 188 (2020) 435–474.
[5] S. Gorsse, J.-P. Couzinié, D.B. Miracle, From high-entropy alloys to complex concen-
calculated from CALPHAD method, and the ML model is validated
trated alloys, C R Phys. 19 (8) (2018) 721–736.
with 81% accuracy by 155 experimental data from 15 different sources. [6] D.B. Miracle, O.N. Senkov, A critical review of high entropy alloys and related con-
On the basis of ML result and the analysis of large amount of data, we es- cepts, Acta Mater. 122 (2017) 448–511.
tablish new phase selection rules for single-phase FCC and BCC of HEAs [7] B.S. Murty, J.W. Yeh, S. Ranganathan, High-Entropy Alloys, Butterworth-Heinemann,
(an Imprint of Elsevier), Amsterdam, 2014.
which achieve the success rates of 93% and 92%, respectively. The new [8] M.C. Gao, J.-W. Yeh, P.K. Liaw, Y. Zhang, High-Entropy Alloys. Fundamentals and Ap-
rules state that all the following five conditions are required to satisfy si- plications, Springer International Publishing, Switzerland, 2016.
multaneously, namely, for single-phase FCC, 1080 < Teq (K) < 1660, 1.26 [9] M.H. Tsai, J.W. Yeh, High-entropy alloys: a critical review, Mater. Res. Lett. 2 (3)
(2014) 107–123.
< R (Å) < 1.28, 7.80 < VE (Villars scale) < 8.96, 0.066 < Δχ (Allen [10] Y.F. Ye, Q. Wang, J. Lu, C.T. Liu, Y. Yang, High-entropy alloy: challenges and pros-
scale) < 0.105, and 1.50 < Δ(VE) (Villars scale) < 2.26; and for single- pects, Mater. Today 19 (6) (2016) 349–362.
phase BCC, 1330 < Teq(K) < 1690, 1.28 < R (Å) < 1.30, 6.45 < VE (Villars [11] W.R. Zhang, P.K. Liaw, Y. Zhang, Science and technology in high-entropy alloys, Sci.
China-Mater. 61 (1) (2018) 2–22.
scale) < 7.55, 0.071< Δχ (Allen scale) < 0.096, and 1.46 < Δ(VE) (Villars [12] X. Chang, M. Zeng, K. Liu, L. Fu, Phase engineering of high-entropy alloys, Adv. Mater.
scale) < 2.10. We further demonstrate that some of previously pro- 32 (14) (2020) 1907226.
posed phase selection rules are only a subset of the phase selection [13] A. Hoffman, L. He, M. Luebbe, H. Pommerenke, J.Q. Duan, P.P. Cao, K. Sridharan, Z.P.
Lu, H.M. Wen, Effects of Al and Ti additions on irradiation behavior of FeMnNiCr
rules established in this work, and thus explain why they are only par-
multi-principal-element alloy, Jom 72 (1) (2020) 150–159.
tially successful for HEA phase selection. The subsequent analysis of [14] D.B. Dai, T. Xu, X. Wei, G.T. Ding, Y. Xu, J.C. Zhang, H.R. Zhang, Using machine learn-
the relationship between the element compositions and the five ing and feature engineering to characterize limited material datasets of high-
important features reveals the relative sensitivities of the constituent el- entropy alloys, Comput. Mater. Sci. 175 (2020) 6.
[15] Y. Zhang, Y.J. Zhou, J.P. Lin, G.L. Chen, P.K. Liaw, Solid-solution phase formation rules
ements in the formation of desirable phases. Finally, 213 new single- for multi-component alloys, Adv. Eng. Mater. 10 (6) (2008) 534–538.
phase BCC and single-phase FCC structures with high- and medium- en- [16] Y. Zhang, C. Wen, C. Wang, S. Antonov, D. Xue, Y. Bai, Y. Su, Phase prediction in high
tropy alloys are predicted, providing an ample opportunity for experi- entropy alloys with a rational selection of materials descriptors and machine learn-
ing models, Acta Mater. 185 (2020) 528–539.
mental validation. The newly proposed phase selection rules together
[17] X. Yang, Y. Zhang, Prediction of high-entropy stabilized solid-solution in multi-
with the relationships in the descriptor-composition-phase paradigm component alloys, Mater. Chem. Phys. 132 (2) (2012) 233–238.
provide powerful tools for the rational design of phase structures in HEAs. [18] Y. Zhang, Z.P. Lu, S.G. Ma, P.K. Liaw, Z. Tang, Y.Q. Cheng, M.C. Gao, Guidelines in
predicting phase formation of high-entropy alloys, MRS Commun. 4 (2) (2014)
57–62.
Authors contributions [19] S. Guo, C.T. Liu, Phase stability in high entropy alloys: formation of solid-solution
phase or amorphous phase, Prog. Nat. Sci. 21 (6) (2011) 433–446.
Yingzhi Zeng: Data curation, Methodology, Validation, Visualization, [20] S.A. Kube, S. Sohn, D. Uhl, A. Datye, A. Mehta, J. Schroers, Phase selection motifs in
high entropy alloys revealed through combinatorial methods: large atomic size dif-
Writing - review & editing. Mengren Man: Data curation, Methodology, ference favors BCC over FCC, Acta Mater. 166 (2019) 677–686.
Validation, Writing - review & editing. Kewu Bai: Conceptualization, [21] S. Yang, J. Lu, F. Xing, L. Zhang, Y. Zhong, Revisit the VEC rule in high entropy alloys
Supervision, Methodology, Visualization, Writing - review & editing. (HEAs) with high-throughput CALPHAD approach and its applications for material
design-a case study with Al–Co–Cr–Fe–Ni system, Acta Mater. 192 (2020) 11–19.
Yong-Wei Zhang: Conceptualization, Funding acquisition, Methodol- [22] M.C. Troparevsky, J.R. Morris, P.R.C. Kent, A.R. Lupini, G.M. Stocks, Criteria for
ogy, Project administration, Supervision, Writing - review & editing. predicting the formation of single-phase high-entropy alloys, Phys. Rev. X 5 (1)
(2015), 011041, .
[23] M.G. Poletti, L. Battezzati, Electronic and thermodynamic criteria for the occurrence
Data availability of high entropy alloys in metallic systems, Acta Mater. 75 (2014) 297–306.
[24] N. Islam, W. Huang, H.L. Zhuang, Machine learning for phase selection in multi-
All data used in this manuscript are available from the authors on principal element alloys, Comput. Mater. Sci. 150 (2018) 230–235.
[25] Y. Li, W.L. Guo, Machine-learning model for predicting phase formations of high-
reasonable request.
entropy alloys, Phys. Rev. Mater. 3 (9) (2019).
[26] W.J. Huang, P. Martin, H.L.L. Zhuang, Machine-learning phase prediction of high-
Declaration of Competing Interest entropy alloys, Acta Mater. 169 (2019) 225–236.
[27] C. Wen, Y. Zhang, C. Wang, D. Xue, Y. Bai, S. Antonov, L. Dai, T. Lookman, Y. Su, Ma-
chine learning assisted design of high entropy alloys with desired property, Acta
None. Mater. 170 (2019) 109–117.
[28] Z. Zhou, Y. Zhou, Q. He, Z. Ding, F. Li, Y. Yang, Machine learning guided appraisal and
Acknowledgment exploration of phase design for high entropy alloys, Npj Comput. Mater. 5 (1)
(2019) 128.
[29] N. Qu, Y. Chen, Z. Lai, Y. Liu, J. Zhu, The phase selection via machine learning in high
This work is supported by RIE2020 AME Programmatic Grant: entropy alloys, Procedia Manuf. 37 (2019) 299–305.
AMDM (Grant No. A1898b0043) and by the A*STAR Computational Re- [30] Q. Wu, Z. Wang, X. Hu, T. Zheng, Z. Yang, F. He, J. Li, J. Wang, Uncovering the eutec-
tics design by machine learning in the Al–Co–Cr–Fe–Ni high entropy system, Acta
source Centre and National Supercomputing Centre Singapore through
Mater. 182 (2020) 278–286.
the use of their high-performance computing facilities. [31] Z. Pei, J. Yin, J.A. Hawk, D.E. Alman, M.C. Gao, Machine-learning informed prediction
of high-entropy solid solution formation: beyond the Hume-Rothery rules, Npj
Appendix A. Supplementary data Comput. Mater. 6 (1) (2020) 50.
[32] Y. Zhang, S. Guo, C.T. Liu, X. Yang, Phase formation rules, in: M.C. Gao, J.-W. Yeh, P.K.
Liaw, Y. Zhang (Eds.), High-Entropy Alloys. Fundamentals and Applications,
Supplementary data to this article can be found online at https://doi. Springer International Publishing, Switzerland, 2016.
org/10.1016/j.matdes.2021.109532. [33] J. Schmidt, M.R.G. Marques, S. Botti, M.A.L. Marques, Recent advances and applica-
tions of machine learning in solid-state materials science, Npj Comput. Mater. 5
(1) (2019) 83.
References [34] E.P. George, D. Raabe, R.O. Ritchie, High-entropy alloys, Nat. Rev. Mater. 4 (8) (2019)
515–534.
[1] J.W. Yeh, S.K. Chen, J.Y. Gan, S.J. Lin, T.S. Chin, T.T. Shun, C.H. Tsau, S.Y. Chang, Forma- [35] W. Kohn, L.J. Sham, Self-consistent equations including exchange and correlation ef-
tion of simple crystal structures in Cu-Co-Ni-Cr-Al-Fe-Ti-V alloys with fects, Physiol. Rev. 140 (4A) (1965) A1133–A1138.
multiprincipal metallic elements, Metall. Mater. Trans. A-Phys. Metall. Mater. Sci. [36] M.C. Gao, D. Alman, Searching for next single-phase high-entropy alloy composi-
35A (8) (2004) 2533–2536. tions, Entropy 15 (2013) 4504–4519.
[2] J.W. Yeh, S.K. Chen, S.J. Lin, J.Y. Gan, T.S. Chin, T.T. Shun, C.H. Tsau, S.Y. Chang, Nano- [37] Y. Ikeda, B. Grabowski, F. Körmann, Ab initio phase stabilities and mechanical prop-
structured high-entropy alloys with multiple principal elements: novel alloy design erties of multicomponent alloys: a comprehensive review for high entropy alloys
concepts and outcomes, Adv. Eng. Mater. 6 (5) (2004) 299–303. and compositionally complex alloys, Mater. Charact. 147 (2019) 464–511.
10
[38] C. Zhang, F. Zhang, H.Y. Diao, M.C. Gao, Z. Tang, J.D. Poplawsky, P.K. Liaw, Under- [54] F. He, W. Zhijun, C. Ai, J. Li, J. Wang, J.J. Kai, Grouping strategy in eutectic multi-
standing phase stability of Al-Co-Cr-Fe-Ni high entropy alloys, Mater. Des. 109 principal-component alloys, Mater. Chem. Phys. 221 (2018).
(2016) 425–433. [55] B. Krawczyk, Learning from imbalanced data: open challenges and future directions,
[39] J.O. Andersson, T. Helander, L. Höglund, P. Shi, B. Sundman, Thermo-Calc & DICTRA, Progr. Artif. Intell. 5 (4) (2016) 221–232.
computational tools for materials science, Calphad 26 (2) (2002) 273–312. [56] sklearn.utils.class_weight.compute_class_weight, https://scikit-learn.org/stable/
[40] A. Raturi, J. Aditya, N.P. Gurao, K. Biswas, ICME approach to explore equiatomic and modules/generated/sklearn.utils.class_weight.compute_class_weight.html,
non-equiatomic single phase BCC refractory high entropy alloys, J. Alloys Compd. accessed March 20, 2020, 2020.
806 (2019) 587–595. [57] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, Springer se-
[41] D.C. Ma, M.J. Yao, K.G. Pradeep, C.C. Tasan, H. Springer, D. Raabe, Phase stability of ries in statistics, New York, 2001.
non-equiatomic CoCrFeMnNi high entropy alloys, Acta Mater. 98 (2015) 288–296. [58] S. Guo, Phase selection rules for cast high entropy alloys: an overview, Mater. Sci.
[42] C. Ng, S. Guo, J.H. Luan, S.Q. Shi, C.T. Liu, Entropy-driven phase stability and slow dif- Technol. 31 (10) (2015) 1223–1230.
fusion kinetics in an Al0.5CoCrCuFeNi high entropy alloy, Intermetallics 31 (2012) [59] R. Feng, M.C. Gao, C. Lee, M. Mathes, T.T. Zuo, S.Y. Chen, J.A. Hawk, Y. Zhang, P.K.
165–172. Liaw, Design of light-weight high-entropy alloys, Entropy 18 (9) (2016) 21.
[43] J.I. Lee, K. Tsuchiya, W. Tasaki, H.S. Oh, T. Sawaguchi, H. Murakami, T. Hiroto, Y. [60] G. Arzpeyma, A.E. Gheribi, M. Medraj, On the prediction of Gibbs free energy of
Matsushita, E.S. Park, A strategy of designing high-entropy alloys with high- mixing of binary liquid alloys, J. Chem. Thermodyn. 57 (2013) 82–91.
temperature shape memory effect, Sci. Rep. 9 (1) (2019) 13140. [61] A.R. Miedema, F.R. de Boer, R. Boom, Model predictions for the enthalpy of forma-
[44] O.N. Senkov, J.D. Miller, D.B. Miracle, C. Woodward, Accelerated exploration of tion of transition metal alloys, Calphad 1 (4) (1977) 341–359.
multi-principal element alloys for structural applications, Calphad-Comput. Cou- [62] U. Mizutani, Hume-Rothery Rules for Structurally Complex Alloy Phases, 33487-
pling Ph. Diagrams Thermochem. 50 (2015) 32–48. 2742, CRC Press Taylor & Francis Group, Boca Raton, FL, 2011.
[45] A. Abu-Odeh, E. Galvan, T. Kirk, H. Mao, Q. Chen, P. Mason, R. Malak, R. Arróyave, Ef- [63] Z. Guo, Pretty Print A Confusion Matrix With Seaborn, GitHub, 2018https://gist.
ficient exploration of the high entropy alloy composition-phase space, Acta Mater. github.com/shaypal5/94c53d765083101efc0240d776a23823.
152 (2018) 41–57. [64] R. Li, L. Xie, W.Y. Wang, P.K. Liaw, Y. Zhang, High-throughput calculations for high-
[46] F. Tancret, I. Toda-Caraballo, E. Menou, P. Diaz-Del-Castillo, Designing high entropy entropy alloys: a brief review, Front. Mater. 10 (2020) 3389.
alloys employing thermodynamics and Gaussian process statistical analysis, Mater. [65] G.-Y. Ke, G.-Y. Chen, T. Hsu, J.-W. Yeh, FCC and BCC equivalents in as-cast solid solu-
Des. 115 (2017) 486–497. tions of Al x Co y Cr z Cu 0.5 Fe v Ni w high-entropy alloys, Eur. J. Control. 31 (2006)
[47] Thermo-Calc Software AB, TCHEA3: TCS High Entropy Alloy Database, Available at: 669–684.
http://www.thermocalc.com/media/35873/tchea3_extended_info.pdf. [66] B. Ren, Z.X. Liu, D.M. Li, L. Shi, B. Cai, M.X. Wang, Effect of elemental interaction on
[48] T. Chen, C. Guestrin, XGBoost: A Scalable Tree Boosting System, 2016. microstructure of CuCrFeNiMn high entropy alloy system, J. Alloys Compd. 493
[49] L. Ward, A. Agrawal, A. Choudhary, C. Wolverton, A general-purpose machine learn- (1) (2010) 148–153.
ing framework for predicting properties of inorganic materials, Npj Comput. Mater. [67] Y. Zhang, T.T. Zuo, Z. Tang, M.C. Gao, K.A. Dahmen, P.K. Liaw, Z.P. Lu, Microstructures
2 (2016) 7. and properties of high-entropy alloys, Prog. Mater. Sci. 61 (2014) 1–93.
[50] C. Kittel, Introduction to Solid State Physics, 8th ed. Wiley, 2004. [68] N. Liu, C. Chen, I. Chang, P.J. Zhou, X.J. Wang, Compositional dependence of phase se-
[51] Wikipedia, Electronegativity, https://en.wikipedia.org/wiki/Electronegativity# lection in CoCrCu0.1FeMoNi-based high-entropy alloys, Materials 11 (8) (2018) 11.
Allen_electronegativity. [69] F. Otto, Y. Yang, H. Bei, E.P. George, Relative effects of enthalpy and entropy on the
[52] P. Villars, F. Hulliger, Structural stability domains for single-coordination intermetal- phase stability of equiatomic high-entropy alloys, Acta Mater. 61 (7) (2013)
lic phases, J. Less Common Met. 132 (2) (1987) 289–315. 2628–2638.
[53] A. Takeuchi, A. Inoue, Mixing enthalpy of liquid phase calculated by miedema’s [70] B. Cantor, Multicomponent and high entropy alloys, Entropy 16 (9) (2014)
scheme and approximated with sub-regular solution model for assessing forming 4749–4768.
ability of amorphous and glassy alloys, Intermetallics 18 (9) (2010) 1779–1789.
11

1 s2.0 S026412752100085X Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S026412752100085X Main

Uploaded by

Copyright:

Available Formats

Materials and Design 202 (2021) 109532

Contents lists available at ScienceDirect

Materials and Design

journal homepage: www.elsevier.com/locate/matdes

Revealing high-ﬁdelity phase selection rules for high entropy alloys: A

• 300,000+ equilibrium data with 8

1. Introduction the vast composition range of HEAs. For instance, a high-throughput

Teq (K) Training 615 1720 1070 1700 300 1700

Teq (K) [1080, 1660] 102,127 33,922 33

3.6. Prediction of new single-phase FCC and BCC HEAs

By using the newly developed phase selection rules and the

You might also like