Professional Documents
Culture Documents
H I G H L I G H T S G R A P H I C A L A B S T R A C T
a r t i c l e i n f o a b s t r a c t
Article history: We reveal high-fidelity new phase selection rules for high entropy alloys (HEAs) by combining CALPHAD cal-
Received 15 December 2020 culations and the machine learning (ML) method. Employing Thermo-Calc and TCHEA3 database, we first
Received in revised form 14 January 2021 generate more than 300,000 equilibrium phase data from 20 quinary families formed by the 8 elements of
Accepted 25 January 2021
Al Co, Cr, Cu, Fe, Mn, Ni, and Ti, and choose initially 15 materials/physical descriptors. The eXtreme Gradient
Available online 29 January 2021
Boosting (XGBoost) method is then used to identify 5 most important descriptors that best delineate the sin-
Keywords:
gle and mixed phases in the complex temperature-composition space of HEAs. The ML model trained by the
Machine learning 5 features is validated by 155 annealing experimental data points from 15 publications and then used to pre-
High entropy alloy dict 213 new single-phase alloys with BCC and FCC structures of the alloy families of AlCrNiFeMn and
CALPHAD AlCrCoNiFeTi. We also highlight the importance of equilibrium temperature and offer in-depth insights
Solid solution into the paradigm of composition-feature-phase of HEAs. On the basis of the 5 important features, we estab-
Phase selection rules lish new phase selection rules for single FCC and BCC phases with a success rate above 90%, significantly
outperforming all existing phase selection rules and providing a powerful tool for mapping single-phase in
the complex temperature-composition space of HEAs.
© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://
creativecommons.org/licenses/by-nc-nd/4.0/).
⁎ Corresponding authors.
E-mail addresses: baikw@ihpc.a-star.edu.sg (K. Bai), zhangyw@ihpc.a-sta.edu.sg (Y.-W. Zhang).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.matdes.2021.109532
0264-1275/© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
2
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
2.1. Data generation by CALPHAD method are expressed in Eq. (1), where P is denoted as one of the properties and
xi the mole fraction of ith element.
Eight elements, Al, Co, Cr, Cu, Fe, Mn, Ni, and Ti, are selected to pro-
N
duce light-weight quinary HEA alloys. Furthermore, since Al must be in- P avg ¼ ∑ xi P i ð1Þ
cluded in each alloy, the number of Al-containing quinary alloy families i¼1
is 35 (=7C4). The equilibrium phases of the quinary alloys are computed
by Thermo-Calc Software with TCHEA3 database. It is noted that many material properties cannot be described simply
The credibility of TCHEA3 can be determined by two parameters by the compositional averages in their constituent properties, and the
[44], namely, the fractions of assessed end-binary (FAB) and end- differences in these properties are also important [49]. Therefore, the
ternary systems (FAT), respectively (see Section S1 in Supplemental In- variances of the five properties of the constituent elements are also
formation for the data assessment). The analysis by Senkov et al. [44] taken into account. Specifically, we take the atomic radius difference
found a good agreement between calculations and experiments with (ΔR) defined in Eq. (2), which is the same as that used in [15,32]. The
FAB>0.5, an important criterion that can be used to assess the validity other four, namely, the electronegativity difference Δχ, the atomic
of CALPHAD predictions. While all the end-binary systems have been number difference (Δ(AN)), the melting temperature difference (ΔTm)
fully assessed in TCHEA3 for all the 35 quinary families in our study, and the valance electron difference (Δ(VE)), are defined in Eq. (3).
i.e., FAB = 1, we conservatively select the quinary series with FAT ≥0.6 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n 2
to ensure even higher reliability of the CALPHAD results. Consequently, ΔR ¼ ∑i¼1 xi 1−Ri =R ð2Þ
only the first 20 quinary families in Table S1 are selected in our study.
n
In the present study, we use Thermo-Calc to generate phase equilib- where R ¼ ∑ xi Ri .
rium data that cover the full composition space of quinary alloy systems i¼1
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
with a compositional step of δx = 0.1 (mole fraction). As a result, the n 2
ΔP ¼ ∑i¼1 xi P i −P ð3Þ
number of data points for a quinary system is 126. The temperature
n
range is set to 300 K – 1700 K with a step of 10 K, which results in 141
where P ¼ ∑ xi P i .
temperatures. Thus, the total number of instances is 355,320 (= i¼1
126 × 141 × 20) for all the 20 quinary systems. The proportions of all
the phases of each instance (data point) are calculated by the single It is noted that the units and data sources of the above-mentioned el-
point equilibrium calculation. To better study the phase selection (for- emental properties are the following: melting temperature in Kelvin,
mation) rules, we label all the data points by the 3-class scheme defined atomic radii in Å (Killet scale) [50], electronegativity (Allen scale) [51]
as below: and valence electron (Villars scale) [52].
Class-1: disordered single FCC phase with mole fraction >0.999 In addition, the other four quantities, i.e., ΔHmix [15,53], the deviation
Class-2: disordered single BCC phase with mole fraction >0.999 in enthalpy σH [54], ΔSmix [2,15,32], and the parameter Ω that is defined
Class-3: the other phases as the entropy of mixing divided by the enthalpy of mixing and multi-
It is noted that the above definitions of single-phase FCC and BCC plying by the average melting temperature of the elements [17,32],
structures are stricter than those defined in [45], where the cut-off are also taken as the descriptors as defined in Eqs. (4–7), respectively.
point for a single solid solution phase is 0.99. In the present study, the
n C2
resultant numbers of the 3 classes of the 20 quinary series (after remov- ΔHmix ¼ ∑ ΔH mix ð4Þ
xi ,xj
ing liquid-containing entries) are 1004, 992 and 257,215, respectively. i¼1
Clearly, the number of instances in Class-3 significantly outnumbers i≠j
those of the other two classes. This imbalance in the dataset may !
3 k x
cause difficulties in ML training. To alleviate the imbalance, we generate where ΔHmix
xi ,xj ¼ 4ci cj ∑ W k xi,nor −x j,nor , xi,nor ¼ xi xþx
i
j
, x j,nor ¼ xi þx
j
j
,
k¼0
the other two groups of data by Thermo-Calc calculations. One of them
is obtained by calculating the phase equilibrium for the full composition and Wk values can be found in Table 2 of Ref. [53].
space of Al-Co-Cr-Fe-Ni series with δx = 0.05, from which only the in- sffiffiffiffiffiffiffiffi
stances with Class-1 and Class-2 are selected. The other group is gener- 1 n C 2 mix 2
σH ¼ ∑ ΔHxi ,xj −ΔH ð5Þ
ated by conducting ‘stepping’ calculation in Thermo-Calc with the initial C
n 2 i¼1
compositions of the instances of Class-1 and Class-2 and fixing the FCC i≠j
and BCC phase, respectively. All the above instances are compiled and
referred to as the ML training dataset in which the total numbers of in- where ΔH ¼ ΔHCmix .
n 2
stances of Class-1, Class-2, and Class-3 are 41,329, 8804, and 257,215,
N
respectively.
ΔSmix ¼ −R ∑ xi ln xi ð6Þ
i¼1
T m ΔSmix
2.2. Descriptor selection Ω¼ ð7Þ
jΔHmix j
A key ingredient of a machine learning algorithm is to select physical
meaningful descriptors, and generally, the best choice is highly depen-
2.3. Machine learning methodology
dent on the target under study [33]. To begin with, we choose a few typ-
ical physicochemical/thermodynamic parameters that were previously
2.3.1. Addressing data imbalance
proposed to identify phase formation [21]. Overall, a total of 15 descrip-
We split the data set of 307,348 samples into a training set (90%) and
tors, including the equilibrium temperature (Teq), is chosen as the initial
a test set (10%). Of the 276,613 samples in the training set, 37,188 are
features for ML training. Among these descriptors, five of them are
Class 1, 7910 are Class 2, and the remaining 231,515 are Class 3. It is
formed by the average values of the five fundamental elemental proper-
clear that inherent in the data set, there is a severe imbalance despite
ties, namely, atomic radius (R), electronegativity (χ), atomic number the effort in balancing these data as described in Section 2.1. The relative
(AN), melting temperature (T m ), and valence electron count (VEC) under-representation of Classes 1 and 2 can be considered a reflection of
(VE) of the constituent elements. The definitions of the five descriptors the scarcity of single FCC and BCC phases in reality. This imbalance can
3
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
cause a high accuracy in majority classes but low accuracy for minority 3. Results and discussions
classes. Typically, two types of strategies [55] can be used address this
data imbalance issue: (i) re-balancing the dataset through under- 3.1. Results of machine learning
sampling majority classes or over-sampling minority classes, and (ii)
cost-sensitive scoring, where sample weights inversely proportional to 3.1.1. Feature selection
class sizes are taken into consideration when evaluating the perfor- Feature selection is another essential step for establishing a robust
mance metric and objective functions. It is known that rebalancing the machine learning model. We first calculate Pearson correlation coeffi-
dataset by oversampling the minority classes would result in a huge cients between every pair of features. As can be seen from Fig. S1,
training set consisting of 231,515 × 3 = 694,545 samples, adding to some features are highly correlated, which indicates that there is
the computational cost. On the other hand, under-sampling the major- room for feature selection, i.e., selecting a subset consisting of the
ity of classes would result in the loss of data. Such considerations have most significant features.
led us to implement cost-sensitive scoring. To rank the importance of the individual features in terms of their
For classification problems, the objective or cost function is contributions towards the classification objective, we train an XGBoost
usually the cross entropy loss, also known as log loss, defined as model by using all 15 features of the data set. The resulting importance
N K scores are shown in Fig. 1(a). Such feature importance scores depend on
ði Þ
N1 ∑ ∑ ρk ln ρ ^ ðkiÞ , where N is the total number of samples, K the
i¼1 k¼1
the training data set; it is quite possible that with a different training set,
ðiÞ the importance scores can be substantially different even if the problem
number of classes, ρk the target probability that the ith sample or in-
is still classifying the same three classes. To give an estimate of the ex-
ðiÞ
stance belongs to class k and takes the value of either 0 or 1, and ρ bk tent of variation, we trained 256 XGBoost models, each on a randomly
the predicted probability that the ith sample or instance belongs to undersampled 90% of the training data set. In Fig. 1(a), the width of
class k. In our model-building process, we use the weighted cross en- each blue stripe corresponds to the mean of the importance scores for
N K a feature generated across the 256 models, and the error bar the stan-
ðiÞ
^ðkiÞ , where the ith sample is associated
tropy loss N1 ∑ ∑ wðiÞ ρk ln ρ
i¼1 k¼1 dard deviation. The standard deviation allows one to gauge the likeli-
with a weight w (i) that is inversely proportional to the size of the hood that feature importance ranking can be perturbed. We observe
class that the sample belongs to. The sample weight w(i) is computed that the standard deviation is low for most features, especially those
using the Scikit-Learn library [56], according to the formula with low importance scores. Subsequently, we carry out feature selec-
wðiÞ ¼ ðiÞ N , where N is the total number of samples in the training tion by eliminating features one by one in the order of increasing impor-
N N classes
tance, starting with the least important feature, σH. At each stage of the
set, N(i) is the number of class-i samples in the training set, and
process, we tune the hyperparameters of the reduced model (see
Nclasses is the number of classes. As a result, the samples of Class 1,
Section S2 in Supplementary Information) using 5-fold cross-
2, and 3 are attached to a weight 2.48, 11.66 and 0.40, respectively.
validation grid search and record the weighted cross-entropy loss mea-
During training, such a weighted loss function penalizes errors
sured on the training set. Fig. 1(b) shows the weighted cross-entropy
made on the minority classes more than those on the majority
loss as a function of number of input features, which indicates that
class, forcing the model to achieve the right balance between the
five input features are necessary since further reduction leads to a sig-
classes.
nificant increase in the cross-entropy. The five most important features
identified are Teq, R, Δ(VE), VE and Δχ.
It is noted that some of the five most important features identified
2.3.2. Choice of ML model - XGBoost classifier
from the present work are consistent with previous studies [17–19],
It is well-known that XGBoost regularly outperforms classical ML
where the radius, electronegativity, and valence electron were often
models, such as the logistic regression, k-nearest neighbours (KNN),
chosen based on the Hume-Rothery rule. In contrast to all previous
support vector machines (SVM), as well as neural networks [48].
parametric studies in which the equilibrium temperature was not
Boosting algorithms proceed iteratively by combining weak predictors
taken into account, our work highlights the importance of the equilib-
to form a strong predictor, with each additional predictor correcting
rium temperature in the phase selection of HEAs. In the following, we
the residual error of the previous one, a sequence of classifiers can
discuss in greater detail the possible reasons why the five important de-
thus be built, enabling progressively more accurate training. In addition,
the XGBoost implementation of gradient boosting possesses the advan- scriptors, namely, Teq, R, Δ(VE), VE and Δχ, are chosen in our phase se-
tages of scalability, the ability to handle large datasets with ease, as well lection rules of the multicomponent HEAs. Since three other
as support for parallel and distributed computing. Importantly, XGBoost descriptors, namely ΔSmix, ΔHmix, and Ω are not chosen in our phase se-
is also compatible with cost-sensitive scoring and allows implementa- lection rules, we also discuss the possible reasons and their roles in
tion of weighted cross entropy loss for classification problems. For this HEAs thermodynamics.
specific problem, we carried out comparison between various machine The fundamental principle of phase stability is attributed to the
learning models, and XGBoost was found to deliver the best generaliza- Gibbs energy, which can be written as ΔG = ΔHmix − T(ΔSmix +
tion performance. The comparison of their accuracies can be found in ΔSex
mix ), where T is the equilibrium temperature (Teq), ΔSmix is the config-
section S2 of the supplemental information. uration entropy as defined by Eq. 6 in the descriptor selection section
The base predictors of XGBoost are classification and regression and ΔSex mix is the excess entropy. In principle, all the quantities in the
trees (CART), which provide us with a natural and convenient mecha- Gibbs energy should be important. Indeed, our study shows that the
nism for measuring the importance of each feature [48,57]. Based on equilibrium temperature is an important factor. As pointed out by Guo
such feature importance scores, feature selection can be carried out, en- [58] and George et al., [34] however, ΔSmix is not an effective factor for
abling the reduction of model complexity while at the same time, im- HEA phase selection, which is consistent with our result. Although ΔSex mix
proving interpretability and offering insights into the physical could be an important factor, [34] it is commonly not considered per-
relevance of each descriptor in the phase selection. XGBoost is therefore haps due to the difficulties in calculating it. In the present study, ΔHmix
chosen in the current work for building our machine learning model is calculated by commonly-used scheme, [53] which is based on the
due to its support for cost-sensitive scoring and ability to generate fea- thermodynamics of liquid phases. A recent study [59] has shown that
ture importance scores, besides being more accurate than other ma- the scheme is inadequate to describe the complex thermodynamics of
chine learning methods. solid HEAs, which may cause the exclusion of ΔHmix in our phase
4
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
Fig. 1. (a) The importance scores for all 15 features. (b) Weighted cross-entropy loss on training set as a function of the number of input features. Features are eliminated one by one,
starting with the least important one, σH, in the order of increasing importance as indicated in (a).
selection rules. Due to the same reasons, Ω, which is defined as significance of valence electron concentration (VEC) and its crucial roles
T m ΔSmix in phase stability as well as numerous physical properties of alloys have
Ω¼ jΔH mix j , is also not included in our phase selection rules.
been discussed in [23,62], and the importance of VEC in phase stability
The significance of the other 4 chosen descriptors R, Δ(VE), VE and
has also been highlighted in [19,58]. The Miedema model also suggests
Δχ can be explained by their relationships with the mixing enthalpy
that the other two descriptors, namely, the difference in electronegativ-
ΔHmix of HEAs. It is instructive to use a binary AB alloy to illustrate
ity (Δχ) and the average radius R (which is related to the element vol-
these relationships. According to the well-known Miedema model
ume) should also be taken into consideration. It is noted that the
[60,61], the ΔHmix can be written as Eq.(8).
importance of atomic radius and electronegativity differences were also
2 revealed in [23].
2
2f ðcÞ xA V 3A þ xB V 3B 1 2
ΔHmix ¼ −P ð ΔχÞ2 þ Q Δn3ws −R ð8Þ
−1
− 1
nAws 3 þ nBws 3 3.1.2. Testing on the final model based on the five features
The final ML model is trained on the five most important features,
where xA(B) and VA(B) are the molar volume of A(B) element and the vol- Teq, R, Δ(VE), VE and Δχ. The performance on the training and test sets
ume of A(B) element in the binary alloy, respectively. f(c) represents the can be seen from the confusion matrices of predictions [63] as shown
concentration dependence of ΔHmix, nws denotes the electron density at in Fig. 2(a) and (b), respectively. In each confusion matrix, the row-
the boundary of Wigner–Seitz cell, R denotes the hybridization param- wise accuracies are shown in the top three cells of the right-most col-
eter for the alloying of a transition metal with a non-transition metal, umn, corresponding to the recalls for Classes 1, 2 and 3, respectively.
and P and Q are constants. Clearly, both Δnws and R are associated with Similarly, the column-wise accuracies shown in the first three cells of
the redistribution of the valence electron density that contributes to the the bottom row are the precisions for the three classes. The overall pre-
metallic bonding. Thus these two parameters are closely related with diction accuracy is given in the bottom right corner. On the training set,
the two chosen descriptors Δ(VE) and VE. The thermodynamic this accuracy is 99.95%, while on the test set, it is 99.92%. The precisions
Fig. 2. Confusion matrices of the ML model trained with the five features for (a) the training data set, (b) test data set, and (c) the experimental data set from 15 sources.
5
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
are all above 99% for all the classes. Such high accuracy indicates that the Table 1
final model built on the five features possesses a good general property. Comparison of the ranges of the five features of the training dataset and experimental
dataset.
In the subsequent sections, we will use this model to make predictions
on experimental data gathered from literature and guide the design of Descriptors Phase FCC BCC Other Phases
single-phase FCC and BCC HEAs. Range Min Max Min Max Min Max
Fig. 3. Parallel coordinate plots of FCC (a), BCC (b) and multi-phases (c) from the training and experimental datasets. The units of the features are given in the parentheses as below: Teq (K),
R (Å), VE (Villars scale), Δχ (Allen scale), Δ(VE) (Villars scale).
6
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
Table 2
The instance counts of single-phase (FCC and BCC) and the success rates when satisfying the conditions.
Feature FCC Condition, [5th, 95th] Total counts of instances satisfying the Counts of FCC instances satisfying the 100 × (FCC counts) / (Total
percentiles condition condition counts)
work. It was reported that the solid solution phase stabilizes in the Al appears in both BCC afnd FCC phases, Al is found in higher levels in
range of Ω ≥ 1.1 and δ ≤ 6.6% [17]. Our results show that for [5th, BCC phases thus it is regarded as a BCC stabilizer in Table 3. Similarly,
95th] percentiles, the Ω is in the range of [1.9, 4.2] for both single- Co and Ni are found to be the FCC stabilizers in all the studies. The
phase FCC and BCC, and the δ for FCC is [2.9%, 4.2%] and BCC [2.8%, PCPs of both FCC and BCC show that the Fe composition varies from
5%], in good agreement with [17]. From the study of a quinary low to high levels, indicating that Fe is neither a FCC nor a BCC stabilizer.
AlCoCrFeNi system by CALPHAD, it was found that the VEC for BCC is This observation is in agreement with [65], where it is argued that Fe is
in the range of [5.7, 7.2], while for FCC, VEC > 8.4 [21]. Also, it was re- neutral in stabilizing the solid-solution phases. However, there are in-
ported in [19] that an FCC solid solution is stable at higher VEC (≥8), consistent arguments on the role of Mn as a stabilizer. While Mn was
while a BCC solid solution is stable at lower VEC (≤6.87). In contrast, believed to be an FCC stabilizer in [66,67], it was regarded as a BCC sta-
the VEC ranges defined by the [5th, 95th] percentiles in our results ob- bilizer in [20]. The PCPs show that Mn compositions span from low to
tained from 20 quinary families of 276,613 data points are [6.5, 7.6] high levels in BCC phase while at lower levels only in FCC phase. Thus
and [7.8, 9.0] for BCC and FCC, respectively. Clearly, these previous stud- our result supports the argument of [20]. From the PCPs, we can con-
ies support the current findings as these critical VEC values are within clude that both Cu and Ti are neither FCC stabilizers nor BCC stabilizers,
the ranges of our phase selection rules. This result implies that our although low levels of Cu can exist in FCC phases. Our results suggest
new phase selection rules established based on a large number of equi- that the concept of element stabilizer in the conventional alloys should
librium data calculated by CALPHAD can also be used to guide HEA de- be used cautiously in the element selection for HEA design. Instead, the
sign by as-cast experiments. This conclusion is in agreement with ML new phase selection rules established in our study should be used to
study by Pei et al. [31] in which the CALPHAD verification reached 94% choose the phase stabilizing elements.
consistency with the prediction from a ML model built on 1252 as-
cast data. 3.5. Relationship of element compositions and descriptors
It is interesting to point out that our work here clarifies the mystery
concerning the role of VEC in phase selection raised recently in a review In the following, we analyse how each of the 8 composition elements
article [64] which found that the role of VEC in the ML studies [24,26] is correlated with each of the 5 features by the correlation map, i.e., the
was significantly different from the empirical study [19]. In the former, Pearson matrix. Fig. 5 reveals the correlations in the training dataset. It
the VEC criterion was regarded as very important in the phase selection,
can be seen that for the average radii R, Ti has the largest impact,
whereas in the latter, it was stated that VEC could only distinguish FCC
followed by Al. In comparison, for the average valence electron VE, the
and BCC. Our study here shows that VEC is only one of the five most im-
elements with significant impact in descending order are Al, Ti and Ni.
portant descriptors in the phase selection rules and ranks as the third
The largest correlation with Δ(VE) is Al, whereas with Δχ is Ti. The cor-
most important factor according to the importance scores shown in
relations of all the elements with temperature are low and thus
Fig. 1(a).
negligible.
Of particular interest is that the Pearson correlation coefficients
3.4. The element effect on the stability of single-phase solid solutions
shown in Fig. 5 may offer additional insights into the role of different el-
ements in the phase stabilization in HEAs via the element effects on VE.
In recent years, the concept of the stabilizer of BCC or FCC phases in
As shown in the third column of Fig. 5, large negative coefficients of both
the conventional alloys has been extended to HEA alloys and some in-
consistent results were reported [20,65–67]. To clarify the inconsistency Al and Ti with the VE are observed, suggesting that Al and Ti have
and further explore the element effect on the stability of single-phase greater effects on decreasing VE and thus may strongly stabilize BCC
solid solutions, we generate the PCPs of the 8 elements for each of the phases. Furthermore, Cr has a relatively small negative coefficient, and
three classes in the same manner as the descriptors. The PCPs for Classes thus Cr is a weak BCC stabilizer. On the other hand, moderate positive
1, 2, and 3 are shown in Fig. 4(a), (b), and (c), respectively. To compare coefficients are found in Co, Ni and Cu, and thus these three elements
our results with those published by others, we define a phase stabilizer have moderate effects on the increase of VE, which leads to stabilizing
as following, namely, the composition of an element being found at FCC phases. Finally, small positive coefficient of Fe and small negative
higher levels in one PCP (e.g., FCC) and at lower levels in the other coefficient of Mn are observed, which implies that Fe is a weak FCC sta-
PCP (e.g., BCC). The comparison is shown in Table 3. Clearly, Al and Cr bilizer and Mn a weak BCC stabilizer. Despite these findings, it should be
are deemed as BCC stabilizers in all the studies. Although in our study emphasized that VEC is only one of the five important factors in the
7
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
Fig. 4. Parallel coordinate plots of compositions of the 8 elements in the training dataset for (a) Class 1, (b) Class 2 and (c) Class 3.
correlation map. The strong correlations between elements and other is determined by the overall effects of the five factors identified in the
descriptors, such as Al and Ti with R and Δχ, may outweigh VEC factors present work which offers a framework in the rationalization of differ-
in the phase selection of HEAs. This conclusion is supported by a recent ent empirical physical parameters in the phase selection in the high-
study on the effect of different elements on phase formation in the al- entropy alloys.
loys [68] where the authors found that while FCC structure is stable We further analyse the impact of the element compositions on the
when the radius difference ΔR ≤ 2.8 and VE ≥ 8.27, the intermetallic magnitudes of the five features by the sensitivity analysis, namely, the
phases are favoured when Δ χ > 0.133. In summary, the phase selection partial derivatives of each feature with respect to each element
8
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
Table 3 good agreement with those of higher correlation in the Pearson ma-
Comparison of the phase stabilizers reported in the literature and this work. trix. Since the sensitivity analysis of Δχ and Δ(VE) expressed by
Element FCC BCC Eq. (10) is too complicated, we employ a numerical method to calcu-
Al [20,65], this work
late the partial derivatives of the 8 elements in pseudo 8-atom alloys.
Cr [20,65], this work By setting the composition spacing to 0.02 mole fraction, the resul-
Co [20,65], this work tant number of instances (data points) is 85,900,584. Among them,
Ni [20,65], this work the highest partial derivative of Δχ with respect to Ti mole fraction
Fe
accounts for 95.4%, followed by that of Ni (4.6%). These results
Cu
Mn [66,67] [20], this work agree well with the Pearson matrix, where Ti has the highest corre-
Ti lation with Δχ. Similarly, the highest partial derivative of Δ(VE)
with respect to Al is found to be ranked at the top, i.e., 63.4%,
followed by Cu, 37.5%. These results are also consistent with the
Pearson matrix, which shows that Al and Cu have highest correla-
tions with Δ(VE).
Fig. 5. The correlation map between 8 elements and 5 important features. The value in the 3.7. Criteria for the formation of duplex phases
grid shows the correlation coefficient between the corresponding element and descriptor.
The colour intensity is proportional to the magnitude of the correlation coefficients.
While the focus of the present work is the formation of single-phase
HEAs, we extend the methodology to establish the criteria for the for-
mation of duplex phases. The duplex phases are defined as the mole
composition. Of the five features identified in this work, namely, Teq, fraction of the matrix (disordered FCC or BCC phases) greater than 0.8
R, ΔðVEÞ, VE, Δχ, two of them (R and VE) are obtained by Eq. (1) and and the sum of the matrix and precipitate (ordered FCC or BCC) greater
the other two (Δχ and Δ(VE)) from Eq. (3). Thus the partial derivatives than 0.999. A total of 58 instances of FCC duplex phases and 667 in-
of R and VE are calculated by Eq.(9), and Δχ and Δ(VE) by Eq. (10). stances of BCC duplex phases are found in the training dataset. In the
same fashion as described in Section 3.3 where the criteria for the for-
∂P mation of single FCC and BCC phases are suggested, the values of the 5
¼ Pi ð9Þ
∂xi features that simultaneously fall in the ranges leading to the duplex
phases are listed in Table 4.
2 N
∂ΔP P i −P −2P i ∑ j¼1 xj P j −P
¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 ffi ð10Þ
∂xi N
2 ∑ j¼1 xj P j −P Table 4
Important feature ranges for producing FCC and BCC duplex phases.
where xi and Pi are the mole fraction and the element property of ith el-
Duplex Range Teq R VE (Villars Δχ (Allen Δ(VE)(Villars
ement, respectively. phases (K) (Å) scale) scale) scale)
The sensitivity of an element composition to R and VE is readily
FCC min 1020 1.27 8.00 0.08 2.17
available from Eq. (9), namely, Ti > Al > Cr = Cu > Fe > Mn > Co = max 1440 1.30 9.10 0.16 2.62
Ni for R, and Cu < Ni < Co < Fe < Mn < Cr < Ti < Al for VE. As can be BCC min 1050 1.28 5.90 0.07 1.66
max 1680 1.32 7.40 0.10 2.30
seen, the elements with a higher sensitivity to both R and VE are in
9
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
4. Conclusion [3] B. Cantor, I.T.H. Chang, P. Knight, A.J.B. Vincent, Microstructural development in
equiatomic multicomponent alloys, Mater. Sci. Eng. A-Struct. Mater. Prop.
Microstruct. Process. 375 (2004) 213–218.
We develop ML model by using large amount of phase equilibrium [4] E.P. George, W.A. Curtin, C.C. Tasan, High entropy alloys: a focused review of me-
data in the well-defined vast composition-temperature space of HEAs chanical properties and deformation mechanisms, Acta Mater. 188 (2020) 435–474.
[5] S. Gorsse, J.-P. Couzinié, D.B. Miracle, From high-entropy alloys to complex concen-
calculated from CALPHAD method, and the ML model is validated
trated alloys, C R Phys. 19 (8) (2018) 721–736.
with 81% accuracy by 155 experimental data from 15 different sources. [6] D.B. Miracle, O.N. Senkov, A critical review of high entropy alloys and related con-
On the basis of ML result and the analysis of large amount of data, we es- cepts, Acta Mater. 122 (2017) 448–511.
tablish new phase selection rules for single-phase FCC and BCC of HEAs [7] B.S. Murty, J.W. Yeh, S. Ranganathan, High-Entropy Alloys, Butterworth-Heinemann,
(an Imprint of Elsevier), Amsterdam, 2014.
which achieve the success rates of 93% and 92%, respectively. The new [8] M.C. Gao, J.-W. Yeh, P.K. Liaw, Y. Zhang, High-Entropy Alloys. Fundamentals and Ap-
rules state that all the following five conditions are required to satisfy si- plications, Springer International Publishing, Switzerland, 2016.
multaneously, namely, for single-phase FCC, 1080 < Teq (K) < 1660, 1.26 [9] M.H. Tsai, J.W. Yeh, High-entropy alloys: a critical review, Mater. Res. Lett. 2 (3)
(2014) 107–123.
< R (Å) < 1.28, 7.80 < VE (Villars scale) < 8.96, 0.066 < Δχ (Allen [10] Y.F. Ye, Q. Wang, J. Lu, C.T. Liu, Y. Yang, High-entropy alloy: challenges and pros-
scale) < 0.105, and 1.50 < Δ(VE) (Villars scale) < 2.26; and for single- pects, Mater. Today 19 (6) (2016) 349–362.
phase BCC, 1330 < Teq(K) < 1690, 1.28 < R (Å) < 1.30, 6.45 < VE (Villars [11] W.R. Zhang, P.K. Liaw, Y. Zhang, Science and technology in high-entropy alloys, Sci.
China-Mater. 61 (1) (2018) 2–22.
scale) < 7.55, 0.071< Δχ (Allen scale) < 0.096, and 1.46 < Δ(VE) (Villars [12] X. Chang, M. Zeng, K. Liu, L. Fu, Phase engineering of high-entropy alloys, Adv. Mater.
scale) < 2.10. We further demonstrate that some of previously pro- 32 (14) (2020) 1907226.
posed phase selection rules are only a subset of the phase selection [13] A. Hoffman, L. He, M. Luebbe, H. Pommerenke, J.Q. Duan, P.P. Cao, K. Sridharan, Z.P.
Lu, H.M. Wen, Effects of Al and Ti additions on irradiation behavior of FeMnNiCr
rules established in this work, and thus explain why they are only par-
multi-principal-element alloy, Jom 72 (1) (2020) 150–159.
tially successful for HEA phase selection. The subsequent analysis of [14] D.B. Dai, T. Xu, X. Wei, G.T. Ding, Y. Xu, J.C. Zhang, H.R. Zhang, Using machine learn-
the relationship between the element compositions and the five ing and feature engineering to characterize limited material datasets of high-
important features reveals the relative sensitivities of the constituent el- entropy alloys, Comput. Mater. Sci. 175 (2020) 6.
[15] Y. Zhang, Y.J. Zhou, J.P. Lin, G.L. Chen, P.K. Liaw, Solid-solution phase formation rules
ements in the formation of desirable phases. Finally, 213 new single- for multi-component alloys, Adv. Eng. Mater. 10 (6) (2008) 534–538.
phase BCC and single-phase FCC structures with high- and medium- en- [16] Y. Zhang, C. Wen, C. Wang, S. Antonov, D. Xue, Y. Bai, Y. Su, Phase prediction in high
tropy alloys are predicted, providing an ample opportunity for experi- entropy alloys with a rational selection of materials descriptors and machine learn-
ing models, Acta Mater. 185 (2020) 528–539.
mental validation. The newly proposed phase selection rules together
[17] X. Yang, Y. Zhang, Prediction of high-entropy stabilized solid-solution in multi-
with the relationships in the descriptor-composition-phase paradigm component alloys, Mater. Chem. Phys. 132 (2) (2012) 233–238.
provide powerful tools for the rational design of phase structures in HEAs. [18] Y. Zhang, Z.P. Lu, S.G. Ma, P.K. Liaw, Z. Tang, Y.Q. Cheng, M.C. Gao, Guidelines in
predicting phase formation of high-entropy alloys, MRS Commun. 4 (2) (2014)
57–62.
Authors contributions [19] S. Guo, C.T. Liu, Phase stability in high entropy alloys: formation of solid-solution
phase or amorphous phase, Prog. Nat. Sci. 21 (6) (2011) 433–446.
Yingzhi Zeng: Data curation, Methodology, Validation, Visualization, [20] S.A. Kube, S. Sohn, D. Uhl, A. Datye, A. Mehta, J. Schroers, Phase selection motifs in
high entropy alloys revealed through combinatorial methods: large atomic size dif-
Writing - review & editing. Mengren Man: Data curation, Methodology, ference favors BCC over FCC, Acta Mater. 166 (2019) 677–686.
Validation, Writing - review & editing. Kewu Bai: Conceptualization, [21] S. Yang, J. Lu, F. Xing, L. Zhang, Y. Zhong, Revisit the VEC rule in high entropy alloys
Supervision, Methodology, Visualization, Writing - review & editing. (HEAs) with high-throughput CALPHAD approach and its applications for material
design-a case study with Al–Co–Cr–Fe–Ni system, Acta Mater. 192 (2020) 11–19.
Yong-Wei Zhang: Conceptualization, Funding acquisition, Methodol- [22] M.C. Troparevsky, J.R. Morris, P.R.C. Kent, A.R. Lupini, G.M. Stocks, Criteria for
ogy, Project administration, Supervision, Writing - review & editing. predicting the formation of single-phase high-entropy alloys, Phys. Rev. X 5 (1)
(2015), 011041, .
[23] M.G. Poletti, L. Battezzati, Electronic and thermodynamic criteria for the occurrence
Data availability of high entropy alloys in metallic systems, Acta Mater. 75 (2014) 297–306.
[24] N. Islam, W. Huang, H.L. Zhuang, Machine learning for phase selection in multi-
All data used in this manuscript are available from the authors on principal element alloys, Comput. Mater. Sci. 150 (2018) 230–235.
[25] Y. Li, W.L. Guo, Machine-learning model for predicting phase formations of high-
reasonable request.
entropy alloys, Phys. Rev. Mater. 3 (9) (2019).
[26] W.J. Huang, P. Martin, H.L.L. Zhuang, Machine-learning phase prediction of high-
Declaration of Competing Interest entropy alloys, Acta Mater. 169 (2019) 225–236.
[27] C. Wen, Y. Zhang, C. Wang, D. Xue, Y. Bai, S. Antonov, L. Dai, T. Lookman, Y. Su, Ma-
chine learning assisted design of high entropy alloys with desired property, Acta
None. Mater. 170 (2019) 109–117.
[28] Z. Zhou, Y. Zhou, Q. He, Z. Ding, F. Li, Y. Yang, Machine learning guided appraisal and
Acknowledgment exploration of phase design for high entropy alloys, Npj Comput. Mater. 5 (1)
(2019) 128.
[29] N. Qu, Y. Chen, Z. Lai, Y. Liu, J. Zhu, The phase selection via machine learning in high
This work is supported by RIE2020 AME Programmatic Grant: entropy alloys, Procedia Manuf. 37 (2019) 299–305.
AMDM (Grant No. A1898b0043) and by the A*STAR Computational Re- [30] Q. Wu, Z. Wang, X. Hu, T. Zheng, Z. Yang, F. He, J. Li, J. Wang, Uncovering the eutec-
tics design by machine learning in the Al–Co–Cr–Fe–Ni high entropy system, Acta
source Centre and National Supercomputing Centre Singapore through
Mater. 182 (2020) 278–286.
the use of their high-performance computing facilities. [31] Z. Pei, J. Yin, J.A. Hawk, D.E. Alman, M.C. Gao, Machine-learning informed prediction
of high-entropy solid solution formation: beyond the Hume-Rothery rules, Npj
Appendix A. Supplementary data Comput. Mater. 6 (1) (2020) 50.
[32] Y. Zhang, S. Guo, C.T. Liu, X. Yang, Phase formation rules, in: M.C. Gao, J.-W. Yeh, P.K.
Liaw, Y. Zhang (Eds.), High-Entropy Alloys. Fundamentals and Applications,
Supplementary data to this article can be found online at https://doi. Springer International Publishing, Switzerland, 2016.
org/10.1016/j.matdes.2021.109532. [33] J. Schmidt, M.R.G. Marques, S. Botti, M.A.L. Marques, Recent advances and applica-
tions of machine learning in solid-state materials science, Npj Comput. Mater. 5
(1) (2019) 83.
References [34] E.P. George, D. Raabe, R.O. Ritchie, High-entropy alloys, Nat. Rev. Mater. 4 (8) (2019)
515–534.
[1] J.W. Yeh, S.K. Chen, J.Y. Gan, S.J. Lin, T.S. Chin, T.T. Shun, C.H. Tsau, S.Y. Chang, Forma- [35] W. Kohn, L.J. Sham, Self-consistent equations including exchange and correlation ef-
tion of simple crystal structures in Cu-Co-Ni-Cr-Al-Fe-Ti-V alloys with fects, Physiol. Rev. 140 (4A) (1965) A1133–A1138.
multiprincipal metallic elements, Metall. Mater. Trans. A-Phys. Metall. Mater. Sci. [36] M.C. Gao, D. Alman, Searching for next single-phase high-entropy alloy composi-
35A (8) (2004) 2533–2536. tions, Entropy 15 (2013) 4504–4519.
[2] J.W. Yeh, S.K. Chen, S.J. Lin, J.Y. Gan, T.S. Chin, T.T. Shun, C.H. Tsau, S.Y. Chang, Nano- [37] Y. Ikeda, B. Grabowski, F. Körmann, Ab initio phase stabilities and mechanical prop-
structured high-entropy alloys with multiple principal elements: novel alloy design erties of multicomponent alloys: a comprehensive review for high entropy alloys
concepts and outcomes, Adv. Eng. Mater. 6 (5) (2004) 299–303. and compositionally complex alloys, Mater. Charact. 147 (2019) 464–511.
10
Y. Zeng, M. Man, K. Bai et al. Materials and Design 202 (2021) 109532
[38] C. Zhang, F. Zhang, H.Y. Diao, M.C. Gao, Z. Tang, J.D. Poplawsky, P.K. Liaw, Under- [54] F. He, W. Zhijun, C. Ai, J. Li, J. Wang, J.J. Kai, Grouping strategy in eutectic multi-
standing phase stability of Al-Co-Cr-Fe-Ni high entropy alloys, Mater. Des. 109 principal-component alloys, Mater. Chem. Phys. 221 (2018).
(2016) 425–433. [55] B. Krawczyk, Learning from imbalanced data: open challenges and future directions,
[39] J.O. Andersson, T. Helander, L. Höglund, P. Shi, B. Sundman, Thermo-Calc & DICTRA, Progr. Artif. Intell. 5 (4) (2016) 221–232.
computational tools for materials science, Calphad 26 (2) (2002) 273–312. [56] sklearn.utils.class_weight.compute_class_weight, https://scikit-learn.org/stable/
[40] A. Raturi, J. Aditya, N.P. Gurao, K. Biswas, ICME approach to explore equiatomic and modules/generated/sklearn.utils.class_weight.compute_class_weight.html,
non-equiatomic single phase BCC refractory high entropy alloys, J. Alloys Compd. accessed March 20, 2020, 2020.
806 (2019) 587–595. [57] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, Springer se-
[41] D.C. Ma, M.J. Yao, K.G. Pradeep, C.C. Tasan, H. Springer, D. Raabe, Phase stability of ries in statistics, New York, 2001.
non-equiatomic CoCrFeMnNi high entropy alloys, Acta Mater. 98 (2015) 288–296. [58] S. Guo, Phase selection rules for cast high entropy alloys: an overview, Mater. Sci.
[42] C. Ng, S. Guo, J.H. Luan, S.Q. Shi, C.T. Liu, Entropy-driven phase stability and slow dif- Technol. 31 (10) (2015) 1223–1230.
fusion kinetics in an Al0.5CoCrCuFeNi high entropy alloy, Intermetallics 31 (2012) [59] R. Feng, M.C. Gao, C. Lee, M. Mathes, T.T. Zuo, S.Y. Chen, J.A. Hawk, Y. Zhang, P.K.
165–172. Liaw, Design of light-weight high-entropy alloys, Entropy 18 (9) (2016) 21.
[43] J.I. Lee, K. Tsuchiya, W. Tasaki, H.S. Oh, T. Sawaguchi, H. Murakami, T. Hiroto, Y. [60] G. Arzpeyma, A.E. Gheribi, M. Medraj, On the prediction of Gibbs free energy of
Matsushita, E.S. Park, A strategy of designing high-entropy alloys with high- mixing of binary liquid alloys, J. Chem. Thermodyn. 57 (2013) 82–91.
temperature shape memory effect, Sci. Rep. 9 (1) (2019) 13140. [61] A.R. Miedema, F.R. de Boer, R. Boom, Model predictions for the enthalpy of forma-
[44] O.N. Senkov, J.D. Miller, D.B. Miracle, C. Woodward, Accelerated exploration of tion of transition metal alloys, Calphad 1 (4) (1977) 341–359.
multi-principal element alloys for structural applications, Calphad-Comput. Cou- [62] U. Mizutani, Hume-Rothery Rules for Structurally Complex Alloy Phases, 33487-
pling Ph. Diagrams Thermochem. 50 (2015) 32–48. 2742, CRC Press Taylor & Francis Group, Boca Raton, FL, 2011.
[45] A. Abu-Odeh, E. Galvan, T. Kirk, H. Mao, Q. Chen, P. Mason, R. Malak, R. Arróyave, Ef- [63] Z. Guo, Pretty Print A Confusion Matrix With Seaborn, GitHub, 2018https://gist.
ficient exploration of the high entropy alloy composition-phase space, Acta Mater. github.com/shaypal5/94c53d765083101efc0240d776a23823.
152 (2018) 41–57. [64] R. Li, L. Xie, W.Y. Wang, P.K. Liaw, Y. Zhang, High-throughput calculations for high-
[46] F. Tancret, I. Toda-Caraballo, E. Menou, P. Diaz-Del-Castillo, Designing high entropy entropy alloys: a brief review, Front. Mater. 10 (2020) 3389.
alloys employing thermodynamics and Gaussian process statistical analysis, Mater. [65] G.-Y. Ke, G.-Y. Chen, T. Hsu, J.-W. Yeh, FCC and BCC equivalents in as-cast solid solu-
Des. 115 (2017) 486–497. tions of Al x Co y Cr z Cu 0.5 Fe v Ni w high-entropy alloys, Eur. J. Control. 31 (2006)
[47] Thermo-Calc Software AB, TCHEA3: TCS High Entropy Alloy Database, Available at: 669–684.
http://www.thermocalc.com/media/35873/tchea3_extended_info.pdf. [66] B. Ren, Z.X. Liu, D.M. Li, L. Shi, B. Cai, M.X. Wang, Effect of elemental interaction on
[48] T. Chen, C. Guestrin, XGBoost: A Scalable Tree Boosting System, 2016. microstructure of CuCrFeNiMn high entropy alloy system, J. Alloys Compd. 493
[49] L. Ward, A. Agrawal, A. Choudhary, C. Wolverton, A general-purpose machine learn- (1) (2010) 148–153.
ing framework for predicting properties of inorganic materials, Npj Comput. Mater. [67] Y. Zhang, T.T. Zuo, Z. Tang, M.C. Gao, K.A. Dahmen, P.K. Liaw, Z.P. Lu, Microstructures
2 (2016) 7. and properties of high-entropy alloys, Prog. Mater. Sci. 61 (2014) 1–93.
[50] C. Kittel, Introduction to Solid State Physics, 8th ed. Wiley, 2004. [68] N. Liu, C. Chen, I. Chang, P.J. Zhou, X.J. Wang, Compositional dependence of phase se-
[51] Wikipedia, Electronegativity, https://en.wikipedia.org/wiki/Electronegativity# lection in CoCrCu0.1FeMoNi-based high-entropy alloys, Materials 11 (8) (2018) 11.
Allen_electronegativity. [69] F. Otto, Y. Yang, H. Bei, E.P. George, Relative effects of enthalpy and entropy on the
[52] P. Villars, F. Hulliger, Structural stability domains for single-coordination intermetal- phase stability of equiatomic high-entropy alloys, Acta Mater. 61 (7) (2013)
lic phases, J. Less Common Met. 132 (2) (1987) 289–315. 2628–2638.
[53] A. Takeuchi, A. Inoue, Mixing enthalpy of liquid phase calculated by miedema’s [70] B. Cantor, Multicomponent and high entropy alloys, Entropy 16 (9) (2014)
scheme and approximated with sub-regular solution model for assessing forming 4749–4768.
ability of amorphous and glassy alloys, Intermetallics 18 (9) (2010) 1779–1789.
11