Professional Documents
Culture Documents
A Generalized Method For Constructing Hypothetical Nanoporous Materials of Any Net Topology From Graph Theory
A Generalized Method For Constructing Hypothetical Nanoporous Materials of Any Net Topology From Graph Theory
CrystEngComm
www.rsc.org/crystengcomm
PAPER
Peter G. Boyd and Tom K. Woo
A generalized method for constructing hypothetical nanoporous materials
of any net topology from graph theory
CrystEngComm
View Article Online
PAPER View Journal | View Issue
Here we present a method for constructing hypothetical crystalline nanoporous materials, such as metal–
organic frameworks (MOFs), using a graph theoretical approach. The method takes as input the discrete
secondary (or structural) building units (SBUs) with defined connection points, and a desired 3-dimensional
net topology in the form of a labelled quotient graph. The hypothetical materials are constructed based on
the principle that using a labelled quotient graph obtained, for example from the reticular chemistry struc-
ture resource (RCSR), one can construct a net embedding in 3-D Euclidean space with an infinite number
of different representations. Thus, crystalline structures can be realized by manipulating a net's embedding
such that vertices of the net match the geometries of the desired SBUs. To demonstrate the methodology,
Received 19th February 2016, 46 different network topologies (i.e. tbo, pcu), are used to build MOFs from the same pair of 4-coordinate
Accepted 21st March 2016
and 3-coordinate SBUs. We further show that the method can be used to generate hypothetical MOFs
DOI: 10.1039/c6ce00407e
where the most common realization of a net, called the barycentric representation, will not produce a via-
ble structure. When combined with a robust force field based geometry optimizer, the method can be
www.rsc.org/crystengcomm used to generate large and structurally diverse hypothetical databases for virtual screening purposes.
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3777
View Article Online
Paper CrystEngComm
mathematical enumeration and is not upper bounded. How- with the correct connectivities are aligned to vertices within
ever, databases of nets have sprung up such as the reticular the net representation. Following this orientation, either the
chemistry structure resource (RCSR), which contain many of ideal symmetry operations are applied to the SBUs to attempt
the nets important in classifying crystal structures.7 On its to form the structure, or the SBUs are connected together it-
surface, the database provides three dimensional representa- eratively until a unit cell is formed. This algorithm is an in-
tions of nets. The vertices and edges of each net are placed to novation in the field of hypothetical structure generation as
obtain the maximum possible symmetry achievable, which it circumvents the difficulty of parameterizing SBUs by hand,
was shown to be when vertices are at the centre of its edge- and also affords the generation of potentially many more
connected neighbours.8 This is known as the net's equilib- MOFs of varying topology. However, to fully realize all possi-
rium or barycentric placement. Fig. 1 demonstrates this for ble structures a given set of SBUs can achieve, it would be
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
the diamond net, where a) shows the barycentric form and b) more advantageous to start from a more abstract definition
shows a distortion in the node geometry, which lowers the of the net.
maximum achievable symmetry. Eon formally demonstrated that one can obtain a unique
There have recently been two examples of computational representation of a net with deviations from its barycentric
algorithms which exploit crystallographic nets to generate pe- placement using a series of distortion vectors called colattice
riodic materials. Wilmer et al.1 constructed a database of vectors.13 These vectors, along with a net's lattice vectors can
137 000 hypothetical materials by combinatorially snapping fully describe a three-dimensional embedding. Herein we re-
SBUs together, as the authors put it, like tinker toys. We have port a graph theoretical approach for obtaining hypothetical
implemented this methodology9,10 and found that one of the MOFs from underlying nets. Our algorithm requires the same
limitations of this approach is that the when connecting information as described in the method of Martin and
SBUs together, their relative orientations are pre-defined geo- Haranczyk,12 that is to say, the user must input a set of SBUs
metric parameters and the approach can be very sensitive to with arbitrarily defined connection points and a desired net
these parameters. This requires, in some cases, extensive topology. However our method utilizes the underlying net,
trial-and-error adjustment of orientation parameters to not the barycentric representation, to obtain hypothetical
achieve a periodic structure. Additionally, the structures gen- MOF structures. This maximizes the number of achievable
erated by the program will always possess the same underly- hypothetical MOF structures from a given set of SBUs.
ing topology. This was evidenced in a recent study, which The paper is outlined as follows, the next section will in-
demonstrated that of the 137 000 MOFs in the hypothetical troduce some of the graph theoretic terminology used in the
MOF database, the connectivity of each could be classified in remainder of this text, as well as define nets and labelled
to only one of 6 underlying nets.11 quotient graphs which are fundamental concepts used in the
More recently Martin and Haranczyk reported an algo- construction of hypothetical MOFs in this work. The Methods
rithm which assembles SBUs together using the barycentric section will discuss how a dimensionless labelled quotient
placement of nets as ‘blueprints’.12 In such manner, SBUs graph is embedded in Euclidean space, initially in its
barycentric form, and then in a form which supports SBUs of
arbitrary geometry. In the Results section we will demon-
strate the method in a number of applications; i) embedding
nets to fit SBUs with varying geometries, ii) construction of a
wide variety of structures from using only two SBUs and la-
belled quotient graphs from the RCSR, iii) building a MOF
using the labelled quotient graph of an unstable net, which
possess vertex collisions in its barycentric form and finally iv)
we construct hypothetical versions of MOF-210 and NU-110
to demonstrate the complexity of the problem and the poten-
tial of this method to build real structures. In addition, we
demonstrate the construction of record high surface area hy-
pothetical MOFs using SBUs taken from experimentally deter-
mined high surface area MOFs.
Background
As is the case in several studies presenting graph theory in
context of studying the topologies of crystal structures,13–17
Fig. 1 The diamond (dia) net in it's a) barycentric representation
where all edge lengths are equal, and b) a non-barycentric representa-
these authors find it necessary to define some basic graph
tion. The different edge lengths result in different geometries of the theoretical terms to aid the reader in navigating the remain-
nodes in the net. der of this work.
3778 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
A graph, G(V, E), consists of a set of vertices V and edges tween vertices. The definition of the net above infers an im-
E. Each edge, e ∈ E is terminated by two vertices u, v ∈ V or portant concept; the vertex geometries and spatial distances
one vertex in V in the case of loops. If the graph has a finite between neighbouring vertices are irrelevant when classifying
number of vertices in the set V, then the graph is called fi- MOFs into a net. Topological analysis programs such as
nite. If each vertex u ∈ V contains a finite number of edges Systre8 and TOPOS19 perform analyses on the incidence struc-
incident upon it, the vertices are locally finite. ture of the MOF's underlying vertices and edges to determine
A walk in a graph G(V, E) is a traversal of edges {e1, e2,⋯, en} isomorphism with a known net, ignoring the spatial geome-
∈ E from u to v in V. tries that arise from chemical bonding information. Thus it
A path between two vertices u, v ∈ V is a walk over a se- can be said there are an infinite number of possible
quence of vertices and edges in G(V, E) such that no edge is 3-dimensional representations, or embeddings, of a net. Here
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3779
View Article Online
Paper CrystEngComm
representation of the quotient graph of dia is provided in cycle basis can be realized for a labelled quotient graph by
Fig. 2a. adding the edges not included in the minimum spanning tree.
We are only half done, to construct a 3-D representation
Methods of a labelled quotient graph, B* must unambiguously map all
of the arcs to 3-D lines, and is therefore an invertible m × m
The aim of this work is to construct a 3-D net from an ab- matrix. The cycle basis is only of rank m − n + 1, thus the
stract labelled quotient graph such that when the 3-D net is remaining n − 1 rows of the matrix consist of the graphs' co-
realized, the geometry of it's nodes will fit exactly with the ge- cycle space. Here we will say that the co-cycle space consists
ometry of the SBUs, whereupon superposition, will generate a of all outward oriented arcs from the first n − 1 vertices of the
novel hypothetical crystalline material. For the sake of clarity, graph. An example of a co-cycle basis vector for qtz in Fig. 2b
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
we will say vertices and arcs (or edges) are dimensionless ele- is to represent all the outward oriented arcs of vertex ‘A’, if
ments of the labelled quotient graph, and their images in 3-D necessary ‘flipping’ orientations in the same fashion as was
Euclidean space are points and lines, respectively. done for the cycle bases. Thus one co-cycle vector for the la-
There are two main aspects to constructing a unit-cell rep- belled quotient graph of qtz is (1, 0, −1, −1, 0, 1) representing
resentation of an abstract labelled quotient graph i) construc- the outgoing arcs from ‘A’; e1, ‘flipped’ e3, ‘flipped’ e4, and
tion of the metric tensor, which defines the translational e6. We construct the co-cycle basis in a relatively straight for-
symmetry of the net (i.e. the unit cell), and ii) sending each ward manner, by iterating over the first n − 1 vertices and tak-
vertex and arc from the labelled quotient graph to a 3-D point ing their outward oriented arcs as the co-cycle basis vectors.
and line in the unit cell. The information required to perform We map the basis vectors in B* to the 3-D lattice space by
these two operations is all contained within the labelled quo- defining their lattice orientations in the m × 3 matrix α. For
tient graph and the following sections will demonstrate how the cycle basis vectors, this is accomplished by summing the
this information is extracted. labels associated with the arcs in that vector. Thus the first
m − n + 1 rows of the matrix α are called ‘lattice vectors’ and
Periodic lattice representations of the arcs contain information about what periodic image the corre-
It was shown13 that one can construct a matrix representa- sponding cycle in B* projects into. For example the lattice
tion, B*, of the arcs in a labelled quotient graph such that, vector associated with the orange cycle of qtz in Fig. 2b would
be a summation of the labels of +e1, −e2, and e3 giving.
Ω + = B*−1α (1) . Conceptually, this cycle mapping can be
described by a walk of the net starting at vertex ‘A’, traversing
where Ω + are the 3-D lines represented in the fractional coor-
e1 to vertex ‘B’, traversing –e2 to vertex ‘C’, and finally tra-
dinates of the unit cell and α is a matrix corresponding to
versing e3 to arrive at an image of vertex ‘A’ in the periodic
the sum of arc labels in the matrix B*.
direction 001.
To construct B* and the corresponding α matrix for an ar-
We define the co-lattice vectors as the remaining n − 1 en-
bitrary labelled quotient graph with m arcs and n vertices,
tries of α. These correspondingly map the co-cycle basis vec-
one must first establish the basis of the graphs' cycle and co-
tors in B* to the lattice space. Importantly, these co-lattice vec-
cycle (or cut) space. The cycle basis of a graph is an irreduc-
tors can be considered a deviation of an embedded point from
ible representation of all of its possible cycles. For connected
the centre of mass of its neighbouring points. Eon13 demon-
graphs such as all of those representing crystalline nets, the
strated that if one maps the cycle basis to the lattice vectors
basis of the cycle space contains a total of m − n + 1 vectors,
and the co-cycle basis to a set of zero vectors (L* = {000}), one
where each vector is a sum of the arcs (taking orientation
obtains the unique barycentric representation of a net, a repre-
into consideration) which form that cycle. An example can be
sentation seen in the RCSR7 as well as publications discussing
seen for the graph of qtz in Fig. 2b, where a cycle is
nets in the context of MOF classification.6,25 We leave the de-
highlighted in orange. Note, to be described as a cycle, the
tails of identifying the cycle, co-cycle basis and mapping the
highlighted orange path must ‘flip’ the orientation of the arc
two graphs shown in Fig. 2 to their barycentric representations
e2, since it is pointing in the wrong direction. Thus the basis
in the ESI.† It was also demonstrated that one can obtain an
cycle vector, represented in terms of the edges (e1, e2, e3, e4,
arbitrary, unique, representation of a net by setting the co-
e5, e6) is (1, −1, 1, 0, 0, 0) since it only contains the first three
cycle basis to a set of non-zero vectors (L* ≠ {000}), which is
arcs in its path and the second entry is negated to indicate
the general strategy employed in this work to obtain net em-
the reversed orientation of that arc in the cycle.
beddings supporting the geometries of a set of SBUs. To our
Many of the labelled quotient graphs used to construct
knowledge, no other program is this generalized in
MOFs are much larger than the examples shown in Fig. 2
constructing hypothetical MOFs or other materials.
(some with over 100 arcs!), so to systematically construct the
cycle basis in these graphs, we use a minimum spanning tree
algorithm20–23 from the SAGE mathematical package.24 Upon The metric tensor
forming a minimum spanning tree, which defines the mini- We have defined the arcs as lines and the vertices as points
mum number of edges that connect all vertices in a graph, a in the lattice space. To construct a representation of a net in
3780 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
Cartesian space for the purposes of building hypothetical cycles correspond to the lattice directions (100), (010), and
MOFs, we must construct a representation of the lattice in (001). In the qtz graph these are L1 = (e3 − e4), L2 = (e1 − e6),
3. The metric tensor, Z, thus completes this description, and L3 = (e4 + e6 + e5). The lattice vectors obtained from the
representing a mapping from the lattice coordinate system remaining cycles which form the basis of the cycle space can
defined by the integer vectors (100), (010), and (001) to some then be expressed as a linear combination of L1, L2, and L3.
other basis in three dimensional space. It is characterized by For qtz, a single remaining cycle L4 = (e2 − e5), whose sum of
the dot products of the lattice basis vectors,
edge labels is can be expressed as a linear combination
of the first two lattice vectors, L4 = (−L1 − L2). Thus one can
construct a cycle vector for the kernel K by summing the
(2)
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
(3)
(4)
(8)
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3781
View Article Online
Paper CrystEngComm
3782 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
opted for the L-BFGS optimizer. In favour of using a global ever, in the case of structure generation, we find it necessary
optimizer; one can perceive cases where the net is very large, to include these vertices to ensure a good geometric match
such that significant local geometric errors in the net geome- with the SBUs. For example, all of the zeolite topologies pos-
try would not impact the relative value of eqn (11). In these sess quotient graphs with vertices of degree 4, which repre-
cases a local minima could result in a net embedding with sent the tetrahedral Si4+ ions within the cell. However in or-
distorted nodes, yielding a potentially non-physical structure. der to accurately describe the ‘bent’ 145° angle of the
As an example, consider the two SBUs in Fig. 4, an connecting oxygen atoms, these bridging vertices must be
octatopic Cd2+ inorganic SBU abbreviated as Cd4(O2CR)8 (ref. inserted between vertices of degree 4. By inserting these verti-
30) and a ditopic diphenyl organic SBU. The net bcu contains ces of degree 2, we do not alter the topology of the underlying
a single vertex of degree 8, which matches the degree of the graph. This is ensured by assigning a label of ‘000’ to any
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
inorganic SBU in Fig. 4a. Likewise, the linear ditopic organic new edges such that the sum of edge labels in the cycle basis
SBU shown in Fig. 4b can be assigned to the edges of the yields the same value as the cycle in the un-augmented form.
graph. Here we will demonstrate how the method described A specific example is shown in Fig. 5. The labelled quotient
above will embed the net bcu based on the geometries of the graph is augmented with several new vertices and edges to
SBUs in Fig. 4. account for the linear organic SBU and the bond between the
The initial step is to assign SBUs to specific vertices within inorganic and organic SBUs.
the labelled quotient graph. This can be accomplished by We are now faced with the task of assigning the arcs of
matching vertex and SBU coordination numbers. In the case the labelled quotient graph to specific SBU connecting sites.
of bcu, this task is straight forward; there is a single node We note that there are many possible ways to assign arcs to
with eight edges incident upon it. Thus the 8-coordinate Cd2+ SBU connection sites, particularly when the symmetry of the
SBU will be assigned to the vertex. In order to account for SBU is low. The work of Schmid and coworkers have
particular bonding geometries of SBUs, the labelled quotient performed genetic algorithm optimizations to find the lowest
graphs are augmented with more vertices and edges. Specifi- energy (and therefore probable) SBU orientations within a
cally, edges are cut between each original vertex to include an net representation, this was used in the context of explaining
additional two vertices of degree 2. The extra edge between why a particular nano-porous material will assemble to form
these vertices represents the bond made between adjacent one net over another.31,32 In this work arc assignment is
SBUs, to account for bond orientations which cannot be rep- based on a geometric comparison of the inner products of
resented by a single edge originating from the SBUs centre of the SBU vectors with the barycentric placement of the net
mass. An example of this can be seen in Fig. 4a for
Cd4(O2C)8. The geometry of the SBU and its connection sites
can best be described by a vertex of degree 8 located at the
SBU centre of mass, and 8 vertices of degree 2 which describe
the bonding orientation of the carboxylate moieties.
It should be noted that vertices of degree 2 are typically
not found in crystallographic nets, as they can be substituted
for a single edge between vertices of a higher degree. How-
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3783
View Article Online
Paper CrystEngComm
vertices. A decision is made by pairing the lines of the NSBU = {garas|(1 ≤ r ≤ n), (s ≤ r)} (16)
barycentric net embedding to a normalized set of vectors
representing the SBU connection sites that yields the lowest There are 16 arcs assigned to Cd4(O2CR)8 in the aug-
root mean squared deviation. mented quotient graph of bcu. Thus n = 16 and the set NCd2+
Once the connection sites of each SBU have been assigned runs over 136 elements of the inner product matrix.
to specific arcs from the labelled quotient graph, we can com- It should be noted that when constructing the geometry of
pare their geometries to that of the inner products from a net the reduced SBU representation in eqn (9), the orientations
embedding generated using eqn (10). We define elements gij of the vectors must match the positive orientation of the as-
of the net embedding's inner product matrix g as belonging sociated arc in the labelled quotient graph. For example, the
to a particular SBU if the arcs ei and ej in the augmented la- arc +e9 in Fig. 5b points toward the vertex A. Thus the vector
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
belled quotient graph were assigned to that SBU. In the case associated with +e9 in the underlying SBU representation of
of the bcu topology, The Cd4(O2CR)8 SBU is assigned all of Cd4(O2C)8 must be oriented into the SBUs centre of mass
the arcs connected to the central vertex A in Fig. 5b as well as (which was assigned to the vertex A).
the adjacent arcs representing the SBU's connection sites. Now armed with the necessary elements required to com-
Suppose the n edge indices assigned to a particular SBU are pare metric tensor parameters which optimizes the objective
found in the set a = (a1,⋯, ak,⋯, an), the elements of g asso- function in eqn (11). Fig. 6 shows a graphical progression
ciated with an SBU would be, from the barycentric embedding of the net bcu as taken from
Fig. 6 The graphical representation of progressing from the a) barycentric representation of the net bcu to b) an augmented form of the net with
orange vertices representing SBU connection sites, black vertices representing the centre of mass of the SBUs. c) The bcu net is adjusted into a
non-barycentric form to match the geometry of the SBUs in Fig. 4. d) The SBUs are super-imposed on the net vertices to form the final hypotheti-
cal MOF structure.
3784 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
the RCSR to the embedding which fits the SBUs from Fig. 4. coordinate (4-c) inorganic SBU shown in Fig. 7a and a 4-c or-
The barycentric representation of bcu is shown in Fig. 6a, the ganic SBU shown in Fig. 7b. The copper paddlewheel can be
augmented form with lines and points added to accommo- considered a square planar SBU with angles of 90° between
date SBU connection sites is seen in Fig. 6b. This is followed adjacent edges. The organic SBU can be considered tetrahe-
by an optimization of the X and P parameters of the net in dral-like, which has four angles of 116.5° and two angles of
Fig. 6c, such that its geometry best accommodates the SBUs. 96.1°. The 3 topologies taken from the RCSR were chosen for
Finally, the SBUs are oriented onto their associated vertices instructive purposes; the two nets dia and qtz whose labelled
to construct the final hypothetical MOF in Fig. 6d. quotient graphs are shown in Fig. 2, and the net pts. The first
two nets have only tetrahedral-like vertices in their
Implementation barycentric form, thus it represents a challenging case to fit
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
The MOF building program was written primarily in Python, the square planar Cu paddlewheel to these nets, as it will re-
with parts in C++ to interface with the NLOPT33 optimization quire significant distortion of the net. The net pts contains
library. The outline of the program is as follows: both square planar and tetrahedral like vertices in its
•The program reads in labelled quotient graphs obtained barycentric representation, thus it will be less challenging to
from the RCSR. The format of which is general, and used in fit the net to these SBUs. Moreover this net is known to sup-
other collections such as the EPINET database.34 Further de- port geometries of the first two SBUs described in Fig. 7 and
tails are given in the ESI.† is thus likely to give the best fit to the SBUs when optimizing
•SBUs are chosen if they agree with the degree of the net, the net geometry.36,37 The labelled quotient graphs of dia
based on the number of connection sites they possess. and pts are classified as bipartite, which means the vertices
•The labelled quotient graph is augmented to introduce in the graph can be separated into separate sets based on
more vertices and edges so they can be matched to each SBUs their connectivity. This makes it easy for SBU placement, as
centre of mass and connection sites. If linear two-connected the inorganic SBUs can be placed in one set of vertices and
SBUs are used, they will be accommodated by inserting 3 di- the organic SBUs can be placed in the other set, ensuring
valent vertices into a single edge from the labelled quotient that organic and inorganic SBUs connect only to each other.
graph; a central vertex for the SBU's centre of mass, and two However the qtz net produces a challenge for the choice of
vertices for its connection sites. The edges split in this man- SBU placements. It is not bipartite and so we must relax the
ner are typically between vertices assigned to opposing SBU rule that organic and inorganic SBUs only join to each other
type. For example an edge joining two inorganic SBUs will be
split to support a linear organic SBU.
•SBUs and their connection sites are assigned to specific
vertices and arcs in the labelled quotient graph based on a
lowest RMS score with respect to the barycentric embedding
of the net. The authors recognize that one can leverage sym-
metry elements found in the labelled quotient graph (auto-
morphisms) to assign arcs and vertices to SBUs (assuming
the point group symmetry of the SBU is known). However,
this is currently not implemented.
•The net embedding is optimized to the geometries of the
SBUs. This is accomplished by constructing the objective
function for the net and minimizing it using the L-BFGS algo-
rithm discussed above.
•Following optimization, the SBUs are oriented onto their
assigned points to complete the hypothetical MOF construc-
tion. This orientation consists of a translation of the SBU's
centre of mass to the central node, and a rotation to least
squares fit the connection sites to the appropriate points.35
At this stage, an atom collision check is performed to ensure
that no atoms are closer than the sum of their van der Waals
radii multiplied by a user-defined tolerance factor. In this
work, the tolerance factor was set to 0.4.
Fig. 7 SBUs used as input into the structure generation program and
Results and discussion their underlying vertex representations. a) (4-c) square planar copper
Fitting different 4-c SBUs to nets paddlewheel SBU, b) (4-c) tetrahedral organic SBU and c) (3-c) trigonal
planar organic SBU. Black vertices in the underlying SBU representa-
As an initial demonstration of the methodology, we will con- tions correspond to the SBU's centre of mass, and orange vertices cor-
struct periodic structures of different topologies using a four respond to their connection sites.
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3785
View Article Online
Paper CrystEngComm
and not themselves. In this case only organic SBUs were dia. By increasing the number of SBUs in the unit cell, we
placed on adjacent vertices. This effectively creates a larger, can effectively distribute this stress over a larger number of
more complicated organic SBU in the hypothetical MOF. bonds and degrees of freedom. Similar arguments can be
The fitting of these nets to SBUs is not always perfect for a made for the qtz net whose lattice vectors are determined by
number of reasons. The first being the program attempts to a sum of only two or three arcs. It is notable that the stress
match a net to a set of rigid SBU geometries, which, due to values in the XX, YY, and ZZ directions are negative for both
the constraints of their angles, may not support a particular the dia and qtz MOFs, indicating a drive towards volume
periodic connectivity. A discussion of these types of geome- compression of the unit cells. The connectivity of these struc-
tries can be found in ref. 38. Another reason is due to the tures indicates that at least two connection sites of the tetra-
constraints set on the optimization parameters. For example, hedral SBU in Fig. 7b are connected to two sites of the same
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
during the optimization, the cell angles are not permitted to Cu paddlewheel. While the tetrahedral orientation of the or-
deviate lower than 60° and greater than 120°. For these rea- ganic SBU is provided by a single sp3 hybridized carbon, the
sons, a ‘fitness’ of the embedded net is calculated once the square planar angles of the Cu paddlewheel are much more
optimization has converged. This is computed as the average rigid, being supported by two Cu2+ ions coordinated to conju-
deviation of the net geometry, in terms of its line lengths gated carboxylate moieties. Therefore the compression likely
(line length deviation shown in the first row of Table 1) and arises from the forced planarity of the tetrahedral SBUs.
angles (angle deviation shown in the second row of Table 1)
compared with underlying geometry of the rigid SBUs the
program fits the net to. Upon evaluating the net's fitness, one Structures built from the RCSR
can use discretion on whether to keep or discard the struc- To demonstrate the breadth of structures the program is able
ture. In some cases the fit can be poor, but produce a reason- to generate, the (4-c) copper paddlewheel and a (3-c) SBU
able structure after relaxing the atoms and bonds by optimi- shown in Fig. 7a and c, respectively were used to generate 46
zation with a molecular mechanics force field or ab initio MOFs unique in their topology. Note, at the time of this writ-
calculation. Table 1 reports how well each net embedding fits ing, the number of non-augmented, non-catenated nets with
to the rigid SBUs. While the averages of line length and angle mixed degree (3,4) in the RCSR is 49, however 3 structures
deviations were relatively small across all topologies, the contained atomic collisions which could not be eliminated.
standard deviations appear to be the lowest for the pts net. Many of the topologies shown along the x-axis of Fig. 8 are
The stresses on the unit cell were computed from a single not bipartite. Thus in some of the MOFs Cu paddlewheels
point DFT calculation in VASP.39–42 Comparison of the stress are bonded to Cu paddlewheels, and in others the 3-c SBU is
tensors show the pts net provides the lowest amount of strain bonded to other 3-c SBUs. All of these structures were opti-
in the structure. It is clear that the significant distortions re- mized using a molecular mechanics force field prior to com-
quired to fit the dia and qtz nets to the square planar Cu puting their surface areas.
paddlewheel SBU has resulted in some unphysical stresses It is interesting to see such a diversity of surface areas in
on the system. Fig. 8 achieved from building MOFs with only two types of
While not investigated here, it is possible that expanding SBUs. As a rough comparison, the database of over 130 000
the labelled quotient graph (or reducing the translational MOFs constructed by Wilmer et al. possesses a similar
symmetry) so that the ‘unit cell’ of the net contains more ver-
tices and edges to support more SBUs may reduce the stress
values shown in Table 1. For example, 3 of the 4 arcs in the
labelled quotient graph of dia correspond to the three lattice
vectors 100, 010, and 001. Since these three arcs are assigned
to a square planar paddlewheel, there is considerable pres-
sure to make these lattice vectors coplanar. This can be seen
by the non-zero stress values in the XY YZ and ZX values for
Table 1 Measurements of the fit of three nets to the organic and inor-
ganic SBUs shown in Fig. 7
3786 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3787
View Article Online
Paper CrystEngComm
addition, the symmetry detected in the generated structure this net is shown in Fig. 10 and was confirmed to posses the
was the ideal symmetry space group found in the barycentric 3IJ32,4)2 topology using the TOPOS program. It should be
representation of the net, Fm3̄m. For the structure built with noted that the embedding of this net produced somewhat un-
the SBU with C2v symmetry in Fig. 9d the resulting net em- natural bending within the structure, which can be expected
bedding found from our algorithm, when compared with the when fitting rigid SBUs to an underlying graph. Potential en-
SBU geometries, was reported to possess an average line ergy based geometry optimization of the material with an ap-
length deviation of 0.3 ± 0.4 Å and angle deviation of 6.6 ± 10 propriate force field or similar would improve the structure,
degrees. The reported crystal system was found to be mono- however, the structure given in Fig. 10 is the ‘raw’ structure
clinic C2 – which possesses two-fold screw axis and a two- obtained with our approach.
fold rotation axis. Both symmetry operations support the
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
symmetry of the C2v SBU. The structure built from the low
symmetry SBU in Fig. 9f was found to have P1 symmetry, as Challenging topologies and high surface area MOFs
expected. The line length and angle deviations reported from Recently, Yaghi and coworkers reported a material, MOF-210
generating this material was 0.1 ± 0.4 Å and 7 ± 10 degrees, with, at the time, a record breaking Brunauer–Emmet–Teller
respectively. Crystalline structures with this level of strain (BET) surface area of 6240 m2 g−1.48 Building this MOF repre-
and low symmetry will likely never be experimentally realized, sents a challenge for current structure generation algorithms
however generation of this material demonstrates the robust- for several reasons. The MOF consists of 3 unique SBUs of
ness of the structure generation code. The displayed struc- different geometry; a hexatopic Zn4O inorganic SBU found in
tures have not been geometry optimized in any form, they are other MOFs such as MOF-5,18 a 3-c organic SBU; 4,4′,4″-
a direct output of the structure generation program. (benzene-1,3,5-triyl-trisIJethyne-2,1-diyl))tribenzoate (BTE), and
biphenyl-4,4′-dicraboxylate (BPDC), which is a linear 2-c SBU.
The rhombohedral primitive cell of this material is very large,
Building MOFs with unstable nets possessing cell dimensions of 71 Å and containing a total of
The advantage this program possesses over other methods is 18 Zn4O inorganic SBU clusters, 18 BPDC SBUs, and 12 BTE
that it can be extended beyond the nets found in the RCSR SBUs. These factors alone would pose a particular challenge
in their barycentric representation. As an example, we will for the recursion method,1 which initiates a structure by plac-
build a structure based on an unstable net 3IJ32,4)2. Unstable ing a single SBU in space, and samples every possible combi-
nets, so-called because they possess intersecting vertices in nation of SBU bonds until a structure is made. To assemble
their barycentric form, have been shown as important as de- the hypothetical version of MOF-210, which possesses the toz
scriptions of crystallographic structures. Recently Delgado- topology, with the tinker-toy-based construction method, one
Friedrichs et al. discussed several crystalline materials which would need to sample a very large number of possibilities be-
could be classified as having unstable nets with collisions in fore achieving the correct structure. Assuming that a
their barycentric placement.44 Here we assemble a MOF with 6-coordinate Zn4O SBU is initially placed, then each of the 6
the copper paddlewheel (Fig. 7a) and a tritopic organic SBU connection sites would sample the 3 bonds from the BTE or-
(Fig. 7b), using only the labelled quotient graph as input to ganic SBU and the 2 bonds from BPDC, yielding 65 = 7776
define the topology. The final hypothetical MOF built from possible bonding combinations for the first SBU alone. There
Fig. 10 Hypothetical MOF built with the unstable net 3IJ32,4)2. Left: The labelled quotient graph, middle: the SBUs used to build the structure,
right: the hypothetical MOF.
3788 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
are a total of 48 SBUs in the primitive cell of this MOF, so generated structure following geometry optimization of both
one would have to sample on the order of 1090 possible bond- the atomic positions and the cell at the molecular mechanics
ing combinations, one of which would yield MOF-210. In level. Table 2 also reveals that the calculated surface area and
practice, an upper bound must be placed on the number of pore volume of the generated structure are within 1.1% of
bond combinations to sample, thus even if the tinker toy al- the experimental structure. We believe the agreement in the
gorithm can perform trial bonding moves in seconds, it is cell parameters and other properties is excellent considering
unlikely that MOF-210 could be built. In addition, the coordi- how that only the labelled quotient graph of the net was used
nation environments of the Zn4O SBUs are different. Each is as input and that the initial SBU geometries were taken from
coordinated to 4 tritopic BTE SBUs and 2 linear BPDC SBUs, sources not related to the crystal structure of MOF-210. To
however some Zn4O SBUs coordinate two BPDC in a cis fash- compare the generated and experimental structures visually
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
ion, others in a trans fashion. Thus to build this structure and to depict the complexity of the structure, Fig. 11b, shows
with the tinker-toy algorithm would require not only an in- an overlay of the experimental atomic positions in the unit
conceivable amount of compute resources, but also a large cell with that of the generated structure. It is worth noting
amount of information from the experimental crystal struc- that the ‘raw’ unit cell vectors produced by the algorithm will
ture to ensure the bonding motifs of the Zn SBUs are pre- generally not be the same as the experimental vectors. How-
served. For these reasons, predictive capacity of the tinker-toy ever, one can define a linear combination of the original vec-
algorithm is somewhat limited to relatively small, topologi- tors to yield cell vectors and dimensions to agree with the ex-
cally and chemically simple structures. perimental counterparts. Further discussion of adjusting the
These problems can be alleviated with our algorithm. By cell vectors in this way is provided in the ESI.† We note that
using the labelled quotient graph of the toz net as structure altering the unit cell dimensions does not change the inher-
directing, and defining the 2-c, 4-c and 6-c SBUs, one can as- ent physical and chemical characteristics of the periodic
semble a hypothetical version of MOF-210 with relative ease. structure. The presented values for surface area and pore vol-
In this case, all 6-c vertices are assigned to the Zn4O SBU and ume in Table 2 were identical both before and after the cell
3-c vertices are assigned to the BTE SBU. With the placement parameters were adjusted.
of Zn4O SBUs on the 6-c vertices, the program subsequently Farha et al.50 recently demonstrated a method for improv-
recognizes these vertices as ‘inorganic’, thus in cases where ing the surface area of an existing MOF, NU-100, based on
6-c vertices in the graph are incident upon other 6-c vertices, the ntt topology. By increasing the length of the organic SBU
it will automatically augment the edge between these vertices arms with extra benzene and acetylene moieties, they were
with an additional 2-c vertex to support a bridging linear able to generate NU-110, which surpassed MOF-210 as the
ditopic BPDC SBU. To construct the hypothetical version of MOF with record breaking gravimetric surface area.50 The or-
MOF-210, the organic SBUs were built using a conventional ganic SBU of NU-110 is similar to the 3-c BTE SBU in that it
molecule building software. Unlike the SBUs found in the ex- contains a central aromatic ring with three arms extending
perimental crystal structure of MOF-210, these SBUs were outward with D3h symmetry. The differences being that the
built entirely planar, with typical C–C and C–H bond lengths length of these arms are approximately four times larger than
and angles for aromatic and alkyne systems. The inorganic those of BTE, and at the terminus of these arms two carboxyl-
SBU was extracted from MOF-5 (ref. 18) which contains the ate moieties are found, yielding a 6-c SBU (the authors called
same Zn4O cluster, only connected in a different topology to LH6 (ref. 50)) instead of the 3-c SBU of BTE. It was shown50
different organic SBUs. that the computed physical properties of the hypothetical
Compared in the first two columns of Table 2 are the ex- structure generated with the recursion method were in good
perimental unit cell parameters of MOF-210, and those of our agreement with the experimentally realized version of this
Table 2 Cell parameters and selected physical properties of two experimental MOFs, NU-110 and MOF-210, along with purely hypothetical MOFs gen-
erated with the ntt, toz and ith-d topologies
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3789
View Article Online
Paper CrystEngComm
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
Fig. 11 The hypothetical generation of two high surface area experimental MOFs. a) The SBUs used to assemble the hypothetical version of MOF-
210, including the inorganic Zn4O cluster, 4,4′,4″-(benzene-1,3,5-triyl-trisIJethyne-2,1-diyl))tribenzoate (BTE) and biphenyl-4,4′-dicarboxylate
(BPDC). b) Direct comparison of the hypothetical (blue) and experimental (red) MOF-210 structures. c) The organic SBUs used to assemble the hy-
pothetical version of NU-110. The original 6-c LH6 ligand (carboxylate moieties replaced with dotted lines) was fragmented into a 3-c benzene
SBU, and 3-c LH3 containing the remaining arms of the LH6 ligand.50 d) Superposition of the hypothetical (blue) and experimental (red) structures
of NU-110.
MOF, and so as a curiosity, in this work we assembled the hy- perimental and hypothetical NU-110 in columns 3 and 4 of
pothetical version of NU-110, as well as new MOFs based on Table 2 shows a slight over-prediction with the hypothetical
the tritopic version of the LH6 SBU. MOF. We note that there is a slight distortion in the SBUs
We constructed the hypothetical version of NU-110 using when the hypothetical and experimental MOFs of NU-110
the labelled quotient graph of ntt, which possesses vertices were superposed as seen in Fig. 11d, which is likely the cause
of degree 3 and 4. This required an alteration of the SBUs of the larger values in the reported properties for hypotheti-
used to assemble the MOF; the 4-c Cu paddlewheel, and a cal NU-110.
fragmentation of the 6-c LH6 organic ligand into large (LH3) The final two columns of Table 2 demonstrate how the al-
and small (benzene) 3-c SBUs to support the separate 3-c ver- gorithm can predict new MOFs with incredibly high surface
tices in the net. One challenge in constructing the hypotheti- areas and void volumes. Both the toz and ith-d hypothetical
cal NU-110 structure was to ensure the 3-c organic SBUs were MOFs were constructed using the exact same SBUs, one of
placed at specific nodes in the ntt net such that they com- which is a 3-c adaptation of the LH6 SBU found in NU-110.
bine to form the large (6-c) SBU in the final structure. Know- The other SBUs were the Zn4O inorganic SBU seen in Fig. 11a
ing that only the small 3-c benzene SBU form bonds with and a 2-c organic SBU large enough to support the extended
copper paddlewheels in the crystal structure, we imposed a 3-c SBUs in both the toz and ith-d topologies. Hypothetical
constraint such that only these SBUs were assigned to 3-c ver- MOF-toz was found to have a 1.4-fold increase in calculated
tices adjacent to 4-c vertices in the labelled quotient graph. surface area over that of the NU-110 MOF. In addition the
The remaining 3-c vertices were then assigned LH3 SBUs. The pore volume is larger by a factor of approximately 5. Hypo-
authors note that relaxing this constraint will still yield valid thetical MOF-ith-d also has potential to break the gravimetric
(and entirely different) hypothetical MOF materials, however surface area record, however the computed properties are
for the purposes of constructing hypothetical NU-110, it was slightly lower than that of the SBUs assembled in the toz to-
necessary to judiciously assign 3-c SBUS to appropriate verti- pology. These structures are provided in the ESI† in the crys-
ces. Comparing the computed physical properties for both ex- tallographic information file format.
3790 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016
View Article Online
CrystEngComm Paper
placement of any SBUs in space, which poses an advantage Jariwala, C. H. Rycroft, A. S. Bhown, M. W. Deem, M.
over the exhaustive recursion method presented by Wilmer Haranczyk and B. Smit, Nat. Mater., 2012, 11, 633–641.
et al.1 In their method, the underlying topology is encoded in 4 A. F. Wells, Three Dimensional Nets and Polyhedra, John
the geometries of the parameters assigned to the connection Wiley & Sons Ltd., New York, 1977.
sites of each SBU, effectively limiting the topological space 5 M. O'Keeffe and B. G. Hyde, Philos. Trans. R. Soc., A,
sampled by the program. In our method, similar to the 1980, 295, 553–618.
method of Martin and Haranczyk,12 requires only the SBUs, 6 E. V. Alexandrov, V. A. Blatov, A. V. Kochetkov and D. M.
an indication of where they will form bonds with other SBUs, Proserpio, CrystEngComm, 2011, 13, 3947.
and a blueprint describing the interconnectivity of the SBUs 7 M. O'Keeffe, M. A. Peskov, S. J. Ramsden and O. M. Yaghi,
within a periodic lattice. The major difference between the Acc. Chem. Res., 2008, 41, 1782–1789.
method presented here and that of Martin and Haranczyk is 8 O. Delgado-Friedrichs and M. O'Keeffe, Acta Crystallogr.,
that we propose using the abstract definition of the net over Sect. A: Found. Crystallogr., 2003, 59, 351–360.
the barycentric realization. This allows us to manipulate the 9 M. Fernandez, P. G. Boyd, T. D. Daff, M. Z. Aghaji and T. K.
net to match SBU geometries prior to assembly. Additionally Woo, J. Phys. Chem. Lett., 2014, 5, 3056–3060.
this method allows for assembly of MOFs based on topolo- 10 E. S. Kadantsev, P. G. Boyd, T. D. Daff and T. K. Woo,
gies not described in the RCSR database, allowing for a much J. Phys. Chem. Lett., 2013, 4, 3056–3061.
wider range of hypothetical structures to sample. 11 B. J. Sikora, R. Winnegar, D. M. Proserpio and R. Q. Snurr,
In this work we demonstrate the method on MOFs with Microporous Mesoporous Mater., 2014, 186, 207–213.
3-dimensional periodicity, but it is applicable to the construc- 12 R. L. Martin and M. Haranczyk, Cryst. Growth Des., 2014, 14,
tion of other classes of materials, such as porous polymer 2431–2440.
networks51 or covalent organic frameworks with 13 J.-G. Eon, Acta Crystallogr., Sect. A: Found. Crystallogr.,
2-dimensional periodicity.52,53 This program can be used for 2011, 67, 68–86.
several applications, including the development of new struc- 14 O. Delgado-Friedrichs and M. O'Keeffe, J. Solid State Chem.,
tures based on the modification of SBUs in existing mate- 2005, 178, 2480–2485.
rials, or to generate a large database of topologically diverse 15 S. J. Chung, T. Hahn and W. E. Klee, Acta Crystallogr., Sect.
structures for high throughput screening purposes. Although A: Found. Crystallogr., 1984, 40, 42–50.
we have used the methodology to generate structures of 16 J.-G. Eon, J. Solid State Chem., 1998, 138, 55–65.
MOF-210 and NU-110 that were in remarkable agreement 17 M. O'Keeffe, M. Eddaoudi, H. Li, T. Reineke and O. M.
with the experimental crystal structures, the topologies of Yaghi, J. Solid State Chem., 2000, 152, 3–20.
these MOFs were known and used as input information. 18 M. Eddaoudi, J. Kim, N. L. Rosi, D. T. Vodak, J. Wachter, M.
Thus, the method is not a general crystal structure prediction O'Keeffe and O. M. Yaghi, Science, 2002, 295, 469–472.
algorithm. However, TOBASCCO could be used to generate 19 V. A. Blatov, A. P. Shevchenko and D. M. Proserpio, Cryst.
structures of all possible topologies as part of a crystal struc- Growth Des., 2014, 14, 3576–3586.
ture prediction system that would then also compare free en- 20 T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein,
ergies of all possible topologies and geometric isomers, as Introduction to Algorithms, MIT Press, 2nd edn, 2001.
well factor in the kinetics of assembly in a given thermody- 21 M. T. Goodrich and R. Tamassia, Data Structures and
namic and experimental environment. Algorithms in Java, John Wiley & Sons Ltd., 2nd edn, 2001.
22 D. Joyner, M. Van Nguyen and N. Cohen, Algorithmic Graph
Acknowledgements Theory, 2010.
23 S. Sahni, Data Structures, Algorithms, and Applications in
The authors would like to thank the Natural Sciences and En- Java, McGraw-Hill, 2000.
gineering Research Council of Canada, the Canada Research 24 W. A. Stein, sage Dev. team, http://www.sagemath.org, 2013.
Chairs program, and Carbon Management Canada for 25 M. O'Keeffe and O. M. Yaghi, Chem. Rev., 2012, 112,
funding. We are also grateful for computing resources pro- 675–702.
vided by Canada Foundation for Innovation and Compute 26 C. Godsil and G. Royle, Algebraic Graph Theory, Springer,
Canada. PGB would also like to thank NSERC and Ministry of New York, 1st edn, 2001.
This journal is © The Royal Society of Chemistry 2016 CrystEngComm, 2016, 18, 3777–3792 | 3791
View Article Online
Paper CrystEngComm
27 J. Nocedal, Math. Comp., 1980, 35, 773. 42 G. Kresse and J. Furthmüller, Phys. Rev. B: Condens. Matter
28 D. C. Liu and J. Nocedal, Mathematical Programming, Mater. Phys., 1996, 54, 11169–11186.
1989, 45, 503–528. 43 C. E. Wilmer, O. K. Farha, Y.-S. Bae, J. T. Hupp and R. Q.
29 P. Kaelo and M. M. Ali, J. Optim. Theory Appl., 2006, 130, Snurr, Energy Environ. Sci., 2012, 5, 9849.
253–264. 44 O. Delgado-Friedrichs, S. T. Hyde, S. W. Mun, M. O'Keeffe
30 H. Chun, D. Kim, D. N. Dybtsev and K. Kim, Angew. Chem., and D. M. Proserpio, Acta Crystallogr., Sect. A: Found.
Int. Ed., 2004, 43, 971–974. Crystallogr., 2013, 69, 535–542.
31 S. Bureekaew and R. Schmid, CrystEngComm, 2013, 15, 1551. 45 S. S.-Y. Chui, S. M.-F. Lo, J. P. H. Charmant, A. Guy Orpen
32 S. Bureekaew, V. Balwani, S. Amirjalayer and R. Schmid, and I. D. Williams, Science, 1999, 283, 1148–1150.
CrystEngComm, 2015, 17, 344–352. 46 M. A. Addicoat, D. E. Coupry and T. Heine, J. Phys. Chem. A,
Published on 21 March 2016. Downloaded by University of Ottawa on 1/26/2022 1:47:31 AM.
3792 | CrystEngComm, 2016, 18, 3777–3792 This journal is © The Royal Society of Chemistry 2016