Professional Documents
Culture Documents
Harewood Mchugh Cms 2007
Harewood Mchugh Cms 2007
1:4
_ u
i
1
2
_ u
i
1
2
Dt
i1
Dt
i
2
u
i
1:5
where u is the displacement and the superscripts refer to the
time increment. The term explicit refers to the fact that
the state of the analysis is advanced by assuming constant
values for the velocities, _ u, and the accelerations, u, across
half time intervals. The accelerations are computed at the
start of the increment by
u
i
M
1
F
i
I
i
1:6
where F is the vector of externally applied forces, I is the
vector of internal element forces and M is the lumped mass
matrix. As the lumped mass matrix is diagonalised it is a
trivial process to invert it, unlike the global stiness matrix
in the implicit solution method. Therefore each time incre-
ment is computationally inexpensive to solve.
Astability limit determines the size of the time increment:
Dt 6
2
x
max
1:7
where x
max
is the maximum element eigenvalue. A conser-
vative and practical method of implementing the above
inequality is:
Dt min
L
e
c
d
_ _
1:8
where L
e
is the characteristic element length and c
d
is the
dilatational wave speed:
c
d
k 2l
q
1:9
k and l are the Lame elastic constants and q is the material
density. A quasi-static problem that is solved using the ex-
plicit method would have much smaller time increments
than an equivalent problem solved using the implicit meth-
od. Although the incremental solution is easy to obtain
using the explicit method, it is not unusual for an analysis
to take 100,000 increments to solve. In order to maintain
eciency of the analyses it is important to ensure that
the sizes of the elements are as regular as possible. This is
so that one small element does not reduce the time incre-
ment for the whole model.
It is often impractical to run a quasi-static analysis using
its true time scale as the runtime would be very large. A
number of methods can be used to articially reduce the
runtime of the simulation. The rst involves simply speed-
ing up the applied deformation or loading rate and the
second involves scaling the density of the material in the
model. According to Eqs. (1.8) and (1.9), when the density
is scaled by a factor, f
2
, the runtime is reduced by a factor f.
The latter method is preferable as it does not aect the
strain rate dependent response of viscoplastic/rate-depen-
dent materials.
It is important when performing a quasi-static simula-
tion that the inertial forces do not aect the mechanical
response and provide unrealistic dynamic results. To
reduce the dynamic eects Kutt et al. [16] recommend that
the ratio of the duration of the load and the fundamental
natural period of the model be greater than ve. It has been
shown that by keeping the ratio of kinetic energy to the
total internal strain energy at <5% dynamic eects in the
model are negligible [6,14]. This is the criterion for quasi-
static behaviour that is employed in this paper.
During implicit analyses where the material gives a non-
linear stressstrain response many iterations are usually
needed to solve for an increment. This leads to progressively
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 483
smaller time steps being used and should the code encounter
large non-linearities convergence may be impossible to
achieve in practical terms. As there is no iteration involved
in the explicit method, convergence problems are not an
issue.
The solution time of the implicit solver is proportional
to the square of the wavefront size in the global stiness
matrix. This has implications when increasing the size of
the model and when running 3D simulations. In the case
of the explicit solver there is a linear relationship between
the size of the model and the solution time, as dictated
by the characteristic element length and the number of
elements in the model [9,14,15,29,3537].
2. Theory
The nite element analyses performed in this study
incorporate elastic and plastic constitutive laws in the
context of nite deformation kinematics. The elasticity is
considered to be isotropic and linear in terms of nite
deformation quantities and can be described using the
Youngs modulus, E, and Poissons ratio, m. This is a rea-
sonable approach as the elastic strains in the model are
very small in comparison to the plastic strains. Plasticity
is described using rate-dependent crystal plasticity theory
[26].
Crystal plasticity theory is a physically based plasticity
theory that represents the deformation of a metal at the
microscale. The ow of dislocations in a metallic crystal
along slip systems is represented in a continuum frame-
work. Plastic strain is assumed to be solely due to crystallo-
graphic dislocation ow. The rate-dependent, viscoplastic
single crystal theory as presented in Peirce et al. [26] and
Huang [10], and employed in several works such as
McHugh and Connolly [20], Savage et al. [31] and McGarry
et al. [18], is used. The rate-dependence is implemented in
the formulation through a power-law that relates the
resolved shear stress s
(a)
to the slipping rate _ c
a
on each slip
system a,
_ c
a
_ asgns
a
s
a
g
a
n
2:1
where _ a and n are a reference strain rate and rate sensitivity
exponent, respectively, and g
(a)
is the slip system strain
hardness. As n tends to innity the material reaches the
rate-independent limit.
The slip system strain hardness, g
(a)
, is determined by
integration of the following evolution equation
_ g
a
N
b1
h
ab
j _ c
b
j 2:2
where h
ab
is the strain hardening modulus; h
aa
and h
ab
(a 5b) are the self and latent hardening moduli, respec-
tively, and the summation ranges over the number of
slip systems, N. In this work, Taylor isotropic hardening
is assumed where the self and latent hardening moduli are
considered equal. Quantitatively, slip system strain harden-
ing is specied by the following hardness function as dened
by Peirce et al. [26]:
gc
a
g
0
g
1
g
0
tanh
h
0
c
a
g
1
g
0
2:3
From which one can determine the hardening moduli,
through dierentiation:
h
aa
h
ab
hc
a
dgc
a
dc
a
h
0
sech
2
h
0
c
a
g
1
g
0
2:4
In the above expressions the material hardening parame-
ters g
0
, g
1
and h
0
are the initial hardness of a slip system,
the maximum slip system hardness when og/oc = 0 and the
value of og/oc when c = 0, respectively. These material
constants can be derived from the strain hardening part
of an experimental tensile stressstrain (re) curve for a
material. The values used in this study are taken from the
single crystal tensile test data for 316L stainless steel pre-
sented by Okamoto et al. [25], and as calibrated in [31]
g
0
50 MPa; g
1
330 MPa; h
0
225 MPa;
_ a 0:001 s
1
A rate sensitivity parameter, n = 20 is used for all of the
analyses except where stated.
A quantity of importance in the above equation is the
accumulated slip, c
a
. This is a measure of the total crystal-
lographic plastic strain at a point and is dened as follows:
c
a
N
a1
_
t
0
j _ c
a
jdt 2:5
where t is the time and the summation is over all the slip
systems at a point.
The material chosen, 316L stainless steel, has a face-cen-
tred-cubic (FCC) crystalline structure. For this material the
slip systems are dened by the following Miller indices:
h111i{110}.
3. Development of explicit user material subroutine
The crystal plasticity constitutive theory is not provided
as standard in any of the commercially available nite
element analysis software. It is therefore necessary to
implement the theory in the form of a user-dened stress
update algorithm. This is implemented in the nite element
code ABAQUS/standard, an implicit solver, by means of a
UMAT, as coded in [10]. It was necessary to develop a VUMAT
for use with ABAQUS/explicit. Much of the coding
involved in the two algorithms is the same but there are
several key issues that must be addressed to maintain con-
sistency of results between the two solvers.
These subroutines, written in Fortran, implement the
theory in the form of a stress update algorithm that is
called at each integration point for every iteration during
a nite element simulation [9]. Recalling Eq. (1.1), the
stress component must be updated during each iteration
484 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494
and this operation is performed by the UMAT/VUMAT. The
stress update is calculated thus:
ru
tDt
i
ru
t
Du
i
ru
t
Dr
i
3:1
During every iteration of an analysis the nite element
solver provides the subroutine with quantities such as the
strain and time increments, De
i
and Dt, and the increment
in stress, Dr
i
, is calculated [9].
In this section the development of the crystal plasticity
VUMAT for use with ABAQUS/explicit is described. Partic-
ular attention is paid to the dierences between it and
the crystal plasticity UMAT [10]. A schematic of the VUMAT
layout is presented in the Appendix.
3.1. System equations
The most important dierence between the sets of equa-
tions used in the two solvers is the lack of presence of a glo-
bal stiness matrix in the explicit solver. When writing a
UMAT it is vital that the Jacobian matrix be accurately
represented to get correct and ecient nite strain problem
solutions. It is not necessary to dene any such matrix in the
VUMAT interface. Although this makes the writing of the
subroutine more straightforward, the choice of elements
available to the user is restricted to mainly rst order.
3.2. Vectorised interface
The explicit solution process involves a large number of
increments, each of which is easily solved for. Conse-
quently, the explicit nite element calculation procedure
is well suited to being split up and solved by a number of
processors. With this in mind the VUMAT is constructed with
a vectorised interface. At the beginning of each increment
the stress and state variable data are passed in, in the form
of two-dimensional arrays. Each column in an array con-
tains the information relating to an integration point of
the material. When a simulation is performed using multi-
ple processors the analysis data can be split up into blocks
and solved independently. Thus, vectorisation is preserved
in the writing of the subroutine in order that optimal
processor parallelisation can be achieved.
3.3. Size of time increment
The initial time increment used in ABAQUS/standard is
chosen by the user. Subsequent increments are controlled
by an automatic incrementation control. However, when
implementing the UMAT it was necessary to improve the con-
trol in order to improve time-stepping accuracy. The varia-
tion in the accumulated slip, c
a
, throughout each iteration is
measured. If the rate of increase is excessive then the itera-
tion is aborted and a smaller time increment size is used.
This is found to be an ecient method of incrementation
control and prevents premature termination of an analysis.
The same procedure is not necessary using ABAQUS/expli-
cit as the time increments are suciently small.
To determine the size of the initial time increment in
ABAQUS/explicit a bogus set of tiny strain increments
are passed in to the VUMAT at the start of the analysis. From
the stress response of the material, a conservative value for
the stable time increment is calculated (Eqs. (1.8) and
(1.9)). The nite element solver requires that the material
be elastic for this initial check. Due to the fact that the crys-
tal plasticity subroutine is computationally intensive, an
elastic stressstrain response must be dened that ensures
a relatively small time increment. During this initial time
increment calculation stage a material response is dened
with the same elastic properties as are used to describe
the elasticity in the body of the crystal plasticity subrou-
tine. This ensures that a relatively ecient time increment
is employed. If a smaller increment is required a stier elas-
tic modulus may be used, although the solution time will be
longer. This does not aect the response of the material
during the analysis; it is purely for the purpose of time
increment calculation.
3.4. Time integration scheme
The forward gradient time integration scheme that
forms part of the stress update algorithm in [10] involves
the solution of the following non-linear equation for the
incremental slip system shear strains using the Newton
Raphson method:
Dc
a
1 hDt _ c
a
j
t
h_ asgns
a
s
a
Ds
a
g
a
Dg
a
_ _
n
tDt
Dt
3:2
where h ranges from 0 to 1. This is a non-linear implicit
equation as the increments of resolved shear stress, Ds
a
,
and current strength, Dg
a
, are functions of Dc
(a)
. Solving
Eq. (3.2) in this way ensures that the convergence rate dur-
ing the analysis is high and allows for a relatively large time
increment to achieve convergence.
The explicit solver does not require the use of iteration.
Time rates of change are assumed to be constant through-
out each time increment and a value of h = 0 is used in Eq.
(3.2) such that an incremental quantity is calculated, in a
simple Euler fashion, as the product of the rate quantity
and the time increment, for example:
Dc
a
_ c
a
Dt 3:3
3.5. Material rotation
During elastic deformation of a crystal, the crystal lattice
undergoes rotation and distortion. This eect is captured by
the vectors that dene the slip directions, s
*
(a)
, and the nor-
mals to the slip planes, m
*
(a)
, as the deformation continues.
These vectors have components s
a
i
and m
a
i
in the refer-
ence Cartesian coordinate system. The incremental value
of s
*
(a)
is calculated thus:
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 485
DS
a
i
De
ij
b
l
b
ij
Dc
b
x
ij
Dt
b
#
b
ij
Dc
b
_ _
S
a
j
3:4
where
b
l
b
ij
Dc
b
and
b
#
b
ij
Dc
b
are the incremental
plastic strain and plastic lattice spin respectively. The quan-
tities in the brackets equal the sum of the elastic strain and
the elastic spin increments. The increment of total crystal
rotation is denoted by x
ij
Dt, where x
ij
is the sum of the
rotation on each slip system, a, and is calculated by
x
a
ij
asymS
a
i
m
a
j
3:5
where asym(A
ij
) denotes the asymmetric part of a tensor
A
ij
. The incremental value of the slip normal, Dm
*
(a)
, is
calculated similarly to Eq. (3.4).
An important feature of crystal plasticity theory is that
both the lattice stretch and rotation inuence the amount
of plasticity:
L
p
N
a1
_ c
a
s
a
m
a
3:6
where L
p
is plastic velocity gradient in the current state and
the summation ranges over all the slip systems, a, in the
crystal.
The most important issue that the programmer must be
aware of is in regard to the way in which stress rates and,
consequently, rotations are dealt with. In the case of FE
simulations using solid continuum elements and a user-
dened material ABAQUS/standard employs the Jaumann
stress rate, whereas ABAQUS/explicit employs the Green
Naghdi stress rate [9,11]. The stress rates are dened,
respectively, as:
r
rJ
_ r W r r W 3:7
and
r
rG
_ r X r r X 3:8
where _ r is the time derivative of stress, W is a spin rate
derived from the velocity gradient
W asym
ov
ox
_ _
3:9
and X is found from the right-hand polar decomposition of
the total deformation gradient, F;
F R U 3:10
and
X
_
R R
T
3:11
In the stress update algorithms these quantities must be
calculated incrementally.
In ABAQUS/standard the material is treated as being
based in a xed global coordinate system. Incremental
rotations are passed in to the UMAT at the start of each
increment. This array, dR, is the amount by which the
stress and strain arrays have been rotated between the
end of the previous increment and the start of the current
one. It is treated thus:
r
tDt
dR r
t
dR
T
dr 3:12
where r
t+Dt
is the current Cauchy stress and dr is the stress
increment that has been calculated in the previous incre-
ment. The value passed in at the start of each increment is
the Cauchy stress in the model coordinate system. It is ro-
tated, as shown, at the end of each increment. The dR var-
iable is calculated using the HughesWinget algorithm [9]
dR I
1
2
Dx
1
I
1
2
Dx 3:13
where I is the identity matrix and Dx is the anti-symmetric
part of the incremental velocity gradient, i.e. the incremen-
tal form of W (Eq. (3.9)).
Dx asym
dDu
dx
tDt=2
_ _
3:14
This value is used in the calculations of s
*
(a)
(Eq. (3.4)) and
m
*
(a)
and is calculated from the variable dR according to
Eq. (3.13).
In the case of the VUMAT the material is taken to lie on a
corotational coordinate system, i.e. the coordinate system
rotates with the material. As each time increment is so
small, incrementally, it is assumed that all material rota-
tions are rigid body. Hence for each increment the problem
can be viewed as a small strain problem. The rotation used
in ABAQUS/explicit is the GreenNaghdi spin rate. The
HughesWinget algorithm is also used but takes a slightly
dierent form
dR I
1
2
DX
1
I
1
2
DX 3:15
where DX is found from
DX D
_
RR
T
3:16
The rotations used by each code are dierent. However,
these dierences are only evident when kinematic hardening
is employed. Johnson and Bamann [11] showed that when a
kinematic hardening model is used, employing the Jaumann
stress rate, a sinusoidal response is exhibited by the shear
stress beyond the yield point. This is not physically realistic.
When the GreenNaghdi stress rate is employed the shear
stress is shown to monotonically increase with the shear
strain. As the crystal plasticity subroutine does not incorpo-
rate kinematic hardening, both methods yield the same
results.
Ultimately, during each time increment of the VUMAT,
the increment of stress is calculated thus:
Dr
ij
D
global
ijkl
De
kl
a
v
a
ij
Dc
a
r
ij
De
dil
3:17
where
v
a
ij
D
global
ijkl
l
a
kl
x
a
ik
r
jk
x
a
jk
r
ik
3:18
D
global
ijkl
is the 4th order elastic constitutive tensor in the
global coordinate system and De
dil
is the increment of dila-
tational strain.
486 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494
To investigate the way that material rotation is dealt
with in both user material subroutines a simple analysis
is run. In a two-step simulation a square of material is
tensed uniaxially and in the second step, while maintaining
the axial tension, the material is rotated (Fig. 1). The com-
ponents of the stress are monitored in the ABAQUS output
database (odb) and the same components are also written
directly from the material subroutines. The analysis is car-
ried out for the implicit and explicit solvers.
Two components of stress are plotted in Fig. 2. In the
case of the UMAT the components of stress calculated are
directly transmitted to the odb. This is not the case with
the VUMAT. As the formulation is based in a corotational
coordinate system, rotation of the material is not seen
by the VUMAT and there is no change in the orientation of
any tensors. All components are passed into the VUMAT
unrotated and the necessary rotations are performed
following the stress update. Therefore when developing
the crystal plasticity VUMAT it was assumed that there is
zero incremental rotation of the material and the tensor
dR (Eq. (3.13)) is the identity matrix. The calculated stress
components in the VUMAT and UMAT dier but the odb
results are identical (Fig. 2). It is interesting to note that
Weber et al. [38] found that the HughesWinget algorithm
is not absolutely objective. A simultaneous rotation and
simple shearing of material was performed and a non-
objective stress response was generated. However, when
the values of Dt are small the performance of the algorithm
is satisfactory.
4. Comparative analyses
A range of 2D and 3D boundary value problems incor-
porating crystal plasticity are analysed using the implicit
and explicit nite element solvers, ABAQUS/standard and
ABAQUS/explicit, respectively. The mechanical response
is used as a means of comparing the performance of both
user material subroutines. As the model has the ability to
detect localisations in the material it is also desirable that
the contour proles of the accumulated shear strain, c, com-
pare well. To ensure a quasi-static analysis in the case of the
explicit method the kinetic energy must be negligible (<5%)
compared to the internal energy [6].
As shown in Section 1.2, the system of equations used to
solve for each time increment in the explicit code assumes a
constant acceleration and velocity across time steps. For a
practical time increment to be maintained it is necessary to
apply displacements that follow an amplitude wave. A
smooth step timedisplacement relationship is used to
ensure that the nodal accelerations and velocities remain
continuous as the model is being strained.
All computational simulations were performed using an
SGI 3800, 500 MHz processor, high performance com-
puter, on one processor except in the cases where explicitly
stated as being multiple processor. The analyses were
performed using ABAQUS version 6.3.
4.1. 2D analyses
The 2D geometry used in this study is based on that
presented in Savage et al. [31]. A 25 lm thick wire, with
the granular microstructure explicitly represented, is con-
structed. Each grain is idealised as a hexagon of area
92 lm
2
(Fig. 3). The material modelled in this study,
316L stainless steel, has a face-centred-cubic (FCC) crystal
structure. The orientations of the FCC crystal axes are
randomly generated and assigned to each grain. Orienta-
tion mismatch among the grains causes stress localisations
and the coalescence of plastic strain due to favourable
grain orientations determines the site of eventual failure.
Fig. 1. Schematic of the analysis to investigate the rotation of the
material.
Fig. 2. Plot of two components of stress during tension and rotation of
the material. There are three outputs for each component: from the
ABAQUS odb, directly from the UMAT, and directly from the VUMAT.
Fig. 3. Model of 316L stainless steel wire with a thickness of 25 lm. The
granular microstructure is idealised as a series of hexagons (highlighted).
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 487
The model is meshed with rst order reduced integration
plane strain quadrilateral elements (CPE4R).
4.1.1. Tension
The left-hand edge of the model is constrained in the
horizontal direction and a horizontal displacement is
applied to the right-hand edge. Rigid body motion is pre-
vented by constraining the left-hand bottom corner of the
model in all degrees of freedom. The wire is tensed beyond
the failure strain, which is taken as the strain at the point of
ultimate tensile stress (UTS). The strain is dened as the
ratio of the increase in length to the original length,
(l
1
l
0
)/l
0
, and the stress is the ratio of force applied to
the wire to the original cross-sectional area of the wire,
F/A
0
. This produces engineering stressengineering strain
(r
eng
e
eng
) curves and is a convenient method for present-
ing tensile data.
The r
eng
e
eng
response of the two models shows good
agreement with the failure point particularly well compared
(Fig. 4). The distribution of plastic shear strain during
necking is also very similar for the two analyses (Fig. 5).
The solution time of the implicit method is 56 min as
compared to 2 h 11 min for the explicit method (Table 1).
The shorter implicit solution time can be explained by the
fact that the size of the incremental global stiness matrix
is relatively small due to the size of the model. As the solu-
tion time in implicit analyses is proportional to the square
of the wavefront of the global stiness matrix in the model,
this simulation is computationally easy to solve.
4.1.2. Bending
The left-hand boundary of the model is constrained in
all degrees of freedom. The degrees of freedom of the nodes
along the right-hand edge are tied to a reference node and a
rotation, about the z-axis (normal to the page), is applied
to this node. A bending moment is produced that is con-
stant along the length of the wire. The average curvature
along the length of the wire is used as a deformation
Fig. 4. Comparison of macroscopic mechanical performance for 2D
tension analyses.
Fig. 5. Accumulated shear strain, c, in the 25 lm wire under tension solved using (a) the implicit solver and (b) the explicit solver.
Table 1
Computation times for 2D analyses using implicit (UMAT) and explicit
(VUMAT) solvers
Analysis Runtime
(h, min)
Ratio,
explicit/
standard
Size of time
increment,
min/max
Tension UMAT 2D 0, 56 2.34 1.86e4/0.01
Tension VUMAT 2D 2, 11 1.37e5/6.36e5
Bending UMAT 2D 1, 53 2.39 1.85e4/6.0e3
Bending VUMAT 2D 4, 30 4.86e6/1.2e5
Contact 1 UMAT 2D 1, 50 1.41 9.25e5/1.2e2
Contact 1 VUMAT 2D 2, 35 1.86e5/3.8e5
Contact 2 UMAT 2D 0, 40
a
0.875 5.74e4/3.8e2
Contact 2 VUMAT 2D 0, 35
a
2.57e5/3.81e5
a
The solution time at equivalent points in the two analyses.
488 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494
measure, analogous to e
eng
used in the tension analyses.
The macroscopic comparison of bending momentcurva-
ture curves for the two FE analyses show the same perfor-
mance (Fig. 6). The ratio of solution times is similar to the
2D tension analyses (Table 1).
4.1.3. Contact
The introduction of contact between surfaces increases
the computational cost for the implicit solver. Smaller time
increments are required to achieve convergence and com-
putational expense is further increased by the formation
and inversion of the stiness matrix. Due to the nature of
the explicit FE solver and the small time increments
involved, simple algorithms may be implemented to nego-
tiate contact conditions with a minimal loss of computa-
tional eciency [9,15,29,37].
Two types of analyses involving contact are performed.
The rst analysis simulates a three-point bending of the
wire and models contact between a rigid body and the
stainless steel. A rigid semi-circular cylinder is brought into
contact with the wire at the midpoint while the left- and
right-hand edges provide the reaction forces and are only
free to rotate about the z-axis (Fig. 7). A masterslave con-
tact algorithm is employed for both analyses. Frictionless
contact is assumed between the rigid surface and the wire.
It is interesting to note the forcedisplacement response
of the rigid body in both analyses (Fig. 8). In the explicit
case when the cylinder comes into contact with the wire
there is a vibration that dissipates quickly and subsequently
both curves follow the same path. This is a transient eect
and does not aect the nal state of the model. It is caused
by the mass scaling factor employed to reduce the solution
time of the analysis. Solution times are much closer for this
analysis than for the previous two. The solution times are
1 h 50 min and 2 h 35 min for implicit and explicit respec-
tively. The presence of contact does not aect the solution
process of the explicit code as strongly as the implicit code.
The second contact analysis models the interaction
between two deformable materials. A block of material
(E = 10 GPa, m = 0.3) is modelled as compressing a 25 lm
2
of the stainless steel wire. The basis for this analysis is purely
to investigate the ability of the implicit and explicit codes to
solve a more numerically complicated analysis. The geome-
try of the model is shown in Fig. 9. The bottom edge of the
stainless steel is displaced by 0.01 mm in the vertical direc-
tion, while the top edge of the sti block is held. This ensures
contact and compression of the wire. The same contact
algorithm as in the previous analysis is implemented.
The implicit solver is unable to complete the simulation.
A displacement of 7.27 10
3
mm is achieved before
numerical diculties prevent further analysis. When the
implicit iterative solver encounters a highly non-linear
response very small time increments must be employed to
Fig. 6. Comparison of macroscopic mechanical performance for 2D
bending analyses.
Fig. 7. 2D contact analysis. 25 lm wire is deformed by a rigid cylinder.
The left- and right-hand edges are free to rotate.
Fig. 8. Comparison of the forcedisplacement behaviour of the rigid
cylinder for 2D contact analyses.
Fig. 9. Model of contact simulation involving two deformable materials.
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 489
solve the equilibrium equations. However in this case, the
solver attempts increasingly small time increments without
achieving equilibrium. The minimum time increment for
the analysis in Table 1 is given as 5.74e4. However, in
the nal time increment, prior to termination of the job,
time increments in the order of 1.0e8 are attempted with-
out success. Severe discontinuities at the surfaces in contact
are the cause of termination of the analysis. The explicit
solver does not encounter such problems and this analysis
is completed. The solution time for the implicit simulation
at the point of termination is approximately 40 min. The
time taken by the explicit solver to solve to the same point
is 35 min while the total solution time is 40 min. Prior to
the premature termination of the implicit analysis the
forcedisplacement responses and the contour plots of
stress compare very well between both models. It is thus
evident that the explicit solver is more suited to this analy-
sis in terms of solution time and ability to complete the
analysis.
4.2. 3D analyses
The model used in the 3D analyses is geometrically
simpler in comparison to the 2D analyses, with the grains
being represented as cubes. The wire has a 25 lm
2
section
with a length of 75 lm. Each grain is meshed with rst order
reduced integration brick elements (C3D8R) (Fig. 10).
Similarly to the 2D models, a random 3D crystallographic
orientation is assigned to each grain.
The 3D analyses are carried out in order to investigate
what eect, if any, a more three-dimensional stress state
has on the solution times. Three types of analyses are car-
ried out; tension, contact, and a combination of torsion
and bending. The two former loading conditions are imple-
mented in the same way as the 2D analyses.
4.2.1. Torsion and bending
A combined loading of rotation around the x-axis (lon-
gitudinal axis of wire) and rotation around the z-axis is
applied to the right-hand end of the wire. The left-hand
end of the wire is constrained in all degrees of freedom.
The implicit code has a shorter solution time than the
explicit code. This can be expected for a simple deforma-
tion case. The ratio of explicit solution time to implicit
solution time is somewhat larger for the 3D displacement
driven analyses compared with the 2D analyses (Table 2).
The increase in model size from 2D to 3D has a greater
eect on the explicit solution times. The bandwidth size
in the implicit analyses remains relatively small, while the
combination of the short explicit time increments and the
size of the model signicantly increase solution times.
It appears that the implicit solver is the quicker solver
for all three types of 3D model loading but it is important
to comment that in the case of the contact analysis the
explicit solver solves for the initial interaction between
the surfaces more quickly. At time, t = 0.4 the solution
time is 18 h 12 min for the implicit solver while it is 14 h
51 min for the explicit solver at the same point in the load-
ing history. The contact that follows is not as computation-
ally dicult to solve. Therefore, the implicit solver
increases the size of the time increments after the initial
contact. This shows that in an analysis where the contact
conditions are more complex, i.e. between two deformable
materials, or are constantly changing, the explicit solver
would prove to be the better option.
4.3. Material strain-rate sensitivity
The material in each of the previous analyses uses a rel-
atively rate-dependent value of n = 20 (Eq. (2.1)). By
increasing the value of n the material becomes more rate-
independent. It is known that a material with an increased
rate sensitivity delays the development of shear bands
[24,26]. An analysis employing a material with low rate sen-
sitivity, i.e. nearly rate-independent, is computationally
more dicult to solve as higher strain gradients must be
resolved.
A more rate-independent parameter, n, is assigned to the
material and the 2D tension case is again simulated. Values
Fig. 10. 3D crystal plasticity model with a cubic crystalline structure. One
grain is highlighted, there are 3 3 grains in the cross-section and nine
grains along the length.
Table 2
Computation times for 3D analyses using implicit (UMAT) and explicit
(VUMAT) solvers
Analysis Runtime
(h, min)
Ratio,
explicit/
standard
Size of time
increment,
min/max
Tension UMAT 3D 27, 30 5.04 4.52e4/1.24e2
Tension VUMAT 3D 138, 43 1.06e5/5.01e5
Torsion +
bending UMAT
3D 40, 03 3.78 2.37e4/1.21e2
Torsion +
bending VUMAT
3D 151, 18 5.06e7/5.01e5
Contact UMAT 3D 42, 32 1.46 1.88e4/3.42e3
Contact VUMAT 3D 62, 05 2.12e6/8.47e6
490 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494
of n = 50 and n = 100 are chosen. The resulting runtimes
show that as the material becomes more rate insensitive
the implicit solver has trouble converging to a nal solution
(Table 3). The explicit solver is aected to a much smaller
degree. Table 3 shows that there is a signicant reduction
in the size of the time increments using the implicit
solver when the rate sensitivity exponent is increased,
whereas, there is little change in the size of the explicit time
increments.
4.4. Processor parallelisation
As stated in Section 3.2 the VUMAT is constructed in a
vectorised format, the solver in ABAQUS/explicit is simi-
larly constructed. This ensures that a high level of eciency
can be achieved when multiple processors are used. The 2D
tension case is solved across multiple processors using both
the implicit and explicit solvers. The simulations are solved
using 2 and 4 processors on the SGI 3800 multiple proces-
sor computer.
The results of these analyses are presented in Table 4. In
order to characterise the eciency of using multiple proces-
sors the following formula was developed:
g
processor
%
jCj
_
sgnC 100 4:1
with
C
t
1
t
x
v
1
x
_ _
1
1
v
_ _
4:2
where t is the solution time and x is the number of proces-
sors used. For example, an eciency of 100% when using
two processors indicates a solution time saving of 50%. A
negative eciency indicates that a longer solution time is
required when using multiple processors than when using
one processor.
There is no speedup with the implicit solver when using
multiple processors for this simple analysis. There is actu-
ally a loss in solution time as indicated by the negative
processor eciency. This can be explained by the extra
computation time required to assemble the system equa-
tions from a number of processors. The eciency values
for each of the explicit analyses are shown to be quite good
(Table 4). When four processors are used to solve the 2D
tension analysis the solution time is 44 min. This compares
favourably to the quickest solution time of 56 min using
the implicit solver.
In the 2D analyses a very high speedup eciency is
achieved when the explicit analysis is run across two pro-
cessors. The eciency is not as high when four processors
are used. There is a type of model size dependence associ-
ated with the explicit speedup factors. When using the
domain parallelisation setting a larger model can be split
amongst a large number of processors. Each domain
requires a signicant amount of processor capability to
solve the analysis. If a large number of processors is used
to solve a relatively small model the computational power
required of each processor may be quite low, such that two
domains may be solved by one processor. The speedup
over using fewer processors is not simply a product of
the increase in number of processors.
In order to investigate whether a better speedup can be
achieved in a large model the 3D tension case is run across
two processors. The solution time is 17 h 31 min using the
implicit solver. This gives an eciency of 75.5%. Although
this solution time compares favourably to the 3D explicit
analyses it is worth noting the high eciency in the explicit
case (Table 4). Given the large size of the model, further
increase of the number of processors would continue to
yield a high eciency and would likely result in a shorter
solution time than the implicit analysis.
5. Conclusions
A crystal plasticity material subroutine has been devel-
oped by the authors for use with the explicit nite element
solver ABAQUS/explicit. The results from a variety of 2D
and 3D loading conditions are shown to be the same as the
original UMAT. Either solver is shown to be more ecient
when solving certain types of simulations. In the simpler
analyses, where deformation is directly applied to the mate-
rial, the implicit FE solver is shown to solve more quickly.
The ratio of explicit runtime to implicit runtime is approx-
imately 2.35 for the simple 2D analyses. The ratio for anal-
yses in 3D is approximately double that value. The
inclusion of time increment sizes in Tables 1 and 2 reveals
the inuence of dierent loading conditions on similar
models. It is noteworthy, when comparing the tension anal-
yses to the bending analyses in Table 1, that the bending
Table 3
Computation times for the 2D tension analysis using dierent values for
the rate sensitivity exponent
Tension
analysis
Rate sensitivity
exponent, n
Runtime
(h, min)
Size of time increment,
min/max
UMAT 20 0, 56 1.89e3/0.01
UMAT 50 1, 24 7.05e5/2.3e3
UMAT 100 1, 42 5.06e5/1.0e3
VUMAT 20 2, 11 1.37e5/6.36e5
VUMAT 50 2, 22 1.24e5/6.36e5
VUMAT 100 2, 23 1.22e5/6.36e5
Table 4
Computation times and parallelisation eciency for analyses solved using
multiple processors
Tension
analysis
# Processors Runtime
(h, min)
Eciency
(%)
UMAT 2D 1 0, 56
UMAT 2D 2 0, 57 13.2
UMAT 2D 4 1, 02 20.0
UMAT 3D 1 27, 30
UMAT 3D 2 17, 31 75.5
VUMAT 2D 1 2, 11
VUMAT 2D 2 1, 13 89.1
VUMAT 2D 4 0, 44 81.2
VUMAT 3D 1 138, 43
VUMAT 3D 2 74, 35 92.7
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 491
loading has a far greater eect on the explicit analyses than
the implicit analyses. A similar trend is clear in the 3D
analyses from Table 2. The opposite is true of the contact
analyses; the introduction of contact has little eect on the
size of the explicit time increments but signicantly reduces
the size of the implicit increments.
As regards the implicit solution iteration process, for
each increment, ABAQUS chooses an initial guess, u
tDt
0
,
assuming an incrementally linear response for the material,
based on the tangent stiness calculated at the end of the
previous increment Ku
t
final
. In the case of a linear elastic
material this would provide the correct solution directly.
For non-linear problems this should yield a good initial
guess for small increments, Dt. Increasingly non-linear
problems require increasingly small time increments to
maintain solution accuracy. Within the constraints of
ABAQUS, in the context of the scope of this paper, few
options are provided for improving the process. In the
problems of interest here the non-linearities due to contact
and material response are of such a severity as to require
time step sizes far smaller than are practically usable to
generate solutions with the implicit method.
The explicit solver is better suited to deal with complex
contact and sliding conditions, particularly in cases of large
element deformation. In the contact analyses that include a
rigid body the runtimes using the implicit code are shorter
than the explicit code. This disguises the fact that the explicit
code actually solves the contact condition more quickly. The
second 2Dcontact analysis involving more complex contact
conditions and greater element deformation provides a
clearer indication of the benets of the explicit solver.
It is interesting to note the increase in solution times
when a more rate-independent material is used in the 2D
tension analyses. The small time increments used in the
explicit solver ensure that the highly non-linear material
behaviour is dealt with, with a minimal increase in time.
This is not the case with the implicit solver where smaller
time increments must be employed to achieve convergence.
When one considers the high multiple processor e-
ciency that is achieved by the explicit solver compared to
the implicit solver for every situation considered in this
study, the former option would prove to be the more
favourable when the user has a multiple processor com-
puter at his/her disposal. Given the ongoing technological
advances, multiple processor computers are becoming
more commonly used. Therefore it is important to recog-
nise the link between the speedup eciency, the number
of processors assigned to solve an analysis, and the size
of the model. The optimal number of processors should
be determined for each analysis so that a high parallelisa-
tion eciency can be maintained.
Acknowledgements
The authors wish to acknowledge funding from
Embark, Irish Research Council for Science, Engineering
and Technology: Funded by the National Development
Plan. The simulations in this work were performed on
the SGI 3800 high performance computer at NUI, Galway.
Appendix
492 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494
References
[1] L. Anand, M. Kothari, Journal of the Mechanics and Physics of
Solids 44 (4) (1996) 525558.
[2] R.J. Asaro, A. Needleman, Acta Metallurgica 33 (6) (1985) 923953.
[3] D.J. Bammann, Materials Science and Engineering A 309310 (2001)
406410.
[4] R. Becker, Acta Metallurgica et Materialia 39 (6) (1991) 12111230.
[5] S.-H. Choi, Acta Materialia 51 (6) (2003) 17751788.
[6] W.J. Chung, J.W. Cho, T. Belytschko, Engineering Computations 15
(6) (1998) 750776.
[7] J.D. Clayton, D.L. McDowell, International Journal of Plasticity 19
(9) (2003) 14011444.
[8] M. Grujicic, Y. Zhang, Materials Science and Engineering A265 (12)
(1999) 285300.
[9] Hibbitt, Karlsson, and Sorenson, ABAQUS Theory Manual,
Pawtucket, RI, USA, 1997.
[10] Y. Huang, Harvard University Report, MECH 178, 1991.
F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494 493
[11] G.C. Johnson, D.J. Bammann, International Journal of Solids and
Structures 20 (8) (1984) 725737.
[12] J. Kaczmarek, International Journal of Plasticity 19 (10) (2003) 1585
1603.
[13] H.-K. Kim, S.-I. Oh, International Journal of Plasticity 19 (8) (2003)
12451270.
[14] J. Kim, Y.H. Kang, H.-H. Choi, S.-M. Hwang, B.S. Kang, The
International Journal of Advanced Manufacturing Technology 20 (6)
(2002) 407413.
[15] S. Kugener, AMP Journal of Technology 4 (1995) 815.
[16] L.M. Kutt, A.B. Pifko, J.A. Nardiello, J.M. Papazian, Computers &
Structures 66 (1) (1998) 117.
[17] J.W. Kysar, Journal of the Mechanics and Physics of Solids 49 (5)
(2001) 10991128.
[18] J.P. McGarry, B.P. ODonnell, P.E. McHugh, J.G. McGarry,
Computational Materials Science 31 (34) (2004) 421438.
[19] P.E. McHugh, in: IUTAM Summer School on Mechanics of
Microstructured Materials, Udine, Italy, 2003.
[20] P.E. McHugh, P.J. Connolly, Computational Materials Science 27 (4)
(2003) 423436.
[21] F.T. Meissonnier, E.P. Busso, N.P. ODowd, International Journal of
Plasticity 17 (4) (2001) 601640.
[22] C. Miehe, J. Schroder, International Journal for Numerical Methods
in Engineering 50 (2001) 273298.
[23] K.W. Neale, K. Inal, P.D. Wu, International Journal of Mechanical
Sciences 45 (10) (2003) 16711686.
[24] A. Needleman, Computer Methods in Applied Mechanics and
Engineering 67 (1) (1988) 6985.
[25] K. Okamoto, A. Yoshinari, J. Kaneda, Y. Aono, T. Kato, Materials
Transactions 41 (7) (2000) 806814.
[26] D. Peirce, R. Asaro, A. Needleman, Acta Metallurgica et Materialia
31 (12) (1983) 19511976.
[27] D. Raabe, Acta Metallurgica et Materialia 43 (3) (1995) 10231028.
[28] J.L. Raphanel, C. Teodusiu, L. Tabourot, in: C. Teodusiu, J.L.
Raphanel, F. Sidoro (Eds.), Large Plastic Deformations: Funda-
mental Aspects and Applications to Metal Forming, A.A. Balkema,
Rotterdam, 1993, pp. 153168.
[29] N. Rebelo, J.C. Nagtegaal, L.M. Taylor, Numerical Methods in
Industrial Forming Processes (1992) 99108.
[30] W. Rust, K. Schweizerhof, Thin-Walled Structures 41 (23) (2003)
227244.
[31] P. Savage, B.P. ODonnell, P.E. McHugh, B.P. Murphy, D.F. Quinn,
Annals of Biomedical Engineering 32 (2) (2004) 202211.
[32] B.M. Schroeter, D.L. McDowell, International Journal of Plasticity
19 (9) (2003) 13551376.
[33] N.J. Sorensen, B.S. Andersen, Computer Methods in Applied
Mechanics and Engineering 132 (34) (1996) 345357.
[34] A. Staroselsky, L. Anand, Journal of the Mechanics and Physics of
Solids 46 (4) (1998) 671696.
[35] J.S. Sun, K.H. Lee, H.P. Lee, Journal of Materials Processing
Technology 105 (12) (2000) 110118.
[36] L. Taylor, J. Cao, A.P. Karallis, M.C. Boyce, Journal of Materials
Processing Technology 50 (14) (1995) 168179.
[37] S.P. Wang, S. Choudhry, T.B. Wertheimer, White Papers in MARC
Corp., 1997.
[38] G.G. Weber, A.M. Lush, A. Zavaliangos, L. Anand, International
Journal of Plasticity 6 (6) (1990) 701744.
[39] A.G. Youtsos, E. Gutierrez, G. Verzeletti, Acta Mechanica 84 (1990)
109125.
494 F.J. Harewood, P.E. McHugh / Computational Materials Science 39 (2007) 481494