You are on page 1of 200

Determination and Enhancement of the

Forming Limit Curve for Sheet Metal


Materials using Machine Learning

Bestimmung und Erweiterung der


Grenzformänderungskurve für
Blechwerkstoffe mittels Maschinellem
Lernen

Der Technischen Fakultät


der Friedrich-Alexander-Universität
Erlangen-Nürnberg

zur

Erlangung des Doktorgrades Dr.-Ing.

vorgelegt von

Christian Jaremenko
aus
Wittmund, Deutschland
Als Dissertation genehmigt von der
Technischen Fakultät der
Friedrich-Alexander-Universität Erlangen-Nürnberg

Tag der mündlichen Prüfung: 30. Juli 2020


Vorsitzender des Promotionsorgans: Prof. Dr.-Ing. habil. Andreas Paul Fröba
Gutachter: Prof. Dr.-Ing. habil. Andreas Maier
Prof. Dr.-Ing. habil. Marion Merklein
Abstract

Future legal standards for European automobiles will require a considerable reduc-
tion in CO2 emissions by 2021. In order to meet these requirements, an optimization
of the automobiles is required, comprising technological improvements of the engine
and aerodynamics, or even more important, weight reductions by using light-weight
components. The properties of light-weight materials differ considerably from those
of conventional materials and therefore, it is essential to correctly define the forma-
bility of high-strength steel or aluminum alloys. In sheet metal forming, the forming
capacity is determined by means of the forming limit curve that specifies the maxi-
mum forming limits for a material. However, current methods are based on heuristics
and have the disadvantage that only a very limited portion of the evaluation area is
considered. Moreover, the methodology of the industry standard is user-dependent
with simultaneously varying reproducibility of the results. Consequently, a large
safety margin from the experimentally determined forming limit curves is required in
process design.
This thesis introduces pattern recognition methods for the determination of the
forming limit curve. The focus of this work is the development of a methodology
that circumvents the previous disadvantages of location-, time-, user- and material-
dependencies. The dependency on the required a priori knowledge is successively
reduced by incrementally improving the proposed methods.
The initial concept proposes a supervised classification approach based on established
textural features in combination with a classifier and addresses a four-class problem
consisting of the homogeneous forming, the diffuse and local necking, as well as the
crack class. In particular for the relevant class of local necking, a sensitivity of up to
92% is obtained for high-strength materials. Since a supervised procedure would re-
quire expert annotations for each new material, an unsupervised classification method
to determine the local necking is preferred, so that anomaly detection is feasible by
means of predefined features. A probabilistic forming limit curve can thus be defined
in combination with Gaussian distributions and consideration of the forming progres-
sion. In order to further reduce the necessary prior knowledge, data-driven features
are learned based on unsupervised deep learning methods. These features are adapted
specifically to the respective forming sequences of the individual materials and are
potentially more robust and characteristic in comparison to the predefined features.
However, it was discovered that the feature space is not well-regularized and thus not
suitable for unsupervised clustering procedures. Consequently, the last methodology
introduces a weakly supervised deep learning approach. For this purpose, several
images of the beginning and end of the forming sequences are used to learn optimal
features in a supervised setup while regularizing the feature space. Through unsu-
pervised clustering, this facilitates the class membership determination for individual
frames of the forming sequences and the definition of the probabilistic forming limit
curve. Moreover, this approach enables a visual examination and interpretation of
the actual necking area.
Zusammenfassung

Zukünftige gesetzliche Standards erfordern für europäische Automobile eine erhe-


bliche Reduktion der CO2 Emissionen bis 2021. Um diese Auflagen zu erfüllen, bedarf
es einer Optimierung der Fahrzeuge, die sich aus technologischen Verbesserungen des
Motors und der Aerodynamik zusammensetzt, oder viel wichtiger einer signifikanten
Gewichtsreduktion durch den Einsatz von Leichtbau. Die Eigenschaften von Leicht-
baumaterialien weichen erheblich von denen herkömmlicher Werkstoffe ab, weshalb
für hochfesten Stahl oder Aluminiumlegierungen eine korrekte Definition ihrer Um-
formfähigkeit erforderlich ist. In der Blechumformung wird die Umformfähigkeit mit-
tels Grenzformänderungskurve bestimmt, welche die maximalen Umformungsgren-
zen für ein Material definiert. Alle derzeitigen Methoden basieren auf Heuristiken
und haben den Nachteil, dass nur ein sehr geringer Anteil des Auswertungsbere-
iches berücksichtigt wird. Darüber hinaus ist die Methodik des Industriestandards
benutzerabhängig bei gleichzeitig schwankender Reproduzierbarkeit der Ergebnisse.
Als Konsequenz kommt in der Prozessauslegung ein großer Sicherheitsabstand von
den experimentell bestimmten Grenzformänderungskurven zur Anwendung.
Mit dieser Arbeit werden erstmalig Methoden der Mustererkennung zur Bestimmung
der Grenzformänderungskurve eingesetzt. Mittelpunkt der Arbeit ist die Entwick-
lung einer Methodik, welche die bisherigen Nachteile der Orts-, Zeit-, Benutzer- und
Materialabhängigkeiten umgeht. In einer inkrementellen Herangehensweise wird die
Abhängigkeit vom benötigten a priori Wissen sukzessive verringert.
Ausgangspunkt ist ein überwachter Klassifikationsansatz basierend auf etablierten
Texturmerkmalen im Zusammenspiel mit einem Klassifikator. Dieser löst ein Vier-
klassenproblem bestehend aus der homogenen Umformung, der diffusen- und lokalen
Einschnürung sowie der Riss Klasse. Insbesondere für die relevante Klasse der lokalen
Einschnürung, wird für hochfeste Materialien eine Sensitivität von bis zu 92% erre-
icht. Da eine überwachte Vorgehensweise für jedes neue Material Expertenannota-
tionen benötigen würde, wird eine unüberwachte Klassifikationsmethode zur Bes-
timmung der lokalen Einschnürung bevorzugt, sodass mit Hilfe von vordefinierten
Merkmalen eine Anomalie-Erkennung möglich ist. Hierbei wird die Abweichung von
der homogenen Umformungsphase in den Umformsequenzen festgestellt.
In Verbindung mit Verteilungsfunktionen und unter Zuhilfenahme des Umformungs-
fortschrittes kann somit eine probabilistische Grenzformänderungskurve definiert wer-
den. Um das notwendige Vorwissen weiter zu reduzieren, werden mittels unüberwach-
ten Deep Learning Verfahren datengetrieben Merkmale gelernt. Diese sind charak-
teristisch an die jeweiligen Umformsequenzen der einzelnen Materialien angepasst
und potentiell robuster als die vordefinierten Merkmale. In diesem Zusammen-
hang hat sich herausgestellt, dass der resultierende Merkmalsraum für unüberwachte
Clusterverfahren nicht geeignet ist. Als Konsequenz wird in der letzten Methodik
ein schwach-überwachtes Deep Learning Verfahren eingeführt. Dieses verwendet
nur einen Bruchteil der zur Verfügung stehenden Bilder des Beginns und des En-
des der Umformungssequenz, um einen optimalen Merkmalsraum zu lernen. Dies
ermöglicht mittels unüberwachtem Clusterverfahren neben der Bestimmung von Ver-
sagensklassen für einzelne Frames der Umformsequenzen und der probabilistischen
Grenzformänderungskurve ebenso eine Abschätzung des tatsächlichen Einschnürungs-
bereiches.
Acknowledgment

First of all, I would like to thank my supervisor Prof. Dr-Ing. habil. Andreas Maier
for his belief in me and his support for the successful realization of this complex
interdisciplinary topic. Besides my dissertation, I was also given the opportunity for
personal development and flexibility to work independently on a broad spectrum of
industrial and medical research topics.
I would also like to thank Prof. Dr.-Ing. habil. Marion Merklein for the scientific
supervision of my work. It was a great pleasure for me to jointly develop and imple-
ment ideas and elaborations for the collaborative DFG projects.
Special appreciation also deserves Dr.-Ing. Emanuela Affronti for the many untiring
explanations and motivating encounters. Without them we would never have been
able to realize our successful publications and DFG proposals.
Next, I would like to thank my colleagues of the chair, who have always created a
positive and motivating atmosphere that has spurred us to top performance every
day. It was an unforgettable experience that I would not want to miss. Altogether, I
enjoyed the open, communicative and supportive environment even away from pro-
fessional topics, which was evident in our group and chair events or conferences.
In particular, I would like to thank Christopher Syben, Leonid Mill, Prathmesh
Madhu, Ronak Kosti and Nishant Ravikumar for their interest in my topic and the
endless discussions and helpful suggestions.
For the omnipresent great atmosphere at our office I would like to thank my friends
and colleagues Bastian Bier, Jennifer Maier, Marc Aubreville, Christian Marzahl,
Patrick Mullan, Martin Berger and Yan Xia.
Most importantly and especially to be emphasized is the support, understanding and
trust of my family, those aspects and people who have accompanied me throughout
my life and who have finally led me to this point. I owe special thanks to Julia, whose
loving, understanding and supportive nature made all this work possible in the first
place.

Christian Jaremenko
Contents

Chapter 1 Introduction 1
1.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Organization of the Thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Chapter 2 Theoretical Background: Sheet Metal Forming 9


2.1 Fundamentals of Sheet Metal Forming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Tensile Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Stress-Strain Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Forming Limit Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Factors Influencing the FLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Digital Image Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Location-Dependent Determination of the FLC . . . . . . . . . . . . . . . . . . 23
2.3.2 Time- and Location-Dependent Determination of the FLC . . . . . . . . . . 23

Chapter 3 Theoretical Background: Machine Learning 27


3.1 Introduction to Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.1 Intensity-level Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.2 Homogeneity Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.3 Local Binary Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.4 Histogram of Oriented Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Classification Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.3 One-Class Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.4 Gaussian Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.5 Students t Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.1 Feed-forward Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.2 Stochastic Gradient Descent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.3 Back-propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.4 Activation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.5 Network Structures and Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.6 Network Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

i
3.5 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5.1 Receiver Operating Characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Dice Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Assessment of the Expert Annotation Quality. . . . . . . . . . . . . . . . . . . . . . . . 59

Chapter 4 Data Acquisition & Materials 61


4.1 Materials & Process Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Annotation Guidelines & Failure Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Strain Distributions in Dependence of Loading Conditions . . . . . . . . . . . . . . . 68
4.4 Signal Impairments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 Software for Expert Annotations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Chapter 5 Supervised Determination of Forming Limits using


Conventional Machine Learning 73
5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.3 Classification: Random Forest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1 Inter-Rater-Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.3 Comparison Experts vs. Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Chapter 6 Unsupervised Determination of Forming Limits using


Conventional Machine Learning 91
6.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1.3 Classification: One-Class-SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3.1 Deterministic FLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3.2 Probabilistic FLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.3 Comparison with Time-Dependent Evaluation Method . . . . . . . . . . . . . 100
6.3.4 Comparison of Deterministic and Probabilistic FLC . . . . . . . . . . . . . . . 104
6.3.5 Comparison with Metallography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.3.6 Factors of Influence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Chapter 7 Unsupervised Determination of Forming Limits using Deep


Learning 111
7.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

ii
Chapter 8 Weakly Supervised Determination of Forming Limits using
Deep Learning 117
8.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
8.1.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.1.3 Supervised Siamese Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.1.4 Clustering using Students t Mixture Models . . . . . . . . . . . . . . . . . . . . 121
8.2 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.3.1 Comparison of Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.3.2 Comparison with State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3.3 Metallography – Strain Path Quantification . . . . . . . . . . . . . . . . . . . . 129
8.3.4 Interpretation of Weakly Supervised Network Activations . . . . . . . . . . . 131
8.3.5 Comparison with Supervised Network Activations . . . . . . . . . . . . . . . . 135
8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Chapter 9 Weakly Supervised Approximation of the Localized Necking


Region 143
9.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
9.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.1.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.2 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.4 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Chapter 10 Outlook 153

Chapter 11 Summary 157

List of Acronyms 163

List of Symbols 167

List of Figures 173

List of Tables 177

Bibliography 179

iii
CHAPTER 1

Introduction
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

In the context of stricter legal regulations on emissions, the automotive industry


sees itself obliged to introduce innovations. By 2021, new European automobiles
must comply with the new CO2 emission limit value of 95 g km−1 , corresponding to
a 37% reduction relative to the former limit of 130 g km−1 in 20151 . In order to
meet the new limit values, the optimization of vehicles is necessary. In this context,
a more efficient combustion process of the engine or lower air resistance through
optimized aerodynamics may support efficiency, although the main contribution to
reducing emissions is made by the use of light-weight construction. According to
the German Federal Ministry of Economics and Energy, reducing the mass of a ve-
hicle with a combustion engine by 100 kg improves the fuel consumption of a car
by 0.5 l/100 km. More drastically, reducing mass in the aerospace sector, i.e., of an
Airbus A320 airplane by 100 kg, would lead to a decreased consumption of kerosene
of 10 000 l/year. As a result of this, industries are turning towards light-weight con-
struction, as their use potentially reduces the material consumption2 . In case of the
automotive industry, especially the Body-in-White process – the creation of the vehi-
cle body – significantly contributes to the weight-saving potential by 50%. The key
to light-weight construction in general is the selection of the appropriate material for
each component in the vehicle3 . Hence, the automotive industry has an increasing
interest in light-weight construction, as well as the complex demands on design and
efficiency generally encourage the use of new materials with very specific properties.
These properties usually considerably deviate from the ones of conventional materi-
als such as ductile steels. The complexity of modern materials, such as high-strength
steel or aluminum alloys, therefore requires an optimized process design by means of
Finite Element Analysis (FEA) and more importantly, a correct definition of their
forming capacity. In sheet metal forming, this forming capacity is determined using
the Forming Limit Curve (FLC). The FLC visualizes the limit strains in terms of
major and minor strain pairs for different loading conditions and therefore denotes
the maximum endurable loading conditions or forming limits for one material, up to
which defect-free components can be produced.
1
https://www.vda.de/de/themen/umwelt-und-klima/co2-regulierung-bei-pkw-und-leichten-
nfz/co2-regulierung-bei-pkw-und-leichten-nutzfahrzeugen.html — last accessed 20.03.2019
2
https://www.bmwi.de/Redaktion/DE/Dossier/leichtbau.html — last accessed 21.03.2019
3
https://www.autoform.com/de/glossar/leichtbau/ — last accessed 21.03.2019

1
2 Chapter 1. Introduction

1.1 Motivation
In general, the forming process of sheet metal is separated into two forming phases. A
homogeneous forming phase, in which the strain is evenly distributed over the entire
sheet, and an inhomogeneous forming phase, in which the strain begins to concentrate
on a small area, the actual forming limit, prior to failure of the material. Preced-
ing the localization of strain, the onset of localized necking, a diffuse concentration
of strain occurs, that affects a larger area. The baseline method to determine the
FLC is standardized in DIN EN ISO 12004-2 [DIN 08] and is mainly used for ductile
materials. Especially in case of brittle materials without a pronounced constriction
phase, the results are difficult to reproduce and possess a high standard deviation
[Merk 17]. The standardized test procedure is based on an experimental setup, that
consists of a blank holder and a die, in between the sheet metal is clamped. The
forming is carried out, up to fracture, by using a hemispherical or flat punch, accord-
ing to Nakajima [Naka 68] or Marciniak [Marc 65], respectively. Until the end of the
90s, the forming limit was determined by measuring the post-deformation ellipse of
a previously coated circular mark on the specimen prior to deformation. Nowadays,
the specimens are prepared with a stochastic speckle pattern and the forming pro-
cedure is recorded using a stereo-camera setup. The strains are computed using the
Digital Image Correlation (DIC) [Bruc 89] technique and hence a fine-grained deter-
mination of strain distributions on the surface of the material is achieved, as well as
a possible assessment of the strain development over time. Current methods, such as
the time-dependent by Volk and Hora [Volk 11] or the cross-correlation by Merklein
et al. [Merk 10], exploit the forming history and determine the onset of necking as
sudden decrease of sheet thickness. However, these methods only evaluate heuristi-
cally predefined, very small areas of the available strain distributions, which limits
their focus and adversely affects the overall evaluation. Additionally, the result of the
industry standard is susceptible to user dependency and difficult to reproduce. All
mentioned methods determine the forming limit for individual loading conditions by
three repeated forming operations and calculation of their average principal strains,
and consequently lack a measure of certainty.
The aim of this thesis is to introduce machine learning methods into the research area
of sheet metal forming. In particular, the aim of this thesis is to develop a robust
evaluation method to improve the determination of the FLC by exploiting image
information and thus, the establishment of temporal and spatial independence. The
thesis specifically focuses on introducing a measure of certainty for the determined
forming limits, that may reduce required safety margins used in process design. This
potentially expands the range of applications of materials and in turn increases ma-
terial savings. Another key aspect is the transferability from one material to another,
such that the method is independent of material characteristics and generalizes to
new unseen data.
1.2. Contributions 3

1.2 Contributions
In the course of this thesis, several fundamental contributions in the determination
of the FLC based on pattern recognition methods have been made. Multiple studies
with different approaches have been published that differently try to detect the onset
of localized necking, which is critical for process design of manufacturing pipelines.
Several concepts are presented which differ methodically and, in particular, solve the
problem by an arrangement of the data in different ways. With every publication, the
necessary prior knowledge used to create the FLC is reduced. In the following, the
main achievements are introduced with accompanying references to the literature.

Analysis of Forming Limits in Sheet Metal Forming with Pattern


Recognition Methods. Part 1: Characterization of the Onset of
Necking and Expert Evaluation
Since we are the first to determine the FLC based on image classification, the initial
method focuses on identifying appropriate handcrafted textural image features that
support the use of a supervised classification approach and emphasize the current
understanding of the necking phenomenon of materials during forming. A database
consisting of multiple materials and forming sequences was created and annotated by
experts using the following four classes: homogeneous, diffuse necking, localized neck-
ing and crack, that serve as ground truth in the classification process. Figure 1.1 vi-
sualizes schematically for three sequences, the data-imbalance of the different classes,
that is one of the major problems in forming processes, since the critical localized
necking is underrepresented. The quality of the expert annotations is statistically
evaluated and compared to the results of a supervised classification algorithm, with
special emphasis on the process parameters of the experimental forming setup. The
methods are detailed in Chapter 5 and were presented in one conference and one
shared first authorship journal publication.
homogeneous inhomogeneous

seq1 homogeneous
diff. necking
seq2 local. necking
crack
seq3
Figure 1.1: Schematic data structure and class distribution for three sequences in
the supervised classification approach. The data is split into failure classes according
to the expert assigned annotations.
4 Chapter 1. Introduction

C. Jaremenko, X. Huang, E. Affronti, M. Merklein, and A. Maier.


“Sheet Metal Forming Limits as Classification Problem”. In: Pro-
[Jare 17]
ceedings of the IAPR International Conference on Machine Vision
Applications (MVA), pp. 100–103, IEEE, 2017
E. Affronti, C. Jaremenko, M. Merklein, and A. Maier. “Analysis of
Forming Limits in Sheet Metal Forming with Pattern Recognition
[Affr 18]
Methods. Part 1: Characterization of Onset of Necking and Expert
Evaluation”. Materials, Vol. 11, No. 9, 2018

Analysis of Forming Limits in Sheet Metal Forming with Pattern


Recognition Methods. Part 2: Unsupervised Methodology and
Application
An unsupervised, anomaly detection strategy is proposed, that only uses data from
the homogeneous forming phase to train a classifier, and finds the optimal hyperplane
in feature space. Again, handcrafted features are employed to asses the status of the
forming process and within the remaining parts of the forming sequences, the onset of
localized necking is identified by deviation from the homogeneous class. To facilitate
the processing, the crack class is removed from the data and the video sequences
are aligned at the end of the localized necking class (cf. Figure 1.2) and made
equal on length by removing parts of the homogeneous forming phase. A mixture
model approximates two distributions, one corresponding to the homogeneous and
the inhomogeneous forming phase, using the confidence scores of the trained classifier
together with the normalized time of forming sequences. This enables the lookup of
strain value and supports the generation of multiple FLC candidates with different
likelihoods and consequently introduces the probabilistic FLC.
The approach is described in Chapter 6 and was published as shared first authorship
journal publication.
homogeneous inhomogeneous

seq1 homogeneous
normalized length diff. necking
seq2 local. necking
test data
seq3
Figure 1.2: Schematic data structure for the anomaly detection method. The
classifier is trained using exclusively the data from the homogeneous forming phase.

C. Jaremenko, E. Affronti, A. Maier, and M. Merklein. “Analysis


of Forming Limits in Sheet Metal Forming with Pattern Recogni-
[Jare 18]
tion Methods. Part 2: Unsupervised Methodology and Application”.
Materials, Vol. 11, No. 10, 2018
1.2. Contributions 5

Determination of Forming Limits in Sheet Metal Forming using


Deep Learning
A weakly supervised classification approach for the detection of the onset of localized
necking is proposed, that combines parts of supervised and unsupervised classifica-
tion strategies. Major differences to the previous contributions are the independence
of handcrafted features due to the use of a Convolutional Neural Network (CNN) and
the time and location independence. Overall, it comprises two steps: (1) a super-
vised, data-driven, feature learning part using a Siamese-CNN to generate optimal
feature representations of the forming sequences. Only the extreme cases of the form-
ing sequences, the beginning of the homogeneous and the end of the inhomogeneous
forming phase are employed (cf. Figure 1.3), and lead to an optimal separation of the
two classes in feature space; (2) an unsupervised clustering part using mixture mod-
els, to group the remaining frames of the sequence as belonging to the homogeneous,
transition or inhomogeneous forming phase. The location and time independence of
the proposed method consequently enables on-line identification of a distinct time
point for the onset of localized necking and could be used for the supervision of form-
ing processes. Furthermore, for the first time, it is possible to relate microscopic
findings of prematurely stopped forming processes and their metallographic investi-
gations, with macroscopic frames of the corresponding forming video sequences.
Additionally, this procedure was extended to include a segmentation approach, so
that besides the temporal determination of the onset of necking, a spatial approx-
imation of the actual necking region is provided. Both methods are described in
Chapter 8 and Chapter 9 and were published as first authorship journal publications.

homogeneous inhomogeneous

seq1 homogeneous
diff. necking
seq2 local. necking
test data
seq3
Figure 1.3: Schematic data structure for the weakly supervised approach. Only the
extreme and certain cases of the homogeneous and inhomogeneous forming phase are
used for learning.

C. Jaremenko, N. Ravikumar, E. Affronti, A. Maier, and


[Jare 19] M. Merklein. “Determination of Forming Limits in Sheet Metal
Forming using Deep Learning”. Materials, Vol. 12, No. 4, 2019
C. Jaremenko, E. Affronti, M. Merklein, and A. Maier. “Temporal
[Jare 20] and Spatial Detection of the Onset of Local Necking and Assess-
ment of its Growth Behavior”. Materials, Vol. 13, No. 11, 2020
6 Chapter 1. Introduction

1.3 Organization of the Thesis


The organization of the thesis is schematically visualized in Figure 1.4. The thesis
starts with a short motivation and general overview of the topic in Chapter 1, fol-
lowed by the main contributions with a coarse outline and the main differences of
the individual proposed evaluation strategies. In order to facilitate the understand-
ing of the proposed methods and why certain decisions where made, the necessary
fundamentals from the field of mechanical engineering will be introduced in Chap-
ter 2. It elaborates on strain and stress and their relationship to continuum mechanics
and introduces the fundamental behavior of specimen under loading conditions by
means of the stress-strain diagram, whose comprehension is required to understand
the different characteristic material parameters. Additionally, the DIC technique to
measure strain distributions during forming, and the state-of-the-art methods to de-
termine the onset of localized necking are presented in detail, as their results serve
as baseline throughout the rest of the thesis. Machine learning and the differences
to Deep Learning (DL) are described in accordance with the pattern recognition
pipeline in Chapter 3. General well-known concepts, algorithms and classifiers of
conventional machine learning are presented, followed by the theory of DL with feed
forward neural networks, their components and optimization algorithms. Further-
more, CNNs are derived and multiple well-established architectures are outlined.
Chapter 4 presents the data acquisition strategy along with the different evaluated
materials and their characteristic metrics. Furthermore, the strain distributions for
different loading conditions and possible signal impairments are demonstrated. The
aforementioned contributions and realized strategies are evaluated and discussed in
Chapters 5-9. In Chapter 5, a supervised conventional machine learning approach is
proposed, and an evaluation of the annotation quality of the experts is being made.
Chapter 6 proposes an unsupervised conventional pattern recognition approach, that
uses handcrafted features and requires no expert annotations or knowledge. This
approach is extended to a purely data-driven deep learning method without any nec-
essary prior knowledge in Chapter 7. Several findings made it inevitable to reduce this
approach to a data-driven, weakly supervised method, using some prior knowledge
as elaborated on in Chapter 8. Up to this point, the focus of the work concentrates
on the precise temporal determination of the onset of necking without explicit ap-
proximation of the necking area. However, minor changes in the weakly supervised
methodology enable a segmentation of the relevant region, so that an approach for
the spatial determination of the necking region is presented in Chapter 9. Finally, an
outlook is provided in Chapter 10, that points out possible future research directions.
The thesis is concluded with a summary in Chapter 11.
1.3. Organization of the Thesis 7

Theory
CH 1 Introduction

Mechanical Engineering Pattern Recognition


CH 2 CH 3
Fundamentals & Machine Learning

Methodology Data Acquisition


CH 4
& Materials

Supervised Unsupervised
CH 5 CH 6
Machine Learning Machine Learning

Unsupervised
CH 7
Deep Learning

Weakly supervised
CH 8
Deep Learning

Weakly supervised
CH 9
Segmentation

CH 10 Summary

CH 11 Outlook

Figure 1.4: Schematic outline of the thesis.


CHAPTER 2

Theoretical Background:
Sheet Metal Forming
2.1 Fundamentals of Sheet Metal Forming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Forming Limit Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

This chapter covers the essential fundamentals of forming technology in the area
of sheet metal forming, which is necessary for a comprehensive understanding of
the thesis and proposed methods for the determination of the FLC. Fundamental
concepts of metal forming are introduced alongside with elementary methods for the
characterization of materials and their forming behavior associated with material
descriptive metrics. A general description of the FLC, the necessary experimental
setup for the determination and influencing factors are outlined in Section 2.2. The
chapter concludes with the state-of-the-art methods in Section 2.3, that are considered
as baseline for comparison throughout the thesis. The presented concepts are common
knowledge in the area of sheet metal forming and covered by diverse books of the
field [Lang 85, Sieg 15, Alta 12, Bana 10].

2.1 Fundamentals of Sheet Metal Forming


Forming is the intentional modification of the shape, the surface, and the properties
of a metallic body while maintaining the mass and material cohesion, as defined in
DIN 8580 [DIN 03a]. In general, forming processes are categorized according to the
structure of the semi-finished products to be formed. For this reason, one distin-
guishes two categories, sheet metal forming (up to 10 mm sheet metal thickness) and
bulk metal forming. If considering the underlying temperature, a differentiation into
cold and hot forming is possible as well [Doeg 10]. According to DIN 8582 [DIN 03b],
the varying forming processes are classified into multiple categories according to the
predominant loading conditions or stresses, such as tensile-, tensile-compressive- and
pressure forming, as well as forming by bending or under shearing conditions. To
manufacture a component with the desired shape and properties, a suitable forming
process needs to be designed that is dependent on multiple process parameters and
constraints, such as the metal forming machine tool, the lubricant, or the ambient
medium. Particularly important are the specific material characteristics, e. g., the
microstructure, the geometry, the surface, together with the technological character-
istics of the workpiece material such as yield strength or tensile strength [Grot 18].

9
10 Chapter 2. Theoretical Background: Sheet Metal Forming

These parameters are subdivided into input-, process and output parameters, whereas
the input parameters are valid at the beginning of the forming process and the out-
put parameters are valid immediately after the end of the forming process [Sieg 15].
This complexity of materials and structures requires an optimized process design by
means of FEA to realize the individual specifications and consequently, a correct
determination of the material forming limits. These limits must not be exceeded in
simulations, as in case of surpassing defective parts are expected. In order to under-
stand the effects that occur during the forming of metallic materials and to develop a
new method for the determination of sheet metal forming limits, it is essential to have
a fundamental knowledge of the material behavior under loading conditions. Effects
that occur on the microscopic scale or affect the microstructure are not taken into
consideration since the thesis is focusing on the forming behavior of different materials
on a macroscopic scale. The material category depends on the material composition,
e. g. deep drawing steel or aluminum alloy, whereas the forming behavior and ability
are described using multiple technological characteristics that influence the process
design.

2.1.1 Tensile Test


The tensile test is an experimental setup, as defined in DIN 50125 [DIN 16], that
enables the characterization of materials in terms of technical characteristics with high
reproducibility. An uniaxial tensile test uses standardized specimen as visualized
in Figure 2.1, where l0 and l, define the initial length and current length, w0 and
w the initial and current width of the object and t0 and t the initial and current
thickness. The specimen is held by fixtures at both sides without influencing the
forming process and an extensometer measures the elongation of the gage length
while applying a tensile load or force F [Alta 12]. After removal of F and if F was
smaller than a certain threshold, the specimen will return to its original shape. This
effect is defined as an elastic deformation. When F exceeds the threshold, the forming
process results in an irreversible, plastic deformation. As a result, the specimen will
remain extended after the removal of F . This leads to the definition of engineering
stress σe and engineering strain e according to [Alta 12, p. 28]:
F
σe = , where A0 = w0 ∗ t0 (2.1)
A0
d` Z l
d` l − l0 ∆l
de = yielding e = = = (2.2)
l0 l0 l0 l0 l0
One drawback of the engineering strain e is that it cannot be used to correctly describe
extensions for large plastic strains or compression and additionally, it is not suitable

t0 t

l0 w0 l w
Figure 2.1: Schematic tensile test specimen at the beginning (left) and during
forming (right). Adapted from [Sieg 15].
2.1. Fundamentals of Sheet Metal Forming 11

for sequences of deformation. When referring to the current length l instead of the
initial length l0 , the infinitesimal strain can be written as:
dl
dε = (2.3)
l
Integration over the stretching period yields to true strain ε, that resolves the limi-
tations of engineering strain:
Z l !
dl l
ε= = ln (2.4)
l0 l l0
where ε is the true strain, that can be related to the engineering strain:
l0 + ∆l ∆l
! ! !
l
ε = ln = ln = ln 1 + = ln(1 + e) (2.5)
l0 l0 l0
True stress σt can be related to engineering stress σe in the same way:

σt = σe (1 + e) (2.6)

The effect of an applied tensile load on the specimen is visualized in Figure 2.2. With
increasing force, the strain increases and the specimen changes its shape, uniformly,
passing the elastic forming region at εe until reaching the ultimate strain εu within
the plastic forming region. When additional force is applied on the specimen, the
material composite dissolves and a necking effect, a reduction in width or breadth is
visible, with successive rupture of the specimen at εr . This deformation process can
be better expressed with the stress-strain diagram, that is used to explain several
basic mechanical material properties employed for design purposes.
length

Af
neck lt
lu
le
l0

A0

ε0 εe εu necking εr ε
Figure 2.2: Schematic development of a tensile specimen during forming. Elonga-
tion first develops homogeneously, until reaching ultimate strength at εu . Additional
forming introduces inhomogeneity such that only a limited area contributes to further
elongation, the necking, followed by rupture at εr . Adapted from [Alta 12].
12 Chapter 2. Theoretical Background: Sheet Metal Forming

U ltimate stength
US

Y ield strength
YS Rupture
σe in MPa

ic

hardening necking
last

region region
ar e
line

YS US e in % FS

Figure 2.3: Schematic Stress-strain diagram. In the beginning, linear elastic de-
formation according to Hookes’ law takes place. Yield strength defines the elastic
limit, the theoretical minimum value with measurable permanent deformation. With
further forming, a maximum value is derived, which is denoted as ultimate strength,
after which the necking starts followed by rupture. Adapted from [Alta 12].

2.1.2 Stress-Strain Diagram


An exemplary engineering stress-strain curve is visualized in Figure 2.3. It describes
the relationship between stress and strain, dependent on the applied load, and depicts
the deformation of a specimen, as shown in Figure 2.2. At first, in the linear elastic
phase, the curve is linearly increasing following Hooke’s law. This law states that
stress is proportional to strain with respect to the slope, which is denoted as Young’s
modulus E, a material dependent parameter of elasticity [Sieg 15]:

σe = Ee (2.7)

If the proportional stress is exceeded, at the end of the linear elastic behavior, the
non-linear curve and plastic deformation or hardening region begins. This means,
starting from this point, the microstructure changes permanently, as atoms are moved
to new equilibrium positions. This behavior is material dependent, and if the internal
microstructure blocks the dislocation motion, the material gets increasingly brittle,
resulting in a longer linearly behaving strain stress curve, that eventually is termi-
nated by fracture without significant plastic deformation. If there is no pronounced
yield at the transition to the hardening region, an offset of 0.2% is determined, that
is used to define the start of the plastic deformation and is denoted as yield strength.
When reaching the maximum value, the so-called ultimate strength, no more uni-
form elongation is possible and the strain starts to localize on a small area due to the
occurrence of plastic instability, referred to as localized necking. This means, plas-
tic deformation only emerges in a limited area of the specimen, while the remaining
2.1. Fundamentals of Sheet Metal Forming 13

Figure 2.4: With primer and painting prepared specimen during forming. The spec-
imen is uniformly formed until necking occurs, followed by rupture of the material.

regions of the specimen no longer contribute to the further deformation, such that
fracture starts in the localized necking area. If the forming process is stopped at a
certain stage, stress reduces to 0 according to Hookes’ law, such that only the plastic
strain remains [Sieg 15]. The complete forming behavior for a tensile test specimen,
starting from the original sample, over localized necking to rupture of the specimen, is
visualized in Figure 2.4. For process design purposes, Hooke’s law (Equation 2.7) can
be used to relate stress to strain for the elastic region, whereas in the plastic region
the relationship between stress and strain is nonlinear and hence true stress-strain
definitions are used and related according to Hollomon’s law [Alta 12]:

σt = KεnH (2.8)

where K is the strength coefficient and nH is the strain-hardening exponent.


The introduced mechanical properties are dependent on the rolling direction of
the sheet metal materials, which is referred to as anisotropy. The reason is due to the
texture of the microstructure that is generated by the preceding hot and cold rolling
process [Grot 18]. During the preceding of sheet metal, the grains are aligned and
elongated in the rolling direction while being compressed in the thickness direction.
This leads to different strengths and properties for the specimen dependent on the
orientation of the extracted specimen with respect to the rolling direction. The degree
of anisotropy is defined using the plastic-strain ratio, or Lankford coefficient rL :
εw
rL = (2.9)
εt
where εw is the strain in width direction and εt the strain in the thickness direction
of the specimen (cf. Figure 2.1). A Lankford coefficient equal to one refers to the
isotropic behavior of the material. Consequently, the lower the values, the earlier
starts the thinning of the material, and hence high values of the ratio are desirable
[Alta 12]. The Lankford coefficient is determined at multiple angles relative to the
rolling direction of the sheet metal as visualized in Figure 2.5.
14 Chapter 2. Theoretical Background: Sheet Metal Forming

0° rd.

45° to rd.

90° to rd.

Figure 2.5: Anisotropy is defined in plane direction of the sheet metal with respect
to the rolling direction of the sheet metal. Specimen are extracted at different an-
gles with respect to the rolling direction to determine multiple Lankford coefficients.
Adapted from [Sieg 15].

2.2 Forming Limit Diagram


The forming limit diagram provides a graphical description of the forming behav-
ior of materials. Essentially, the diagram consists of two curves, the actual FLCs,
and incorporates the maximum tolerable forming limits in dependence of the princi-
pal strains. Herein, the upper curve depicts the failure of the material by rupture,
whereas the lower curve describes the failure of the material due to localized necking
Figure 2.6 [Sieg 15]. In theory, defect-free workpieces are expected if the process takes
into account the localized necking forming limits. The FLC is expressed in terms of
in-plane principal strains, the larger (major strain) and smaller (minor strain) loga-
rithmic longitudinal and transversal deformation. The major strain (ε1 ) and minor
strain (ε2 ) pairs are determined based on a circular grid that is etched on the surface
of the specimen. During the deformation process, the circles with diameter d0 are
deformed to ellipses with the larger semi-major axis d1 and smaller semi-minor axis
d2 , which leads to the definition of ε1 and ε2 , that are perpendicular to each other:
d1 d2
ε1 = ln , ε2 = ln (2.10)
d0 d0
The determination of the ellipsoidal diameters is either determined incrementally or
continuously while forming the specimen up to fracture [Sieg 15]. Since the position
and curvature of the FLC is dependent on the test setup, e. g. the shape of the
punch used to form the specimen or the geometry of the specimen (with lateral
cutouts or straight sheet metals), there exists no single, definite FLC. In general, it is
beneficial to use frictionless test setups, e. g. mechanical stretching for the right part
of the FLC that correspond to biaxial loading conditions and where the relationship
ε1 = ε2 is valid. Tensile tests are used for the left part of the FLC that correspond
to uniaxial loading conditions with the relationship ε1 = −2ε2 . The plane strain
region is derived by tensile tests with suppressed lateral extension and possesses
the relationship ε2 = 0. The individual loading conditions and their characteristic
2.2. Forming Limit Diagram 15

uniaxial plane biaxial


loading strain loading
ε1 = −2ε2 ε1 ε2 = 0 ε1 = ε2

rupture

deep drawing
ε1 = −ε2

necking

prior to forming

strain paths after forming

−ε2 0 ε2
Figure 2.6: Schematic of the forming limit diagram with one FLC for necking and
rupture and two strain paths. The blue circle denotes the shape prior to forming,
while the dashed ellipse depicts the shape after forming. The relation between ε1 and
ε2 changes, depending on the loading conditions. Adapted from [Sieg 15].

interrelationships ε1 and ε2 are emphasized in Figure 2.6. Another aspect that is


evaluated using the forming limit diagram is the strain progression of specimens, the
so-called strain paths. The paths are described by ε1 and ε2 pairs, while their ratio
has to remain constant to be considered as valid experiment for the determination of
the FLC:
ε1
= const. (2.11)
ε2
Otherwise, with non-linear strain paths, it would be possible to create an unlim-
ited amount of FLC [Naka 68]. A comprehensive description of the determination of
forming limit diagrams can be found in [Hase 78], while usually test setups, according
to Nakajima [Naka 68] or Marciniak [Marc 65], are considered for the experimental
determination of the FLC. During the design of a production process, coarse knowl-
edge of the position of the FLC and the strain paths may be of advantage as the
process can be adapted incrementally. Nevertheless, a sufficiently large safety margin
is required to produce defect-free components, as visualized in Figure 2.7 [Sieg 15]. A
precise and consistent determination of the forming limit curve therefore reduces the
required safety margin and increases the possible applications of materials.

2.2.1 Factors Influencing the FLC


Besides the dependency of the FLC on the test setup, multiple factors influence the
formability of the material specimen and lead to a transition of the corresponding
FLC. These characteristics are categorized into material and process parameters,
which are summarized in Table 2.1 [Doeg 10]. While most parameters were already
introduced, the subset size and resolution are discussed in Subsection 2.2.3.
16 Chapter 2. Theoretical Background: Sheet Metal Forming

ε1 rupture

necking

safety margin

−ε2 0 ε2
Figure 2.7: Manufacturing processes include safety margins well below the FLC as
they imply uncertainty, to guarantee defect-free components. Adapted from [Sieg 15].

Table 2.1: Parameters with influence on the position of the FLC.

Material parameter Process parameter


sheet thickness friction
yield strength forming velocity
tensile strength geometry of specimen
anisotropy forming history
hardening exponent subset size of the measurement system
resolution optical measurement system

2.2.2 Experimental Setup


The FLC can be determined experimentally, where stretch tests are carried out ac-
cording to Nakajima [Naka 68] or Marciniak [Marc 65] setups. The setups use a punch,
either hemispherical (Nakajima) or flat (Marciniak), to deform a sheet metal, that is
clamped between a blank-holder and a die, until failure of the material. During the
deformation of the specimen, an optical measurement system is used to record the
forming process. A schematic overview of the setup is visualized in Figure 2.8 (a).
The different loading conditions and strain paths, from uni-axial over plane strain
to biaxial, are derived by deforming varying sample geometries as visualized in Fig-
ure 2.8 (b). The sample geometries are specified by their cutouts from the material,
that define the remaining conjunction width, e. g. a S050 geometry corresponds to a
conjunction width of 50 mm, whereas S245 depicts a complete circular sheet metal
without cutouts. Additionally, a lubrication system is used to reduce friction between
the punch and the specimen and to induce a rupture at the center of the specimen,
or respectively at the top of the punch. Prior to forming, specimen are prepared
with a white primer and black paint to generate a stochastic speckle pattern, such
that the deformation process can be evaluated using an optical measurement system
with DIC as proposed by [Anut 70] or [Keat 75] in the 1970s, and with applications
to experimental mechanics as introduced by [Chu 85].
2.2. Forming Limit Diagram 17

optical measurement system

test piece blank holder


S245 S110 S100

S080 S030

punch die

(a) Schematic of Nakajima test setup. (b) Specimen geometries.

Figure 2.8: The Nakajima test setup in (a) with multiple specimen geometries in
(b), that are used for the different loading conditions. The number denotes the width
of the conjunction in mm. Source: [Jare 18] (CC BY 4.0).

2.2.3 Digital Image Correlation


In general, DIC is an optical measurement method that can be used for accurate
determination of changes in images using well-understood features of computer vision
such as registration, epipolar geometry, lens undistortion or camera calibration to
generate 2D or 3D measurements. It is employed to measure full-field displacement
and enables the computation of local strain information. In combination with a
Nakajima test setup, it is possible to generate high-resolution strain distributions
of the formed specimen and their progression over time. This thesis focuses on the
fundamental principles of DIC, which are necessary to understand the generation of
2D - strain distributions, that are being used in the determination of the FLC. A
comprehensive overview and complete coverage of the topic, including an extensive
literature review of the very active research area is found in [Sutt 09].

Basic Workflow
In general, DIC is a measurement method that compares gray-level intensities changes
of images at two different states before and after the deformation process. To track
the deformation and to determine the displacements of the specimen, a stochastic
speckle pattern is applied on the surface of the sheet metal. To achieve a fine-grained
resolution of the strain distribution, the image of the prepared, undeformed specimen
is subdivided into rectangular subsets that are unique with respect to their neigh-
borhoods and hence traceable over time. This concept has been applied successfully,
since neighboring pixels remain approximately constant over time, independent of
the deformation process. This effect is exemplarily visualized in Figure 2.9 for one
subset with its original position and its new position 30 frames later. Despite the
significant time difference, the neighboring pixels appear constant. Consequently, the
subset size needs to be chosen adequately with respect to the camera resolution and
18 Chapter 2. Theoretical Background: Sheet Metal Forming

Figure 2.9: Reference subset position and new position with a difference of 30
frames in between. Despite the forming of the specimen, the neighborhood remains
constant.

the size of the specimen in order to derive a fine-grained resolution of the individual
displacements. Too large subsets may impede the displacement resolution, such that
small changes may not be traceable, whereas too small subsets may not be unam-
biguous and are computationally expensive to track. Tracking of the movement and
calculation of the displacements of the subsets is achieved by evaluation of a similarity
metric between the reference subset and the deformed subset.
Multiple criteria have been suggested as e. g. Cross-Correlation (CC), Sum of Ab-
solute Differences (SAD), Squared Sum Differences (SSD), Zero-Normalized Cross-
Correlation (ZNCC), Zero-Normalized Squared Sum of Differences (ZNSSD), Nor-
malized Cross-Correlation (NCC) or Normalized Squared Sum of Differences (NSSD)
[Pan 09]. Especially the NCC and NSSD are of importance as they are robust to local
illumination changes.
Considering a small reference subset of an image with center coordinate at P (xc0 , y0c )
prior to deformation as visualized in Figure 2.10, the subset moves to a new posi-
tion P ∗ (xc0 , y c0 ) and deforms. Consequently, an arbitrary point Q(xc0 , y0c ) inside the
reference subset prior to deformation, is found at a new position Q∗ (xc0 , y c0 ) after
deformation according to the following displacement mapping function:

∂u ∂u
xc0 = xc0 + ∆xc + u + ∆x c
+ ∆y c (2.12)
∂xc ∂y c

∂v ∂v
y c0 = y0c + ∆y c + v + c
∆xc + c ∆y c (2.13)
∂x ∂y

where u, v are the displacement components for the subset center P in xc and y c
direction, ∆xc , ∆y c the distances from subset center to point Q(xc , y c ). The partial
derivatives ucx , ucy , vcx and vcy denote the first order displacement gradient components
as depicted in Figure 2.10.
2.2. Forming Limit Diagram 19
xc
u
∆xc ∂u
∂xc
∆xc

P
∆y c Q

v ∂v
∆xc
∂xc

P*
∂v
∂y c
∆y c Q*

yc
∂u
∂y c
∆y c

Figure 2.10: Relationship between the reference subset and deformed subset. The
rigid translation in u and v direction does not contribute to the strain calculation,
while the deformation of the blue subset is described with respect to the reference
subset.

A typical similarity criterion, such as ZNCC is evaluated, to measure how well subsets
match:
h   i h   i
c c c0 c0
s xi , yj − fm · g xi , yj − gm 
M M  f
CCfs ,gs (p) =
X X

i=−M j=−M
 ∆fs ∆gs 
v
u
M
u X M h   i2
∆fs =
X
fs xci yjc − fm (2.14)
u
t
i=−M j=−M
v
u
M
u X M h   i2
∆gs =
X
gs xc0i , yjc0 − gm
u
t
i=−M j=−M

where fs (xci , yjc ) are gray-level reference subsets of the undeformed image, gs (xc0i , yjc0 )
the gray-level subsets of the target image, and fm , gm the ensemble averages of refer-
ence and target subsets, respectively, while vector p = (u, ux , uy , v, vx , vy ) depicts the
displacement mapping parameters. The reference and target subsets reach maximum
correlation when optimally aligned with maximum similarity. To determine the un-
known displacement parameters, a Newton-Raphson algorithm is employed for fast
convergence [Bruc 89]:
   
p k = p (k−1 ) − H −1 (CC p (k−1 ) )∇CC p (k−1 ) (2.15)

where p k is the
 k-th iteration of the solution and p (k−1 ) the previoussolution,

∇CC p (k−1 ) are the gradients of the correlation criteria and H −1 (CC p (k−1 ) )
20 Chapter 2. Theoretical Background: Sheet Metal Forming

the inverse Hessian matrix, the second-order derivative of the correlation criteria.
Generation of the strain measurements from the displacements is achieved, after a
smoothing procedure, using a least-squares method, that applies a rectangular mov-
ing window approach as proposed in [Pan 07] and [Pan 09]. First, a squared window
is defined, that contains (2m + 1) × (2m + 1) pixels for strain calculation. The dis-
placement distributions included in these windows are then used to approximate a
linear plane:
! !
∂u ∂u
uplane (x , y ) = au,plane +
c c
xc + yc
∂xplane ∂yplane
! ! (2.16)
∂v ∂v
vplane (x , y ) = bv,plane +
c c
x +
c
yc
∂xplane ∂yplane

or more generally formulated as

u(i, j) = a0 + a1 xc + a2 y c
(2.17)
v(i, j) = b0 + b1 xc + b2 y c

where i, j = −m : m denote the local coordinates, u(i, j) and v(i, j) denote the
displacements at location (i, j) as derived by DIC, and ai=0,1,2 , bi=0,1,2 are the to be
determined polynomial coefficients. This is rewritten for u(i, j) as follows:

1 −m −m u(−m, −m)
   
 
1 −m + 1 −m  u(−m + 1, −m)

 

  
 

.. .. ..  ..
 
 

  
. . .  .
   
a
 
 

 0 
 

1 0 0   a1  = u(0, 0) (2.18)
  
 
.. .. ..  ..
  

 . . .  a2 


 .




   
1 m−1 1, m)
 
m  u(m −
  
 

 
 

1
 
m m u(m, m)
 

while v(i, j) is derived equivalently. The Green-Lagrangian strains are directly com-
puted from the determined coefficients as [Blab 15]:
 !2 !2 
1 ∂u ∂u ∂v
εxx = 2 c + +
2

∂x ∂xc ∂xc
1
!
∂u ∂v ∂u ∂u ∂v ∂v
εxy = + c+ c c+ c c (2.19)
2 ∂y c ∂x ∂x ∂y ∂x ∂y
 !2 !2 
1 ∂v ∂u ∂v
εyy = 2 c + +
2

∂y ∂y c ∂y c

Since the determined strains depend on the coordinate system itself, the principal
strains, major and minor strain are derived by determination of the eigenvalues λ1,2 :
s
εxx + εyy εxx + εyy
 2  
λ1,2 =1+ ± − εxx · εyy − ε2xy (2.20)
2 2
2.2. Forming Limit Diagram 21

where the larger eigenvalue, defines the stretch ratio of the major strain component,
and the smaller eigenvalue defines the stretch ratio of the minor strain component.
As the thickness direction cannot be obtained directly, using an optical measurement
system that only assesses surface deformation, volume consistency is assumed to
derive the thickness information (ε3 ) according to:

λ1 · λ2 · λ3 = 1 (2.21)

which, in logarithmic form yields to the principal strain relationship:

ε1 + ε2 + ε3 = 0 (2.22)

Figure 2.11 visualizes an exemplary strain distribution, derived by DIC, and its pro-
gression over time for a S060 specimen geometry.

ε1

ε2

ε3

Figure 2.11: Strain distribution during forming and its progression (left to right)
from homogeneous forming to localized necking with separate visualization of ε1 – ε3 .
22 Chapter 2. Theoretical Background: Sheet Metal Forming

2.3 State-of-the-Art

The term “forming limit” was first introduced by Keeler in 1961 [Keel 61]. It assumes
a change of the initial circular shape to an ellipsoidal contour as a result of the plastic
elongation of the material. The major and minor strain is derived by the natural
logarithm of the ratio of the main and secondary diameter of the ellipse after plastic
deformation and the radius of the initial circle. For identification and measurement,
a homogeneous circular pattern is etched onto the workpiece prior to forming and
measured after deformation. Keeler used a hemispherical punch for the experimental
determination [Keel 77]. The workpieces consisting of aluminum, brass, copper and
steel were clamped and tested under biaxial tensile load until fracture. As a result of
the biaxial load, Keeler was only able to determine positive values for major and minor
strain. Godwin adopted this concept and extended the experimental forming limit
determination to include the tensile-compression condition, where the minor strain
assumes negative values compared to the biaxial tensile loading condition [Good 68].
For this purpose, strip samples with different sheet widths and thicknesses along the
longitudinal axis were tested under uniaxial tension up to fracture of the specimen.
The variation of the specimen dimensions in width and thickness enabled the modeling
of different stress states and hence, the resulting forming limit curve describes the
tensile-tensile and tensile-compression range. These represent almost all relevant
strain states of formed sheet metal parts as e. g. used for the deep drawing of chassis
parts. Marciniak investigated in analogy to Keeler, in particular, the formability of
the biaxial stress state [Marc 65].
An analytical model was developed based on experimental stretch forming tests
with a ring-shaped punch attachment to predict the local thinning as a function of the
yield strength, the anisotropy, the strain hardening exponent and the local necking or
local thinning of the specimen. The basis for the current determination of the forming
limit capacity was introduced by Nakajima, by establishing the forming limit curve
for different steel using a spherical and ellipsoidal punch [Naka 67]. Nakajima was
able to specify a complete forming limit curve by varying the specimen and punch
geometry, whereby a relationship between stress and strain conditions was presented
for the first time. Hasek [Hase 78] varied the sample geometry for the Nakajima test
by various cutouts radii, which forced the specimen, due to the geometric dimensions,
to different strain paths. Traditionally, the forming limit diagram is used in sheet
metal forming to describe the failure as a result of instability in order to investigate
the feasibility of real components.
In Europe, the procedure used for the determination of the FLC is summarized in
DIN EN ISO 12004-2 [DIN 08], that is based on the Bragard study of 1972 [Brag 72]
and evaluates multiple cross-sections through the strain distribution of the specimen
prior to failure, without considering forming progression. Another method that over-
comes this limitation and makes use of the forming history, was proposed by Volk and
Hora [Volk 11]. Both methods are presented and contrasted in the following, as they
serve as a baseline throughout the remainder of the thesis. Additionally, both meth-
ods make use of a Nakajima test setup to deform multiple specimens, coupled with
an optical measurement system and employ DIC to calculate the strain distributions.
2.3. State-of-the-Art 23
cross-sections evaluation area
crack # px > 0.9 ∗ max(ε3 )

(a) Location-dependent. (b) Time-dependent.

Figure 2.12: Different evaluation areas. (a) Five cross-sections are extracted per-
pendicular to crack occurrence (location-dependent method). (b) Up to 20 connected
pixels are determined with respect to a 90% threshold of ε3 (time-dependent method).

2.3.1 Location-Dependent Determination of the FLC


The ISO standard [DIN 08] specifies the location-dependent so-called “cross-section”
method for deriving the FLC. It uses the major and minor strain values along multiple
cross-sections perpendicular to the crack development of the specimen as visualized
in Figure 2.12 (a). The aim of the method is to estimate the theoretical maximum of
the forming process using a quadratic function. For this purpose, the last valid frame
of the forming procedure is identified and the strain values along the cross-sections
are subdivided into two areas wl and wr , the left and right side of the discontinuity
Figure 2.13. Heuristics are used to derive the width of two consistent and constantly
rising approximation windows, which are employed to regress a quadratic function.
The maximum of the estimated function determines the location of the ε1 and ε2 pair,
being used for the generation of the FLC. Three forming experiments are realized
per geometry with five averaged cross-sections for the regression process for each
specimen, whereas the mean value of the repetitions is utilized to create the FLC.

2.3.2 Time- and Location-Dependent Determination of the FLC


Volk and Hora [Volk 11] proposed the time-dependent method, referred to as “line-fit”
method in the remainder of the thesis, that heuristically defines a localization area on
the strain distributions of the last valid frame of the forming sequence as visualized
in Figure 2.12 (b). The mean value of the defined necking area is calculated for
ε1 , ε2 and ε3 of each frame of the forming sequences and arranged over time. The
progression of the two principal strains is depicted in Figure 2.14. Typically, the
principal strains exhibit a linear profile up to the time of occurrence of localized
necking. The ε3 -rate, the first derivative, is used to derive the point in time when
necking occurs. For this purpose, two regression lines are approximated, while one
line is estimated for the homogeneous forming region and one line is estimated for
the inhomogeneous forming region. The intersection of both lines defines the onset
of localized necking and lookup point in time for the ε1 , ε2 pair of the FLC. One
disadvantage of the line-fit method is the dependence on the sampling rate, as it
determines the number of available points that can be used for regression and hence
the intersection lines. For this reason, the transferability and reproducibility of the
results is not guaranteed [Merk 14].
24 Chapter 2. Theoretical Background: Sheet Metal Forming

ε1
ε1 forming limit
approx.

wl la wr

section length
Figure 2.13: Schematic of the location-dependent evaluation method. The ε1 cross-
sections are subdivided into two windows, one on the left and right side of the dis-
continuity, and being used to approximate a parabola. The maximum value of ε1 is
used as forming limit, while the same procedure is performed for ε2 .

ε1
ε2
ε3
ε3 − rate
principal strain [-]
forming rate [1/s]

forming limit line fit


inhomo-
geneous
phase

line fit of homogeneous phase


intersection

punch position before rupture


Figure 2.14: Schematic of the time-dependent evaluation method. Two straight
lines are regressed using the thinning rate, one for the homogeneous and inhomoge-
neous forming phase. The intersection of the two lines defines the point in time when
localized necking occurs, and hence is used to define the ε1 and ε2 values.
2.3. State-of-the-Art 25

0.5
ISO
0.4 line-fit
strain path
0.3

ε1
0.2

0.1

0
−0.1 0 0.1 0.2 0.3
ε2

Figure 2.15: FLC curves of the ISO and line-fit method for AA6014. For the
same material, a difference of approx. 10% is observable, depending on the specimen
geometry.

Comparison of State-of-the-Art Methods


The evaluation method according to the DIN-Norm provides good and consistent
results for ductile materials with distinct necking phases that develop a single maxi-
mum by evaluation of only one frame of the forming sequence. For materials that do
not exhibit pronounced necking behavior, i.e. their strain distribution develops mul-
tiple local maxima, an unambiguous localization of the cross-sections is not possible.
The line-fit method shows promising results for light-weight materials by investiga-
tion of the forming history but has a tendency to overestimate the actual forming
limit. Consequently, the evaluation method influences the position of the FLC, and
different FLC are derived for the same material as visualized in Figure 2.15, which
affects the required safety margin during process planning.

In summary, the following shared disadvantages can be identified for both meth-
ods:

• Limited evaluation area: Only a limited area of the available information is


evaluated in order to determine the beginning of local necking.

• Approximation of the onset of localized necking: Partial areas with


constant forming character are defined for both methods, upon whose founda-
tion either a parabola is estimated or two intersection lines are regressed, such
that only an approximation of varying quality of the maximum forming limit
is possible.

• Material dependency: The evaluation method according to the DIN-Norm


provides reliable results for ductile materials, whereby too conservative esti-
mates are provided for modern light-weight materials. Similar behavior can be
observed for the time-dependent method, which particularly exhibits failure in
materials with abrupt fracture behavior.
26 Chapter 2. Theoretical Background: Sheet Metal Forming

In addition to the methods presented and established in the industry, further method-
ologies were proposed that are applicable for the determination of the FLC [Merk 10,
Wang 14, Silv 15, Vyso 16]. As with the established methods, the recently proposed
approaches share the same disadvantage of the heuristically determined and limited
evaluation areas and thus neglect a large proportion of the information available. In
order to avoid the aforementioned disadvantages for the determination of the FLC,
a new evaluation method is required that makes use and exploits the available strain
information provided. This method should consider most of the available strain dis-
tributions without limitation to heuristically predefined evaluation areas. At the
same time, the method should guarantee general validity and thus be independent of
the material. One possibility to realize this is the development of a new evaluation
strategy based on pattern recognition methods, whose principles with a focus on the
applied methods are presented in the following.
CHAPTER 3

Theoretical Background:
Machine Learning
3.1 Introduction to Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Classification Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6 Assessment of the Expert Annotation Quality . . . . . . . . . . . . . . . . . . . . . . . . . 59

This chapter introduces the fundamental principles of pattern recognition and


machine learning from the perspective of the typical pattern recognition pipeline
[Niem 83]. It covers well-established traditional machine learning algorithms, which
provide the conceptual basis for multiple proposed methods. Subsequently, novel
DL techniques are presented, which avoid several disadvantages of the traditional
approaches. The chapter concludes with common evaluation metrics for classification
algorithms and additional statistics used in the context of quality assessment of expert
annotations.

3.1 Introduction to Pattern Recognition


In general, machine learning (traditional and DL) focuses on the task of learning
representative models or separation hypotheses by exploiting features extracted from
data to automate and objectify decision-making processes. To derive these decisions,
a classification system is realized by implementation of the well-known steps of the
pattern recognition pipeline. The pipeline consists of four sequential steps: data
acquisition, preprocessing, feature extraction, and classification, whose sequential
structure is visualized in Figure 3.1. The initial step of the pipeline is the acquisition
of a signal, such as time-series or image data by means of a sensor. This sensor is
used to transform a continuous signal into a discrete representation. Subsequently,
preprocessing steps are performed, which may encompass the suppression of noise,
the removal of outliers, or the interpolation of missing values. Moreover, in order to
facilitate further processing of the signal, a subdivision into smaller sections or areas
can be considered optionally. Within the feature extraction step, the preprocessed
signal is encoded into smaller representations, the feature vectors, so that the con-
tained information is concise and yet describes the original signal, ideally without
losing relevant information. These features are usually selected in advance based on

27
28 Chapter 3. Theoretical Background: Machine Learning

Classification phase
image preprocessing feature classification
acquisition extraction

Learning phase
training
training samples

Figure 3.1: Pattern recognition pipeline according to Niemann [Niem 83].

prior knowledge of experts in the field. Within the classification step, a differentia-
tion is made between supervised and unsupervised approaches. In case of supervised
algorithms, the data is annotated by experts and provided with class labels, whereby
the unsupervised methods do not require any expert annotation and ideally no prior
knowledge. Independent of the underlying approach, the available data is separated
into disjoint training and test datasets. Within the learning phase, the training
data is used to train a classifier to find an optimal decision boundary that separates
the instances of one class from another. This separation hypothesis is evaluated in
the classification phase with the disjoint test set, whereby the class assignments are
compared with the ground truth labels of the expert annotations. Quantification of
the separation hypothesis is assessed by evaluation metrics that enable comparisons
between different classification algorithms. In the following sections, the individual
steps of the pipeline are described in detail with reference to the strain distributions.

3.2 Feature Extraction


In general, features are extracted from images, or in this case strain distributions,
to derive a characteristic feature vector. The features are chosen to reflect the char-
acteristics of the data and suitable to support the differentiation of multiple classes,
i. e., the varying forming behavior of sheet metals. A variety of feature descriptors
exist that either describe the image entirely or first subdivide the image into multiple
evaluation areas to assess the image information locally. This is followed by subse-
quent combination of the features to generate a global descriptor. In the course of
this work, different features were examined, which are presented in the following.

3.2.1 Intensity-level Histogram


An intensity-level histogram represents an image by using the relative frequency of its
gray-level occurrences. It is used to describe the characteristics of an image without
consideration of structural arrangement or relationships of individual pixels with their
neighborhood. For instance, if the resulting histogram possesses a bimodal character,
it may include an object with a narrow intensity range against a background of varying
intensities [Mate 98]. In case of strain analysis, one would expect a homogeneous
Gaussian-like histogram within the homogeneous forming region. When reaching
the inhomogeneous forming phase, higher intensities occur locally that affect the
3.2. Feature Extraction 29

relative frequencies and hence the skewness and kurtosis of the distribution. Intensity
histograms are utilized as features, e. g. in medical application areas such as magnetic
resonance imaging [Agga 12] or computed tomography [Moug 07]. Every pixel, more
precisely its gray-level intensity, contributes to the histogram generation procedure.
Assuming an image as a function fI (x, y) of two variables x ∈ [0, . . . , N ] and y ∈
[0, . . . , M ], with discrete intensity values gv ∈ [0, . . . , Ngv ], where Ngv describes the
number of gray-levels, the amount of pixels having the same intensity gv is calculated
by
N X
M
hI (gv ) = δk (fI (x, y), gv ) (3.1)
X

x=0 y=0

where δk (gv cur., gv ) is the Kronecker delta function

1, if gvcur. = gv
(
δk (gvcur. , gv ) = (3.2)
0, otherwise

The relative frequencies are converted into a probability distribution p(gv ) by con-
sideration of the total number of pixels according to:

hI (gv )
p(gv ) = , gv ∈ [0, . . . , Ngv ] (3.3)
NM
while the concatenation of probabilities per intensity represents the histogram.

3.2.2 Homogeneity Features


This feature extractor evaluates two types of image representations. The first infor-
mation domain considers the gray-scale intensity (strain) distribution, whereas the
second information domain consists of the edge information. Both domains are com-
bined to assess the homogeneity of an image. Again, at the beginning of the forming
procedure, a smooth and homogeneous intensity distribution is expected, that trans-
forms into an inhomogeneous distribution. This comprises localization effects within
the first domain and thus leads to the development of edges in the second domain.
Sobel-Operators as described in Equation 3.4 and Equation 3.5 are used to derive the
horizontal and vertical gradients Gx and Gy of an image I via convolution [Paul 03]:

−1 0 1
   
−1 −2 −1
 −2 0 2 
Hxs =   0
Hys =  0 0 

(3.4) (3.5)

−1 0 1 1 2 1
Gx = Hxs ∗ I Gy = Hys ∗ I

The resulting gradient approximations are combined to calculate the orientation in-
dependent gradient magnitude Imagn. using:
q
Imagn. = G2x + G2y (3.6)

An example of the original strain distribution and the corresponding edge information
is visualized in Figure 3.2. Herein, the strain distribution already reveals necking
30 Chapter 3. Theoretical Background: Machine Learning

Figure 3.2: Original strain distribution (left) and magnitude representation (right).

characteristics with higher strain values towards the center, whereas the magnitude
representation highlights the occurrence of edges. Identical features are calculated in
both information domains and combined afterwards. They consist of basic statistical
moments up to the fourth-order and assess the level of homogeneity of the image.
In addition to the statistical moments, the median, minimum and maximum of both
domains are taken into account. To assess the level of localization, a ratio between
two areas is generated in dependence of two threshold values. In both domains, the
top 1% and top 10% maximum values are defined as thresholds that are used to
determine the number of pixels being larger than the threshold. Until the end of the
homogeneous forming phase, the two areas are expected to proceed evenly so that
their fraction remains constant. If one of the two areas changes significantly with
regard to its expansion, its proportion decreases.

3.2.3 Local Binary Patterns


Another strategy to describe the strain distributions are Local Binary Patterns (LBP),
introduced by Ojala et al. [Ojal 96]. These image descriptors are used in various
applications, ranging from computed tomography [Soer 10] and biometrics like face-
detection [Ahon 06] to eye-detection [Kroo 09] or biomedical image analysis [Jare 15,
Vo 17]. The major advantage of LBP is that it is able to accommodate the large
intensity variation within images, as one pixel is always evaluated in the context of
its neighborhood. Within the original approach by Ojala et al., a 3×3 local neighbor-
hood is considered to form a pattern that encodes the central pixel by using a binary
weighting scheme with its surrounding pixels. Therefore, the central pixel is used as
a threshold value for binary comparisons with the neighboring pixels, and if larger,
the neighboring pixel is assigned a 1 and 0 otherwise. The results of the comparisons
are subsequently combined in a given order and referred to as LBP that by itself
describes the central pixel. Together with a weighting scheme (2i , where i ∈ [0 . . . 8]),
this binary pattern is encoded into a scalar that defines the position in a histogram.
The process of deriving a LBP is exemplarily demonstrated in Figure 3.3. Using the
central pixel as threshold together with the binary weighting scheme leads to an LBP
of 00011011b , which is encoded to 27 using the weighting scheme, starting from the
upper left corner. As a result of the constant neighborhood size, there exist 28 = 256
different possible LBP. Similar to the previous feature, the image is described by
a histogram, while now relative frequencies of LBP are used to describe the image
characteristics. One drawback of LBP is its rotation dependence, as the neighbor-
hood with respect to the central pixel will change if the image is rotated. Ojala et
al. [Ojal 02] have extended this approach to become gray-scale and rotation invari-
3.2. Feature Extraction 31

113 176 9 1 1 0 1 2 4 1 2 0

85 100 110 0 1 128 8 0 27 8

60 30 105 0 0 1 64 32 16 0 0 16

a) 3x3 Image region b) Thresholding c) Weighting d) Contribution


results of Pixels to pixel
Figure 3.3: Generation of a LBP. Adapted from [Nixo 12].

Figure 3.4: Different neighborhoods and radii (8,1), (12,2), (8,2)

ant using a circular neighborhood with the possibility of varying the neighborhood
size and radii. Figure 3.4 visualizes a circular neighborhood with eight neighboring
pixels and a radius of one, denoted as (8,1). Additionally, another neighborhood is
illustrated with a larger radius (12,2) and an increased amount of neighbors, together
with an (8,2) example. As visualized, the sampling density depends on the number
of neighbors and the radius, whereby the sampling position with respect to a central
point and neighborhood is derived according to [Piet 11, p. 14]:
2π 2π
xNp (i) = x + r cos( i), yNp (i) = y − r sin( i) (3.7)
Np Np
where Np denotes the size of the neighborhood, r the radius and i ∈ [0 . . . Np − 1] the
index of the sampling point. Sub-pixel accuracy is derived by bilinear interpolation
of neighboring pixels. Uniform Local Binary Patterns (LBPu ) were introduced to
reduce the amount of possible binary patterns, which may not contain more than
two zero-to-one or one-to-zero transitions. For example, 111100002 or 00110002 are
uniform patterns, while 010101002 , 100001012 are non-uniform. For an (8,1) LBP,
there exist nine uniform patterns as depicted by Figure 3.5. Each pattern describes a
certain local image characteristic such as a line, spot, line-end or corner, as presented
in Figure 3.6. Every pattern, excluding the first and the last one, can be rotated seven
times around its origin, which leads to a total amount of 58 uniform patterns and one

# of pixels > center


Figure 3.5: Different uniform patterns. Black circles indicate that intensity value is
greater than the center pixel, non-filled circles are smaller. Adapted from [Piet 11].
32 Chapter 3. Theoretical Background: Machine Learning

Spot Spot / flat Line end Edge Corner


Figure 3.6: Every uniform pattern describes a different image characteristic.
Adapted from [Piet 11].

(a) Original strain (b) Classical LBP (c) LBPu . (d) LBPriu .
distribution.

Figure 3.7: Original strain distribution (left) and the corresponding different LBP
representations: classical LBP, LBPu and LBPriu . The original strain distribution
was Gaussian filtered to highlight the differences between the LBP approaches.

additional reject pattern, which is assigned if the pattern is non-uniform. In order to


introduce rotation invariance, the nine uniform patterns are rotated circularly to their
minimum and referred to as Rotation Independent Uniform Local Binary Patterns
(LBPriu ):
 
LBPriu
Np ,r (x) = min fror LBPNup ,r (x), i (3.8)
i=0....Np −1

 
where fror LBPNup ,r (x), i denotes the circular right bitwise rotation of the bit se-
quence x by i steps. For instance, the LBP codes 100000112 , 001110002 , and 000011102
are all rotated to the minimum code 000001112 [Piet 11]. Examples of the different
LBP strategies are visualized in Figure 3.7. The rotation dependent classical LBP
reveal more detail of the image with the drawback of many unused binary patterns
and rare patterns that occur only occasionally. The LBPu approach removes a lot
of those unused patterns and lays focus on the most import and often used binary
patterns. Rotation independence is introduced with the LBPriu that further reduces
the level of details, while still describing the most important information.

3.2.4 Histogram of Oriented Gradients


In contrast to the aforementioned homogeneity features, Histogram of oriented Gradi-
ents (HoG) [Dala 05] were introduced in the scope of pedestrian detection and exploit
the gradient orientation together with the magnitude of the edges. The intuition be-
hind the feature extractor is that humans in an upright position can be represented
by the orientation of gradients of their outline with dependence on their pose. Similar
to the homogeneity features, a Sobel-Operator is used to generate the horizontal and
3.2. Feature Extraction 33

Blocks Cells Histograms

Θ
Original Image Edge response

Figure 3.8: Schematic determination of HoG features. The original image is con-
volved with e.g., the Sobel filter and subdivided into blocks. These blocks are further
subdivided into cells, while the gradient orientations in each of the cells are expressed
in terms of histograms.

vertical gradient approximations via convolution (cf. Equation 3.4 and Equation 3.5).
The orientation Θ at each position is then derived by:

Θ = atan2 (Gy , Gx ) (3.9)

To generate the HoG feature descriptor, the image is subdivided into rectangular
cells, originally of size 8 × 8 pixels. To generate a histogram of the respective cell,
an appropriate angular resolution needs to be specified, which defines the number of
available bins for the histogram. In case of unsigned angles ([0◦ , . . . , 180◦ ]), this would
lead to nine bins and a resolution of 20◦ . The weighted magnitude is assigned to a
histogram bin according to its gradient orientation. To improve robustness against
illumination changes, multiple cells are combined into a larger block and normalized
via L1 or L2 norm. An overview of the procedure to derive the HoG feature descriptor
is visualized in Figure 3.8 and an exemplary visualization of the HoG descriptor is
depicted in Figure 3.9. As can be seen, the magnitude of the orientations is higher
towards the center of the image. Additionally, the most important orientation changes
when comparing the central with the outer regions.

(a) Original strain (b) HoG represen-


distribution. tation.

Figure 3.9: Original strain distribution (left) and the corresponding HoG image.
34 Chapter 3. Theoretical Background: Machine Learning

3.3 Classification Algorithms


Before the introduction of the theory of DL, widely used methods that have proven
their reliability and efficacy over many years are presented in the following. These
comprise supervised and unsupervised classification algorithms, which in turn can be
distinguished into discriminative and generative models. In supervised classification,
samples x that are described by their characteristics feature vector and their class
labels y emerge from an unknown joint probability distribution p(x, y). Within dis-
criminatory models, an optimal separation between classes is found by estimation of
the conditional probability p(y|x) and minimization of a classification loss. Genera-
tive models approximate the parameters of the distribution, e. g. by using Maximum
Likelihood estimation. Both procedures are applicable to unsupervised methods that
do not require class labels. Irrespective of the underlying classification approach, su-
pervised or unsupervised, only a subset of the available data is used for training of the
classification algorithm. This practice is motivated by the fact that the separation
hypothesis and thus generalization to unseen data is evaluated with the remaining
disjoint test dataset.

3.3.1 Random Forest


Breiman et al. [Brei 01] introduced Random Forest (RF) in 2001. A comparison
of decision forests with respect to other classification algorithms can be found in
[Caru 08]. They naturally adapt to multi-class situations and show good generaliza-
tion to high dimensional problems. A broad field of application is covered by the RF,
including organ localization in computed tomography [Crim 09], keypoint recognition
[Lepe 06], pose detection [Roge 08], motion artifact detection [Lorc 17], cell detection
[Mual 13], super-resolution [Sind 18], or image segmentation [Shot 08]. RF fall into
the category of ensemble classifiers and are realized by combination of many decision
trees such as Classification and Regression Trees (CART) [Brei 84]. The result for an
image describing feature vector is obtained by either implementing a majority vote
over the individual decision trees in case of a classification problem or by averaging
the individual tree results in case of a regression problem. To prevent the creation of
identical decision trees, randomness is introduced into the forest by randomly sam-
pling with replacement from the annotated training dataset. This strategy produces
multiple varying datasets, whose evaluation will lead to different decision trees, a
procedure that is referred to as bagging. Additional randomness is derived by only
using random subsets of the available features instead of using the complete feature
vector. The basis of the RF is an individual decision tree as visualized in Figure 3.10.
The tree is a set of nodes that are arranged hierarchically, without any loops, while
each node has two child nodes. The top of the tree is defined as root node, fol-
lowed by internal or split nodes and terminating with leaf nodes. Using the training
set, a decision is being made at each internal node that subdivides the dataset into
smaller parts until reaching a leaf node. These contain the final result and assign
an output to the input feature vector x, i. e. the class membership or a continuous
variable. At test time, an unseen sample traverses the tree according to the learned
split criteria until reaching a leaf node, which assigns the output to the sample. Let
3.3. Classification Algorithms 35
X
Root node
x1 Sl

Internal node
X SlL SlR

Leaf
x2 node
Figure 3.10: Principal generation of a decision tree. The input data is represented
as a collection of points in the 2-D feature space (x1 , x2 ). During training of the
hierarchical structure of connected nodes, the training data v is passed into the tree
to find optimal parameters that split the data at the internal nodes until no further
subdivision is possible. At test time, the data instance is passed through the tree
according to the determined parameters until reaching a leaf node, corresponding to
a certain class.

X be a dataset of feature vectors x with xi = (x1 , x2 . . . xd ) ∈ Rd consisting of xi


scalar features and one corresponding class label defined by y. The decision function
hD (x, θj ) ∈ {0, 1} determines whether hD is true or false, where θ = (φ, ψ, τ ) are
parameters of the j-th split node. The function φ(x) randomly samples features of
the feature vector and the geometric primitive ψ separates the data in feature space,
e. g. by axis-aligned hyperplanes [Crim 12] where τ defines the thresholds for the
binary splits. The parameters necessary for obtaining the binary splits are derived
during training according to:
 
θj∗ = arg maxJj Sj , SjL , SjR θj , (3.10)
θj

where Sj , SjL , SjR denotes the training data before and after the split (cf. Figure 3.10).
Possible options for the objective function J are the information gain or Gini impurity:
Sji  
Jj = H (Sj ) − H Sji (3.11)
X

i∈{L,R}
|Sj |
with Shannon entropy
H(S) = − p(y) log p(y). (3.12)
X

y∈Y

The Gini impurity is defined as:


G (S) = (3.13)
X
p(y)(1 − p(y))
y∈Y

Figure 3.11 shows a dataset with four uniform distributed classes. The geometric
primitive allows axis-aligned separation of the dataset, which leads to multiple possi-
ble decision boundaries. Using information gain as the objective function, a horizontal
36 Chapter 3. Theoretical Background: Machine Learning

1.0 top 1.0 bottom


0.75 0.75

(b) 0.5 0.5

1.0 0.25 0.25

0.75 0 0
(a) 0.5

0.25
1.0 left 1.0 right
0
0.75 0.75

(c) 0.5 0.5

0.25 0.25

0 0

Figure 3.11: Information gain as a result of different data splits: (a) Initial dataset
with a uniform class distribution, (b) Horizontal split with the corresponding class
distribution, (c) Vertical split with the corresponding class distribution.

split would lead to lower entropy and thus smaller information gain, when compared
with the entropy of a vertical split. By maximizing the information gain, the en-
tropy decreases starting from the root towards the leaf nodes, which increases the
confidence of the prediction. When using decision trees for classification, each of the
leaf nodes defines the empirical distribution of class y related to x. This leads to the
probabilistic leaf predictor model for a tree t as:

pt (y|x) (3.14)

In case of RF, many unpruned decision trees are derived in parallel. During train-
ing, randomness is introduced to avoid over-fitting and derive better generalization
to unseen data. This is mainly achieved by the bagging procedure that employs
slightly varying subsets of the data and therefore leads to a generation of different,
independent decision trees. Besides bagging, features are randomly subsampled at
the internal nodes to further increase randomness. During testing, an unseen data
instance is passed through all decision trees, where each leaf yields the posterior for
the individual tree. The forest output is derived with a majority vote or in case of
regression determined by the average over all trees:

1 XNt
p(y|x) = pt (y|x) (3.15)
Nt t

In comparison with other classification algorithms, RF have a couple of useful ad-


vantages, as they can solve multi-class problems naturally, provide a probabilistic
output, are efficient to train by reasons of their mutual independence and also show
robustness against over-fitting due to bagging [Crim 12].
3.3. Classification Algorithms 37

margin

Figure 3.12: Exemplary separation of data consisting of two classes using SVM
with maximum margin separation hyperplane.

3.3.2 Support Vector Machine


The Support Vector Machine (SVM) [Cort 95] algorithm computes a decision bound-
ary and maximizes the margin between datasets that belong to different classes, such
that the data is optimally separated from each other. Typical applications are found
in the area of speech recognition such as, age and gender determination [Bock 08],
the detection of articulation disorders [Maie 09a], or language learning [Maie 09b]. An
example of a maximum margin hyperplane or decision boundary that separates two
classes is visualized in Figure 3.12. As the example depicts, no sample of either class
falls within the margin on either side of the separation hyperplane. Hence, this is
referred to as hard margin SVM. If it is not possible to find an optimal solution or to
separate the datasets without errors, one can relax this constraint to allow samples
to lay within the margin, which is referred to as soft margin SVM. Both categories
require class labels for the data points and hence belong to the area of supervised
classification algorithms.

Hard Margin Support Vector Machine


In order to introduce the working principle of the SVM, a binary classification problem
with two classes is considered. In this case, the samples of the classes can be separated
with a linear hyperplane as visualized in Figure 3.12 according to [Bish 06, p. 326]:

ŷ(x) = w> x + b (3.16)

which separates the training data into two classes and where xn ∈ RD , n = 1, . . . , N
denotes the data samples with corresponding class labels y ∈ {−1, +1}. New data
samples would be classified in accordance with the sign of ŷ(x). Additionally, weights
are described by w, whereas b depicts a bias parameter. With the assumption of
linearly separable datasets in feature space and no misclassifications, there exists at
least one set of parameters such that yn ŷ (xn ) > 0 ∀xn . The aim is to find the optimal
38 Chapter 3. Theoretical Background: Machine Learning

parameters, the ones that maximize the margin between data points which are closest
to the hyperplane. In general, the perpendicular distance of a data point x to the
hyperplane is defined as |y(x)|/||w|| and hence the distance of a point xn to the
decision boundary is derived as:
 
yn ŷ (xn ) yn wT xn + b
= . (3.17)
kwk kwk
Since the margin is determined by the points closest to the decision boundary, the
so-called support vectors, the aim is to optimize the parameters in order to maximize
these distances and hence find a maximum margin solution for points with minimal
distance to the decision boundary [Bish 06, p. 327]:
1
( )
 
arg max min yn w> xn + b (3.18)
w,b kwk n
As this a non-convex optimization problem, conversion into an easier to solve convex
optimization problem is encouraged. To derive this, the data is rescaled by an arbi-
trary factor, such that the distance from any data sample to the decision boundary
remains unchanged. For the samples closest to the decision boundary one derives
 
yn wT xn + b = 1. (3.19)
From this follows that the remaining data points will satisfy the constraints
 
yn wT xn + b > 1, n = 1, . . . , N. (3.20)
Since there are at least two data samples closest to the hyperplane, one for each
class, the optimization problem requires maximization of kwk−1 which is equivalent
to minimization of kwk2 , and hence one derives the convex optimization problem
[Bish 06, p. 328]:
arg min 21 kwk2
w,b   , (3.21)
s.t. ∀n : yn w> xn + b ≥ 1
which can be solved efficiently using Lagrangian multipliers αn :
1 N n   o
L(w, b, α) = kwk2 − αn y n w T x n + b − 1 (3.22)
X
2 n=1

while minimizing with respect to w and b and maximizing with respect to α. Elimi-
nation of w, b from Equation 3.22 by setting the derivatives of w, b to zero leads to
the dual representation [Bish 06, p. 329]:
N
1X N X N
= αn αm yn ym k (xn , xm ) (3.23)
X
L(α)
e αn −
n=1 2 n=1 m=1
with respect to the constraints:
αn > 0, n = 1, . . . , N
N (3.24)
αn yn = 0
X

n=1
3.3. Classification Algorithms 39

where the kernel function in case of a linear kernel is defined by k (x, x0 ) = xT x0 .


These kernel functions are one of the main reasons why SVM became so popular as
they provide outstanding performance for non-linearly separable data points. Since
the decision function only relies on the inner product of the vectors in the feature
space, it is sufficient to evaluate the kernel function rather than performing an ex-
plicit projection into the feature space. Popular kernel function are the linear kernel,
 k
the polynomial kernel k (x, x0 ) = xT x0 + 1 and the radial basis function kernel
0 2
k (x, x0 ) = e−γkx−x k2 . In order to classify new data samples, the sign of the trained
model is evaluated with respect to the parameters αn according to:
N
ŷ(x) = αn yn k (x, xn ) + b (3.25)
X

n=1

The demonstrated constrained optimization problem satisfies the Karush-Kuhn-Tucker


(KKT) conditions, which require the following properties to hold [Bish 06, p. 330]:

αn > 0
 
yn w> x + b − 1 > 0 (3.26)
n   o
αn yn w x + b − 1 = 0
>

From
 the complementary
 slackness follows, that samples either fulfill αn = 0 or
yn w> x + b = 1. Consequently, when predicting new data samples, any sample
for which αn = 0 holds, is neglected in Equation 3.25, since it has no effect on the
decision boundary. The remaining samples are the so-called
 support vectors that
define the decision boundary and satisfy yn w x + b = 1. This is of fundamental
>

importance for SVM as most of the data samples, apart from the support vectors,
can be discarded.

Soft Margin Support Vector Machine


Kernel functions introduced the possibility to separate datasets with a non-linear
decision boundary. However, if the data is not separable, an optimal separation
hyperplane can still be found when relaxing the separation criterion. This means,
that data samples are allowed to lie within the margin on both sides of the decision
boundary, leading to the following optimization problem [Bish 06, p. 332]:

1 N
min kwk2 + C (3.27)
X
ξn
w,b,ξn 2 n=1

with the constraints


 
s.t. ∀n : yn w> xn + b ≥ 1 − ξn (3.28)
∀n : ξn ≥ 0
where ξn are the slack variables, that penalize the points laying within the margin.
The parameter C is used to trade-off the margin and the penalty. Again, as within the
40 Chapter 3. Theoretical Background: Machine Learning

hard margin case, this can be transformed to the dual form based on the Lagrangian,
which leads to the same optimization problem as within the hard margin case, but
depending on other constraints:
N
1X N X N
= αn αm yn ym k (xn , xm )
X
L(α)
e αn −
n=1 2 n=1 m=1
N
(3.29)
s.t. αn yn = 0, ∀n : 0 ≤ αn ≤ C
X

n=1

where all data points with αn = 0 do not contribute to the predictive model. If
αn = C, the points lay inside the margin and if αn < C the points are considered as
support vectors.

3.3.3 One-Class Support Vector Machine


One of the major challenges for classification problems is the uneven distribution of
data points per class. For example, in case of a disease, it is much easier to derive
physiological data than pathological data. Another example is the capturing of the
expected behavior of a manufacturing process, whereas it is rather difficult to estab-
lish a significant representation of the occasional error cases. In other words, it is
easier to obtain data of the common class versus the anomaly class. In such situa-
tions, it is beneficial to over- or undersample the data of the respective class to obtain
an equal distribution of samples per class and simply use one of the presented SVM
approaches. When the imbalance is too severe, or in cases where it is impossible to
collect representative data of a faulty system, it would be convenient only to use data
of the majority class to determine a decision boundary. Ideally, samples of the mi-
nority class would be classified as outliers or anomaly, as they would fall outside the
decision boundary of the majority class. Such a classification algorithm falls in the
area of unsupervised classification, as no labels are required to find a decision bound-
ary. Two equivalent approaches were proposed, that derive a comparable decision
hyperplane, although implementing different solutions. The more intuitive approach
by Tax and Duin [Tax 04] introduced the Support Vector Data Description (SVDD)
that finds a minimal enclosing hypersphere for the majority class, whereas the One-
class Support Vector Machine (O-SVM) approach by Schölkopf et al. [Scho 00] solves
the problem by maximizing the distance of a hyperplane to the origin. A comparison
of both approaches is visualized in Figure 3.13 and indicates an impression of their
transferability. The O-SVM approach uses the feature space, similar to the other
SVM methods, to separate the data points from the origin with a maximum margin
hyperplane. Therefore, a binary function is used that assigns positive labels to data
points of the majority class in a small region, and negative labels elsewhere. This
can be described as optimization problem:
1 1 XN
min kwk + 2
ξn − ρ (3.30)
w,ξn ,ρ 2 νN n=1
with the constraints:  
s.t. ∀n : w> xn ≥ ρ − ξn
(3.31)
∀n : ξn ≥ 0
3.3. Classification Algorithms 41

(a) Minimal enclosing circle. (b) Separating hyperplane.

Figure 3.13: Outlier detection according to SVDD with the minimal enclosing hull
in (a). Separating hyperplane with maximal distance from origin with the O-SVM in
(b). Both approaches deliver equivalent results if all samples have the same distance
from the origin and are linearly separable from it [Lamp 09].

where ν is a trade-off parameter comparable to C of the soft margin method, and


ρ a bias term. The parameter ν characterizes the decision boundary and is used to
trade-off two characteristics of the decision boundary. On the one hand, it defines
the upper bound on the fraction of outliers, i. e. the amount of data points that are
classified as outliers during training and on the other side provides the lower bound
on the number of training samples used as support vectors. Using Lagrangian, the
decision function is derived as [Lamp 09, p. 55]:
N
ŷ(x) = αn k (x, xn ) − ρ. (3.32)
X

n=1

3.3.4 Gaussian Mixture Models


Other unsupervised classification algorithms are the Gaussian Mixture Models (GMM)
or the Students t Mixture Models (SMM). In contrast to the O-SVM approach, no
assumptions on the data, such as prior knowledge, is required. Before discussing the
details of GMM and SMM, a general clustering problem is considered. Given a data
set {x1 , . . . , xN } that consists of varying classes, it can be separated into clusters K.
Herein, one cluster comprises a group of data points whose inter-point distances are
smaller in comparison to points outside the cluster. The prototype of one cluster is
depicted by µk and represents the cluster center. An assignment of data points to
clusters is then derived by minimization of the sum of squared distances of each data
point to its closest cluster center. This requires a binary function rbnk ∈ {0, 1}, where
k = 1, . . . , K, that indicates if a data point is assigned to the cluster. The objective
function is then defined by [Bish 06, p. 424]:
N X
K
J= rbnk kxn − µk k2 (3.33)
X

n=1 k=1

which describes the sum of squared distances of each data point to the assigned cluster
center. This is a two-stage optimization problem since rbnk is dependent on the cluster
42 Chapter 3. Theoretical Background: Machine Learning

centers µk , which are randomly initialized. As a result, one of the parameters is kept
fixed, while optimizing the other parameter starting with rbnk , according to:
1 if k = arg minj kxn − µj k2
(
rbnk = (3.34)
0 otherwise
This is subsequently followed by the optimization of µk according to:
rbnk xn
P
µk = P
n
b
(3.35)
n rnk
which sets µk to the mean of the data points xn assigned to the cluster k, and explains
why the procedure is referred to as K-means algorithm. This iterative scheme is
repeated until convergence. Figure 3.14 visualizes the K-means result for an unlabeled
dataset. The circles depict the support of the distance, i. e. that points that lie within
the radius would be assigned to the corresponding cluster center. Therefore, a good
clustering result may be derived in case of circular distributed data by using the hard
assignments to cluster centers, which are solely dependent on the distance to the
cluster mean. Here, this approach leads to bad separation results as the distributions
comprise orientations. For this reason, GMM are introduced, which consider the
variance or covariance of a mixture of Gaussian distributions in addition to the mean
and thus lead to soft assignments of the data points with a certain probability. These
cluster membership probabilities are visualized in Figure 3.14 by the contour lines,
while lines closer to the center correspond to a higher probability.
As already mentioned, GMM generate soft decisions, such that data points can
belong to multiple clusters at the same time with varying probabilities. In this
context, GMM make use of the univariate Gaussian distribution according to:
1 − 1 (x−µ)2
N (x|µ, σs2 ) = q e 2σs2 , (3.36)
2πσs2
where x refers to the random observations, µ defines the mean of the distribution and
σs depicts the standard deviation. Similarly, the multivariate Gaussian distribution
for a D dimensional vector x is defined as:
1 1 1
 
N (x|µ, Σ) = exp − (x − µ) Σ
T −1
(x − µ) , (3.37)
(2π)D/2 |Σ|1/2 2
where a Gaussian is fully specified by a mean vector µ of dimension D and a D × D
covariance matrix Σ. As in the previous K-means example, the mean vector describes
the position of the center of the distribution, while the covariance matrix defines the
spread and orientation of the distribution. Consequently, a distribution, consisting of
a mixture of K Gaussians, is a linear combination of Gaussians according to [Bish 06,
p. 430]:
K
p(x) = πk N (x|µk , Σk ) , (3.38)
X

k=1
where πk is a weighting factor of the individual Gaussian considering the constraints
k=1 πk = 1 and 0 6 πk 6 1. The parameters of the GMM are derived iteratively by
PK

optimization of a maximum likelihood criterion [Demp 77]:


N
(K )
ln p(X|µ, Σ, π) = ln πk N (xn |µk , Σk ) (3.39)
X X

n=1 k=1
3.3. Classification Algorithms 43

1.2 1.2
data Cluster 0
1.0 1.0 Cluster 1
0.8 0.8 Cluster 2

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0

-0.2 -0.2
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2

(a) Data distribution. (b) K-means result.


1.2 1.2
Cluster 0 Cluster 0
1.0 Cluster 1 1.0 Cluster 1
0.8 Cluster 2 0.8 Cluster 2

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0

-0.2 -0.2
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2

(c) GMM result. (d) SMM result.

Figure 3.14: Comparison of K-means and GMM results. The GMM approach covers
the distribution of the dataset in a more convenient way, while SMM provide a larger
support of the likelihood.

As previously, the optimization scheme consists of two individual steps, the expecta-
tion and maximization step, and is referred to as Expectation-Maximization (EM).
The intuition behind this approach is to derive the parameters for the Gaussians
that best explain the distribution of the dataset. This is defined as generative mod-
eling as the parameters that maximize the likelihood of observing the data have to
be determined. Typically, the parameters are initialized with K-means and updated
iteratively by means of the EM algorithm.
Within the expectation step, the posterior probability or responsibility that a data
point was generated by the respective Gaussian is determined according to [Bish 06,
pp. 438-439]:
πk N (x|µk , Σk )
γk (x) = PK (3.40)
j=1 πj N (x|µj , Σj )

These are kept fixed during the maximization step, to update the parameters accord-
ing to:
1 X N
µ̂k = γk (xn )xn (3.41)
nk n=1
44 Chapter 3. Theoretical Background: Machine Learning

1 X N
Σ̂k = γk (xn ) (xn − µ̂k ) (xn − µ̂k )T (3.42)
nk n=1
N
nk
π̂k = with nk = γk (xn ) (3.43)
X
N n=1

Here, nk is interpreted as the effective number of samples assigned to cluster k.


The mean µ̂k of the k th Gaussian component is derived with a weighted average
among all points of the dataset according to their posterior probability. The iterative
optimization procedure is stopped if the log likelihood of the data points occurring
under the current model parameters has not improved by a predefined tolerance value
and hence converged in terms of the log likelihood.

3.3.5 Students t Mixture Models


Gaussian Mixture Models, as described, might not always be a suitable choice for sta-
tistical modeling of distributions. For example, if a dataset contains data points that
require longer than Gaussian tails or if the dataset is impaired by noise and atypical
observations, the model fit using Gaussians might be negatively affected [Peel 00].
The quality of the model using Gaussians is mainly employed by the approximation
of the mean and standard deviation of the distribution, which are particularly com-
promised by the presence of outliers. For this reason, Peel and McLachlan [Peel 00]
propose the use of multivariate Students t distributions instead of Gaussians accord-
ing to [Bish 06, p. 105]:
1 #− D+ι
Γ( D+ι ) |Λ| 2 (x − µ)T Λ(x − µ)
"
2
St(x|µ, Λ, ι) = 2
1 + (3.44)
Γ( 2ι ) (πι) D2 ι

where D is the dimension of x, Γ defines the gamma function, ι depicts the degree
of freedom, and Λ describes the precision, the inverse of the variance. The effect of
the ι parameter is visualized in Figure 3.15. For ι → ∞ the t distribution converges
towards a Gaussian distribution. In comparison to the Gaussian distribution, the
longer support of the t distribution due to the longer tails is noticeable. This behavior
can also be observed in Figure 3.14 (d), where the likelihood of the t distribution
has larger support, as visualized by the contour lines. Since the data of the example
was generated using Gaussians without presence of outliers, the Gaussian fit would
be the preferred choice. The adverse effect of outliers on the parameter estimation
of the Gaussian distribution for a univariate model is depicted in Figure 3.16. The
mean value is shifted in the direction of the outliers, and the standard deviation
increases. This effect cannot be observed among the t distributions, as the longer
tails render them more robust against outliers and have been proven in the area of
medical image segmentation [Nguy 12], registration [Gero 09] or group-wise alignment
of shapes [Ravi 16]. Similar to the GMM approach, the parameters of the SMM are
derived by employing the iterative EM procedure and the maximization of the log
likelihood.
3.3. Classification Algorithms 45

0.40
0.35
0.30
p(x|ι) 0.25
0.20
0.15
0.10
0.05
0.00
−10 −5 0 5 10
x
Gaussian St(ι = 1.0) St(ι = 2.0) St(ι = 5.0)

Figure 3.15: Comparison of the Gaussian and Students t distribution. If ι ap-


proaches infinity, the Students t distribution is equal to the Gaussian distribution.

0.45
0.40
0.35
0.30
p(x|ι)

0.25
0.20
0.15
0.10
0.05
0.00
−5 0 5 10
x
Gaussian St(ι = 1.0) St(ι = 2.0) St(ι = 5.0)

Figure 3.16: Comparison of the Gaussian and Students t distribution in the presence
of outliers. The mean of the Gaussian distribution is shifted towards the outliers,
whereas the position of the Students t distribution is unaffected by the outliers.
46 Chapter 3. Theoretical Background: Machine Learning

3.4 Deep Learning

So far, this thesis introduced the theory behind the conventional type of pattern
recognition methods according to the pattern recognition pipeline (cf. Figure 3.1).
One disadvantage of these approaches concerns the feature extraction and classifica-
tion part, which involves human-made decisions that do not necessarily lead to an
optimal solution of the problem. Even though these decisions comprise well-thought
choices of relevant features that suit the problem, these are still non-optimal and
combined with a specific classification or regression algorithm, that is somewhat ar-
bitrarily chosen [Duda 00]. Furthermore, the classification algorithms are independent
of the feature generation step and hence cannot influence or modify the quality of
the features. Consequently, the solution is derived using non-optimal characteristics
of the problem to be solved. This limitation is mitigated with the advent of DL since
it combines the feature extraction and classification step by minimizing a suitable
problem-specific loss function. As a consequence, the automatic data-driven learning
of optimal relevant features provides an intuition why DL substantially improved the
state-of-the-art results in machine learning tasks such as classification, speech recog-
nition, or object detection [LeCu 15]. The step-wise improvements introduced by DL
can particularly be observed by the advances achieved in the ImageNet Large Scale
Visual Recognition Challenge (ILSVRC) [Russ 15]. This challenge uses the ImageNet
database designed for visual object detection that consists of around 14 million man-
ually annotated images with more than 20000 classes. Typically, the Top-1 or Top-5
classification error is used to benchmark and compare different approaches. Herein,
the Top-5 classification error is the fraction of test images, for which the actual ground
truth label is not included within the Top-5 predictions of the classifier. While in
2011, the best traditional machine learning approach based on handcrafted features
in combination with an SVM achieved a Top-5 error of 25% [Sanc 11], this approach
was outperformed by a DL method that derived a Top-5 error of 16.7% [Kriz 12]. In
2015, He et al. [He 15] achieved a Top-5 error of 4.94% that surpassed the 5.1% Top-5
error of humans for the first time and was further improved to 3.67% [He 16].
This steep performance improvement within a short period of time had multiple rea-
sons. First of all, with the use of Graphic Processing Units (GPUs), it was possible
to efficiently train the different DL architectures that increased in depth and com-
plexity over time. While in 2011, the network depth was only eight layers (AlexNet)
[Kriz 12], the depth grew to 19 layers (VGG19) in 2014 [Simo 14], and finally reaching
152 layers (ResNet) in 2015. Overall, deeper architectures are considered advanta-
geous, specifically because their complexity is increased, as they encourage reuse of
features that allow the extraction of abstract features on deeper layers [Beng 13]. The
architectural changes, especially the increased depth, would not have been possible
without algorithmic improvements such as Rectified Linear Units (RELUs) [Nair 10]
to mitigate vanishing gradients or regularization techniques such as dropout to pre-
vent overfitting [Sriv 14]. Besides the architectural and algorithmic advances, together
with the increased computation power of GPUs, the availability of large labeled data
sets like Common Objects in Context (COCO) [Lin 14], Pascal Visual Object Classes
Challenge (PASCAL) [Ever 15] or ImageNet [Deng 09] supported the advent of new
learning-based approaches. Especially the software libraries such as Theano [Thea 16],
3.4. Deep Learning 47

Torch [Coll 11], Caffe [Jia 14] TensorFlow [Abad 16] and Keras facilitated the use and
implementation of DL models and thus simplified access to this technology. These
advances led to various applications in the field of medical imaging and mechanical
engineering, such as computer-aided diagnosis [Shin 16, Aubr 17], non-rigid registra-
tion [Kreb 17], CT-reconstruction [Wurf 16, Hamm 17], landmark detection [Bier 18],
deep denoising [Aubr 18], or in mechanical engineering for example in the field of
defect detection for photovoltaic module cells [Deit 19] or event detection [Sanc 18].
Another trend in DL and signal processing is to embed known parts of the process-
ing chain into the network architecture so that the maximum error bounds may be
reduced [Maie 19b]. The remainder of this section introduces the main principles be-
hind DL, including the different layers and blocks typically used in DL architectures,
as well as optimization techniques and algorithms necessary for the training of neural
networks.

3.4.1 Feed-forward Neural Networks


Feed-forward neural networks or Multilayer Perceptrons (MLPs) constitute the basis
of DL models. The smallest building block of an MLP is the computational neuron,
while multiple stacked neurons, referred to as layers, form an MLP. The depth of the
model is defined by the number of layers, which led to the term “deep learning” or
“deep neural networks”. In general, one refers to these models as feed-forward since
the information flows through the network without recurrent connections, thereby
constituting a directed acyclic graph. A simple MLP model comprising multiple
neurons is visualized in Figure 3.17 (b). It consists of three different layers: (1) the
input layer corresponds either to the input data by itself or its feature representation;
(2) intermediate layers that are referred to as hidden layers since they are not visible
to external systems or private to the network and (3) the output layer, that generates
an output that resembles the ground truth. An MLP learns the model parameters
θ of a function f , such that the mapping of an input x to an output y is optimally
approximated according to y = f (x; θ) [Good 16, p. 168]. In a forward network,
the neurons of each layer receive as input the output of all neurons of the preceding
layer, as depicted by the computational neuron in Figure 3.17 (a). Formally, a layer
consists of M neurons that have to be set in advance, such that the output of a layer
can be expressed as M linear combinations of the input variables x1 , . . . xD [Bish 06,
p. 227]:
M
(1) (1)
zj = wji xi + wj0 = wjT x + bj (3.45)
X

i=1

where j = 1, . . . M , and the superscript (1) denotes that parameters correspond to the
first layer of the network. For reasons of clarity, this superscript notation is omitted
for the remainder of the thesis, where possible. The remaining parameters, wji and
wj0 or bj are referred to as weights and biases. Furthermore, with integration of the
bias parameters into the weights as w0 and extension of the input vector by 1 (cf.
Figure 3.17 (a)), Equation 3.45 can be expressed as vector-matrix multiplication for
the entire layer:
z = Wx + b (3.46)
48 Chapter 3. Theoretical Background: Machine Learning

Input Hidden Hidden Output


layer layer 1 layer 2 layer
1 w0
Activation
x1 w1 function
h
zj
x2 w2
P
aj

.. ..
. .
xn wn

(a) Computational neuron. (b) MLP.

Figure 3.17: Computational neuron (a) and MLP (b). The network model is
represented by an acyclic graph of multiple layers, while each neuron is depicted by
a circle.

To derive the outputs of the respective layers, the parameters are aggregated in zj
and transformed using differentiable, nonlinear activation functions h according to:

aj = h (zj ) (3.47)

where aj defines the output of the layer, that more specifically, is referred to as acti-
vations. These activations then serve as input for the subsequent layers, while at the
output layer, the choice of activation function depends on the nature of the target
variables, i. e. if a prediction for a regression or classification problem is determined.
A single neuron can already be used as a classifier by itself using the sign function as
within the Rosenblatt’s perceptron [Rose 58], whereas modeling capacities in combi-
nation with additional neurons are drastically increasing, such that any continuous
function can be approximated by a single layer neural network [Cybe 89].

Optimization
The most critical part of DL involves the training of the network. Since this is the
most expensive and time-consuming part of DL algorithms, multiple optimization
techniques have been developed to increase performance. Overall, the aim is to
iteratively find optimal parameters θ∗ that significantly minimize a cost function C
according to [Good 16, p. 274]:

θ∗ = arg min C(θ) (3.48)


θ

The cost function C(θ) uses the current model parameters to evaluate the error func-
tion Le for each sample of the empirical distribution p̂data (x, y) of size Nd :

1 XNd
C(θ) = E(x,y)∼p̂data Le (f (x; θ), y) = Le (f (xi ; θ) , yi ) (3.49)
Nd i=1
3.4. Deep Learning 49

where Le is the introduced loss per-example of the network predicted outputs f (xi ; θ)
with respect to the corresponding sample ground truth yi . While y depicts the class
membership in supervised classification approaches, the optimization scheme can be
reformulated without target variables in order to optimize a regression or unsuper-
vised classification problem. Furthermore, additional regularization terms constrain-
ing the parameters θ may be added to the cost function in order to achieve desired
properties as for example sparsity. When considering the L2 loss, a common loss
function for regression problems, the cost function becomes:
Nd
C(θ) = kyi − f (xi , θ)k22 (3.50)
X

i=1

When considering a binary classification task, the cross-entropy is employed, which


is typically followed by softmax to derive the class probability according to [Good 16,
p. 179]:
C(θ|y, x) = −Ex,y∼p̂data log (pmodel (y|x))
e zi (3.51)
p (y = 1|x) = softmax(x) = P zj
je

As already mentioned, the network is optimized iteratively by re-investigating the


empirical dataset. Since the network parameters θ are randomly initialized, they are
optimized by using a gradient scheme that minimizes the loss function and thereby
leads to convergence towards an optimal solution:

θt+1 = θt − η∇C (θt ) (3.52)

where η is the learning rate that controls the step-size in the gradient direction and
therefore defines the influence on the parameters. Similar to the number of neurons,
the initial learning rate needs to be set in advance of the training procedure.

3.4.2 Stochastic Gradient Descent


The Stochastic Gradient Descent (SGD) optimization scheme is one of the most im-
portant algorithms used for training and optimization of DL models, which iteratively
optimizes an objective function. In general, it might be possible to evaluate the cost
function using the whole empirical dataset (cf. Equation 3.49), at least for small
datasets, whereas with the increasing sizes of common datasets such as COCO or
PASCAL, this strategy becomes infeasible due to memory limitations. Additionally,
this optimization approach might result in slow convergence and long run-times when
only a single image or data sample is assessed. To mitigate these limitations, mini-
batches, i. e. random subsets of the data are considered to approximate the cost
function and to update the parameters in dependence of the gradient according to
[Good 16, p. 298]:
1 Nm
∇C(θt ) = Le (f (xi , θt ) , yi ) (3.53)
X

Nm i=1
where Nm determines the size of the mini-batch. One of the major benefits of the
mini-batch SGD is that, in general, a faster convergence is achieved when rapidly
50 Chapter 3. Theoretical Background: Machine Learning

using approximations of the gradients rather than slowly deriving the exact gradients
[Good 16, p. 278]. Larger mini-batches lead to more accurate estimates of the gradi-
ents, while multi-core architectures can efficiently be utilized with a sufficiently large
mini-batch to enhance convergence. Conversely, small batches offer a regularization
effect [Wils 03] and often achieve the best generalization error. Since this additionally
increases the variance, a lower learning rate should be considered [Good 16, p. 279].
Besides the mini-batch size, the learning rate η is a crucial parameter that influ-
ences the optimization procedure, as indicated by Equation 3.52. If the learning rate
is too large, convergence might be impossible or lead to a non-optimal solution. If
the learning rate is too low, again, a non-optimal solution might be possible as the
optimization procedure might get stuck in a local optimum, and additionally leads to
an increased run-time. As the learning rate needs to be set in advance and to trade-
off the described limitations, the initial learning rate is set to a rather large value. If
no more improvement is observed on the validation dataset for several iterations, it
is commonly reduced by a constant factor.
To further improve the run-time and convergence of the optimization, another so-
lution that incorporates momentum was introduced by Poly et al. [Poly 64]. Herein,
the preceding gradients are accumulated with an exponentially decaying moving av-
erage to avoid large jumps of the gradients and to stabilize the gradient direction.
Other optimization algorithms, such as AdaGrad [Duch 11] or RMSProp [Hint 12],
were introduced, while the Adam [King 14] algorithm can be considered as the stan-
dard method for optimizing DL models, as it automatically adapts the learning rate
with progression of the training procedure.

3.4.3 Back-propagation
An efficient technique is necessary to evaluate the gradients for a feed-forward network
with respect to the weights and the loss function. As already introduced, the cost
optimization is derived by iteratively adjusting the parameters of the network. The
intuition behind the back-propagation approach, as introduced by Rumelhart et al.
[Rume 86], is that the weight of each neuron of the network is updated according to
its contribution to the overall error, which is evaluated by the loss function of the
last layer.
Therefore, at each iteration, two different steps have to be considered. The first
step involves a forward pass of the network, comprising subsequently applying Equa-
tion 3.45 and Equation 3.47 to derive zj and its activation aj for all neurons. Con-
sidering an MLP as visualized in (cf. Figure 3.17 (b)), one difficulty is the derivation
of good internal representations for the hidden layers. Since the nodes of these layers
do not have a target output, optimization of an error function that is specific to
that node is not possible. This is addressed by the second step, the so-called back-
propagation, which recursively applies the chain rule to calculate the derivatives,
while successively, the weights of each neuron are adapted according to its contribu-
tions to the loss. After a forward pass, all activations of layer l = {1, . . . , L} have
been set accordingly, so that back-propagation starts with the weights of the output
layer. Consequently, the partial derivative of the Le with respect to a weight wjn L
3.4. Deep Learning 51

between node j of the second last layer and node n of the last layer L is derived
according to [Bish 06, p. 243]:

∂Le ∂Le ∂znL


L
= L
= δnL aL−1
j (3.54)
∂wjn ∂znL ∂wjn

Since the activations were already set by the forward pass, it follows that only the
individual error δnL needs to be determined with additional use of the chain rule
according to:
∂Le ∂Le ∂aLn ∂Le 0  L 
δnL := = = h zn (3.55)
∂znL ∂aLn ∂znL ∂aLn

with h0 (x) being the derivative of the activation function. So far, only the partial
derivatives of the last layer are calculated. However, for any node k of an intermediate
layer l = [1, . . . , L − 1] the individual errors δkl must incorporate all previous backward
paths up to node k. This is realized with the multivariate chain rule according to:

l+1 l+1
∂Le MX ∂Le ∂zil+1 MX l+1 ∂zil+1 ∂alk
δkl := = = δi
∂zkl i ∂zil+1 ∂zkl i ∂alk ∂zkl
l+1
(3.56)
M  
= l+1 0
δil+1 wik h zkl
X

where i = [1, . . . M ] with M being the number of nodes per layer l. Rearranging the
derivative of the activation function leads to the back-propagation formula according
to [Bish 06, p. 244], that is applied recursively to determine the δ’s of the hidden
units:
l+1
  MX
δkl =h 0
zkl δil+1 wij
l+1
(3.57)
i

One important aspect of the activation functions is already emphasized by this for-
mula. Since they are necessary to derive the activations and in the end to determine
the loss, they are required to be differentiable, as otherwise, back-propagation is
infeasible.

3.4.4 Activation Functions


Activation functions or transfer functions are essential to introduce nonlinearities
into the architecture, and in particular, reveal their strengths when several layers
are stacked consecutively to approximate arbitrary functions. Until 2012, tanh and
sigmoid were the preferred choices, while nowadays, RELUs are considered as stan-
dard since they improved convergence and performance of networks [Kriz 12]. An
illustration of the different activation functions is visualized in Figure 3.18, whereby
the individual functions are described in the following.
52 Chapter 3. Theoretical Background: Machine Learning

y
T anh(x) 2
Sigmoid(x)
ReLu(x)
1
x
−4 −2 2 4
−1

−2

Figure 3.18: Different examples of activation functions: ReLu, tanh and sigmoid.

Sigmoid and Tanh


The sigmoid function can be considered as a differentiable version of the step-function
[Rume 86] and is defined as:
1
h(z) =
1 + e−z (3.58)
h (z) = h(z)(1 − h(z))
0

Besides the positive fact that it is differentiable, however, it has some negative aspects,
as it saturates in most parts of its domain. This saturation effect at both tails of
the function may impair gradient-based learning since the gradient saturates in these
regions. This is usually referred to as a vanishing gradient, which can be observed
especially within larger network structures since the gradients are multiplied with
each other during back-propagation and thus lead to very small or zero gradients.
Consequently, the parameters of the network are no longer updated such that the
learning process of the network is stopped. Besides vanishing gradient and its strictly
positive values, the function is not centered at zero. This impedes optimization and
shifts zero-mean input distributions towards more positive values, that need to be
adapted by subsequent layers. To mitigate these limitations, the tanh activation
function is introduced:
1 − e−2z
h(z) =
1 + e−2z (3.59)
h (z) = 1 − h(z)
0 2

In general, the tanh is a shifted and scaled version of the sigmoid function with a
zero mean. However, the saturation effects are still present.

Rectified Linear Unit


Major improvements for parameter optimization of DL architectures have been de-
rived by using a RELU as activation function [Jarr 09, Nair 10] according to:
z for z > 0
(
h(z) =
0 else
(3.60)
1 for z > 0
(
h (z) =
0
0 else
3.4. Deep Learning 53

The RELU function, as illustrated in Figure 3.18, behaves linearly in the positive
axis direction and returns zero otherwise. Therefore, the problem of the vanishing
gradient is resolved, since the derivation leads to a constant gradient that improves
optimization and reduces training time. However, a RELU is not differentiable at
x = 0, so that a sub-gradient in the range of 0−1 is utilized to mitigate this limitation
during optimization [Maie 19a]. In addition, the negative axis direction provides a
positive side aspect, since unlike the activation functions sigmoid or tanh, not all
neurons have to fire. Consequently, activations may be set to zero, which on the
one hand, enforces sparsity, while on the other hand, may lead to dying gradients
as a consequence of the zero activations. To mitigate this issue, leaky-RELU were
proposed that replace the negative part by a linear function with small slope [Maas 13]
such that the gradient remains non-zero and may recover during training.

3.4.5 Network Structures and Topologies


Typical MLP architectures such as the one illustrated in Figure 3.17 (b) comprise mul-
tiple hidden or so-called Fully Connected (FC) layers, wherein each neuron receives
the input of all units of the previous layer. Consequently, the amount of parameters
that need to be optimized quickly becomes infeasible. Considering an image of size
256 × 256 px, a common resolution used in ImageNet challenges, and 100 neurons in
one hidden layer, this would lead to more than 6 million weights for only one layer.
Since a network architecture usually consists of several stacked layers, the number of
weights would no longer be manageable or at least require significantly more training
data to optimize all weights accordingly, which in turn increases training time and
convergence. Furthermore, it is not optimal to process an image pixel-wise as local
dependencies, such as object-related structures, might not be taken into account by
the FC layer. In order to reduce computational costs and exploit local dependencies
in images, Convolutional Layers (Convolutional Layers) as well as Pooling Layers
(Pooling Layers) were introduced, which represent important components in CNNs
[LeCu 89], that are presented in the following.

Convolutional Layers
The Convolutional Layers are the core components of CNNs, as they preserve local
dependencies and performs convolutions between two matrices. The smaller matrix
is referred to as kernel, which comprises the learnable parameters, whereas the other
matrix depicts the input signal. As result of this operation, one derives the activation
map, that corresponds to the kernel responses at each spatial position. Similar to
single neurons, the activation maps are processed by non-linear activation functions,
subsequently. The dimensions of the filter kernel play an important role as it defines
the receptive field, the region of the signal which is provided to the kernel, and is
influenced by the width, height and depth of the kernel. An example of the convo-
lution operation is visualized in Figure 3.19 (a). Herein, a padded 6 × 6 image is
convolved with a 3 × 3 kernel, resulting in a 6 × 6 output. If no padding is applied,
a smaller image of size 4 × 4 would have been the result. Of course, the kernel is not
restricted to only two dimensions. For example, if an RGB-image would be processed,
the depth dimension could be set to three, such that all color channels are taken into
54 Chapter 3. Theoretical Background: Machine Learning

(a) Convolutional layer (b) Pooling layer

Figure 3.19: Convolution and pooling layer adapted from [Dumo 16]. The 2D-input
(blue) is convolved with a 3 × 3 kernel whose values are combined to one value on
the subsequent layer (light blue). Within pooling, neighboring pixels from a small
neighborhood are combined to one value with a pooling operation.

account at the same time to calculate a single output value.

Convolutional Layers possess three important properties [Good 16, p. 335]: (1)
Sparse interactions, also known as sparse connectivity or sparse weights, which is a
result of the kernel being much smaller than the input. Small, representative features
such as edges are detected by kernels that consist of only a few pixels instead of
the whole image dimension. As a result, only few parameters need to be stored,
which reduces the memory consumption, increases the efficiency and requires fewer
operations to derive the output. (2) Parameter sharing, which means that the kernel
and its weights are reused at every spatial position of the image. However, this is not
the case in fully connected layers, where typically, each weight corresponding to each
input in a hidden layer is unique. Consequently, parameter sharing further reduces
memory consumption and prevents overfitting. (3) Equivariance to translation, which
means that the output changes in the same way as the input does. Consequently,
a shift of the input in one direction, therefore, leads to a corresponding shift in
the activation map, which is beneficial since, in early layers, edges or other basic
features may occur at varying locations of the image. In addition, the convolution
layer can be mapped to matrix operations, so that efficiency is improved through
sparsity and parameter sharing while maintaining the gradient flow according to the
back-propagation as presented for fully connected layers.

Pooling Layers
Pooling Layers are typically applied after the non-linear activation functions to re-
duce the size of the activation maps with a summary statistic of adjacent outputs
and thereby reduce the number of parameters. There exist multiple strategies like
maximum pooling, average pooling or the l2 norm. For example, maximum pool-
ing with a kernel of size 2 × 2 and a stride of 2 would return the maximum values
of non-overlapping kernel neighborhoods and consequently result in an output map
3.4. Deep Learning 55

with only one fourth of the input dimension as visualized in Figure 3.19 (b). As a
side effect of the dimensionality reduction, pooling operations introduce translational
invariance to small variations of the input while additionally increasing the receptive
field of the subsequent layers. Other operations, such as the average operation, can
be expressed as a matrix with hard-coded weights, whereas non-linear operations as
the maximum or median exploit sub-gradients. The required matrices for these op-
erations are created during the forward pass so that the correct element is selected
during the backward pass [Maie 19a].

3.4.6 Network Architectures


Over the years, many architectures have been introduced using the building blocks of
neural networks presented in different combinations and quantities. These architec-
tures include AlexNet [Kriz 12], VGGnet [Simo 14], Inception net [Szeg 15] or ResNet
[He 16], whose performance is usually compared in competitions such as ILSVRC. A
comprehensive overview of current architectures and their applications is provided
in [Maie 19a]. Besides classification or object detection tasks, other encoder-decoder
structures such as autoencoders [Bour 88, Hint 94] were proposed, that traditionally
were used for dimensionality reduction or learning of representative features. In the
following two architectures are presented, which serve as a basis for the developed
methodologies.

VGG16

VGG16, the short form of Visual Geometry Group, is a CNN architecture comprising
16 layers, that achieved 92.7% Top-5 test accuracy in the ILSVRC competition.
Typically, with an increasing depth of the network, more abstract and high level
features are learned. An overview of the general structure and the sequence of the
individual layers is provided in Figure 3.20.

As input serve RGB images of size 224 × 224, while subsequently, the network
consists of multiple blocks of convolutional layers that increase in depth dimension
with increasing depth of the layer. While in the first convolutional block, only 64
filters are learned, the number of filter increases to 512, starting from the fourth
convolutional block. The filter kernel size is 3 × 3, with a stride length of 1 and zero
padding. Each convolutional block ends with a max-pooling layer that uses a 2 × 2
kernel and stride of 2, such that the dimension is significantly reduced. This leads
to an increased receptive field of the kernels in the subsequent layers, such that a
more global context is incorporated, and the network is able to learn relationships
among large distances of the image. After the fifth convolutional block, each image
is represented by a feature map of dimension 7 × 7 × 512, that is further processed by
two subsequent FC layer of size 4096. The classification result, in the end, is derived
using another FC layer, that comprises 1000 neurons, where each neuron represents a
specific class of the ImageNet database. All layers use RELUs as activation functions
except for the last layer, which employs softmax due to its classification purpose.
56 Chapter 3. Theoretical Background: Machine Learning

K
96 96
40 40
224 1
14
512 512 512
28

56
512 512 512
2
11

4
22 256 256 256 1 1
64 64
64 64

Fully
Convolutional Pooling connected Softmax
+ ReLu +
ReLu

Figure 3.20: VGG16

Autoencoder

In contrast to the VGG16 structure, which is used in supervised classification tasks,


autoencoders are employed in unsupervised methods. This has the advantage that
no expert knowledge or ground truth annotation is required. For unsupervised ap-
plications there exist no defined architectures, as these are often problem-dependent.
However, Figure 3.21 illustrates the general structure of an autoencoder. An autoen-
coder consists of two parts, the encoder and the decoder, while following a typical
feed-forward network concept, where both parts have contradictory objectives. The
encoder in the illustrated example consists of the first three convolutional blocks
with subsequent pooling layers, similar to those of the VGG16 structure. The en-
coder block is terminated with two FC layers of size 512, followed by an additional
FC layer of size 256. This layer is referred to as bottleneck since the input image is
compressed into a lower dimensional representation, also known as latent space. Sub-
sequently, the decoder uses these representations and tries to reconstruct the input,
while the structure of the convolutional blocks is mirrored from the encoding path.
Typically, the mean squared error between the input and the reconstruction is used
as loss function and minimized during training.
The intuition behind this procedure is that with increasing depth, the dimen-
sionality decreases and consequently, the encoder has to prioritize which parts of the
input are considered as important. This ideally leads to representative features in the
latent space, that can be used for further processing of the data [Vinc 08]. As already
mentioned, autoencoders are mainly used for dimensionality reduction in an unsuper-
3.5. Evaluation Metrics 57

2
51

25

51
4

4
I/

I/
256 256 256 1 1 1 1 1 256 256 256
2 2
I/ Bottleneck I/

64 64 64 64

I I

64 64 64 64

Fully
Convolutional Pooling &
+ ReLu unpooling connected
+
ReLu

Figure 3.21: Autoencoder

vised setup, with two important properties. (1) They provide data-specific solutions,
i. e. that contrary to the broad VGG16 features, the latent space representations are
not expected to work on different datasets. (2) The reconstruction will deviate from
the input since by design of the encoder only lossy compression is possible.

3.5 Evaluation Metrics


Supervised classification algorithms can be assessed with evaluation metrics that
originate from confusion matrices [Witt 16]. Table 3.1 visualizes a confusion matrix
for a binary classification problem contrasting the actual ground truth labels with
the predicted classes of the algorithm. Typical metrics that can be derived from this
matrix consists of accuracy, precision and recall according to:
TP + TN
accuracy = (3.61)
TP + FP + TN + FN
TP
precision = (3.62)
TP + FP
TP
recall = (3.63)
TP + FN
In particular, the accuracy metric that depicts the total amount of correctly clas-
sified samples can lead to confusion in imbalanced datasets, since always predicting
the majority class might already lead to good accuracy. For example, if 97% of the
data belongs to the majority class, an accuracy of 97% can be achieved without even
correctly classifying a single sample of the minority class. Precision is a metric that
58 Chapter 3. Theoretical Background: Machine Learning

Table 3.1: Confusion matrix of a two-class prediction.

Actual Class
Yes No
Predicted Yes # true positive (TP) # false positive (FP)

No # false negative (FN) # true negative (TN)

measures the proportion of correctly classified samples relative to the total amount
of all classified samples. Recall or sensitivity measures the fraction of correctly clas-
sified samples relative to the actual amount of samples, e. g. the proportion of sick
persons correctly classified as having the disease. The principle of confusion matrices
is not only applicable to two-class problems but can easily be extended to additional
classes.

3.5.1 Receiver Operating Characteristic


In principle, the confusion matrices or the metrics that can be generated from these
are suitable to compare different supervised classification algorithms. However, the
determined indicators are dependent on a threshold value, which is typically set to
50% for a two-class problem. Since it is usually not possible to decide in advance on a
certain threshold value, it is beneficial to evaluate the overall model performance. The
Receiver Operating Characteristic (ROC) curve is one graphical evaluation scheme
that summarizes the trade-off between the True Positive Rate (TPR), also referred
to as sensitivity or recall, and the False Positive Rate (FPR) at different thresholds
[Hast 05]. These indicators are again derived from the confusion matrix as:

TP
tpr =
TP + FN (3.64)
FP
f pr =
FP + TN

An illustration of a ROC curve is depicted for a four-class problem in Figure 3.22,


where each point on the curves corresponds to a specific confusion matrix. The
diagonal depicts a random process, i. e. the accuracy corresponds to random guessing,
whereas curves below the diagonal are indicating that the samples are misinterpreted.
An ideal curve would have 100 percent TPR with a simultaneous 0 percent FPR, that
is unlikely to derive. Usually, a ROC curve is employed as an optimization method to
define a suitable threshold that can be utilized, for example, in product development.
The Area Under ROC Curve (AUC) is another performance indicator that is used
to compare classification methods. This metric facilitates, for example, to represent
the results of a multi-class problem using the mean value among classes as the only
indicator, which in particular can reduce the complexity of comparisons. However,
classes are considered equally important and consequently, misclassifications produce
the same costs.
3.6. Assessment of the Expert Annotation Quality 59

1.0

0.8

true positive rate


0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
false positive rate
Class 0 (AUC = 0.965) Class 1 (AUC = 0.903)
Class 2 (AUC = 0.979) Class 3 (AUC = 0.987)

Figure 3.22: ROC curves of a multi-class classification problem with corresponding


AUC scores for each class.

3.5.2 Dice Coefficient


The previous metrics are used to evaluate entire instances of objects, i.e., the whole
image is classified as belonging to a certain class. However, such an evaluation scheme
is not favorable for segmentation problems, since often only a portion of an image or
an object is of interest in the investigation so that an alternative evaluation method
has to be employed. For this purpose, manually created ground truth annotations
are provided by domain experts for the relevant image areas so that the quality of the
segmentation methodology is assessed based on a comparison between the estimates
and the ground truth annotations. This enables a classification and an evaluation on
pixel scale, which is typically provided by the dice coefficient [Dice 45]:
2 TP
Dice = (3.65)
(T P + F P ) + (T P + F N )
The dice coefficient determines the spatial overlap of the ground truth mask with the
prediction and thus represents a metric for assessing reproducibility. The value of the
dice coefficient ranges from 0 to 1, which indicates that either no or complete spatial
overlap between the ground truth mask and the result of the binary segmentation
is determined. The number of pixels being classified as fore- or background pixel is
obtained by using a threshold value of 0.5. Consequently, these are interpreted as
T P, F P and F N per prediction.

3.6 Assessment of the Expert Annotation Quality


In order to perform supervised classification experiments, ground truth annotations
are required that are usually assigned by experts in the respective field. In order
to evaluate the inter-rater reliability, one robust measure that assesses a pair of
two raters is the Cohen’s kappa statistic [Cohe 60]. Similar to the previous metrics,
60 Chapter 3. Theoretical Background: Machine Learning

a confusion matrix (cf. Table 3.1) is utilized, whereby instead of employing the
predictions of a classifier, the agreement with another rater is examined. Generally,
Cohen’s kappa (κ) evaluates the agreement between two raters, i. e. it evaluates if
the rater assigns identical classes to the same data samples in relationship to the
agreement that the assignment occurs by chance. It is defined as:
po − pe
κc = (3.66)
1 − pe
where po is the observed agreement among raters, identical to the previously defined
accuracy, and pe the hypothetical probability of agreement by chance, which is also
referred to as expected accuracy. More formally, pe is defined as:
1 X
pe = 2 nk1 nk2 (3.67)
Ns k
where Ns denotes the number of samples and nki the number of assignments of
category k made by rater i. Since Cohen’s kappa is only valid in the context of two
raters, it can easily be extended to a multiple raters, multiple categories scheme using
the Fleiss’ kappa [Flei 73]. While within Cohen’s kappa, two raters are expected to
assess the same set of data samples, Fleiss’ kappa allows that different samples may
be rated by different individuals [Flei 71]. It is defined as:
po − p e
κf = (3.68)
1 − pe
where similarly po is the observed accuracy and pe the level of agreement by chance. In
order to determine the two metrics, the assignments of multiple raters are compiled
into a matrix, where Ns denotes the number of samples, n the amount of ratings
per sample and k the number of different categories. The samples are indexed by
i = 1, . . . Ns , whereas the categories range from j = 1, . . . k. Consequently, the
proportion pej of all assignments to the j-th category is derived as:
1 X Ns
pej = nij (3.69)
Ns Nr i=1
where Nr refers to the total number of raters and nij denotes the number of raters
who assigned sample i to category j. The extent of agreement p∗i for the sample i is
calculated as:
  
1 k
1 k
p∗i = nij (nij − 1) = n2ij  − (n) (3.70)
X X
Nr (Nr − 1) j=1 Nr (Nr − 1)

j=1

which finally leads to the observed and expected accuracy according to:
1 X Ns k
po = pi , p e =

p2j (3.71)
X
Ns i=1 j=1

Perfect agreement of raters is depicted by κ = 1 for both kappa metrics, whereas


disagreement among experts would lead to κ ≤ 0. Further gradations of kappa were
suggested by Landis and Koch [Land 77], who describe a kappa value of 0.4-0.6 with
a moderate agreement. It should be noted that the gradations of the kappa values
must generally be determined on a problem-specific basis.
CHAPTER 4

Data Acquisition & Materials


4.1 Materials & Process Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Annotation Guidelines & Failure Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Strain Distributions in Dependence of Loading Conditions. . . . . . . . . . . . . 68
4.4 Signal Impairments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 Software for Expert Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

This chapter covers the data generation procedure, both with regard to the ex-
perimental parameters and the peculiarities of the image data. In particular, the
materials and their properties, as well as their transformability with regard to the
forming limit curves, are presented. In addition, exemplary failure classes are intro-
duced by means of image material, as well as common signal impairments that occur
during the forming experiments. The chapter concludes with the presentation of
the implemented Graphical User Interface (GUI) that was used for data annotations
by multiple experts from the field. The main content of the following is part of a
shared first authorship publication that, among machine learning aspects, presented
the course of the study and the materials [Affr 18].

4.1 Materials & Process Parameters


All the forming processes and the determination of material key characteristics were
performed at the Institute of Manufacturing Technology at the Friedrich-Alexander-
University of Erlangen-Nuremberg using a Nakajima test setup as introduced in Sub-
section 2.2.2. An optical measurement system (ARAMIS v6.3.0-7 gom GmbH, Braun-
schweig, Germany) was used to record the forming procedures and to calculate the
strain distributions using a proprietary DIC method as described in Subsection 2.2.3.
Multiple materials with varying characteristics are investigated throughout the thesis
to evaluate the transferability and generalization of the methodologies.

DX54D
The DX54D is a deep drawing steel, that is often used in the automotive industry.
The ductility is described by a uniform elongation between 22% and 23%, with a
relatively low yield strength of 164–170 MPa. The classical FLC, according to DIN
EN ISO 12004-2 as described in Subsection 2.3.1 shows good agreement with the
experimental determination [Merk 17]. According to the norm, at least 5 sample
geometries (cf. Figure 2.8) need to be investigated to create the FLC, whereby a

61
62 Chapter 4. Data Acquisition & Materials

0.9 0.9
DX54D DX54D
0.8 0.8
0.75 mm 2.00 mm
major strain

major strain
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
−0.4 −0.2 0 0.2 0.4 0.6 −0.4 −0.2 0 0.2 0.4 0.6
minor strain minor strain
ISO LF ISO LF

Figure 4.1: Determined FLCs for DX54D and two different material thicknesses. For
both thicknesses, the cross-section method is considerably conservative, while good
agreement between the methods is observable for a thickness of 2 mm, especially for
the uniaxial part of the curve. Source: [Affr 18] (CC BY 4.0).

more exact determination is possible by examination of additional geometries. As


the material is ductile and the strain localization is visible several forming steps
before crack initiation, it is expected that the necking phase may vary more substan-
tially between different geometries of this material in comparison to high strength
materials. Figure 4.1 depicts the FLCs determined by the introduced methods (cf.
Section 2.3) for two different material thicknesses. It can be observed that there
exists a large deviation between the two FLCs of the different evaluation methods
for both thicknesses, since the necking transition is gradual for a few seconds before
crack initiation. Generally, the line-fit method is less conservative in comparison
to the cross-section approach. In order to minimize these deviations and to reduce
the spatial and temporal dependency on the actual evaluation area and lookup time
point, it is therefore necessary to determine the onset of necking as well as its extent
independently of heuristics, that in turn will increase reproduce-ability of measure-
ments. Overall, these requirements are also applicable to the remaining investigated
materials.

DP800
The DP800 is a dual-phase steel that is mainly used for structural parts and is
characterized by a matrix of ferrite and martensite precipitations. It is a high strength
material with a uniform elongation of 14%-16% and a yield strength of 465 MPa.
During Nakajima tests, several local maxima are observed due to the inhomogeneous
structure of ferrite and martensite [Merk 17]. Figure 4.2 (a) depicts the different
FLCs. Again, the FLC determined by the cross-section method is more conservative
in comparison to the FLC of the line-fit method. Additionally, it can be observed,
that both methods have instabilities, as the individual forming experiments (denoted
by dots of the same color) possess a pronounced spread independently of the method.
4.1. Materials & Process Parameters 63

0.5 0.5
DP800 AA6014
major strain 0.4 1.00 mm 0.4 1.00 mm

major strain
0.3 0.3

0.2 0.2

0.1 0.1
−0.1 0 0.1 0.2 0.3 0.4 −0.1 0 0.1 0.2 0.3 0.4
minor strain minor strain
ISO LF ISO LF
(a) (b)

Figure 4.2: Determined FLCs for DP800 and AA6014. (a) A high deviation can be
observed between the two methods for DP800. Both methods exhibit large deviations
for individual experiments per specimen geometry, which is especially emphasized
by the biaxial geometry on the right side of the curve. (b) In case of AA6014,
both methods provide more consistent results with good correlation and a smaller
deviation. Source: [Affr 18] (CC BY 4.0).

This confirms the weaknesses of both methods for sheet metal materials with sudden
onset of crack behaviors.

AA6014
The AA6014 is a light-weight aluminum alloy of the 6xxx series, that in the T4
condition exhibits good formability and is used in car-body structures. The yield
strength is close to 140 MPa and the uniform elongation is comparable to that of
the ductile DX54D. However, this aluminum alloy possesses lower formability, which
is reflected in the low Lankford coefficient. The FLCs derived with both methods
produce similar results with a small deviation as visualized in Figure 4.2 (b).

AA5182
The AA5182 is an aluminum alloy of the 5xxx series with several local maxima in the
form of shear bands with Portevin-LE Chatlier (PLC) effects [Yilm 11]. The yield
strength ranges from 130-132 MPa and the uniform elongation ranges from 21% to
23%, which is comparable to the one of the ductile DX54D. Similarly to the AA6014,
it also possesses lower formability as the DX54D, which is again reflected by the
Lankford coefficient. In contrast to the other materials, the FLCs determined by
the line-fit method proposes lower values in comparison to the results of the cross-
section method, which seems to be a result of the PLC effect and hence questions the
reliability of the determined FLCs in Figure 4.3.
64 Chapter 4. Data Acquisition & Materials

0.5
AA5182
0.4 1.00 mm

major strain
0.3

0.2

0.1
−0.1 0 0.1 0.2 0.3 0.4
minor strain
ISO LF

Figure 4.3: Determined FLC for AA5182. In contrast to the other materials, the
line-fit method leads to a more conservative FLC. Source: [Affr 18] (CC BY 4.0).

Table 4.1: Material properties of the investigated materials.

Material t0 (mm) n YS (MPa) TS (MPa) UE (%) r0 r90


DX54D 0.75/2.00 0.23 164–170 297–322 22–23 1.80 2.22
DP800 1.00 0.16 465 775–797 14–16 0.76 0.90
AA6014 1.00 0.24 140–143 244–239 21–23 0.69 0.67
AA5182 1.00 0.32 130–132 265–275 20–21 0.72 0.67

An overview of the different key characteristics (cf. Subsection 2.1.2) of every


material is summarized in Table 4.1. In order to not only evaluate the transferability
of the methods to different materials, the process parameters, i.e., the thickness of the
sheet material, the punch speed and the recording frequency are additionally varied
to determine the dependency on these parameters. Especially the sampling frequency
is an essential parameter, since on the one hand, the calculated strains depend on the
specific displacements of the individual subsets and on the other hand, the observable
effects of the local necking may develop spontaneously.
In order to approximate the required sampling frequency and thus to prevent under-
sampling of the necking effect, the forming progression of a subset of a DX54D-S050-1
(0.75 mm) experiment is investigated in detail. Figure 4.4 (a) visualizes the forming
history of the subset. At the beginning, a monotonic, steady increase in strain is
observable, whereas towards the end, this increase changes to an exponential devel-
opment of strain with subsequent failure of the material.
In order to be able to simulate an ongoing forming process, this signal is approxi-
mated with an exponential function based on the samples starting at second 7.0 to
the end (cf. Figure 4.4 (b)).
In comparison to the original signal, the estimation is extrapolated by five additional
samples that simulate a continued forming process for the analyzed experiment. The
estimated exponential function has the form:

a · eb·x + c · ed·x (4.1)


4.1. Materials & Process Parameters 65

1 1 0.4
thinning [mm] Signal

thinning [mm]
0.8 0.8 0.3 Signal approx.

|Y(f)|
0.6 0.6
0.2
0.4 0.4
0.2 0.2 0.1
0 0 0
0 1 2 3 4 5 6 7 8 7.0 8.0 9.0 0 4 8 12 16 20
time [s] time [s] frequency [Hz]

(a) (b) (c)

Figure 4.4: Original signal and power density spectrum for DX54D-S050-1: (a)
Maximum thinning value as determined on the last frame and its development over
the whole forming process. (b) The exponential region of the original signal is ap-
proximated with an exponential function and extrapolated. (c) The power density
spectrum of the approximated and original signal.

with the determined coefficients:

a = 0.0010, b = 0.1527, c = 0.3673, d = 0.0065 (4.2)

The individual exponential functions can be analytically transformed into the fre-
quency domain by means of Fourier transform in order to highlight the frequencies
contained in the signal. Undersampling of the signal is avoided by compliance with
the Nyquist-Shannon sampling theorem [Scha 89].
This theorem states that a band-limited signal must be sampled at a frequency larger
than twice the maximum contained frequency, fsampling ≥ 2 · fmax , to guarantee an
error-free reconstruction of the signal. On account of the smaller parameter d, it is
immediately apparent that high frequencies can only be induced by the first exponen-
tial function. Transformation into the frequency domain, by neglecting the parameter
a leads to
1
g(t) = e0.1527t ∗ u(t) d tG(ω) = (4.3)
0.1527 − jω
where g(t) denotes the exponential function, u(t) represents the step function, t the
time with respect to the forming range, G(ω) the transformed exponential function
and ω the angular frequency. Furthermore, it follows from G(ω) that the magnitude
|G(ω)| decreases monotonically with increasing frequency. The limitation of the signal
bandwidth is based on the theorem of Parseval [Scha 89]:
Z ∞
1 Z∞
|g(t)|2 dt = |G(ω)|2 dω (4.4)
−∞ 2π −∞
which states that the energy of the signal is identical in the time and frequency
domain. Therefore, it is possible to define a threshold value with respect to the
magnitude, above which the included frequencies may be neglected while still covering
most of the signal energy [Rupp 13]. Here, this threshold value is set to 1% of the
maximum magnitude with nthreshold = 0.01 · |G(ω)|, so that 99% of the signal energy
66 Chapter 4. Data Acquisition & Materials

is contained in the remaining frequencies. Consequently, the frequency B can be


determined according to:

1
≤ 0.01 · |G(ω)|max
0.1527 − jω
1 1
≤ 0.01 ·
0.1527 − jω 0.1527 (4.5)
q
0.15272 + (−j2πB)2 ≥ 15.27
2πB ≥ 15.24
B ≥ 2.42 Hz

This results in a minimum sampling rate of 4.84 Hz, according to the Nyquist-
Shannon theorem. The majority of the magnitude, both for the approximated and
the original signal, is in the low frequency range with |Y (f )| ≤ 4 Hz as illustrated by
the frequency spectra in Figure 4.4 (c). However, in order to evaluate the dependency
on the sampling rate, multiple frequencies ranging from 15-40 Hz are investigated.

The remaining parameters are summarized in Table 4.2, with additional informa-
tion about the available geometries per material. From this follows that a varying
amount of specimen geometries is investigated throughout the proposed methods and
used for the generation of the FLCs. Especially the varying sampling frequency is
of importance from a data scientist perspective, since it directly affects the amount
of available images that can be included within the methodologies. Since the de-
velopment of necking is a rather fast phenomenon, a slow punch speed with a fast
sampling frequency covers most of the information at the cost of very small differ-
ences between subsequent images. This is critical since too little displacement might
lead to erroneous traceability of blocks and consequently affects the DIC algorithm
and the computation of strain.

Table 4.2: Specimen geometries per material with process parameters.

Material Frequency Punch Velo- Available


(Thickness mm) (Hz) City (mm/s) Geometries

DX54D (2.00) 20 1.5 S060, S080, S100, S125, S245


AA6014 (1.00) 15 1.0 S050, S060, S080, S100, S110, S125, S245
DP800 (1.00) 40 1.0 S050, S060, S110, S125, S245
S050, S060, S070, S080, S090,
DX54D (0.75) 40 1.0
S100, S110, S125, S125, S245
AA5182 (1.00) 40 1.0 S050, S060, S110, S125, S245
4.2. Annotation Guidelines & Failure Stages 67

4.2 Annotation Guidelines & Failure Stages


Expert annotations are essential for the deployment of supervised pattern recogni-
tion techniques, as the knowledge of the experts is used to associate different classes
with the available data, which are subsequently used to train the machine learning
algorithms. While expert knowledge is often used in clinical machine learning appli-
cations, as physicians learn how to recognize anomalies and annotate the data, which
is easily transferable to learning algorithms, it has never been used in the context of
sheet metal forming processes. In sheet metal forming and especially in Nakajima
tests, the anomaly is represented by the onset of necking or, more generally, by a devi-
ation from the homogeneous strain distribution. The diversity of forming conditions,
i. e. the specimen geometries and the material behavior makes it difficult to define
the classes unambiguously. However, an initial assessment of the general classes may
be made by considering a simple forming condition, such as the uniaxial tensile test
(cf. Subsection 2.1.2). The test specimen is subjected to an uniaxial force over its
entire length and due to the relatively small thickness compared to length and width,
an uniaxial stress condition can be assumed whose principal strain is aligned along
the direction of force. After a linear increase in strain (elastic behavior), the yield
point is reached and the material begins to deform plastically, whereby the required
stress increases as the material hardens and the strain develops homogeneously. The
maximum tolerable stress is then reached and the material begins to neck, with the
necking typically affecting the width. At the end of the forming process, the necking
is concentrated on a small area of the order or magnitude of the sheet thickness with
subsequent fracture. These considerations are used within the guidelines for experts’
annotations and lead to the following four classes:
• Homogeneous forming: strain is homogeneously distributed on the entire image
• Diffuse necking: inhomogeneity of the strain distribution emerges between the
two principal directions, whereby the area of interest is still covered by the
entire image
• Local necking: localization of strain occurs only in a small area in the order of
the sheet thickness accompanied by thinning or sharp increase of strain
• Crack: increasing amount of defect pixels due to surface discontinuities;
An example of the four introduced classes that occur during Nakajima tests is visu-
alized in Figure 4.5. The guidelines provide no further distinctions for the individual
geometries. However, it is expected that the defined classes are compatible for geome-
tries with a small width and negative minor strain. In the case of biaxial straining
(full geometry) or plane strain (120-125 mm width), the analyzed surface recorded
by the optical measuring system is limited to the area from the top to the inner
diameter of the die (110 mm). Therefore, it may not be possible to recognize diffuse
necking since it occurs on the entire image. Additionally, the different geometries
cause slightly different strain distributions that deviate from the ideal class defini-
tions of the tensile test. Especially in case of high strength materials that do not
present a clear transition between uniform elongation and the onset of necking, the
assignment of classes can be difficult.
68 Chapter 4. Data Acquisition & Materials

DX54D; S050
t0 = 0.75 mm

Homogeneous Diffuse Local Crack


strain distribution necking necking

Figure 4.5: Different classes as defined by the guidelines: homogeneous forming,


diffuse necking, localized necking, crack. Source: [Affr 18] (CC BY 4.0).

4.3 Strain Distributions in Dependence of Loading Con-


ditions
The different material properties lead to variations in the strain distributions during
forming processes. This is visualized in Figure 4.6. Even though the same geome-
try is evaluated and contrasted, one can observe the different localization behavior
in dependence on the underlying material. Both DX54D and DP800 exhibit visu-
ally comparable strain distributions including their localization behavior. AA6014
already slightly deviates from DX54D and DP800, while cross like structures that
occur during necking seem to be present in all the mentioned materials. A signif-
icant deviation can be observed for AA5182 that is a result of the PLC effect. It
is particularly difficult to differentiate between the homogeneous forming phase and
the inhomogeneous forming phase that in the end will contain the necking effect and
is necessary for the determination of the FLC. The crack is prominent in all mate-
rials and defined by the defect pixels, whose number is increasing on the last few
frames of the forming sequences due to cancellation of the material cohesion and
hence failure of the DIC method. Multiple prototypes of the uniaxial, plane strain
and biaxial samples are visualized in Figure 4.7 to further improve the impression
of the dataset and to highlight the influence of the different specimen geometries on
the strain distribution. Here, only samples from the inhomogeneous forming phase
with a well-defined localized necking effect are presented. Across all materials, with
the exception of AA5182, comparable strain distributions are observed for S060 and
S125, although the width of the localization effect is visible with varying degrees of
intensity and slightly different shapes. The localization effect for S245 is completely
different from the previous geometries. This is not only characterized by the differ-
ent localization shapes but also depicted by the changing amount of dark areas that
significantly vary depending on the material as a result of its forming properties.
4.3. Strain Distributions in Dependence of Loading Conditions 69

DX54D
(0.75mm)

DX54D
(2.00mm)

DP800

AA6014

AA5182

Figure 4.6: Different strain progressions (ε1 ) for the uniaxial S060 geometry depen-
dent on the material (rows). The images show an excerpt of the forming sequences
and emphasize how different the materials are behaving over time.
70 Chapter 4. Data Acquisition & Materials

S060 S125 S245

DX54D
(0.75mm)

DX54D
(2.00mm)

DP800

AA6014

AA5182

Figure 4.7: Localized necking examples for all materials (rows) and different ge-
ometries (columns). The image area with use-able information may decrease while
forming as the black areas increases due to the varying material characteristics.
4.4. Signal Impairments 71

4.4 Signal Impairments


Specimen preparation is very important and can be considered as a critical step as
it heavily affects the quality of the DIC result. Even with perfectly applied speckle
patterns, it is possible that artifacts occur during the forming of specimens. These
artifacts comprise: (1) defect pixels, that are either static or only occur occasionally
due to measurement error as a consequence of a failure of the DIC calculation; (2)
defect pixels, that occur block-wise and affect large parts of the image; (3) measure-
ment errors that lead to regions where consistently higher strain values are derived
together with occasionally or static defect pixels; (4) defect pixels that may be in-
terchanged with the increasing amount of defect pixels that are a sign of the crack
class. Examples for these image artifacts are visualized in Figure 4.8. Single defect
pixels, that occur occasionally can be removed easily, whereas block-wise occurring
defect pixels, as well as measurement errors, may impede methods that use the image
for the determination of the FLC. These artifacts are especially critical when they
emerge near or in the region of interest.

Defect pixels Defect pixels Measurement Defect pixels


block-wise errors vs crack

Figure 4.8: Exemplary visualization of possible image artifacts. Especially block-


wise occurring defect pixels and measurement errors may deteriorate methods that
make use of the image information.

4.5 Software for Expert Annotations


In order to support the experts during the process of data annotation, a GUI was
developed that simultaneously displays the individual principal strains as well as the
thinning distribution. A visual impression of the software is depicted in Figure 4.9.
The individual frames of the strain sequences are accessible via the timeline and the
experts are supported by interactive, freely positionable cross-sections through the
strain distributions. These cross-sections visualize the intensity profile and therefore
are useful to detect sudden increases or decreases in one of the principal strains. Five
72 Chapter 4. Data Acquisition & Materials

experts were interviewed for annotation purposes, which already enables good cor-
relation according to Hönig et al. [Honi 11], whereby systematic annotation errors,
especially for similar classes, cannot be avoided even with an increasing number of
raters [Stei 05]. The experts are scientific researchers in the field of material charac-
terization and possess an extensive experience in material characterization of different
materials, ranging from 4 to 20 years’, in combination with the usage of an optical
measurement system.

Figure 4.9: Image of the developed software used by experts for the annotation
procedure. Single frames of the forming sequences are assigned to the different failure
classes based on visual inspection. To support the experts, it is possible to place cross-
sections through regions of interest (red lines) and to inspect the strain / intensity
profile.
CHAPTER 5

Supervised Determination of
Forming Limits using
Conventional Machine
Learning
5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Pattern recognition methods for the determination of the FLC have not been
taken into consideration so far. As described in Section 2.3, the assessment of the
onset of necking by well-established methods such as the cross-section method or the
time-dependent method relies on the evaluation of spatially or temporally limited
information. While the cross-section method uses multiple cross-sections to deter-
mine the onset of necking, the time-dependent method evaluates a limited area that
is found using an arbitrary threshold. Additionally, only one of the principal strains
is investigated, such as the major strain or the thickness reduction. Instead of artifi-
cially limiting the available information to only a few pixels, the pattern recognition
approach makes use of the spatial information of all principal strains at the same
time while focusing on the extremal region and its vicinity. With this methodology,
it is feasible to incorporate interactions in the assessment, which ultimately lead to
the onset of necking rather than just considering a sudden change within the evalua-
tion area. The to be presented approach follows the principle of supervised machine
learning and can be considered as baseline or feasibility study. Most of the findings
were published in the shared first author publication [Affr 18], while only the pattern
recognition part is covered in the following.

5.1 Method
Irrespective of the material class, materials are subjected to four different phases of
forming behavior, which varies in duration depending on the properties of the ma-
terial, as introduced in Subsection 2.1.2. These forming behavior classes consist of:
homogeneous forming (C0), diffuse necking (C1), local necking (C2) and crack (C3),

73
74 Chapter 5. Supervised Determination of the FLC – Machine Learning

as visualized in Figure 4.5. Overall, the method follows the introduced pattern recog-
nition pipeline (cf. Section 3.1). The acquired video sequences of different materials
and geometries (cf. Section 4.1) serve as inputs to the pipeline. These signals are
preprocessed by reducing noise and interpolating missing data resulting from defect
pixels (cf. Section 4.4) to improve the classification performance. Since the proposed
method employs a supervised classification scheme, the experts’ knowledge is utilized
for training a classifier that separates the different forming stages into the respec-
tive classes. Consequently, each image in a video sequence is assigned a label by
the experts that represents a specific failure class (C0-C3). Five experts with 5-20
years experience with Nakajima tests are considered in this study, that follow the
annotation guidelines (cf. Section 4.2) and make use of the annotation software (cf.
Section 4.5). This procedure results in a label vector for each sequence and serves as
ground truth for supervised classification. Within feature extraction, a characteristic
vector is created that describes the image in a compressed representation. In order
to simulate a realistic scenario that involves a learned classification model, the data
set is divided into disjoint training and test sets. The extracted information, the
characteristic and ground truth vectors are used to train a classifier that learns an
optimal solution using the training set, that separates the instances to the different
classes. This separation hypothesis is assessed using the disjoint test set, whereby
the class predictions of the individual instances are compared to the correspond-
ing ground truth labels as defined by the experts. The classification performance is
quantitatively evaluated using the AUC, precision and recall metrics.

5.1.1 Preprocessing
The output of the optical measurement system is converted into three-channel (ma-
jor strain, minor strain, thinning) video sequences that serve as input to the algo-
rithm. Processing of the images and calculation of strains values based on DIC may
introduce artifacts, that should be removed before feature extraction (cf. Section 4.4).
Temporal defect pixels may occur as a result of a high sampling rate or minor er-
rors during probe preparation, such that DIC temporarily fails to correlate blocks on
subsequent images and hence would impede feature extraction. Since the feature rep-
resentations are essential for assessing the localized necking condition, missing values
are interpolated using the forming history of the individual strain values. Another
signal impairment are the static defect pixels that deliver no measurement signal
during the whole forming procedure. These artifacts are removed by iteratively cal-
culating the mean value of a square 3 × 3 neighborhood. In addition to these two
types of artifacts, defect pixels can also indicate crack initiation. Towards the end of
the forming process, the DIC system is not able to further correlate the individual
blocks in the event of material failure which is a result of the material cancellation.
Consequently, this leads to a sudden, increasing amount of defect pixels towards the
end of the forming procedure. This indicator is used by the experts to assign the fail-
ure class and since the expert annotations are based on visual appearance removing,
or interpolating these values would adversely affect the subsequent classification task.
As a result, defect pixels that do not recover are substituted with a negative value to
provide a strong edge response describing the material failure or crack initiation.
5.1. Method 75

(a)

(b)

(c)

(d)

Figure 5.1: Progression of strain information in feature space: (a) Major strain
distribution as gray-scale image. (b) Sobel filtered gradient representation. (c) Vi-
sualization of LBPu . (d) Visualization of LBPriu . Source: [Affr 18] (CC BY 4.0)

5.1.2 Feature Extraction


The state-of-the-art methods only consider limited evaluation areas. To overcome
these limitations and instead of focusing on a subjectively defined area with only
strain information, numerous rectangular patches are extracted from the 2D images,
which include extremal regions and their vicinity. In order to reduce their dimension-
ality, these patches are encoded with local descriptors. Specifically, the characteristic
vector for the classification algorithm is generated by concatenating the extracted
features of each principal strain in order to use information from all three domains si-
multaneously. This study employs the introduced features (cf. Section 3.2) to encode
the images. These consist of LBPs and their variants, HoG, homogeneity features
as well as histograms of gray-scale intensities. Figure 5.1 exemplarily visualizes the
strain progression as well as the corresponding progression in feature space.
76 Chapter 5. Supervised Determination of the FLC – Machine Learning

5.1.3 Classification: Random Forest


This study uses an RF classifier (cf. Subsection 3.3.1) to solve the multi-class classi-
fication problem. The RF utilizes the Gini impurity to learn the decision boundary
based on the training data, while the separation hypothesis is evaluated using the
test set.

5.2 Experiments
To derive the datasets for the supervised classification method, three forming ex-
periments per geometry and material are investigated using a Nakajima test setup
(cf. Subsection 2.2.2), with varying process parameters (cf. Table 4.2). The process
parameters are varied to evaluate if the algorithm is sensitive to these changes or if
the method is able to generalize. Overall, three different materials are examined to
further evaluate the generalize-ability of the method. These materials consist of: (1)
a ductile deep drawing steel DX54D with 0.75 mm and 2.0 mm sheet thickness, (2)
a light-weight aluminum alloy AA6014 (1.0 mm) and (3) a dual-phase steel DP800
(1.0 mm). The key characteristics of these materials are summarized in Table 4.1,
while their individual FLCs and forming behaviors are described in Section 4.1.

Database
The distribution of failure stages of each material is depicted in Table 5.1. In contrast
to the homogeneous and diffuse necking class, the localized necking and crack class
are underrepresented in all materials with a significant quantity. While in general,
a classifier is trained to achieve the best classification results, the presence of data
imbalance (cf. Table 5.1) may introduce a bias towards a preferred classification of
the majority class. This can be addressed by a weighting scheme. However, such a
solution does not increase the variance of the minority classes.
Consequently, data augmentation is used to increase the variation of the minority
classes and therefore, each class of the dataset is artificially increased by a factor of 12,
using vertical and horizontal flipping as well as random rotations (2–12 degree). These
random rotations also address the possible slight differences regarding the orienta-
tion of investigated probes that might occur during preparation or execution of the
forming experiments. However, the augmentation scheme would overall increase the
data imbalance in favor of the majority classes. In order to resolve the class imbal-
ance, random sub- or up-sampling is applied to the data in relation to the number of
instances (50%) of the augmented localized necking class. This procedure increases
the variety of each class and enables a uniformly distributed amount of instances per
class that is used for training and evaluation of the RF classifier.

Inter-Rater-Reliability
The consistency and agreement among the annotations of five raters is examined
using the Fleiss’ kappa metric [Flei 73] (cf. Section 3.6). This is a robust measure
of consistency that takes into account the possibility that agreement between several
raters may be achieved by chance. With this methodology, the consistency of the
5.2. Experiments 77

Table 5.1: Images per failure class – summarized over all geometries per material.

Material Incon. Diff. Neck. Loc. Neck. Crack Total


DX54D (2.00 mm) 5303 1246 273 35 6857
AA6014 435 430 195 30 1090
DP800 3156 1335 292 45 4828
DX54D (0.75 mm) 2802 4953 773 220 8748

human experts among each other can be investigated with regard to their assignments
to different defect classes. Values below 0 would represent no consistency, while values
between 0.41–0.61 would correspond to a moderate agreement and 0.81–1.00 would
be considered as perfect coincidence [Land 77].

Classification
The performance of the classification algorithm is evaluated via Leave-One-Sequence
Out Cross-Validation (LOSO-CV), whereby consistent data labels are derived by
means of a majority voting scheme over the experts’ annotations that serves as ground
truth. In addition, to reduce the influence of different geometries and materials, the
LOSO-CV is assessed geometry-wise for each material. Following the cross-validation
scheme, each sequence of the geometry is left-out once to assess the separation hy-
pothesis, while the remaining two-thirds of the data are used for training. A fixed
amount of 200 decision trees is used for the RF classifier, whereas suitable values for
two other parameters, the maximum depth per tree and the minimum amount of sam-
ples per leaf are determined via grid-search (15–30, 2–12, respectively) individually
for each geometry to improve the generalization capabilities of the classifier.
In order to resolve the uneven distribution of uniaxial uniaxial, biaxial and plane
strain geometries and to generate unbiased classification results, the performance of
the different features is assessed in multiple ways. The uniaxial to plane strain ge-
ometries are jointly evaluated using the S060, S080 and S100 geometry, or if S080 and
S100 are unavailable using the S050 and S110 geometry. The necking behavior and
localization effect develops similarly in these geometries, whereas the strain distribu-
tion of the S245 geometry is evaluated independently. The S245 strain distribution
appears very different in comparison to the uniaxial to plane strain distributions and
thus might require another evaluation area or feature descriptors for good separation.

All misclassifications have the same weight and no particular class is preferred
over the other, i. e. a misclassification of the crack or diffuse necking class has the
same cost. As a result, the overall performance and quality of the four-class classifi-
cation problem is assessed with the mean AUC (cf. Section 3.5), which describes the
classifier’s ability to meet all false-positive rate thresholds without having to choose
the best operating point, that usually is defined on the basis of costs. Therefore, a
comparison between the different features is facilitated as only one evaluation metric
has to be considered for each four-class classification problem. The individual best
performing features of each material are then evaluated on the respective unrestricted
dataset, i. e. including all geometries per material, to asses the overall performance of
78 Chapter 5. Supervised Determination of the FLC – Machine Learning

(a) (b) (c)

Figure 5.2: Different evaluation strategies for S245-equibiax. (a) Nine patches (9
× 24) are distributed uniformly around the maximum (Patch-wise). (b) One patch
strategy (1 × 36) covering the same area (Single-patch). (c) Centered patch strategy
with a large centered evaluation area(Centered). Source: [Affr 18] (CC BY 4.0)

the classification algorithm. Following this procedure, the classifier is assessed with
a confusion matrix that uses a 50% probability threshold for the class memberships.
This facilitates investigation of misclassifications per class and provides class-wise
assessment in terms of precision and recall.

An additional aspect that needs clarification is the definition of an evaluation


area. On the one hand, it might be advantageous to concentrate on local image
details only, however, on the other hand, unused information in the vicinity might be
included that could enhance the classification process. Consequently, as visualized in
Figure 5.2, three different patch extraction techniques are investigated that cover a
varying evaluation area.

The first approach extracts multiple patches with an overlap of 50%, a side-length
of 16-32 px and a step-size of 8 px. The patches are evenly distributed and incorporate
the centered maximum value (Patch-wise). The second approach covers the identical
area with only one patch, resulting in a side-length of 24, 36, or 48 px (Single-patch).
The third extraction procedure covers as much information as possible with respect
to the underlying sample geometry. Herein, the evaluation area is defined on the last
image without a defect pixel to maintain a consistent image content, as the size of
the image may change over time. Unlike the other two approaches, the maximum
value cannot be used as the central pixel, since this would lead to image patches that
contain non-comparable image information. For example, if the maximum value is
found close to the image border in one sequence and next to center within another
sequence, this would lead to invalid comparisons of patches with unequal information.
Consequently, the center of mass is employed as the center of the evaluation region
(Centered). However, a direct comparison of the patch-wise approach to the other
concepts is not a feasible option, since each of the nine patches is classified with an
individual probability. To facilitate a comparison, the mean image class probability
is calculated as the average probability over the individual patches.
5.3. Results 79

1000
# images 800
600
400
200
0
CrackDiffuse Local Hom. CrackDiffuse Local Hom. CrackDiffuse Local Hom.
E1 E2 E3 E1 E2 E3 E1 E2 E3
E4 E5 E4 E5 E4 E5

(a) (b) (c)

Figure 5.3: Distribution of the expert annotations for the individual geometries of
DP800: (a) S050-uniaxial strain. (b) S110-near plane strain. (c) S245-equibiaxial
strain. Source: [Affr 18] (CC BY 4.0)

5.3 Results
5.3.1 Inter-Rater-Reliability
Expert annotations were ascertained to be rather divergent and led to different con-
sistencies. This observation depends on the material and geometry of the marginal
distributions of the individual experts and is exemplified by S050, S110 and S245 of
DP800 in Figure 5.3. Each diagram combines the annotations of three different form-
ing experiments for the S050, S110 and S245 geometry, respectively. In particular,
the separation between the homogeneous and diffuse necking class appears challeng-
ing, whereas the experts are more consistent with regard to the crack and localized
necking class. The degree of consensus among the raters on the other geometries and
materials is provided in Table 5.2. The maximum consistency is obtained with a sub-
stantial agreement on the geometries < S125 of DX54D (2.00 mm) that was acquired
with a 20 Hz sampling rate. A moderate agreement is achieved in the case of DP800.
Once again, the best results are obtained in geometries with uniaxial to plane strain
loading conditions < S125, recorded with a sampling rate of 40 Hz. Within AA6014
(15 Hz sampling rate), a rather moderate agreement is obtained irrespective of the
underlying geometry, while consistency among experts drops to poor agreement for
DX54D (0.75 mm) (40 Hz sampling rate), with no evident trend.

5.3.2 Classification
The ten best results for the uniaxial to plane strain geometries (S060, S050/S080 and
S100/S110) of the individual materials are presented in Figure 5.4, where pS./Cent.
denotes the evaluation strategy, 360/180 the signed/unsigned HoG mode and 5°,
10° the resolution of the HoG in degrees. The average AUC of the different evaluation
approaches are compared with reference to the underlying feature. Generally, edge-
based features such as HoG and Homogeneity are predominant, resulting in an average
AUC above 0.9 for all materials. Only within DX54D (2.00 mm), LBP coupled with
variance perform comparably (0.97 AUC). A trend towards a patch-wise evaluation
80 Chapter 5. Supervised Determination of the FLC – Machine Learning

Table 5.2: Inter-Rater-Reliability in terms of Fleiss-Kappa statistics.

Geometry DX54D (2.00 mm) DP800 DX54D (0.75 mm) AA6014


S050 – 0.56 0.27 0.39
S060 0.70 0.59 0.24 0.38
S070 – – 0.22 –
S080 0.70 – 0.30 0.52
S090 – – 0.24 –
S100 0.70 – 0.33 0.51
S110 – 0.56 0.34 0.43
S125 0.38 0.38 0.24 0.37
S245 0.33 0.35 0.05 0.41

strategy is evident, as the best performances among all materials originate from this
category. For DX54D (2.00 mm) and DX54D (0.75 mm), the differences between
patch-wise, single-patch and centered are negligible, while for DP800, the patch-wise
method outperforms the other approaches (0.93 AUC vs. 0.91 AUC). The ten best
results of DX54D (0.75 mm) are predominantly derived by the patch-wise evaluation
strategy, whereas the differences between patch sizes and classification results are
insignificant as the different patch sizes (pS24-pS48) of the homogeneity features
achieve comparable performance (≈ 0.93 AUC).
The evaluation of the biaxial S245 geometry differs from the previous behavior
as visualized in Figure 5.5. In the case of DX54D (2.00 mm) and DP800, very good
results with an AUC of 0.95 and 0.88 are derived that outperform DX54D (0.75 mm)
and AA6014 with an AUC of 0.725 and 0.825, respectively. In all materials, it is
not possible to obtain the best results with a feature extractor based on LBP. Con-
sequently, it can be inferred that edge information is more descriptive and relevant
for the classification task. This is supported by the fact that superior results are
achieved either with Homogeneity or HoG features. Furthermore, these results reveal
that the application of larger evaluation areas provides a significant advantage. The
patch-wise approach slightly exceeds the single-patch and centered approach only
within the DX54D (2.00 mm). The AUC reduces the results of the multi-class clas-
sification to one single performance indicator and facilitates the retrieval of the most
efficient feature per material. Since the uniaxial to plane strain geometries dominate
the data distribution, the best performing feature is individually selected for each
material across the entire classification experiment. For all materials except DX54D
(2.00 mm), the best performing feature of the uniaxial to plane strain experiment in
the S245 geometry is at least in the top ten and therefore, only has a slightly negative
effect on the overall result. Consequently, this leads to the following material/feature
pairs that are being evaluated:
• DX54D (2.00 mm)/pS32-HoG-180-5°

• DP800 (1.00 mm)/pS24-Homogeneity

• DX54D (0.75 mm)/pS32-Homogeneity

• AA6014 (1.00 mm)/pS24-HoG-180-5°


5.3. Results 81

0.974 0.940
0.972
Avg. AUC

Avg. AUC
0.930
0.970
0.920
0.968
0.966 0.910
0.964 0.900
Patch-wise Single-patch Centered PatchWise SinglePatch Centered

pS32-HoG-180-5 Cent.-LPBVar-u. pS24-Homogeneity pS32-Homogeneity
pS32-HoG-360-5◦ pS32-HoG-180-10◦ pS16-Homogeneity pS32-Histogram
pS16-HoG-180-10◦ pS32-HoG-360-10◦ pS24-HoG-180-10◦ pS24-HoG-360-5◦
pS16-HoG-180-5◦ pS16-HoG-360-5◦ pS24-Histogram pS32-HoG-360-5◦
pS24-Homogeneity pS24-HoG-360-10◦ pS16-Histogram pS24-HoG-360-10◦

(a) (b)
0.940 0.910
Avg. AUC
Avg. AUC

0.935 0.905
0.930 0.900
0.925 0.895
0.920 0.890
Patch-wise Single-patch Centered Patch-wise Single-patch Centered

pS32-Homogeneity pS48-Homogeneity pS24-HoG-180-5 pS32-HoG-360-10◦
pS24-Homogeneity pS24-HoG-360-10◦ pS24-HoG-360-10◦ pS24-HoG-180-10◦
pS16-HoG-360-5◦ pS36-Homogeneity pS32-HoG-180-5◦ pS48-HoG-180-10◦
pS24-HoG-180-5◦ pS24-HoG-360-5◦ pS36-HoG-180-10◦ pS32-HoG-360-5◦
pS32-HoG-360-10◦ pS32-HoG-180-5◦ pS32-HoG-180-10◦ pS24-HoG-360-5◦

(c) (d)

Figure 5.4: Average AUC calculated using S060, S080/S050-uniaxial and


S110/S100-near plane strain geometries: (a) DX54D (2.00 mm). (b) DP800. (c)
DX54D (0.75 mm). (d) AA6014. Source: [Affr 18] (CC BY 4.0)
82 Chapter 5. Supervised Determination of the FLC – Machine Learning

0.960 0.880
0.860
Avg. AUC

Avg. AUC
0.940 0.840
0.920 0.820
0.800
0.900 0.780
Patch-wise Single-patch Centered Patch-wise Single-patch Centered

pS16-Homogeneity pS36-HoG-360-10 pS48-Homogeneity pS24-Homogeneity
Cent.-HoG-360-5◦ pS48-HoG-180-5◦ pS16-Homogeneity pS16-HoG-360-5◦
pS48-HoG-180-10◦ Cent.-HoG-180-10◦ Cent.-Homogeneity pS36-HoG-360-10◦
Cent.-HoG-180-5◦ Cent.-HoG-360-10◦ pS36-Homogeneity pS16-HoG-180-5◦
Cent.-LPBVar-riu pS24-Histogram pS32-Homogeneity pS24-HoG-360-10◦

(a) (b)
0.750 0.840
Avg. AUC
Avg. AUC

0.700 0.820

0.650 0.800

0.600 0.780
Patch-wise Single-patch Centered Patch-wise Single-patch Centered
Cent.-HoG-180-5◦ Cent.-HoG-360-10◦ pS48-HoG-360-5◦ pS48-HoG-360-10◦
Cent.-HoG-360-5◦ Cent.-HoG-180-10◦ pS24-HoG-180-5◦ pS24-Homogeneity
pS24-Homogeneity pS24-HoG-360-10◦ Cent.-Homogeneity pS24-HoG-180-10◦
pS24-HoG-360-5◦ pS32-Homogeneity pS32-HoG-36010◦ pS24-HoG-180-10◦
pS16-Homogeneity pS32-LPBVar-riu pS16-HoG-360-5◦ pS36-Homogeneity

(c) (d)

Figure 5.5: Average AUC of S245-equibiaxial geometry: (a) DX54D (2.00 mm).
(b) DP800. (c) DX54D (0.75 mm). (d) AA6014. Source: [Affr 18] (CC BY 4.0)
5.3. Results 83

Table 5.3: Confusion matrices of the selected features for each material.
(a) DX54D (2.00 mm) – pS32-HoG-180 (b) DP800 – pS24-Homog.

C0 C1 C2 C3 Pr. Rec. C0 C1 C2 C3 Pr. Rec.


C0 0.95 0.05 0 0 0.99 0.95 C0 0.89 0.11 0 0 0.90 0.89
C1 0.02 0.95 0.03 0 0.81 0.95 C1 0.28 0.66 0.06 0 0.67 0.69
C2 0 0.05 0.92 0.03 0.85 0.92 C2 0 0.14 0.84 0.02 0.76 0.85
C3 0 0 0.14 0.86 0.79 0.86 C3 0 0 0.04 0.96 0.93 0.96
(c) DX54D (0.75 mm) – pS32-Homog. (d) AA6014 – pS24-HoG-180

C0 C1 C2 C3 Pr. Rec. C0 C1 C2 C3 Pr. Rec.


C0 0.80 0.20 0 0 0.79 0.80 C0 0.85 0.15 0 0 0.86 0.85
C1 0.12 0.84 0.04 0 0.86 0.84 C1 0.14 0.78 0.08 0 0.81 0.79
C2 0 0.14 0.85 0.01 0.77 0.85 C2 0 0.07 0.92 0.01 0.82 0.92
C3 0 0 0.01 0.99 0.97 1.00 C3 0 0 0.16 0.84 0.93 0.83

The per class performance is highlighted by the confusion matrices (cf. Sec-
tion 3.5) in Table 5.3 in terms of classification errors, precision and recall. Each row
of the confusion matrix specifies the relative frequency of instances in the expected
failure categories, while the columns reflect the relative frequency of instances of pre-
dictions made by the RF. With perfect agreement between classifier and experts, only
diagonal elements of confusing matrices would contain nonzero entries. This per class
evaluation provides evidence which classes are accurately determined by the classifier
and those that are not. In general, all matrices and materials share the similarity
that the off-diagonal misclassifications occur at the transitions between successive
classes (i.e. transition between failure stages). Most of the errors are encountered
between C0 and C1, especially in case of DP800, DX54D (0.75 mm) and AA6014 as
the C0/C1 and C1/C0 confusions add up to > 29%, whereas the error decreases to
< 10% for DX54D (2.00 mm). The error between C1 and C2 is significantly lower for
all materials as the confusions sum up to < 20%. The smallest error occurs between
C2 and C3, as it consistently remains below 20%. Herein, the result is influenced by
the low amount of instances and thus, one misclassification contributes more to the
error in comparison to the other classes. The best results among all failure classes
except the crack class are obtained within DX54D (2.00 mm) with a recall above 92%
primarily attributable to the low sampling frequency and good inter-rater-reliability.
The diffuse necking is classified with the lowest reliability for DP800 and AA6014
since the C1 recall only reaches 69% and 79%, respectively. A comparison of the
recall for the diffuse necking class of DX54D (2.00 mm) and DX54D (0.75 mm) re-
veals that the DX54D (2mm) recall is significantly higher than the DX54D (0.75 mm)
recall with 95% to 84%, respectively. Again, this might be an effect of the sampling
rate differences. The same effect is observable in case of the localized necking class
as a higher recall with 92% is achieved within AA6014 and DX54D (2.00 mm) and
only 85% in case of DP800 and DX54D (0.75 mm) that were recorded with a higher
sampling rate.
84 Chapter 5. Supervised Determination of the FLC – Machine Learning

Table 5.4: Average error per class (± # images/± mm punch movement).

C0/C1 C1/C2 C2/C3


DX54D (2.00 mm) 20.7/1.55 3.9/0.29 0.9/0.07
DP800 46.3/1.15 7.8/0.20 0.3/0.01
DX54D (0.75 mm) 42.0/1.05 11.4/0.29 0.25/0.01
AA6014 6.0/0.60 2.2/0.22 0.3/0.03

1.0 1.0
0.8 0.8
probability

probability
0.6 0.6
0.4 0.4
0.2 DP800 0.2 DP800
1.00 mm 1.00 mm
0.0 0.0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
stage stage

C0 C1 C2 C3 C0 C1 C2 C3

(a) (b)

Figure 5.6: Class affiliation of two features evaluated with DP800-S060: (a) PS 24
Homogeneity (b) PS 24 HoG 180 10°. Source: [Affr 18] (CC BY 4.0)

Based on the absolute number of confusion between class transitions, the expected
average deviation in reference to images and mm punch movement per sequence is
approximated by summarizing the misclassifications among all geometries and its
division by the quantity of underlying sample geometries and sequences (cf. Ta-
ble 5.4). Again, as expected, the largest deviation occurs between the homogeneous
and the diffuse necking class, while the deviation for DP800 and DX54D (0.75 mm)
is equivalent with > 40 images or > 1 mm punch movement. AA6014 has the lowest
deviation with six images and 0.6 mm punch movement, while DX54D (2.00 mm) ex-
hibits the largest deviation with 20.7 images and 1.66 mm punch movement. Taking
into account the confusion between diffuse necking and localized necking, this devia-
tion decreases significantly to 7.8/2.2 images and 0.2/0.22 mm punch movement for
DP800/AA6014. The deviation is identical for DX54D (2.00 mm) and DX54D (0.75
mm), as both reveal a deviation of 0.29 mm punch movement, corresponding to 3.9
and 11.4 images, respectively. The smallest error on the basis of class confusions
occurs between the localized necking and the crack class with a deviation of < 1
image or < 0.1 mm punch movement. In addition to these quantitative results, a
more qualitative interpretation is possible that takes into account the time or phase
of the forming process. Figure 5.6 illustrates the probability of affiliation to one of
the four classes for two different features.
5.3. Results 85

Figure 5.6 (a) depicts the result of the best performing PS24-Homogeneity feature,
whereas Figure 5.6 (b) depicts the result of the slightly worse performing PS24-HoG-
180 feature. In both images, the transitions between subsequent classes depict the
area of interest. Especially the clear separation between C0 and C1 on the left side is
advantageous as there exists a clear cutoff point. This separation is not as pronounced
in case of the HoG as the C0 and C1 curves oscillate. This renders the Homogeneity
feature more suitable for separating the homogeneous from the diffuse necking class,
which is reasonable since no oriented edge information is to be expected at the tran-
sition from C0 to C1. Both features are appropriate considering the transition from
C1 to C2, as a smooth separation without oscillation can be observed. However, the
slope of the curve indicates that HoG achieves a more reliable separation in this case.
This emphasizes that in particular the edge information is a suitable feature to distin-
guish diffuse necking from localized necking. Consequently, both features are found
to be suitable to separate the localized necking from the crack class as depicted in
both images by the sudden and steep transition between C2 and C3.

5.3.3 Comparison Experts vs. Machine


The oscillating behavior of the class transitions makes it difficult to find the appro-
priate point in time to reference it for looking up the strain values to determine
the FLC. For example, if the classifier provides an oscillating result vector, such as
yb = 0101010111222233, where 0, 1, 2, 3 depict the respective classes from homoge-
neous forming to crack, whereas their positions correspond to the point in time. It
is not possible to determine when the transition from C0 to C1 takes place since
the probabilities vary around 50%. This fluctuation is compensated by the fact that
only the associated class probabilities are considered to belong to one class, i. e. the
previous result vector would be interpreted as yb = 0000000111222233. Consequently,
the first occurrence of each class is used as the time-step to lookup the principal
strain values. To realize a comparison with the line-fit method, the FLCs are created
using the same localization area and strain values to ascertain whether the point in
time that is used as lookup is responsible for the deviation between the methods.
A comparison with the expert decisions is additionally pursued, that uses the same
strain history and evaluation area as the previous two methods. Of course, as a result
of the majority voting scheme, the ground truth vector does not exhibit an oscillat-
ing behavior. A graphical representation comprising the three failure stages for each
FLC candidate and material is visualized in Figure 5.7.
A forming curve is created for every failure class of the classification results and
compared to the expert ratings. The solid lines represent the results of the classi-
fier, while the dashed lines depict the results of the annotations. In order to assess
the results of the experts and the classifier, both the standard ISO method and the
line-fit (LF) method are used for comparison. A high level of correlation is observed
between the diffuse necking class of the classifier and the experts’ annotations for
DX54D (2.00 mm) among all evaluated geometries with only a small deviation. With
the exception of S125, this also applies to DP800 as well. Within DX54D (0.75 mm),
the situation is different as the classification results fluctuate around the expert re-
sults with a pronounced discrepancy for S245. This is presumably to be attributed
86 Chapter 5. Supervised Determination of the FLC – Machine Learning

1.0 0.50
0.9 0.45
0.8 0.40
major strain

major strain
0.7 0.35
0.6 0.30
0.5 0.25
0.4 0.20
0.3 0.15
0.2 0.10
−0.2 0 0.2 0.4 0.6 −0.1 0 0.1 0.2 0.3
minor strain minor strain
diff. RF diff. EXP loc. RF diff. RF diff. EXP loc. RF
loc. EXP crack RF crack EXP loc. EXP crack RF crack EXP
LF ISO LF ISO

(a) (b)
0.9 0.50
0.8 0.45
0.7
major strain

major strain

0.40
0.6 0.35
0.5 0.30
0.4 0.25
0.3 0.20
0.2 0.15
−0.3−0.2−0.1 0 0.1 0.2 0.3 0.4 0.5 -0.1 0.00 0.10 0.20 0.30 0.40
minor strain minor strain
diff. RF diff. EXP loc. RF diff. RF diff. EXP loc. RF
loc. EXP crack RF crack EXP loc. EXP crack RF crack EXP
LF ISO LF ISO

(c) (d)

Figure 5.7: Comparison between the expert determined FLCs and the classification
results: (a) DX54D (2.00 mm). (b) DP800. (c) DX54D (0.75 mm). (d) AA6014.
Source: [Affr 18] (CC BY 4.0)
5.4. Discussion and Conclusion 87

to the poor agreement among experts, as indicated by the low Fleiss’ kappa values
in Table 5.2. Since the majority vote is used to train the classifier, poor agreement
among experts likely leads to majority votes across multiple forming experiments of
the same geometry that are mutually inconsistent. Hence, they correspond to fail-
ure stages that are less comparable in terms of class occurrences in time, as well as
regarding their forming progression and strain distributions. The localized necking
curve of the experts and the classified result coincide, independent of the underlying
sample geometry and evaluated material. Only for S050 and S060 within DX54D
(0.75 mm), minor deviations can be observed.
Overall, the curves generated by the experts and the curves of the classification ap-
proach are comparable to the one generated by the DIN EN ISO 12004-2. The FLC
using the line-fit method is, in general, less conservative and proposes higher forming
limits. This is especially emphasized within DP800 or AA6014 and is particularly
apparent in the uniaxial to plane strain geometries. In all materials except AA6014,
the curve representing the crack class of the classifier and experts’ annotations are at
an equivalent level. The rather large deviation in AA6014 might again result from the
low sampling rate, as this would lead to large differences within the strain distribu-
tions of subsequent frames. In addition, incorrectly annotated instances (generated
by the experts) may also contribute to the observed deviations. However, this is
supported by the fact that not all experts consistently used an increasing amount of
defect pixels as an indicator for the initiation of the crack class. The small distance
between the localized necking and the crack class in case of DX54D (2.00 mm) and
DP800 can be interpreted that only very little additional forming is achievable prior
to failure of the specimen. Conversely, this distance is rather large for the remaining
two materials, which suggests further forming capacity. For AA6014, this again may
be introduced by the low sampling rate.

5.4 Discussion and Conclusion


Determining the necking behavior by means of pattern recognition requires a precise
definition of defect classes. This study therefore presents a novel approach for catego-
rizing forming processes into several failure classes based on the principles of pattern
recognition. An optical strain measurement system was used to acquire strain dis-
tributions of forming experiments using a Nakajima test setup. These distributions
are characterized by major strain, minor strain and thinning and were interpreted
by experts to visually assess forming sequences and assign failure classes. The het-
erogeneity of the necking behavior was emphasized by the expert annotations for the
different materials and strain conditions. However, misinterpretations between the
diffuse and localized necking class may occur, whereas the crack class is well-defined
and easily detectable. Considering the difference between the strain distributions
(Figure 5.3), it is evident that for the local and diffuse necking classes, the experts’
knowledge exhibits a small deviation for the S050 geometry. The diffuse necking class
deviates considerably for the S110 and S245 geometries and confirms the heteroge-
neous material behavior under different loading conditions. While the S050 geometry
is subject to a near uniaxial loading condition, the negative minor strain and the
diffuse necking along the width could be considered by the experts. The specimens
88 Chapter 5. Supervised Determination of the FLC – Machine Learning

geometry under near plane strain (S110) and biaxial strain conditions (S245) exhibit
discrepancies between the experts’ decision in particular for diffuse necking. An-
other observation is the data imbalance of the different failure classes, as usually, the
amount of images of the homogeneous and diffuse necking class strain distributions
is a multitude higher in comparison to the localized necking and crack class that
contains the instability. This demonstrates that the onset of instability occurs few
millimeters of punch displacement before crack initiation, which correlates well with
the findings of [Volk 11] and their definition of instability. The most sophisticated
geometry is represented by the S245 geometry under biaxial stretching. This loading
condition affects the entire evaluation area with a homogeneous thinning and irre-
spective of the material, the strain distribution develops gradually and causes varying
degrees of thinning until sudden development of a crack. Consequently, experts have
detected the local necking rather late, only a few stages prior to crack initiation.
Another difficulty is the definition of the diffuse necking as it can be considered to
start at very early stages without explicit indications.
Furthermore, the guidelines used by the experts for annotations were based on
the tensile test. In this experimental setup, the stress can be considered uniaxial
and it can be conveniently described by consideration of only the planar components.
However, in Nakajima test setups for specimens with increasing widths (cf. Subsec-
tion 2.2.2), the stress conditions are more complex. For this reason, it is easier to
determine the defect classes for smaller specimen geometries that behave comparable
to the tensile test. Larger specimen geometries lead to increasingly complex strain
developments that are more difficult to detect. This is supported by Figure 5.3 as the
geometry under near plane strain condition were differently interpreted by the experts
for the diffuse and local necking class and is additionally emphasized by the Fleiss-
Kappa values in Table 5.2. The quality of agreement decreases starting from S110
while reaching its minimum at S245. This degradation is primarily a result of the low
consistency of the homogeneous and diffuse necking classes, as they predominate the
forming process and consequently impede the statistics. In addition, it might also be
a side effect of the localized necking appearance, as it is not concentrated in a small
area but distributed over larger parts of the image, rendering it increasingly difficult
for experts to find the exact time when the onset of necking begins (cf. Section 4.3).
Furthermore, this phenomenon also accounts for the consistently low Fleiss-Kappa
values of DX54D (0.75 mm), since the ductility of the material combined with a
high frequency exacerbates the problem for experts to distinguish the homogeneous
class from diffuse necking as well as the diffuse from localized necking. Conversely,
when using a low sampling rate (AA6014 and DX54D (2.00 mm)), or when analyzing
a less-ductile material (DP800), a moderate to good agreement is achievable. Fig-
ure 5.3 emphasizes that the annotations for localized necking are consistent despite
low kappa-value for the specimen under biaxial loading (S245). The consistently good
agreement throughout the localized necking class might be introduced by the usage
of the GUI, which provides the evaluator with virtual cross-sections along the sample
and consequently reduces the deviation between experts (cf. Section 4.5).
Generally, the classification results (cf. Table 5.3) emphasize that the combination
of expert knowledge with a classification algorithm is a convenient approach for the
assessment of the failure behavior of sheet metals. This is especially valid for the iden-
5.4. Discussion and Conclusion 89

tification of localized necking, since the ability of the experts to distinguish different
failure classes by visual inspection proved to be a valuable resource, as demonstrated
by the achieved 85% recall independent of the material. The consistency among
experts is dependent on the sampling rate and the ductility of the material and con-
sequently affects the ground truth vector that is used for training of the classification
algorithm and therefore additionally impacts the classification results. Inconsistent
annotation of sequences consequently leads to increased misclassifications in the tran-
sition areas between classes. This is illustrated in all materials except DX54D (2.00
mm) by the consistently lower recall rates for the diffuse necking class compared to
the localized necking class in all materials except DX54D. Since this is a ductile ma-
terial, the under-performance might also be a limitation of the feature space, that is
focusing on edge information, which is better suited for the localized necking class.
Overall, optimal features are of interest in future work, since with the inconsistencies
between experts, this can not be investigated unambiguously.
This study proposes three different evaluation areas, with the patch-wise pro-
cedure being the superior approach for the evaluation of uniaxial to plane strain
geometries with a patch size of 24 px or 32 px using Homogeneity or HoG features.
However, for the biaxial condition S245, a larger evaluation range was considered
advantageous, either with a single patch centered around the maximum of the last
valid stage or based on the centered approach using the center of mass. For opti-
mal classification results, the evaluation strategy would have to be applied separately
depending on the sample geometry, but since the uniaxial geometries dominate the
data distribution, only the most efficient features were examined. Overall it is more
important to have precise and consistent annotations of the onset of localized necking
in time rather than the choice of features. The FLCs of the experts coincide very
well with the ones of the classification algorithm (cf. Figure 5.7), which applies espe-
cially for the localized necking candidate with a high correlation. This is reasonable
since the experts were supported with on-line virtual cross-sections along the strain
distribution that facilitated the determination of a sudden increase in the strain dis-
tribution, which might also explain the good agreement with the FLC determined
according to the ISO. This observation additionally explains why features exploiting
edge information outperform the LBP features since the latter are unable to capture
rising intensity differences in the neighborhood as only binary comparisons are pos-
sible. However, this could be overcome by including additional information into the
LBP, such as the variance.
Despite the encouraging results of this study, the proposed method still has some
limitations: (1) one must decide to use a particular feature in advance; (2) the se-
lection of the evaluation area is ambiguous, while in case of uniaxial geometries
patch-wise evaluation regions seem appropriate, biaxial geometries require a larger
evaluation area, and overall a mixed evaluation procedure would be necessary for best
results; (3) geometry dependent evaluation might be beneficial for the classification
process as side effects may be excluded (4) dependence on multiple experts for the
generation of ground truth annotations is expensive, time-consuming and would be
necessary for every new material; (5) lack of consistency between experts’ annotations
in case of the diffuse necking class. The main disadvantage of the proposed method
is the fourth limitation, which could be addressed with unsupervised classification
90 Chapter 5. Supervised Determination of the FLC – Machine Learning

algorithms that focuses only on the localized necking class. Consequently, the low
consistency within the diffuse necking class, as well as the dependency on expert
knowledge, would be omitted. So far, it seems that if the experts would be able to
define consistent points in time within the strain distributions, the proposed super-
vised classification approach would easily be able to separate classes, as the individual
repetitions of each geometry are comparable in feature space. Therefore, the overall
classification performance is limited to the expected error of the experts. Conse-
quently, an improvement in the consistency of the ground truth might be achieved by
assessing metallographic investigations as suggested for DX54D [Affr 17] and DP800
[Affr 18].
Despite these limitations, the presented work highlights the potential of conven-
tional pattern recognition methods in the field of forming limits determination. It
enables staging of the failure behavior of sheet metals based on feature representa-
tions and specifically, it has been demonstrated that experts are able to assess the
stage of material failure of specimens during forming processes. In addition, it was
inferred that their knowledge could be transferred and deduced to create FLCs that
support multiple failure stages, without restricting the evaluation area or focus on a
particular principal strain.
CHAPTER 6

Unsupervised Determination
of Forming Limits using
Conventional Machine
Learning
6.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Chapter 5 introduced pattern recognition methods in the field of sheet metal


forming and the determination of FLCs. Instead of focusing on pre-determined prin-
ciple strains in a limited evaluation area, multiple rectangular patches of the image
were extracted and assessed in combination with HoG features and a classifier. Since
the principles of supervised pattern recognition were pursued, the ground truth an-
notations of multiple failure classes (homogeneous forming, diffuse and local neck-
ing, crack) made by experts, could be related to the obtained classification results.
One implication was that expert knowledge can be integrated into the classification
model for consistent determination of localized necking. However, the accuracy and
efficiency of the results obtained is significantly affected by the intrinsic subjective
nature of the expertise and annotations. Additionally, the quality of these anno-
tations may be impeded by external factors such as the sampling frequency or the
punch velocity, especially in case of the diffuse necking class. Within this class, the
experts obtained the least consistency. Another drawback, which applies to many
supervised classification applications, is the necessity of expert assisted ground truth
annotations that are time-consuming and expensive to derive and therefore reduces
applicability. To address these limitations, this chapter presents an unsupervised
classification approach based on an O-SVM (cf. Subsection 3.3.3) and focuses on the
most important failure class, the localized necking. The main contributions feature
in-dependency of expert knowledge and the introduction of a probabilistic assessment
of localized necking using objective gradient features. Again, the proposed method
follows the principle of the pattern recognition pipeline (cf. Figure 3.1) using the
same dataset as in Chapter 5. One major modification of the dataset is the use of the
time derivative to emphasize the progression of strain across successive images and to
highlight the localization effect, as suggested by Vacher et al. [Vach 99]. For compari-

91
92 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

son, the results of the unsupervised pattern recognition approach are contrasted with
the expert annotations and the FLC of the time-dependent state-of-the-art method.
Most of the findings were published in the shared first author publication [Jare 18],
while only the pattern recognition part is covered in the following.

6.1 Method
The proposed unsupervised classification approach follows the typical pipeline used
in pattern recognition approaches (cf. Figure 3.1), which consists of four sequential
steps: data acquisition, preprocessing, feature extraction, and classification. In the
learning phase of the classification step, the data is subdivided into a disjoint training
and test set. Supervised classification algorithms utilize ground truth annotations to-
gether with the training set to learn a decision boundary that optimally separates the
class members from each other. The unseen test data is used to evaluate the quality
of the separation hypothesis, simulating a real-world scenario. Since this study em-
ploys an unsupervised classification approach based on O-SVM (cf. Subsection 3.3.3),
the classifier can be trained without the need for expert annotations or ground truth
labels. As emphasized in the previous chapter, gradient-based features, such as HoG
(cf. Subsection 3.2.4), proved to be well suited to capture the localized necking effect
in a supervised classification setup. The best performance was achieved in combina-
tion with the use of nine patches. In addition, it was demonstrated that the experts
annotated the data based on sudden cross-sectional changes through strain distri-
butions. Consequently, this study relies on the previous findings and employs the
same patch and feature extraction scheme. In contrast to the previous study, the
time derivative of video sequences is used, that highlight small changes in the strain
distributions between successive images, while the evaluation area is kept sufficiently
large to cover the necking region.

6.1.1 Preprocessing
Although this study uses the same dataset as in the supervised classification ap-
proach (cf. Chapter 5), the preprocessing step is slightly modified and adapted to
the processing of the time derivative sequences. Again, the video sequences represent
the principal strains comprising major strain, minor strain and thinning. These se-
quences may be impeded by defect pixels, which adversely affect the time derivative
of the video sequences and may be introduced by specimen preparation or DIC fail-
ure. Consequently, the temporal information of the strain progression of single pixels
is used to interpolate occasional missing values, where possible. Static defect pixels
that contain no information across the entire forming procedure are replaced by the
average strain value of a 3 × 3 neighborhood.
Furthermore, as frames of the crack class deviate significantly from the other classes,
these instances are automatically removed from the individual sequences by exploiting
and identifying the increasing amount of defect pixels. As their presence is indicative
of the onset of material failure, defect pixels are used to shortening the video sequences
and guarantee that feature extraction and subsequent analyses are conducted using
only valid and representative image information. Based on the backward difference,
6.1. Method 93

(a) (b) (c)

Figure 6.1: Different loading conditions with their spatially varying evaluation
areas: (a) DP800-S050-3 uniaxial; (b) DP800-S110-2 plane strain; (c) DP800-S245-1
biaxial. Source: [Jare 18] (CC BY 4.0)

the time derivative of the video sequences is derived subsequently to the defect pixel
interpolation. Consequently, only the differences between successive images are em-
phasized, that serve as inputs to the algorithm.
A quantile normalization scheme is applied to each individual image in order to reduce
the influence of measurement errors, e.g. introduced by single pixels that are off by a
large magnitude. The normalization scheme uses the 0.5% and the 99.5% percentile
of the image intensities as lower and upper bounds, to convert the intensity range to
0–1. Moreover, the anticipated necking region is specified by the maximum strain on
the last valid frame of the forming sequence without any defect pixels. With respect
to the extreme value, the vicinity is subdivided into nine patches with a side length
of 32 px and a step-size of 8 px, such that the maximum is uniformly distributed
across the patches as visualized in Figure 6.1.

6.1.2 Feature Extraction


HoG (cf. Subsection 3.2.4) have consistently proven to provide the best results in a
supervised classification task (cf. Chapter 5). The image content of each rectangular
patch is encoded by the introduced feature descriptor, whereas the subdivision into
cells and blocks is omitted, since the patches already cover a very limited region of
32×32 px. Since the images are already quantile normalized during preprocessing, the
normalization procedure is omitted as well. In order to support fine-grained sensing,
the orientation resolution is set to 5°.

6.1.3 Classification: One-Class-SVM


As introduced in Subsection 3.3.2, an SVM approximates a hyperplane that separates
instances of one class from another. This is derived by using the available training
data and the corresponding ground truth label vector and maximizing the margin
between the individual decision boundaries. Since the ground truth labels are not
available in an unsupervised classification setup, the generation of a separation hy-
perplane is rather difficult. O-SVM (cf. Subsection 3.3.3) solves this problem by
estimation of compact regions in feature space, which contain only a selected fraction
of the training data. This can be visualized in feature space as finding the center and
94 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

radius of a sphere [Tax 99] that covers a defined fraction of the training data describ-
ing the inlier class. The remaining instances lying outside the sphere are considered
as outliers (cf. Figure 3.13). Consequently, most of the unknown data distribution is
covered without estimation of the distribution and its parameters. Again, the sepa-
ration hypothesis is evaluated using the test set and in general, the confidence being
classified as inlier decreases with increasing distance to the center. The shape and
support of the O-SVM is affected by two parameters (ν, γ). The first parameter
describes the fraction of tolerated outliers in the training set and the second param-
eter defines the width of the Gaussian radial basis function kernel, that controls the
influence and support of the individual features. Within this study, solely data of the
homogeneous forming phase (2 mm punch movement) is used to train the classifier,
while the remaining data is considered as unknown forming phase (homogeneous and
inhomogeneous, 2 mm punch movement) and serves as test set. Only clean data of
the homogeneous phase, without necking behavior and with only a small amount of
measurement errors is used for training of the classifier. The fraction of outliers as
indicated by the ν parameter is naively set to 0.05 and the individual support of
features, determined by the γ parameter is set to 1/n, where n denotes the number
of features per image. Generally, the problem can be considered as anomaly detec-
tion, where an anomaly is defined as the deviation from the expected gradient range
occurring within the homogeneous forming phase.

Deterministic FLC
A deterministic FLC is derived using the confidence scores of the O-SVM. These are
binarized using a 50% threshold, such that an instance can either be an inlier or an
outlier independent of the actual level of confidence. Nine patches are extracted from
each frame and classified independently. Overall, it is impossible to prove if the onset
of necking within a specimen is induced by a single patch classified as an outlier, or
if it is necessary to have all nine patches classified as outliers. Consequently, two
FLC candidates are introduced. The first candidate (SVMe) is determined by the
last valid frame of the test set sequences that does not contain patches classified as
outliers. Hence, the individual point in time of each test sequence is used to lookup
the strain value pairs. The second candidate (SVMf) is determined by the first frame
with all patches being classified as outliers. Consequently, the two candidates can
be interpreted as the onset and end of the localization phase. Single patches being
erroneously classified as outlier are inevitable due to the presence of measurement
errors, which is particularly problematic for the SVMe candidate. For this reason,
the first stage without any patch classified as an outlier is found by parsing backwards
through each test video sequence, starting from the last frame.

Probabilistic FLC
In addition to the deterministic FLC candidates, this study proposes a probabilistic
FLC. This candidate is derived using a post-processing strategy that exploits the fact
that the test sequences are shortened to the same length (in time). Time can thus be
integrated as a feature/variable and used as an additional source of information. The
confidence scores of the O-SVM are usually unbounded, as they represent distances
6.1. Method 95

from the decision boundary. Since the test set covers the remaining homogeneous and
the entire inhomogeneous part of the forming process of each experiment, the mini-
mum and maximum values are used to normalize the confidence scores into the 0–1
range. Consequently, as the three individual test set sequences per geometry possess
the same length and duration, it is feasible to combine them into one distribution
that depends on the normalized confidence scores over time.
This distribution consists of two parts, the homogeneous forming and the inhomo-
geneous forming phase. A GMM (cf. Subsection 3.3.4) comprising two Gaussians is
used to approximate the parameters of the distributions based on EM (cf. Subsec-
tion 3.3.4). The GMM centroids of the algorithm were initialized with the empirically
determined mean of the inlier and outlier classes. After deriving the parameters of
the distributions, each of the nine patches receives an individual probability that
determines its class affiliation. Consequently, the average of nine patches per image
is used to calculate the average probability of each image in the test sequence. An
overview of the evaluation pipeline is visualized in Figure 6.2.
The pipeline begins with the unbounded confidence scores estimated per patch for
the three forming repetitions of one geometry in Figure 6.2 (a). Towards the end of
each experiment, the decreasing confidence is a sign of the onset of necking, although
samples are still considered to be within the inlier class/homogeneous phase. This is
highlighted in the combined normalized distribution in Figure 6.2 (b). Additionally,
at the beginning of the distribution and towards the end, outliers may occur which
are primarily attributable to temporal measurement noise. Figure 6.2 (c) illustrates
the negative log. likelihood space of the GMM for both associated Gaussian distri-
butions. Obviously, the outliers at the beginning and end of the sequence are now
considered correctly to be part of the inlier class/distribution. Figure 6.2 (d) visual-
izes the average probability per image of each test sequence, wherein the probability
of belonging to the outlier class increases towards the end of each experiment.
In general, the onset of necking is expected to occur in the elbow region of the
curves as depicted in Figure 6.2, where the confidence of instances decreases, while
still being classified as inliers. This is emphasized by the color-coded probability in
Figure 6.3 that facilitates to highlight the transition from the homogeneous to the
inhomogeneous forming phase.
96 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

0
confidence scores

normalized conf. scores


0
−0.2

−200 −0.4

−0.6
−400
// // −0.8
0 80 0 80 0 80 Outlier
Inlier
−1
P1 P2 stage P3 P4
P5 P6 P7 P8 0 0.2 0.4 0.6 0.8 1
P9 avg bound. normalized time
(a) (b)
1 1.0
normalized conf. scores

DP800-S050-1
0.8 DP800-S050-2
DP800-S050-3
probability

0 0.6

0.4

−1 Outlier 0.2
Inlier
0.0
−0.5 0 0.5 1 1.5 0 20 40 60 80
normalized time stage

(c) (d)

Figure 6.2: Anomaly detection pipeline exemplified by DP800-S050: (a) Confidence


scores per patch and experiment together with their individual averages; (b) Com-
bined and normalized confidence scores for all experiments over the normalized time;
(c) Confidence scores with respect to the negative log. likelihood of the determined
GMM; (d) Average probability of being classified as an outlier and its progression
over time for each repetition of the DP800-S050 experiment. Source: [Jare 18] (CC
BY 4.0)
6.2. Experiments 97

normalized conf. scores


0.8
−0.2

probability
0.6
−0.4

−0.6 0.4

−0.8
Outlier 0.2
−1 Inlier

0 0.2 0.4 0.6 0.8 1


normalized time

Figure 6.3: GMM decision boundaries. The transition between the inlier and
outlier class is highlighted by the color-coded probability. Effectively, the inliers
in the curved region would now be considered as outliers according to the GMM.
Source: [Jare 18] (CC BY 4.0)

6.2 Experiments
In general, forming experiments can be subdivided into two forming phases, the ho-
mogeneous and inhomogeneous forming phase. Usually, the majority of the data
corresponds to the homogeneous phase, while the data of the inhomogeneous forming
phase is underrepresented. Consequently, the determination of the forming limit can
be interpreted as an anomaly detection problem, where the deviation from the ho-
mogeneous phase is classified as the onset of localized necking. The line-fit method,
as described in Section 2.3, uses the last 4 mm of the punch movement to approx-
imate the two forming phases with two regression lines. Herein, the regression line
that corresponds to the instable condition is estimated using the last 2 mm of punch
displacement, whereby the regression line that corresponds to the stable condition is
approximated using the preceding 2 mm [Merk 17].
The same assumptions are considered in the present investigation and thus, the
dataset is limited to the last 4 mm of the forming process as well. The first 2 mm
of the sequences, that correspond to the homogeneous phase, are used to train the
O-SVM, whereas the last 2 mm of the sequences are used for the detection of the
anomaly. Essentially, the amount of data for training the classifier and estimating
the parameters of the Gaussian distributions can be chosen arbitrarily. The only
limitation is that data from the homogeneous forming phase must be included in the
training set. For this reason, for DX54D (2.00 mm), the restriction using only 4 mm
is extended to 6 mm, in order to maintain the 50% training and test split.
The materials with their available geometries and the corresponding process parame-
ters are summarized in Table 4.2, while every geometry was evaluated separately and
individually for each material. As a result of the different sampling frequencies, the
available frames vary per material. Overall, the dataset consists of 160 images for
DX54D (0.75 mm) and DP800, whereas 80 and 60 images are available for DX54D
(2.00 mm) and AA6014, respectively. Per geometry, the first 2 mm of the three
forming experiments are combined into one dataset to train the O-SVM. The combi-
98 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

nation of three repetitions per geometry enables the estimation of an average FLC,
based on the failure behavior and support of multiple different forming experiments.
Specifically, this aspect is of high relevance for most applications.
However, as mentioned before, the extent of the evaluation area should generalize
to the necking behavior of the different geometries (cf. Figure 6.1) such that the
area of interest is included for each geometry and hence the patch size is set to a
side length of 32 px. Additionally, the size and variation of the training dataset
is increased by applyÂing multiple variants of data augmentation. This means the
images are flipped (horizontal, vertical, or both) and randomly rotated (2–15°) to
take into account the various possible orientations of specimens and consequently its
necking behavior. For a DP800 forming experiment, this results in a training dataset
of 3 × 80 × 12 × 9 = 25, 920 patches.
Yet, a quantitative evaluation of the method is rather difficult, as no ground truth
based on metallographic investigation of the sheet metal is available. In order to
realize these investigations, it would be necessary to stop the forming process at the
exact point in time when necking occurs. So far, this point in time can only be deter-
mined approximately due to the spontaneous nature of material failure. In order to
still evaluate the FLC candidates of the proposed methods, a qualitative comparison
of the results with the corresponding time-dependent line-fit method and with the
experts’ annotations of Chapter 5 is presented in the following.

6.3 Results and Discussion

6.3.1 Deterministic FLC

The deterministic FLC for each material is visualized in Figure 6.4. All materials
except DX54D (2.00 mm), exhibit high consistency between the SVMf candidate and
the line-fit method. This is reasonable, since both strategies exploit the difference
between subsequent images and therefore evaluate comparable information domains.
For all materials, as expected, the SVMe estimate is consistently lower than that of
SVMf since an earlier point in time is used to lookup the strain value pairs. The three
candidates SVMe, SVMf and line-fit coincide under uniaxial loading conditions, in
particular for DP800. The notably large distance between the line-fit method and
the SVM candidates in case of DX54D (2.00 mm) either could be a consequence of
the rather low sampling frequency or induced by the different evaluation areas used
to determine the onset of necking. A comparison with the experts’ annotations of
Chapter 5 and the SVMe candidates reveals high agreement, while the former being
marginally below the latter for all materials except the uniaxial loading conditions
of DX54D (2.00 mm). Consequently, the experts have consistently defined an earlier
stage in the strain distribution as the onset of localized necking. This seems rea-
sonable since the experts were using sudden increases between two subsequent strain
distributions as an indicator for localized necking in the supervised experiment (cf.
Chapter 5).
6.3. Results and Discussion 99

0.7 0.35

0.6 0.30
major strain

major strain
0.5 0.25

0.4 0.20

0.3 0.15
−0.2 0 0.2 0.4 0.6 −0.1 0 0.1 0.2 0.3
minor strain minor strain

EXP LF SVMe SVMf EXP LF SVMe SVMf

(a) (b)
0.8 0.4
major strain

major strain

0.6 0.3

0.4 0.2

0.2 0.1
−0.45−0.3−0.15 0 0.15 0.3 0.45 −0.1 0 0.1 0.2 0.3 0.4
minor strain minor strain

EXP LF SVMe SVMf EXP LF SVMe SVMf

(c) (d)
Figure 6.4: Deterministic FLCs in comparison to the results of the line-fit method
and the experts’ annotations: (a) DX54D (2.00 mm); (b) DP800; (c) DX54D
(0.75 mm); (d) AA6014. Source: [Jare 18] (CC BY 4.0)
100 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

6.3.2 Probabilistic FLC


In general, the FLC candidates determined by experts’ are found consistently below
the SVMe candidates, indicating that there is room for improvement in determining
the point in time when the onset of necking begins. Typically, the experts might
constitute a source of error in defining the exact point in time, whether due to differ-
ences between the experts or a high sampling frequency. As visualized by Figure 6.3,
the confidence of being classified as inlier decreases towards the end of the forming
process, before actually being classified as outlier. This decrease is expressed by
probability quantiles for a data point in terms of its membership to the outlier class
and leads to the definition of the probabilistic FLC. The resulting probabilistic FLCs
for each material are depicted in Figure 6.5. Again, in order to assess the quality of
the probabilistic FLC, the line-fit and the expert determined FLCs are presented for
comparison.
An excellent agreement between the < 0.01 quantile and the experts’ annotations
was observed among all geometries in case of DX54D (0.75 mm). The same de-
gree of agreement is found for the uniaxial loading conditions in the case of AA6014
and DP800. A rather large deviation from the < 0.01 quantile is observable for
DX54D(2.00 mm), which again might be explained with the low sampling rate in
combination with the ductility of the material. Furthermore, major deviations are
found within the plane strain to biaxial strain conditions of AA6014 and DP800,
which result from the low consistencies in the experts’ annotations. When inves-
tigating the > 0.99 quantile, it exhibits very good agreement or concordance with
the line-fit result, in all cases with the exception of DX54D (2.00 mm). The other
quantiles lie between the < 0.01 quantile and the > 0.99 quantile.

6.3.3 Comparison with Time-Dependent Evaluation Method


The line-fit method (cf. Section 2.3) evaluates similar information in relation to the
proposed method to determine the onset of localized necking and therefore consti-
tutes a suitable choice for comparison. The main difference lies in the definition of the
evaluation area, which is restricted to 15–20 connected pixels of the first derivative in
the thinning direction and is determined by using a threshold. This area is averaged
for each frame and its progression over time is then used to define a homogeneous
and inhomogeneous forming phase. These phases are regressed by two lines and the
intersection of these lines defines the onset of necking. Consequently, the determina-
tion of the evaluation area is crucial for the line-fit method as it directly affects the
shape and steepness of the thinning rate and hence the intersection of the regression
lines.
Conversely, the proposed method identifies a certain point in time when some-
thing abnormal emerges within the images of the homogeneous forming phase. The
major difference is that the gradients are expressed in terms of HoG and that the
abnormality is found within the feature space. Together with the GMM, changes of
the HoG are used to assess the failure probability. From this follows, that dependent
on the progression of the forming process, the likelihood of being in the necking phase
can be defined with a certain confidence. For the purpose of facilitating comparisons
between the proposed method and the line-fit approach, a threshold value of 0.9 of
6.3. Results and Discussion 101

0.7 0.32
0.30
0.6 0.28
major strain

major strain
0.26
0.5 0.24
0.22
0.4 0.20
0.18
0.3 0.16
−0.2 0 0.2 0.4 0.6 −0.1 0 0.1 0.2 0.3
minor strain minor strain

< 0.01 < 0.1 < 0.2 < 0.3 < 0.01 < 0.1 < 0.2 < 0.3
< 0.4 < 0.5 < 0.6 < 0.7 < 0.4 < 0.5 < 0.6 < 0.7
< 0.8 < 0.9 < 0.99 > 0.99 < 0.8 < 0.9 < 0.99 > 0.99
EXP LF LF

(a) (b)
0.8 0.32
0.30
0.7
major strain

major strain

0.28
0.6 0.26
0.5 0.24
0.22
0.4
0.20
0.3 0.18
−0.4 −0.2 0 0.2 0.4 0.6 −0.1 0 0.1 0.2 0.3 0.4
minor strain minor strain

< 0.01 < 0.1 < 0.2 < 0.3 < 0.01 < 0.1 < 0.2 < 0.3
< 0.4 < 0.5 < 0.6 < 0.7 < 0.4 < 0.5 < 0.6 < 0.7
< 0.8 < 0.9 < 0.99 > 0.99 < 0.8 < 0.9 < 0.99 > 0.99
LF EXP LF

(c) (d)

Figure 6.5: Probabilistic FLC in comparison to the results of the line-fit method and
the experts’ annotations: (a) DX54D (2.00 mm); (b) DP800; (c) DX54D (0.75 mm);
(d) AA6014. Source: [Jare 18] (CC BY 4.0)
102 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

0.080 (a) 0.4


z-disp. diff. [mm]

thinning
0.075 0.3

0.070 0.2

0.065 0.1
0 10 20 30 40 0 10 20 30 40
position position
350 355 362 370 350 355 362 370

(b) (c)

Figure 6.6: Progression of a DX54D-S030-1 (2.00 mm) forming experiment: (a)


Z-displacements of consecutive frames with indicated cross-sections at stage 350,
355, 362 and 370; (b) Illustration of the z-displacement cross-sections for the re-
spective stages; (c) Cross-section profiles of the corresponding strain distributions.
Source: [Jare 18] (CC BY 4.0)

the maximum thinning (φ3 ) was employed, with an additional constraint of 15–20
connected pixels. Since the proposed method derives the onset of localized necking
without restrictions of the evaluation area, different thresholds such as the maximum
value would be possible. However, this certainly would affect the location of the FLC.
An in-depth investigation of the z-displacement differences of DX54D-S030-1 (2.00 mm),
as visualized in Figure 6.6 (a), supports the general hypothesis that the line-fit method
overestimates the onset of localized necking. Within this forming sequence, the line-
fit method returns stage 370 as the onset of necking. However, this seems to be
overly optimistic as the darker region already implies some reduction of sheet metal
thickness along the z-axis, as visualized in Figure 6.6 (a) (right). These shadowed
areas can already be found or perceived in earlier stages, while the earliest reduction
in sheet metal thickness is observed at stage 355. Consequently, the z-displacement
of the sample is a significant source of information that can be used in the evaluation
of ductile materials. Moreover, this is illustrated by the z-displacement cross-section
profiles depicted in Figure 6.6 (b). Additionally, Figure 6.6 (c) depicts the corre-
sponding cross-section profiles of the strain distribution.
As already mentioned, the choice of the measurement area to determine the onset
of the necking is crucial. At the beginning of the forming procedure, the area is
distributed over the entire evaluation area. As forming progresses, this area decreases
in size and concentrates towards the center, which in turn leads to a single connected
particle. This process is visualized in Figure 6.7 (a), which contrasts the extent of the
area as determined by the threshold with the extent of the largest coherent particle.
The relationship between the size of the varying evaluation area and the largest
6.3. Results and Discussion 103

1 (a) 1
normalized feature

0.8 normalized feature 0.8


0.6 0.6
0.4 0.4
0.2 0.2
0 0
200 250 300 350 400 150 200 250 300 350
stage stage
Particle Area Particle Area
HoG (φ3 ) Max HoG (φ3 ) Max

(b) (c)

Figure 6.7: Analysis of the thresholded area and its development in comparison
to the HoG feature: (a) Consolidation and growth of the area of DX54D-S030-
1 (2.00 mm) (φ3 ) difference images at stage 300, 330, 360 and 370. In the beginning,
the area is spread over the entire image before converging to a single particle; (b)
Progression of the size of the thresholded image area (orange) contrasted with the
area of the largest connected particle (blue) and the HoG feature (Green); (c) An
additional example for DP800-S050-1. Once the maximum coherent particle size has
been reached, no further forming is feasible, resulting in an increasing gradient for
different HoG bins. Source: [Jare 18] (CC BY 4.0)
104 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

coherent particle is highlighted for DX54D (2.00 mm) in Figure 6.7 (b). Herein,
the maximum of the thresholded area is reached long before the maximum of the
single connected particle, which is followed by a decreasing size for both areas. When
comparing the HoG feature with the size of the connected particle, it is obvious,
that soon after reaching the maximum of the particle, the HoG feature increases.
This indicates that the necking area is unable to spread or concentrate any further,
leading to an increasing gradient and feature response. From this can be concluded
that HoG features are a reasonable choice as a feature descriptor. The aforementioned
observation depends on the material properties in particular ductility. As illustrated
by Figure 6.7 (c)), similar characteristics of this behavior can be observed as well for
DP800-S050-1. Here, the maximum of the thresholded area and connected component
nearly coincide, while the HoG reaction is delayed by a couple of frames. Additionally,
this comparison reveals the dependence on the threshold value, as the choice, e. g.
using 0.95 instead of 0.9 would result in a maximum at a different point in time,
especially with respect to the connected component.
This emphasizes the need to find a material-dependent evaluation area in order
to determine the onset of localized necking, rather than using a fixed threshold in
combination with a minimum amount of connected pixels. Taking into consideration
the Figures 6.6 and 6.7 of DX54D-S030-1 (2.00 mm), an optimal evaluation region
would include the complete necking area. This would require a width of around
15 px, as defined in the cross-sections plot shown in Figure 6.6 (b). This width is
inferred from Figure 6.6 (c) and based on the fact that at position ≈15 and ≈27,
the strain remains static. Since the dependency on a threshold value always affects
the evaluation area and thus the location of the FLC, it might be advantageous to
determine the necking area using computer-assisted image segmentation techniques
(cf. Chapter 9).
Since the evaluation area is not of major interest in this study, a comparison
is provided that considers the different determined points in time when necking is
detected without consideration of the evaluation area. Herein, the average points in
time of the failure quantiles of the probabilistic FLC are contrasted with the average
of the line-fit method and experts’ decisions, as visualized in Figure 6.8 (b). Starting
with stage 350, the probability that localized necking is initiated is < 1%. This
constantly increases to a 50% probability by stage 356 and is > 99% starting with
stage 362. In contrast to the proposed method, the line-fit method proposes only one
stage (368) as onset of necking for DX54D-S030-1 (2.00 mm), whereas the majority of
experts identified stage 364. The repeating time points in Figure 6.8 (b) are a result
of the varying slopes of the probability progression curves per forming experiment, as
depicted in Figure 6.8 (a). Since the experiments are of different lengths, the figure
considers only the last 40 frames of each video sequence.

6.3.4 Comparison of Deterministic and Probabilistic FLC


Both proposed methods, the deterministic and probabilistic FLC, are predictive to
describe the onset of localized necking. The deterministic FLC only uses the binary
affiliation of patches to the outlier class without considering neighboring patches,
whereas the probabilistic FLC uses GMM to model the homogeneous and inhomoge-
6.3. Results and Discussion 105

0.750 368
0.725
0.700

major strain
364
0.675
1.0 362
0.650 361
0.8 0.625 358
361 358
358 357
probability

0.600 356
0.6 354
0.575 354
351
0.4 0.550 350

0.525
0.2 -0.35 -0.33 -0.31 -0.29 -0.27
0.0 minor strain
< 0.01 < 0.1 < 0.2 < 0.3
0 10 20 30 40
< 0.4 < 0.5 < 0.6 < 0.7
last 40 frames of test sequences < 0.8 < 0.9 < 0.99 > 0.99
Exp. 1 Exp. 2 Exp. 3 EXP LF

(a) (b)

Figure 6.8: Membership progression and comparison of the averaged lookup time
steps per quantile with the line-fit result of DX54D-S030-1 (2.00 mm): (a) Mem-
bership progression of the last 40 frames for each repetition of the experiment; (b)
Average major and minor strain of the quantiles in combination with the respective
lookup times. Source: [Jare 18] (CC BY 4.0)

neous class and thus is capable of extracting failure probabilities for each time step.
Specifically, this is relevant for the curved region, the transition zone from the in-
lier to the outlier class, since it facilitates the detection of the onset of necking by
means of the change in gradient information. Moreover, this detection is expressed
in the form of a likelihood of necking, which allows a probabilistic interpretation
of the forming process. For DX54D, the > 0.99 quantile estimate of the probabilis-
tic FLC demonstrates close agreement with SVMf candidate and in case of DP800,
the line-fit method additionally strongly correlates with the proposed approaches as
visualized in Figure 6.9.
One advantage of the probabilistic approach is the possibility to detect the devel-
opment towards the outlier class since it assesses the instances of the curved region.
Consequently, the < 0.01 quantile is found below the SVMe curve. While the dif-
ference for DX54D (2.00 mm) is rather large, it vanishes for DP800 due to its lower
ductility. In general, the < 0.01 quantile and SVMe candidate must not agree as
depicted in Figure 6.9 (a). An explanation for this behavior is measurement noise
or unstable outliers, which impede the modeling of the distributions and thus the
lookup times of the approaches.
A substantial difference between these two assessment strategies is that the deter-
ministic FLC could be extended to an online evaluation approach capable of stopping
the forming process when trained incrementally. With such a real-time capable sys-
tem, it would be possible to generate ground truth labels for supervised classification
algorithm since correlation with metallographic examinations would be enabled. As
a result of possible misclassified outliers during real-time classification, it would be
106 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

0.7 0.4

0.6
major strain

major strain
0.3
0.5
0.2
0.4

0.3 0.1
−0.2 0 0.2 0.4 0.6 −0.1 0 0.1 0.2 0.3
minor strain minor strain

LF SVMe SVMf LF SVMe SVMf


< 0.01 > 0.99 < 0.01 > 0.99

(a) (b)

Figure 6.9: Comparison of the proposed methods D-FLC and P-FLC: (a) DX54D
(2.00 mm); (b) DP800. Very high consistency the SVMf and > 0.99 quantile is
obtained. As anticipated, the < 0.01 quantile remains below the SVMe curve. In
contrast to the SVMe results, the quantiles are based on the GMM so that the S245
conditions of SVMe must not agree due to signal impairments and instable outliers.
Source: [Jare 18] (CC BY 4.0)

advantageous to employ the SVMf approach for this purpose, since it is unlikely to
classify nine patches as outliers during the homogeneous forming phase. In contrast to
SVMf, such a procedure is not feasible for the probabilistic FLC since it is dependent
on time information.
6.3.5 Comparison with Metallography
Another qualitative assessment of the method can be achieved by analyzing the mate-
rial behavior using metallography. Affronti et al. [Affr 18] presented a metallographic
analysis of a DP800 with several forming steps and different strain paths using Naka-
jima tests. This analysis of the surface and thickness indicates that the onset of
necking for this dual-phase steel begins at the surface with micro-cracks determined
on microscopic scale. These micro-cracks are represented in the strain distributions
with multiple localizations and therefore might serve as ground truth for supervised
classification approaches. Since the sequences up to the forming step of the metal-
lographic examination were recorded, it is feasible to assess the strain paths and to
compare these with the introduced probabilistic FLC, as visualized in Figure 6.10. It
can be observed that the individual strain paths, especially their endpoints that corre-
spond to the metallographic examinations, have good agreement with the outcome of
the probabilistic FLC. In particular, the average FLC derived by the metallographic
investigations (solid black line) is found above the expert determined FLC, except
for the S245 geometry. Under biaxial straining, the strain distribution behaves ho-
mogeneously on the surface for most of the processing time and straining occurs in
the sheet metal thickness. Consequently, the evaluation of the onset of necking for
this geometry remains challenging. However, the qualitative evaluation using strain
paths revealed that the metallography agrees with the quantiles of the probabilis-
6.3. Results and Discussion 107

tic FLC. This indicates that the results of the proposed unsupervised methods are
plausible. In comparison to the metallographically derived FLC, the line-fit FLC is
located above and coincides with the > 0.99 quantile of the probabilistic FLC. Con-
sequently, the evaluated necking phase of the line-fit method is at an advanced stage
of development and therefore less conservative, whereas the experts are conservative
and detect an early stage for the onset of localization. In general, the comparison
with metallographic analysis confirms the validity of the unsupervised approach and
provides an overview of the state of development in terms of failure quantiles.

0.32
0.30
0.28
major strain

0.26
0.24
0.22
0.20
0.18
0.16
0.14
-0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25
minor strain

< 0.01 < 0.1 < 0.2 < 0.3


< 0.4 < 0.5 < 0.6 < 0.7
< 0.8 < 0.9 < 0.99 > 0.99
EXP LF Met.

Figure 6.10: Probabilistic FLC of DP800 in comparison to the LF and EXP


method with additional strain paths of the metallographic investigations of [Affr 18].
Source: [Jare 18] (CC BY 4.0)

6.3.6 Factors of Influence


As already introduced in Section 4.4, signal impairments may impede the quality of
the result. In particular, incorrect measurements that lead to alternating or fluctu-
ating strain values are problematic, since they cannot be distinguished from correct
measurements. As a result, individual pixels with incorrectly increased values are
generated for a short period of time. Since the proposed method uses the time
derivative in combination with gradient features (HoG), those errors are even ampli-
fied. Therefore, optimal preparation of the specimen is crucial, as it directly improves
the traceability of blocks using DIC. Although perfect preparation mitigates this type
of artifact, it is still challenging to completely avoid its occurrence. An example of the
fluctuating pixels or erroneous measurements within a strain distribution is visualized
in Figure 6.11, together with its effect on the time derivative.
Besides signal impairments, further consideration is required for the geometry
S245 under biaxial straining. The symmetry of the strain condition and the geome-
try does not permit a preference for the determination of the strain development, so
that the crack can occur in arbitrary direction. On the one hand, consistent place-
ment of the test specimen of S245 could improve evaluation. Since the crack tends
108 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

(a) (b)

Figure 6.11: Visualization of pixel perturbations as a result of measurement noise


and its influence on the difference: (a) Strain distribution with measurement noise;
(b) Effect on the time difference. Source: [Jare 18] (CC BY 4.0)

(a) (b) (c)

Figure 6.12: Different orientations of the three forming experiments of DP800-S245-


biaxial: (a) Exp. 1; (b) Exp. 2; (c) Exp. 3. Source: [Jare 18] (CC BY 4.0)

to occur in the weakest direction, this would correspond to perpendicular occurrence


to the rolling direction for steel and parallel occurrence to the rolling direction for
aluminum alloys. On the other hand, due to material anisotropy and material struc-
ture defects, the crack can originate independently of the rolling direction. This has
no effect on the line-fit method, since it does not take into account the orientation
of the gradient, while the proposed method uses rotation dependent HoG features.
Still, the specimen orientation and its placement seems important, as the gradient
develops with specific orientation during forming. This behavior is depicted in Fig-
ure 6.12, using the last valid stage of the forming sequence with emphasis on the
necking region. Two experiments have corresponding or slightly varying necking ori-
entations, whereas another experiment deviates significantly. However, orientation
dependency is mitigated during training using augmentation with random rotations
and flipping of the data. Since only the homogeneous area is used for training, an
additional preprocessing step, such as pose normalization according to the last valid
frame, might address the orientation dependency for the biaxial loading condition.
The effect of both factors, measurement errors and specimen orientation, on the
probability progression and decision boundary, is depicted in Figure 6.13. The first
forming experiment deviates from the other two repetitions and consequently intro-
duces outliers in the decision boundary (cf. Figure 6.13 (b)) and hence influences
the quantile based and determined strain values. The spontaneous failure of this
geometry, however, is still consistently captured at stage ≈76 for each repetition, as
6.4. Conclusion 109

1.0
0

normalized conf. scores


Exp. 1
Exp. 2 0.8
0.8
Exp. 3
probability

0.6 0.6
−0.5
0.4 0.4

0.2 Outlier 0.2


−1 Inlier
0.0
0 20 40 60 80 0 0.2 0.4 0.6 0.8 1
stage normalized time
(a) (b)

Figure 6.13: Measurement noise, defect pixels or varying sample orientations may
affect the probability progression of S245 forming experiments: (a) Probability pro-
gression of DP800-S245 determined with HoG 180–10°; (b) Corresponding distribu-
tion with the decision boundary. Two of the three experiments coincide, whereas the
first experiment is not supported by the distribution. Source: [Jare 18] (CC BY 4.0)

illustrated by the steep increase in the probability progression curves of Figure 6.13
(a).

6.4 Conclusion
Previous studies have demonstrated that Nakajima-based forming processes produce
visible characteristics on the surface of sheet materials. Those patterns were identified
by experts within the video sequences and annotated into multiple failure categories
(cf. Chapter 5). Based on these annotations, a classifier was trained to automatically
identify these patterns on images of unseen video sequences belonging to one of three
different failure classes. This evaluation strategy, however, requires expert knowledge
and thus expensive data annotation to enable the training of a supervised classifi-
cation algorithm. Consequently, this study introduces an unsupervised classification
method capable of automatically detecting the onset of localized necking without the
necessity for annotated data. Two different evaluation methods were proposed based
on O-SVM confidence scores: (1) a deterministic FLC, which defines a lower and up-
per bound for the onset of localized necking; (2) a probabilistic FLC, which models
the forming process with two Gaussians and thus provides a probabilistic assessment
of the individual forming stages.
Despite the encouraging results of this study, the proposed method still has some
limitations: (1) it is feature dependent, which means that the procedure only works
if an increasing gradient is observable and if the material exhibits a growing neck-
ing effect; (2) rotation dependency of the biaxial S245 geometry, which might be
addressed by pose normalization or extended augmentation schemes; (3) anomaly
detection permits the point in time to be well determined. However, the selection of
the evaluation area for determining the strain values used in the FLC is heuristic and
should be chosen automatically in relation to the geometry. In particular, for S100-
110 Chapter 6. Unsupervised Determination of the FLC – Machine Learning

S125, the evaluation area appears extended and should therefore be adapted flexibly;
(4) sample preparation is critical since measurement errors or defective pixels can
interfere with the result, although the quantile normalization of the time derivative
of the video sequences is used to mitigate the effect. In future work, DL techniques
could be employed to reduce the feature dependency, lack of robustness caused by
rotation dependency in S245, and the susceptibility to measurement noise or defect
pixels. In addition, automatic segmentation of the evaluation area would be possible
(cf. Chapter 9). In spite of these limitations, the proposed method emphasizes the po-
tential of an unsupervised classification algorithm in the field of FLC determination.
Furthermore, it proposes the deterministic and probabilistic FLC, that determine the
onset of necking independent of the evaluation area. The latter, in particular, enables
the interpretation of the necking phase in terms of failure quantiles and provides new
opportunities for risk and process management.
CHAPTER 7

Unsupervised Determination
of Forming Limits using Deep
Learning
7.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Chapter 6 proposed an unsupervised traditional classification approach while in-


corporating a priori knowledge. With HoG, features were used which consider the
gradients or, more precisely, their orientation and magnitude and therefore lever-
ages similar characteristics as the line-fit method. However, in contrast to the line-
fit method, a precise restriction of the evaluation area is unnecessary, whereas certain
other decisions had to be made in advance. The features to be evaluated needed to
be selected while additionally, the evaluation area near the extreme value had to be
specified prior to evaluation. Another disadvantage to be mentioned is the temporal
dependence on the forming process, since without using the temporal information, it
would not have been possible to create the probabilistic FLC using GMM.
This chapter addresses the mentioned problems and proposes an unsupervised deep
learning approach, so that the determination of the FLC is possible without explicit
selection of the evaluation area and time dependency. However, for various reasons,
it was not possible to determine consistent and reliable FLCs. Nevertheless, since the
findings form the basis of fundamental changes in the upcoming approach, which will
be introduced and explained in Chapter 8, the method is presented in the following
together with the outcomes and challenges.

7.1 Method
In general, the method follows the principle of the pattern recognition pipeline (cf.
Figure 3.1) and consists of two steps: (1) unsupervised, data-driven learning of com-
pact and representative features using DL; (2) unsupervised clustering of encoded
forming sequences and using their low dimensional representatives to determine the
actual status of the forming process, that is associated with one of the three follow-
ing conditions: homogeneous, transition and necking. The features to be used in the
classifications step are automatically learned from data by means of an autoencoder

111
112 Chapter 7. Unsupervised Determination of the FLC – Deep Learning

(cf. Figure 3.21) rather than using predefined features [Beng 09]. As a result of the
encoder-decoder structure, single images of the video sequences are compressed and
then optimally reconstructed. Consequently, the autoencoder takes an input image
xI and maps it to a latent representation in the bottleneck layer using a mapping
function xb = fθe (xI ), parameterized by the parameters θe . The decoder function
reverts this mapping and generates reconstructions of the same dimension as the in-
put image with a new mapping function x̂I = fθd (xb ), parameterized by θd . The
parameters of the model are then derived by minimizing the average reconstruction
error:

1 n
θˆe , θˆd = arg min
 
Le xIi , x̂Ii (7.1)
X
θe ,θd n i=1

where Le is the squared error Le (xI , x̂I ) = kxI − x̂I k2 . Ideally, this approach will
result in representative features at the bottleneck layer that capture the most impor-
tant characteristics of the images [Vinc 08].
In the second step, the forming sequences are encoded into their low-dimensional
representations by means of the encoder part of the autoencoder. Additionally, Prin-
cipal Component Analysis (PCA) is applied to further reduce the dimensionality of
the representations in order to facilitate the clustering procedure. Hereby, the num-
ber of components is selected in such a way that more than 95% of the variance of
the data is covered, which yields two components. Subsequently, the data is clus-
tered using the introduced GMM. This enables probabilistic assignment of the cluster
membership or failure stage to each individual frame of the sequences. Both steps of
the approach use disjoint datasets. For instance, two out of three sequences are used
to train the features and for clustering, whereas a third sequence is used to test the
hypothesis and assess the generalization.
An overview of the complete processing pipeline is visualized in Figure 7.1 for an
uniaxial loading condition of AA6014-S050. Two sequences comprise the training
data, whereas the third sequence serves as test data. As visualized, the individual
samples of the three sequences coincide in the PCA visualization, so that it is possible
to derive well-defined clusters using GMM (cf. Figure 7.1 (b)). Consequently, each
frame of the test set is assigned to a specific cluster, such that the failure state of the
test sequence can be visualized for the whole forming sequence (cf. Figure 7.1 (c)).
The autoencoder structure is inspired by the VGG16 architecture (cf. Figure 3.20).
The encoder consists of the first three convolutional blocks of the VGG16 architec-
ture, followed by two fully connected layers of size 512. The bottleneck is again a
fully connected layer of size 256. The decoder comprises the same structure as the
encoder, except that the order is inverted. All layers use leaky-RELU as activations.
Prior to the feature learning and cluster part of the proposed method, the data is
preprocessed as in Chapter 6, whereby in addition to percentile normalization, a stan-
dardization procedure is applied on the individual images of each forming sequence,
such that each image will be zero centered and possesses a standard deviation of one.
From this follows that the time derivative or the difference of two successive frames
are used for evaluation, similar to the proposed method in Chapter 6.
7.2. Experiments 113

1.0 1.0
0.8 0.8
0.6 0.6
P CA2

P CA2
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
P CA1 P CA1
train. data test data train. data test data cluster center
dummy dummy homogene transition necking

(a) (b)
1.0
0.8
probability

0.6
0.4
0.2
0.0
0 20 40 60
time
homogene transition necking
test data train. data

(c)

Figure 7.1: Exemplary procedure of the detection pipeline based on AA6014-S050:


(a) Combined first two PCA components of the three forming sequences of AA6014-
S050. (b) Same data with color-coded cluster memberships. (c) Cluster membership
progression for the test sequence, with respect to the two training sequences.

7.2 Experiments
A LOSO-CV scheme is employed to evaluate the methodology. This means that
two sequences per geometry and material are used as training dataset, while each
hold-out sequence serves as a test dataset once. The limitation of the data to one
geometry and one material is enforced to reduce complexity and avoid possible side
effects, that may be introduced by other geometries or materials and deteriorate the
autoencoder during learning of features. In order to further facilitate the learning
process and to remove dependency on the to be determined localization area, images
are center cropped and limited to 72×72 px while ascertaining that the necking effect
is included in all geometries and sequences of each material. Similar to the previous
chapter, forming sequences are restricted and cover the last 4 mm of the forming
process, so that the amount of frames per material is determined by the sampling
frequencies (cf. Table 4.2).
114 Chapter 7. Unsupervised Determination of the FLC – Deep Learning

8
6

intensity
4
2
0
−2
0 20 40 60
position
(a) (b)
8
6

intensity
4
2
0
−2
0 20 40 60
position
(c) (d)

Figure 7.2: Original input image with its reconstruction and cross-sections: (a)
Original image. (b) Corresponding cross-section. (c) Reconstructed image of the
decoder. (d) Corresponding cross-section.

Keras, a high-level API of the TensorFlow framework [Abad 16] is used for the imple-
mentation of the autoencoder architecture. The experiment employs the Adam op-
timizer [King 14] for loss minimization with an initial learning rate of 0.0001. Adam
has demonstrated to improve the convergence of CNN training, resulting from its
ability to estimate adaptive learning rates based on the observed data.

7.3 Results and Discussion


Altogether, reconstructions are generated by means of the autoencoder, which look
comparable to the original input image and contain the most important characteris-
tics. A comparison between the input image and its reconstruction for AA6014-S050,
a uniaxial loading condition, is visualized in Figure 7.2, together with their cross-
sections. The images correspond to an already necked specimen, particularly evident
from the localization of strain values in the center of the image. Considering the
cross-sections, it becomes apparent that irregularly oscillating values occur, espe-
cially aside the necking effect. In comparison to this, it is first of all remarkable that
the reconstruction replicates the characteristics of the original image correctly. A de-
tailed inspection of the cross-sections furthermore reveals that the increase in strain
values is reproduced accordingly to the original image, even though the magnitude
is not exactly met. The most significant difference between the cross-sections of the
original image and the reconstruction can be found in the less noisy side area, which
does not oscillate in the reconstruction but is constantly decreasing. Based on the
7.3. Results and Discussion 115

1.0 1.0 train. data


0.8 0.8 test data
cluster center
P CA2

P CA2
0.6 0.6 end of seq.
0.4 0.4 homogene
transition
0.2 0.2 necking
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
P CA1 P CA1

(a) (b)

Figure 7.3: Failure examples of the GMM clustering: (a) AA6014-S070 failure
cases with partly consistent clusters. (b) AA6014-S100 failure cases with meaning-
less clusters, since the individual sequences end at different points in feature space.
The arrows highlight the ideal end of sequences for both cases that would lead to
reasonable clusters.

quality of the reconstruction and the similarity of the cross-sections, it is inferred that
the autoencoder learns representative features. Thus, the features of the bottleneck
layer can be utilized for further processing by means of cluster operations.
In a variety of experiments across different geometries, materials and loading condi-
tions, it was possible to obtain consistent cluster results similar to the ones derived
and visualized in Figure 7.1. As a result, it was possible in these experiments to
assign the cluster membership to each frame of the forming sequence according (cf.
Figure 7.1), which would principally enable the generation of the probabilistic FLC.
However, within all materials existed sequences of varying geometries, such that it
was not possible to derive consistent cluster results and consequently, it was not pos-
sible to generate reliable FLCs. Two examples of failed experiments are illustrated
in Figure 7.3 for AA6014-S070 and AA6014-S110. As visualized by Figure 7.3 (a),
it is almost possible to obtain consistent clusters. However, it is evident that one
experiment differs from the other two. While the features of the three sequences still
overlap at the beginning of the forming process, they deviate considerably from each
other towards the end of the forming procedure. Ideally, as emphasized by the arrow,
the end of the sequences should coincide, since the characteristic or class that is de-
scribed by the images is the same. Consequently, since the sequences do not coincide
and appear somewhat rotated or at least not consistent to each other in the necking
phase, the distances used in the clustering procedure will lead to unreliable cluster
affiliations. Figure 7.3 (b) demonstrates an even worse example of this effect. One
of the three sequences deviates so drastically from the other two sequences that it
generates its own cluster and will consequently lead to completely meaningless clus-
ters. Again, as highlighted by the arrow, the end of sequences or ideally the complete
forming sequences should coincide to derive reliable cluster results. This unexpected
behavior is most likely induced in the feature learning part of the processing pipeline,
although the sequences appear to be similar and no significant differences are appar-
ent, at least in a variety of the failed experiments. Consequently, even though the
116 Chapter 7. Unsupervised Determination of the FLC – Deep Learning

network is able to reconstruct the images or at least the important characteristics


based on the features derived in the bottleneck layer, some regularization seems to be
missing that would remove the divergent behavior and lead to consistent and com-
parable features throughout all sequences.
Different approaches have been explored to resolve this shortcoming, such as the
incorporation of l1 or l2 regularization terms into the loss function. Furthermore,
structural changes in the architecture were made, e. g. replacing the fully connected
layers with convolutional layers. Additionally, changes to the experimental setup
were investigated, so that either multiple geometries of the same material or even
other materials were included during training of the features. Overall, none of the
attempts was successful, since the network needs to be regularized in such a way that
similar instances are located close to each other in the feature space, which seems to
be difficult without employing class labels. Even though the approach was not en-
tirely successful, the derived insights are considered in the subsequent methodologies
(cf. Chapter 8 and Chapter 9).
However, as especially emphasized by Figure 7.3, the beginning of all sequences co-
incide, which is highlighted by the accumulation of instances on the left side of the
image (red samples). Consequently, the learned features of the homogeneous forming
phase can be used to substitute the HoG features of the previously presented approach
(cf. Chapter 6). This is visualized in Figure 7.4 and illustrates the exact same be-
havior as the confidence scores in Figure 6.2 (b) and Figure 6.3. In contrast to the
previous approach, however, the characteristics are not explicitly selected. Instead,
they are learned automatically, so that they potentially capture the characteristics of
the material during forming processes more effectively. However, this aspect was not
further investigated, as the objective of this methodology focused on reducing time
and location dependency.

0
normalized conf. scores

−0.2

−0.4

−0.6

−0.8
Outlier
Inlier
−1
0.0 0.2 0.4 0.6 0.8 1.0
normalized time

Figure 7.4: Confidence scores using the features of the autoencoder instead of HoG
features, while the evaluation followed the method of Chapter 6. Similarly, the onset
of necking will start in the elbow region of the confidence scores.
CHAPTER 8

Weakly Supervised
Determination of Forming
Limits using Deep Learning
8.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

In Chapter 7 an unsupervised classification approach using DL was pursued to


learn optimal material and geometry dependent features that ultimately generalize to
unseen new data. Together with a cluster algorithms, the aim was to derive the FLC
by realizing a completely unsupervised approach without the need of a priori knowl-
edge. Consequently, the purpose was to remove the time and feature dependency
of the conventional unsupervised method as introduced in Chapter 6. However, the
network could not be regularized such that features of individual frames of forming
sequences from the same geometry are consistently found close to each other in the
latent space. This property is mandatory for unsupervised cluster algorithms since
they employ distance metrics to assess the similarity of data points. To enforce the
property that similar instances of different forming sequences, with respect to their
forming status, are close to each other in the feature space, the proposed method com-
bines supervised and unsupervised pattern recognition techniques to determine the
FLC and was published as first author publication [Jare 19]. In contrast to the preced-
ing chapters, the AA5182 material characterized by PLC effects (cf. Section 4.1) was
added to the dataset, to further evaluate the generalization to differently behaving
materials.

8.1 Method
The supervised part uses the extreme cases of the homogeneous and inhomogeneous
forming phase, which corresponds to a few images from the beginning and end of each
forming sequence. The actual number of available images and class labels per mate-
rial are found in Table 8.1, where # images indicates the overall amount of images per
forming sequence of the respective material, # homog. the number of images of the

117
118 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

Table 8.1: Image database per material.

Material
(thickness mm)
# images # homog. # neck.

DX54D (2.00) 80 20 5
AA6014 (1.00) 55 15 3
DP800 (1.00) 160 20 3
DX54D (0.75) 160 20 5
AA5182 (1.00) 160 30 3

homogeneous forming phase and # neck. the number of images of the inhomogeneous
forming phase. The remaining process parameters and details of the material proper-
ties together with the available geometries were already introduced in Section 4.1 and
are found in Table 4.1 and Table 4.2. Only these extreme cases, such as the beginning
20 frames from the certainly homogeneous and last three images from the certainly
inhomogeneous or pre-cracking phase, are used to train two identical CNNs. These
networks share the same weights and hence can be used to assess the similarity of
images and to take into account the difficulty of reliably assigning individual images
of a sequence. Consequently, images guaranteed to belong to either the homogeneous
or inhomogeneous class were used to extract features by employing a Siamese CNN.
This type of CNN enables pairwise comparisons between images of both classes using
a similarity metric in a low-dimensional feature space. As a result, if two instances
belong to the same class, the similarity metric would return low values and higher
values otherwise. For this reason, the supervised classification setup separates the
two classes optimally, while simultaneously learning compact representations of both
classes.
While the pairwise comparisons are necessary to create a well-defined feature space,
the unsupervised part ceases this concept. Herein, the complete forming sequences,
consisting of the extreme cases as well as the unseen data between the beginning
of the homogeneous and the end of the inhomogeneous phase, are assessed using
the trained network to create their low-dimensional representations. By means of a
clustering approach, these representations are then assigned to one of the three clus-
ters corresponding to the current forming phase: homogeneous, transition, necking.
For clarification, the datasets are disjoint throughout the experiments unless stated
otherwise. Consequently, the extreme cases of all but one material are used during
training and validation of the network, whereas the actual evaluation and clustering
is carried out using the held-out material.

8.1.1 Preprocessing
Overall, the study employs the same preprocessing method as introduced in Chap-
ter 6 to mitigate the influence of defect pixels. Again, the difference between two
successive images is exploited to remove the correlation with the punch displacement
and to emphasize the incremental changes. Additionally, the frames that contain
the fracture information, identified by an increasing amount of defect pixels, were
8.1. Method 119

(a) (b) (c)

Figure 8.1: Varying image dimensions with respect to the forming progression.
The center cropped evaluation areas of DX54D and DP800 illustrate the extreme
cases of the forming process: (a) DX54D-S245 homogeneous phase. (b) DX54D-S245
localization phase. (c) DP800-S245 localization phase. Source: [Jare 19] (CC BY 4.0)

removed from the sequences. Each image of the dataset was normalized using its
0.5% and 99.5% percentile of the intensities and standardized afterwards. In order
to facilitate the procedure of the experiments, the data was center cropped with a
rectangle area and a side length of 72 px. This was necessary to address the varying
resolutions of the sequences with respect to their specimen geometries and to support
the interchangeability of materials during training of the network. The S245 geome-
try of the ductile DX54D material was used to determine the largest possible size of
the inner rectangle, as this type of material can be deformed the most. The selection
of the correct size is critical, as in particular, the image content of the S245 geom-
etry changes its size during forming. This process is visualized in Figure 8.1. The
perturbing dark homogeneous regions near the borders must be avoided to prevent
the network from directly evaluating the shape of the border regions. These depict a
strong indicator of the necking state and would bias the network. Consequently, the
network is forced to identify and learn the localized necking characteristics based on
the material changes in terms of structures and intensities that are included within
each patch.

8.1.2 Feature Extraction


Two identical sub-networks, each consisting of the first three convolutional blocks of
a VGG16 network [Simo 14], are combined in order to form a Siamese architecture
[Chop 05]. As an alternative to training the networks with the limited amount of data
from scratch, the method uses pre-trained VGG16 networks derived on the basis of
the large-scale image database [Deng 09]. Furthermore, to improve generalization
and to adapt to new data, one dropout layer (0.5 dropout rate) between the two
fully connected layers (512, 256 neurons) followed by one L2 normalization layer was
added. The Siamese structure of the two CNNs is visualized in Figure 8.2 and denoted
by feature learning, since these are used to derive the low-dimensional representations
of the input data by pairwise evaluation of the images. Simultaneously, two inputs
(x1 , x2 ) are evaluated via an identical network structure (Gθ ) with shared weights
(θ). As a result, this structure enables pairwise comparisons between the two L2
normalized low-dimensional outputs of the network (Gθ (x1 ), Gθ (x2 )) since different
input images are evaluated identically based on an appropriate distance metric.
120 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

x1 x2 xT
Feature Learning Clustering

Gθ Gθ Gθ
Convolutional θ Convolutional Convolutional
network network network
(V gg16 ) (V gg16 ) (V gg16 )
Gθ (x1 ) Gθ (x2 )

||Gθ (x1 ) − Gθ (x2 )||2 PCA + SMM

Eθ cluster
membership
Figure 8.2: Illustration of the supervised feature learning and unsupervised clus-
tering phase, adapted from [Chop 05].

8.1.3 Supervised Siamese Optimization


Unlike other loss functions that sum over the individual input samples, this loss
function evaluates the performance between pairs of inputs x1 , x2 , while learning
the parameterized distance function Eθ based on the Euclidean distance between the
low-dimensional outputs of Gθ .

Eθ (x1 , x2 ) = ||Gθ (x1 ) − Gθ (x2 )||2 (8.1)

The overall loss function is composed of two losses that penalize the model differently,
whether the pairs of examples are from the similar (LS ) or different (LD ) class, and
is described as

L(θ(y, x1 , x2 )) = (1 − y)LS (Eθ (x1 , x2 )) + yLD (Eθ (x1 , x2 )) (8.2)

where (y, x1 , x2 ) denotes a labeled input sample pair. The label y refers to the
extreme cases of the input sequence, where y = 0 if the samples are from the same
class and y = 1 otherwise. The final loss function, the contrastive loss [Hads 06], is
denoted as
1 1
L(θ, y, x1 , x2 ) = (1 − y) (Eθ (x1 , x2 ))2 + (y) max(0, m − Eθ (x1 , x2 ))2 (8.3)
2 2
where m is a margin parameter that specifies a radius around Gθ (x), the threshold
distance for dissimilar pairs. This parameter cannot be optimized due to the incom-
pleteness of the data, since only extreme cases are used for training and thus is set
to 1.0 naively. Overall, the individual losses are designed such that minimizing the
contrastive loss encourages the model to output feature vectors that are more similar
for low values of Eθ , whereas if the classes are different, high values of Eθ will lead to
less similar feature vectors. Consequently, the data is separated in the latent space by
reducing the intra-class variations and maximizing the inter-class distances, without
directly enforcing this condition.
8.2. Experiments 121

8.1.4 Clustering using Students t Mixture Models


So far, the network has learned to create low-dimensional feature vectors that provide
optimal separation of the extreme cases of the forming phases. Although complete
sequences have never been made available to the network, discriminative features are
derived for complete formation sequences, which may be used to assess and clus-
ter the individual frames of video sequences. In order to facilitate the unsupervised
clustering procedure, PCA is applied to further reduce the dimensionality of the
manifolds. The amount of components is chosen such that more than 95% of the
variance of the data is covered, which led to two components. During training of
the network, generalized feature representations were learned by combining varying
geometries and different materials to increase the amount and variance of the data.
However, in contrast to the training procedure, the individual geometries and ma-
terials are not combined during the assessment of the held-out data. Consequently,
the available information and amount of instances that can be used to cluster the
data is reduced, which may result in sparsely populated regions. In particular, the
transition phase from homogeneous to localized necking is rather short, and hence
only a limited amount of samples is available in this period, which is enforced by a
low sampling frequency. In order to compensate for this inconvenience, the sample
density is increased by artificially augmenting the data. This means, that each image
of the held-out test set is randomly cropped, rotated (by 15 deg.), translated (by 5
px) and flipped (horizontally and vertically), with its effects on the feature space well
described and visualized in [Hads 06]. Within the evaluation, the loading conditions
and hence geometries of the held-out material are assessed separately. Therefore, for
each geometry, three sequences defined as XT are converted by the network into their
low-dimensional representations. Following on this, after applying the PCA dimen-
sionality reduction, the data is clustered using SMM (cf. Subsection 3.3.5) to obtain
the distributions. Similar to the GMM in Chapter 6, the forming sequences can now
be described by probabilities, with the difference that the time information is not
necessary. Each frame of the forming sequences, more precisely the center cropped
unaltered part of the image, is now depicted by a probability that represents the
membership likelihood to the clusters of the mixture model. At the same time, these
clusters represent distinct phases of the forming process (homogeneous, transition
and necking), so that the individual frames of sequences can be classified into the
corresponding ones.
The complete unsupervised part of the pipeline is illustrated in Figure 8.3. The
unclustered PCA reduced features of the center cropped frames are depicted in
Figure 8.3 (a) together with the augmented data of three forming experiments of
AA6014-S070. In Figure 8.3 (b) this data is being clustered with SMM, whereas Fig-
ure 8.3 (c) visualizes the individual probability progression of each forming sequence
with respect to its mixture component membership.

8.2 Experiments
Several experiments are carried out to investigate whether the method can generalize
and infer the learned characteristics from one material class to another. Further-
122 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

1.0 1.0
0.8 0.8
P CA2

P CA2
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
P CA1 P CA1
aug. data cent. data aug. data cent. data cluster center
dummy dummy homogene transition necking

(a) (b)
1.0
homogene
0.8 transition
probability

0.6 necking
avg
0.4
0.2
0.0
0 20 40 60
time
(c)

Figure 8.3: Detection pipeline exemplarily visualized for AA6014-S070: (a) Com-
bined presentation of the three forming experiments of AA6014-S070 with the two
main PCA components. (b) Color-coded cluster memberships of the identical data.
(c) The temporal progression of cluster memberships for the three forming sequences
and their average. Source: [Jare 19] (CC BY 4.0)

more, forming sequences recorded prior to metallographic examinations are validated


by categorizing their individual frames as belonging to one of the defined forming
phases. Since the sequences are center cropped, no prior knowledge of the localiza-
tion region is required, unlike previous methods.

Leave-One-Material Out Cross-Validation (LOMO-CV): This experiment con-


siders all materials except AA5182, since its behavior and appearance significantly
deviates from the other materials (cf. Section 4.1) and would deteriorate the train-
ing of the network. An example of the different forming behaviors is visualized in
Figure 8.4. Especially, the localization effects in the images of the homogeneous
forming phase of AA5182-S050 appear comparable with the necking effects of the
inhomogeneous forming phase of DP800-S050. Overall, this experiment examines the
generalization and transferability of knowledge to unseen data. One material is held
out and evaluated after the network was trained using all geometries and sequences of
the remaining materials. Consequently, two-thirds of this data is used as the training
set, while the remaining data serves as validation set. The FLC for the held-out
8.2. Experiments 123

(a) (b)

(c) (d)

Figure 8.4: Comparison of samples from the homogeneous and localization phase
of DP800-S050 and AA5182-S050: (a) DP800 homogeneous phase. (b) DP800 lo-
calization phase. (c) AA1582 homogeneous phase. (d) AA1582 localization phase.
Source: [Jare 19] (CC BY 4.0)

material is generated by processing its sequences with the network to derive the low
dimensional manifolds. Subsequently, the unsupervised part of the proposed method
is used to evaluate the geometries separately by determining the cluster memberships
for each forming sequence. Consequently, the cross-sections of the probability pro-
gression curves (50% probability) serve as lookup time points for the actual strain
values.

Leave One Sequence Out Cross-Validation (LOSO-CV): The data in this


experiment, used to train the network, is restricted to one geometry and material.
Hence, features are derived, which will be characteristic for this particular loading
condition and material. Two thirds of the available data per geometry are used for
training and validation of the network, while the remaining sequence serves as test
set. As a result, it is possible to assess the loading conditions separately and at the
same time evaluate the generalization to unseen, comparable data.

Over-fit to Sequences: A theoretical upper bound of the FLC is defined by this


experiment since all three video sequences per geometry are used for training of the
network. Of course, no generalization can be assessed or expected by this setup.
Usually, an overfit to the training data is discouraged in pattern recognition ap-
proaches, since the learned separation hypothesis would not generalize to unseen
data. Nonetheless, the intentional overfit to the data seems valid in the application
of defining optimal forming limits for the generation of the FLC based on limited
data.

Metallography: This experiment evaluates the spatial and temporal independence,


as it assesses video sequences of forming processes that were intentionally stopped
124 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

prior to the onset of localized necking. Hence, the presence of material fatigue can be
investigated by means of metallographic examinations, while their forming sequences
can be assessed with the proposed method. Consequently, the unsupervised approach
is used to validate the examination results. Additionally, it is ascertained whether it
is possible to assign frames without a well-defined localization region to failure clus-
ters. Such an assessment is not feasible using existing techniques since all of them
require a prior definition of the localization area. As a side effect, this approach would
render evaluations of strain paths unnecessary, since these vary anyway depending
on the defined evaluation area. The whole evaluation is again carried out using the
held-out material.

Interpretation of Weakly Supervised Network Activations: So far, only the


unsupervised cluster results to generate the FLC or strain paths are considered.
However, the learned manifolds comprise non-linear combinations of activations of
the different network layers, which can be associated with varying structures on the
individual frames and thus with the different cluster memberships. Consequently, it
is possible to determine which image areas are important for the respective points
in time, at least from the perspective of the network, which ultimately lead to the
class affiliations. Therefore, dependent on the geometry and test sequences, the ex-
ternally estimated SMM and calculated PCA components are attached to the feature
learning part of the network. Since the network cannot be trained end-to-end, this
workaround enables back-propagation of the gradient and thus the derivation of the
class-dependent activations via Grad-cam [Selv 17]. These activations highlight the
areas on the images that have contributed the most to the classification result and
thus provide a qualitative visual interpretation of the current state of the forming
process.

Interpretation of Supervised Network Activations: The cluster results of the


weakly supervised approach can optionally be used as labels in a supervised classi-
fication method. As a consequence, it is possible to infer class affiliations by means
of the individual frames of the video sequences instead of relying on the Euclidean
distances. Consequently, it is thus possible to consider all frames of a sequence in
the training process, so that the homogeneous, the transition and the inhomogeneous
forming phase can be explicitly separated from each other. This may potentially
lead to differences in the activation functions of the individual classes as a result of
the consideration of the transition phase. This means, that instead of a two-class
problem, all three classes are considered during learning. In order to guarantee com-
parability with the weakly supervised results, the same architecture is used to train
the supervised classification network while employing the same LOMO-CV experi-
mental approach.

The FLCs are generated for all experiments by using the intersection between the dif-
ferent forming phases as lookup time points for the strain values. These intersections
refer to the 50% transition probability between the clusters as visualized in Figure 8.3
(c). Keras, a high-level API of the TensorFlow framework [Abad 16] is used for the
implementation of the network architecture. All experiments employed the Adam op-
8.3. Results 125

timizer [King 14] for contrastive loss minimization. This optimizer has demonstrated
to improve the convergence of CNN training, resulting from its ability to estimate
adaptive learning rates based on the observed data. However, the learning rates to
train the network were initialized differently dependent on the underlying experiment
and hence the size of the dataset. LOSO-CV uses a learning rate of 0.0001, whereas
the LOMO-CV and overfit experiment employ a learning rate of 0.00001.

8.3 Results
8.3.1 Comparison of Experimental Results
The FLC candidates corresponding to the onset of the transition and localization
phase of DP800 are visualized in Figure 8.5. The results of the different experimental
approaches coincide when comparing the localized necking FLC candidates (loc.).
This is comprehensible, as necking develops similarly throughout all materials except
AA5182. Consequently, the same behavior is learned by the network independent
of the experimental setup being used for training of the network, while especially
consistent manifolds are expected if only one material and one geometry is provided
as in the LOSO-CV and overfit experiment. In the case of LOMO-CV, however, the
network has never seen a sequence of this material, but is still able to determine the
beginning of the localized necking with almost perfect agreement to the other experi-
ments. Consequently, this proves that the materials exhibit similar localized necking
behavior in terms of their image characteristics and it is therefore possible for the
network to learn and infer the onset of localized necking from the remaining materials
(AA6014, DX54D). When comparing the transition phase FLC candidates (trans.),
a slightly different picture emerges. Herein, the LOSO-CV and overfit experiment
nearly perfectly coincide, whereas the LOMO-CV experiment proposes a less conser-
vative transition phase FLC. One explanation for this rather large difference could be
the reduced amount of data available for the LOSO-CV and overfit experiments. As
a consequence, the variation of the samples describing the homogeneous and necking
class decreases, which may result in a degraded separation of the classes.
Another possibility that explains this phenomenon are dissimilar or slightly different
forming sequences within the same geometry, which prevents the proposed method
to generalize to the held-out sequence. This is exemplarily visualized in Figure 8.6
for DX54D-S060 (0.75mm), where Figure 8.6 (a) depicts the result of the LOMO-
CV experiment. Especially when taking into consideration the average of the three
transition curves of the hold-out sequences, it is obvious, that the transition phase is
shorter in comparison to the one derived by the LOSO-CV experiment as visualized
in Figure 8.6 (b). Here, the solution is biased towards the training sequences, since
the curves of the held-out sequence overall seem displaced in comparison to the av-
erage curves as derived from the training sequences.
This deviation of single sequences can be especially observed when artifacts, as
visualized in Figure 8.4, are not consistently present throughout all datasets. On
the one hand, these artifacts affect the standardization procedure, while on the other
hand, aside of image artifacts, the localization behavior is just very different from the
training dataset, which might occur occasionally. However, increasing the number of
126 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

0.35 0.35
DP800 DP800
0.30 1.00 mm 0.30 1.00 mm
major strain

major strain
0.25 0.25

0.20 0.20

0.15 0.15
−0.10 0.00 0.10 0.20 0.30 −0.10 0.00 0.10 0.20 0.30
minor strain minor strain
lomocv loc. losocv loc. lomocv trans. losocv trans.
overfit loc. overfit trans.

(a) (b)

Figure 8.5: FLC candidates of DP800 (1.00 mm). (a) localization phase. (b)
transition phase. Source: [Jare 19] (CC BY 4.0)

forming sequences per geometry that are used to train the network would attenuate
these effects. So far, three repetitions for one geometry per material do not seem
sufficient to create a representative dataset, at least not when image artifacts are en-
countered simultaneously as these impede the training of the network. The dataset in
the case of the LOMO-CV experiment consists of different geometries, materials and
more sequences. Consequently, it contains more variation and the network is able to
learn more robust or generalized features, which in turn leads to a less conservative
transition candidate. However, the outcome of the LOMO-CV approach may also
be deteriorated by the effect of artifacts, if the sequences of the held-out material
contain artifacts that are significantly different from those in the training set. So far,
the results indicate that it is advantageous to pursue a LOMO-CV approach rather
than performing a LOSO-CV setup. Owing to the large differences between AA5182
and the other materials, only a limited amount of data is available for this material
and therefore only a LOSO-CV and overfit experiment can be considered for its eval-
uation.
As visualized in Figure 8.7, the proposed method is able to generalize to more complex
materials, while both experimental setups again nearly perfectly coincide. Nonethe-
less, especially for this material, it additionally might be beneficial to increase the
evaluation area. As depicted by Figure 8.4 (c), a large rectangular evaluation area
would cover the provided image information up to the periphery of the specimen,
which in turn may improve the training procedure of the network. However, a
geometry-dependent evaluation area would generally be preferable, but would compli-
cate the training process of the network and require more consideration in designing
the network architecture accordingly.
8.3. Results 127

1.00 1.00
0.80 0.80
probability

probability
0.60 0.60
0.40 0.40
0.20 0.20
0.00 0.00
0 40 80 120 160 0 40 80 120 160
frame frame
homogene transition homogene transition necking
necking avg. h. hold-out t. hold-out n. hold-out
dummy avg.

(a) (b)

Figure 8.6: Differences between the class membership progression of the LOMO-
CV and LOSO-CV experiment for DX54D-S060 (0.75mm). (a) All three hold-out
sequences (LOMO-CV). (b) One hold-out sequence vs. the two training sequences
(LOSO-CV). Source: [Jare 19] (CC BY 4.0)

0.40
AA5182 losocv loc.
1.00 mm overfit loc.
major strain

0.30 losocv trans.


overfit trans.

0.20

0.10
0.00 0.20 0.40
minor strain

Figure 8.7: LOSO-CV and overfit FLC candidates of AA5182. Source: [Jare 19] (CC
BY 4.0)

8.3.2 Comparison with State-of-the-Art


A comparison of the different FLCs derived from the line-fit method (cf. Section 2.3),
the unsupervised O-SVM method (cf. Chapter 6) and the proposed method is car-
ried out in this study. As short reminder, the line-fit method focuses on a small
heuristically defined evaluation area, only using the material thickness reduction to
determine the onset of necking. The unsupervised O-SVM approach analyzes all
three principal strains simultaneously by evaluation of multiple rectangular areas in
the vicinity of the maximum strain value by exploiting HoG features. Subsequently,
the confidence scores together with the time information, are assessed by means of
GMM. Consequently, the FLC is extended by failure quantiles and thus provides
a probabilistic FLC, which is compared with the probabilistic FLC candidates pro-
posed in this method. An exemplary comparison for DP800 is illustrated in Figure 8.8
(a), in which the probabilistic FLC candidates of the unsupervised O-SVM approach
128 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

0.35 0.40
DP800 AA5182
0.30 1.00 mm 1.00 mm

major strain
major strain

0.30
0.25
0.20
0.20

0.15 0.10
−0.10 0.00 0.10 0.20 0.30 0.00 0.20
minor strain minor strain
lomocv loc. lomocv trans. loocv loc. overfit loc.
LF < 0.01 LF loocv trans.
< 0.6 < 0.99 overfit trans.

(a) (b)

Figure 8.8: Comparison with the line-fit method and P-FLC for DP800 (a) and
AA5182 (b). Source: [Jare 19] (CC BY 4.0)

are represented by their probability quantiles. In order to facilitate the comparison


and to reduce the number of visualized curves, the LOMO-CV candidate represen-
tatively replaces the other two experiments as they provide comparable results. The
LOMO-CV localized necking curve is in line with the 0.99% and 0.6% quantile, and
additionally coincides with the line-fit candidate.
In general, the visualized FLC candidates can be considered as being rather con-
servative, since the intersections between the homogeneous/transition and transi-
tion/necking class are considered as lookup time points for the strain values and thus
represent the 50% probability. Fine-grained quantile profiles would also be possible,
however, they have been omitted for visualization reasons. The LOMO-CV trans.
candidate provides comparable strain values as the 0.01% quantile candidate, while
both are found far below the results of the line-fit method. Generally, the width of
the transition phase and its covered time span depends on the material and evaluated
geometry. For example, a broader transition phase is characteristic for the ductile
DX54D-S060 material and geometry (cf. Figure 8.6 (a)), whereas the transition
phase is rather thin in case of the light-weight AA6014-S070 material and geometry
(cf. Figure 8.3 (c)). The homogeneous/transition intersection lookup for S245 was
omitted in all experiments, which explains why the transition and localization curves
end with the same strain value pairs. Overall, the transition phase tends to not occur
within this geometry or is negligibly short with respect to the time span.
In case of DP800-S245, the transition phase is superimposed by the other two phases
(cf. Figure 8.9). In case of the AA5182 material, the localized necking candidates,
with the exception of the S245 geometry, are found above the line-fit candidate (cf.
Figure 8.8 (b)), while the transition candidates are found below the line-fit method,
at least in near plane strain conditions.
8.3. Results 129

1.00
homogene
0.80 transition

probability
0.60 necking
avg.
0.40
0.20
0.00
0 40 80 120 160
frame

Figure 8.9: Class affiliations of DP800-S245. Source: [Jare 19] (CC BY 4.0)

8.3.3 Metallography – Strain Path Quantification


Metallographic investigations enable qualitative and quantitative assessments of the
influence of forming processes on the material and its microstructure. Consequently,
if forming processes are stopped at an early stage, microstructural changes can be
assessed and the state of material fatigue can be determined by metallographic ex-
aminations. If the forming process was recorded with an optical measuring system,
the macroscopic level is described by strain distributions, and the last image of the
sequence can be associated with the microscopic level by metallographic investiga-
tions. With a multitude of experiments and sufficiently small steps with respect to
the punch progression, it is possible to describe the material behavior and finally, the
onset of the localized necking more precisely (cf. [Affr 17] for DX54D and [Affr 18]
for DP800). However, due to the absence of a well-defined necking area in case of
prematurely stopped forming processes, a concrete comparison with specimens that
have been deformed until fraction is not possible. Strain paths, as well as the un-
supervised method proposed in Chapter 6, require the definition of either connected
pixels or a well-defined necking area together with a fixed sequence length, which
prevents a valid comparison. In contrast to this, the proposed method possesses no
time dependency, and furthermore is location-independent, since a large central area
is used for evaluation. This independence from the presence of localization effects
makes it possible to infer the cluster membership for a frame of an incompletely
formed specimen from the clustered distributions of sequences that were deformed
up to fracture. Ultimately, the proposed procedure makes it feasible to monitor and,
if necessary, stop a forming process in real-time based on a learned model. Two se-
quences of prematurely stopped forming experiments of DP800-S110 are depicted in
Figure 8.10 (a) and (b) with their individual probabilistic cluster memberships with
respect to the forming progression. While the first sequence reaches the localized
necking phase, the second forming process was stopped within the transition phase
close to the onset of localized necking. A comparison of the strain paths between the
completely and incompletely formed specimen is visualized in Figure 8.10 (c). Since
it is impossible to determine the evaluation area exactly, the top 90% of ε3 serves as
threshold and is used to derive the strain value pairs. Each sample of the strain path
is color-coded using the 50% threshold for each class, while the determined onset
of localized necking according to the proposed classification method (CL) and the
line-fit method are highlighted additionally.
130 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

1.00 1.00
0.80 0.80
probability

probability
0.60 0.60
0.40 0.40
0.20 0.20
0.00 0.00
100 150 200 250 300 350 100 150 200 250 300 350
frame frame
homog. trans. neck. homog. trans. neck.

(a) (b)
0.30
strain path train. seq.
0.25 strain path metallo. seq.
major strain

transition prob. > 50%


0.20 transition prob. metallo > 50%
necking prob. > 50%
0.15 necking prob. metallo. > 50%
avg. limit strain CL (50%)
0.10 avg. limit strain LF
0.030 0.035 0.040 0.045
minor strain
(c)

Figure 8.10: Color-coded probability progressions and strain paths of two incom-
pletely formed sequences of DP800-S110 with respect to the strain paths of the com-
pletely formed training sequences: (a) Probability progression with all classes as-
signed to the forming sequence. (b) The forming process was stopped prior to reach-
ing the necking phase. (c) Corresponding color-coded strain paths of all sequences.
The position of the classified avg. limit strain (CL) is adjustable and depends on the
selected quantile or probability threshold. Source: [Jare 19] (CC BY 4.0)
8.3. Results 131

8.3.4 Interpretation of Weakly Supervised Network Activations


Depending on the forming phase, different parts of the image are representative from
the perspective of the network. These representative class activations from the homo-
geneous, transition and localization phases are illustrated individually in the following
for three experiments of the S050 uniaxial loading condition of AA6014. Besides the
original time-derivative of strain distributions (ε1 ) and the class activation heat-maps,
which highlight the important parts that contributed to the decision, a Gaussian fil-
tered version of the time-derivative is provided, that is easier to interpret as it contains
less noise. Three experiments are presented in each Figure, while always the same
forming stage is considered. In case of the homogeneous forming phase, as visualized
in Figure 8.11, no structures seem to be important, as the cluster membership is
mainly derived using the border parts of the images which exhibit high activations.
In contrast, the inner parts of the image are evenly unimportant, which is reason-
able since no activations are expected in this area. Furthermore, the measurement
errors in experiment one (upper right) and two (lower right) are considered impor-
tant since the network provides high activations at these positions. These artifacts
are suppressed within the blurred version, which additionally highlights the homoge-
neous character of the strain distributions. Additional examples of the homogeneous
forming phase are visualized in Figure 8.12. Since this forming phase covers a large
time-span, the same cluster membership can be derived even though different parts
of the image are considered as important. Since the forming process is close to the
transition phase, the activations are spread across the images, without focusing on
a specific structure as emphasized by the heat-maps. Contrary to the previous ex-
amples, the measurement error in experiment 2 is not considered as important and
additionally, the blurred strain distributions highlight the diffuse concentration of
strain, which by itself is distributed across the images.
Figure 8.13 visualizes the transition phase of the experiments. Contrary to the expec-
tations, the network does not consider the central localization of strain, but instead
focuses on the triangular areas below and above the central region with higher strain
values. Especially the blurred images emphasize the triangle regions with lower strain
values, while again, the measurement artifacts in experiment 1 and 2 have no impact
on the cluster memberships.
In case of the localization phase, similar to the homogeneous forming phase, the net-
work focuses on different parts to derive the cluster membership. At the beginning
of this phase, as expected and visualized in Figure 8.14, the network considers the
localization of strain at the center of the images in all the experiments. This local-
ization of strain is clearly visible in the original noisy time-derivatives as well as in
the Gaussian filtered visualizations. Larger parts of the images are highlighted by
the network, which supports the hypothesis that considering additional information
is beneficial rather than focusing on limited areas or cross-sections as proposed by
state-of-the-art methods (cf. Section 2.3). However, the network activations are dif-
ferent at the end of the localization phase (close to rupture of the specimen). Again
contrary to the expectations, as visualized in Figure 8.15, the network does not focus
on the central, localized part of the strain distributions. Instead, the areas beside the
strain localizations contribute to the decision-making process among all experiments.
132 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

Experiment 1 Experiment 2 Experiment 3

Stage 10
(original)

Stage 10
(heat-map)

Stage 10
(blurred)

Figure 8.11: Strain distributions together with heat-maps. According to the acti-
vations, the important parts are found in the border regions.

Experiment 1 Experiment 2 Experiment 3

Stage 36
(original)

Stage 36
(heat-map)

Stage 36
(blurred)

Figure 8.12: At an advanced forming progression, the homogeneous forming phase


is represented by distributed activations across the image.
8.3. Results 133

Experiment 1 Experiment 2 Experiment 3

Stage 46
(original)

Stage 46
(heat-map)

Stage 46
(blurred)

Figure 8.13: For the transition phase, the important parts are triangular areas
above and below the concentrated regions with high strain values.

Experiment 1 Experiment 2 Experiment 3

Stage 48
(original)

Stage 48
(heat-map)

Stage 48
(blurred)

Figure 8.14: At the beginning of the localization phase, the activations consider
the central concentration of high strain values.
134 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

Experiment 1 Experiment 2 Experiment 3

Stage 52
(original)

Stage 52
(heat-map)

Stage 52
(blurred)

Figure 8.15: At the end of the localization phase, the activations consider the areas
adjacent to the concentrated high strain values in the center.

When investigating a biaxial loading condition, the homogeneous forming phase


images appear similar to the already presented ones of the uniaxial loading condition.
As the transition phase is negligible for the S245 geometry (cf. Figure 8.9), multiple
examples for the localization phase are presented in Figure 8.16. In contrast to
the previous visualizations, individual stages are presented for each experiment, as
the localization membership varies in time. Similar to the previous examples, the
same classification result can be derived using varying activations of the network. In
experiment two and three, the network focuses on the strain structures located at
the center and border regions of the images, whereas in experiment one, the network
considers varying parts of the image, that are not related to the localized strain
structure. Again the blurred versions of the original distributions highlight the strain
concentrations visually more appealing in comparison to the noisy derivatives.
8.3. Results 135

Experiment 1 Experiment 2 Experiment 3


Stage 53 Stage 52 Stage 51

original

heat-map

blurred

Figure 8.16: For a biaxial loading condition, the localization phase is represented
either by distributed activations across the image or by focusing on the central region
with higher strain values.

8.3.5 Comparison with Supervised Network Activations


The same time points are considered for both, the weakly supervised and the super-
vised experiment, such that the relevant areas are highlighted by means of Grad-cam.
Within the figures, heat-maps of the weakly supervised (weakly) and supervised (su-
pervised) approach are contrasted for AA6014-S050 and visualized together with the
blurred strain distribution for each of the three experiments.
In the homogeneous phase, no structural information is evaluated, whereby the ar-
tifacts in the peripheral areas cause network activations again. It is noticeable that
additionally, no activation can be detected in the central area, recognizable by the
homogeneous blue coloration (cf. Figure 8.17). As the homogeneous transformation
progresses, the heat-maps of the two different approaches deviate (cf. Figure 8.18).
The weakly supervised approach obtains distributed activations, whereas the super-
vised approach focuses on smaller image areas, with additional focus on the peripheral
region.
This emphasis on the peripheral area is intensified especially in the transition phase
(cf. Figure 8.19), whereby the classification is also derived by exploiting the two cen-
tral areas near the beginning localization. Experiment three in particular, already
reveals activations close to the expected necking region.
At the beginning of the localization phase (cf. Figure 8.20), both approaches focus
on the strain concentration in the center of the image, whereas the supervised ap-
proach significantly highlights the occurring structure more explicitly. Consequently,
by means of the activation maps, it is thus possible to approximate the area of the
onset of necking together with its vicinity that is involved in the necking process.
136 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

Generally, within all the previous figures, the activation maps of the two networks
are somewhat comparable to each other. This was not to be expected, since in the
weakly supervised approach, significantly fewer samples are used without any infor-
mation about the transition phase and consequently a larger deviation was expected.
For this reason, the different activation maps at the end of the inhomogeneous form-
ing phase are surprising (cf. Figure 8.21). Both networks made their decision not on
the basis of the structure that occurred, but based on information located beside the
structure. Especially in this example, the opposite was expected specifically in case
of the supervised approach.

Experiment 1 Experiment 2 Experiment 3

Stage 10
(blurred)

Stage 10
(weakly)

Stage 10
(supervised)

Figure 8.17: Similar to the weakly supervised heat-maps, within the homogeneous
forming phase, the important parts are found at the border regions, while most parts
of the image comprise no activations.
8.3. Results 137

Experiment 1 Experiment 2 Experiment 3

Stage 36
(blurred)

Stage 36
(weakly)

Stage 36
(supervised)

Figure 8.18: At an advanced forming progression, the homogeneous forming phase


is represented by distributed activations across the image. The supervised approach
additionally highlights parts of the border regions.

Experiment 1 Experiment 2 Experiment 3

Stage 46
(blurred)

Stage 46
(weakly)

Stage 46
(supervised)

Figure 8.19: For the transition phase, the important parts are triangular areas
above and below the concentrated regions with high strain values. The supervised
approach additionally emphasizes the border regions on both sides and may also focus
on small areas beside the beginning localization.
138 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

Experiment 1 Experiment 2 Experiment 3

Stage 48
(blurred)

Stage 48
(weakly)

Stage 48
(supervised)

Figure 8.20: At the beginning of the localization phase, the activations consider
the central concentration of high strain values, whereas the supervised approach
highlights the localization more explicitly.

Experiment 1 Experiment 2 Experiment 3

Stage 52
(blurred)

Stage 52
(weakly)

Stage 52
(supervised)

Figure 8.21: At the end of the localization phase, the activations consider the areas
adjacent to the concentrated high strain values at the center.
8.4. Discussion 139

0.35
AA6014
0.30 1.00 mm

major strain
0.25

0.20

0.15
−0.10 0.00 0.10 0.20 0.30
minor strain
weakly sup. loc. supervised loc.
weakly sup. trans. supervised trans.

Figure 8.22: FLC candidates of AA6014 (1.00 mm). Both approaches coincide in
case of the localization candidate, whereas the transition candidate of the supervised
approach is more conservative.

In contrast to the weakly supervised method, the supervised experiment uses im-
ages from all phases, while especially the transition phase is included during training.
Consequently, it is potentially feasible to determine the beginning of this phase more
precisely. In particular, the use of transition frames provides a more robust determi-
nation with potentially higher and faster increasing confidence. This can be explained
by the integration of image information during the training process, which takes into
account the different characteristics and structures of the individual geometries and
might reduce uncertainty. A comparison of the FLC candidates of both approaches
is visualized in Figure 8.22. While the localization candidates of both approaches
coincide, the transition candidates deviate slightly. This deviation is introduced by
misclassifications, since the labels for training the network are generated only based
on the Euclidean distance of a PCA reduced feature space. Conversely, the network
uses these annotations together with the images. Since the image characteristics are
employed instead of the distances to determine the class affiliations, it is comprehen-
sible that divergent results are obtained. Consequently, the transition FLC candidate
of the supervised approach is more conservative.

8.4 Discussion
Forming processes based on Nakajima test setups generate patterns on the surface
of sheet metal materials that are observable within the strain distributions when
using a DIC measurement device. Chapter 5 introduced a supervised classification
algorithm with expert knowledge/annotations that achieved good agreement between
experts and classification results. In order to remove the subjective and costly expert
annotations, an unsupervised classification algorithm based on conventional pattern
recognition was introduced in Chapter 6, which combined O-SVM and HoG features
of a predefined region together with GMM. Overall consistent results were achieved
throughout the experiments conducted, while additionally introducing a probabilis-
tic FLC. The intuition of this procedure was to introduce well-established pattern
140 Chapter 8. Weakly Supervised Determination of the FLC – Deep Learning

recognition methods and specific features for the evaluation of edge responses that
support the common hypothesis that localized necking is a result of sudden changes
in the principal strains. As long as this assumption is correct, the method can be
used as a baseline approach for comparison, since it already provides a probabilistic
FLC using quantiles, that define the certainty of necking.
However, the method has four disadvantages: (1) location dependency, since only
the vicinity of the maximum strain area is evaluated; (2) time-dependency, since the
method requires specimen to be formed until fracture. Consequently, a comparison
with prematurely terminated forming processes is impossible; (3) the features are
predefined in accordance with the expected behavior of the material and therefore
limited to edge information (4) knowledge transfer to other materials is not possi-
ble and therefore no generalization to new materials is provided. In Chapter 7, an
unsupervised deep learning procedure was proposed, which automatically extracts
characteristic features of the forming sequences by means of an autoencoder. Subse-
quently, these characteristics were used in cluster procedures, so that temporal and
spatial independence was realized. However, within all materials existed sequences of
varying geometries, such that it was not possible to derive consistent cluster results
and consequently, it was not possible to generate reliable FLCs.
In order to circumvent the mentioned limitations, the proposed method pursues a
two-step approach. First, a Siamese CNN is trained following a supervised scheme
that employs only the extreme cases of the homogeneous and inhomogeneous form-
ing phase (begin/end of forming process). The individual frames of the two classes
are separated optimally by minimization of the contrastive loss function, while the
amount and variation of data is increased using augmentation techniques. The sec-
ond step assesses complete forming sequences that are transformed by the network
into their low-dimensional representations. The dimensionality of the manifolds is
further reduced by employing PCA, while the individual frames of each sequence are
clustered in an unsupervised manner via SMM. As a consequence, by application of
the proposed method, the disadvantages are avoided as follows: (1) the dependence
on the extremal region or the location dependency in general is reduced since the
maximum possible square evaluation area is employed; (2) the overall framework is
now independent of time, as the O-SVM together with the GMM is replaced with
SMM. This allows assessment of individual frames of incomplete forming sequences
and a comparison of strain paths; (3) instead of predefined features, optimal features
are derived automatically as they are learned by the Siamese CNN; (4) inference from
one material to another is possible as proven with the LOMO-CV experiment, which
enables real-time supervision of forming processes even for unknown materials; and
(5) generalization of the method to materials with more complex forming behavior
such as AA5182 seems possible even with limited data. Furthermore, evaluation of
the network activations revealed the image regions the network focuses on. Con-
sequently, a coarse approximation of the localized necking region was accomplished
completely data-driven, without using heuristics. However, especially the detection
and accommodation of measurement artifacts and defect pixels remains a challenge,
as their presence or absence affects the normalization process, which can interfere
with network training or impede data clustering.
8.5. Conclusion 141

8.5 Conclusion
A weakly supervised classification approach for the detection of the onset of localized
necking was proposed in this study. It comprises two steps: (1) a supervised classi-
fication part that employs data-driven learning of optimal features using a Siamese
CNN. During training, the contrastive loss function is minimized and therefore the
extreme cases of the forming sequences are separated optimally; (2) an unsupervised
clustering part, that is used to combine the frames of disjoint sequence as belonging
to the homogeneous, transition, or inhomogeneous forming phase. Overall, the results
are consistent with previous methods. However, the proposed method requires no def-
inition of the necking region or suitable features. Consequently, the main advantages
of the proposed approach are its location and time independency, thereby rendering
real-time monitoring of forming processes possible. Similarly, a trained model could
be used in Nakajima setups together with a measurement system, in turn, to inter-
rupt the forming process upon detection of localized necking and therefore enable
validation of the method by metallographic investigations. A remaining question is
the definition of localized necking by means of a strain distribution. So far, heuristics
or in this case the upper 90% of thinning are used naively to define an area that is
considered as localized necking. To address this open issue, a data-driven approach
that automatically segments localized necking regions is presented in Chapter 9.
CHAPTER 9

Weakly Supervised
Approximation of the
Localized Necking Region
using Deep Learning
9.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
9.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

So far, the focus of the work has been the precise temporal determination of the
onset of localized necking by means of unsupervised or weakly supervised classifica-
tion methods. This is primarily due to the fact that already the temporal onset of
localized necking cannot be determined unambiguously by experts, as discussed in
Chapter 5. For this reason, the spatial determination of the localized necking area
was mostly neglected. By means of the activation maps of the weakly supervised or
supervised approach (cf. Subsection 8.3.4 and Subsection 8.3.5), it was possible to
highlight the areas of the image which are significantly involved in the classification
part, such that the temporal beginning of localized necking can be unambiguously de-
rived from the focus on the emerging structure. However, this procedure provides no
high-resolution determination of the necking area. Additionally, shortly after focus-
ing on the occurring structure, the class affiliation is derived based on the information
aside from the structure (cf. Figure 8.21). Consequently, the necking region cannot
be assessed to the end using the activation maps.
Nevertheless, a high-resolution determination of the necking area is made possible
by means of small architectural changes. The proposed method again combines su-
pervised and unsupervised pattern recognition techniques to determine the FLC and
to segment the critical necking region, and was published as first author publication
[Jare 20].

9.1 Method
In principle, the method follows the weakly supervised methodology as proposed in
Chapter 8. However, a decoding path, comparable to the autoencoder approach of

143
144 Chapter 9. Weakly Supervised Approximation of the Necking Region

Chapter 7, is attached to the network. This decoding path consists of the same num-
ber of layers with identical structure of the encoding path, but in mirrored order.
Consequently, an output with identical resolution is generated based on the features
of the bottleneck layer. Additionally, in order to emphasize finer structures, the fea-
tures of the convolutional layers preceding the max-pooling layers of the encoding
path are added to the corresponding layer in the decoding path (cf. Figure 3.21).
These so-called skip-connections preserve the high-resolution features and lead to
finer segmentations [Maie 19a].
While the weakly supervised approach exclusively established a separation of the
homogeneous and inhomogeneous forming phase, the proposed method additionally
pursues segmentation of the necking area. This segmentation is derived by the decod-
ing path, while in contrast to the autoencoder, the input image is not reconstructed.
Instead, a pixel-wise probabilistic class affiliation is determined by means of a softmax
function that is employed on the last layer of the decoding path. Consequently, every
pixel receives a probability of either belonging to the homogeneous or inhomogeneous
forming phase.
In order to realize these segmentations, binary masks are made available in the train-
ing process in addition to the individual images, which again consist only of the
extreme cases of the homogeneous and inhomogeneous phase (cf. Table 8.1). The
masks are generated automatically using a threshold value of 99.0% of the maximum
ε3 to guarantee that the determined area covers the necking region on the frames of
the inhomogeneous forming phase. Specifically, this value is determined to cover a
larger portion of the image area with strain localization, such that the actual necking
region is included in the segmentation masks together with its neighboring pixels.
Since no localization effects are present in the strain distribution of the homogeneous
phase, the binary masks consist exclusively of background pixels (class 0), whereas
the structures for frames of the inhomogeneous phase are emphasized by foreground
pixels (class 1). Closing, a classical morphologic transformation is applied to the
masks to remove single pixels above the threshold that are not connected to the
approximated necking region. Three examples of the differential strain distribution,
together with their corresponding binary masks, are visualized in Figure 9.1 for three
forming experiments of DP800-S050. As visualized, the necking region plus a few pix-
els aside the strain localization serves as binary mask, so that the network is forced
to focus on this particular region.
Consequently, the network optimizes two different criteria. On the one hand, an op-
timal separation of the classes is again pursued by means of the contrastive loss (CCL )
(cf. Equation 8.3) based on the features of the bottleneck layer of the encoding path,
whereas on the other hand the segmentation of the localized necking region is derived
by optimizing the binary cross-entropy (CBCE ) of the decoding path according to:

1 XNI h    i
CBCE (θd ) = − yiI log ŷiI + 1 − yiI log 1 − ŷiI (9.1)
NI i=1

where y I and ŷ I denote the pixel-wise label and its prediction of an image with
resolution NI , so that the entire loss function is composed as:
1 1
Ctotal (θ, y, y1I , x1 , x2 ) = CCL (θe , y, x1 , x2 ) + CBCE (θd , y1I , x1 ) (9.2)
2 2
9.1. Method 145

(a) (b) (c)

(d) (e) (f)

Figure 9.1: Difference images of DP800-S050-(1-3) (a) - (c) together with their
corresponding segmentation masks (d) - (f ). White foreground pixels character-
ize the local necking effect, whereas black pixels are considered as background.
Source: [Jare 20] (CC BY 4.0)

In other words, within the training phase, the network learns to recognize struc-
tures and critical behaviors that are characteristic for the specific regions of the
inhomogeneous class while at the same time optimally separating the extreme cases
of the homogeneous and inhomogeneous forming phase. Within the test phase, all
unseen, intermediate frames are evaluated by the network, such that critical regions
are emphasized and highlighted accordingly. Consequently, the temporal, as well as
the spatial determination of the onset of local necking, is feasible while additionally
providing the development of the necking region for the entire forming process.

9.1.1 Preprocessing
In principle, the data is preprocessed as described in Subsection 6.1.1. Thus, missing
pixels are interpolated, whereby the temporal derivatives, i.e., difference images, are
investigated again. Deviating from the presented preprocessing in Subsection 8.1.1,
a robust scaling of the data is employed. Outliers are therefore not removed from the
data or truncated with a saturation value (0.0 or 1.0), so that the value range covers
only approximately 0 - 1. This is necessary to preserve the character of local necking
and to facilitate the segmentation.
With respect to difference images, the local necking can be identified as a local max-
imum with falling flanks or metaphorically described as a mountain summit. In the
absence of outliers, previous preprocessing methods would substitute the maximum
value in the relevant region, thereby losing the tapered character and being replaced
by a plateau. Consequently, a robust scaling using the 0.25 and 99.75 percentiles is
employed for each frame, so that the majority of the distribution is approximately
scaled into 0 - 1 range, without limiting the minimum and maximum values.
146 Chapter 9. Weakly Supervised Approximation of the Necking Region

9.1.2 Feature Extraction


The proposed method also provides an image-level classification based on the features
of the bottleneck layer similar to the previous method, such that the feature learning
process follows the same principle as presented in Subsection 8.1.2.

9.2 Experiments
In order to examine the generalization to other materials, the LOMO-CV experiment
from Section 8.2 will be re-investigated. Rectangular areas with a side length of 72 px
are center cropped from each frame of the video sequences (cf. Figure 9.1), whereby
again only a limited amount of images and corresponding masks of the homogeneous
and inhomogeneous forming phase are used for training the network (cf. Table 8.1).
In order to determine the quality of the segmentation results, the dice coefficient as
introduced in Subsection 3.5.2 is utilized.
Additionally, to investigate whether the bottleneck features of the proposed method
can also be employed for clustering, they are evaluated accordingly to the unsuper-
vised clustering part as presented in Subsection 8.1.4. Consequently, this enables a
comparison of the results on image scale with the segmentation results on pixel scale,
as well as a comparison with the weakly supervised FLC candidates of Chapter 8.
Keras a high-level API of the TensorFlow framework [Abad 16] is used for imple-
mentation of the network architecture. The experiment employs the Adam optimizer
[King 14] for loss minimization at a learning rate of 0.00001.

9.3 Results and Discussion


Overall, very good dice coefficients are obtained throughout all geometries and mate-
rials, resulting in an average dice coefficient of above 0.90 among all materials in case
of the foreground class, whereas a dice coefficient of 0.99 is derived for the background
class. Figure 9.2 provides an overview of the incremental strain images together with
their ground truth segmentations and predictions.
Even though only coarse segmentation masks were provided that included the lo-
calized necking plus its vicinity, the network was able to focus and emphasize the
intended region. In particular, the ground truth mask of AA6014-S060 and AA6014-
S100 contain discontinuities that impede the dice coefficient computation. It is dif-
ficult to argue whether these discontinuities are actually correct, as these regions
clearly exhibit high intensities when compared with the original strain distributions,
that might also be considered as foreground. Consequently, the dice coefficient only
provides an indication if the network focuses on correct parts of the image, but cannot
be used as a direct measure of quality, since the ground truth is only approximately
correct. As there exists no explicit definition of localized necking, it is difficult to
only segment the actual localized necking region automatically. However, the net-
work learns the specific characteristics of material failure and localized necking based
on the extreme cases and thus infers the necking probability to individual pixels.
This is in particular well visualized for AA6014-S245-1 and S245-2 as the ground
9.3. Results and Discussion 147

S060 S100 S245-1 S245-2

Original

Ground truth

Prediction

Figure 9.2: Difference strain images together with the corresponding ground truth
and predictions. Correct segmentation results are generated despite the presence of
outliers in S245-1 and S245-2. For S245-2, outliers are included in the generated
ground truth mask and predicted accordingly. Source: [Jare 20] (CC BY 4.0)

truth mask contains a larger region and thus pixels that do not really reveal necking
characteristics. Consequently, the predictions only highlight few pixels with necking
characteristics and thus result in lower dice coefficients. In order to better approxi-
mate the beginning of localized necking, the temporal development of the maximum
of the difference images (ε3 ) is considered first. This is illustrated for DP800-S050-1
in Figure 9.3 (a), together with its line profile in Figure 9.3 (e). At the beginning of
the considered part of the forming process, the line profile reveals a slightly increasing
tendency in consecutive frames, which increases significantly towards the end of the
forming process. In the associated segmentation (cf. Figure 9.3 (b)), or more pre-
cisely in the line profile through the maximum of the prediction (cf. Figure 9.3 (f )),
no necking probability has been predicted for the majority of the forming process.
Only towards the end of the forming process, the necking probability increases fast
and significantly. This deformation behavior can also be observed in the repetitions
of the experiments for DP800-S050-2 and DP800-S050-3.
A detailed investigation of DP800-S050-1 (cf. Figure 9.4 (a) and (b)) compares the
critical forming period that emerges around time step 120 of the forming process,
by contrasting the line profile of the difference images (ε3 ) to the corresponding line
profile of the necking probability to emphasize their different development. For this
purpose, the difference images are compared with the predicted segmented masks at
several points in time, so that the necking area and its extent are visually highlighted.
The line profile of the difference images in Figure 9.4 (a) reveals no significant increase
up to the time point 155 (cf. Figure 9.4 (f )). The difference images at the respective
148 Chapter 9. Weakly Supervised Approximation of the Necking Region

time step 120

160
(a) (b) (c) (d)
1
int. / prob.

0.8
0.6
0.4
0.2
0
0 80 160 0 80 160 0 80 160 0 80 160
position position position position
(e) (f) (g) (h)

Figure 9.3: The images emphasize the temporal development of the strain dis-
tributions and their necking probability. The red line highlights the evaluation re-
gion of the line-profiles: (a) and (e) x-z representation of the difference images of
DP800-S050-1 together with its incremental strain development as line-profile. (b)
and (f ) x-z representation of the segmented necking probability for DP800-S050-1
together with its line-profile through the maximum value. (c) and (g), (d) and (h)
x-z representation and line-profiles for DP800-S050-2 and DP800-S050-3, respectively.
Source: [Jare 20] (CC BY 4.0)
9.3. Results and Discussion 149

1
strain diff. seg. prob. 148/50%
int. / prob.
0.8
g cluster prob.
0.6 j kl
0.4
f i
0.2 h
c d e
0
0 50 100 150 0 50 100 150
position position
(a) (b)

(c) 142 (d) 148 (e) 150 (f) 155 (g) 160

(h) 142 (i) 148 (j) 150 (k) 155 (l) 160

Figure 9.4: Line profiles for DP800-S050-1 of the forming process of frame 120
to 160: (a) Progression of the incremental strain development through the max-
imum. (b) Corresponding probability progression through the maximum value of
the prediction. (c) - (g) Difference strain distributions at different time points. (d)
Corresponding development of the segmented area, or approximated necking region.
Source: [Jare 20] (CC BY 4.0)

points in time 142-150 (cf. Figure 9.4 (c-f )) also do not exhibit a clear necking that
is easy to identify. Visually one could suspect that localized necking starts at time
point 150 (cf. Figure 9.4 (e)), whereby this can not be recognized distinctly until time
point (cf. Figure 9.4 155 (f )). At time point 160 (cf. Figure 9.4 (g)), i.e., the last
image of the forming process, a well-defined necking region can be identified. When
investigating the corresponding predictions, individual pixels are already classified as
belonging to the necking class starting at time point 142 (cf. Figure 9.4 (h)), whereby
the localization effect occurs at another position in the image area in comparison to
the remaining predictions. Beginning with time point 148 (cf. Figure 9.4 (i)), a small
connected area is identified, which constantly increases until the end of the forming
process at time point 160 (cf. Figure 9.4 (l)). In order to emphasize a stable and well
depict-able necking area, the predictions were spatially and temporally smoothed us-
ing a 3D-Mean filter.
150 Chapter 9. Weakly Supervised Approximation of the Necking Region

Although the necking area is clearly emphasized visually, the question arises to
what extent individual pixels with low probability actually permit a reliable classifica-
tion of the beginning of necking. The predicted mask at time point 148 (cf. Figure 9.4
(i)) illustrates a contiguous area classified as necking, whereby the probability of neck-
ing according to the line profile in Figure 9.4 (b) reaches only 15%. Although this is
considerably below the usual 50% that serves as the decision threshold in classification
experiments, due to the consistency of the area and the absence of misclassified back-
ground pixels, the 15% failure probability can be considered as a significant change.
Consequently, necking can be assumed to begin at this point in time or even slightly
earlier. Since the proposed method possesses the same capabilities as the weakly
supervised approach as a result of the encoding path, it is feasible to contrast the
onset of necking based on image scale with the segmentation of the necking region on
pixel scale. For this purpose, the features of the bottleneck layer are again utilized,
such that individual frames of the forming process are assigned to the failure classes
by means of clustering using SMM (cf. Chapter 8).
The resulting probability progression curve is additionally provided in Figure 9.4
(b) and depicted by cluster probability. At time point 148, the evaluation based
on the bottleneck features yields a failure probability of 50%, which corresponds
to the 15% probability of the segmentation approach. The proposed methodology
thus closes the gap of the weakly supervised approach, which used features of the
entire image without providing an estimate of the expected necking region. With
the presented method, it is thus possible to either use the whole image information,
directly use the probability of the segmented area, or use both evaluation options in
combination. Furthermore, it can be inferred from the two curves that they begin to
rise significantly at about the same time point (approx. 145 in Figure 9.4 (b)), so
that an earlier or smaller quantile (< 50%) should be used to define the beginning of
necking in the probabilistic FLC. The difference between the two curves is particularly
noticeable in the premature and temporary increase of the clustering approach. This
can be attributed to the untreated outliers, which are also visible in the upper part
of the difference images in Figure 9.4 (c)-(g).

These outliers affect the features in the bottleneck layer, so that individual sam-
ples exhibit an increased noise behavior within the clustering procedure. This is
illustrated in Figure 9.5 (b) and especially emphasized by the samples of the transi-
tion region. This noise behavior impairs the clustering procedure, so that the cluster
centers deviate marginally and thus affect the likelihoods. A comparison of the prob-
abilistic FLC between the weakly supervised (cf. Chapter 8) and the cluster method
based on the encoding path of the proposed method is provided in Figure 9.5 (a) and
Figure 9.6. For DP800, a high degree of agreement prevails, with slight deviations for
the biaxial loading condition on the right sight of the curve, whereby the differences
between the two FLC candidates can be attributed to the varying preprocessing of
the outliers.
In contrast to the weakly supervised method, the data is now preprocessed using
robust scaling. As a result, there is no saturation effect of the extreme values. Con-
sequently, the necking character is maintained both in the presence and absence of
outliers. This leads to marginal deviations of the generated FLCs of the proposed
9.3. Results and Discussion 151

0.35 1.0
DP800
0.30 1.00 mm 0.8
major strain

P CA2
0.6
0.25
0.4
0.20 0.2
0.0
0.15
−0.10 0.00 0.10 0.20 0.30 0.0 0.2 0.4 0.6 0.8 1.0
minor strain P CA1
weakly sup. prop. method cent. data cluster center dummy
line-fit ISO homogene transition necking

(a) (b)

Figure 9.5: Comparison of the weakly supervised approach and the proposed
method together with the impact of outliers: (a) DP800 FLC candidates. (b)
Especially the transition region contains outliers that affect the cluster position.
Source: [Jare 20] (CC BY 4.0)

method in comparison to the weakly supervised results (cf. Figure 9.6 (a) and (b)).
In order to facilitate the interpretation of the results, the FLCs determined by the
ISO industry standard and the line-fit method are also provided (cf. Subsection 2.3.1
and Subsection 2.3.2). Note that the intersection between the transitions and the
necking phase (cf. Figure 8.3 (c)) is employed again to generate the FLCs. This
corresponds to the 50% quantile of the local necking phase (cf. Figure 9.4 (b)). To
improve visibility, additional quantiles (cf. Chapter 6) are not illustrated. The 50%
quantiles of both proposed methods lie below the FLCs of the line-fit method for
AA6014 and DX54D, whereas they coincide in case of DP800.
Overall, the generated FLCs are located above the FLCs of the ISO method. This
seems reasonable, since this method tends to underestimate the forming capacity,
especially for light-weight material, whereas the line-fit method tends to overesti-
mate the forming capacity. Furthermore, the position of the generated FLCs can be
customized using other failure probabilities to meet the individual requirements.
152 Chapter 9. Weakly Supervised Approximation of the Necking Region

0.35 0.8
AA6014 DX54D
0.30 1.00 mm 0.7 0.75 mm
major strain

major strain
0.6
0.25
0.5
0.20 0.4
0.15 0.3
−0.10 0.00 0.10 0.20 0.30 −0.4 −0.2 0.0 0.2 0.4 0.6
minor strain minor strain
weakly sup. prop. method weakly sup. prop. method
line-fit ISO line-fit ISO

(a) (b)

Figure 9.6: Comparison of the weakly supervised classification with the weakly
supervised segmentation method. (a) AA6014 FLC candidates. (b) DX54D FLC
candidates. Source: [Jare 20] (CC BY 4.0)

9.4 Conclusion
The proposed method incorporates all the advantages of the weakly supervised ap-
proach outlined in Chapter 8, which comprises the data-driven learning of appropriate
features and the time- and location-independent determination of the onset of local-
ized necking. Consequently, using the features of the bottleneck layer provides the
possibility to determine the temporal onset of localized necking based on image scale
features. While additionally, the spatial determination of localized necking is realized
by the decoding path of the network, such that a probabilistic assessment of the neck-
ing region on pixel scale is feasible. As a consequence, it is possible to generate strain
paths by only evaluating the specific area that is actually involved in the necking
process rather than heuristically specifying the necking area on the last image of the
forming process. Furthermore, it was determined that a lower failure quantile of the
probabilistic FLC (< 50%) should be considered in process design that defines the
onset of localized necking. Future work should incorporate temporal information to
facilitate a smooth determination of the necking region without the necessity of mean
filtering. However, in order to provide a final evaluation of the presented methodol-
ogy, it is essential to produce real components and to examine their quality by means
of metallographic investigations.
CHAPTER 10

Outlook
In this work, machine learning was used for the first time in the area of sheet
metal forming and the determination of the FLCs. For this purpose, five studies
were carried out, ranging from a supervised approach via unsupervised to a weakly
supervised methodology, whereby the required prior knowledge was successively re-
duced. Further extensions and alternative techniques are provided for each study in
detail and outlined below before proposing general modifications.
In principle, the traditional approach with classic HoG features and RF performs very
well. Of course, one would expect slightly better results with other approaches based
on CNNs [He 16]. Especially for the homogeneous and diffuse necking class, large
deviations would still be expected since partly the labels are contradictory between
the individual sequences of the same loading condition. Consequently, a majority
voting scheme was employed for this study, while it would also be possible to apply
a multilabel approach [Madj 12, Yeh 17], since the properties of two different classes
may occur on individual frames due to the smooth transitions from one class to the
other. It would also be reasonable to introduce a self-supervised concept that uses
only a few certain instances per class while extending them incrementally, thus fur-
ther reducing the uncertainty between the experts [Lee 13, Oliv 18]. Furthermore, it
would be desirable to get new annotations from experts with respect to the spatial
extent of the individual classes, such that instead of relying on class activations or
coarse approximations, an explicit segmentation and assessment by means of U-Net
would be possible [Ronn 15]. Consequently, this would allow improved segmentation
of the extent of the local necking, so that besides the comparison with the ground
truth annotations, an additional comparison with the coarse segmentations would be
worth investigating.
The second study was the logical evolution based on HoG features and O-SVM with-
out the need of expert knowledge. Explicitly it was focused on spontaneous differences
in successive images. However, there was no attempt to cluster the features in the
PCA reduced feature space directly. In contrast to CNN activations, the handcrafted
features possess the property that changes in the feature space can be directly at-
tributed to changes on the material surface. Similarly, it is to be expected that the
position in the feature space will not vary as much as it was the case with autoen-
coder features. However, as indicated, the autoencoder features could be used and
substitute the HoG features.
In the third study, the location and time dependency was further reduced, utilizing
a deep learning approach for the first time. As demonstrated, it was not possible to
regularize the feature space so that sequences of the same loading condition are not
mapped side by side similarly in the feature space. One possible option to minimize

153
154 Chapter 10. Outlook

this effect might be to use perceptual loss, which in contrast to per-pixel loss functions
is more robust against transitions or small changes in the images and thus locates
similar instances potentially closer together [John 16, Doso 16]. Another possible en-
hancement would be to replace the two-step approach with an end-to-end solution
using existing cluster approaches [Xie 16, Ghas 17]. However, since these are likewise
based on autoencoders, it must first be examined whether the feature space can be
regularized accordingly. Otherwise, the previous problem of divergent features would
occur again.
The fourth study provides a well-regularized feature space as a result of the semi
supervised procedure. However, the feature space is derived by employing only two
classes during training and does not explicitly aim at the detection of localized neck-
ing. Rather than using only the extreme cases, the homogeneous phase could be
extended by additional certain instances which are located near but still before the
beginning of the localized necking. Consequently, application of the triplet loss func-
tion would divide the latent space differently [Schr 15]. Especially when using three
classes, e.g., the beginning of the homogeneous phase, the end of the homogeneous
phase and the end of the inhomogeneous phase, an incremental procedure with soft
assignment and hard case mining would be advantageous, so that all data might be
included during training.
Since all proposed deep learning methods are based on the VGG16 architecture and
only use the first three convolutional blocks, it would be generally interesting to inves-
tigate whether and to what extent the results would deviate by using deeper models
or whether they behave consistently [He 16].
In general, the temporal aspect has been neglected with the assumption that lo-
cal necking occurs spontaneously. Specifically for brittle materials, this hypothesis
seems not to be appropriate. Although spontaneous increase in strain together with
material failure occurs as well, visual saturation in combination with only minor
increasing strain values tends to emerge earlier, at least for AA5182. Therefore, it
seems to be reasonable, rather than focusing on a spontaneous change between two or
a few images, to consider and evaluate a longer time-frame of material behavior (20-
30 frames). This would again enable the determination of several forming phases,
whereby these most likely would again possess a smooth transition phase between
neighboring classes. Rather than determining the maximum value on the images just
before fracture, which seem to be too optimistic, the transition from one material
behavioral condition to the next should be used, which will be more conservative,
but more consistent. Consequently, it would be reasonable to use recurrent neural
networks for this purpose, whereby the behavioral conditions could be assessed using
cluster procedures [Hoch 97]. A comparable approach would also address the open
question of the exact extent of local necking and how it evolves [Greg 15]. In partic-
ular, an adaptation of the algorithm in temporal instead of spatial dimension would
be necessary.
A completely alternative solution could be realized with variational autoencoders
[Hou 17]. They directly estimate the distribution in the feature space, so that the
entire forming process, as well as additional variations, can be simulated and sam-
pled, thus providing unlimited training data and consequently might improve cluster
results.
155

.
CHAPTER 11

Summary
The automotive industry is currently facing many challenges. In addition to the
general move towards electromobility, more stringent legal regulations are being in-
troduced for combustion engines, which require a 37 percent reduction of the emission
limits by 2021. In order to reach this goal, it is necessary to reduce the weight of
vehicles. In this context, the focus for light-weight construction is generally the se-
lection of the appropriate material for each component in the vehicle. This requires a
correct determination of the FLCs for each material, in order to provide an optimal
selection. Yet, current state-of-the-art methods have multiple limitations that lead
to considerably different FLCs for identical materials. In this thesis, several contri-
butions are presented for the determination of the FLC to address current limitations
while introducing machine learning methods for the first time.
Chapter 2 introduces the essential fundamentals of forming technology in the area of
sheet metal forming and the determination of the FLC. Especially the tensile test is
of particular interest, since this experimental setup is suitable for the illustration of
different forming phases, as well as for the determination of material characteristics.
The forming capacity is usually determined experimentally using a Nakajima setup,
that uses a stereo camera system in combination with a DIC algorithm to calculate
local strains that are used for the generation of the FLC. The FLC represents the
limit strains in the form of major and minor strain pairs for different loading con-
ditions and thus defines the forming limits for a material that must be maintained
in order to manufacture defect-free components. A precise determination of this
curve is therefore crucial in order to exploit the forming capacity of the material and
simultaneously achieve the maximum reduction in material consumption. Current
state-of-the-art methods such as the cross-section or line-fit method rely on heuris-
tics and show particular weaknesses for light-weight materials while being limited
with respect to their evaluation areas and additionally possess a temporal depen-
dency. Consequently, a large safety margin is required within process planning to
guarantee defect-free components.
This thesis introduces machine learning methods in the field of sheet metal forming
and the determination of the FLC by exploiting 2D strain distributions. Chapter 3
presents the fundamentals of machine learning in accordance with the pattern re-
cognition pipeline. This pipeline comprises four sequential steps: data acquisition,
preprocessing, feature extraction and classification. The initial step of the pipeline
is the acquisition of a signal, for example image data. Subsequently, preprocessing
steps may be performed to facilitate further processing. Within the feature extraction
step, the signal is encoded into smaller representations, so that the contained infor-
mation is concise and yet describes the original signal, ideally without losing relevant

157
158 Chapter 11. Summary

information. For this process step, different features are presented, which potentially
emphasize the specific behaviors on the individual frames of the forming sequences.
These comprise features that are suitable to highlight areas with the same strain
behavior (LBP) and features which are appropriate to identify spontaneous mate-
rial changes based on gradients and their orientation (HoG). These features are used
within the classification step to learn a decision boundary for samples of different
classes in order to correctly assign class memberships to future data samples. Funda-
mentally, a differentiation is made between supervised and unsupervised classification
procedures. Supervised classification methods require manually annotated data sam-
ples, whereas unsupervised classification algorithms learn a decision boundary purely
data-driven. Several classification methods are presented, on the one hand, RF as
part of supervised learning and on the other hand, O-SVM and cluster procedures
such as GMM and SMM as part of unsupervised learning. In addition to the tra-
ditional methods of machine learning, this chapter introduces the fundamentals of
deep learning. Generally, DL still follows the pattern recognition pipeline, whereas
the limitation of the expert-based selection of the extracted features is avoided. Al-
ternatively, DL combines the feature extraction and classification step by optimizing
a problem specific optimization function to generate features that are derived com-
pletely data-driven and optimally adapted to the problem. In order to understand
this technology, the necessary concepts of feed-forward networks, network optimiza-
tion, stochastic gradient descent and back-propagation are introduced. Furthermore,
the building blocks of neural networks, such as fully connected, pooling and convolu-
tional layers, are presented in addition to well-established network architectures like
VGG16 or autoencoders. Moreover, this chapter presents evaluation metrics as for
example, confusion matrices or ROC analyses that are applicable for the evaluation
of traditional machine learning and DL methods.
Chapter 4 introduces the materials, their specific properties and characteristics, as
well as the FLCs determined by state-of-the-art methods. Overall, four different ma-
terials are investigated: a ductile DX54D steel with two different sheet thicknesses; a
dual-phase steel DP800, which is used for structural components; a light-weight alu-
minum alloy AA6014 used for car-body structures and an aluminum alloy AA5182
that develops multiple maxima in the form of shear bands with PLC effects. A pro-
cedure to calculate the minimum sampling rate according to the Nyquist-Shannon
theorem is presented and additionally, the process parameters such as the sheet thick-
ness or the sampling frequencies used per material are outlined. Furthermore, the
image database used for the following studies is introduced in this chapter, so that the
forming phases of the materials are contrasted with strain distributions for the differ-
ent loading conditions in order to emphasize the individual characteristic differences.
By means of strain distributions, the four failure classes, homogeneous forming, dif-
fuse necking, local necking and crack are presented and illustrated with exemplary
strain distributions together with the annotation guidelines that were used by the
experts for the data annotation procedure.
Fundamentally, the aim is to avoid the shortcomings of existing approaches and to
develop a method that can be used independently of the material class. For this
purpose, different machine learning methods are presented, which build upon each
other while gradually reducing the necessary prior knowledge. The proposed methods
159

consist of supervised, weakly supervised and unsupervised concepts and are outlined
in the following.
Chapter 5 covers a supervised classification approach to determine the FLC. In order
to extend the FLC with additional failure conditions, the database was annotated by
five experts to derive the ground truth for the introduced four failure classes. In ad-
dition to the evaluation of the classification algorithm, it is consequently also feasible
to assess the quality of the ground truth annotations by means of inter-rater relia-
bility with respect to the process parameters and the material class. The proposed
method follows the pattern recognition pipeline and utilizes HoG as well as LBP in
the feature extraction step. A RF is used as classifier, while the classification exper-
iments are restricted to the individual materials. Additionally, different evaluation
areas are examined to determine whether an evaluation strategy with multiple eval-
uation regions has advantages over a single large area of evaluation. The position of
the evaluation area is determined in dependence of the maximum strain value before
crack occurrence, such that only the relevant region is assessed. The results reveal
that for uniaxial to plane strain loading conditions gradient-based features with mul-
tiple small evaluation ranges are advantageous, since on average, an AUC of 0.91 to
0.97 is achieved for all classes. For the biaxial loading conditions, however, a larger
evaluation area is advantageous, whereby a lower AUC of 0.72 to 0.92 is obtained.
In the separate examination of the classification results, it becomes apparent that
in particular good consistencies are observed in the local necking and crack classes.
Hence, occurring errors are mainly associated with the difficult distinction between
the homogeneous and the diffuse necking class. This is also supported by the inter-
rater reliability. Especially low coincidences for ductile materials are observed when
a high sampling rate is used, which impedes differentiation between the individual
classes. In general, the study demonstrated that expert knowledge specifically for the
local necking class, can be leveraged for the development of a classification approach
for different material classes.
Chapter 6 covers an unsupervised classification approach to eliminate the dependence
on expert knowledge and annotations, while focusing exclusively on the detection of
local necking, since this class possesses a higher relevance in forming processes. In or-
der to highlight differences between successive images, the time derivative of the video
sequences is employed, as the local necking is generally understood as a spontaneous
change in sheet metal thickness direction. Consequently, local necking recognition
can also be considered as an anomaly detection problem. To facilitate comparability
with the line-fit method, only the last 4 mm of the punch movement of the forming
sequences are considered. Of these 4 mm, the first 2 mm describe the homogeneous
forming phase and serve as the training dataset. The remaining 2 mm serve as the
test dataset that will include the emerging local necking. For each loading condition
exist three forming sequences that are evaluated simultaneously to derive a combined
assessment of the necking condition. On the evidence and as a result of the first study,
gradient-based features (HoG) are used in combination with several small evaluation
areas. Once again, these regions are determined in relation to the extreme values, so
that only relevant regions containing the necking effect are used for evaluation. An
O-SVM is used as classifier that defines a hypersphere as decision boundary within
feature space and covers the entire homogeneous forming phase. This decision bound-
160 Chapter 11. Summary

ary is then assessed by means of the test data so that a transition region between the
homogeneous and inhomogeneous forming phase can be determined by consideration
of a GMM and time information. The transition interval is utilized to generate a
probabilistic FLC that returns failure quantiles, potentially allowing a lower safety
margin. A high level of agreement with the line-fit method is obtained, whereby more
conservative FLCs candidates can be selected based on the individual failure quan-
tiles. Additionally, the results are verified using sequences that were not deformed
up to fracture but examined with metallographic investigations.
Chapter 7 proposes an unsupervised deep learning approach that assesses a centered
evaluation area, thereby avoiding dependence on the spatial determination of extreme
values. Overall, it comprises two steps. In the first step, optimal features are learned
by means of an autoencoder that potentially provides superior features in terms of
adaptation to the individual forming sequences and materials. The second step again
involves a clustering procedure based on a GMM that omits the intermediate step of
anomaly detection based on an O-SVM. Ideally, it is therefore possible to differentiate
between three different forming phases, i. e. homogeneous, transition and inhomo-
geneous forming phase. Unfortunately, for several loading conditions, no consistent
feature space could be generated. This means that the individual forming sequences
begin at the same point and hence share a cluster, but often end at divergent points
in the feature space. Consequently, the clustering procedure fails and it was not pos-
sible to generate probabilistic FLCs.
Chapter 8 covers a weakly supervised deep learning approach that again comprises
two steps. In the first step, optimal features are learned, whereby this time employing
a supervised classification approach. Naively, it uses up to 30 images from the be-
ginning of the homogeneous forming phase and up to five images from the end of the
inhomogeneous forming phase close to material failure. Based on a Siamese network
topology, representative features are learned for these two classes and at the same time
optimally separated from each other. Furthermore, by this annotation of the extreme
regions of the sequences, it is guaranteed that the sequences of the same loading con-
dition begin and end at similar points within the feature space. The second step again
involves a clustering procedure, this time using an SMM, that yields more robust out-
puts in the presence of outliers. Consequently, an assignment of the individual frames
to the respective clusters is made possible, so that a probabilistic assessment of the
cluster membership and thus the failure class is realized. Even though the features
have only been learned using two classes, a three-class classification is obtained using
the SMM. Intuitively, this is reasonable, since without spontaneous failure, it is not
possible to reach the inhomogeneous forming phase without passing the transition
phase. In contrast to Chapter 6, this approach is completely independent of time, so
that failure probabilities are obtainable for strain paths. Consequently, this method
enables validation by means of sequences that have not been deformed to fracture
but investigated by metallographic evaluations. Overall, two probabilistic FLCs are
generated, which represent the beginning of the transition as well as the onset of
the localized necking phase. Additionally, an interpretation of the activation maps is
carried out in order to approximate the extent of localized necking.
Chapter 9 introduces a deep learning based segmentation approach, that similar to
the previous method provides temporal determination of the onset of necking on
161

image scale. Additionally, a spatial determination on pixel scale with a precise de-
termination of the localized necking region is realized. Basically, it uses the same
setup as the previous method and therefore again only uses up to 30 images from the
homogeneous forming phase and up to five images from the inhomogeneous forming
phase. Additionally, segmentation masks are provided for these images that use a
99% threshold of the maximum thinning, such that the top 1% coarsely segments
the necking region. The same Siamese network topology of Chapter 8 is reused for
training, while an additional decoding path is added to realize the segmentations.
Evaluation on image scale using the same SMM cluster approach confirmed the pre-
viously derived FLCs, whereby the results on pixel scale suggest that a lower failure
quantile (< 50%) should be considered in process design.
List of Acronyms
AUC
Area Under ROC Curve 58, 74, 77, 79–82, 159

CART
Classification and Regression Trees 34

CC
Cross-Correlation 18

CNN
Convolutional Neural Network 5, 53, 55, 114, 118, 119, 125, 140, 141, 153

COCO
Common Objects in Context 46, 49

Convolutional Layers
Convolutional Layers 53, 54

DIC
Digital Image Correlation 2, 6, 16, 17, 20–22, 61, 66, 68, 71, 74, 92, 107, 139,
157

DL
Deep Learning 27, 34, 46–50, 52, 111, 117, 158

EM
Expectation-Maximization 43, 44, 95

FC
Fully Connected 53, 55, 56

FEA
Finite Element Analysis 1, 10

FLC
Forming Limit Curve 1–4, 9, 14–17, 22, 23, 25, 26, 61–64, 66, 68, 71, 73, 76,
85–87, 89–92, 94, 98–102, 104–107, 109–111, 115, 117, 122–128, 139, 140, 146,
150, 151, 153, 157–161, 177

163
164 List of Acronyms

FPR
False Positive Rate 58

GMM
Gaussian Mixture Models 41–44, 95–97, 100, 104, 106, 111, 112, 115, 121, 127,
139, 140, 158, 160

GPU
Graphic Processing Unit 46

GUI
Graphical User Interface 61, 71, 88

HoG
Histogram of oriented Gradients 32, 33, 75, 79, 80, 85, 89, 91–93, 100, 103, 104,
107–109, 111, 116, 127, 139, 153, 158, 159

ILSVRC
ImageNet Large Scale Visual Recognition Challenge 46, 55

KKT
Karush-Kuhn-Tucker 39

LBP
Local Binary Patterns 30–32, 75, 79, 80, 89, 158, 159

LBPriu
Rotation Independent Uniform Local Binary Patterns 32, 75

LBPu
Uniform Local Binary Patterns 31, 32, 75

LOMO-CV
Leave-One-Material Out Cross-Validation 122, 124–128, 140, 146, 175

LOSO-CV
Leave-One-Sequence Out Cross-Validation 77, 113, 123, 125–127, 175

MLP
Multilayer Perceptron 47, 48, 50, 53

NCC
Normalized Cross-Correlation 18
List of Acronyms 165

NSSD
Normalized Squared Sum of Differences 18

O-SVM
One-class Support Vector Machine 40, 41, 91–94, 97, 109, 127, 139, 140, 153,
158–160

PASCAL
Pascal Visual Object Classes Challenge 46, 49

PCA
Principal Component Analysis 112, 121, 124, 139, 140, 153

PLC
Portevin-LE Chatlier 63, 68, 117, 158

Pooling Layers
Pooling Layers 53, 54

RELU
Rectified Linear Unit 46, 51–53, 55, 112

RF
Random Forest 34, 36, 76, 77, 83, 153, 158, 159

ROC
Receiver Operating Characteristic 58, 59, 158

SAD
Sum of Absolute Differences 18

SGD
Stochastic Gradient Descent 49

SMM
Students t Mixture Models 41, 43, 44, 121, 124, 140, 150, 158, 160, 161

SSD
Squared Sum Differences 18

SVDD
Support Vector Data Description 40, 41

SVM
Support Vector Machine 37, 39, 40, 46, 93, 98
166 List of Acronyms

TPR
True Positive Rate 58

ZNCC
Zero-Normalized Cross-Correlation 18, 19

ZNSSD
Zero-Normalized Squared Sum of Differences 18
List of Symbols
A0 Initial cross-sectional area.

C Soft margin SVM trade-off parameter.

CC Correlation criteria.

D Dimension of the sample.

E Modulus of elasticity.

F Force.

Gx Horizontal gradients of the image.

Gy Vertical gradients of the image.

Γ Gamma function.

G Gini impurity.

H Shannon entropy.

Hxs Sobel filter operator in the x-direction.

Hys Sobel filter operator in the y-direction.

I Image.

Imagn. Gradient magnitude image.

C Cost function.

J Objective function.

K Strength coefficient.

L Lagrangian function.

L
e Dual representation of the Lagrangian function.

Λ Precision of the Student t distribution.

Le Error function.

N Normal distribution.

Np Neighborhoodsize of an LBP.

167
168 List of Symbols

Nr Number of rater.

Ns Number of samples.

Nt Number of trees.

P Center coordinate of the reference subset.

P∗ New position of the center coordinate of the reference subset.

Q Arbitrary point in the reference subset.

Q∗ Arbitrary point in the reference subset after deformation.

S Subset before split.

Sl Left children subset.

SlL Left children subset of the left children.

SlR Right children subset of the left children.

St Student t distribution.

Θ Orientation of gradients.

X Set of feature vectors.

aj Activation of layer j.

α Lagrangian multiplier.

b Bias parameter.

d0 Initial circular diameter.

d1 Larger semi-major axis of an ellipses.

d2 Smaller semi-minor axis of an ellipses.

δj Partial error of layer j.

δk Kronecker delta function.

e Engineering strain.

ε Strain.

ε1 Major strain.

ε2 Minor strain.

ε3 Thinning.

εxx Normal strain of the Green-Lagrange strain tensor in the x- direction.


List of Symbols 169

εxy Shear component of the Green-Lagrange strain tensor.

εyy Normal strain of the Green-Lagrange strain tensor in the y- direction.

η Learning rate.

fI Image as a function.

fm Ensemble average of the reference subset.

fror Bitwise rotate right function.

fs Gray-level reference subset of the undeformed image.

f θd Mapping function parametrized by the model parameters of the decoder.

f θe Mapping function parametrized by the model parameters of the encoder.

γ Responsibility or posterior probability.

gm Ensemble average of the target subset.

gs Gray-level reference subset of the target image.

gv Gray-level intensity value.

hD Decision function.

h Activation function.

hI Relative frequency of the intensity value.

ι Degree of freedom of the Student t distribution.

k Kernel function.

κc Cohen’s kappa.

κf Fleiss kappa.

l0 Initial length of the object.

l Current length of the object.

lf Complete length of the object.

∆l Elongation of the initial length.

λ Eigenvalue.

µ Mean value.

nH Hardening coefficient.

nij Number of raters assigning sample i to category j.


170 List of Symbols

ν Trade-off parameter of the O-SVM.

p Density distribution.

p̂ Empirical distribution.

pej Proportion of assignments to category j.

p∗i Extent of agreement for a sample i.

pe Hypothetical probability of the agreement by chance.

φ Sampling function of features.

πk Weighting factor of Gaussian k.

π̂k Estimated weighting factor of Gaussian k.

po Observed agreement among two raters.

po Observed accuracy among multiple raters.

ψ Geometric primitive.

pt Probabilistic leaf predictor model of a tree.

r Radius of an LBP.

rb Binary function to determine the cluster membership.

rL Lankford coefficient.

ρ Bias parameter of the O-SVM.

σ Stress.

σe Engineering Stress.

σs Standard deviation.

σt True Stress.

t0 Initial thickness of the object.

t Current thickness of the object.

τ Threshold for binary split.

u Translation u in the x-direction.

v Translation v in the y-direction.

w0 Initial width of the object.

w Current width of the object.


List of Symbols 171

wl Left window aside of the discontinuity.

wr Right window aside of the discontinuity.

ξ Slack variable of the SVM.

y Class label.

ŷ Predicted class label.

zj Aggregated parameters of layer j.

b Bias vector.

µk Cluster center of cluster k.

µ̂k Estimated cluster center of cluster k.

p Displacement mapping parameters.

θ Model parameters.

θ∗ Optimal model parameters.

θd Model parameters of the decoder.

θˆd Estimated model parameters of the decoder.

θe Model parameters of the encoder.

θˆe Estimated model parameters of the encoder.

w Weights.

x Feature vector.

xI Input image.

x̂I Estimated input image.

xb Feature vector at the bottleneck layer.

y Output vector.

yI Ground truth segmentation.

ŷ I Predicted segmentation.

H Hessian matrix.

Σ Covariance matrix.

Σ̂ Estimated covariance matrix.

W Weights as matrix.
List of Figures
1.1 Supervised method overview . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Unsupervised method overview . . . . . . . . . . . . . . . . . . . . . 4
1.3 Weakly supervised method overview . . . . . . . . . . . . . . . . . . . 5
1.4 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Tensile test scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10


2.2 Specimen development during tensile tests . . . . . . . . . . . . . . . 11
2.3 Stress-strain diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Prepared specimen with speckle pattern . . . . . . . . . . . . . . . . 13
2.5 Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Forming limit diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.7 Safety margin of forming limit curves . . . . . . . . . . . . . . . . . . 16
2.8 Nakajima test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.9 Example of a transition of a DIC subset . . . . . . . . . . . . . . . . 18
2.10 Deformation example of a DIC subset . . . . . . . . . . . . . . . . . . 19
2.11 Principal strains (ε1 –ε3 ) examples . . . . . . . . . . . . . . . . . . . . 21
2.12 Evaluation areas of state-of-the-art methods . . . . . . . . . . . . . . 23
2.13 Location-dependent method . . . . . . . . . . . . . . . . . . . . . . . 24
2.14 Time-dependent method . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.15 FLC comparison of state-of-the-art methods . . . . . . . . . . . . . . 25

3.1 Pattern recognition pipeline . . . . . . . . . . . . . . . . . . . . . . . 28


3.2 Strain distribution and edge response . . . . . . . . . . . . . . . . . . 30
3.3 LBP example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 LBP neighborhoods and radii . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Uniform LBP variants . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.6 Structures represented by LBPs . . . . . . . . . . . . . . . . . . . . . 32
3.7 Different examples of LBP visualizations . . . . . . . . . . . . . . . . 32
3.8 Derivation of HoG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.9 HoG example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.10 Random Forest generation . . . . . . . . . . . . . . . . . . . . . . . . 35
3.11 Information gain as split criteria . . . . . . . . . . . . . . . . . . . . . 36
3.12 Separation hyperplane of an SVM . . . . . . . . . . . . . . . . . . . . 37
3.13 Separation hyperplane of an O-SVM . . . . . . . . . . . . . . . . . . 41
3.14 Comparison of K-means, GMM and SMM . . . . . . . . . . . . . . . 43
3.15 Student t distribution with varying degree of freedom . . . . . . . . . 45
3.16 Student t distribution with respect to outliers . . . . . . . . . . . . . 45
3.17 Computational neuron and MLP structure . . . . . . . . . . . . . . . 48
3.18 Activation functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.19 Convolution and pooling layer . . . . . . . . . . . . . . . . . . . . . . 54

173
174 LIST OF FIGURES

3.20 VGG16 architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 56


3.21 Autoencoder architecture . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.22 ROC curve example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1 DX54D FLCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62


4.2 DP800 FLC & AA6014 FLC . . . . . . . . . . . . . . . . . . . . . . . 63
4.3 AA5182 FLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4 Original signal and power density spectrum . . . . . . . . . . . . . . 65
4.5 Defined forming classes as strain distributions . . . . . . . . . . . . . 68
4.6 Overview of strain progressions for the uniaxial loading condition . . 69
4.7 Overview of localized necking examples . . . . . . . . . . . . . . . . . 70
4.8 Signal impairments examples . . . . . . . . . . . . . . . . . . . . . . . 71
4.9 Annotation software . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.1 Feature visualizations overview . . . . . . . . . . . . . . . . . . . . . 75


5.2 Evaluation areas of the supervised approach . . . . . . . . . . . . . . 78
5.3 Expert distributions per geometry . . . . . . . . . . . . . . . . . . . . 79
5.4 Average AUC for the uniaxial to plane strain loading condition . . . . 81
5.5 Average AUC for the biaxial loading condition . . . . . . . . . . . . . 82
5.6 Probability and feature progression for DP800-S060-2 . . . . . . . . . 84
5.7 Derived FLCs using expert annotations and the supervised method . 86

6.1 Evaluation areas of the unsupervised approach . . . . . . . . . . . . . 93


6.2 Overview of the unsupervised approach . . . . . . . . . . . . . . . . . 96
6.3 Decision boundaries of the GMM . . . . . . . . . . . . . . . . . . . . 97
6.4 Deterministic FLC in comparison to experts and line-fit method . . . 99
6.5 Probabilistic FLC in comparison to experts and line-fit method . . . 101
6.6 Z-displacement progression in relation to cross-sections and strain dis-
tributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.7 Area progression in relation to feature progression . . . . . . . . . . . 103
6.8 Probability progression of quantiles with respect to the strain devel-
opment of individual time steps . . . . . . . . . . . . . . . . . . . . . 105
6.9 Comparison of the deterministic and probabilistic FLC . . . . . . . . 106
6.10 Probabilistic FLC with strain paths of metallography experiments . . 107
6.11 Influencing factors with respect to the strain distributions . . . . . . 108
6.12 Influencing factors with respect to the specimen orientation . . . . . . 108
6.13 Impacts of signal impairments on the probability progression curves . 109

7.1 Visualization of the cluster part of the unsupervised method . . . . . 113


7.2 Original input image with its reconstruction and cross-sections . . . . 114
7.3 Failure examples of the autoencoder approach . . . . . . . . . . . . . 115
7.4 Confidence scores using learned features . . . . . . . . . . . . . . . . 116

8.1 Evaluation area of the weakly supervised approach . . . . . . . . . . 119


8.2 Overview of the weakly supervised approach . . . . . . . . . . . . . . 120
8.3 Visualization of the cluster part of the weakly supervised method . . 122
8.4 Comparison of samples from the homogeneous and localization phase 123
8.5 FLC candidates of the weakly supervised approach . . . . . . . . . . 126
LIST OF FIGURES 175

8.6 Differences between the LOMO-CV and LOSO-CV experiment results 127
8.7 FLC candidates for AA5182 . . . . . . . . . . . . . . . . . . . . . . . 127
8.8 Comparison of the weakly supervised with state-of-the-art results . . 128
8.9 Class affiliations of DP800-S245 . . . . . . . . . . . . . . . . . . . . . 129
8.10 Color-coded probability progressions and strain paths of incompletely
formed specimen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.11 Network activations of the homogeneous forming phase at stage 10 . . 132
8.12 Network activations of the homogeneous forming phase at stage 36 . . 132
8.13 Network activations of the transition forming phase at stage 46 . . . . 133
8.14 Network activations of the localized forming phase at stage 48 . . . . 133
8.15 Network activations of the localized forming phase at stage 52 . . . . 134
8.16 Network activations of the localized forming phase of S245 . . . . . . 135
8.17 Comparison with supervised Grad-cam activations: homogeneous phase
at stage 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.18 Comparison with supervised Grad-cam activations: homogeneous phase
at stage 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.19 Comparison with supervised Grad-cam activations: transition phase
at stage 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.20 Comparison with supervised Grad-cam activations: localization phase
at stage 48 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.21 Comparison with supervised Grad-cam activations: localization phase
at stage 52 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.22 Comparison of the FLC candidates between the supervised and weakly
supervised method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

9.1 Difference images with corresponding segmentation masks . . . . . . 145


9.2 Segmentation results of localized necking . . . . . . . . . . . . . . . . 147
9.3 X-z representations of the strain difference images and predicted seg-
mentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.4 Detailed comparison of x-z representations and segmentation results . 149
9.5 Comparison of FLC candidates and the impact of outliers . . . . . . . 151
9.6 Comparison of the weakly supervised classification with the weakly
supervised segmentation method . . . . . . . . . . . . . . . . . . . . . 152
List of Tables
2.1 Parameters with influence on the position of the FLC. . . . . . . . . . 16

3.1 Confusion matrix of a two-class prediction. . . . . . . . . . . . . . . . 58

4.1 Material properties of the investigated materials. . . . . . . . . . . . . 64


4.2 Specimen geometries per material with process parameters. . . . . . . 66

5.1 Images per failure class . . . . . . . . . . . . . . . . . . . . . . . . . . 77


5.2 Inter-Rater-Reliability in terms of Fleiss-Kappa statistics. . . . . . . . 80
5.3 Confusion matrices of the best performing feature per material . . . . 83
5.4 Average error per sequence and class . . . . . . . . . . . . . . . . . . 84

8.1 Image database per material. . . . . . . . . . . . . . . . . . . . . . . 118

177
Bibliography
[Abad 16] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin,
S. Ghemawat, G. Irving, and M. Isard. “Tensorflow: a System for Large-
scale Machine Learning”. In: Proceedings of the Symposium on Operating
Systems Design and Implementation, pp. 265–283, 2016.
[Affr 17] E. Affronti and M. Merklein. “Metallographic Analysis of Nakajima Tests
for the Evaluation of the Failure Developments”. Procedia Engineering,
Vol. 183, pp. 83–88, 2017.
[Affr 18] E. Affronti, C. Jaremenko, M. Merklein, and A. Maier. “Analysis of Form-
ing Limits in Sheet Metal Forming with Pattern Recognition Methods.
Part 1: Characterization of Onset of Necking and Expert Evaluation”.
Materials, Vol. 11, No. 9, 2018.
[Agga 12] N. Aggarwal. “First and Second Order Statistics Features for Classi-
fication of Magnetic Resonance Brain Images”. Journal of Signal and
Information Processing, Vol. 3, No. 2, pp. 146–153, 2012.
[Ahon 06] T. Ahonen, A. Hadid, and M. Pietikäinen. “Face Description with Local
Binary Patterns: Application to Face Recognition”. IEEE Transactions
on Pattern Analysis and Machine Intelligence, Vol. 28, No. 12, pp. 2037–
41, 2006.
[Alta 12] T. Altan and A. E. Tekkaya. Sheet Metal Forming: Fundamentals. ASM
International, Materials Park, 2012.
[Anut 70] P. E. Anuta. “Spatial Registration of Multispectral and Multitemporal
Digital Imagery Using Fast Fourier Transform Techniques”. IEEE Trans-
actions on Geoscience Electronics, Vol. 8, No. 4, pp. 353–368, 1970.
[Aubr 17] M. Aubreville, C. Knipfer, N. Oetter, C. Jaremenko, E. Rodner, J. Den-
zler, C. Bohr, H. Neumann, F. Stelzle, and A. Maier. “Automatic Classi-
fication of Cancerous Tissue in Laserendomicroscopy Images of the Oral
Cavity using Deep Learning”. Scientific Reports, Vol. 7, No. 1, p. 11979,
2017.
[Aubr 18] M. Aubreville, K. Ehrensperger, A. Maier, T. Rosenkranz, B. Graf, and
H. Puder. “Deep Denoising for Hearing Aid Applications”. In: Interna-
tional Workshop on Acoustic Signal Enhancement (IWAENC), pp. 361–
365, IEEE, 2018.
[Bana 10] D. Banabic. Sheet Metal Forming Processes: Constitutive Modelling and
Numerical Simulation. Springer-Verlag, Berlin, Heidelberg, 2010.
[Beng 09] Y. Bengio. “Learning Deep Architectures for AI”. Foundations and Trends
in Machine Learning, Vol. 2, No. 1, pp. 1–127, 2009.
[Beng 13] Y. Bengio, A. Courville, and P. Vincent. “Representation Learning: A
Review and new Perspectives”. IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol. 35, No. 8, pp. 1798–1828, 2013.

179
180 Bibliography

[Bier 18] B. Bier, M. Unberath, J.-N. Zaech, J. Fotouhi, M. Armand, G. Osgood,


N. Navab, and A. Maier. “X-ray-transform Invariant Anatomical Land-
mark Detection for Pelvic Trauma Surgery”. In: Proceedings of the Inter-
national Conference on Medical Image Computing and Computer-Assisted
Intervention (MICCAI), pp. 55–63, Springer, 2018.
[Bish 06] C. M. Bishop. Pattern Recognition and Machine Learning (Information
Science and Statistics). Springer-Verlag, Berlin, Heidelberg, 2006.
[Blab 15] J. Blaber, B. Adair, and A. Antoniou. “Ncorr: Open-source 2D Digital
Image Correlation Matlab Software”. Experimental Mechanics, Vol. 55,
No. 6, pp. 1105–1122, 2015.
[Bock 08] T. Bocklet, A. Maier, J. G. Bauer, F. Burkhardt, and E. Noth. “Age and
gender recognition for telephone applications based on gmm supervec-
tors and support vector machines”. In: Proceedings of the IEEE Interna-
tional Conference on Acoustics, Speech, and Signal Processing (ICASSP),
pp. 1605–1608, IEEE, 2008.
[Bour 88] H. Bourlard and Y. Kamp. “Auto-association by Multilayer Percep-
trons and Singular Value Decomposition”. Biological Cybernetics, Vol. 59,
No. 4-5, pp. 291–294, 1988.
[Brag 72] A. Bragard, J.-C. Baret, and H. Bonnarens. “Simplified Technique to
Determine the FLD on the Onset of Necking”. C. R. M., No. 33, pp. 53–
63, 1972.
[Brei 01] L. Breiman. “Random Forests”. Machine Learning, Vol. 45, No. 1, pp. 5–
32, 2001.
[Brei 84] L. Breiman, J. Friedman, R. Olshen, and C. J. Stone. Classification and
Regression Trees. Chapman and Hall/CRC, New York, 1984.
[Bruc 89] H. Bruck, S. McNeill, M. A. Sutton, and W. Peters. “Digital Image Corre-
lation using Newton-Raphson Method of Partial Differential Correction”.
Experimental Mechanics, Vol. 29, No. 3, pp. 261–267, 1989.
[Caru 08] R. Caruana, N. Karampatziakis, and A. Yessenalina. “An Empirical Eval-
uation of Supervised Learning in High Dimensions”. In: Proceedings of
the International Conference on Machine Learning (ICML), pp. 96–103,
ACM, 2008.
[Chop 05] S. Chopra, R. Hadsell, and Y. LeCun. “Learning a Similarity Metric
Discriminatively, with Application to Face Verification”. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 539–546, IEEE, 2005.
[Chu 85] T. Chu, W. Ranson, and M. A. Sutton. “Applications of Digital-Image-
Correlation Techniques to Experimental Mechanics”. Experimental Me-
chanics, Vol. 25, No. 3, pp. 232–244, 1985.
[Cohe 60] J. Cohen. “A Coefficient of Agreement for Nominal Scales”. Educational
and Psychological Measurement, Vol. 20, No. 1, pp. 37–46, 1960.
[Coll 11] R. Collobert, K. Kavukcuoglu, and C. Farabet. “Torch7: A Matlab-
like Environment for Machine Learning”. In: BigLearn, NIPS Workshop,
2011.
Bibliography 181

[Cort 95] C. Cortes and V. Vapnik. “Support Vector Networks”. Machine learning,
Vol. 20, No. 3, pp. 273–297, 1995.
[Crim 09] A. Criminisi, J. Shotton, and S. Bucciarelli. “Decision Forests with
Long-range Spatial Context for Organ Localization in CT volumes”. In:
Proceedings of the International Conference on Medical Image Comput-
ing and Computer-Assisted Intervention (MICCAI), pp. 69–80, Springer,
2009.
[Crim 12] A. Criminisi, J. Shotton, and E. Konukoglu. “Decision Forests: A Uni-
fied Framework for Classification, Regression, Density Estimation, Mani-
fold Learning and Semi-supervised Learning”. Foundations and Trend in
Computer Graphics and Vision, Vol. 7, No. 2–3, pp. 81–227, 2012.
[Cybe 89] G. Cybenko. “Approximation by Superpositions of a Sigmoidal Function”.
Mathematics of Control, Signals and Systems, Vol. 2, No. 4, pp. 303–314,
1989.
[Dala 05] N. Dalal and B. Triggs. “Histograms of Oriented Gradients for Human
Detection”. In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), pp. 886–893, IEEE, 2005.
[Deit 19] S. Deitsch, V. Christlein, S. Berger, C. Buerhop-Lutz, A. K. Maier,
F. Gallwitz, and C. Riess. “Automatic Classification of Defective Pho-
tovoltaic Module Cells in Electroluminescence Images”. Solar Energy,
Vol. 185, pp. 455–468, 2019.
[Demp 77] A. P. Dempster, N. M. Laird, and D. B. Rubin. “Maximum Likelihood
from Incomplete Data via the EM Algorithm”. Journal of the Royal
Statistical Society: Series B, Vol. 39, No. 1, pp. 1–22, 1977.
[Deng 09] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei. “Ima-
geNet: A Large-scale Hierarchical Image Database”. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 248–255, IEEE, 2009.
[Dice 45] L. R. Dice. “Measures of the Amount of Ecologic Association Between
Species”. Ecology, Vol. 26, No. 3, pp. 297–302, 1945.
[DIN 03a] DIN Deutsches Institut für Normung e.V. “Fertigungsverfahren - Begriffe,
Einteilung”. 2003.
[DIN 03b] DIN Deutsches Institut für Normung e.V. “Fertigungsverfahren Um-
formen - Einordnung; Unterteilung, Begriffe, Alphabetische Übersicht”.
2003.
[DIN 08] DIN Deutsches Institut für Normung e.V. “Metallische Werkstoffe –
Bleche und Bänder - Bestimmung der Grenzformänderungskurve –Teil
2: Bestimmung von Grenzformänderungskurven im Labor”. 2008.
[DIN 16] DIN Deutsches Institut für Normung e.V. “Prüfung metallischer Werk-
stoffe - Zugproben”. 2016.
[Doeg 10] E. Doege and B.-A. Behrens. Handbuch Umformtechnik. Springer-Verlag,
Berlin, Heidelberg, 2 Ed., 2010.
[Doso 16] A. Dosovitskiy and T. Brox. “Generating images with Perceptual Similar-
ity Metrics based on Deep Networks”. In: Advances in Neural Information
Processing Systems, pp. 658–666, 2016.
182 Bibliography

[Duch 11] J. Duchi, E. Hazan, and Y. Singer. “Adaptive Subgradient Methods


for Online Learning and Stochastic Optimization”. Journal of Machine
Learning Research, Vol. 12, No. Jul, pp. 2121–2159, 2011.

[Duda 00] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-


Interscience, New York, 2000.

[Dumo 16] V. Dumoulin and F. Visin. “A Guide to Convolution Arithmetic for Deep
Learning”. arXiv preprint arXiv:1603.07285, 2016.

[Ever 15] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn,


and A. Zisserman. “The Pascal Visual Object Classes Challenge: A Ret-
rospective”. International Journal of Computer Vision, Vol. 111, No. 1,
pp. 98–136, Jan. 2015.

[Flei 71] J. L. Fleiss. “Measuring Nominal Scale Agreement Among Many Raters”.
Psychological Bulletin, Vol. 76, No. 5, p. 378, 1971.

[Flei 73] J. L. Fleiss and J. Cohen. “The Equivalence of Weighted Kappa and the
Intraclass Correlation Coefficient as Measures of Reliability”. Educational
and Psychological Measurement, Vol. 33, No. 3, pp. 613–619, 1973.

[Gero 09] D. Gerogiannis, C. Nikou, and A. Likas. “The Mixtures of Student’s t-


Distributions as a Robust Framework for Rigid Registration”. Image and
Vision Computing, Vol. 27, No. 9, pp. 1285–1294, 2009.

[Ghas 17] K. Ghasedi Dizaji, A. Herandi, C. Deng, W. Cai, and H. Huang. “Deep
Clustering via Joint Convolutional Autoencoder Embedding and Rela-
tive Entropy Minimization”. In: Proceedings of the IEEE International
Conference on Computer Vision (ICCV), pp. 5736–5745, IEEE, 2017.

[Good 16] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press,
Cambridge, 2016.

[Good 68] G. M. Goodwin. “Application of Strain Analysis to Sheet Metal Forming


Problems in the Press Shop”. SAE Transactions, pp. 380–387, 1968.

[Greg 15] K. Gregor, I. Danihelka, A. Graves, D. J. Rezende, and D. Wierstra.


“Draw: A Recurrent Neural Network for Image Generation”. arXiv
preprint arXiv:1502.04623, 2015.

[Grot 18] K.-H. Grote, B. Bender, and D. Göhlich. Dubbel: Taschenbuch für den
Maschinenbau. Springer Vieweg, Berlin, Heidelberg, 25 Ed., 2018.

[Hads 06] R. Hadsell, S. Chopra, and Y. LeCun. “Dimensionality Reduction by


Learning an Invariant Mapping”. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pp. 1735–1742,
IEEE, 2006.

[Hamm 17] K. Hammernik, T. Würfl, T. Pock, and A. Maier. “A Deep Learning


Architecture for Limited-Angle Computed Tomography Reconstruction”.
In: Bildverarbeitung für die Medizin (BVM), pp. 92–97, Springer, 2017.

[Hase 78] V. Hasek. “Untersuchung und Theoretische Beschreibung wichtiger Ein-


flussgrössen auf das Grenzformanderungsschaubild”. Institute of Metal
Forming Report, pp. 213–220, 1978.
Bibliography 183

[Hast 05] T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin. “The Elements


of Statistical Learning: Data Mining, Inference and Prediction”. The
Mathematical Intelligencer, Vol. 27, No. 2, pp. 83–85, 2005.
[He 15] K. He, X. Zhang, S. Ren, and J. Sun. “Delving Deep into Rectifiers:
Surpassing Human-Level Performance on ImageNet Classification”. In:
Proceedings of the IEEE International Conference on Computer Vision
(ICCV), pp. 1026–1034, IEEE, 2015.
[He 16] K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Im-
age Recognition”. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 770–778, IEEE, 2016.
[Hint 12] G. Hinton. “Neural Networks for Machine Learning”. Coursera Video
Lectures, 2012.
[Hint 94] G. E. Hinton and R. S. Zemel. “Autoencoders, Minimum Description
Length and Helmholtz Free Energy”. In: Advances in Neural Information
Processing Systems, pp. 3–10, 1994.
[Hoch 97] S. Hochreiter and J. Schmidhuber. “Long Short-Term Memory”. Neural
Computation, Vol. 9, No. 8, pp. 1735–1780, 1997.
[Honi 11] F. Hönig, A. Batliner, and E. Nöth. “How Many Labellers Revisited–
Naı̈ves, Experts, and Real Experts”. In: Speech and Language Technology
in Education, 2011.
[Hou 17] X. Hou, L. Shen, K. Sun, and G. Qiu. “Deep Feature Consistent Varia-
tional Autoencoder”. In: Proceedings of the IEEE Winter Conference on
Applications of Computer Vision (WACV), pp. 1133–1141, IEEE, 2017.
[Jare 15] C. Jaremenko, A. Maier, S. Steidl, J. Hornegger, N. Oetter, C. Knipfer,
F. Stelzle, and H. Neumann. “Classification of confocal laser endomi-
croscopic images of the oral cavity to distinguish pathological from
healthy tissue”. In: Bildverarbeitung für die Medizin (BVM), pp. 479–
485, Springer, 2015.
[Jare 17] C. Jaremenko, X. Huang, E. Affronti, M. Merklein, and A. Maier. “Sheet
Metal Forming Limits as Classification Problem”. In: Proceedings of the
IAPR International Conference on Machine Vision Applications (MVA),
pp. 100–103, IEEE, 2017.
[Jare 18] C. Jaremenko, E. Affronti, A. Maier, and M. Merklein. “Analysis of Form-
ing Limits in Sheet Metal Forming with Pattern Recognition Methods.
Part 2: Unsupervised Methodology and Application”. Materials, Vol. 11,
No. 10, 2018.
[Jare 19] C. Jaremenko, N. Ravikumar, E. Affronti, A. Maier, and M. Merklein.
“Determination of Forming Limits in Sheet Metal Forming using Deep
Learning”. Materials, Vol. 12, No. 4, 2019.
[Jare 20] C. Jaremenko, E. Affronti, M. Merklein, and A. Maier. “Temporal and
Spatial Detection of the Onset of Local Necking and Assessment of its
Growth Behavior”. Materials, Vol. 13, No. 11, 2020.
[Jarr 09] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. “What is the
Best Multi-stage Architecture for Object Recognition?”. In: Proceed-
ings of the IEEE International Conference on Computer Vision (ICCV),
pp. 2146–2153, IEEE, 2009.
184 Bibliography

[Jia 14] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,


S. Guadarrama, and T. Darrell. “Caffe: Convolutional Architecture for
Fast Feature Embedding”. In: Proceedings of ACM International Confer-
ence on Multimedia, pp. 675–678, ACM, 2014.
[John 16] J. Johnson, A. Alahi, and L. Fei-Fei. “Perceptual Losses for Real-time
Style Transfer and Super-resolution”. In: Proceedings of the European
Conference on Computer Vision (ECCV), pp. 694–711, Springer, 2016.
[Keat 75] T. J. Keating, P. Wolf, and F. Scarpace. “An Improved Method of Digital
Image Correlation”. Photogrammetric Engineering and Remote Sensing,
Vol. 41, No. 8, pp. 993–1002, 1975.
[Keel 61] S. P. Keeler. Plastic Instability and Fracture in Sheets Stretched over
Rigid Punches. PhD thesis, Massachusetts Institute of Technology, 1961.
[Keel 77] S. Keeler and W. Brazier. “Relationship between Laboratory Material
Characterization and Press-shop Formability”. In: Proceedings of the
Conference on Microalloying, pp. 517–528, 1977.
[King 14] D. P. Kingma and J. Ba. “Adam: A Method for Stochastic Optimization”.
arXiv preprint arXiv:1412.6980, 2014.
[Kreb 17] J. Krebs, T. Mansi, H. Delingette, L. Zhang, F. C. Ghesu, S. Miao, A. K.
Maier, N. Ayache, R. Liao, and A. Kamen. “Robust Non-rigid Regis-
tration through Agent-based Action Learning”. In: Proceedings of the
International Conference on Medical Image Computing and Computer-
Assisted Intervention (MICCAI), pp. 344–352, Springer, 2017.
[Kriz 12] A. Krizhevsky, I. Sutskever, and G. E. Hinton. “Imagenet Classification
with Deep Convolutional Neural Networks”. In: Advances in Neural In-
formation Processing Systems, pp. 1097–1105, Curran Associates, Inc,
2012.
[Kroo 09] B. Kroon, S. Maas, S. Boughorbel, and A. Hanjalic. “Eye Localiza-
tion in Low and Standard Definition Content with Application to Face
Matching”. Computer Vision and Image Understanding, Vol. 113, No. 8,
pp. 921–933, 2009.
[Lamp 09] C. H. Lampert. “Kernel Methods in Computer Vision”. Foundations and
Trends in Computer Graphics and Vision, Vol. 4, No. 3, pp. 193–285,
2009.
[Land 77] J. R. Landis and G. G. Koch. “The Measurement of Observer Agreement
for Categorical Data”. Biometrics, pp. 159–174, 1977.
[Lang 85] K. Lange. Handbook of Metal Forming. McGraw-Hill Book Company,
New York, 1985.
[LeCu 15] Y. LeCun, Y. Bengio, and G. Hinton. “Deep Learning”. Nature, Vol. 521,
No. 7553, p. 436, 2015.
[LeCu 89] Y. LeCun. “Generalization and Network Design Strategies”. In: Connec-
tionism in Perspective, Citeseer, 1989.
[Lee 13] D.-H. Lee. “Pseudo-label: The Simple and Efficient Semi-supervised
Learning Method for Deep Neural Networks”. In: Workshop on Chal-
lenges in Representation Learning (ICML), p. 2, 2013.
Bibliography 185

[Lepe 06] V. Lepetit and P. Fua. “Keypoint Recognition using Randomized


Trees”. IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 28, No. 9, pp. 1465–1479, 2006.

[Lin 14] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,


P. Dollár, and C. L. Zitnick. “Microsoft Coco: Common Objects in Con-
text”. In: Proceedings of the European Conference on Computer Vision
(ECCV), pp. 740–755, 2014.

[Lorc 17] B. Lorch, G. Vaillant, C. Baumgartner, W. Bai, D. Rueckert, and


A. Maier. “Automated detection of motion artefacts in MR imaging
using decision forests”. Journal of medical engineering, Vol. 2017, 2017.

[Maas 13] A. L. Maas, A. Y. Hannun, and A. Y. Ng. “Rectifier Nonlinearities


Improve Neural Network Acoustic Models”. In: Proceedings of the Inter-
national Conference on Machine Learning (ICML), p. 3, 2013.

[Madj 12] G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Džeroski. “An Extensive


Experimental Comparison of Methods for Multi-label Learning”. Pattern
Recognition, Vol. 45, No. 9, pp. 3084–3104, 2012.

[Maie 09a] A. Maier, F. Hönig, T. Bocklet, E. Nöth, F. Stelzle, E. Nkenke, and


M. Schuster. “Automatic detection of articulation disorders in children
with cleft lip and palate”. The Journal of the Acoustical Society of Amer-
ica, Vol. 126, No. 5, pp. 2589–2602, 2009.

[Maie 09b] A. Maier, F. Hönig, V. Zeißler, A. Batliner, E. Körner, N. Yamanaka,


P. Ackermann, and E. Nöth. “A language-independent feature set for the
automatic evaluation of prosody”. In: Conference of the International
Speech Communication Association (Interspeech), 2009.

[Maie 19a] A. Maier, C. Syben, T. Lasser, and C. Riess. “A Gentle Introduction to


Deep Learning in Medical Image Processing”. Zeitschrift für Medizinische
Physik, Vol. 29, No. 2, pp. 86–101, 2019.

[Maie 19b] A. K. Maier, C. Syben, B. Stimpel, T. Würfl, M. Hoffmann, F. Schebesch,


W. Fu, L. Mill, L. Kling, and S. Christiansen. “Learning with known
operators reduces maximum error bounds”. Nature machine intelligence,
Vol. 1, No. 8, pp. 373–380, 2019.

[Marc 65] Z. Marciniak. “Stability of Plastics Shells under Tension with Kine-
matic Boundary Condition”. Arciwum Mechaniki Stosowanej, pp. 577–
592, 1965.

[Mate 98] A. Materka and M. Strzelecki. “Texture Analysis Methods - a Review”.


Technical University of Lodz, COST B11 report, pp. 1–33, 1998.

[Merk 10] M. Merklein, A. Kuppert, S. Mütze, and A. Geffer. “New Time Dependent
Method for Determination of FLC Applied to SZBS800”. In: Proceedings
of the International Deep Drawing Research Group (IDDRG), pp. 489–
498, 2010.

[Merk 14] M. Merklein, A. Kuppert, and E. Affronti. “An Improvement of the


Time Dependent Method based on the Coefficient of Correlation for the
Determination of the Forming Limit Curve”. In: Advanced Materials
Research, pp. 215–222, Trans Tech Publications, 2014.
186 Bibliography

[Merk 17] M. Merklein, E. Affronti, W. Volk, and D. Jocham. Verbesserung der


zeitlichen Auswertemethoden von Versuchen zur Ermittlung der Grenz-
formänderung und Ableitung eines virtuellen Ersatzmodells: Nr.469. Eu-
ropäische Forschungsgesellschaft für Blechverarbeitung e.V., 2017.
[Moug 07] S. G. Mougiakakou, I. K. Valavanis, A. Nikita, and K. S. Nikita. “Differ-
ential Diagnosis of CT Focal Liver Lesions using Texture Features, Fea-
ture Selection and Ensemble Driven Classifiers”. Artificial Intelligence in
Medicine, Vol. 41, No. 1, pp. 25–37, 2007.
[Mual 13] F. Mualla, S. Schöll, B. Sommerfeldt, A. Maier, and J. Hornegger. “Au-
tomatic cell detection in bright-field microscope images using SIFT, ran-
dom forests, and hierarchical clustering”. IEEE transactions on medical
imaging, Vol. 32, No. 12, pp. 2274–2286, 2013.
[Nair 10] V. Nair and G. E. Hinton. “Rectified Linear Units improve Restricted
Boltzmann Machines”. In: Proceedings of the International Conference
on Machine Learning (ICML), pp. 807–814, Omnipress, 2010.
[Naka 67] K. Nakazima and T. Kikuma. “Forming Limits under Biaxial Stretching
of Sheet Metals”. Testu-to Hagane, Vol. 53, pp. 455–458, 1967.
[Naka 68] K. Nakajima, T. Kikuma, and K. Hasuka. “Study of Formability of Steel
Sheets”. Yawata Tech Rep, No. 264, pp. 8517–8530, 1968.
[Nguy 12] T. M. Nguyen and Q. J. Wu. “Robust Student’s-t Mixture Model with
Spatial Constraints and its Application in Medical Image Segmentation”.
IEEE Transactions on Medical Imaging, Vol. 31, No. 1, pp. 103–116, 2012.
[Niem 83] H. Niemann. Klassifikation von Mustern. Springer-Verlag, Berlin, Hei-
delberg, 1983.
[Nixo 12] M. S. Nixon and A. S. Aguado. Feature Extraction and Image Processing
for Computer Vision. Academic Press, London, 3 Ed., 2012.
[Ojal 02] T. Ojala, M. Pietikainen, and T. Maenpaa. “Multiresolution Gray-scale
and Rotation Invariant Texture Classification with Local Binary Pat-
terns”. IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 24, No. 7, pp. 971–987, 2002.
[Ojal 96] T. Ojala, M. Pietikäinen, and D. Harwood. “A Comparative Study of
Texture Measures with Classification based on Featured Distributions”.
Pattern Recognition, Vol. 29, No. 1, pp. 51–59, 1996.
[Oliv 18] A. Oliver, A. Odena, C. A. Raffel, E. D. Cubuk, and I. Goodfellow. “Re-
alistic Evaluation of Deep Semi-supervised Learning Algorithms”. In: Ad-
vances in Neural Information Processing Systems, pp. 3235–3246, 2018.
[Pan 07] B. Pan, H. Xie, Z. Guo, and T. Hua. “Full-field Strain Measurement
using a Two-dimensional Savitzky-Golay Digital Differentiator in Digital
Image Correlation”. Optical Engineering, Vol. 46, No. 3, 2007.
[Pan 09] B. Pan, K. Qian, H. Xie, and A. Asundi. “Two-dimensional Digital Im-
age Correlation for In-plane Displacement and Strain Measurement: a
Review”. Measurement Science and Technology, Vol. 20, No. 6, 2009.
[Paul 03] D. Paulus and J. Hornegger. Applied pattern recognition: algorithms and
implementation in C++. Springer Vieweg, Wiesbaden, 2003.
Bibliography 187

[Peel 00] D. Peel and G. J. McLachlan. “Robust Mixture Modelling using the t
Distribution”. Statistics and Computing, Vol. 10, No. 4, pp. 339–348,
2000.
[Piet 11] M. Pietikäinen, A. Hadid, G. Zhao, and T. Ahonen. Computer Vision
Using Local Binary Patterns. Springer-Verlag, London, 2011.
[Poly 64] B. T. Polyak. “Some Methods of Speeding up the Convergence of Iteration
Methods”. USSR Computational Mathematics and Mathematical Physics,
Vol. 4, No. 5, pp. 1–17, 1964.
[Ravi 16] N. Ravikumar, A. Gooya, S. Cimen, A. F. Frangi, and Z. A. Tay-
lor. “A multi-resolution t-mixture model approach to robust group-wise
alignment of shapes”. In: International Conference on Medical Image
Computing and Computer-Assisted Intervention (MICCAI), pp. 142–149,
Springer, 2016.
[Roge 08] G. Rogez, J. Rihan, S. Ramalingam, C. Orrite, and P. H. Torr. “Ran-
domized Trees for Human Pose Detection”. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–
8, IEEE, 2008.
[Ronn 15] O. Ronneberger, P. Fischer, and T. Brox. “U-net: Convolutional net-
works for Biomedical Image Segmentation”. In: Proceedings of the Inter-
national Conference on Medical Image Computing and Computer-assisted
Intervention (MICCAI), pp. 234–241, Springer, 2015.
[Rose 58] F. Rosenblatt. “The perceptron: a Probabilistic Model for Information
Storage and Organization in the Brain”. Psychological review, Vol. 65,
No. 6, p. 386, 1958.
[Rume 86] D. E. Rumelhart and G. E. Hintonf. “Learning Representations by Back-
propagating Errors”. Nature, Vol. 323, p. 9, 1986.

[Rupp 13] W. Rupprecht. Signale und Übertragungssysteme: Modelle und Verfahren


für die Informationstechnik. Springer-Verlag, Berlin, Heidelberg, 2013.
[Russ 15] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
A. Karpathy, A. Khosla, M. Bernstein, et al. “Imagenet Large Scale Vi-
sual Recognition Challenge”. International Journal of Computer Vision,
Vol. 115, No. 3, pp. 211–252, 2015.
[Sanc 11] J. Sanchez and F. Perronnin. “High-dimensional Signature Compression
for Large-scale Image Classification”. In: Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition (CVPR), pp. 1665–
1672, IEEE, 2011.
[Sanc 18] J. A. Sanchez, A. Conde, A. Arriandiaga, J. Wang, and S. Plaza. “Un-
expected Event Prediction in Wire Electrical Discharge Machining Using
Deep Learning Techniques”. Materials, Vol. 11, No. 7, 2018.
[Scha 89] R. W. Schafer and A. V. Oppenheim. Discrete-time signal processing.
Prentice Hall, Englewood Cliffs, 1989.
[Scho 00] B. Schölkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and J. C.
Platt. “Support Vector Method for Novelty Detection”. In: Advances in
Neural Information Processing Systems, pp. 582–588, 2000.
188 Bibliography

[Schr 15] F. Schroff, D. Kalenichenko, and J. Philbin. “Facenet: A Unified Em-


bedding for Face Recognition and Clustering”. In: Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 815–823, IEEE, 2015.

[Selv 17] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and


D. Batra. “Grad-cam: Visual Explanations from Deep Networks via
Gradient-based Localization”. In: Proceedings of the IEEE International
Conference on Computer Vision (ICCV), pp. 618–626, IEEE, 2017.

[Shin 16] H. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mol-
lura, and R. M. Summers. “Deep Convolutional Neural Networks for
Computer-Aided Detection: CNN Architectures, Dataset Characteristics
and Transfer Learning”. IEEE Transactions on Medical Imaging, Vol. 35,
No. 5, pp. 1285–1298, 2016.

[Shot 08] J. Shotton, M. Johnson, and R. Cipolla. “Semantic Texton Forests for
Image Categorization and Segmentation”. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–
8, IEEE, 2008.

[Sieg 15] K. Siegert. Blechumformung: Verfahren, Werkzeuge und Maschinen.


Springer-Verlag, Berlin, Heidelberg, 2015.

[Silv 15] M. Silva, A. J. Martı́nez-Donaire, G. Centeno, D. Morales-Palma, C. Val-


lellano, and P. Martins. “Recent Approaches for the Determination of
Forming Limits by Necking and Fracture in Sheet Metal Forming”. Pro-
cedia Engineering, Vol. 132, pp. 342–349, 2015.

[Simo 14] K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for
Large-scale Image Recognition”. arXiv preprint arXiv:1409.1556, 2014.

[Sind 18] A. Sindel, K. Breininger, J. Käßer, A. Hess, A. Maier, and T. Köhler.


“Learning from a handful volumes: MRI resolution enhancement with
volumetric super-resolution forests”. In: Proceedings of the IEEE Inter-
national Conference on Image Processing (ICIP), pp. 1453–1457, IEEE,
2018.

[Soer 10] L. Soerensen, S. B. Shaker, and M. de Bruijne. “Quantitative Analysis


of Pulmonary Emphysema using Local Binary Patterns”. IEEE Transac-
tions on Medical Imaging, Vol. 29, No. 2, pp. 559–69, 2010.

[Sriv 14] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-


dinov. “Dropout: A Simple Way to Prevent Neural Networks from Over-
fitting”. Journal of Machine Learning Research, Vol. 15, pp. 1929–1958,
2014.

[Stei 05] S. Steidl, M. Levit, A. Batliner, E. Noth, and H. Niemann. “Öf all
things the measure is manäutomatic classification of emotions and inter-
labeler consistency [speech-based emotion recognition]”. In: Proceedings
of the IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), pp. I–317, IEEE, 2005.

[Sutt 09] M. A. Sutton, J. J. Orteu, and H. Schreier. Image Correlation for Shape,
Motion and Deformation Measurements: Basic Concepts, Theory and
Applications. Springer, New York, 2009.
Bibliography 189

[Szeg 15] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,


V. Vanhoucke, and A. Rabinovich. “Going Deeper with Convolutions”.
In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 1–9, IEEE, 2015.
[Tax 04] D. M. Tax and R. P. Duin. “Support Vector Data Description”. Machine
Learning, Vol. 54, No. 1, pp. 45–66, 2004.
[Tax 99] D. M. Tax and R. P. Duin. “Support Vector Domain Description”. Pattern
Recognition Letters, Vol. 20, No. 11-13, pp. 1191–1199, 1999.
[Thea 16] Theano Development Team. “Theano: A Python Framework for
Fast Computation of Mathematical Expressions”. arXiv preprints
arXiv:1605.02688, 2016.
[Vach 99] P. Vacher, A. Haddad, and R. Arrieux. “Determination of the Forming
Limit Diagrams Using Image Analysis by the Corelation Method”. CIRP
Annals - Manufacturing Technology, Vol. 48, No. 2, pp. 227–230, 1999.
[Vinc 08] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. “Extracting
and Composing Robust Features with Denoising Autoencoders”. In: Pro-
ceedings of the International Conference on Machine Learning (ICML),
pp. 1096–1103, ACM, 2008.
[Vo 17] K. Vo, C. Jaremenko, C. Bohr, H. Neumann, and A. Maier. “Automatic
classification and pathological staging of confocal laser endomicroscopic
images of the vocal cords”. In: Bildverarbeitung für die Medizin (BVM),
pp. 312–317, Springer, 2017.
[Volk 11] W. Volk and P. Hora. “New Algorithm for a Robust User-independent
Evaluation of Beginning Instability for the Experimental FLC Determi-
nation”. International Journal of Material Forming, No. 3, pp. 339–346,
2011.
[Vyso 16] D. Vysochinskiy, T. Coudert, O. S. Hopperstad, O.-G. Lademo, and
A. Reyes. “Experimental Detection of Forming Limit Strains on Samples
with Multiple Local Necks”. Journal of Materials Processing Technology,
Vol. 227, pp. 216–226, 2016.
[Wang 14] K. Wang, J. E. Carsley, B. He, J. Li, and L. Zhang. “Measuring Form-
ing Limit Strains with Digital Image Correlation Analysis”. Journal of
Materials Processing Technology, Vol. 214, No. 5, pp. 1120–1130, 2014.
[Wils 03] D. R. Wilson and T. R. Martinez. “The General Inefficiency of Batch
Training for Gradient Descent Learning”. Neural networks, Vol. 16,
No. 10, pp. 1429–1451, 2003.
[Witt 16] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal. Data Mining: Practical
Machine Learning Tools and Techniques. Morgan Kaufmann, Amster-
dam, 2016.
[Wurf 16] T. Würfl, F. C. Ghesu, V. Christlein, and A. Maier. “Deep Learning
Computed Tomography”. In: Proceedings of the International Confer-
ence on Medical Image Computing and Computer-Assisted Intervention
(MICCAI), pp. 432–440, Springer, 2016.
[Xie 16] J. Xie, R. Girshick, and A. Farhadi. “Unsupervised Deep Embedding for
Clustering Analysis”. In: Proceedings of the International Conference on
Machine Learning (ICML), pp. 478–487, 2016.
190 Bibliography

[Yeh 17] C.-K. Yeh, W.-C. Wu, W.-J. Ko, and Y.-C. F. Wang. “Learning Deep
Latent Space for Multi-label Classification”. In: Proceedings of the AAAI
Conference on Artificial Intelligence, 2017.
[Yilm 11] A. Yilmaz. “The Portevin–Le Chatelier Effect: a Review of Experimental
Findings”. Science and Technology of Advanced Materials, Vol. 12, No. 6,
2011.

You might also like