Professional Documents
Culture Documents
overcome many problems that seemed before unsolvable. And the development of the internet of
things in the next years will provide us with enormous amounts of data that data scientists can build
upon many applications. The development of machine learning has also been possible thanks to the
improvements in calculation capacities we have witnessed in the recent period.
To solve an engineering problem with a traditional conventional approach, we have to gain as much
knowledge as we can about the questions our problem arises, in order to build a good enough model
that faithfully represents it. The next step is to come up with an efficient algorithm that helps us
obtain the numerical solutions of the problem. But sometimes we can be unable to implement this
method, either because the problem it too complex, preventing us to imagine a model that covers
the properties of the problem in a satisfactory way, or because the best algorithm we can think of is
not efficient enough to give us the numerical solutions we need in a reasonable amount of time.
Osvaldo Simeone refers to these problems as a „model deficit“ and an „algorithm deficit“.
Machine learning can be in both these situations very helpful because it enables us, we normally
don’t need deep understanding of for example the physics underlying the problem as we do when
solving a problem in a conventional way. However, this knowledge can be helpful when we want to
choose the hypothesis class, or to say it differently, the „machines“ we want to train. Osvaldo
Simeone has given in the article the example of „convolutional neural networks „ used in „image
processing“. Another example is the use LSTM-networks to detect anomalies in a signal (obtained for
example from a sensor). But when using Machine Learning, data mining and cleaning is of
fundamental importance because the quality of the results we get from the algorithm is highly
dependent on the quality of the data used.
Osvaldo Simeone insists on the fact that not every problem is suitable for a resolution with
machine learning techniques. I agree with him on that. We should make sure that we deal with
either a “model deficit” or an “algorithm deficit” and that problem doesn’t require us to give
explanation for every result we get from the program. Machine learning techniques are a sort of
black boxes, taking input and delivering output but we cannot access in general an explicit
relationship between for example the features and the labels. When using data-driven
approaches, we should keep in mind the limits inherent to these techniques.