Professional Documents
Culture Documents
1. Feature Engineering
a. Feature representation
- Static code: number and type of instruction (Arithmetic instructions, Memory
operations, Branch instructions, loop information, parallel information)
- Tree based and graph based approach: Tree built using features helped to
contrast cost function using neural network, graph based approached was used
in SVM where kernel is based on graph similarity metric.
- Dynamic features: Drawback of static code is it may contain information of
code segments that are rarely executed, this confused machine learning
models. loop iteration counts the cannot be decided at compile time, dynamic
memory and I/O behavior of the application as well as CPU load and thread
contention, control flows, frequently executed code regions, how many
instructions have been executed and of what types, and the number of cache
loads/stores as well as branch misses.
b. Reaction based features
- They then use the program “reactions” to predict the best available application
speedup.
c. Automatic feature generation
- the predictive model takes in a set of human-crafted features, program code is
used directly in the training data. Programs are fed through a series of neural
network based language models which learn how code correlates with the
desired optimization options.
d. Feature selection and dimensionality reduction
- Feature selection: Feature selection requires understanding how does a
particular feature affect the prediction accuracy. The methods that can be used
are Pearson correlation coefficient, mutual information, LASSO.
- Feature dimensionality reduction: It is important for efficiency of model and
curse of dimensionality problem (KNN). PCA linearly combines the original
features to construct new features. LDA is also a similar technique. The
alternative way is using autoencoders. The encoder tries to compress the
original input into a low-dimensional representation, while the decoder tries to
reconstruct the original input based on the low-dimension representations
generated by the encoder.