Professional Documents
Culture Documents
By César Barreto
Advance of Phishing
Keep in mind that phishing is a typical social engineering attack, that is, cyber
attackers use their instincts and the curiosity, trust, fear and greed of users to
commit crimes. Phishing increased 350% during the COVID-19 quarantine,
according to cybersecurity research reports. It is estimated that the cost of
Phishing is now 1/4 of the cost of traditional cyberattacks, but the revenue is
double what it was in the past. Midsize businesses paid an average of $1.6
million to deal with phishing attacks, as a business can lose customers faster
than gain them thanks to this cyberthreat.
Leaking privacy
Receiving deceptive information and losing property
• Existing algorithms treat all websites in the same way, which leads to the
inefficiency of the statistical model. In other words, the models are not suitable
for the real web environment, which contains a large number of complex web
pages.
• Most data sets do not contain enough samples and sample diversity is not
considered; furthermore, the proportion of positive and negative samples is
unrealistic. In general, models based on such data sets experience a large
overfit and the robustness of the models needs further improvement.
In recent years, efforts have been made to develop a large-scale, robust and
efficient Antiphishing method in a real web environment based on statistical
machine learning algorithms whose innovations are based on the following
tests:
• The new Antiphishing models are based on building data sets as similar as
possible to those of the real web environment with different languages, content
qualities and brands. Furthermore, taking into account that Phishing detection is
a class imbalance problem, nowadays a large proportion of positive and
negative samples are considered confused, which are very difficult to detect. All
these features increase the difficulty of detection in order to develop robust,
effective and practical Antiphishing detection models in a real web environment.
Even many Antiphishing models work based on URLs, titles, hyperlinks, login
boxes, copyright information, confidential terms and search engine information,
and even with the comparison of logos of the brands of the websites and it has
been shown that they can be used to identify phishing websites. In addition to
this, visual spoofing features and evaluation features have received more
attention in recent years. However, they have not been robust enough to
determine if a website is a Phishing website.
APWG statistics show that "more than 98% of Phishing websites use fake
domain names." Researchers use information from the URL string to create
antiphishing models, but extracting the underlying information behind the
domain name, such as domain registration and resolution, are also very
important for phishing detection. This information can often indicate whether a
domain name is entitled to provide related brand services. Therefore, extracting
effective features is the main task of Antiphishing researchers, who analyze
social engineering attacks and propose a comprehensive and interpretable
feature framework that not only covers all aspects of Phishing attacks, but also
covers the quality and relevance of web content.
This consists of: Stage 1. White list filtering stage; This stage includes filtering
real web pages from suspicious ones based on the phishing domain name of
the target brand's website. Stage 2. It is the stage of rapid filtering of counterfeit
bills; this includes extraction of the following functions: counterfeit title function,
counterfeit text function, visual counterfeit function using a detection model The
last stage is the extraction and fusion of functions of accurate recognition of
counterfeit function, theft function, function of affiliation, evaluation function,
training and detection of models, using the CASE function, which is a training
based on detection of phishing based on altered models.
Conclusion