You are on page 1of 5

1) Bayesian Filtering for Location Estimation :

Dieter Fox, Jeffrey Hightower, Gaetano Borriello JULY- 2003


Bayesian-filter techniques provide a powerful statistical tool to help manage
measurement uncertainty and perform multisensor fusion and identity estimation. The
authors survey Bayes filter implementations and show their application to real-world
location-estimation tasks common in pervasive computing.

Location awareness is important to many pervasive computing applications.


Unfortunately, no location sensor takes perfect measurements or works well in all
situations. Thus, the motivation behind this article is twofold. First, we believe the
pervasive computing community will benefit from a concise survey of Bayesian-filter
techniques. Because no sensor is perfect, representing and operating on uncertainty with a
statistical tool such as Bayes filters is key in any system using many sensors. Second,
estimating an object’s location is arguably the most fundamental sensing task in
many pervasive computing scenarios.

2) Increasing the Accuracy of a Spam-Detecting Artificial Immune


System – JAN 2004

Ray Hunt and James Carpinter


Spam, the electronic equivalent of junk mail, affects over 600 million users worldwide.
Even as anti-spam solutions change to limit the amount of spam sent to users, the senders
adapt to make sure their messages are seen. This paper looks at application of the
artificial immune system model to protect email users effectively from spam. In
particular, it tests the spam immune system against the publicly available SpamAssassin
corpus of spam and non-spam, and extends the original system by looking at several
methods of classifying email messages with the detectors produced by the immune
system. The resulting system classifies the messages with similar accuracy to other spam
filters, hut uses fewer detectors to do so, making it an attractive solution for
circumstances where processing time is at a premium.

3) Current and New Developments in Spam Filtering NOV-2004

Terri Oda, Tony White


This paper provides an overview of current and potential future spam filtering techniques.
We examine the problems spam introduces, what spam is and how we can measure it.
The paper primarily focuses on automated, noninteractive filters, with a broad review
ranging from commercial implementations to ideas confined to current research papers.
Both machine learning and non-machine learning based filters are reviewed
as potential solutions and a taxonomy of known approaches presented. While
a range of different techniques have and continue to be evaluated in
academic research, heuristic and Bayesian filtering - along with its variants -
provide the greatest potential for future spam prevention.
4) ALPACAS: A Large-scale Privacy-Aware Collaborative Anti-spam
System :APRIL -2008
Makoto Sakai1,2, Norihide Kitaoka2, Kazuya Takeda2

While the concept of collaboration provides a natural defense against massive spam
emails directed at large numbers of recipients, designing effective collaborative anti-
spam systems raises several important research challenges. First and foremost, since
emails may contain confidential information, any collaborative anti-spam approach has to
guarantee strong privacy protection to the participating entities. Second, the continuously
evolving nature of spam demands the collaborative techniques to be resilient to various
kinds of camouflage attacks. Third, the collaboration has to be lightweight, efficient, and
scalable. Towards addressing these challenges, this paper presents ALPACAS - a
privacy-aware framework for collaborative spam filtering. In designing the ALPACAS
framework, we make two unique contributions. The first is a feature-preserving message
transformation technique that is highly resilient against the latest kinds of spam attacks.
The second is a privacy-preserving protocol that provides enhanced privacy guarantees to
the participating entities. Our experimental results conducted on a real email dataset
shows that the proposed framework provides a 10 fold improvement in the false negative
rate over the Bayesian-based Bogofilter when faced with one of the recent kinds of spam
attacks. Further, the privacy breaches are extremely rare. This demonstrates the
strong privacy protection provided by the ALPACAS system.

5) Detecting Word Substitutions in Text :

SzeWang Fong, Dmitri Roussinov, and David B. Skillicorn AUG- 2008

Searching for words on a watchlist is one way in which large-scale surveillance of


communication can be done, for example, in intelligence and counterterrorism settings.
One obvious defense is to replace words that might attract attention to a message with
other more innocuous words. For example, the sentence “the attack will be tomorrow”
might be altered to “the complex will be tomorrow,” since “complex” is a word whose
frequency is close to that of “attack.” Such substitutions are readily detectable by
humans since they do not make sense. We address the problem of detecting such
substitutions automatically by looking for discrepancies between words and their contexts
and using only syntactic information. We define a set of measures, each of which is
quite weak, but which together produce per-sentence detection rates around 90 percent
with false positive rates around 10 percent. Rules for combining per-sentence detection
into per-message detection can reduce the false positive and false negative rates for
messages to practical levels. We test the approach using sentences from the
Enron e-mail and Brown corpora, representing informal and formal text,
respectively.

6) Design and Evaluation of a Bayesian-filter-based Image Spam


Filtering Method: OCT- 2008

In recent years, with the spread of the Internet, the number of spam e-mail has become
one of the most serious problems. A recent report reveals that 91% of all e-mail
exchanged in 2006 was spam. Using the Bayesian filter is a popular approach to
distinguish between spam and legitimate e-mails. It applies the Bayes theory to identify
spam. This filter proffers high filtering precision and is capable of detecting spam as per
personal preferences. However, the number of image spam, which contains the spam
message as an image, has been increasing rapidly. The Bayesian filter is not capable of
distinguishing between image spam and legitimate e-mails since it learns from and
examines only text data. Therefore, in this study, we propose an anti image spam
technique that uses image information such as file size. This technique can be easily
implemented on the existing Bayesian filter. In addition, we report the results of
the evaluations of this technique.

7)Privacy-Aware Collaborative Spam Filtering :

Kang Li,Member, IEEE, Zhenyu Zhong, Student Member, IEEE, and


Lakshmish Ramaswamy, Member, IEEE MAY -2009

While the concept of collaboration provides a natural defense against massive spam e-
mails directed at large numbers of recipients, designing effective collaborative anti-spam
systems raises several important research challenges. First and foremost, since e-mails
may contain confidential information, any collaborative anti-spam approach has to
guarantee strong privacy protection to the participating entities. Second, the continuously
evolving nature of spam demands the collaborative techniques to be resilient to various
kinds of camouflage attacks. Third, the collaboration has to be lightweight,
efficient, and scalable. Toward addressing these challenges, this paper
presents ALPACAS—a privacy-aware framework for collaborative spam
filtering. In designing the ALPACAS framework, we make two unique
contributions. The first is a feature-preserving message transformation
technique that is highly resilient against the latest kinds of spam attacks. The
second is a privacy-preserving protocol that provides enhanced privacy
guarantees to the participating entities. Our experimental results conducted
on a real e-mail data set shows that the proposed framework provides a 10
fold improvement in the false negative rate over the Bayesian-based
Bogofilter when faced with one of the recent kinds of spam attacks. Further,
the privacy breaches are extremely rare. This demonstrates the strong
privacy protection provided by the ALPACAS system.

8) Feature transformation based on discriminant analysis preserving


local structure for speech recognition:

Zhenyu Zhong ,Lakshmish Ramaswamy, Kang Li SEP- 2009

To improve speech recognition performance, a feature transformation based on


discriminant analysis has been widely used to reduce redundant dimensions of features.
Linear discriminant analysis (LDA) and Heteroscedastic discriminant analysis (HDA) are
often used for this purpose, and a generalization method for LDA and HDA called Power
LDA (PLDA) has been proposed. However, these methods may result in unexpected
dimensionality reduction for multimodal data. It is important to preserve the local
structure of the data in reducing the dimensionality of multimodal data. In this paper
we introduce two methods, locality preserving HDA and locality preserving
PLDA. We also give an efficient calculation scheme to obtain an optimal
projection.

9) Anti-Spam Solutions and Security:

(by Dr. Neal Krawetz) MAY-2010

In a recent survey, 93% of respondents reported dissatisfaction with


the large volume of unsolicited email (spam) they receive. [ref 1] The
problem has grown to the point where nearly 50% of the world's email
is spam [ref 2], yet only a few hundred groups are responsible. [ref 3]
Many anti-spam solutions have been proposed and a few have been
implemented. Unfortunately, these solutions do not prevent spam as
much as they interfere with every-day email communications.

The problems posed by spam have grown from simple annoyances to


significant security issues. The deluge of spam costs up to an
estimated $20 billion each year in lost productivity -- according to the
same document, spam within a company can cost between $600 and
$1,000 per year for every user.

10) The application of decision tree in chinese email classification :


HAO CHEN, YAN ZHAN, YAN LI JULY- 2010

Email is a kind of semi-structured document, some important attributes are contained in


its structure, and especially using spam-specific features could improve the email
classification results. In this paper, we apply decision tree data mining technique to dig
out the potential association rules among these attributes of email, and then to identify
unknown email’s category based on these rules. According to the experiment of applying
numerous Chinese emails to our email classifier, the efficiency of our method is not
lower than that of other existing methods of checking whole email content text.
Meanwhile our method can reduce the cost of computation and consumption
of system resources.