Lec-6 Spam-1

Uploaded by

Adish garg

0% found this document useful (0 votes)

72 views16 pages

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

72 views16 pages

Lec-6 Spam-1

Uploaded by

Adish garg

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 16

Search inside document

SPAM Detection

SPAM
• Originating from the name of Hormel's canned meat,
• "spam" now also refers to junk e-mail or irrelevant postings to
a newsgroup or bulletin board.
• The unsolicited e-mail messages you receive about refinancing
your home, reversing aging, and losing those extra pounds are
all considered to be spam.
• Spamming other people is definitely not cool and is one of the
most notorious violations of Internet etiquette (or
"netiquette").
• So if you ever get the urge to let thousands of people know
about that hot new guaranteed way to make money on the
Internet, please reconsider.
One Solution to Spam Detection
• Machine Learning
– Learn spam versus good/ham

• Naïve Bayes

3
Advantages of Bayesian Method
• Bayesian approach is self adapting. It keeps learning from the new
spams.
• Bayesian method takes whole message into account.
• Bayesian method is easy to use and very accurate (Claimed Accuracy
Percentage is 97).
• Bayesian approach is multi-lingual.
• Reduces the number of false positives.

4
A Spam Filter
Dear Sir.
• Naïve Bayes spam filter
First, I must solicit your confidence in this
transaction, this is by virture of its nature as
• Data: being utterly confidencial and top secret. …
– Collection of emails, labeled
spam or ham
TO BE REMOVED FROM FUTURE MAILINGS,
– Note: someone has to hand SIMPLY REPLY TO THIS MESSAGE AND PUT
label all this data! "REMOVE" IN THE SUBJECT.
– Split into training, testing
sets 99 MILLION EMAIL ADDRESSES
FOR ONLY $99

• Classifiers Ok, Iknow this is blatantly OT but I'm

– Learn on the training set beginning to go insane. Had an old Dell
– Test it on new emails Dimension XPS sitting in the corner and
decided to put it to use, I know it was working
pre being stuck in the corner, but when I
plugged it in, hit the power nothing
happened.
Later in time

Coming before or earlier

Discrete example
Separate spam from valid email, attributes=words
• D1: “send us your password” Spam
• D2: “send us your review” ham
• D3: “review your password” ham
• D4: “review us” spam
• D5: “send your password” spam
• D6: “send us your account” spam
Construct Vocabulary

spam Ham
2/4 ½ Password
¼ 2/2 Review
¾ ½ Send
¾ ½ Us
¾ ½ Your
1/4 0/2 Account

Separate spam from valid email, attributes=words

P (spam)= 4/6 • D1: “send us your password” Spam
P (ham)= 2/6 • D2: “send us your review” ham
• D3: “review password” ham
• D4: “review us” spam
• D5: “send your password” spam
• D6: “send us your account” spam
Naïve Bayes
• Want P( spam | words)
• Use Bayes Rule: P(spam | words)  P( words | spam) P(spam)
P ( words)

P( words )  P( words | spam)  P( spam)  P( words | ham)  P( ham)

• Assume independence: probability of each word

independent of others
P( words | spam)  P( word1 | spam)  P(word 2 | spam)  ... P( wordn | spam)

14
Construct Vocabulary

spam Ham
2/4 ½ Password
¼ 2/2 Review
¾ ½ Send
¾ ½ Us
¾ ½ Your
1/4 0/2 Account

P (spam)= 4/6 New email: “review us now”

P (ham)= 2/6

P(review us|spam) = P( 0,1,0,1,0,0| spam) = (1-2/4)(1/4)(1-3/4)(3/4)(1-3/4)(1-1/4)

P(review us|ham) = P( 0,1,0,1,0,0| ham) = (1-1/2)(2/2)(1-1/2)(1/2)(1-1/2)(1-1/2)
P( words | ham) P(ham)
P(ham | words) 
P( words)

P(ham|review us) = 0.0625*2/6 divide by

0.0625*2/6+ 0.0044*4/6
= 0.87

Is it correct!!!!

AgentSuperpack v7 Final
Document22 pages
AgentSuperpack v7 Final
Javier Valdez (SELFMADE RENEGADE)
No ratings yet
Hello Guys This Is Tutorial in Depth of The Topic Spamming
Document22 pages
Hello Guys This Is Tutorial in Depth of The Topic Spamming
marry angell
100% (7)
Hello Guys This Is Tutorial in Depth of The Topic Spamming
Document22 pages
Hello Guys This Is Tutorial in Depth of The Topic Spamming
John Swift
100% (4)
Spamming Checks Sauce
Document14 pages
Spamming Checks Sauce
kenishaskinnerz
100% (3)
A Study of Supervised Spam Detection Applied To Eight Months of Personal E-Mail
Document34 pages
A Study of Supervised Spam Detection Applied To Eight Months of Personal E-Mail
Mohit
No ratings yet
The Spamhaus Project - Frequently Asked Questions (FAQ)
Document5 pages
The Spamhaus Project - Frequently Asked Questions (FAQ)
Priscilla Felicia Harmanus
No ratings yet
A Study of Supervised Spam Detection Using Artificial Intelligence
Document18 pages
A Study of Supervised Spam Detection Using Artificial Intelligence
Mohit
No ratings yet
Is Your Email List Dirty
Document8 pages
Is Your Email List Dirty
Hubert
No ratings yet
Bulats Writing Materials
Document7 pages
Bulats Writing Materials
Jonathan Bm
No ratings yet
100 Email Hacks
Document21 pages
100 Email Hacks
mccwho
No ratings yet
Email Etiquettes - Khabirul Alam
Document31 pages
Email Etiquettes - Khabirul Alam
Md. Khabirul Alam
No ratings yet
013102newsbytes PDF
Document2 pages
013102newsbytes PDF
helmantico1970
No ratings yet
Effective Emails
Document10 pages
Effective Emails
Laura Gondor
No ratings yet
Through 1 PDF
Document20 pages
Through 1 PDF
yougy
No ratings yet
E-Mail Etiquette: Tutored By: Prof. Sunil D' Anto
Document20 pages
E-Mail Etiquette: Tutored By: Prof. Sunil D' Anto
changumangu
No ratings yet
Email Draft Sample
Document30 pages
Email Draft Sample
meghna khatri
No ratings yet
Spam
Document34 pages
Spam
Rajeev Hatwar
No ratings yet
Formal Email Workshop
Document17 pages
Formal Email Workshop
Mohammad Al-Stafe
No ratings yet
Quadratics - The Quadratic Formula - SparkNotes
Document6 pages
Quadratics - The Quadratic Formula - SparkNotes
Jemerald
No ratings yet
G. Apr No.009 Ing.9 Cristina Zambrano
Document6 pages
G. Apr No.009 Ing.9 Cristina Zambrano
Paola RG
No ratings yet
Inglese B2
Document9 pages
Inglese B2
zakaria barraj
No ratings yet
100 Email Tricks
Document42 pages
100 Email Tricks
Nouredine Looki
100% (1)
Email Marketing Module 1
Document32 pages
Email Marketing Module 1
Mehak Pasricha
No ratings yet
Brs - Upper - Unit 1 Class - Students
Document32 pages
Brs - Upper - Unit 1 Class - Students
Yehimy
No ratings yet
Business Correspondence Language
Document18 pages
Business Correspondence Language
Juan Corral Aguilar
No ratings yet
The ICT Lounge: Recognising and Dealing With Spam
Document4 pages
The ICT Lounge: Recognising and Dealing With Spam
Jedediah Phiri
No ratings yet
The Student Internet Etiquette
Document32 pages
The Student Internet Etiquette
LACoach
No ratings yet
Email Etiquette
Document38 pages
Email Etiquette
saranya_ananthan
No ratings yet
Writing E-Mails: Let's Start! STATEMENT SENTENCE: How Do You Write E-Mails?
Document5 pages
Writing E-Mails: Let's Start! STATEMENT SENTENCE: How Do You Write E-Mails?
Brayam Moreno Saavedra
No ratings yet
E-Mail Writing Ettiquettes 146
Document22 pages
E-Mail Writing Ettiquettes 146
Anantha
No ratings yet
MIDTERM - Nursing Informatics
Document13 pages
MIDTERM - Nursing Informatics
Reya Mae Orcajada
No ratings yet
1 First Insight Into Business Correspondence
Document21 pages
1 First Insight Into Business Correspondence
Nabilah Nur Salma
No ratings yet
mc180400708 - CS723 (1) solutionNN
Document3 pages
mc180400708 - CS723 (1) solutionNN
Muhammad Farooq
No ratings yet
Bec Communication Activities PDF
Document27 pages
Bec Communication Activities PDF
gilmolto
No ratings yet
KG Primary Dots
Document1 page
KG Primary Dots
aleix perez congost
No ratings yet
Email Templates To Use - Thefutur
Document18 pages
Email Templates To Use - Thefutur
Kashif Anjum
No ratings yet
Chapter-Vi Ai and Mail Server
Document29 pages
Chapter-Vi Ai and Mail Server
shivamsaidane09
No ratings yet
ABC Slides For Participants
Document48 pages
ABC Slides For Participants
Ally Imam
No ratings yet
Choose Your Exam Specification: Best of The BBC
Document20 pages
Choose Your Exam Specification: Best of The BBC
Tommy Costa
No ratings yet
Course Workbook: With Gretta Van Riel, Rob Ward, Chase Dimond Richard Li, Nick Shackelford
Document6 pages
Course Workbook: With Gretta Van Riel, Rob Ward, Chase Dimond Richard Li, Nick Shackelford
straywolf0
No ratings yet
10 Online Passwords To Avoid
Document6 pages
10 Online Passwords To Avoid
Andres Estrada
No ratings yet
ML Book
Document40 pages
ML Book
Rishabh chaudhary
No ratings yet
1as - Unit 1 - Getting Through PDF
Document34 pages
1as - Unit 1 - Getting Through PDF
Huire Ghaza
No ratings yet
Ielts Band: 475 5 July 7, 2022
Document7 pages
Ielts Band: 475 5 July 7, 2022
Fipsi Endrawan
No ratings yet
Toeic Test Writing
Document16 pages
Toeic Test Writing
Oussama Simour
No ratings yet
Use Spam Filter Settings: Keep Your Business Open During COVID-19
Document4 pages
Use Spam Filter Settings: Keep Your Business Open During COVID-19
Shaikh Muhammad Ateeq
No ratings yet
WordCounter - Count Words & Correct Writing
Document1 page
WordCounter - Count Words & Correct Writing
Anagha Khanna
No ratings yet
Email Etiquettes
Document19 pages
Email Etiquettes
Alpha Automobile
No ratings yet
Google Keyboard Shortcuts
Document2 pages
Google Keyboard Shortcuts
Rohitava Saha
No ratings yet
Ben M
Document33 pages
Ben M
hajer Gz
No ratings yet
12WaysToBeHappy Pps
Document14 pages
12WaysToBeHappy Pps
Johannes Rustan
No ratings yet
Pass The Pet Exam: With Karen Teacher
Document27 pages
Pass The Pet Exam: With Karen Teacher
Yumii
No ratings yet
Lecture # 11 PDF
Document7 pages
Lecture # 11 PDF
nahel abdallah
No ratings yet
Effective E-Mail Communication: Dr. Hak Danet 28.12.2022
Document12 pages
Effective E-Mail Communication: Dr. Hak Danet 28.12.2022
Nakhim Khorn
No ratings yet
Warm Up A Stale Email List: Prepared by
Document7 pages
Warm Up A Stale Email List: Prepared by
Sergiu Alexandru Vlasceanu
No ratings yet
Email
Document19 pages
Email
Steven Bernal
No ratings yet
HUM 102 Report Writing Skills
Document27 pages
HUM 102 Report Writing Skills
Rizwan Ullah
No ratings yet
Week 5 Electronic Mail Correspondence
Document23 pages
Week 5 Electronic Mail Correspondence
Fei Claudine Manalo
No ratings yet
Halloween Freebie Halloween Freebie: Multisyllabic Words & Spooky Vocab!
Document17 pages
Halloween Freebie Halloween Freebie: Multisyllabic Words & Spooky Vocab!
Marisa López Cha
No ratings yet
NCM210 Midterm
Document15 pages
NCM210 Midterm
Jasmine Angeli Obillo
No ratings yet
Email Best Practice 21 February 2013
Document54 pages
Email Best Practice 21 February 2013
gingerdave100
No ratings yet
E-mail In An Instant: 60 Ways to Communicate With Style and Impact
From Everand
E-mail In An Instant: 60 Ways to Communicate With Style and Impact
Karen Leland
Rating: 5 out of 5 stars
5/5 (3)
The Best American Emails: Re: A Collection of the Finest Accidental Reply Alls, Pharma Spams, and Anonymous Death Threats
From Everand
The Best American Emails: Re: A Collection of the Finest Accidental Reply Alls, Pharma Spams, and Anonymous Death Threats
Amanda Meadows
No ratings yet
2009mliannualreport Final Printresolution
Document60 pages
2009mliannualreport Final Printresolution
Vaso Te Amargo
No ratings yet
Carlill V Carbolic Smoke Ball Co
Document51 pages
Carlill V Carbolic Smoke Ball Co
Kaviya Kavi
No ratings yet
WebUser - 30 October 2019
Document78 pages
WebUser - 30 October 2019
Shiva Shankar
No ratings yet
Business and Professional Communication Keys For Workplace Excellence 3Rd Edition Quintanilla Test Bank Full Chapter PDF
Document35 pages
Business and Professional Communication Keys For Workplace Excellence 3Rd Edition Quintanilla Test Bank Full Chapter PDF
tarascottyskiajfzbq
100% (11)
1e Abriefhistoryofcybercrime
Document2 pages
1e Abriefhistoryofcybercrime
Marielle Caralipio
No ratings yet
Annotated Bibliography
Document4 pages
Annotated Bibliography
api-270802212
No ratings yet
Direct Marketing 2005: The Emergence of Convergence: Fall/Winter 2005
Document51 pages
Direct Marketing 2005: The Emergence of Convergence: Fall/Winter 2005
Saurabh Sahi
No ratings yet
Email Marketing Now Sendblaster Edition
Document142 pages
Email Marketing Now Sendblaster Edition
Hrvoje Runtić
No ratings yet
Inside The Scam Jungle: A Closer Look at 419 Scam Email Operations
Document18 pages
Inside The Scam Jungle: A Closer Look at 419 Scam Email Operations
Qasim Ismail
No ratings yet
Disini vs. Secretary of Justice
Document25 pages
Disini vs. Secretary of Justice
Judiel Pareja
No ratings yet
Ha 3 K
Document48 pages
Ha 3 K
hyd_lonely
No ratings yet
Pedidos 1
Document64 pages
Pedidos 1
Pablo Uz
No ratings yet
Spamming Paypal Guide by Kaushal Pal 2023
Document5 pages
Spamming Paypal Guide by Kaushal Pal 2023
eatrice2utcher
No ratings yet
Spoofing Emails
Document4 pages
Spoofing Emails
isaac_maykovich
No ratings yet
Cyberspace News Prediction of Text and Image
Document53 pages
Cyberspace News Prediction of Text and Image
city
No ratings yet
What Is Disposable Temporary E-Mail?: Your Temporary Email Address
Document4 pages
What Is Disposable Temporary E-Mail?: Your Temporary Email Address
giraxi1535
No ratings yet
Etika Slide
Document30 pages
Etika Slide
Kavilashini Subramaniam
No ratings yet
Message 2
Document3 pages
Message 2
Ishaan Udeshi
No ratings yet
Spam Assassin
Document6 pages
Spam Assassin
Ladin Nguyễn
No ratings yet
Designing A Captcha System With PHP and MySQL
Document9 pages
Designing A Captcha System With PHP and MySQL
Abhilash V Pillai
No ratings yet
7 Effective Tips For Blocking Email Spam With Postfix SMTP Server
Document46 pages
7 Effective Tips For Blocking Email Spam With Postfix SMTP Server
Mordor Chalice
100% (1)
Why Canning "Spam" Is A Bad Idea, Cato Policy Analysis No. 408
Document15 pages
Why Canning "Spam" Is A Bad Idea, Cato Policy Analysis No. 408
Cato Institute
No ratings yet
Notes ICT
Document68 pages
Notes ICT
Mercy Nambo
No ratings yet
Business Trends - October 2014 PDF
Document28 pages
Business Trends - October 2014 PDF
elauwit
No ratings yet
Chapter1cybersecurity PDF
Document22 pages
Chapter1cybersecurity PDF
Dr-Samson Chepuri
No ratings yet
WHM (VPS) Gerenciador de Configuração Do Exim - 78.0.41-22-10-2019 PDF
Document5 pages
WHM (VPS) Gerenciador de Configuração Do Exim - 78.0.41-22-10-2019 PDF
Patrícia Teixeira
No ratings yet
Synopsis - SANTOSH VERMA
Document26 pages
Synopsis - SANTOSH VERMA
sai project
No ratings yet