P. 1
Algorithms for Duplicate Documents Prince Ton)

Algorithms for Duplicate Documents Prince Ton)

|Views: 19|Likes:
Published by .xml

More info:

Published by: .xml on Nov 19, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/24/2013

pdf

text

original

•[B ‘98]

•Advantages

uSimpler math ⇒better understanding.

uBetter for filtering

•Disadvantage

uTime consuming

•Similar approach independently proposed by

[Indyk & Motwani‘99]

29

A. Broder –Algorithms for

near-duplicate documents

February 18, 2005

Sketch construction

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->