P. 1
Algorithms for Duplicate Documents Prince Ton)

Algorithms for Duplicate Documents Prince Ton)

|Views: 19|Likes:
Published by .xml

More info:

Published by: .xml on Nov 19, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/24/2013

pdf

text

original

•Linear transformation are not good in the
worst case but work reasonable well in
practice.

uSee [BCFM ‘97], [Bohman, Cooper, & Frieze

’00]

•Matrix transformations

u[B & Feige‘00]

•Some code available from

http://www.icsi.berkeley.edu/~zhao/minwise/[Zhao ’05]

34

A. Broder –Algorithms for

near-duplicate documents

February 18, 2005

The filtering mechanism

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->