Log In | Sign Up | Help
Transparent

Recent Documents Rss

Preview Star_faded

Removing Manually Generated Boilerplate from Electronic Texts: Experiments with Project Gutenberg e

Collaborative work on unstructured or semistructured documents, such as in literature corpora or source code, often involves agreed upon templates containing metadata. These templates are not consistent across users and over time. Rule-based parsing of these templates is expensive to maintain and tends to fail as new documents are added. Statistical techniques based on frequent occurren...
  • Pdf_16x16 4 Pages
  • 880 views
  • 0 Comments
Preview Star_faded

A technique for isolating differences between files

A simple algorithm is described for isolating the differences between two files. One application is the comparing of two versions of a source program or other file in order to display all differences. The algorithm isolates differences in a way that corresponds closely to our intuitive notion of difference, is easy to implement, and is computationally efficient, with time linear in the file len...
  • Pdf_16x16 5 Pages
  • 673 views
  • 2 Comments

See all 2 documents

Collections

No collections

Comments

No comments

joyceschan


Search within joyceschan's documents:


Page_white_text 2 Documents

Group 1553 Viewers

Thumb_up 3 Likes

User_add Add joyceschan to friends

Basic Info

Friends

No friends yet.