You are on page 1of 6

Matching records with titles

Make an out-file based on the title field

Next, run Edit out-files/Keep only 0-9 and A to Z characters


The atz-file now only have a to z characters.
Next, step is to truncate the words to a maximum length:
1. select atz-file and run Edit outfiles/decompress blank separated out-file
2. type for example 5 in the Min number box
3. select the nnu-file and run Edit out-files/Keep only first n characters
4. select the chr-file and run Edit out-files/Compress outfile (1 row per field)
and then answer no for semicolon separation …
…and now the nou file has titles with words no longer than 5 characters
Let’s assume we wish to match the truncated titles with the full titles:
1. Select the atz-file and run Edit out-files/Swap two columns
2. Be sure to have the swp-file in box “Type new file name here” and
then select the nou-file (with truncated titles)
3. Run Add data classify/Match abbreviated words in full string
and answer no to finding all matches
4. The cll-file has the full titles to the right, and of course we get 100
percent hits since we match the same titles.
In a more realistic setting you may need to match titles from
two different sets, for example Web of Science titles with
those of a publication database of an university. Then you
will probably not find all titles from one set represented in the
other. The not-file will contain missing titles in the truncated
form.

You might also like