You are on page 1of 3

Working with Large Datasets in VantagePoint

Overview:

Dave Schoeneck Senior Research Analyst, Search Technology, Inc.

When you are working with datasets with more than ~20,000 records in VantagePoint (VP), you may see an error message that you are running out of RAM. The following guidelines may help you to free up some system memory and continue to work. Which guidelines to apply will depend on your analytical needs and where you are in the workflow process. Strategies discussed in this document include: • • • • Use a 64-bit Operating System and Install the maximum amount of RAM supported by your computer Close other programs that are not essential to your analysis. Import a small number of fields at first; Use “Import More Fields” to add other fields later. Use VantagePoint’s “Memory Manager” tool

Use a 64-bit OS and Install the Maximum RAM VantagePoint is a 32-bit application, and is subject to the per-process memory usage limits of the operating system. These limits exist regardless of how much physical memory the computer has installed. If you are using VantagePoint on a 32-bit version of Windows, the maximum amount of memory that VP can use is 2 gigabytes. On a 64-bit Windows system, VantagePoint can use up to 3 GB. Close Non-essential Programs and *.vpt Files. If you have other applications running that are not essential to your workflow, close them to make more system memory available for VP to use. If you have more than one VantagePoint data file (*.vpt) open, close all open data files except the one in which you are currently working. Maintain a Dataset with as Few Fields as Possible When you maintain a dataset with only the essential fields, you also keep the size (in MB) of the *.vpt file on the disk as small as possible. This is especially important when you import raw data files, and it is advisable to import only the “Title” field at first, so you do not run out of memory before you save the *.vpt file to a disk. Once you have saved your dataset as a *.vpt file, exit and restart VP (to free up as much memory as possible) and open your saved dataset. You can use “Import More Fields” (from VP’s “Fields” menu to add other fields you need after your data is imported and saved to a *.vpt file. Use discretion when choosing which fields to add. Whenever possible, avoid importing fields with Long Text (e.g. Patent Claims, Abstracts, etc.) Fields with a very large number of items will also consume a lot of system resources. Examples of such fields include: • Fields with “NLP” Words or Phrases

much larger field. which you will need to set up. matrices. Search Technology. or other types of sheets to make more system memory available. and run your saved thesaurus on that field.g. When this is the case.) Fields that include a lot of items also tend to have long tails on their record frequency distributions. Launch VantagePoint and select: ToolsEdit Keyboard Shortcuts Figure 1 1 If you saved your List Cleanup work as a thesaurus. but only if they can be readily imported again using “Import More Fields. Use VantagePoint’s Memory Manager Tool VantagePoint includes a Memory Manager tool. That is.” Use caution not to delete fields that have Groups you want to keep or “Cleaned” fields. Inventors. which you can use to unload fields from memory when they are not in use. The following illustrations walk through the steps to set a hotkey. You can use the “Minimize Memory Use” button after deleting fields. Inc. Authors. . • • “Cited References” fields and fields derived from Cited References (e. This feature is accessible only by a user-defined keyboard shortcut . You can then use “Create Field using Group Items” to make a new field with far fewer items. “Cleaned” fields cannot be readily re-imported with “Import More Fields. Full Organization Names. consider creating a group of all terms that occur in at least N records. you can re-import the original field. or fields with Uncontrolled vocabulary terms (see note below) Note: Delete existing large fields that are not in use. “Cited Authors” or “Cited Journals”). and delete the originating. lists.” 1 (but the originating field on which the cleaning was done can usually be safely deleted.instructions on setting up a hotkey are as follows: How To: Set up a Hotkey to Open VantagePoint’s Memory Manager Tool VantagePoint’s Memory Manager tool can be accessed only through use of a keyboard shortcut.Dave Schoeneck Senior Research Analyst. a vast majority of the terms will occur in only one or two records.

Figure 3 . Fields can be unloaded from memory one by one or you can use the “Minimize Memory Use” button to unload all fields that are not loaded in detail windows or used by the currently viewed list. or other sheet. Inc. Follow the steps in Figure 2 to configure the hotkey: Figure 2 Press the hotkey combination to open the Memory Manager window. Search Technology.Dave Schoeneck Senior Research Analyst. matrix.