You are on page 1of 35

__________________________________________________________________________ FACULTY OF SCIENCE AND TECHNOLOGY COURSEWORK FOR THE BSC (HONS) INFORMATION SYSTEMS & BSC (HONS) in INFORMATION

TECHNOLOGY; YEAR 3 ACADEMIC SESSION MARCH 2013; SEMESTER 7 and 8 BIS 3218: Business Intelligence BIS 3216/ CTB 3201: Data Mining & Knowledge Discovery DEADLINE: 5th of JULY 2013, PRESENTATION: 8 -19th of JULY 2013 STUDENT NAME: __________________________________________________________ NRIC/PASSPORT NO: ______________________________________________________

INSTRUCTIONS TO CANDIDATES • • This assignment is 100 marks and it is a group assignment with individual oral presentation and question and answers. The group should consist of no more than 4 people.

IMPORTANT The University requires students to adhere to submission deadlines for any form of assessment. Penalties are applied in relation to unauthorized late submission of work. - Coursework submitted after the deadline but within 1 week will be accepted for a maximum mark of 40%. Work handed in following the extension of 1 week after the original deadline will be regarded as a non-submission and marked zero. Lecturer’s Remark (Use additional sheet if required)

I.............................. (Name) ...................std. ID received the assignment and read the comments....................................... (Signature/date) Academic Honesty Acknowledgement “I .........................................(student name). verify that this paper contains entirely my own work. I have not consulted with any outside person or materials other than what was specified (an interviewee, for example) in the assignment or the syllabus requirements. Further, I have not copied or inadvertently copied ideas, sentences, or paragraphs from another student. I realize the penalties (refer to page 16, 5.5, Appendix 2, page 44 of the student handbook diploma and undergraduate programme) for any kind of copying or collaboration on any assignment.” .................................. (Student’s signature / Date)
Updated: SCT/020211

GROUP ASSIGNMENT The goal of this project is to estimate the return from a direct mailing in order to maximize donation profits.
+--------------------------------------------------------------------+ | PROJECT OVERVIEW: A Fund Raising Net Return Prediction Model | +--------------------------------------------------------------------+ BACKGROUND AND OBJECTIVES ------------------------The data set for this project has been generously provided by the Paralyzed Veterans of America (PVA). PVA is a not-for-profit organization that provides programs and services for US veterans with spinal cord injuries or disease. With an in-house database of over 13 million donors, PVA is also one of the largest direct mail fund raisers in the country. You are required to demonstrate the performance of your model by analyzing the results of one of PVA's recent fund raising appeals. This mailing was sent to a total of 3.5 million PVA donors who were on the PVA database as of June 1997. Everyone included in this mailing had made at least one prior donation to PVA. The mailing included a gift (or "premium") of personalized name & address labels plus an assortment of 10 note cards and envelopes. All of the donors who received this mailing were acquired by PVA through similar premium-oriented appeals such as this. One group that is of particular interest to PVA is "Lapsed" donors. These are individuals who made their last donation to PVA 13 to 24 months ago. They represent an important group to PVA, since the longer someone goes without donating, the less likely they will be to give again. Therefore, recapture of these former donors is a critical aspect of PVA's fund raising efforts. However, PVA has found that there is often an inverse correlation between likelihood to respond and the dollar amount of the gift, so a straight response model (a classification or discrimination task) will most likely net only very low dollar donors. High dollar donors will fall into the lower deciles, which would most likely be suppressed from future mailings. The lost revenue of these suppressed donors would then offset any gains due to the increased response rate of the low dollar donors. Therefore, to improve the cost-effectiveness of future direct marketing efforts, PVA wishes to develop a model that will help them maximize the net revenue (a regression or estimation task) generated from future renewal mailings to Lapsed donors. POPULATION ---------The population for this analysis will be Lapsed PVA donors who received the June '97 renewal mailing (appeal code "97NK"). Therefore, the analysis data set contains a subset of the total universe who received the mailing. The analysis file includes all 191,779 Lapsed donors who received the mailing, with responders to the mailing marked with a flag in the TARGET_B field. The total dollar amount of each responder's gift is in the TARGET_D field. The overall response rate for this direct mail promotion is 5.1%. The distribution of the target fields in the learning and validation files is as follows:

1

Learning Data Set Target Variable: Binary Indicator of Response to 97NK Mailing Cumulative Cumulative TARGET_B Frequency Percent Frequency Percent -----------------------------------------------------0 90569 94.9 90569 94.9 1 4843 5.1 95412 100.0 Learning Data Set Target Variable: Donation Amount (in $) to 97NK Mailing Variable N Mean Minimum Maximum -----------------------------------------------------TARGET_D 95412 0.7930732 0 200.0000000 ------------------------------------------------------

Validation Data Set Target Variable: Binary Indicator of Response to 97NK Mailing Cumulative Cumulative TARGET_B Frequency Percent Frequency Percent -----------------------------------------------------0 91494 94.9 91494 94.9 1 4873 5.1 96367 100.0 Validation Data Set Target Variable: Donation Amount (in $) to 97NK Mailing Variable N Mean Minimum Maximum -----------------------------------------------------TARGET_D 96367 0.7895819 0 500.0000000 -----------------------------------------------------The average donation amount (in $) among the responsers is: Learning Data Set Target Variable: Donation Amount (in $) to 97NK Mailing N Mean Minimum Maximum ----------------------------------------------4843 15.6243444 1.0000000 200.0000000 ----------------------------------------------Validation Data Set Target Variable: Donation Amount (in $) to 97NK Mailing N Mean Minimum Maximum ----------------------------------------------4873 15.6145372 0.3200000 500.0000000 -----------------------------------------------

2

the objective of the analysis will be to maximize the net revenue generated from this mailing . thus. 3 . returning the predicted value of the binary target variable TARGET_B and its associated probability/strength will not be sufficient. The response variable is.) Although we are releasing the binary and the continuous versions of the target variable (TARGET_B and TARGET_D respectively). The typical outcome of predictive modeling in database marketing is an estimate of the expected response/return per customer in the database.. +--------------------------------------------------------------------+ | EVALUATION RULES | +--------------------------------------------------------------------+ Once again. So. For our purpose. continuous (for the lack of a better common term. ANALYSIS TIME FRAME AND REFERENCE DATE -------------------------------------The 97NK mailing was sent out on June 1997. The participants could also find the reference date information in the filed ADATE_2. This date may be used as the reference date in generating the "number of months since" or "time since" or "elapsed time" variables. All information included in the file (excluding the giving history date fields) is reflective of behavior prior to 6/97. A marketer will mail to a customer so long as the expected return from an order exceeds the cost invested in generating the order.68 per piece mailed. i.68 per piece mailed. the package cost (including the mail cost) is $0. This filed contains the dates the 97NK promotion was mailed. the cost of promotion.COST MATRIX ----------The package cost (including the mail cost) is $0.a censored regression or estimation problem.e. the program committee will use the predicted value of the donation (dollar) amount (for the target variable TARGET_D) in evaluating the results.

) Field Name Type ---------------ODATEDW Num OSOURCE Char TCODE Num STATE Char ZIP Char MAILCODE Char PVASTATE Char DOB Num NOEXCH Char RECINHSE Char RECP3 Char RECPGVG Char RECSWEEP Char MDMAUD Char DOMAIN Char CLUSTER Char AGE Num AGEFLAG Char HOMEOWNR Char CHILD03 Char CHILD07 Char CHILD12 Char CHILD18 Char NUMCHLD Num INCOME Num GENDER Char WEALTH1 Num HIT Num MBCRAFT Num MBGARDEN Num MBBOOKS Num MBCOLECT Num MAGFAML Num MAGFEM Num 4 . state.g.) The name of the variables in the learning and validation data sets is included in each file as the top (header) record.. total # of donations prior to "97NK" mailing. For your information. origin source.g. total $ amount of the donations. they are listed below again (ordered by data set position) along with the filed type information (Num: numeric..+--------------------------------------------------------------------+ | DATA SOURCES and ORDER & TYPE OF THE VARIABLES IN THE DATA SETS | +--------------------------------------------------------------------+ The dataset includes: o 24 months of detailed PVA promotion and giving history (covering the period 12 to 36 months prior to the "97NK" mailing) o A summary of the promotions sent to the donors over the most recent 12 months prior to the "97NK" mailing (by definition. date of first gift. etc. etc. including a mix of household and area level data o All other available data from the PVA database (e. Char: string/character. none of these donors responded to any of these promotions) o Summary variables reflecting each donor's lifetime giving history (e.) o Overlay demographics.

MAGMALE PUBGARDN PUBCULIN PUBHLTH PUBDOITY PUBNEWFN PUBPHOTO PUBOPP DATASRCE MALEMILI MALEVET VIETVETS WWIIVETS LOCALGOV STATEGOV FEDGOV SOLP3 SOLIH MAJOR WEALTH2 GEOCODE COLLECT1 VETERANS BIBLE CATLG HOMEE PETS CDPLAY STEREO PCOWNERS PHOTO CRAFTS FISHER GARDENIN BOATS WALKER KIDSTUFF CARDS PLATES LIFESRC PEPSTRFL POP901 POP902 POP903 POP90C1 POP90C2 POP90C3 POP90C4 POP90C5 ETH1 ETH2 ETH3 ETH4 ETH5 ETH6 ETH7 ETH8 ETH9 ETH10 ETH11 ETH12 ETH13 ETH14 ETH15 ETH16 AGE901 AGE902 AGE903 Num Num Num Num Num Num Num Num Char Num Num Num Num Num Num Num Char Char Char Num Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 5 .

AGE904 AGE905 AGE906 AGE907 CHIL1 CHIL2 CHIL3 AGEC1 AGEC2 AGEC3 AGEC4 AGEC5 AGEC6 AGEC7 CHILC1 CHILC2 CHILC3 CHILC4 CHILC5 HHAGE1 HHAGE2 HHAGE3 HHN1 HHN2 HHN3 HHN4 HHN5 HHN6 MARR1 MARR2 MARR3 MARR4 HHP1 HHP2 DW1 DW2 DW3 DW4 DW5 DW6 DW7 DW8 DW9 HV1 HV2 HV3 HV4 HU1 HU2 HU3 HU4 HU5 HHD1 HHD2 HHD3 HHD4 HHD5 HHD6 HHD7 HHD8 HHD9 HHD10 HHD11 HHD12 ETHC1 ETHC2 ETHC3 ETHC4 Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 6 .

ETHC5 ETHC6 HVP1 HVP2 HVP3 HVP4 HVP5 HVP6 HUR1 HUR2 RHP1 RHP2 RHP3 RHP4 HUPA1 HUPA2 HUPA3 HUPA4 HUPA5 HUPA6 HUPA7 RP1 RP2 RP3 RP4 MSA ADI DMA IC1 IC2 IC3 IC4 IC5 IC6 IC7 IC8 IC9 IC10 IC11 IC12 IC13 IC14 IC15 IC16 IC17 IC18 IC19 IC20 IC21 IC22 IC23 HHAS1 HHAS2 HHAS3 HHAS4 MC1 MC2 MC3 TPE1 TPE2 TPE3 TPE4 TPE5 TPE6 TPE7 TPE8 TPE9 PEC1 Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 7 .

PEC2 TPE10 TPE11 TPE12 TPE13 LFC1 LFC2 LFC3 LFC4 LFC5 LFC6 LFC7 LFC8 LFC9 LFC10 OCC1 OCC2 OCC3 OCC4 OCC5 OCC6 OCC7 OCC8 OCC9 OCC10 OCC11 OCC12 OCC13 EIC1 EIC2 EIC3 EIC4 EIC5 EIC6 EIC7 EIC8 EIC9 EIC10 EIC11 EIC12 EIC13 EIC14 EIC15 EIC16 OEDC1 OEDC2 OEDC3 OEDC4 OEDC5 OEDC6 OEDC7 EC1 EC2 EC3 EC4 EC5 EC6 EC7 EC8 SEC1 SEC2 SEC3 SEC4 SEC5 AFC1 AFC2 AFC3 AFC4 Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 8 .

AFC5 AFC6 VC1 VC2 VC3 VC4 ANC1 ANC2 ANC3 ANC4 ANC5 ANC6 ANC7 ANC8 ANC9 ANC10 ANC11 ANC12 ANC13 ANC14 ANC15 POBC1 POBC2 LSC1 LSC2 LSC3 LSC4 VOC1 VOC2 VOC3 HC1 HC2 HC3 HC4 HC5 HC6 HC7 HC8 HC9 HC10 HC11 HC12 HC13 HC14 HC15 HC16 HC17 HC18 HC19 HC20 HC21 MHUC1 MHUC2 AC1 AC2 ADATE_2 ADATE_3 ADATE_4 ADATE_5 ADATE_6 ADATE_7 ADATE_8 ADATE_9 ADATE_10 ADATE_11 ADATE_12 ADATE_13 ADATE_14 Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 9 .

ADATE_15 ADATE_16 ADATE_17 ADATE_18 ADATE_19 ADATE_20 ADATE_21 ADATE_22 ADATE_23 ADATE_24 RFA_2 RFA_3 RFA_4 RFA_5 RFA_6 RFA_7 RFA_8 RFA_9 RFA_10 RFA_11 RFA_12 RFA_13 RFA_14 RFA_15 RFA_16 RFA_17 RFA_18 RFA_19 RFA_20 RFA_21 RFA_22 RFA_23 RFA_24 CARDPROM MAXADATE NUMPROM CARDPM12 NUMPRM12 RDATE_3 RDATE_4 RDATE_5 RDATE_6 RDATE_7 RDATE_8 RDATE_9 RDATE_10 RDATE_11 RDATE_12 RDATE_13 RDATE_14 RDATE_15 RDATE_16 RDATE_17 RDATE_18 RDATE_19 RDATE_20 RDATE_21 RDATE_22 RDATE_23 RDATE_24 RAMNT_3 RAMNT_4 RAMNT_5 RAMNT_6 RAMNT_7 RAMNT_8 RAMNT_9 RAMNT_10 Num Num Num Num Num Num Num Num Num Num Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Char Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num 10 .

0000000 0 4.--------------------------8306.0000000 0 6.00 0 9710.--------------------------Minimum Maximum Minimum Maximum ------------------------.00 9701.0000000 0 4.0000000 0 9.0000000 0 9.0000000 7.0000000 0 3.0000000 7.0000000 98.0000000 7.0000000 0 9.RAMNT_11 RAMNT_12 RAMNT_13 RAMNT_14 RAMNT_15 RAMNT_16 RAMNT_17 RAMNT_18 RAMNT_19 RAMNT_20 RAMNT_21 RAMNT_22 RAMNT_23 RAMNT_24 RAMNTALL NGIFTALL CARDGIFT MINRAMNT MINRDATE MAXRAMNT MAXRDATE LASTGIFT LASTDATE FISTDATE NEXTDATE TIMELAG AVGGIFT CONTROLN TARGET_B TARGET_D HPHONE_D RFA_2R RFA_2F RFA_2A MDMAUD_R MDMAUD_F MDMAUD_A CLUSTER2 GEOCODE2 Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Num Char Char Char Char Char Char Num Char /* not included in the validation file */ /* not included in the validation file */ +--------------------------------------------------------------------+ | SUMMARY STATISTICS (MIN & MAX) | +--------------------------------------------------------------------+ Summary statistics are provided for the numeric variables only.0000000 0 6. Variable --------------ODATEDW TCODE DOB AGE NUMCHLD INCOME WEALTH1 HIT MBCRAFT MBGARDEN MBBOOKS MBCOLECT MAGFAML MAGFEM MAGMALE PUBGARDN PUBCULIN Learning Data Set Validation Data Set ------------------------.0000000 0 6.0000000 0 4.0000000 1.00 0 39002.0000000 1.0000000 7.00 8301.0000000 1.0000000 0 5.0000000 0 9.0000000 11 .0000000 0 9.0000000 1.0000000 98.00 0 72002.0000000 0 4.0000000 0 242.00 9701.0000000 0 5.0000000 0 241.0000000 0 9.0000000 1.0000000 0 6.0000000 0 6.0000000 0 4.00 0 9705.0000000 0 6.00 1.

0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 90.0000000 29.0000000 99.0000000 99.0000000 67.0000000 81.0000000 99.0000000 84.00 35403.0000000 99.0000000 49.PUBHLTH PUBDOITY PUBNEWFN PUBPHOTO PUBOPP MALEMILI MALEVET VIETVETS WWIIVETS LOCALGOV STATEGOV FEDGOV WEALTH2 POP901 POP902 POP903 POP90C1 POP90C2 POP90C3 POP90C4 POP90C5 ETH1 ETH2 ETH3 ETH4 ETH5 ETH6 ETH7 ETH8 ETH9 ETH10 ETH11 ETH12 ETH13 ETH14 ETH15 ETH16 AGE901 AGE902 AGE903 AGE904 AGE905 AGE906 AGE907 CHIL1 CHIL2 CHIL3 AGEC1 AGEC2 AGEC3 AGEC4 AGEC5 AGEC6 AGEC7 CHILC1 CHILC2 CHILC3 CHILC4 CHILC5 HHAGE1 HHAGE2 HHAGE3 HHN1 HHN2 HHN3 HHN4 HHN5 HHN6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9.0000000 8.0000000 46.0000000 99.00 99.0000000 99.0000000 72.0000000 99.0000000 99.0000000 99.0000000 2.0000000 72.0000000 79.0000000 99.00 23766.0000000 99.0000000 52.0000000 84.0000000 99.00 21036.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 12 .0000000 99.0000000 99.0000000 99.0000000 81.0000000 99.0000000 99.0000000 96.0000000 9.0000000 99.0000000 100286.0000000 99.0000000 81.0000000 75.0000000 99.0000000 81.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 9.0000000 99.0000000 94.0000000 9.00 99.0000000 99.0000000 99.0000000 9.0000000 76.0000000 99.0000000 86.0000000 57.0000000 87.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 87.0000000 99.0000000 81.0000000 99.0000000 99.0000000 99.0000000 99.0000000 45.0000000 99.00 35403.0000000 99.0000000 99.0000000 9.0000000 99.0000000 99.0000000 67.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9.0000000 71.0000000 98701.0000000 99.0000000 84.0000000 97.0000000 99.0000000 9.0000000 99.0000000 99.0000000 99.0000000 47.0000000 99.0000000 50.0000000 67.0000000 84.0000000 9.0000000 99.0000000 84.0000000 81.0000000 99.0000000 84.0000000 99.0000000 99.0000000 99.0000000 97.0000000 84.0000000 2.0000000 99.0000000 84.0000000 99.0000000 84.0000000 22.

0000000 99.0000000 99.0000000 99.0000000 99.0000000 6000.0000000 99.0000000 99.0000000 99.0000000 31.0000000 85.0000000 99.0000000 99.0000000 90.0000000 99.0000000 99.0000000 75.00 13.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 13.0000000 99.0000000 99.0000000 99.0000000 99.0000000 700.0000000 99.0000000 99.0000000 9360.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 85.0000000 99.0000000 99.0000000 99.0000000 99.0000000 46.0000000 700.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 50.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 83.00 13.0000000 99.0000000 99.0000000 55.0000000 99.0000000 99.0000000 99.0000000 61.0000000 99.0000000 71.0000000 99.0000000 99.0000000 881.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.MARR1 MARR2 MARR3 MARR4 HHP1 HHP2 DW1 DW2 DW3 DW4 DW5 DW6 DW7 DW8 DW9 HV1 HV2 HV3 HV4 HU1 HU2 HU3 HU4 HU5 HHD1 HHD2 HHD3 HHD4 HHD5 HHD6 HHD7 HHD8 HHD9 HHD10 HHD11 HHD12 ETHC1 ETHC2 ETHC3 ETHC4 ETHC5 ETHC6 HVP1 HVP2 HVP3 HVP4 HVP5 HVP6 HUR1 HUR2 RHP1 RHP2 RHP3 RHP4 HUPA1 HUPA2 HUPA3 HUPA4 HUPA5 HUPA6 HUPA7 RP1 RP2 RP3 RP4 MSA ADI DMA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99.0000000 99.0000000 99.0000000 90.0000000 99.0000000 99.0000000 650.0000000 650.0000000 40.0000000 99.0000000 99.00 6000.0000000 99.0000000 99.0000000 99.0000000 13 .0000000 40.00 651.0000000 99.0000000 99.0000000 13.0000000 80.0000000 99.0000000 99.0000000 99.0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 6000.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 9360.00 645.0000000 881.00 6000.0000000 99.0000000 61.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 88.0000000 99.0000000 99.0000000 99.0000000 73.0000000 99.

IC1 IC2 IC3 IC4 IC5 IC6 IC7 IC8 IC9 IC10 IC11 IC12 IC13 IC14 IC15 IC16 IC17 IC18 IC19 IC20 IC21 IC22 IC23 HHAS1 HHAS2 HHAS3 HHAS4 MC1 MC2 MC3 TPE1 TPE2 TPE3 TPE4 TPE5 TPE6 TPE7 TPE8 TPE9 PEC1 PEC2 TPE10 TPE11 TPE12 TPE13 LFC1 LFC2 LFC3 LFC4 LFC5 LFC6 LFC7 LFC8 LFC9 LFC10 OCC1 OCC2 OCC3 OCC4 OCC5 OCC6 OCC7 OCC8 OCC9 OCC10 OCC11 OCC12 OCC13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1500.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 61.0000000 71.0000000 99.0000000 99.0000000 44.0000000 99.0000000 50.0000000 99.0000000 99.0000000 99.00 174523.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.00 174523.0000000 99.0000000 99.00 99.0000000 99.00 1500.0000000 99.0000000 97.0000000 14 .0000000 90.0000000 99.0000000 99.0000000 99.0000000 99.00 1500.0000000 99.0000000 99.0000000 99.00 1500.0000000 99.0000000 99.0000000 99.0000000 99.0000000 57.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.00 99.0000000 61.0000000 99.0000000 99.0000000 99.0000000 99.0000000 47.0000000 99.0000000 99.0000000 99.0000000 99.0000000 50.0000000 99.0000000 43.0000000 99.0000000 99.0000000 99.00 1500.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 25.0000000 99.0000000 99.0000000 68.0000000 99.0000000 99.0000000 76.0000000 44.0000000 99.0000000 99.0000000 55.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.00 1394.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 78.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 88.0000000 85.0000000 76.0000000 99.0000000 99.0000000 99.0000000 99.0000000 47.00 1500.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1500.0000000 99.0000000 55.0000000 90.0000000 99.

0000000 99.0000000 97.0000000 30.0000000 74.0000000 170.0000000 98.0000000 99.0000000 99.0000000 72.0000000 99.0000000 99.0000000 14.0000000 99.0000000 50.0000000 99.0000000 32.0000000 99.0000000 47.0000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99.0000000 99.0000000 74.0000000 57.0000000 23.0000000 99.0000000 71.0000000 67.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 61.0000000 99.0000000 99.0000000 99.0000000 68.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 43.0000000 170.0000000 33.0000000 99.0000000 15 .EIC1 EIC2 EIC3 EIC4 EIC5 EIC6 EIC7 EIC8 EIC9 EIC10 EIC11 EIC12 EIC13 EIC14 EIC15 EIC16 OEDC1 OEDC2 OEDC3 OEDC4 OEDC5 OEDC6 OEDC7 EC1 EC2 EC3 EC4 EC5 EC6 EC7 EC8 SEC1 SEC2 SEC3 SEC4 SEC5 AFC1 AFC2 AFC3 AFC4 AFC5 AFC6 VC1 VC2 VC3 VC4 ANC1 ANC2 ANC3 ANC4 ANC5 ANC6 ANC7 ANC8 ANC9 ANC10 ANC11 ANC12 ANC13 ANC14 ANC15 POBC1 POBC2 LSC1 LSC2 LSC3 LSC4 VOC1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99.0000000 99.0000000 99.0000000 73.0000000 99.0000000 72.0000000 99.0000000 99.0000000 99.0000000 99.0000000 20.0000000 99.0000000 50.0000000 99.0000000 99.0000000 99.0000000 83.0000000 95.0000000 74.0000000 99.0000000 74.0000000 38.0000000 55.0000000 99.0000000 99.0000000 99.0000000 99.0000000 99.0000000 68.0000000 37.0000000 99.0000000 99.0000000 47.0000000 99.0000000 78.0000000 30.0000000 99.0000000 57.0000000 99.0000000 99.0000000 99.0000000 65.0000000 99.0000000 99.0000000 99.0000000 78.0000000 50.0000000 99.0000000 65.0000000 99.0000000 99.0000000 99.0000000 31.0000000 99.0000000 91.0000000 99.0000000 99.0000000 99.0000000 27.0000000 99.0000000 99.0000000 99.0000000 92.0000000 99.0000000 99.0000000 99.0000000 57.0000000 99.0000000 99.0000000 99.0000000 99.0000000 74.0000000 99.0000000 99.0000000 97.0000000 99.0000000 52.0000000 41.0000000 99.0000000 99.0000000 99.0000000 99.0000000 48.0000000 99.0000000 99.0000000 99.0000000 64.0000000 99.0000000 72.0000000 99.

00 9506.00 9410.00 9407.0000000 62.00 9511.0000000 99.00 9503.0000000 5.00 9806.0000000 99.00 9503.00 9409.00 9504.00 9510.00 9604.0000000 52.00 9511.00 9509.0000000 21.00 9411.00 9803.0000000 99.00 9502.0000000 21.0000000 99.00 9509.00 9507.00 9603.0000000 9706.0000000 99.0000000 99.0000000 31.0000000 52.00 9511.0000000 99.0000000 31.00 9609.00 9406.00 9412.0000000 78.00 9504.00 9603.0000000 99.00 9604.00 9501.00 9603.0000000 99.00 9511.00 9409.00 9506.0000000 99.0000000 30.00 9701.00 9506.00 9409.00 9511.0000000 99.00 9602.00 9511.00 9803.00 9312.00 9405.00 9806.00 9604.00 9408.0000000 0 1.00 9509.00 9502.0000000 99.00 9604.0000000 99.0000000 99.00 9604.00 9508.00 9504.00 9604.00 9502.00 4.00 9603.0000000 99.0000000 9702.00 9510.00 9806.00 9409.00 9509.00 9603.00 9505.0000000 9309.0000000 99.0000000 90.00 9605.0000000 9608.00 9409.00 9406.00 9504.00 9511.0000000 99.0000000 99.00 9606.0000000 9605.00 9804.0000000 9806.00 9512.0000000 99.00 9610.00 9512.0000000 99.00 9508.0000000 21.0000000 99.00 9511.00 9604.00 9502.00 9510.00 9510.00 189.0000000 99.00 9601.0000000 99.00 9502.0000000 19.0000000 99.00 9410.00 9405.00 9805.0000000 99.00 9511.00 9604.00 9806.00 9806.00 9506.00 9504.00 9411.00 9412.00 9601.00 9510.0000000 9806.00 9512.00 9603.00 9609.00 9606.0000000 91.00 9501.00 9609.00 16 .00 9508.00 9507.00 9507.00 9804.0000000 9706.00 9509.00 9504.00 9805.00 62.0000000 99.00 9412.00 4.0000000 99.0000000 5.0000000 0 1.00 61.00 9409.0000000 99.00 9509.00 9509.00 9504.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9704.00 9408.00 9511.00 9603.0000000 52.0000000 99.0000000 62.00 9510.0000000 99.00 9511.00 99.00 9602.0000000 99.00 0 9607.00 9606.00 9411.0000000 99.0000000 76.00 9502.0000000 99.00 9603.00 9805.0000000 99.0000000 9702.00 9509.00 9508.00 9510.00 9407.00 9804.0000000 34.00 9312.0000000 99.00 99.00 9411.00 9507.00 9511.00 9511.00 9511.00 9512.0000000 99.00 9406.VOC2 VOC3 HC1 HC2 HC3 HC4 HC5 HC6 HC7 HC8 HC9 HC10 HC11 HC12 HC13 HC14 HC15 HC16 HC17 HC18 HC19 HC20 HC21 MHUC1 MHUC2 AC1 AC2 ADATE_2 ADATE_3 ADATE_4 ADATE_5 ADATE_6 ADATE_7 ADATE_8 ADATE_9 ADATE_10 ADATE_11 ADATE_12 ADATE_13 ADATE_14 ADATE_15 ADATE_16 ADATE_17 ADATE_18 ADATE_19 ADATE_20 ADATE_21 ADATE_22 ADATE_23 ADATE_24 CARDPROM MAXADATE NUMPROM CARDPM12 NUMPRM12 RDATE_3 RDATE_4 RDATE_5 RDATE_6 RDATE_7 RDATE_8 RDATE_9 RDATE_10 RDATE_11 RDATE_12 RDATE_13 RDATE_14 RDATE_15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9704.00 195.0000000 99.00 9805.00 9603.00 9504.00 9412.00 1.00 9509.

0000000 200.0000000 0 45.00 9507.0000000 237.00 TARGET_B 0 1.2857143 1000.00 RAMNT_3 2.0000000 300.0000000 1.0000000 RAMNT_11 1.00 9503.0000000 970.0000000 500.0000000 13.0000000 1.0000000 0.00 RDATE_20 9411.00 5.00 RDATE_19 9409.00 RDATE_22 9409.00 9507.00 9512.00 9409.0000000 100.0000000 -------------------------------------- 9410.0000000 1.0000000 10000.00 RDATE_17 9502.0000000 RAMNT_9 1.0000000 0 436.00 9702.00 1.00 9508.00 MAXRAMNT 5.0000000 500.0000000 500.0000000 3.00 FISTDATE 0 9603.0000000 300.1000000 500.00 7312.0000000 1.0000000 191779.0000000 100.0000000 62.00 RDATE_23 9309.0000000 MINRAMNT 0 1000.00 9805.0000000 RAMNT_4 1.0000000 0.00 9407.31 1.00 9702.00 NEXTDATE 7211.0000000 RAMNT_21 1.00 9512.0000000 300.00 RAMNT_19 1.00 9504.00 9702.0000000 0 500.00 9411.0000000 300.0000000 500.00 AVGGIFT 1.0000000 CARDGIFT 0 41.0000000 10000.0000000 RAMNT_20 0.3000000 500.5789474 650.0000000 RAMNT_15 1.0000000 30.0000000 203.0000000 RAMNT_16 0.00 9409.00 9702.0000000 1.3000000 200.0000000 500.0000000 RAMNT_13 0.00 9309.00 1.0000000 RAMNT_14 1.0000000 ------------------------- +--------------------------------------------------------------------+ | DATA (PRE)PROCESSING | +--------------------------------------------------------------------+ General ------o The field CONTROLN is a unique record identifier (an index) and should not be used in modeling o Response flag (field name: TARGET_B) indicates whether or not the lapsed donor responded to the campaign.00 RDATE_18 9412.00 9508.0000000 50.00 9508.00 MAXRDATE 7510.0000000 100.0000000 RAMNT_12 1.0000000 0 1.0000000 225.00 RDATE_24 9309.00 9602.0000000 1000.0000000 0.0000000 8010.0000000 5000.0000000 RAMNT_22 0.00 9409.0000000 RAMNT_7 1.0000000 RAMNT_5 4.0000000 1.0000000 RAMNTALL 13.0000000 300.00 9601.5000000 500.5000000 250.3200000 3713.00 1.00 0 9603.00 MINRDATE 7506.00 9504.0000000 50. THIS FIELD SHOULD NOT BE USED DURING MODEL BUILDING.00 NGIFTALL 1. 17 .00 0 1.00 9702.0000000 1000.00 9508.0000000 1.00 1.00 9502.0000000 RAMNT_24 1.0000000 300.0000000 CLUSTER2 1.0000000 1.RDATE_16 9411.0000000 TARGET_D 0 200.00 LASTDATE 9503.00 2.0000000 1.0000000 200.0000000 10253.0000000 1.0000000 RAMNT_8 1.00 9702.2900000 300.0000000 191776.00 RAMNT_10 0.0000000 500.00 9702.0000000 300.00 RDATE_21 9409.00 9603.0000000 9485.0000000 RAMNT_23 0.0000000 HPHONE_D 0 1.00 9508.0000000 RAMNT_18 1.0000000 600.00 0 1060.0000000 1.0000000 1000.0000000 RAMNT_17 1.00 LASTGIFT 0 1000.5000000 250.3200000 300.0000000 62.5000000 205.00 9510.00 8011.0000000 200.0000000 1.00 0 10000.0000000 250.00 TIMELAG 0 1088.0000000 1.0000000 100.0000000 5.0000000 0.0000000 RAMNT_6 1.0000000 1000.0000000 250.00 CONTROLN 1.00 9702.00 9309.0000000 126.00 9509.00 1.00 9509.

Fields Containing Constants --------------------------Fields containing a constant value (i. or any other way supported by your tool. refer to the questionnaire documentation (file name: cup98QUE. which will follow the promotion dates.e. the missing data should be inferred from known values (e. Periods and/or blanks in the numeric variables correspond to missing values. others infer missing values from known values. You are also expected to drop attributes with 'sparse' distributions.5 million PVA donors who were on the PVA database as of June 1997.txt. or treat missing data as a special value to be included additionally in the attribute domain. ATTRIBUTE TYPE -------------See the data dictionary to determine the attribute types. mean. Sparse data occur when the events actually represented in given data make only a very small subset of the event space.o Blanks in string or character variables correspond to missing values. Time Frame and Date Fields -------------------------This mailing was mailed to a total of 3. While some simply disregard missing values or omit the corresponding records. 'Y') are not considered as constants and should be included in the analysis.. You are expected to omit these attributes from the analysis. mode. there is only one value for all the records) should be dropped from the analysis.) One exception to this rule is the attributes containing 99.g..5 percent or more missings. median. You are expected to clean these fields (without excluding the records.g. For the purposes of KDD-CUP-98 the records and/or fields should not be omitted from analysis because they contain missing data.) o attribute = field = variable = feature o responders = targets 18 . a modeled value..) Records and Fields with Missing and Sparse Data ----------------------------------------------Discovery methods vary in the way they treat the missing values. All information contained in the analysis dataset reflects the donor status prior to 6/97 (except the gift receipt dates. Instead. Attributes containing missing and one valid level (e.) This date could be used as the "end date" or "rerefence date" in the calculation of "number of months since" variables. +--------------------------------------------------------------------+ | TERMINOLOGY-GLOSSARY | +--------------------------------------------------------------------+ [GLOSSARY] For more information on the terminology used throughout this documentation. Data preprocessing tasks include the following: Noisy Data ---------Some of the fields in the analysis file may contain data entry and/or formatting errors.

004002 = DR.Defaulted to 00000 for conversion . Donor title code 000 = _ 001 = MR. Origin Source .A nominal or symbolic field. 013 = COLONEL 013002 = COLONEL & MRS. 012 = GENERAL 012002 = GENERAL & MRS. 018 = MAJOR 018002 = MAJOR & MRS. 016 = DEAN 017 = JUDGE 017002 = JUDGE & MRS. & MRS. 033 = CANTOR TCODE 19 . 001002 = MR. 010010 = PROFESSORS 011 = ADMIRAL 011002 = ADMIRAL & MRS. 004004 = DOCTORS 005 = MADAME 006 = SERGEANT 009 = RABBI 010 = PROFESSOR 010002 = PROFESSOR & MRS. 029 = BISHOP 031 = AMBASSADOR 031002 = AMBASSADOR & MRS. 001001 = MESSRS. & MRS. 014 = CAPTAIN 014002 = CAPTAIN & MRS. 019 = SENATOR 020 = GOVERNOR 021002 = SERGEANT & MRS. 002002 = MESDAMES 003 = MISS 003003 = MISSES 004 = DR.(Only 1rst 3 bytes are used) .o o o o non-reponders = non-targets output = target = dependent variable inputs = independent variables analysis file = analysis sample = combined learning and validation files DICTIONARY AND VARIABLES Variable -------------------------ODATEDW OSOURCE Description -----------------------------------------Origin Date. 002 = MRS. 024 = LIEUTENANT 026 = MONSIGNOR 027 = REVEREND 028 = MS.Code indicating which mailing list the donor was originally acquired from . 022002 = COLNEL & MRS. Date of donor's first gift to PVA YYMM format (Year/Month). 028028 = MSS. 015 = COMMANDER 015002 = COMMANDER & MRS.

ET MME. SR. MLLE. State abbreviation (a nominal/symbolic field) Zipcode (a nominal/symbolic field) Mail Code " "= Address is OK B = Bad Address EPVA State or PVA State Indicates whether the donor lives in a state served by the organization's EPVA chapter P = PVA State E = EPVA State (Northeastern US) PVASTATE 20 . SA. PASTOR ARCHBISHOP SPECIALIST PRIVATE SEAMAN AIRMAN JUSTICE MR. YOUR MAJESTY HIS HIGHNESS HER HIGHNESS COUNT LADY PRINCE PRINCESS CHIEF BARON SHEIK PRINCE AND PRINCESS YOUR IMPERIAL MAJEST M. GOVERNOR LIC. PROF. DA. CHANCELLOR REPRESENTATIVE SECRETARY LT. SRA. LORD CARDINAL FRIEND FRIENDS ARCHDEACON CANON BISHOP REVEREND & MRS.036 037 038 040 042 043 044 046 047 048 050 056 059002 062 063 064 065 068 069 070 072002 073 075 085 087 089 090 091 092 100 103 104 106 107 108 109 111 114 116 117 118 120 122 123 124 125 126 127 128 129 130 131 132 135 210 STATE ZIP MAILCODE = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = BROTHER SIR COMMODORE FATHER SISTER PRESIDENT MASTER MOTHER CHAPLAIN CORPORAL ELDER MAYOR LIEUTENANT & MRS. SRTA. JUSTICE M.

) Do Not Exchange Flag (For list rental) _ = can be exchanged X = do not exchange In House File Flag _ = Not an In House Record X = Donor has given to PVA's In House program P3 File Flag _ = Not a P3 Record X = Donor has given to PVA's P3 program Planned Giving File Flag _ = Not a Planned Giving Record X = Planned Giving Record Sweepstakes file flag _ = Not a Sweepstakes Record X = Sweepstakes Record The Major Donor Matrix code The codes describe frequency and amount of giving for donors who have given a $100+ gift at any time in their giving history. An RFA (recency/frequency/monetary) field. see the promotion history field definitions. Year/Month format. The individual bytes could RECINHSE RECP3 RECPGVG RECSWEEP MDMAUD separately be used as fields and refer to the following: First byte: Recency of Giving C=Current Donor L=Lapsed Donor I=Inactive Donor D=Dormant Donor 2nd byte: Frequency of Giving 1=One gift in the period of recency 2=Two-Four gifts in the period of recency 5=Five+ gifts in the period of recency 3rd byte: Amount of Giving L=Less than $100(Low Dollar) C=$100-499(Core) M=$500-999(Major) T=$1.000+(Top) 4th byte: Blank/meaningless/filler 'X' indicates that the donor is not a major donor. 1st byte = Urbanicity level of the donor's neighborhood U=Urban C=City S=Suburban DOMAIN/Cluster code. A nominal or symbolic could be broken down by bytes as explained 21 . below. For more information regarding the RFA codes. DOMAIN field. The (current) concatenated version is a nominal or symbolic field.DOB NOEXCH Date of birth (YYMM.

T=Town R=Rural 2nd byte = Socio-Economic status of the neighborhood 1 = Highest SES 2 = Average SES 3 = Lowest SES (except for Urban communities. where 1 = Highest SES. 4 = Lowest SES. 3 = Below average SES. MBCRAFT MBGARDEN MBBOOKS MBCOLECT MAGFAML MAGFEM MAGMALE PUBGARDN Buy Craft Hobby Buy Gardening Buy Books Buy Collectables Buy General Family Mags Buy Female Mags Buy Sports Mags Gardening Pubs CLUSTER Code indicating which cluster group the donor HOMEOWNR CHILD03 CHILD07 CHILD12 CHILD18 NUMCHLD INCOME GENDER WEALTH1 HIT has 22 . Each cluster is unique in terms of socioeconomic status. F = Female. ------------------------------------------------------------------------The following variables indicate the number of known times the donor has responded to other types of mail order offers. AGE AGEFLAG Overlay Age 0 = missing Age Flag E = Exact I = Inferred from Date of Birth Field Home Owner Flag H = Home owner U = Unknown Presence of Children age 0-3 B = Both. urbanicty. 2 = Above average SES. A nominal or symbolic field. ethnicity and a variety of other demographic characteristics. M = Male Presence of Childern age 4-7 Presence of Childern age 8-12 Presence of Childern age 13-18 NUMBER OF CHILDREN HOUSEHOLD INCOME Gender M = Male F = Female U = Unknown J = Joint Account. unknown gender Wealth Rating MOR Flag # HIT (Mail Order Response) Indicates total number of known times the donor responded to a mail order offer other than PVA's.) CLUSTER falls into.

Each rating has a different meaning within each state. SOLIH MAJOR WEALTH2 GEOCODE at which 23 . Geo Cluster Code indicating the level geography a record matches the census data. A nominal or symbolic field.PUBCULIN PUBHLTH PUBDOITY PUBNEWFN PUBPHOTO PUBOPP --------------------------DATASRCE donor Culinary Pubs Health Pubs Do It Yourself Pubs News / Finance Pubs Photography Pubs Opportunity Seekers Pubs ----------------------------------------------Source of Overlay Data Indicates which third-party data source the matched against 1 = MetroMail 2 = Polk 3 = Both MALEMILI MALEVET VIETVETS WWIIVETS LOCALGOV STATEGOV FEDGOV SOLP3 % % % % % % % Males active in the Military Males Veterans Vietnam Vets WWII Vets Employed by Local Gov Employed by State Gov Employed by Fed Gov SOLICIT LIMITATION CODE P3 = can be mailed (Default) 00 = Do Not Solicit or Mail 01 = one solicitation per year 02 = two solicitations per year 03 = three solicitations per year 04 = four solicitations per year 05 = five solicitations per year 06 = six solicitations per year 12 = twelve solicitations per year SOLICITATION LIMIT CODE IN HOUSE = can be mailed (Default) 00 = Do Not Solicit 01 = one solicitation per year 02 = two solicitations per year 03 = three solicitations per year 04 = four solicitations per year 05 = five solicitations per year 06 = six solicitations per year 12 = twelve solicitations per year Major ($$) Donor Flag _ = Not a Major Donor X = Major Donor Wealth Rating Wealth rating uses median family income and population statistics from each area to index relative wealth within each state The segments are denoted 0-9. with 9 being the highest income group and zero being the lowest. Blank=No code has been assigned or did not match at any level.

----------------------------------------------The following variables reflect donor as collected from third-party data sources COLLECT1 VETERANS BIBLE CATLG HOMEE PETS CDPLAY STEREO PCOWNERS PHOTO CRAFTS FISHER GARDENIN BOATS WALKER KIDSTUFF CARDS PLATES LIFESRC listed COLLECTABLE (Y/N) VETERANS (Y/N) BIBLE READING (Y/N) SHOP BY CATALOG (Y/N) WORK FROM HOME (Y/N) HOUSEHOLD PETS (Y/N) CD PLAYER OWNERS (Y/N) STEREO/RECORDS/TAPES/CD (Y/N) HOME PC OWNERS/USERS PHOTOGRAPHY (Y/N) CRAFTS (Y/N) FISHING (Y/N) GARDENING (Y/N) POWER BOATING (Y/N) WALK FOR HEALTH (Y/N) BUYS CHILDREN'S PRODUCTS (Y/N) STATIONARY/CARDS BUYER (Y/N) PLATE COLLECTOR (Y/N) LIFE STYLE DATA SOURCE Indicates source of the lifestyle variables above 1 = MATCHED ON METRO MAIL ONLY 2 = MATCHED ON POLK ONLY 3 = MATCHED BOTH MM AND POLK --------------------------PEPSTRFL ----------------------------------------------Indicates PEP Star RFA Status blank = Not considered to be a PEP Star 'X' = Has PEP Star RFA Status ----------------------------------------------The following variables reflect characteristics of the donors neighborhood. POP901 POP902 POP903 POP90C1 POP90C2 POP90C3 POP90C4 POP90C5 ETH1 ETH2 ETH3 ETH4 ETH5 ETH6 ETH7 ETH8 ETH9 ETH10 ETH11 ETH12 Number of Persons Number of Families Number of Households Percent Population in Urbanized Area Percent Population Outside Urbanized Area Percent Population Inside Rural Area Percent Male Percent Female Percent White Percent Black Percent Native American Percent Pacific Islander/Asian Percent Hispanic Percent Asian Indian Percent Japanese Percent Chinese Percent Philipino Percent Korean Percent Vietnamese Percent Hawaiian 24 . as collected from --------------------------- the 1990 US Census.--------------------------interests.

ETH13 ETH14 ETH15 ETH16 AGE901 AGE902 AGE903 AGE904 AGE905 AGE906 AGE907 CHIL1 CHIL2 CHIL3 AGEC1 AGEC2 AGEC3 AGEC4 AGEC5 AGEC6 AGEC7 CHILC1 CHILC2 CHILC3 CHILC4 CHILC5 HHAGE1 HHAGE2 HHAGE3 Age 65+ HHN1 HHN2 HHN3 HHN4 HHN5 HHN6 MARR1 MARR2 MARR3 MARR4 HHP1 HHP2 DW1 DW2 DW3 DW4 DW5 DW6 DW7 DW8 DW9 HV1 HV2 HV3 HV4 HU1 HU2 HU3 HU4 HU5 HHD1 HHD2 HHD3 HHD4 HHD5 HHD6 HHD7 HHD8 Percent Mexican Percent Puerto Rican Percent Cuban Percent Other Hispanic Median Age of Population Median Age of Adults 18 or Older Median Age of Adults 25 or Older Average Age of Population Average Age of Adults >= 18 Average Age of Adults >= 25 Percent Population Under Age 18 Percent Children Under Age 7 Percent Children Age 7 .13 Percent Children Age 14-17 Percent Adults Age18-24 Percent Adults Age 25-34 Percent Adults Age 35-44 Percent Adults Age 45-54 Percent Adults Age 55-64 Percent Adults Age 65-74 Percent Adults Age >= 75 Percent Children Age <=2 Percent Children Age 3-5 Percent Children Age 6-11 Percent Children Age 12-15 Percent Children Age 16-18 Percent Households w/ Person 65+ Percent Households w/ Person 65+ Living Alone Percent Households Headed by an Elderly Person Percent 1 Person Households Percent 2 Person Households Percent 3 or More Person Households Percent 4 or More Person Households Percent 5 or More Person Households Percent 6 Person Households Percent Married Percent Separated or Divorced Percent Widowed Percent Never Married Median Person Per Household Average Person Per Household Percent Single Unit Structure Percent Detached Single Unit Structure Percent Duplex Structure Percent Multi (2+) Unit Structures Percent 3+ Unit Structures Percent Housing Units in 5+ Unit Structure Percent Group Quarters Percent Institutional Group Quarters Non-Institutional Group Quarters Median Home Value in hundreds Average Home Value in hundreds Median Contract Rent in hundreds Average Contract Rent in hundreds Percent Owner Occupied Housing Units Percent Renter Occupied Housing Units Percent Occupied Housing Units Percent Vacant Housing Units Percent Seasonal/Recreational Vacant Units Percent Households w/ Related Children Percent Households w/ Families Percent Married Couple Families Percent Married Couples w/ Related Children Percent Persons in Family Household Percent Persons in Non-Family Household Percent Single Parent Households Percent Male Householder w/ Child 25 .

000 .$74.000 Percent Home Value >= $300.000 Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Households w/ Income >= $150.999 Families w/ Income $100.999 Percent Households w/ Income $35. 5+ Units Occupied Mobile Homes or Percent Renters Paying >= $500 per Month Percent Renters Paying >= $400 per Month Percent Renters Paying >= $300 per Month Percent Renters Paying >= $200 per Month MSA Code ADI Code DMA Code Median Household Income in hundreds Median Family Income in hundreds Average Household Income in hundreds Average Family Income in hundreds Per Capita Income Percent Households w/ Income < $15.999 IC14 IC15 IC16 IC17 IC18 IC19 IC20 IC21 IC22 IC23 HHAS1 HHAS2 Percent Percent Percent Percent Female Householder w/ Child Single Male Householder Single Female Householder Households w/ Non-Family Living Percent White < Age 15 Percent White Age 15 .$149.999 Families w/ Income $50.$74.000 Families w/ Income $15.000 Percent Households w/ Income $125.$49.59 Percent White Age 60+ Percent Black < Age 15 Percent Black Age 15 .$124.$24.000 .999 Families w/ Income $25.000 Percent Home Value >= $150.000 .000 .999 Percent Households w/ Income $50.000 $ 1 or 2 Room Housing Units Percent >= 6 Room Housing Units Median Number of Rooms per Housing Unit Average Number of Rooms per Housing Unit Median Number of Persons per Housing Unit Average Number of Persons per Room Percent Housing Units w/ 2 thru 9 Units at the Percent Housing Units w/ >= 10 Units at the Percent Percent Percent Percent Percent Mobile Renter Renter Renter Renter Homes or Trailers Occupied Single Unit Structure Occupied.999 Families w/ Income $75.000 .999 IC13 $149.$99.000 .000 .999 Families w/ Income $125.000 Percent Home Value >= $50.999 Percent Households w/ Income $75.000 .$34.$99.999 Families w/ Income $35.000 .4 Units Occupied. 2 .000 .$24.999 Percent Households w/ Income $25.$49.HHD9 HHD10 HHD11 HHD12 Arrangements ETHC1 ETHC2 ETHC3 ETHC4 ETHC5 ETHC6 HVP1 HVP2 HVP3 HVP4 HVP5 HVP6 HUR1 HUR2 RHP1 RHP2 RHP3 RHP4 HUPA1 Address HUPA2 Address HUPA3 HUPA4 HUPA5 HUPA6 HUPA7 Trailers RP1 RP2 RP3 RP4 MSA ADI DMA IC1 IC2 IC3 IC4 IC5 IC6 IC7 IC8 IC9 IC10 IC11 IC12 $124.000 .000 Families w/ Income < $15.000 Percent Households w/ Income $15.999 Percent Households w/ Income $100.34.999 Families w/ Income >= $150.000 Households on Social Security Households on Public Assistance 26 .000 .000 Percent Home Value >= $75.000 Percent Home Value >= $100.59 Percent Black Age 60+ Percent Home Value >= $200.

Handlers. Real Percent Employed in Business and Repair Percent Employed in Personnal Services Percent Employed in Entertainment and Percent Percent Percent Percent Percent Percent Percent Percent Employed in Health Services Employed in Educational Services Employed in Other Professional Services Employed in Public Administration Employed by Local Government Employed by State Government Employed by Federal Government Self Employed 27 . Percent Farmers Percent Craftsmen.59 Minutes to Work Percent Adults in Labor Force Percent Adult Males in Labor Force Percent Females in Labor Force Percent Adult Males Employed Percent Adult Females Employed Percent Mothers Employed Married and Single Percent 2 Parent Earner Families Percent Single Mother w/ Child in Labor Force Percent Single Father w/ Child in Labor Force Percent Families w/ Child w/ no Workers Percent Professional Percent Managerial Percent Technical Percent Sales Percent Clerical/Administrative Support Percent Private Household Service Occ. Percent Other Service Occ.HHAS3 Dividend Income HHAS4 MC1 MC2 MC3 1985 TPE1 TPE2 TPE3 TPE4 TPE5 TPE6 TPE7 TPE8 TPE9 PEC1 PEC2 State TPE10 TPE11 TPE12 TPE13 LFC1 LFC2 LFC3 LFC4 LFC5 LFC6 LFC7 LFC8 LFC9 LFC10 OCC1 OCC2 OCC3 OCC4 OCC5 OCC6 OCC7 OCC8 OCC9 OCC10 OCC11 OCC12 OCC13 EIC1 EIC2 EIC3 EIC4 EIC5 EIC6 EIC7 EIC8 EIC9 Estate EIC10 EIC11 EIC12 Recreation EIC13 EIC14 EIC15 EIC16 OEDC1 OEDC2 OEDC3 OEDC4 Percent Households w/ Interest. Helpers Percent Employed in Agriculture Percent Employed in Mining Percent Employed in Construction Percent Employed in Manufacturing Percent Employed in Transportation Percent Employed in Communications Percent Employed in Wholesale Trade Percent Employed in Retail Industry Percent Employed in Finance. Machine Percent Transportation Percent Laborers. Rental or Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Persons Persons Persons Persons Below Poverty Level Move in Since 1985 in Same House in 1985 in Different State/Country in Driving to Work Alone Car/Truck/Van Carpooling Car/Truck/Van) Using Public Transportation Using Bus/Trolley Using Railways Using Taxi/Ferry Using Motorcycles Using Other Transportation Working at Home/No Transportation Working Outside State of Residence Working Outside County of Residence in Median Travel Time to Work in minutes Mean Travel Time to Work in minutes Percent Traveling 60+ Minutes to Work Percent Traveling 15 . Percent Protective Service Occ. Repair Percent Operatives. Precision. Insurance.

OEDC5 OEDC6 Worker OEDC7 EC1 EC2 EC3 EC4 Equivalency EC5 EC6 EC7 EC8 SEC1 SEC2 SEC3 SEC4 School SEC5 AFC1 AFC2 AFC3 AFC4 AFC5 AFC6 VC1 VC2 VC3 VC4 ANC1 ANC2 ANC3 ANC4 ANC5 ANC6 ANC7 ANC8 ANC9 ANC10 ANC11 ANC12 ANC13 ANC14 ANC15 POBC1 POBC2 LSC1 LSC2 LSC3 LSC4 VOC1 VOC2 VOC3 HC1 HC2 years HC3 1989 HC4 1985 HC5 1980 HC6 1970 HC7 1960 HC8 to 1960 Percent Private Profit Wage or Salaried Worker Percent Private Non-Profit Wage or Salaried Percent Unpaid Family Workers Median Years of School Completed by Adults 25+ Percent Adults 25+ Grades 0-8 Percent Adults 25+ w/ some High School Percent Adults 25+ Completed High School or Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Percent Adults 25+ w/ some College Adults 25+ w/ Associates Degree Adults 25+ w/ Bachelors Degree Adults 25+ Graduate Degree Persons Enrolled in Private Schools Persons Enrolled in Public Schools Persons Enrolled in Preschool Persons Enrolled in Elementary or High Persons in College Adults in Active Military Service Males in Active Military Service Females in Active Military Service Adult Veterans Age 16+ Male Veterans Age 16+ Female Veterans Age 16+ Vietnam Veterans Age 16+ Korean Veterans Age 16+ WW2 Veterans Age 16+ Veterans Serving After May 1975 Only Dutch Ancestry English Ancestry French Ancestry German Ancestry Greek Ancestry Hungarian Ancestry Irish Ancestry Italian Ancestry Norwegian Ancestry Polish Ancestry Portuguese Ancestry Russian Ancestry Scottish Ancestry Swedish Ancestry Ukranian Ancestry Foreign Born Born in State of Residence English Only Speaking Spanish Speaking Asian Speaking Other Language Speaking Households w/ 1+ Vehicles Households w/ 2+ Vehicles Households w/ 3+ Vehicles Median Length of Residence Median Age of Occupied Dwellings in Percent Owner Occupied Structures Built Since Percent Owner Occupied Structures Built Since Percent Owner Occupied Structures Built Since Percent Owner Occupied Structures Built Since Percent Owner Occupied Structures Built Since Percent Owner Occupied Structures Built Prior 28 .

) '97NK' ==> xxxx_2 (mailing was used to construct the target fields) '96NK' '96TK' '96SK' '96LL' '96G1' '96GK' '96CC' '96WL' '96X1' '96XK' '95FS' '95NK' '95TK' '95LL' '95G1' '95GK' '95CC' '95WL' '95X1' '95XK' '94FS' '94NK' ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> ==> xxxx_3 xxxx_4 xxxx_5 xxxx_6 xxxx_7 xxxx_8 xxxx_9 xxxx_10 xxxx_11 xxxx_12 xxxx_13 xxxx_14 xxxx_15 xxxx_16 xxxx_17 xxxx_18 xxxx_19 xxxx_20 xxxx_21 xxxx_22 xxxx_23 xxxx_24 29 . RDATE and RAMNT. Percent Percent Percent Percent Housing Housing Housing Housing Units Units Units Units w/ w/ w/ w/ Public Water Source Well Water Source Public Sewer Source Complete Plumbing Percent Housing Units w/ Telephones Median Homeowner Cost w/ Mortgage per Month Median Homeowner Cost w/out Mortgage per Month Percent Adults Age 55-59 Percent Adults Age 60-64 ----------------------------------------------The fields listed below are from the promotion PROMOTION CODES: ---------------The following lists the promotion codes and their respective field names (where XXXX refers to ADATE. Tank or LP HC13 Electricity HC14 Oil HC15 Energy HC16 Wood. RFA. Percent Owner Occupied Condominiums Percent Renter Occupied Condominiums Percent Occupied Housing Units Heated by Percent Occupied Housing Units Heated by Percent Occupied Housing Units Heated by Percent Occupied Housing Units Heated by Fuel Percent Occupied Housing Units Heated by Solar Percent Occupied Housing Units Heated by Coal. Other HC17 HC18 HC19 HC20 Facilities HC21 MHUC1 dollars MHUC2 dollars AC1 AC2 --------------------------history file.HC9 HC10 HC11 Utility Gas HC12 Bottled.

This is everyone who made their first donation 7-12 months ago. N=NEW DONOR Anyone who has made their first donation in the last 12 months and is not a First time donor. I=INACTIVE DONOR A previous donor who has not 30 . blank. & get well) with labels XK mailings are Christmas cards with labels X1 mailings have labels and a notepad G1 mailings have labels and a notepad This information could certainly be used to calculate several summary variables that count the number of occurrences of various types of promotions received in the most recent 12-36 months. or people who made their first donation between 0-6 months ago and have made 2 or more donations. sympathy. A=ACTIVE DONOR Anyone who made their first donation more than 12 months ago and has made a donation in the last 12 months. RFA (RECENCY/FREQUENCY/AMOUNT) -----------------------------The RFA (recency/frequency/amount) status of the donors (as of the promotion dates) is included in the RFA fields. The individual bytes could separately be used as fields and refer to the following: First Byte of code is concerned with RECENCY based on Date of the last Gift F=FIRST TIME DONOR Anyone who has made their first donation in the last 6 months and has made just one donation.1st 2 bytes of the code refers to the year of the mailing while 3rd and 4th bytes refer to the following promotion codes/types: LL mailings had labels only WL mailings had labels only CC mailings are calendars with stickers but do not have labels FS mailings are blank cards that fold into thirds with labels NK mailings are blank cards with labels SK mailings are blank cards with labels TK mailings have thank you printed on the outside with labels GK mailings are general greeting cards (an assortment of birthday. The (current) concatenated version is a nominal or symbolic field. etc. L=LAPSING DONOR A previous donor who made their last donation between 13-24 months ago.

1=One gift in the period of 2=Two gift in the period of 3=Three gifts in the period 4=Four or more gifts in the last gift.00 E=$10.$9.99 .00 ADATE_2 ADATE_3 ADATE_4 ADATE_5 ADATE_6 ADATE_7 ADATE_8 ADATE_9 ADATE_10 ADATE_11 ADATE_12 ADATE_13 ADATE_14 ADATE_15 ADATE_16 ADATE_17 ADATE_18 ADATE_19 ADATE_20 ADATE_21 ADATE_22 ADATE_23 ADATE_24 RFA_2 RFA_3 RFA_4 RFA_5 RFA_6 RFA_7 RFA_8 RFA_9 RFA_10 RFA_11 RFA_12 RFA_13 Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date the the the the the the the the the the the the the the the the the the the the the the the . Second Byte of code is concerned with FREQUENCY based on the period of recency.$1.99 .99 . It is people who made a donation 25+ months ago.00 D=$5. There are four valid frequency codes.$2.99 .$24. For L it is 13-24 months ago.$4.00 G=$25.00 C=$3.made a donation in the last 24 months.00 F=$15.01 B=$2.99 .99 and above 97NK 96NK 96TK 96SK 96LL 96G1 96GK 96CC 96WL 96X1 96XK 95FS 95NK 95TK 95LL 95G1 95GK 95CC 95WL 95X1 95XK 94FS 94NK RFA RFA RFA RFA RFA RFA RFA RFA RFA RFA RFA RFA promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion as as as as as as as as as as as as of of of of of of of of of of of of was was was was was was was was was was was was was was was was was was was was was was was mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed mailed promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion date date date date date date date date date date date date recency recency of recency period of recency Third byte of the code is the Amount of the Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's status status status status status status status status status status status status 97NK 96NK 96TK 96SK 96LL 96G1 96GK 96CC 96WL 96X1 96XK 95FS 31 . A=$0.$14. The period of recency for all groups except L and I is the last 12 months. and for I it is 25-36 months ago. S=STAR DONOR STAR Donors are individuals who have given to 3 consecutive card mailings.

Date of the most recent promotion received (in YYMM. UF. Year/Month format) Lifetime number of promotions received to date Number of card promotions received in the last 12 months (in terms of calendar months into 9603-9702) Number of promotions received in the last 12 months (in terms of calendar months translates into 9603-9702) ----------------------------------------------The following fields are from the giving --------------------------history file. MAXADATE NUMPROM CARDPM12 translates NUMPRM12 Lifetime number of card promotions received to date. NK. XK. Card promotions are promotion type FS. RDATE_3 RDATE_4 RDATE_5 RDATE_6 RDATE_7 RDATE_8 RDATE_9 RDATE_10 RDATE_11 RDATE_12 RDATE_13 RDATE_14 RDATE_15 RDATE_16 RDATE_17 RDATE_18 RDATE_19 RDATE_20 RDATE_21 RDATE_22 RDATE_23 RDATE_24 RAMNT_3 RAMNT_4 RAMNT_5 RAMNT_6 RAMNT_7 RAMNT_8 RAMNT_9 Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date Date the the the the the the the the the the the the the the the the the the the the the the gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift was was was was was was was was was was was was was was was was was was was was was was of of of of of of of received received received received received received received received received received received received received received received received received received received received received received the the the the the the the gift gift gift gift gift gift gift for for for for for for for for for for for for for for for for for for for for for for for for for for for for for 96NK 96TK 96SK 96LL 96G1 96GK 96CC 96WL 96X1 96XK 95FS 95NK 95TK 95LL 95G1 95GK 95CC 95WL 95X1 95XK 94FS 94NK 96NK 96TK 96SK 96LL 96G1 96GK 96CC Dollar Dollar Dollar Dollar Dollar Dollar Dollar amount amount amount amount amount amount amount 32 . SK.RFA_14 RFA_15 RFA_16 RFA_17 RFA_18 RFA_19 RFA_20 RFA_21 RFA_22 RFA_23 RFA_24 --------------------------- Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's Donor's RFA RFA RFA RFA RFA RFA RFA RFA RFA RFA RFA status status status status status status status status status status status as as as as as as as as as as as of of of of of of of of of of of 95NK 95TK 95LL 95G1 95GK 95CC 95WL 95X1 95XK 94FS 94NK promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion promotion date date date date date date date date date date date ----------------------------------------------The following fields are summary variables from the promotion history file. UU. TK. CARDPROM GK.

RAMNTALL NGIFTALL CARDGIFT date MINRAMNT MINRDATE MAXRAMNT MAXRDATE LASTGIFT LASTDATE FISTDATE NEXTDATE TIMELAG AVGGIFT --------------------------CONTROLN TARGET_B to TARGET_D associated HPHONE_D --------------------------- Dollar amount of lifetime gifts to date Number of lifetime gifts to date Number of lifetime gifts to card promotions to Dollar amount of smallest gift to date Date associated with the smallest gift to date Dollar amount of largest gift to date Date associated with the largest gift to date Dollar amount of most recent gift Date associated with the most recent gift Date of first gift Date of second gift Number of months between first and second gift Average dollar amount of gifts to date ----------------------------------------------Control number (unique record identifier) Target Variable: Binary Indicator for Response 97NK Mailing Target Variable: Donation Amount (in $) with the Response to 97NK Mailing Indicator for presence of a published home phone number ----------------------------------------------(See the section on RFA for the meaning of the codes) RFA_2R RFA_2F RFA_2A MDMAUD_R MDMAUD_F MDMAUD_A --------------------------CLUSTER2 GEOCODE2 Recency code for RFA_2 Frequency code for RFA_2 Donation Amount code for RFA_2 Recency code for MDMAUD Frequecy code for MDMAUD Donation Amount code for MDMAUD ----------------------------------------------Classic Cluster Code (a nominal symbolic field) County Size Code 33 .RAMNT_10 RAMNT_11 RAMNT_12 RAMNT_13 RAMNT_14 RAMNT_15 RAMNT_16 RAMNT_17 RAMNT_18 RAMNT_19 RAMNT_20 RAMNT_21 RAMNT_22 RAMNT_23 RAMNT_24 --------------------------- Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar Dollar amount amount amount amount amount amount amount amount amount amount amount amount amount amount amount of of of of of of of of of of of of of of of the the the the the the the the the the the the the the the gift gift gift gift gift gift gift gift gift gift gift gift gift gift gift for for for for for for for for for for for for for for for 96WL 96X1 96XK 95FS 95NK 95TK 95LL 95G1 95GK 95CC 95WL 95X1 95XK 94FS 94NK ----------------------------------------------The following fields are summary variables from the giving history file.

34 .