Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong!

R Style Object Oriented Programming Repetit

Writing Functions In R
Necessity Really is a Mother

Paul E. Johnson12
1

University of Kansas, Department of Political Science 2 Center for Research Methods and Data Analysis

2012

Writing Functions I

1 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Outline
1 2 3 4 5 6 7 8 9

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetition Bootstrapping

Writing Functions I

2 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Did You Ever Write a Program?

If Yes: R’s different than that. If No: Its not like other things you’ve done. In either case, don’t worry about it :)

Writing Functions I

3 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

R is a little bit like an elephant

Its a tree trunk! Its a snake! Its a brush!

Writing Functions I

4 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The R Language is like S, of course

The S Language– John Chambers, et al. at Bell Labs, mid 1970s. See Richard Becker’s “Brief History of S” about the AT&T years There have been 4 generations of the S language. Now we are in a transition between S3 and S4 Current rumor a new programming approach dubbed “R5” will supercede S4 S3: The New S Language 1988

Writing Functions I

5 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Is R a Branch from S?
Or is S coming back to R?

Is R a “branch” from S? Ross Ihaka and Robert Gentleman. 1996. “R: A language for data analysis and graphics.” Journal of Computational and Graphical Statistics, 5(3):299-314. Open Source, Open Community S4: John Chambers,Software for Data Analysis: Programming with R, Springer, 2008

Writing Functions I

6 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Main Idea: Separate Calculations Meaningfully

Do NOT write a giant sequence of 1000 commands Many novices knit together a giant sequence of commands. Just Don’t! Problem No other human can comprehend that mess Solution Write functions to calculate separate parts

Writing Functions I

7 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Your R program should look like this

myfn1 <− f u n c t i o n ( parms ) { l i n e s here } myfn2 <− f u n c t i o n ( parms ) { l i n e s here } # # perfect above , don ' t edit it again # # At end , the part you revise while working a <− 7 b <− c ( 4 , 4 , 4 , 4 , 2 ) g r e a t 1 <− myfn1 ( a , b , parm3 = TRUE) g r e a t 2 <− myfn2 ( b , g r e a t 1 , parm9 = FALSE )

Writing Functions I

8 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Avoid for loops with lots of meat inside

Instead of this:
f o r ( i i n 1 : 1000}{ 1000 s o f l i n e s h e r e } full o f x [ i ] , z [ i ] , and s o f o r t h

We Want:
f n 1 <− f u n c t i o n ( params ) { . . . } f n 2 <− f u n c t i o n ( params ) { . . . } f o r ( i i n 1 : 1000}{ fn1 ( x [ i ] , . . . ) }

Writing Functions I

9 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

R Functions pass information “by value”

We should organize our information “here” in the current environment We send it “over there” to a function We get back something we can work with The function DOES NOT change variables we give to the function The super assignment << − allows an exception to this, but R Core recommends we avoid it when possible.

Writing Functions I

10 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A simple example of a new function: doubleMe

doubleMe <− f u n c t i o n ( i n p u t = 0 ) { n e w v a l <− 2 * i n p u t } The function’s name is “doubleMe” I am allowed to name the incoming variable “input” anything I want Could as well have used: doubleMe <− f u n c t i o n ( x ) { o u t <− 2 * x } Note, explicit use of a return function is NOT REQUIRED.

Writing Functions I

11 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Key Elements of doubleMe

doubleMe <− f u n c t i o n ( i n p u t = 0 ) { n e w v a l <− 2 * i n p u t } input is a name to be used INTERNALLY while making calculations = 0 An optional but recommended default value newval Last calculation is returned.

doubleMe a name with which to access this function. Because of my background in “Objective C”, I like this style of name. C++ programmers prefer double.me or such.

Writing Functions I

12 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How to Call doubleMe

What is 2 * 7? ( doubleMe ( 7 ) ) [ 1 ] 14

Writing Functions I

13 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Clarity versus Verbosity
I prefer to be clear about the input variable on both ends This would work doubleMe <− f u n c t i o n ( i n p u t ) { n e w v a l <− 2 * i n p u t } But Nicer to set a default, even if just 0 or NULL doubleMe <− f u n c t i o n ( i n p u t=NULL) { n e w v a l <− 2 * i n p u t } Here’s why: if you don’t give a default, then R may substitute something you don’t want. Suppose you run doubleMe ( ) If “input” has no default in the definition, then R will search outside double me for values to use.
Writing Functions I 14 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Similarly, I prefer Clarity in the Call

This works doubleMe ( 1 0 ) But wouldn’t you rather be clear? doubleMe ( i n p u t =10)

Writing Functions I

15 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Function Calls
But if you name it wrong, it breaks > doubleMe ( myInput =7) E r r o r i n doubleMe ( myInput = 7 ) : u nu s e d argument ( s ) ( myInput = 7 ) What if you feed it something unsuitable? > doubleMe ( lm ( rnorm ( 1 0 0 )∼rnorm ( 1 0 0 ) ) ) E r r o r i n 2 * i n p u t : non−numeric argument t o binary operator I n a d d i t i o n : Warning m e s s a g e s : 1 : I n m o d e l . m a t r i x . d e f a u l t ( mt , mf , c o n t r a s t s ) : t h e r e s p o n s e a p p e a r e d on t h e r i g h t − h a n d s i d e and was d r o p p e d 2 : I n m o d e l . m a t r i x . d e f a u l t ( mt , mf , c o n t r a s t s ) : p r o b l e m w i t h term 1 i n m o d e l . m a t r i x : no columns a r e a s s i g n e d
Writing Functions I 16 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Function Calls ...
We get “free”“vectorization” ( doubleMe ( c ( 1 , 2 , 3 , 4 , 5 ) ) ) [1] 2 4 6 8 10

But it won’t allow you to specify too many inputs: > doubleMe ( 1 , 2 , 3 , 4 , 5 ) E r r o r i n doubleMe ( 1 , 2 , 3 , 4 , 5 ) : u n u s e d argument ( s ) ( 2 , 3 , 4 , 5 ) Oops. I forgot the input doubleMe ( ) Gives the default value.

Writing Functions I

17 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Function Calls
Oops. I forgot the parentheses doubleMe f u n c t i o n ( i n p u t = 0) { n e w v a l <− 2 * i n p u t } Tip: Code for most functions can be reviewed by typing its name. R will “beautify” the format and show you what it sees in your function. Type “lm” if you don’t believe it. Note, after all this, that “input” does not exist in the current environment. > l s () [ 1 ] ”doubleMe ” ”op ” > input Error : object ' input
Writing Functions I

”p j m a r ”
'

not found
University of Kansas

18 / 205

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Check Point: write your own function

Write a function “myGreatFunction” that takes a vector and returns a vector of 3 values: the maximum, the minimum, and the median. Generate some data, for testing, and run x1 <− rnorm ( 1 0 0 0 0 , m=7, s d =19) m y G r e a t F u n c t i o n ( x1 ) Now stress test your function by changing x1 x1 [ c ( 1 3 , 4 4 , 9 9 , 3 4 3 , 5 5 5 ) ] m y G r e a t F u n c t i o n ( x1 ) <− NA

Writing Functions I

19 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Entropy can summarize diversity for a categorical variable

Entropy in Physics means disorganization Sometimes called Shannon’s Information Index Basic idea. List the possible outcomes and their probabilities The amount of diversity in a collection of observations depends on the equality of the proportions of cases observed within each type.

Writing Functions I

20 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Reasonable Person Would Agree . . .

This distribution is “less outcome name t1 prob(outcome) 0.1 than this distribution: outcome name t1 prob(outcome) 0.2

diverse” t2 t3 0.3 0.05 t2 0.2 t3 0.2

t4 0.55 t4 0.2

t5 0.0 t5 0.2

Writing Functions I

21 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Information Index
For each type, calculate the following information (or can I say “diversity”?) value −pt ∗ log2 (pt ) Note that if pt = 0, the diversity value is 0 If pt = 1, then diversity is also 0 Sum those values across the m categories
m

(1)

−pt ∗ log2 (pt )
t=1

(2)

Diversity is at a maximum when pt are all equal

Writing Functions I

22 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Calculate Diversity for One Type

d i v r <− f u n c t i o n ( p=0){ i f e l s e ( p>0 & p < 1 , −p }

*

log2 ( p ) , 0)

Writing Functions I

23 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Let’s plot that

p s e q <− s e q ( 0 . 0 0 1 , 0 . 9 9 9 , l e n g t h =999) p l o t ( pseq , d i v r ( p s e q ) , x l a b=”p ” , y l a b=” D i v e r s i t y C o n t r i b u t i o n o f One O b s e r v a t i o n ” , main=e x p r e s s i o n ( p a s t e ( ” D i v e r s i t y : ” , −p* l o g [ 2 ] ( p ) ) ) , t y p e=” l ”)

Writing Functions I

24 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Diversity:− plog2(p)
Diversity Contribution of One Observation 0.1 0.2 0.3 0.4 0.5 0.0 0.0

0.2

0.4 p

0.6

0.8

1.0

Writing Functions I

25 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Diversity Function
Define an Entropy function that sums those values e n t r o p y <− f u n c t i o n ( p ) { sum ( d i v r ( p ) ) } Calculate some test cases e n t r o p y ( c ( 1/ 5 , 1/ 5 , 1/ 5 , 1/ 5 , 1/5 ) ) [ 1 ] 2 .321928 e n t r o p y ( c ( 3/ 5 , 1/ 5 , 1/ 5 , 0/ 5 , 0/5 ) ) [ 1 ] 1 .370951

Writing Functions I

26 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

There’s a Little Problem With This Approach
Diversity is sensitive to the number of categories 8 equally likely outcomes (rep(x,y): repeats x y times.) entropy ( rep (1 / 8 , 8) ) [1] 3 14 equally likely outcomes entropy ( rep (1 / 14 , 14) ) [ 1 ] 3 .807355 Write it out for a 3 category case 1 1 1 1 1 1 1 − log2 ( ) − log2 ( ) − log2 ( ) = −log2 ( ) 3 3 3 3 3 3 3
1 The highest possible diversity with 3 types is −log2 ( 3 ) 1 The highest possible diversity for N types is −log2 ( N )
Writing Functions I 27 / 205 University of Kansas

(3)

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

We Might As Well Plot That

maximumEntropy <− f u n c t i o n (N) −log2 ( 1 /N) Nmax <− 15 M <− 2 : Nmax p l o t (M, maximumEntropy (M) , x l a b=”N o f P o s s i b l e Types ” , y l a b=”Maximum P o s s i b l e D i v e r s i t y ” , main=”Maximum P o s s i b l e E n t r o p y For N C a t e g o r i e s ” , t y p e=”h ” , a x e s=F ) axis (1) axis (2) p o i n t s (M, maximumEntropy (M) , pch =19)

Writing Functions I

28 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Maximum Entropy as a Function of the Number of Types
4.0

Maximum Possible Entropy For N Categories
q q q

3.5

q q q

Maximum Possible Diversity 2.0 2.5 3.0

q

q

q

q

q

q

1.0

1.5

q

q

2

4

6

8 10 N of Possible Types

12

14

Writing Functions I

29 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Final Result: Normed Entropy as a Diversity Summary
norme dEn tro py <− f u n c t i o n ( x ) e n t r o p y ( x ) / maximumEntropy ( l e n g t h ( x ) ) Compare some cases with 4 possible outcomes norme dEn tro py ( c ( 1 / 4 , 1 / 4 , 1 / 4 , 1 / 4 ) ) [1] 1 norme dEn tro py ( c ( 1 / 2 , 1 / 2 , 0 , 0 ) ) [1] 0 .5 norme dEn tro py ( c ( 1 , 0 , 0 , 0 ) ) [1] 0

Writing Functions I

30 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How about 7 types of outcomes:
norme dEn tro py ( r e p ( 1 / 7 , 7 ) ) [1] 1 norme dEn tro py ( ( 1 : 7 ) / ( sum ( 1 : 7 ) ) ) [ 1 ] 0 .9297027 norme dEn tro py ( c ( 2 / 7 , 2 / 7 , 3 / 7 , 0 , 0 , 0 , 0 ) ) [ 1 ] 0 .5544923 norme dEn tro py ( c ( 5 / 7 , 2 / 7 , 0 , 0 , 0 , 0 , 0 ) ) [ 1 ] 0 .3074497

Writing Functions I

31 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Compare 3 test cases
0.5 0.5 Normed Entropy= 0.82 Normed Entropy= 0.66 0.5 0.3 0.4 Normed Entropy= 0.93

0.4

0.3

0.3

0.4

0.4

0.4

0.25

0.21

0.2

0.2

0.2

0.2 0.2 0.2

0.2

0.18

0.14

0.1

0.1

0.1

0.1 0.1 0.1 0.1

0.11

0.07

0.04

0.0

0.0

1

2

3

4

5

6

7

8

9 10

1

2

3

4

5

0.0

0

0

0

0

0

1

2

3

4

5

6

7

Writing Functions I

32 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

In the rockchalk package, see “summarize” and “summarizeFactors”
Manufacture a variable to re-produce testcase3 round ( t e s t c a s e 3 , 2 ) [ 1 ] 0 .04 0 .07 0 .11 0 .14 0 .18 0 .21 0 .25 l i b r a r y ( rockchalk ) t e s t c a s e 3 v <− f a c t o r ( c ( 1 , 2 , 2 , 3 , 3 , 3 , 4 , 4 , 4 , 4 , 5 ,5 ,5 ,5 ,5 , 6 ,6 ,6 ,6 ,6 ,6 , 7 ,7 ,7 ,7 ,7 ,7 ,7 ) ) round ( ( t a b l e ( t e s t c a s e 3 v ) / l e n g t h ( t e s t c a s e 3 v ) ) , 2) testcase3v 1 2 3 4 5 6 7 0 .04 0 .07 0 .11 0 .14 0 .18 0 .21 0 .25 d a t <− d a t a . f r a m e ( t e s t c a s e 3 v ) summarizeFactors ( dat )
Writing Functions I 33 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

In the rockchalk package, see “summarize” and “summarizeFactors” ...

testcase3v 7 : 7 .0000 6 : 6 .0000 5 : 5 .0000 4 : 4 .0000 ( A l l Others ) : 6 .0000 NA ' s : 0 .0000 entropy : 2 .6100 norme dEn tro py : 0 . 9 2 9 7 N :28 .0000

Writing Functions I

34 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Function Anatomy

Arguments: The parenthesized parameters of a function
someWork <− f u n c t i o n ( what1 , what2 , what3 )

what1, what2, and what3 become “local variables” inside the function.

Writing Functions I

35 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Declaring function arguments
R is lenient in function declaration
Arguments are not type-defined May declare default values, but need not
someWork <− f u n c t i o n ( what1 = 0 , what2 = NULL , what3 = c ( 1 , 2 , 3 ) , what4 = 3 * what1 , what5 )

R is lenient on format of function calls
someWork ( 1 ) someWork ( what =1 , s o m e O b j e c t ) someWork ( what5 = f r e d , what4 = j i m , what3 = j o e )

Not all parameters must be provided Partial argument matching

R is (very!) lenient on undefined variables inside functions
If a variable is used, but not defined inside the function, R will “look outward” for it (and may fill in values you do not intend)

Writing Functions I

36 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Arguments: named R variables

someWork <−

f u n c t i o n ( what1 , what2 , what3 )

In R, everything is an “object” what1, what2, what3 can be . ANYTHING! That is a blessing and a curse
Blessing: Flexibility! Let an argument be a vector, matrix, list, whatever. The function declaration does not care. Curse: Difficulty managing the infinite array of possible ways users might break your functions.

Writing Functions I

37 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Inside the function, you have work to . . .

Check and Re-format
“argument checking” diagnose what arguments the user provided : Re-organization and re-structur to take what they have given and convert it to something else.

There is no mandatory standard that stipulates when a function should receive
2 integers (x, y), or one vector of 2 integers (x) that can be unpacked For examples of this, look at the arguments of plot.default, arrows, segments, lines, and matplot.

Writing Functions I

38 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Arguments: When Should We Provide Defaults?

I don’t know, but . . . I used to be very worried that R functions would “reach outward” and find values for underfined variables A conservative strategy in that case is to set all defaults at NULL or someother value.
f u n c t i o n ( what1 = NULL , what2 = NULL , what3 = NULL , what4 = NULL)

That way, if a user forgets to provide “what3” then the system will , not go looking for it.

Writing Functions I

39 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Functions that Help while Writing Functions and Checking Arguments

missing, stop, and stopifnot missing. Inside a function, missing is a way to ask if an argument is undefined.
d o S o m e t h i n g <− f u n c t i o n ( what1 , what2 , what3 , what4 ){ i f ( m i s s i n g ( what1 ) ) s t o p ( ` ` you f o r g o t t o s p e c i f y what1 ' ' ) }

class, is.vector, length, . . .

Writing Functions I

40 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Return Has To Be Singular
When you use a function, it is necessary to “catch” the output with a single object name, as in n e w t h i n g <− doubleMe ( 3 2 ) newthing [ 1 ] 64 i s . n u m e r i c ( newthing ) [ 1 ] TRUE i s . v e c t o r ( newthing ) [ 1 ] TRUE We expect “doubleMe(32)” should return 64, and it does. R allows more freedom than most languages in returning complicated structures.
Writing Functions I 41 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Generalization 1. Return a vector
One can return a vector with many values n e w t h i n g 2 <− doubleMe ( c ( 1 , 4 0 , 1 9 0 ) ) newthing2 [1] 2 80 380

i s . n u m e r i c ( newthing2 ) [ 1 ] TRUE i s . v e c t o r ( newthing2 ) [ 1 ] TRUE

Writing Functions I

42 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Generalization 2. Return a list

A list may include numbers, characters, vectors ,etc or data frames or other lists Read code for function “lm”

Writing Functions I

43 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example: Return a data frame or a Whole Regression Model

This function returns an object aRegMod <− f u n c t i o n ( x1 , x2 , y ) { lm ( y∼x1+x2 ) } That is a “single object” but it is a very complicated structure , “Inside” aRegMod there are parameter estimates, whole data matrices, and such.

Writing Functions I

44 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Almost All Substantial R Functions Return A Diverse List of Items
An R function can return an R “list” object. Basically, that means a combination of anything. Look at the end of R’s glm function, for example. After it has done a bunch of calculations, it has an object “fit” and then more and more details are wedged together with “fit” f i t <− e v a l ( c a l l ( i f ( i s . f u n c t i o n ( method ) ) ” method ” e l s e method , x = X, y = Y, weights = weights , s t a r t = start , etastart = etastart , mustart = mustart , o f f s e t = o f f s e t , f a m i l y = family , c o n t r o l = c o n t r o l , i n t e r c e p t = a t t r ( mt , ” i n t e r c e p t ”) > 0L ) )
Writing Functions I 45 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Almost All Substantial R Functions Return A Diverse List of Items ...
i f ( l e n g t h ( o f f s e t ) && a t t r ( mt , ” i n t e r c e p t ”) > 0 L) { f i t $ n u l l . d e v i a n c e <− e v a l ( c a l l ( i f ( i s . f u n c t i o n ( method ) ) ”method ” e l s e method , x = X [ , ”( I n t e r c e p t ) ” , d r o p = FALSE ] , y = Y, weights = weights , o f f s e t = offset , family = family , control = control , i n t e r c e p t = TRUE) ) $ d e v i a n c e } i f ( model ) f i t $ model <− mf f i t $ n a . a c t i o n <− a t t r ( mf , ” n a . a c t i o n ”) if (x) f i t $ x <− X
Writing Functions I 46 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Almost All Substantial R Functions Return A Diverse List of Items ...
if ( !y) f i t $ y <− NULL f i t <− c ( f i t , l i s t ( c a l l = c a l l , f o r m u l a = f o r m u l a , t e r m s = mt , d a t a = data , o f f s e t = o f f s e t , c o n t r o l = c o n t r o l , method = method , c o n t r a s t s = a t t r (X , ” c o n t r a s t s ”) , x l e v e l s = . g e t X l e v e l s ( mt , mf ) ) ) c l a s s ( f i t ) <− c ( f i t $ c l a s s , c ( ”glm ” , ”lm ”) ) fit

Writing Functions I

47 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here’s an example: aRegThing

I’ll throw together a funtion that runs a regression and returns a bunch of characteristics: aRegThing returns a regression model (and everything in it) a regression summary object a separate numeric estimate of b2 the estimated standard error of b2 a vector of t values and the Rsquare.

Writing Functions I

48 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The aRegThing function declaration

aRegThing <− f u n c t i o n ( x1 , x2 , y ) { mymod <− lm ( y∼x1+x2 ) mysum <− summary (mymod) b 2 h a t <− c o e f (mymod) [ 3 ] b 2 h a t s e <− c o e f ( mysum ) [ 2 , 3 ] a l l t v a l s <− c o e f ( mysum ) [ , 3 ] r s q <− mysum$ r . s q u a r e l i s t ( t h e M o d e l = mymod , theSummary = mysum , e s t b 2 = b2hat , s e b 2 h a t = b 2 h a t s e , a l l t = a l l t v a l s , rsquare = rsq ) }

Writing Functions I

49 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here’s an example usage of aRegThing

x1 <− rnorm ( 1 0 0 0 ) x2 <− rnorm ( 1 0 0 0 ,m=44 , s d =7) y <− 4 + 5 * x1 + 0 . 2 * x2 + rnorm ( 1 0 0 0 , m=0, s d =63) aRegThing ( x1 , x2 , y )

Writing Functions I

50 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

If I wanted to run aRegThing over and over, I could

m y l i s t <− v e c t o r ( mode = ` ` l i s t ' ' , l e n g t h = 1 0 0 0 ) for ( i in 1:1000) { x1 <− rnorm ( 1 0 0 0 ) x2 <− rnorm ( 1 0 0 0 ,m=44 , s d =7) y <− 4 + 5 * x1 + 0 . 2 * x2 + rnorm ( 1 0 0 0 , m=0, s d =63) m y L i s t [ [ i ] ] <− aRegThing ( x1 , x2 , y )

Writing Functions I

51 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Functions replace “cut and paste” editing

If you find ourself using “copy and paste” to repeat stanzas with slight variations, you are almost certainly doing the wrong thing. Re-conceptualize, write a function that does the right thing Use the function over and over Why is this important: AVOIDING mistakes due to editing mistakes

Writing Functions I

52 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example
I see a lot of user code that needs to be tightened up.
# # Writes Mplus code to g en er a te the data ## # # then runs the Mplus code it g e n e r a t e s ## # ------------------Clear All-------------------------- # rm ( l i s t = l s ( ) ) # ---------------SPECIFICATIONS------------------------ # # Specify root d i r e c t o r y and lo c at i on of mplus ; use double slashes ; leave end open d i r r o o t <− ”D: \ \ U s e r s \ u s e r n a m e \ D e s k t o p \ s i m u l a t i o n ” # where to place data , etc. i t e r =1000 # how many i t e r a t i o n s per condition s e t . s e e d (7913025) # set random seed # --------------END SPECIFICATIONS--------------------- # # n per cluster sample size f o r ( p e r c l u s t in c (100) ) { # number of c lu s te r s - later for MLM a p p l i c a t i o n for ( nclust in c (1) ){ # common c o r r e l a t i o n for ( setcorr in c (1:8) ){
Writing Functions I 53 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
i f ( s e t c o r r == 1 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−0 c o r r . 3 0 <−0 c o r r . 4 0 <−0 c o r r . 5 0 <−0 c o r r . 6 0 <−0 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 2 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−0 c o r r . 4 0 <−0 c o r r . 5 0 <−0 c o r r . 6 0 <−0 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 3 ) { c o r r . 1 0 <−1
Writing Functions I 54 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
c o r r . 2 0 <−1 c o r r . 3 0 <−1 c o r r . 4 0 <−0 c o r r . 5 0 <−0 c o r r . 6 0 <−0 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 4 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−1 c o r r . 4 0 <−1 c o r r . 5 0 <−0 c o r r . 6 0 <−0 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 5 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−1
Writing Functions I 55 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
c o r r . 4 0 <−1 c o r r . 5 0 <−1 c o r r . 6 0 <−0 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 6 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−1 c o r r . 4 0 <−1 c o r r . 5 0 <−1 c o r r . 6 0 <−1 c o r r . 7 0 <−0 c o r r . 8 0 <−0 } i f ( s e t c o r r == 7 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−1 c o r r . 4 0 <−1 c o r r . 5 0 <−1
Writing Functions I 56 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
c o r r . 6 0 <−1 c o r r . 7 0 <−1 c o r r . 8 0 <−0 } i f ( s e t c o r r == 8 ) { c o r r . 1 0 <−1 c o r r . 2 0 <−1 c o r r . 3 0 <−1 c o r r . 4 0 <−1 c o r r . 5 0 <−1 c o r r . 6 0 <−1 c o r r . 7 0 <−1 c o r r . 8 0 <−1 } # missing pattern for ( setpattern in c (1:4) ){ i f ( s e t p a t t e r n == 1 ) { mcar <−1 mar <−0 mnar <−0 }
Writing Functions I 57 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
i f ( s e t p a t t e r n == 2 ) { mcar <−0 mar <−1 mnar <−0 } i f ( s e t p a t t e r n == 3 ) { mcar <−0 mar <−0 mnar <−1 } i f ( s e t p a t t e r n == 4 ) { mcar <−0 mar <−1 mnar <−1 } # percent missing for ( percentmiss in c (1:6) ){ i f ( p e r c e n t m i s s == 1 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−0 m i s s . 3 0 <−0
Writing Functions I 58 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
m i s s . 4 0 <−0 m i s s . 5 0 <−0 m i s s . 6 0 <−0 m i s s . 7 0 <−0 } i f ( p e r c e n t m i s s == 2 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−1 m i s s . 3 0 <−0 m i s s . 4 0 <−0 m i s s . 5 0 <−0 m i s s . 6 0 <−0 m i s s . 7 0 <−0 } i f ( p e r c e n t m i s s == 3 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−1 m i s s . 3 0 <−1 m i s s . 4 0 <−0 m i s s . 5 0 <−0 m i s s . 6 0 <−0 m i s s . 7 0 <−0
Writing Functions I 59 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...
} i f ( p e r c e n t m i s s == 4 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−1 m i s s . 3 0 <−1 m i s s . 4 0 <−1 m i s s . 5 0 <−0 m i s s . 6 0 <−0 m i s s . 7 0 <−0 } i f ( p e r c e n t m i s s == 5 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−1 m i s s . 3 0 <−1 m i s s . 4 0 <−1 m i s s . 5 0 <−1 m i s s . 6 0 <−0 m i s s . 7 0 <−0 } i f ( p e r c e n t m i s s == 6 ) { m i s s . 1 0 <−1 m i s s . 2 0 <−1
Writing Functions I 60 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

A Bad Example ...

miss.30 miss.40 miss.50 miss.60 miss.70 }

<−1 <−1 <−1 <−1 <−0

Writing Functions I

61 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# aux method f o r ( aux i n c ( 1 : 3 ) ) { i f ( aux == 1 ) { v a r <−1 pca <−0 } i f ( aux == 2 ) { v a r <−0 pca <−1 } i f ( aux == 3 ) { v a r <−0 pca <−0 } # number of a u x i l i a r y v a r i a b l e s f o r ( auxnumber i n c ( 1 : 7 ) ) { i f ( auxnumber == 1 ) { a u x . 1 <−1 a u x . 2 <−0 a u x . 3 <−0 a u x . 4 <−0
Writing Functions I 62 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
a u x . 5 <−0 a u x . 6 <−0 a u x . 7 <−0 } i f ( auxnumber == 2 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−0 a u x . 4 <−0 a u x . 5 <−0 a u x . 6 <−0 a u x . 7 <−0 } i f ( auxnumber == 3 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−1 a u x . 4 <−0 a u x . 5 <−0 a u x . 6 <−0 a u x . 7 <−0 }
Writing Functions I 63 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
i f ( auxnumber == 4 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−1 a u x . 4 <−1 a u x . 5 <−0 a u x . 6 <−0 a u x . 7 <−0 } i f ( auxnumber == 5 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−1 a u x . 4 <−1 a u x . 5 <−1 a u x . 6 <−0 a u x . 7 <−0 } i f ( auxnumber == 6 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−1
Writing Functions I 64 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
a u x . 4 <−1 a u x . 5 <−1 a u x . 6 <−1 a u x . 7 <−0 } i f ( auxnumber == 7 ) { a u x . 1 <−1 a u x . 2 <−1 a u x . 3 <−1 a u x . 4 <−1 a u x . 5 <−1 a u x . 6 <−1 a u x . 7 <−1 } # CREATE D I R E C T O R I E S p a t h <− p a s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e t p a t t e r n , p e r c e n t m i s s , aux , auxnumber , s e p=”\ ”) s h e l l ( p a s t e ( ”m k d i r ” , path , s e p=” ”) ) # - - - - - - - - - - - - - - s e p e r a t e missing data m e c h a n i s m s in SAS-------------------------------- #

Writing Functions I

65 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# ######################################### # WRITE G E N E R A T I O N CODE # # ######################################### pathGEN <− p a s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e p=”\ ”) gen <− p a s t e ( pathGEN , ” g e n e r a t e c o r r d a t a . i n p ” , s e p=”\ ”) t e s t d a t a 1 <− p a s t e ( pathGEN , ” d a t a 1 . d a t ” , s e p=”\ ”) c a t ( ' MONTECARLO: \n ' , f i l e =gen ) c a t ( ' NAMES ARE x y q1−q8 ; \n ' , f i l e =gen , append=T) c a t ( ' NOBSERVATIONS = 1000 ; \n ' , f i l e =gen , append=T) c a t ( ' NREPS = ' , i t e r , ' ; \n ' , f i l e =gen , append=T) c a t ( ' SEED = ' , r o u n d ( r u n i f ( 1 ) * 1 0 0 0 0 0 0 0 ) , ' ; \n ' , f i l e =gen , append=T) c a t ( ' REPSAVE=ALL ; \n ' , f i l e =gen , append=T) c a t ( ' SAVE=\n ' , pathGEN , ' \ d a t a * . d a t ; \n ' , f i l e =gen , append= T , s e p=” ”) c a t ( 'MODEL POPULATION : \n ' , f i l e =gen , append=T) c a t ( ' [ q1−q8 * 0 x * 0 y * 0 ] ; \n ' , f i l e =gen , append=T) c a t ( ' q1−q8 * 1 ; x * 1 ; y * 1 ; \n ' , f i l e =gen , append=T) c a t ( ' q1−q8 w i t h q1−q8 * . 5 0 ; \n ' , f i l e =gen , append=T) c a t ( ' x w i t h y * . 5 0 ; \n ' , f i l e =gen , append=T)
Writing Functions I 66 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' x w i t h q1−q8 * . 5 0 ; \n ' , f i l e =gen , append=T) c a t ( ' y w i t h q1−q8 * ' , . 0 +( c o r r . 1 0 * ( . 1 0+c o r r . 2 0 * . 1 0+c o r r . 3 0 * . 1 0+c o r r . 4 0 *+c o r r . 4 0 * . 1 0+c o r r . 5 0 * . 1 0+c o r r . 6 0 * . 1 0+c o r r . 7 0 * . 1 0+c o r r . 8 0 * . 1 0 ) ) , ' ; \n ' , f i l e =gen , append=T , s e p=” ”) i f ( f i l e . e x i s t s ( testdata1 ) ){ } else { s h e l l ( p a s t e ( ”m p l u s . e x e ” , gen , p a s t e ( pathGEN , ” s a v e 1 . o u t ” , s e p =”\ ”) , s e p=” ”) ) } # -------------------------------------------------------------# # ######################################### # G EN E RA T E SAS CODE # # ######################################### pathSAS <− p a s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e t p a t t e r n , p e r c e n t m i s s , aux , auxnumber , s e p=”\ ”) pathSASdata <− p a s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e t p a t t e r n , p e r c e n t m i s s , s e p=”\ ”)
Writing Functions I 67 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
s m c a r m i s s <− p a s t e ( pathSAS , ” m o d i f y d a t a . s a s ” , s e p=”\ ”) t e s t d a t a 2 <− p a s t e ( pathSASdata , ” d a t a 1 . d a t ” , s e p=”\ ”) # ------------------------------------------------------- # # - - - - - - - - - - - - - import SIM data into SAS - - - - - - - - - - - - - - - - # # ------------------------------------------------------- # i f ( f i l e . e x i s t s ( testdata2 ) ){ } else { c a t ( ' p r o c p r i n t t o \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' l o g =”R : \ \ u s e r s \ u s e r n a m e \ d a t a \ simLOG2 \ LOGLOG.log ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r i n t =”R : \ \ u s e r s \ u s e r n a m e \ d a t a \ simLOG2 \ L S T L S T . l s t ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' new ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro importMPLUS ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T)
Writing Functions I 68 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d a t a w o r k . d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' i n f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathGEN , ”d a t a& i . . d a t ” , s e p=”\ ”) , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' INPUT x y q1 q2 q3 q4 q5 q6 q7 q8 ; / *<−− − i n s e r t v a r i a b l e s h e r e * / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%importMPLUS \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) # ######################################### # Draw random sample of size N # # ######################################### c a t ( '%macro s a m p l e s i z e ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 69 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' P r o c s u r v e y s e l e c t d a t a=w o r k . d a t a& i o u t=w o r k . s a m p l e d a t a& i method=SRS \n ' , f i l e =s m c a r m i s s , append=T) i f ( p e r c l u s t == 5 0 ) { c a t ( ' s a m p s i z e =50 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 7 5 ) { c a t ( ' s a m p s i z e =75 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 1 0 0 ) { c a t ( ' s a m p s i z e =100 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 2 0 0 ) { c a t ( ' s a m p s i z e =200 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 4 0 0 ) { c a t ( ' s a m p s i z e =400 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 8 0 0 ) { c a t ( ' s a m p s i z e =800 \n ' , f i l e =s m c a r m i s s , append=T) } e l s e i f ( p e r c l u s t == 1 0 0 0 ) { c a t ( ' s a m p s i z e =1000 \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 70 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

} c a t ( ' SEED = ' , r o u n d ( r u n i f ( 1 ) * 1 0 0 0 0 0 0 0 ) , ' ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%s a m p l e s i z e \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

71 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -----------------------------------------------------------# # - - - - - - - - - E x p o r t sample data for later use------------------- # # -----------------------------------------------------------# # cat ( ' LIBNAME mysim " R :\ users \ u se rn a me \ data \ mysim " ; \n ' , file = smcarmiss , append = T ) # cat ( ' % macro s a m p l e s a v e ; \n ' , file = smcarmiss , append = T ) # cat ( ' % do i =1 % to ' , file = smcarmiss , append = T ) # cat ( paste ( iter ) , file = smcarmiss , append = T ) # cat ( ' ; \n ' , file = smcarmiss , append = T ) # cat ( ' data m y s i m . s a m p l e d a t a&i ; set w o r k . s a m p l e d a t a&i ; \n ' , file = smcarmiss , append = T ) # cat ( ' RUN ; \n ' , file = smcarmiss , append = T ) # cat ( ' % end ; \n ' , file = smcarmiss , append = T ) # cat ( ' % mend ; \n ' , file = smcarmiss , append = T ) # cat ( ' % s a m p l e s a v e \n ' , file = smcarmiss , append = T )

Writing Functions I

72 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# ################################################# # i n t e r a c t i o n MACRO # # ################################################# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% MACRO i n t e r a c t ( v a r s , q u a d r = 1 , p r e f i x = INT ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%LET c =1; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% DO %WHILE(%SCAN(&v a r s ,&c ) NE) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%LET c=%EVAL(&c +1) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%END ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%LET n v a r s=%EVAL(&c−1 ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% DO i = 1 % TO &n v a r s ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% DO j = %EVAL(& i +1−&q u a d r ) % TO &n v a r s ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' & p r e f i x . %SCAN(&v a r s ,& i ) %SCAN(&v a r s ,&j ) = \n ' , f i l e = s m c a r m i s s , append=T) c a t ( '%SCAN(&v a r s ,& i ) * %SCAN(&v a r s ,&j ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%END ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%END ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 73 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( '% MEND; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

74 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -----------------------------------------------------------# # -------------------call MACRO------------------------------- # # -----------------------------------------------------------# c a t ( '%macro i n t e r ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do k=1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d a t a i n t e r d a t a&k ; SET w o r k . s a m p l e d a t a&k ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( '%INTERACT( q1 q2 , q u a d r =1) ; \n ' , f i l e =s m c a r m i s s , append=T ) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%i n t e r \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

75 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# -------------------------------------------------------------# # - - - - - - - - - - - - - - - - - - - - - S E T P ( miss ) ------------------------------ # # -------------------------------------------------------------# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c f o r m a t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a l u e u s e r 0 . 0 0 − 0 . 1 0 = 1 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 1 0<−0 . 2 0 = 2 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 2 0<−0 . 3 0 = 3 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 3 0<−0 . 4 0 = 4 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 4 0<−0 . 5 0 = 5 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 5 0<−0 . 6 0 = 6 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 6 0<−0 . 7 0 = 7 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 7 0<−0 . 8 0 = 8 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 8 0<−0 . 9 0 = 9 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' 0 . 9 0<−1 . 0 0 = 10 \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 76 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' d a t a myrandomdata ( d r o p= : ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' do i =1 t o 1 e5 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' random = r a n t b l ( 9 9 9 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 , 0 . 1 0 ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' o u t p u t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) # -------------------------------------------------------------# # ----------------------MCAR MECHANISM-------------------------------- # # -------------------------------------------------------------# i f ( s e t p a t t e r n == 1 ) { c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro MCAR; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T)
Writing Functions I 77 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / *MCAR* / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data MCAR& i ; s e t i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s ( 1 0 ) x y q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' random=p u t ( r a n u n i ( 1 2 3 4 5& i ) , u s e r n a m e . ) ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' d r o p random ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' i f random= ' , 0+(( m i s s . 1 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; e l s e \n ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' i f random= ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; e l s e \n ' , f i l e =s m c a r m i s s , append=T , s e p=” ” ) c a t ( ' i f random= ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; e l s e \n ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' i f random= ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; e l s e \n ' , f i l e = s m c a r m i s s , append=T , s e p=” ”)

Writing Functions I

78 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' i f random= ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 * 1+ m i s s . 5 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; e l s e \n ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' i f random= ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 * 1+ m i s s . 5 0 *1+ m i s s . 6 0 * 1 ) ) , ' t h e n do ; v a r s ( 2 ) = . ; end ; \ n ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% MCAR \n ' , f i l e =s m c a r m i s s , append=T) # ################################################# # g e ne r at e PCAs regular # # ################################################# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro PCA ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c p r i n c o m p d a t a=MCAR& i o u t=w o r k . p r i n& i N=7; \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

79 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
c a t ( ' v a r q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%PCA \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) # -----------------------------------------------------------# # - - - - - - - - - - - - - - - - - C r e a t e d at a li st file----------------------- # # -----------------------------------------------------------# c a t ( '%l e t f o l d e r = ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , s e p=”\ ”) , f i l e =s m c a r m i s s , append=T) c a t ( ' \ \ ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro i m p o r t l i s t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c i m l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' f i l e ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 80 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p u t ( ”d a t a& i . . d a t ”) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c l o s e f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%i m p o r t l i s t \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' q u i t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) # -----------------------------------------------------------# # - - - - - - - - - - - - - C o n v e r t . to 999 for MPLUS--------------------- #

Writing Functions I

81 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# -----------------------------------------------------------# c a t ( '%macro c o d e m i s s i n g ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do j =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * r e c o d e m i s s i n g number t o 999 * / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d a t a p r i n&j ; s e t p r i n&j ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s { * } n u m e r i c ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' do i = 1 t o dim ( v a r s ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' i f v a r s { i } = . t h e n v a r s { i } = 9 9 9 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d r o p i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%c o d e m i s s i n g \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

82 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# -----------------------------------------------------------# # - - - - - - - - - E x p o r t data with P r i n c i p a l Components-------------- # # -----------------------------------------------------------# c a t ( '%macro e x p o r t p c a ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' PROC EXPORT DATA=w o r k . p r i n& i OUTFILE = ' , f i l e = s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , ”d a t a& i . . d a t ” , s e p=”\ ”) , f i l e = s m c a r m i s s , append=T) c a t ( ' ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( 'DBMS= dlm r e p l a c e ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' putnames=no ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 83 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%e x p o r t p c a \n ' , f i l e =s m c a r m i s s , append=T) }

Writing Functions I

84 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -------------------------------------------------------------# # ------------------------MAR MECHANISM------------------------------- # # -------------------------------------------------------------# e l s e i f ( s e t p a t t e r n == 2 ) { c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro MAR; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / *MAR* / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c u n i v a r i a t e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data=i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a r q1 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' o u t p u t o u t= d e c i l e s & i p c t l p t s =10 20 30 40 50 60 70 80 90 p c t l p r e=p c t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 85 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * w r i t e t h e c u t p o i n t s t o macro v a r i a b l e s * / \n ' s m c a r m i s s , append=T) cat ( ' data n u l l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' s e t d e c i l e s & i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c a l l symput ( ”qu1 ” , p c t 1 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu2 ” , p c t 2 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu3 ” , p c t 3 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu4 ” , p c t 4 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu5 ” , p c t 5 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu6 ” , p c t 6 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu7 ” , p c t 7 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu8 ” , p c t 8 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) , f i l e=

append append append append append append append append

Writing Functions I

86 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' c a l l symput ( ”qu9 ” , p c t 9 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data MAR& i ; s e t i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s ( 1 0 ) x y q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' i f &qu ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 *1+ m i s s . 5 0 *1+ m i s s . 6 0 * 1 ) ) , ' ≥ Q1 t h e n do ; v a r s ( 2 ) = . ; end ; ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' RUN ; ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% MAR \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

87 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# ################################################# # g e ne r at e PCAs regular # # ################################################# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro PCA ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c p r i n c o m p d a t a=MAR& i o u t=w o r k . p r i n& i N=7; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a r q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%PCA \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) # -----------------------------------------------------------#
Writing Functions I 88 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# - - - - - - - - - - - - - - - - - C r e a t e d at a li st file----------------------- # # -----------------------------------------------------------# c a t ( '%l e t f o l d e r = ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , s e p=”\ ”) , f i l e =s m c a r m i s s , append=T) c a t ( ' \ \ ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro i m p o r t l i s t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c i m l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p u t ( ”d a t a& i . . d a t ”) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c l o s e f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 89 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
cat ( cat ( cat ( cat ( cat ( # ” ; \n ' , f i l e =s m c a r m i s s , append=T) %mend ; \n ' , f i l e =s m c a r m i s s , append=T) '%i m p o r t l i s t \n ' , f i l e =s m c a r m i s s , append=T) ' q u i t ; \n ' , f i l e =s m c a r m i s s , append=T) ' \n ' , f i l e =s m c a r m i s s , append=T)
' '

-----------------------------------------------------------# # - - - - - - - - - - - - - C o n v e r t . to 999 for MPLUS--------------------- # # -----------------------------------------------------------# c a t ( '%macro c o d e m i s s i n g ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do j =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * r e c o d e m i s s i n g number t o 999 * / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d a t a p r i n&j ; s e t p r i n&j ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s { * } n u m e r i c ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' do i = 1 t o dim ( v a r s ) ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 90 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( ' i f v a r s { i } = . t h e n v a r s { i } = 9 9 9 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d r o p i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%c o d e m i s s i n g \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

91 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -----------------------------------------------------------# # - - - - - - - - - E x p o r t data with P r i n c i p a l Components-------------- # # -----------------------------------------------------------# c a t ( '%macro e x p o r t p c a ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' PROC EXPORT DATA=w o r k . p r i n& i OUTFILE = ' , f i l e = s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , ”d a t a& i . . d a t ” , s e p=”\ ”) , f i l e = s m c a r m i s s , append=T) c a t ( ' ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( 'DBMS= dlm r e p l a c e ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' putnames=no ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 92 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%e x p o r t p c a \n ' , f i l e =s m c a r m i s s , append=T) } # -------------------------------------------------------------# # ------------------------MNAR MECHANISM------------------------------ # # -------------------------------------------------------------# e l s e i f ( s e t p a t t e r n == 3 ) { c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro MNAR; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / *MAR* / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c u n i v a r i a t e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data=i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a r y ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 93 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' o u t p u t o u t= d e c i l e s & i p c t l p t s =10 20 30 40 50 60 70 80 90 p c t l p r e=p c t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * w r i t e t h e c u t p o i n t s t o macro v a r i a b l e s * / \n ' , f i l e = s m c a r m i s s , append=T) cat ( ' data n u l l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' s e t d e c i l e s & i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c a l l symput ( ”qu1 ” , p c t 1 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu2 ” , p c t 2 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu3 ” , p c t 3 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu4 ” , p c t 4 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu5 ” , p c t 5 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu6 ” , p c t 6 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu7 ” , p c t 7 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T)
Writing Functions I 94 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' c a l l symput ( ”qu8 ” , p c t 8 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' c a l l symput ( ”qu9 ” , p c t 9 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data MNAR& i ; s e t i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s ( 1 0 ) x y q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' i f &qu ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 *1+ m i s s . 5 0 *1+ m i s s . 6 0 * 1 ) ) , ' ≥ y t h e n do ; v a r s ( 2 ) = . ; end ; ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' RUN ; ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '% MNAR \n ' , f i l e =s m c a r m i s s , append=T) # ################################################# # g e ne r at e PCAs regular # # ################################################# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro PCA ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 95 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c p r i n c o m p d a t a=MNAR& i o u t=w o r k . p r i n& i N=7; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a r q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%PCA \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

96 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -----------------------------------------------------------# # - - - - - - - - - - - - - - - - - C r e a t e d at a li st file----------------------- # # -----------------------------------------------------------# c a t ( '%l e t f o l d e r = ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , s e p=”\ ”) , f i l e =s m c a r m i s s , append=T) c a t ( ' \ \ ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro i m p o r t l i s t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c i m l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p u t ( ”d a t a& i . . d a t ”) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 97 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
cat ( cat ( cat ( cat ( cat ( cat ( cat ( cat ( # c l o s e f i l e ' , f i l e =s m c a r m i s s , append=T) ” ' , f i l e =s m c a r m i s s , append=T) '& f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) '%i m p o r t l i s t \n ' , f i l e =s m c a r m i s s , append=T) ' q u i t ; \n ' , f i l e =s m c a r m i s s , append=T) ' \n ' , f i l e =s m c a r m i s s , append=T)
' '

-----------------------------------------------------------# # - - - - - - - - - - - - - C o n v e r t . to 999 for MPLUS--------------------- # # -----------------------------------------------------------# c a t ( '%macro c o d e m i s s i n g ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do j =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * r e c o d e m i s s i n g number t o 999 * / \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 98 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
cat ( cat ( cat ( cat ( cat ( cat ( cat ( cat ( cat ( cat ( cat ( # d a t a p r i n&j ; s e t p r i n&j ; \n ' , f i l e =s m c a r m i s s , append=T) a r r a y v a r s { * } n u m e r i c ; \n ' , f i l e =s m c a r m i s s , append=T) ' do i = 1 t o dim ( v a r s ) ; \n ' , f i l e =s m c a r m i s s , append=T) ' if v a r s { i } = . t h e n v a r s { i } = 9 9 9 ; \n ' , f i l e =s m c a r m i s s , append=T) ' end ; \n ' , f i l e =s m c a r m i s s , append=T) ' drop i ; \n ' , f i l e =s m c a r m i s s , append=T) ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) '%end ; \n ' , f i l e =s m c a r m i s s , append=T) '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) '%c o d e m i s s i n g \n ' , f i l e =s m c a r m i s s , append=T) ' \n ' , f i l e =s m c a r m i s s , append=T)
' '

-----------------------------------------------------------# # - - - - - - - - - E x p o r t data with P r i n c i p a l Components-------------- # # -----------------------------------------------------------# c a t ( '%macro e x p o r t p c a ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 99 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' PROC EXPORT DATA=w o r k . p r i n& i OUTFILE = ' , f i l e = s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , ”d a t a& i . . d a t ” , s e p=”\ ”) , f i l e = s m c a r m i s s , append=T) c a t ( ' ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( 'DBMS= dlm r e p l a c e ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' putnames=no ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%e x p o r t p c a \n ' , f i l e =s m c a r m i s s , append=T) }

Writing Functions I

100 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -------------------------------------------------------------# # - - - - - - - - - - - - - - - - - n o n l i n e a r MAR MECHANISM---------------------------- # # -------------------------------------------------------------# e l s e i f ( s e t p a t t e r n == 4 ) { c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro n on li ne arM A R ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / *MAR* / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c u n i v a r i a t e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data=i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' v a r I N T q 1 q 2 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' o u t p u t o u t= d e c i l e s & i p c t l p t s =10 20 30 40 50 60 70 80 90 p c t l p r e=p c t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 101 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * w r i t e t h e c u t p o i n t s t o macro v a r i a b l e s * / \n ' s m c a r m i s s , append=T) cat ( ' data n u l l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' s e t d e c i l e s & i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c a l l symput ( ”qu1 ” , p c t 1 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu2 ” , p c t 2 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu3 ” , p c t 3 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu4 ” , p c t 4 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu5 ” , p c t 5 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu6 ” , p c t 6 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu7 ” , p c t 7 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) c a t ( ' c a l l symput ( ”qu8 ” , p c t 8 0 ) ; \n ' , f i l e =s m c a r m i s s , =T) , f i l e=

append append append append append append append append

Writing Functions I

102 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' c a l l symput ( ”qu9 ” , p c t 9 0 ) ; \n ' , f i l e =s m c a r m i s s , append =T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' Data MARnon& i ; s e t i n t e r d a t a& i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s ( 1 0 ) x y q1 q2 q3 q4 q5 q6 q7 q8 ; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' i f &qu ' , 0+(( m i s s . 1 0 *1+ m i s s . 2 0 *1+ m i s s . 3 0 *1+ m i s s . 4 0 *1+ m i s s . 5 0 *1+ m i s s . 6 0 * 1 ) ) , ' ≥ I N T q 1 q 2 t h e n do ; v a r s ( 2 ) = . ; end ; ' , f i l e =s m c a r m i s s , append=T , s e p=” ”) c a t ( ' RUN ; ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%no nli ne a rM AR \n ' , f i l e =s m c a r m i s s , append=T) # ################################################# # # # # # # # # # # # # ge n er at e PCAs i n t e r a c t i o n # # # # # # # # # # # # ################################################# c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro PCA ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T)
Writing Functions I 103 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c p r i n c o m p d a t a=MARnon& i o u t=p r i n& i N=7; \n ' , f i l e = s m c a r m i s s , append=T) c a t ( ' v a r q1 q2 q3 q4 q5 q6 q7 q8 INT q1 q1 INT q1 q2 INT q2 q2 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%PCA \n ' , f i l e =s m c a r m i s s , append=T) # -----------------------------------------------------------# # - - - - - - - - - - - - - - - - - C r e a t e d at a li st file----------------------- # # -----------------------------------------------------------# c a t ( '%l e t f o l d e r = ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , s e p=”\ ”) , f i l e =s m c a r m i s s , append=T)
Writing Functions I 104 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' \ \ ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%macro i m p o r t l i s t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p r o c i m l ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' p u t ( ”d a t a& i . . d a t ”) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' c l o s e f i l e ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( ' & f o l d e r . d a t a l i s t . d a t ' , f i l e =s m c a r m i s s , append=T) c a t ( ' ” ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%i m p o r t l i s t \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' q u i t ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

105 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# -----------------------------------------------------------# # - - - - - - - - - - - - - C o n v e r t . to 999 for MPLUS--------------------- # # -----------------------------------------------------------# c a t ( '%macro c o d e m i s s i n g ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do j =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' / * r e c o d e m i s s i n g number t o 999 * / \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d a t a p r i n&j ; s e t p r i n&j ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' a r r a y v a r s { * } n u m e r i c ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' do i = 1 t o dim ( v a r s ) ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' i f v a r s { i } = . t h e n v a r s { i } = 9 9 9 ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' d r o p i ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' r u n ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 106 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%c o d e m i s s i n g \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' \n ' , f i l e =s m c a r m i s s , append=T)

Writing Functions I

107 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# -----------------------------------------------------------# # - - - - - - - - - E x p o r t data with P r i n c i p a l Components-------------- # # -----------------------------------------------------------# c a t ( '%macro e x p o r t p c a ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%do i =1 %t o ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( i t e r ) , f i l e =s m c a r m i s s , append=T) c a t ( ' ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' PROC EXPORT DATA=w o r k . p r i n& i OUTFILE = ' , f i l e = s m c a r m i s s , append=T) c a t ( ' ” ' , f i l e =s m c a r m i s s , append=T) c a t ( p a s t e ( pathSASdata , ”d a t a& i . . d a t ” , s e p=”\ ”) , f i l e = s m c a r m i s s , append=T) c a t ( ' ” \n ' , f i l e =s m c a r m i s s , append=T) c a t ( 'DBMS= dlm r e p l a c e ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' putnames=no ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( ' RUN ; \n ' , f i l e =s m c a r m i s s , append=T)
Writing Functions I 108 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( '%end ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%mend ; \n ' , f i l e =s m c a r m i s s , append=T) c a t ( '%e x p o r t p c a \n ' , f i l e =s m c a r m i s s , append=T) } # ######################################### # RUN SAS code in all files # # ######################################### # shell ( paste (" sas.exe " , s m c a r m i s s ) ) SASexe <− p a s t e ( ' ”C : \ \ Program F i l e s \SASHome\ SASFoundation \9 . 3 \ s a s . e x e ” ' ) s h e l l ( p a s t e ( SASexe , s m c a r m i s s ) ) } # -------------------------------------------------------------# # ######################################### # WRITE an a ly s is CODE # # #########################################
Writing Functions I 109 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
# -------------------------------------------------------------# # - - - - - - c o d e writes MPLUS Monte Carlo code to " Path " directory---------- # # -------------------------------------------------------------# # - - - - - - - - - - - - - d a t a drawn from " P E R C E N T M I S S " folder--------------------- # # -------------------------------------------------------------# p a t h s a m p l e <− p a s t e ( d i r r o o t , p e r c l u s t , n c l u s t , s e t c o r r , s e t p a t t e r n , p e r c e n t m i s s , s e p=”\ ”) ana <− p a s t e ( path , ” a n a l y z e . i n p ” , s e p=”\ ”) c a t ( ' DATA: FILE = ' , f i l e =ana , append=T) c a t ( p a s t e ( p a t h s a m p l e , ” d a t a l i s t . d a t ; ” , s e p=”\ ”) , append=T) c a t ( ' \n ' , f i l e =ana , append=T) c a t ( ' TYPE = MONTECARLO; \n ' , f i l e =ana , append=T)
Writing Functions I 110 / 205

f i l e =ana ,

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...
c a t ( ' VARIABLE : \n ' , f i l e =ana , append=T) c a t ( ' NAMES = x y q1−q8 i n t e r 1 i n t e r 2 i n t e r 3 p r i n 1 − p r i n 7 ; \n ' , f i l e =ana , append=T) c a t ( ' USEVARIABLES = x y ; \n ' , f i l e =ana , append=T) c a t ( ' MISSING = ALL ( 9 9 9 ) ; \n ' , f i l e =ana , append=T) # ################################################# # ####### CODE IN A U X I L I A R Y V A R I A B L E S # # # # # # # # # # # # # # ################################################# i f ( pca == 1 ) { c a t ( ' a u x i l i a r y = (m) p r i n 1 − p r i n ' , 0+( a u x . 1 * (1+ a u x . 2 *1+ a u x . 3 *1+ a u x . 4 * +a u x . 4 *1+ a u x . 5 *1+ a u x . 6 *1+ a u x . 7 * 1 ) ) , ' ; \n ' , f i l e =ana , append=T , s e p=” ”) } i f ( v a r == 1 ) { c a t ( ' a u x i l i a r y = (m) q1−q ' , 0+( a u x . 1 * (1+ a u x . 2 * 1+ a u x . 3 *1+ a u x . 4 * +a u x . 4 *1+ a u x . 5 *1+ a u x . 6 *1+ a u x . 7 * 1 ) ) , ' ; \n ' , f i l e =ana , append=T , s e p=” ”) } # ------------------------------------------------ # c a t ( ' s a v e d a t a : \n ' , f i l e =ana , append=T) c a t ( ' r e s u l t s a r e ' , f i l e =ana , append=T) c a t ( p a s t e ( path , ” r e s u l t s 1 . t x t ” , s e p=”\ ”) , f i l e =ana , append=T)
Writing Functions I 111 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This ...

c a t ( ' ; \n ' , f i l e =ana , append=T) c a t ( 'MODEL: \n ' , f i l e =ana , append=T) c a t ( ' [ x * 0 y * 0 ] ; \n ' , f i l e =ana , append=T) c a t ( ' x * 1 ; y * 1 ; \n ' , f i l e =ana , append=T) c a t ( ' x w i t h y * . 5 0 ; \n ' , f i l e =ana , append=T) c a t ( ' OUTPUT: \n ' , f i l e =ana , append=T) # ######################################### # RUN AN A LY S IS CODE # # ######################################### s h e l l ( p a s t e ( ”m p l u s . e x e ” , ana , p a s t e ( path , ” s a v e o u t . o u t ” , s e p= ”\ ”) , s e p=” ”) ) # -------------------------------------------------------------#

Writing Functions I

112 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Let Your Code Look Like This
# ######################################### # Zip or ig i na l data FILES # # ######################################### # setwd ( path ) # exe <− paste ( ' " C :\ Program \ Files \7 -Zip \7 z.exe " ' ) # shell ( paste ( exe ," a " , path ,"\ d a ta . zi p " , path ,"\ * .dat " , sep ="") ) # shell ( paste (" del " , path ,"\ * .dat " , sep = ' ' ) ) } # number of a u x i l i a r y v a r i a b l e s } # aux method } # percent missing } # missing pattern } # common c o r r e l a t i o n } # number of c l us te r s } # n per cluster sample size

Writing Functions I

113 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

There are many mistakes in there

Weird indentation Use ”/”, not ”backslash” even on Windows , Use vectors Prolific copying and pasting of ” cat” lines. Avoid system except where truly necessary. R has OS neutral functions to create directorys and such.

Writing Functions I

114 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Unfortunately, Students Learn Mistakes From Each Other

Here’s a danger I perceive: 1 student writes a program that ” runs” , and passes it on to the others. My suspicion: Students like complicated, unreadable code because they think nobody will ever check their work.

Writing Functions I

115 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Did That Code Look Like That?
I found the prototype in an earlier dissertation project
gen <− p a s t e ( path , ” g e n e r a t e . i n p ” , s e p=”\ ”) c a t ( ' MONTECARLO: \n ' , f i l e =gen ) c a t ( ' NAMES ARE y1−y6 ; \n ' , f i l e =gen , append=T) c a t ( ' NOBSERVATIONS = ' , p e r c l u s t * n c l u s t , ' ; \n ' , f i l e =gen , append=T) i f ( p e r c l u s t != 7 . 5 ) { c a t ( ' NCSIZES = 1 ; \n ' , f i l e =gen , append=T) c a t ( ' CSIZES = ' , n c l u s t , ' ( ' , p e r c l u s t , ' ) ; \n ' , f i l e =gen , append=T) } i f ( p e r c l u s t == 7 . 5 ) { c a t ( ' NCSIZES = 2 ; \n ' , f i l e =gen , append=T) c a t ( ' CSIZES = ' , n c l u s t / 2 , ' ( 7 ) ' , n c l u s t / 2 , ' ( 8 ) ; \n ' , f i l e =gen , append=T) } # user-specified iterations c a t ( ' NREPS = ' , i t e r , ' ; \n ' ,
Writing Functions I

f i l e =gen , append=T)
116 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Did That Code Look Like That? ...
SEED = 7 9 1 3 0 5 ; \n ' , f i l e =gen , append=T) REPSAVE=ALL ; \n ' , f i l e =gen , append=T) ' SAVE=\n ' , path , ' \ d a t a * . d a t ; \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' ANALYSIS : TYPE = TWOLEVEL ; \n ' , f i l e =gen , append=T) cat ( cat ( cat (
' '

c a t ( 'MODEL POPULATION : \n ' ,

f i l e =gen , append=T)

c a t ( '%WITHIN% \n ' , f i l e =gen , append=T) # b a se l in e l oa d in g s are all .3 , add .4 if ' within ' = 1 , but change based on mod / strong c a t ( 'FW BY y1−y2 * ' , . 3 +( w i t h i n * ( . 4+s t r o n g * . 1 ) ) , ' \n ' , f i l e = gen , append=T , s e p=” ”) c a t ( ' y3−y4 * ' , . 3 +( w i t h i n * ( .4−mod * . 3 ) ) , ' \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' y5−y6 * ' , . 3 +( w i t h i n * ( . 4 − s t r o n g * . 1 ) ) , ' ; \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' FW@1 ; \n ' , f i l e =gen , append=T) # r e s i d u a l s are just 1 - l oa d in g ∧ 2 c a t ( ' y1−y2 * ' , 1−( . 3 +( w i t h i n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 , ' ; \n ' , f i l e =gen , append=T , s e p=” ”)

Writing Functions I

117 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Did That Code Look Like That? ...
c a t ( ' y3−y4 * ' , 1−( . 3 +( w i t h i n * ( .4−mod * . 3 ) ) ) ∧ 2 , ' ; \n ' , f i l e = gen , append=T , s e p=” ”) c a t ( ' y5−y6 * ' , 1−( . 3 +( w i t h i n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 , ' ; \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( '% BETWEEN \n ' , f i l e =gen , append=T) % # the ICC bit m u l t i p l i e s by .053 when ICC is low , by 1 when ICC is high c a t ( ' FB BY y1−y2 * ' , s q r t ( ( . 3 +( b e t w e e n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ' \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' y3−y4 * ' , s q r t ( ( . 3 +( b e t w e e n * ( .4−mod * . 3 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ' \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' y5−y6 * ' , s q r t ( ( . 3 +( b e t w e e n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ' ; \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' FB@1 ; \n ' , f i l e =gen , append=T) # r e s i d u a l s are total v ar i an c e (1 or .053 , d e p e n d i n g on ICC ) loading ∧ 2 c a t ( ' y1−y2 * ' , 1+( i c c * ( .053−1 ) ) − s q r t ( ( . 3 +( b e t w e e n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ' ; \n ' , f i l e =gen , append= T , s e p=” ”)
Writing Functions I 118 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Did That Code Look Like That? ...

c a t ( ' y3−y4 * ' , 1+( i c c * ( .053−1 ) ) − s q r t ( ( . 3 +( b e t w e e n * ( .4−mod * . 3 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ' ; \n ' , f i l e =gen , append=T , s e p=” ”) c a t ( ' y5−y6 * ' , 1+( i c c * ( .053−1 ) ) − s q r t ( ( . 3 +( b e t w e e n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ' ; \n ' , f i l e =gen , append =T , s e p=” ”) c a t ( ' \n ' , f i l e =gen , append=T , s e p=” ”) # run the above syntax using Mplus # shell ( paste (" cd " , mplus , sep =" ") ) s h e l l ( p a s t e ( ”m p l u s . e x e ” , gen , p a s t e ( path , ” s a v e . o u t ” , s e p=”\ ”) , s e p=” ”) )

Writing Functions I

119 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here was my Counter-Suggestion
Below you see a cleaner R approach, just 3 cat commands, but it still has some messy stuff in it. I isolate the work of creating a file into one function, createInpFile Appears that copying code into this file may have damaged line endings (still checking).
# # Create one MPlus Input file c o r r e s p o n d i n g to f o l l o w i n g parameters. c r e a t e I n p F i l e <− f u n c t i o n ( p a t h=”a p a t h ” , gen=” a f i l e n a m e . i n p ” , p e r c l u s t =2 , n c l u s t =100 , i t e r =1000 , mod=1 , s t r o n g =1 , b e t w e e n =1 , w i t h i n =1){ c a t ( ”MONTECARLO: NAMES ARE y1−y6 ; NOBSERVATIONS = ” , p e r c l u s t * n c l u s t , ” ; \n ” , i f e l s e ( p e r c l u s t != 7 . 5 , p a s t e ( ”NCSIZES = 1 ; \n CSIZES =” , n c l u s t , ”( ” , p e r c l u s t , ”) ; \ n ”) , p a s t e ( ” NCSIZES = 2 ; \n CSIZES = ” , n c l u s t / 2 , ” ( 7 ) ” , n c l u s t / 2 , ” ( 8 ) ; \n ” ) ) , f i l e =gen , append=T , s e p=” ”) ## user-specified iterations c a t ( ”NREPS = ” , i t e r , ” ;
Writing Functions I 120 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here was my Counter-Suggestion ...
SEED = 7 9 1 3 0 5 ; REPSAVE=ALL ; SAVE=” , path , ”\ d a t a * . d a t ; ANALYSIS : TYPE = TWOLEVEL ; MODEL POPULATION : %WITHIN% \n ” , f i l e =gen , append=T , s e p=” ”) # # baseline loading s are all .3 , add .4 if " within " = 1 , but change based on mod / strong c a t ( ”FW BY y1−y2 * ” , . 3 +( w i t h i n * ( . 4+s t r o n g * . 1 ) ) , ”y3−y4 * ” , . 3 +( w i t h i n * ( .4−mod * . 3 ) ) , ”y5−y6 * ” , . 3 +( w i t h i n * ( . 4 − s t r o n g * . 1 ) ) , ”FW@1 ; \n ” , ”y1−y2 * ” , 1−( . 3 +( w i t h i n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 , ” ; \n ” , ”y3−y4 * ” , 1−( . 3 +( w i t h i n * ( .4−mod * . 3 ) ) ) ∧ 2 , ” ; \n ” , ”y5−y6 * ” , 1−( . 3 +( w i t h i n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 , ” ; \n ” , ” % BETWEEN ” , % ”FB BY y1−y2 * ” , s q r t ( ( . 3 +( b e t w e e n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ”y3−y4 * ” , s q r t ( ( . 3 +( b e t w e e n * ( .4−mod * . 3 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ”y5−y6 * ” , s q r t ( ( . 3 +( b e t w e e n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) , ” ; FB@1 ; \n ” ,

Writing Functions I

121 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here was my Counter-Suggestion ...

”y1−y2 * ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3 +( b e t w e e n * ( . 4+s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ” ; ” , ”y3−y4 * ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3 +( b e t w e e n * ( .4−mod * . 3 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ” ; ” , ”y5−y6 * ” , 1+( i c c * ( .053−1 ) ) −sqrt ( ( . 3 +( b e t w e e n * ( . 4 − s t r o n g * . 1 ) ) ) ∧ 2 * (1+( i c c * ( .053−1 ) ) ) ) ∧ 2 , ” ; ” , ”\n ” , f i l e =gen , append=T , s e p=” ”) }

Writing Functions I

122 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Critique that!

Benefit of re-write is isolation of code writing into a separate function We need to work on cleaning up use of “cat” to write files.

Writing Functions I

123 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Unofficial Official R Style Guide

This is discussed in rockchalk vignette Rchaeology There is not much “official style guidance” from the R Core Team Don’t mistake that as permission to write however you want. There ARE very widely accepted standards for the way that code should look

Writing Functions I

124 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Inductive Style Guide
To see that R really does have an implicitly stated Style, inspect the R source code. The code follows a uniform pattern of indentation, the use of white space, and so forth. For a quick demonstration, run the command “lm” inside R:
lm f u n c t i o n ( f o r m u l a , data , s u b s e t , w e i g h t s , n a . a c t i o n , method = ”q r ” , model = TRUE, x = FALSE , y = FALSE , q r = TRUE, s i n g u l a r . o k = TRUE, c o n t r a s t s = NULL , o f f s e t , . . . ) { r e t . x <− x r e t . y <− y c l <− m a t c h . c a l l ( ) mf <− m a t c h . c a l l ( e x p a n d . d o t s = FALSE ) m <− match ( c ( ”f o r m u l a ” , ”d a t a ” , ” s u b s e t ” , ”w e i g h t s ” , ” n a . a c t i o n ” , ” o f f s e t ”) , names ( mf ) , 0L ) mf <− mf [ c ( 1 L , m) ] mf$ d r o p . u n u s e d . l e v e l s <− TRUE
Writing Functions I 125 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Inductive Style Guide ...
mf [ [ 1 L ] ] <− a s . n a m e ( ”m o d e l . f r a m e ”) mf <− e v a l ( mf , p a r e n t . f r a m e ( ) ) i f ( method == ”m o d e l . f r a m e ”) r e t u r n ( mf ) e l s e i f ( method != ”q r ”) w a r n i n g ( g e t t e x t f ( ”method = '%s ' i s n o t s u p p o r t e d . U s i n g ' qr ' ”, method ) , domain = NA) mt <− a t t r ( mf , ”t e r m s ”) y <− m o d e l . r e s p o n s e ( mf , ”n u m e r i c ”) w <− a s . v e c t o r ( m o d e l . w e i g h t s ( mf ) ) i f ( ! i s . n u l l (w) && ! i s . n u m e r i c (w) ) s t o p ( ” ' w e i g h t s ' must be a n u m e r i c v e c t o r ”) o f f s e t <− a s . v e c t o r ( m o d e l . o f f s e t ( mf ) ) if ( ! is.null ( offset )) { i f ( l e n g t h ( o f f s e t ) != NROW( y ) ) s t o p ( g e t t e x t f ( ”number o f o f f s e t s i s %d , s h o u l d e q u a l %d ( number o f o b s e r v a t i o n s ) ” , l e n g t h ( o f f s e t ) , NROW( y ) ) , domain = NA) } i f ( i s . e m p t y . m o d e l ( mt ) ) { x <− NULL z <− l i s t ( c o e f f i c i e n t s = i f ( i s . m a t r i x ( y ) ) m a t r i x ( , 0 , 3) e l s e numeric ( ) , r e s i d u a l s = y , f i t t e d . v a l u e s = 0 * y , w e i g h t s = w , r a n k = 0L , d f . r e s i d u a l = i f ( ! i s . n u l l (w) ) sum (w !=
Writing Functions I 126 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Inductive Style Guide ...
0 ) e l s e i f ( i s . m a t r i x ( y ) ) nrow ( y ) e l s e l e n g t h ( y ) ) if ( ! is.null ( offset )) { z $ f i t t e d . v a l u e s <− o f f s e t z $ r e s i d u a l s <− y − o f f s e t } } else { x <− m o d e l . m a t r i x ( mt , mf , c o n t r a s t s ) z <− i f ( i s . n u l l (w) ) l m. fi t (x , y , offset = offset , singular.ok = singular.ok , ... ) e l s e l m . w f i t ( x , y , w, o f f s e t = o f f s e t , s i n g u l a r . o k = singular.ok , ... ) } c l a s s ( z ) <− c ( i f ( i s . m a t r i x ( y ) ) ”mlm ” , ”lm ”) z $ n a . a c t i o n <− a t t r ( mf , ” n a . a c t i o n ”) z $ o f f s e t <− o f f s e t z $ c o n t r a s t s <− a t t r ( x , ” c o n t r a s t s ”) z $ x l e v e l s <− . g e t X l e v e l s ( mt , mf ) z $ c a l l <− c l z $ t e r m s <− mt i f ( model ) z $ model <− mf if ( ret.x ) z $ x <− x
Writing Functions I 127 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Inductive Style Guide ...

if ( ret.y ) z $ y <− y i f ( ! qr ) z $ q r <− NULL z } <bytec ode : 0 x1948b88> < e n v i r o n m e n t : namespace : s t a t s >

Writing Functions I

128 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Neatness Counts

You can write messy code, but you can’t make anybody read it. Finding bugs in a giant undifferentiated mass of commands is difficult Chance of error rises as clarity decreases If you want people to help you, or use your code, you should write neatly!

Writing Functions I

129 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Style Fundamentals
WHITE SPACE:
indentation. R Core Team recommends 4 spaces one space around operators like <- = *

“< −” should be used for assignments. “=” was used by mistake so often by novices that the R interpreter was re-written to allow =. However, it may still fail in some cases. Use helpful variable names Separate calculations into functions. Sage advice from one of my programming mentors: Don’t allow a calculation to grow longer than the space on one screen. Break it down into smaller, well defined pieces.

Writing Functions I

130 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Be Careful about line endings
Unlike C (or other languages), R does not require an “end of line character” like “;”. That’s convenient, but sometimes code can “fool” R into believing that a command is finished. From the help page for “if” Note that it is a common mistake to forget to put braces (‘{ .. }’) around your statements, e.g., after ‘if(..)’ or ‘for(....)’. In particular, you should not have a newline between ‘}’ and ‘else’ to avoid a syntax error in entering a ‘if ... else’ construct at the keyboard or via ‘source’. For that reason, one (somewhat extreme) attitude of defensive programming is to always use braces, e.g., for ‘if’ clauses. Thus, it is HIGHLY ADVISABLE to put the left squiggly bracket on the same line as the preceding material when a command is not finished.
Writing Functions I 131 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Be Careful about line endings ...
With else statements, I have fallen on the defensive approach of writing }else{ to close the previous if and begin else on same line. “if-else” is very troublesome. If R thinks the “if” is finished, it may not notice the else. i f ( x > 7 ) y <− 3 e l s e y <− 2 Causes “Error: unexpected ’else’ in “else” Do THIS: i f ( whatever > 0) { do.something () }else{ do.other.thing () }
Writing Functions I 132 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Be Careful about line endings ...
NOT THIS: i f ( whatever > 0) { do.something () } else { do.other.thing () } The “squiggle on the end of a line” style is called K&R style, after Kernighan and Richie, the inventors of the C language.

Writing Functions I

133 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Google Doc on R Coding is Just “somebody’s” Opinion
The Easily Googled Google R Standards

Writing Functions I

134 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How To Name Functions

Don’t use names for functions that are already in widespread use, like lm, seq, rep, etc. I like Objective C style variable and function names that smash words together, as in myRegression myCode R uses periods in function names to represent “object orientation” or “subclassing”, thus I avoid periods for simple punctuation. Ex: doSomething() is better than do.something Underscores are now allowed in function names. Ex: do something() would be OK

Writing Functions I

135 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How To Name Variables

Use clear names always Use short names where possible Never use names of common functions for variables Never name a variable T or F (doing so tramples R symbols) “Name by suffix” strategy I’m using now: m1 <− m1sum m1vif m1inf lm ( y ∼ x , d a t a=d a t ) <− summary (m1) <− v i f (m1) <− i n f l u e n c e . m e a s u r e s (m1)

Writing Functions I

136 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Expect Some Variations in My Code
I don’t mind adding squiggly braces, even if not required if (whatever == TRUE) {x <- y} Sometimes I will use 3 lines where one would do if (whatever == TRUE){ x <- y } When in doubt, I like to explicitly name options to functions because I believe it makes code more clear and reduces error. I don’t often remember to write TRUE and FALSE when T and F will suffice But I’m trying to remember to write TRUE or FALSE, and here’s why. If some user mistakenly redefines T or F: T <− 7 F <− m y T e r r i f i c F u n c t i o n then my functions will not be fooled if I avoid “T” and “F” .
Writing Functions I 137 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Object Oriented Programming

A re-conceptualization of programming to reduce programmer error OO includes a broad set of ideas, only a few of which are directly applicable to R programming The “rise” to pre-eminance of OO indicated by the
introduction of object frameworks in existing languages (C++, Objective-C) growth of wholly new object-oriented languages (Java)

Writing Functions I

138 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Decipher R OO by Intuition
Run the command methods ( p r i n t ) What do you see? I see 170 lines, like so:
[1] [2] [3] [4] ... [167] [168] [169] [170] print.acf* print.anova print.aov* print.aovlist* print.warnings print.xgettext* print.xngettext* print.xtabs*

Non−visible functions are a s t e r i s k e d

Writing Functions I

139 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

170 print.??? Methods.

Yes: there are really 170 print “methods” No: the R team does not expect or want you to know all of them. Users just run print (x) Try not to worry about “how” the printing is achieved. Yes: R team wants package writers to create specialized print methods to control presentation of their output for the object they create.

Writing Functions I

140 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The R Runtime System Handles the Details

The user runs print (x) The R runtime system
1 2

notices that x is of a certain type, say “classOfx” and then the runtime system uses print.classOfX to handle the user’s request

Writing Functions I

141 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

print is a “Generic Function”
Definition of “Generic Function” the function that users call which causes : an object-specific function to be called for the work to get done. Examples:“print”, “plot”, “summary” “anova” , Generic Function is terminology unique to R(AFAIK) In the standard case, a generic function does not do any work. It sends the work to the appropriate “implementation” in a method. “A standard generic function does no computation other than dispatching a method, but R generic functions can do other coumputations as well before and/or after method dispatch”(Chambers, Software for Data Analysis, p. 398) UseMethod() is the function that declares a function as generic: The R runtime system is warned to “be alert” to usage of the function. Example: the print generic function from R source (base package). p r i n t <− f u n c t i o n ( x , . . . ) UseMethod ( ” p r i n t ”)

Writing Functions I

142 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

print is a “Generic Function” ...

Example: the plot generic function from R source (graphics package). p l o t <− f u n c t i o n ( x , y , ... ) UseMethod ( ” p l o t ”)

Writing Functions I

143 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Here’s Where R Gains its Analytical Power

The generic is just a place holder. User runs print(x), then R knows it is supposed to ask x for its class and then the appropriate thing is supposed to happen. No Big Deal. But the statisticians in the S & R projects saw enormous simplifying potential in developing a battery is standard generic accessor functions
summary() anova() predict() plot() aic()

Writing Functions I

144 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Object

Object: self-contained “thing” A container for data. . Operationally, in R: just about anything on the left hand side in an assignment “<-” Each “thing” in R carries with it enough information so that generic functions “know what to do.” . If there is no function specific to an object, the work is sent to a default method (see print.default).

Writing Functions I

145 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Class

Definition: As far as R with S3 is concerned, class is a characteristic label assigned to an object (an object can have a vector of classes, such as c( “lm”, “glm”)). The class information is used by R do decide which method should be used to fulfill the request. Run class(x), ask x what class it inherits from. In R, the principal importance of the “class” of an object is that it is used to decide which function should be used to carry out the required work when a generic function is used. Classes called “numeric” “integer” “character” are all vector classes , ,

Writing Functions I

146 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Class ...
> y <− c ( 1 , 1 0 , 2 3 ) > class (y) [ 1 ] ”n u m e r i c ” > x <− c ( ”a ” , ”b ” , ”c ”) > x [ 1 ] ”a ” ”b ” ”c ” > class (x) [ 1 ] ”c h a r a c t e r ” > x <− f a c t o r ( x ) > class (x) [ 1 ] ”f a c t o r ” > m1 <− lm ( y ∼ x ) > c l a s s (m1) [ 1 ] ”lm ” > m2 <− glm ( y ∼ x , f a m i l y=Gamma) > c l a s s (m2) [ 1 ] ”glm ” ”lm ”
Writing Functions I 147 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Class ...

Writing Functions I

148 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Method, a.k.k, “Method Function”
Definition: The “implementation” the function that does the work for an : object of a particular type. When the user runs print(m1), and m1 is from class “lm” the work is , sent to a method print.lm() Methods are always named in the format “generic.class” such as , “print.default”, “print.lm” etc. , Note: Most methods do not “double-check” whether the object they are given is from the proper class. They count in R’s runtime system to check and then call print.whatever for obejcts of type whatever That’s why many methods are “hidden” (can only access via ::: notation) Accessing methods directly
If a method is “exported” can be called directly via , “package::method.class()” format If a package is “attached” to the search path, then “method.class()” will suffice, but is not as clear If a method is NOT exported, then user can reach into the package and grab it by running “package:::method.class()”
Writing Functions I 149 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Detour: attributes() Function and Confusing Output
The class is stored as an attribute in many object types. Run attributes()
> attributes (x) $levels [ 1 ] ”a ” ”b ” ”c ” $class [ 1 ] ”f a c t o r ” > a t t r i b u t e s (m1) $ names [1] ”c o e f f i c i e n t s ” ”r e s i d u a l s ” [ 5 ] ” f i t t e d . v a l u e s ” ”a s s i g n ” [ 9 ] ”c o n t r a s t s ” ”x l e ve l s ” [ 1 3 ] ”model ” $class [ 1 ] ”lm ” > attributes (y) NULL > is.object (y) [ 1 ] FALSE

”effe c ts ” ”q r ” ”call ”

”r a n k ” ”df.residual ” ”t e r m s ”

puzzle: why has y no attribute? Why is it not an object?
Writing Functions I 150 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Detour: attributes() Function and Confusing Output ...

Honestly, I’m baffled, I thought ” everything in R is an object.” (And I still do.) If the object does not have a class attribute, it has an implicit class, “matrix” “array” or the result of “mode(x)” (except , that integer vectors have the implicit class “integer” (from ). ?class in R-2.15.1)

Writing Functions I

151 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How Objects get “into” Classes
In older S3 terminology, user is allowed to simply claim that x is from one or more classes c l a s s ( x ) <− c ( ` ` lm ' ' ,
``

glm ' ' , c l a s s ( x ) )

That would say x’s class includes “lm” and “glm” as new classes, and also would keep x’s old classes as well. The class is an attribute, can be set thusly attr (x , ')
``

class

' '

) <− c ( ` ` lm ' ' ,

``

whateverISay

'

When a generic method “run” is called with x, the R runtime will first try to use run.lm. If run.lm is not found, then run.whateverISay will be tried, and if that fails, it falls back to run.default.

Writing Functions I

152 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How Objects get “into” Classes: S4
S4 has more structure, makes classes & methods work more like truly object oriented programs. S4 classes are defined with a list of variables BEFORE objects are created. Variables are typed! Example imitates Matloff, p. 223 setClass ( ”p j f r i e n d ”, representation ( name=” c h a r a c t e r ” , g e n d e r=” f a c t o r ” , f o o d=” f a c t o r ” , age=” i n t e g e r ”) ) Create an instance of class pjfriend (Note: to declare an integer, add letter “L” to end of number).

Writing Functions I

153 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How Objects get “into” Classes: S4 ...
w i l l i a m <− new ( ” p j f r i e n d ” , name = ” w i l l i a m ” , gender = f a c t o r ( ”male ”) , f o o d=f a c t o r ( ” p i z z a ”) , age=33 L) william An o b j e c t o f c l a s s ” p j f r i e n d ” S l o t ”name ” : [ 1 ] ”w i l l i a m ” S l o t ”g e n d e r ” : [ 1 ] male L e v e l s : male S l o t ”f o o d ” : [1] pizza Levels : pizza

Writing Functions I

154 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How Objects get “into” Classes: S4 ...
S l o t ”age ” : [ 1 ] 33 j a n e <− new ( ” p j f r i e n d ” , name=”pumpkin ” , g e n d e r = f a c t o r ( ”f e m a l e ”) , f o o d=f a c t o r ( ”hamburger ”) , age=21L ) jane An o b j e c t o f c l a s s ” p j f r i e n d ” S l o t ”name ” : [ 1 ] ”pumpkin ” S l o t ”g e n d e r ” : [ 1 ] female Levels : female S l o t ”f o o d ” : [ 1 ] hamburger
Writing Functions I 155 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

How Objects get “into” Classes: S4 ...

L e v e l s : hamburger S l o t ”age ” : [ 1 ] 21 jane and william are instances of class “pjfriend” The variables inside an S4 object are called “slots” in R “slot” would be called “instance variable” in most OO-languages) values in slots can be retrieved with symbol @, not $

Writing Functions I

156 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Implement an S4 method
Step 1. Write a function that can receive a function of type “pjfriend” and do something with it. Step 2. Use setMethod to tell the R system that the function implements the method that is called for. setMethod “wraps” a function. set M e t ho d ( ”some−generic−funciton−name ” , ” pjfriend ”, function (x) { #do something with x } )

Writing Functions I

157 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Difficult to Account For Changes between S3 and S4
I think it is difficult to explain some of the notational and terminological changes between S3 and S4. If you type an S4 object’s name on the command line > x the R runtime looks for a method “show.class” (where class is the class of x). Why change from “print” to “show” (IDK) ? Why change the “accessor” symbol from $ to @ ? Why call things accessed with @ “slots” rather than instance variables?

Writing Functions I

158 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

R has lots of ways to do things over and over

for loop: process by “i” or by “element” apply: process rows and/or columns in a matrix lapply: process each element in a list sapply: attempts to simplify output from lapply replicate: shorthand for sapply for simple simulations

Writing Functions I

159 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

for looping
loop over elements in a sequence x1 <− v e c t o r ( ) f o r ( i i n 1 : 5 7 ) { x1 [ i ] <− doubleMe ( i ) } create an empty vector integers i from 1 to 57 are sent to double me Note, it is not necessary to actually do this for loop in R, because R is vectorized. x2 <− doubleMe ( 1 : 5 7 ) a l l . e q u a l ( x1 , x2 ) [ 1 ] TRUE Using vectorized code is much faster.

Writing Functions I

160 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

“apply()”

useRs are urged to avoid “for loops” when possible Why? Accessing particular values with “[” (vector or matrix indexes) is SLOW. Better to exploit R’s “vectorization” apply() is one of a family of functions that can replace a for loop. apply() takes a matrix, and does “the same FUN” to all of its rows or columns (or both). Definition: MARGIN=1 means “work row by row” MARGIN=2 , means “column by column”

Writing Functions I

161 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example of “apply()” With a Built-In FUN
Given a matrix xyz with columns “x” “y” and “z” , , On the columns, MARGIN=2, apply the R “mean” function. x y z <− m a t r i x ( rnorm ( 9 ) , n c o l =3) xyz [ ,1] [ ,2] [ ,3] [ 1 , ] 0 . 5 8 5 5 2 8 8 −0.4534972 0 . 6 3 0 0 9 8 6 [ 2 , ] 0 . 7 0 9 4 6 6 0 0 . 6 0 5 8 8 7 5 −0.2761841 [ 3 , ] −0.1093033 −1.8179560 −0.2841597 c o l n a m e s ( x y z ) <− c ( ”x ” , ”y ” , ”z ”) a p p l y ( xyz , MARGIN=2, FUN=mean ) x y 0 . 3 9 5 2 3 0 5 1 −0.55518856 z 0 .02325157

If there is no “built in” function that does what you want, then you have to write your own.
Writing Functions I 162 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Write your own Function for apply
Suppose you want the second-highest score from each column. Write a little function called “second()” s e c o n d <− f u n c t i o n ( a c o l=NULL) { sort ( acol ) [2] } p r i n t ( xyz ) x y z [ 1 , ] 0 . 5 8 5 5 2 8 8 −0.4534972 0 . 6 3 0 0 9 8 6 [ 2 , ] 0 . 7 0 9 4 6 6 0 0 . 6 0 5 8 8 7 5 −0.2761841 [ 3 , ] −0.1093033 −1.8179560 −0.2841597 a p p l y ( xyz , MARGIN=2, FUN=s e c o n d ) x y z 0 . 5 8 5 5 2 8 8 −0.4534972 −0.2761841
Writing Functions I 163 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Apply the normedEntropy function to rows
First, create a matrix in which the sum of each row is 1.0 xmat <− m a t r i x ( r m u l t i n o m ( 6 , s i z e =20 , p r o b=c ( 1 , 2 , 3 , 4 , 5 ) ) , byrow=T , n c o l =5) xmat <− p r o p . t a b l e ( xmat , 1 ) p r i n t ( r o u n d ( xmat , 3 ) ) [ ,1] 0 .00 0 .20 0 .10 0 .10 0 .05 0 .10 [ ,2] 0 .30 0 .15 0 .15 0 .00 0 .10 0 .05 [ ,3] 0 .15 0 .20 0 .10 0 .15 0 .30 0 .30 [ ,4] 0 .20 0 .20 0 .30 0 .40 0 .35 0 .25 [ ,5] 0 .35 0 .25 0 .35 0 .35 0 .20 0 .30

[1 [2 [3 [4 [5 [6

,] ,] ,] ,] ,] ,]

Writing Functions I

164 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Entropy for each row!

apply normed Entropy to each Row with apply a p p l y ( xmat , MARGIN=1, FUN=no rmed Ent ro py ) [ 1 ] 0 .8295351 0 .9921503 0 .9156704 0 .7759110 0 .8888583 0 .9003158

Writing Functions I

165 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

“lapply()” Do same thing to all Elements of a List :

lapply() will take a list of things and apply a given function to each item, returning a new list. Generally, aNewList <- lapply( someList, FUN = someFunction ) someFunction MUST accept the elements from someList as the first argument Additional arguments to someFunction are allowed

Writing Functions I

166 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example Use of lapply
Create a list with 5 sets of random uniform normal variables s a m p l e L i s t <− l a p p l y ( r e p ( 1 0 0 0 , 5 ) , rnorm ) sampleList [ [ 1 ] ] [ 8 8 8 ] [ 1 ] −0.3101479 Same as sampleList '' , 5) sampleList sampleList sampleList sampleList sampleList <− l i s t ( ) ## or <− vector ( `` list [[1]] [[2]] [[3]] [[4]] [[5]] <− <− <− <− <− rnorm ( 1 0 0 0 ) rnorm ( 1 0 0 0 ) rnorm ( 1 0 0 0 ) rnorm ( 1 0 0 0 ) rnorm ( 1 0 0 0 )

Writing Functions I

167 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example Use of lapply
Get the mean of sets 1 and 2 individually mean ( s a m p l e L i s t [ [ 1 ] ] ) [ 1 ] 0 .04081866 mean ( s a m p l e L i s t [ [ 2 ] ] ) [ 1 ] −0.02739241 Grab means of all sets with lapply ( a N e w L i s t <− l a p p l y ( s a m p l e L i s t , mean ) ) [[1]] [ 1 ] 0 .04081866 [[2]] [ 1 ] −0.02739241 [[3]] [ 1 ] −0.0255273
Writing Functions I 168 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why lapply, Not apply?

Sometimes our “data” is not an even set of columns that fits in a data.frame or matrix x l i s t <− l i s t ( x1=c ( 1 , 1 , 1 , 2 , 3 , 3 ) , x2=r p o i s ( 1 0 , lambda =3) , x3= r o u n d ( rnorm ( 2 0 ,m=100 , s =1) , 0 ) ) e l i s t <− l a p p l y ( x l i s t , f u n c t i o n ( x ) { y <− t a b l e ( x ) / l e n g t h ( x ) ; no rm edE nt rop y ( y ) } )

Writing Functions I

169 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why lapply, not apply?
for ( i in 1: length ( x l i s t ) ){ c a t ( ”G i v e n L i s t ”) print ( xlist [[ i ]]) c a t ( ”Normed E n t r o p y ”) p r i n t ( round ( e l i s t [ [ i ] ] , 3 ) ) c a t ( ”\n ”) } Given L i s t [ 1 ] 1 1 1 2 3 3 Normed E n t r o p y [ 1 ] 0 . 9 2 1 Given L i s t [ 1 ] 3 2 5 2 5 2 1 6 2 4 Normed E n t r o p y [ 1 ] 0 . 8 9 8 G i v e n L i s t [ 1 ] 101 101 100 101 100 99 101 100 100 102 100 102 100 99 100 101 100 100 101 100 Normed E n t r o p y [ 1 ] 0 . 8 4 3
Writing Functions I 170 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example with additional arguments
One NA wrecks mean (by default) s a m p l e L i s t <− l a p p l y ( r e p ( 1 0 0 0 , 5 ) , rnorm ) s a m p l e L i s t [ [ 1 ] ] [ 7 7 ] <− NA ( a N e w L i s t <− l a p p l y ( s a m p l e L i s t , mean ) ) [[1]] [ 1 ] NA [[2]] [ 1 ] −0.008354005 [[3]] [ 1 ] −0.003276648 [[4]] [ 1 ] −0.003438522 [[5]] [ 1 ] 0 .05110267
Writing Functions I 171 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example (cont.): Fix that Missing Value Problem
( a N e w L i s t <− l a p p l y ( s a m p l e L i s t , mean , na.rm=T) ) [[1]] [ 1 ] −0.03336209 [[2]] [ 1 ] −0.008354005 [[3]] [ 1 ] −0.003276648 [[4]] [ 1 ] −0.003438522 [[5]] [ 1 ] 0 .05110267

Writing Functions I

172 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example: lapply to Simulate Regressions.

The question:
Create 100 regression models from 100 data sets Study the sampling distribution of the R 2 statistic from those regressions.

Writing Functions I

173 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Step 1.
The following generates 100 data frames in a list “mydatasets” . e x s <− 10 exq <− 0 . 3 4 5 e x s t d e <− 20 createOneDF <− f u n c t i o n ( run , s=NA, q=NA, s t d e=NA ){ x <− 18 + 43 * r u n i f ( 1 0 0 0 ) y <− s + q * x + rnorm ( 1 0 0 0 , mean=0, s d=s t d e ) mydf <− d a t a . f r a m e ( run , x , y ) } m y d a t a s e t s <− l a p p l y ( 1 : 1 0 0 , createOneDF , e x s , exq , e x s t d e ) Here the “list” is just a sequence 1,2,3,... lapply automatically gives each list element to function as first argument. (In this case, “run” number).
Writing Functions I 174 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Step 2.
Now apply a function to each data frame, make list “myregressions” m y r e g r e s s i o n s <− l a p p l y ( m y d a t a s e t s , FUN = f u n c t i o n ( mydf ) lm ( y∼x , d a t a=mydf ) ) Note: small functions can be written “inline” Could as well have written c a l c R e g <− f u n c t i o n ( a d f=NULL) { mod <− lm ( y∼x , d a t a=a d f ) } m y r e g r e s s i o n s <− l a p p l y ( m y d a t a s e t s , FUN = calcReg )

Writing Functions I

175 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Take Stock of What We Have
Each element in the list “mydatasets” really is a data frame: head ( m y d a t a s e t s [ [ 3 3 ] ] ) run 33 33 33 33 33 33 x y 41 . 4 7 3 1 5 30 . 8 1 7 7 7 4 48 . 7 8 7 8 8 48 . 2 2 9 4 8 9 31 . 7 1 1 0 7 45 . 5 1 5 4 1 4 50 . 2 8 9 9 1 −22.129543 60 . 1 3 3 1 0 33 . 6 3 2 9 5 3 35 . 6 7 7 7 1 9 .532895

1 2 3 4 5 6

Each element in “myregressions” really is a regression result object myregressions [ [ 3 3 ] ]

Writing Functions I

176 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Take Stock of What We Have ...

Call : lm ( f o r m u l a = y ∼ x , d a t a = mydf ) Coefficients : ( Intercept ) 10 . 5 2 6 1

x 0 .3371

Which can be summarized thus: summary ( m y r e g r e s s i o n s [ [ 3 3 ] ] )

Writing Functions I

177 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Take Stock of What We Have ...
Call : lm ( f o r m u l a = y ∼ x , d a t a = mydf ) Residuals : Min 1Q −56.643 −11.595

Median 0 .873

3Q 12 . 4 6 2

Max 57 . 8 5 4

Coefficients : E s t i m a t e S t d . E r r o r t v a l u e Pr ( >| t | ) ( I n t e r c e p t ) 10 . 5 2 6 1 3 1 .94869 5 . 4 0 2 8 .26e−08
*

**

x
*

0 .33713

0 .04737

7 . 1 1 7 2 .10e−12

**

−−− S i g n i f . codes : ' 0 .1 ' ' 1

0

' *** '

0 .001

' ** '

0 .01

'* '

0 .05

'

.

Writing Functions I

178 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Take Stock of What We Have ...

R e s i d u a l s t a n d a r d e r r o r : 18 . 7 9 on 998 d e g r e e s o f freedom M u l t i p l e R2 : 0 .0483 , A d j u s t e d R2 : 0 . 0 4 7 3 5 F − s t a t i s t i c : 50 . 6 6 on 1 and 998 DF , p−value : 2 .101e−12
Note, the R 2 value that we need is sitting there, in the middle of the summary output. We’ll need that.

Writing Functions I

179 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Step 3.
Grab the R 2 from each regression in the list. The estimate of the R 2 is an element in the returned object from summary. One strategy: create an R list of summary objects mysummaries <− l a p p l y ( m y r e g r e s s i o n s , FUN= summary ) Getting the R 2 out of each one of those requires some tedious grabbing, such as myrsq <− l a p p l y ( mysummaries , FUN = f u n c t i o n ( mr ) {mr$ r . s q u a r e } ) myrsq [ 1 : 5 ]

Writing Functions I

180 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Step 3. ...
[[1]] [ 1 ] 0 .03758218 [[2]] [ 1 ] 0 .03746384 [[3]] [ 1 ] 0 .02569663 [[4]] [ 1 ] 0 .03390325 [[5]] [ 1 ] 0 .04059477 myrsq <− u n l i s t ( myrsq ) s t r ( myrsq )
Writing Functions I 181 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Step 3. ...

num [ 1 : 1 0 0 ] 0 . 0 3 7 6 0 . 0 3 7 5 0 . 0 2 5 7 0 . 0 3 3 9 0 . 0 4 0 6 ...

Writing Functions I

182 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Sapply will do that in one shot
sapply is the “simplified apply” it attempts to convert a list into a , vector or matrix. snoop through the regressions, grab the R 2 . myrsq <− s a p p l y ( mysummaries , FUN = f u n c t i o n ( mr ) {mr$ r . s q u a r e } ) mean ( myrsq ) [ 1 ] 0 .04510022 sd ( myrsq ) [ 1 ] 0 .01280801 median ( myrsq ) [ 1 ] 0 .04424352
Writing Functions I 183 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Everybody Still Loves Histograms
Histogram of myrsq
observed density

0 0.00

10

Density 20

30

0.02

0.04 0.06 R−Squares From 100 Regressions

0.08

0.10

Writing Functions I

184 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example: Balance in Logistic Regression

Last year I wondered (while auditing the categorical class), “what if we run a logistic regression comparing men and women and there are not very many men?” Write functions to
manufacture data analyze data summarize & plot data

Writing Functions I

185 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Create Output Data: Need to convert real numbers to 0’s and 1’s

η “eta” is input, the proclivity to “vote democratic” s i m L o g i t <− f u n c t i o n ( myeta ) { mypi <− exp ( myeta ) /(1+ exp ( myeta ) ) ## SAME AS 1/ (1+ exp( -myeta )) m y u n i f <− r u n i f ( l e n g t h ( myeta ) ) y <− i f e l s e ( m y u n i f < mypi , 1 , 0 ) }

Writing Functions I

186 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Example Use: Creates 1000 Observations

N <− 1000 A <− −1 B <− 0 . 3 x <− 1 + 10 * rnorm (N) myeta <− A + B * x y <− s i m L o g i t ( myeta )

Writing Functions I

187 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Illustration of Simulated Data
p l o t ( x , y , main=b q u o t e ( e t a [ i ] == . (A) + . (B) * x [ i ] )) t e x t ( 0 . 5 * max ( x ) , 0 . 5 , e x p r e s s i o n ( Prob ( y [ i ] == 1 ) == f r a c ( 1 , 1 + exp ( −eta [ i ] ) ) ) )
ηi = −1 + 0.3xi
1.0
q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqq qq qqq qq q qqq q q qq qq qqq qqqqqq qqqq qqqqqqqqqqqqqqqqqqqqq qqqqqq qq qq q q q q q q qq q q qqqqqq qq qqqqqqqqqqqqqqqqqqqq q q qqqqq q q q qqqq qq qq qqqq qq qq q qq q q q q q qq q q q q q q q q q q q qq q q q q q qq q q q q

0.6

0.8

Prob(yi = 1) = 0.4

1 1 + exp(− ηi)

y 0.0 0.2
q

qq q

q q qq q q q q qqqq q qqq qqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqq q qq qqqq q qq q q q q qq qq q q q q q q qqqqqqqq qq qq qqqqqqqqqqqqq qqq qqqqqqqqqq qq qq qqq q q q q q q qq q q qqqq q q q qqq qqqqqqqqqqq qqqqqq qqqq qqqqqqqq q q q qq qq q q qqq qqqqqqqq qqqqqqqqqq qq qqqqq qq q q q q q q q qq qqqqq q q q qq q q q qq q q q q q q

q

−30

−20

−10

0 x

10

20

30

Writing Functions I

188 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Fitted Line from glm
ηi = −1 + 0.3xi
1.0
q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqq qq qqq qq q qqq q q qq qq qqq qqqqqq qqqq qqqqqqqqqqqqqqqqqqqqq qqqqqq qq qq q q q q q q qq q q qqqqqq qq qqqqqqqqqqqqqqqqqqqq q q qqqqq q q q qqqq qq qq qqqq qq qq q qq q q q q q qq q q q q q q q q q q q qq q q q q q qq q q q q

0.6

0.8

Prob(yi = 1) = 0.4

1 1 + exp(− ηi)

y 0.0 0.2
q

qq q

q q qq q q q q qqqq q qqq qqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqq q qq qqqq q qq q q q q qq qq q q q q q q qqqqqqqq qq qq qqqqqqqqqqqqq qqq qqqqqqqqqq qq qq qqq q q q q q q qq q q qqqq q q q qqq qqqqqqqqqqq qqqqqq qqqq qqqqqqqq q q q qq qq q q qqq qqqqqqqq qqqqqqqqqq qq qqqqq qq q q q q q q q qq qqqqq q q q qq q q q qq q q q q q q

q

−30

−20

−10

0 x

10

20

30

Writing Functions I

189 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

We are Interested in the Difference Between Two Groups
ηi = −1 + 0.3xi
1.0
q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqq qq qqq qq q qqq q q qq qq qqq qqqqqq qqqq qqqqqqqqqqqqqqqqqqqqq qqqqqq qq qq q q q q q q qq q q qqqqqq qq qqqqqqqqqqqqqqqqqqqq q q qqqqq q q q qqqq qq qq qqqq qq qq q qq q q q q q qq q q q q q q q q q q q qq q q q q q qq q q q q

0.6

0.8

Prob(yi = 1) = 0.4

1 1 + exp(− ηi)

y 0.0 0.2
q

qq q

q q qq q q q q qqqq q qqq qqqqqqqq qqq qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqq q qq qqqq q qq q q q q qq qq q q q q q q qqqqqqqq qq qq qqqqqqqqqqqqq qqq qqqqqqqqqq qq qq qqq q q q q q q qq q q qqqq q q q qqq qqqqqqqqqqq qqqqqq qqqq qqqqqqqq q q q qq qq q q qqq qqqqqqqq qqqqqqqqqq qq qqqqq qq q q q q q q q qq qqqqq q q q qq q q q qq q q q q q q

q

−30

−20

−10

0 x

10

20

30

Writing Functions I

190 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Now Automate That Process
Manufacture data Run Regression Return row of estimates s i m U n b a l a n c e d <− f u n c t i o n ( i t e r =0, parm ) { A <− parm $A ; B<− parm $B ; C<− parm $C ; PrFem <− parm $PrFem s e x <− i f e l s e ( r u n i f (N) < PrFem , 0 , 1 ) myeta <− A + B * x + C * s e x s e x <− f a c t o r ( s e x , l e v e l s =c ( 0 , 1 ) , l a b e l s=c ( ”M” , ” F ”) ) y <− s i m L o g i t ( myeta ) myglm2 <− glm ( y ∼ x + s e x , f a m i l y=b i n o m i a l ) myglm2sum <− c o e f ( summary ( myglm2 ) ) e s t <− myglm2sum [ 3 , ] }
Writing Functions I 191 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Use sapply to run 1000 Regressions

p <− l i s t ( ) p$A <− −1 ; p$B <− 0 . 3 ; p$C <− 0 . 4 p$PrFem <− 0 . 5 r e s u l t 4 5 <− l i s t ( s a p p l y ( 1 : 1 0 0 0 , s i m U n b a l a n c e d , parm=p ) , parm=p ) Note: I’m combining the sapply result, along with “p” for record-keeping , p$PrFem <− 0 . 9 r e s u l t 4 9 <− l i s t ( s a p p l y ( 1 : 1 0 0 0 , s i m U n b a l a n c e d , parm=p ) , parm=p )

Writing Functions I

192 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Now Plan to Draw Some Figures
c r e a t e F i g s <− f u n c t i o n ( r e s u l t ) { d a t <− r e s u l t [ [ 1 ] ] C <− r e s u l t $ parm $C PrFem <− r e s u l t $ parm $PrFem mybeta <− d a t [ 1 , ] hrow1 <− h i s t ( mybeta , b r e a k s =50 , p l o t=F ) mybreaks <− hrow1 $ b r e a k s breakMember <− c u t ( d a t [ 1 , ] , mybreaks ) mypval <− d a t [ 4 , ] m y s i g n i f <− i f e l s e ( ( mypval < 0 . 0 5 ) , 1 , 0 ) d f <− d a t a . f r a m e ( mybeta , mypval , m y s i g n i f , breakMember )

Writing Functions I

193 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Now Plan to Draw Some Figures ...
p r o p s i g <− by ( d f $ m y s i g n i f , INDICES= l i s t ( d f $ breakMember ) , mean , s i m p l i f y=T) m y t r a t <− d a t [ 3 , ] mycounts <− hrow1 $ c o u n t s p l o t ( d a t [ 1 , ] , d a t [ 4 , ] , x l a b=”b e t a e s t i m a t e ” , y l a b=”e s t i m a t e d p ” , c e x=0. 7 , main=p a s t e ( ”True Beta=” ,C , ”P r o p . Fem.=” , PrFem ) ) gc <− c ( ”g r a y 9 8 ” , ”g r a y 7 0 ” , ”g r a y 5 0 ” , ”g r a y 4 0 ”) c u t ( p r o p s i g , b r e a k s=c (−1 , 0 . 1 , 0 . 5 , 0 . 9 , 1 . 1 ) ) c a t p r o p s i g <− c u t ( p r o p s i g , b r e a k s=c (−1 , 0 . 1 , 0 . 5 , 0 . 9 , 1 . 1 ) , o r d e r e d=T , l a b e l s=c ( ”0 ” , ” l t h ” , ”mth ” , ”1 ”) ) b a r p l o t ( hrow1 $ d e n s i t y , c o l=gc [ a s . n u m e r i c ( c a t p r o p s i g ) ] , names=hrow1 $ mids ) }
Writing Functions I 194 / 205 University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

For Balanced Data
1.0
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q q q q q qq q q q q q qq q q q q q q q q q q q q q qq q q qq q q qq q q q q q q q q q q q qq q q qq q qq q q q qq q q q qq q q q q q qq q qq q q q qq q q q q qq q qq q qq q q qq qq q q q q qq qq qq qq q qq q q q q q q qq q qq qq qq qqq q q q qq qq q qq q qq q qq qq qq qq qq q qqq qqq q qq qq qq qq qqq qqq qq qqq qq qq qq q qq q qqqq q qq qqqq qq q qqq qqqq qq qq qqqq qqq qqq qqqq qqq qq qqq qqqq qqqqq qqq qqq qqq qqqq qqqq qq qqq qqqqqq qqqqqqq qqqqqqqqqqq q q qq qq qq q q q qqqqqqqqqqqqq qq qq q qqqqqqqqqq qqqq q qq qq qqqqq qqqq qqqq qqq q qq q qqq

q q

0.8

q q q

q

estimated p 0.4 0.6

q q

q q

q

0.2

0.0

q

−0.2

0.2 0.6 beta estimate

1.0

0.0 −0.21 0.09 0.37 0.65 0.93
195 / 205

Writing Functions I

0.5

1.0

1.5

2.0

q

2.5

True Beta= 0.4 Prop. Fem.= 0.5

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

For Unbalanced Data
1.0
q q q q q q q q q q qq qq qq q qq qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q qq q q q qq q q q qq q q q q q q q q q q q q q qq q q q q q q q q q q q q q qq q q q q q qq q q q q q q qq q q qq qq q q q q q q q q q qq qq q qq q qq qq q q q qq q q q q qq qq q q q q q q qq q q q q qq q q q q qq q qq q q q q q q q q q qq q q qq q q q qq q qq q q qq qq qq qq q qq q qq qq qq q q qq q q qq q qq qq q q qqq qqq qq q q qq q qq qq q qq q qqq q q q q q qq q qq q qq qq qq q qq q qq q q q qq q q qq q qq q q qq q qq qq qq qq q qq qqq qqq qq qq qqq q q qq q q qq qq qq q q qq qqqq q qqqq q qqqq qqq q qqq qq qq q qq qq q qqqq q qqq qqq qqq qq qqqq qqqq qq qq qqqq q qq qq q q qqqq qq qq q qqqq q qq q qq q qqqqq qq q q qq q qq q qqq qqq q q qq q qq q qqqq qqqq qqq qqq q qq qqq q qq qqqqq q qq qqqqq qqqqq q qq q qq q qq q q qq q qq qq qq q q qq q q qqq q q q qqqqqqqq q q qq q q qq q qqqqq q q q q qq q qqqqqqq q qq qq qq q q qqqqqqqqqqq q q q q qq q qqq qqqqqq q qq q qq q qq q q

0.8

estimated p 0.4 0.6

0.2

q q q

0.0

q

−0.5

0.0 0.5 1.0 beta estimate

1.5

0.0 −0.625

0.2

q q q

0.4

0.6

0.8

1.0

1.2

1.4

True Beta= 0.4 Prop. Fem.= 0.9

0.025 0.575 1.125

Writing Functions I

196 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Bootstrapping: Some “Do it Yourself” Work Is Required

Many R functions require users to write little functions that do little things. In many cases (like lapply or apply), look for FUN as an argument. Sometimes no builtin-exists. useR must create!

Writing Functions I

197 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

boot Function Requires a Special Function “statistic”

l i b r a r y ( boot ) ? boot
Bootstrap Resampling Description : G e n e r a t e 'R ' b o o t s t r a p r e p l i c a t e s o f a s t a t i s t i c a p p l i e d t o d a t a . Both p a r a m e t r i c and n o n p a r a m e t r i c r e s a m p l i n g a r e p o s s i b l e . . . . b o o t ( da ta , s t a t i s t i c , R , s i m= ' ' o r d i n a r y ' ' , s t y p e= ' ' i ' ' , s t r a t a=r e p ( 1 , n ) , L=NULL , m=0 , w e i g h t s=NULL , r a n . g e n=f u n c t i o n ( d , p ) d , mle=NULL , s i m p l e=FALSE , . . . ) s t a t i s t i c : A f u n c t i o n w h i c h when a p p l i e d t o d a t a r e t u r n s a v e c t o r containing the s t a t i s t i c ( s ) of i n t e r e s t . . .

Writing Functions I

198 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Bootstrap: Background Explanation

Bootstrap: draw samples repeatedly and re-estimate θ Resulting values approximate a sampling distribution θ The “boot” package asks for a data frame and a special function “statistic”. statistic must
accept a data frame as the first argument accept an “index vector” as the second argument

Writing Functions I

199 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Don’t Panic: This is Confusing to Everybody

Example usage b o o t ( data , s t a t i s t i c =y o u r F u n c t i o n , R=1000) boot will iterate 1000 times, and yourFunction will provide the statistic of interest. You write yourFunction to make required calculation. boot will tell yourFunction which lines to use in the data frame, over-and-over.

Writing Functions I

200 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

The Median of a Poisson Distribution

Suppose you have a sample from a Poisson Process: samp <− r p o i s ( 2 0 , lambda =3) And you calculate the median: median ( samp ) [1] 4 How confident are you in that estimate of the median?

Writing Functions I

201 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Bootstrap Your Median

Here is yourFunction: calcMed <− f u n c t i o n ( dat , i n d ) { median ( d a t [ i n d ] ) } dat[ind] has the effect of “pulling” rows that match “ind” from “dat” The boot function will send 1000 “case indexes” to your function. l i b r a r y ( boot ) b p o i s <− b o o t ( samp , calcMed , R=1000) bpois

Writing Functions I

202 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Bootstrap Your Median ...

ORDINARY NONPARAMETRIC BOOTSTRAP

Call : b o o t ( d a t a = samp , s t a t i s t i c = calcMed , R = 1000)

Bootstrap S t a t i s t i c s : original bias std. error t1 * 4 −0.5015 0 .7181668

Writing Functions I

203 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Let’s plot that
Histogram of t
4.5
q

8

4.0

qqqqqqqqqqqqqqqqqqqqqqqqqqqq qq q qqqqqqqqqqqqqqqqqqqqqqqqqqqq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqq qqqqqqqq q qqqqqqqqqqqqqq qq qqqq q q qqqqqqqqqqqqqq qq q q

6

3.5

qqq qqq qqq qqq qqq qqq

Density 4

t* 3.0

qqqq qqqq qqqq qqqq qqqq qqq

2

2.5

qqq qqq qqq qqq qqq q

2.0 2.0 2.5 3.0 t* 3.5 4.0 4.5

0

q q qq qqqqqqqqqqqqqq qqqqqqqqqqqqqqq qqqqqqqqqqqqq qqq qqqqqqq qq qqqqqq q q q

−3

−2 −1 0 1 2 Quantiles of Standard Normal

3

Writing Functions I

204 / 205

University of Kansas

Introduction Use Functions! Example: Calculate Entropy Parameters and Returns Cut and Paste: Wrong! R Style Object Oriented Programming Repetit

Why Do They Do It That Way?

Your instinct is to do this the “simple” way
(Just) “Manually” draw new random samples of rows from a data frame. But: Creating 1000s of “new” re-sampled data sets would “waste” (exhaust?) memory Would be especially slow if separate data sets have to be copied between systems.

More efficient to keep 1 data frame, but 1000’s of vectors of row numbers.

Writing Functions I

205 / 205

University of Kansas