You are on page 1of 11

Package maintenance

A great power comes with great responsibility

Jaime Pizarroso Gonzalo


Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

Index
1 DESCRIPTION file......................................................................................................... 2
2 R code (R/) ...................................................................................................................... 3
2.1 Function.R ............................................................................................................................ 3
2.2 Package.R and data.R.......................................................................................................... 5
2.3 zzz.R ..................................................................................................................................... 6
3 NAMESPACE file .......................................................................................................... 6
4 cran-comments.md .......................................................................................................... 6
5 Create library Workflow ................................................................................................. 7
6 Extra files ........................................................................................................................ 9
6.1 Rbuildignore ........................................................................................................................ 9
6.2 README.Rmd ..................................................................................................................... 9
6.3 .travis.yml ............................................................................................................................ 9
6.4 appveyor.yml ....................................................................................................................... 9
References ........................................................................................................... 10

IIT – Instituto de investigación tecnológica 1


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

1 DESCRIPTION file
As its name says, this file contains a brief description about the package. An example of this file is:
Package: IITML
Type: Package
Title: Useful functions for Machine Learning done by IIT (Institute for Research in Technology)
Version: 0.0.3.9000
Date: 2018-09-24
Author: José Portela González [aut], Antonio Muñoz San Roque [aut], Jaime Pizarroso Gonzalo [ctb, cre]
Maintainer: Jaime Pizarroso Gonzalo <jpizarroso@alu.comillas.edu>
Authors@R: c(
person(given = "José", family = "Portela González",
email = "Jose.Portela@iit.comillas.edu", role = "aut"),
person(given = "Antonio", family = "Muñoz San Roque",
email = "antonio.munoz@iit.comillas.edu", role = "aut"),
person(given = "Jaime", family = "Pizarroso Gonzalo",
email = "jpizarroso@alu.comillas.edu", role = c("ctb", "cre"))
)
Description: Useful functions for Machine Learning to analyze data.
These functions are useful for supervised and unsupervised models.
The code in this package has been done based on Google's R style guide.
Imports:
ROCR,
directlabels,
gridExtra,
ggplot2,
mvtnorm,
caret,
pROC,
reshape2,
NeuralNetTools,
splines
Suggests:
h2o,
neural,
RSNNS,
nnet,
neuralnet
RoxygenNote: 6.1.0
VignetteBuilder: knitr
NeedsCompilation: no
URL: https://github.com/JaiPizGon/IITML
BugReports: https://github.com/JaiPizGon/IITML/issues
License: GPL (>= 2)
Encoding: UTF-8
LazyData: true
• Package: name of the package
• Title: short description of the package
• Version: current version of the package
• Date: creation date of the version of the package

IIT – Instituto de investigación tecnológica 2


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

• Authors: in this section must be the authors (role = “aut”), the collaborators (role = “ctb”) and
the creator (role = “cre”) of the package. If the Maintainer field is blank, the maintainer should
be the creator.
• Maintainer: maintainer of the package. If bugs or problems are detected in the package, he/she
should be notified in order to correct it.
• Authors@R: in this section must be specified following the format showed the same persons
as in the Authors field.
• Description: Description with the objective and reason of the package. Must be longer and
explicative than the title
• Imports: here are the libraries needed by the package to work, so they would be installed with
the library with install.packages(“package.name”, dep = TRUE). If a certain version is needed,
the version can be specified next to the name: ROCR (>= 3.0.0), ggplot2 (=5.1.0)
• Imports: here are the libraries that would be imported when the package is called. These
libraries would be installed with the package. They can also be imported in the NAMESPACE
file.
• Suggests: here are the libraries that are not essential for the package to work, but they could be
useful. They are not installed with the library.
• RoxygenNote: version of Roxygen to generate the documentation of the functions.
• VignetteBuilder: package to create the Vignettes if there’s any in the package.
• NeedsCompilation: if there are files with external languages like C, C++ or Fortran, specify if
they need to be compiled during the installation of the package. Be careful because these files
must be compiled in 32-bits and 64-bits architectures. If any of these architecture are not
supported, the package should not be accepted in CRAN.
• URL: url of the package, usually a repository in github or similar.
• Issues: url where users of the package can notify errors or bugs.
• License: it can be an extra file with a custom license or a standard license like GPL-2
Encoding: encoding used to save the .R files, usually UTF-8.
• LazyData: if TRUE, the datasets would be lazily loaded, they won’t occupy memory until they
are used.

2 R code (R/)
All documentation source and/or functions must be in the R/ directory. A good practice is to keep all
files with different objectives in separate files, in order to be easily debuggable (yep, that word exists,
google it).

2.1 Function.R
The next code is an example of a good function code. Use it as a template for new functions:
#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x+y
}

IIT – Instituto de investigación tecnológica 3


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

As can be seen in the code, previous to the function must be the documentation of the function,
explaining the objective of the function, and all the information needed to use it.
R use the package Roxygen to render that documentation and producing and RMarkdown file. The
previous code would produce the next .Rd file:
% Generated by roxygen2 (4.0.0): do not edit by hand
\name{add}
\alias{add}
\title{Add together two numbers}
\usage{
add(x, y)
}
\arguments{
\item{x}{A number}

\item{y}{A number}
}
\value{
The sum of \code{x} and \code{y}
}
\description{
Add together two numbers
}
\examples{
add(1, 1)
add(10, 1)
}
And when calling help of that function in R, it would display:

Other important attributes are:


• @import library: import a library needed for the function to work.
• @importFrom library function: import a function needed of the library. Useful to reduce the
memory consumption

IIT – Instituto de investigación tecnológica 4


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

• @export function: export the function done in the .R file. If this attribute is not used, the function
would not be loaded in the environment when the library is called. This would reduce memory
consumption, but every time that the function is used it must be called as library:::function()
(annoying way)
• @section new.section: this would create a non-standard section of the documentation.
• @examples: code with examples to use the function. They should take no longer than 5 seconds
to perform, be careful if these examples must do plots or similar.
An example of all of these attributes is:
#' Analyze results of classification models
#'
#' @description Custom function for plotting the results of a classification model with two
input variables and
#' a classification output variable.It let's you choose which class to plot the
probabilities.
#' @param X Data frame with two input variables
#' @param Y Vector with real output variable, should be a factor
#' @param model Fitted model using caret
#' @param var1 \code{character}, name of the x-axis variable
#' @param var2 \code{character}, name of the y-axis variable
#' @param sel.class \code{character}, class to be analyzed (probabilities). If \code{NULL},
an interactive list let you choose the class analyzed.
#' @param np.grid \code{integer}, number of discretization points in each dimension
#' @section Output:
#' \itemize{
#' \item Plot 1: colorful plot with the classification of the classes in a 2D map
#' \item Plot 2: b/w plot with probability of the chosen class in a 2D map
#' \item Plot 3: plot with the predictions of the data provided
#' }
#' It does not return any object.
#' @examples
#' plot2Dclass(fdata, fdata[,"Y"], var1 = "X1", var2 = "X2", sel.class = "YES" lm.fit, 300)
#' @import caret
#' @import ggplot2
#' @import pROC
#' @importFrom gridExtra grid.arrange
#' @export Plot2DClass
Plot2DClass <- function(X, Y, model, var1 = NULL, var2 = NULL, sel.class = NULL, np.grid = 200) { … }
The documentation accepts roxygen tags like \itemize (to create non-numeric lists) or \code (to write
with code character format).
2.1.1 Bonus tips
To format the function and have a nice and to-be-proud-of script, use Ctrl+A and Ctrl+Shift+A (two
commands in that order) to format the code of the script. Be polite to others and comment your code
(others include your future you which has no idea of what have you thought in the moment you write
the function and that is remembering your ancient ones because of your non-commenting policy).

2.2 Package.R and data.R


These files are optional but highly recommendable. In the Package.R file can be a longer description of
the package, in order to better explain the objective of it and the functions inside of it. It uses the same
syntaxis that the documentation of the function, but a @doctype attribute must be used specifying that
the documentation is of a package:
#' IITML: A package with useful functions for learning machine learning.
#' @docType package
#' @name IITML

IIT – Instituto de investigación tecnológica 5


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

NULL
The documentation must be always linked to an R object, so the documentation of the objects which are
not R objects must be linked to a NULL object.
Similar to the package documentation, the data documentation can be done in a file called data.R.
However, the actual data must be saved in .Rdata objects in the data/ directory of the package. An
example of this file is:
#' Data frame with 3 variables
#'
#' @description A dataset containing values to train and test classification models with a
two-level output
#' @name plot2Dclass_2classes
#' @author José Portela González
#' @keywords data
#' @format A data frame with 1000 rows and 3 variables:
#' \describe{
#' \item{X1}{numeric coordinate}
#' \item{X2}{numeric coordinate}
#' \item{Y}{output 2 levels YES and NO}
#' }
NULL
In this case, the @doctype attribute is not needed.

2.3 zzz.R
In this file are the functions related to the load of the library (like the powerful and inspirational
spiderman message) or actions of the functions (like exits or loadings of some libraries). Some examples
can be seen in the ggplot repository.

3 NAMESPACE file
This file is automatically generated when the library is built by devtools. It contains all the instructions
related to import libraries or export the library functions to the working environment, following the
documentation of each function. IT MUST NOT BE EDITED BY HAND.
It’s important to use the @import, @importFrom and @export attributes in the documentation of the
function because of this file. Devtools would recognize these attributes and overwrite the NAMESPACE
file in order to load the correct libraries and export the functions, so yes, think about the needed libraries
by the function when you write it.

4 cran-comments.md
In this file should be all the comments relevant to the CRAN volunteers when they test the package. A
good cran-comments.md file should be like this:
# cran-comments.md

## donttest{} examples

The `SensAnalysisMLP.R` function contains `\donttest{}` examples which produce animations that take >5sec
that users need to know about, but cause issues in examples and checks.
The manipulations before the animation rending is already tested in the example which is not wrapped by
`\donttest{}`.

## Test environments

IIT – Instituto de investigación tecnológica 6


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

* Windows 10 Home x64, R 3.6.0


* ubuntu 14.05.5 (on travis-ci), R (oldrel, release and devel)
* Mac OS X 10.13.3 (on travis-ci), R (oldrel, release and devel)
* win-builder, R (oldrelease, release and devel)
* macOS Sierra 10.12.6, R 3.5.1

### R CMD check

-- R CMD check results ----------------------------------- NeuralSens 0.0.2 ----


Duration: 27.5s

0 errors √ | 0 warnings √ | 0 notes √

## Travis CI
- linux
- oldrel pass
- release pass
- devel pass
- osx
- oldrel pass
- release pass
- devel ERROR:
Installing R
292.31s$ brew update >/dev/null
4.05s$ curl -fLo /tmp/R.pkg https://r.research.att.com/el-capitan/R-devel/R-devel-el-capitan-signed.pkg
curl: (22) The requested URL returned error: 404 Not Found
The command "eval curl -fLo /tmp/R.pkg https://r.research.att.com/el-capitan/R-devel/R-devel-el-capitan-
signed.pkg " failed. Retrying, 2 of 3.

## win-builder
- oldrelease pass
- release pass
- devel pass

5 Create library Workflow


In order to create a new version of the library, the devtools library must be installed in the device. A
good workflow would be:
1. Change the version number of the DESCRIPTION file. This version number should be
major.minor.patch.develop (for example: 1.0.2.9000). 9000 is a conventional number to indicate
that this version is under development and is totally optional (CRAN does not accept that type
of number and force to use major.minor.patch). A new development version would be
1.0.2.9001
a. If there are any files/directories that must not be included in the library but are useful
to have in the library directory (like functions to be included in future versions) the
files/directories must be included in the .Rbuildignore file of the library directory.
2. Use devtools::document() to create the documentation of the package. This would create a new
directory called man/, with all the markdown files of the documentation. This step is optional
when the new version does not include any change in the documentation.
3. In RStudio, there is a window like this:

IIT – Instituto de investigación tecnológica 7


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

In this window, some commands needed to check the package can be done using buttons.
• The install and restart button would build the package and install it in the computer. It is
useful to test the functions as they are created.
• The check button would do the same as install and restart but it would also check the rules
to create the package, like no sentence should have more than 80 characters, it would run
all the examples in the @examples fields of the functions documentation, etc. All the
NOTES, WARNINGS and ERRORS of this check should be corrected or documented
afterwards in cran-comments.md
4. Use devtools::build() to create a .tar.gz file with the built package in order to install it in other
devices. This new file can be shared to other devices. It can be installed by:
a. setwd("dir") where dir is the directory of the .tar.gz file.
b. install.packages("file.tar.gz", repos = NULL, dep = TRUE)
5. Push the new version to the github repository. In this step is recommended to have configured
Travis and appveyor automatic checks to check the package version in Linux and Mac, and
Windows respectively. The automatic check should be done in R-oldrelease, R-release and R-
devel.
6. Upload the .tar.gz file to win-builder https://win-builder.r-project.org/ to check the package in
a Windows virtual machine in R-oldrelease, R-release and R-devel.
7. Update the cran-comments.md file.
8. Push the library to CRAN. This would throw an automatic check in CRAN servers and some
emails would be sent to the maintainer of the package with the results. If the package passes
these automated checks, a CRAN member would check the package and would ask the doubts
to the maintainer. This process takes 4-5 days the first time a package is uploaded, and 1-2 days
the next times.
9. Improve the library and repeat.

IIT – Instituto de investigación tecnológica 8


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

6 Extra files
6.1 Rbuildignore
In this file should be listed all the files that should not be taken into account when building the package.
An example of this file is:
^IITML\.Rproj$
^\.Rproj\.user$
^next$
^Papers$
^revdep$
cran-comments.md
^\.travis\.yml$
README.md
README.Rmd
README.html
^README_files/.*
^appveyor\.yml$
6.2 README.Rmd
README file of the github repository made in Rmarkdown. This allows to include graphics made with
ggplot2 or similar, but rmarkdown package must be used to render this file before pushing the new
version to github.

6.3 .travis.yml
Configuration file to use the automatic checker travis of the R package when pushing the new version
to github. An example of this file is:
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r

language: R
r:
- oldrel
- release
- devel
cache: packages
warnings_are_errors: true
os:
- linux
- osx
6.4 appveyor.yml
Configuration file to use the automatic checker appveyor of the R package when pushing the new version
to github. An example of this file is:
# DO NOT CHANGE the "init" and "install" sections below

# Download script file from GitHub


init:
ps: |
$ErrorActionPreference = "Stop"
Invoke-WebRequest http://raw.github.com/krlmlr/r-appveyor/master/scripts/appveyor-tool.ps1 -OutFile
"..\appveyor-tool.ps1"
Import-Module '..\appveyor-tool.ps1'

install:

IIT – Instituto de investigación tecnológica 9


Jaime Pizarroso Gonzalo
Universidad Pontificia de Comillas
Escuela Técnica Superior de Ingeniería (ICAI)

ps: Bootstrap

cache:
- C:\RLibrary

environment:
global:
WARNINGS_ARE_ERRORS: 1
USE_RTOOLS: true
NOT_CRAN: true
# env vars that may need to be set, at least temporarily, from time to time
# see https://github.com/krlmlr/r-appveyor#readme for details
# USE_RTOOLS: true
# R_REMOTES_STANDALONE: true

# Adapt as necessary starting from here

build_script:
- travis-tool.sh install_deps

test_script:
- travis-tool.sh run_tests

on_failure:
- travis-tool.sh dump_logs

artifacts:
- path: '*.Rcheck\**\*.log'
name: Logs

- path: '*.Rcheck\**\*.out'
name: Logs

- path: '*.Rcheck\**\*.fail'
name: Logs

- path: '*.Rcheck\**\*.Rout'
name: Logs

- path: '\*_*.tar.gz'
name: Bits

- path: '\*_*.zip'
name: Bits

References
[1] R packages. Organize, test, document and share your code, H. Wickham. URL: http://r-
pkgs.had.co.nz

IIT – Instituto de investigación tecnológica 10


Jaime Pizarroso Gonzalo

You might also like