Professional Documents
Culture Documents
The topic of this talk, as its name suggests, is the importance and effectiveness of public
work, specifically in the statistics and data science community. David Robinson elaborates on
the benefits of collaboration when dealing with code or data science-related problems, and
expands not just on his own experience but the experience of many others throughout their
One comment that I found particularly interesting and unexpected by David Robinson is
the statistic that a blog post will, on average, be read 10-100 more times than a published paper.
In my personal experience, I have definitely read or at least skimmed more scientific papers than
blog posts, which may simply be a result of my specific tastes, but either way I was still
surprised at this statistic. This goes to show how important it is to publicly share one’s work, no
matter how insignificant it may be, because of the possibility of it reaching so many more people
This event connects to the entire use of GitHub in STA 210 this semester. One of the
fundamental aspects of this talk was to encourage collaboration and open-source ideas, which is
one of the key purposes of GitHub besides version control. In class, we have been using GitHub
in order to maintain our group’s R code, further showing how GitHub is essential in fostering
collaboration in the data science community. In addition, GitHub contains many public
repositories from a multitude of users, which allows anyone with a GitHub account to view their
code and ideas. Again, this ties into one of the central ideas of David Robinson’s talk where he
highlights how new findings and research in the scientific community largely builds upon the
ability to provide free and public access to important code and insights produced by others.
Without GitHub, a lot of code would be housed in private databases and communities, which
Bryan Tong
STA 210 Extra Credit Assignment
would significantly reduce the extent to which we are able to foster growth and innovation in this
world today.
Something new I learned is that there is a package in R, blogdown, that gives you the
tools necessary to create a blog or website in R markdown. The introduction of this package goes
to show how important and critical the rise of open-source collaboration has become, as a lot of
the blogs that are being created using this package are data-related blogs, where authors analyze
a dataset or even just talk about new advances in the data science community. By creating
blogdown, the authors are now equipping more and more people with the ability to contribute
code and insight to the community, with the hope of ultimately using the collective contributions
of as many data scientists as possible to provide the most state-of-the-art and innovative