You are on page 1of 41

Presentation by

Jenni Crossley
Library Research
Support
Jen Quah
RBI Research
Administration

24 May 2016

Practical housekeeping
for your data

Overview

Brief introduction to research data management


Organising your data
Data storage
Data documentation
Working with collaborators
Personal, sensitive and confidential data
Long term preservation and sharing

What is research data?


(From UWE Draft Data Management Policy)
The evidence collected, created or observed,
upon which knowledge from research is based.
Textual data, numerical data, databases,
geospatial data, images, audio-visual recordings
and data generated by machines and
instruments.
Research data will include data held in electronic
and paper form as well as physical data such as
samples and physical artefacts.

Research data management is


A. Planning for data
B. Curating data
C. Using data safely
D. Data preservation
E. Data sharing and
re-use
F. All of the above

bo
ve

f t
h

Al

l o

d
re

e a

-u
se

n
tio
in
g

er
va

Da
t

a s
h

ar

re
s
a p

Da
t

an

fe
ly
a s
a

a
da
t

g d
at

Us
in
g

at
in

Cu
r

Pl
an

ni
n

g f
or

d
a

ta

17% 17% 17% 17% 17% 17%

What is research data


management?
The active management and appraisal of data
over the lifecycle of scholarly and scientific
interest.
Research data management concerns the
organisation of data, from its entry to the
research life cycle through to dissemination and
archiving of valuable results.

Or in other words
Doing your housework and looking after your
stuff!
Is it valuable?
Is it current?
Keep, archive, dispose?

Why manage data?


Data is an asset unless it isnt properly cared
for
Mitigates risk of data loss
Makes data easier to find and use
Makes your life easier
Allows validation of results
Is good research practice

Data threats
Human error
Fire
Flood
Virus attack
Equipment failure
Theft/hacking

Who is responsible for data


management?
Overall project manager
Often delegated
For PhD candidates, on a day-to-day basis this
means you!

Data management plan


Think about/define:

What kind of data do you want/need?


How the data will be generated
How will the data be documented?
What do you need to do with your data?
Permissions

Human participants? Ethics clearance required


Re-using data? What can you do with it?

How much?
Storage
Back-up
Who can access? Anyone apart from you?
Long term value v disposal

File structure/format
Structure folders hierarchically:

Restrict the level of folders to three or four deep

Establish naming conventions:

Descriptive or non-descriptive filenames


(http://bit.ly/1TrljWL)

Choose suitable file formats:

Optimal data formats for long-term preservation of data


(http://bit.ly/1M65rqu)

Use version control


Review materials regularly restructure, delete or archive
Be consistent folder structures and file naming are only
as effective as your ability to follow them!

Three-level versioning:
Major.Minor.Revision
Major numbers
begin at 1. Minor
and revision
numbers always
start with 0. The
first version of a
data file is "1.0.0".
Numbers change
when the file
changes, which
number changes
depends on the
relevance of
changes.

Example from: http://bit.ly/1TtWXg7

Version control table


Included at the
beginning or end of
a document.
Updated each time
a change is made to
the document.
Records versions,
dates, authors and
details of changes
to the document.
Useful for tracking if
multiple authors or
contributors
involved.
Example from: http://bit.ly/25dgl8J

Other version control options


Automated versioning:
Version control facilities within existing software
Using versioning software (e.g. Subversion, Git)
often 3rd party!
Using file sharing services such as OneDrive
Manual versioning:
Controlling rights to file editing
Manual merging of entries or edits by multiple users

Documenting research data


Allows easy understanding of data
Adds context
Best done when data is created

What to record?
Title, date, creator, format, subject (rights,
conditions of access)
Details about how the data was created,
analysed, anonymised etc
Explanations of any codes or abbreviations
Any notables: e.g. exceptions, quirks or
questionable results
Information about the project and creators of
the data

Where or how to document?


Maintain a readme.txt file
Lab book
Register
Database
If data is relegated, record this also, and reasons
why

Where do you store (or plan to


store) your data?

E.
F.
G.
H.
I.

UWE network
Personal laptop
External hard drive
Dropbox or other cloud
storage
Memory stick
In a folder
In a box
In a locked filing cabinet
or similar
Other

11% 11% 11% 11% 11% 11% 11% 11% 11%

UW
E
Dr
Pe ne
op
t
r
bo Ex son wor
x o ter al
k
r o na lap
th l ha to
p
er
cl r d d
ou
r
d ive
M sto
em r a
In
or ge
a
y
lo
ck
In stic
ed
k
a
fo
fi
ld
lin
e
g
ca In r
a
bi
ne bo
t o x
r s
...
Ot
he
r

A.
B.
C.
D.

Where does UWE guidance say to


store data? Choose 2.

E.
F.
G.
H.
I.

UWE network
Personal laptop
External hard drive
Dropbox or other cloud
service
Memory stick
In a folder
In a box
In a locked filing cabinet
or similar
Other

11% 11% 11% 11% 11% 11% 11% 11% 11%

UW
E
Dr
Pe ne
tw
op
r
or
bo Ext son
a
e
x o
rn l la k
r o al
p
th ha top
er
rd
cl

ou driv
e
d
M se
em rv
In
or ice
a
y
lo
ck
In stic
ed
k
a
fo
fi
ld
lin
e
g
ca In r
a
bi
ne bo
t o x
r s
...
Ot
he
r

A.
B.
C.
D.

How long do you think a computer


hard drive is expected to last, under
optimal circumstances?
25%

4
ye
ar
s

25%

6
ye
ar
s

25%

8
ye
ar
s

10 years
8 years
6 years
4 years

10
y
ea
rs

A.
B.
C.
D.

25%

Data storage considerations


UWE Data Security Guidance
http://bit.ly/1WDkuwz
Should store data on H or S drive
Students have 1Tb space on OneDrive

Where not to store your


research data
On personal laptops or devices

http://bit.ly/1VDNrtf for remote access

On removable media

Unless encrypted, password protected and stored


securely

On externally-hosted cloud storage (e.g. Dropbox)

Data must not leave European Economic Area (Data


Protection Act)
Syncing of devices makes the data vulnerable

How often do ITS back up the H


and S drives?

rly

25%

Ho
u

ay

25%

Ev
er
y
d

25%

On
ce
a
w
ee
k

25%

Once a month
Once a week
Every day
Hourly

On
ce
a
m
on
t

A.
B.
C.
D.

How long do ITS keep back ups?


25%

7
da
ys

ay
s

25%

30
d

ys

25%

18
2
da

25%

ys

365 days
182 days
30 days
7 days

36
5
da

A.
B.
C.
D.

Backing up
H and S drives backed up daily
Kept for 30 days
Large data backed up to removable media
Secure storage

Collaborating?
Share via OneDrive
Staff should share via SharePoint
Be very careful about email
Should not use for personal/sensitive/confidential data
If this is the only option:
Encrypt data
Add Confidential to message line
Password protect
Send password by other media (e.g. text)

Encryption
ITS guidance:
http://www1.uwe.ac.uk/its/informationsecurity/g
uidance/encryption.aspx
http://www1.uwe.ac.uk/its/informationsecurity/t
asks.aspx

Data in transit?
Consider carefully how to move data: physically
or electronically
Dont leave removable media containing research
data unattended
Consider points re collaborating (see previous
slide)
If possible, upload to UWE servers using secure
remote access.

Personal data
Relate to a living individual.
Make identification possible on their own, or in
combination with any other information held, or
likely to be held, by the data controller.
Includes any expression of opinion about the
individual.
Any indication of the intentions of the data
controller.
This includes any other person in respect of the
individual (Data Protection Act 1998).

Which of these are personal data?


Choose 4.

17%

Da
te
o
f b

irt
h

17%

k d
et
ai
ls

17%

Ba
n

17%

Ra
ce

17%

Ad
dr
es
s

17%

Ey
e
co
lo
ur

Name
Eye colour
Address
Race
Bank details
Date of birth

Na
m
e

A.
B.
C.
D.
E.
F.

Which of these are sensitive


personal data? Choose as many as you think.

of
th
e

a
b

ov
e

ip
rs
h
Al
l

em
be

al
c

Tr
ad
e

u
ni
on
m

on
ni
Cr
im
in

Po

lit

ic a
l o
pi

ni
c o
rig
in

on
vi
ct
io
n

20% 20% 20% 20% 20%

Ethnic origin
Political opinion
Criminal conviction
Trade union membership
All of the above

Et
h

A.
B.
C.
D.
E.

Sensitive personal data


Defined in the Data Protection Act (1998) as data on a
person's:
Race
Ethnic origin
Political opinion
Religious or similar beliefs
Trade union membership
Physical or mental health or condition
Sexual life
Commission or alleged commission of an offence
Proceedings for an offence (alleged to have been)
committed
Disposal of such proceedings or the sentence of any
court in such proceedings.

Which of these could be confidential


data? Choose as many as you think.

fo
rm
at
io
n

on
ni
Bu
s

in
e

ss

in

ic a
l o
pi

Po

lit

et
ai
ls
ic a
l d

M
ed

He
al
th

20% 20% 20% 20% 20%

Income
Health
Medical details
Political opinion
Business information

In
co
m

A.
B.
C.
D.
E.

Confidential data
Given in confidence, or data agreed to be kept
confidential, i.e. secret, between two parties, that are
not in the public domain such as information on:
business
income
health
medical details
political opinion

Anonymisation
Removing direct identifiers
Removing or modifying indirect identifiers
Risk of identification
Risk of data-linking

Scenario 1
Mrs A. Badcrumble
Childrens Nurse
29 Eastville Street, Eastville, Bristol, BS16
8AF
DoB 29th May 1978
What do you need to do?
How do you store?

Scenario 2
Dr x
He is a plastic surgeon, specialising in childrens
hand injuries
Works in St Bernards Hospital, London
Income 25000 pa
Registered member of Conservative party

Scenario 3

You interview some farmers

You have their names, farm addresses

You have information on their incomes

You record the type of farming done

You record biodiversity on the farms

Long term preservation and


sharing
Sometimes required by funders
Sometimes required by journals
Part of good research practice
Allows validation of results
Allows for value for money
Data repositories available
Core data with long term value, correct
permissions etc.

After today

Library Research Support lib.rke@uwe.ac.uk

Code of Good Research Conduct http://bit.ly/1OTEucG

Research Governance RDM pages http://bit.ly/1OKEGG2

Data Security Guidance http://bit.ly/1s6ILiU

Library Services RDM guidance (shortly to be replaced)


http://bit.ly/1sNLChx

Supervisor or project manager.

Acknowledgements
Bristol University
London School of Economics and Political Science
UK Data Archive
JISC Digital Media

You might also like