You are on page 1of 21

Craig

(Large mammal ecologist)

• ZOT 28 – Zoology Building


• Office number: 040 602 2339
• E-mail: ctambling@ufh.ac.za
• Blackboard communication
• https://learn.ufh.ac.za/
Next section of the course
1. Spreadsheets
2. Types of graphs
3. Central tendencies
4. Variation
5. Interpretation of data
Spreadsheets
in Science
Spreadsheets – what are they
• An electronic document in which
data is arranged in the rows and
columns of a grid and can be
manipulated and used in
calculations
Uses of spreadsheets
• Data entry – Data that was collected in the field can be entered into
an electronic format
• Organizing data – Sorting and arranging data so that the data are
easier used
• Subsetting and sorting data – Basic manipulation of your data for
doing analysis on different sections of your dataset
• Statistics – Developing patterns and models in the data used to create
inference based on your data you have collected
• Plotting – Graphically presenting your data to make it easy to
understand
What is a spreadsheet composed of
Cell: A single data point or element in a spreadsheet

Column: A vertical set of cells

Row: A horizontal set of cells


Range: A selection of cells extending across a row, column, or
both

Function: A built-in operation from the spreadsheet app,


which can be used to calculate cell, row, column, or range
values, manipulate data and more
Formula: The combination of functions, cell, rows, columns,
and ranges used to obtain a specific result

Worksheet (Sheet): The named sets of rows and columns


making up your spreadsheet, one spreadsheet can have
multiple sheets

Spreadsheet: The entire document containing your


worksheets
Good data entry practices – formatting data
tables
• Need to have well formatted spreadsheets before data are even
entered
• If you don’t organize your data properly you might as well just scan
your datasheet and store it as a PDF
• Two simple rules when entering data
1) Each data cell is an observation that must have all the relevant
information connected to it for it to stand on its own
2) You must make it clear to the computer how the data cells relate to
the relevant information and each other

https://datacarpentry.org/2015-03-09-ISI-CODATA/lessons/excel/ecology-examples/00-intro.html
Columns are the variables of
interest

Rows are most often the


observations, some which have
multiple values measurements or
categories
Try avoid having multiple tables
in a single spreadsheet (above)
or excessive number of sheets
(right). Ask yourself, “Self, could I
avoid adding this tab by adding
another column to my original
spreadsheet”.
Avoiding common formatting mistakes
• Not filling in zeros
• Using formatting to make the data sheet look pretty (i.e. merging cells
is a no no if you want your data to be readable by another
application)
• Placing comments or units in cells (create another variable called
comments)
• More than one piece of information in a cell
• Field name problems (no spaces or special characters)
• Special characters in data
Dates as data – beware!

• Excel stores dates as a number, which it then references back to 1 on


the 1st of January 1900.
• Problem if dates are not entered properly and completely
Basic quality control
• Tip! Before doing any quality control operations, save your original file with the
formulas and a name indicating it is the original data. Create a separate file with
appropriate naming and versioning, and ensure your data is stored as “values”
and not as formulas. Because formulas refer to other cells, and you may be
moving cells around, you may compromise the integrity of your data if you do not
take this step!

Once the data has been entered, don’t #$%@ with it. Work on copies of your
raw data
Basic quality control
• Tip! Before doing any quality control operations, save your original file with the
formulas and a name indicating it is the original data. Create a separate file with
appropriate naming and versioning, and ensure your data is stored as “values”
and not as formulas. Because formulas refer to other cells, and you may be
moving cells around, you may compromise the integrity of your data if you do not
take this step!
• readMe files: As you start manipulating your data files, create a readMe
document / text file to keep track of your files and document your manipulations
so that they may be easily understood and replicated, either by your future self
or by an independent researcher. Your readMe file should document all of the
files in your data set (including documentation), describe their content and
format, and lay out the organizing principles of folders and subfolders. For each of
the separate files listed, it is a good idea to document the manipulations or
analyses that were carried out on those data.
Exporting data from spreadsheets
• Excel is owned by Microsoft so at some time in the future these
formats may not be accessible to all programs
• Better to store the data in a universal, open, static format such as a
CSV file. CSV = plain text file where data are separated by commas
(comma separated variable [CSV] file). CSV can be opened and read
by almost any software
Questions
• What do we use a spreadsheet program for?
Questions
• List three things that you should avoid doing when entering your data
into a spreadsheet program and for each explain why you should
avoid doing it.
Questions
• Why should you be careful when entering date formats into excel?

You might also like