You are on page 1of 7

This file contains the following worksheets:

1. Checklist for high-throughput sequence submissions to NCBI GEO


2. Metadata Template (Please completely annotate the template. Failure to do so may re
3. MD5 Checksums for the raw data files referenced in Metadata Template
HTS-seq EXAMPLES: RNA-seq, ChIP-seq, scRNA-seq, CITE-seq (Refer to the example

Full online instructions and pointers to more examples are at: https://www.ncbi.nlm
For questions or concerns, please email us at geo@ncbi.nlm.nih.gov.

Quick Instructions
Step 1: Check that GEO accepts your data type

Step 2: Gather raw data files

Step 3: Gather processed data files

Step 4: Fill in Metadata Template (one seq type per template)

Step 5: Fill in MD5 Checksums sheet for the raw data files referenced in Metadata Temp
Step 6: Create a folder on your computer that contains all raw and processed files, and yo
completed Metadata Template in Excel format
Step 7: FTP the entire data folder to GEO

Step 8: Notify GEO by email once the FTP transfer is complete; unannounced files will no

Step 9: Your submission is placed into the processing queue and reviewed within 5 busin
expect to receive an email from GEO curators with questions about your submission or the
accession numbers
* Updating GEO records (that have been processed and approved) can be labor-intensiv
so please carefully prepare your submission before you transfer your files to the GEO FTP

FAQ - For more FAQs, see https://www.ncbi.nlm.nih.gov/geo/info/seq.htm

Human data warning


DO NOT SUBMIT TO GEO if there are patient privacy concerns regarding making data fu
which has controlled access mechanisms and is an appropriate resource for hosting sens
does not compromise participant privacy and is in accord with the original consent in addit
https://www.ncbi.nlm.nih.gov/geo/info/faq.html#patient for more information.

Multi-part studies
If you have a multi-part study (eg, RNA-seq and ChIP-seq data that are part of the same
metadata sheet, raw files, and processed files). Each part will be assigned a Series acces
please request that we group the parts under a SuperSeries. A SuperSeries will facilitate a
your manuscript. Please provide a title for the SuperSeries (for example, the publication ti
Raw data already submitted to SRA
It is not necessary for you to submit raw data directly to SRA, but if you did and you want c
(Second tab, at the foot of this page) as follows:

[1] List the BioProject accession (PRJNAnnnn) in a 'BioProject' field of the SERIES section

[2] Add an 'experiment' column to the SAMPLES section and include the corresponding SR
we can create the appropriate links between the SRA Experiments and GEO Samples.

[3] Add a 'BioSample' column to the SAMPLES section and include the corresponding Bio

You do not need to list the raw files in the Metadata template.

Before completing your Metadata Template, please read below:.

All fields in the template must be completed. Failure to do so may result in


significant delays in processing.

Example submissions are found in the EXAMPLE tabs at the foot of this shee

Field names should not be edited; any field with a red triangle in the upper-ri
corner has pop-up instructions (hover over the field).

SAMPLES Section

Each row represents a GEO Sample record.

Add as many Sample rows, as needed, for your study.

In the "processed data file" column, the file names must exactly match the
submitted file names.
Each row represents a GEO Sample record.

Add as many Sample rows, as needed, for your study.

In the "processed data file" column, the file names must exactly match the
submitted file names.

In the "raw file" column, the file names must exactly match the submitted file
names. A raw file can be listed only once in the SAMPLES section.

Additional "processed data file" or "raw file" columns may be included to


accommodate all of the file names.

PROTOCOLS Section

If any of the protocols are applicable to only a subset of Samples, include it a


additional column in the SAMPLES section instead.

"Data processing steps" include base-calling, alignment, filtering, peak-calling


generation of normalized abundance measurements, etc.

For each step provide a description, as well as software name, version,


parameters, if applicable.

Include additional steps, as needed.

PAIRED-END EXPERIMENTS Section

For paired-end experiments, list the 2 associated raw files. For experiments t
may have 3 or 4 file names (single cell, SOLiD), include "file name 3" and "file
name 4" columns. Skip this section if you have single-end raw files.
nt delays in processing.)

ded. They can help you get started.)

nfo/seq.html

Links
Categories of data we do and do not accept

Raw data file instructions

Processed data file instructions

Second tab, at the foot of this page

Create folder

FTP instructions

Notify GEO instructions

When will my data receive GEO accession numbers?


r, submit to NCBI's dbGaP (http://www.ncbi.nlm.nih.gov/gap/) database
a. It is the submitter's responsibility to ensure that the submitted information
able laws, regulations, and institutional policies. See

ase prepare each part as a separate submission in its own folder (with
ter you finish transferring the multi-part study to the GEO FTP server,
ated data and establish a convenient reference entry that can be quoted in
est.
GEO entries, please include additional information in the Metadata template

accessions (SRXnnnnnn) or SRA Run accessions (SRRnnnnnn) so that

ons (SAMNnnnnnn).

You might also like