You are on page 1of 101

RSC|ChemSpider – The Online

Chemistry Database Where


Community Contributions Count
ChemSpider
 The RSC’s Online Chemical Database

 A central hub for chemists to source information


 >28 million unique chemical records
 Aggregated from >400 data sources
 Chemicals, spectra, CIF files, movies, images,
podcasts, links to patents, publications,
predictions

 A central hub for chemists to deposit & curate data


Answer Questions with ChemSpider
 Questions a chemist might ask…
 What is the melting point of n-heptanol?
 What is the chemical structure of Xanax?
 Chemically, what is phenolphthalein?
 What are the stereocenters of cholesterol?
 Where can I find publications about xylene?
 What are the different trade names for Ketoconazole?
 What is the NMR spectrum of Aspirin?
 What are the safety handling issues for Thymol Blue?
I want to know about “Vincristine”
I want to know about “Vincristine”

If all algorithms work then


everything on the page is
correct by default except
the name!
Vincristine: Identifiers and Properties
Vincristine: Identifiers and Properties
Vincristine: Vendors and Sources
Vincristine: Patents
Vincristine: Articles
ChemSpider : Spectra Linked
Spectra Linked
Multiple Spectra for One Structure
ChemSpider ID 24528095 H1 NMR
ChemSpider ID 24528095 C13 NMR
ChemSpider ID 24528095 HHCOSY
ChemSpider ID 24528095 HSQC
ChemSpider ID 24528095 HMBC
About Structures
The InChI Standard
InChIKeys
Search the Web by Structure
InChIs
Searches: The INTERNET

All ChemSpider and Internet searches are “simply algorithms”


but synonym searching is based on an assertion
Validated Names for Searching…
Scientists are measured by…
 Impact
 Citations
 Papers
 Patents
 Funding

 and increasingly by “Alt-Metrics” – what you say,


what you contribute, your data depositions, your
code in repositories, your voice in the network,
your activities on Facebook (be careful!)
If it was not just about me…
If it was not just about me…
 We might have a community
built encyclopedia
 I might know where the best
restaurants are
 I might get good advice on
books to read
 I might know which movies
to watch
 I might know which plumber
to call
 Data might just be Open
If it was not just about me…
 We might have a community
built encyclopedia
 I might know where the best
restaurants are
 I might get good advice on
books to read
 I might know which movies
to watch
 I might know which plumber
to call
 Data might just be Open
The Social Network
 Career-wise NOT having a personal presence
online will be a detriment
 Self-marketing
 Establishing a profile
 Getting on the record
 Collaborative Science
 Demonstrating a skill set
 Measured using alternative metrics
 Contributing to the public peer review process
Social Networking Tools
 A growing number of social networking tools:

 Facebook
 Twitter
 Linked-In
 Flickr
 YouTube
 Blogs
 Communities
 Collaborative environments
Chemistry Social Networking
 Methods of sharing MY chemistry online include:
 Wikis or blogs
 Slideshare for presentations
 YouTube for videos
 Flickr, Wikimedia etc. for images
 PubChem for assay data
 NMRShiftDB for NMR assignments
 GoogleDocs for data
Your profile online…
Establish a Mendeley Account
http://www.mendeley.com/profiles/antony-williams/
ResearchGate
http://www.researchgate.net/profile/Antony_Williams/
Microsoft Academic Search
http://academic.research.microsoft.com/Author/12789419
The Alt-Metrics Manifesto
 http://altmetrics.org/manifesto/
What is my ImpactStory?
ImpactStory
Enabled by ORCID…
The Linked Network
There is much to be linked
The World of Contribution

 Times have changed


 Immediacy of social networks
 Commenting on articles/data is here
 The “participating scientist” has high profile
 And who can be a scientist now???
A Ten Year Old Scientist
Share Science!!! Not Just Yourself
 If you have time, and the inclination, become a
community contributor

 Share your expertise in the new world of openness


 Share your Open Source code
 Share your data and your model
 Share your Figures
 Contribute to Wikis – Wikipedia and others
 Become an Open Notebook Scientist
Expose Data and Figures on FigShare
Expose Data and Figures on FigShare
ChemSpider SyntheticPages
 Many syntheses are not published but are of value

 A database of synthesis procedures built for the


community, by the community.

 Peer-reviewed by the community

 Each contribution DOI’ed. Develop online scientific


reputation at a time of “micro-publications”

 Integrates semantic mark-up and visualization tools


ChemSpider SyntheticPages
http://cssp.chemspider.com
ChemSpider SyntheticPages
Submission process
 Register as a user
 Use the Submit button and fill in the fields…
Submission Process
 Submissions reviewed by editorial board

 Published as is or comments sent to author

 Online Peer Review process – engage chemists


in ongoing discussions and feedback loop

 Data supported include web movies, images, live


spectra etc.
Recent Submissions
Interactive Data
Most Accessed
Is it working?
 Show of hands…
 How many of you know ChemSpider?
 How many of you know CSSP?
 Have any of you submitted to CSSP?

 Low submissions but some dedicated authors


Popular Authors
Is it working?
 Show of hands…
 How many of you know CSSP?
 Have any of you submitted to CSSP?

 Low submissions but some dedicated authors

 What reasons are there you would not publish?


 Time
 Approval from supervisor
 Need to keep the science quiet
 Publishing on CSSP prevents future publishing?
Contributing to The Quality of Data
What is the Structure of Vitamin K?
Contributing to The Quality of Data
What is the Structure of Vitamin K?

A lipid cofactor that is required for normal blood


clotting. Several forms of vitamin K have been
identified: VITAMIN K1 (phytomenadione)
derived from plants, VITAMIN K2
(menaquinone) from bacteria & synthetic
naphthoquinone provitamins, VITAMIN K3
(menadione).
What is the Structure of Vitamin K1?
CAS’s Common Chemistry
Wikipedia
Wolfram Alpha
DailyMed
People Use Trusted Resources…
Quality police…
How will it improve?

Participation
and
contribution
ALL Different, ALL “Domoic Acids”
The EXPERTS must get it right?!
Question Everything Online
Deposition, Annotation and
Validation

 ANYBODY can annotate a record on ChemSpider

 Registered users can deposit new data

 Registered users can validate existing data


CURATION Search “Vitamin H”
“Curate” Identifiers
“Curate” Identifiers
Spectra Linked
Spectral Uploading
 Locate the structure of interest and deposit
spectrum
Spectral Uploading
 Various types of NMR spectra supported
Regular Updates
Web Services
www.SpectralGame.com
http://www.jcheminf.com/content/1/1/9
Spectral Game
Increasing Complexity
SpectralGame in the hand
Work in Progress – 300k Reactions
Data Enabling the RSC Archive
 An archive going back to 1841. Project underway
to “data enable” the archive:

 Extract chemistry – chemicals, reactions,


experimental data points, complex data

 Semantic enriching of the articles for interactive


viewing and crowdsourced annotation/curation

 Dramatically enables the type of queries


possible across the archive
A model for data segregation

Integrate to Institutional repositories


Access to Theses and Dissertations
Model Building with Community Data
 Community data can be the basis of model
building

 Consume data from available databases, RSC


archive, new publications and build predictive
algorithms for the community

 Accept research data from the community and


include into predictions
An Open Data-Centric Chemistry Hub
Internet Data

Small organic molecules Commercial Software


Undefined materials Pre-competitive Data
Organometallics Open Science
Nanomaterials Open Data
Polymers Publishers
Minerals Educators
Particle bound Open Databases
Links to Biologicals Chemical Vendors
Wikipedia
http://en.wikipedia.org/wiki/Antony_John_Williams
An Interesting Read
http://tinyurl.com/7e3l6rz
ScientistsDB
http://tinyurl.com/7cqylsp
ScientistsDB
 Write your OWN article about yourself on
ScientistsDB

 It is a community-policed site so any comments


you write might be challenged/edited. It is “your”
page but edited by all

 An article, once approved by the community, can,


in theory, be moved to Wikipedia

 All content is licensed under standard CC-BY-SA


3.0 licensing provided by Wikipedia
Acknowledgments
 RSC|ChemSpider team
 CSSP Editorial Team
 All data source providers
 Curators and annotators
 Service providers:
 ACD/Labs
 OpenEye
 GGA Software Services
 Many others….
Communicating Science
 As scientists one of our primary roles is contribution

 The internet enables contribution in different ways,


benefitting the scientist and the community

 Share your data and experience – it can enhance


your public profile as a scientist, make you more
discoverable and contribute data to the community

 AltMetrics will be a measure of scientists…


Thank you

Email: williamsa@rsc.org
Twitter: ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

You might also like