You are on page 1of 3

Idea create store the way someone similar to what we just had presented scientific bio

information or data products multiple different systems right now the limit is still a work
in progress so all this is manually collected scientific it out or labs science information is
also sort bottle used by the research commute quite extensively. The genome portal is
freely accessible to both commercial academic government customers they have to
register that they have tablet approved account with ATCC to login and download
anything but there is no charge for it so we were gonna transition to a paid subscription
model that requires like creating a product in a product catalogue ask you he will that
represents that digital license.

All of this data will go to the genome portal and it would be you know denoted to the
end users as this is verified followed by data but it's not something vacancies last
number less we have replied some level of curation to that also provide significant
other value. If you're buying from petition you can actually download the the the
genome data itself it's just a text file. There's a built-in genome browser where you can
like sort of way scroll through the genome and each of these little boxes here are
different genes.

Sitecore's genome portal allows scientists to search for their favorite Organism or cell
line or whatever bacteria or virus environment partitions. There's some limited tagging
built in there's a limited sort of rather crude sequence similarity tool that allows you to
use the you know DNA sequence data to then search for an Organism. The company is
working on a subscription based access model for sponsored genomes. We'll be the
only company in the world that is doing that no other material biomaterials supplier in
the market is doing anything like that right now so we'll be able to pull data directly out
of our manufacturing labs to populate a knowledge base for our customers.

The top priority feature that I wanna add to the genome portal is what I'm thinking of as
a taxonomy browser just to so you can visually search and explore what items are in
our content. New visualizations to sort of bin or cluster or group certain organisms or
certain products together based on antibiotic resistance data. Trees and metadata.
follow those strict binomial rules so that creates some complexities on the data side but
you can imagine when you start taking trees and metadata together you can start
having all kinds of interesting visualizations where you might have a group of bacteria
that are all related to one another.

I don't necessarily need to go into this too much but the the natural that I mentioned
earlier about taking external data and curating it and bring it in. The added advantage
of doing that for us is that it creates opportunity where I don't have to sequence
everything in the catalog. We would ingest that data into the gene pool will make it
available to our engineers create some data visualizations we talked about clustering or
binning or grouping different organisms together based on some of this data that
would be great opportunity from a database perspective and then my team would
obviously to justify their the business case.

ATCC has a high performance computing cluster we're gonna have all of our pipelines
and we'll be basically doing all the data processing ourselves. Customers don't have the
option to download the raw data it's stored in the cloud just for our own backup
purposes if the but I would wanna add that to the subscription model as an option. The
API this taxonomy browser will only allow users to download multiple genomes at the
same time or aggregate them in aggregate.

When my lab is done sequencing after that data is ready we do some very light QC work
on it just to make sure that you know there's not some obvious things wrong like
something bad didn't happen in the lab if everything passes QC we pushed to one
codex. Cryogenic has built a $100 million a year business just taking public data and
sharing it with the biopharma industry. They have a their contract with Merck alone is
worth $2,000,000 a year just for license just to get access to the data 50,000.

The vast majority of publicly available genomics data is produced by academic labs.
There's been a huge amount of diversity in the quality of the data. This makes it really
really challenging for researchers especially in the biopharma space to analyze large
amounts of data. Everybody's using the same tools to do it yeah or they're home
brewing their own pipelines in Python or whatever I mean it's I mean some could argue
that we're also doing the same thing we're doing it better we're not being everybody
else yeah and that's true we are.

DNA Nexus is using one codexis pipeline and internally our own pipeline is even better
than that so and that's we've benchmark we haven't published it yet though looking at
supposed to build an arrow yeah familiar with those so DNA Nexus is like it's a platform
as a service seven bridges. There's no stop it mean if science wants to buy a bacterial
strain from us and have us buddy down the hall of this university who has a sequencer
sequencer. forum knock yourself out but we're probably gonna be doing a better job
than that person is.

MLS failed because the labs that develop the models were using public data that was all
produced using different approaches methods quality and that creates noise. I mean
there'll be a lot of those details that I don't have the expertise in knowing and that's why
I wanna get those two technical developers on on two sides. I would like to be
completely free of 1 codex by Q3 of 2023. the back end yeah should be ready but more
importantly the number of products that you're servicing should exceed what you're
currently servicing with one. develop a replacement back end right that's one strategy
that would be a multiphase approach.

One codex is main business is actually totally disconnected from this that's why this is
another reason why we wanna separate from them because they're not a web
development shop this is not their you know for them it was a way to sharing the
benefits of ATCC's very strong global brand. I just lost audio for some reason stop so I
think he was asking about metadata and hosting. We are doing it now all the metadata
that's through scientific kind of technical data that's associated with our products in
their catalog pull that data into the genome portal.

One codex is providing the infrastructure from the front end and the back end they
don't actually have access to that customer information but it's all through us. The
requirement from my end would be that that that subscription model is a self-service
kind of a platform where they can log in to ATCC's main website. You can manage who
has access like itself that's also self-service. I don't know I haven't seen it but Sarah was
telling me that in there there's already an off the shelf solution and insight where you
can say I'm gonna buy five licenses for 10 licenses.

You might also like