You are on page 1of 50

Colon

 Modeling  with  Open  Source  Tool  (CMOST)  


 
Notes  and  User  Manual  
Program  generated  by  Meher  Prakash  and  Benjamin  Misselwitz  

 
Table  of  content:  
I.  PROGRAM  AND  REQUIREMENTS   3  
II.  CMOST  Folder  Contents   4  
II.2  CMOST  applications   5  
III.  Opening  the  program   6  
IV.  CMOST  Main  Window   7  
IV.1.  M AIN  W INDOW :  P ATHS  A ND  SETTINGS   7  
IV.2.  Start,  patients  and  additional  settings   8  
IV.3.  Screening  and  Surveillance   9  
IV.4.  Output   10  
IV.5.  Scan  variables   10  
IV.6.  Automated  optimization   11  
IV.6.  Options   12  
IV.6.1.  Write  CMOST  batch  instructions  –  Automatic  tests   12  
IV.6.2.  Write  CMOST  batch  instructions  –  Repeat  identical  settings   12  
IV.6.3.  Calibration  Step  1,  2,  3   12  
IV.6.3.  Autocalibration  Step  4  rectosigmo   13  
V.  CMOST  output   13  
V.1.  CMOST  graphical  output   13  
V.1.1.  First  page  of  CMOST  graphical  Results  –  adenoma  prevalence,  CRC  incidence   14  
V.1.2.  Second  page  of  CMOST  graphical  results:  adenoma  characteristics   14  
V.1.3.  Third  page  of  CMOST  graphical  results:  Cancer  characteristics   17  
V.1.4.  Fourth  page  of  CMOST  graphical  results  -­‐‑  CRC  effects  and  CRC  Screening   19  
V.2.  CMOST  Results  (Matlab  variable)   20  
V.3.  CMOST  Results  (Excel  File)   21  
VI.  Adjusting  CMOST   22  
VI.1.  Risk   22  
VI.2.  Location   23  
VI.3.  Mortality   24  
VI.4.  Colonoscopy   25  
VI.5.  Costs   26  
VI.6.  Screening   26  
VI.7.  Adenoma  dwell  time   27  
VII.  Parameter  optimization   28  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  1  
 
VII.1.  Step  1:  Early  adenomas   29  
VII.1.1.  Early  adenoma  benchmarks   29  
VII.1.2.  Optimizing  early  adenomas   30  
VII.2.  Step  2:  Advanced  adenomas   32  
VII.2.1.  Advanced  adenoma  benchmarks   32  
VII.2.2.  Optimizing  advanced  adenomas   33  
VII.3.  Step  3:  Optimizing  cancer   34  
VII.3.1.  Benchmarks  for  cancer  optimization   34  
VII.3.1.  Optimizing  cancer  parameters   35  
VII.4.  Step  4:  Adjusting  direct  cancer   36  
VII.4.1.  Benchmarks  for  adjusting  direct  cancer   36  
VII.4.1.  Optimizing  direct  cancer  parameters   37  
VII.5.  Suggested  scheme  for  automated  calibration   38  
VII.6.  Alternative  strategy  for  Step  4,  adjustment  of  direct  cancer   38  
VII.7.  Concomitant  adjustment  of  Steps  1-­‐‑3   39  
VII.8.  Manual  adjustments   41  
VII.9.  Default  benchmarks   42  
VIII.    Speed  of  calculations   42  
VIII.1.  Speed  limit   42  
VIII.2.  Accelerating  speed:  the  coder  option   43  
IX.  Running  CMOST  in  batch  mode   45  
IX.1.  Using  the  STARTER  option   45  
IX.2.  Parallel  Calculations  on  a  Linux  cluster   46  
IX.3.  Generating  multiple  settings  files  with  the  “Scan  Variables”  option   48  
IX.4.  Simulating  large  populations   49  
 
   

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  2  
 
I.  PROGRAM  AND  REQUIREMENTS  
 
Checking   out:   CMOST   program   source   code   is   uploaded   to   gitlab.   The   code   is  
available  for  the  researchers  under  GNU  Public  license.  The  last  version  of  the  code  
can  be  checked  out  in  a  LINUX/UNIX  environment  with  the  command:      
$ git clone https://gitlab.com/cmostmodel/CMOST.git  

 
or  in  a  internet  browser,  visit  the  URL    
 
https://gitlab.com/cmostmodel/CMOST  

and  click  on  the  “Download  Zip”  option  as  shown  below.  
   

 
 
This   will   download   the   CMOST   source   code   on   your   machine.   Unzip   the   folder   to  
access  the  source  files.  
 
Software  requirements:    
 
CMOST   program   runs   on   Matlab   R2011b   or   later   versions   (versions   later   than  
2015b  have  not  been  tested).  
   
The   computationally   intensive   part   of   the   Matlab   program   is   converted   into   a   C++  
executable  (MEX  file,  details  in  Section  VIII.2.  Accelerating  speed,  the  Coder  option  
below).   We   provide   compiled   versions   for   Mac64,   Windows64,   Linux64.   In   case  
another  operating  system  is  installed,  coder  generation  should  be  done  (otherwise  
calculations   for   large   population   sizes   will   be   slow).   In   order   to   achieve   this  
conversion  Matlab  requires  the  “coder  toolbox”  and  compilers  to  be  installed.    
If   no   compiled   versions   of   the   time   critical   NumberCrunching   subroutine   are  
available  an  error  message  is  provided:  
 

 
Some   of   the   compilers   include   Microsoft   Windows   SDK   7.1,   Microsoft   Visual   C++  
2010  on  windows,  GNU  gcc  4.3.x  on  Linux,  Apple  Xcode  4.1  on  Mac  OS  X.    

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  3  
 
One   can   check   the   list   of   supported   compilers   for   the   version   of   Matlab   at   the  
Mathworks  website.  For  example,  Matlab  R2011b  supports  the  following  compilers.  
http://www.mathworks.com/support/compilers/R2011b/win32.html  

 
System   requirements:   A   standard   calculation   of   100,000   individuals   requires   a   2  
GB  RAM  and  2.5  GHz  processing  speed.  Calculations  with  larger  populations  (a  few  
million   might   be   a   routine   calculation   of   interest   for   health   economists)   require  
larger  RAM.  To  make  the  calculation  easier,  the  job  is  split  into  sub  populations  of  
100,000  and  run  in  parallel  on  a  large  computer  cluster.  

II.  CMOST  Folder  Contents  


   
CMOST  folder  includes:    
 
Software  package  implemented  in  MATLAB  
 
1a.   CMOST_Main   GUI:   Graphical   User   Interface   which   can   be   the   starting   point   for  
the  microsimulation  calculations  
 
1b.   CalculateSub:   This   subroutine   processes   the   inputs   including   the   risks   of  
adenomas,  cancers  as  well  as  the  preferences  of  screening  method  and  passes  this  
data  to  the  next  module  
 
1c.  NumberCrunching:  This  module  handles  the  most  computationally  intensive  part  
of   the   software.   The   module   performs   the   microsimulation   by   tracking   over   the  
entire   population,   all   the   adenomas   that   appeared   and   evolved   in   each   individual  
from  the  population  
 
1d.   Evaluation:   This   module   processes   the   results   from   the   microsimulation   and  
presents   them   as   graphical   comparisons   to   benchmarks,   matlab   data   file   for   further  
processing  and  as  excel  sheets.    
 
1e.   Other   CMOST   subroutines:    
(to  be  explained  below)  
 
1f.   Scripts   to   run   the   calculations  
on   cluster.   The   scripts   are   useful  
for   submitting   multiple  
calculations   simultaneously   on   a  
linux  cluster.  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  4  
 
 
 
2.  Settings  folder  with  pre-­‐‑calibrated  settings  files  
 
2a.  Parameter  set  with  dwell  time  8  years  (CMOST8.mat)  
2b.  Parameter  set  with  dwell  time  13  years  (CMOST13.mat)  
2c.  Parameter  set  with  dwell  time  19  years  (CMOST19.mat)  
 
3.  Cluster  folder  with  scripts  for  submission  to  cluster  
4.  Codegen  folder  with  MEX  files  compiled  for  running  CMOST  Matlab  code  
efficiently  
5.  Manual  

II.2  CMOST  applications  


CMOST   as   an   open   source   program,   with   pre-­‐‑calibrated   settings   caters   to   the  
following  cases  with  the  different  levels  of  complexity.  
 
User  interest   Manual   Sections   of   How  to  do  it   Technical  difficulty  
interest  
Repeat  calculations  with   Sections  III,  IV,  V   Use  any  one  of  the   None  
same  natural  history  for   pre-­‐‑calibrated  
100,000  individuals   settings  from  the  
Settings  folder  
Repeat  calculations  with   Sections  III,  IV,  V.   Use  pre-­‐‑calibrated   None  
same  natural  history  for   Additionally  Section  VI   Settings  and  
100,000  individuals  with   for  screening  options,   update  required  
updated  Costs,   costs,  etc   parameters    
Screening  Scenarios,  etc    
Repeat  calculations  with   Sections  III,  IV,  V.   Use  coder  module   Low.  Requires  
same  benchmarks,  but   Additionally,  Section   in  Matlab  to   checking  supported  
optimize  the   VIII  to  convert  time   convert  one  part  of   C++  compilers  for  
performance  for  faster   consuming  parts  of   the  complete   Matlab,  and  installing  if  
calculations   program  into  faster   program  into  a   needed  and  Matlab  
C++  equivalent   faster  MEX  format   command  line  usage  
Repeat  calculations  with   Sections  III,  IV,  V.   Use  the  pre-­‐‑ Medium.  Access  to  a  
same  or  different  natural   Additionally   Section   IX   calibrated  settings   linux  cluster.  Sufficient  
history  for  much  more   for  cluster  scripts   from  the  folder   linux  knowledge  to  use  
than  100,000  individuals     Settings     the  BASH  scripts  we  
provide  to  execute  
multiple  jobs  on  a  
cluster  
Calibrate  CMOST  to  a   Sections  III,  IV,  V.   Update   Medium.  Our  
different  natural  history   Additionally,  Section   benchmarks  in  the   automatic  parameter  
using  a  new  choice  of   VI  for  parameter   specified  formats   calibration  works  with  
benchmarks     calibration.   (early,  advanced   a  broad  range  of  
adenomas,  cancer,   benchmarks.  We  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  5  
 
etc).  Run   cannot  exclude  that  for  
automatic   some  parameter  
parameter   settings,  the  heuristics  
calibration.     used  in  greedy-­‐‑
algorithm  may  need  to  
be  modified    
 
 

III.  Opening  the  program  


 
The   program   CMOST_Main.m   can   be   opened   with   the   Matlab   Editor   and   started  
(press  run,  see  red  arrow).  
 

 
 
This  calls  the  function  CMOST_Main.m  in  a  graphical  user  interface.    
 
 

   

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  6  
 
IV.  CMOST  Main  Window  
After  starting  CMOST  the  following  graphical  user  interface  will  appear:  

 
Most  basic  functions  of  CMOST  are  accessible  via  the  graphical  user  interface.    

IV.1.  MAIN  WINDOW:  PATHS  AND  SETTINGS    


 

 
 
Settings  name:     Name  of  the  current  settings  of  CMOST    
Save  data  path:     Results  of  CMOST  calculations  will  be  saved  under  this  path  
Browse  button:     Browse  for  another  path  to  save  data  
Comment:       These  comments  will  be  saved  with  the  settings    
Default  Settings:     Since  the  current  settings  might  be  derived  from  a  different  set  
of  settings,  the  name  of  the  original  settings  will  be  saved.  
Load  Settings  button:  Browse  to  load  saved  settings  
Save  Settings  button:  Save  current  settings  for  future  use  
 
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  7  
 
Comment:  Saved  settings  can  be  considered  as  instructions  for  an  automated  run  by  
CMOST;   they   can   be   used   for   an   automatic   run   by   CMOST   either   with   the   Starter  
option  in  the  graphical  user  interface  or  on  the  cluster.    
 

IV.2.  Start,  patients  and  additional  settings    


(upper  left  part  of  main  window)  
 

 
 
Number  patients:   Adjust  the  number  of  patients  for  the  next  run  of  CMOST.  For  
technical   reasons   (usage   of   the   coder   programming   module   in   Matlab)   only   4  
options  are  available:  10’000,  25’000,  50’000  and  100’000  patients.  A  larger  number  
of   patients   will   be   difficult   for   Matlab   due   to   limitations   in   memory.   If   necessary,  
several  runs  with  100’000  patients  should  be  combined.      
START:   Start  a  run  of  Matlab  with  current  settings  
 
Starter:     Starter  option  allows  loading  several  saved  runs  of  CMOST  which  can  
be  calculated  sequentially  
Start  batch:   Start  calculations  of  the  files  selected  with  Starter  option  
 
Risk:     Opens  menu  to  adjust  risk  distribution  of  individual  risk  of  adenoma  
development   and   early   and   advanced   adenoma   progression   (see  
below)  
Location:   Opens  menu  to  adjust  behavior  of  CMOST  in  various  locations  within  
the   colon   (i.e.   adenoma   appearance,   progression,   detection   by  
colonoscopy);  see  below.  
Mortality:   Opens  menu  to  adjust  CRC  mortality  according  to  age    
Colonoscopy:  Opens  menu  to  adjust  CMOST  settings  regarding  colonoscopy    
(for  instance  complications)  
Costs:     Opens  menu  to  adjust  costs  for  interventions  and  CRC  treatment  
Screening:   Opens  a  menu  to  adjust  settings  for  CRC  screening  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  8  
 
IV.3.  Screening  and  Surveillance    
(lower  left  in  main  window)  
 

 
Enable   CRC   screening:   If   this   box   is   checked,   CRC   screening   will   be   performed  
(screening  parameters  will  be  adjusted  in  Screening  window)  
Enable   adenoma   surveillance:   If   this   box   is   checked,   surveillance   colonoscopies  
will   be   performed   (5   years   after   detection   of   1   or   2   early   adenomas,   3   years   after  
detection  of  an  advanced  adenomas  and  5  years  henceforward)  
Enable   cancer   surveillance:   CRC   surveillance   colonoscopies   will   be   performed   1  
and  4  years  after  CRC  detection  and  henceforward  every  5  years.  
Special  scenarios:  If  this  box  is  checked  one  of  the  special  scenarios  below  will  be  
considered  during  CMOST  calculations  
 
Special  scenarios:  The  text  indicates  the  special  scenario.  Several  strings  are  
known  to  CMOST  (see  below).  If  a  different  string  is  entered  the  special  scenario  will  
be  ignored.  Several  special  scenarios  are  hard  coded:  
-­‐‑   “RS-­‐‑Atkin”:  simulation  of  the  rectosigmoidoscopy  study  by  Atkin  et  al.,  
Lancet  2012),  treatment  arm    
-­‐‑   “RS-­‐‑Atkin_Mock”:  simulation  the  rectosigmoidoscopy  study  by  Atkin  et  
al.,  Lancet  2012),  control  arm  without  screening  
-­‐‑   “RS-­‐‑Schoen”:  simulation  of  rectosigmoidoscopy  study  by  Schoen  et  al.  
NEJM  2012,  treatment  arm  
-­‐‑   “RS-­‐‑Schoen_Mock”  simulation  of  the  rectosigmoidoscopy  study  by  Schoen  
et  al.  NEJM  2012,  control  arm  without  screening  
-­‐‑   “RS-­‐‑Holme”:  simulation  of  rectosigmoidoscopy  study  by  Holme  et  al.  
JAMA  2014,  treatment  arm  
-­‐‑   RS-­‐‑Holme_Mock”:  simulation  of  rectosigmoidoscopy  study  by  Holme  et  al.  
JAMA  2014,  control  arm  without  screening  
-­‐‑   RS-­‐‑Segnan”:  simulation  of  rectosigmoidoscopy  study  by  Segnan  et  al.  JNCI  
2011,  treatment  arm  
-­‐‑   RS-­‐‑Segnan_Mock”:  RS-­‐‑Segnan”:  simulation  of  rectosigmoidoscopy  study  
by  Segnan  et  al.  JNCI  2011,  control  arm  without  screening  
-­‐‑   “perfect”,  perfect  intervention:  simulation  “maximum  clinical  incidence  
reduction.  At  age  65  all  colonic  lesions  will  be  removed  and  the  evolution  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  9  
 
of  new  lesions  will  be  followed  (van  Ballegooijen  et  al.,  Med.  Dec  Making  
2011)  
-­‐‑   “Po+-­‐‑55”:  at  age  55  all  individuals  with  and  without  adenomas  will  be  
marked    to  be  able  to  distinguish  differences  of  both  populations  
-­‐‑   “Po+-­‐‑55treated”:  as  above,  only  adenomas  will  be  removed;  thereby,  
individuals  with  treated  adenomas  at  age  55  can  be  compared  with  
individuals  without  adenomas.  (Kuntz  et  al.,  Med.  Dec  Making  2011)  
-­‐‑   “Kolo1”:  A  single  colonoscopy  at  a  given  age  will  be  performed    
(technical  comment:  age  can  be  found  at  the  third  position  in  the  
colonoscopy  screening  window  or  at:  
handles.Variables.Screening.Colonoscopy(3))  
-­‐‑   “Kolo2”:  Two  colonoscopies  will  be  performed  at  pre-­‐‑specified  ages  
The  year  of  the  second  colonoscopy  can  be  found  at  the  fourth  position  in  
the  colonoscopy  screening  window  or  at:  
handles.Variables.Screening.Colonoscopy(4))  
-­‐‑   “Kolo3”:  Two  colonoscopies  will  be  performed  at  pre-­‐‑specified  ages  
The  year  of  the  third  colonoscopy  can  be  found  at  the  fifth  position  in  the  
colonoscopy  screening  window  or  at:  
handles.Variables.Screening.Colonoscopy(5))  

IV.4.  Output  
 

 
During   a   run   of   CMOST   the   program   can   generate   a   results   file   (Matlab   variable),   an  
Excel  file  and/  or  .pdf  figures.  Output  will  be  saved  in  the  Results-­‐‑Folder  (compare  
Paths  and  Settings  above)  
 

IV.5.  Scan  variables  


 

 
Variables  Scan:  enables  systematic  scanning  of  CMOST  parameters.  Thereby  up  to  
5   parameters   can   be   modified   in   parallel   or   2   or   3   individual   parameters   can   be  
scanned   independently   (thus   generating   a   2D   or   3D   matrix   of   parameters).   The  
result  will  be  a  number  of  settings  files  which  could  be  run  on  a  Linux  cluster  or  by  
using  the  Start  option.    
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  10  
 
IV.6.  Automated  optimization  
 

 
 
Optimization   of   parameters   required   for   calibrating   the   model   will   be   done   in   4  
separate   steps.   Step   1   deals   with   early   adenoma   prevalence,   progression   and  
distribution.   Step   2   deals   with   advanced   adenoma   prevalence   and   distribution.   Step  
3   deals   with   carcinoma   incidence,   fast   cancer   (derived   from   adenoma   precursors  
other  than  P6)  and  rectum  carcinoma.  Step  4  adjusts  the  direct  cancer  rate  (cancer  
without  adenomatous  precursors).        
Step  1  bench  marks:  opens  a  window  to  adjust  benchmarks  for  Step  1  
Step  2  bench  marks:  opens  a  window  to  adjust  benchmarks  for  Step  2  
Step  3  bench  marks:  opens  a  window  to  adjust  benchmarks  for  Step  3  
Step  4  bench  marks:  opens  a  window  to  adjust  benchmarks  for  Step  4  
 
Optimize  step  1:  opens  a  window  for  automatic  parameter  adjustment  step  1      
Optimize  step  2:  opens  a  window  for  automatic  parameter  adjustment  step  2      
Optimize  step  3:  opens  a  window  for  automatic  parameter  adjustment  step  3      
Optimize  step  4:  opens  a  window  for  automatic  parameter  adjustment  step  4      
 
Manual   adjustments:   Opens   a   window   to   visualize   and   manually   adjust   all  
parameters  important  for  a  run  of  CMOST  (these  are  the  parameters  automatically  
adjusted  by  Step  1  –  Step  4).  
 
Default  benchmarks:  No  window  will  be  opened.  The  current  benchmarks  will  be  
replaced  by  benchmarks  provided  by  the  authors    
   

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  11  
 
IV.6.  Options  

 
 
Some  functionality  is  available  via  this  dropdown  menu  

IV.6.1.  Write  CMOST  batch  instructions  –  Automatic  tests  


A   larger   series   of   instruction   files   for   CMOST   is   written   enabling   testing   of   standard  
features  of  CMOST.  These  includes  most  of  the  tests  from  the  CMOST  manuscripts.  
The  user  is  asked  to  select  the  folder  where  the  files  will  be  saved  and  the  number  of  
replicas.    
Tests  include:  
-­‐‑   Screening  colonoscopy  (ages  50-­‐‑70,  every  10  years  with  follow  up)  
-­‐‑   Rectosigmoidoscopy   screening   (ages   50-­‐‑70,   every   10   years   with   follow  
up)  
-­‐‑   Screening   with   FOBT,   I-­‐‑FOBT,   Hemoccult   Sensa,   no   intervention   for  
comparison  
-­‐‑   Simulation   of   randomized   controlled   trials   for   rectosigmoidoscopy  
screening  (according  to  Atkin  et  al.,  Schoen  et  al.,  Holme  et  al.  and  Segnan  
et  al.)  with  intervention  and  Mock  
-­‐‑   Comparison  of  individuals  with  and  without  adenomas  at  age  55,  treated  
and  untreated  
-­‐‑   “perfect   intervention”   (i.e.   “maximum   clinical   incidence   reduction”,   see  
CMOST  for  details)    
These   instruction   files   could   either   be   transferred   to   a   linux   cluster   or   calculated  
using   the   (see   appropriate   section   of   this   Manual).   In   the   folder   “Scripts”   in   the  
CMOST-­‐‑folder   the   file   “Evaluate   standard   settings”   will   be   helpful   for   evaluations  
(please  note  that  writing  Excel  file  does  not  work  on  a  Mac  computer).  

IV.6.2.  Write  CMOST  batch  instructions  –  Repeat  identical  settings  


You   are   allowed   to   select   a   folder   and   the   number   of   replicas.   After   that   the   current  
settings   of   CMOST   will   be   saved   with   the   chosen   number   of   copies.   This   is   for  
instance  useful  if  a  number  of  parallel  calibrations  will  be  needed.    

IV.6.3.  Calibration  Step  1,  2,  3  


Autocalibration  123  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  12  
 
Will   perform   automated   calibrations   step   1-­‐‑3   without   graphical   output   and   without  
detailed   reporting   of   results   in   one   procedure.   This   might   need   several   hours.   See  
section  VII.7.  for  more  detail.  
 
 Autocalibration  123  bootstrapping  
As   above   with   bootstrapping   (see   VII.7.   for   more   detail).   Please   note   that  
bootstrapping  has  not  been  sufficiently  tested  and  usage  is  not  recommended.  

IV.6.3.  Autocalibration  Step  4  rectosigmo  


Autocalibration  RS  tests  writing  
Writes  a  battery  of  tests  for  titrating  the  best  CancerVariable  for  direct  cancer  which  
appears   without   adenomatous   precursors   (Step   4   of   Autocalibration).   These   files  
can   be   calculated   using   the   Starter   option   or   the   Cluster.   See   Section   “VII.7.  
Alternative  strategy  for  Step  4,  adjustment  of  direct  cancer”  for  more  detail.  
 
Autocalibration  RS  tests  writing  
The  user  is  asked  to  select  a  folder  with  calculated  files  for  RS  tests  (see  above).  The  
results   will   be   plotted,   the   best   CancerVariable   selected   and   transferred   to   the  
current   settings   (if   user   agrees).  See   Section   “VII.7.   Alternative   strategy   for   Step   4,  
adjustment  of  direct  cancer”  for  more  detail.  

V.  CMOST  output  
 
Running  CMOST  calculation  generates  three  different  types  of  outputs  -­‐‑    
 
1.   Graphical  output  of  the  natural  history,  comparing  to  the  benchmarks  used.  
The  output  is  saved  in  PDF  format  as  OUTPUT_NAME_1.pdf,  …_2.pdf,  …_3.pdf,  
OUTPUT_NAME_4.pdf.    
2.   Excel  format  results  file,  summarizing  the  number  of  colonoscopies,  number  
of  cancers,  etc.  This  is  saved  as  OUTPUT_NAME.xls.    
3.   Matlab  format  results  file,  which  has  all  the  detailed  results  including  
developed  or  detected  adenomas,  cancers,  costs,  etc.  This  results  file  is  saved  
under  the  name  OUTPUT_NAME_Results.mat  and  can  be  used  for  incidence,  
cost  calculations,  etc.    
All  these  results  are  saved  in  the  results  folder  (see  IV.1).  

V.1.  CMOST  graphical  output  


If  the  “generate  pdf  file”  option  is  checked,  after  a  run  of  CMOST  (i.e.  after  pressing  
“Start”)  4  different  pdf  files  are  generated.  CMOST  also  generates  Matlab  figures.  For  
convenience  these  figures  are  not  closed  after  saving  the  pdf  files.  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  13  
 
 

V.1.1.  First  page  of  CMOST  graphical  Results  –  adenoma  prevalence,  


CRC  incidence  
The   first   page   deals   with   early   adenoma   prevalence   (left   column),   advanced  
adenoma   prevalence   (middle   column)   and   carcinoma   incidence   (right   column).  
Results   are   indicated   separately   for   males   (upper   row),   females   (middle   row)   and  
overall  (lower  row).    
   
early polyps present advanced polyps present cancer incidence overall
60 14 400

12 350
50
300

per 100'000 per year


10
40
% of survivors

% of survivors

250
8
30 200
6
150
20
4
100
10 2 50

0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
year year
  year

Early   and   advanced   adenoma   prevalence   and   CRC   incidence   for   the   overall  
early polyps present male advanced polyps present male cancer incidence male
population.  
60 Similar   graphs   are  
20 generated   for   males  
500and   females.   Benchmarks   (as  

adjusted  in  the  benchmark  windows)  are  indicated  in  blue;  actual  results  of  CMOST  
50
calculations   are   indicated   by  
15 the   black   line.   Data  
400
points   corresponding   to   the  
per 100'000 per year

benchmarks   are   plotted   separately   in   green   (if   the  


40
respective   point   is   situated  
% of survivors

% of survivors

300
within  an  arbitrary  20%  tolerance  from  the  benchmark)  or  in  red  (if  the  data  point  
30 10
is  outside  this  tolerance  range).  Note:  no  tolerance  is  indicated  for  cancer  incidence  
200
20
<20  years.   5
  10 100

0 0 0
V.1.2.  
0
Second  
20 40 60
year
page   of  
80 100 0 20
CMOST  
40 60
year
graphical  
80 100 0 20
results:  
40 60
year
adenoma  
80 100

characteristics  
early polyps present female advanced polyps present female cancer incidence female
  50 12 350

10 300
40
per 100'000 per year

250
8
% of survivors

% of survivors

30 200
6
20 150
4
100
10
2 50

0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
year year year

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  14  
 
Page  2  upper  row:  distribution  of  adenomas-­‐‑1  
all early Adenoma 3mm 1 adenoma
at least P2 Adenoma 5mm 2 adenomas
at least P3 Adenoma 7mm 3 adenomas
at least P4
Adenoma 9mm 4 adenomas
at least P5
P6 Adv Adenoma P5 >4 adenomas
Adv Adenoma P6

% of patients with adenomas


all adenomas present number of adenomas
60 100 100
% of survivors

% of adenomas
40
50 50
20

0 0 0
0 20 40 60 80 100 Ear.Ad. BM Adv.Ad. BM 0 50 100
year
  year

Left:  Fraction   o f   p atients  


young population (40-54y)
w ith   a t   l east   1   early   a
intermediate population (55-74y)
denoma   o f   a t   l east   t he   i
old population (75-90y)
ndicated   s tage  
is   indicated   as   a   function   of   age   (P1:   3mm,   P2:   5mm,   P3:   7mm,   P4:   9mm,   P5:   1cm   or  
20 40 50

advanced  histology,  P6:  2cm.     35


40
Middle:  15 Adenoma  all stage   early distribution.  
30 Fraction  
Adenoma 3mm of   3mm,   5mm,   7mm,   1 adenoma 9mm   early  
% of population

% of population

% of population
adenomas   of   all   atatearly   adenomas   is   indicated;   BM:   benchmark.   The   green   line  
at least P2 Adenoma 5mm 2 adenomas
least P3 25 Adenoma 7mm 30 3 adenomas
least P4
indicates   10 agreement   at least with  
P6
P5 the   benchmark,  
20 the   red  
Adenoma 9mm
Adv Adenoma P5 line   indicates  >4a  adenomas deviation   of   at  
4 adenomas

least   20%.   The   right   bars   indicate   15 fraction   of  


Adv Adenoma P6
>1cm   (or  
20 advanced   histology)   and  

% of patients with adenomas


>2cm  60adenomas  
5
all adenomas and  the  respective  
present
10010 benchmarks  (BM).  100 number of adenomas
10
Right:  Fraction  of  adenoma  patients  with  only  1  adenoma,  2  adenomas  etc.,  plotted  
% of survivors

% of adenomas

5
40
as  a  function  
0 of  age.   50 0 50 0
  20 1 2 3
min number polyps
4 5 1 2 3
min number polyps
4 5 1 2 3
min number polyps
4 5

0 0 0
Page  2  0middle  
20 40
ryear
ow:  
60
d80istribution  
100
of  adenomas-­‐‑
Ear.Ad. BM Adv.Ad. BM
2   0 50
year
100
summary number polyps Relative danger adenomas
  40-49y: 1.22 (0.59)
Ad 3mm 0.009 0.021 red
Adenoma 3mm
Adenoma 5mm
50-59y: 1.55 (1.15)
Ad 3mm 0.12 0.214 red Adenoma 7mm
young population (40-54y) intermediate population
Ad 3mm 0.285 (55-74y)
0.342 green old populationAdenoma
(75-90y)
9mm
20 60-69y: 1.94 (1.8) 40 Ad 3mm 0.593 50
70-79y: 2.19 (2.15) 1.069 red Adv Ad P5
80-89y: 2.43 (2.46) Adv P5 8.571 12.829 red Adv Ad P6
35 Adv P6 90.421 85.525 green direct
screening population (50-80y) 40
15 30
adenoma prevalvence : 32.8% origin of cancer
% of population

% of population

allpopulation

advanced adenoma prev.:5.7% 100


25 30
cancer

carcinoma prevalence:0.5%

10 20
50
20
of of

15
%%

5 10
10 0
0 5 10
5
decade
0 0 0
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
min number polyps min number polyps min number polyps
 
Left:   Distribution   of   adenomas   in   the   young   population   (age   40-­‐‑54   years).   The  
fraction  summary
of   the   population   with  Relative
number polyps
1,   2,   3,   4  0.021
danger adenomas
Ad 3mm 0.009
and   5  
red
or   more   adenomas   is   indicated.  
Adenoma 3mm
Adenoma 5mm
Benchmarks   are   indicated   in   blue.   Please   note   that   for   the   young   population  
40-49y: 1.22 (0.59)
50-59y: 1.55 (1.15)
Ad 3mm 0.12 0.214 red Adenoma 7mm
60-69y: 1.94 (1.8) Ad 3mm 0.285 0.342 green Adenoma 9mm
benchmarking   h
70-79y: 2.19 (2.15)
80-89y: 2.43 (2.46)
as   b een   inactivated   a nd  
Ad 3mm 0.593
Adv P5 8.571
t he   c olor  
1.069
12.829
red
red
o f   t he   m arkers   is   m eaningless.  
Adv Ad P5
Adv Ad P6

Middle:  screening
Fraction   of   individuals  
population (50-80y) with   a   85.525
Adv P6 90.421
given  
green
number   of   adenomas  direct
for   the  
intermediate  population  (55-­‐‑
adenoma prevalvence : 32.8%
advanced adenoma prev.:5.7%
7 4  years)  with  benchmarks  (blue)  and  agreement  with  
origin of cancer
100
benchmarks   (green)  or  disagreement  (red)  with  a  tolerance  of  20%.  Please  note  that  
% of all cancer

carcinoma prevalence:0.5%

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  a50 n  open  source  tool.                  15  
 
0
0 5 10
decade
year year

young population (40-54y) intermediate population (55-74y) old population (75-90y)


20 40 50

35
40
15 30

% of population

% of population

% of population
25
5   indicates   5   or   more   adenomas.   Only   this   plot   (but  
30
not   the   young   or   the   old  
10 20
population)  will  be  used  for  automated  calibration  in  20Step  1.  
15
Right:  Fraction  of  Fraction  of  individuals  with  a  given  number  of  adenomas  for  the  
old  population  (75-­‐‑90  years).  
5 10
Colors  and  labels  as  for  10the  intermediate  population.  
    5

0 0 0
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Page  2  lower  row:  adenoma  summary,  origin  of  cancer  min number polyps
min number polyps min number polyps

 
summary number polyps Relative danger adenomas Adenoma 3mm
Ad 3mm 0.009 0.021 red Adenoma 5mm
40-49y: 1.22 (0.59)
50-59y: 1.55 (1.15)
Ad 3mm 0.12 0.214 red Adenoma 7mm
60-69y: 1.94 (1.8) Ad 3mm 0.285 0.342 green Adenoma 9mm
70-79y: 2.19 (2.15) Ad 3mm 0.593 1.069 red Adv Ad P5
80-89y: 2.43 (2.46) Adv P5 8.571 12.829 red Adv Ad P6
Adv P6 90.421 85.525 green direct
screening population (50-80y)

adenoma prevalvence : 32.8% origin of cancer


advanced adenoma prev.:5.7% 100

% of all cancer
carcinoma prevalence:0.5%

50

0
0 5 10

  decade

 
 
Right:   Average   number   of   adenomas   for   individuals   of   a   given   age   (mean   and  
standard   deviation).   The   fraction   of   individuals   of   the   screening   population   (50-­‐‑
80years)  with  adenomas,  advanced  adenomas  and  cancer  is  also  indicated.  
Middle:    Relative  danger  adenomas.  CMOST  assumes  that  most  cancers  derive  from  
advanced   adenomas   (P6).   However,   a   fraction   of   cancers   will   also   derive   from  
smaller  adenomas  (“fast  cancer”).  Relative  danger  indicates  the  origin  of  all  cancers  
with  adenomatous  precursors  (most  from  large  adenomas,  very  few  also  from  small  
adenomas).   The   left   column   indicates   numbers   from   the   current   run,   the   middle  
column  benchmarks  and  the  right  column  agreement  with  benchmarks  (within  20%  
tolerance  –  green,  outside  20%  tolerance  –  red).  
Right:   Origin   of   carcinomas.   The   origin   of   carcinomas   either   “direct,   i.e.   without  
adenomatous   precursors   or   an   adenoma   of   a   given   stage   (3,   5,   7,   9   mm,   P5   or   P6)   is  
indicated  as  a  function  of  age.  
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  16  
 
V.1.3.   Third   page   of   CMOST   graphical   results:   Cancer  
characteristics  
Page  3  upper  row:  stage  distribution  of  colorectal  cancer  
Stage I Stage I Stage I
Stage II Stage II Stage II
Stage III Stage III Stage III
Stage IV Stage IV Stage IV

stage distribution screening stage distribution symptomatic cancer stage distribution follow up
100 100 100
% of affected patients

% of affected patients

% of affected patients
80 80 80

60 60 60

40 40 40

20 20 20

0 0 0
<50 50+ 60+ 70+ 80+ 90+ all b-mark <50 50+ 60+ 70+ 80+ 90+ all b-mark <50 50+ 60+ 70+ 80+ 90+ all b-mark

 
year year year
Stage I Stage I Stage I
The   distribution  Stage of  
II cancer   stages   (stage  
StageI…  
II stage   IV)   are   labeled  
Stage II by   colors   as  
Stage distributionStage III Stage III male Stage III female
indicated  in  the  legend.  Stage  distribution  is  provided  per  decade  (<50y,  50-­‐‑
100
per location
Stage IV 100
fraction rectum carcinoma
Stage IV 100
fraction rectum carcinoma
Stage IV 59,  60-­‐‑
69,  70-­‐‑stage
79,  80-­‐‑ 89,  ≥90),  for  all  patients  and  the  benchmark  is  indicated  (very  left  of  
distribution screening stage distribution symptomatic cancer stage distribution follow up
each  panel,  b-­‐‑
80
100
mark).    Green  color  indicates  agreement  with  the  benchmark  (within  
80
100 80
100
% of affected patients
% of affected patients

% of affected patients

% of affected patients
% rectum of all ca

% rectum of all ca
20%   80
60
tolerance),  red  color  indicates  
80

60
disagreement  with   60
80
the  benchmark.    
CMOST   provides   results   for   cancers   detected   40during   screening   (left),   for  
60 60 60

40 40
spontaneous  
40
20
( i.e.   symptomatic)  40
20
cancer   and  for  cancer  4020detected  during  surveillance.  
0 0 0
Page  
20
3  <50
m50+iddle   row:  b-mark
60+ 70+ 80+ 90+ all
year
CRC  location  
20 <50 50+ 60+ 70+ 80+ 90+ all
year
b-mark 20 <50 50+ 60+ 70+ 80+ 90+ all
year
b-mark

  0 0 0
all Rectum Right Rest 0 50carcinoma male
100 0
Stage distribution per location fraction rectum fraction rectum50 100
carcinoma female
100 year 100 year 100 year

80
fraction of all carcinoma without polyp precursor 80 80
of affected patients

100 dwell time all Ca


% rectum of all ca

% rectum of all ca

present
70 diagnosed
60 60 60 multiple cancer
80 60
% cancer

40 40 50 40 cumulative cancer
60 8
40
years

% of all patients
% direct

20 20 30 6
20
40
20 4
0 0 0
20 all Rectum Right Rest 0 10 50 100 0 50 100
2
year
0   year year

Left:  0Percentage   o f  
all Ca right side
carcinoma   a nd  
all s
10tage  
20 30 40d 50istribution  
60 70 80 90100 of  
0
0
cancer  
20 40
a ccording  
60 80 100
t o   l ocation  
decade
(all,  100rectum,  right  colon,  remaining  cdwell olon)   is  allindicated.  
fraction of all carcinoma without polyp precursor
time Ca year
present
 Middle/  right:  Rectum  carcinoma   70
as  a  fraction  of  all  carcinoma  in   males  (left)  and  
diagnosed
multiple cancer
80
females  (right)  is  indicated  according  to  age.  The  black  line  indicates  (noisy)  raw  
60
% direct cancer

numbers,  
60 blue  markers  indicate  50benchmarks.  Red  or  g8reen  cumulative markers   indicate  an  
cancer

average  of  5  years  around  the  year  of  the  benchmark.  G


40
reen  color  indicates  that  the  
years

% of all patients

6
value  
40 is  within  20%  tolerance  of  the  benchmark,  red  colors  indicate  a  value  outside  
30

that  tolerance.  
20 4
20 10
Prakash  et  al.  Manual  for  CMOST:   0
Colon  modeling  with  2an  open  source  tool.                  17  
  0 all 10 20 30 40 50 60 70 80 90100 0
all Ca right side decade 0 20 40 60 80 100
year
60 60 60

% of affected p

% rectum of

% rectum of
40 40 40

20 20 20

0 0 0
Page  3  lower   row:  
all Rectum Right dRest
irect  cancer,  
0 dwell  50time,  cumulative  
100 0 cancer  
50 100
year year year
 
fraction of all carcinoma without polyp precursor
100 dwell time all Ca
present
diagnosed
80 multiple cancer
80
% direct cancer

60 cumulative cancer
60 8

years

% of all patients
40 6
40
4
20
20
2
0
0 all 10 20 30 40 50 60 70 80 90100 0
all Ca right side decade 0 20 40 60 80 100
year
 
 
Left:  Fraction  of  direct  cancer  (i.e.  cancer  without  adenomatous  precursors)  from  all  
cancer   (left   bar)   and   right-­‐‑sided   cancer   (right   bar).   The   green   line   indicates   20%,  
the  red  line  indicates  a  50%  value.    
Please   note   that   CMOST   assumes   that   many   serrated/   flat/   hard-­‐‑to-­‐‑detect   lesions  
preferentially   reside   in   the   right   colon.   Since   direct   cancer   also   represents   these  
lesions,  direct  cancer  is  located  preferentially  within  the  right  colon.    
Middle:  Box  plots  of  adenoma  dwell  time  is  indicated  (dwell  time  is  defined  as  the  
time   from   appearance   of   an   adenoma   to   appearance   of   colon   cancer)   for   each  
decade.   The   box   indicates   the   25th   and   75th   percentile   of   adenoma   dwell   times.  
Outliers  are  indicated  by  red  crosses.  
Right:  Cumulative  cancer  as  a  function  of  age.  The  black  line  indicates  the  fraction  of  
patients   with   cancer   (present   or   past,   diagnosed   or   undiagnosed).   The   blue   line  
indicates   the   fraction   of   patients   with   a   past   or   present   diagnosis   of   cancer.   The  
green  line  indicates  the  fraction  of  patients  with  multiple  cancers.  
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  18  
 
V.1.4.  Fourth  page  of  CMOST  graphical  results  -­‐‑  CRC  effects  and  CRC  
Screening  
Page  4  upper  row:  CRC  mortality  
Cancer mortality per year male Cancer mortality per year female Cancer mortality per year overall
300 250 250

250
200 200
per 100 000 per year per 100 000 per year

per 100 000 per year per 100 000 per year

per 100 000 per year per 100 000 per year
200
150 150
150
Cancer mortality per year male Cancer mortality per year female Cancer mortality per year overall
300 100
250 100
250
100
250 50 50
200 200
50
200
0 0
150 0
150
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
150
year year year
100 100  
CRC  related  mortality  (according  to  the  SEERS  data  base)  is  indicated  as  a  function  
100

of   the  
50 age   of   the  all patient   for   males   (left),  
50
natural females   (middle)   and  
50 overall   (right).  
symptoms
Benchmarks   are   indicated   a s   b lue   m arkers,   a greement   ( within   2 0%)   is   indicated   by  
excl. Ca
Man
cancer screening
0 excl. Ca 0 colonoscopy 0 follow up
a  green  0color.  
20  40 Woman
60
excl. Ca
80 100 0 20 40 60 80 100 0 20 40 baseline
60 80 100
year causeyear
of death year
10 5 reasons for colonoscopies
Page   4  middle   row:  Survival,  cause  of  death,  indication  
Survival 10 4 colonoscopy  
number patients

100
number patients
% of all patients

all natural symptoms


excl. Ca
cancer screening
50 Man 10 2
excl. Ca colonoscopy follow up
Woman baseline
excl. Ca
0 cause of death
0 10
10 5 10 0
0 20 40 60
Survival 80 100 1 2 3 4 5 6 7 8 9 10+ 0 reasons for 50
colonoscopies
100
10 4
number patients

100 year decade year


number patients
% of all patients

50 10 2
early all cost
population: 100000 patients
advanced treatment age: all: 74.6, male: 73.1, female: 76.11
follow up 0 screening colos performed
0 10 0 10 0 5488 symptom colos performed
0 20 40 removed
adenomas 60 80 100 1 2 3 4 5 screening
6 7 8 9 10+ 0
13022 follow up colos 50performed 100
800 year decade 0 custom tests performed
year
  dollars spent per person
2182 patients died of CRC
number adenomas

26711.2 years lost to CRC


Left:  600
Survival.   Percentage   of   all   surviving   patients   (black),   males   (blue),   female   150 0 pat. died due to colo
0 years lost to colo

(red)  
400at   a   given   age.  
early   Please  100
note   that   due   to   the   low   CRC   incidence   changes   in   3.2324e+08 total CRR rel costs
US Dollar

all cost comment: no comment please


population:
settings: 100000 patients
Default_Settings_March_2016
CRC  related  survival  are  usually  not  detectable  
advanced treatment
follow up
in  this  plot.   age: all: 74.6, male: 73.1, female: 76.11
0 screening colos performed
50
Middle:  Causes  of  death  (natural,  CRC  related,  endoscopy  related)  are  indicated.  
200
adenomas removed screening 5488 symptom colos performed
13022 follow up colos performed
800
Please  0 note  the  logarithmic  y-­‐‑s0cale.  
0 custom tests performed
2182 patients died of CRC
0 dollars spent 50 per person100
number adenomas

0 50 100 26711.2 years lost to CRC


Right:  
600Indications   (reasons)   f150
year or   colonoscopy  
year (either   for   CRC   related   symptoms,   CRC   0 pat. died due to colo
0 years lost to colo

screening,  CRC  surveillance  (i.e.  follow  up  of  adenoma  or  carcinoma)  are  indicated.   3.2324e+08 total CRR rel costs
US Dollar

400 100 comment: no comment please

Please  note  the  logarithmic  y-­‐‑scale.  


settings: Default_Settings_March_2016

200 50

0 0
0 50 100 0 50 100
year year

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  19  
 
number patie

number patien
% of all patient
50 10 2

0 10 0 10 0
0 20 40 60 80 100 1 2 3 4 5 6 7 8 9 10+ 0 50 100
year decade year

Page  4  lower  row:  adenomas  removed,  dollars  spent,  summary  


early all cost
population: 100000 patients
advanced treatment age: all: 74.6, male: 73.1, female: 76.11
follow up 0 screening colos performed
5488 symptom colos performed
adenomas removed screening 13022 follow up colos performed
800 0 custom tests performed
2182 patients died of CRC
dollars spent per person
number adenomas

26711.2 years lost to CRC


600 150 0 pat. died due to colo
0 years lost to colo
3.2324e+08 total CRR rel costs

US Dollar
400 100 comment: no comment please
settings: Default_Settings_March_2016

200 50

0 0
0 50 100 0 50 100
year year
       
Left:  Total  number  of  early  (red)  and  advanced  (black)  adenomas  removed.  
Middle:  CRC  related  health  care  costs  (related  to  CRC  treatment,  CRC  screening  and  
CRC  follow  up)  are  indicated.    
Right:  Verbal  summary  of  the  current  run  of  CMOST.  

V.2.  CMOST  Results  (Matlab  variable)  


Most  of  the  results  presented  by  the  graphical  user  interphase  are  also  available  as  a  
Matlab  output  file.  The  results  file  OUTPUT_NAME_Results.mat  stores  the  results  in  
a  Matlab  structure  format.  The  variables  stored  in  this  structure  can  be  seen  by  first  
loading  the  results  file  into  Matlab  and  then  looking  up  the  names  of  the  fields:  
 

 
 
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  20  
 
The   description   of   the   benchmarks,   benchmark   values   and   the   corresponding  
results   from   CMOST   calculations   are   stored   in   the   fields   BM_Description,  
Benchmark,  and  BM_Value.  The  number  of  early  cancers  appearing,  for  example,  is  
retrieved   by   typing   OUTPUT_NAME_Results.Early_Cancer.   Check   out   Matlab  
documentation  for  more  detail.  
 

V.3.  CMOST  Results  (Excel  File)  


CMOST   also   provides   results   of   a   given   calculation   as   an   EXCEL-­‐‑file   with   self-­‐‑
explaining   headlines.   Please   note   that   the   Mac-­‐‑version   of   Matlab   does   not   support  
the  xlswrite-­‐‑function  and  generation  of  Excel-­‐‑files  is  not  possible  using  a  Macintosh.  

   

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  21  
 
VI.  Adjusting  CMOST  
VI.1.  Risk  
 

 
 
Individual  risk  (upper  left):  At  birth,  an  individual  adenoma  risk  is  assigned  to  each  
patient.   This   risk   determines   the   likelihood   of   the   appearance   of   a   new   adenoma  
(other  factors  determining  adenoma  risk  are  age,  gender).  A  distribution  of  relative  
risks  is  plotted  and  the  relative  risk  for  a  given  percentage  of  individuals  is  indicated  
(12  bins:  0%,  10%,  20%...  97%,  100%).    
At  birth,  for  each  individual  a  specific  risk  is  drawn  from  the  distribution  indicated  
by   the   graph.   Thereby,   0   indicates   no   risk   of   ever   getting   an   adenoma.   A   high  
number   (compared   to   the   other   numbers   in   this   plot)   indicate   a   high   adenoma   (and  
subsequently  a  high  cancer  risk).    
Note:  Individual  risk  is  adjusted  during  Step  1  of  automated  parameter  calibration.  
Thereby,  anchor  points  are  automatically  adjusted  
Adenoma   risk   –   early   progression   (upper   right):   Risk   for   progression   for   an   early  
adenoma.   At   adenoma   initiation,   an   early   adenoma   progression   risk   and   an  
advanced   adenoma   progression   risk   is   assigned   to   each   adenoma.   This   risk   is  
randomly   drawn   from   a   distribution   of   risks   indicated   by   the   graph   and   remains  
constant   throughout   the   lifetime   of   an   adenoma.   A   low   number   indicates   slow  
progression   of   early   adenoma,   a   high   number   fast   progression   (and   a   dangerous  
adenoma  quickly  transforming  to  an  advanced  adenoma).  The  distribution  of  risks  
can  be  adjusted  as  described  for  individual  adenoma  risk.    
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  22  
 
Adenoma   risk   –   advanced   progression   (lower   right):   Risk   for   progression   of   an  
advanced  adenoma  and  subsequent  transformation  to  cancer.  The  progression  risk  
is  randomly  drawn  at  the  initiation  of  an  adenoma  and  remains  constant  throughout  
the  lifetime  of  an  adenoma.  Adjustment  as  described  above.  
Correlate:   if   this   box   is   checked,   early   and   advanced   progression   for   a   given  
adenoma  will  correlate  (i.e.  if  the  50th  percentile  of  early  adenoma  progression  will  
be   selected,   the   50th   percentile   of   advanced   adenoma   will   also   be   selected,   resulting  
in   an   adenoma   of   intermediate   progression   speed).   Default   and   recommended  
option:  ON    
Note  1:  The  risk  distribution  for  early  and  advanced  adenoma  is  a  critical  parameter.  
A   steep   curve   or   uneven   distribution   of   early   and   advanced   progression   risks   result  
in   a   short   adenoma   dwell   time.   Vice   versa,   a   flat   curve   will   result   in   a   long   adenoma  
dwell  time.  
Note  2:  The  risk  distributions  for  early  and  advanced  adenomas  have  to  be  manually  
adjusted  (no  automated  adjustment  is  available).  The  only  way  to  adjusted  adenoma  
dwell  time  remains  adjustment  of  the  risk  curve  followed  by  automated  parameter  
calibration.  Since  Parameter  calibration  will  also  influence  adenoma  dwell  time  the  
effect   of   adjusting   the   risk   might   partially   bounce   back   and   several   rounds   for  
adjusting  the  risk  curves  and  parameter  calibration  might  be  necessary.  
Note   3:   We   provide   3   preset   versions   of   CMOST   with   8,   13   and   19   years   dwell   time.  
We  did  not  succeed  calibrating  a  version  of  CMOST  with  <8  years  dwell  time.      
 
Technical  comment:  The  chance  of  actual  adenoma  progression  is  the  product  of  the  
individual   adenoma-­‐‑specific   progression   rate,   age   specific   adenoma   progression  
rate,  the  gender  specific  progression  rate  and  the  location  specific  progression  rate  
(this  is  true  for  early  and  advanced  adenomas).      

VI.2.  Location  

 
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  23  
 
 
This  window  allows  for  adjustments  in  adenoma  behavior  according  to  the  location  
of   the   adenoma.   In   CMOST,   13   colon   segments   are   considered   (1:   cecum   –   13:  
rectum).      
New  adenoma:  the  numbers  indicated  the  fraction  of  new  adenomas  to  be  situated  
in  each  colon  segment.  
Adenoma   progression,   early/   advanced:   Progression   of   early   and   advanced  
adenomas   can   be   adjusted.   In   the   current   version   of   CMOST   progression   of   rectal  
adenomas  differs  from  adenomas  in  the  remaining  colon  to  account  for  a  high  rate  of  
rectal  carcinomas.  
Direct  cancer:  Location  specific  probability  for  the  appearance  of  direct  cancer  (i.e.  
cancer   without   adenomatous   precursors.   CMOST   assumes   direct   cancer   to   be  
situated  preferentially  within  the  right  colon.    
Detection   during   colonoscopy:   Allows   for   the   adjustment   of   different   adenoma  
detection  rates  in  various  segments  of  the  colon.    
Detection   during   rectosigmoidoscopy:   Allows   for   the   adjustment   of   different  
adenoma   detection   rates   in   during   rectosigmoidoscopy   in   various   segments   of   the  
colon.    
Note:   With   the   current   settings,   CMOST   assumes   an   identical   detection   rate   of  
rectosigmoidscopy  and  colonoscopy  
Reach   of   colonoscopy:   Probablity   that   a   colonosocpy   will   reach   a   certain   colon  
segment  
Reach   of   rectosigmoidoscopy:   Probablity   that   a   rectosigmoidoscopy   will   reach   a  
certain  colon  segment  
Return:  Return  to  main  window.  

VI.3.  Mortality  
Allows   for   the   adjustment   of   age   dependent   CRC  
specific   mortality.   To   reduce   discrepancies   in  
calculated   and   reported   mortality   rates,   CMOST  
assumes  a  45%  increase  in  CRC  mortality  beyond  age  
86  years.    
 

 
 
 
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  24  
 
VI.4.  Colonoscopy    
 

 
 
Allows   for   the   adjustment   of   critical   settings   for   colonoscopy   for   the   risks   for  
perforation   during   colonoscopy   and   rectosigmoidoscopy,   death   following  
perforation,   serosa   burn   (i.e.   post-­‐‑polypectomy   syndrome),   risks   for   bleeding   and  
severe  bleeding.    
Colo-­‐‑detection:   allows   for   the   adjustment   of   the   probability   for   detection   of   the  
indicated  lesions  during  colonoscopy.  
Rectosigmo-­‐‑detection:  allows  for  the  adjustment  of  the  probability  for  detection  of  
the  indicated  lesions  during  rectosigmoidoscopy.  
Note:   With   the   current   settings,   CMOST   assumes   the   detection   rate   of  
rectosigmoidscopy  for  early  adenomas  to  be  25%  lower,  for  advanced  adenomas  to  
be  12.5%  lower  than  for  colonoscopy  
Colonoscopy   rate   (base   line   colonoscopy   rate)   will   be   implemented   in   a   later  
version  of  CMOST.  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  25  
 
VI.5.  Costs  

 
Allows   for   the   adjustment   of   costs   of   the   indicated   interventions   (upper   left  
corner),   complications   of   interventions   (lower   left   corner)   and   cancer  
treatment.   Cancer   treatment   is   thereby   divided   in   an   initial   year   (1st   year),   a  
terminal   year   (the   year   before   the   patient   will   die   if   applicable)   and   continuous  
care  (time  after  the  first  year  until  year  5  or  terminal  year).    
 

VI.6.  Screening  

 
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  26  
 
 
Screening:  allows  for  the  adjustment  of  screening  interventions.  
7   screening   interventions   are   implemented   (colonoscopy,   rectosigmoidoscopy,  
FOBT,   I-­‐‑FOBT,   Septin-­‐‑9,   other).   These   interventions   can   be   combined   but   the  
selected  fraction  of  the  population  will  adhere  to  the  indicated  screening  test  for  the  
rest   of   its   life.   For   each   screening   intervention   the   following   parameters   can   be  
adjusted:  
Left  side:  
-­‐‑   Fraction  of  population  (0…1):  fraction  of  the  population  that  will  chose  a  
given  screening  test  
-­‐‑   adherence:  fraction  of  population  that  will  use  the  screening  test  at  each  
indicated  time  point  (0…1)  
-­‐‑   follow-­‐‑up:  adherence  to  the  follow  up  investigation  (i.e.  colonoscopy  after  
a  positive  screening  test;  0…1).  
-­‐‑   y  start:  year  at  which  screening  will  be  started  
-­‐‑   y  end:  year  at  which  screening  will  end  
-­‐‑   interval:  interval  at  which  screening  tests  will  be  applied  (for  instance  1,  
2  or  5  years)  
-­‐‑   y  after  colo:  time  interval  in  years  that  CMOST  will  wait  for  another  round  
of  screening  after  the  last  colonoscopy  (performed  for  whatever  reasons)  
-­‐‑   specificity:  for  detection  of  colonic  lesions  (determines  the  false  positive  
rate)    
Right  side:  Sensitivity  for  the  detection  of  the  indicated  lesions.  
Fraction   not   screened:   determines   the   fraction   of   the   population   which   will   not  
undergo  a  screening  test.  

VI.7.  Adenoma  dwell  time  


Dwell  time  is  the  time  from  the  appearance  of  a  clinically  detectable  adenoma  to  it  
developing  into  cancer.  The  graphical  outputs  show  dwell  time  on  page  3.  Further,  
loading   the   results   file   OUTPUT_NAME_Results.mat,   loads   a   Matlab   structured  
variable  Results.  In  it,  Results.Variable{18}  gives  the  dwell  time.    
 
Dwell   time   is   a   result   of   the   time   spent   by   the   adenoma   in   several   stages,   and   if   a  
model   with   different   dwell   time   is   desired,   it   is   not   straight-­‐‑forward   to   precisely  
adjust   the   dwell   time.   CMOST8,   CMOST13,   CMOST19   are   preset   with   8   years,   13  
years   and   19   years   of   dwell   time.   This   range   covers   the   range   of   dwell   time   covered  
by  other  microsimulations.    
 
We   recommend   adjusting   the   risks   for   early   and   advanced   adenoma   progression  
(see   VI.1.   Risk).   However,   this   step   will   not   only   change   adenoma   dwell   time   but  
also  several  additional  parameters.  Therefore,  all  remaining  parameters  need  to  be  
re-­‐‑adjusted   (preferentially   automatically,   see   VII.     Parameter   optimization).   The  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  27  
 
adjustment   will   partially   reverse   the   dwell   time.   Therefore,   several   rounds   of   risk  
adjustment  and  parameter  optimization  might  be  necessary  to  achieve  the  desired  
dwell  time.      

VII.  Parameter  optimization  


This   section   is   specifically   for   advanced   users,   who   would   like   to   use   their   own  
benchmark   parameters   to   create   the   CMOST   settings   file   and   to   perform  
calculations   using   them.   An   early   user   wanting   to   reproduce   the   calculations   with  
the  default  CMOST  settings  can  ignore  this  section.  
 
Parameterization  approach:  CMOST  is  a  very  detailed  model  with  many  different  
parameters.   For   example,   the   new   polyp   appearance   is   a   sigmoid   function  
parameterized  by  three  parameters.  The  early  polyp  progression  is  assumed  to  be  of  
Gaussian   form   and   is   parameterized   by   three   additional   parameters.   Inclusion   of   all  
parameters  used  for  describing  the  model  would  make  it  difficult  if  not  impossible  
to  obtain  an  automated  parameter  estimation.    
 
In  order  to  achieve  the  results  in  a  quick  way,  two  strategies  were  adopted  –    
●   The   comparison   of   the   calculated   results   to   benchmarks   is   made   in   four  
steps.  It  was  observed  during  the  calculations  that  these  four-­‐‑steps  behaved  
almost  independently  of  each  other.  Making  this  step-­‐‑wise  parameterization  
reduces   the   complexity   from   simultaneously   optimizing   twenty   parameters  
to  fewer  parameters  in  each  step.  
●   The  parameter  space  was  not  explored  with  all  possible  combinations  of  the  
parameter   values,   but   instead,   in   an   initial   step   the   parameters   were  
corrected  iteratively,  depending  on  how  far  the  calculated  results  were  from  
the  predictions.  This  greedy  algorithm  approach  makes  the  exploration  of  the  
parameter   space   directed   towards   achieving   the   benchmarks   provided.   For  
some  key  parameters  this  is  followed  by  a  systematic  unbiased  search  using  
the  Nelder-­‐‑Mead  simplex  algorithm.  
 
While   the   biology   of   colorectal   cancer   is   complicated,   at   a   very   simple   level   of  
description,   colorectal   cancer   has   two   kinds   of   origins   –   adenomatous   polyps   that  
can   be   detected   in   a   colonoscopy,   and   flat/   serrated   polyps   (or   non-­‐‑polyp   lesions)  
that  are  hard  to  be  detected  by  colonoscopy.  The  latter  kind  of  lesions  are  accounted  
for  by  the  “direct  cancer”  pathways,  i.e.  cancer  without  adenomatous  precursors.  
 
Evolution   from   adenomas   or   without   adenomas   as   well   as   the   incidence   of  
colorectal  cancer  defines  the  natural  history  of  colorectal  cancer,  in  the  absence  
of  any  screening  interventions.    
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  28  
 
Parameter  calibration  is  done  in  4  separate  steps:  
1.   Early  adenoma  prevalence  and  distribution  
2.   Advanced  adenoma  prevalence  and  distribution  
3.   Cancer  incidence  
4.   Direct  cancer  
Number  and  size  (or  stage)  distribution  of  adenomas  and  cancer  as  a  function  of  age  
and  gender  –  observed  either  in  colonoscopy  studies  or  in  autopsy  studies  are  used  
as  benchmarks.  Direct  cancer  (step  4)  is  benchmarked  in  an  indirect  way  using  the  
data  of  a  rectosigmoidoscopy  randomized  control  trials.  

VII.1.  Step  1:  Early  adenomas  


VII.1.1.  Early  adenoma  benchmarks  

 
 
Left   (overall):   Prevalence   of   early   adenoma   overall   (i.e.   males   and   females).   The  
number   indicates   the   fraction   of   individuals   with   at   least   one   (early   or   advanced)  
adenoma.   To   add   another   benchmark   just   add   a   new   year   and   an   adenoma  
prevalence.   The   sort   button   (lower   right   in   this   window)   will   sort   all   benchmarks  
according  to  age.  Below  all  benchmarks  are  illustrated  with  a  graph    
Second  from  left  (male):  As  above,  male  population  
Middle  (female):  As  above,  female  population  
Second   from   right   –   Multiple   adenoma:   Numbers   indicate   the   fraction   of   individuals  
with   exactly   1   adenoma,   2,   3,   4   adenomas   or   5   or   more   adenomas.   Here   the  
population  55-­‐‑74  years  of  age  is  considered.  
Right   graph   -­‐‑   Distribution   of   adenoma   stage:   Distribution   of   adenoma   stages   (i.e.  
fraction  or  percentage  of  adenomas  at  a  given  stage  (i.e.  3mm,  5mm,  7mm,  9mm)  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  29  
 
Technical   comment:   for   optimization   step   1   only   adenoma   stages   I-­‐‑IV   are  
considered   (the   sum   of   the   number   adenoma   stages   I-­‐‑IV   is   used   as   100%);   for  
optimization  step  2  only  adenoma  V  and  VI  will  be  considered.    
Sort:  sort  benchmarks  according  to  age.  
Return:  return  to  main  window.    

VII.1.2.  Optimizing  early  adenomas    

 
 
CMOST   will   try   to   optimize   overall,   male   and   female   adenoma   prevalence,  
distribution   of   multiple   adenomas   within   the   population   and   adenoma   stage  
distribution  I-­‐‑IV.    
In   the   first   set   of   iterations   heuristic,   customized   algorithms   will   be   used   to  
approach   an   optimum.   In   the   second   set   of   algorithm   a   Nelder-­‐‑Mead   simplex  
algorithm  will  be  used  to  refine  this  optimum.  After  an  optimum  is  found  the  user  
will  be  asked,  whether  the  parameters  found  should  be  kept.  
1st  iteration:  number  of  calculations  for  first  set  of  iteration  (heuristic  algorithms)  
2nd  iteration:  number  of  calculations  for  second  set  of  iteration  (Nelder-­‐‑Mead)  
Start:  start  optimization  procedure  
Stop:  stop  optimization  procedure  (the  current  run  will  be  finished)  
Return:  return  to  main  window.    
Technical  comment  1:    
-­‐‑   For   adenoma   prevalence   the   age   dependent   new   adenoma   rate   is  
adjusted  (3  coefficients  of  a  sigmoid  curve,  see  manuscript).  In  addition,  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  30  
 
CMOST   adjusts   a   scaling   factor   for   adenoma   appearance   in   females   to  
minimize  the  differences  to  the  benchmarks  in  males  and  females.      
-­‐‑   For  distribution  of  multiple  adenomas  the  distribution  of  individual  risks  
for  the  population  is  adjusted  
-­‐‑   For   the   adenoma   stage   distribution,   the   stage   specific   adenoma  
progression  rates  are  adjusted  according  to  the  square  root  of  the  ratio  of  
observed  vs.  expected  relative  prevalence.    
Technical  comment  2:  
Using  these  benchmarks,  the  following  two  internal  vectors  are  adjusted:  
-­‐‑    New   polyp   appearance,   which   is   a   vector   with   the   relative   likelihood   of  
appearance  of  adenomas  for  each  year  between  ages  0  and  99.    
-­‐‑    Individual   Risk   (vector   with   500   numbers):   This   vector   determines   the  
distribution   of   individual   risks.   This   distribution   might   be   flat   (i.e.   very  
similar   cancer   and   adenoma   risks   between   individuals)   or   steep   (i.e.  
uneven,  highly  different  adenoma  and  carcinoma  risks).  CMOST  assumes  
each   simulated   individual   to   have   a   person   specific   risk   for   adenoma  
appearance.  This  individual  risk  is  drawn  from  a  distribution  of  these  500  
numbers.  This  risk  may  be  due  to  heredity,  nutrition,  or  other  factors.  A  
distribution   of   individual   risks   is   calculated   and   benchmarked   using   the  
distribution   of   multiple   adenomas   (a   high   frequency   of   multiple  
adenomas   can   only   be   achieved   if   an   uneven   distribution   of   individual  
risks  are  assumed  (i.e.,  many  high-­‐‑risk  individuals  compared  to  average)  

 
   

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  31  
 
VII.2.  Step  2:  Advanced  adenomas  
VII.2.1.  Advanced  adenoma  benchmarks    
 

 
Left:  prevalence  of  advanced  adenomas  (percentage  of  individuals  with  at  least  1  
advanced  adenoma  1cm  or  2cm;  i.e.  P5  or  P6).  
Middle  left:  same  as  above,  male  population  
Middle  right:  same  as  above,  female  population  
Right:  distribution  of  adenoma,  compare  Benchmarks  Step  1,  above      
Sort:  will  sort  benchmarks  according  to  the  year  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  32  
 
VII.2.2.  Optimizing  advanced  adenomas  

 
 
General  strategy:  CMOST  will  optimize  overall,  male  and  female  advanced  adenoma  
prevalence,   and   adenoma   stage   distribution   of   advanced   adenomas   (P5:   advanced  
adenoma  >1cm  or  advanced  histology,  P6:  advanced  adenoma  >2cm).    
In  the  first  set  of  iterations  customized  heuristic  algorithms  will  be  used  to  approach  
an  optimum.  In  the  second  set  of  algorithm  a  Nelder-­‐‑Mead  simplex  algorithm  will  be  
used   to   refine   this   optimum.   After   an   optimum   is   found   the   user   will   be   asked,  
whether  the  parameters  found  should  be  kept.  
In   addition,   a   scaling   factor   for   adenoma   progression   for   females   will   be   adjusted   to  
account   for   different   advanced   adenoma   distributions   for   females   and   males.   This  
scaling  factor  is  not  reported.  
 
1st  iteration:  number  of  calculations  for  first  set  of  iterations  
2nd  iteration:  number  of  calculations  for  second  set  of  iterations  
Adjust  Adv.  Adenoma,  Adjust  adenoma  distribution:  adjustments  can  be  activated/  
inactivated  separately.    
Start:  start  optimization  procedure  
Stop:  stop  optimization  procedure  (the  current  run  will  be  finished)  
Return:  return  to  main  window.    
Technical  comments:    

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  33  
 
-­‐‑   For   advanced   adenoma   prevalence   the   age   dependent   early   adenoma  
progression   rate   is   adjusted   (3   coefficients   of   a   Gaussian   curve,   see  
manuscript).  In  addition,  CMOST  adjusts  a  correction  factor  for  adenoma  
appearance  in  females  to  minimize  the  differences  to  the  benchmarks  in  
males  and  females.      
-­‐‑   For   the   adenoma   stage   distribution   P5/   P6   the   stage   specific   adenoma  
progression  rates  is  adjusted  according  to  the  square  root  of  the  ratio  of  
observed  vs.  expected  relative  prevalence.    

VII.3.  Step  3:  Optimizing  cancer  


VII.3.1.  Benchmarks  for  cancer  optimization  
 

 
 
Left:  incidence  of  carcinoma  whole  population  
Middle  left:  same  as  above,  male  population  
Middle  right:  same  as  above,  female  population  
Upper  right  -­‐‑  relative  danger  of  adenoma.  This  parameter  is  benchmarked  reversely;  
we  estimate  that  the  indicated  fraction  of  carcinoma  were  derived  from  an  adenoma  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  34  
 
of  the  indicated  stage.    For  instance,  1%  of  all  carcinoma  originated  directly  within  
an  adenoma  of  9mm  size  (stage  IV  =  P4),  skipping  the  P5  stage  (>1cm)  and  the  P6  
stage  (>2cm).  
Lower   right   –   fraction   of   rectum   carcinoma:   A   disproportionate   fraction   of  
carcinoma   is   observed   in   the   rectum.   Benchmarks   can   be   added   for   adjustment   of  
the  fraction  of  rectum  carcinoma  (only  the  second  and  the  third  benchmark,  i.e.  the  
5  year  span  around  year  62  and  72    will  be  used).  
Sort:  will  sort  incidence  benchmarks  according  to  the  year  
Return:  return  to  main  window  
 

VII.3.1.  Optimizing  cancer  parameters  

 
 
General  strategy:  CMOST  will  optimize  overall,  male  and  female  carcinoma  adenoma  
prevalence,  the  relative  danger  of  adenomas  (i.e.  the  fraction  of  carcinomas  derived  
from   each   type   of   adenomas   other   than   P6-­‐‑adenoma   (>2cm))   and   the   fraction   of  
rectum  carcinoma.    
In  the  first  set  of  iterations  customized  heuristic  algorithms  will  be  used  to  approach  
an  optimum.  In  the  second  set  of  algorithm  a  Nelder-­‐‑Mead  simplex  algorithm  will  be  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  35  
 
used   to   refine   this   optimum.   After   an   optimum   is   found   the   user   will   be   asked,  
whether  the  parameters  found  should  be  kept.  
1st  iteration:  number  of  calculations  for  first  set  of  iteration  
2nd  iteration:  number  of  calculations  for  second  set  of  iteration  
Adjust   fraction   rectum,   Adjust   cancer,   Adjust   relative   danger   adenoma:   adjustments  
can  be  activated/  inactivated  separately.    
Start:  start  optimization  procedure  
Stop:  stop  optimization  procedure  (the  current  run  will  be  finished)  
Return:  return  to  main  window.    
Technical  comments:    
-­‐‑   For   carcinoma   incidence   the   age   dependent   advanced   adenoma  
progression   rate   is   adjusted   (3   coefficients   of   a   Gaussian   curve,   see  
manuscript).  In  addition,  CMOST  adjusts  a  correction  factor  for  advanced  
adenoma   progression   in   females   to   minimize   the   differences   to   the  
benchmarks  in  males  and  females.      
-­‐‑   For   the   relative   danger   adenoma   a   conversion   factor   for   each   type   of  
adenoma  is  adjusted  according  to  the  square  root  of  the  ratio  of  observed  
vs.  expected  relative  prevalences.    
-­‐‑   For   adjusting   the   fraction   of   rectum   carcinoma   the   location   specific  
progression   rate   of   rectum   adenomas   (early   and   advanced)   will   be  
adjusted.  
 

VII.4.  Step  4:  Adjusting  direct  cancer  


VII.4.1.  Benchmarks  for  adjusting  direct  cancer    
 

 
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  36  
 
Incidence  reduction  overall:  incidence  reduction  by  sigmoidoscopy  screening.  This  
parameter  will  be  used  for  benchmarking.  
Incidence  reduction  right/  left,  mortality  reduction:  the  respective  outcomes  in  the  
rectosigmoidoscopy  study.  These  parameters  are  not  used  for  benchmarking.  
Comment:  due  to  specific  differences  in  the  rectosigmoidoscopy  studies  (i.e.  age  of  
population,  1  or  2  screening  interventions  (Schoen  et  al.,  NEJM  2012),  adherence  to  
screening   etc.)   incidence   reduction   only   makes   sense   in   connection   with   the  
selection  of  the  study  (see  VII.4.1.,  below).    

VII.4.1.  Optimizing  direct  cancer  parameters  

 
 
General  strategy:  CMOST  uses  data  from  randomized  controlled  rectosigmoidoscopy  
studies  for  the  adjustment  of  direct  cancer.  Direct  cancer  is  assumed  to  be  localized  
mainly   in   the   right-­‐‑sided   colon,   modeled   by   a   sigmoid   curve   (see   manuscript).  
CMOST   adjusts   only   the   maximum   of   this   sigmoid   curve.   Thereby,   the   variable  
“DirectCancerSpeed”  (direct  cancer  variable)  will  be  adjusted  using  the  square  root  
of  the  expected  vs.  observed  incidence  reduction  rates  for  overall  carcinoma.  
Graph   incidence   reduction:   for   the   calculations   the   trends   in   incidence   reduction  
overall  (yellow),  incidence  reduction  left  colon  (blue)  and  incidence  reduction  right  
colon  (red)  are  indicated.  
Graph   direct   cancer   variable:   Trends   in   the   variable   “DirectCancerSpeed”   are  
indicated.  
Max.  iterations:  Number  of  iterations    
Number  averaging:  At  the  end  of  the  calculations  the  last  n  individual  direct  cancer  
speeds  will  be  averaged  (to  reduce  noise)  and  the  resulting  number  used  further  on.  
Scenario:  3  scenarios  are  modeled  by  CMOST,  the  user  can  choose  either:  “Atkin  et  
al.,  Lancet  2010”,  “Schoen  et  al.,  NEJM  2012”,  “Holme  et  al.,  JAMA  2014”  or  “Segnan,  
JNCI  2011”.  
Start:  start  calculations  
Stop:  stop  calculations  (the  current  run  will  be  finished)  
Return:  return  to  main  window  
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  37  
 
VII.5.  Suggested  scheme  for  automated  calibration  
We   suggest   to   run   all   4   optimization   steps   sequentially   and   adjust   the   number   of  
trials  and  the  size  of  the  patient  population:  
•   Step  1:  25’000  patients,  10  trials  first  iteration,  30  trials  second  iteration  
•   Step  2:  50’000  patients,  30  trials  first  iteration,  60  trials  second  iteration  
•   Step  3:  100’000  patients,  50  trials  first  iteration,  60  trials  second  iteration  
•   Step   4:   100’000   patients,   30   trials   first   iteration,   averaging   the   last   10  
iterations  
Comments:    
-­‐‑   Variables  will  only  be  transferred  if  the  user  applies  the  “Return”  button  
in  each  window,  not  simply  by  closing  the  windows.  
-­‐‑   Calculations   for   step   1   and   2   will   be   fast.   Step   3   can   take   one   hour.   If  
calculations   are   excessively   slow   please   check   that   the   coder   option   has  
been  appropriately  used  (section  VII).  

VII.6.  Alternative  strategy  for  Step  4,  adjustment  of  


direct  cancer  
Since  simulation  of  even  a  population  as  large  as  100’000  results  in  some  noise  in  
the   data,   we   designed   another   way   to   adjust   direct   cancer.   Thereby   the   “direct  
cancer  variable”  will  be  systematically  varied  to  span  the  range  of  reasonable  values  
with   a   selected   number   of   repeats;   results   will   be   averaged   and   the   most  
appropriate  value  used.  
 
Load  the  settings  for  which  you  want  to  adjust  the  direct  cancer  variable.  
à  Menu  Option      
  à  autocalibration  step  4  rectosigmo  
    à  Automatic  RS  tests  writing  
You   will   be   able   to   choose   the   number   of   repeats.   For   the   settings   calculated   for  
CMOST  we  used  25  repeats,  resulting  in  15x25  =  375  calculations.  
Calculations   can   be   done   on   a   Linux   cluster   (recommended,   see   section   VII.2)   or  
using  the  Starter  Option  (Section  VII.1).  
 
After  calculations  are  finished,  results  can  be  summarized:  
à  Menu  Option      
  à  autocalibration  step  4  rectosigmo  
    à  Automatic  RS  tests  reading  
 
Results  might  look  as  follows:  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  38  
 
     
 
In   the   plot   above,   results   of   an   individual   repeat   are   shown   in   blue,   the   averaged  
value   in   red,   the   trendline   is   indicated   by   a   blue   line.   The   benchmark   for   overall  
incidence  reduction  is  indicated  by  a  thin  blue  line.  The  point  where  the  both  lines  
cross  is  calculated  and  the  best  value  for  cancer  variable  extracted.  
The  user  will  be  asked,  whether  the  calculated  Cancer  variable  should  be  kept,  if  yes  
it  will  be  replaced  in  the  settings.  

VII.7.  Concomitant  adjustment  of  Steps  1-­‐‑3  


Steps   1-­‐‑3   can   be   run   subsequently   (see   above)   with   graphical   representation   of   key  
parameters.   However,   CMOST   also   supports   optimization   of   Steps   1-­‐‑3   in   a   single  
procedure.  This  can  be  done  from  the  CMOST  Main  window  or  on  the  cluster.  
 
à  Menu  Option      
  à  Calibration  step  1,  2,  3  
    à  Autocalibration  step  123  
The   current   settings   will   be   used   as   a   starting   point.   CMOST   will   perform   the  
following  steps:  
-­‐‑   Step   1:   With   population   size   25’000,   10   and   30   repeats   for   the   first   and  
second  set  of  iterations,  respectively    

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  39  
 
-­‐‑   Step   2:   With   population   size   50’000,   30   and   60   repeats   for   the   first   and  
second  set  of  iterations,  respectively    
-­‐‑   Step  3:  With  population  size  100’000,  30  and  60  repeats  for  the  first  and  
second  set  of  iterations,  respectively    
-­‐‑   Step  2+3  with  population  size  100’000,  100  iterations  
CMOST   will   start   running   the   calculations;   no   graphical   output   is   provided.   The  
results   of   the   calculations   will   be   put   back   to   CMOST   main   window.   Calculations  
might   need   several   hours.   For   best   results   we   suggest   performing   this   automated  
calibration   on   a   Linux   cluster;   thereby,   several   calculations   can   be   run   in   parallel  
(see  section  X.2).  
 
Note:  For  the  combination  of  step  2  +  3  the  Nelder-­‐‑Mead  simplex  algorithm  will  be  
used   only.   It   can   be   useful   to   achieve   even   more   precise   fitting   since   some  
interferences   between   steps   2   and   3   are   present.   This   combined   mode   is   not  
available  with  a  graphical  user  interphase.  
 
We   also   developed   automated   calibration   using   bootstrapping.   Thereby,   for   each  
settings  parameter  an  artificial  population  with  the  size  of  the  original  study  (source  
data)   with   the   same   distribution   of   the   parameter   as   in   the   original   study   (source  
data)  will  be  generated.  For  each  parameter  individuals  will  be  randomly  drawn  and  
a   new   study   population   is   generated   (with   the   size   of   the   original   study).   Since   each  
individual  can  be  drawn  more  than  once,  some  fluctuation  of  each  parameter  will  be  
present.   Bootstrapping   addresses   the   uncertainty   of   calculations.   Bootstrapping   has  
not  been  sufficiently  tested  and  usage  is  currently  not  recommended.      

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  40  
 
VII.8.  Manual  adjustments  

 
 
CMOST   also   supports   visualization   and   manual   adjustment   of   all   parameters.  
This   would   be   an   alternative   strategy   to   automated   calibration.   The   user   would  
need   to   manually   change   parameters   in   this   window,   press   Return   to   transfer  
parameters   to   main   window   and   press   Start   in   the   main   window,   followed   by  
inspection  of  the  matlab  figures  or  .pdf  files.  With  experience,  with  30-­‐‑50  iterations  
good  agreement  of  all  parameters  can  be  achieved.  
Manual   calibration   results   in   slightly   better   agreement   with   benchmarks   and  
ultimately  provides  an  intuitive  understanding  of  the  model  parameters.  However,  
we  recommended  the  objective  approach  provided  by  the  automated  calibration.    
New  polyp  rate:  for  each  increment  of  5  years  an  age-­‐‑specific  adenoma  appearance  
rate  can  be  entered.  CMOST  will  interpolate  the  rates  for  the  years  in  between.    
Technical  comment:  for  the  automated  settings  a  sigmoid  curve  with  3  parameters  
is  assumed  and  the  individual  points  will  be  calculated.    
Early   polyp   progression:   as   for   new   polyp,   only   the   age   specific   early   adenoma  
progression  rate  can  be  adjusted.  
Advanced   polyp   progression:   as   for   new   polyp,   only   the   age   specific   advanced  
adenoma  progression  rate  can  be  adjusted.  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  41  
 
Direct   cancer:   The   age   specific   direct   cancer   rate   can   be   adjusted   separately   for  
males   and   for   females.   For   the   settings   provided   by   the   authors,   the   direct   cancer  
rate   follows   the   cancer   incidence   benchmarks.   Direct   cancer   refers   to   cancer  
without  adenomatous  precursors.  
Technical  comment:    
-­‐‑   The  probability  of  direct  cancer  will  be  the  result  of  the  direct  cancer  rate  
and  “direct  cancer  speed”.    
-­‐‑   In   the   current   settings   the   direct   cancer   rate   equals   the   carcinoma  
incidence  
Progression:  adenoma  stage  specific  progression  rates  can  be  adjusted.  These  rates  
determine   the   probability   of   progression   for   instance   from   a   polyp   5mm   (stage   II,  
P2)  to  a  polyp  7mm  (stage  III,  P3).  
Technical   comment:   The   rate   of   actual   progression   will   be   the   product   of   stage-­‐‑
specific  progression  rate,  age-­‐‑specific  early  or  advanced  adenoma  progression  rate,  
gender-­‐‑specific  correction  factor  and  a  location  specific  correction  factor.    
Fast  cancer:    Refers  to  cancer  directly  derived  from  adenoma  stages  other  than  P6  
(advanced   adenoma   >2cm).   The   probabilities   of   conversion   to   cancer   for   each  
adenoma  stage  can  be  adjusted  here  (very  low  for  small  adenomas  and  relevant  only  
for  large  adenomas  P5,  >1cm).    
Healing:  CMOST  assumes  a  low  probability  of  adenoma  regression,  to  be  adjusted  in  
this  windows.  Adenoma  healing  is  not  calibrated.    
Female   gender:   Correction   factors   to   control   for   the   fraction   of   females   within   the  
general   population   (at   birth),   appearance   of   new   adenoma,   progression   of   early  
adenoma  and  progression  of  advanced  adenoma.      
Direct   cancer   speed:   allows   for   adjusting   the   fraction   of   direct   cancer   (i.e.   without  
adenomatous  precursors)  in  the  population.  

VII.9.  Default  benchmarks  


No   new   window   will   open,   but   all   benchmarks   will   be   reverted   to   the   default  
settings  provided  by  the  authors.  

VIII.    Speed  of  calculations  


VIII.1.  Speed  limit    
CMOST   calculation   handles   large   arrays.   It   loops   over   a   population   of   (say)   100,000  
individuals,   following   each   of   them   over   100   years,   and   the   number   of   adenomas  
which  can  be  up  to  50.  Handling  the  large  data  and  calculations  in  Matlab  makes  it  
slow,  requiring  about  20  minutes  to  finish  such  a  calculation  on  a  2.5  GHz  and  4  GB  
RAM   machine.   This   is   because   calculations   by   Matlab   codes   can   be   much   slower  
than  their  C  or  C++  counterparts.    
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  42  
 
 

VIII.2.  Accelerating  speed:  the  coder  option  


The   computationally   intensive   part   of   CMOST   is   a   function   NumberCrunching.m  
which  is  called  from  the  subroutine  Calculate_Sub.m.    
Matlab  offers  a  way  to  accelerate  the  calculations  to  achieve  speeds  comparable  to  C  
programs.  This  option  called  coder  and  is  called  with  the  Matlab  function  “codegen”.    
Important   note:   CMOST   is   provided   with   converted   files   for   PC   and   Macintosh,for  
some   machines   (64-­‐‑bit   machines).   For   best   performance   it   might   be   necessary   to  
generate   NumberCrunching.mex   anew.   Conversion   is   also   necessary   after   each  
modification   in   the   NumberCrunching   function.   If   no   appropriate   .mex   file   is  
available,  a  warning  will  appear.  
Due   to   technical   requirements   of   the   Matlab   function   “codegen”,   it   is   necessary   to  
compile   NumberCrunching   for   each   number   of   patients   (i.e.   10’000,   25’000,   50’000,  
100’000).   One   additional   routine   (QuickRS,   evaluating   results   of   the   simulated  
randomized  rectosigmoidoscopy  studies)  also  needs  to  be  converted.  
Specifically,  the  conversion  is  achieved  by  the  following  steps:  
1.   Start  Matlab  
2.   Change  path  to  the  CMOST  folder  
cd  /Users/USERNAME/Documents/CMOST    
(the  command  pwd  will  print  the  current  path,  cd  ..  moves  on  step  up  in  the  folder  hierarchy,  
cd  FOLDERNAME  one  step  down,  consult  Matlab  documentation  for  details)  

3.   Change  to  the  codegen  folder  


cd  codegen  

4.   Load  variables  
load  Variables_100000.mat  

5.   Change  to  CMOST  path  


cd  ..  

6.   Use  Matlab  command  “codegen”  on  NumberCrunching  


codegen  NumberCrunching_100000  -­‐‑args  All  

7.   Repeat  these  steps  for  50000,  25000,  10000  patients  


cd  codegen,  load  Variables_50000.mat,  cd  ..,  codegen  NumberCrunching_50000  -­‐‑args  All  
cd  codegen,  load  Variables_25000.mat,  cd  ..,  codegen  NumberCrunching_25000  -­‐‑args  All  
cd  codegen,  load  Variables_10000.mat,  cd  ..,  codegen  NumberCrunching_10000  -­‐‑args  All  

8.   Do  codegen  for  one  subprogram  evaluating  results  of  the  simulated  


randomized  rectosigoidsocopy  studies  
cd  codegen  /,  load  RS_Variables_100000,  cd  ..,  codegen  QuickRS  -­‐‑args  All  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  43  
 
After   NumberCrunching.m   module   is   converted   to   NumberCrunching_mex.mex,  it  is  
called  in  the  function  Calculate_Subroutine.m  exactly  with  the  same  arguments,  with  
only   a   difference   in   the   name.   NumberCrunching_mex   is   capable   of   performing  
faster  calculations  (in  our  experience  20X  faster).  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  44  
 
IX.  Running  CMOST  in  batch  mode  
IX.1.  Using  the  STARTER  option  
Press   the   “starter”   button   in   the   main   window   for   following   window:

 
 
The  starter  option  allows  to  select  individual  saved  settings_files  as  a  pipeline.  These  
settings   files   can   then   be   subsequently   run   by   CMOST   (settings   files   can   also   be  
considered   instruction   files   for   a   run   of   CMOST).   One   convenient   way   to   generate  
settings  files  would  be  the  “Scan  Variable”  option.    
Add  item:  Browse  and  select  for  Settings  files  
Add  folder:  a  folder  with  all  settings  files  can  be  added  to  the  pipeline  
Delete:  delete  individual  files  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  45  
 
Up,  Down:  change  order  of  files    
Return:  return  to  main  window.  
Comment:  If  files  have  been  selected  the  “start  batch”  button  will  be  active  and  the  
pipeline  can  be  started.  

IX.2.  Parallel  Calculations  on  a  Linux  cluster  


Calculations  can  be  performed  on  a  linux  cluster.  This  allows  parallel  processing  of  
many  jobs  and  tremendously  increases  the  power  of  CMOST.  However,  the  setup  of  
each   cluster   is   different   and   not   general   solution   can   be   provided.   The   following  
steps   will   work   on   the   Linux   cluster   Euler   at   ETH   Zurich.   If   the   setup   of   your   cluster  
differs,   consult   the   cluster   administration   for   support.   Matlab   and   appropriate  
compiler  need  to  be  installed  on  the  cluster.  
1.   Adjust  Foldernames  in  CMOSTCluster.m    
Open  CMOSTCluster.m  in  the  Matlab  Editor  (if  you  do  not  know  what  this  
means,  consult  Matlab  documentation)  
Adapt  line  25:  
addpath  ('/cluster/home/USERNAME/CMOST')  
2.   Adjust  Foldernames  in  CMOSTCluster_Calibration_Step123.m  
Open  CMOSTCluster_Calibration_Step123.m  in  the  Matlab  Editor    
Adapt  line  25:  
addpath  ('/cluster/home/USERNAME/CMOST')  
3.   Copy  CMOST  from  your  computer  to  the  Linus-­‐‑Cluster  
From  a  Mac  Terminal  the  following  commands  will  work:  
scp  –r  /Users/USERNAME/Documents/CMOST  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME  
Note:  paths  on  your  computer  or  Linux  cluster  might  look  quite  different.    
4.   Copy  bash  scripts  to  cluster  
We  will  setup  the  cluster  to  be  able  to  do  2  things:  1)  calculations  using  a  provided  settings-­‐‑
file  (i.e.  instruction  file).  2)  automated  parameter  calibration.  For  that  4  scripts  need  to  be  
copied:    
a)  submit_standard,  Calib_submit  (for  automated  submission  of  jobs)  
b)  CMOSTClusterSub,  CMOSTClusterSub_Calib    (subroutine  called  by  the  previous  programs)  
 
From  a  Mac  terminal  the  following  commands  would  work:  
scp  /Users/USERNAME/Documents/CMOST/Cluster/bash_Scripts/submit_standard  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME  
scp  /Users/USERNAME/Documents/CMOST/Cluster/bash_Scripts/Calib_submit  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME  
scp  /Users/USERNAME/Documents/CMOST/Cluster/bash_Scripts/CMOSTClusterSub  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME  
scp  /Users/USERNAME/Documents/CMOST/Cluster/bash_Scripts/CMOSTClusterSub_Calib    
USERNAME@euler.ethz.ch:/cluster/home/USERNAME  

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  46  
 
5.   Login  to  the  cluster  
ssh  USERNAME@euler.ethz.ch  

6.   Make  bash  scripts  executable  


chmod  +x  submit_standard  
chmod  +x  Calib_submit  
chmod  +x  CMOSTClusterSub    
chmod  +x  CMOSTClusterSub_Calib  

7.   Generate  Data  folder  and  folders  for  output  


mkdir  Data  
mkdir  JobOutput  
mkdir  Data_Calib  
mkdir  JobOutput_Calib  

8.   Copy  data  to  the  cluster  


Copy  files  for  calculations:  
scp  –r  /Users/USERNAME/Documents/YOURDATAFOLDER  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME/Data  
(this  will  copy  a  whole  folder  with  settings  files  =  instruction  files  to  the  cluster,  this  might  
be  settings  files  systematically  varying  for  the  year  of  the  screening  colonoscopy)  
 
AND/  OR  copy  files  for  automated  parameter  calibration:  
scp  –r  /Users/USERNAME/Documents/YOURDATAFOLDER  
USERNAME@euler.ethz.ch:/cluster/home/USERNAME/Data_Calib  
(this  files  will  be  used  as  starting  points  for  automated  parameter  calibration)  

9.   Start  Matlab  
module  load  matlab/8.5  (loads  matlab  to  be  started)  
matlab  –nojvm  
(this  will  start  Matlab  without  java  virtual  machine  and  thus  without  the  graphical  user  
interphase  which  usually  does  not  work  on  a  busy  Linux  login  node.  However,  operating  
Matlab  from  the  command  line  will  work)    

10.  do  codegen  for  all  files  necessary  (see  also  VII.2  the  coder  option)  
in  Matlab:  
cd  /cluster/home/USERNAME/CMOST  
cd  codegen,  load  Variables_100000.mat,  cd  ..,  codegen  NumberCrunching_100000  –args  All  
cd  codegen,  load  Variables_50000.mat,  cd  ..,  codegen  NumberCrunching_50000  –args  All  
cd  codegen,  load  Variables_25000.mat,  cd  ..,  codegen  NumberCrunching_25000  –args  All  
cd  codegen,  load  Variables_10000.mat,  cd  ..,  codegen  NumberCrunching_10000  –args  All  
cd  codegen  /,  load  RS_Variables_100000,  cd  ..,  codegen  QuickRS  –args  All  

11.  Start  calculations  


./submit_standard  (for  submitting  files  for  calculations)  
./Calib_submit  (for  submitting  files  for  calibrations)  
 
Notes:    
-­‐‑  You  can  use  bjobs  to  follow  how  many  jobs  are  currently  running  
-­‐‑  During  calculations  for  each  settings  file  one  results  file  in  the  form  
ORIGINALFILENAME_Results.mat  will  be  generated.    
-­‐‑  By  checking  the  data  folder  you  can  also  follow  progress  
cd  ./Data/YOURDATAFOLDER  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  47  
 
ls  
for  each  original  settings  file  one  results  file  should  be  present.  
-­‐‑  You  can  follow  the  JobOutput  folder  for  troubleshooting  
cd  ./JobOutput  
vi  YOURDATAFILE  

12.  Copy  data  back  to  your  local  computer  and  summarize  data  

IX.3.   Generating   multiple   settings   files   with   the  


“Scan  Variables”  option  

   
Purpose:   “Scan   variables”   generates   a   number   of   settings   files   which   differ  
systematically  in  only  1  parameter  (or  several  parameters  as  specified  below).  This  
parameter   could   for   instance   be   the   year   of   a   screening   colonoscopy.   This   option  
efficiently  generates  settings  files  (=instruction  files)  without  the  need  of  manually  
changing  one  parameter  and  saving  the  file.      
Scan:  1  CMOST  parameter  (or  2…5  parameters)  will  be  changed  
2D   scan:   2   parameters   will   be   individually   changed   (generating   a   2D   matrix   of  
parameters)  
3D   scan:   3   parameters   will   be   individually   changed   (generating   a   3D   matrix   of  
parameters)  
Number   variables:   only   active   for   “scan”   (single   variable).   Up   to   5   individual  
variables  can  be  chosen  (Variable  1…5).    
Number  steps:  Determines  the  number  of  individual  steps  a  variable  will  be  tested  
(for   instance,   for   testing   a   single   screening   colonoscopy   could   be   tested   in   50  
individual   steps   from   age   30   to   age   79).   Therefore,   50   settings   files   would   be  
generated.  
Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  48  
 
Number  steps  2  and  3:  Number  of  individual  steps  for  the  2nd  and  3rd  variable  (for  
instance,   3   individual   screening   colonoscopies   could   be   tested   at   age   30   to   79   and  
50x50x50  =  125’000  individual  settings  files  that  would  be  generated.    
Choose:  Choose  an  individual  parameter  that  will  be  varied.  However,  this  requires  
navigating   through   the   variables   of   CMOST.   However,   naming   of   the   variables   is  
usually   self-­‐‑explanatory   and   follows   mainly   the   structure   of   the   various   graphical  
user  interphases.  In  case  of  a  doubt  the  settings  files  generated  can  be  reloaded  in  
CMOST  and  the  graphical  user  interphases  can  be  inspected  .  
Subposition:  Several  variables  have  subpositions  (for  instance  screening  has  options  
for   percentage   of   population,   adherence   etc.   as   in   the   “Screening”   graphical   user  
interphase).     In   the   example   with   3   screening   colonoscopies   the   variable   name  
would  be  “Sreening.Colonoscopy”  and  the  subpositions  3,  4  and  5.  
Minimum,  Maximum:  minimum  and  maximum  value  of  the  respective  variable  (for  
the   example   with   screening   colonoscopies,   minimum   would   be   30   and   maximum  
79).    
Adjust:   before   files   can   be   generated   CMOST   wants   the   user   to   confirm   (possibly  
after   adjustment)   the   individual   values   for   each   step   (for   instance,   for   screening  
colonoscopies  year  29.1  would  be  meaningless).  
Filename:  base  filename  for  the  files  to  be  generated.  
Linker:  for  the  naming  of  the  files,  positioned  between  baseline  and  the  number  for  
an  individual  file    
Browse:  Browse  for  a  path  to  save  the  settings  files  to  be  generated  
Economy:   if   checked   double   positions   (for   instance   2   colonoscopies   at   age   50/   60  
and   60/   50   will   be   avoided.   This   reduces   the   number   of   files   and   the   time   for  
calculations.    
Number  repeats:  Each  settings  file  can  be  generated  once  or  with  a  given  number  of  
repeats.  Repeating  calculations  (and  subsequent  averaging)  will  reduce  noise.  
Linker  repeats:  Each  file  generated  as  a  repeat  will  have  in  the  file  name  the  repeat  
number  attached  with  a  linker  string  between  filename  and  this  number.  This  linker  
can  be  selected.  
Create  files:  create  settings  files  in  the  respective  folder.  
Return:  return  to  main  menu.  

IX.4.  Simulating  large  populations    


Colorectal  cancer  incidence  is  approximately  5%.  And  the  incidence  reduction  could  
be   about   30%   of   this,   which   is   about   1.5%.   The   risks   of   colonoscopy   are   even  
smaller.  Performing  calculations  on  a  small  group  of  100,000  can  lead  to  stochastic  
errors  because  of  the  small  percentages.  Since  it  is  a  theoretical  model,  it  is  better  to  
perform   the   calculations   on   large   populations   of   a   few   million   in   order   to   get   the  
predictions  right.    
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  49  
 
However,   the   memory   requirements   are   very   high,   and   a   typical   calculation   for   a  
population   of   100,000   requires   more   than   2   GB   memory.   In   avoid   memory   and  
processing  limitations  in  performing  the  calculations  on  large  populations,  we  split  
the  population  into  multiple  jobs.  For  example,  to  perform  a  calculation  of  300,000  
using   the   settings   of   CMOST13.mat,   CMOST13   can   be   saved   as   CMOST13_1,  
CMOST13_2  and  CMOST13_3  and  calculated  using  the  cluster  or  the  Starter  option.    
 
Results   have   to   be   summarized   using   custom   made   scripts.   Examples   for   these  
scripts  can  be  found  in  the  Scripts  folder  in  the  CMOST  folder.  These  scripts  can  be  
used  as  a  guide  and  modified  for  generation  of  scripts  for  different  purposes.  
 
 

Prakash  et  al.  Manual  for  CMOST:  Colon  modeling  with  an  open  source  tool.                  50  
 

You might also like