Data Management & Warehousing

http://www.datamgmt.com

An introduction to Process Neutral Data Modelling
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 1 of 19 31 January 2006

Data Management & Warehousing
•  Founded 1995 by David Walker
–  Operates with up to 15 consultants

•  Specialists in Enterprise Data Warehousing •  Clients have included:
–  Manufacturing: Diageo, Mars ISI –  Retail: Albert Heijn, Nectar –  Financial: Virgin Money –  Transport: Network Rail, Swissair –  Telco: Turkcell, Swisscom Mobile, Telkom SA
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 2 of 19 31 January 2006

What is Process Neutral Modelling ?
•  A method of designing a data model for a data warehouse that is less affected by changes in source system and/or business process •  A technique that incorporates the metadata within the data model (in a similar way to XML which incorporates metadata in a data file) •  A consistent, self similar modelling method that allows easy model management in data warehouses
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 3 of 19 31 January 2006

Where would you use it ?
•  Data Warehouses that:
–  –  –  –  Feed multiple data marts Have many source systems that are poorly integrated Are in organisations undergoing large business process change Support a recognised need for integrated business intelligence

•  But not in organisations that:
–  –  –  –  are small and can’t afford Enterprise Data Warehousing have a few or one source system with little external data have very stable business processes want to build an Online Transaction Processing (OLTP) Systems for reporting
UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 4 of 19 31 January 2006

© 2006 Data Management & Warehousing Speaker: David M. Walker

Overcomes Some DWH Requirements Issues •  Stops the need to closely define certain things from the requirements in the data model e.g. •  Define CUSTOMER
–  Marketing say it is everyone they communicate with –  Sales say it is everyone in their prospect database. –  Customer Support say it is people who have bought the product –  Service Team say it is people who have a support contract

© 2006 Data Management & Warehousing Speaker: David M. Walker

UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London

Page 5 of 19 31 January 2006

Major Entities
•  Rules
–  Lifetime value attributes only –  Always has a start date and an optional end date

•  Examples
–  –  –  –  –  Party Geography Calendar Electronic Address Product

© 2006 Data Management & Warehousing Speaker: David M. Walker

UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London

Page 6 of 19 31 January 2006

Major Entity Types

•  Rules
–  List of valid types and when they are valid (metadata)

•  Examples
–  Party
•  Individual, Sole Trader, Partnership, Ltd Co, Plc, Trust

–  Geography
•  PAF Address, Co-ordinate Point
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 7 of 19 31 January 2006

Major Entity Properties

•  Rules
–  Attributes of the Major Entity that change over time listed in the ‘Type table’ and their association with the major entity

•  Examples
–  Party
•  Individual: Marital Status, Income •  Plc: Turnover, Number of employees
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 8 of 19 31 January 2006

Major Entity Events

•  Rules
–  Things that happen to a major entity

•  Examples
–  Party
•  Individual: Marriage

–  Address
•  Change of use approved
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 9 of 19 31 January 2006

Major Entity Links

•  Rules
–  Relates to entries in a major entity, and relationship is defined by the type table

•  Examples
–  Party
•  Individual 1 is married to individual 2 •  Individual 1 is employed by Organisation 3
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 10 of 19 31 January 2006

Major Entity Segments

•  Rules
–  Creates a collection of entries from a major entity

•  Examples
–  Party
•  Marketing Group 1: Males >40 with 1 or more children (data derived from the other tables, e.g. properties and links)
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 11 of 19 31 January 2006

The Major Entity Collection

© 2006 Data Management & Warehousing Speaker: David M. Walker

UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London

Page 12 of 19 31 January 2006

Major Entity / Major Entity History

•  Rules
–  Relates two different major entities via a history type

•  Examples
–  Party / Address
•  Individual 1 lives at Address 2 •  Individual 3 works at Address 4
UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 13 of 19 31 January 2006

© 2006 Data Management & Warehousing Speaker: David M. Walker

Occurrences and Major Entities

•  Rules
–  These are the tables with define interactions between all the major entities

•  Examples
–  Sales
•  Party 1 is supplier •  Party 2 is the customer •  Address 3 is the store location •  Product 4 is item purchased
UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 14 of 19 31 January 2006

© 2006 Data Management & Warehousing Speaker: David M. Walker

Key Elements
•  Self Similar modelling
–  All _TYPE tables have the same structure, etc. –  Naming conventions are consistent everywhere

•  Insert ‘heavy’ / Update ‘light’
–  Most ETL will result in an insert, there will be very few updates

•  Manages ‘Slowly Changing Dimensions’
–  Inherent in the Major Entity Collection –  Significantly reduces overhead in the Data Mart build

•  Data Driven
–  Types provide metadata

•  Natural Star Schemas
–  Occurrences will map to FACTS, Major Entity Collections will collapse into DIMENSIONS
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 15 of 19 31 January 2006

Pros & Cons
•  Development Cost front-loaded
–  Most of the costs are in the early part of the (ETL) development, later stages are then quicker and faster. This will put some organisations off

•  Pivoting Data vs. Slowly Changing Dimensions
–  Questions about the cost of loading ‘property tables’ and ‘pivoting’ data. In reality this is easily offset by the extra code and effort of managing slowly changing dimensions

© 2006 Data Management & Warehousing Speaker: David M. Walker

UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London

Page 16 of 19 31 January 2006

Pros & Cons (cont.)
•  Two stage process: Source -> TR - Mart
–  Design patterns exist to mitigate this –  Allows loading whilst users continue to work –  Allows for the development of flip-flop marts

•  Larger Initial Data Volumes
–  But smaller over the long term due to data sparsity

© 2006 Data Management & Warehousing Speaker: David M. Walker

UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London

Page 17 of 19 31 January 2006

Is this all there is to it ?
•  At a high level – YES •  BUT:
–  There are methods for dealing with data quality –  Special case methods for some lifetime attributes
•  e.g. Handling women changing their names at marriage

–  Insert/Update methods for performance –  Design Patterns for implementation –  Other detailed techniques

•  This talk could only ever be:

“An introduction to Process Neutral Data Modelling”
© 2006 Data Management & Warehousing Speaker: David M. Walker UKOUG: Business Intelligence & Reporting Tools SIG Institute of Physics, 76 Portland Place, London Page 18 of 19 31 January 2006

Data Management & Warehousing
Thank you ! •  For more information:
–  Visit our website at http://www.datamgmt.com –  Call us on 07050 028 911 –  E-mail davidw@datamgmt.com

Winning Teams - Great Team Players Data Management & Warehousing are proud player sponsors for the 2005/06 season of Joe Worsley, utility back row with the English Rugby Premiership Champions London Wasps. Joe has helped London Wasps win the Zurich Premiership in 2002-03, 2003-04 and 2004-05 ©as wellManagement Heineken Cup in 2003-04. Joe was alsoReporting Tools SIG of the England World Cup squad of 19 2006 Data as the & Warehousing UKOUG: Business Intelligence & a member Page 19 Speaker: David M. Walker Institute of Physics, 76 Portland Place, London 31 January 2006 and was awarded an MBE by the Queen.