Dec 09 PDF

BE. Seventh Semester Examination, Dec.-2009 Data Warehousing and Data Mining at-101-£) Note : Attempt any five questions. All questions carry equal maiks. Q.1-(a) Differentiate between the following : Database, Data Warehouse, Data Mining, KDD. ‘Ans. Database: Databace is an organized body of related information. A database isa collection of data for one of more multiple uses. One way of classifying databases involves the type of content, for example bibliographic, full-text, numeric, image. Data Warehouse : A data warehouse is a repository of an organization’s electronically stored data Durawarehouses are designed to faciitae reporting and analysis ‘This definition of data wacchouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data and to manage data diotionary are also considered essential compo- nent of a c ta warehousing system. Data ining : Data mining is the process of extracting patterns from data. Data mining is becoming an ingly important tool to transform the data into information. It is commonly used in a wide range of profiling practiced, such as marketing, surveillance, fraud detection and scientific discovery KDD : KDD stands for knowledge discovery in databases. KDD i synonymoue with large databases and automated discovery of patlerns and relationships. KDD is “thenon-trivial pracess of identifying valid, novel, potentially useful and ultimately understand able patterns in data.” ‘DD Process: Reduction Transformed coding Dal Plata Mining Visualization Reporte Jil gel 500 @ dul Knowledge ees eonQ.1. (2) Explain the concept of star, snowflake and galaxy schema with the help of suitable example, ‘Ans. Star Schema : Single fact table witha dimension table linked toi There isa central large fact table with no redundancy. i) Each type inthe fact table has a foreign key to a dimensional table which describes the details of that dimensions. Time Sales Time dimension table a Fact table dimension table Times hey Time key om. Day Mesh tt ome Day-oFtheweek} | Branch-key yon Manth Location key : Quaner lars- sold Pe spe Year Unit: soid upper L Bioesion Location panera Dimension Table T Branch Key sation | Branch- Name City | Branch: Type Prorince or stat Lee. ! Country Fig, Datansining Concepts and Tech Q.2.(a) Explain in detail the three-tier Data Warehouse architecture, How a query is mapped between three cers, explain, ‘Ans, Data warehouse adopt a throe tier architecture, these are: (i) Bottom tier (datawarehouse server), (@) this warehouse database, server (i) Data fed using back end tools and utes. i) Data extracted esing programs called gateways (iv) Italso contains meta data repository. Middle Tier : Middle ter isan OLAF server that is typically implemented using ether Relational OLAP model hats, extended eclatonal DBMS that maps operations oni ulidimeral data standard relational operations; or A multidimensional OLAP mode, shat isa special purpose server that directly imp ements multii- ‘mensional data end operations ‘Top Tier: The top tes ia frontend client layer, which contains query and reporting tools, analysis toolsand or data mining tools. Analysis Data Mining [| Tor Tier | | Frontend Toots Hee Operational Data Bases = = Extemal Server‘Snowflake Schema : Single fact table with n-dimining tables organized as a hierarchy. ‘Some of th@Vimension tables are normalized thus splining data into additional tables, Supnlior Time + Sales Fime dlnpnsion dimensional able fact table dimension ble table Dry Htem- name Day-oftheweek| | Branch-key fem Month Location. key x Supple 1 Quarter Dollars- sold See crtype Year Unite sold wpplier ty ranch Cie Dimension Location anension Table Dimension Table Tobie Branch Key ' Location -Key Branch: Name ‘street Branch: Type | | ciy Galaxy Schema t Also known as fac constellation schema, * Multiple facts table sharing dimension tables. In the fig, given below the ‘sales’ fact table and ‘shipping’ fact table share the dimension tables, Supplier Time Time fact dimension table foct table dimension table table — im Tienvkey mee ree, Item- Key Tiine-key Oey ete Item-name |_| shipperhey Day af-the-week Brand Frometocation ome Type torlonetion Quer Supplicrtype || | dotorvcast uiteshingod Branch Shipper dimension Dimension ‘Location , ‘Table Dimension Tabie ee Location Ki Ke Branch: Key 7 Shipper Key Branch- Name ee ‘Shipper Name Branch- Type Province-or-state ———F | count |(Q.2.(b) Diseuss various OLAP operations which can be performed on a multidimensional data cube. ‘Ans, OLAP Operations : The analyst con understand the meaning contained in the databases using ‘mokiimensional analysis. By sligaing the data content with the analyst's mental model, the chances of confusion and erroneous interpretations are reduced. The analyst can navigate through the database and screen fors particular subset ofthe data, changing the data's orientations and defining analytical relations, The ser initiated process of navigating by calling for page displays intemstively. through ihe specification of sliees via otations and drill down up is sometimes called “slice and die”. Common operations inelude slice and die, di down, roll up and pivot. Slice : Asie is a subset of a multidimensional array corresponding to a single value for one or more ‘member of the dimensions not in the subset ‘Die : The die operation i a slice on mors than two dimensions of a data cube. Drill/Down/Up : Drilling down or up is a specific analytical technique whereby the user navigate among. Levels of data ranging from the most summarized (up) tothe most detailed (down), Roll up : A roll up involves computing 2M! of the data relationships for one or more dimensions, A. computational relationship or formnala might be defined Pivot : This operation is atso called rotate operation that rotates the dats in order to pravide an alternative presettation of data. To change the dimensional orieatation of e report or page display. ‘Q.3. Suppose a datsbase has four transactions. Let min-support~60%, min-eonfidence = 80% iL Le 100 TSS (A,B, ) 1200 15/10/08 DAGE,B) ‘T300 19110008 {GABE} - F400 22/10/08 {BAD} . () Find al trequentitemsets using a prionialgorithmn, {@) List all strong association rules matching the following meta-role, where X fsa variable repeesent- dng customers and items denotes variables representing items (¢.g.,A. Bele) Ve (ransaction, buys item) 4 buys item) => buys items). ‘Ans. (i) Apriori algorithm employs BFS and uses # hash tree structure FI (Frequent Itemset)= {A, B,D} (i Assocation Rates conf (XY) = REPLY) Coe Soi) XasY whereX.¥, ct and KAY =§ Meta Retes: supp(XwY) Lin (x)= SMR) SY) sou) 1-Supp(¥) Comm X = Y) oO F)0. 4. Explain the concept of Query Language employed in data mining and standardization of’ nta mining, How pattern presentation and visuakzation specifieation ean be carried out in data Mining Query Langeage? ‘Ans. Data mining query languages are based on mine rule. MINE rule has been designed at the university ‘of Torsion andthe poitechmg di Miland. tt an evtension of sa" whichis coupled vith a veletion DBMS. Data ‘can be selected using the Sul power af SQL Mine association rules are materialized into celaional tables 3s, ‘well. MINE RULE extracts 23 rules between values of atributes in a relational table. However, itis up to theuserto specify the form of the rules tu be enacted. The wer can specify the cardinality of body and lead ‘of the desired rules andthe attributes on which ule components can be built, ‘An imeresting aspect of mine rule is that itis possible to work on different levets on grouning during the extraction, Hf there is one level of grouping, rule support wil be completed wet. the number of groups in the table, Defining a second level of grouping leads to the definition of clusters. Rules components an be taken in two different clusters, eventually ordered, inside the same group. 11s, thus possible to extract some elementary sequential patterns (by clustering 09 a time related atriburs). For instance, grouping purchases by customers who buy fist. Buiter and milk terd to by oil afer. Concerning intrestingers measures. MINE RULE enables to specify minimal frequency and confidence thresholds. The genera syntax ofa mine rule quality or extracting rules is : (MINE RULE ] ASBODY {[) ASHEAD [SUPPORT] {CONFIDENCE FROM Table> [WHERE ] {CLUSTER BY PHAVING CONFIDENCE:
You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5819)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1093)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (845)
Magazines
Podcasts
Sheet music
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (609)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1717)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (590)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1104)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (897)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (540)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (2104)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (348)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1018)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (474)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (822)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1866)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (271)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (122)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (441)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (1947)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (401)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4771)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2259)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (808)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (98)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (266)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4208)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1929)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (231)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1902)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (234)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2522)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (738)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (3811)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2409)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (74)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (789)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (792)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (104)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (137)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (104)

Dec 09 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dec 09 PDF

Uploaded by

Copyright:

Available Formats

You might also like