Data Warehousing Fundamentals

Course:  DWHFUN
Duration:  3 Days
Level:  I
Course Summary

This dimensional modelling techniques course is designed to answer questions such as the following:

  • What is data warehousing?
  • What is a data mart?
  • What are the data modelling options?
  • What is Extract, Transform and Load (ETL)?
  • What are the terms and concepts specific to data warehousing and OLAP design?
  • How to plan and implement a data warehouse with high availability, simplified manageability and optimal performance
  • What are common statistics, analytic and OLAP SQL queries?

The course is suitable for all popular DBMS platforms, including DB2, Oracle,SQL Server, Sybase, etc.

« Hide The Details
Topics Covered In This Course

DATA WAREHOUSING OVERVIEW

  • Overview
  • Typical uses

DEFINITION, ARCHITECTURE AND CONCEPTS

  • Enterprise Data Model
  • Operational vs. historical data
  • Extract Transform Load (ETL)
  • Metadata
  • Data warehouse vs. data mart
  • Data mining
  • OLAP vs. OLTP
  • Massive size implementation
  • Logical design vs. physical design
  • Normalization vs. denormalization
  • Referential constraints

DATA MODELLING OPTIONS

  • Entity model
  • Star schema
  • Snowflake schema

DIMENSIONAL MODELLING DEVELOPMENT LIFE CYCLE

  • Requirements analysis
  • Requirements gathering
  • Requirements validation
  • Requirements modelling
  • Schema design
  • Project definition
  • Warehouse design
  • Implementation
  • Follow-up and review

DIMENSIONAL MODELLING DESIGN

  • Overview
  • Metadata properties
  • Star schema
  • Snowflake schema
  • Cubes
  • Measures and facts
  • Attributes and relationships
  • Dimension
  • Hierarchies
  • Role-playing dimensions
  • Joins
  • Summary tables and aggregation
  • Exercises

CASE STUDY

  • Project definition and scoping
  • Specify the requirements
  • Specify the grain (e.g., fact table types)
  • Specify the dimensions (e.g., handling slowly changing dimensions)
  • Specify the facts (e.g., conformed facts)

IMPLEMENTATION OPTIONS

  • Overview
  • Top down
  • Bottom up
  • Sizing
  • Cleaning
  • Populating the data warehouse

EXTRACT, TRANSFORM, LOAD (ETL) TERMS AND CONCEPTS

  • Options
  • Extraction options
  • Transformation options
  • Loading options
  • Change Data Capture and publishing
  • Staging areas

EXTRACTING

  • Logical-to-physical data mapping
  • Disparate (heterogeneous) data sources
  • Extracting changes data – delta or other

DATA CLEANING AND CONFORMING

  • Data quality criteria
  • Design methods and alternatives
  • Cleaning deliverables
  • Conforming dimension tables
  • Conforming fact tables

DIMENSION TABLE DELIVERY

  • Dimension table structure
  • Surrogate key generation
  • Dimension table grain
  • Flat (denormalized) or snowflake?
  • Data and time dimensions
  • 'Big' vs. 'small' dimensions
  • Dimensional roles
  • Dimensions as subdimensions
  • Degenerate dimensions

SLOWLY CHANGING DIMENSIONS

  • Type 1
  • Type 2
  • Type 3
  • Hybrid
  • Late arrivals

MULTIVALUED DIMENSIONS

  • Definition
  • Bridge tables

FACT TABLE DELIVERY

  • Fact table structure
  • Referential integrity (RI)
  • Surrogate key derivation and flow
  • Fundamental grain
  • Transaction fact tables
  • Factless fact tables
  • Periodic snapshots
  • Accumulating snapshots

FACT TABLE LOAD CONSIDERATIONS

  • Index management
  • Partition management
  • Updates, deletes and inserts
  • Recovery
  • Summary tables
  • Parallelism

DATA WAREHOUSE PERFORMANCE DESIGN

  • Materialized views
  • Large concurrent reports
  • Short running queries
  • Long running queries
  • Random queries
  • Occasional updates
  • On-line utilities
  • Index options
  • Partitioning and parallelism (e.g., LOADs)

INTRODUCTION TO STATISTICS, ANALYTIC AND OLAP SQL QUERIES

  • AVG
  • CORRELATION
  • COUNT
  • COUNT_BIG
  • CONVARIANCE
  • MAX
  • MIN
  • RAND
  • STDDEV
  • SUM
  • VARIANCE
  • Regression function
  • GROUPING, ROLLUP AND CUBE

PHYSICAL DESIGN CONSIDERATIONS

  • Denormalization
  • Index choices
  • Data placement
  • Free space
  • Summary tables
  • Data compression
What You Can Expect

Upon completion of this course, the participant should be able to design a data warehouse using both star and snowflake schemas. And the delegate should understand the implication of such terms as cubes, dimensions, attributes, joins, hierarchies, measures, etc.

Who Should Take This Course

Would-be data warehouse architects, IT developers, database administrators or anyone responsible for a data warehouse or related discipline.

« Hide The Details
Related Courses
Code Course Title Duration Level
DWHTCA
Data Warehousing Terms, Concepts and Architecture
1 Day
I
Details
DWCONI
Introduction to Data Warehouse Concepts
1 Day
I
Details
SSASBI
Implementing and Maintaining SQL Server 2008 Analysis Services and Business Intelligence Solutions
5 Days
I
Details
HADOOPIP
Introduction to Hadoop Programming
5 Days
I
Details
OBIEE
Oracle BI Enterprise Edition (OBIEE) 11g: Reports and Dashboards
4 Days
I
Details
OBIBC
Oracle BI Enterprise Edition (OBIEE) 11g: Boot Camp
5 Days
I
Details
ORDWA
Oracle Database 11g: Administer a Data Warehouse
5 Days
II
Details
DWMODEL
Dimensional Modeling for Data Warehouse Projects
5 Days
II
Details
OBIBR
Oracle BI Enterprise Edition (OBIEE) 11g: Building Repositories
5 Days
II
Details

Every student attending a Verhoef Training class will receive a certificate good for $100 toward their next public class taken within a year.

You can also buy "Verhoef Vouchers" to get a discounted rate for a single student in any of our public or web-based classes. Contact your account manager or our sales office for details.

Schedule For This Course
There are currently no public sessions scheduled for this course. We can schedule a private class for your organization just a couple of weeks from now. Or we can let you know the next time we do schedule a public session.
Notify me the next time this course is confirmed!
Can't find the course you want?
Call us at 800.533.3893, or
email us at [email protected]