In last article we learned about ETL: Extraction, Transformation, and Loading and in today’s class we are shred light on different types of Data Warehouse Schema, advantages and differences.You can get full list of Data Warehouse Tutorial articles here.
So lets start with basic question:
What Is A Schema?
A schema is a model which help is organizing, understanding and interpreting the information. Schema are used to structure of different types of data. With the help of schemas, it is easy to interpret a result from large amount of data. Some examples of schema are XML schema and database schema.
The relational database schemas are built as an ER (entity-relationship model) which entities and relationships between them. This data model is suitable for transactional processing.
The data warehouse schema model is a multidimensional schema model suitable for data analysis and decision making. A data warehouse has subject oriented, time variant data for which a multidimensional model is best suited.
Multidimensional Model: Star and Snowflake
A multidimensional model is presented in form of data cube. A multidimensional data model in DW exists in form of star schema, snowflake schema and fact constellation schema.
Star Schema:
A star schema consists of one large centralized fact table and smaller dimension tables, one table for each dimension. The fact table consists of large amount of non-redundant data and dimension table consists of attributes of a dimension. The graphical representation of this schema resembles a star shape as fact table is placed the center with dimension tables spread around the radius of the central fact table.
Diagram below represents a star schema model:
The diagram above has a central fact table for sales based on four dimensions City, Employees, Country and Product. The fact has keys for four dimensions and two measures: total quantity total sales.
The dimension table has attributes of dimension. For table city, the key is CityID and other attribute is City Description. The attributes of a dimension table are in form of a hierarchy (total order amongst attributes) or form a lattice (partial ordering amongst attributes).
Snowflake Schema:
A snowflake schema is a type of star schema where the dimension tables are normalized. The normalization takes place by further splitting the tables into other tables. The graph becomes like a snowflake.
The diagram below shows a snowflake schema:
The above diagram has a central fact table representing sales upon four dimensions: city, country, employees and product. The dimension, product in star schema is now normalized to snowflake schema with two tables, product and vendor. The products table contains attributes ProductID and ProductName. The ProductID is linked to vendor dimension table which contains Vendorname and ProductID.
Fact Constellation:
A collection of star schemas forma galaxy schema or fact constellation schema. Some applications are complex in nature and require multiple fact tables or sharing of dimension tables. Such schema can be visualized as collection of stars thus named galaxy schema.
The diagram below shows a fact constellation schema:
The diagram above shows two fact tables: Sales and Shipping. The dimension represented are Product and Store. Both these fact tables share common dimensions, product and store. The Sales fact table and Shipping fact table both contain the Product Key and Store key of Product and Store dimension table respectively.
Difference in Star Schema and Snowflake Schema
Star Schema | Snowflake Schema | |
Dimension Table | The dimension tables in star schema are not normalized so they may contain redundancies | This schema has normalized dimension tables |
Queries | The execution of queries is relatively faster as there are less joins needed in forming a query. | The execution of snowflake schema complex queries is slower than star schema as many joins and foreign key relations are needed to form a query. Thus performance is affected. |
Performance | Star schema model has faster execution and response time | It has slow performance as compared to star schema |
Storage Space | This type of schema requires more storage space as compared to snowflake due to unnormalised tables. | Snowflake schema tables are easy to maintain and save storage space due to normalized tables. |
Usage | Star schema is preferred when the dimension tables have lesser rows | If the dimension table contains large number of rows, snowflake schema is preferred |
Type of DW | This schema is suitable for 1:1 or 1: many relationships such as data marts. | It is used for complex relationships such as many: many in enterprise Data warehouses. |
Dimension Tables | Star schema has a single table for each dimension | Snowflake schema may have more than one dimension table for each dimension. |
Advantages and Disadvantages of Star Schema Data Warehouses
The star schema has lesser number of joins therefore simple queries can be used to query the database. It has better performance. The business logics are very much simplified in star schema. | Redundancy of data. Data integrity issues prevail Many: Many relationships are not supported. |
Advantages and Disadvantages of Fact Constellation Schema Data Warehouse
It fact constellation schema offers more flexibility. Multiple fact tables are explicitly assigned to dimension tables. |
The structure is more complex and sophisticated. It is difficult to maintain. The number of aggregations are high in constellation. |
Advantages and Disadvantages of The Snowflake Schema Data warehouse
Data integrity issues are reduced. Data is easy to maintain and more structured. Data quality is better than star schema. Disk space is optimized and storage space is saved. |
Queries to access the database schema is complex in nature. The performance of system is degraded due to complex queries as number of joins and foreign key constraints are increased. |
Next tutorial we are going to discuss the OLAP operations.
⇓ Subscribe Us ⇓
If you are not regular reader of this website then highly recommends you to Sign up for our free email newsletter!! Sign up just providing your email address below:
Happy Testing!!!
- Automation Testing Vs Manual Testing
- Data Warehouse and Database and OLTP Difference and Similarities
- What is difference between Unit Testing and Integration Testing?
- Recovery Testing
- Introduction To WebDriver And Its Comparison With Selenium RC – [Learn Selenium Basics]
- What is Structural Testing?
- Model Based Testing: Testing Type You Must Know!
- ETL Process and Data Mart
- Difference between MOLAP vs ROLAP vs HOLAP
- Volume Testing In Software Testing
1 thought on “Data Warehouse Schema: Star and Snowflake”
Thanks for the nice tutorial series. Till tutorial 5 looks good and great points I learned.
Keep up the good work.