help to design ETL process
I am working with retail system, and I would like to design fact and dimension tables for basic sale information. I am not sure which will be the best schma I should opt for (snow flake or Star schema). The information that I want to store in datawarehouse is given below, 1) Transaction number 2) Transaction time 3) Item Number 4) Item Description 5) Unit Price 6) Qty Sold 7) Sold At Price 8) Discount Amount 9) Tax I also want to include discount code and method of payment information. Data is going to come from 6 data sources. I have following questions, 1) what is the best design (schema) to hold this information in data warehouse? 2) I want to store item with its classification. In data sources classification is not stored, how can I manage this info in datawarehouse? 3) Should I bring Item Number and description separately to store in data warehouse, along with sales.csv flat file? That is how dimensions are populated. I know that dimension tables are like discount no and description, Item No and description etc. Can I bring only one file from data source with sales information and populate dimension tables? 4) How can I create Item Classification in data warehouse, if the data is coming from different data sources? Data sources doesn't hold classification (some places partial info is present)?
January 30th, 2011 9:23am

The schema to use, star or snowflake, depends on your OTLP's structure. Please post some info about the underlying database and especially the tables, their relationships (FK-PK) and the tables that currently hold the data you listed. Please mark as answer if this helps. Thank you! http://thebipalace.wordpress.com/
Free Windows Admin Tool Kit Click here and download it now
January 30th, 2011 12:27pm

The schema to use, star or snowflake, depends on your OTLP's structure. Please post some info about the underlying database and especially the tables, their relationships (FK-PK) and the tables that currently hold the data you listed. Please mark as answer if this helps. Thank you! http://thebipalace.wordpress.com/
January 30th, 2011 12:27pm

in each data source there is header and detail table for retail system transaction.
Free Windows Admin Tool Kit Click here and download it now
January 30th, 2011 6:03pm

in each data source there is header and detail table for retail system transaction.
January 30th, 2011 6:03pm

I dont think you need to worry about model in which you are going to design the database. Just design the database based on your requirement and it will get into the required shape either into star or snowflake schema. Dont worry about this jargon. Focus on design based on your requirement and rest will be taken care - My personal view. Let me try to answer as much as I can to your questions. 1) what is the best design (schema) to hold this information in data warehouse? Answered it above. 2) I want to store item with its classification. In data sources classification is not stored, how can I manage this info in datawarehouse? Have tables designed as appropriate so that you hold the classification as well. You may even maintain the hierarchy or get the data loaded into the same row as requried. Provide us with more details by giving an example to be more specific in my answer. 3) Should I bring Item Number and description separately to store in data warehouse, along with sales.csv flat file? That is how dimensions are populated.I know that dimension tables are like discount no and description, Item No and description etc. Can I bring only one file from data source with sales information and populate dimension tables? You can have the Item table as a dimension with all columns listed like ItemNumber, Description, discount etc. Since discount will be varying, I am not sure if that will be part of your dimension. If you are getting the Item details from different sources, then have staging tables, load initially data into staging tables and then take it forward to getting loaded into the final tables. 4) How can I create Item Classification in data warehouse, if the data is coming from different data sources? Data sources doesn't hold classification (some places partial info is present)? As mentioned above, try to make use of staging tables. Ensure that you have appropriate Business rules to perform validations and then move the data into final table. You may have to handle the partial data part by deciding on nullability of the columns. Overall, ensure that the data model contains only that data which makes sense to business. That would help you in maintaining the data better. Regards,Phani Note: Please vote/mark the post as answered if it answers your question/helps to solve your problem.
Free Windows Admin Tool Kit Click here and download it now
January 31st, 2011 6:26pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics