cs614 Graded Discussion Board will be launched on February 12, 2013 and it will be remained open for two days (February 12 to February 13, 2013).
Dear Students!
Graded Discussion Board will be launched on February 12, 2013 and it will be remained open for two days (February 12 to February 13, 2013).The Topic of Graded Discussion Board:
Many organizations are maintaining Data Warehouse as well as Online Transaction Processing System simultaneously. In your opinion, is it possible that the Online Transactions Processing Systems can be replaced by the Data Warehouse? Support your opinion with strong reasons.
Instructions:
1. Books, websites and other reading material may be consulted before posting your comments.
2. The comments should not be copied from other student or from any other source like internet etc.
3. Your comments should be in your own wordings.
4. A concise, coherent and to the point comment is preferred over lengthy comment having irrelevant details.
5. Your comments must not be more than 6-8 lines.
6. Comments posted on regular Lesson's MDB or sent through emails will not be considered in any case.
In case of any query, email at CS614@vu.edu.pk
Differences in requirements
ReplyDeleteData warehouses and OLTP systems have very different requirements. Here are some examples of differences between typical data warehouses and OLTP systems:
*Workload
Data warehouses are designed to accommodate ad hoc queries. You might not know the workload of your data warehouse in advance, so a data warehouse should be optimized to perform well for a wide variety of possible query operations. OLTP systems support only predefined operations. Your applications might be specifically tuned or designed to support only these operations.
*Data modifications
A data warehouse is updated on a regular basis by the ETL process (run every 15 minutes, nightly, weekly). The end users of a data warehouse do not directly update the data warehouse. In OLTP systems, end users routinely issue individual data modification statements to the database. The OLTP database is always up to date, and reflects the current state of each business transaction.
*Schema design
Data warehouses often use denormalized or partially denormalized schemas (such as a star schema) to optimize query performance by avoiding join and using techniques as star_transformation. OLTP systems often use fully normalized schemas to optimize update/insert/delete performance, and to guarantee data consistency.
*Typical operations
A typical data warehouse query scans thousands or millions of rows. For example, “Find the total sales for all customers last month.” A typical OLTP operation accesses only a handful of records. For example, “Retrieve the current order for this customer.”
*Historical data
Data warehouses usually store many months or years of data. This is to support historical analysis. OLTP systems usually store data from only a few weeks or months. The OLTP system stores only historical data as needed to successfully meet the requirements of the current transaction.
Decision Support System
ReplyDeleteDecision Support Systems (DSS) helps organizations in strategic and operational decision making process.
Organizations use On-Line Transaction Processing (OLTP) servers for their day-to-day business operations. This only stores current data and very recent transactional data.
Strategic decision making require historical data.Data Warehouse(DW) are large storage systems which stores historical data that can be used for strategic decision making.
Business Intelligence (BI)
The technologies, tools, and practices for collecting, integrating, analyzing,and presenting large volumes of information to enable better decision making is referred as Business Intelligence (BI).
How to Build a Data Warehouse
Data from multiple Sources(Sources are usually OLTP Systems or Flat Files) are extracted and integrated at a common repository called Data warehouse.
ETL (Extract - Transform - Load)
Extract data from multiple sources.
Transform the extracted data according to the requirements of the Data warehouse
Load the transformed data into the warehouse.
Why ETL ( Extract - Transform - Load)
To build the Data Warehouse we need to integrate data from multiple data sources.These sources may be a database or flat files.
Even though these sources may have similar kind of data,there may exist considerable differences in the following ways.
Different attribute names will be used by different sources to represent same data element. So we need to find the semantically equivalent attributes from different data sources so that we can represent all of these as a single attribute in the Data warehouse.
Some times different data sources may use same attribute names to represent semantically different data elements. We should resolve this before we load data in to the data warehouse.
We can do various transformation on the source data before loading it into the warehouse. Some of them are
De-duplication: Remove duplicate records.
Sort: Sort Records based on certain key attributes.
Summarization: Some times Data warehouse does not require the actual detailed transaction level data,instead it may require only a summary data. For Example instead of detailed sales data,give a record that specifies total sales.
OLTP System deals with operational data. Operational data are those data involved in the operation of a particular system.
ReplyDeleteExample: In a banking System, you withdraw amount through an ATM. Then account Number,ATM PIN Number,Amount you are withdrawing, Balance amount in account etc are operational data elements.
Operational Data
Operational data are usually of local relevance
Frequent Updates
Normalized Tables
Point Query
In an OLTP system data are frequently updated and queried. To prevent data redundancy and to prevent update anomalies the database tables are normalized.Set of tables that are normalized are fragmented.Normalization makes the write operation in the database tables more efficient.Operational data are usually of local relevance.It involves Queries accessing individual tuple(individual record).These type of queries are termed as point queries.
Examples for OLTP Queries:
What is the Salary of Mr.John?
What is the address and email id of the person who is the head of maths department?
OLAP Description and Use
ReplyDeleteAn OLAP database, which is an On-line Analytical Processing database, is used to perform data analysis. An OLAP database is based on dimensions a dimension is a single detail record about a data item. For example, a product can have a quantity, a price, a time of sale and a place sold. These four items are the dimensions of the item product in this example. Where the dimensions of an object intersect is a single data item, for example, the sales of all apples in Atlanta Georgia for the month of May, 1999 at a price greater than 59 cents a pound. One problem with OLAP databases is that the cubes formed by the relations between items and their dimensions can be sparse, that is, not all intersections contain data. This can lead to performance problems. There are two versions of OLAP at last count, MOLAP and ROLAP. MOLAP stands for Multidimensional OLAP and ROLAP stands for Relational OLAP.
The problem with MOLAP is that there is a physical limit on the size of data cube which can be easily specified. ROLAP allows the structure to be extended almost to infinity (petabytes in Oracle8i). In addition to the space issues a MOLAP uses mathematical processes to load the data cube, which can be quite time intensive. The time to load a MOLAP varies with the amount of data and number of dimensions. In the situation where a data set can be broken into small pieces a MOLAP database can perform quite well, but the larger and more complex the data set, the poorer the performance. MOLAPs are generally restricted to just a few types of aggregation.
In a ROLAP the same performance limits that apply to a large OLTP come into play. ROLAP is a good choice for large data sets with complex relations. Data loads in a ROLAP can be done in parallel so they can be done quickly in comparison to a MOLAP which performs the same function.
Some applications, such as Oracle Express use a combination of ROLAP and MOLAP.
The primary purpose of OLAP architecture is to allow analysis of data whether comes from OLTP, DSS or Data warehouse sources.
LTP Stands for On-Line Transaction Processing. In an OLTP system the transaction size is generally small affecting single or few rows at a time. OLTP systems generally have large numbers of users that are generally not skilled in query usage and access the system through an application interface. Generally OLTP systems are designed as normalized where every column in a tuple is related to the unique identifier and only the unique identifier.
ReplyDeleteOLTP systems use the primary-secondary key relationship to relate entities (tables) to each other.
OLTP systems are usually created for a specific use such as order processing, ticket tracking, or personnel file systems. Sometimes multiple related functions a re performed in a single unified OLTP structure such as with Oracle Financials.
OLTP Tuning
OLTP tuning is usually based around a few key transactions. Small range queries or single item queries are the norm and tuning is to speed retrieval of single rows. The major tuning methods consist of indexing at the database level and using pre-tuned queries at the application level. Disk sorts are minimized and shared code is maximized. In many cases closely related tables may be merged (denormalized) for performance reasons.
A fully normalized database usually doesn't perform as well as a slightly de-normalized system. Usually if tables are constantly accessed together they are denormalized into a single table. While denormalization may require careful application construction to avoid insert/update/delete anomalies, usually the performance gain is worth the effort.
:p
ReplyDeleteab me es naam se admin hon hahahhaaaaaaa
ReplyDelete