Relational Database Technology for Data Warehousing
Select and Use the Right Relational Database
Technology for Your Data Warehousing and Business Intelligence
Project
Which relational database should you choose for the staging,
data warehouse and data mart databases that make up the data
warehousing system? In addition, which version and feature set
are the best choice? As you can see by this data warehouse
architecture diagram there are a number of databases
involved:
First look at your goals and objectives. Requirements you
may consider when evaluating include:
- Scalability - can you start small and grow?
- Portability / multi-platform and multi OS
- Volume of data stored
- Volume of data loaded / changed
- Number of users
- Analytics requirements
- Volatility of data
- Organization standards
- Availability of pre-packaged models
- Availability of tools
- Availability of trained personnel
- Existing database and hardware used
- Relationships and contracts with vendors
- Best of Breed vs Total Product Line
Performance features to consider:
- Table/index partitioning (mysql5.1 has merge table
feature)
- Fast drop/create index (mysql5.1)
- Fast data import / bulk load (mysql5.1)
- Efficient full text search (mysql5.1)
- High availability (mysql5.1)
- High data compression for archive (mysql5.1)
Financial Considerations:
- Software Base License Costs
- Hardware Costs
- Trained people costs
- Financial viability of vendor
- Start up costs
- Total Cost of Ownership (TCO)
There are a number of databases that you may want to
consider. The following databases and their respective vendors
are market leaders that you are likely to evaluate:
- DB2 from IBM Corporation
- MySql from Sun Software
- Oracle from Oracle Corporation
- SQL Server from Microsoft Corporation
- Teradata from Teradata Corporation
There are other very good database systems such as Sybase-IQ
and Postgres.
Business Intelligence with DB2
DB2 from IBM is definitely an alternative to consider if
your organization runs an IBM shop with AIX or z/OS operating
system.
|
Strengths
|
Weaknesses
|
- Strong on IBM platforms (AIX and
Z/OS)
- Highly scalable
- Integrates with related IBM
offerings
- Supported by IBM and partners
- Good installed base
|
- Not as strong on Linux and Windows
- Add on cost is high
|
DB2 has a number of versions and supports multiple
platforms. The Express-C version of the system differs from
other express versions in that there is no limit on database
size. The Data Warehouse Edition (DWE) packages DB2 Enterprise
Server along with:
- DB2 Cube Views
-
DB2 Intelligent Miner Modeling,
-
Visualization, and Scoring
- DB2 Warehouse Manager Standard Edition
IBM has developed a means of speeding queries by storing
pre-computing values called Materialized Query Tables
(MQT). This helps DB2 to be highly scalable.
Business Intelligence with MySql
MySQL from Sun Software is the world's most popular open
source DBMS. It has proven itself in many situations with
successes with companies like LinkedIn, Dow Corning and the NBC
Olympics.
|
Strengths
|
Weaknesses
|
- Many users
- Low cost
- Scalable
- Multi-platform
- Many partners
- Many tools
- Easy to use
- Supported by Sun Software
|
- Features may be slow in coming
- OLAP is not provided by Sun
|
With MySQL, you can start with the Community Edition and
then upgrade to the Enterprise Edition for production work.
Business Intelligence with Oracle
Oracle provides an industry leading relational database plus
a suite of business intelligence products.
|
Strengths
|
Weaknesses
|
- Data warehousing and business
intelligence features available
- Very scalable
- Multi-platform
- Many partners
- Many tools
- Supported by Oracle
|
- High price
- Add-ons cost extra
|
Oracle does provide a free Express version, so it is
possible to start at a low cost.
Business Intelligence with SQL Server
SQL Server 2005 from Microsoft is a great choice when the
environment is Windows and moderate scalability is
required.
|
Strengths
|
Weaknesses
|
- Ease of use
- High customer satisfaction
- Strong Microsoft supprt
- Widely used
- Availability of people
- Scales up to 3 TB
- ETL suite included
- OLAP suite included
- Reporting suite included
- Low TCO compared to other commercial
software
|
- Limited to Windows platform
- Scalability beyond 3 TB is a
challenge
|
SQL Server 2005 has a number of versions ranging in price
from free to $25,000. If you want to learn SQL Server or do a
proof of concept, the Developer version is a bargain at $50.
Keep in mind that if multiple platforms such as development,
test and production are needed the cost will increase.
|
Express
|
Developer
|
Work
Group
|
Standard
|
Enterprise |
| CPU |
1 |
1 |
2 |
4 |
No Limit |
| RAM |
1 GB |
3 GB |
3 GB |
No Limit |
No Limit |
|
Database
Size
|
4 GB |
4 GB |
No
Limit
|
No Limit |
No Limit |
Advanced
Features |
No |
No |
No |
No |
Yes |
| UI |
Basic |
Studio |
Studio |
Studio |
Studio |
| OLAP |
No |
Yes |
No |
Yes |
Yes |
| Reporting |
No |
Yes |
No |
Yes |
Yes |
| Approx Price |
Free |
$50 |
$3.9K |
$6K |
$25K |
Please check with vendor for most current prices and
features. The prices and features presented on this page are
for information only.
Business Intelligence with Teradata
Teradata has a long history of supporting large scale
databases aimed at data warehousing and business
intelligence. Teradata provides both the relational
database software and the hardware that supports it.
|
Strengths
|
Weaknesses
|
- Can support large data warehouses -
over 10 TB
- Industry data models
|
- High price point
- Not portable - requires specialized
hardware
|
Consider Teradata when large databases that
support many users are needed.
Database Choice for Data Warehousing and Business
Intelligence
In conclusion, there a number of relational database choices
available to support data warehousing and business
intelligence. These database choices cover a wide range
of scalability and price.
|