Businesses and organizations of all sizes are becoming increasingly dependent on data analytics, and data warehouses or business analytic infrastructure has become a business critical application for many (if not most) companies. Indeed, these companies have always searched for better ways to understand their customers, and anticipate their needs. They have longed to improve the speed and accuracy of operational decision-making. Equally important as timeliness is the depth of the data analysis.
Generally, the companies want to decipher all secrets hidden within the massive amounts of ever-increasing data. A data warehouse appliance, which is an integrated collection of hardware and software designed for a specific purpose typically involving the high throughput of data and analytic functions, can be used by organizations to optimize various areas of data processing. Its main intent is to supplant conventional business intelligence functions, such as warehousing, extract-transform-load (ETL), analysis and reporting.
Due to its cost-effectiveness and efficiency, the data warehouse appliance has become an important segment of the data warehousing market. In this paper, I will examine the data warehouse appliances and describe its positive impact on business enterprises. Introduction Since introduced in the early 1990s, data warehouse (DW) has proven to be the key platform for strategic and tactical decision support systems in the competitive business environment today.
It has become a major technology for building data management infrastructure, and resulted in many benefits for various organizations, including providing “a single version of the truth, better data analysis and time savings for users, reductions in head count, facilitation of the development of new applications, better data, and support for customer-focused business strategies” (Rahman, 2007). The technology has become extremely important in an environment where increasing competition, unpredictable market fluctuations, and changing regulatory environments are putting pressure on business organizations.
Data warehouses are also becoming the central repositories of organization/company information for data, which is obtained from a variety of operational data sources. Business applications will find data warehouses more beneficial and rely on them as the main source of information as they progress. These applications are able to perform all sorts of data analysis, with increasing customer demands for having the most up-to-date information available in data warehouses. Improving data freshness within short time frames is essential to meeting such demands.
According to Hong et al, virtually all Fortune 1000 companies, today, have data warehouses, and many medium and small sized firms are developing them. The desire to improve decision-making and organizational performance is the fundamental business driver behind data warehouses. DW help managers easily discover problems and opportunities sooner, and widen the scope of their analysis. Hong also mentions that data warehouse is user-driven, meaning that users are allowed to be in control of the data and will have the responsibility of determining and finding the data they need.
But however, the data warehouses have to be designed and evaluated from the user perspective in order to motivate users to be responsible for finding the data they need. Data warehouse is said to be “one of the most powerful decision-support tools to have emerged in the last decade” (Ramamurthy, 2008). They are developed by firms to help managers answer important business questions which require analytics including data slicing and dicing, pivoting, drill-downs, roll-ups and aggregations.
And these analytics are best supported by online-analytical processing (OLAP) tools. A data warehouse appliance, which is the main topic of discussion in this research, is referred to as an integrated collection of hardware and software designed for specific purposes involving the high throughput of data and analytic functions. Data warehouse appliance has become an important segment of the data warehousing market, due to its cost-effectiveness and efficiency. A business or organization can use a data warehouse appliance to optimize various areas of data processing.
In general, the main purpose of the DW appliance is to supplant conventional business intelligence (BI) functions including warehousing, extract, transform, load (ETL), analysis, and reporting. A data warehouse appliance can have a huge positive impact on a business enterprise. Large organizations are able to staff their data warehouse more efficiently, while assisting mid-level companies in solving business intelligence challenges. Data warehouse is fundamentally changing the way the businesses operate, as they are increasingly adopted across various companies.
The purpose of this paper is to present the data warehouse appliances and how they impact businesses and organizations. In the next sections, I present a brief overview of data warehousing and the current state of BI, then I define and discuss DW appliances including its benefits, after which I describe the positive impact of DW appliances on businesses. Data Warehousing A data warehouse can basically be defined a subject-oriented, integrated, non-volatile, and time-variant collection of data in support of management’s decisions.
Unlike the on-line transaction processing (OLTP) database systems, data warehouses are organized around subjects storing historical/summarized data for business requirement purposes. According to O’Brien and Marakas, a data warehouse is a central source of data which have been cleaned, transformed and cataloged so they are usable by managers/business professionals for data mining, online analytical processing, market research, and decision support. These stored data are usually extracted from various operational, external, and other database management system of an organization.
DW can be sub-divided into data marts, holding subsets of data from the warehouse that focus on specific aspects, such as department, of a company. In general all data warehouse systems comprises of the following layers; data source, data extraction, staging area, ETL, data storage, data logic, data presentation, metadata, and system operations layer. But the four major components include the multi-dimensional database, ETL, OLAP, and metadata. The dimensional database applies the concept of standard star-schema including dimension and fact tables, hierarchies for drill-down, role models, aggregates and snow flaking.
It optimizes database design for better performance. The ETL process involves the extraction, transformation and loading of data with appropriate ETL tools. Data integration is one of the most important aspects of data warehouse, whereby data is extracted from multiple heterogeneous source systems and placed in a staging area where it is cleaned, transformed, pruned, reformatted, standardized, combined, and summarized before loading into the warehouse.
OLAP (online analytical processing) tool provides the front-end analytical capabilities including slice and dice, drill up, drill down, drill across, pivoting, and trend analysis across time. And metadata stores information (or data) about the data in the warehouse system. The components of a complete data warehouse architectural system are illustrated in Figure 1 below. Figure 1 An important characteristic about the data in a data warehouse is that they are static, unlike a typical database with constant changes.
Once the data are gathered up, formatted for storage, and stored in the data warehouse, they will never change. The restriction is such that complex patterns or historical trends can be searched for, and analyzed, by queries. Data warehouses are also non-volatile in the sense that end-users cannot update the data directly, thereby being able to maintain a history of the data. A major use of the data warehouse databases is data mining, in which the data are analyzed to reveal hidden patterns and trends in historical business activity.
Such analysis could be used to help managers make decisions about strategic changes in business operations in order to gain competitive advantages in the marketplace. Data warehousing is a relatively new technology that “brings the vision of an entirely new (customer-centric) way of conducting business to reality”, and can provide “environments promising a revolution in organizational creativity and innovation” (Ramamurthy, 2008).
Ramamurthy also mentioned that data warehouse generally serves as an IT infrastructure technology, focused on data architecture, as it provides a foundation for integrating a diverse set of internal and external data sources, enabling enterprise-wide data access and sharing, enforcing data quality standards, providing answers to business questions, and promoting strategic thinking through CRM, data mining, and other front-end BI applications. Users of the data warehouses are from virtually every business unit, amongst which information systems, marketing and sales, finance, production and operations, are the heaviest users.
Current State of Business Intelligence Business Intelligence are computer based techniques used in identifying, extracting and analyzing business data. Sales revenue by products, department, time, region or income are such examples. The BI technologies provide historical, current and predictive views of business operations. Some common functions of BI technologies include reporting, online analytical processing, analytics, data mining, text-mining and predictive analytics. As BI aims to support better business decision-making, they can also be referred to as a decision support system.
BI applications often use data gathered from data warehouses or data marts, however, not all BI applications require a data warehouse. With sources from Wikipedia, business intelligence can be applied to business purposes in order to drive business value. Amongst these business purposes include measurement, analytics, reporting, collaboration, and knowledge management. BI is widely used today, mainly to describe analytic applications. According to Watson, BI is currently the top-most priority of many chief information officers.
In a survey of 1,400 CIOs, from Gartner Group, it was discovered that BI projects were the number one technology priority for 2007. Watson further informs that the BI is a process which basically consists of two primary activities; “getting data in and getting data out”. Getting data in, also referred to as data ware housing, delivers limited value to a business enterprise. Organizations realize the full value of data from data warehouses only when users and applications access the data and use it to make decisions.
Getting data out receives the most attention, as it consists of business users/applications accessing data from DW to perform enterprise reporting, OLAP, querying and analytics. The business intelligence framework is depicted in figure 2. Current BI infrastructure is a patchwork of hardware, software and storage that is growing ever more complex. Figure 2 – BI framework BI is continuing to evolve, and several recent developments are generating widespread interest, including real-time BI, business performance management, and pervasive BI.
Data Warehousing Appliance A data warehouse is developed to support a broad range of organizational tasks. It can be referred to as an organized collection of large amounts of structured data, designed and intended to support decision making in organizations. The import of information and knowledge from a data warehouse is a complex process that requires understanding of the logical schema structure and the underlying business environment.
According to Hinshaw, a data warehouse appliance, applied to business intelligence, “is a machine capable of retrieving valuable decision-aiding intelligence from terabytes of data in seconds or minutes versus hours or days”. The appliances represent the difference between decision-making using either stale data or the freshest information possible. With sources from Wikipedia, a more standard definition of the data warehouse appliance is an integrated collection of hardware and software designed for a specific purpose that typically involves the high throughput of data and analytic functions.
It typically consists of integrated set of servers, operating systems, data storage facilities, database management systems (DBMS), and software that is pre-installed and pre-optimized for data warehousing. DW appliances provide solutions for the mid-to-large volume data warehouse market, offering low-cost performance usually on data volumes within the terabyte range. Due to its cost-effectiveness and efficiency, the data warehouse appliance has become a critical segment of the data warehousing market.
A business or an organization can use a data warehouse appliance to optimize various areas of data processing. The main purpose of a DW appliance, in general, is to supplant conventional business intelligence functions, such as warehousing, extract, transform, load (ETL), analysis, and reporting. A true DW appliance is defined as one that does not require fine-tuning, indexing, partitioning, or aggregating, whereas, some other DW appliances use languages such as SQL to facilitate interaction with the appliance at a database request level.
With reference to Wikipedia, most data warehouse appliance vendors use massive parallel processing (MPP) architectures to provide high query performance and platform scalability. The MPP architectures consist of independent processors or servers executing in parallel, implementing a “shared nothing architecture” which provides an effective way to combine multiple nodes within a highly parallel environment.
A DW appliance is capable of deploying up to thousands of query processing nodes in one ppliance package, compared to traditional solutions where the cost and complexity of each additional node prevents a high level of hardware parallelism. Leveraging fully integrated data warehouse architecture, a data warehouse appliance can deliver a significant performance advantage, performing up to 100 times faster than general-purpose data warehousing systems. Maturation With reference to Hinshaw, data warehouse appliance is specifically designed for the streaming workload of business intelligence and is built based on commodity components.
It integrates hardware, DBMS and storage into one opaque device and combines the best elements of SMP and massively parallel processing (MPP) approaches into one that allows a query to be processed in the best possible optimized way. A data warehouse appliance is fully compatible with existing BI applications, tools and data, through standard interfaces. It is simple to use and has an extremely low cost of ownership. The development of standardized interfaces, protocols and functionality is one of the most important trends in BI.
In comparison to about a decade ago, there are a wealth of tools and applications using these standardized interfaces including MicroStrategy, Business Objects, Cognos, SAS and SPSS. And these are coupled with ETL tools having standardized interfaces such as Ab Initio, Ascential and Informatica. The appliances work seamlessly with these tools and other in-house applications. A data warehouse appliance is truly scalable. The bottlenecks are the speeds of the internal buses, internal networks, and disk transfer in BI, whereas in transactional workloads, scalability is limited primarily by CPU.
Reliability, which is provided by the homogenous nature of an appliance – all parts of the system coming from a vendor, is also critical. A data warehouse appliance also provides simplicity for the administrators, in that it allows administrators spend a more productive time in troubleshooting complex database systems. And DBAs can be deployed to assist end users doing real-time BI. A data warehouse appliance offers the lowest cost of ownership as it has one source and one vendor, thereby reducing costs associated with support.
Businesses and organizations will run more efficiently with the simple, efficient solution provided by a data warehouse appliance. Benefits Data warehouse appliances provide freedom to the business user. With patch-work systems, users are limited in the queries they can run due to the time required to run them. And with the time required to run a complex query reduced to seconds, users can not only run their old analysis with more iterations, but have the time to devise and run entirely new sets of analysis on granular data.
With sources from Wikipedia, some researched benefits of DW appliance are briefly discussed as follows; Reduction in costs – As a data warehouse grows, the total cost of ownership of the data warehouse consists of initial entry costs, maintenance costs, and the cost of changing capacity. DW appliances offer low entry and maintenance cost. Parallel performance – DW appliances provide a compelling price/performance ratio. The vendors use several distribution and partitioning methods to provide parallel performance.
With high performance on highly granular data, DW appliances can address analytics that could previously not meet performance requirements. Reduced Administration – DW appliances can provide a single vendor solution, taking ownership for optimizing the parts and software within the appliance, thereby eliminating the customer’s costs for integration and regression testing of the DBMS, OS and storage on a terabyte scale. DW appliance reduces administration via automated space-allocation, reduced index-maintenance and reduced tuning and performance analysis. Scalability – DW appliances scale for both capacity and performance.
In massive parallel processing architectures, adding servers increases performance as well as capacity. Built-in high availability – Massive parallel processing DW appliance vendors provide built-in high availability via redundancy on components within the appliance. Warm-standby servers, dual networks, dual power-supplies, disk mirroring with fail-over and solutions for server failure are offered by many. Increasingly, business analytics are expected to be used to improve the current cycle, and DW appliances provide quick implementations without the need for regression and integration testing.
Also, DW appliances provide solutions for many analytic application uses. Some of these applications include; enterprise data warehousing, super-sized sandboxes isolating power users with resource intensive queries, pilot projects, off-loading projects from the enterprise data warehouse, applications with specific performance or loading requirements, data marts that have outgrown their present environment, turnkey data warehouses, solutions for applications with high data growth and high performance requirements, and applications needing data warehouse encryption.
Impact of Data Warehouse Appliances on Businesses and Organization Demand for data warehouse appliances is increasing, and businesses taking advantage of the benefits of this hardware range from a world-wide large-scale business to the smallest individual business. Data virtualization could be a useful partner to appliances, providing a single view of information across multiple appliances. Data virtualization is also useful because it provides a stable reporting layer during normal migration exercises, such as the circumstances during addition of data warehouse appliances to the information infrastructure.
As businesses today continue to process extremely large volumes of data, there is always the need to keep data warehousing costs under control while ensuring a superior BI and application performance. Scalability, flexibility, and affordability are essential requirements for designing an infrastructure capable of supporting next-generation BI performance. When asked why the demand for data warehouse appliance is increasing, during an interview, Robert Eve (executive vice president of marketing for Composite Software Inc. ) stated that it is the confluence of three primary drivers at the macro level.
The first is “the well-reported information explosion, and the technical challenges involved in making this information accessible in forms that business decision-makers can easily use”. Secondly, data warehouse appliances are more affordable and appealing, as the costs per terabyte and for support are coming down. And finally, recent advancements in analytics technology, notably in predictive analytics, promise to concur with the massive data volumes. Data warehouse appliances offer numerous advantages some of which are similar to benefits.
Amongst the advantages include; more reporting and analytical capabilities – data warehouse appliance are able to handle bigger and more complex query workload, if it executes queries, Cost reductions – data warehouse appliance requires a minimal amount of tuning and optimization of the database server and database design. It is also able to run most queries with a quick speed, Flexibility – it will be easier to implement new user requests if less tuning and optimization is needed. With other database servers, a new query might lead to quite a number of technical changes, such as creating and dropping indexes, repartitioning tables, etc.
Sometimes, decision is made not to implement the new request at all, due to the overwhelming work. The need for these additional technical changes is less with a data warehouse appliance. Data warehouse appliances helps support impressive BI deployments. With reference to Hinshaw, real world application examples of the positive impact of DW appliance on businesses are discussed. The rapid growth of call detail records, in the telecommunications industry, creates an imposing amount of data, which makes it difficult for companies to quickly and efficiently analyze customer and call plan information.
And traditional approaches have been inefficient in processing queries on even a month’s data, seriously hampering an organization’s ability to perform trend analysis to reduce customer churn and generate timely reports. However, with a DW appliance, the telecom user can analyze customer activity down to the call detail record level over a full year’s worth of detailed data. Another industry where data warehouse appliances have begun to prove their worth, and are poised to play a bigger role in the future, is the retail.
Hinshaw states that Brick-and-mortar and online retailers are capturing great amounts of customer transaction and supply chain information, creating a data explosion that threatens to overwhelm an average retail organization and its current IT infrastructure. But data warehouse appliances enable these retailers to manage and analyze the terabytes of information in near-real time. They are able to use the information to effectively forecast buying patterns, quickly generate targeted promotions and optimize their inventory and supply chain. Business intelligence remains the foundation for the success of decision making in any company.
And BI, itself, relies on the underlying database architecture. Eve also presents other real world examples of positive business impact among a broad range of industries. A leading worldwide convenience foods business uses data warehouse appliances and analytic applications to acquire major business benefits in two specific areas. One of which the company optimizes its international network of delivery routes, making the system more efficient and ensuring timely delivery of its products. Secondly, it continuously refines its merchandizing mix daily, on a retail basis, in order to maximize sales and margins.
Major League Baseball captures information about every pitch, at-bat, and fielding play within a data warehouse appliance, using this data to predict players’ future on-field performance. This can help teams to evaluate current and free-agent talent, refine coaching and development methods, and determine salaries, hence maximizing their wins. Also, a global freight, transportation, and logistics company uses data warehouse appliances to identify behavioral patterns that indicate potential dissatisfaction within its existing customer base.
The customer care group then proactively takes steps to improve satisfaction before they lose their customers. Currently, smaller data warehouse appliance vendors seem to be focusing on adding functionality to their products in order to compete with the mega-vendors. However, it is anticipated that all appliance vendors will be impacted by the trend toward an inexpensive, high-performance, and scalable virtualized data warehouse implementations which use regular hardware and open source software. Conclusion
In general, data warehouse appliance is a combination hardware and software product specifically designed for analytical processing. In a traditional data warehouse implementation, the database administrator can spend a significant amount of time tuning and putting structures around the data to get the database to perform well for large sets of users. But with a data warehouse appliance, it is the vendor who is responsible for simplifying the physical database design layer and making sure that the software is tuned for the hardware.
In this research, a comprehensive examination/review of the data warehouse appliances, their benefits, and how they positively impact businesses and organizations, was presented. Based on this research, the negative impact of DW appliances on businesses are negligible compared to its positive impact. And there is an increasing demand for DW appliances. I believe that, in the near future, the DW appliances will become the sole platform for all business intelligence applications and requirements. I gained much knowledge and insights from researching this topic, and I intend to further my research on future impacts of DW appliance on businesses.
Courtney from Study Moose
Hi there, would you like to get such a paper? How about receiving a customized one? Check it out https://goo.gl/3TYhaX