Thinking about switching cloud data warehouse providers? Or looking for your first-ever cloud service provider?
This savvy little guide will walk you through the different data warehouse options on the market – both on-premise and cloud, plus give you tips about the right questions to ask, and what red flags to watch out for and avoid.
Why (and when) you need a data warehouse?
Let’s begin with a definition of a data warehouse: a data management system that aggregates data from multiple sources into a single, centralized repository to support data analysis, business intelligence, and informed decision making.
But data warehouses require structured data to deliver these valuable insights. Consequently, the raw data they receive must be filtered and given shape and structure before it can be stored, via a process called ETL (Extract, Transform and Load).
If your company’s information is stored in various source systems and shared across your organization, ask yourself:
- Are you seeing inconsistencies in data and reports?
- Are you experiencing difficulties in sharing data?
- Does your company lack a single source of truth?
If you’ve answered “Yes” to all three questions, a data warehouse is likely the solution for your business.
Because it integrates data from multiple sources, a data warehouse enables you to make better-informed business decisions based on all of your data. And since a data warehouse of standardizes and stores all of your organization’s data in the same format – the famous “single source of truth” – your qualified end-users benefit from enhanced data quality and consistency. Data centralization provides timely access to data as well – an advantage when dealing with time-sensitive issues as well as for turning information into insights faster.
Of course, not all data warehouses function in the same way, and aside from critical features such as data storage, choosing the right data warehouse depends on what you need to do with your data, what your budget allows, and the size of your business.
Let’s dive a little deeper into the options.
Different architectures for a customized solution
The three main types of data warehouse models are designed to meet different business needs.
An enterprise data warehouse (EDW) is a centralized warehouse that provides decision support services across the enterprise. EDWs are usually a collection of databases that offer a unified approach for organizing data and classifying data according to the subject.
A data mart is a subset of a data warehouse and is used for business-line specific reporting and analysis. Subject-oriented, this model aggregates data from source systems relevant to a specific business area, such as sales or finance. A virtual warehouse, on the other hand, is actually a set of separate databases which can be queried together, so users can access all data as if it was stored in one data warehouse. Data virtualization makes all data, regardless of location and configuration, appear as if it’s one place in a consistent format.
Data warehouse models: on-premise and cloud
When shopping for a data warehouse, the first thing to decide is whether you want an on-premise or a cloud-based option. Each has its own distinct advantages and disadvantages.
On-premise data warehouse
This traditional type of data warehouse runs on self-owned hardware, housed on-site by the company. Due to the high volume of data to aggregate and organize, it tends to require significant investment and skilled workers. IT may prefer this solution as it gives them complete control over the entire repository, although depending on their legacy appliance, hardware, and software combinations, they may need the help of outside appliance and managed services to keep things running.
But today, with the advent of the cloud, this option is no longer popular as a new choice according to Gartner, by 2023, 75% of all company databases will run on cloud platforms.
Cloud data warehouse
Cloud-based data warehouses, with their inherent flexibility and cost-effectiveness, have become the more attractive option. In the cloud, data is collected, stored, queried, and analyzed without the need for upfront investments in hardware or infrastructure.
But not all cloud data warehouses operate in the same way:
- Bring Your Own License (BYOL) lets companies use their existing licenses flexibly. BYOL lowers the cost and minimizes the risk often associated with transferring to a cloud service. The lack of one-to-one connections between licenses and hardware devices means a strong increase in dedicated resource availability, which helps businesses meet compliance requirements.
- Data warehouse as a service (DWaaS) is an outsourcing model in which a service provider configures and manages the hardware and software resources a data warehouse requires, and the customer provides the data and pays for the managed service. The customer doesn’t have to worry about staffing the data warehouse, making DWaaS a good choice for organizations with small or limited IT departments.
- Hybrid data warehouses combine the power of cloud storage with on-premise data platforms. A hybrid platform enables companies to keep up with the latest technologies and create efficiencies without losing the investment made in their current setups. Hybrid models can be dedicated or shared.
Criteria to consider before buying a data warehouse
Now that you’ve decided you’re in the market for a data warehouse, here are some of the things you need to explore before choosing the one that’s right for your business needs.
Data types
Your new data warehouse needs to know if you’re going to be storing structured or unstructured data. For the former, a relational database, where information is organized in tables with dependencies, should serve you well. For the latter, a non-relational database (where information is stored or in a “laundry list’ order) will likely better meet your needs.
Performance
How quickly do you really need your data? The higher the performance, the higher the price. That’s not to say you should accept poor performance but think carefully about the speed of process, size of data you want to process, etc., so you can choose an option that suits your needs.
Cost
This is obviously one of the most important considerations when choosing your data warehouse. The cost will depend on whether you choose a cloud or on-premise option. Vendor pricing tables are a good place to start, but be sure to request a quote with your exact configuration, as some vendors offer a pay-as-you-go model, while others offer flat rate pricing, where you can pay per TB or per hour of usage.
Side note: Beware of hidden costs. Most vendors are transparent about their offers, but make sure you won’t be charged for “extras” that are part of running a data warehouse, such as occasionally exceeding limitations. The best way to avoid hidden costs is to write a requirement guide for your vendor listing every eventuality for inclusion in your contract.
Implementation
There are several things to consider when looking at data warehouse implementation. Cost matters, but time may matter more, depending on your goals. If one data warehouse costs slightly less than another but takes five months longer to implement, that’s five months of being less competitive due to lack of business insights. Then there’s the question of ease of implementation: can you implement your data warehouse yourself or would you need a consultant to do it?
Scalability
Ask yourself how much data you’re currently accessing and how much scale your warehouse will need to support. Cloud-based data warehouses can store massive amounts of data without much overhead. Based on your previous data warehouse usage, your Business Intelligence & data strategy, and your company’s growth, you should be able to plan for your needs accordingly. You may also want to consider how a particular warehouse scales during times of demand.
Security
Your data warehouse will constantly be centralizing information from various locations, so implementing sound, effective, and cost-optimized security controls to protect that data is a must. These include intelligent user access controls, proper categorization of information, highly secure encryption techniques, and other current methods to protect your data assets.
Support and community
If you run into a problem with your data warehouse, you may want support, especially if you’re new to the game. If two data warehouse systems are fairly equal, the one that offers better support could be the better choice. Check out their online support communities to see what kind of help they offer, and find out if live support is included in your contract. Even if documentation may solve most of your support issues, being able to contact a real person can sometimes be a lifesaver.
Elasticity
Elasticity is the ability of system resources to scale independently and transparently, without impacting data availability or performance. This characteristic of cloud computing makes it an ideal candidate for data warehousing and analytical and scientific workloads. An elastic, cloud-based data warehouse is therefore scalable and enables high-speed analytics and data model flexibility, and may therefore be the best solution for your company’s evolving needs.
Backup & Recovery
While the complete loss of data is less common with data warehouses, it’s crucial to choose a platform that safely stores your backup to Amazon S3 and allows you to use that information at any time. Cloud systems are unique in that they provide a level of redundancy in case of emergencies. If something happens to the company, data can be restored from cloud servers located in a safe place in another location.
How ClicData can help you make the right choice?
We offer an integrated, smart data warehouse that brings all of your company data into one place to ensure data consistency, data quality, and ease of reporting.
There’s a ClicData solution that’s perfect for you. We can take care of everything with our dedicated plan and host your data on our Cloud Azure Server, a secure and scalable database complete with all necessary maintenance, backup, and monitoring. Or if you prefer, we offer an on-premise plan hosted on your own SQL Server and use our cloud application to add data to it. Your database and your server, wherever you want them.
You can read more here about the bespoke data warehouses we offer, from shared or dedicated to on-premise, designed to best meet your needs and implement your company’s BI strategy.
Ready to test a real 100 % cloud data warehouse solution for yourself?
Start your free trial now! Or if you have any questions? Ask them here.