Data Trends & Technology: A FAQ for Marketers
The last several years have seen an explosion in a new world of cloud-enabled data infrastructure. Snowflake made this official the other day with the release of their inaugural “Modern Marketing Data Stack Guide.” This new world is unlocking a new set of opportunities, but also a new set of complexities.
This FAQ is designed for today’s marketers to acclimate to this world of cloud-enabled data infrastructure.
What is a Cloud Data Warehouse – and why is the “cloud” aspect of this important?
The “Cloud Data Warehouse” is just that – a data warehouse that can deploy on today’s cloud-enabled infrastructure such as Amazon Web services (AWS), Google Cloud Platform (GCP), or Microsoft Azure. Popular Cloud Data Warehouses today include Snowflake, Databricks, and Google’s BigQuery.
Predecessors to these systems were deployed on-premises and in data centers. This limited both storage and compute capacity. As with everything before cloud computing, scaling these systems was both expensive and slow – and common challenges would include the warehouse running out of storage or getting overloaded due to too much simultaneous compute.
With cloud-enablement of the data warehouse, these limitations are virtually gone today. Storage can scale to virtually any data that businesses want to retain, and cloud compute allows for many applications and end users to simultaneously query and use the data warehouse at the same time.
Cloud Data Warehouse applications started out as primarily Business Intelligence (BI) and analytics tools such as Tableau or Microstrategy. These applications are quickly evolving to other spaces including security applications, marketing applications, and much more.
What are the benefits of connecting a Cloud Data Warehouse to my marketing systems?
The Cloud Data Warehouse has enabled the next generation of data capabilities. Whereas historically, data warehouse capacity was limited, the cloud model has opened this up to house a magnitude of more data and can support usage by many more people and teams.
The rise of the data scientist and huge investments in data teams more broadly has resulted in massive efforts by many modern businesses today to build and maintain their core data infrastructure and Cloud Data Warehouses. These investments aim to develop a data asset that can provide a competitive advantage across the entire business.
For many organizations today, the Cloud Data Warehouse represents the highest quality and most complete view of the customer and the broader business. Connecting this data repository into your marketing systems can represent a shift in the potential to drive a next-generation of data-driven segmentation and personalization.
I already have a Customer Data Platform (CDP) – why do I also need a Cloud Data Warehouse?
Historically, many CDPs have roots in tag management & data collection. Segment’s core product for example is designed to collect data from your website and mobile applications – and then aggregate, unify, and stream this data directly to your marketing applications.
While these systems may be sufficient for businesses that are completely web and mobile-based, most businesses are not. For example, you may purchase a pair of shoes on your iPhone, but the shoes may take three weeks to arrive (data provided by a shipping provider), may arrive damaged (data submitted via a call to a support center), or may get returned (via a third party provider that handles returns).
In this example, critical data required to stitch together the true customer 360 wouldn’t be available from a tag manager. And furthermore, the business logic required to properly account for total revenue (including returns), average shipping times (including lost or delayed packages), and other factors like per-product margins – all require a degree of complex bookkeeping and transformations which are best maintained in a Cloud Data Warehouse.
So for many businesses, a CDP on its own will not be able to scale to the data needs and complexities across key customer touch points – and this is where the Cloud Data Warehouse comes in.
How is cloud data related to the death of cookies?
As third-party cookies deprecate, publishers who previously shared behavioral and demographic data over cookies in the browser are now starting to store this very same data in the cloud, and specifically in the Cloud Data Warehouse.
Initiatives such as Unified ID 2.0 that standardize customer identity coupled with the emergence of cloud storage-based marketplaces, including Snowflake’s data marketplace, have enabled many use cases that were formerly cookie-based to now to operate in the cloud and the Cloud Data Warehouse.
In the coming quarters and years as cookies continue to deprecate, this new world of cloud-enabled data will not just support existing use cases, but open up a new world of machine learning-powered applications that leverage compute capabilities of the Cloud Data Warehouse – compute capabilities that previously weren’t available in a browser-based third-party cookie world.
What does it mean for my data to have a “single source of truth” and why does this matter?
For many MarTech systems and architectures today, data is fragmented and siloed across multiple products. This makes it very difficult to find the right information and to know which data points and fields are correct and which ones are not.
A single source of truth is an organization-wide investment to create a single, accurate, and maintained view of key data points that describe the business, customer behavior, and everything in between.
Many organizations today have defined their Cloud Data Warehouse as the central source of truth for data in the business.
How can a Cloud Data Warehouse help provide better compliance across emerging privacy laws including GDPR and CCPA?
A core requirement of GDPR and CCPA is the ability to delete customer data on request. Yet at the same time, if your customer data is fragmented across multiple MarTech tools, it can be incredibly difficult to know which data lives where much less be able to delete this data across a complex set of systems.
With centralized data models, this problem can be simplified greatly – all data on your customers is tracked in a single place and just as you have visibility over your customer 360, you also have visibility across the sources that create your customer 360.
What’s special about Snowflake and how is a Cloud Data Platform different from a Cloud Data Warehouse?
In recent years, Snowflake has emerged as a winner in the competitive Cloud Data Warehouse space. Historically, the Cloud Data Warehouse along with their predecessors (the “enterprise data warehouse” which is deployed on-prem in a data center) existed primarily for reporting purposes.
Snowflake’s move to so-called “data applications” has been accompanied by a rebranding of their database as a “data platform.” With this has come an expanded set of functionality and partners to enable these new applications.
Marketing and MarTech systems are one class of applications where Snowflake is investing in developing further. As mentioned, in October of 2022, they published a comprehensive report called “The Modern Marketing Data Stack” which overviews their platform capabilities and preferred set of partners.
What’s “Reverse ETL” and how does it fit into all of this?
Reverse ETL is a simple way to move data from your Cloud Data Warehouse into your MarTech tools. The category today has gotten quite a bit of attention from data engineers and analysts since it creates a simple way to integrate data using SQL instead of custom code that integrates via APIs.
We’re seeing some of the larger marketing players integrate reverse ETL functionality directly into their platforms – including Salesforce (with their proposed Snowflake integration), and Braze.
A big restriction of reverse ETL is data limitations of destinations. Most MarTech systems can’t accommodate even a fraction of the data in your Cloud Data Warehouse. So while reverse ETL may make initial integration easier for your technical teams, it will not solve your problems around data still being extremely hard to use and access from your end systems.