Cloud data warehouses offer a way for ecommerce companies to scale as the size of their data increases, promoting unlimited storage space, cost optimization and analytics horsepower. But where do you start? Are there no-code solutions that are also best-in-class?
Segment is an increasingly popular way to connect website data to a data warehouse such as AWS Redshift. In this guide we’ll take a close look at exactly how this works, and the pros and cons for your longterm company data needs.
Using Segment to connect Shopify to AWS Redshift
What is Segment? Segment is a powerful Customer Data Platform (CDP) solution, but it’s also much more than that. Segment provides businesses the ability to organize customer activity events from various platforms to a broad range of destinations,
One of those destinations can be a data warehouse – an ecosystem that serves as the centralized source of data collection. This includes the big three: BigQuery, Redshift, and Snowflake. The technology focuses on the tasks of collection, storage, and management of business data – with the purpose of turning operational data into meaningful information.
For any company looking to harness the value of the activities gathered inside their CDP, it’s a no-brainer that bringing a data warehouse into the mix is the next best step. Amazon Web Services (AWS) and its data warehouse offering, Redshift, remains the market leader in this space because of its compatibility with data integration pipelines and analytics tools.
For ecommerce sites this can be difficult to implement manually (not to mention maintenance time, costs and complexity!), but Littledata’s Shopify source for Segment does this automatically. With Littledata’s capabilities, you have the ability to direct, track, and identify custom events across all critical customer activities, including across your Shopify website, whether that’s a simple Shopify instance, a headless Shopify setup or multiple country stores doing international sales.
Coupling that with Segment’s unified CDP takes powerful data to activation, and the ability to direct platform data to marketing channels for increased engagement, conversion and retention. Whether you want to use a data warehouse for deep analysis, audience building or real-time recommendations, Littledata + Segment + Redshift is a proven solution for Shopify stores.
Setting up your Redshift data warehouse
Segment’s documentation portal gives a step-by-step breakdown of provisioning a Redshift cluster, configuring a database user, securing data ingestion, and providing a path to data collection into your Redshift instance.
Breaking the process down in digestible chunks, here are the necessary steps to go from data to data warehouse:
- Choose the best instance for your needs: Dense vs. Compute Storage
- Provision a new Redshift Cluster: 5 simple steps from start to finish
- Create a database user: Creating a user to manage your instance
- Connect Redshift to Segment: Select sources, credentials, and go
Collecting events in Segment
Event tracking is a critical part of the data collection process. Creating a plan tracking plan associated with measurable business outcomes, such as acquiring new customers, increasing retention and activating new leads, and mapping those outcomes to business goals, is an important step in the data journey.
Understanding this relationship will provide guidance to the relevant events or actions that must be configured to successfully track. With Littledata’s automated solution, you can avoid the blocking-and-tackling of configuring the best-in-class event strategies surrounding (client side) device-mode and (server side) cloud-mode events:
- Device-Mode events include Cart Viewed, Page Viewed, Product Clicked, Product Image Clicked, Product List Viewed, Product Shared, Product Viewed, Products Searched, Registration Viewed, Thank you Page Viewed
- Cloud-Mode events include Checkout Started, Checkout Step Completed, Coupon Applied, Customer Created, Customer Enabled, Fulfillment Created, Fulfillment Updated, Ordered Cancelled, Order Completed, Order Refunded, POS Order Placed, Payment Failure, Payment Info Entered, Product Added, Product Removed
To streamline the process for ecommerce sites, Littledata’s tracking script automatically sends events to Segment through its analytics.js library, making it easy to collect all the critical event activities associated with a customer’s store journey – from browsing behavior through the checkout funnel and repeat purchases (including recurring billing for stores selling by subscription).
Additionally, from every event where this is an identifiable customer (from both device-mode and cloud-mode), Littledata will send an Identify call – the identification of a customer when the customer logs into your storefront, a last step of the checkout process, with the order, and also after a purchase with a customer update. With Littledata’s streamlined modeling, data can be accurately represented and pushed to downstream destinations, such as marketing activation channels and data warehouses.
Littledata connects Shopify to Segment and your data warehouse
Connecting Segment data to your data warehouse
Now that your Redshift instance is up and running, the next step is to connect to Segment and start collecting data into your data warehouse. There are two ways to complete this step – one, through Segment’s native migration, and the other, utilizing no-code data pipeline tools (recommended). Whichever process you choose, you will have the opportunity to push data out of Segment into your data warehouse environment and start utilizing it across your business.
Option 1: Segment’s native migration
As mentioned, Redshift data warehouse is one of the many destinations that Segment can send data to. You can directly connect to Redshift from within Segment to stream event data.
Essentially, it’s as simple as:
- Login to your Segment App and proceed to the Catalog section
- In the top menu, choose Destinations
- Select Redshift in the Storage Destinations list
After configuring your user permissions and selecting the data sources you would like to sync, you’ll enter in your credentials and connect to your data warehouse. Voila! Data will now be continuously replicated into your Redshift instance based on your plan:
- Free: Data refreshed (synced) 1x per day
- Team: Data refreshed (synced) 2x per day
- Business: Data refreshed (synced) as fast as hourly
As for historical data, all plans will allow loading up to 2 months of your historical data, with the Business plan allowing for full historical backfills. Since Segment provides an environment to support many, it requires a premium plan to collect complete history and sync data real-time.
Option 2: Leverage data pipeline services
The second way to get data out of Segment into your data warehouse is through data pipeline platforms. Data pipeline or ETL (Extract, Transform, Load) platforms, provide prebuilt integrations to over 100+ enterprise software sources, and focus on a maintenance-free structure where replica data is automatically transformed, standardized, and normalized on collection. The automated adjustment to schema and API changes, allows business users to streamline developer tasks in a no-coding required environment. Companies like Stitchdata (“Stitch”) and Fivetran, leaders in the space, provide frictionless, subscription-based memberships that allow integrating data to data warehouse destinations convenient for any business size.
To set up, simply sign into your console, click on the Segment icon in the available integrations, and enable. You will automatically be pushed into the Segment tool to confirm authorization and (another voila!) data will begin replicating.
The benefits of cloud-ETL platforms, not only include their out-of-box integrations, but the list of features included to help visualize, maintain, and support ongoing data integration tasks:
- Over 100+ database and SaaS platform integrations
- In-app support including email alert monitoring and support SLAs
- 14-day free trial to kick-off and vet the platform prior to fully onboarding
- SOC2 security compliant, encrypted communication and an AWS cloud backed environment
With the appropriate event tracking configured at data collection by Littledata, your data can be properly analyzed for ecommerce store performance. The downstream output can be properly displayed by:
- Customer behavior before, during and after purchase
- Order performance relative to average order value, add-to-carts, average order size, and cart abandonment
- Shopper engagement including product views and purchases
- Coupon and discounting activities
- Customer checkout funnel and stage of drop-off
- Conversion rate and lifetime value
With the emphasis on accuracy completed at the inception data collection stage, the ability to produce the above areas of performance becomes that much more straightforward. This means spending more time analyzing and visualizing data, then transforming and modeling data for analytical use.
Empowering your data
Once your data is available in your data warehouse, replicating frequently, and building history, it’s time to utilize it. That can come in a number of various opportunities, depending on your business needs. Most notably, companies will focus on transforming data into actionable blocks and pushing into business intelligence (BI) tools.
To properly stitch event data together – say in the case to tie all interactions by a site visitor to achieve multi-channel attribution – companies can leverage existing packages that transform, marry and enrich data points. These packages – or prebuilt libraries – produce powerful results that end up restructuring data from their raw state to analysis-ready. Fishtown Analytics’ product dbt does just that, performing user-stitching, simplifying data structures, and speeding up data modeling to use instantly within reporting, analytics, or machine learning applications.
Companies usually begin the conversation here, “I’d like to see a dashboard like X” or “Can we get a report showing Y?”. In fact, what they are looking for is a way to properly view data in digestible, actionable views.
BI (Business Intelligence) tools do just that – whether it’s through data visualizations (dashboards), self-service analytics, or prebuilt reporting. Enterprise BI and SaaS tools like Looker and Tableau (like outlined in the table below) create the speedy path to data viewing. They can be simply connected to a data warehouse and publish dynamic views for instant performance tracking.
BI Tools Breakdown
|Tableau, Looker, PowerBI, Mode, Databricks
|Enterprise tier platforms with extended features
|Domo, Klipfolio, Kissmetrics, Sigma
|SaaS-oriented products with cost on user and dashboard use
|Glew, Daasity, Dashthis, Rubix3
|Ecommerce focused with prebuilt visuals
Another option is connecting reporting tools directly to Google Analytics data in parallel with your Redshift setup (for example, use Tableau on top of GA for marketing analysis and Looker on top of Redshift for deeper analysis and predictive analytics).
Building for the future
Companies that put an emphasis on building the foundational components of data ingestion, management and analytics early on see many benefits. Primarily, you are able to increase your ability to measure and understand your business properly. Data warehouses provide an opportunity to collect all of your store, site, customer, marketing, other relational data – all in one place. This creates a centralized view of your business and gives an upper hand to companies looking to take a data-driven approach to growth.
Cloud tools and no-code options remove the need for technical resources, freeing up dollars that can go elsewhere without sacrificing the ability to use and analyze data. No matter the size of your business, taking data seriously is the first step to empowering your business for the future. Data warehouses are no longer the property of only mega enterprises.
Want to build a modern ecommerce data stack but not sure where to start? Get in touch for a free consultation.