The Ultimate Guide to connecting Segment to Redshift (and other powerful analytics tools)
Cloud data warehouses offer a way for ecommerce companies to scale as the size of their data increases, promoting unlimited storage space, cost optimization and analytics horsepower. But where do you start? Are there no-code solutions that are also best-in-class? Segment is an increasingly popular way to connect website data to a data warehouse such as AWS Redshift. In this guide we'll take a close look at exactly how this works, and the pros and cons for your longterm company data needs. Using Segment to connect Shopify to AWS Redshift What is Segment? Segment is a powerful Customer Data Platform (CDP) solution, but it's also much more than that. Segment provides businesses the ability to organize customer activity events from various platforms to a broad range of destinations, One of those destinations can be a data warehouse - an ecosystem that serves as the centralized source of data collection. This includes the big three: BigQuery, Redshift, and Snowflake. The technology focuses on the tasks of collection, storage, and management of business data - with the purpose of turning operational data into meaningful information. For any company looking to harness the value of the activities gathered inside their CDP, it’s a no-brainer that bringing a data warehouse into the mix is the next best step. Amazon Web Services (AWS) and its data warehouse offering, Redshift, remains the market leader in this space because of its compatibility with data integration pipelines and analytics tools. One of your Segment destinations can be a data warehouse such as AWS Redshift For ecommerce sites this can be difficult to implement manually (not to mention maintenance time, costs and complexity!), but Littledata's Shopify source for Segment does this automatically. With Littledata’s capabilities, you have the ability to direct, track, and identify custom events across all critical customer activities, including across your Shopify website, whether that's a simple Shopify instance, a headless Shopify setup or multiple country stores doing international sales. Coupling that with Segment’s unified CDP takes powerful data to activation, and the ability to direct platform data to marketing channels for increased engagement, conversion and retention. Whether you want to use a data warehouse for deep analysis, audience building or real-time recommendations, Littledata + Segment + Redshift is a proven solution for Shopify stores. Setting up your Redshift data warehouse Segment's documentation portal gives a step-by-step breakdown of provisioning a Redshift cluster, configuring a database user, securing data ingestion, and providing a path to data collection into your Redshift instance. Breaking the process down in digestible chunks, here are the necessary steps to go from data to data warehouse: Choose the best instance for your needs: Dense vs. Compute StorageProvision a new Redshift Cluster: 5 simple steps from start to finishCreate a database user: Creating a user to manage your instanceConnect Redshift to Segment: Select sources, credentials, and go Redshift allows users to start small and scale up on-demand as needs grow Collecting events in Segment Event tracking is a critical part of the data collection process. Creating a plan tracking plan associated with measurable business outcomes, such as acquiring new customers, increasing retention and activating new leads, and mapping those outcomes to business goals, is an important step in the data journey. Understanding this relationship will provide guidance to the relevant events or actions that must be configured to successfully track. With Littledata's automated solution, you can avoid the blocking-and-tackling of configuring the best-in-class event strategies surrounding (client side) device-mode and (server side) cloud-mode events: Device-Mode events include Cart Viewed, Page Viewed, Product Clicked, Product Image Clicked, Product List Viewed, Product Shared, Product Viewed, Products Searched, Registration Viewed, Thank you Page Viewed Cloud-Mode events include Checkout Started, Checkout Step Completed, Coupon Applied, Customer Created, Customer Enabled, Fulfillment Created, Fulfillment Updated, Ordered Cancelled, Order Completed, Order Refunded, POS Order Placed, Payment Failure, Payment Info Entered, Product Added, Product Removed To streamline the process for ecommerce sites, Littledata's tracking script automatically sends events to Segment through its analytics.js library, making it easy to collect all the critical event activities associated with a customer’s store journey - from browsing behavior through the checkout funnel and repeat purchases (including recurring billing for stores selling by subscription). Additionally, from every event where this is an identifiable customer (from both device-mode and cloud-mode), Littledata will send an Identify call - the identification of a customer when the customer logs into your storefront, a last step of the checkout process, with the order, and also after a purchase with a customer update. With Littledata’s streamlined modeling, data can be accurately represented and pushed to downstream destinations, such as marketing activation channels and data warehouses. [subscribe heading="Littledata connects Shopify to Segment and your data warehouse" button_text="Book a demo" button_link="https://www.littledata.io/app/enterprise"] Connecting Segment data to your data warehouse Now that your Redshift instance is up and running, the next step is to connect to Segment and start collecting data into your data warehouse. There are two ways to complete this step - one, through Segment’s native migration, and the other, utilizing no-code data pipeline tools (recommended). Whichever process you choose, you will have the opportunity to push data out of Segment into your data warehouse environment and start utilizing it across your business. Option 1: Segment’s native migration As mentioned, Redshift data warehouse is one of the many destinations that Segment can send data to. You can directly connect to Redshift from within Segment to stream event data. Segment’s catalog provides direct integration to best-in-class data warehouses Essentially, it’s as simple as: Login to your Segment App and proceed to the Catalog sectionIn the top menu, choose DestinationsSelect Redshift in the Storage Destinations list After configuring your user permissions and selecting the data sources you would like to sync, you’ll enter in your credentials and connect to your data warehouse. Voila! Data will now be continuously replicated into your Redshift instance based on your plan: Free: Data refreshed (synced) 1x per dayTeam: Data refreshed (synced) 2x per dayBusiness: Data refreshed (synced) as fast as hourly As for historical data, all plans will allow loading up to 2 months of your historical data, with the Business plan allowing for full historical backfills. Since Segment provides an environment to support many, it requires a premium plan to collect complete history and sync data real-time. Segment’s infrastructure is suitable for instantaneous data collection to downstream points Option 2: Leverage data pipeline services The second way to get data out of Segment into your data warehouse is through data pipeline platforms. Data pipeline or ETL (Extract, Transform, Load) platforms, provide prebuilt integrations to over 100+ enterprise software sources, and focus on a maintenance-free structure where replica data is automatically transformed, standardized, and normalized on collection. The automated adjustment to schema and API changes, allows business users to streamline developer tasks in a no-coding required environment. Companies like Stitchdata ("Stitch") and Fivetran, leaders in the space, provide frictionless, subscription-based memberships that allow integrating data to data warehouse destinations convenient for any business size. ETL platforms streamline data from end-to-end and require limited technical lift To set up, simply sign into your console, click on the Segment icon in the available integrations, and enable. You will automatically be pushed into the Segment tool to confirm authorization and (another voila!) data will begin replicating. Stitchdata’s user-friendly interface for connecting platforms to destinations The benefits of cloud-ETL platforms, not only include their out-of-box integrations, but the list of features included to help visualize, maintain, and support ongoing data integration tasks: Over 100+ database and SaaS platform integrationsIn-app support including email alert monitoring and support SLAs14-day free trial to kick-off and vet the platform prior to fully onboardingSOC2 security compliant, encrypted communication and an AWS cloud backed environment Ecommerce data With the appropriate event tracking configured at data collection by Littledata, your data can be properly analyzed for ecommerce store performance. The downstream output can be properly displayed by: Customer behavior before, during and after purchaseOrder performance relative to average order value, add-to-carts, average order size, and cart abandonmentShopper engagement including product views and purchasesCoupon and discounting activitiesCustomer checkout funnel and stage of drop-offConversion rate and lifetime value With the emphasis on accuracy completed at the inception data collection stage, the ability to produce the above areas of performance becomes that much more straightforward. This means spending more time analyzing and visualizing data, then transforming and modeling data for analytical use. Empowering your data Once your data is available in your data warehouse, replicating frequently, and building history, it’s time to utilize it. That can come in a number of various opportunities, depending on your business needs. Most notably, companies will focus on transforming data into actionable blocks and pushing into business intelligence (BI) tools. Transformation To properly stitch event data together - say in the case to tie all interactions by a site visitor to achieve multi-channel attribution - companies can leverage existing packages that transform, marry and enrich data points. These packages - or prebuilt libraries - produce powerful results that end up restructuring data from their raw state to analysis-ready. Fishtown Analytics’ product dbt does just that, performing user-stitching, simplifying data structures, and speeding up data modeling to use instantly within reporting, analytics, or machine learning applications. Leveraging transformation can streamline data modeling and enrich data for analytical-use BI Tools Companies usually begin the conversation here, “I’d like to see a dashboard like X” or “Can we get a report showing Y?”. In fact, what they are looking for is a way to properly view data in digestible, actionable views. BI (Business Intelligence) tools do just that - whether it’s through data visualizations (dashboards), self-service analytics, or prebuilt reporting. Enterprise BI and SaaS tools like Looker and Tableau (like outlined in the table below) create the speedy path to data viewing. They can be simply connected to a data warehouse and publish dynamic views for instant performance tracking. Data can be presented in dashboards across many dynamic charts, tables, and graphs BI Tools Breakdown CategoryVendorsBreakdownMarket LeadersTableau, Looker, PowerBI, Mode, DatabricksEnterprise tier platforms with extended featuresRisersDomo, Klipfolio, Kissmetrics, SigmaSaaS-oriented products with cost on user and dashboard usePrebuiltGlew, Daasity, Dashthis, Rubix3Ecommerce focused with prebuilt visualsOpenDataStudio, MetabaseOpen-source/no-cost platforms So a straightforward reporting and visualization solution with the setup we've described in this article, would be to connect Shopify to Segment, then Segment to Redshift, then Redshift to Tableau. Learn more about how to connect BI tools to your Shopify data in Segment, whether as a Segment destination using alias calls or a dynamic view pulling from data in your warehouse. Another option is connecting reporting tools directly to Google Analytics data in parallel with your Redshift setup (for example, use Tableau on top of GA for marketing analysis and Looker on top of Redshift for deeper analysis and predictive analytics). Building for the future Companies that put an emphasis on building the foundational components of data ingestion, management and analytics early on see many benefits. Primarily, you are able to increase your ability to measure and understand your business properly. Data warehouses provide an opportunity to collect all of your store, site, customer, marketing, other relational data - all in one place. This creates a centralized view of your business and gives an upper hand to companies looking to take a data-driven approach to growth. Cloud tools and no-code options remove the need for technical resources, freeing up dollars that can go elsewhere without sacrificing the ability to use and analyze data. No matter the size of your business, taking data seriously is the first step to empowering your business for the future. Data warehouses are no longer the property of only mega enterprises. Want to build a modern ecommerce data stack but not sure where to start? Get in touch for a free consultation. [subscribe heading="Littledata connects Shopify to Segment and your data warehouse" button_text="Book a demo" button_link="https://www.littledata.io/app/enterprise"]
Measuring screen resolution versus viewport size
There’s a difference between the ‘screen size’ measured as standard in Google Analytics and the ‘browser size’ or ‘browser viewport’. Especially on mobile devices, there are pitfalls comparing the two. Browser viewport is the actual visible area of the HTML, after the width of scroll bars and height of button, address, plugin and status bars has been allowed for. Desktop computer screens have got much bigger over the last decade, but browser viewports (the visible area within the browser window) are not. The CSS tricks site found only 1% of users have their browser viewing in the full screen. While only 9% of visitors to his site had a monitor less than 1200px wide in 2011, around 21% of users have a browser viewport of less than that width. Simply put, on a huge monitor you don’t browse the web using your full screen. Therefore, 'screen resolution' may be much larger than 'viewport size'. The best solution is to post browser viewport size to GA as a custom dimension. P.S. Google Analytics does have a feature within In Page Analytics (under Behaviour section) to overlay Browser Size, but it doesn’t work for any of the sites I look at.
How many websites use Google Analytics?
Google Analytics is clearly the number one web analytics tool globally. From a meta-analysis of different surveys, we estimate it is currently installed on over 50% of all websites or 80% of operational websites using any kind of analytics tracking. We looked at the following sources for this chart: Datanyze survey of Alexa top 1m sites (04/2014) BuiltWith survey of all websites (04/2014) MetricMail survey of Alexa top 1m sites Pingdom survey of Alexa top 10k sites (07/2012) W3Techs survey of their own sites (04/2014) LeadLedger survey of Fortune 500 sites (04/2014)
What's included in Analytics traffic sources?
The Channel report in Google Analytics (under 'Acquisition' section) splits out into 6 or more types of visit channel: Direct Where a visitor has: typed the URL into the address bar clicked on a link which is NOT in another web page (e.g. in a mobile app) visited a bookmarked link Organic Search All visits from search engines (i.e. Google, Bing, Yahoo) which were not an advertisement. You used to be able to filter out people searching for your brand (which are more like Direct visits), but now the search terms are not provided. Paid Search Visits from search engines where the visitor clicked on an advert. Referral Where a visitor has clicked on a link in another website (not your own domain), but not including search engines or social networks. Social Networks Specifically links from known social network websites (including Facebook, Twitter, LinkedIn etc) Email From links tagged as medium = 'email'. Your email software needs to be configured correctly to add this tag. Display Links tagged as 'display' or 'cpm'. FAQs Can I change the channel groupings? Yes, you can change this under Admin .. (Selected View).. Channel Grouping. But we recommend you don't do this for your default view, as you won't be able to compare the historical data.
Subscribe to Littledata news
Insights from the experts in ecommerce analytics
Try the top-rated Google Analytics app for Shopify stores
Get a 30-day free trial of Littledata for Google Analytics or Segment