From Data Chaos to Zen: How a Strong Development and Data Team Can Optimize Your Data Strategy

Introduction:

How to go from data chaos to zen. This post discusses the vital need of data and the preparation to consume it, and how a strong development and data team can optimize your data strategy.

If data is the lifeblood of an organization, then the “product” is the heart. A company’s products, mobile apps, websites, APIs, software, hardware, systems, etc., produce mountains of data. Much of which often goes unnoticed, let alone unused. It can quickly become chaotic and overwhelming without proper planning, management, and organization. Often, companies that are not accustomed to the constant data inundation feel a sense of analysis paralysis when they start down the road of making data a primary driver of their business. It feels like there is so much to consume, so much to understand, so much to utilize, and often at worse, so much ignored.

Getting to a point where a business feels like they have a grasp on the data is in and of itself a daunting task. Researching all the options out there… Should they build their own solution? Can they develop their own system? Should they just buy a solution from one of the many options they showed up in their search? If they buy one of those solutions, how do they know it’s right? Can they maintain it? …So many questions, where do we start?…

One of our partners, Hightouch, wrote a fantastic article stating there is a third option in the build vs buy debate, at least regarding data ecosystems. I agree with the outcome in the article that composability is the future, and I would like to add that while a Composable CDP (Customer Data Platform) is indeed a salve to cure many of the data woes I stated above, it is only part of the solution.

At Coalesced, believe these are the core areas involved in reaching (or at least striving towards) a zen state of data:

The product(s) producing the data
The systems facilitating data
The operationalization strategy for data

I’d like to lay out the value of these three areas and how and where Coalesced can come along to assist in reaching your goals.

Product:

We have worked with numerous clients who felt that simply adding Google Analytics (GA) to their marketing site was sufficient to understand their customer’s behavior, and in having GA, they had their data needs covered. Yet, after having discovery calls and understanding the client’s vision and strategy, we find they are woefully blind to data generated within their systems that just vanishes into the ether that is “uncaptured data.” We work with them to help educate their teams on the wealth of “behavioral” data often locked in the sites, apps, etc. User purchasing patterns, content viewing patterns, exit intents, upsell opportunities, predictive cancellations, user experience blockers, and the list goes on. Still, none of this is visible if it isn’t thought about and planned for in the design and development phases. Intentionality is required before a single pixel is added to a prototype or one line of code is written to determine the data we need to capture “here.” Much of this data is only accessible at a moment in time. How many visitors did you have on your site yesterday? Don’t know? Didn’t capture it, then you’ll never know. And this is the simplest example and why I think many companies default to just a GA implementation. Don’t get me wrong that is better than nothing, but it is so massively insufficient for the data world we live in today. The user actions within your systems must be proactively contemplated and implemented to ensure you truly have the knowledge you need to operationalize your data.

Systems:

Much like the intentionality required to plan the data capture within the “product,” equal planning must go into the data systems themselves. If the data ecosystem isn’t in place, there is nowhere for the data to go. This system is just as important as the actual capture itself.

Without belaboring the full buildout of the data stack, I’ll cover the highlights and dive into the data utilization.

Let’s start with where the data will be stored. You should consider platform and system architecture, scalability, and data volume to select your destination, but the data warehouse is the central location for all data to live. Examples are Snowflake, Databricks, Google BigQuery, and Amazon Redshift. This system should become your single source of truth for all your data. Having this single source will grease the proverbial wheels for everything when it comes to operationalizing your data.

Getting to the single source of truth involves syncing your generated data and third-party data into your warehouse. I’ll circle back to the “product” data in a moment and cover third-party quickly. Systems such as Fivetran and Dataddo are great solutions for integrating data for systems you do not manage the data architecture of, for example, Google Analytics, YouTube, Stripe, Shopify, Mailchimp, etc. The value of syncing this external data to your warehouse becomes immense when you can then link it to your internal systems. Mixing these data sets is where operationalization starts to take form.

Your “product” data is probably coming from two sources typically. Assuming you have built out something custom, your application database and the actions within your applications. Syncing the data from your database can be reasonably straightforward with tools such as Fivetran.

Syncing your users’ actions is where all the intentionality, as mentioned earlier, comes into play. Platforms like Hightouch are great solutions for getting the data to your warehouse.

You’ll be able to add javascript snippets to your website and web app to start capturing all those page views, along with a wealth of additional data about the user. You can add explicit “track” calls on the desired actions you want to know have been performed. You’ll be capturing an anonymous identifier for the users on your websites, which you can later call an identify function, when a user performs an identifiable action, such as login, sign up, purchase, etc., so you can build a robust “map” of the journey a user took to get to a goal you define.

You can add platform-specific SDKs to track all user actions within your applications. Tracking on both the web and apps allows broken user journeys to be brought back together; for example, you served an ad on social media that took a user to your site, who ultimately left without a signup and subsequently downloaded your app where they signed up. Now, with that Identify action, you can see the entire journey.

Operationalization:

Ok, so we have planned what data our “products” need to be concerned about. We have determined which warehouse we will store the data in and have systems in place to send data to the warehouse. Now what?

We need to use the data, thus operationalizing it to yield value to the business. Capturing and storing data is not the goal… Using the data to glean insights for the company and ultimately improve user experiences to grow revenue and scale is the goal. So how do we do that?

An excellent place to start is to let business users access the data see graphs, charts, and tables to help them make decisions. There are numerous options for visualization tools like Klifolio, Looker, Sigma, Preset, and Tableau to let users see pre-built dashboards and ultimately explore the data. This type of access creates data curiosity, which is a great place to be as a company. You want your team to be very data-curious. This means they want to know what is happening and, more importantly, why.

A tool to “see” the data is a core requirement, but a prerequisite to visualizing the data is preparing and modeling the data for consumption. For this, you’ll want a solid tool to model the data into meaningful structures to the business users; this will involve data cleanup steps, such as deduplication, unioning of historical data, integration of third-party data, and data aggregation steps. One of the best tools on the market for these tasks is dbt. Not only does dbt accomplish all the itemized tasks, but it is incredibly developer-friendly and has fantastic integrations with systems like Klipfolio to extract all the power from the models with access to benefits like PowerMetrics and the semantic models of dbt.

Once you have met the requirements of base business dashboards and extended internal user access to data with strong data models, it’s time to use this single source of truth to send data to external systems. Re-enter Hightouch with its reverse ETL product and warehouse-first approach for data sources and hundreds of destination integrations. You can use SQL statements to query and configure data to meet near-infinite requirements to send your data to your third-party partners, like Social Media Ad platforms, email platforms, and even back into your application database to bring data full circle.

Coalesced:

Coalesced has years of experience in development, data, and strategy. We have implemented numerous sites/applications and data systems, intending to bring these processes into a self-feeding loop of data generation and consumption. We have built systems from the most basic to incredibly intricate custom requirements. We understand how these systems interact and, more importantly, how they impact the client and their customers. Let us help you take the next steps to a zen state with your data. Get in touch here or with the chat in the bottom right to see how we can you with your data planning.

Conclusion:

Data is everywhere.

Some you control, some you don’t.

But all of it is important; without it, you’re flying blind; with too little, you’re starved for knowledge, and with too much, you’re potentially drowning. It requires planning and intentionality to know what you need to be paying attention to, and it requires systems that support scalability and flexibility when things change. Coalesced is ready to help!