Growth Hacking, Retention, Acquisition, Conversion: Growthmint

Day 6: Analysis

You know which metrics you’re going to track. Your analytics software is set up. Now you need to make sense of your data.

Besides analytics software, it’s common for analysis to involve spreadsheets at some point. Excel is by far the most popular, but is heavy duty. Google spreadsheets is a lightweight and free alternative that can work well for collaboration.

You want a tool that allows your to sort metrics by other metrics. A key feature for this in spreadsheet software is pivot tables (both Excel and Google spreadsheets have it.)

Pro Tip

School of Data offers free courses for people new to working with data. Take a look for further learning on the topics I mention in this lesson.

Cleaning up your data

Before you jump into analysis, you need to make sure you data is clean. Extra spaces, test data mixed in with production data and poorly formatted data can all contribute to making a data source unreliable.

Watch out for non-printable characters; your spreadsheet software will count them as data. This means that “data”, “ data”, “data " (notice the extra spaces in the last two examples) will be treated as different data points, even though they should be treated identically.

Non-printable characters include:

  • White spaces in your spreadsheet cells, before and after your data points
  • Tabs inserted at the ends of lines
  • Line breaks and carriage returns

You’ll want to give your integers identical formatting and normalize your text entries. With integers, this means taking “50.1”, “50.10”, “50.100” and making them all “50.10”.

For text entries, especially with data that has been entered manually by customers, you can end up with the same thing written in different ways. For example, “NYC”, “New York” and “New York City”. These all refer to the same thing, but your spreadsheet software will treat them differently.

Example

I worked on a data project that involved stats from top startup accelerators. One of our data points involved investors and one of the accelerators had their own investors listed in our data. Because I wanted to know what the data for outside investors looked like, the data from the accelerator’s own investors was unnecessary and I removed it for better results.

Pro Tip

Relying on poorly formatted data can render your analysis useless and really hurt your business when you’re making large strategic decisions based on it.

Pro Tip

Take a look at Data Wrangler. It’s a tool developed at Stanford for cleaning up your data.

Analyzing your data

Now you have a nice clean data set to work with. Before starting your analysis, it's ideal to have benchmarks for your product/industry. This way, it’s much easier to know if you’re seeing something out of the ordinary that needs further investigation.

Sorting data is the easiest way to start analyzing it. You just sort data by column in ascending or descending order and look at the top 5, 10 and 20 entries vs. the bottom entries. Is there anything that sticks out? What do the top entries or bottom entries have in common?

Finding averages would be the next step. Look up the averages of each column with integers and compare it with the averages of the top 5, 10 and 20 and the bottom 5, 10 and 20. How much higher is the average of the top 20 versus the entire set?

You have to be cautious with averages, though. Extreme outliers pull averages away from normal behavioral trends and can make them unreliable. To give yourself a better picture, find the median and look at the average alongside it.

Next up is the pivot table. A pivot-table can automatically sort, count total or give the average of your data. It displays the results in a second table, the pivot table, showing the summarized data.

Pivot tables are a great way to look at distribution for a metric. For example, they can be very helpful for finding the activation points of your customers. You could use a pivot table to look at the distribution of people per first, second and so on visits. As the conversion rate from visit to visit plateaus, there’s a good chance you’ve found your customer’s activation point.

Example

I have used distribution charts many times to find powerful insights. For one client, it was only when I looked at their retention metrics with a distribution chart, that it became clear how bad their conversion rate from first to second visit was. This meant that while they were putting money and time into acquisition, their customers weren’t being retained and were never coming back after their first visit.

Example

One of my client's had one customer who used their product vastly more than any other customer. While it’s great to have a super fan like that, including the customer in our data analysis skewed our results. We took the customer out and were able to more accurately determine normal customer behavior.

Pro Tip

Look at how certain channels perform all the way through your relationship with customers. Maybe your organic search traffic converts better than your paid search traffic, but you may find that after several months or longer, you retain fewer SEO customers and you end up with more revenue from your PPC.