computer with people pointing

CRO-ology: What is data sampling?

Lucky Orange isn’t into data sampling, but other products on the market do. Does data sampling really matter? We’ll break it down to expose why it does.


Data sampling is an aggregated set of data that is delivered in smaller, bite-sized nuggets of information. Kind of like those delicious samples that get passed out at your local warehouse club! For websites who use sampled data, a percentage of overall traffic is collected and/or reported rather than the full 100 percent. Sampling is also commonly used in statistics, research and politics.


Your website has 50,000 daily sessions to analyze. Instead of capturing and displaying 50,000 recordings, a system using data sampling would instead show you a smaller percentage of the total, such as 5,000 sessions out of the 50,000 actual sessions for a sampling rate of 10 percent.

For conversion optimization products in particular, data sampling impacts all levels of reporting, including recordings, form analytics, heatmaps, conversion funnels and emailed reports. In addition, options such as chat, polls and surveys would not be available to customers who were not included in the sample.

Let’s say you wanted to send an email for a discount to every customer who didn’t convert and instead abandoned their cart. A system that uses a sampling methodology would only be able to send that discount to the sample of customers they were tracking. This means that not every customer who abandoned their cart would get the discount, only the small number that were being tracked.

Important note: Lucky Orange does not sample data. But we do like warehouse club samples.

Benefits to Sampling

  • Faster reporting
  • Lower price point

Drawbacks to Sampling

  • Decreased accuracy. You’re making assumptions that effect all of your customers based on the actions of a few. Do you trust that 10 percent of your customers being sampled are reflective of all customers?
  • No control over who gets recorded. Data sampling is random by nature. You usually can’t control who does or doesn’t get recorded, meaning your results can be skewed to over weight or completely miss an important customer segment. This also means you might:
  • Missed crucial issues. Uh-oh! Users are having problems on your site that you can’t identify because they weren’t recorded.
  • Lost opportunities to improve customer support. If you’re not recording all of your user sessions, you’ll never know about the users who left your site and failed to covert because their session wasn’t recorded. You may have missed out on a great opportunity to save that sale by engaging that customer through live chat or an emailed customer support query.
  • Lowered confidence in polls and survey data. Let’s say your running a poll on what inventory to offer next season. Your sampled survey respondent base of 10 percent tells you they really want vegan-friendly adult unicorn onesies. Are you willing to bet that vegan-friendly unicorn onesies will be a hot item based on the opinions of a sample of your website visitors?

As Ruben Ugarte with SuperMetrics wrote in an article here:

“If you can’t trust your data, you won’t take it seriously and you might as well not have it. Sampling can create a distrust of the data and make it irrelevant inside your company.”

Does Lucky Orange Sample Data?

Lucky Orange does not sample data. Out of the box, you a product that is “always on” and always gathering data for all visitors and interactions. This means that 100 percent of visitors to your website will be recorded.

In today’s data-driven world, accuracy matters and if you’re sampling data, you can’t confirm that your data are accurate. Unlike the warehouse clubs and other CRO suite platforms, we give you whole product.

As we learned from the botched sampling during the 2016 U.S. presidential election polls, the more people that are polled (or in our case, the more traffic data collected), the fewer errors or discrepancies we encounter.

While we aren’t in the business of politics or political polling, we are in the business of providing you with accurate data about your visitors. Our main objective is to provide you, our users, with the best insight to make informed, data-backed decisions for their websites.

Lucky Orange does this by providing 100 percent of your data all the time, every time.

All of your data are captured, which means dynamic heatmaps can be fully analyzed and filtered for more insight, and recordings can be used for more than just analytics.

For example: One of our users was able to use recordings to fight a fraud claim with her bank. Another used a recording show a customer that she input her address incorrectly while checking out, resolving a claim that she never shipped the product to the customer.

Had either user been using a product that sampled data, these recordings may or may not have been captured.

How Much Sampling is Okay?

For those still interested in data sampling, what percentage should be considered “accurate?” Will 25 percent give you a large enough data set? Or is 40 percent better?

Sayf Sharif with Lunametrics argues that even 50 percent sampling may not be enough to really get an accurate look at data.

“I once described the Google Analytics sample rate as akin to my personal confidence in the data. At 90 percent sample, I’m 90 percent confident that the data are close to correct. At 50 percent, I’m 50 percent confident. At 1 percent sample, I’m 1 percent confident.”

He added. “I don’t generally base opinions on things when I’m as close to a coin flip in regards to my confidence.”

Accurately understanding all of your visitors should mean more than just a coin flip. Your conversions and future customers are counting on it.

Leave a Reply

Your email address will not be published. Required fields are marked *