The Great Contamination: How Bots And Fake Users Can Skew An Organization's Data And Analytics

Guy Tytunovich is the founder and CEO of CHEQ, a leader in customer acquisition security.

getty

When we think of operating costs for medium and large businesses, we immediately think of heavy-expense line items like payroll, commercial real estate, vendors and suppliers. Over the past decade, however, another form of (costly) expense is on the rise — data management. A recent study by McKinsey found that "a midsize institution with $5 billion of operating costs, spends more than $250 million on data across third-party data sourcing, architecture, governance and consumption." In fact, the annual cost of data consumption alone (report generation, business and marketing intelligence, data analysis, distribution) for a midsize organization can be as high as $90 million.

Why the massive investment? Because inaccurate data loses companies money.

Back in 2016, IBM had already estimated (via Harvard Business Review) that "the yearly cost of poor-quality data, in the U.S. alone" reached an astonishing $3.1 trillion, and in 2018, Gartner, Inc. found that "organizations believe poor data quality to be responsible for an average of $15 million per year in losses." Now, with increased reliance on data, we can only imagine the macro impact that bad data is having. With all this in mind, organizations are looking to invest more in good data management to drive good business decisions. This trend becomes apparent when looking at the rapid growth of the data management market, which is projected to increase by nearly 60% from 2020 to 2025, from $78 billion to $123 billion.

The rise of bots, fake users and invalid traffic is jeopardizing that investment.

It's no secret that a great portion of today's web traffic is driven by crawlers, scrapers, automation tools, fake accounts, proxy users, malicious botnets, hackers, fraudsters and click farms. In 2017, The Atlantic cited an Imperva report claiming that bots accounted for 52% of all internet traffic, while some estimates go even higher than that.

For a truly data-driven organization, this poses a strategic threat. If a large percentage of all site visits, ad clicks, form fills, sign-ups, chat requests and page engagements are coming from bots and fake users, then the data that BI and marketing intelligence teams are looking at is completely skewed. Just this past Black Friday, we found that over one-third of all online shoppers were fake. Think what that means if you're running analytics for a large e-commerce site and you're not aware that one in every three site visitors isn't real.

Sophisticated data-driven organizations are the most exposed.

Perhaps counterintuitively, the more an organization's data operation is sophisticated, the more it is exposed to data contamination. Think of it as a body with many limbs, where each limb is susceptible to infection. Now, take your typical midsize or enterprise data operation. We're talking about a robust setup that includes customer data platforms (CDP) and data management platforms (DMP), business and marketing intelligence software, a customer relationship management (CRM) software containing all of the sales and pipeline data, audience segments for retargeting, lookalike models and much more. Every one of these databases can get contaminated if invalid users and bot traffic are allowed in.

Imagine a CRM filled with 10% to 15% invalid leads. That's countless hours of the sales organization's time wasted on chasing bad leads as well increased costs for storing all those contacts. Think about audience segments polluted by invalid site visitors. Marketing organizations will end up remarketing to these audiences, wasting money and further polluting their funnels. BI tools are another high-risk area, as the business's entire view of reality as it pertains to web traffic, conversions, revenue and pipeline is completely offset by the presence of undetected invalid traffic.

The estimated cost of bot-skewed data is now almost $700 billion.

Data skewed by bots and fake traffic can cause material harm to an organization's health. A business's ability to forecast revenue, plan budgets and headcount growth, understand trends, optimize marketing and sales performance — all of these crucial functions are at risk when the data is off by 10% to 20% (and sometimes even higher). Our own recent study found that 27% of all organic and direct traffic was invalid and that the midsize and enterprise organizations lose $697 billion a year due to data skewed by fake traffic. A study by LeadJen found that sales and marketing departments lose 546 hours and over $20,000 per sales rep annually from using bad data, and research by MIT estimated "the cost of bad data to be 15% to 25% of revenue for most companies."

Ultimately, the problem of fake traffic and bots is becoming a strategic threat to the well-being of data-driven organizations, and as a result, many companies are recognizing the need to secure their investment in data, ensuring their operation remains clean. Unusual behavior and movement patterns on-site, sharp spikes in traffic at atypical times and increases in traffic from certain sources or geolocations can all raise red flags for data-driven professionals.

While the best step is to deploy proper security measures, organizations should — at the very least — be on the lookout for suspicious traffic trends, conversion rates that don't add up, CRM leads that seem off and any other activity that could be indicative of data contamination. Some manual work identifying and filtering out potentially harmful traffic from advertising campaigns, website pages and other digital assets is also a good start. Still, some false positives can occur, and bots do their best to appear human-like. Consider incorporating go-to-market cybersecurity, as this can typically be the most accurate way to ensure decisions are being made based on accurate data.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Follow me on LinkedIn. Check out my website.

Guy Tytunovich

Guy Tytunovich is the founder and CEO of CHEQ, a leader in go-to-market security. Read Guy Tytunovich's full executive profile here.

Subscribe to newsletters

Billionaires

Innovation

Leadership

Money

Forbes Digital Assets

Learn

Investment Newsletters

Business

Small Business

Lifestyle

Real Estate

Forbes Store

Vetted

Mattress & Sleep

Home

Fashion

Kids & Baby Gear

Gear

Beauty & Grooming

Tech & Electronics

Travel

Gifts

Deals

Coupons

Forbes 101

Advisor

Mortgages & Loans

Banking

Business Services

Home Services

Health

Lists

Video

Newsletters

Frase by Forbes

Forbes Magazine

Latest

Featured

The Great Contamination: How Bots And Fake Users Can Skew An Organization's Data And Analytics

More from Forbes

20 Expert Tips For Effective And Secure Enterprise AI Adoption

Beyond The Idea: The Key To App Startup Survival

The State Of Cybersecurity (Part Two): Can’t Manage What You Can’t See

The Shifting Mindset Of CIOs In The Era Of AI

The Endless Possibilities Of GenAI In Cybersecurity Transformation

Navigating The Complexity Of The Latest Data Privacy Regulations

Automate It All: Hyperautomation Is Now Essential To CX

Is Your Data Safe With GenAI Practices? Here's How To Stay Secure