How to Identify Bad Data When Conducting Market Research

Group of happy young  business people in a meeting at officeMarket research is a critical tool for businesses to gain insights into the competitive landscape and make informed strategic decisions. However, the quality of data collected is crucial for overall success. Bad data can lead to erroneous conclusions, wasted resources, and potentially disastrous business outcomes.  

Unfortunately, bad data is ubiquitous on the internet, and it’s very easy for unqualified content creators to publish faulty business information that looks attractive and persuasive. True expertise can be difficult to gauge online, and flawed data can be repeated, amplified, spun, and skewed until it is embedded in the psyche and becomes a piece of accepted industry folklore.  

Maintaining a healthy dose of skepticism is a good first step, but differentiating between good and bad data can be a challenging and time-consuming task, particularly for those with limited experience, as it requires critical thinking and evaluation skills, as well as the ability to identify credible sources and validate information. 

In this article, we turn to long-time Industry Analyst Gleb Mytko from MarketResearch.com’s partner The Freedonia Group for specific advice on navigating the thorny challenge of data quality. Based on his 10+ years of experience producing authoritative market research on everything from motorcycles to mining equipment, Mytko explains where bad data comes from and what red flags to watch out for during the research process. 

What Is Bad Data? 

Data is vulnerable to human error, technical glitches, bias, and intentional manipulation. Even reliable sources can have significant issues if you take the information at face value without understanding how the projections were estimated. While a variety of issues are at play, the most common hallmarks of bad data, according to Mytko, include: 

  • Data developed using flawed information or improper assumptions 
  • Data based on a faulty methodology 
  • Data compromised by human error 
  • Data that seems reasonable from one perspective but does not line up with what is known about related fields 
  • Data that isn’t consistent over time 
  • Data that contradicts reliable sources without explaining why 
  • Data that is unclear about its scope and can be easily misinterpreted 

Data found on the internet can be faulty, as well as research found in large, hefty reports produced by low-quality research firms. If you are relying on sources that value quick answers over accuracy, you may have to wade through segmentation errors, insufficient analysis, out-of-date assumptions, and information that is out of context.  

Even worse, if you are using generative AI models such as ChatGPT, the information you receive may sound logical and look compelling but be completely fabricated and have no basis in reality. AI developers have dubbed this issue “AI hallucination.” As has been widely reported, ChatGPT has a tendency to invent phony anonymous sources and make up direct quotations, names, and dates, so it’s not exactly a fact checker’s dream.    

Whatever the source of information, always pause to consider if the data makes sense. “If the data is not consistent with historical trends and doesn’t line up with what we know about related fields, it is likely to have issues,” Mytko states.  

What Key Factors Contribute to Bad Data? 

When it comes to producing accurate market research, longevity in the field and experience matter. Analysts who specialize in one industry for a long time have the historical perspective to put current developments in context and better predict where the industry is headed next. In contrast, analysts with limited experience are more likely to overlook something or make mistakes, and they may lack the necessary knowledge to work with the data.  

When knowledge, training, and experience are lacking, the quality of research may be hindered by several different stumbling blocks: 

  • Taking too narrow of a perspective and not accounting for all relevant factors 
  • Failing to understand what drives the trends in the data and overlooking historical patterns and developments 
  • Lacking an understanding of the scope of the data 
  • Not having a comprehensive and multidimensional review process  
  • Producing technical errors and other oversights
  • Overlooking or missing a key data point or source that contradicts your data 
  • Not updating or improving the data series over long periods of time  

If reliable and actionable research is a priority, analysts should not work in a bubble drawing their own conclusions. Instead, take a team approach to quality control and have layers of review to ensure everything makes sense and is consistent. Multiple sets of eyes should be in place to catch technical errors and cross-check findings. 

“A comprehensive and multidimensional review process is essential for developing high-quality data, as is taking a long-term perspective and consulting a wide range of sources,” Mytko advises.  

Research firms such as The Freedonia Group use a team of editors, economists, and managers all working together to produce high-quality market research reports. In addition, analysts specialize in specific industry verticals so they become familiar with the landscape and how it changes during various business cycles. These practices help ensure quality research. 

What Are Some Examples of Bad Data? 

Bad data is often sneaky and can take many forms. As the examples below illustrated, It is important to carefully consider the sources and scope of data to ensure that it is accurate and properly applied.

Overhyped Predictions About New Technologies 

It’s all too easy to forget that we live in a world that’s awash in “click-bait” headlines designed to capture attention. Sensationalist predictions often accompany new technologies, such as electric vehicles or automated equipment. 

“Data issues are common in new fields that are developing rapidly because there is often a lack of reputable data sources and consensus,” Mytko explains. “Instead, you frequently encounter sources with eye-catching headlines that offer little published data to back up their conclusions and don’t explain their methodology.” 

For example, a source may assert that “50% of all buses sold in the U.S. will be electric” by a certain date, but what they are really talking about is transit buses, which is a smaller scope of the market. The source may not consider how feasible this data is, or whether enough electric buses will even be made available. Most likely, the author never researched which companies offer electric buses in the U.S. or how many models exist.

Generalizations That Overlook Regional Differences

Understanding technological developments in other regions of the world can be complex as well. For example, Mytko traveled to India and saw the challenges facing the electric grid firsthand. “Then I read an article that says tons of farmers will use electric and hybrid cars in the country in the next five years,” he says. These types of unrealistic predictions may be based on government announcements, or marketing hype. 

Inconsistent Categorization

Media publicity about disruptive new technologies may be overblown, but information about other major industries can also be misconstrued even within reliable sources. Although U.S. government data is often considered the gold standard, it too can be wrong and give a false impression of reality. For example, if you aren’t aware that the government changed what products are assigned to specific NAIC codes by the US International Trade Commission, you might have a skewed view of import trends in a specific category from year to year.  

Data with Scope Issues

Along these lines, keep in mind that good data can be “bad data” if you do not have a clear understanding of its scope. You need to know what is included in the data. For example, does the data focus on certain product types, specific market segments, pricing levels, or geographic regions? If this information isn’t clear, the data can easily be misinterpreted or improperly used. To get a proper apples-to-apples comparison, a researcher must always be sure that the data they are looking at matches the scope of what they are thinking about. 

How Can You Identify Bad Data? 

Even if you are not an expert in the field, or you are studying an unfamiliar market, keep these considerations in mind to help identify bad data: 

  • Common sense: if something doesn’t seem right, it probably isn’t.  
  • Does this data line up with what we know about historical trends? 
  • Is the data in line with what we know about related fields?  
  • Is the data incomplete? Are there issues with the methodology? Is the scope clear? 
  • Is the data actionable? Does it use standard units that are possible to cross-check with other sources? Does it provide sufficient information? 

By watching out for inaccurate, misleading, or incomplete data, businesses can avoid pitfalls and make better informed decisions. Relying on multiple sources, employing well-trained experienced analysts, developing a rigorous review process, and partnering with reputable market research firms that follow these same practices can also go a long way in ensuring high-quality market data.  

New Call-to-action


About the author: Sarah Schmidt is a Managing Editor at MarketResearch.com, a leading provider of global market intelligence products and services.

Topics: Market Research Strategy How To's