“Let’s see what the numbers say …”
“The data will tell us …”
We hear these phrases all the time, usually tacked on at the end of an argument. In the messy world of working with people, it’s tempting to assign infallibility to data compared to opinions and feelings. Herein lies the problem: We think of data as being unbiased, but it depends on what data people choose to collect and how they analyze, interpret, highlight, and share the results.
In the workplace, data can guide the investment of resources, impacting individual employees, teams, and the organization as a whole. It’s embedded in almost all of the tools we use for assessment, from AI algorithms to employee engagement surveys and product popularity polls. With so much data at our fingertips, it’s important to think critically about the objectivity of data, something often left unquestioned.
Even when you understand there’s inherent bias in data because people, often those in power, are making decisions about which data to include, how to interpret it, who’s involved, and how to share the results, what can you do? Here are a few tips to help uncover the real story behind the data.
Look beyond the spreadsheet
When we hear “data” and “data visualization,” we may automatically think of spreadsheets and graphs. However, this narrowed view hasn’t always been the norm. It ignores the rich history of indigenous data visualization. New research highlights influential data power dynamics through a historical review of data collection, specifically, record keeping and data visualization among indigenous populations.
The researchers found evidence of data visualization dating back to the 1500s — when qualitative and quantitative data, stories, and numbers were incorporated into textiles, government treaties, and ceremonies. For example, in the 1600s, Dutch settlers and the Five Nations of the Haudenosaunee recorded a treaty with a beaded document. Two rows of purple beads represented two rivers and the two separate ways of life between the groups, living harmoniously and without interfering with each other. The treaty, along with the beauty and genius of indigenous storytelling, was later disregarded by Dutch and other European settlers, and a different prevailing narrative took shape. Colonizers taking power not only led to a shift in narrative but also took power and voice from the Five Nations people, who were intentionally removed from the practice of sharing data and shaping the narrative.
So when you’re looking at cold, hard numbers in a spreadsheet, remember they’re not the only way — or perhaps even the most accurate way — of representing the truth. Is there another way the story could have been told?
Examine inclusiveness
Start by looking for information about who’s included in the sample — and who isn’t. A report might read, “500 full-time workers in the industry,” but gaining a better understanding of the demographic characteristics of those 500 full-time workers is essential. For example, when collecting data for the gender pay gap, U.S. Census data only considers full-time workers. While this may accurately reflect the disparity in earnings between men and women, it fails to capture the complete picture.
According to the Women’s Bureau of the U.S. Department of Labor, about 65% of working women are employed full-time year-round, compared to 75% of working men; so, the sample is skewed from the start. The decision to exclude the salaries of part-time workers further biases the picture and doesn’t capture the experiences of different groups. Think about when demographic characteristics intersect to make working conditions extremely difficult for specific groups, like working Black mothers or working mothers who have recently immigrated to the U.S. Exclusions, intentional or not, shape the picture we see, and the picture is often far from comprehensive. We’re left with a limited understanding of what we’re examining.
Identify process biases
As consumers of data, it’s important for us to understand how data are collected to evaluate its objectivity. For example, in 1940 the U.S. Department of Agriculture (USDA) set out to publish a pamphlet of dress and clothing patterns for women. At the time, patterns were unreliable because there weren’t standardized measurements available for women, which cost manufacturers $10 million annually. Ruth O’Brien, the author of the pamphlet, and a team of researchers collected measurements from 15,000 women across the U.S.
O’Brien purposefully excluded 5,000 data points from the dataset because they were measurements from women of color. Some researchers encouraged the inclusion of non-Caucasian women in the sample, and O’Brien agreed to it to “keep peace” in the group. But she made sure the data were flagged to easily and purposefully be omitted — meaning the patterns were based solely on data collected from white women, as though they were the true representation of women in the U.S.
What’s the lesson here? Data collection processes matter, and someone is making purposeful decisions. Part of why we love to rely on data is the totality that’s promised: We’ve comprehensively gathered representative data in a way that allows us to capture a fact that can be generalized broadly. But when evaluating data, it’s important to ensure the data collection process has been ethical and doesn’t perpetuate biases.
Understand the impact of visualizations
Understanding what happens in the brain when looking at data visualizations is essential. Recent neuroscience research demonstrates we’re better at remembering symbols over words. When presented with the graphic symbol “$” and the word “dollar,” people were more likely to recall the symbol than the word. With this knowledge, consider how data visualization can impact perceptions.
When data is easily understood, it democratizes information and helps more people join the conversation. It enables researchers to build on each other’s work, policymakers to craft evidence-based solutions, and citizens to hold their governments and institutions accountable. Researcher Alberto Cairo and journalist Scott Klein developed a font called Wee People while they worked for ProPublica. Cairo sketched silhouettes — as an alternative to commonly used dots — to represent people in their data visualizations. Over the years, they’ve expanded the font to include silhouettes of children and people with mobility aids, further expanding who gets to see themselves in data visualizations and quickly understand the message being communicated. When you find yourself looking at wordy and confusing data, it’s not you; it’s just not a brain-friendly way to show results. It’s important to pause and reflect on who else may be confused by the story and why it matters?
Critically examining these issues is an essential step in unlocking the full potential of data for organizations, researchers, and society as a whole. As consumers of data, it’s crucial to adopt a mindset that extends beyond the numbers themselves, pause, and ask ourselves what part of the story is missing and what part is being amplified. By doing so, we can unearth hidden biases, uncover marginalized voices, and enable our understanding of an issue to be comprehensive and just.