The Data You Do — and Don’t — Need for Your First Pay Equity Analysis

| February 23, 2023 | 8 min read
The pay equity analysis data you do and don't need to get started

When it comes to pay equity, one of the most common myths that makes employers hesitant to start is the concern that they don’t have complete data. But the truth is, you don’t need completely clean data to begin your first pay equity analysis (after all, nobody has perfect data).

In our experience working with hundreds of customers to build, refine and update their pay equity work, one major takeaway is that building a pay equity program is an iterative process. Employers should update their pay equity analysis each round based on lessons they learned from the previous one. It is true that the less data you have, the more work you’ll need to do — but that doesn’t mean you shouldn’t start! In fact, you’ll learn so much from your first go-around that you’ll have specific insights into how you can improve the next one.

In this blog post we’ll discuss what data you need — and what you don’t need — in order to get started on your pay equity journey. 


The pay equity analysis data you actually need (if you have the following, we’re good to start!)

When it comes to conducting a pay equity analysis, there are certain foundational pieces of information you definitely need in order to make sure the process is successful, including:

1. Sufficient coverage of identity variables: First and foremost, you will need sufficient coverage for the variables of interest, such as gender, race/ethnicity, national origin, disability, sexual orientation, etc. That means you should have enough individuals in each group to discern significant  differences in outcomes (e.g., salary)  from random noise. So while you may not have a complete dataset, so long as you have enough employees on each side of the variable of interest (the most common being gender) in most of your employee groups to run statistical tests, then you have enough to begin!

In the U.S., organizations usually have great data on gender and race/ethnicity due to government reporting requirements, and the UK is catching up on collecting ethnicity data. In Europe, coverage for gender tends to be good while data on race, ethnicity, and national origin are less commonly collected. Other variables may yield even spottier coverage, but we have seen organizations conduct pay equity analyses for veterans, employees with disabilities, and members of the LGBTQIA+ community. It’s important you only analyze groups that have enough relevant data for an accurate statistical analysis — something our PayEQ ™ software accounts for automatically.

2. Employee identifier: Additionally, you need an employee identifier so that everyone can be identified uniquely throughout the process. This one is typically a given for organizations with a modern HRIS(s), as databases require some form of unique identifier. The ability to take any actions back to your business and clearly identify who you’re talking about is critical — so if you don’t have an employee ID available, you likely have some other bridges to cross before you begin your analysis.

3. Compensation data: Every organization pays its people, so this isn’t a tricky one, right? You need pay data in a form that can be translated into a meaningful “apples-to-apples” rates. Possible compensation types include pay rates, full-time equivalent salary, on-target earnings, or total discretionary compensation. While it can be helpful to analyze multiple compensation types, particularly when it comes time for remediation, to start you just need at least one. Occasionally companies have somewhat messy data when it comes to translating part-time employees to their full-time equivalents. Usually a pay equity analysis will highlight these issues, surfacing employees as underpaid who are in fact unadjusted part-time workers — so you may be able to start and use the analysis to help clean up your full-time equivalency data.


Data that’s nice to have

Having the following data lead to more robust results, but are not absolutely necessary to conduct a baseline pay equity analysis. 

1. Groupings of comparable employees: A job architecture you feel confident in makes it easier to group employees together who perform similar work. That said, as long as you have some information about employees’ jobs (this could be as simple  as job titles), you can start the process. Even limited data here can get you to meaningful results faster than you might expect: the “80/20” rule often applies, where grouping the most common 20% of your job titles allows you to create meaningful groups for 80% of your employees. 

Additionally, if the fear of not having the “perfect” job architecture is holding you back, know that the pay equity process actually provides a great opportunity for improvement! You can use insights from your baseline pay equity analysis to inform the design of a better job architecture — check out our advice for how to build your job architecture and ensure pay equity all at once

2. Good internal control data: Internal control variables include information about geographic differentials, management responsibilities, tenure with the company, and other neutral, job-related factors. The more reliable data you have here, the better. That said, many pay equity analyses begin without applying controls — and even the basic analysis of comparing mean pay between communities within a job group can give you a good place to start digging deeper and identifying legitimate reasons why pay differs, guiding the work that needs to be done to refine your program moving forward.

Eventually you will want better data here for a more robust pay equity analysis, but conducting an analysis with limited data is still a good place to start. Remember, incomplete data is not necessarily a deal breaker! As long as the data are largely complete, it is a legitimate approach to “impute” data, meaning you make the variable to have a neutral or average impact on employees who are missing data.  For example, companies frequently have performance rating data for tenured employees but not for new hires. Imputing the average performance rating value for newer hires allows you to leverage the information you do have (employees with histories of low performance ratings earn less than those with histories of high performance ratings) without that information impacting the predictions for recent hires.


“Wish we had” data that you don’t need to start

The following data can enrich your pay equity analysis, but are not necessary and can even come with some potential challenges for consideration. 

1. Performance ratings: While performance ratings may seem like useful information for understanding wage disparities between groups, they are often based on subjective criteria . Using a biased control can unintentionally bake bias into your pay equity analysis.

The only way performance ratings will adequately explain a gender- or race/ethnicity-based pay discrepancy is if ratings and pay are correlated. In other words, you run the risk that your performance process might be as biased as your compensation process.

There are good reasons to include performance ratings in your analysis (e.g. excluding employees with very low performance ratings from salary adjustments), but we typically approach performance ratings with care. Lacking this data is certainly not a reason to put off your analysis!

2. Good external control data: There are a bunch of external variables that organizations would love to have for their workforce but don’t: years of relevant experience from other employers, education, certifications, specific “hot skills”, and more.

In the process of your first pay equity analysis, you will identify which groups really need these data and then you can capture it on a limited basis. For example, I worked with a company on an analysis of their engineering organization. After reviewing their preliminary results, they identified a handful of select skill sets for which they  were paying a premium (e.g., cloud computing and data engineering, among others). Identifying roles with these skill requirements within this job group was a manageable task.

The key here is to gather these data neutrally rather than only applying this differential based on who is paid more than expected. If you identify an employee paid more than expected, and realize it is because they have a particularly valuable skill set, make sure to identify all workers with that skill set in your dataset rather than just highlighting the few individuals who flagged. If men negotiate for skills differentials more often than women with the same skills (as research suggests), you may disproportionately identify the men with the hot skill in question. This is another way you could bake bias into your pay equity analysis.


What will you get out of your first pay equity analysis?

Pay equity should be an ongoing, iterative process that builds upon itself. You should use the process learnings from each analysis to make subsequent analyses more efficient and robust. As you conduct your first pay equity analyses, you’ll learn:

  • How the analysis is done. You’ll become familiar with the analysis process. Some of the key questions that confront you along the way may have good answers at the start, while others will require further investigation in order to get a complete picture. 
  • What the key outputs are. Pay equity work extends beyond the statistical analysis — a mature pay equity program is ongoing and embedded into every pay decision. Take the learnings from your first analysis to identify what you can apply to your  broader pay equity program. The appropriate follow-through process will depend on the particulars of your analysis, as well as your company culture. Once you’ve gone through a preliminary analysis, you’ll know what you’re dealing with — and you can think through how to strengthen your pay equity program so that it works for everyone in your organization.
  • A list of compensation changes you feel good about. Your preliminary model may not be perfect for all job groups, but for some groups it will be totally adequate. For example, I worked with an organization with a distribution center. The fulfillment jobs in this center were clearly similar to each other and different from others. Seniority, supervisory responsibilities, and shift differentials did a good job of explaining most people’s pay — all data the employer already had on hand. Though they had some work to do before they had an actionable analysis for their whole organization, they identified real issues in their distribution centers with this initial analysis. Even on the first pass, you are likely to get some groups right enough that you are ready to take actions on any issues surfaced in those groups.
  • A list of other, specific issues that you want to dig deeper into. Some groups will require a closer look, or a second pass—- as in the example of the software engineering group where they realized many of the highest paid folks had specific skills that they wanted to incorporate into the pay equity model. Your initial pay equity analysis will help you prioritize those groups, allowing you to take a data-driven approach to your investigations.
  • What data you want to start collecting to conduct more robust future analyses. Your first pay equity exercise is an excellent opportunity to clean up your data and create a plan for what your company needs to collect to conduct more robust analyses going forward. This can become a touchpoint with leadership about the necessity of putting rigor behind the effort of employee data integrity. Additionally, when you identify additional data points to start collecting, you create an avenue to communicate with employees about your pay equity commitment, efforts, and progress.

Often, we see organizations start with a subset of their organization. Employers may feel good about the data they have in the U.S. and/or in the UK or in certain European countries, but have reservations about the accuracy of data in other markets. Starting with a subset of your organization is another way to identify actionable issues you want to address, while learning important lessons that can inform a broader rollout of a comprehensive pay equity program.


Don’t let perfect be the enemy of good. The important thing is to start — and to learn along the way.

Don’t let missing, messy, or decentralized data hold you back from starting pay equity work. The reality is that every company has room for improvement. I work with companies who have conducted pay equity analyses on the same workforce for years, and they still find opportunities to improve their models and processes.

Taking on a pay equity analysis doesn’t require perfectly clean and complete data in order to get actionable results. Having an expert partner to guide you through the process will help ensure that even with limited data available, you are able to conduct a meaningful analysis and identify ways to improve your methodology and process moving forward. Syndio can help you identify the best approach for your organization, so you can start taking steps towards lasting change. Don’t be afraid to get started — the journey of progress begins with the first step!


The information provided herein does not, and is not intended to, constitute legal advice. All information, content, and materials are provided for general informational purposes only. The links to third-party or government websites are offered for the convenience of the reader; Syndio is not responsible for the contents on linked pages.

© 2024 Syndio. All rights reserved.