# Introduction to Crosstabs

You are investigating whether women in a given company are being discriminated against. You look at the salaries of women and find this:

Company 1

 Female Under 50K 400 Over 50K 200 600

At  another company, you  find this:

Company 2

 Female Under 50K 300 Over 50K 300 600

What would you conclude about the first company? It looks like the first company discriminates while the second doesn't, right?

Let's look at it from another point of view. Let's just consider all the people who are making good money, starting with the first company:

Company 1

 Male Female Over 50K 100 200 300

Notice that twice as many women as men are making good money. If we look at the second company, we see more or less the same story:

Company 2

 Male Female high pay 200 300 500

So, in both companies, it looks like more women than men are making good money. But from the first two tables, it appears that in Company 1 most women are making low pay, while in Company 2, its about half and half.

Now let's look at the men in both companies:

Company 1

 Male Female Under 50K 200 400 600 Over 50K 100 200 300 300 600 900

Is there discrimination at Company 1? The biggest group of people is females who make less than 50K!

Company 2

 Male Female low pay 100 300 400 high pay 200 300 500 300 600 900

If we convert to column percentages we get:

Company 1

 Male Female Under 50K 66.67% 66.67% Over 50K 33.33% 33.33% 100.00% 100.00%

Company 2

 Male Female low pay 33.33% 50.00% high pay 66.67% 50.00% 100.00% 100.00%

Obviously, it is Company 2 that has lower salaries for women. So it seems like we should always look at the percentages, right?

Have a closer look at the raw frequencies for company 2. What if we had sampled three times as many low paying people as high paying people? Then the results would have looked this:

Company 2

 Male Female low pay 300 900 1200 high pay 200 150 350 500 1050 1550

And the percentages are these:

Company 2

 Male Female low pay 60.00% 85.71% high pay 40.00% 14.29% 100.00% 100.00%

These percentages are different! Somehow, we need to get a handle on the fact that there are different numbers of males and females, AND different numbers of high and low paying jobs. We can't see the pattern in the data because of these different sizes of the groups. A way to deal with this is given in the next handout.