Social science has long been concerned with how to demonstrate the results of statistical models to a lay audience. If there were a straight-forward way of doing so that had widespread support, we'd be doing it (apart from the contentious few that always muck things up). But there isn't, and one can still find a range of opinion on fundamental questions. For example, in March, 2009, the political methodology listserv (where all the stats geeks hang out) had an active debate about whether a traditional regression table with coefficients and standard errors was better or worse for displaying results than a graphical presentation. As another example, King (1986) argued that one should never use standardized regression coefficients. And finally, there are a number of problems with the widespread practice of reporting p-values on variable of interest -- not least is that comparing a coefficient to the null hypothesis is an incredibly weak test.

Most readers, however, want to be able to do something that should be simple: they want to understand the relative magnitude of the different factors in a model. In our book, we present a number of different applications of the TRAITS model and compare it to the more common (and we argue, incomplete) practice of relying on demographics supplemented by a number of behavioral variables (i.e., microtrends). Given that there's no set answer to how one should present results (especially since we are attempting to do so in our book without getting bogged down in the raw statistical results), how did we approach this problem?

Essentially, the problem boils down to this. Across a range of dependent variables, we ask what is the relative power of the TRAITS model versus the supplemented demographics model? One further and important caveat is that we want to know what the systematic power is of each model. Any particular sample is always muddied by noise; we want to know (or at least be as certain as we can be) that our models aren't picking up on this noise. In the statistics literature this is known as overfitting the sample and in the realm of social science, one introductory explanation of this problem is found in de Marchi (2005).

There are thus three questions we had to answer before we included any model's results in our book:
  • The first question is whether or not the model overall performed well or was simply fitting noise?
  • The second question is if whether or not we are certain about specific variables and their impact on the dependent variable?
  • The final question was what was the relative contribution of the TRAITS model versus the supplemented demographics model?

For all of our results in the book, we took extraordinary steps to satisfy ourselves that all three questions were demonstrated convincingly in our results.

Overall model fit

We took the following steps to determine whether or not to include a model in the book:
  1. We divided our samples (primarily the data from Knowledge Networks) into two parts: a training set to initially fit our models and an out-of-sample set to see how well they perform on novel data. Results only made it into our book if the performance out-of-sample came close to matching the performance on the training set. This is easy enough to do given our samples were typically several thousand observations - we would randomly select somewhere near 10% of the data as an out-of-sample set and hold it aside for generating results.
  2. The danger we were wary of is named overfitting. The figure below is taken from de Marchi (2005) - it's a small sample of random data points centered on a mean of 0. No model should fit this sample - but one can nonetheless find models that appear to fit the sample if one works at it hard enough. As an example, the line is a complicated function that fits the data, but it's not a model one would want to bet money on if one were asked to predict the future...

(Note: the points are observations in a sample; the line is a "model".)

Individual variable effects

We often, in the text, point out effects of particularly salient variables. Unlike an academic article, we can't provide the full results (i.e., a sense of the magnitude of the variable plus the standard errors). We're also wary that, as with the first question, one can often find what seem to be significant or important effects for variables that are in fact artifacts of the sample (i.e., your model has fit the noise present in the data rather than the systematic component). While there are a number of ways to provide additional security that the effects you've found are real, we relied on the bootstrap.

The nonparametric bootstrap is a technique that constructs a very large number of bootstrap samples from the empirical probability distribution based on the original sample. We only comment on variables in the text that remain important (i.e., they have the expected sign and significance) when using the bootstrap with 1000 replications.

TRAITS vs. supplemented demographics

There's an active debate about the utility of standardized coefficients, but for our purpose - comparing the relative power of the TRAITS model vs. the supplemented demographics model - they are a reasonable choice. The basic idea is that different variables are measured on different units, and this makes comparison of the magnitude of the effects of variables difficult. To address this, standardized coefficients tell you how much the dependent variable changes if the independent variable in question moves one standard deviation (whatever the unit). The most cited critique of the use of standardized betas within social science is King (1986). He has two main complaints. The first is that there is no natural underlying unit that would allow one to make comparisons - but this is a matter more of aesthetics than anything else. His second complaint is that if one changes the sample and thereby changes the variance of the independent variables, then the standardized coefficients can also change dramatically.

Following King, imagine you have a dependent variable Y and an independent variable X. The three observations in your first sample are {Y,X} = {5,2}, {5, 4}, and {6, 4}. Below is the result of an OLS regression:

Source

|

SS

df

MS

              

Number of obs

=

3

F( 1, 1)

=

0.33

Prob > F

=

   0.6667

R-squared

=

0.2500

Adj R-squared

=

-0.5000

Root MSE

=

.70711


Model

|

.166666667

     1

.166666667

Residual

|

.5

1

.5


Total

|

.666666667

2

.333333333




y

|

Coef.

Std. Err.

t

P>|t|

[95% Conf.

Interval]

Beta


x

|

 .25

 .4330127

 0.58

 0.667

 -5.251948

 5.751948

 .5

_cons

|

 4.5

 1.5

 3.00

 0.205

 -14.55931

 23.55931

 


The slope for X is .25, which means that Y increases by .25 for every unit increase of X.

The regression line looks like this (superimposed on the observations):



As one can immediately tell from the significance of X and its standard error as well as the R-squared, this is not a good model. There is little overall variance in X, and the slope that we've discovered is more a product of chance than anything systematic. Adding or subtracting a single observation would produce dramatic changes to the model and the poor results reflect this.

King proposes that we add a new observation: {9.5, 20}. Since this both increases the variance of X and is exactly on the regression line from the last model, we find that our results improve. The variable X now is significant at the customary level and its standard error has dramatically improved. The model fit has also improved by a huge margin, and the graph below shows that the line now matches the data very well:



Source

|

SS

df

MS

 

Number of obs

=

4

F( 1, 2)

=

52.75

Prob > F

=

   0.0184

R-squared

=

0.9635

Adj R-squared

=

0.9452

Root MSE

=

.5


Model

|

13.1875

     1

13.1875

Residual

|

.5

2

.25


Total

|

13.6875

3

4.5625




y

|

Coef.

Std. Err.

t

P>|t|

[95% Conf.

Interval]

Beta


x

|

 .25

 .0344214

 7.26

 0.018

 .1018966

 .3981034

 .98

_cons

|

 4.5

 .3593702

 12.52

 0.006

 2.953755

 6.046245

 






King's complaint is that the standardized coefficient changes from .5 in the first model to .98 in the second (listed as the "Beta" in the above results). First, it should be remembered that the point of standardized coefficients is not usually to compare variables from different samples; rather, it's to compare the magnitude of variables within a sample. Second, it must be remembered that the standardized coefficient depends on the variance of the independent variable - if that changes, so too does the coefficient. And to the extent King's critique bothers anyone, they'd have to be equally bothered by significance testing and the use of standard errors - both of these also change quite dramatically from the first sample to the second.

For more information on this topic, see Gelman (http://www.stat.columbia.edu/~cook/movabletype/archives/2009/03/displaying_regr.html).

This is a long-winded way of saying that when we present graphs in our book, we are comparing the sum of the absolute values of the standardized coefficients of the TRAITS model versus the supplemented demographics model. For all of our dependent variables, we run the same model to compare TRAITS with the supplemented demographics unless noted in the text.

Resources:

Bring, Johan. "How to Standardize Regression Coefficients." The American Statistician, 1994.
de Marchi, Scott. Computational and Mathematical Modeling in the Social Sciences. Cambridge University Press, 2005.
Friedman, J., T. Hastie, and R. Tibshirani, The elements of statistical learning. Springer Series in Statistics, 2001.
Delgado, Miguel and Wenceslao Gonzalez Manteiga. "Testing in Nonparametric Regression Based on the Bootstrap." The Annals of Statistics, 2001.
Gelman, Andrew. "Scaling regression inputs by dividing by two standard deviations." Statistics in Medicine, 2008.
King, Gary, Michael Tomz, and Jason Wittenberg. "Making the most of statistical analyses: Improving interpretation and presentation." American Journal of Political Science, 2000.
King, Gary. "How not to lie with statistics: avoiding common mistakes in quantitative political science." American Journal of Political Science, 1986.


Addendum- Predicting Political Activity (see page. 115-116 of You Are What You Choose):

To see how we estimated our choice model, consider the discussion of political action in our chapter on "Consuming Politics." For a sample of respondents to the Knowledge Network surveys we created a scale that measured their political activity across many different decisions. These included voting, going to a rally, working in a campaign, donating money, contacting officials, and writing a letter to the editor. We then used our variables representing the TRAITS, Party ID, and Demographics + Microtargeting to predict who is politically active. The graph on page 116 indicates that for 88 out of 100 people we correctly predicted their level of political activity. We sum the total standardized betas for the variables, and in the graph base the proportion of the bar graph using a particular color on the percentage of the betas' sum accounted for by particular types of variables. While political consultants treat Demographics + Microtargeting variables as destiny, our model shows that the TRAITS play an equally large role in determining a person's level of political activity.

If you're high on the Information trait, you may have the following additional questions about this model:

Question 1: Does the overall model generalize out-of-sample?

Original Sample, N=4725
Training Set, N=4260
Out-of-sample Set, N=465

Root MSE of model fit on the training set: 1.3
Root MSE of that same model applied to the out-of-sample set: 1.25 (!)

Answer: Yes -- in fact, the model does slightly better on the out-of-sample set than it did on the training set, which is an excellent outcome. We are clearly capturing something systematic about the data rather than overfitting the sample.

Question 2: Are we reasonably certain the variables we highlight in the text are genuinely important?

Compare the coefficients recovered from the regression on the training set with the subsequent bootstrap results.

OLS Regression on training set for the dependent variable Political Action (Chapter 5, p. 116):

Source

|

SS

df

MS

              

Number of obs

=

4260

F( 23, 4236)

=

75.03

Prob > F

=

   0.0000

R-squared

=

0.2895

Adj R-squared

=

0.2856

Root MSE

=

1.3055


Model

|

     2941.40591

     23

     127.887213

Residual

|

7219.73471

4236

1.70437552


Total

|

10161.1406

4259

2.38580432



-.0521756


pol_act

|

Coef.

Std. Err.

t

P>|t|

 

Beta


info

|

4.211418

.4928954

8.54

0.000

 

.1189927

risk

|

.0593032

.1915891

0.31

0.757

 

.0043778

discount

|

.2750255

.1242289

2.21

0.027

 

.0305679

altruism

|

5.11929

.1937676

26.42

0.000

 

.3888472

other

|

.0396026

.182587

0.22

0.828

 

.0029363

loyal

|

-.1986226

.268059

-0.74

0.459

 

-.0102238

partyid

|

.0448009

.0123549

3.63

0.000

 

.0502457

gender

|

-.0414336

.0424795

-0.98

0.329

 

-.0133915

agecat

|

.1190375

.0176885

6.73

0.000

 

.104108

hispan

|

-.0261893

.0270792

-0.97

0.334

 

-.0126481

black

|

-.0243126

.1023843

-0.24

0.812

 

-.0032085

education

|

.2674262

.0279696

9.56

0.000

 

.1431566

income

|

.0381839

.0259019

1.47

0.141

 

.021977

house_head

|

.0676508

.0868603

0.78

0.436

 

.0113613

kids

|

-.0992315

.0296444

-3.35

0.001

 

-.0482974

home owner

|

.0114617

.0567192

0.20

0.840

 

.0032059

urban

|

-.0869962

.0607824

-1.43

0.152

 

-.0190046

married

|

-.0346019

.0590884

-0.59

0.558

 

-.0104588

divorced

|

.0715278

-0.73

0.466

 

-.0119373

Fox news

|

.1431671

.0457964

3.13

0.002

 

.0429032

gun owner

|

.1256663

.0430543

2.92

0.004

 

.0401986

single_fam

|

.0584629

.0534912

1.09

0.274

 

.0157985

church_att

|

.007942

.0138807

0.57

0.567

 

.0083093

_cons

|

-.2156179

.3531037

-0.61

0.541

 

.




Note: most of the above independent variables are self-explanatory, but house_head is household head, single_fam is living in a single family home, and church_att is church attendance.

Bootstrap replications (1000):

Linear regression

              

Number of obs

=

4725

Replications

=

1000

Wald chi2(23)

=

     1426.65

Prob > chi2

=

0.0000

R-squared

=

0.2907

Adj R-squared

=

0.2872

Root MSE

=

1.3002



-.0527845


 

|

Observed

Bootstrap

 

 

Normal-based

pol_act

|

Coef.

Std. Err.

z

P>|z|

[95% Conf.

Interval]


info

|

4.034178

.5256567

7.67

0.000

3.003909

5.064446

risk

|

.0213319

.1760631

0.12

0.904

-.3237454

.3664092

discount

|

.2429872

.1232276

1.97

0.049

.0014655

.4845088

altruism

|

5.13132

.2259805

22.71

0.000

4.688406

5.574233

other

|

.0712412

.1829331

0.39

0.697

-.287301

.4297835

loyal

|

-.2854712

.2737294

-1.04

0.297

-.8219711

.2510286

partyid

|

.0468967

.0114132

4.11

0.000

.0245273

.0692662

gender

|

-.0474893

.0412135

-1.15

0.249

-.1282663

.0332878

agecat

|

.1267478

.0156742

8.09

0.000

.096027

.1574686

hispan

|

-.0050949

.0220281

-0.23

0.817

-.0482692

.0380793

black

|

-.0501854

.0955833

-0.53

0.600

-.2375252

.1371544

education

|

.272836

.0238029

11.46

0.000

.2261832

.3194887

income

|

.0329165

.0249986

1.32

0.188

-.0160799

.0819129

house_head

|

.0824984

.0757868

1.09

0.276

-.066041

.2310378

kids

|

-.0691018

.0268534

-2.57

0.010

-.1217336

-.01647

home owner

|

.0169316

.0535751

0.32

0.752

-.0880738

.1219369

urban

|

-.0911029

.0586117

-1.55

0.120

-.2059797

.0237739

married

|

-.050785

.0555436

-0.91

0.361

-.1596484

.0580784

divorced

|

.0665132

-0.79

0.427

-.1831479

.0775789

Fox news

|

.1420787

.0437425

3.25

0.001

.0563449

.2278124

gun owner

|

.1372335

.0408524

3.36

0.001

.0571642

.2173028

single_fam

|

.0533655

.0527446

1.01

0.312

-.0500121

.1567431

church_att

|

.0046954

.0133789

0.35

0.726

-.0215266

.0309175

_cons

|

-.1636907

.335815

-0.49

0.626

-.821876

.4944946




Answer: all coefficients are stable, which indicates we're probably right - for example, altruism, information, and time matter a great deal.

Question 3: what are the relative contributions of the models?

As we show on the figure on p. 116 of our book, both TRAITS and the supplemented demographics model have an impact on political action. The top three individual effects (see the column labeled "Beta" in the forgoing model) are in order: altruism, education, and information.

About the Knowledge Network Surveys

We are fortunate to have worked with Knowledge Network survey data in our analyses. For general information about the firm and its services, see www.knowledgenetworks.com. The company describes the general approach for a Knowledge Network survey this way:

METHODOLOGY

The survey was conducted using the web-enabled KnowledgePanel®, a probability-based panel designed to be representative of the U.S. population. Initially, participants are chosen scientifically by a random selection of telephone numbers and residential addresses. Persons in selected households are then invited by telephone or by mail to participate in the web-enabled KnowledgePanel®. For those who agree to participate, but do not already have Internet access, Knowledge Networks provides at no cost a laptop and ISP connection. People who already have computers and Internet service are permitted to participate using their own equipment. Panelists then receive unique log-in information for accessing surveys online, and then are sent emails throughout each month inviting them to participate in research. More technical information is available at http://www.knowledgenetworks.com/ganp/reviewer-info.html.


Why We Care If You Go to the Dentist
Going to the dentist takes a bit of foresight. You need to stop surfing the Net and call up for an appointment. You need to be willing to pay something today - in terms of your time, money, and a bit of pain - to ward off potentially bad outcomes in the future...
Amazon
Barnes and Noble
Borders
800-CEO-READ (great for bulk purchases)
IndieBound (to find a local independent store)
Schedule an Interview
Email the Authors