|
|
Social science has long been concerned with how to demonstrate the results of statistical models to a lay audience. If there were a straight-forward way of doing so that had widespread support, we'd be doing it (apart from the contentious few that always muck things up). But there isn't, and one can still find a range of opinion on fundamental questions. For example, in March, 2009, the political methodology listserv (where all the stats geeks hang out) had an active debate about whether a traditional regression table with coefficients and standard errors was better or worse for displaying results than a graphical presentation. As another example, King (1986) argued that one should never use standardized regression coefficients. And finally, there are a number of problems with the widespread practice of reporting p-values on variable of interest -- not least is that comparing a coefficient to the null hypothesis is an incredibly weak test.
Most readers, however, want to be able to do something that should be simple: they want to understand the relative magnitude of the different factors in a model. In our book, we present a number of different applications of the TRAITS model and compare it to the more common (and we argue, incomplete) practice of relying on demographics supplemented by a number of behavioral variables (i.e., microtrends). Given that there's no set answer to how one should present results (especially since we are attempting to do so in our book without getting bogged down in the raw statistical results), how did we approach this problem?
Essentially, the problem boils down to this. Across a range of dependent variables, we ask what is the relative power of the TRAITS model versus the supplemented demographics model? One further and important caveat is that we want to know what the systematic power is of each model. Any particular sample is always muddied by noise; we want to know (or at least be as certain as we can be) that our models aren't picking up on this noise. In the statistics literature this is known as overfitting the sample and in the realm of social science, one introductory explanation of this problem is found in de Marchi (2005).
There are thus three questions we had to answer before we included any model's results in our book:
- The first question is whether or not the model overall performed well or was simply fitting noise?
- The second question is if whether or not we are certain about specific variables and their impact on the dependent variable?
- The final question was what was the relative contribution of the TRAITS model versus the supplemented demographics model?
For all of our results in the book, we took extraordinary steps to satisfy ourselves that all three questions were demonstrated convincingly in our results.
Overall model fit
We took the following steps to determine whether or not to include a model in the book:
- We divided our samples (primarily the data from Knowledge Networks) into two parts: a training set to initially fit our models and an out-of-sample set to see how well they perform on novel data. Results only made it into our book if the performance out-of-sample came close to matching the performance on the training set. This is easy enough to do given our samples were typically several thousand observations - we would randomly select somewhere near 10% of the data as an out-of-sample set and hold it aside for generating results.
- The danger we were wary of is named overfitting. The figure below is taken from de Marchi (2005) - it's a small sample of random data points centered on a mean of 0. No model should fit this sample - but one can nonetheless find models that appear to fit the sample if one works at it hard enough. As an example, the line is a complicated function that fits the data, but it's not a model one would want to bet money on if one were asked to predict the future...
(Note: the points are observations in a sample; the line is a "model".)
Individual variable effects
We often, in the text, point out effects of particularly salient variables. Unlike an academic article, we can't provide the full results (i.e., a sense of the magnitude of the variable plus the standard errors). We're also wary that, as with the first question, one can often find what seem to be significant or important effects for variables that are in fact artifacts of the sample (i.e., your model has fit the noise present in the data rather than the systematic component). While there are a number of ways to provide additional security that the effects you've found are real, we relied on the bootstrap.
The nonparametric bootstrap is a technique that constructs a very large number of bootstrap samples from the empirical probability distribution based on the original sample. We only comment on variables in the text that remain important (i.e., they have the expected sign and significance) when using the bootstrap with 1000 replications.
TRAITS vs. supplemented demographics
There's an active debate about the utility of standardized coefficients, but for our purpose - comparing the relative power of the TRAITS model vs. the supplemented demographics model - they are a reasonable choice. The basic idea is that different variables are measured on different units, and this makes comparison of the magnitude of the effects of variables difficult. To address this, standardized coefficients tell you how much the dependent variable changes if the independent variable in question moves one standard deviation (whatever the unit). The most cited critique of the use of standardized betas within social science is King (1986). He has two main complaints. The first is that there is no natural underlying unit that would allow one to make comparisons - but this is a matter more of aesthetics than anything else. His second complaint is that if one changes the sample and thereby changes the variance of the independent variables, then the standardized coefficients can also change dramatically.
Following King, imagine you have a dependent variable Y and an independent variable X. The three observations in your first sample are {Y,X} = {5,2}, {5, 4}, and {6, 4}. Below is the result of an OLS regression:
Source |
| |
SS |
df |
MS |
|
Number of obs |
= |
3 |
F( 1, 1) |
= |
0.33 |
Prob > F |
= |
0.6667 |
R-squared |
= |
0.2500 |
Adj R-squared |
= |
-0.5000 |
Root MSE |
= |
.70711 |
|
|
Model |
| |
.166666667 |
1 |
.166666667 |
Residual |
| |
.5 |
1 |
.5 |
|
Total |
| |
.666666667 |
2 |
.333333333 |
|
y |
| |
Coef. |
Std. Err. |
t |
P>|t| |
[95% Conf. |
Interval] |
Beta |
|
x |
| |
.25 |
.4330127 |
0.58 |
0.667 |
-5.251948 |
5.751948 |
.5 |
_cons |
| |
4.5 |
1.5 |
3.00 |
0.205 |
-14.55931 |
23.55931 |
|
|
The slope for X is .25, which means that Y increases by .25 for every unit increase of X.
The regression line looks like this (superimposed on the observations):
As one can immediately tell from the significance of X and its standard error as well as the R-squared, this is not a good model. There is little overall variance in X, and the slope that we've discovered is more a product of chance than anything systematic. Adding or subtracting a single observation would produce dramatic changes to the model and the poor results reflect this.
King proposes that we add a new observation: {9.5, 20}. Since this both increases the variance of X and is exactly on the regression line from the last model, we find that our results improve. The variable X now is significant at the customary level and its standard error has dramatically improved. The model fit has also improved by a huge margin, and the graph below shows that the line now matches the data very well:
Source |
| |
SS |
df |
MS |
|
Number of obs |
= |
4 |
F( 1, 2) |
= |
52.75 |
Prob > F |
= |
0.0184 |
R-squared |
= |
0.9635 |
Adj R-squared |
= |
0.9452 |
Root MSE |
= |
.5 |
|
|
Model |
| |
13.1875 |
1 |
13.1875 |
Residual |
| |
.5 |
2 |
.25 |
|
Total |
| |
13.6875 |
3 |
4.5625 |
|
y |
| |
Coef. |
Std. Err. |
t |
P>|t| |
[95% Conf. |
Interval] |
Beta |
|
x |
| |
.25 |
.0344214 |
7.26 |
0.018 |
.1018966 |
.3981034 |
.98 |
_cons |
| |
4.5 |
.3593702 |
12.52 |
0.006 |
2.953755 |
6.046245 |
|
|
King's complaint is that the standardized coefficient changes from .5 in the first model to .98 in the second (listed as the "Beta" in the above results). First, it should be remembered that the point of standardized coefficients is not usually to compare variables from different samples; rather, it's to compare the magnitude of variables within a sample. Second, it must be remembered that the standardized coefficient depends on the variance of the independent variable - if that changes, so too does the coefficient. And to the extent King's critique bothers anyone, they'd have to be equally bothered by significance testing and the use of standard errors - both of these also change quite dramatically from the first sample to the second.
For more information on this topic, see Gelman (http://www.stat.columbia.edu/~cook/movabletype/archives/2009/03/displaying_regr.html).
This is a long-winded way of saying that when we present graphs in our book, we are comparing the sum of the absolute values of the standardized coefficients of the TRAITS model versus the supplemented demographics model. For all of our dependent variables, we run the same model to compare TRAITS with the supplemented demographics unless noted in the text.
Resources:
Bring, Johan. "How to Standardize Regression Coefficients." The American Statistician, 1994.
de Marchi, Scott. Computational and Mathematical Modeling in the Social Sciences. Cambridge University Press, 2005.
Friedman, J., T. Hastie, and R. Tibshirani, The elements of statistical learning. Springer Series in Statistics, 2001.
Delgado, Miguel and Wenceslao Gonzalez Manteiga. "Testing in Nonparametric Regression Based on the Bootstrap." The Annals of Statistics, 2001.
Gelman, Andrew. "Scaling regression inputs by dividing by two standard deviations." Statistics in Medicine, 2008.
King, Gary, Michael Tomz, and Jason Wittenberg. "Making the most of statistical analyses: Improving interpretation and presentation." American Journal of Political Science, 2000.
King, Gary. "How not to lie with statistics: avoiding common mistakes in quantitative political science." American Journal of Political Science, 1986.
Addendum- Predicting Political Activity (see page. 115-116 of You Are What You Choose):
To see how we estimated our choice model, consider the discussion of political action in our chapter on "Consuming Politics." For a sample of respondents to the Knowledge Network surveys we created a scale that measured their political activity across many different decisions. These included voting, going to a rally, working in a campaign, donating money, contacting officials, and writing a letter to the editor. We then used our variables representing the TRAITS, Party ID, and Demographics + Microtargeting to predict who is politically active. The graph on page 116 indicates that for 88 out of 100 people we correctly predicted their level of political activity. We sum the total standardized betas for the variables, and in the graph base the proportion of the bar graph using a particular color on the percentage of the betas' sum accounted for by particular types of variables. While political consultants treat Demographics + Microtargeting variables as destiny, our model shows that the TRAITS play an equally large role in determining a person's level of political activity.
If you're high on the Information trait, you may have the following additional questions about this model:
Question 1: Does the overall model generalize out-of-sample?
Original Sample, N=4725
Training Set, N=4260
Out-of-sample Set, N=465
Root MSE of model fit on the training set: 1.3
Root MSE of that same model applied to the out-of-sample set: 1.25 (!)
Answer: Yes -- in fact, the model does slightly better on the out-of-sample set than it did on the training set, which is an excellent outcome. We are clearly capturing something systematic about the data rather than overfitting the sample.
Question 2: Are we reasonably certain the variables we highlight in the text are genuinely important?
Compare the coefficients recovered from the regression on the training set with the subsequent bootstrap results.
OLS Regression on training set for the dependent variable Political Action (Chapter 5, p. 116):
Source |
| |
SS |
df |
MS |
|
Number of obs |
= |
4260 |
F( 23, 4236) |
= |
75.03 |
Prob > F |
= |
0.0000 |
R-squared |
= |
0.2895 |
Adj R-squared |
= |
0.2856 |
Root MSE |
= |
1.3055 |
|
|
Model |
| |
2941.40591 |
23 |
127.887213 |
Residual |
| |
7219.73471 |
4236 |
1.70437552 |
|
Total |
| |
10161.1406 |
4259 |
2.38580432 |
|
pol_act |
| |
Coef. |
Std. Err. |
t |
P>|t| |
|
Beta |
|
info |
| |
4.211418 |
.4928954 |
8.54 |
0.000 |
|
.1189927 |
risk |
| |
.0593032 |
.1915891 |
0.31 |
0.757 |
|
.0043778 |
discount |
| |
.2750255 |
.1242289 |
2.21 |
0.027 |
|
.0305679 |
altruism |
| |
5.11929 |
.1937676 |
26.42 |
0.000 |
|
.3888472 |
other |
| |
.0396026 |
.182587 |
0.22 |
0.828 |
|
.0029363 |
loyal |
| |
-.1986226 |
.268059 |
-0.74 |
0.459 |
|
-.0102238 |
partyid |
| |
.0448009 |
.0123549 |
3.63 |
0.000 |
|
.0502457 |
gender |
| |
-.0414336 |
.0424795 |
-0.98 |
0.329 |
|
-.0133915 |
agecat |
| |
.1190375 |
.0176885 |
6.73 |
0.000 |
|
.104108 |
hispan |
| |
-.0261893 |
.0270792 |
-0.97 |
0.334 |
|
-.0126481 |
black |
| |
-.0243126 |
.1023843 |
-0.24 |
0.812 |
|
-.0032085 |
education |
| |
.2674262 |
.0279696 |
9.56 |
0.000 |
|
.1431566 |
income |
| |
.0381839 |
.0259019 |
1.47 |
0.141 |
|
.021977 |
house_head |
| |
.0676508 |
.0868603 |
0.78 |
0.436 |
|
.0113613 |
kids |
| |
-.0992315 |
.0296444 |
-3.35 |
0.001 |
|
-.0482974 |
home owner |
| |
.0114617 |
.0567192 |
0.20 |
0.840 |
|
.0032059 |
urban |
| |
-.0869962 |
.0607824 |
-1.43 |
0.152 |
|
-.0190046 |
married |
| |
-.0346019 |
.0590884 |
-0.59 |
0.558 |
|
-.0104588 |
divorced |
| | -.0521756
.0715278 |
-0.73 |
0.466 |
|
-.0119373 |
Fox news |
| |
.1431671 |
.0457964 |
3.13 |
0.002 |
|
.0429032 |
gun owner |
| |
.1256663 |
.0430543 |
2.92 |
0.004 |
|
.0401986 |
single_fam |
| |
.0584629 |
.0534912 |
1.09 |
0.274 |
|
.0157985 |
church_att |
| |
.007942 |
.0138807 |
0.57 |
0.567 |
|
.0083093 |
_cons |
| |
-.2156179 |
.3531037 |
-0.61 |
0.541 |
|
. |
|
Note: most of the above independent variables are self-explanatory, but house_head is household head, single_fam is living in a single family home, and church_att is church attendance.
Bootstrap replications (1000): |
Linear regression |
|
Number of obs |
= |
4725 |
Replications |
= |
1000 |
Wald chi2(23) |
= |
1426.65 |
Prob > chi2 |
= |
0.0000 |
R-squared |
= |
0.2907 |
Adj R-squared |
= |
0.2872 |
Root MSE |
= |
1.3002 |
|
|
|
| |
Observed |
Bootstrap |
|
|
Normal-based |
pol_act |
| |
Coef. |
Std. Err. |
z |
P>|z| |
[95% Conf. |
Interval] |
|
info |
| |
4.034178 |
.5256567 |
7.67 |
0.000 |
3.003909 |
5.064446 |
risk |
| |
.0213319 |
.1760631 |
0.12 |
0.904 |
-.3237454 |
.3664092 |
discount |
| |
.2429872 |
.1232276 |
1.97 |
0.049 |
.0014655 |
.4845088 |
altruism |
| |
5.13132 |
.2259805 |
22.71 |
0.000 |
4.688406 |
5.574233 |
other |
| |
.0712412 |
.1829331 |
0.39 |
0.697 |
-.287301 |
.4297835 |
loyal |
| |
-.2854712 |
.2737294 |
-1.04 |
0.297 |
-.8219711 |
.2510286 |
partyid |
| |
.0468967 |
.0114132 |
4.11 |
0.000 |
.0245273 |
.0692662 |
gender |
| |
-.0474893 |
.0412135 |
-1.15 |
0.249 |
-.1282663 |
.0332878 |
agecat |
| |
.1267478 |
.0156742 |
8.09 |
0.000 |
.096027 |
.1574686 |
hispan |
| |
-.0050949 |
.0220281 |
-0.23 |
0.817 |
-.0482692 |
.0380793 |
black |
| |
-.0501854 |
.0955833 |
-0.53 |
0.600 |
-.2375252 |
.1371544 |
education |
| |
.272836 |
.0238029 |
11.46 |
0.000 |
.2261832 |
.3194887 |
income |
| |
.0329165 |
.0249986 |
1.32 |
0.188 |
-.0160799 |
.0819129 |
house_head |
| |
.0824984 |
.0757868 |
1.09 |
0.276 |
-.066041 |
.2310378 |
kids |
| |
-.0691018 |
.0268534 |
-2.57 |
0.010 |
-.1217336 |
-.01647 |
home owner |
| |
.0169316 |
.0535751 |
0.32 |
0.752 |
-.0880738 |
.1219369 |
urban |
| |
-.0911029 |
.0586117 |
-1.55 |
0.120 |
-.2059797 |
.0237739 |
married |
| |
-.050785 |
.0555436 |
-0.91 |
0.361 |
-.1596484 |
.0580784 |
divorced |
| | -.0527845
.0665132 |
-0.79 |
0.427 |
-.1831479 |
.0775789 |
Fox news |
| |
.1420787 |
.0437425 |
3.25 |
0.001 |
.0563449 |
.2278124 |
gun owner |
| |
.1372335 |
.0408524 |
3.36 |
0.001 |
.0571642 |
.2173028 |
single_fam |
| |
.0533655 |
.0527446 |
1.01 |
0.312 |
-.0500121 |
.1567431 |
church_att |
| |
.0046954 |
.0133789 |
0.35 |
0.726 |
-.0215266 |
.0309175 |
_cons |
| |
-.1636907 |
.335815 |
-0.49 |
0.626 |
-.821876 |
.4944946 |
|
Answer: all coefficients are stable, which indicates we're probably right - for example, altruism, information, and time matter a great deal.
Question 3: what are the relative contributions of the models?
As we show on the figure on p. 116 of our book, both TRAITS and the supplemented demographics model have an impact on political action. The top three individual effects (see the column labeled "Beta" in the forgoing model) are in order: altruism, education, and information.
About the Knowledge Network Surveys
We are fortunate to have worked with Knowledge Network survey data in our
analyses. For general information about the firm and its services, see
www.knowledgenetworks.com. The company describes the general approach for a
Knowledge Network survey this way:
METHODOLOGY
The survey was conducted using the web-enabled KnowledgePanel®, a
probability-based panel designed to be representative of the U.S.
population. Initially, participants are chosen scientifically by a
random selection of telephone numbers and residential addresses. Persons
in selected households are then invited by telephone or by mail to
participate in the web-enabled KnowledgePanel®. For those who agree to
participate, but do not already have Internet access, Knowledge Networks
provides at no cost a laptop and ISP connection. People who already have
computers and Internet service are permitted to participate using
their own equipment. Panelists then receive unique log-in information
for accessing surveys online, and then are sent emails throughout each
month inviting them to participate in research. More technical
information is available at
http://www.knowledgenetworks.com/ganp/reviewer-info.html.
|
 |
|
All contents of the site © 2009 Scott de Marchi and James T. Hamilton.
|
|
|