I Introduction Data The Cooperative Election Studies Validated Turnout Data from U.S. Secretaries of States Nielsen Gross Ratings Points Data for Political Advertising Civic Engagement and Political Information II Analyses: Political Interest, Knowledge, Uncertainty, and Participation (Project A) Analyses of Variation Civic Engagement: Dependent Variables DMA region Fixed Effects Bayesian Analysis of Hierarchical and Multilevel Models Interest in Politics Multilevel model Overall Political Knowledge/Information Multilevel model Ability/Willingness to Place Obama Multi-Level Model Validated Voter Turnout Multilevel Model Conclusion: Project A III Analyses: Identification of Local Candidates — A Study Representative of DMA regions (Project B) The Pattern across DMA regions and its Explanations How many people can recognize the candidates? Is There a Pattern to the Variation at the DMA region Level? Individual-Level Attributes Explaining the DMA region-level Fixed Effects IV Report Conclusion References Report to the FCC: How the Ownership Structure of Media Markets affects Civic Engagement and Political Knowledge, 2006-2008 Lynn Vavreck,1 Simon Jackman,2 and Jeffrey B. Lewis3 Saturday 23rd April, 20114 1Associate Professor, Department of Political Science, University of California, Los Angeles. e-mail: lvavreck@ucla.edu 2Professor, Department of Political Science, Stanford University. e-mail: jackman@stanford.edu 3Associate Professor, Department of Political Science, University of California, Los Angeles. email: jblewis@ucla.edu 4DMA is a registered trademark of, and DMA region boundaries and names are the proprietary information of The Nielsen Company, used under license. Contents I Introduction 1 1 Data 2 1.1 TheCooperativeElectionStudies ........................ 2 1.2 Validated Turnout Data from U.S. Secretaries of States . . . . . . . . . . . . 10 1.3 Nielsen Gross Ratings Points Data for Political Advertising . . . . . . . . . . 13 2 Civic Engagement and Political Information 14 II Analyses: Political Interest, Knowledge, Uncertainty, and Participation (Project A) 17 3 Analyses of Variation 17 3.1 CivicEngagement:DependentVariables .................... 18 3.2 DMAregionFixedEffects ............................ 21 4 Bayesian Analysis of Hierarchical and Multilevel Models 36 4.1 InterestinPolitics ................................ 39 4.1.1 Multilevelmodel ............................. 40 4.2 Overall Political Knowledge/Information . . . . . . . . . . . . . . . . . . . . 43 4.2.1 Multilevelmodel ............................. 43 4.3 Ability/WillingnesstoPlaceObama ...................... 46 4.3.1 Multi-LevelModel ............................ 47 4.4 ValidatedVoterTurnout ............................. 50 4.4.1 MultilevelModel ............................. 50 4.5 Conclusion:ProjectA .............................. 51 III Analyses: Identification of Local Candidates — A Study Representative of DMA regions (Project B) 53 5 The Pattern across DMA regions and its Explanations 54 5.1 Howmanypeoplecanrecognizethecandidates? . . . . . . . . . . . . . . . . 55 5.2 Is There a Pattern to the Variation at the DMA region Level? . . . . . . . . 60 5.3 Individual-LevelAttributes ............................ 62 5.4 Explaining the DMA region-level Fixed Effects . . . . . . . . . . . . . . . . . 66 IV Report Conclusion 78 References 79 List of Figures 1 Map highlighting the nine Midwest designated market areas (DMA regions) analyzed in this study. From the Northwest to the Southeast, these DMA regions are: Minneapolis—St. Paul (MN), Madison (WI), Milwaukee (WI), Chicago (IL), Champaign&Sprngfld–Decatur (IL), Lansing (MI), Detroit (MI), Cleveland-Akron (Canton) (OH), and Columbus, OH (OH). The black bound aries on the maps outline Congressional districts in the 109th Congress. Black dots highlight Congressional districts that intersect the 9 studied DMA regions. 3 2 Both maps show the representation ratio of the CCAP sample within each of the 210 Designated Market Areas (DMA regions) in 2007. Reds reflects DMA regions that are overrepresented in the CCAP sample. Blues reflect DMA regions that are underrepresented in the CCAP sample. Darker colors reflect larger degrees of over-or under-representation. Panel (b) is a cartogram that deforms the US map such that each DMA region’s area is proportional to the size of its adult (18+) population. Alaska’s two DMA region’s are shown to the Southwest of Texas. The Hawaii DMA region is shown the Southeast of Texas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Both maps show the weighted representation ratio of the CCAP sample within each of the 211 Designated Market Areas (DMA regions) in 2007. In these plot the CCAP data are weighted to correct for the oversample of battle ground states and for other incidental demographic imbalances. The color scale is the same as in Figure 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Histograms of representation ratios for the CCAP sample across DMA regions. In Panel (a) the ratios are based on the unweighted CCAP observations. In Panel (b) the ratios are based on weighted CCAP sample. The weights adjust for CCAP’s battleground-state oversample and for demographic imbalances. 10 5 Registration Status of CCAP respondents in voter files, by state . . . . . . . 13 6 Scatterplot Matrix of Fixed Effects . . . . . . . . . . . . . . . . . . . . . . . 23 7 Number of Independently Owned TV Stations by Fixed Effects . . . . . . . 26 8 Number of Independent Radio Stations by Fixed Effects . . . . . . . . . . . 27 9 Number of Multiple-TV Station Parents by Fixed Effects . . . . . . . . . . . 29 10 Number of Parents Owning at least one Television and one Radio Station by Fixed Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 11 Number of Parents Owning TV and Radio Stations and a Newspaper by Fixed Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 12 Fraction of Households that Subscribe to 200 KBS Internet Service by Fixed Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 13 Log of Total Presidential Campaign Advertising GRPs by Fixed Effects, 2008 33 14 Number of Radio Parents with News/Talk Format by Fixed Effects . . . . . 34 15 Log of Population over 18 by Fixed Effects . . . . . . . . . . . . . . . . . . . 35 16 Scatterplot Matrix of Level Two Variables . . . . . . . . . . . . . . . . . . . 37 17 Estimates and 95% Credible Intervals, DMA region-specific offsets (αj), multi level ordinal logistic regression model for respondent self-reported levels of interest in politics. The estimates have been sorted from low to high and by constructionhavezeromean. ........................... 41 18 Estimates and 95% Credible Intervals, DMA region-specific offsets (αj), multi level regression model for political information scores. The estimates have been sorted from low to high and by construction have zero mean. . . . . . . 45 19 Estimates and 95% Credible Intervals, DMA region-specific offsets (αj), multi level logistic regression model for respondent ability/willingness to place Obama on health policy. The estimates have been sorted from low to high and by constructionhavezeromean. ........................... 49 20 Estimates and 95% Credible Intervals, DMA region-specific offsets (αj), multi level logistic regression model for validated voter turnout. The estimates have been sorted from low to high and by construction have zero mean. . . . . . . 52 21 Differences in Ability to Identify Incumbents by DMA regions . . . . . . . . 58 22 Ability to Identify Candidates by Congressional District . . . . . . . . . . . 59 23 Scatterplot Matrix of DMA region Fixed Effects from Challenger Identification ModelandOwnershipVariables ......................... 68 24 DMA region-Level Fixed Effects and Number of Independent TV Voices . . . 69 25 DMA region-Level Fixed Effects and Number of Independent Radio Voices . 70 26 DMA region-Level Fixed Effects and Number of Multi-Ownership Parents . . 71 27 DMA region-Level Fixed Effects and Number of Cross-Ownership Parents . . 72 28 DMA region-Level Fixed Effects and Gross Ratings Points . . . . . . . . . . 74 29 Explaining DMA region-Level Fixed Effects with Incumbents’ Ad Buys . . . 75 30 Explaining DMA region-Level Fixed Effects with Challengers’ Ad Buys . . . 76 List of Tables 1 NumberofCCESrespondentsbyDMAregion . . . . . . . . . . . . . . . . . 5 2 Zip code matches, CCAP and Vote Validation File from Third-Party Firm. . 11 3 Least squares regression analysis of Political Information (mean 0, sd 1) and Political Interest (1, 2, 3), October wave. All models include unreported fixed effectsforincomelevels. ............................. 22 4 Least squares regression analysis of Ability/willingness to locate Obama on health care issue, September wave of CCAP (0,1) and Validated Turnout (0,1). All models include unreported fixed effects for income levels. . . . . . . . . . 24 5 Estimates of Multilevel Ordinal Logistics Regression Model of Respondent Self-Reported Levels of Political Interest. Entries above the line are estimates of micro-level parameters, β; entries below the line are estimates of DMA-level parameters, γ. τ are threshold parameters in the ordinal logistic regression model. σ is the standard deviation of the error component at the DMA level ofthemodel. ................................... 42 6 Estimates of Multilevel Regression Model of Political Information Scores En tries above the line are estimates of micro-level parameters, β; entries below the line are estimates of DMA-level parameters, γ. σ is the standard devia tion of the error component at the micro, individual level of the model; ω is the standard deviation of the error component at the the DMA level of the model. ....................................... 44 7 Estimates of Multilevel Model of Ability/Willingness to Place Obama on Health Care. Entries above the line are estimates of micro-level parameters, β; entries below the line are estimates of DMA-level parameters, γ. ..... 48 8 Estimates of Multilevel Model of Validated Voter Turnout. Entries above the line are estimates of micro-level parameters, β; entries below the line are estimates of DMA-level parameters, γ. ..................... 50 9 Percent Correctly Identifying Congressional Candidates, 2006 . . . . . . . . . 56 10 Ability to Identify Incumbent Images as Function of DMA region Fixed Effects 60 11 Ability to Identify Challenger Images as Function of DMA region Fixed Effects 61 12 Ability to Identify Incumbent Images as Function of Demographics . . . . . 63 13 Ability to Identify Challenger Images as Function of Demographics . . . . . 63 14 Reliance on Local News for Information about Elections as a Function of Employment .................................... 65 15 Ability to Identify Challenger Images as Function of Demographics and DMA regionFixedEffects ................................ 65 16 Correctly Identifying the Challenger in a District from His/Her Image, DMA region-Level .................................... 77 Abstract This report investigates whether the structure of media ownership in television markets affects the levels of civic/political engagement and political information of people living within those markets. The FCC provided data on the structure of market ownership to which we appended three types of data: Proprietary survey data, collected in 2006 (CCES) and 2008 (CCAP) Validated turnout information collected from Secretaries of States’ offices Political advertising data in the form of gross ratings points for each television media market purchased from Nielsen, Inc. The FCC’s localism and diversity goals are partially designed to provide citizens with enough local information and a diversity of viewpoints such that media consumption increases the likelihood of participating in politics and people’s levels of information about politics. In this study, engagement, participation, and knowledge are indicators of the success at reaching the localism and diversity goals at the market-level. Localism We measure the effects of localism using 3,000 respondents from a portion of a larger survey fielded during the 2006 midterm elections (Cooperative Congressional Election Study, CCES). This module of the CCES targets 9 Midwestern TV markets covering 63 congressional districts. The focus on midterm elections lets us determine whether media ownership structure is directly related to knowledge about local Congress members and the people running against them. Essentially we ask whether the local media provide citizens with information that helps them learn about candidates for Congress and whether media market structure has anything to do with the answer. We find that there is variation across media markets with respect to people’s knowledge about local candidates, but that these differences are due mainly to features of the congressional district (how much advertising the candidates do in any particular race) and not due in any appreciable way to the ownership characteristics of the market. Diversity We measure the effects of diversity using a 20,000 person, 6-wave nationwide survey fielded during 2007-8 (Cooperative Campaign Analysis Project, CCAP). We use campaign interest, knowledge of candidates’ positions on issues, general levels of political information, and turnout in the election to examine whether media markets are in any way related to citizens’ levels of civic or political engagement and whether the ownership structure in the markets are related to these differences. We find that there is variation on political engagement across media markets, but that these differences cannot be explained by the structural conditions related to ownership. Part I Introduction The FCC has commissioned this investigation into the way that media ownership structure affects political and civic engagement. By ownership structure, we refer here to the number of independent television or radio voices in a given market, 1 how many parent corporations own multiple broadcast stations in the market, 2 whether any of the broadcast outlets are also owned by a parent company that owns a newspaper in the market, 3 the number of unique radio stations with a news or talk format, 4 and the fraction of households in the DMA region that subscribe to Internet service of at least 200KBS. 5 These measures are meant to convey where the markets align, along an imaginary continuum, in terms of the amount of unique political information broadcast outlets in the market provide to viewers. Hypothetically speaking, on one extreme is a market with one television or radio station. The people living in this DMA region do not receive very much political information via broadcast media at all. On the other extreme is a market with 20 television or radio stations, all uniquely owned and all providing some unique content about politics. People in this DMA region may be exposed to a lot of political information via the media available to them. Somewhere in the middle is the market with 20 stations, but also a set of four parent companies that own five stations each. If the coverage across the five stations owned by each conglomerate is the same, then the people living in this market have more opportunities to hear political news, but the chances that they hear unique points of view are diminished relative to people living in the 20-station market described above. Our goal was to exploit this structure in the analyses that follow, however, our inability to relate any of these ownership dimensions to political engagement limits the need to bring this level of nuance to the analyses. The FCC provided data on the structure of media ownership for television stations, radio stations, and the ownership of newspapers within television markets. 6 We use these data to construct our characterization of the market ownership environment using the variables listed above. To these data we appended three additional sources of information. We describe the data in detail in the next section. These data cover the fall of 2006, just before the midterm 1As measured in the FCC provided dataset named TVMarkets.dta, using the variables called TVVOICES and RADIOVOICES 2Variables MULTICOMTVPARENTS and COMRADIOCOMTVPARENTS 3Variable NEWSPAPERTVPARENTS 4Variable RADIONTPARENTS 5Variable BROADBAND200PCT 6These data are contained in the file TVMarkets.dta delivered originally to us on October 22, 2010 and updated by the FCC on December 22, 2010. elections to Congress; and the year leading up to the 2008 presidential election. We use the FCC data on media ownership from 2005 and 2007 to investigate the effects of ownership structure on political engagement in the 2006 and 2008 studies, respectively. Our general approach is to determine whether there is any variation in political engagement or knowledge across markets and to investigate whether we can characterize the variation in terms of the ownership structure of markets. In no case do we find that the ownership structure of the local media market affects levels of civic or political engagement or knowledge. We conclude that there is significant within-market variation that we can explain a portion of with individual-level demographics, but that the across-market variation is not explained by the ownership structure in the market. Although there is between-market variation in civic interests, it appears to be driven by two things — the local political context in the DMA region and the level of Internet penetration in the DMA region. Specifically, we measure the local political context at the DMA region level with the total advertising Gross Ratings Points that were purchased at the market level for the election in reference. There are market-level factors that drive political engagement and participation, but they are not tied to the ownership structure in the market to any appreciable degree. 1 Data 1.1 The Cooperative Election Studies The Cooperative Congressional Election Study (CCES) and the Cooperative Campaign Analysis Project (CCAP) are in the family of election studies known as the Cooperative Election Studies. These projects bring together Political Scientists, Economists, Communication scholars, Sociologists, and Psychologists from around the world in order to pool financial resources for the purposes of running an election study with a large amount of statistical power (through the acquisition of tens of thousands of cases). Each research team buys in to the project in exchange for the opportunity to field original content to a portion of the study’s respondents. Vavreck served as the study director for the first cooperative project in 2006 (the CCES) ( Vavreck and Rivers 2008 ) and, along with Jackman, served as principal investigator of the CCAP in 2008 ( Jackman and Vavreck 2009 ). We use these two datasets in our analysis of media ownership and political engagement. The data were gathered by the survey research firm YouGov/Polimetrix, Inc. of Palo Alto, California. Interviews were conducted via the Internet. The data are representative Figure 1: Map highlighting the nine Midwest designated market areas (DMA regions) analyzed in this study. From the Northwest to the Southeast, these DMA regions are: Minneapolis—St. Paul (MN), Madison (WI), Milwaukee (WI), Chicago (IL), Champaign&Sprngfld–Decatur (IL), Lansing (MI), Detroit (MI), Cleveland-Akron (Canton) (OH), and Columbus, OH (OH). The black boundaries on the maps outline Congressional districts in the 109th Congress. Black dots highlight Congressional districts that intersect the 9 studied DMA regions. of target populations as described below. Details on the process by which panelists are recruited and samples are made to be representative can be found in Vavreck and Rivers (2008) and Jackman and Vavreck (2010). 7 CCES The subset of the CCES data with which we work was commissioned by teams from University of California Los Angeles (UCLA) and the University of Wisconsin. These data cover nine Midwestern Designated Market Areas (DMA regions) and include a total of 3,000 7Studies using data from each of these projects have been published across a wide variety of peer reviewed journals in Political Science, including the profession’s most prominent journal, the American Political Science Review. respondents. 8 The UCLA/Wisconsin project was specifically designed to track media effectiveness. The teams asked a number of questions about people’s media habits but also asked about things citizens are likely to learn from local media. The University of Wisconsin team was particularly interested in the effects of local television news on attitudes and information while the UCLA team was most interested in the effects of local campaigns. The project, therefore, contains unique questions asking respondents to identify pictures of their local candidates for U.S. House and their sitting Members of Congress. Tying these measures to structural factors within the market provide a rare opportunity to assess whether geography — in terms of media markets — affects local political knowledge at all, and if it does, to what extent the ownership environment structures this effect. The nine DMA regions and the Congressional districts that they include are shown in Figure 1. The data were constructed to be representative of the general population with respect to a number of demographic and political variables and were gathered in October of 2006 in the weeks leading up to the 2006 midterm Congressional elections (see Vavreck and Rivers (2008) for a complete description of the project). The nine Midwestern DMA regions range from large markets such as Chicago (3rd largest in adult population) and Detroit (11th largest), to mid-sized markets, Columbus (33rd largest) and Milwaukee (34th largest), to smaller markets, Champaign (82nd largest) and Lansing (111th largest). 9 The DMA regions cover in whole or in part 65 Congressional districts. While most of these 65 districts intersect with only one of the nine DMA regions, eight districts interest two of the DMA regions. 10 A breakdown of the number of CCES respondents by DMA region is given in Table 1. Table 1 also shows the size of the adult population in each of the nine DMA regions and enumerates the Congressional districts that intersect each DMA region. The final column of the table shows the representation ratio of the CCES sample with respect to each DMA region. If the CCES sample sizes for each DMA region had been in perfect proportion to the size of the DMA region, each of these ratios would be one. Ratios larger than one are associated with DMA regions in which the CCES sample over-represents the DMA region, while ratios less than one are associated with DMA regions that are underrepresented in the CCES. The deviations from proportionality in the DMA region-level sample sizes are not cause for serious concern because we will be inferring differences among the DMA 8The nine Nielsen markets are: Champaign&Sprngfld-Decatur, Chicago, Cleveland-Akron (Canton), Columbus, OH, Detroit, Lansing, Madison, Milwaukee, Minneapolis/St. Paul 9The markets are ranked by their 2005 eighteen and older population as provided by the variable POP18PLUS in the file tvmarkets.dta furnished by the FCC. 10These districts are Michigan’s 7th and 8th, Ohio’s 4th, 5th, and 18th, and Wisconsin’s 1st, 2nd, and 3rd. regions rather than aggregate features of these nine DMA regions as a whole. However, it is reassuring that most of the representation ratios fall between 0.75 and 1.25. Conveniently, the most over-represented DMA regions are those that are the smallest in population. Thus, we have many more respondents to leverage in the study of the two DMA regions in which we have the smallest number of observations (Madison and Champaign–Springfield–Decatur) than would have been generated by a sample that was perfectly representative of the DMA regions. Designated Market Area Observations Adult Representation (Congressional districts) (Weighted) Population Ratio Champaign&Sprngfld-Decatur, IL 164 736,831 1.70 (IL 15,17,18,19) (166.93) Chicago, IL 685 7,079,787 0.72 (IL 1–11, 13–16; IN 1,2) (680.51) Cleveland-Akron (Canton), OH 491 2,966,440 1.22 (OH 4,5,9,10–18) (485.38) Columbus, OH 271 1,701,532 1.20 (OH 4,5,7,12,15,18) (272.82) Detroit, MI 479 3,766,926 0.95 (MI 7–15) (476.31) Lansing, MI 77 512,131 1.12 (MI 7,8) (76.53) Madison, WI 234 699,538 2.52 (WI 1–3) (236.00) Milwaukee, WI 260 1,682,118 1.15 (WI 1,2,4–6) (258.00) Minneapolis—St. Paul, MN 336 3,224,155 0.78 (MN 1–8; WI 3,7) (336.73) Table 1: Number of CCES respondents by DMA region CCAP The CCAP data consist of 20,000 impaneled observations representative of registered voters with respect to a number of demographic and political variables. 11 Registered voters from 15 states in which the 2008 Presidential election was expected to be closely contested and in which extensive campaign activity was likely to take place were over-sampled. 12 This 11See Appendix A for a complete description of the sampling attributes of CCAP. 12These “battleground” states were selected based upon the likely competitiveness within each state and “battleground” oversample can be corrected for by applying sampling weights and is not a cause for concern in this study because we are seeking to make comparisons among media markets and only 47 of the 210 media markets include territory from both battleground and non-battleground states. 13 The data collection began in December of 2007 and continued across 5 additional panel waves in January, March, September, October, and November of 2008. The CCAP data contain questions on general civic responsibilities such as turnout in elections, knowledge of candidates’ positions on issues, overall levels of general political information, and interest in public affairs. Once we merge these data with the FCC data on media markets, we are able to place 19,159 respondents in 210 unique Nielsen media markets (DMA regions) and 279 unique radio metro areas. 14 CCAP respondents hailed from all 210 DMA regions that existed in 2007. As we would expect, small DMA regions such as Glendive (MT) and Harrisonburg (VA) have small numbers of respondents (1 and 4 respectively), but all of the DMA regions have at least one respondent. The largest number of respondents are found in the New York, Los Angeles, Tampa—St. Petersburg—Sarasota, Philadelphia, and Chicago DMA regions with 691, 680, 573, 569, and 526 respondents respectively. The median DMA region had 40 CCAP respondents and 50 percent of DMA regions have between 18 and 110 CCAP respondents. Also as expected, the Tampa—St. Petersburg—Sarasota DMA region has more respondents than larger markets such as Boston, Dallas—Forth Worth, Washington (DC), and Atlanta, because of the CCAP battleground oversample that led to disproportionately more Florida respondents and fewer respondents from non-battleground states such as Texas, Georgia, and Washington (DC). Figure 2 shows representation ratios for the CCAP sample for the 210 DMA regions. DMA regions shaded in blue are under-represented in CCAP whereas DMA regions shaded in red are over-represented. Darker shades of red or blue reflect larger degrees of over-or under-representation respectively. DMA regions from Alaska and HI appear to the Southwest and Southeast of Texas, respectively. Some of Alaska is not shown because those areas of Alaska were not assigned to any DMA region in the data provide to us by the FCC. The representation ratios ranged from an under-representation of 1 to 8.25 in the St. Joseph (MO) DMA region to an over-representation of 2.7 to 1 in the Marquette, MI DMA region. This are: CO, FL, IA, ME, MI, MN, NC, NH, NM, NV, OH, OR, PA, WI, and WV. 13In these 47 DMA regions, residents are not usually evenly split among the battleground and non-battleground states. Rather, large majorities of these DMA regions’ populations are generally either battleground or non-battleground residents. 14We drop 841 cases from the analysis due to an inability to place the respondent in a single DMA region or radio metro area. Our conversations with FCC representatives Tracy Waldon and Jonathan Levy on December 22, 2010 about acquiring shape-files to place the remaining respondents in DMA regions led to the conclusion that they were not worth using or purchasing for these purposes. means that the one Marquette respondent represents 8 times as many adults in the DMA region as the average CCAP respondent. Conversely, each of the 41 Marquette DMA region CCAP respondents represent almost 3 times fewer residents of their DMA region than did the average CCAP respondent. Ninety percent of DMA regions have representation ratios that fall between 1 to 2.4 and 1.8 to 1. Because of the oversample of battleground states the median DMA region representation ratio is 1 to 1.09. The effect of the battleground oversample can be seen in Figure 2. Most of the electorally competitive Midwestern states along with Florida, Colorado, New Mexico, Nevada, and Oregon are shaded in red (over-represented). The states that were electorally uncompetitive in the Presidential election, such as those of the deep South, New York, and California) are shaded in blue (under-represented). The overall sense of the distribution of representation ratios given in Panel (a) of Figure 2 is somewhat misleading because there are large differences in population size across DMA regions and those differences are only weakly related to the geographic extent (area) of the DMA region (ρ =0.2). Thus, Panel (a) over-emphasizes the smaller DMA regions in which the representation ratios are inherently less stable (DMA regions in which only a few respondents are to be expected). The map presented in Panel (b) of Figure 2 deforms the DMA region map such that each DMA region remains in the same relative location (maintains all of its neighbors), but the area of each DMA region is adjusted to be proportional to the population of the DMA region. Displayed in this way, DMA regions like Los Angeles and New York become much larger while the DMA regions of the mountain West are compressed. Once the larger DMA regions are emphasized, we see much less variation in the degree of over-or under-representation across DMA regions. Two clear remaining outliers are New York which was underrepresented and Palm Springs which is over-represented. Once variation due to small DMA regions is de-emphasized, the role of the battleground oversample becomes even more clear. Figure 3 presents representation ratios based on the number of weighted CCAP respondents. These weights correct for the oversample of voters from battleground states and for some demographic imbalances. Once weighted, the CCAP sample becomes more representative of the DMA regions. Ninety percent of the weighted representation ratios fall between 1:1.7 and 1.6:1. Weighting to adjust for the battleground-state oversample, the median representation ratio across DMA regions is 1.01:1. The regional pattern of over and under-sampling becomes much less pronounced and the frequency which the darkest shades of red and blue appear on the DMA region map is greatly reduced. Interesting, even after weighting, the battleground-state DMA regions appear to be somewhat overrepresented in the sample relative to non-battleground DMA regions. (a) Designated Market Areas (DMA regions) (b) DMA regions distorted so that their areas are proportional to their adult populations Figure 2: Both maps show the representation ratio of the CCAP sample within each of the 210 Designated Market Areas (DMA regions) in 2007. Reds reflects DMA regions that are overrepresented in the CCAP sample. Blues reflect DMA regions that are underrepresented in the CCAP sample. Darker colors reflect larger degrees of over-or under-representation. Panel (b) is a cartogram that deforms the US map such that each DMA region’s area is proportional to the size of its adult (18+) population. Alaska’s two DMA region’s are shown to the Southwest of Texas. The Hawaii DMA region is shown the Southeast of Texas. (a) Designated Market Areas (DMA regions) (b) DMA regions distorted so that their areas are proportional to their adult populations Figure 3: Both maps show the weighted representation ratio of the CCAP sample within each of the 211 Designated Market Areas (DMA regions) in 2007. In these plot the CCAP data are weighted to correct for the oversample of battle ground states and for other incidental demographic imbalances. The color scale is the same as in Figure 2. (a) Representation ratios (b) Weighted representation ratios Figure 4: Histograms of representation ratios for the CCAP sample across DMA regions. In Panel (a) the ratios are based on the unweighted CCAP observations. In Panel (b) the ratios are based on weighted CCAP sample. The weights adjust for CCAP’s battleground-state oversample and for demographic imbalances. The effect that applying the weights has on the DMA region representation ratios can be seen more clearly in two histograms presented in Figure 4. In the left panel, we see a bimodal distribution of ratios reflecting the separate populations of over-sampled battleground-state DMA regions and under-sampled non-battleground state DMA regions. One the weights are applied correcting for the oversample (among other things), the distribution of representation ratios becomes unimodal and centered at 1. A few large outliers remain (and a few outliers are exaggerated by the weighting), however these are largely confined to small population DMA regions (as can be verified by examining the lower panel of Figure 3.) 1.2 Validated Turnout Data from U.S. Secretaries of States To the 2008 survey data described above we append an indicator for whether the Secretary of State in the respondent’s state of residence validated, through state-based records of participation, that the respondent was registered to vote and cast a ballot in the respective election. These data are publicly available in most states. States vary in terms of the costs associated with obtaining these data. Our validated turnout data was gathered by the survey research firm YouGov/Polimetrix, Inc. The process of actually doing the matching and validating is itself interesting and warrants some brief elaboration. Vavreck and Jackman were the PIs of CCAP, and in effect, clients of YG/PMX. As such, information sufficient to identify the respondents on a voter file is not delivered to the PIs and is not their intellectual property. Like almost any survey research firm, YG/PMX makes quite stringent guarantees of privacy to their panelists. Aware of these circumstances, we did not seek (nor we were offered) identifying information such as respondent names and addresses from YG/PMX. Accordingly, the PIs were “firewalled” from the vote validation process. YG/PMX sent a file of identifying information (e.g., names, addresses, date of birth) to a third-party firm that specializes in vote validation. 15 YG/PMX received a file with turnout history for respondents for whom the contractor was able to match in its extensive data bases. 16 Ninety-seven out of 20,000 CCAP respondents appear twice in the voter file data returned to us by YG/PMX. In these instances we choose the voter file record with an entry for the 2008 general election; in every instance this is unambiguous. For another three CCAP respondents we have no corresponding records in the vote validation file and we presume that no match could be found for these respondents. n % On File Unmatched Unregistered 5 digit zip match 15,127 75.6 15,127 0 0 No zip on validation file 3,236 16.2 35 2,603 598 Zips do not match 1,599 8.0 1,599 0 0 Bad zip from YG/PMX, 31 0.2 31 0 0 good zip from validation file Bad zip from YG/PMX, 7 0.0 0 5 2 no zip from validation file Table 2: Zip code matches, CCAP and Vote Validation File from Third-Party Firm. YG/PMX did provide the respondent ip codes (at least as reported by the respondents to YG/PMX); YG/PMX also forwarded the ip code found by the third-party vote validation firm. Table 2 summaries the congruence between the ip codes provided by the YG/PMX 15This is consistent with YG/PMX’s relationship with its panelists, in that YG/PMX was not selling the data, nor disclosing anything about the panelists other than that they might possibly be YG/PMX panelists. In fact, the passage of data between YG/PMX and its contractors is subject to a non-disclosure agreement; under the terms of this agreement the contractor returns the registration and vote history data and destroys the data set of identifying information received from YG/PMX. On the respondent side, YG/PMX’s agreement with its panelists allows them to perform matching with data from third parties who agree to abide with YG/PMX’s privacy policies. 16Beyond the voter files, vote-matching firms also uses consumer data bases to help ascertain residency and registration if someone is not in the voter files; e.g., if a person with a certain set of identifying characteristics can be found in the consumer data bases, but not on the voter file, then this is strong evidence that this person is not a registered voter (the alternative hypothesis is that the respondent is “real”, but the voter file records are incomplete or contain errors sufficient to make a match impossible). respondents and the ip codes found by the vote validation. For three quarters of our (ostensibly) registered voter sample, the 5 digit ip supplied by the respondent corresponds with the 5 digit match found by the validation contractor. For 16.2% of our respondents, the firm could not find a matching record or a ip code in the voter files; for about one-fifth of these cases (n = 598), the firm had enough information to identify the individual and conclude that this person was not on a voter file and hence designated as “unregistered”. For a small number of individuals (n = 35), they did find the individual in the voter files, but did not return the ip code from the voter file (presumably because the particular voter file did not have that information). For 1,600 respondents (8%), the ip code provided to YG/PMX by the respondent does not match the ip code on the matching record on the voter file found by the validating firm. Intriguingly, there are 31 cases where the respondent provided a “bad” ip code (e.g., less than five digits in a state without a “leading zero” in their ip codes); nonetheless, in each of these cases the firm was able to find a matching record on the voter files with a “good” ip code. Finally, there are another 7 cases with “bad” ip codes from the respondent; 5 of these respondents could not be found in the voter files; for 2 of these 7 cases enough information was found to conclude that this person was in fact not registered. In 600 out of 20,000 cases (3%), we conclude that contrary to the respondent’s assertion, the respondent was in fact not a registered voter. In another 2,603 cases (13%), the respondent could not be found on a state voter file. Of course, this is not to say that the respondent is in fact not a registered voter; but given that a professional data-matching firm with one of the best maintained collection of voter files in the country utilized a good deal of identifying information from YG/PMX (at least name, address, gender, date of birth) and utilized information in other sources and could not find a matching entry on any state voter file for these respondents leads us to conclude that these respondents are not registered to vote. Rates of matching reveal that YG/PMX is able to track respondents across state lines; these are almost all cases where (a) the respondent moved across state lines between completing a YG/PMX profile survey at time of initial recruitment, but updated their voter registration records, or (b) the respondent gave erroneous state of residence information to YG/PMX. The voter file data appear to exclude Nevada; the “state of registration” variable returned to us has zero cases for Nevada, and on the YG/PMX side, the only matches for the 213 respondents thought to reside in Nevada are from 16 respondents matched on voter files in other states, or identified on consumer files as residing elsewhere. Nevada aside, rates of matching our ostensibly registered voter sample on voter files display considerable variation across states. In Mississippi, just 67 of 100 CCAP panelists were found on voter files from that state. Similarly low rates come from Wyoming (21 out Figure 5: Registration Status of CCAP respondents in voter files, by state of 31, or 67.7%), the District of Columbia (29 out of 40, 72.5%) and Alaska (38 out of 52, or 73.1%). States with high rates of matching include the Dakotas (SD: 46 out of 51 respondents, or 90.2%; ND, 40 out of 43 respondents, 93%) and Montana (53 out of 55 respondents, 96.4%). Across the entire set of 20,000 respondents, 84.0% were found on state voter files, and another 600 respondents (3%) were determined to be unregistered. Figure 5 provides a graphical display of the data, dropping the problematic case of Nevada. Most of the state-specific rates of verified voter registration for CCAP respondents are clustered around the average of 84%. 1.3 Nielsen Gross Ratings Points Data for Political Advertising In addition to the validated vote information from the voter files described above, we contracted with the Nielsen Corporation to provide data on the advertising gross ratings points purchased and aired by political candidates or parties running for any office in each DMA region for 2006 and running for president in 2008. A gross rating point (GRP) is a measure of the penetration of an advertisement into its targeted market. For example, one GRP indicates that one percent of the targeted population saw the ad in question one time. 17 The GRP data on political ad buys are an important control in a study of civic or political engagement that aims to explain the effect of characteristics of media markets unrelated to politics (ownership structure). The ads that candidates run in their DMA regions are meant to increase political participation, engagement, and knowledge among the population. If variation in GRPs is correlated with variation in ownership structure at the DMA region level, without controls for political advertising, we run the risk of concluding that market structure affects civic engagement when in fact, it was the political advertising that was driving political interest, participation, and knowledge, not the other attributes of the markets. Nielsen, Inc. provided these data beginning August 1 of each election year under investigation and running through election day. The data are broken out by candidate, day, creative, and outlet. The data we use here are limited to broadcast television outlets in the market. We collapse the data to the level of candidates in 2006 because we are interested in controlling for the effects of advertising on knowledge about each Congressional candidate over the course of the campaign. In 2008, we collapse the ad data for all candidate or party-sponsored political advertising for the presidency in each market in order to control for the amount of candidate-provided political information that was being broadcast to voters during the 2008 presidential campaign in any given market. 2 Civic Engagement and Political Information Political scientists have long been interested in questions of civic engagement, political participation, and political knowledge. Figuring out how citizens acquire the information needed to discharge the duties of citizenship is a large part of the investigations of many prominent political scientists ( Rosenstone, Hansen and Reeves 2003 ; Wolfinger and Rosen- stone 1980 ; Delli Carpini and Keeter 1997 ; Lupia and McCubbins 1998 ; Gerber and Green 2000; Verba, Schlozman and Brady 1995 ). The media have not been overlooked in these investigations. But, while the media play a role in shaping the kinds of thing people think about ( Iyengar and Kinder 1988 ) and the criteria on which voters evaluate candidates ( Iyengar and Kinder 1988 ), the main drivers of 17Technically, GRPs are defined as an advertisement’s “REACH” multiplied by its “FREQUENCY”. Thus there are different ways to explain 10 GRPS, for example. One possibility is that 10% of the population saw the ad a single time. Another possibility is that one percent of the population saw the ad 10 times. participation are well-established as individual-level covariates such as age, race, gender, ed ucation level, income, marital status, employment status, and home ownership (Rosenstone, Hansen and Reeves 2003; Wolfinger and Rosenstone 1980 ; Verba, Schlozman and Brady 1995). Recently, a productive line of field experimentation has demonstrated the positive and robust effects of canvassing and leafleting to remind voters that an election is coming up (Gerber and Green 2000 ), shaming voters into participating by publishing the names of those who do not vote in newspapers or newsletters ( Gerber, Green and Larimer 2008 ), and of non-partisan cable TV advertising, such as that done by groups like Rock the Vote (Green and Vavreck 2008 ). These field tests have shown effects that range from 3 points for the cable TV study to more than 10 points for the shaming study. The extant literature on participation suggests that both individual-level attributes and campaign-efforts at stimulation will be important drivers of political engagement. With respect to media and market structures, what can we say about their role in fostering civic engagement? Communication studies scholars have shown that despite the increases in available political information (through cable TV news and increased outlets) political knowledge and participation have not increased dramatically ( Sunstein 2002 ; Prior 2007 , 2005). Sunstien proposes that people’s increasing ability to customize their political information will have an unexpected effect on democracy because people are simply less likely to encounter political news as a byproduct of turning on the television. This is not unlike Sam Popkin’s conception of the byproduct theory of information ( Popkin 1991 ). Popkin and Sunstien suggest that even those who are uninterested in politics may come across some snippets of political information simply because they are watching television or listening to the radio. Take for example a typical household in the 1960s. If the television was on in the evening, chances are some form of news was part of the programming — across all three broadcast networks. People could not escape the news in this setting. In contrast, today it is easy to escape all forms of news when watching television — in fact, people’s ability to expose themselves to only the type of television programming they like is impressive and has lead to an entire industry devoted to understanding market segmentation and micro-targeting. Narrowcasting has replaced broadcasting. The implication of the Sunstien argument is that greater choice allows politically interested people to access more information and increase their already impressive amounts of political knowledge, while people who prefer pure entertainment or sports to politics can avoid being exposed to political information all together. Markus Prior (2005) demonstrates that this pattern exists and that the gaps in knowledge and participation between those who prefer news on television and those who prefer entertainment shows is widening over time. He assumes this is due to the increasing choice over programming in the media environment. For our purposes, if choice breeds a widening gap between types of viewers in terms of participation and knowledge, we may be able to see the pattern in markets in which there are many broadcast outlets compared to those in which there are fewer. Of course, cable and satellite television provide ample numbers of choices for respondents across media markets, so perhaps the pattern will be difficult to detect given our focus on the effects of broadcast stations alone, however, we note that cable programming is typically not local in orientation. We present the evidence on whether the structure of ownership affects civic engagement in the following manner: Structure — We operationalize structure as it relates to unique information providers in the market by investigating the role of the number of independently owned television stations in the market, the number of radio stations in the market, the number of parent entities that own more than one television station in the market, the number of parent entities that own at least one television station and a radio station in the market, the number of parent entities that own at least one television station, a radio station, and a newspaper in the market, the number of Internet service providers providing service at 200KBS, and the number of unique radio stations with a news or talk format in the market. 18 Engagement — We operationalize civic engagement in two forms, participation and knowledge. In terms of participation, we use the information from the vote validation in 2008 to investigate the drivers of turnout in the presidential election. In terms of knowledge, using the 2008 data we investigate people’s overall levels of general political information, their abilities to place the candidates on important political issues relevant to the campaign, and their level of interest in public affairs generally. In 2006, we analyze people’s abilities to identify the candidates running for local office (Congress). We begin with the 2008 study, which we refer to as “Project A” and from there move to the 2006 evidence, “Project B.” In each study, we proceed in the following manner: Start with simple models of engagement using only the individual-level attributes that have been shown to drive participation and knowledge. Add fixed effects for the DMA regions. Assess whether any of the variation in engagement is explained by DMA region indicators. If so, we attempt to explain that structure. Recover the appropriate regression/logistic parameters describing the fixed effects for the markets and treat those as the “biggest possible net effects” due to market-level attributes. Individually demonstrate the relationship between the structural characteristics of markets (number of voices, cross-ownership features, etc . . . ) and the fixed-effect parameters for each measure of engagement. Model engagement hierarchically using all the market-level indicators to assess the role of structural factors across the measures of engagement when all indicators are allowed to work simultaneously. 18The exact names of the variables and the datasets from which they were drawn are listed on page X. We turn now to Project A, which focuses on people’s levels of interest, knowledge, participation, and uncertainty regarding politics and candidates. Part II Analyses: Political Interest, Knowledge, Uncertainty, and Participation (Project A) 3 Analyses of Variation In project A we examine four general indicators of civic engagement: Levels of interest in politics and current affairs, overall levels of political knowledge, people’s willingness to place the candidates on important campaign issues (their level of uncertainty about the candidates’ positions), and turnout in the 2008 presidential election. We use the 2008 CCAP data, which covers 210 DMA regions as described earlier. For each of the four dependent variables, we begin the analyses with the approach described above: we use the fixed effects from a model with basic demographics and DMA region indicators and we attempt to model the structure of those fixed effects with the variables detailing the ownership characteristics of markets. From there, we move into a full-hierarchical model allowing the effects of covariates to change based on characteristics of the markets. 3.1 Civic Engagement: Dependent Variables We use the following four measures as dependent variables tapping in to civic engagment: Interest We begin with a general measure of interest in politics. We asked people, “How interested are you in politics and current affairs?” People could answer in three decreasingly interested categories (very much, somewhat, not that much). Willingness to Place Candidates on Issues We also asked people to place the candidates on important political issues that were being discussed during the 2008 presidential campaign. One of those issues was health care reform. We asked people: Which comes closest to Barack Obama’s view about providing health care in the United States? (Choose one of the following) The Government should provide everyone with health care and pay for it with tax dollars Companies should be required to provide health insurance for their employees and the government should provide subsidies for those who are not working or retired. Health insurance should be voluntary. Individuals should either buy insurance or obtain it through their employers as they do currently. The elderly and the very poor should be covered by Medicare and Medicaid as they are currently. I’m not sure, I haven’t thought much about this We take the health care question and we dichotomize it into two categories, giving people a one if they will not place the Barack Obama on health care and a zero if they will place him at a position. This “unwillingness” to place the candidate on an important campaign issue is best thought of as a measure of uncertainty about the candidate’s position on the issue. These kinds of uncertainty measures are known to be robust predictors of vote choice and favorability ( Bartels 1986 ; Vavreck 2009 ), and they correlate in predictable ways with demographic attributes such as education, age, gender, and income. 19 19People who were not asked this question are dropped from the analysis as they are not considered unwilling or too uncertain to place the candidate. Political Knowledge In order to characterize people’s overall level of political sophistication or knowledge more generally, we asked a series of 12 questions that we use together in a scale. Ten of these questions take the same form, they ask people to place a prominent politician or business person in one of three possible jobs — a Member of the House of Representatives, a Senator, or neither of those things. People’s responses are coded in binary fashion as either correct or incorrect. We asked about the following ten people: John Dingell, Nancy Pelosi, Bill Gates, John Boehner, Susan Collins, Henry Waxman, Jon Kyl, Dennis Kucinich, Patrick Leahy, and Ted Kennedy. Additionally, we asked people to choose what job Condoleeza Rice held (from four choices) and whether people could correctly identify why Guant´anamo Bay had been in the news lately (from four choices). There is some interesting variation in the item difficulties with the item about Senator Susan Collins being the hardest item (only 39% of respondents got that one right in a setup where 33% could get it right by guessing). The item about Bill Gates, CEO of Microsoft Corporation is the easiest item — 97% of respondents got that right. To form the scale we use an Item Response Theory model (IRT). We pass this set of binary coded items to the following item-response theory model, a model widely used in the social-sciences to recover a latent measure from a set of binary items: πij = Pr(yij =1|ξi,βj,αj)= F (ξiβj − αj) (1) where yij ∈{0, 1} is the i-th subject’s answer to the j-th item (e.g., yij = 1 if correct, yij = 0if incorrect), where i =1,...,n indexes respondents and j =1,...,m indexes items; ξi ∈ R is an unobserved attribute of subject i (typically considered ability in the test-taking context, or ideology in the analysis of legislative data) βj is an unknown parameter, tapping the item discrimination of the j-th item, the extent to which the probability of a correct answer responds to change in the latent trait ξi αj is an unknown item difficulty parameter, tapping the probability of a correct answer irrespective of levels of political information F (·) is a monotone function mapping from the real line to the unit probability interval, typically the logistic or normal CDF. A one parameter version of the model results from setting βj =1, ∀j; i.e., items vary in difficulty, but not in terms of their discrimination, and is often called a Rasch model. Connections between IRT models for binary indicators and the factor analysis model for continuous indicators have been noted by Takane and de Leeuw (1987) and Reckase (1997). The statistical problem here is inference for ξ =(ξ1,...,ξn)� , β =(β1,...,βm) and α = (α1,...,αm)� . We form a likelihood for the binary data by assuming that given ξi, βj and αj, the binary responses are conditionally independent across subjects and items; this assumption is called “local independence” in the argot of IRT. That is, nm yij L = πij (1 − πij)1−yij (2) i=1 j=1 where πij is defined in equation 1. The model parameters are unidentified. For instance, any linear transformation of the ξi can be offset by appropriate linear transformations for the βj and αj; an obvious case is scale invariance, in which πij = F (ξiβj − αj) indistinguishable from the model with π∗ = ij F (ξi ∗βj ∗ − αj) where ξi ∗ = cξi and βj ∗ = βj/c, c �= 0. A special type of rotational invariance arises with c = −1. Any two linearly independent restrictions on the latent traits are sufficient for at least local identification, in the sense of Rothenberg (1971); a typical example is a mean-ero, unit variance restriction on the ξi, while setting at least one pair of (βj,αj) item parameters to fixed values is one way of obtaining global identification. Here we impose the identifying restriction that the latent traits (the ξi) have mean zero and standard deviation one across respondents. We use a Gibbs sampler to generate a Monte Carlo based exploration of the posterior density of the model parameters Θ =(ξ, β, α)� . By Bayes Rule, this posterior density is p(Θ|y) ∝ p(y|Θ)p(Θ) where p(y|Θ) is the likelihood defined above (equation 2) and p(Θ) is a set of prior densities. We assume a priori independence across all elements of Θ, specifically ξi ∼ N(0, 1), βj ∼ N(0, 52) and αj ∼ N(0, 52), each iid across i and j, respectively. Note also that post-estimation, we impose the identifying restriction that the Bayes estimates of the ξi (the means of their respective marginal posterior density) have mean ero and standard deviation one across respondents; this normalization is functionally equivalent to imposing some additional prior structure on the parameters. The Gibbs sampler is implemented in the ideal function in the pscl package in R, as shown below. Further details appear in Jackman (2009, section 9.3). We use the resulting scores from this item response model as a dependent variable assessing people’s underlying level of general political knowledge or sophistication. Turnout Finally, as described in Subsection 1.2 above, we acquired validated turnout information from Secretaries of States offices. We append these data to our survey data and use them as our final measure of civic engagement. 3.2 DMA region Fixed Effects We begin the analysis with a purely exploratory investigation of the relationship between DMA regions and engagement. We are interested in decomposing the variation in engagement into within-DMA region and between-DMA region components. The DMA region fixed effects give us an indication of how much of the variation in each of the engagement dependent variables comes from between-DMA region variation. The individual-level demographics explain the within-DMA region variation. It is worth noting at this point that we do not view these analyses as structural models for interpreting the effects of market-level features. We are, at this point, merely interested in discovering whether accounting for DMA region-level variation increases the overall level of explained variation in each of the engagement variables. Because of this, we estimate the models for the variance decomposition using a simple least squares linear probability model (LPM) across all of the dependent variables, even the dichotomous and ordered ones. For political interest, willingness to place Obama on healthcare, and turnout, the LPM will not be appropriate when we move to a structural model of the market ownership frameworks. Logistic regression will replace the LPM at that point. In other words, for this exploratory analysis, we disregard levels of measurement, fitting linear regressions irrespective of whether we have a continuous, binary or ordinal dependent variable. Our goal at this stage is merely to understand the structure of the data. Specifically, all we want to know at this stage is how much variation in a given y can be accounted for with a set of “fixed effects” for DMA regions, versus, say, other plausible sources of variation in the data (e.g., fixed effects for state, fixed for birth year, etc). We define a function that we use repeatedly in this exploratory mode of analysis: a function that regresses a given y on various fixed effects, and reports the resulting r2 . We ask the following questions: How much of the variation in each of the engagement dependent variables is explained by DMA region-level fixed effects alone? And what do the demographics add? We present these results in Table 3 and Table 4. There appears to be modest structure to the civic engagement markers based on DMA region indicators. In other words, we conclude there is modest between-DMA region variation to explain in these data. But most of the variation-explained comes from within-DMA region indicators that take the form of individual-level demographics including: gender, race (black, white, hispanic, other), education (college or not), age, and income (5-levels). These demographics explain 19% of the variation in overall levels of political information. Table 3: Least squares regression analysis of Political Information (mean 0, sd 1) and Political Interest (1, 2, 3), October wave. All models include unreported fixed effects for income levels. Info 1 Info 2 Interest 1 Interest 2 Female −0.41 −0.41 −0.18 −0.18 (0.01) (0.01) (0.01) (0.01) Black −0.23 −0.26 −0.05 −0.07 (0.02) (0.02) (0.02) (0.02) Hispanic −0.20 −0.26 −0.03 −0.09 (0.03) (0.03) (0.02) (0.02) Other Race −0.05 −0.07 0.01 −0.02 (0.03) (0.03) (0.02) (0.02) College 0.46 0.44 0.18 0.18 (0.02) (0.02) (0.01) (0.01) Age (Years/10) 0.09 0.09 0.06 0.06 (0.00) (0.00) (0.00) (0.00) DMA Fixed Effects No Yes No Yes r2 0.19 0.22 0.12 0.16 N 19,650 19,650 16,404 16,404 Standard errors in parentheses Adding DMA region indicators increases the percent of variation explained to 22%. The between-DMA region indicators increase explanatory power by 3 points. Similar trends exist for political interest (12% explained by demographics alone and 16% with DMA region indicators) and the respondent’s willingness to place Obama on healthcare (8% to 13% with the markets accounted for). For turnout, however, we see an opposite trend. Only 4% of the variation in validated turnout is explained by the individual-level characteristics of respondents. This is increased to 10% when the DMA region indicators are added. Here, the DMA region fixed effects increase the percent of variation in turnout that is explained by the model by 6-points. While the size of this effect is not that different from the size of the increases due to DMA region fixed effects from the other dependent variables, the level of this effect — being larger than the percent of explained variation from the demographics — is what draws our attention. Despite the modest levels of between-DMA region variation in engagement, we can ascertain whether the fixed effects are related to one another across the four dependent variables measuring engagement. If the fixed effect estimates are related across the dependent variables, it opens the possibility that a common factor is driving the variation across all the engagement measures. We present these comparisons in Figure 6. Figure 6: Scatterplot Matrix of Fixed Effects Table 4: Least squares regression analysis of Ability/willingness to locate Obama on health care issue, September wave of CCAP (0,1) and Validated Turnout (0,1). All models include unreported fixed effects for income levels. Placement 1 Placement 2 Turnout 1 Turnout 2 Female −0.10 −0.10 0.00 0.01 (0.01) (0.01) (0.01) (0.01) Black 0.01 −0.01 0.00 0.02 (0.01) (0.01) (0.01) (0.01) Hispanic −0.06 −0.09 −0.04 −0.05 (0.01) (0.01) (0.01) (0.01) Other Race −0.01 −0.02 −0.01 −0.02 (0.02) (0.02) (0.01) (0.01) College 0.14 0.13 0.06 0.06 (0.01) (0.01) (0.01) (0.01) Age (Years/10) 0.02 0.02 0.03 0.03 (0.00) (0.00) (0.00) (0.00) DMA Fixed Effects No Yes No Yes r2 0.08 0.13 0.04 0.10 N 15,509 15,509 16,929 16,929 Standard errors in parentheses Figure 6 is a scatterplot matrix showing the way the coefficients on the DMA region fixed effects for each measure of engagement are related to one another. These coefficients can be thought of as representing the exceptionalism of each DMA region on the dependent variable under analysis, net of individual-level predictors. In other words, these off-sets can be thought of as the “remainders” in each DMA region mapping directly onto the dependent variable and due specifically to something occurring in the physical space of that market. The first column shows the relationships between fixed effects in the information model, the second column is for interest in politics, the third is willingness to place Obama on healthcare and the last column in turnout. The rows are ordered similarly. The bottom triangle plots the off-sets against one another. As such, there are roughly 210 datapoints in each square. The top triangle reports the pearson correlation between the two sets of estimates. The purpose of this analysis is to search for patterns of any functional form between the DMA region off-sets on different measures of engagement. The method does not require us to have theoretical expectations about whether and how the DMA region-level off-sets might be related to one another. We are looking for patterns. What might some of these patterns look like? If the estimates were perfectly correlated we would expect to see all of the datapoints on a line. This type of relationship would suggest that something at the DMA region-level was driving the observations of both dependent variables in the same manner. To the extent that there is no relationship between the estimates, we would expect to see datapoints scattered in the squares. This type of relationship would suggest no structure to the DMA region-level off-sets across the dependent variables — for example, the markets may very well differ from one another on one measure, but the way that they differ bears no relationship to the way they differ on other measures, net of individual-level predictors. This lack of patterning would suggest no common driver of political engagement at the DMA region-level; it does not, however, rule out the possibility that different DMA region-level factors affect different measures of engagement. To assist readers in visualizing these relationships, we fit a loess smoother in each square in red. 20 The loess curves show slight positive associations between the market off-sets across measures of engagement. The strongest relationship is between political interest and willingness to place Obama on healthcare. This association indicates that there is something ordering the markets from high to low levels of political interest that that orders them similarly from willing to unwilling to place Obama. The off-sets for interest are also highly correlated with the off-sets for political information more generally. The off-sets for turnout, in contrast, seem to have little relationship to any of the other market-level off-sets (with the exception of political information). Net of individual-level predictors, the relationships among these off-sets for market suggest an underlying structure at the market level that is worth further exploration. We attempt to define this underlying structure by taking each dependent variable and uncovering whether the ownership structure of the DMA region has any relation to these off-sets and their patterns of correlation. We do this by presenting similar matrices of scatterplots for each measure of engagement and the market-level offsets. We do this for each of our measures of market ownership. Once more, we evaluate the non-parametric loess curves searching for any indication of a pattern in the data. In Figure 7 and Figure 8 we present the relationships between the number of independent commercially owned television and radio stations and the fixed effects, separately. In both cases, the relationships across all four dependent variables are flat. The data span most of the X-axis, but there is no clear pattern to the effects of increasing market voices, whether television or radio, and political interest, knowledge, uncertainty, or participation. 20Loess, sometimes called locally weighted scatterplot smoothing, combines much of the simplicity of linear least squares regression with the flexibility of nonlinear regression. It does this by fitting simple models to localized subsets of the data to build up a function that describes the deterministic part of the variation in the data, point by point. One of the chief attractions of this method is that the data analyst is not required to specify a global function of any form to fit a model to the data, only to fit segments of the data. In Figure 7: Number of Independently Owned TV Stations by Fixed Effects Figure 8: Number of Independent Radio Stations by Fixed Effects The only slightly interesting movement is in the relationship between radio stations and political interest at the low end of the scale, where increasing numbers of stations are leading to decreasing levels of interest in politics (remember that interest is coded in reverse), but this trend is not clear enough, even with this large amount of data, from which to make any inferences. If ownership is related to the structure of these DMA region-level effects, it will have to come from multiple-and cross-ownership and not from the sheer numbers of voices. Looking at Figure 9, Figure 10, and Figure 11, however, we see that these trends are flat as well. Whether we examine multiple television station ownership, the television-radio cross-ownership, or this cross plus a newspaper — there is simply no pattern to the relationships. The FCC provided data on the penetration, at the market level, of Internet usage. In Figure 11 we present the data on the percentage of the households with Internet access of greater than 200 bauds per second by the DMA region level fixed effects. While these trends show more shape than the ownership structure patterns, they are essentially presenting equally null relationships. There may be a slight association between Internet penetration rates and the fixed effects for validated turnout, but the movement is slight and only for the lower half of the penetration scale. Finally, using Nielsen data on the total number of gross ratings points purchased by the presidential candidates in each DMA region, we look for patterns between these ad-buy levels and the DMA region off-sets. The measures of GRP in each DMA region are simply the sum of all the political advertising that was purchased in a given DMA region by any presidential candidate or party on behalf of a presidential candidate from August 1, 2008 to Election Day. These data are presented in Figure 13. The patterns are not striking, but each of the loess lines slopes in a downward direction, suggesting that areas with a lot of presidential advertising are areas with low levels of interest, turnout, information, and knowledge. This is no surprising when one considers the fact that the design of this test is not experimental — meaning we cannot conclude that increased advertising leads to declines in participation, for example, since candidates are deciding where to place these ads strategically, and they may be placing ads in exactly those places where people are less engaged and less likely to participate. this case, we fit a weighted quadratic least squares regression over the span of values on the y-axis using a bandwidth setting of .8. Figure 9: Number of Multiple-TV Station Parents by Fixed Effects We also examined the relationship between the number of parents owning radio stations with a news or talk format and the DMA region offsets as well as the size of the adult population and the offsets. We see the familiar flat line for the news/talk radio relationship, but a slight hint of a downward trending relationship for adult population size. Thus the non-ownership measures (Internet penetration, advertising, and adult population) gave us the strongest pattern, although all the relationships were weak in general. These results led us to analyze the relationship among all of these DMA region level factors. We suspected that many of the market ownership indicators were highly correlated with size of adult population and in fact, by definition, some of these measures must be highly correlated with independent voices because they are transformations of the voices count based on FCC rules and regulations. We present the relationships among these indicators in Figure 16. You can see the correlations in Figure 16 in two ways: by examining the pearson correlations in the upper triangle or by focusing on the loess lines in the lower triangle. Either way, the dependencies are obvious. Adult population, in particular, is correlated with news/talk radio parents at .76, with radio and TV voices at .84, respectively. In contrast, Internet penetration and presidential advertising do not seem to be highly correlated with the market ownership structure. This correlation among ownership factors is likely to make it difficult, even with the heightened power of our 20,000 respondent survey, to find strong effects. We turn our attention now to a set of multilevel and hierarchical models that estimate the overall effect of ownership structure on civic engagement, even though we did not find much of the variation across DMA regions was explained by these indicators. 4 Bayesian Analysis of Hierarchical and Multilevel Models Descriptions of the specific multilevel models we fit appear below. Generally, we will be fitting models of the sort E(yi|xi)= xiβ + αj(i) (3) V (yi|xi)= σ� 2 , (4) where i indexes observations and j indexes DMA regions; i.e., respondent i is located in DMA region j(i). Regression, logistic regression and ordinal logistic regression are the particular cases we will encounter below. The models we fit here have a hierarchical component sometimes taking the simple form αj ∼ N(0,σα2 ) (5) but of greater interest are multilevel models of the sort αj ∼ N(iγ,σα2). (6) The densities used in the hierarchical component need not be normal; here we focus on normal densities since the αj ∈ R and the normal is a convenient choice for when modeling continuous-valued quantities without restrictions on their support; note that it is straightforward to introduce covariates into the model for the mean of the normal via the familiar linear, additive regression model. We adopt a Bayesian approach to inference for these models; Bayesian analysis of hierarchical and multilevel models is now well-established as one of the preferred methods for dealing with what classical statistics and econometrics would refer to as “random coefficients” or “varying coefficients” model. Other branches of the social sciences sometimes refer to these models of this sort as “mixed” models. Discussion and further details appear in Gelman and Hill (2007) or Jackman (2009, Ch 7). In the Bayesian approach to inference, we seek to characterize the posterior density of the model parameters θ ∈ Θ ⊆ Rm . By Bayes Rule, the posterior density of θ is p(θ|data) ∝ p(data|θ)p(θ) where p(data|θ) is the likelihood function for the data and p(θ) is the prior density of θ. Note immediately that with a “vague” or approximately locally-constant prior density, the prior is absorbed into the constant of proportionality in Bayes Rule, and the posterior density has the same shape as the likelihood function. Moreover, in large samples, posterior densities for many parameters are approximately normal; the symmetry of the normal in turn ensures that the mean of a (marginal) posterior density will often be close to the maximum likelihood estimate. For this reason, many practitioners have adopted Bayesian approaches out of convenience — especially in light of the computational details discussed in the next paragraph — as a means of generating MLEs “by any other means”. Hierarchical models give rise to lots of parameters; e.g., in this case we have over 200 αj DMA region-specific parameters. Modern approaches to characterizing high-dimensional posterior densities make use of the Gibbs sampler, an algorithm that sample successively from the lower dimensional conditional densities that together constitute some joint density of interest (e.g., a high-dimensional posterior density). Details of the mechanics of the Gibbs sampler appear in many places: see Robert and Casella (2004) or Jackman (2009, Chapter 5). Of great practical significance is that Gibbs sampling for hierarchical models (and other Bayesian models) is easily implemented using freely available software; e.g., JAGS (Plummer 2010) or OpenBUGS/WinBUGS ( Spiegelhalter et al. 2003 ). In fact, it is no exaggeration to observe that the Gibbs sampler and its implementation in user-customizable programs lie behind the surge of interest and utilization of hierarchical models in the last decade or two. 4.1 Interest in Politics We model this variable with an ordinal logistic regression model, utilziing the micro-level covariates introduced earlier. Let the ordinal self-report of level of political interest be yi ∈{1, 2, 3} and let xi be a vector of covariates, i =1,...,n. The ordindal response model presumes that the observed responses are an interval-censored version of a latent continuous variable: i.e., yi =1 ⇐⇒ yi ∗ ≤ τ1 yi =2 ⇐⇒ τ1 τ2 where yi ∗ = xiβ+αj(i) +�i is a regression model relating the covariates to the latent response, with β a vector of unknown parameters and �i ∼ Logistic giving an ordinal logit model. The term αj(i) is unobserved term specific to the j-th DMA region; the notation j(i) simply denotes the index of the DMA region in which respondent i is located. The thresholds τ1 <τ2 are also unobserved parameters. A common parameterization (which we adopt here) is to omit an intercept term from xi, which means both thresholds can be estimated. With these definitions, we can define the likelihood for the model as follows. The probabilities of each of the three outcomes are: Pr[yi = 1] = Pr[yi ∗ ≤ τ1] = Pr[xiβ + αj(i) + �i] = Pr[�i ≤ τ1 − xiβ − αj(i)]= F [τ1 − xiβ − αj(i)] Pr[yi = 2] = Pr[τ1 1 Station .11 .084 Parents with TV, Radio, News -.022 .078 Constant .612† .248 R2 = 0.84 N=9 Note: Cell entries are OLS regression estimates on 9 DMA region observations. Significance levels : † : 10% ∗ : 5% ∗∗ : 1% In terms of being able to recognize local candidates, specifically the challenger in a Congressional election, the only significant driver of DMA region-level factors is that candidate’s advertising penetration as measured by Neilsen gross ratings points. 31 Moving from no advertising in a market to a typical 1000 GRP ad-buy (100% of the target audience sees the ad 20 times) boosts the probability of being able to recognize the candidate from his or her image by 20-points, from .19 to .39. 32 Even with controls for the number of independent television voices, the number of parent companies owning more than one TV station, and the number of parent companies owning TV, radio, and newspaper outlets, the effect of advertising remains strong, in fact, virtually unchanged. These findings are consistent with other work in communication and political science that demonstrates the effectiveness of political advertisements relative to news cover age in terms of educating voters ( Vavreck 2009 ; Gilens, Vavreck and Cohen 2007 ). When it comes to providing the information necessary to dispense the duties of citizenship, it appears that television is doing its part — through advertising, however, not through the ownership structure of the news environment. Part IV Report Conclusion We conclude that while there is a pattern to the civic and political engagement in media markets across the country we are unable to explain this pattern with market-ownership indicators like TV and radio voices, and multi-and cross-ownership. In terms of political and civic engagement, there is some within DMA region variation that can be explained at the individual-level and in addition to that, what drives participation, learning, knowledge, and interest in politics and public affairs seems to be the media context as cultivated by the candidates in the market through paid advertising (or other efforts with which paid advertising is correlated, like a ground-campaign or direct mail effort); and sometimes, the level of Internet penetration in the market. The advertising effect is complicated by the fact that any political messaging is likely to have as its goal electioneering, which means that one side is trying to persuade voters to do one thing and the other is trying to persuade them to do the opposite. To the extent that we can show any effects of political advertising effort at all in these analyses without accounting for the content of the ads, it is likely that we are underestimating the effects of 31In these models, GRPS are denominated in units of 100. 32For Incumbents, the movements is a much smaller 6-points, from .52 to .58. the ads, since in most cases the competing messages are likely to cancel each other out. The clear and direct effect of challenger advertising on respondents’ abilities to recognize images of the candidate demonstrate the power of television to affect political knowledge or engagement. That we cannot show this kind of effect for television news — as measured by number of voices or conglomerate cross-ownership — suggests that the intensity of political television advertising is so great that its effects will drown out any effects of multi-or cross-ownership. References Bartels, L.M. 1986. “Issue voting under uncertainty: An empirical test.” American Journal of Political Science pp. 709–728. Delli Carpini, M.X. and S. Keeter. 1997. What Americans know about politics and why it matters. Yale University Press. Gelman, Andrew and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press. Gerber, Alan S. and Donald C. Green. 2000. “The Effects of Canvassing, Telephone Calls, and Direct Mail on Voter Turnout: A Field Experiment.” American Political Science Review 94(3):653–663. Gerber, A.S., D.P. Green and C.W. Larimer. 2008. “Social pressure and voter turnout: Evidence from a large-scale field experiment.” American Political Science Review 102(01):33– 48. Gilens, M., L. Vavreck and M. Cohen. 2007. “The Mass Media and the Public’s Assessments of Presidential Candidates, 1952–2000.” Journal of Politics 69(4):1160–1175. Green, D.P. and L. Vavreck. 2008. “Analysis of cluster-randomized experiments: A comparison of alternative estimation approaches.” Political Analysis 16(2):138. Iyengar, S. and D.R. Kinder. 1988. News that matters: Television and American opinion. University of Chicago Press. Jackman, S. and L. Vavreck. 2010. “Primary politics: race, gender, and age in the 2008 Democratic primary.” Journal of Elections, Public Opinion & Parties 20(2):153–186. Jackman, Simon. 2009. Bayesian Analysis for the Social Sciences. Hoboken, New Jersey: Wiley. Jackman, Simon and Lynn Vavreck. 2009. “The 2007-8 Cooperative Campaign Analysis Project, Release 2.0.”. Lupia, A. and M.D. McCubbins. 1998. The democratic dilemma: Can citizens learn what they need to know? Cambridge Univ Pr. Plummer, Martyn. 2010. JAGS Version 2.20 manual. URL: http://mcmc-jags.sourceforge.net Popkin, S.L. 1991. The reasoning voter: Communication and persuasion in presidential campaigns. University of Chicago Press. Prior, M. 2005. “News vs. entertainment: How increasing media choice widens gaps in political knowledge and turnout.” American Journal of Political Science 49(3):577–592. Prior, M. 2007. Post-broadcast democracy: How media choice increases inequality in political involvement and polarizes elections. Cambridge Univ Pr. Reckase, Mark D. 1997. “The Past and Future of Multidimensional Item Response Theory.” Applied Psychological Measurement 21(1):25–36. Robert, Christian P. and George Casella. 2004. Monte Carlo Statistical Methods. Second ed. New York: Springer. Rosenstone, S.J., J.M. Hansen and K. Reeves. 2003. Mobilization, participation, and democracy in America. Longman. Rothenberg, Thomas J. 1971. “Identification in Parametric Models.” Econometrica 39(3):577–591. Spiegelhalter, David J., Andrew Thomas, Nicky G. Best and Dave Lunn. 2003. WinBUGS User Manual Version 1.4. Cambridge, UK: MRC Biostatistics Unit. Sunstein, C.R. 2002. Republic. com. Princeton Univ Pr. Takane, Yoshio and Jan de Leeuw. 1987. “On the relationship between item response theory and factor analysis of discretized variables.” Psychometrika 52:393–408. Vavreck, L. 2009. The message matters: the economy and presidential campaigns. Princeton Univ Pr. Vavreck, Lynn. and Douglas Rivers. 2008. “The 2006 Cooperative Congressional Election Study.” Journal of Elections, Public Opinion & Parties 18(4):355–366. Verba, Sidney, Kay Lehman Schlozman and Henry Brady. 1995. Voice and Equality: Civic Voluntarism in American Politics. Cambridge, MA: Harvard University Press. Wolfinger, R.E. and S.J. Rosenstone. 1980. Who votes? Yale Univ Pr.