Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1"""American National Election Survey 1996""" 

2from numpy import log 

3 

4from statsmodels.datasets import utils as du 

5 

6__docformat__ = 'restructuredtext' 

7 

8COPYRIGHT = """This is public domain.""" 

9TITLE = __doc__ 

10SOURCE = """ 

11http://www.electionstudies.org/ 

12 

13The American National Election Studies. 

14""" 

15 

16DESCRSHORT = """This data is a subset of the American National Election Studies of 1996.""" 

17 

18DESCRLONG = DESCRSHORT 

19 

20NOTE = """:: 

21 

22 Number of observations - 944 

23 Number of variables - 10 

24 

25 Variables name definitions:: 

26 

27 popul - Census place population in 1000s 

28 TVnews - Number of times per week that respondent watches TV news. 

29 PID - Party identification of respondent. 

30 0 - Strong Democrat 

31 1 - Weak Democrat 

32 2 - Independent-Democrat 

33 3 - Independent-Indpendent 

34 4 - Independent-Republican 

35 5 - Weak Republican 

36 6 - Strong Republican 

37 age : Age of respondent. 

38 educ - Education level of respondent 

39 1 - 1-8 grades 

40 2 - Some high school 

41 3 - High school graduate 

42 4 - Some college 

43 5 - College degree 

44 6 - Master's degree 

45 7 - PhD 

46 income - Income of household 

47 1 - None or less than $2,999 

48 2 - $3,000-$4,999 

49 3 - $5,000-$6,999 

50 4 - $7,000-$8,999 

51 5 - $9,000-$9,999 

52 6 - $10,000-$10,999 

53 7 - $11,000-$11,999 

54 8 - $12,000-$12,999 

55 9 - $13,000-$13,999 

56 10 - $14,000-$14.999 

57 11 - $15,000-$16,999 

58 12 - $17,000-$19,999 

59 13 - $20,000-$21,999 

60 14 - $22,000-$24,999 

61 15 - $25,000-$29,999 

62 16 - $30,000-$34,999 

63 17 - $35,000-$39,999 

64 18 - $40,000-$44,999 

65 19 - $45,000-$49,999 

66 20 - $50,000-$59,999 

67 21 - $60,000-$74,999 

68 22 - $75,000-89,999 

69 23 - $90,000-$104,999 

70 24 - $105,000 and over 

71 vote - Expected vote 

72 0 - Clinton 

73 1 - Dole 

74 The following 3 variables all take the values: 

75 1 - Extremely liberal 

76 2 - Liberal 

77 3 - Slightly liberal 

78 4 - Moderate 

79 5 - Slightly conservative 

80 6 - Conservative 

81 7 - Extremely Conservative 

82 selfLR - Respondent's self-reported political leanings from "Left" 

83 to "Right". 

84 ClinLR - Respondents impression of Bill Clinton's political 

85 leanings from "Left" to "Right". 

86 DoleLR - Respondents impression of Bob Dole's political leanings 

87 from "Left" to "Right". 

88 logpopul - log(popul + .1) 

89""" 

90 

91 

92def load_pandas(): 

93 """Load the anes96 data and returns a Dataset class. 

94 

95 Returns 

96 ------- 

97 Dataset instance: 

98 See DATASET_PROPOSAL.txt for more information. 

99 """ 

100 data = _get_data() 

101 return du.process_pandas(data, endog_idx=5, exog_idx=[10, 2, 6, 7, 8]) 

102 

103 

104def load(as_pandas=None): 

105 """Load the anes96 data and returns a Dataset class. 

106 

107 Parameters 

108 ---------- 

109 as_pandas : bool 

110 Flag indicating whether to return pandas DataFrames and Series 

111 or numpy recarrays and arrays. If True, returns pandas. 

112 

113 Returns 

114 ------- 

115 Dataset instance: 

116 See DATASET_PROPOSAL.txt for more information. 

117 """ 

118 return du.as_numpy_dataset(load_pandas(), as_pandas=as_pandas) 

119 

120 

121def _get_data(): 

122 data = du.load_csv(__file__, 'anes96.csv', sep=r'\s') 

123 data = du.strip_column_names(data) 

124 data['logpopul'] = log(data['popul'] + .1) 

125 return data.astype(float)