The graph I attempted to reproduce below was retrieved from a FiveThirtyEight article “The National Parks Have Never Been More Popular”, by Andrew Flowers, that visualizes the rise in popularity of the US’s national parks throughout the years. The graph was created from annual visitor counts that date back to 1904. Based on these visitor counts, Flowers then ranked the national parks (e.g., park with most visitor count ranked highest). The National Park Services (NPS) have several detailed data sets that can be found here. The data set I utilized for this visualization exercise can be found on the aforementioned NPS page under the “Annual Visitation and Record Year by Park (1904-Last Calendar Year)”. You will be prompted to a page with three drop down menus. For”Region(s)” check “(Select All)”. Next, for “Park(s)” check “(Select All)”. Finally, for “Park Type”, scroll and check “National Park”. This will then return the data needed to replicate the graph below.
Some larger datasets need to be installed separately, like senators and
house_district_forecast. To install these, we recommend you install the
fivethirtyeightdata package by running:
install.packages('fivethirtyeightdata', repos =
'https://fivethirtyeightdata.github.io/drat/', type = 'source')
library(ggthemes)library(dplyr)library(janitor)
Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
str(orig_nps) #Exploring the NPS data a bit with these commands
spc_tbl_ [67 × 122] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ ...1 : chr [1:67] "Annual Visitation and Record Year by Park (1904 - Last Calendar Year)" NA "Region" "Alaska Region" ...
$ ...2 : chr [1:67] NA NA "Park Name" "Denali NP & PRES" ...
$ ...3 : chr [1:67] NA NA "Park Type" "National Park" ...
$ ...4 : num [1:67] NA NA 1904 NA NA ...
$ ...5 : num [1:67] NA NA 1905 NA NA ...
$ ...6 : logi [1:67] NA NA NA NA NA NA ...
$ ...7 : num [1:67] NA NA 1906 NA NA ...
$ ...8 : num [1:67] NA NA 1907 NA NA ...
$ ...9 : num [1:67] NA NA 1908 NA NA ...
$ ...10 : num [1:67] NA NA 1909 NA NA ...
$ ...11 : num [1:67] NA NA 1910 NA NA NA NA NA NA NA ...
$ ...12 : num [1:67] NA NA 1911 NA NA ...
$ ...13 : num [1:67] NA NA 1912 NA NA ...
$ ...14 : num [1:67] NA NA 1913 NA NA ...
$ ...15 : num [1:67] NA NA 1914 NA NA ...
$ ...16 : num [1:67] NA NA 1915 NA NA ...
$ ...17 : num [1:67] NA NA 1916 NA NA ...
$ ...18 : num [1:67] NA NA 1917 NA NA ...
$ ...19 : num [1:67] NA NA 1918 NA NA ...
$ ...20 : num [1:67] NA NA 1919 NA NA ...
$ ...21 : num [1:67] NA NA 1920 NA NA NA NA NA NA NA ...
$ ...22 : num [1:67] NA NA 1921 NA NA ...
$ ...23 : num [1:67] NA NA 1922 7 NA ...
$ ...24 : num [1:67] NA NA 1923 34 NA ...
$ ...25 : num [1:67] NA NA 1924 62 NA ...
$ ...26 : num [1:67] NA NA 1925 206 NA ...
$ ...27 : num [1:67] NA NA 1926 533 NA ...
$ ...28 : num [1:67] NA NA 1927 651 NA ...
$ ...29 : num [1:67] NA NA 1928 802 NA ...
$ ...30 : num [1:67] NA NA 1929 1038 NA ...
$ ...31 : num [1:67] NA NA 1930 951 NA NA NA NA NA NA ...
$ ...32 : num [1:67] NA NA 1931 771 NA ...
$ ...33 : num [1:67] NA NA 1932 357 NA ...
$ ...34 : num [1:67] NA NA 1933 386 NA ...
$ ...35 : num [1:67] NA NA 1934 628 NA ...
$ ...36 : num [1:67] NA NA 1935 877 NA ...
$ ...37 : num [1:67] NA NA 1936 1073 NA ...
$ ...38 : num [1:67] NA NA 1937 1378 NA ...
$ ...39 : num [1:67] NA NA 1938 1487 NA ...
$ ...40 : num [1:67] NA NA 1939 2262 NA ...
$ ...41 : num [1:67] NA NA 1940 1201 NA ...
$ ...42 : num [1:67] NA NA 1941 1688 NA ...
$ ...43 : num [1:67] NA NA 1942 5 NA ...
$ ...44 : num [1:67] NA NA 1943 12 NA ...
$ ...45 : num [1:67] NA NA 1944 0 NA ...
$ ...46 : num [1:67] NA NA 1945 19 NA ...
$ ...47 : num [1:67] NA NA 1946 1134 NA ...
$ ...48 : num [1:67] NA NA 1947 3466 NA ...
$ ...49 : num [1:67] NA NA 1948 4512 NA ...
$ ...50 : num [1:67] NA NA 1949 4831 NA ...
$ ...51 : num [1:67] NA NA 1950 6672 NA ...
$ ...52 : num [1:67] NA NA 1951 7807 NA ...
$ ...53 : num [1:67] NA NA 1952 7310 NA ...
$ ...54 : num [1:67] NA NA 1953 6839 NA ...
$ ...55 : num [1:67] NA NA 1954 5000 NA ...
$ ...56 : num [1:67] NA NA 1955 3400 NA ...
$ ...57 : num [1:67] NA NA 1956 5200 NA ...
$ ...58 : num [1:67] NA NA 1957 10700 NA ...
$ ...59 : num [1:67] NA NA 1958 25900 NA ...
$ ...60 : num [1:67] NA NA 1959 25800 NA ...
$ ...61 : num [1:67] NA NA 1960 22500 NA 900 600 NA NA NA ...
$ ...62 : num [1:67] NA NA 1961 18300 NA ...
$ ...63 : num [1:67] NA NA 1962 16600 NA ...
$ ...64 : num [1:67] NA NA 1963 18400 NA ...
$ ...65 : num [1:67] NA NA 1964 19200 NA ...
$ ...66 : num [1:67] NA NA 1965 21400 NA ...
$ ...67 : num [1:67] NA NA 1966 31300 NA ...
$ ...68 : num [1:67] NA NA 1967 39800 NA ...
$ ...69 : num [1:67] NA NA 1968 33300 NA ...
$ ...70 : num [1:67] NA NA 1969 45400 NA ...
$ ...71 : num [1:67] NA NA 1970 46000 NA 29700 11800 NA NA NA ...
$ ...72 : num [1:67] NA NA 1971 44500 NA ...
$ ...73 : num [1:67] NA NA 1972 88625 NA ...
$ ...74 : num [1:67] NA NA 1973 137300 NA ...
$ ...75 : num [1:67] NA NA 1974 161400 NA ...
$ ...76 : num [1:67] NA NA 1975 160600 NA ...
$ ...77 : num [1:67] NA NA 1976 157600 NA ...
$ ...78 : num [1:67] NA NA 1977 183200 NA ...
$ ...79 : num [1:67] NA NA 1978 223000 NA ...
$ ...80 : num [1:67] NA NA 1979 251105 NA ...
$ ...81 : num [1:67] NA NA 1980 216361 NA ...
$ ...82 : num [1:67] NA NA 1981 256593 NA ...
$ ...83 : num [1:67] NA NA 1982 321868 1381 ...
$ ...84 : num [1:67] NA NA 1983 346082 2138 ...
$ ...85 : num [1:67] NA NA 1984 395099 2440 ...
$ ...86 : num [1:67] NA NA 1985 436545 1381 ...
$ ...87 : num [1:67] NA NA 1986 529749 2801 ...
$ ...88 : num [1:67] NA NA 1987 575013 1060 ...
$ ...89 : num [1:67] NA NA 1988 592431 1258 ...
$ ...90 : num [1:67] NA NA 1989 543640 822 ...
$ ...91 : num [1:67] NA NA 1990 546693 1010 ...
$ ...92 : num [1:67] NA NA 1991 558870 1154 ...
$ ...93 : num [1:67] NA NA 1992 503674 2116 ...
$ ...94 : num [1:67] NA NA 1993 505565 2245 ...
$ ...95 : num [1:67] NA NA 1994 490311 1726 ...
$ ...96 : num [1:67] NA NA 1995 543309 7074 ...
$ ...97 : num [1:67] NA NA 1996 341385 6448 ...
$ ...98 : num [1:67] NA NA 1997 354278 6949 ...
$ ...99 : num [1:67] NA NA 1998 372519 8266 ...
[list output truncated]
- attr(*, "spec")=
.. cols(
.. ...1 = col_character(),
.. ...2 = col_character(),
.. ...3 = col_character(),
.. ...4 = col_number(),
.. ...5 = col_number(),
.. ...6 = col_logical(),
.. ...7 = col_number(),
.. ...8 = col_number(),
.. ...9 = col_number(),
.. ...10 = col_number(),
.. ...11 = col_number(),
.. ...12 = col_number(),
.. ...13 = col_number(),
.. ...14 = col_number(),
.. ...15 = col_number(),
.. ...16 = col_number(),
.. ...17 = col_number(),
.. ...18 = col_number(),
.. ...19 = col_number(),
.. ...20 = col_number(),
.. ...21 = col_number(),
.. ...22 = col_number(),
.. ...23 = col_number(),
.. ...24 = col_number(),
.. ...25 = col_number(),
.. ...26 = col_number(),
.. ...27 = col_number(),
.. ...28 = col_number(),
.. ...29 = col_number(),
.. ...30 = col_number(),
.. ...31 = col_number(),
.. ...32 = col_number(),
.. ...33 = col_number(),
.. ...34 = col_number(),
.. ...35 = col_number(),
.. ...36 = col_number(),
.. ...37 = col_number(),
.. ...38 = col_number(),
.. ...39 = col_number(),
.. ...40 = col_number(),
.. ...41 = col_number(),
.. ...42 = col_number(),
.. ...43 = col_number(),
.. ...44 = col_number(),
.. ...45 = col_number(),
.. ...46 = col_number(),
.. ...47 = col_number(),
.. ...48 = col_number(),
.. ...49 = col_number(),
.. ...50 = col_number(),
.. ...51 = col_number(),
.. ...52 = col_number(),
.. ...53 = col_number(),
.. ...54 = col_number(),
.. ...55 = col_number(),
.. ...56 = col_number(),
.. ...57 = col_number(),
.. ...58 = col_number(),
.. ...59 = col_number(),
.. ...60 = col_number(),
.. ...61 = col_number(),
.. ...62 = col_number(),
.. ...63 = col_number(),
.. ...64 = col_number(),
.. ...65 = col_number(),
.. ...66 = col_number(),
.. ...67 = col_number(),
.. ...68 = col_number(),
.. ...69 = col_number(),
.. ...70 = col_number(),
.. ...71 = col_number(),
.. ...72 = col_number(),
.. ...73 = col_number(),
.. ...74 = col_number(),
.. ...75 = col_number(),
.. ...76 = col_number(),
.. ...77 = col_number(),
.. ...78 = col_number(),
.. ...79 = col_number(),
.. ...80 = col_number(),
.. ...81 = col_number(),
.. ...82 = col_number(),
.. ...83 = col_number(),
.. ...84 = col_number(),
.. ...85 = col_number(),
.. ...86 = col_number(),
.. ...87 = col_number(),
.. ...88 = col_number(),
.. ...89 = col_number(),
.. ...90 = col_number(),
.. ...91 = col_number(),
.. ...92 = col_number(),
.. ...93 = col_number(),
.. ...94 = col_number(),
.. ...95 = col_number(),
.. ...96 = col_number(),
.. ...97 = col_number(),
.. ...98 = col_number(),
.. ...99 = col_number(),
.. ...100 = col_number(),
.. ...101 = col_number(),
.. ...102 = col_number(),
.. ...103 = col_number(),
.. ...104 = col_number(),
.. ...105 = col_number(),
.. ...106 = col_number(),
.. ...107 = col_number(),
.. ...108 = col_number(),
.. ...109 = col_number(),
.. ...110 = col_number(),
.. ...111 = col_number(),
.. ...112 = col_number(),
.. ...113 = col_number(),
.. ...114 = col_number(),
.. ...115 = col_number(),
.. ...116 = col_number(),
.. ...117 = col_number(),
.. ...118 = col_number(),
.. ...119 = col_number(),
.. ...120 = col_number(),
.. ...121 = col_number(),
.. ...122 = col_number()
.. )
- attr(*, "problems")=<externalptr>
summary(orig_nps)
...1 ...2 ...3 ...4
Length:67 Length:67 Length:67 Min. : 563
Class :character Class :character Class :character 1st Qu.: 1250
Mode :character Mode :character Mode :character Median : 1904
Mean : 17513
3rd Qu.: 8314
Max. :101000
NA's :60
...5 ...6 ...7 ...8
Min. : 928 Mode:logical Min. : 700 Min. : 900
1st Qu.: 1200 NA's:67 1st Qu.: 1564 1st Qu.: 1705
Median : 1905 Median : 1853 Median : 2334
Mean : 20408 Mean : 4059 Mean : 4355
3rd Qu.: 14313 3rd Qu.: 3444 3rd Qu.: 3839
Max. :109000 Max. :17182 Max. :16414
NA's :60 NA's :59 NA's :59
...9 ...10 ...11 ...12
Min. : 80 Min. : 165 Min. : 250 Min. : 206
1st Qu.: 1773 1st Qu.: 854 1st Qu.: 2034 1st Qu.: 2637
Median : 2826 Median : 3216 Median : 4194 Median : 4000
Mean : 4964 Mean : 6979 Mean : 17533 Mean : 17788
3rd Qu.: 5275 3rd Qu.: 5968 3rd Qu.: 12214 3rd Qu.: 11418
Max. :19542 Max. :32545 Max. :120000 Max. :130000
NA's :58 NA's :58 NA's :57 NA's :56
...13 ...14 ...15 ...16
Min. : 230 Min. : 280 Min. : 502 Min. : 663
1st Qu.: 2582 1st Qu.: 3290 1st Qu.: 3664 1st Qu.: 6440
Median : 5235 Median : 6253 Median : 7096 Median : 12818
Mean : 18163 Mean : 19847 Mean : 19192 Mean : 26310
3rd Qu.: 9915 3rd Qu.: 13618 3rd Qu.: 15092 3rd Qu.: 33881
Max. :135000 Max. :135000 Max. :125000 Max. :115000
NA's :56 NA's :56 NA's :56 NA's :55
...17 ...18 ...19 ...20
Min. : 1385 Min. : 1917 Min. : 1918 Min. : 1814
1st Qu.: 10335 1st Qu.: 11645 1st Qu.: 9086 1st Qu.: 3000
Median : 14100 Median : 18387 Median : 15496 Median : 25000
Mean : 27209 Mean : 34844 Mean : 33461 Mean : 43042
3rd Qu.: 34005 3rd Qu.: 35400 3rd Qu.: 36000 3rd Qu.: 58362
Max. :118740 Max. :135000 Max. :140000 Max. :169492
NA's :55 NA's :54 NA's :54 NA's :50
...21 ...22 ...23 ...24
Min. : 1920 Min. : 1921 Min. : 7 Min. : 15
1st Qu.: 8665 1st Qu.: 13036 1st Qu.: 9500 1st Qu.: 6431
Median : 30949 Median : 28617 Median : 31177 Median : 41328
Mean : 51136 Mean : 51361 Mean : 50311 Mean : 55210
3rd Qu.: 67111 3rd Qu.: 68661 3rd Qu.: 76509 3rd Qu.: 92675
Max. :240966 Max. :273737 Max. :219164 Max. :218000
NA's :49 NA's :48 NA's :47 NA's :45
...25 ...26 ...27 ...28
Min. : 17 Min. : 206 Min. : 533 Min. : 651
1st Qu.: 8686 1st Qu.: 13651 1st Qu.: 19545 1st Qu.: 24836
Median : 35020 Median : 50952 Median : 53173 Median : 61151
Mean : 58453 Mean : 77586 Mean : 87095 Mean : 99954
3rd Qu.: 88826 3rd Qu.:118958 3rd Qu.:130503 3rd Qu.:152692
Max. :224211 Max. :265500 Max. :274209 Max. :490430
NA's :44 NA's :45 NA's :45 NA's :45
...29 ...30 ...31 ...32
Min. : 802 Min. : 500 Min. : 400 Min. : 405
1st Qu.: 34096 1st Qu.: 26106 1st Qu.: 35982 1st Qu.: 51995
Median : 76820 Median : 76822 Median : 88000 Median : 85000
Mean :109988 Mean :108078 Mean :109788 Mean :117479
3rd Qu.:159144 3rd Qu.:149554 3rd Qu.:157693 3rd Qu.:156964
Max. :460619 Max. :461257 Max. :458566 Max. :461855
NA's :45 NA's :42 NA's :42 NA's :42
...33 ...34 ...35 ...36
Min. : 357 Min. : 386 Min. : 275 Min. : 300
1st Qu.: 20356 1st Qu.: 11615 1st Qu.: 19791 1st Qu.: 15879
Median : 57338 Median : 51925 Median : 71901 Median : 83321
Mean :109593 Mean :103797 Mean :119617 Mean :125076
3rd Qu.:153134 3rd Qu.:163980 3rd Qu.:218391 3rd Qu.:206316
Max. :498289 Max. :375000 Max. :420000 Max. :500000
NA's :41 NA's :39 NA's :37 NA's :35
...37 ...38 ...39 ...40
Min. : 400 Min. : 706 Min. : 1130 Min. : 1500
1st Qu.: 20111 1st Qu.: 21213 1st Qu.: 20100 1st Qu.: 18000
Median :124697 Median : 124365 Median :121301 Median :116516
Mean :173314 Mean : 192804 Mean :188685 Mean :191680
3rd Qu.:258988 3rd Qu.: 223413 3rd Qu.:224445 3rd Qu.:226741
Max. :694098 Max. :1041204 Max. :954967 Max. :911612
NA's :33 NA's :32 NA's :31 NA's :30
...41 ...42 ...43 ...44
Min. : 1141 Min. : 0 Min. : 0 Min. : 0
1st Qu.: 18348 1st Qu.: 15506 1st Qu.: 7855 1st Qu.: 4570
Median :111185 Median : 124563 Median : 48144 Median : 18971
Mean :201245 Mean : 217969 Mean : 97481 Mean : 52331
3rd Qu.:274769 3rd Qu.: 274002 3rd Qu.:124809 3rd Qu.: 60651
Max. :950807 Max. :1310101 Max. :728706 Max. :394140
NA's :29 NA's :26 NA's :26 NA's :26
...45 ...46 ...47 ...48
Min. : 0 Min. : 0 Min. : 0 Min. : 0
1st Qu.: 3759 1st Qu.: 6397 1st Qu.: 16426 1st Qu.: 19374
Median : 19519 Median : 38624 Median : 124763 Median : 162563
Mean : 62445 Mean :108193 Mean : 222159 Mean : 264001
3rd Qu.: 55707 3rd Qu.:130091 3rd Qu.: 308593 3rd Qu.: 376973
Max. :534586 Max. :750690 Max. :1157930 Max. :1204017
NA's :25 NA's :25 NA's :25 NA's :25
...49 ...50 ...51 ...52
Min. : 1948 Min. : 1949 Min. : 1950 Min. : 1951
1st Qu.: 26703 1st Qu.: 37842 1st Qu.: 31636 1st Qu.: 36043
Median : 169063 Median : 200555 Median : 189286 Median : 224801
Mean : 281803 Mean : 325711 Mean : 328327 Mean : 371395
3rd Qu.: 386848 3rd Qu.: 404359 3rd Qu.: 425890 3rd Qu.: 497083
Max. :1469749 Max. :1539641 Max. :1843620 Max. :1945100
NA's :25 NA's :25 NA's :24 NA's :24
...53 ...54 ...55 ...56
Min. : 1952 Min. : 1953 Min. : 1954 Min. : 1955
1st Qu.: 49458 1st Qu.: 58464 1st Qu.: 59350 1st Qu.: 69100
Median : 312677 Median : 332835 Median : 337900 Median : 342200
Mean : 425016 Mean : 437933 Mean : 452945 Mean : 467239
3rd Qu.: 564989 3rd Qu.: 590949 3rd Qu.: 581000 3rd Qu.: 642950
Max. :2322152 Max. :2250772 Max. :2526900 Max. :2581500
NA's :24 NA's :24 NA's :24 NA's :24
...57 ...58 ...59 ...60
Min. : 500 Min. : 600 Min. : 700 Min. : 1100
1st Qu.: 62000 1st Qu.: 49975 1st Qu.: 57425 1st Qu.: 70650
Median : 303800 Median : 328900 Median : 340500 Median : 360450
Mean : 488572 Mean : 497117 Mean : 519619 Mean : 539682
3rd Qu.: 669800 3rd Qu.: 692250 3rd Qu.: 711525 3rd Qu.: 778475
Max. :2885800 Max. :2943700 Max. :3168900 Max. :3162300
NA's :22 NA's :21 NA's :21 NA's :21
...61 ...62 ...63 ...64
Min. : 600 Min. : 600 Min. : 300 Min. : 700
1st Qu.: 71800 1st Qu.: 83750 1st Qu.: 93450 1st Qu.: 103525
Median : 397700 Median : 415600 Median : 399000 Median : 410800
Mean : 621101 Mean : 644369 Mean : 742316 Mean : 740547
3rd Qu.: 871600 3rd Qu.: 801450 3rd Qu.:1005450 3rd Qu.: 966825
Max. :4528600 Max. :4762100 Max. :5209800 Max. :5258700
NA's :20 NA's :20 NA's :20 NA's :19
...65 ...66 ...67 ...68
Min. : 500 Min. : 800 Min. : 300 Min. : 1200
1st Qu.: 102325 1st Qu.: 118000 1st Qu.: 117900 1st Qu.: 146600
Median : 432250 Median : 480500 Median : 513000 Median : 516300
Mean : 759212 Mean : 852191 Mean : 936444 Mean : 913177
3rd Qu.: 940675 3rd Qu.:1091300 3rd Qu.:1143800 3rd Qu.:1282800
Max. :5321100 Max. :5954900 Max. :6466100 Max. :6710100
NA's :19 NA's :18 NA's :18 NA's :18
...69 ...70 ...71 ...72
Min. : 1600 Min. : 1969 Min. : 1970 Min. : 1971
1st Qu.: 147000 1st Qu.: 162600 1st Qu.: 183225 1st Qu.: 193875
Median : 578300 Median : 550300 Median : 611750 Median : 542550
Mean : 967624 Mean : 975214 Mean :1018211 Mean : 907303
3rd Qu.:1540200 3rd Qu.:1299700 3rd Qu.:1367225 3rd Qu.:1306500
Max. :6667100 Max. :6331100 Max. :6778500 Max. :7173000
NA's :18 NA's :18 NA's :17 NA's :15
...73 ...74 ...75 ...76
Min. : 1972 Min. : 1973 Min. : 1974 Min. : 1975
1st Qu.: 168896 1st Qu.: 196850 1st Qu.: 162775 1st Qu.: 242925
Median : 546286 Median : 500750 Median : 470750 Median : 558150
Mean :1000728 Mean : 960740 Mean : 906212 Mean : 994583
3rd Qu.:1391299 3rd Qu.:1350050 3rd Qu.:1213925 3rd Qu.:1307375
Max. :8034753 Max. :7586300 Max. :7807800 Max. :8541500
NA's :14 NA's :13 NA's :13 NA's :13
...77 ...78 ...79 ...80
Min. : 1976 Min. : 1977 Min. : 1978 Min. : 1979
1st Qu.: 215300 1st Qu.: 246000 1st Qu.: 276200 1st Qu.: 246900
Median : 625600 Median : 622300 Median : 663331 Median : 585678
Mean :1068945 Mean :1093285 Mean :1073324 Mean : 939412
3rd Qu.:1312300 3rd Qu.:1375800 3rd Qu.:1321844 3rd Qu.:1400204
Max. :8991500 Max. :9173600 Max. :8695534 Max. :8019788
NA's :12 NA's :12 NA's :11 NA's :11
...81 ...82 ...83 ...84
Min. : 1980 Min. : 1981 Min. : 1381 Min. : 1983
1st Qu.: 262219 1st Qu.: 264326 1st Qu.: 172287 1st Qu.: 164926
Median : 567420 Median : 592454 Median : 566807 Median : 577439
Mean : 962913 Mean :1020821 Mean : 941126 Mean : 958910
3rd Qu.:1234220 3rd Qu.:1247455 3rd Qu.:1030484 3rd Qu.:1160008
Max. :8440953 Max. :8312884 Max. :8177869 Max. :8435475
NA's :11 NA's :11 NA's :6 NA's :6
...85 ...86 ...87 ...88
Min. : 1075 Min. : 1305 Min. : 1085 Min. : 230
1st Qu.: 166610 1st Qu.: 183348 1st Qu.: 185612 1st Qu.: 203216
Median : 539476 Median : 505791 Median : 586668 Median : 651606
Mean : 924798 Mean : 925230 Mean : 990115 Mean : 1041905
3rd Qu.:1142727 3rd Qu.:1139202 3rd Qu.:1228194 3rd Qu.: 1233212
Max. :8508390 Max. :9319290 Max. :9836306 Max. :10209841
NA's :5 NA's :4 NA's :4 NA's :4
...89 ...90 ...91 ...92
Min. : 1258 Min. : 822 Min. : 1010 Min. : 1154
1st Qu.: 217918 1st Qu.: 238006 1st Qu.: 240466 1st Qu.: 212755
Median : 675397 Median : 600045 Median : 611375 Median : 679034
Mean :1049132 Mean :1061917 Mean :1018962 Mean :1083886
3rd Qu.:1231204 3rd Qu.:1302215 3rd Qu.:1293538 3rd Qu.:1438690
Max. :8770781 Max. :8333553 Max. :8151769 Max. :8654459
NA's :4 NA's :4 NA's :4 NA's :4
...93 ...94 ...95 ...96
Min. : 1992 Min. : 1993 Min. : 1726 Min. : 0
1st Qu.: 219461 1st Qu.: 214598 1st Qu.: 211855 1st Qu.: 241461
Median : 688742 Median : 666054 Median : 685031 Median : 663794
Mean :1108917 Mean :1141519 Mean :1157714 Mean :1197706
3rd Qu.:1467198 3rd Qu.:1421256 3rd Qu.:1573088 3rd Qu.:1605836
Max. :8931690 Max. :9283848 Max. :8628174 Max. :9080420
NA's :4 NA's :4 NA's :4 NA's :4
...97 ...98 ...99 ...100
Min. : 1996 Min. : 1997 Min. : 1998 Min. : 1999
1st Qu.: 285684 1st Qu.: 306023 1st Qu.: 271858 1st Qu.: 288709
Median : 644502 Median : 627720 Median : 604556 Median : 635736
Mean :1198161 Mean :1211313 Mean :1204959 Mean : 1195782
3rd Qu.:1535120 3rd Qu.:1548693 3rd Qu.:1474971 3rd Qu.: 1443662
Max. :9265667 Max. :9965075 Max. :9989395 Max. :10283598
NA's :5 NA's :4 NA's :4 NA's :4
...101 ...102 ...103 ...104
Min. : 2000 Min. : 2001 Min. : 1938 Min. : 0
1st Qu.: 257790 1st Qu.: 269938 1st Qu.: 237364 1st Qu.: 241347
Median : 605192 Median : 541787 Median : 558503 Median : 570953
Mean : 1167022 Mean :1133359 Mean :1125762 Mean :1097179
3rd Qu.: 1467108 3rd Qu.:1377130 3rd Qu.:1401990 3rd Qu.:1323676
Max. :10175812 Max. :9197697 Max. :9316420 Max. :9366845
NA's :4 NA's :4 NA's :3 NA's :3
...105 ...106 ...107 ...108
Min. : 2004 Min. : 2005 Min. : 1239 Min. : 847
1st Qu.: 258122 1st Qu.: 268943 1st Qu.: 246691 1st Qu.: 268616
Median : 549708 Median : 594893 Median : 569464 Median : 548004
Mean :1112941 Mean :1115680 Mean :1040673 Mean :1068904
3rd Qu.:1363063 3rd Qu.:1424896 3rd Qu.:1260680 3rd Qu.:1299272
Max. :9167046 Max. :9192477 Max. :9289215 Max. :9372253
NA's :4 NA's :4 NA's :3 NA's :3
...109 ...110 ...111 ...112
Min. : 1565 Min. : 1879 Min. : 2010 Min. : 2011
1st Qu.: 259539 1st Qu.: 221411 1st Qu.: 271609 1st Qu.: 270732
Median : 547580 Median : 568652 Median : 568426 Median : 550900
Mean :1043258 Mean :1078816 Mean :1110641 Mean :1072016
3rd Qu.:1219177 3rd Qu.:1220559 3rd Qu.:1290286 3rd Qu.:1310031
Max. :9044010 Max. :9491437 Max. :9463538 Max. :9008830
NA's :3 NA's :3 NA's :3 NA's :3
...113 ...114 ...115 ...116
Min. : 2012 Min. : 2013 Min. : 0 Min. : 0
1st Qu.: 243314 1st Qu.: 236605 1st Qu.: 262790 1st Qu.: 282101
Median : 518568 Median : 526974 Median : 538970 Median : 596116
Mean :1110429 Mean :1080282 Mean : 1155141 Mean : 1254802
3rd Qu.:1323217 3rd Qu.:1315336 3rd Qu.: 1427298 3rd Qu.: 1473670
Max. :9685829 Max. :9354695 Max. :10099276 Max. :10712674
NA's :3 NA's :3 NA's :3 NA's :3
...117 ...118 ...119 ...120
Min. : 2016 Min. : 2017 Min. : 2018 Min. : 2019
1st Qu.: 320378 1st Qu.: 304206 1st Qu.: 291636 1st Qu.: 325694
Median : 630326 Median : 667870 Median : 650660 Median : 681872
Mean : 1369082 Mean : 1396772 Mean : 1370565 Mean : 1422075
3rd Qu.: 1554654 3rd Qu.: 1544675 3rd Qu.: 1667333 3rd Qu.: 1680013
Max. :11312786 Max. :11338893 Max. :11421200 Max. :12547743
NA's :3 NA's :3 NA's :3 NA's :3
...121 ...122
Min. : 2020 Min. : 2021
1st Qu.: 162119 1st Qu.: 292505
Median : 454968 Median : 707328
Mean : 1061464 Mean : 1441467
3rd Qu.: 1265616 3rd Qu.: 1713756
Max. :12095720 Max. :14161548
NA's :3 NA's :3
class(orig_nps)
[1] "spec_tbl_df" "tbl_df" "tbl" "data.frame"
names(orig_nps) #Exploring columns names, they look to be all numbers
nps1<-orig_nps[-c(1,3,6, 118:122)] #Here I will remove the columns that I do not need by selecting the column numbernps2<-nps1[-c(1,2,65),] #I am going to clean the data a bit more by removing the two rows at the top of the datasetprint(nps2) #Much more succinct
# A tibble: 64 × 114
...2 ...4 ...5 ...7 ...8 ...9 ...10 ...11 ...12 ...13 ...14 ...15 ...16
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Park… 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915
2 Dena… NA NA NA NA NA NA NA NA NA NA NA NA
3 Gate… NA NA NA NA NA NA NA NA NA NA NA NA
4 Glac… NA NA NA NA NA NA NA NA NA NA NA NA
5 Katm… NA NA NA NA NA NA NA NA NA NA NA NA
6 Kena… NA NA NA NA NA NA NA NA NA NA NA NA
7 Kobu… NA NA NA NA NA NA NA NA NA NA NA NA
8 Lake… NA NA NA NA NA NA NA NA NA NA NA NA
9 Wran… NA NA NA NA NA NA NA NA NA NA NA NA
10 Arch… NA NA NA NA NA NA NA NA NA NA NA NA
# … with 54 more rows, and 101 more variables: ...17 <dbl>, ...18 <dbl>,
# ...19 <dbl>, ...20 <dbl>, ...21 <dbl>, ...22 <dbl>, ...23 <dbl>,
# ...24 <dbl>, ...25 <dbl>, ...26 <dbl>, ...27 <dbl>, ...28 <dbl>,
# ...29 <dbl>, ...30 <dbl>, ...31 <dbl>, ...32 <dbl>, ...33 <dbl>,
# ...34 <dbl>, ...35 <dbl>, ...36 <dbl>, ...37 <dbl>, ...38 <dbl>,
# ...39 <dbl>, ...40 <dbl>, ...41 <dbl>, ...42 <dbl>, ...43 <dbl>,
# ...44 <dbl>, ...45 <dbl>, ...46 <dbl>, ...47 <dbl>, ...48 <dbl>, …
While our dataset is cleaner and more succinct, there are no column or row names
nps3<- nps2 %>%#Using first row of dataset for column names row_to_names(row_number=1)nps4<- nps3[!is.na(nps3$`Park Name`),] #Removing NA from Park Name columnnps5<- nps4 %>%#Converting dataset from wide to long pivot_longer(cols =c(-`Park Name`), names_to ="year", values_to ="visitors",values_drop_na =TRUE)nps_clean<- nps5 %>%#Ranking the parks group_by(year) %>%mutate(Rank =order(order(visitors, decreasing=TRUE))) %>%ungroup() %>%rename(park_name =`Park Name`)
Beginning Data Visualization
library(ggthemes)nps_clean$year<-as.numeric(as.character(nps_clean$year)) #Making year numeric#Creating df for each of the top parks to overlay on nps_g1GSM<- nps_clean [which (nps_clean$park_name=="Great Smoky Mountains NP"),] GC<- nps_clean [which (nps_clean$park_name=="Grand Canyon NP"),] RMNP<- nps_clean [which (nps_clean$park_name=="Rocky Mountain NP"),] YNP<- nps_clean [which (nps_clean$park_name=="Yosemite NP"),]YSNP<- nps_clean [which (nps_clean$park_name=="Yellowstone NP"),] ZNP<- nps_clean [which (nps_clean$park_name=="Zion NP"),] ANP<- nps_clean [which (nps_clean$park_name=="Acadia NP"),] HSNP<- nps_clean [which (nps_clean$park_name=="Hot Springs NP"),]DNP<- nps_clean [which (nps_clean$park_name=="Denali NP & PRES"),]CCNP<- nps_clean [which (nps_clean$park_name=="Carlsbad Caverns NP"),]GBNP<- nps_clean [which (nps_clean$park_name=="Great Basin NP"),]nps_g1<- nps_clean %>%ggplot() +geom_line( aes(x = year, y = Rank, color = park_name, group = park_name), color="grey") +theme_fivethirtyeight() +theme(legend.position ="none") +scale_y_reverse(breaks=seq(50,1,-25), limits=c(62,-1)) +#setting y axisscale_x_continuous(breaks=seq(1925,2000,25),limits=c(1904,2030)) +#setting y axis xlab("Year") +ylab("Rank") +labs(title="The most popular national parks",subtitle="National parks ranked by number of visitors in a given year") +geom_line(data=GSM, aes(x = year, y = Rank, color = park_name, group = park_name), color="darkolivegreen") +annotate("text", x =2007, y =0, label ="Great Smoky Mountains", color ="darkolivegreen", fontface=2, size=2) +geom_line(data=GC, aes(x = year, y = Rank, color = park_name, group = park_name), color="deepskyblue4") +annotate("text", x =2022, y =1.6, label ="Grand Canyon", color ="deepskyblue4", fontface=2, size=2) +geom_line(data=RMNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="cyan4")+annotate("text", x =2022, y =2.6, label ="Rocky Mountain", color ="cyan4", fontface=2, size=2) +geom_line(data=YNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="chartreuse4") +annotate("text", x =2021, y =3.6, label ="Yosemite", color ="chartreuse4", fontface=2, size=2) +geom_line(data=YSNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="orange2") +annotate("text", x =2021, y =4.7, label ="Yellowstone", color ="orange2", fontface=2, size=2) +geom_line(data=ZNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="tomato") +annotate("text", x =2018.3, y =5.8, label ="Zion", color ="tomato", fontface=2, size=2) +geom_line(data=ANP, aes(x = year, y = Rank, color = park_name, group = park_name), color="gold1") +annotate("text", x =2018.9, y =8.2, label ="Acadia", color ="gold1", fontface=2, size=2) +geom_line(data=HSNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="plum3") +annotate("text", x =2020.8, y =16.8, label ="Hot Springs", color ="plum3", fontface=2, size=2) +geom_line(data=DNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="mediumpurple1") +annotate("text", x =2019, y =34.7, label ="Denali", color ="mediumpurple", fontface=2, size=2) +geom_line(data=CCNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="lightskyblue2") +annotate("text", x =2023.3, y =42, label ="Carlsbad Caverns", color ="lightskyblue2", fontface=2, size=2) +geom_line(data=GBNP, aes(x = year, y = Rank, color = park_name, group = park_name), color="maroon2") +annotate("text", x =2020.8, y =52, label ="Great Basin", color ="maroon2", fontface=2, size=2)nps_g1
Here is an update on my data visualization! 3/10/2023 I think I bit off a bit more than I could chew with this graph. But, I am pretty proud of how far I got with it. To make things a little more cohesive, I overrode the color to grey. I plan on continuing work on it tonight with the hopes of adding the 11 parks in color. It is definitely a work in progress!