flights

reference:
- map of New York airports
- Central Park, NY

data

How many flights are in this database? From where did these flights originate? When?

SOLUTION:

?nycflights13
data <- flights
str(data)
## Classes 'tbl_df', 'tbl' and 'data.frame':    336776 obs. of  19 variables:
##  $ year          : int  2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
##  $ month         : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ day           : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ dep_time      : int  517 533 542 544 554 554 555 557 557 558 ...
##  $ sched_dep_time: int  515 529 540 545 600 558 600 600 600 600 ...
##  $ dep_delay     : num  2 4 2 -1 -6 -4 -5 -3 -3 -2 ...
##  $ arr_time      : int  830 850 923 1004 812 740 913 709 838 753 ...
##  $ sched_arr_time: int  819 830 850 1022 837 728 854 723 846 745 ...
##  $ arr_delay     : num  11 20 33 -18 -25 12 19 -14 -8 8 ...
##  $ carrier       : chr  "UA" "UA" "AA" "B6" ...
##  $ flight        : int  1545 1714 1141 725 461 1696 507 5708 79 301 ...
##  $ tailnum       : chr  "N14228" "N24211" "N619AA" "N804JB" ...
##  $ origin        : chr  "EWR" "LGA" "JFK" "JFK" ...
##  $ dest          : chr  "IAH" "IAH" "MIA" "BQN" ...
##  $ air_time      : num  227 227 160 183 116 150 158 53 140 138 ...
##  $ distance      : num  1400 1416 1089 1576 762 ...
##  $ hour          : num  5 5 5 5 6 5 6 6 6 6 ...
##  $ minute        : num  15 29 40 45 0 58 0 0 0 0 ...
##  $ time_hour     : POSIXct, format: "2013-01-01 05:00:00" "2013-01-01 05:00:00" ...

flights to Nashville

How many of these flights came to Nashville?

SOLUTION:

flights.to.Nashville <- filter(flights, dest == 'BNA')
flights.to.Nashville
## # A tibble: 6,333 x 19
##     year month   day dep_time sched_dep_time dep_delay arr_time
##    <int> <int> <int>    <int>          <int>     <dbl>    <int>
##  1  2013     1     1      903            820        43     1045
##  2  2013     1     1      953            959        -6     1141
##  3  2013     1     1     1255           1200        55     1451
##  4  2013     1     1     1552           1600        -8     1749
##  5  2013     1     1     1558           1534        24     1808
##  6  2013     1     1     1803           1620       103     2008
##  7  2013     1     1     1806           1810        -4     2002
##  8  2013     1     1     1910           1909         1     2126
##  9  2013     1     1     1934           1725       129     2126
## 10  2013     1     1     2122           2125        -3     2312
## # ... with 6,323 more rows, and 12 more variables: sched_arr_time <int>,
## #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
## #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
## #   minute <dbl>, time_hour <dttm>

origin of Nashville flights

From which airports did these Nashville flights originate? Find these airports on a map. Describe their locations relative to Central Park.

SOLUTION:

flights.to.Nashville %>%
  group_by(origin) %>%
  summarize(n = n())
## # A tibble: 3 x 2
##   origin     n
##    <chr> <int>
## 1    EWR  2336
## 2    JFK   730
## 3    LGA  3267

arrival times

When did they arrive? Describe the shape of this distribution, its center, spread, and range.

SOLUTION:

ggplot(flights.to.Nashville, aes(arr_time)) +
  geom_histogram(color = "saddlebrown", fill = "wheat") +
  labs(title = "Flights from NY to Nashville, 2013")