For this activity, work in groups of 2 - 3. Work together to come to a solution, and help each other out when stuck! The goal is to use the journey as a vessel for your and your peer’s learning, not to make it to the ‘correct’ answer as fast as possible.
Problem 0: load (and install) packages
# Packages (I got you started)library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.1.5
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Lahman) # the baseball datalibrary(nycflights13) # the flights data
Problem 1: Baseball
The Major League Baseball Angels have at times been called the California Angels (CAL), the Anaheim Angels (ANA), and the Los Angeles Angels of Anaheim (LAA). Using the Teams data frame in the Lahman package:
Find the 10 most successful seasons in Angels history, defining “successful” as the fraction of regular-season games won in the year. In the table you create, include the yearID, teamID, lgID, W, L, and WSWin. See the documentation for Teams (see help(Teams)) for the definition of these variables.
Have the Angels ever won the World Series? If so, when?
data(Teams)head(Teams)
yearID lgID teamID franchID divID Rank G Ghome W L DivWin WCWin LgWin
1 1884 UA ALT ALT <NA> 10 25 NA 6 19 <NA> <NA> N
2 1961 AL LAA ANA <NA> 8 162 82 70 91 <NA> <NA> N
3 1962 AL LAA ANA <NA> 3 162 81 86 76 <NA> <NA> N
4 1963 AL LAA ANA <NA> 9 161 81 70 91 <NA> <NA> N
5 1964 AL LAA ANA <NA> 5 162 81 82 80 <NA> <NA> N
6 1965 AL CAL ANA <NA> 7 162 80 75 87 <NA> <NA> N
WSWin R AB H X2B X3B HR BB SO SB CS HBP SF RA ER ERA CG SHO SV
1 <NA> 90 899 223 30 6 2 22 130 NA NA NA NA 216 114 4.67 20 0 0
2 N 744 5424 1331 218 22 189 681 1068 37 28 NA NA 784 689 4.31 25 5 34
3 N 718 5499 1377 232 35 137 602 917 46 27 NA NA 706 603 3.70 23 15 47
4 N 597 5506 1378 208 38 95 448 916 43 30 NA NA 660 569 3.52 30 13 31
5 N 544 5362 1297 186 27 102 472 920 49 39 NA NA 551 469 2.91 30 28 41
6 N 527 5354 1279 200 36 92 443 973 107 59 NA NA 569 508 3.17 39 14 33
IPouts HA HRA BBA SOA E DP FP name
1 659 292 3 52 93 156 4 0.862 Altoona Mountain City
2 4314 1391 180 713 973 192 154 0.969 Los Angeles Angels
3 4398 1412 118 616 858 175 153 0.972 Los Angeles Angels
4 4365 1317 120 578 889 163 155 0.974 Los Angeles Angels
5 4350 1273 100 530 965 138 168 0.978 Los Angeles Angels
6 4323 1259 91 563 847 123 149 0.981 California Angels
park attendance BPF PPF teamIDBR teamIDlahman45 teamIDretro
1 <NA> NA 101 109 ALT ALT ALT
2 Wrigley Field (LA) 603510 111 112 LAA LAA LAA
3 Dodger Stadium 1144063 97 97 LAA LAA LAA
4 Dodger Stadium 821015 94 94 LAA LAA LAA
5 Dodger Stadium 760439 90 90 LAA LAA LAA
6 Dodger Stadium 566727 97 98 CAL CAL CAL
Problem 2: Flights
Use the nycflights13 package and the flights data frame to answer the following questions:
What plane (specified by the tailnum variable) traveled the most times from New York City airports in 2013?