2022 Beijing Olympic Data from NBC's API • FHSTR

This package currently provides data from the 2022 Beijing Olympics from NBC’s API. The name of this package comes from the Olympic Motto: “Faster, Higher, Stronger - Together”.

Installation

You can install the development version of FHSTR from GitHub with:

# install.packages("devtools")
devtools::install_github("billyfryer/FHSTR")

Preloaded Datasets

So far there are 3 pre-loaded data sets in the FHSTR package to help get you started. To learn more about these data sets, try the code chunk below!

library(FHSTR)
?sport_list
?date_list
?schedule_matrix

Determining Match IDs

For every event, there is a unique MatchID. To figure out which matchID corresponds to which game use the following code:

hockey_matchIDs <- load_olympic_matchID_key(sportID = 113) # Hockey

Sport Schedules

The sport schedules are similar to the matchID keys except they contain more information such as whether medals were awarded or location of events. Sport schedules can be found by running the code below:

hockey_sport_schedule <- load_olympic_sport_schedules(sportID = 113)

Loading Event Data

Data from the events is available in 2 different formats: raw JSON files and pre-parsed CSVs. The advantage of having access to the JSON files is that you are able to dig through and find more data available than what is present in the CSVs. This is typically more time consuming. That is why pre-parsed CSVs are available. Sample code below:

 # W Hockey USA vs Canada CSV Data
csv_data <- load_olympic_csv_data(sportID = 113, matchID = 746587)

# W Hockey USA vs Canada JSON Data
json_data <- load_olympic_json_data(sportID = 113, matchID = 746587)

Sample Work Flow for Hockey and Curling

Due to the nature of their tournaments, Hockey and Curling Function different from the rest of the sports. An example workflow for Hockey and Curling is shown below:

# Call the Package
library(FHSTR)

# Get Sport List
sport_list <- FHSTR::sport_list

# Filter to desired Sport
hockey <- dplyr::filter(sport_list, sport_list$c_Sport == "Hockey")

# Pull the Sport ID
sportID <- hockey$n_SportID

# Sport Schedule
hockey_schedule <- load_olympic_sport_schedules(sportID)
 
# Find Match ID for USA vs Canada Gold Medal Match in Hockey
gold_medal_match <- dplyr::filter(.data = hockey_schedule, 
              hockey_schedule$c_ContainerMatch == "Gold Medal Game" &
              hockey_schedule$GenderEvent.c_Name == "Women's Tournament")

gold_id <- gold_medal_match$Match.n_ID

# Load CSV and JSON Data
csv_data <- load_olympic_csv_data(sportID = sportID, 
                                  matchID = gold_id)
json_data <- load_olympic_json_data(sportID = sportID, 
                                    matchID = gold_id)

Sample Workflow for All Other Sports

The following code can be applied to the rest of the sports. Bobsled was selected as an example.

# Call the Package
library(FHSTR)

# Get Sport List
sport_list <- FHSTR::sport_list

# Filter to desired Sport
bobsled <- dplyr::filter(sport_list, sport_list$c_Sport == "Bobsled")

# Pull the Sport ID
sportID <- bobsled$n_SportID

# Sport Schedule
bobsled_schedule <- load_olympic_sport_schedules(sportID)

# Monobob Run 4 Data
monobob <- dplyr::filter(.data = bobsled_schedule, 
              bobsled_schedule$c_ContainerMatch == "Monobob" &
              bobsled_schedule$c_Match == "Run 4")
  
# Get Match ID
run4_id <- monobob$n_MatchID

# Load CSV and JSON Data
csv_data <- load_olympic_csv_data(sportID = sportID, 
                                  matchID = run4_id)
json_data <- load_olympic_json_data(sportID = sportID, 
                                    matchID = run4_id)

If you want to see a project that uses the FHSTR package, check out the Biathlon Mass Start Clustering Project that I did!

Raw Datasets

If you want to use the CSV or JSON data sets outside of R, they are available at the at this link: Beijing 2022 Data Repository