Classes with the Flights Dataset
This example is from TDM 102 Project 11 Spring 2024.
These example(s) depend on the database:
-
/anvil/projects/tdm/data/flights/2014.csv
Learn more about the dataset here.
1a. Create a class named Flight
, which contains attributes for the flight number, origin airport ID, destination airport ID, departure time, arrival time, departure delay, and arrival delay.
class Flight:
def __init__(self, flight_number, origin_airport_id, dest_airport_id, dep_time, arr_time, dep_delay, arr_delay):
self.flight_number = flight_number
self.origin_airport_id = origin_airport_id
self.dest_airport_id = dest_airport_id
self.dep_time = dep_time
self.arr_time = arr_time
self.dep_delay = dep_delay
self.arr_delay = arr_delay
1b. Add a function called get_arrdelay()
to the class, which gets the arrival delay time.
class Flight:
def __init__(self, flight_number, origin_airport_id, dest_airport_id, dep_time, arr_time, dep_delay, arr_delay):
self.flight_number = flight_number
self.origin_airport_id = origin_airport_id
self.dest_airport_id = dest_airport_id
self.dep_time = dep_time
self.arr_time = arr_time
self.dep_delay = dep_delay
self.arr_delay = arr_delay
def get_arrDelay(self):
return self.arr_delay
2a. Create a DataFrame named myDF
, to store data from the 2014.csv
data set. It suffices to import (only) the columns listed below, and to (only) read in the first 100 rows. Although we provide the columns_to_read
, please make (and use) a dictionary of col_types
like we did in Question 1 of Project 10.
import pandas as pd
pth="/anvil/projects/tdm/data/flights/2014.csv"
cols = [
'DepDelay', 'ArrDelay', 'Flight_Number_Reporting_Airline','Distance',
'CarrierDelay', 'WeatherDelay',
'DepTime', 'ArrTime', 'OriginAirportID' ,
'DestAirportID' , 'AirTime'
]
col_dtypes = {
'DepDelay': 'float64',
'ArrDelay': 'float64',
'Flight_Number_Reporting_Airline':'int64',
'Distance': 'float64',
'CarrierDelay': 'float64',
'WeatherDelay': 'float64',
'DepTime': 'float64',
'ArrTime': 'float64',
'OriginAirportID':'int64',
'DestAirportID':'int64',
'AirTime': 'float64'
}
# Reading the CSV file with only the specified columns
myDF = pd.read_csv(pth, usecols=cols, dtype=col_dtypes,nrows=100)
print(myDF.head())
Flight_Number_Reporting_Airline | OriginAirportID | DestAirportID | DepTime | DepDelay | ArrTime | ArrDelay | AirTime | Distance | CarrierDelay | WeatherDelay | 2377 |
---|---|---|---|---|---|---|---|---|---|---|---|
11298 |
12278 |
935.0 |
-5.0 |
1051.0 |
-4.0 |
56.0 |
328.0 |
2377 |
11298 |
||
12278 |
951.0 |
11.0 |
1115.0 |
20.0 |
54.0 |
328.0 |
11.0 |
0.0 |
2377 |
12278 |
11298 |
1144.0 |
9.0 |
1302.0 |
2.0 |
57.0 |
328.0 |
2377 |
12278 |
11298 |
1134.0 |
||
-1.0 |
1253.0 |
-7.0 |
53.0 |
328.0 |
2377 |
12278 |
11298 |
1129.0 |
-6.0 |
2b.Load the data from myDF
into the Flight class instances. (When you are finished, you should have a list of 100 Flight instances.)
flights =[]
for index, row in myDF.iterrows():
flight=Flight(
flight_number=row['Flight_Number_Reporting_Airline'],
origin_airport_id=row['OriginAirportID'],
dest_airport_id=row['DestAirportID'],
dep_time=row['DepTime'],
arr_time=row['ArrTime'],
dep_delay=row['DepDelay'],
arr_delay=row['ArrDelay']
)
flights.append(flight)