Classes with the Flights Dataset

This example is from TDM 102 Project 11 Spring 2024.

These example(s) depend on the database:

  • /anvil/projects/tdm/data/flights/2014.csv

Learn more about the dataset here.

1a. Create a class named Flight, which contains attributes for the flight number, origin airport ID, destination airport ID, departure time, arrival time, departure delay, and arrival delay.

class Flight:
    def __init__(self, flight_number, origin_airport_id, dest_airport_id, dep_time, arr_time, dep_delay, arr_delay):
        self.flight_number = flight_number
        self.origin_airport_id = origin_airport_id
        self.dest_airport_id = dest_airport_id
        self.dep_time = dep_time
        self.arr_time = arr_time
        self.dep_delay = dep_delay
        self.arr_delay = arr_delay

1b. Add a function called get_arrdelay() to the class, which gets the arrival delay time.

class Flight:
    def __init__(self, flight_number, origin_airport_id, dest_airport_id, dep_time, arr_time, dep_delay, arr_delay):
        self.flight_number = flight_number
        self.origin_airport_id = origin_airport_id
        self.dest_airport_id = dest_airport_id
        self.dep_time = dep_time
        self.arr_time = arr_time
        self.dep_delay = dep_delay
        self.arr_delay = arr_delay

    def get_arrDelay(self):
        return self.arr_delay

2a. Create a DataFrame named myDF, to store data from the 2014.csv data set. It suffices to import (only) the columns listed below, and to (only) read in the first 100 rows. Although we provide the columns_to_read, please make (and use) a dictionary of col_types like we did in Question 1 of Project 10.

import pandas as pd


pth="/anvil/projects/tdm/data/flights/2014.csv"

cols = [
    'DepDelay', 'ArrDelay', 'Flight_Number_Reporting_Airline','Distance',
    'CarrierDelay', 'WeatherDelay',
    'DepTime', 'ArrTime', 'OriginAirportID' ,
    'DestAirportID' , 'AirTime'
]


col_dtypes = {
    'DepDelay': 'float64',
    'ArrDelay': 'float64',
    'Flight_Number_Reporting_Airline':'int64',
    'Distance': 'float64',
    'CarrierDelay': 'float64',
    'WeatherDelay': 'float64',
    'DepTime': 'float64',
    'ArrTime': 'float64',
    'OriginAirportID':'int64',
    'DestAirportID':'int64',
    'AirTime': 'float64'
}

# Reading the CSV file with only the specified columns
myDF = pd.read_csv(pth, usecols=cols, dtype=col_dtypes,nrows=100)

print(myDF.head())
Flight_Number_Reporting_Airline OriginAirportID DestAirportID DepTime DepDelay ArrTime ArrDelay AirTime Distance CarrierDelay WeatherDelay 2377

11298

12278

935.0

-5.0

1051.0

-4.0

56.0

328.0

2377

11298

12278

951.0

11.0

1115.0

20.0

54.0

328.0

11.0

0.0

2377

12278

11298

1144.0

9.0

1302.0

2.0

57.0

328.0

2377

12278

11298

1134.0

-1.0

1253.0

-7.0

53.0

328.0

2377

12278

11298

1129.0

-6.0

2b.Load the data from myDF into the Flight class instances. (When you are finished, you should have a list of 100 Flight instances.)

flights =[]
for index, row in myDF.iterrows():
    flight=Flight(
        flight_number=row['Flight_Number_Reporting_Airline'],
        origin_airport_id=row['OriginAirportID'],
        dest_airport_id=row['DestAirportID'],
        dep_time=row['DepTime'],
        arr_time=row['ArrTime'],
        dep_delay=row['DepDelay'],
        arr_delay=row['ArrDelay']
    )
    flights.append(flight)