Superstore Sales

Source

The original Superstore Sales dataset is available to download from Kaggle here:

Description of the Data

There is a version of the Superstore Sales dataset at 'anvil/projects/tdm/data/sales/Superstore_modified.csv' that contains an 'Order Status' column, which was made by randomly selecting rows to have a status of "Pending" or "Shipped":

order_ids = myDF["Order ID"].unique()

unshipped_orders = np.random.choice(
    order_ids,
    size=int(0.15 * len(order_ids)),
    replace=False
)

myDF.loc[
    myDF["Order ID"].isin(unshipped_orders),
    ["Ship Date", "Ship Mode"]
] = np.nan

myDF["Order Status"] = np.where(
    myDF["Ship Date"].isna(),
    "Pending",
    "Shipped"
)