TDM 20200: Project 14 — Spring 2024

Motivation: We covered a lot this year! When dealing with data driven projects, it is crucial to thoroughly explore the data, and answer different questions to get a feel for it. There are always different ways one can go about this. Proper preparation prevents poor performance. As this is our final project for the semester, its primary purpose is survey based. You will answer a few questions mostly by revisiting the projects you have completed.

Context: We are on the last project where we will revisit our previous work to consolidate our learning and insights. This reflection also help us to set our expectations for the upcoming semester


Question 1 (2 pts)

  1. Reflecting on your experience working with different datasets, which one did you find most enjoyable, and why? Discuss how this dataset’s features influenced your analysis and visualization strategies. Illustrate your explanation with an example from one question that you worked on, using the dataset.

Question 2 (2 pts)

  1. Reflecting on your experience working with different commands, functions, and packages, which one is your favorite, and why do you enjoy learning about it? Please provide an example from one question that you worked on, using this command, function, or package.

Question 3 (2 pts)

  1. While working on the projects, including web scraping, data visualization, machine learning, and containerization, what steps did you take to ensure that the results were right? Please illustrate your approach using an example from one problem that you addressed this semester.

Question 4 (2 pts)

  1. Reflecting on the projects that you completed, which question(s) did you feel were most confusing, and how could they be made clearer? Please use a specific question to illustrate your points.

Question 5 (2 pts)

  1. Please identify 3 skills or topics in data science areas you are interested in. You may choose from the following list or create your own list. Please briefly explain the reason you think the topics will be beneficial, with examples.

    • database optimization

    • containerization

    • machine learning

    • generative AI

    • deep learning

    • cloud computing

    • DevOps

    • GPU computing

    • data visualization

    • time series and spatial statistics

    • predictive analytics

    • (if you have other topics that you want Dr Ward to add, please feel welcome to post in Piazza, and/or just add your own topics when you answer this question)

Project 14 Assignment Checklist

  • Jupyter Lab notebook with your answers and examples. You may just use markdown format for all questions.

    • firstname-lastname-project14.ipynb

  • Submit files through Gradescope

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.