TDM 20200: Python Project 14 — Spring 2025

Motivation: We hope that you have had the opportunity to learn a lot, and to improve your data science skills. For our final project of the semester, we want to provide you with the opportunity to give us your feedback on how we connected different concepts, built up skills, and incorporated real-world data throughout the semester, along with showcasing the skills you learned throughout the past 13 projects!

Context: This last project will work as a consolidation of everything we’ve learned thus far, and may require you to back-reference your work from earlier in the semester.

Scope: Python, Data Science

Learning Objectives:
  • Reflect on the semester’s content as a whole

  • Offer your thoughts on how the class could be improved in the future

Make sure to read about, and use the template found here, and the important information about project submissions here.

Questions

Question 1 (2 pts)

The Data Mine team is writing a Data Mine book to be (hopefully) published in 2025. We would love to have a couple of paragraphs about your Data Mine experience. What aspects of The Data Mine made the biggest impact on your academic, personal, and/or professional career? Would you recommend The Data Mine to a friend and/or would you recommend The Data Mine to colleagues in industry, and why? You are welcome to cover other topics too! Please also indicate (yes/no) whether it would be OK to publish your comments in our forthcoming Data Mine book in 2025.

Deliverables

Feedback and reflections about The Data Mine that we can potentially publish in a book in 2025.

Question 2 (2 pts)

Reflecting on your experience working with different datasets, which one did you find most enjoyable, and why? Discuss how this dataset’s features influenced your analysis and visualization strategies. Illustrate your explanation with an example from one question that you worked on, using the dataset.

Deliverables
  • A markdown cell detailing your favorite dataset, why, and a working example and question you did involving that dataset.

Question 3 (2 pts)

While working on the projects, how did you validate the results that your code produced? For instance, did you try to solve problems in 2 different ways? Or did you try to make summaries and/or visualizations? How did you prefer to explore data and learn about data? Are there better ways that you would suggest for future students (and for our team too)? Please illustrate your approach using an example from one problem that you addressed this semester.

Deliverables
  • A few sentences in a markdown cell on how you conducted your work, and a relevant working example.

Question 4 (2 pts)

Reflecting on the projects that you completed, which question(s) did you feel were most confusing, and how could they be made clearer? Please cite specific questions and explain both how they confused you and how you would recommend improving them.

Deliverables
  • A few sentences in a markdown cell on which questions from projects you found confusing, and how they could be written better/more clearly, along with specific examples.

Question 5 (2 pts)

Please identify 3 skills or topics related to the Python language or data science (in general) that you wish we had covered in our projects. For each, please provide an example that illustrates your interests, and the reason that you think they would be beneficial.

Deliverables
  • A markdown cell containing 3 skills/topics that you think we should’ve covered in the projects, and an example of why you believe these topics or skills could be relevant and beneficial to students going through the course.

OPTIONAL but encouraged:

Please connect with Dr Ward on LinkedIn: www.linkedin.com/in/mdw333/

and also please follow our Data Mine LinkedIn page: www.linkedin.com/company/purduedatamine/

and join our Data Mine alumni page: www.linkedin.com/groups/14550101/

Submitting your Work

Please make sure that you added comments for each question, which explain your thinking about your method of solving each question. Please also make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template.

If you have any questions or issues regarding this project, please feel free to ask in seminar, over Piazza, or during office hours.

Prior to submitting your work, you need to put your work into the project template, and re-run all of the code in your Jupyter notebook and make sure that the results of running that code is visible in your template. Please check the detailed instructions on how to ensure that your submission is formatted correctly. To download your completed project, you can right-click on the file in the file explorer and click 'download'.

Once you upload your submission to Gradescope, make sure that everything appears as you would expect to ensure that you don’t lose any points.

Items to submit
  • firstname_lastname_project14.ipynb

It is necessary to document your work, with comments about each solution. All of your work needs to be your own work, with citations to any source that you used. Please make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template.

You must double check your .ipynb after submitting it in gradescope. A very common mistake is to assume that your .ipynb file has been rendered properly and contains your code, markdown, and code output even though it may not.

Please take the time to double check your work. See here for instructions on how to double check this.

You will not receive full credit if your .ipynb file does not contain all of the information you expect it to, or if it does not render properly in Gradescope. Please ask a TA if you need help with this.