Fall 2022 Syllabus - The Data Mine Seminar

Course Information

Course Number and Title CRN

TDM 10100 - The Data Mine I

possible CRNs are 12067 or 12072 or 12073 or 12071

TDM 20100 - The Data Mine III

possible CRNs are 12117 or 12106 or 12113 or 12118

TDM 30100 - The Data Mine V

possible CRNs are 12104 or 12112 or 12115 or 12120

TDM 40100 - The Data Mine VII

possible CRNs are 12103 or 12111 or 12114 or 12119

TDM 50100 - The Data Mine Seminar

CRN 15644

Course credit hours: 1 credit hour, so you should expect to spend about 3 hours per week doing work for the class

Prerequisites: None for TDM 10100. All students, regardless of background are welcome. Typically, students new to The Data Mine sign up for TDM 10100, students in their second, third, or fourth years of The Data Mine sign up for TDM 20100, TDM 30100, and TDM 40100, respectively. TDM 50100 is geared toward graduate students. However, during the first week of the semester (only), if a student new to The Data Mine has several years of data science experience and would prefer to switch from TDM 10100 to TDM 20100, we can make adjustments on an individual basis.

Course Web Pages

Meeting Times

There are officially 4 Monday class times: 8:30 am, 9:30 am, 10:30 am (all in the Hillenbrand Dining Court atrium—​no meal swipe required), and 4:30 pm (synchronous online, recorded and posted later). Attendance is not required.

All the information you need to work on the projects each week will be provided online on the Thursday of the previous week, and we encourage you to get a head start on the projects before class time. Dr. Ward does not lecture during the class meetings, but this is a good time to ask questions and get help from Dr. Ward, the T.A.s, and your classmates. The T.A.s will have many daytime and evening office hours throughout the week.

Course Description

The Data Mine is a supportive environment for students in any major from any background who want to learn some data science skills. Students will have hands-on experience with computational tools for representing, extracting, manipulating, interpreting, transforming, and visualizing data, especially big data sets, and in effectively communicating insights about data. Topics include: the R environment, Python, visualizing data, UNIX, bash, regular expressions, SQL, XML and scraping data from the internet, as well as selected advanced topics, as time permits.

Learning Outcomes

By the end of the course, you will be able to:

  1. Discover data science and professional development opportunities in order to prepare for a career.

  2. Explain the difference between research computing and basic personal computing data science capabilities in order to know which system is appropriate for a data science project.

  3. Design efficient search strategies in order to acquire new data science skills.

  4. Devise the most appropriate data science strategy in order to answer a research question.

  5. Apply data science techniques in order to answer a research question about a big data set.

Required Materials

  • A laptop so that you can easily work with others. Having audio/video capabilities is useful.

  • Brightspace and Gradescope course pages.

  • Access to Jupyter Lab at the On Demand Gateway on Anvil: ondemand.anvil.rcac.purdue.edu/

  • "The Examples Book": the-examples-book.com

  • Good internet connection.

Attendance Policy

Attendance is not required.

When conflicts or absences can be anticipated, such as for many University-sponsored activities and religious observations, the student should inform the instructor of the situation as far in advance as possible.

For unanticipated or emergency absences when advance notification to the instructor is not possible, the student should contact the instructor or TA as soon as possible by email or phone. When the student is unable to make direct contact with the instructor and is unable to leave word with the instructor’s department because of circumstances beyond the student’s control, and in cases falling under excused absence regulations, the student or the student’s representative should contact or go to the Office of the Dean of Students website to complete appropriate forms for instructor notification. Under academic regulations, excused absences may be granted for cases of grief/bereavement, military service, jury duty, parenting leave, and medical excuse. For details, see the Academic Regulations & Student Conduct section of the University Catalog website.

How to succeed in this course

If you would like to be a successful Data Mine student:

  • Start on the weekly projects on or before Mondays so that you have plenty of time to get help from your classmates, TAs, and Data Mine staff. Don’t wait until the due date to start!

  • Be excited to challenge yourself and learn impressive new skills. Don’t get discouraged if something is difficult—​you’re here because you want to learn, not because you already know everything!

  • Remember that Data Mine staff and TAs are excited to work with you! Take advantage of us as resources.

  • Network! Get to know your classmates, even if you don’t see them in an actual classroom. You are all part of The Data Mine because you share interests and goals. You have over 800 potential new friends!

  • Use "The Examples Book" with lots of explanations and examples to get you started. Google, Stack Overflow, etc. are all great, but "The Examples Book" has been carefully put together to be the most useful to you. the-examples-book.com

  • Expect to spend approximately 3 hours per week on the projects. Some might take less time, and occasionally some might take more.

  • Don’t forget about the syllabus quiz, academic integrity quiz, and outside event reflections. They all contribute to your grade and are part of the course for a reason.

  • If you get behind or feel overwhelmed about this course or anything else, please talk to us!

  • Stay on top of deadlines. Announcements will also be sent out every Monday morning, but you should keep a copy of the course schedule where you see it easily.

  • Read your emails!

Information about the Instructors

The Data Mine Staff

Name Title

Shared email we all read

[email protected]

Kevin Amstutz

Senior Data Scientist and Instruction Specialist

Maggie Betz

Managing Director of Corporate Partnerships

Shuennhau Chang

Corporate Partners Senior Manager

David Glass

Managing Director of Data Science

Kali Lacy

Associate Research Engineer

Naomi Mersinger

ASL Interpreter / Strategic Initiatives Coordinator

Kim Rechkemmer

Senior Program Administration Specialist

Nick Rosenorn

Corporate Partners Technical Specialist

Katie Sanders

Operations Manager

Rebecca Sharples

Managing Director of Academic Programs & Outreach

Dr. Mark Daniel Ward

Director

The Data Mine Team uses a shared email which functions as a ticketing system. Using a shared email helps the team manage the influx of questions, better distribute questions across the team, and send out faster responses.

For the purposes of getting help with this 1-credit seminar class, your most important people are:

  • T.A.s: Visit their office hours and use the Piazza site

  • Mr. Kevin Amstutz, Senior Data Scientist and Instruction Specialist - Piazza is preferred method of questions

  • Dr. Mark Daniel Ward, Director: Dr. Ward responds to questions on Piazza faster than by email

Communication Guidance

  • For questions about how to do the homework, use Piazza or visit office hours. You will receive the fastest response by using Piazza versus emailing us.

  • For general Data Mine questions, email [email protected]

  • For regrade requests, use Gradescope’s regrade feature within Brightspace. Regrades should be requested within 1 week of the grade being posted.

Office Hours

Office hours are held in person in Hillenbrand lobby and on Zoom. Check the schedule to see the available schedule.

Piazza is an online discussion board where students can post questions at any time, and Data Mine staff or T.A.s will respond. Piazza is available through Brightspace. There are private and public postings. Last year we had over 11,000 interactions on Piazza, and the typical response time was around 5-10 minutes!

Assignments and Grades

Course Schedule & Due Dates

See the schedule and later parts of the syllabus for more details, but here is an overview of how the course works:

In the first week of the beginning of the semester, you will have some "housekeeping" tasks to do, which include taking the Syllabus quiz and Academic Integrity quiz.

Generally, every week from the very beginning of the semester, you will have your new projects released on a Thursday, and they are due 8 days later on the Friday at 11:55 pm Purdue West Lafayette (Eastern) time. You will need to do 3 Outside Event reflections.

We will have 13 weekly projects available, but we only count your best 10. This means you could miss up to 3 projects due to illness or other reasons, and it won’t hurt your grade. We suggest trying to do as many projects as possible so that you can keep up with the material. The projects are much less stressful if they aren’t done at the last minute, and it is possible that our systems will be stressed if you wait until Friday night causing unexpected behavior and long wait times. Try to start your projects on or before Monday each week to leave yourself time to ask questions.

The Data Mine does not conduct or collect an assessment during the final exam period. Therefore, TDM Courses are not required to follow the Quiet Period in the Academic Calendar.

Projects (best 10 out of Projects #1-13)

86%

Outside event reflections (3 total)

12%

Academic Integrity Quiz

1%

Syllabus Quiz

1%

Total

100%

Grading Scale

In this class grades reflect your achievement throughout the semester in the various course components listed above. Your grades will be maintained in Brightspace. This course will follow the 90-80-70-60 grading scale for A, B, C, D cut-offs. If you earn a 90.000 in the class, for example, that is a solid A. /- grades will be given at the instructor's discretion below these cut-offs. If you earn an 89.11 in the class, for example, this may be an A- or a B.

  • A: 100.000% - 90.000%

  • B: 89.999% - 80.000%

  • C: 79.999% - 70.000%

  • D: 69.999% - 60.000%

  • F: 59.999% - 0.000%

Late Policy

We generally do NOT accept late work. For the projects, we count only your best 10 out of 13, so that gives you a lot of flexibility. We need to be able to post answer keys for the rest of the class in a timely manner, and we can’t do this if we are waiting for other students to turn their work in.

Projects

  • The projects will help you achieve Learning Outcomes #2-5.

  • Each weekly programming project is worth 10 points.

  • There will be 13 projects available over the semester, and your best 10 will count.

  • The 3 project grades that are dropped could be from illnesses, absences, travel, family emergencies, or simply low scores. No excuses necessary.

  • No late work will be accepted, even if you are having technical difficulties, so do not work at the last minute.

  • There are many opportunities to get help throughout the week, either through Piazza or office hours. We’re waiting for you! Ask questions!

  • Follow the instructions for how to submit your projects properly through Gradescope in Brightspace.

  • It is ok to get help from others or online, although it is important to document this help in the comment sections of your project submission. You need to say who helped you and how they helped you.

  • Each week, the project will be posted on the Thursday before the seminar, the project will be the topic of the seminar and any office hours that week, and then the project will be due by 11:55 pm Eastern time on the following Friday. See the schedule for specific dates.

  • If you need to request a regrade on any part of your project, use the regrade request feature inside Gradescope. The regrade request needs to be submitted within one week of the grade being posted (we send an announcement about this).

Outside Event Reflections

  • The Outside Event reflections will help you achieve Learning Outcome #1. They are an opportunity for you to learn more about data science applications, career development, and diversity.

  • You are required to attend 3 of these over the semester, with 1 due each month. See the schedule for specific due dates. Feel free to complete them early.

    • Outside Event Reflections must be submitted within 1 week of attending the event or watching the recording.

    • At least one of these events should by on the topic of Professional Development (designated by "PD" on the schedule)

  • Find outside events posted on The Data Mine’s website (datamine.purdue.edu/events/) and updated frequently. Let us know about any good events you hear about.

  • Format of Outside Events:

    • Often in person so you can interact with the presenter!

    • Occasionally online and possibly recorded

  • Follow the instructions in Gradescaope for writing and submitting these reflections.

    • Name of the event and speaker

    • The time and date of the event

    • What was discussed at the event

    • What you learned from it

    • What new ideas you would like to explore as a result of what you learned at the event

    • AND what question(s) you would like to ask the presenter if you met them at an after-presentation reception.

  • We read every single reflection! We care about what you write! We have used these connections to provide new opportunities for you, to thank our speakers, and to learn more about what interests you.

Academic Integrity

Academic integrity is one of the highest values that Purdue University holds. Individuals are encouraged to alert university officials to potential breaches of this value by either emailing or by calling 765-494-8778. While information may be submitted anonymously, the more information that is submitted provides the greatest opportunity for the university to investigate the concern.

In TDM 10100/20100/30100/40100/50100, we encourage students to work together. However, there is a difference between good collaboration and academic misconduct. We expect you to read over this list, and you will be held responsible for violating these rules. We are serious about protecting the hard-working students in this course. We want a grade for The Data Mine seminar to have value for everyone and to represent what you truly know. We may punish both the student who cheats and the student who allows or enables another student to cheat. Punishment could include receiving a 0 on a project, receiving an F for the course, and incidents of academic misconduct reported to the Office of The Dean of Students.

Good Collaboration:

  • First try the project yourself, on your own.

  • After trying the project yourself, then get together with a small group of other students who have also tried the project themselves to discuss ideas for how to do the more difficult problems. Document in the comments section any suggestions you took from your classmates or your TA.

  • Finish the project on your own so that what you turn in truly represents your own understanding of the material.

  • Look up potential solutions for how to do part of the project online, but document in the comments section where you found the information.

  • If the assignment involves writing a long, worded explanation, you may proofread somebody’s completed written work and allow them to proofread your work. Do this only after you have both completed your own assignments, though.

Academic Misconduct:

  • Divide up the problems among a group. (You do #1, I’ll do #2, and he’ll do #3: then we’ll share our work to get the assignment done more quickly.)

  • Attend a group work session without having first worked all of the problems yourself.

  • Allowing your partners to do all of the work while you copy answers down, or allowing an unprepared partner to copy your answers.

  • Letting another student copy your work or doing the work for them.

  • Sharing files or typing on somebody else’s computer or in their computing account.

  • Getting help from a classmate or a TA without documenting that help in the comments section.

  • Looking up a potential solution online without documenting that help in the comments section.

  • Reading someone else’s answers before you have completed your work.

  • Have a tutor or TA work though all (or some) of your problems for you.

  • Uploading, downloading, or using old course materials from Course Hero, Chegg, or similar sites.

  • Using the same outside event reflection (or parts of it) more than once. Using an outside event reflection from a previous semester.

  • Using somebody else’s outside event reflection rather than attending the event yourself.

The Purdue Honor Pledge "As a boilermaker pursuing academic excellence, I pledge to be honest and true in all that I do. Accountable together - we are Purdue"

Please refer to the student guide for academic integrity for more details.

Disclaimer

This syllabus is subject to change. Changes will be made by an announcement in Brightspace and the corresponding course content will be updated.