TA Training Module 1: Exploring The Data Mine

An Innovative, Immersive Living Learning Community

The sheer number of personal technology devices has led to an explosion in the amount of raw data produced in society. Twenty billion devices are now connected to the internet; it is estimated that by 2030, that number will rise to 1 trillion. All of these devices are constantly producing data.

Data scientists are the critical link that transform this data into information that can be applied to solve real-world problems. The study of this data affects every citizen and is needed in every sector. Data science solutions are applied to everything from preventive maintenance in manufacturing to food science initiatives that address food insecurity.

The Data Mine is a living, learning and research-based community created to introduce students to data science concepts and equip them to create solutions to real-world problems. Members of The Data Mine will be part of a team, living, studying and ultimately, performing data-driven research together. The Data Mine is part of Purdue University’s Integrative Data Science Initiative, which is designed to train students across all majors with the data literacy needed to succeed in a data-driven world.

The Data Mine experience allows students the opportunity to live and learn in a community that revolves around a particular topic or theme. The topics and themes studied within The Data Mine incorporate components relevant to any field of study. Students in each community will live in Hillenbrand Hall, where they will have access to renovated study space, upgraded learning technology and hands-on interaction with faculty within the hall.

The Data Mine is open to students from any major of study. Students will learn some of the skills most sought after by companies and graduate programs. No computational background is required. The key trait for joining The Data Mine is the desire to learn data science in a rigorous, but welcoming environment.

Living and Learning

The National Science Foundation awarded Purdue University with $1.5 million, to implement the Statistics Living Learning Community (STAT-LLC) from 2014 to 2019 in Hillenbrand Hall.

We have come a long way since our first flyer was posted in 2013, announcing the application process for students for the STAT-LLC.

Hillenbrand Hall
Figure 1. Hillenbrand Hall

Hillenbrand was selected as the University Residence of the Year for 2016-17.

Our students in the STAT-LLC published papers and delivered talks at conferences more than 150 times during the 2014-19 life of our NSF grant.

In 2018-19, we scaled this effort up, by introducing The Data Mine to approximately 100 students on campus. In 2019-20, we scaled it up again, to more than 600 students. We continue to grow with just 1000 students in 2021-22 and 1300+ students in 2022-23.

In many ways, Hillenbrand is one of the best places to live on campus. It features two students organizations: The Phoenix Club and The Data Mine Advisory Board. Hillenbrand has its own dining court. 50 students live on each floor of Hillenbrand’s two towers. Each tower has 8 floors, for a total of 800 students. The supportive faculty and staff members who work in Hillenbrand are also crucial to the success of this student-centered initiative.

The Data Mine is structured into 20 learning communities of approximately 25 or 50 students each. The students participate in courses, seminars, research, and professional development experiences, which are offered by departments, research centers, and colleges throughout the university.

In addition, roughly 600 students participate in The Data Mine’s Corporate Partners program, which enables the students to work directly with employees of companies or national laboratories on data-driven projects.

What all students have in common

  • All have the desire to learn data science in a rigorous, but welcoming environment.

  • All take a 1-credit hour seminar (TDM 101000) one day a week to learn new skills.

  • A growing number of students are in TDM 201000, TDM 301000, and TDM 401000. These are students in their second, third, and fourth year in The Data Mine.

  • The Data Mine is open to students from any major of study. No computational background is required.

  • Despite their diverse backgrounds, all are welcome and have potential to do great things! This diversity is a strength of this community.

  • Many may be nervous, and our job is to teach and encourage them.

  • There are 1300+ students in The Data Mine this year!

The National Data Mine Network

The National Data Mine Network (NDMN) is a collaborative project between The Purdue Data Mine and the American Statistical Association that will enable undergraduate students at minority serving institutions to learn fundamental and advanced data science. These undergraduates will engage with industry partners and Purdue researchers on academic-year long data science projects. To learn more about The NDMN, please read this featured article from Purdue Research News.

The Indiana Data Mine

The Indiana Data Mine is another extension of The Data Mine to colleges and universities specifically within Indiana. Funded by a $10 Million Grant from the Lily Foundation, The Indiana Data Mine will give access to fundamental data science education and unique engagement opportunities with researchers and corporate partners. A primary goal of the Indiana Data Mine is to strengthen the already growing tech sector of Indiana. More information can be found in this article from Purdue Research News.

Specialty Learning Communities

In specialty Learning Communities, Students may take classes as a cohort, perform undergraduate research projects, or work with a corporate partner within some of the following research and academic fields:

  • Actuarial Science

  • Agriculture

  • Analyzing Digital Gaming

  • Biology

  • Computational Investigation of Living Systems

  • Corporate Partners

  • Data Visualization

  • Data in the Health and Human Sciences

  • Earth & Atmospheric Sciences

  • Nursing

  • Krannert

  • Pharmacy and Drug Discovery

  • Physics

  • Scalable Asymmetric Lifecycle Management

  • Statistics

  • Vertically Integrated Projects

Seminar Courses: TDM 101000, 201000, 301000, 401000

  • Normally class meets in the Hillenbrand dining hall atrium during a lunch or dinner time.

  • Students work on weekly projects (usually using R, Python, SQL, or UNIX) with approximately 3-5 questions.

  • TAs help students while they work, during online office hours or through the Piazza online discussion board.

Leadership Introductions

To learn more about The Data Mine Leadership, please refer to datamine.purdue.edu/about/welcome.html