Anvil Resource Guide

Purdue’s Anvil high performance computing (HPC) cluster allows users to take advantage of world-class resources for use in their experiential learning projects. Anvil contains both CPU and GPU resources that are available to student teams.

The sections below are designed to help users understand the resources that are available on Anvil and the different request processes to ensure that The Data Mine can provide the tools needed for the experiential learning projects.

Requesting GPU Resources

By default, students are only given access to the CPU resources on Anvil. This is due to two factors.

One is that Anvil is designed as a primarily CPU first system. Due to this, it has less GPU resources available when compared with other HPC clusters that are designed for primarily GPU usage.

The second is that GPU resources are more expensive when compared to their CPU counterparts. On the Anvil system, the GPU resources utilize credits roughly 66 times faster than CPU resources.

If a team would like to use GPU resources as part of their project, please send an email to [email protected] with your Team name and a brief overview of the requirement. The Data Mine team will follow up to add students to the correct permissions groups.

Data Processing Guidelines

Anvil is an amazing system, with a plethora of resources. However, The Data Mine does need to be good stewards of the resources that are provided to us. Every year we work with the Rosen Center for Advanced Computing (RCAC) to submit an allocation request to the ACCESS review board. This request gives us a set number of resources and storage that we provide to students for use on Anvil.

If you know that your project is going to require a large amount of resources (memory, storage, processing, etc.) please reach out to The Data Mine at [email protected] before starting anything. Please include your team’s name and a brief overview of your use case in the ticket description. This allows our technical team to work with you to ensure that the resources are available without limiting the resources that are available to any of the other teams on the shared HPC environment.

Additional Processing Power

The Jupyter Notebook sessions that The Data Mine provides have a limited number of cores available to them at launch. This is by design to ensure that we are good stewards of the HPC resources. In addition, its unlikely additional cores will improve performance unless you are utilizing specific libraries designed to bypass Python’s GIL and/or launch multiple Python processes.

However, we do recognize that teams will need the ability to launch sessions with larger number of cores in special situations.

If you feel that your team requires additional capacity, please email [email protected] with your team’s name and a description of your use case. Our team will reach out to you to discuss and ensure that we can provide the resources required.