The field of Machine Learning and Artificial Intelligence (ML and AI) is rapidly evolving, and increasingly being applied to diverse research domains. One challenge faced by the ML community is the growing need for scalable resources to make training and deploying models efficient, or in some cases, possible. This includes increasing storage needs as datasets grow in size, increasing hardware needs for CPUs, GPUs, and RAM for training models, and an increasing desire to outsource computation as the time required to train models increases. These challenges mean many researchers using ML are moving onto High Performance Computers (HPC) to gain access to storage and compute resources.
As ML applies to a variety of scientific domains beyond computer science, many researchers who wish to harness the power of ML and AI are confronted with the learning curve for data science and ML, as well as the learning curve for using high performance computing facilities. Existing HPC training does not always cater for the needs of Machine Learning community.
As such, this training course wants to acknowledge the confluence of Machine Learning and High Performance Computing, and provide the tools and resources necessary to start moving your ML models onto the HPC. This course is not an introduction to ML as many great courses exist on this topic already, but focusses on the challenge of moving from your local environment (laptop, desktop, workstation) into a HPC environment.
This course has been designed as a series of self-contained modules, to make it easy for learners to “choose their own adventure”. Some learners may want to look at every module, others may want to bare essentials to get going and return to these materials as they need more information. These materials have also been designed so that they can be completed without an instructor, but you can always reach out for assistance by emailing help@massive.org.au if you get stuck.
[list of the most core content]
This course has been developed as part of the ARDC funded “Environments to Accelerate Machine Learning Based Discovery” project.
Prerequisites
This lesson assumes you are familiar with the Unix command line, and that you have introductory knowledge of Machine Learning. Example Machine Learning scripts in this course will be in Python. If you can navigate a filesystem on the command line, including making files, and run some basic ML models in Python, you’re the target audience for this course. We assume no HPC knowledge.
We also assume you have access to the M3 cluster for running the exercises we provide as part of the course.