GPU Usage

The GPU visalizations show temperature and usage of each GPU on a per node basis. This allows users to quickly identify which nodes are under heavy load and which ones are idle.
Problem Statement: MSOE's Rosie Supercomputer is a powerful computing resource used by students and faculty for research and academic projects. However, the existing visualization of workloads limited in functionality and user experience. Hardware specifications and workload distribution was available via an API hosted on Rosie but there was no user-friendly interface to visualize this data effectively.
Solution: I developed a comprehensive web dashboard using React.js and TailwindCSS that interfaces with Rosie's existing API to provide real-time visualization of hardware specifications and workload distribution. The dashboard features interactive charts and graphs that allow users to easily monitor system performance, job statuses, and resource allocation. The interface is designed to be intuitive and user-friendly, making it accessible for both technical and non-technical users. You can check it out live at https://alex-j-lopez.github.io/MSOE-Rosie-Supercomputer-Dashboard/
Skills Used:

The GPU visalizations show temperature and usage of each GPU on a per node basis. This allows users to quickly identify which nodes are under heavy load and which ones are idle.

The CPU visualizations shows the computational load of each CPUs on a per node basis. This allows users to quickly identify which nodes are under heavy load and which ones are idle.

The memory visualizations show the memory usage of each node. This allows users to quickly identify which nodes are under heavy load and which ones have available memory.

The disk visualizations show the disk usage of each node. This allows users to quickly identify which nodes are under heavy load and which ones have available disk space.

The user sessions visualizations show the active jobs per user on the Rosie cluser and their statuses. This allows users to quickly identify which users are using the cluster and what jobs they are running.

The node overview visualizations provide a summary of the status and health of each node in the cluster. This allows users to quickly assess the overall state of the cluster and identify any nodes that may require attention.

The leaderboard visualizations show the top users based on their time used and total jobs run. This allows users to see who the most active and efficient users are on the cluster.

The network summary visualizations provide an overview of the network traffic and connectivity between nodes in the cluster. This allows users to quickly identify any network issues or bottlenecks.

The active jobs visualizations show the currently running jobs on the cluster. This allows users to monitor the progress and status of active jobs.

The job distribution visualizations show the distribution of jobs across the cluster. This allows users to see how jobs are distributed and identify any imbalances or hotspots.