12 Nov

New Cluster Installation FAQ

With all the news about a new cluster being built at the HPCf, you no doubt have many questions regarding Nanobio and the new cluster.  In order to help you with these, I decided to put up a FAQ on our site.  We hope you find it useful.  As always, any further questions can be directed at help@hpcf.upr.edu.

  • So, what's going on?
  • Why is Nanobio so slow?
  • Will Nanobio get faster again?
  • So, you're killing Nanobio?
  • Will my Nanobio data be transferred to the new cluster?
  • Who will have access to the new cluster? Do I have to register for a new account?
  • Will I keep the same username and password?
  • When will the new cluster be up?
  • Will the new cluster work the same way as Nanobio?
  • What's taking you so long?
  • How powerful is the new cluster?
  • Is there anything I can do to help?
  • What if I have more questions?

Nanobio is an old cluster.  After many months of planning what was the best way to proceed with an upgrade to Nanobio, we at the HPCf finally decided that it would be best to build a new separate cluster rather than updating the current cluster.  At the time of writing we are currently involved in the installation of this new cluster.

A lot of the compute nodes of Nanobio were capable of being integrated into the new cluster.  Instead of leaving these compute nodes behind on Nanobio, we removed them from Nanobio and installed them into the new cluster.  This has an obvious impact on Nanobio’s performance, which was already over-capacity to begin with.

No.  The hardware that has been migrated to the new cluster will stay there for good.  In fact, some more compute nodes from Nanobio will be migrated to the new cluster.

Not quite.  The new cluster will be Nanobio’s successor, but we currently have no plans to shut down Nanobio for good (subject to change).  We’ve yet to decide the fate of Nanobio, but one possibility is to keep it running as a separate, less powerful cluster for jobs that may not necessarily require that many resources.

No.  The new cluster is completely separate from Nanobio.  Any data migration will have to be carried out by the user.  The user is ultimately responsible for ensuring that their HPCf data is properly backed up.

Your username will be the same as your current Nanobio username.  You will likely be required to enter a new password, though.  We will let you know more details later.

The current tentative timeline is for the new cluster to be up sometime in February.

The way the new cluster is built is very similar to Nanobio, but some changes will be evident to users.  The most noticeable one will be that the new cluster will use a new software for managing the job queues.  Nanobio uses SGE and the new cluster will use Slurm.  There will be a bit a of a learning curve for users, but we will have documentation and tutorials up on our site for your reference, and as always we’ll be glad to answer your questions at help@hpcf.upr.edu.

Another big change will be the addition of quotas.  A cluster’s resources are limited, and in order ensure that these resources are shared fairly (or as close to fair as possible) we will be enforcing quotas throughout the new cluster.  There will be quotas for both /home consumption and /work consumption, as well as for the running time of jobs.  The quotas on /home and /work will be applied not to individual users, but to entire research groups.

Otherwise, the cluster interface should feel mostly the same to users: its OS is based on CentOS Linux, it will have modules for loading specific software, a high-performance /work filesystem for running jobs on and a slower /home filesystem for temporarily storing files.

Computer clusters are complex systems that incorporate many different software and hardware components and try to make them play nicely together.  These components include: a Linux OS, Infiniband networking, Ethernet networking, a Lustre parallel file system, a cluster manager software, the actual machines that run all these components, and many many others.  There’s also the added work of physically assembling the cluster in our data center.  Sometimes getting all the components to play nicely takes a while, and so we thank you for your patience as we continue to work hard to get the new cluster ready for our users.

We’ll have final numbers soon.  But generally it is much bigger, faster, and more efficient than Nanobio in virtually every meaningful way.

Yes.  You can tell your research colleagues about the new cluster and ask them to verify if they need to create a new account or not.  Also, you could start identifying any data that you know for sure you will need to transfer to the new cluster to continue your work.  Keep in mind that the new cluster is not a storage server.  The data that you transfer should be data that you need for actually running jobs on the cluster.  High performance computer clusters in general are not designed to be reliable, long-term storage solutions for users, and as such users will be ultimately responsible for making sure their data is properly backed up.

You can also share this FAQ with other Nanobio users so they’ll be up to date with what’s going on.

As always, we’ll be glad to respond to them at help(at)hpcf.upr.edu.  Any other questions you ask may be incorporated into this FAQ.