Storage Options for Diskless Servers

I am trying to create a simulation of a neural network running on several high-performance diskless disk instances. I plan to use a persistent disk to store simulation code and training data and mount them on all server instances. This is basically a map that reduces the task (several nodes working with the same training data, the results of all nodes must be collected in one result file).

Now my only question is: what are my possibilities (forever) to save the simulation results of different servers (either at some points during the simulation, or once at the end). Ideally, I would like to write them to one permanent drive installed on all servers, but this is not possible, because I can connect it read-only to several servers.

What is the smartest (and cheapest) way to collect all the simulation results of all servers on one permanent drive?

+2
source share
4 answers

Google Cloud Storage is a great way to permanently store information in Google Cloud. All you have to do is enable this product for your project and you can access cloud storage directly from Compute Engine virtual machines. If you create your instances with the storage-rw service account, access is even easier because you can use the gsutil command built into your virtual machines without the need for explicit authorization.

To be more specific, go to the Google Cloud Console , select the project with which you want to use the Compute Engine and cloud storage, and make sure that both of these services are turned on. Then, when creating the virtual machine, use the account area of ​​the storage-rw service. If you use gcutil to create your virtual machine, just add -storage_account_scope = storage-rw (there is also an intuitive way to set the service account area if you use Cloud Console to start your virtual machine). After starting your virtual machine, you can freely use the gsutil command without worrying about how to perform interactive login or OAuth steps. You can also use the script your use by combining any desired gsutil requests into your application (gsutil will also work when the script starts).

Learn more about the features of the GCE service account here .

+5
source

Marc's answer is definitely suitable for long-term storage of results. Depending on the I / O needs and reliability, you can also configure one server as an NFS server and use it to remotely connect to other nodes.

Typically, the NFS server will be your "master node", and it can serve both binary files and configuration. Workers periodically skipped directories exported from the wizard to receive new binaries or configurations. If you don’t need a lot of disk I / O (you mentioned neural modeling, so I assume the dataset is suitable in memory and you only give the final results), it may be acceptable quickly, just write your output to NFS on your host node, and then do the basic results of backing up the node to some place like GCS.

The main advantage of using NFS through GCS is that NFS offers familiar file system semantics that can help if you use third-party software that expects files to be read from file systems. It is very easy to periodically synchronize files from GCS to local storage, but this requires the launch of an additional agent on the host.

The disadvantages of configuring NFS are that you probably need to synchronize the UID between the hosts, NFS can be a security hole (I would only expose NFS on my private network, and not anything outside 10/8), and that this will be required install additional packages on the client and server for setting shares. In addition, NFS will only be more reliable than a hosting machine, while object storage, such as GCS or S3, will be implemented with redundant servers and, possibly, even with geographic diversity.

+4
source

If you want to stay in google play space, what about Google Cloud Storage ?

Otherwise, I used S3 and boto for these tasks.

0
source

As a more general option, you are requesting some kind of shared object storage. Google, as noted in previous answers, makes a good package, but almost all cloud providers provide some storage. Make sure your cloud provider has the BASIC key parameters - volume storage, data storage similar to a virtual disk, and object storage, key / value storage. Both have their own strengths and weaknesses. Volume stores are replacements for virtual disks. If you can use stdio, you can use the remote volume store. The problem is that they often have a disk structure. If you need something more, you are querying a database. The object store is the "middle base" between the disk and the database. It is fast and semi-structured.

I myself am an OpenStack user - firstly, because it provides both storage families, and secondly, it is supported by different providers, so if you decide to switch from provider A to provider B, your code may remain unchanged. You can even run a copy of it on your computers (go to www.openstack.org). However, note that OpenStack really loves memory. You are not going to run your personal cloud on a 4GB laptop! Consider two 16 GB machines.

0
source

Source: https://habr.com/ru/post/1486658/


All Articles