Private Cloud AWS S3 Alternatives

We now have a requirement to move from AWS to a private data center. We need to find alternative storage instead of AWS S3. Currently, S3 is used as follows:

  • Total storage size - 10 TB;
  • The size of the Min / Avg / Max object is 0.5 / 2/100 Mb;
  • We have instances of N applications that write / read at the same time.
    objects approximately 50 records / sec, 30 reads / sec;
  • This storage should be redundant (available), fault tolerant, scalable;

A naive implementation can store this data:

  • Simple NFS storage and add some replication features;
  • Just save the specified objects in a NoSQL database (for example, in Cassandra). However, Cassandra will require multiple instances to support this storage (it is not recommended to store> 1 TB pn 1 Cassandra node Planning for Cassandra capacity )

What solution would you recommend for such a scenario?

+6
source share
4 answers

There are tons of options related to the S3-compatible private cloud service. if you like open source solutions, the above open stack and Cassandra are good. Note that usually no matter what you use, you may end up setting up a cloud with multiple nodes, and this is inevitable for sharing redundancy and availability. There are also good commercial and economic products such as one from Cloudian.

+3
source

If you need an object store, I could recommend elliptics ( in English ). As I know, it has no restrictions on disk storage.

In case we use SSD disks for Cassandra (for better performance) 200-500 GB. The ring size depends on your requirements (read / write latency, replication speed, lifetime).

50 records / sec, 30 views / sec

It is really very easy for Cassandra, as I can compare with our setup. In this case, it depends on the lifetime for your objects.

As a rule, for a distributed network, you can also look at GlusterFS .

+1
source

You can use openstack swift

Swift is a highly accessible, distributed, ultimately compatible / blob store object. Organizations can use Swift to store large amounts of data efficiently, safely, and cheaply.

Read more about: https://docs.openstack.org/swift/latest/
And https://oldhenhut.com/2016/05/31/s3-vs-swift/

+1
source

Using MinIO is your best bet if you want to have private cloud storage. It is compatible with AWS S3, which means that applications using AWS S3 can be easily ported to MinIO. They have a guide on connecting the MinIO server to the AWS command line interface. You can check it on the public MinIO server https://play.min.io:9000 . Please refer to the AWS CLI with the MinIO server .

You can have a highly available storage system using the MinIO distributed installation. Remember that dynamic expansion is not a function of the distributed MinIO setup. If you want to expand your cluster, you end up launching a new cluster with the number of servers / disks you need, and then you need to transfer data from the old to the new.

I find this much easier to use than HDFS. In addition to this, there are many technologies outside the Hadoop ecosystem that lack HDFS integration. For example, the Docker Registry does not have a built-in HDFS storage driver. However, it has an S3 driver, so you can use MinIO as a storage object.

0
source

Source: https://habr.com/ru/post/1271987/


All Articles