Service Fabric Logs Consuming All Available Disk Space

I have several clusters that have been running for a month or so, and I found that the temporary storage is completely swallowed by Service Fabric log files. On a weak fleet of F1 VMs, where there is only 16 GB of local storage, I’m just not in space, some of them are now up to 30 MB, and megabytes of memory (where less than 1 GB is consumed by my application in all its versions).

When I look at disk usage on cluster virtual machines, I clearly see that the SvcFab \ Log and SvcFab \ ReplicatorLog folders consume more than 90% of the available space. Of course, SF would do better. Is there something I can switch or configure to make it reset some data? Or is it even better to move it to memory or table storage?

This should be a problem for others. What are others doing? And the Service Fabric team, which is best suited for this?

+5
source share
3 answers

So there is no useful help in this. I resort to breaking this cluster and rebuilding it. Fortunately for me, the cluster was one of a couple, and I was able to simply redirect all traffic through TrafficMgr to another cluster, while I destroyed the failed one and created a new one.

Pretty embarrassed. If I didn’t have this redundancy, that would be a pretty big problem. :-(

0
source

If the replicator log is full, it implies that you use F1 to store data ... 16 GB is not much for your data warehouse, and you can better hack the application in processing / storage services with different sets.

Not an expert on how SF stores things (I will leave it and trim others, but there isn’t much information there), but if similar solutions like the replication log have some of your data, and it reduces it safely. In addition, instead of F1, you can better use F2 and F4, since they have * 2 or * 4 IO and cause you to lose nothing but gain additional storage .. and this means less replication (unless you do a lot of markup) .

0
source

I am not sure what below is considered a cluster breakdown! I tested this on a stateless service in the dummy Service Fabric application.

The service structure that we deployed on the standard_DS1_V2 suffered from a loss of quorum, and the health analysis service also failed due to insufficient disk space. Instead of tearing down the cluster, I stopped the vm scale set using the ARM power shell

stop-azurermvmss -ResourceGroupName "RG" -VMScaleSetName "VMSS" 

then went to the Azure Portal website> Resource Groups> Virtual Machine Scale Set> Scale to increase the SKU to Standard_D1_V2 and start the VM scale set

 start-azurermvmss -ResourceGroupName "RG" -VMScaleSetName "VMSS" 

and redistributed the service application, and it works as expected!

0
source

Source: https://habr.com/ru/post/1257705/


All Articles