Slow SSD performance on Azure VM

I am using a Windows Azure virtual machine with Windows Server 2012 Datacenter on a D2 instance (new SSD instances) to unzip a 1.8 GB zip archive that contains a 51 GB XML file that is unpacked. Needless to say, this process can be accelerated with a fast disk, so I am testing a D2 instance.

However, the performance of the drive that I get is not impressive and does not live up to expectations of the performance of the SSD, as the average write speed is 20-30 MB / s.

The program I use to decompress the file is a regular .NET console application designed for this single purpose. The source code is as follows:

static void Main(string[] args) { if (args.Count() < 1) { Console.WriteLine("Missing file parameter."); return; } string zipFilePath = args.First(); if (!File.Exists(zipFilePath)) { Console.WriteLine("File does not exist."); return; } string targetPath = Path.GetDirectoryName(zipFilePath); var start = DateTime.Now; Console.WriteLine("Starting extraction (" + start.ToLongTimeString() + ")"); var zipFile = new ZipFile(zipFilePath); zipFile.UseZip64 = UseZip64.On; foreach (ZipEntry zipEntry in zipFile) { byte[] buffer = new byte[4096]; // 4K is optimum Stream zipStream = zipFile.GetInputStream(zipEntry); String entryFileName = zipEntry.Name; Console.WriteLine("Extracting " + entryFileName + " ..."); String fullZipToPath = Path.Combine(targetPath, entryFileName); string directoryName = Path.GetDirectoryName(fullZipToPath); if (directoryName.Length > 0) { Directory.CreateDirectory(directoryName); } // Unzip file in buffered chunks. This is just as fast as unpacking to a buffer the full size // of the file, but does not waste memory. // The "using" will close the stream even if an exception occurs. long dataWritten = 0; long dataWrittenSinceLastOutput = 0; const long dataOutputThreshold = 100 * 1024 * 1024; // 100 mb var timer = System.Diagnostics.Stopwatch.StartNew(); using (FileStream streamWriter = File.Create(fullZipToPath)) { bool moreDataAvailable = true; while (moreDataAvailable) { int count = zipStream.Read(buffer, 0, buffer.Length); if (count > 0) { streamWriter.Write(buffer, 0, count); dataWritten += count; dataWrittenSinceLastOutput += count; if (dataWrittenSinceLastOutput > dataOutputThreshold) { timer.Stop(); double megabytesPerSecond = (dataWrittenSinceLastOutput / timer.Elapsed.TotalSeconds) / 1024 / 1024; Console.WriteLine(dataWritten.ToString("#,0") + " bytes written (" + megabytesPerSecond.ToString("#,0.##") + " MB/s)"); dataWrittenSinceLastOutput = 0; timer.Restart(); } } else { streamWriter.Flush(); moreDataAvailable = false; } } Console.WriteLine(dataWritten.ToString("#,0") + " bytes written"); } } zipFile.IsStreamOwner = true; // Makes close also shut the underlying stream zipFile.Close(); // Ensure we release resources Console.WriteLine("Done. (Time taken: " + (DateTime.Now - start).ToString() +")"); Console.ReadKey(); } 

When I run this application locally on my machine with an SSD, I get 180-200 MB / s in performance all the time throughout the process of unpacking. But when I run it on Azure VM, I get good performance (100-150 MB / s) in about the first 10 seconds, and then it decreases to about 20 MB / s and stays there, with a periodic further decrease to 8-9 MB /with. This is not improving. The whole unpacking process takes about 42 minutes on the Azure VM, and my local machine can do this in about 10 minutes.

What's going on here? Why is disk performance so bad? Is this something my application is wrong?

Both locally and on Azure VM, the zip file is placed on the SSD, and the file is extracted to the same SSD. (On an Azure VM, I use a Temporary Storage drive, as it is an SSD)

Here is a screenshot from Azure VM extracting the file: Azure Virtual Machine disk performance

Notice how great performance is from the start, but then it suddenly drops and doesn't recover. I assume some caching happens, and then performance drops when the cache skips.

Here is a screenshot of my local machine extracting the file: The performance of my local development machine

Performance varies slightly, but remains above 160 MB / s.

This is the same binary that I use on both machines, which is compiled for x64 (Not AnyCPU). The SSD that I have on my machine is about 1.5 years old, so this is not something new or special. I also don’t think this is a memory problem, since the D2 instance has about 7 GB of RAM, and my local machine has 12 GB. But 7 GB should be enough, right?

Does anyone know what is going on?

Thank you so much for any help.

Added
I tried to control the memory usage when performing the extraction, and I noticed that when the application started, the amount of modified memory exploded and simply continued to grow. Although this was done, the performance presented by my application was excellent (100+ MB / s). Then the Modified memory started to decrease (which, as far as I know, means that the memory is erased to disk), the performance immediately decreases to 20-30 MB / s. Several times, performance really improved, and I could see that when this happened, the use of Modified memory increased. After a few moments, the performance decreased again, and I saw that the amount of changed memory decreased. Thus, it seems that flushing data to disk causes problems with application performance. But why? And how can I solve this?

Added
Ok, so I tried David's suggestion and ran the application on a D14 instance, and now I have really good disk performance, stable 180-200 + MB / s. I will continue testing on different instance sizes and see how low I can go and still get good disk performance. It still seems strange that I got such poor disk performance in a virtual machine with a local SSD, as I have with a D2 instance.

+6
source share
1 answer

Where is the file located? C: or D: Only D-Series virtual disk D is an SSD. all other drives are regular drives.

If you need an attached drive as another drive, you need to switch to the premium account, which is previewed with the G-series virtual machine.

Thanks Subodh

0
source

Source: https://habr.com/ru/post/975821/


All Articles