Setting FileStream.Seek Position / Index to Retrieve "Blocks" of VB.NET Data

I am currently working on a method that accepts a text file and shrinks the file to ~ 10 MB. This method is used to trim log files and save them within 10 MB.

The logic of the code is basically this ... if the file is 250 MB or larger, then read the bytes until the array reaches 250 MB. Save this in a StringBuilder , set the position for the next read and repeat until the StringBuilder contains ~ 10 MB of data. Then write to the file, deleting all the data and leaving only 10 MB of the most recent entries.

To prevent cutting the lines in half, he checks where the last CrLf , and then writes all the data from that point forward.

My problem: I cannot get the search to correctly position myself after the first read. First, it reads the data correctly, and then when I use this position from a previous read for the next iteration, it β€œignores” the position and reads again from the beginning of the file.

 If logFile.Length > (1024 * 1024 * 250) Then Dim DataToDelete As Integer = logFile.Length - (1024 * 1024 * 250) Dim ArrayIndex As Integer = 0 While DataToDelete > 0 Using fs As FileStream = New FileStream(logFile.FullName, FileMode.Open, FileAccess.ReadWrite) fs.Seek(ArrayIndex, SeekOrigin.Begin) If strBuilder.Length < (1024 * 1024 * 250) Then Dim bytes() As Byte = New Byte((1024 * 1024 * 250)) {} Dim n As Integer = fs.Read(bytes, 0, (1024 * 1024 * 250)) ArrayIndex = bytes.Length Dim enc As Encoding = Encoding.UTF8 strBuilder.Append(enc.GetString(bytes)) Else If DataToDelete - strBuilder.Length < 0 And strBuilder.Length > (1024 * 1024 * My.Settings.Threshold) Then Dim DataToCut As Integer = strBuilder.Length - (1024 * 1024 * My.Settings.Threshold) While Not (strBuilder.Chars(DataToCut).ToString.Equals(vbCr)) And DataToCut <> 0 DataToCut -= 1 End While strBuilder.Remove(0, DataToCut) File.WriteAllText(logFile.FullName, strBuilder.ToString) Else DataToDelete -= strBuilder.Length strBuilder.Clear() End If End If End Using End While End If 
+4
source share
2 answers

This is my end result, it works like a charm!

  Dim Maxsize As Integer = (1024 * 1024 * My.Settings.Threshold) For Each logfile In filesToTrim Dim sb As New StringBuilder Dim buffer As String = String.Empty If logfile.Length > Maxsize Then Using reader As New StreamReader(logfile.FullName) reader.BaseStream.Seek(-Maxsize, SeekOrigin.End) buffer = reader.ReadToEnd() sb.Append(buffer) End Using Dim Midpoint As Integer = 0 While Not (sb.Chars(Midpoint).ToString.Equals(vbCr)) And Midpoint <> sb.Length - 1 Midpoint += 1 End While sb.Remove(0, Midpoint) File.WriteAllText(logfile.FullName, sb.ToString) End If Next 
0
source

For what you are doing, it is unnecessary and, indeed, not a good idea, to load the entire file into memory. It would be much better to just read the part of the log file that you intend to save (last 10 MB). For example, it would be much easier and more efficient to do something like this:

 Private Sub ShrinkLog(ByVal filePath As String, ByVal maxSize As Integer) Dim buffer As String If New FileInfo(filePath).Length > maxSize Then Using reader As New StreamReader(filePath) reader.BaseStream.Seek(-maxSize, SeekOrigin.End) buffer = reader.ReadToEnd() End Using File.WriteAllText(filePath, buffer) End If End Sub 

There are other ways to do this. It would be even more efficient if you were going to store most of the file, so as not to even load all this into memory, but simply to move from one stream to another. Also, this simple example does not show how you could avoid shredding a line outside a section in a file, but I'm sure you could continue to search one byte at a time until you find the first line break.

+1
source

Source: https://habr.com/ru/post/1438701/


All Articles