VBA, file system object, speed / advantages / disadvantages

This has turned into a rather long post, and in fact there is no “answer”. I'm looking for an explanation rather than some kind of silver bullet to solve the problem. Thus, any aspect that you would like to answer would be greatly appreciated. Thanks in advance!


I came across what could be a “problem” with a file system object, and this leads to a question about functionality, etc. of how the file system object in VBA works against “something else” (I don’t know if there is an alternative to use in Excel for what I am doing) in .net, etc. I do not know a better place to ask, and I'm not sure what to look for in a study for myself, And here I am!

So! To the problem. A brief explanation is that I iterate over folders, collect information about the file (name, extension, full path, etc.) and put it in a spreadsheet. I ultimately use this information to copy files to a new location. However, on a large scale (more than 1000 files), this, apparently, works fine locally, but much slower on the network (at work). It will chew like 1,500 files, wait a while, make another 1,500, etc. While listing or copying files. Again, this is not the case when it is executed locally, it just starts without problems, so I can assume that this probably has nothing to do with my code. It is almost as if the network opened and closed the gates intermittently.

Alternatively, using other programs from an end-user perspective (I tried it against the same files that I used with my program on our production network), it is much faster without any of the above delays. I assume that the alternative program uses some version of .net, if that matters. In short, I don’t think that I can inherently blame our network for the speed problems that I encounter.

So my question / curiosity / problem comes down to a few key points:

. What is the difference between FSO in VBA and the default libraries in .NET, and can there be a difference between the cause of the problem I am facing? Clearly, this kind of data can be read much faster than it is.

- FSO is not intended for use in this way (over a network, with a lot of deleted data or ...?)? Is it just outdated / obsolete? And is there an alternative that can be used through VBA?

- I just stupidly understand that our network functions differently than the local disk. It stores a lot of terabytes of data, etc., and I'm not sure what the difference is at a very deep level between access to a local drive and a network location. I know that I do not give details on the network, which are likely to be very useful in the diagnosis, I just do not know, unfortunately. I guess I'll just ask if this can “potentially” explain that using FSO in this way with some / all kinds of networks is just not how it is intended to be used. Is it possible that the network is set up in such a way as to limit the way I try to interact with it?

- Even though I did not encounter any problems doing it locally, is it possible that something in my code is much more taxable for the network location and local drive?

Thanks for any information you can provide.

+4
source share
4 answers

Finch042 admits that it is only “foggy” about the specifics of what differs when accessing the file system of the network server and the local file system, and that his question is really about the relative speed between these two circumstances. All other posts here suggest that the problem is related to his design choice and / or encoding methods, but I think the main question remains unanswered: why can network file operations be much slower?

The short answer is that the network file system is located on another computer disk at the end of the LAN cable (or, even worse, on the Wifi signal), and such an intermediate technology is much more limited in its data transfer capacity than the electronics between the computer processor and its local drive. It’s true that the modern capabilities of the LAN compared to the stone age blind quickly, but they still go slower than the electronics of the disk interface on the PC motherboard. Thus, you will always experience some level of performance degradation when accessing remote files.

In addition, many modern farm server systems may include mirroring (i.e., storage redundancy) to maintain data integrity and may also include automatic version backup features that can add access time to some server operations, especially when writing new files or upgrade existing ones.

Regarding fluctuations in the speed of data transfer to / from the server, which Finch042 describes as the apparent “line” of the data stream: whenever you use shared technology, such as LAN systems and shared servers, you usually compete with others that trying to do such things. For example, LAN technologies, such as traditional Ethernet, actually allow different users to totalize for all attempts to transfer each other, and when this leads to an unsuccessful attempt, it will relay until it succeeds. This is a design that trades simplicity and, therefore, maximum overall reliability for (usually) a slight loss of bandwidth. But when demand for the network is high, this can lead to a sharp decrease in bandwidth for all users.

Similarly, a file server has limited capacity to serve access requests to the file system, and it can also be overloaded during periods of high demand.

I suspect that the Finch042 experience is most likely related to these problems, especially if its network organization and server system have grown gradually and therefore are not optimized for a long time and / or there is a capacity limitation next to it or next to it. And his experience with inconsistent data rates is most likely an ebb and flow of demand for shared, shared network / server systems.

Also, keep in mind that virus protection systems can interfere with file access speeds, especially for network server files.

+3
source

(I am sending the message as an answer, because the comment is too long).

I get the impression that you can feed values ​​in Excel cells one at a time or maybe one row at a time. I would use a Dim arr(100, 4) As String array. Dim arr(100, 4) As String filled it with values ​​and then filled a large range at a time Range("A1:E101") = arr . I would experiment with a size of 100, as I suspect it could be much larger. Instead of FSO, I would use (VBA methods) Dir, FileCopy and Kill, using only FSO, if necessary.

VB.NET has a number of other options, such as lists (of a class, possibly) of a stream in memory, StringBuilder. However, if Excel Interop is still needed, the advantage of these approaches may be lost. In this case, I can consider writing to a csv file that can be opened directly using Excel. Excel Interop can still be used, but I have to write in csv and then open it (as a single statement) in Excel.

Logically, I suppose that it would be more efficient to create this text file in the same place as the network files, and then move it later, but someone can correct this assumption.

+2
source

Instead of using FSO, I would use DIR() if I need faster speed.
However, this is not so safe, so you need to do a couple of tests and make sure that it works in all cases.
For example, you might need to check a separate parent folder to make sure they exist.

In any case, DIR() should be faster, because it is a native function.

Another way to solve this problem would be to use Batch (if you are in Widows, of course!) Or use the command line to easily copy from one file to another. You should see a sharp increase in speed, and you do not need to worry about checking each subfolder for existence!

I have VBA code that will use the Windows command line to do what I want. I got it from the Internet, but pushed some error confirmations to get around what I wanted to do:

 Option Explicit Option Base 0 Option Compare Text Private Type SECURITY_ATTRIBUTES nLength As Long lpSecurityDescriptor As Long bInheritHandle As Long End Type Private Type PROCESS_INFORMATION hProcess As Long hThread As Long dwProcessId As Long dwThreadId As Long End Type Private Type STARTUPINFO cb As Long lpReserved As Long lpDesktop As Long lpTitle As Long dwX As Long dwY As Long dwXSize As Long dwYSize As Long dwXCountChars As Long dwYCountChars As Long dwFillAttribute As Long dwFlags As Long wShowWindow As Integer cbReserved2 As Integer lpReserved2 As Byte hStdInput As Long hStdOutput As Long hStdError As Long End Type Private Const WAIT_INFINITE As Long = (-1&) Private Const STARTF_USESHOWWINDOW As Long = &H1 Private Const STARTF_USESTDHANDLES As Long = &H100 Private Const SW_HIDE As Long = 0& Private Declare Function CreatePipe Lib "kernel32" (phReadPipe As Long, phWritePipe As Long, lpPipeAttributes As SECURITY_ATTRIBUTES, ByVal nSize As Long) As Long Private Declare Function CreateProcess Lib "kernel32" Alias "CreateProcessA" (ByVal lpApplicationName As Long, ByVal lpCommandLine As String, lpProcessAttributes As Any, lpThreadAttributes As Any, ByVal bInheritHandles As Long, ByVal dwCreationFlags As Long, lpEnvironment As Any, ByVal lpCurrentDriectory As String, lpStartupInfo As STARTUPINFO, lpProcessInformation As PROCESS_INFORMATION) As Long Private Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, lpOverlapped As Any) As Long Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long Private Declare Function WaitForSingleObject Lib "kernel32" (ByVal hHandle As Long, ByVal dwMilliseconds As Long) As Long Private Declare Function GetExitCodeProcess Lib "kernel32" (ByVal hProcess As Long, lpExitCode As Long) As Long Private Declare Sub GetStartupInfo Lib "kernel32" Alias "GetStartupInfoA" (lpStartupInfo As STARTUPINFO) Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long Public Function Redirect(szBinaryPath As String, szCommandLn As String) As String Dim tSA_CreatePipe As SECURITY_ATTRIBUTES Dim tSA_CreateProcessPrc As SECURITY_ATTRIBUTES Dim tSA_CreateProcessThrd As SECURITY_ATTRIBUTES Dim tSA_CreateProcessPrcInfo As PROCESS_INFORMATION Dim tStartupInfo As STARTUPINFO Dim hRead As Long Dim hWrite As Long Dim bRead As Long Dim abytBuff() As Byte Dim lngResult As Long Dim szFullCommand As String Dim lngExitCode As Long Dim lngSizeOf As Long tSA_CreatePipe.nLength = Len(tSA_CreatePipe) tSA_CreatePipe.lpSecurityDescriptor = 0& tSA_CreatePipe.bInheritHandle = True tSA_CreateProcessPrc.nLength = Len(tSA_CreateProcessPrc) tSA_CreateProcessThrd.nLength = Len(tSA_CreateProcessThrd) If (CreatePipe(hRead, hWrite, tSA_CreatePipe, 0&) <> 0&) Then tStartupInfo.cb = Len(tStartupInfo) GetStartupInfo tStartupInfo With tStartupInfo .hStdOutput = hWrite .hStdError = hWrite .dwFlags = STARTF_USESHOWWINDOW Or STARTF_USESTDHANDLES .wShowWindow = SW_HIDE End With szFullCommand = """" & szBinaryPath & """" & " " & szCommandLn lngResult = CreateProcess(0&, szFullCommand, tSA_CreateProcessPrc, tSA_CreateProcessThrd, True, 0&, 0&, vbNullString, tStartupInfo, tSA_CreateProcessPrcInfo) If (lngResult <> 0&) Then lngResult = WaitForSingleObject(tSA_CreateProcessPrcInfo.hProcess, WAIT_INFINITE) lngSizeOf = GetFileSize(hRead, 0&) If (lngSizeOf > 0) Then ReDim abytBuff(lngSizeOf - 1) If ReadFile(hRead, abytBuff(0), UBound(abytBuff) + 1, bRead, ByVal 0&) Then Redirect = StrConv(abytBuff, vbUnicode) End If End If Call GetExitCodeProcess(tSA_CreateProcessPrcInfo.hProcess, lngExitCode) CloseHandle tSA_CreateProcessPrcInfo.hThread CloseHandle tSA_CreateProcessPrcInfo.hProcess 'If (lngExitCode <> 0&) Then Err.Raise vbObject + 1235&, "GetExitCodeProcess", "Non-zero Application exist code" CloseHandle hWrite CloseHandle hRead Else Err.Raise vbObject + 1236&, "CreateProcess", "CreateProcess Failed, Code: " & Err.LastDllError End If End If End Function 

You would use the command line through
resp = Redirect("cmd", strCmd)
where cmd equivalent to pressing windows + R and strCmd is the line entered at the Run prompt.

To answer your question about the difference in performance between local disks and network drives, working with network drives will always be slower in any type of code. The background code that runs when accessing a network drive is complicated, but I don’t know the specifics.

Hope this helps,
Greetings
kpark

0
source

What do you mean by fast, for 1,500 files on the network, I think the next implementation using FSO is not too slow, but how fast do you hope?

 Sub TestBuildFileStructure() ' Call to test GetFiles function. Const sDIRECTORYTOCHECK As String = <enter path to check from as string> Dim varItem As Variant Dim wkbOutputFile As Workbook Dim shtOutputSheet As Worksheet Dim sDate As String Dim sPath As String Dim lRowNumber As Long Dim vSplit As Variant sPath = ThisWorkbook.Path sDate = CStr(Now) vSplit = Split(sDate, "/") sDate = vSplit(0) & vSplit(1) & vSplit(2) vSplit = Split(sDate, ":") sDate = vSplit(0) & vSplit(1) & vSplit(2) sDate = "Check " & sDate Set wkbOutputFile = Workbooks.Add 'wkbOutputFile.Name = sDate Set shtOutputSheet = wkbOutputFile.Sheets.Add shtOutputSheet.Name = "Output" lRowNumber = 1 Call BuildFileStructure(sDIRECTORYTOCHECK, shtOutputSheet, lRowNumber, True) wkbOutputFile.SaveAs (sPath & "\" & sDate) Cleanup: Set shtOutputSheet = Nothing Set wkbOutputFile = Nothing End Sub Function BuildFileStructure(ByVal strPath As String, _ ByRef shtOutputSheet As Worksheet, _ ByRef lRowNumber As Long, _ Optional ByVal blnRecursive As Boolean) As Boolean ' This procedure returns all the files in a directory into ' an excel file. If called recursively, it also returns ' all files in subfolders. Const iNAMECOLUMN As Integer = 1 Dim fsoSysObj As FileSystemObject Dim fdrFolder As Folder Dim fdrSubFolder As Folder Dim filFile As File ' Return new FileSystemObject. Set fsoSysObj = New FileSystemObject On Error Resume Next ' Get folder. Set fdrFolder = fsoSysObj.GetFolder(strPath) If Err <> 0 Then ' Incorrect path. BuildFileStructure = False GoTo BuildFileStructure_End End If On Error GoTo 0 ' Loop through Files collection, adding to dictionary. For Each filFile In fdrFolder.Files shtOutputSheet.Cells(lRowNumber, iNAMECOLUMN).Value = filFile.Path lRowNumber = lRowNumber + 1 Next filFile ' If Recursive flag is true, call recursively. If blnRecursive Then For Each fdrSubFolder In fdrFolder.SubFolders Call BuildFileStructure(fdrSubFolder.Path, shtOutputSheet, lRowNumber, True) Next fdrSubFolder End If ' Return True if no error occurred. BuildFileStructure = True BuildFileStructure_End: Set fdrSubFolder = Nothing Set fdrFolder = Nothing Set filFile = Nothing Set fsoSysObj = Nothing Exit Function End Function 
0
source

Source: https://habr.com/ru/post/1498446/


All Articles