Erlang: create a file manager

Question

Erlang: create a file manager

I need to implement a file monitoring function in Erlang: there must be a process that displays files if a specific directory is specified and do something when files appear.

I am looking at OTP. Therefore, at the moment I have the following ideas: 1. Create a Supervisor that will manage gen_servers (one server per folder) 2. Create a WatchServer - gen_server for each folder that I want to control. 3. Create a ProcessFileServer server - gen, which should do something with the files). Assume a copy to another folder =

So, the first problem: WatchServer should not wait for a request, it should generate one at predetermined intervals.

Currently, I created a timer in the init / 1 function and handled the on_timer event in the handle_info function.

Now questions: 1. Are there any better ideas? 2. How can I tell ProcessFileServer that a file was found? It seems to me that it would be much more convenient to create WatchServers and ProcessServers yourself, but in this case I do not know who to send the message to?

Maybe there are similar / libs projects?

+6

erlang

Anton Prokofiev Apr 15 '11 at 20:40

source share

3 answers

If you use Linux, you can use inotify. This is a kernel service that allows you to subscribe to file system events. Do not try the file system, let the file system call you.

you can try https://github.com/massemanet/inotify to monitor your directory.

Ulf

+4

Ulf Apr 16 '11 at 4:46

source share

I wrote a poll based library. (It would be nice to expand it to use inotify on platforms where it is supported.) It was originally intended for use in EUnit, but instead I turned into a separate project. You can find it here:

https://github.com/richcarl/file_monitor

+3

RichardC Apr 17 '11 at 20:40

source share

Peer stritzinger · Accepted Answer · 2011-04-16T07:08:34+0000

Erlang is very cheap to create processes (orders of magnitude compared to other systems).

Therefore, I recommend creating a new ProcessFileServer every time a new file for processing appears. When this is done, just complete the process with the goal of exiting normal .

I would suggest the following structure:

  top_supervisor | +-----------------------+-------------------------+ | | directory_supervisor processing_supervisor | simple_one_for_one +----------+-----...-----+ | | | | starts children transient | | | | dir_watcher_1 dir_watcher_2 dir_watcher_n +-------------+------+---...----+ | | | proc_file_1 proc_file_2 proc_file_n

When a dir_watcher notes that a new file has appeared. It calls the processing_supervisor supervisor:start_child\2 function supervisor:start_child\2 , with an additional parameter to the pathe file, for example.

processing_supervisor should start its children with a transient reload policy.

Therefore, if one of the proc_file servers fails, it will be restarted, but when they fail with the normal exit reason, they will not restart. That way, you just exit normal when this is done, and crash when something else happens.

If you did not overdo it, the round-robin polling for files is Ok. If the system boots due to this poll, you can search kernel notification systems (for example, FreeBSD KQUEUE or higher-level services built on it in MacOSX) to send you a message when the file appears in the directory. However, these services are difficult because they need to give up if too many events happen (otherwise they will not be a performance improvement, but vice versa). Therefore, in any case, you will have to have a reliable survey solution.

Therefore, do not prematurely optimize and start by polling, adding improvements (which would be isolated on dir_watcher servers) when necessary.

Regarding the comment about what behavior should be used as dir_watcher , since it does not use most of gen_servers :

There is no problem using only part of the capabilities of gen_servers , in fact, very often do not use all of this. In your case, you only set up a timer in init and use handle_info to do your job. The rest of gen_server is just an immutable template.
If you later want to change parameters, such as the polling rate, it’s easy to add to this.
gen_fsm much less used since it is only suitable for a rather limited model and not very flexible. I use it only when it really meets 100% requirements (which it almost never does).
In the case when you just want a simple simple Erlang server, you can use the spawn functions in proc_lib to get just the minimum functionality to work under the supervisor.
An interesting way to write more natural Erlang code and still have the benefits of OTP plain_fsm , here you have the advantages of selective reception and flexible message processing, especially when processing protocols in combination with the good OTP features.

Having said all this: if I wrote dir_watcher , I would just use gen_server and use only what I need. Unused functionality really costs you nothing, and everyone understands what it does.

Erlang: create a file manager

More articles: