Mix Erlang and Haskell

If you bought a functional programming paradigm, chances are that you like both Erlang and Haskell. Both have purely functional cores and other goodness, such as light streams, which make them suitable for a multi-core world. But there are differences.

Erlang is a commercially proven fault tolerant language with a mature distribution model. It has a unique feature in its ability to update its version at run time by loading hot code. (Calmly!)

Haskell, on the other hand, has the most complex type system of any main language. (Where I define "mainstream" as any language that has an published O'Reilly book, so Haskell counts.) Its straightforward, single-threaded performance looks superior to Erlang, and its lightweight streams look even easier.

I am trying to build a development platform for the rest of my coding life and wondered if Erlang and Haskell can be mixed to achieve a best-in-class platform. This question has two parts:

  • I would like to use Erlang as a kind of fail-safe MPI for gluing instances of the GHC environment. There would be one Erlang process at runtime of the GHC. If "the impossible happened" and the GHC runtime died, the Erlang process will detect this somehow and die. Erlang hot code loading and distribution functions will continue to work. The GHC runtime can be configured to use only one core or all of the cores on the local machine, or any combination between them. After the Erlang library has been written, the rest of the Erlang-level code should be purely a template and automatically generated for each application. (Perhaps, for example, using Haskell DSL.) How to achieve at least some of these things?
  • I would like Erlang and Haskell to share the same garage collector. (This is a much deeper idea than 1.) Languages ​​that run on the JVM and CLR reach more mass by sharing runtime. I understand that there are technical limitations on running Erlang (loading hot code) and Haskell (a higher type of polymorphism) on the JVM or CLR. But what about decoupling only the garbage collector? (Sorting start of execution for functional languages.) The distribution would obviously have to be really fast, so perhaps this bit should be statically linked. And there must be some mechansim to distinguish a volatile heap from an immutable heap (including a lazy write once of memory), as the GHC needs it. Would it be wise to modify both HIPE and GHC so that garbage collectors can share heap?

Please respond to any impressions (positive or negative), ideas or suggestions. In fact, any feedback (without direct appeal!) Is welcome.

Update

Thank you for all 4 answers today - everyone taught me at least one useful thing that I did not know about.

Regarding the rest of the coding thing of life, I turned it a little on the cheek to provoke controversy, but actually it is true. There is a project that I mean that I intend to work until I die, and he needs a stable platform.

In the platform that I suggested above, I would only write Haskell, since the Erlang template will be created automatically. So how long will Haskell last? Well, Lisp is still with us and it doesn't look like it is leaving soon. Haskell is open source BSD3 and has reached critical mass. If programming itself continues after 50 years, I would expect Haskell or some continuous evolution of Haskell to still be here.

Update 2 in response to a rvirding message

Agreed - Implementing the full universal Erskell / Haslang virtual machine may not be completely impossible, but it will certainly be very difficult. Separating only the garbage collector level, like something like a virtual machine, although it’s still complicated, sounds an order of magnitude smaller than it’s harder for me. Functional languages ​​should have much in common in the garbage collection model — the uniqueness of immutable data (including thunks) and the requirement for very fast distribution. So the fact that the community is closely related to monolithic virtual machines seems strange.

VMs help achieve critical mass. Just look at how "lite" functional languages ​​like F # and Scala took off. Scala may not have Erlang's absolute resiliency, but it offers an escape route for so many JVM-bound people.

If there is one heap, the message passes very quickly introducing a number of other problems, mainly that the GC makes it more difficult, because it must be interactive and globally without interruption, so that you cannot use the same simple algorithms as the model of the process heap.

Absolutely, that makes perfect sense to me. Very smart people on the GHC development team seem to be trying to solve part of the problem with the parallel GC "stop the world".

http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel-gc/par-gc-ismm08.pdf

(Obviously, "stopping the world" will not fly for General Erlang, given its main use case.) But even in cases where "stopping the world" is in order, their accelerations do not seem universal. Therefore, I agree with you, it is unlikely that there is a universally better GC, so I pointed out in the first part of my question that

GHC runtime can be configured to use only one core or all of the kernels on the local machine or any combination in between.

Thus, for this use case, I could, after benchmarking, select the Erlang path and start one GHC run time (with single-threaded GC) plus one Erlang process per core and allow Erlang to copy memory between the kernels for good locality.

Alternatively, on a dual-processor machine with 4 cores per processor with good memory bandwidth on the processor, benchmarking may suggest that I run one GHC runtime (with parallel GC) and one Erlang process per processor.

In both cases, if Erlang and GHC can share a bunch, sharing is likely to be associated with a single OS thread running on the same core. (I'm going out of my depth here, so I asked a question.)

I also have another agenda - comparative functional languages, regardless of the GC. Often I read OCaml v GHC v Erlang v ... test results and wonder how the results are mixed with different GCs. What if the choice of GC can be orthogonal to the choice of a functional language? How expensive is a GC? Watch this Devil Lawyers Blog Post

http://john.freml.in/garbage-collection-harmful

my friend Lisp John Fremlin, whom he charmingly gave his post: "Automated garbage collection is garbage." When John claims that the GC is slow and has not actually accelerated that much, I would like to be able to withstand some numbers.

+48
garbage-collection erlang functional-programming haskell ghc
Sep 09 '09 at 5:10
source share
6 answers

Many people Haskell and Erlang are interested in a model in which Erlang controls distribution, and Haskell uses shared memory nodes in parallel, doing all the crunch / logic numbers.

To begin with, this is the haskell-erlang library: http://hackage.haskell.org/package/erlang

And we have similar efforts in the land of Ruby, through Hubris: http://github.com/mwotton/Hubris/tree/master

Now the question is to find someone who is really pushing the Erlang / Haskell interval to figure out complex problems.

+28
Sep 09 '09 at 10:45
source share

Although it's a pretty old thread, if readers are still interested, it's worth taking a look at Cloud Haskell , which brings Erlang's concurrency and distribution style to a stable GHC.

The upcoming distributed-process-platform library adds support for OTP-esque constructs such as gen_servers, watch trees, and various other haskell flavored abstractions, borrowed from and inspired by Erlang / OTP.

+5
Jan 23 '13 at 14:59
source share
  • You can use the OTP gen_supervisor process to control the Haskell instances that you create with open_port (). Depending on how the “port” comes out, you may be able to restart it or decide that it has been specially stopped, and let the corresponding Erlang process die too.

  • Fugheddaboudit. Even these language-independent virtual machines that you are talking about have problems with data transferred between languages. You should just serialize the data between two ways: database, XML-RPC, something like this.

By the way, the idea of ​​one platform for the rest of his life is probably also impractical. Computing technology and fashion change too often to expect that you can continue to use only one language forever. The question is raised to your question: not a single language does everything that we could wish for, even today.

+4
Sep 09 '09 at 5:20
source share

You will have an interesting time related to the GC between Haskell and Erlang. Erlang uses a bunch of processes and copies data between processes - since Haskell doesn't even have a process concept, I'm not sure how you could map this “universal” GC between them. In addition, to achieve the best performance, Erlang uses many distributors, each of which has slightly modified behavior, which, I am sure, will affect the GC subsystem.

As with all things in software, abstraction is expensive. In this case, I rather suspect that you will have to enter so many layers to get both languages ​​by their impedance mismatch, that you end up with a not very efficient (or useful) shared virtual machine.

Bottom line - hug the difference! There are huge advantages for NOT running everything in the same process, especially in terms of reliability. In addition, I think it’s a little naive to expect that one language / virtual machine will leave you for the rest of your life (if you do not plan.) To live a short time or b.) Become a kind of code monk who ONLY works on a single project) . Software development involves mental flexibility and the desire to use the best tools available to create fast and reliable code.

+4
Sep 09 '09 at 12:39
source share

As dizzyd mentioned in his comment, not all data in messages is copied, large binaries exist outside the process heap and are not copied.

Using a different memory structure to avoid a separate heap for each process was certainly possible in a number of early implementations. Although having one heap makes the message faster very much , it introduces a number of other problems, basically it makes the GC more complex, as it must be interactive and global without interruption, so you cannot use the same simple algorithms as for the process heap model .

As long as we use immutable data structures, there are no problems with reliability and security. The decision about which memory and GC models to use is a big compromise, and unfortunately this is the universally best model.

While Haskell and Erlang are functional languages, they are in many ways very different languages ​​and have very different implementations. It would be hard to come up with an Erskell (or Haslang) machine that could work effectively with both languages. I personally find it much better to keep them separate and make sure that you have a really good interface between them.

+4
Sep 09 '09 at 15:34
source share

The CLR supports tail call optimization with an explicit tail operation code (as used by F #), which the JVM does not yet have an equivalent, which limits the implementation of this style of language. Using a separate AppDomain allows the CLR for the hot-swap code (see, for example, this blog post showing how to do this).

With Simon Peyton Jones working down the hall from Don Sim and the F # team at Microsoft Research, it would be a big disappointment if we did not see IronHaskell with some kind of official status. IronErlang would be an interesting project - most of the work would probably be carried by the green thread scheduler without gaining as much weight as the Windows Workflow engine, or it should run the BEAM virtual machine on top of the CLR.

+2
Sep 09 '09 at 6:57
source share



All Articles