I recommend pure-lang for these pedagogical purposes. It is also very powerful. If you want something more popular / with more community support, then I would recommend Scheme or OCaml, depending on whether you prefer to deal with unfamiliar syntax (go to Scheme) or deal with an unfamiliar type (first with OCaml ) SML and F # are slightly different from OCaml. Others mention Clojure, Scala, and Haskell.
Clojure is a variant of the scheme, with its own idiosyncrasies (for example, without tail call optimization), so you can start using this method with Scheme. I would expect you to have an easier time with a less idiosyncratic implementation of the Scheme. Racket is what is often used for training. Scala looks fundamentally similar to OCaml, but it is based only on chance acquaintance.
Unlike Haskell, the other languages mentioned have two advantages: (1) the evaluation order is evaluated by default, although you can get a lazy rating by specifically requesting it. In Haskell, it's the other way around. (2) Mutation is available, although most of the libraries and code you see do not use. I actually believe that it is pedagogically better to learn functional programming and at the same time monitor how it interacts with side effects and work its way into a monadic-style composition along the way. Therefore, I believe that this is an advantage. Some will tell you that it is best to throw more quarantine mutaton processing at Haskell first.
Robert Harper at CMU has good blog posts on learning functional programming . As I understand it, he also prefers languages like OCaml for learning.
Among the three classes of languages that I recommended (Pure, Scheme and friends, OCaml and friends), the first two have dynamic typing. The first and third have explicit reference cells (as if in Python you limited yourself to never overestimating the variable, but you can still change what is stored in the list index). The scheme has implicit reference cells: the variables themselves look mutable, as in C and Python, and the processing of reference cells is performed under covers. In such languages, you often have a certain form of explicit reference cell (as in the example I just gave in Python, or using mutable pairs / lists in Racket ... in other schemes, including the Scheme standard, these are the default / pairs lists).
In one of Haskell's merits, there are some textbooks for him. (I mean this sincerely, not maliciously). Which books / resources to use is another controversial issue with many wars / closed issues. SICP, as others have recommended, has a lot of fans, as well as some critics. It seems to me that I have a lot of good choices. I will not continue to dwell on this debate.