Note

Please be advised that this article is based on personal experimentation. The information may be incorrect. Please use at your own discretion.

In this article, I will set out what I have learned about the Haskell language from a Python developer's perspective.

This is a follow-up to Getting Started with Haskell on Fedora and this is similar to my previous React for python developers post.

Toolchain

Runtime

Python Haskell
python (the REPL) ghci
#!/usr/bin/python (the script interpreter) runhaskell
ghc (the compiler)

In practice, haskell programs are usually compiled using a package manager.

Read Eval Print Loop

A typical developper environment uses a text editor along with a REPL terminal to evaluate expressions.

Given a file named a_file in the current working directory:

Python Haskell
# a_file.py
def greet(name):
    print("Hello " + name + "!")
-- a_file.hs
greet name =
    print("Hello " ++ name ++ "!")

You can evaluate expressions:

Python Haskell
$ python
Python 3.8.3 (default, May 29 2020, 00:00:00)
>>> from a_file import *
>>> greet("Python")
Hello Python!
$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/
Prelude> :load a_file
Prelude> greet("Haskell")
"Hello Haskell!"

Useful ghci command includes:

  • :reload reloads all the loaded file.
  • :info prints info about a name.
  • :type prints the type of an expression.
  • :browse lists the types and functions of a module.
  • :quit to exit ghci.

More infos about ghci in this typeclass post

Package Manager

Python Haskell
setup.cfg and requirements.txt project-name.cabal
setuptools and pip cabal-install
tox and (lts) pip stack

To learn about the history of these tools, check this post.

  • .cabal is a file format that describes most Haskell packages and programs.
  • cabal-install is a package manager that uses the Hackage registry.
  • stack is another package manager that uses the Stackage registry, which features Long Term Support package sets.

Install stack on Fedora using this command:

$ sudo dnf copr enable -y petersen/stack2 && sudo dnf install -y stack && sudo stack upgrade

Example stack usage:

$ stack new my-playground; cd my-playground
$ stack build
$ stack test
$ stack ghci
$ stack ls dependencies

Developer tools

Python Haskell
code formatter black ormolu
linter flake8 hlint
documentation sphinx haddock
api search   hoogle

Documentation can be found on Hackage directly or it can be built locally using the stack haddock command:

$ stack haddock
# Open the documentation of the base module:
$ stack haddock --open base
  • Most packages use Haddock, click on a module name to access the module documentation.
  • Look for a Tutorial or Prelude module, otherwise start with the top level name.
  • Click Contents from the top menu to browse back to the index.

Hoogle is the Haskell API search engine. Visit https://hoogle.haskell.org/ or run it locally using the stack hoogle command:

$ stack hoogle -- generate --local
$ stack hoogle -- server --local --port=8080
# Or use the like this:
$ stack hoogle -- '[a] -> a'
Prelude head :: [a] -> a
Prelude last :: [a] -> a

I recommend running all the above stack commands before reading the rest of this article. Then start a ghci REPL and try the example as well as use the :info and :type command.

Language Features

Before starting, let's see what makes Haskell special.

For more details, check out this blog post that explains why Haskell is nice to program in.

Statically typed

Every expression has a type and ghc ensures that types match at compile time:

Python Haskell
var = "Hello!"
print(var + 42)
# Runtime type error
var = "Hello!"
print(var + 42)
-- Compile error

Type inference

Most of the time, you don't have to define the types:

Python Haskell
def list_to_upper(s):
    return map(str.upper, s)
# What is the type of `list_to_upper` ?
list_to_upper s =
    map Data.Char.toUpper s
-- list_to_upper :: [Char] -> [Char]

Lazy

Expressions are evaluated only when needed:

Python Haskell
res = 42 / 0
print("Done.")
# Program halt before the print
res = 42 / 0
print("Done.")
-- res is not used or evaluated

Immutable

Variable content can not be modified.

Python Haskell
class A:
  b = 0

a = A()
a.b = 42
# The attribute b of `a` now contains 42
data A =
  A { b :: Integer }

a = A 0
a { b = 42 }
-- The last statement create a new record

Purely functional

Haskell programs are made out of function compositions and applications whereas imperative languages use procedural statements.

Language Syntax

In this section, let's overview the Haskell syntax.

Comments

Python Haskell
# A comment
""" A multiline comment
"""
-- A comment
{- A multiline comment
-}

Imports

Python Haskell
import os
import os as NewName
from os import getenv
from os import *
from os import *; del getenv
import qualified System.Environment
import qualified System.Environment as NewName
import System.Environment (getEnv)
import System.Environment
import System.Environment hiding (getEnv)
  • Multiple modules can be imported using the same name, resulting in all the functions to be merged into a single namespace:
import qualified Data.Text as T
import qualified Data.Text.IO as T

Operators

Python Haskell
10 / 3  # 3.3333
10 // 3 # 3
10 % 3
1 != 2
42 in [1, 42, 3]
10 / 3
div 10 3
mod 10 3
1 /= 2
elem 42 [1, 42, 3]

Haskell operators are regular functions used in infix notation. To query them from the REPL, they need to be put in paranthesis:

ghci> :info (/)

Haskell functions can also be used in infix notation using backticks:

Python Haskell
21 * 2
84 // 2
15 % 7
"Apple" in ["Apple", "Peach", "Berry"]
(*) 21 2
84 `div` 2
15 `mod` 7
"Apple" `elem` ["Apple", "Peach", "Berry"]

List comprehension

List generators:

Python Haskell
range(1, 6)
[1, 2, 3, 4, 5, 6, 7, 8, ...]
range(1, 5, 2)
[1..5]
[1..]
[1,2..5]

List comprehension:

Python Haskell
[x for x in range(1, 10) if x % 3 == 0]
# [3, 6, 9]
[(x, y) for x in range (1, 3) for y in range (1, 3)]
# [(1, 1), (1, 2), (2, 1), (2, 2)]
[x | x <- [1..10], mod x 3 == 0 ]
-- [3,6,9]
[(x, y) | x <- [1..2], y <- [1..2]]
-- [(1,1),(1,2),(2,1),(2,2)]
  • List can be infinite.
  • <- is syntax sugar for the bind operation.

Function

Python Haskell
def add_and_double(m, n):
    return 2 * (m + n)

add_and_double(20, 1)
add_and_double m n =
    2 * (m + n)

add_and_double 20 1
  • Parentheses and comma are not required.
  • Return is implicit.

Anonymous function

Python Haskell
lambda x, y: 2 * (x + y)
lambda tup: tup[0]
\x y -> 2 * (x + y)
\(x, y) -> x
  • Argument separators are not needed.
  • Tuple argument can be deconstructed using pattern matching.

Concrete type

Types that are not abstract:

Python Haskell
True
1
1.0
'a'
['a', 'b', 'c']
(True, 'd')
True
1
1.0
'a'
"abc"
(True, 'd')
  • Strings are lists of characters (more on that later).
  • Haskell Int are bounded, Integer are infinite, use type annotation to force the type.

Basic conversion:

Python Haskell
int(0.5)  -- float to int
float(1)  -- int to float
int("42")
round 0.5
fromIntegral 1 :: Float
read "42"      :: Int

Read more about number in the tutorial.

Type annotations

Python Haskell
def lines(s: str) -> List[str]:
    return s.split("\n")
--- ghci> :type lines
lines :: String -> [String]
  • Type annotations are prefixed by ::.
  • lines is a function that takes a String, and it returns a list of Strings, denoted [String].
Python Haskell
def add_and_double(m : int, n: int) -> int:
add_and_double :: Num a => a -> a -> a
  • Before => are type-variable constraints, Num a is a constraint for the type-variable a.
  • Type is a -> a -> a, which means a function that takes two as and that returns a a.
  • a is a variable type (or type-variable). It can be a Int, a Float, or anything that satisfies the Num type class (more and that later).

Partial application

Python Haskell
def add20_and_double(n):
    return add_and_double(20, n)

add20_and_double(1)
add20_and_double =
    add_and_double 20

add20_and_double 1

For example, the map function type annotation is:

  • map :: (a -> b) -> [a] -> [b]
  • map takes a function that goes from a to b, denoted (a -> b), a list of as and it returns a list of bs:
Python Haskell
map(lambda x: x * 2, [1, 2, 3])
# [2, 4, 6]
map (* 2) [1, 2, 3]
--- [2, 4, 6]

Here are the annotations for each sub expressions:

(*)         :: Num a => a -> a -> a
(* 2)       :: Num a => a -> a
map         :: (a -> b) -> [a] -> [b]
(map (* 2)) :: Num b => [b] -> [b]

Record

A group of values is defined using Record:

Python Haskell
class Person:
    def __init__(self, name):
        self.name = name


person = Person("alice")
print(person.name)
data Person =
    Person {
      name :: String
    }

person = Person "alice"
print(name person)
  • the first line defines a Person type with a single Person constructor that takes a string attribute.
  • Record attributes are actually functions.

Here are the annotations of the record functions automatically created:

Person :: String -> Person
name :: Person -> String

Record value can be updated:

Python Haskell
new_person = copy.copy(person)
new_person.name = "bob"
new_person =
  person { name = "bob" }

See this blog post for more details about record syntax.

(Type) class

Classes are defined using type class. For example, objects that can be compared:

Python Haskell
# The `==` operator use object `__eq__` function:
class Person:
    def __eq__(self, other):
        return self.name == other.name
-- The `==` operator works with Eq type class:
data Person = Person { name :: String }
instance Eq Person where
    self (==) other = name self == name other

Type class can also have constraints:

Python Haskell
# The `>` operator use object `__gt__` function:
class ComparablePerson(Person):
    def __gt__(self, other):
        return self.age > other.age
-- ghci> :info Ord
class Eq a => Ord a where
    compare :: a -> a -> Ordering

Haskell can derive most type classes automatically using the deriving keyword:

data Person =
  Person {
    name :: String,
    age :: Int
  } deriving (Show, Eq, Ord)

Common type classes are:

  • Read
  • Show
  • Eq
  • Ord
  • SemiGroup

Do notation

Expressions that produce side-effecting IO operations are descriptions of what they do. For example the description can be assigned and evaluated when needed:

Python Haskell
defered = lambda : print("Hello")

defered()
defered = print("Hello")

defered

Such expressions are often defined using the do notations:

Python Haskell
def welcome():
    print("What is your name? ")
    name = input()
    print("Welcome " + name)
welcome = do
    putStrLn "What is your name?"
    name <- getLine
    print ("Welcome " ++ name)
  • The <- lets you bind to the content of an IO.
  • The last expression must match the IO value, use pure if the value is not already an IO.
  • The do notations can also be used for other non-IO computation.

do notation is syntaxic sugar, here is an equivalent implementation using regular operators:

welcome =
    putStrLn "What is your name?" >>
    getLine >>= \name ->
        print ("Welcome " ++ name)
  • >> discards the previous value while >>= binds it as the first argument of the operand function.

Algebraic Data Type (ADT)

Here the Bool type has two constructors True or False. We can say that Bool is the sum of True and False:

data Bool = True | False

Here the Person type has one constructor MakePerson that takes two concrete values. We can say that Person is the product of String and Int:

data Person = MakePerson String Int

Data type can be polymorphic:

data Maybe  a   = Just a | Nothing
data Either a b = Left a | Right b

Pattern matching

Multiple function bodies can be defined for different arguments using patterns:

Python Haskell
def factorial(n):
    if n == 0: return 1
    else:      return n * factorial(n - 1)
--
factorial 0 = 1
factorial n = n * factorial(n - 1)

Values can also be matched using case expression:

Python Haskell
def first_elem(l):
    if len(l) > 0: return l[0]
    else:          return None
first_elem l = case l of
    (x:_) -> Just x
    _     -> Nothing
  • _ match anything.
  • See this section of Why Haskell Matters to learn more about list pattern match.

Nested Scope

Nesting the scope of definitions is a commonly used pattern, for example with .. where ..:

Python Haskell
def main_fun(arg):
    value = 42
    def sub_fun(sub_arg):
        return value
    return sub_fun(arg)
main_fun arg = sub_fun arg
  where
    value = 42
    sub_fun sub_arg = value

Where clauses can be used recursively. Another pattern is to use let .. in .. :

Python Haskell
def a_fun(arg):
    (x, y) = arg
    return x + y
a_fun arg =
    let (x, y) = arg
    in x + y

For more details see Let vs. Where.

Standard library

Note that the standard library is likely not enough. Add those extra libraries to the build-depends list of your playground cabal file, then reload stack ghci:

  • aeson
  • bytestrings
  • containers
  • text

Prelude

By default, Haskell programs have access to the base library:

Python Haskell
f(g(x))
print(len([1, 2]))
[1, 2] + [3]
"Hello" + "World"
(True, 0)[0]
tuples = [(True, 2), (False, 3)]
map(lambda x:    x[1], tuples)
filter(lambda x: x[0], tuples)
(f . g) x
print $ length $ [1, 2]
[1, 2] <> [3]
"Hello" <> "World"
fst (True, 0)
tuples = [(True, 2), (False, 3)]
map snd tuples
filter fst tuples
  • The $ operator splits the expression in half, and they are evaluated last so that we can avoid using parentheses on the right hand side operand.
  • The <> operator works on all semigroups (while ++ only works on List).

Data.List

Python Haskell
l = [1, 2, 3, 4]
l[0]
l[1:]
l[:2]
l[2:]
l[2]
sorted([3, 2, 1])
l = [1, 2, 3, 4]
head l
tail l
take 2 l
drop 2 l
l !! 2
sort [3, 2, 1]

Data.Maybe

Functions to manipulate optional values: data Maybe a = Just a | Nothing.

Python Haskell
pred = True
value = 42 if pred else None
print(value if value else 0)

values = [21, None, 7]
[value for value in values if value is not None]
import Data.Maybe
value = Just 42
print(fromMaybe 0 value)

values = [Just 21, Nothing, Just 7]
catMaybes values

Data.Either

Functions to manipulate either type: data Either a b = Left a | Right b.

Python Haskell
def safe_div(x, y):
    if y == 0: return "Division by zero"
    else:      return x / y

values = [safe_div(1, y) for y in range(-5, 10)]
[v for v in values if isinstance(value, float)]
[v for v in values if isinstance(value, str)]
import Data.Either
safe_div _ 0 = Left "Division by zero"
safe_div x y = Right $ x / y

values = [safe_div 1 y | y <- [-5..10]]
rights values
left values

Data.Text

The default type for a string is a list of characterset, Text provides a more efficient alternative:

Python Haskell
#
a_string = "Hello world!"
a_string.replace("world", "universe")
a_string.split(" ")
list(a_string)
import qualified Data.Text as T
a_string = T.pack "Hello world!"
T.replace "world" "universe" a_string
T.splitOn " " a_string
T.unpack a_string

Data.Text can also be used to read files:

Python Haskell
#
cpus = open("/proc/cpuinfo").read()
lines = cpus.splitlines()
filter(lambda s: s.startswith("processor\t"), lines)
import qualified Data.Text.IO as T
cpus <- T.readFile "/proc/cpuinfo"
cpus_lines = T.lines cpus
filter (T.isPreffixOf "processor\t") cpus_lines
  • Use :set -XOverloadedStrings in ghci to ensure the "string" values are Text.

Data.ByteString

Use ByteString to work with raw data bytes. Both Data.Text and Data.ByteString come in two flavors, strict and lazy.

Strict version, to and from String:

Data.Text.pack                :: String -> Text
Data.Text.unpack              :: Text   -> String

Data.ByteString.Char8.pack    :: String     -> ByteString
Data.ByteString.Char8.unpack  :: ByteString -> String

Strict version between Text and ByteString:

Data.Text.Encoding.encodeUtf8 :: Text       -> ByteString
Data.Text.Encoding.decodeUtf8 :: ByteString -> Text

Conversion between strict and lazy:

Data.Text.Lazy.fromStrict       :: Data.Text.Text      -> Data.Text.Lazy.Text
Data.Text.Lazy.toStrict         :: Data.Text.Lazy.Text -> Data.Text.Text

Data.ByteString.Lazy.fromStrict :: Data.ByteString.ByteString      -> Data.ByteString.Lazy.ByteString
Data.ByteString.Lazy.toStrict   :: Data.ByteString.Lazy.ByteString -> Data.ByteString.ByteString

To avoid using fully qualified type names, these libraries are usually imported like so:

import Data.ByteString (ByteString)
import qualified Data.ByteString as B
import Data.Text (Text)
import qualified Data.Text as T

Containers

The containers' library offers useful containers types. For example Map:

Python Haskell
#
d = dict(key="value")
d["key"]
d["other"] = "another"
import qualified Data.Map as M
d = M.fromList [("key", "value")]
M.lookup "key" d
M.insert "other" "another" d

Set:

Python Haskell
#
s = set(("Alice", "Bob", "Eve"))
"Foo" in s
len(s)
import qualified Data.Set as S
s = S.fromList ["Alice", "Bob", "Eve"]
"Foo" `S.member` s
S.size s

Check out the documentation by running stack haddock --open containers.

When unsure, use the strict version.

Language Extensions

The main compiler ghc supports some useful language extensions. They can be enabled:

  • Per file using this syntax: {-# LANGUAGE ExtensionName #-}.
  • Per project using the default-extensions: ExtensionName cabal configuration.
  • Per ghci session using the :set -XExtensionName command.

Note that ghci :set - command can be auto completed using Tab.

OverloadedStrings

Enables using automatic conversion of "string" value to the appropriate type.

NumericUnderscores

Enables using underscores separator e.g. 1_000_000 .

NoImplicitPrelude

Disables the implicit import Prelude.

Please check What I Wish I Knew When Learning Haskell for a complete overview of Language Extensions, or this post from the kowainik team.

Further Resources

To delve in further, I recommend digging through the links I shared above. These videos are worth a watch:

These introductory books are often mentioned:

Finally, if you need help, please join the #haskell-beginners IRC channel on Freenode.

Thank you for reading!