Haskell for python developers

Note

Please be advised that this article is based on personal experimentation. The information may be incorrect. Please use at your own discretion.

In this article, I will set out what I have learned about the Haskell language from a Python developer's perspective.

This is a follow-up to Getting Started with Haskell on Fedora and this is similar to my previous React for python developers post.

Toolchain

Runtime

Python	Haskell
python (the REPL)	ghci
#!/usr/bin/python (the script interpreter)	runhaskell
	ghc (the compiler)

In practice, haskell programs are usually compiled using a package manager.

Read Eval Print Loop

A typical developper environment uses a text editor along with a REPL terminal to evaluate expressions.

Given a file named a_file in the current working directory:

Python	Haskell
# a_file.py def greet(name): print("Hello " + name + "!")	-- a_file.hs greet name = print("Hello " ++ name ++ "!")

You can evaluate expressions:

Python	Haskell
$ python Python 3.8.3 (default, May 29 2020, 00:00:00) >>> from a_file import * >>> greet("Python") Hello Python!	$ ghci GHCi, version 8.6.5: http://www.haskell.org/ghc/ Prelude> :load a_file Prelude> greet("Haskell") "Hello Haskell!"

Python

Haskell

$ python
Python 3.8.3 (default, May 29 2020, 00:00:00)
>>> from a_file import *
>>> greet("Python")
Hello Python!

$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/
Prelude> :load a_file
Prelude> greet("Haskell")
"Hello Haskell!"

Useful ghci command includes:

:reload reloads all the loaded file.
:info prints info about a name.
:type prints the type of an expression.
:browse lists the types and functions of a module.
:quit to exit ghci.

More infos about ghci in this typeclass post

Package Manager

Python	Haskell
setup.cfg and requirements.txt	project-name.cabal
setuptools and pip	cabal-install
tox and (lts) pip	stack

To learn about the history of these tools, check this post.

.cabal is a file format that describes most Haskell packages and programs.
cabal-install is a package manager that uses the Hackage registry.
stack is another package manager that uses the Stackage registry, which features Long Term Support package sets.

Install stack on Fedora using this command:

$ sudo dnf copr enable -y petersen/stack2 && sudo dnf install -y stack && sudo stack upgrade

Example stack usage:

$ stack new my-playground; cd my-playground
$ stack build
$ stack test
$ stack ghci
$ stack ls dependencies

Developer tools

	Python	Haskell
code formatter	black	ormolu
linter	flake8	hlint
documentation	sphinx	haddock
api search		hoogle

Documentation can be found on Hackage directly or it can be built locally using the stack haddock command:

$ stack haddock
# Open the documentation of the base module:
$ stack haddock --open base

Most packages use Haddock, click on a module name to access the module documentation.
Look for a Tutorial or Prelude module, otherwise start with the top level name.
Click Contents from the top menu to browse back to the index.

Hoogle is the Haskell API search engine. Visit https://hoogle.haskell.org/ or run it locally using the stack hoogle command:

$ stack hoogle -- generate --local
$ stack hoogle -- server --local --port=8080
# Or use the like this:
$ stack hoogle -- '[a] -> a'
Prelude head :: [a] -> a
Prelude last :: [a] -> a

I recommend running all the above stack commands before reading the rest of this article. Then start a ghci REPL and try the example as well as use the :info and :type command.

Language Features

Before starting, let's see what makes Haskell special.

For more details, check out this blog post that explains why Haskell is nice to program in.

Statically typed

Every expression has a type and ghc ensures that types match at compile time:

Python	Haskell
var = "Hello!" print(var + 42) # Runtime type error	var = "Hello!" print(var + 42) -- Compile error

Type inference

Most of the time, you don't have to define the types:

Python	Haskell
def list_to_upper(s): return map(str.upper, s) # What is the type of `list_to_upper` ?	list_to_upper s = map Data.Char.toUpper s -- list_to_upper :: [Char] -> [Char]

Lazy

Expressions are evaluated only when needed:

Python	Haskell
res = 42 / 0 print("Done.") # Program halt before the print	res = 42 / 0 print("Done.") -- res is not used or evaluated

Immutable

Variable content can not be modified.

Python	Haskell
class A: b = 0 a = A() a.b = 42 # The attribute b of `a` now contains 42	data A = A { b :: Integer } a = A 0 a { b = 42 } -- The last statement create a new record

Purely functional

Haskell programs are made out of function compositions and applications whereas imperative languages use procedural statements.

Language Syntax

In this section, let's overview the Haskell syntax.

Comments

Python	Haskell
# A comment """ A multiline comment """	-- A comment {- A multiline comment -}

Imports

Python	Haskell
import os import os as NewName from os import getenv from os import * from os import *; del getenv	import qualified System.Environment import qualified System.Environment as NewName import System.Environment (getEnv) import System.Environment import System.Environment hiding (getEnv)

Python

Haskell

import os
import os as NewName
from os import getenv
from os import *
from os import *; del getenv

import qualified System.Environment
import qualified System.Environment as NewName
import System.Environment (getEnv)
import System.Environment
import System.Environment hiding (getEnv)

Multiple modules can be imported using the same name, resulting in all the functions to be merged into a single namespace:

import qualified Data.Text as T
import qualified Data.Text.IO as T

Operators

Python	Haskell
10 / 3 # 3.3333 10 // 3 # 3 10 % 3 1 != 2 42 in [1, 42, 3]	10 / 3 div 10 3 mod 10 3 1 /= 2 elem 42 [1, 42, 3]

Haskell operators are regular functions used in infix notation. To query them from the REPL, they need to be put in paranthesis:

ghci> :info (/)

Haskell functions can also be used in infix notation using backticks:

Python	Haskell
21 * 2 84 // 2 15 % 7 "Apple" in ["Apple", "Peach", "Berry"]	(*) 21 2 84 `div` 2 15 `mod` 7 "Apple" `elem` ["Apple", "Peach", "Berry"]

List comprehension

List generators:

Python	Haskell
range(1, 6) [1, 2, 3, 4, 5, 6, 7, 8, ...] range(1, 5, 2)	[1..5] [1..] [1,2..5]

List comprehension:

Python	Haskell
[x for x in range(1, 10) if x % 3 == 0] # [3, 6, 9] [(x, y) for x in range (1, 3) for y in range (1, 3)] # [(1, 1), (1, 2), (2, 1), (2, 2)]	[x \| x <- [1..10], mod x 3 == 0 ] -- [3,6,9] [(x, y) \| x <- [1..2], y <- [1..2]] -- [(1,1),(1,2),(2,1),(2,2)]

Python

Haskell

[x for x in range(1, 10) if x % 3 == 0]
# [3, 6, 9]
[(x, y) for x in range (1, 3) for y in range (1, 3)]
# [(1, 1), (1, 2), (2, 1), (2, 2)]

[x | x <- [1..10], mod x 3 == 0 ]
-- [3,6,9]
[(x, y) | x <- [1..2], y <- [1..2]]
-- [(1,1),(1,2),(2,1),(2,2)]

List can be infinite.
<- is syntax sugar for the bind operation.

Function

Python	Haskell
def add_and_double(m, n): return 2 * (m + n) add_and_double(20, 1)	add_and_double m n = 2 * (m + n) add_and_double 20 1

Parentheses and comma are not required.
Return is implicit.

Anonymous function

Python	Haskell
lambda x, y: 2 * (x + y) lambda tup: tup[0]	\x y -> 2 * (x + y) \(x, y) -> x

Argument separators are not needed.
Tuple argument can be deconstructed using pattern matching.

Concrete type

Types that are not abstract:

Python	Haskell
True 1 1.0 'a' ['a', 'b', 'c'] (True, 'd')	True 1 1.0 'a' "abc" (True, 'd')

Strings are lists of characters (more on that later).
Haskell Int are bounded, Integer are infinite, use type annotation to force the type.

Basic conversion:

Python	Haskell
int(0.5) -- float to int float(1) -- int to float int("42")	round 0.5 fromIntegral 1 :: Float read "42" :: Int

Type annotations

Python	Haskell
def lines(s: str) -> List[str]: return s.split("\n")	--- ghci> :type lines lines :: String -> [String]

Type annotations are prefixed by ::.
lines is a function that takes a String, and it returns a list of Strings, denoted [String].

Python	Haskell
def add_and_double(m : int, n: int) -> int:	add_and_double :: Num a => a -> a -> a

Before => are type-variable constraints, Num a is a constraint for the type-variable a.
Type is a -> a -> a, which means a function that takes two as and that returns a a.
a is a variable type (or type-variable). It can be a Int, a Float, or anything that satisfies the Num type class (more and that later).

Partial application

Python	Haskell
def add20_and_double(n): return add_and_double(20, n) add20_and_double(1)	add20_and_double = add_and_double 20 add20_and_double 1

For example, the map function type annotation is:

map :: (a -> b) -> [a] -> [b]
map takes a function that goes from a to b, denoted (a -> b), a list of as and it returns a list of bs:

Python	Haskell
map(lambda x: x * 2, [1, 2, 3]) # [2, 4, 6]	map (* 2) [1, 2, 3] --- [2, 4, 6]

Here are the annotations for each sub expressions:

(*)         :: Num a => a -> a -> a
(* 2)       :: Num a => a -> a
map         :: (a -> b) -> [a] -> [b]
(map (* 2)) :: Num b => [b] -> [b]

Record

A group of values is defined using Record:

Python	Haskell
class Person: def __init__(self, name): self.name = name person = Person("alice") print(person.name)	data Person = Person { name :: String } person = Person "alice" print(name person)

the first line defines a Person type with a single Person constructor that takes a string attribute.
Record attributes are actually functions.

Here are the annotations of the record functions automatically created:

Person :: String -> Person
name :: Person -> String

Record value can be updated:

Python	Haskell
new_person = copy.copy(person) new_person.name = "bob"	new_person = person { name = "bob" }

See this blog post for more details about record syntax.

(Type) class

Classes are defined using type class. For example, objects that can be compared:

Python	Haskell
# The `==` operator use object `__eq__` function: class Person: def __eq__(self, other): return self.name == other.name	-- The `==` operator works with Eq type class: data Person = Person { name :: String } instance Eq Person where self (==) other = name self == name other

Python

Haskell

# The `==` operator use object `__eq__` function:
class Person:
    def __eq__(self, other):
        return self.name == other.name

-- The `==` operator works with Eq type class:
data Person = Person { name :: String }
instance Eq Person where
    self (==) other = name self == name other

Type class can also have constraints:

Python	Haskell
# The `>` operator use object `__gt__` function: class ComparablePerson(Person): def __gt__(self, other): return self.age > other.age	-- ghci> :info Ord class Eq a => Ord a where compare :: a -> a -> Ordering

Python

Haskell

# The `>` operator use object `__gt__` function:
class ComparablePerson(Person):
    def __gt__(self, other):
        return self.age > other.age

-- ghci> :info Ord
class Eq a => Ord a where
    compare :: a -> a -> Ordering

Haskell can derive most type classes automatically using the deriving keyword:

data Person =
  Person {
    name :: String,
    age :: Int
  } deriving (Show, Eq, Ord)

Common type classes are:

Read
Show
Eq
Ord
SemiGroup

Do notation

Expressions that produce side-effecting IO operations are descriptions of what they do. For example the description can be assigned and evaluated when needed:

Python	Haskell
defered = lambda : print("Hello") defered()	defered = print("Hello") defered

Such expressions are often defined using the do notations:

Python	Haskell
def welcome(): print("What is your name? ") name = input() print("Welcome " + name)	welcome = do putStrLn "What is your name?" name <- getLine print ("Welcome " ++ name)

The <- lets you bind to the content of an IO.
The last expression must match the IO value, use pure if the value is not already an IO.
The do notations can also be used for other non-IO computation.

do notation is syntaxic sugar, here is an equivalent implementation using regular operators:

welcome =
    putStrLn "What is your name?" >>
    getLine >>= \name ->
        print ("Welcome " ++ name)

>> discards the previous value while >>= binds it as the first argument of the operand function.

Algebraic Data Type (ADT)

Here the Bool type has two constructors True or False. We can say that Bool is the sum of True and False:

data Bool = True | False

Here the Person type has one constructor MakePerson that takes two concrete values. We can say that Person is the product of String and Int:

data Person = MakePerson String Int

Data type can be polymorphic:

data Maybe  a   = Just a | Nothing
data Either a b = Left a | Right b

Pattern matching

Multiple function bodies can be defined for different arguments using patterns:

Python	Haskell
def factorial(n): if n == 0: return 1 else: return n * factorial(n - 1)	-- factorial 0 = 1 factorial n = n * factorial(n - 1)

Values can also be matched using case expression:

Python	Haskell
def first_elem(l): if len(l) > 0: return l[0] else: return None	first_elem l = case l of (x:_) -> Just x _ -> Nothing

_ match anything.
See this section of Why Haskell Matters to learn more about list pattern match.

Nested Scope

Nesting the scope of definitions is a commonly used pattern, for example with .. where ..:

Python	Haskell
def main_fun(arg): value = 42 def sub_fun(sub_arg): return value return sub_fun(arg)	main_fun arg = sub_fun arg where value = 42 sub_fun sub_arg = value

Where clauses can be used recursively. Another pattern is to use let .. in .. :

Python	Haskell
def a_fun(arg): (x, y) = arg return x + y	a_fun arg = let (x, y) = arg in x + y

For more details see Let vs. Where.

Standard library

Note that the standard library is likely not enough. Add those extra libraries to the build-depends list of your playground cabal file, then reload stack ghci:

aeson
bytestrings
containers
text

Prelude

By default, Haskell programs have access to the base library:

Python	Haskell
f(g(x)) print(len([1, 2])) [1, 2] + [3] "Hello" + "World" (True, 0)[0] tuples = [(True, 2), (False, 3)] map(lambda x: x[1], tuples) filter(lambda x: x[0], tuples)	(f . g) x print $ length $ [1, 2] [1, 2] <> [3] "Hello" <> "World" fst (True, 0) tuples = [(True, 2), (False, 3)] map snd tuples filter fst tuples

Python

Haskell

f(g(x))
print(len([1, 2]))
[1, 2] + [3]
"Hello" + "World"
(True, 0)[0]
tuples = [(True, 2), (False, 3)]
map(lambda x:    x[1], tuples)
filter(lambda x: x[0], tuples)

(f . g) x
print $ length $ [1, 2]
[1, 2] <> [3]
"Hello" <> "World"
fst (True, 0)
tuples = [(True, 2), (False, 3)]
map snd tuples
filter fst tuples

The $ operator splits the expression in half, and they are evaluated last so that we can avoid using parentheses on the right hand side operand.
The <> operator works on all semigroups (while ++ only works on List).

Data.List

Python	Haskell
l = [1, 2, 3, 4] l[0] l[1:] l[:2] l[2:] l[2] sorted([3, 2, 1])	l = [1, 2, 3, 4] head l tail l take 2 l drop 2 l l !! 2 sort [3, 2, 1]

Data.Maybe

Functions to manipulate optional values: data Maybe a = Just a | Nothing.

Python	Haskell
pred = True value = 42 if pred else None print(value if value else 0) values = [21, None, 7] [value for value in values if value is not None]	import Data.Maybe value = Just 42 print(fromMaybe 0 value) values = [Just 21, Nothing, Just 7] catMaybes values

Python

Haskell

pred = True
value = 42 if pred else None
print(value if value else 0)

values = [21, None, 7]
[value for value in values if value is not None]

import Data.Maybe
value = Just 42
print(fromMaybe 0 value)

values = [Just 21, Nothing, Just 7]
catMaybes values

Data.Either

Functions to manipulate either type: data Either a b = Left a | Right b.

Python	Haskell
def safe_div(x, y): if y == 0: return "Division by zero" else: return x / y values = [safe_div(1, y) for y in range(-5, 10)] [v for v in values if isinstance(value, float)] [v for v in values if isinstance(value, str)]	import Data.Either safe_div _ 0 = Left "Division by zero" safe_div x y = Right $ x / y values = [safe_div 1 y \| y <- [-5..10]] rights values left values

Python

Haskell

def safe_div(x, y):
    if y == 0: return "Division by zero"
    else:      return x / y

values = [safe_div(1, y) for y in range(-5, 10)]
[v for v in values if isinstance(value, float)]
[v for v in values if isinstance(value, str)]

import Data.Either
safe_div _ 0 = Left "Division by zero"
safe_div x y = Right $ x / y

values = [safe_div 1 y | y <- [-5..10]]
rights values
left values

Data.Text

The default type for a string is a list of characterset, Text provides a more efficient alternative:

Python	Haskell
# a_string = "Hello world!" a_string.replace("world", "universe") a_string.split(" ") list(a_string)	import qualified Data.Text as T a_string = T.pack "Hello world!" T.replace "world" "universe" a_string T.splitOn " " a_string T.unpack a_string

Python

Haskell

#
a_string = "Hello world!"
a_string.replace("world", "universe")
a_string.split(" ")
list(a_string)

import qualified Data.Text as T
a_string = T.pack "Hello world!"
T.replace "world" "universe" a_string
T.splitOn " " a_string
T.unpack a_string

Data.Text can also be used to read files:

Python	Haskell
# cpus = open("/proc/cpuinfo").read() lines = cpus.splitlines() filter(lambda s: s.startswith("processor\t"), lines)	import qualified Data.Text.IO as T cpus <- T.readFile "/proc/cpuinfo" cpus_lines = T.lines cpus filter (T.isPreffixOf "processor\t") cpus_lines

Python

Haskell

#
cpus = open("/proc/cpuinfo").read()
lines = cpus.splitlines()
filter(lambda s: s.startswith("processor\t"), lines)

import qualified Data.Text.IO as T
cpus <- T.readFile "/proc/cpuinfo"
cpus_lines = T.lines cpus
filter (T.isPreffixOf "processor\t") cpus_lines

Use :set -XOverloadedStrings in ghci to ensure the "string" values are Text.

Data.ByteString

Use ByteString to work with raw data bytes. Both Data.Text and Data.ByteString come in two flavors, strict and lazy.

Strict version, to and from String:

Data.Text.pack                :: String -> Text
Data.Text.unpack              :: Text   -> String

Data.ByteString.Char8.pack    :: String     -> ByteString
Data.ByteString.Char8.unpack  :: ByteString -> String

Strict version between Text and ByteString:

Data.Text.Encoding.encodeUtf8 :: Text       -> ByteString
Data.Text.Encoding.decodeUtf8 :: ByteString -> Text

Conversion between strict and lazy:

Data.Text.Lazy.fromStrict       :: Data.Text.Text      -> Data.Text.Lazy.Text
Data.Text.Lazy.toStrict         :: Data.Text.Lazy.Text -> Data.Text.Text

Data.ByteString.Lazy.fromStrict :: Data.ByteString.ByteString      -> Data.ByteString.Lazy.ByteString
Data.ByteString.Lazy.toStrict   :: Data.ByteString.Lazy.ByteString -> Data.ByteString.ByteString

To avoid using fully qualified type names, these libraries are usually imported like so:

import Data.ByteString (ByteString)
import qualified Data.ByteString as B
import Data.Text (Text)
import qualified Data.Text as T

Containers

The containers' library offers useful containers types. For example Map:

Python	Haskell
# d = dict(key="value") d["key"] d["other"] = "another"	import qualified Data.Map as M d = M.fromList [("key", "value")] M.lookup "key" d M.insert "other" "another" d

Set:

Python	Haskell
# s = set(("Alice", "Bob", "Eve")) "Foo" in s len(s)	import qualified Data.Set as S s = S.fromList ["Alice", "Bob", "Eve"] "Foo" `S.member` s S.size s

Check out the documentation by running stack haddock --open containers.

When unsure, use the strict version.

Language Extensions

The main compiler ghc supports some useful language extensions. They can be enabled:

Per file using this syntax: {-# LANGUAGE ExtensionName #-}.
Per project using the default-extensions: ExtensionName cabal configuration.
Per ghci session using the :set -XExtensionName command.

Note that ghci :set - command can be auto completed using Tab.

OverloadedStrings

Enables using automatic conversion of "string" value to the appropriate type.

NumericUnderscores

Enables using underscores separator e.g. 1_000_000 .

NoImplicitPrelude

Disables the implicit import Prelude.

Please check What I Wish I Knew When Learning Haskell for a complete overview of Language Extensions, or this post from the kowainik team.

Further Resources

To delve in further, I recommend digging through the links I shared above. These videos are worth a watch:

These introductory books are often mentioned:

A Type of Programming by Renzo Carbonara.
Learn Haskell by Chris Allen.
Get Programming with Haskell by Will Kurt (Manning).
Graham Hutton’s textbook Programming in Haskell (2nd ed).

Finally, if you need help, please join the #haskell-beginners IRC channel on Freenode.

Thank you for reading!