Applying Mathematics to Programming

In mathematics, functions like equality, addition, concatenation, etc. have been studied in detail, and so when we use these functions in programming it makes sense to consider what mathematicians have learned about them.

This turns out to be a very fruitful and deep approach to programming, and we will only scratch the surface in this note.

Equality

To get the flavor of the sort of mathematical approach Haskell allows for programming, lets consider mathematical equality.

Testing if two values are the same is a common programming task. For example, C++ lets you define your own equality operator on some type T like this:

template<class T>            // C++
bool operator==(T x, T y)

Here, T is a user-defined C++ type. For example:

struct Color {              // C++
    uint32 r, g, b;
};

bool operator==(Color x, Color y) {
    return x.r == y.r && x.g == y.g && x.b == y.b;
}

C++ puts no restrictions on what operator== does other than requiring that it take a Color value as input and return a bool as output. A programmer is free to write a nonsensical equality function like this:

bool operator==(Color x, Color y) {
    return x.r + y.b + x.g == 104;    // Huh?
}

In mathematics, however, equality functions are expected to satisfy the rules for an equivalence relation. A C++ == function is considered mathematically sensible if the following rules all hold (x, y, and z are all assumed to be values of the same type):

  1. Reflexivity: For all x, x == x.
  2. Symmetry: For all x and y, if x == y then y == x.
  3. Transitivity: For all x, y, and z, if x == y and y == z, then x == z.

The nonsensical == function above doesn’t satisfy all these rules. In particular, it is not symmetric, e.g. (100, 5, 3) == (10, 1, 20) is true, but (10, 1, 20) == (10, 5, 3) is false.

For many equality functions, especially simple ones, it is obvious that these rules hold, and so we often don’t explicitly check them. However, if you ever implement an unusual or tricky equality function, it is worth ensuring that these three rules hold — otherwise it might not work the way you expect!

By the way, the default == for double in C++ is not an equivalence relation. It is not reflexive, i.e. there are some double values that don’t equal themselves. For instance:

double x = 0.0 / 0.0;
if (x == x) {
    cout << "x equals x\n";
} else {
    cout << "x does not equal x (!)\n";
}

This compiles, runs, and prints “x does not equal x (!)”.

In addition to being an equivalence relation, there are other things we usually require of a good == function:

  • After x == y is called, neither x nor y has been modified. You can enforce this in C++ by passing x and y by value.

  • == should have no side-effects, i.e. nothing outside of == should be modified. For example, this could be bad:

    int a = 0;  // global variable
    
    bool operator==(Color x, Color y) {
        ++a;
        return x.r == y.r && x.g == y.g && x.b == y.b;
    }
    

    There could be cases where this is useful, though. For instance, a is a count of how many times == is called, which provides useful profiling information.

    Haskell doesn’t allow regular functions to have side-effects, and so this is not a problem. Side-effects are only permitted in special situations, i.e. inside a do environment.

  • x == y should always return the same value every time we call it. For instance, this would not be a good ==:

    bool operator==(Color x, Color y) {
        return (rand_bit() == 0);  // rand_bit() randomly returns 0 or 1
    }
    

    Again, the lack of side effects in Haskell solve this problem. A function like rand_bit() cannot be written as a regular Haskell function.

  • Both x and y have the same type T, i.e. they are from the same set of values. C++ sometimes lets you compare different types, e.g. 3 == 3.0 evaluates to true, but 3 is an int and 3.0 is a double. C++ is not directly comparing 3 and 3.0; instead, it automatically converts the 3 to 3.0, and then evaluates the expression 3.0 == 3.0.

    The definition of an equivalence relation implies that the values being compared are from the same set, which implies they ought to be the same type.

  • x == y doesn’t ever get stuck in an infinite loop. In other words, x == y either returns either true or false, or perhaps throws an error. You could argue that eventually returning a value is a requirement for any correct function, but it is worth pointing out because infinite loops are not an issue in ordinary mathematical functions.

  • And last, but not least, == should usually be implemented efficiently in both time and memory. In mathematics, time/memory is not usually a concern for function evaluation, but it is a major concern in programming.

A Python Monoid

A monoid is another example of an abstract syustem that has many applications in programming.

One of the many interesting things about Python is that it lets you use + and * to create strings. For example:

>>> "cat" + "dog"            # Python
'catdog'
>>> "dog" + "cat"
'dogcat'
>>> 3 * 'dog'
'dogdogdog'
>>> 'cat' * 3
'catcatcat'
>>> 3 * ('cat' + 'dog')
'catdogcatdogcatdog'
>>> '' + 'cat'
'cat'
>>> 'cat' + ''
'cat'

When used with string, Python’s + is the concatenation operator. Notice that, in general, if s and t are strings, then s + t is not equal to t + s. Also, if s is a string, then n * s and s * n are only legal if n is an integer, e.g. an expression like 'a' * 'n' is an error.

It turns out that Python’s + for string concatenation satisfies the rules of an algebraic structure called a monoid. For strings, the monoid rules can be written like this:

  1. Associativity: For all strings s, t, and u, (s + t) + u = s + (t + u).
  2. Identity element: For all string s, s + '' = '' + s = s, where '' is the empty string.

A consequence of these rules is that expressions like 3 * 'dog' make sense, i.e. it’s equal to 'dog' + 'dog' + 'dog'.

Many values other than strings satisfy the rules for monoids. So Haskell has the following pre-defined Monoid type class:

class Monoid m where
  mempty :: m
  mappend :: m -> m -> m
  mconcat :: [m] -> m
  mconcat = foldr mappend mempty  -- default implementation of mconcat

The name mappend is read as “monoid append”. The essential idea of a monoid in Haskell is that describes a value that can be appended in a way that satisfies the monoid rules.

For example, the list monoid in Haskell is defined like this:

instance Monoid [a] where
  mempty = []
  mappend x y = x ++ y

Since strings in Haskell are just lists of characters, strings are monoids as well.

Haskell leaves it entirely up to the programmer to ensure that the two monoid rules hold. Haskell does not check this for you (how could it?).

Review: The List map Function

Recall that the expression map f lst applies f to every element of lst. It’s a standard Haskell function, and can be used like this:

> map (+1) [1,2,3]
[2,3,4]

> map (>1) [1,2,3]
[False,True,True]

> map (== 'p') "apple"
[False,True,True,False,False]

It’s type signature is:

map :: (a -> b) -> [a] -> [b]

Take the time to make sure you really understand this type signature. The first input is a function of type a -> b, i.e. a function that converts a value of type a to a value of type b. The second input is a list of a values. The result of the mapping will be a list of values of type b, and so the output of map will be a list of b values.

Sometimes is it useful to pass only the first argument to map. For example:

> twice :: Int -> Int
> twice n = 2 * n
>
> doubler = map twice  -- doubler :: [a] -> [b]

For example:

> doubler [1,2,3,4]
[2,4,6,8]

doubler is an example of a lifted function. The twice function operates on integers, and map twice operators on a list of integers. So map can be thought of as taking a function that applies to single values of type a and returning a new function that applies to list of a values.

Generalizing map

The essential idea of map f lst is to apply a function f to every element of lst in such a way that it modifies individual elements without changing the list structure. In contrast, foldl and foldr don’t promise to return a list, e.g. a fold could change the structure of a list into a number, a boolean, or any other type of value.

What would a map function on a tree structure do? It would apply a function f to every element in the tree, preserving the structure of the tree.

Even more generally, suppose you have a container of values structured in some way. If you map a function g onto this container, then we would expect that every value in it would have g applied to it, and the structure of the container wouldn’t change.

This is the general notion of mapping that a functor encapsulates. In Haskell, a functor can be represented by a type class like this:

class Functor f where               -- Functor is in the standard Prelude
    fmap :: (a -> b) -> f a -> f b
    -- ...

In Functor f, the f is a type constructor that takes one input, e.g. f could be the Maybe type constructor. The function fmap is the generalized mapping function. Notice that the first input to fmap is a function of type a -> b. But the second input is of type f a, not [a] as with a regular list-based map. f a is a generalization of [a]. For lists, f would in fact be equal to [], but that’s just one instance of a functor:

instance Functor [] where  -- Functor instance for built-in lists
    fmap = map

Consider this user-defined list data type:

> data List a = Nil | Cons a (List a)
>   deriving Show

We could define a functor instance for it as follows:

> instance Functor List where
>     fmap g Nil               = Nil
>     fmap g (Cons first rest) = Cons (g first) (fmap g rest)

Now we can write code like this:

> fmap (+1) (Cons 1 (Cons 2 (Cons 3 Nil)))
Cons 2 (Cons 3 (Cons 4 Nil))

Trees are another example of a type where a functor makes sense:

> -- binary search tree (BST)
> data BST a = EmptyBST | Node a (BST a) (BST a)
>      deriving (Show, Read, Eq)
>
> instance Functor BST where
>     fmap g EmptyBST            = EmptyBST
>     fmap g (Node x left right) = (Node (g x)
>                                        (fmap g left)
>                                        (fmap g right))

Now we can write code that uses fmap to apply a function to every value in a BST:

> fmap (+1) (Node 1 (Node 2 EmptyBST EmptyBST) (Node 3 EmptyBST EmptyBST))
Node 2 (Node 3 EmptyBST EmptyBST) (Node 4 EmptyBST EmptyBST)

An interesting example of a functor is the Maybe type:

data Maybe a = Nothing | Just a   -- defined in the prelude

A Maybe value can be thought of as a container that holds either Nothing, or a value of type a. So we could map over it like this:

instance Functor Maybe where  -- already defined in the prelude
    fmap f (Just x) = Just (f x)
    fmap f Nothing  = Nothing

This says that mapping over Nothing gives you Nothing, and mapping over Just x gives you Just (f x).

For example:

> fmap (+1) (Just 2)
Just 3

> fmap (+1) Nothing

Rules for Functors

In addition to satisfying the fmap signature, a functor is usually expected to obey these two rules:

fmap id       ==  id               -- identity
fmap (f . g)  ==  fmap f . fmap g  -- composition

A functor that doesn’t follow these two rules can end up doing some unexpected things. Haskell doesn’t automatically check that these two rules are satisfied — it’s up to the programmer to make sure they are followed.

Imperative Programming in Haskell

The reason we care about these abstract mathematical descriptions of programming is that Haskell uses an abstraction known as a monad to do most of its imperative (i.e. non-functional or impure) programing.

Monads are an infamous topic that many programmers find tricky to master. Monads are often presented in a highly mathematical and abstract way that makes their utility in programming hard to see. So we will approach this from a practical point of view, and not worry much about the mathematics.

The IO Monad

Consider putChar, a standard Haskell function with this signature:

putChar :: Char -> IO ()

It takes a single Char as input, and performs the action of printing that character on the screen. The return type IO indicates that this is an action. No value of interest is returned by putChar, so its return type is the null tuple ().

Another primitive action is done. This does no action, and returns no value of interest. This is not a part of the standard Haskell prelude, but we can define it like this:

done :: Monad m => m ()  -- ignore the type signature for now
done = return ()

The name return is unfortunate, since it doesn’t mean the same thing as most other other languages.

Action Sequencing

We typically want to do multiple actions in sequence, so Haskell provides the sequence operator (>>). For the moment, we can think of (>>) as having this signature:

(>>) :: IO () -> IO () -> IO ()   -- not the most general type signature!
                                  -- will be generalized shortly

If p and q are actions, then p >> q first does p, and then does q. For example:

> putChar 'a' >> putChar >> 'b' >> putChar '\n'
ab
>

With this we can define the putStrLn function, which prints an entire string on the screen with a trailing newline:

myPutStrLn :: String -> IO ()
myPutStrLn xs = (foldr (>>) done (map putChar xs)) >> putChar '\n'

To understand this, lets trace the call myPutStrLn "cat". First, (map putChar "cat") is called, which returns this list:

[myPutStrLn 'c', myPutStrLn 'a', myPutStrLn 't']

We want to do the three actions in the order they are given in the list, e.g.:

myPutStrLn 'c' >> (myPutStrLn 'a' >> (myPutStrLn 't' >> (done)))

Recall that done is an action that does nothing, and returns no value of interest.

This expression has the form of a right fold, and so we can use foldr to calculate it:

foldr (>>) done [myPutStrLn 'c', myPutStrLn 'a', myPutStrLn 't']

Finally, after this fold is finished, a final putChar '\n' is added to the end.

Another example of an action is getChar:

getChar :: IO Char

When run, getChar reads a character from the keyboard:

> getChar
m'm'       -- first m typed by user; 'm' is returned from getCha

It does not return a Char. Instead, as its type signature shows, it returns an IO Char.

Above we defined the do-nothing action done like this:

done = return ()

The return function itself has this type:

return :: a -> IO a   -- not the most general type: specialized for IO

You can think of return` as a generalization of done. return takes a value of type a as input, performs no action, and returns an IO a value.

Now we can generalize (>>) a little more:

(>>) :: IO a -> IO b -> IO b

So if p and q are actions, then p >> q works as follows. First, p is performed, and it’s return value is thrown away. Second, q is performed, and the value of q is returned. For example:

> return "hot" >> return "dog"
"dog"

Since the return value of return "hot" is thrown away, there is no way that following actions can depend on it. That’s a significant limitation, as we almost always want later expressions in a program to depend upon earlier ones.

The Bind Operator

So (>>) is not quite the right operator for putting actions in sequence. We need a more general operator that allows the second action to get the result of the first action. The (>>=) operator is the one we want:

(>>=) :: IO a -> (a -> IO b) -> IO b

Suppose we evaluate p >>= f. First, p is performed, and it returns a value x of type a. Then x is passed to function f, which evaluates to a value y of type b. This value y is the final value that is returned.

(>>=) is known as the bind operator, and is sometimes pronounced “then apply”.

Notice that we can define (>>) using (>>=):

p >> q = p >>= const q

const is a standard prelude function defined as const x y = x, and so const q is a function that takes one input (in this example, the value from p). const throws away it’s second argument, and that’s what happens here: the value returned by p is ignored.

(>>=) is the key operator for understanding imperative programming in Haskell. It lets you do two actions in sequence in such as way that the second action can depend upon the results of the first action.

For example, we can write our own version of the standard getLine function like this:

getLine :: IO String
getLine = getChar >>= f
          where f x = if x == '\n' then return []
                      else getLine >>= g
                           where g xs = return (x:xs)

The first thing getLine does is get a single character. If that character is a newline, then it’s finished and the “do nothing” action is returned with a value of the empty string (i.e. []). Otherwise, if the character is not a newline, we get the rest of the line (called xs) and add x to the front.

Another way to write getLine is like this:

getLine = getChar >>= \x ->
              if x == '\n'
              then return []
              else getLine >>= \xs ->
                     return (x:xs)

Here, the second argument to (>>=) is written as a lambda function in each case.

Yet another way of writing this is to use the special do-notation, which is desgined to simplify the use of (>>=):

getLine = do {x <- getChar;
              if x == '\n'
              then return []
              else do {xs <- getLine;
                      return (x:xs)}}

The braces and semi-colons are used to help clarify the layout.

Note

You don’t need to know any details of using braces or semi-colons in Haskell for this course.

Finally, we mention the Monad type class where return and (>>=) operators can be defined for any suitable type m:

class Monad m where
  return :: a -> m a
  (>>=)  :: m a -> (a -> m b) -> m b

In the examples above, m was the type IO. But m can be a wide variety of types. It has proven to be a very useful pattern for functional programming, and any type that is an instance of Monad can be used with do-notation.

For example, the Maybe type is an instance of Monad, defined like this:

instance Monad Maybe where
  return         = Just
  (Just x) >>= g = g x
  Nothing  >>= _ = Nothing

In addition to the two required Monad functions, there are also a few rules that a Monad type should follow to be sensible. We won’t get into them here, but you can look up these Monad laws in the book Thinking Functionally with Haskell, or most other good Haskell references.

do Notation

In practice, most Haskell programmers use the do notation when doing imperative programming. For example:

main = do
    putStrLn "Hello, what's your name?"
    name <- getLine
    putStrLn ("Hello " ++ name ++ ", how is it going?")

Since putStrLn returns IO (), we put it in a do environment so we can put it in sequence with more than one action.

The line name <- getLine gets a string from the console and assigns it to the variable name. The type of getLine is important:

> :type getLine
getLine :: IO String

Inside a do environment, you can use getLine like an ordinary function. But outside a do environment, you can’t get the string getLine returns — you can only get IO String, i.e. a container that holds the string.

The <- notation is a special operator that is used inside a do environment to assign a value.

Here’s another example of a do environment that shows that you can use regular pure Haskell features, such as let and map, within it:

import Data.Char

main = do
    putStrLn "What's your first name?"
    fname <- getLine
    putStrLn "What's your last name?"
    lname <- getLine
    let firstName = map toUpper fname
        lastName = map toUpper lname
    putStrLn $ "Hey " ++ firstName ++ " " ++ lastName ++ ", how are you?"