Set-theory in programming

Set-theory in programming

Von Ramon Brullo am 19.06.2022

Introduction

It is often said, that programming is simply a form of applied mathematics and so it is no surprise that many concepts usually used in the field of mathematics, such as variables, functions or algebra are also found in various programming languages.

One area of mathematics which is integral to many programming languages, especially the more functional ones, is set-theory. In this article I will briefly describe what set-theory is and how to apply it to programming.

Sets and their theory

First, I will give a brief rundown of what a set it and what operations can be performed on it.

In mathematics, a set is a group of distinct items. As an example, { 1 2 3 } is a set containing, three numbers. Sets must be distinct so this set { 1 2 3 1 } is equivalent to the first one, as duplicates are simply removed.

While in mathematics sets are often numbers, such as the set of all real numbers R, sets may contain all sorts of things. For example, I could create a set of all lakes inside Austria, or a set of all words that start with the letter “b”.

Sets may be defined in a few different ways. The most straight forward one is to simply list all elements, such as in { a b c }. This may however be unpractical or even impossible for large or infinite sets. In these cases, sets may also be defined using rules which must be satisfied by all members. As an example, I could define a set of all people who are younger than 20 years.

Notice that when defining a set using a rule, this often takes the form of defining a sub-set of an existing set which satisfy the condition. A sub-set is a set in which all items are also contained inside some parent set. For example, the set of all pugs is a sub-set of the set of all dogs. This shows us one of the most important operations which may be performed on sets, which is filtering, or creating a new set by taking some subset of an existing set.

Pugs are filtered from the set of all dogs

The next operation I want to touch on is mapping. Mapping takes each element in a set and transforms it in a consistent way to create another set. As an example, I may take a set of cars and map it by taking the color of each car. This will create a new set of colors. Notice that if multiple cars had the same color, they will be collapsed, as a set may only contain distinct elements.

Mapping the colors of cars creates a new set

Sets may also be merged, also called forming a union, which takes all elements of both sets to create a new one. In a similar vein, you may take the intersection of two sets, or all elements which exist in both sets.

How sets relate to programming

Though it may not be immediately obvious, types in modern programming languages are essentially sets. I will use TypeScript as an example, but feel free to find parallels in your programming language of choice. Consider the following line.

let x: number = 5

Here I declare a variable of the type number. Let’s think about what number means in this context. It denotes a set of all possible values which are considered a number in TypeScript and as luck would have it, 5 is an element of that set.

let x: string = 5

In contrast, this line does not work, as 5 is not an element in the set of all possible strings. More interesting parallels can be found when declaring custom types. For example, the following.

type stringOrNumber = string | number

In this line, I declare a new type and as such a new set. In this case I form the union of all strings and all numbers. As such all elements which are either a string or number are also in the set stringOrNumber. We call this adding the two types, because, if there are x possible strings and y possible numbers than this set will have x + y possible values.

type stringAndNumber = {
    s: string
    n: number
}

This in contrast, is a multiplied type. Let’s imagine that there were only 3 possible strings, “a”, “b” and “c” and only two possible numbers 1 and 2. In that case all possible values of this set would be:

{ s: “a”, n: 1 }
{ s: “b”, n: 1 }
{ s: “c”, n: 1 }
{ s: “a”, n: 2 }
{ s: “b”, n: 2 }
{ s: “c”, n: 2 }

Here we can see that what we typically call “Objects” are simply the sets of all their properties multiplied. In this case x strings * y numbers for x * y possibilities.

Next, we will look at functions, which essentially mirror the operations which I have previously described. Specifically, they are usually mapping functions. Here are a few examples.

// number -> number
function inc(x: number) {
    return x + 1
}

// Person -> name
function getName(person: Person) {
    return person.name
}

// T -> T[]
function singletonArray<T>(thing: T) {
    return [thing]
}

As you can see, all of these functions take an element of one set and convert it to an element of another one. In the case of the first function the result-set is the same as the start-set. We can also view these functions as being mappings from one set to another, as you can put any element of one set into them and receive an element of another, and so if you would send all elements of the first set through the function you would receive the whole other set.

Conclusion

This very mathematical way of thinking about types can be very useful in a few, admittedly somewhat niche situations, such as in property-based testing or when designing new programming languages. However, I will also say, that it has been useful to me in everyday programming. This quite specific angle of looking at your types and problem-domain in general, has often helped me to see connections and possible solutions that otherwise would have maybe escaped me. I hope, that by simply knowing about sets now, you may also see these connections in your projects, and that it may be helpful to you.