MarkCC posted an insert algorithm for red-black trees in Haskell yesterday. Now I don’t want to come across as anything resembling an expert here but I felt honor-bound to raise the point that there is a far simpler algorithm described in Chris Okasaki’s 1999 paper Red-Black Trees in a Functional Setting.
It goes basically like this:
data Color = Red | Black deriving (Show, Eq) data RedBlackTree a = Empty | Node Color (RedBlackTree a) a (RedBlackTree a) deriving (Show, Eq) insert :: Ord a => a -> RedBlackTree a -> RedBlackTree a insert elem t = makeBlack (insert' elem t) where makeBlack (Node _ a y b) = Node Black a y b insert' elem Empty = Node Red Empty elem Empty insert' elem s@(Node color a y b) | elem < y = balance color (insert' elem a) y b | elem > y = balance color a y (insert' elem b) | otherwise = s balance Black (Node Red (Node Red a x b) y c) z d = Node Red (Node Black a x b) y (Node Black c z d) balance Black (Node Red a x (Node Red b y c)) z d = Node Red (Node Black a x b) y (Node Black c z d) balance Black a x (Node Red (Node Red b y c) z d) = Node Red (Node Black a x b) y (Node Black c z d) balance Black a x (Node Red b y (Node Red c z d)) = Node Red (Node Black a x b) y (Node Black c z d) balance color a x b = Node color a x b
I will not try to replicate the contents of the paper here, it’s very short so I encourage you to go read it. The gist of it is as follows.
The reason the Haskell algorithm is simpler is because in functional languages, destructive updates to the tree are impossible. This renders the (relatively complex) optimizations that are commonplace in imperative implementations pointless.
The functional implementation has other advantages, namely: beauty, simplicity, persistence, and all of the other things we choose when we give up pointer manipulation.