433 lines
13 KiB
Markdown
433 lines
13 KiB
Markdown
|
# Was ist das hier?
|
||
|
|
||
|
Hier schreiben wir ein paar Code-Highlights auf, die uns begegnet sind.
|
||
|
|
||
|
## Monoid? Da war doch was...
|
||
|
|
||
|
Stellen wir uns vor, dass wir eine Funktion schreiben, die einen String bekommt (mehrere Lines mit ACSII-Text) und dieses Wort-für-Wort rückwärts ausgeben soll. Das ist ein einfacher Einzeiler:
|
||
|
|
||
|
~~~ { .haskell .numberLines }
|
||
|
module Main where
|
||
|
|
||
|
import System.Environment (getArgs)
|
||
|
import Data.Monoid (mconcat)
|
||
|
import Data.Functor ((<$>))
|
||
|
|
||
|
main = do
|
||
|
ls <- readFile =<< head <$> getArgs
|
||
|
mconcat <$> mapM (putStrLn . unwords . reverse . words) (lines ls) --die eigentliche Funktion, ls ist das argument.
|
||
|
~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
Was passiert hier an Vodoo? Und was machen die ganzen wilden Zeichen da?
|
||
|
|
||
|
Gehen wir die Main zeilenweise durch:
|
||
|
Wir lesen die Datei, die im ersten Kommandozeilen-Argument gegeben wird. getArgs hat folgende Signatur:
|
||
|
|
||
|
```haskell
|
||
|
getArgs :: IO [String]
|
||
|
```
|
||
|
|
||
|
Wir bekommen als eine Liste der Argumente. Wir wollen nur das erste. Also machen wir head getArgs. Allerdings fliegt uns dann ein Fehler. head sieht nämlich so aus:
|
||
|
|
||
|
```haskell
|
||
|
head :: [a] -> a
|
||
|
```
|
||
|
|
||
|
Irgendwie müssen wird as **in** das IO bekommen. Hierzu gibt es fmap. Somit ist
|
||
|
|
||
|
```haskell
|
||
|
fmap head :: IO [a] -> IO a
|
||
|
```
|
||
|
|
||
|
Ein inline-Alias (um die Funktion links und das Argument rechts zu schreiben und sich ne Menge Klammern zu sparen) ist <$>. Somit ist schlussendlich der Inhalt der Datei aus dem ersten Argument (lazy) in ls.
|
||
|
|
||
|
Eine andere Möglichkeit sich das (in diesem Fall) zu merken, bzw. drauf zu kommen ist, dass [] AUCH ein Funktor (sogar eine Monade) ist. Man könnte das also auch so schreiben:
|
||
|
|
||
|
```haskell
|
||
|
head :: [] a -> a
|
||
|
head :: Functor f => [] (f a) -> f a -- das "a" geschickt ersetzt zur Verdeutlichung
|
||
|
getArgs :: IO [] String
|
||
|
fmap head :: Functor f => f [] a -> f a
|
||
|
```
|
||
|
|
||
|
fmap "packt" die Funktion quasi 1 Umgebung (Funktor, Monade, ..) weiter rein - Sei es nun in Maybe, Either oder irgendwas anderes.
|
||
|
|
||
|
Alternatives (ausführliches) Beispiel am Ende.
|
||
|
|
||
|
Wenn wir uns die Signatur ansehen, dann haben wir nun
|
||
|
|
||
|
```haskell
|
||
|
head <$> getArgs :: IO String
|
||
|
```
|
||
|
|
||
|
readFile will aber nun ein String haben. Man kann nun
|
||
|
|
||
|
```haskell
|
||
|
f <- head <$> getArgs
|
||
|
ls <- readFile f
|
||
|
```
|
||
|
|
||
|
kann man auch "inline" mit =<< die Sachen "auspacken".
|
||
|
|
||
|
Die 2. Zeile lesen wir nun einfach "von hinten", wie man das meistens tun sollte. Hier ist ein
|
||
|
|
||
|
```haskell
|
||
|
lines ls :: [String]
|
||
|
```
|
||
|
|
||
|
was uns den Inhalt der Datei zeilenweise gibt. Mit jeder Zeile möchten wir nun folgendes machen:
|
||
|
|
||
|
1. nach Wörtern trennen (words)
|
||
|
2. Wörter in der reihenfolge umkehren (reverse)
|
||
|
3. Wörter wider zu einer Zeile zusammensetzen (unwords)
|
||
|
4. diese Zeile ausgeben (putStrLn)
|
||
|
|
||
|
Wenn wir uns die Signatur ansehen:
|
||
|
|
||
|
```haskell
|
||
|
(putStrLn . unwords . reverse . words) :: String -> IO ()
|
||
|
```
|
||
|
|
||
|
Das mag im ersten Moment verwirren, daher noch die Signaturen der Einzelfunktionen:
|
||
|
|
||
|
```haskell
|
||
|
words :: String -> [String]
|
||
|
reverse :: [a] -> [a]
|
||
|
unwords :: [String] -> String
|
||
|
putStrLn :: String -> IO ()
|
||
|
```
|
||
|
|
||
|
Da wir am Ende in der IO-Monade landen müssen wir das auf unsere Zeilen mit mapM statt map anwenden. Dies sorgt auch dafür, dass die Liste der reihe nach durchgegangen wird. mapM mit unserer Funktion schaut dann so aus:
|
||
|
|
||
|
```haskell
|
||
|
mapM (putStrLn . unwords . reverse . words) :: [String] -> [IO ()]
|
||
|
```
|
||
|
|
||
|
eek! Das [IO ()] sieht ekelig aus. Wir haben eine Liste von IO-gar nichts. Das können wir eigentlich entsorgen. Da wir innerhalb der main-Funktion in einer IO-Monade sind, wollen wir IO () anstatt [IO ()] zurück haben.
|
||
|
|
||
|
Wenn wir uns jetzt erinnern, dass [] auch nur eine Monade ist und dass jede Monade ein Monoid ist, dann ist die Lösung einfach. Monoide haben eine "append"-funktion (mappend oder (<>) genannt). Wenn wir "nichts" an "nichts" anhängen, dann erhalten wir .... *Trommelwirbel* "nichts"! Wir müssen die [IO ()]-Liste also "nur noch" mit mappend falten. Hierzu gibt es schon eine vorgefertigte Funktion:
|
||
|
|
||
|
```haskell
|
||
|
mconcat :: [a] -> a
|
||
|
mconcat = foldr mappend mempty
|
||
|
```
|
||
|
|
||
|
Was genau die gewünschte Faltung macht. Wir müssen nun wieder fmap nehmen, da wir die Liste selbst falten wollen - und nicht map, welches auf den IO () innerhalb der Liste arbeiten würde. Durch die Faltung fällt die Liste nun auf IO () zusammen.
|
||
|
|
||
|
Viel Voodoo in wenig Code, aber wenn man sich dran gewöhnt hat, sind Monaden in Monaden auch nicht schlimm. Man muss sich immer nur richtig "rein" fmap'en.
|
||
|
|
||
|
---
|
||
|
|
||
|
Kleinen Tipp gab es noch: mapM_ macht genau das, was oben mit mconcat erreicht werden sollte. Somit kann man auch
|
||
|
|
||
|
```haskell
|
||
|
mapM_ (putStrLn . unwords . reverse . words) (lines ls)
|
||
|
```
|
||
|
|
||
|
schreiben. Ich hab es aber mal wegen der klarheit oben so gelassen.
|
||
|
|
||
|
### Alternatives fmap-Beispiel
|
||
|
|
||
|
Nehmen wir als alternatives Beispiel mal an:
|
||
|
|
||
|
```haskell
|
||
|
a :: IO Maybe State t
|
||
|
```
|
||
|
|
||
|
Um Funktionen vom Typ
|
||
|
|
||
|
```haskell
|
||
|
f :: IO a -> IO a
|
||
|
f a -- valide
|
||
|
```
|
||
|
|
||
|
zu nehmen, brauchen wir nichts machen. Bei
|
||
|
|
||
|
```haskell
|
||
|
f' :: Maybe a -> Maybe a
|
||
|
```
|
||
|
|
||
|
brauchen wir 1 fmap, also ein
|
||
|
|
||
|
```haskell
|
||
|
f' a -- error
|
||
|
f' <$> a
|
||
|
```
|
||
|
|
||
|
um eine Funktion
|
||
|
|
||
|
```haskell
|
||
|
f'' :: State t -> State t
|
||
|
```
|
||
|
|
||
|
zu benutzen folglich:
|
||
|
|
||
|
```haskell
|
||
|
f'' a -- error
|
||
|
f'' <$> a -- error
|
||
|
fmap f'' <$> a
|
||
|
```
|
||
|
|
||
|
## *-Morpisms
|
||
|
|
||
|
Backup eines Blogposts eines Kommilitonen:
|
||
|
|
||
|
|
||
|
This weekend I spend some time on Morphisms.
|
||
|
|
||
|
Knowing that this might sound daunting to many
|
||
|
dabbling Haskellers (like I am), I decided to
|
||
|
write a real short MergeSort hylomorphism quickstarter.
|
||
|
|
||
|
----------------------------------------------------------
|
||
|
|
||
|
For those who need a refresher: MergeSort works by creating
|
||
|
a balanced binary tree from the input list and directly
|
||
|
collapsing it back into itself while treating the children
|
||
|
as sorted lists and merging these with an O(n) algorithm.
|
||
|
|
||
|
----------------------------------------------------------
|
||
|
|
||
|
First the usual prelude:
|
||
|
```haskell
|
||
|
{-# LANGUAGE DeriveFunctor #-}
|
||
|
{-# LANGUAGE TypeFamilies #-}
|
||
|
|
||
|
import Data.Functor.Foldable
|
||
|
import Data.List (splitAt, unfoldr)
|
||
|
```
|
||
|
|
||
|
----------------------------------------------------------
|
||
|
|
||
|
We will use a binary tree like this. Note that
|
||
|
there is no explicit recursion used, but `NodeF` has
|
||
|
two *holes*. These will eventually filled later.
|
||
|
|
||
|
```haskell
|
||
|
data TreeF c f = EmptyF | LeafF c | NodeF f f
|
||
|
deriving (Eq, Show, Functor)
|
||
|
```
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Aside: We could use this as a *normal* binary tree by
|
||
|
wrapping it in `Fix`: `type Tree a = Fix (TreeF a)`
|
||
|
But this would require us to write our tree like
|
||
|
`Fix (NodeF (Fix (LeafF 'l')) (Fix (LeafF 'r')))`
|
||
|
which would get tedious fast. Luckily Edward build
|
||
|
a much better way to do this into *recursion-schemes*.
|
||
|
I will touch on this later.
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Without further ado we start to write a Coalgebra,
|
||
|
which in my book is just a scary name for
|
||
|
"function that is used to construct datastructures".
|
||
|
|
||
|
```haskell
|
||
|
unflatten :: [a] -> TreeF a [a]
|
||
|
unflatten ( []) = EmptyF
|
||
|
unflatten (x:[]) = LeafF x
|
||
|
unflatten ( xs) = NodeF l r where (l,r) = splitAt (length xs `div` 2) xs
|
||
|
```
|
||
|
|
||
|
From the type signature it's immediately obvious,
|
||
|
that we take a list of 'a's and use it to create
|
||
|
a part of our tree.
|
||
|
|
||
|
The nice thing is that due to the fact that we
|
||
|
haven't commited to a type in our tree nodes
|
||
|
we can just put lists in there.
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Aside: At this point we could use this Coalgebra to
|
||
|
construct (unsorted) binary trees from lists:
|
||
|
|
||
|
```haskell
|
||
|
example1 = ana unflatten [1,3] == Fix (NodeF (Fix (LeafF 1)) (Fix (LeafF 3)))
|
||
|
```
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
On to our sorting, tree-collapsing Algebra.
|
||
|
Which again is just a creepy word for
|
||
|
"function that is used to deconstruct datastructures".
|
||
|
|
||
|
The function `mergeList` is defined below and
|
||
|
just merges two sorted lists into one sorted list
|
||
|
in O(n), I would probably take this from the `ordlist`
|
||
|
package if I were to implement this *for real*.
|
||
|
|
||
|
Again we see that we can just construct our
|
||
|
sorted output list from a `TreeF` that
|
||
|
apparently contains just lists.
|
||
|
|
||
|
```haskell
|
||
|
flatten :: Ord a => TreeF a [a] -> [a]
|
||
|
flatten EmptyF = []
|
||
|
flatten (LeafF c) = [c]
|
||
|
flatten (NodeF l r) = mergeLists l r
|
||
|
```
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Aside: We could use a Coalgebra to deconstruct trees:
|
||
|
|
||
|
```haskell
|
||
|
example2 = cata flatten (Fix (NodeF (Fix (LeafF 3)) (Fix (LeafF 1)))) == [1,3]
|
||
|
```
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Now we just combine the Coalgebra and the Algebra
|
||
|
with one from the functions from Edwards `recursion-schemes`
|
||
|
library:
|
||
|
```haskell
|
||
|
mergeSort :: Ord a => [a] -> [a]
|
||
|
mergeSort = hylo flatten unflatten
|
||
|
|
||
|
example3 = mergeSort [5,2,7,9,1,4] == [1,2,4,5,7,9]
|
||
|
```
|
||
|
--------------------------------------------------
|
||
|
|
||
|
What have we gained?
|
||
|
|
||
|
We have implemented a MergeSort variant in 9 lines of
|
||
|
code, not counting the `mergeLists` function below.
|
||
|
Not bad, but
|
||
|
[this implementation](http://en.literateprograms.org/Merge_sort_(Haskell))
|
||
|
is not much longer.
|
||
|
|
||
|
On the other hand the morphism based implementation
|
||
|
cleanly describes what happens during construction
|
||
|
and deconstruction of our intermediate structure.
|
||
|
|
||
|
My guess is that, as soon as the algortihms get
|
||
|
more complex, this will really make a difference.
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
At this point I wasn't sure if this was useful or
|
||
|
remotely applicable. Telling someone "I spend a
|
||
|
whole weekend learning about Hylomorphism" isn't
|
||
|
something the cool developer kids do.
|
||
|
|
||
|
It appeared to me that maybe I should have a look
|
||
|
at the Core to see what the compiler finally comes
|
||
|
up with (edited for brevity):
|
||
|
|
||
|
```haskell
|
||
|
mergeSort :: [Integer] -> [Integer]
|
||
|
mergeSort =
|
||
|
\ (x :: [Integer]) ->
|
||
|
case x of wild {
|
||
|
[] -> [];
|
||
|
: x1 ds ->
|
||
|
case ds of _ {
|
||
|
[] -> : x1 ([]);
|
||
|
: ipv ipv1 ->
|
||
|
unfoldr
|
||
|
lvl9
|
||
|
(let {
|
||
|
p :: ([Integer], [Integer])
|
||
|
p =
|
||
|
case $wlenAcc wild 0 of ww { __DEFAULT ->
|
||
|
case divInt# ww 2 of ww4 { __DEFAULT ->
|
||
|
case tagToEnum# (<# ww4 0) of _ {
|
||
|
False ->
|
||
|
case $wsplitAt# ww4 wild of _ { (# ww2, ww3 #) -> (ww2, ww3) };
|
||
|
True -> ([], wild)
|
||
|
}
|
||
|
}
|
||
|
} } in
|
||
|
(case p of _ { (x2, ds1) -> mergeSort x2 },
|
||
|
case p of _ { (ds1, y) -> mergeSort y }))
|
||
|
}
|
||
|
}
|
||
|
end Rec }
|
||
|
```
|
||
|
|
||
|
While I am not really competent in reading Core and
|
||
|
this is actually the first time I bothered to try,
|
||
|
it is immediately obvious that there is no trace
|
||
|
of any intermediate tree structure.
|
||
|
|
||
|
This is when it struck me. I was dazzled and amazed.
|
||
|
And am still. Although we are writing our algorithm
|
||
|
as if we are working on a real tree structure the
|
||
|
library and the compiler are able to just remove
|
||
|
the whole intermediate step.
|
||
|
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Aftermath:
|
||
|
|
||
|
In the beginning I promised a way to work on
|
||
|
non-functor data structures. Actually that
|
||
|
was how I began to work with the `recursion-schemes`
|
||
|
library.
|
||
|
|
||
|
We are able to create a 'normal' version of our tree
|
||
|
from above:
|
||
|
```haskell
|
||
|
data Tree c = Empty | Leaf c | Node (Tree c) (Tree c)
|
||
|
deriving (Eq, Show)
|
||
|
```
|
||
|
But we can not use this directly with our (Co-)Algebras.
|
||
|
Luckily Edward build a little bit of type magic into
|
||
|
the library:
|
||
|
|
||
|
```haskell
|
||
|
type instance Base (Tree c) = (TreeF c)
|
||
|
|
||
|
instance Unfoldable (Tree c) where
|
||
|
embed EmptyF = Empty
|
||
|
embed (LeafF c) = Leaf c
|
||
|
embed (NodeF l r) = Node l r
|
||
|
|
||
|
instance Foldable (Tree c) where
|
||
|
project Empty = EmptyF
|
||
|
project (Leaf c) = LeafF c
|
||
|
project (Node l r) = NodeF l r
|
||
|
```
|
||
|
|
||
|
Without going into detail by doing this we establish
|
||
|
a relationship between `Tree` and `TreeF` and teach
|
||
|
the compiler how to translate between these types.
|
||
|
|
||
|
Now we can use our Alebra on our non functor type:
|
||
|
|
||
|
```haskell
|
||
|
example4 = cata flatten (Node (Leaf 'l') (Leaf 'r')) == "lr"
|
||
|
```
|
||
|
|
||
|
The great thing about this is that, looking at the
|
||
|
Core output again, there is no traces of the `TreeF`
|
||
|
structure to be found. As far as I can tell, the
|
||
|
algorithm is working directly on our `Tree` type.
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Literature:
|
||
|
|
||
|
* [Understanding F-Algebras](https://www.fpcomplete.com/user/bartosz/understanding-algebras)
|
||
|
* [Recursion Schemes by Example](http://www.timphilipwilliams.com/slides.html)
|
||
|
* [Recursion Schemes: A Field Guide](http://comonad.com/reader/2009/recursion-schemes/)
|
||
|
* [This StackOverflow question](http://stackoverflow.com/questions/6941904/recursion-schemes-for-dummies)
|
||
|
|
||
|
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Appendix:
|
||
|
|
||
|
```haskell
|
||
|
mergeLists :: Ord a => [a] -> [a] -> [a]
|
||
|
mergeLists = curry $ unfoldr c where
|
||
|
c ([], []) = Nothing
|
||
|
c ([], y:ys) = Just (y, ([], ys))
|
||
|
c (x:xs, []) = Just (x, (xs, []))
|
||
|
c (x:xs, y:ys) | x <= y = Just (x, (xs, y:ys))
|
||
|
| x > y = Just (y, (x:xs, ys))
|
||
|
```
|