Wednesday, 27 October 2010

Very Simple F# Compiler (Part 2 Revisited)

Just a quick note today to say that I have done about 30 seconds reading of some F# documentation and found out that reference cells can indeed be used in closures while local mutable values cannot. As it is a trivial exercise to re-write my character source to use local reference cells rather that global mutable values I have done so. Below is the updated code - I've also been able to remove those globals from my program.
  1 let getNextCharFromString (source : string) =
2
let sourceList = ref (List.ofSeq source)
3
let endOfString = ref (false)
4
fun () ->
5
match (!sourceList, !endOfString) with
6
| _, true -> failwith "Past End of String"
7
| [], _ ->
8
endOfString := true
9
char 0
10
| h :: t, _ ->
11
sourceList := t
12
h

Tuesday, 26 October 2010

Very Simple F# Compiler (Part 2 - The Character Generator)

Last time I defined the data structures for my simple compiler. This time I'll define the code that generates the stream of characters that will be consumed by the token generator. The specifics of this code are not really addressed in the book, and I want to keep things simple by just specifying a string that will be a source of the characters. I'd also like to make the code fairly flexible so I can easily swap out this code and replace it with code that will read characters from a file.

The following code is a bit messy and its the part of the application that I'm least happy with, but it does the job. Perhaps if I have enough time I'll be able to rewrite the code with something more appropriate such as a Sequence or something safer like using a couple of Reference Cells.
  1 let mutable sourceList : char list = []
2
let mutable endOfString = false
3
4
let getNextCharFromString (source : string) =
5
sourceList <- List.ofSeq source
6
endOfString <- false
7
fun () ->
8
match (sourceList, endOfString) with
9
| _, true -> failwith "Past End of String"
10
| [], _ ->
11
endOfString <- true
12
char 0
13
| h :: t, _ ->
14
sourceList <- t
15
h

The function has the following type:

string -> (unit -> char)

First of all I have declared two mutable values that hold the string to be served and a flag to indicate if the end of the string has been reached. The string is stored as a char list so that the cons (::) operator can later be used to decompose it. Next a function is defined that will take a string and return a function that returns a char. Each time the function is called it will check so see if the end of the string has been reached, serves up a 0 if there are no more characters, or serves the head of the list and updates the list with its tail.

A function that will provide the next character from a string each time it is called can be constructed in the following way:
 16 let getNextChar = getNextCharFromString "(2*((3*4)+9))"

This new function charSource has the type

unit -> char

See you next time!

Very Simple F# Compiler

I have decided to try and implement a version of the Demo Compiler in Chapter 1 of Modern Compiler Design by Grune, Bal, Jacobs & Langendoen. To make it harder for myself I'd like to use as much functional code as possible and as little object oriented code. I want to try and change as little of the design as possible so I'll be taking their "narrow" compiler approach.

Let's start by defining the data structured used in the design.

The design in the book has two data structures, token and expression, which are both implemented as structs. To make the code more F#y these structures will be implemented as Discriminated Unions rather than the struct's direct equivalent the Record. This is because both of the structures had redundancy built into them as they both represent data that is could be in more than one mutually exclusive form. This is the kind of structure that Discriminated Unions are made for. This decision will also make our pattern matching code much simpler to write later on. I've also decided to add a further structure the operator which will allow me to keep character checking to a minimum outside of the lexing and parsing code.

Here are my data structures:
  1 type token =
2
| Digit of char
3
| Char of char
4
| Eof
5
6
type operator =
7
| Add
8
| Multiply
9
10
type expression =
11
| DigitExpression of int
12
| ParenthesizedExpression of expression * operator * expression

Formatting F# code on Blogger

There doesn't appear to be too many tools on the internet for nicely formatting F# code on Blogger. I recently stumbed across CodeHTMLer which produces output like this:
  1 let rec interpret =
2
function
3
| DigitExpression a -> a
4
| ParenthesizedExpression (l, o, r) ->
5
let left = interpret l
6
let right = interpret r
7
match o with
8
| Add -> left + right
9
| Multiply -> left * right

See my last post for an example in C#