Tuesday 26 October 2010

Very Simple F# Compiler (Part 2 - The Character Generator)

Last time I defined the data structures for my simple compiler. This time I'll define the code that generates the stream of characters that will be consumed by the token generator. The specifics of this code are not really addressed in the book, and I want to keep things simple by just specifying a string that will be a source of the characters. I'd also like to make the code fairly flexible so I can easily swap out this code and replace it with code that will read characters from a file.

The following code is a bit messy and its the part of the application that I'm least happy with, but it does the job. Perhaps if I have enough time I'll be able to rewrite the code with something more appropriate such as a Sequence or something safer like using a couple of Reference Cells.
  1 let mutable sourceList : char list = []
2
let mutable endOfString = false
3
4
let getNextCharFromString (source : string) =
5
sourceList <- List.ofSeq source
6
endOfString <- false
7
fun () ->
8
match (sourceList, endOfString) with
9
| _, true -> failwith "Past End of String"
10
| [], _ ->
11
endOfString <- true
12
char 0
13
| h :: t, _ ->
14
sourceList <- t
15
h

The function has the following type:

string -> (unit -> char)

First of all I have declared two mutable values that hold the string to be served and a flag to indicate if the end of the string has been reached. The string is stored as a char list so that the cons (::) operator can later be used to decompose it. Next a function is defined that will take a string and return a function that returns a char. Each time the function is called it will check so see if the end of the string has been reached, serves up a 0 if there are no more characters, or serves the head of the list and updates the list with its tail.

A function that will provide the next character from a string each time it is called can be constructed in the following way:
 16 let getNextChar = getNextCharFromString "(2*((3*4)+9))"

This new function charSource has the type

unit -> char

See you next time!

No comments:

Post a Comment