SableCC Example
The code in this directory illustrates working with SableCC to create a
simple expression evaluator (and printer). The elements:
To build, save all the files to a directory, download sablecc-3.7.jar (or
later), and type
java -jar sablecc.jar exp.grammar
javac Echo.java
javac Eval.java
The first step creates various subdirectories; these are used to implement
Echo and Eval. Once the programs are compiled, you can
execute them by typing
java Echo exp1
java Eval exp1
An alternative build method is to use the makefile:
make
make clean
Notes about SableCC
Miscellaneous things I've worked out about SableCC from doing this example:
- Be careful what names you use for productions and the various
alternatives in the grammar; you'll be using these names in your code, and
spending some time thinking about the names will make your code easier to
read and debug.
- Strongly consider breaking large productions into pieces that might be
easier to manage. A production with 10 symbols in it may lead to very long
method bodies.
- You should not need to modify any of the files generated by SableCC.
What SableCC does is create a framework for various "visitor" classes, and
you specialize these classes to your task.
- Visitors are generally derived from class DepthFirstAdapter in
the analysis package. This results in a visitor which processes
the entire parse tree (unless you override specific methods). By default,
DepthFirstAdapter applies the visitor to each element of each
production, including the terminal symbols. The application to the
terminal symbols allows you to do things like capture identifier names or
evaluate integer constants (since the actual text for the token is stored with
each token in the parse tree). The application to nonterminals allows you
to define how to process multiple elements. You don't need to specify how
to process every element in every production (or even every
production): the default operation is to do nothing, and a good compiler
should be able to completely remove the method call in such cases.
- When you do override the processing for a production or terminal, you
can override one or more of three methods for that item. For example, the
production
exp = {add} exp plus term ;
results in the following methods being defined in
DepthFirstAdapter (and derived classes):
- public void inAAddExp(AAddExp node) - what happens when you
enter a parse tree node for the above production; processing which
is placed here maps to a preorder traversal of the tree.
- public void outAAddExp(AAddExp node) - what happens when you
exit a parse tree node for the above production; processing which
is placed here maps to a postorder traversal of the tree.
- public void caseAAddExp(AAddExp node) - how the node itself is
processed; you can redefine this to perform some alternative way of
processing this node. In Echo.java and Eval.java, I
redefine this to control how expressions are added, subtracted, etc. You
could also redefine this operation so that you visit only a portion of the
parse tree, pruning the visit in some way.
It is very instructive to look at the file
analysis/DepthFirstAdapter.java after running sablecc on
some source file.
- You can also redefine defaultIn and defaultOut to
define what happens when you enter or leave each node. The default
definitions of these methods do nothing, and the default in and
out operations call this method. For example, one could build a
simple tool to compute the sizes of parse trees by redefining
defaultIn to add one to some counter (where the counter would be
declared as a class variable).
- As an alternative to DepthFirstAdapter, you can derive your
class from Analysis or AnalysisAdapter.
Analysis just defines an interface for all visitors, and
AnalysisAdapter gives you a visitor which does nothing at any
node. Note that if you use AnalysisAdapter, you are responsible
for writing any code which processes any children; AnalysisAdapter
contains no code in itself which visits any children for any productions.
If you're going to override (almost) every operation in
DepthFirstAdpater anyways, you might as well use
AnalysisAdapter instead!
- None of the visitor operations return values (that is, they are all
void operations). This means you must use the visitor itself to
handle any results. For example, the Eval class contains an
integer value into which all results are stored. In the case of a binary
operation such as +, a "temporary" visitor is created to evaluate the
right-hand-side of the expression; when that value is obtained, it is added
to the value retrieved from the left-hand-side. Note how apply is
used to apply the visitor to various subtrees.
(Echo.java and Eval.java could be written to create
visitors for both sides of each expression. This would have been clearer
at a cost to efficiency.)
- A lot of the productions will be processed in much the same way, so
you'll probably want to write (non-public) methods for the common bits of
processing. This was not done in Eval.java and Echo.java
just for ease of reading, but the result is less maintainable.
- Note the definitions of main for both Eval.java and
Echo.java are almost exactly the same. The main for
almost every SableCC processor will be very similar: create a
Lexer to read the source file, a Parser to parse it, and
apply a visitor to the result. The error handling in these examples is not
very sophisticated; this would need to be addressed for a
production-quality system.
- Note the set of package includes in the examples. Two are obvious:
lexer for class Lexer and parser for class
Parser. Package analysis provides the
DepthFirstAdapter and other analysis interface tools. Package
node provides the definitions of all nodes in the parse tree; you
might want to take a look at class Node and Token to gain
insite into what's going on.