CS
3040, Assignment 2: Designing a Programming Language
This assignment is intended to give you experience in writing
grammars. Develop a grammar for the following programming language. In this
language, the basic data type is a matrix. A sample program is
define cube(values)
assert values has dimension [4, 4];
return values * values * values;
end
dim [4, 4] z, y;
z = [1, 2, 3, 4,
5, 6, 7, 8,
9, 10,11,12,
13,14,15,16];
y = cube(z);
if y > z then
y = z - y;
else
y = y - z;
end
display y;
Specifically,
- Whitespace is not significant in the language.
- All input is in lower case; there should be no upper-case symbols in
a program.
- The keywords are
assert
define
dim
dimension
display
else
end
has
if
then
- Functions can be defined with zero or more arguments.
- The language supports assertions which can be boolean expressions
(using >, >=, <, <=, ==, and !=) or the special test
"variable has dimension [n1, n2, ..., nk]". The special test
allows checking the structure of the matrix.
- A function's result will be the last computed value expression. This
means a function is a set of declarations and statements followed by a
final expression.
dim
lines declare (possibly multi-dimensional) matrices
with the given shape. Allow an arbitrary number of dimensions.
- Square brackets surround matrix constants. Since whitespace is not
significant, splitting it across multiple lines is done for the
reader. The intent is that the values would be inserted into a matrix
based on the declared dimensions, but that is an implementation issue for
another day. A single number by itself is treated as a 1 by 1 matrix. All
values are decimal values like 1.5, -12, and 4.0003.
- There are four types of statements: an assignment
(using
=
), assert
,
if
, and display
. Each
(non-compound) statement (as well as the declaration dim
)
ends in a semicolon.
- The keyword
end
terminates blocks like function
definitions and if/then/else statements.
- The test for an if statement is the same as an
assert
statement (given above).
- Expressions in this language will consist of addition
(
+
), multiplication (*
), and subtraction
(-
). Multiplication takes precedence over the other two.
Parentheses can be used to override this precedence. See the expression
grammar in the notes for hints on handling this.
There are likely aspects of this language that the above list fails to
specify. In those cases, you can make your own choices. This language is
inspired by the data science
language R,
but avoids some of its quirkier aspects. There is no need to
research R for this assignment!
Working with one or two partners, you are to write a grammar for this
language. Use initial capital letters for nonterminals, single quote marks around
terminals, and no quote marks around regular expressions. Thus a couple
rules might look like
Aaa → Bbb 'xyz' '.'
Bbb → [a-z]+
As usual, the non-terminal being defined in the first production will be
interpreted as being the start symbol, so there is no need to write out the
full grammar in tuple form.
In addition, add one other feature to your language. Possible features:
- Adding
and
, or
, and not
to
predicates. Parallel the structure of the expression grammar (noting that
all of the boolean operators have lower precedence than addition and
multiplication).
- Prompting the user for data and reading it in. This would add strings
to the language, though you do not need to support strings in other
contexts.
- Add more control constructs like a for loop and a while loop. Note
adding just while loops is not sufficient. You might also consider adding
a case statement.
- Add hash tables as a built-in datatype similar to Ruby.
- Matrix processing languages are ripe for parallelization. Add
threading and synchronization constructs to the language. You could model
your solution after Java's threading model (using monitors for
synchronization), a simplified form of Ada's rendezvous, or some other
language.
Alternatively, you can add a similar feature of your own choosing. Feel
free to borrow ideas from other languages you know - good programming
language designers do that all the time! If you do, it would be a good idea
to name your source in your report.
Teams with three students are to add two additional features.
Note that this assignment is focused on just specifying the syntax of
programs. A full implementation would include checks on dimensions (to
ensure multiplication, addition are legal) as well as other aspects. Those
are not part of this assignment. We will may implement some of this
language later, but you would have opportunity to revise your grammar when
you do that.
Additional Notes
- Think of a grammar like a program and follow the adage "Don't Repeat
Yourself" (DRY). Avoid repeating constructs by introducing appropriate
nonterminals and reusing those nonterminals in appropriate places.
- It is critical to use descriptive names for nonterminals. Using "A"
or "ANonTerminal" makes a real grammar too hard to read. For example, use
terms like
"Statement" or "FunctionDefinition". You might take a
look at grammars for real programming languages like C, Python, or Ruby
to see what names make sense.
- For readability, you should write rules so that they are either
involving nonterminals and terminals in quotes or regular expressions.
That is, have a separate production for each regular expression.
- You can use extended BNF (EBNF) in your answer. That is, you can use
parentheses and | to group items and capture alternatives.
- A sample grammar for a small version of Java is
available here. The
difference is this grammar uses colons instead of →.
Submission
Type up your grammar as a word-processed document (Word, Google Docs)
or use a tool like OneNote. You can simply use -> for arrows and do not
have to italicize non-terminals. You should be able to use ε
to signal an empty string, but if for some reason your word-processing
environment does not allow epsilon you can use a different symbol (that's
well defined) for that. Sometimes it's convenient to have a grammar
rule Empty -> ε
.
Include the following in your document:
- Two other non-trivial sample programs incorporating your extensions.
- The production rules of your grammar (see above for details).
- A parse tree for a short program with at least two statements.
- A short paragraph capturing when the team met, who was at each
meeting, and who did other work. I do not expect all students to do the
exact same amount of work, but I do expect all to participate. Feel free
to message me if you believe the paragraph is not accurate and wish to
add a clarification such as "we met, everyone else was working on
another assignment instead of this one." I will generally give everyone
in the group the same score, but reserve the right to give different
scores if situations warrant.
Upload your document as a PDF.