Fix some typos/grammar in the paper.

02545e55 · gyuri · 88b61f4d · 02545e55
Commit 02545e55 authored Dec 10, 2020 by gyuri
--- a/Documentation/paper.tex
+++ b/Documentation/paper.tex
@@ -52,7 +52,7 @@ used to represent the lexical rules. I used \texttt{flex}~\cite{flex} to
 generate my lexer.

 Fortunately, the lexical rules for C are quite simple. (The preprocessor is not
-implemented here, the lexer expects a preprocessed C source.)  We need to
+implemented here, my lexer expects a preprocessed C source.)  We need to
 recognize comments, string literals, numeric literals, keywords, identifiers,
 and operators. Comments and whitespace are ignored (we simply don't emit any
 tokens). (Whitespace is only used to separate otherwise ambiguous tokens.)
@@ -105,11 +105,7 @@ unary_operator : AND { $$ = AST_UNARY_REF; }
 \end{center}

 These are then used in higher level rules. Here, we recognize unary
-expressions. Note, that increment and decrement operators are handled
-separately, since for these, there is no direct correspondence between a token,
-and the operation it denotes. A \texttt{++} token can denote either a pre, or a
-post increment operator. This grammar rule only recognizes prefix operators.
-(The postfix variants are handled by a different rule.)
+expressions:
 \begin{center}
 \begin{BVerbatim}
 unary_expression : postfix_expression { $$ = $1; }
@@ -119,9 +115,18 @@ unary_expression : postfix_expression { $$ = $1; }
 		 ;
 \end{BVerbatim}
 \end{center}
+Note, that increment and decrement operators are handled separately, since for
+these, there is no direct correspondence between a token, and the operation it
+denotes. A \texttt{++} token can denote either a pre, or a post increment
+operator. This grammar rule only recognizes prefix operators.  (The postfix
+variants are handled by a different rule.)

 At the top, we arrive at the \texttt{translation\_unit} rule. It simply says
-that a C file is a list of function definitions, and declarations.
+that a C file is a list of function definitions, and declarations. (The actions
+for constructing the syntax tree can seem a bit complicated at first, because
+of recursive definition the grammar uses for lists. Basically, at the first
+\texttt{external\_declaration} we initialize the list with a single item, and
+for each subsequent item, we append it to the end.)
 \begin{center}
 \begin{BVerbatim}
 translation_unit : external_declaration { $$ = ast_translation_unit($1); }
@@ -147,7 +152,7 @@ a * b;
 \end{BVerbatim}
 \end{center}

-Whether this is a multiplication experssion, or a declaration depends on
+Whether this is a multiplication expression, or a declaration depends on
 whether \texttt{a} is a typedef name.

 To make matters worse, typedef names are also have to adhere to scoping:
@@ -173,7 +178,11 @@ For an in-depth discussion about \emph{correct} C parsing, see

 \begin{figure}
 \centering
-\begin{tikzpicture}[ scale=0.6, level 2/.style={sibling distance=80mm} ]
+\begin{tikzpicture}[
+	scale=0.6,
+	level 2/.style={sibling distance=75mm},
+	level 3/.style={sibling distance=50mm}
+]
 	\node (a) {translation\_unit}
 	child {
 		node (b) {function\_definition}
@@ -224,8 +233,9 @@ The syntax tree is made up of polymorphic nodes. The leaves of the tree are
 usually identifiers or literals. (We can also have for example an empty
 expression statement, but this is unusual.)

-Let's look at the following program.~\footnote{This program is only
-syntactically correct. \texttt{a} and \texttt{b} are undeclared identifiers.}
+Let's look at the following program.~\footnote{This program is syntactically
+correct, but it would not compile: \texttt{a} and \texttt{b} are undeclared
+identifiers.}
 (See Figure~\ref{fig:ast} for the syntax tree.)
 \begin{center}
 \begin{BVerbatim}
@@ -307,7 +317,8 @@ mov dword [rbp-8], eax ; store
 If the expression is an identifier, we look up the corresponding variable in
 our current stack of scopes, and return its value handle.

-If the expression is a unary or binary operator, the generally following scheme is followed:
+If the expression is a unary or binary operator, generally the following scheme
+is followed:
 \begin{enumerate}
 	\item Call the expression code generator for each operand.
 	\item Emit code for performing the operation.
@@ -352,7 +363,7 @@ mov dword [rbp-8], eax ; store
 Along with the data flow of the values, we also track types.

 Some operations don't change the type of their operands, and the resulting
-value simply hase the same type as the operand.
+value simply has the same type as the operand.

 Some operations can change the type. The three main ones are:
 \begin{itemize}
@@ -451,8 +462,8 @@ jmp label\_0 \\
 label\_1:
 }

-For compound statements, we iterate its children, and call either the statement
-or the declaration code generator.
+For compound statements, we iterate through its children, and call either the
+statement or the declaration code generator.

 \subsection{Declarations}
 Every declaration consists of a set of declaration specifiers (such as storage