Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
C Compiler
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
gyuri
C Compiler
Commits
02545e55
Commit
02545e55
authored
Dec 10, 2020
by
gyuri
Browse files
Options
Downloads
Patches
Plain Diff
Fix some typos/grammar in the paper.
parent
88b61f4d
No related branches found
No related tags found
No related merge requests found
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
Documentation/paper.tex
+26
-15
26 additions, 15 deletions
Documentation/paper.tex
with
26 additions
and
15 deletions
Documentation/paper.tex
+
26
−
15
View file @
02545e55
...
...
@@ -52,7 +52,7 @@ used to represent the lexical rules. I used \texttt{flex}~\cite{flex} to
generate my lexer.
Fortunately, the lexical rules for C are quite simple. (The preprocessor is not
implemented here,
the
lexer expects a preprocessed C source.) We need to
implemented here,
my
lexer expects a preprocessed C source.) We need to
recognize comments, string literals, numeric literals, keywords, identifiers,
and operators. Comments and whitespace are ignored (we simply don't emit any
tokens). (Whitespace is only used to separate otherwise ambiguous tokens.)
...
...
@@ -105,11 +105,7 @@ unary_operator : AND { $$ = AST_UNARY_REF; }
\end{center}
These are then used in higher level rules. Here, we recognize unary
expressions. Note, that increment and decrement operators are handled
separately, since for these, there is no direct correspondence between a token,
and the operation it denotes. A
\texttt
{
++
}
token can denote either a pre, or a
post increment operator. This grammar rule only recognizes prefix operators.
(The postfix variants are handled by a different rule.)
expressions:
\begin{center}
\begin{BVerbatim}
unary
_
expression : postfix
_
expression
{
$$
=
$
1
;
}
...
...
@@ -119,9 +115,18 @@ unary_expression : postfix_expression { $$ = $1; }
;
\end
{
BVerbatim
}
\end
{
center
}
Note, that increment and decrement operators are handled separately, since for
these, there is no direct correspondence between a token, and the operation it
denotes. A
\texttt
{
++
}
token can denote either a pre, or a post increment
operator. This grammar rule only recognizes prefix operators.
(
The postfix
variants are handled by a different rule.
)
At the top, we arrive at the
\texttt
{
translation
\_
unit
}
rule. It simply says
that a C file is a list of function definitions, and declarations.
that a C file is a list of function definitions, and declarations.
(
The actions
for constructing the syntax tree can seem a bit complicated at first, because
of recursive definition the grammar uses for lists. Basically, at the first
\texttt
{
external
\_
declaration
}
we initialize the list with a single item, and
for each subsequent item, we append it to the end.
)
\begin
{
center
}
\begin
{
BVerbatim
}
translation
_
unit : external
_
declaration
{
$$
= ast
_
translation
_
unit(
$
1
)
;
}
...
...
@@ -147,7 +152,7 @@ a * b;
\end{BVerbatim}
\end{center}
Whether this is a multiplication exp
e
rssion, or a declaration depends on
Whether this is a multiplication expr
e
ssion, or a declaration depends on
whether
\texttt
{
a
}
is a typedef name.
To make matters worse, typedef names are also have to adhere to scoping:
...
...
@@ -173,7 +178,11 @@ For an in-depth discussion about \emph{correct} C parsing, see
\begin{figure}
\centering
\begin{tikzpicture}
[ scale=0.6, level 2/.style=
{
sibling distance=80mm
}
]
\begin{tikzpicture}
[
scale=0.6,
level 2/.style=
{
sibling distance=75mm
}
,
level 3/.style=
{
sibling distance=50mm
}
]
\node
(a)
{
translation
\_
unit
}
child
{
node (b)
{
function
\_
definition
}
...
...
@@ -224,8 +233,9 @@ The syntax tree is made up of polymorphic nodes. The leaves of the tree are
usually identifiers or literals. (We can also have for example an empty
expression statement, but this is unusual.)
Let's look at the following program.~
\footnote
{
This program is only
syntactically correct.
\texttt
{
a
}
and
\texttt
{
b
}
are undeclared identifiers.
}
Let's look at the following program.~
\footnote
{
This program is syntactically
correct, but it would not compile:
\texttt
{
a
}
and
\texttt
{
b
}
are undeclared
identifiers.
}
(See Figure~
\ref
{
fig:ast
}
for the syntax tree.)
\begin{center}
\begin{BVerbatim}
...
...
@@ -307,7 +317,8 @@ mov dword [rbp-8], eax ; store
If the expression is an identifier, we look up the corresponding variable in
our current stack of scopes, and return its value handle.
If the expression is a unary or binary operator, the generally following scheme is followed:
If the expression is a unary or binary operator, generally the following scheme
is followed:
\begin{enumerate}
\item
Call the expression code generator for each operand.
\item
Emit code for performing the operation.
...
...
@@ -352,7 +363,7 @@ mov dword [rbp-8], eax ; store
Along with the data flow of the values, we also track types.
Some operations don't change the type of their operands, and the resulting
value simply has
e
the same type as the operand.
value simply has the same type as the operand.
Some operations can change the type. The three main ones are:
\begin{itemize}
...
...
@@ -451,8 +462,8 @@ jmp label\_0 \\
label
\_
1:
}
For compound statements, we iterate its children, and call either the
statement
or the declaration code generator.
For compound statements, we iterate
through
its children, and call either the
statement
or the declaration code generator.
\subsection
{
Declarations
}
Every declaration consists of a set of declaration specifiers (such as storage
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment