diff options
author | dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4> | 2004-05-13 06:41:07 +0000 |
---|---|---|
committer | dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4> | 2004-05-13 06:41:07 +0000 |
commit | 4ee9c6840ad3fc92a9034343278a1e476ad6872a (patch) | |
tree | a2568888a519c077427b133de9ece5879a8484a5 /gcc/doc/tree-ssa.texi | |
parent | ebb338380ab170c91e64d38038e6b5ce930d69a1 (diff) | |
download | gcc-4ee9c6840ad3fc92a9034343278a1e476ad6872a.tar.gz |
Merge tree-ssa-20020619-branch into mainline.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@81764 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/doc/tree-ssa.texi')
-rw-r--r-- | gcc/doc/tree-ssa.texi | 1189 |
1 files changed, 1189 insertions, 0 deletions
diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi new file mode 100644 index 00000000000..56ccd6109c2 --- /dev/null +++ b/gcc/doc/tree-ssa.texi @@ -0,0 +1,1189 @@ +@c Copyright (c) 2004 Free Software Foundation, Inc. +@c Free Software Foundation, Inc. +@c This is part of the GCC manual. +@c For copying conditions, see the file gcc.texi. + +@c --------------------------------------------------------------------- +@c Tree SSA +@c --------------------------------------------------------------------- + +@node Tree SSA +@chapter Analysis and Optimization of GIMPLE Trees +@cindex Tree SSA +@cindex Optimization infrastructure for GIMPLE + +GCC uses three main intermediate languages to represent the program +during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a +language-independent representation generated by each front end. It +is used to serve as an interface between the parser and optimizer. +GENERIC is a common representation that is able to represent programs +written in all the languages supported by GCC@. + +GIMPLE and RTL are used to optimize the program. GIMPLE is used for +target and language independent optimizations (e.g., inlining, +constant propagation, tail call elimination, redundancy elimination, +etc). Much like GENERIC, GIMPLE is a language independent, tree based +representation. However, it differs from GENERIC in that the GIMPLE +grammar is more restrictive: expressions contain no more than 3 +operands (except function calls), it has no control flow structures +and expressions with side-effects are only allowed on the right hand +side of assignments. See the chapter describing GENERIC and GIMPLE +for more details. + +This chapter describes the data structures and functions used in the +GIMPLE optimizers (also known as ``tree optimizers'' or ``middle +end''). In particular, it focuses on all the macros, data structures, +functions and programming constructs needed to implement optimization +passes for GIMPLE@. + +@menu +* GENERIC:: A high-level language-independent representation. +* GIMPLE:: A lower-level factored tree representation. +* Annotations:: Attributes for statements and variables. +* Statement Operands:: Variables referenced by GIMPLE statements. +* SSA:: Static Single Assignment representation. +* Alias analysis:: Representing aliased loads and stores. +@end menu + +@node GENERIC +@section GENERIC +@cindex GENERIC + +The purpose of GENERIC is simply to provide a language-independent way of +representing an entire function in trees. To this end, it was necessary to +add a few new tree codes to the back end, but most everything was already +there. If you can express it with the codes in @code{gcc/tree.def}, it's +GENERIC. + +Early on, there was a great deal of debate about how to think about +statements in a tree IL. In GENERIC, a statement is defined as any +expression whose value, if any, is ignored. A statement will always +have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a +non-statement expression may also have side effects. A +@code{CALL_EXPR}, for instance. + +It would be possible for some local optimizations to work on the +GENERIC form of a function; indeed, the adapted tree inliner works +fine on GENERIC, but the current compiler performs inlining after +lowering to GIMPLE (a restricted form described in the next section). +Indeed, currently the frontends perform this lowering before handing +off to @code{tree_rest_of_compilation}, but this seems inelegant. + +If necessary, a front end can use some language-dependent tree codes +in its GENERIC representation, so long as it provides a hook for +converting them to GIMPLE and doesn't expect them to work with any +(hypothetical) optimizers that run before the conversion to GIMPLE. +The intermediate representation used while parsing C and C++ looks +very little like GENERIC, but the C and C++ gimplifier hooks are +perfectly happy to take it as input and spit out GIMPLE. + +@node GIMPLE +@section GIMPLE +@cindex GIMPLE + +GIMPLE is a simplified subset of GENERIC for use in optimization. The +particular subset chosen (and the name) was heavily influenced by the +SIMPLE IL used by the McCAT compiler project at McGill University, +though we have made some different choices. For one thing, SIMPLE +doesn't support @code{goto}; a production compiler can't afford that +kind of restriction. + +GIMPLE retains much of the structure of the parse trees: lexical +scopes are represented as containers, rather than markers. However, +expressions are broken down into a 3-address form, using temporary +variables to hold intermediate values. Also, control structures are +lowered to gotos. + +In GIMPLE no container node is ever used for its value; if a +@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a +temporary within the controlled blocks, and that temporary is used in +place of the container. + +The compiler pass which lowers GENERIC to GIMPLE is referred to as the +@samp{gimplifier}. The gimplifier works recursively, replacing complex +statements with sequences of simple statements. + +@c Currently, the only way to +@c tell whether or not an expression is in GIMPLE form is by recursively +@c examining it; in the future there will probably be a flag to help avoid +@c redundant work. FIXME FIXME + +@menu +* Interfaces:: +* Temporaries:: +* GIMPLE Expressions:: +* Statements:: +* GIMPLE Example:: +* Rough GIMPLE Grammar:: +@end menu + +@node Interfaces +@subsection Interfaces +@cindex gimplification + +The tree representation of a function is stored in +@code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to +@code{gimplify_function_tree}. + +If a front end wants to include language-specific tree codes in the tree +representation which it provides to the back end, it must provide a +definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to +convert the front end trees to GIMPLE. Usually such a hook will involve +much of the same code for expanding front end trees to RTL. This function +can return fully lowered GIMPLE, or it can return GENERIC trees and let the +main gimplifier lower them the rest of the way; this is often simpler. + +The C and C++ front ends currently convert directly from front end +trees to GIMPLE, and hand that off to the back end rather than first +converting to GENERIC. Their gimplifier hooks know about all the +@code{_STMT} nodes and how to convert them to GENERIC forms. There +was some work done on a genericization pass which would run first, but +the existence of @code{STMT_EXPR} meant that in order to convert all +of the C statements into GENERIC equivalents would involve walking the +entire tree anyway, so it was simpler to lower all the way. This +might change in the future if someone writes an optimization pass +which would work better with higher-level trees, but currently the +optimizers all expect GIMPLE. + +A front end which wants to use the tree optimizers (and already has +some sort of whole-function tree representation) only needs to provide +a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call +@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to +@code{tree_rest_of_compilation} to compile and output the function. + +You can tell the compiler to dump a C-like representation of the GIMPLE +form with the flag @code{-fdump-tree-gimple}. + +@node Temporaries +@subsection Temporaries +@cindex Temporaries + +When gimplification encounters a subexpression which is too complex, it +creates a new temporary variable to hold the value of the subexpression, +and adds a new statement to initialize it before the current statement. +These special temporaries are known as @samp{expression temporaries}, and are +allocated using @code{get_formal_tmp_var}. The compiler tries to +always evaluate identical expressions into the same temporary, to simplify +elimination of redundant calculations. + +We can only use expression temporaries when we know that it will not be +reevaluated before its value is used, and that it will not be otherwise +modified@footnote{These restrictions are derived from those in Morgan 4.8.}. +Other temporaries can be allocated using +@code{get_initialized_tmp_var} or @code{create_tmp_var}. + +Currently, an expression like @code{a = b + 5} is not reduced any +further. We tried converting it to something like +@smallexample + T1 = b + 5; + a = T1; +@end smallexample +but this bloated the representation for minimal benefit. However, a +variable which must live in memory cannot appear in an expression; its +value is explicitly loaded into a temporary first. Similarly, storing +the value of an expression to a memory variable goes through a +temporary. + +@node GIMPLE Expressions +@subsection Expressions +@cindex GIMPLE Expressions + +In general, expressions in GIMPLE consist of an operation and the +appropriate number of simple operands; these operands must either be a +GIMPLE rvalue (@code{is_gimple_val}), i.e. a constant or a register +variable. More complex operands are factored out into temporaries, so +that +@smallexample + a = b + c + d +@end smallexample +becomes +@smallexample + T1 = b + c; + a = T1 + d; +@end smallexample + +The same rule holds for arguments to a @code{CALL_EXPR}. + +The target of an assignment is usually a variable, but can also be an +@code{INDIRECT_REF} or a compound lvalue as described below. + +@menu +* Compound Expressions:: +* Compound Lvalues:: +* Conditional Expressions:: +* Logical Operators:: +@end menu + +@node Compound Expressions +@subsubsection Compound Expressions +@cindex Compound Expressions + +The left-hand side of a C comma expression is simply moved into a separate +statement. + +@node Compound Lvalues +@subsubsection Compound Lvalues +@cindex Compound Lvalues + +Currently compound lvalues involving array and structure field references +are not broken down; an expression like @code{a.b[2] = 42} is not reduced +any further (though complex array subscripts are). This restriction is a +workaround for limitations in later optimizers; if we were to convert this +to + +@smallexample + T1 = &a.b; + T1[2] = 42; +@end smallexample + +alias analysis would not remember that the reference to @code{T1[2]} came +by way of @code{a.b}, so it would think that the assignment could alias +another member of @code{a}; this broke @code{struct-alias-1.c}. Future +optimizer improvements may make this limitation unnecessary. + +@node Conditional Expressions +@subsubsection Conditional Expressions +@cindex Conditional Expressions + +A C @code{?:} expression is converted into an @code{if} statement with +each branch assigning to the same temporary. So, + +@smallexample + a = b ? c : d; +@end smallexample +becomes +@smallexample + if (b) + T1 = c; + else + T1 = d; + a = T1; +@end smallexample + +Note that in GIMPLE, @code{if} statements are also represented using +@code{COND_EXPR}, as described below. + +@node Logical Operators +@subsubsection Logical Operators +@cindex Logical Operators + +Except when they appear in the condition operand of a @code{COND_EXPR}, +logical `and' and `or' operators are simplified as follows: +@code{a = b && c} becomes + +@smallexample + T1 = (bool)b; + if (T1) + T1 = (bool)c; + a = T1; +@end smallexample + +Note that @code{T1} in this example cannot be an expression temporary, +because it has two different assignments. + +@node Statements +@subsection Statements +@cindex Statements + +Most statements will be assignment statements, represented by +@code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can +also be a statement. No other C expressions can appear at statement level; +a reference to a volatile object is converted into a @code{MODIFY_EXPR}. + +There are also several varieties of complex statements. + +@menu +* Blocks:: +* Statement Sequences:: +* Empty Statements:: +* Loops:: +* Selection Statements:: +* Jumps:: +* Cleanups:: +* GIMPLE Exception Handling:: +@end menu + +@node Blocks +@subsubsection Blocks +@cindex Blocks + +Block scopes and the variables they declare in GENERIC and GIMPLE are +expressed using the @code{BIND_EXPR} code, which in previous versions of +GCC was primarily used for the C statement-expression extension. + +Variables in a block are collected into @code{BIND_EXPR_VARS} in +declaration order. Any runtime initialization is moved out of +@code{DECL_INITIAL} and into a statement in the controlled block. When +gimplifying from C or C++, this initialization replaces the +@code{DECL_STMT}. + +Variable-length arrays (VLAs) complicate this process, as their size often +refers to variables initialized earlier in the block. To handle this, we +currently split the block at that point, and move the VLA into a new, inner +@code{BIND_EXPR}. This strategy may change in the future. + +@code{DECL_SAVED_TREE} for a GIMPLE function will always be a +@code{BIND_EXPR} which contains declarations for the temporary variables +used in the function. + +A C++ program will usually contain more @code{BIND_EXPR}s than there are +syntactic blocks in the source code, since several C++ constructs have +implicit scopes associated with them. On the other hand, although the C++ +front end uses pseudo-scopes to handle cleanups for objects with +destructors, these don't translate into the GIMPLE form; multiple +declarations at the same level use the same BIND_EXPR. + +@node Statement Sequences +@subsubsection Statement Sequences +@cindex Statement Sequences + +Multiple statements at the same nesting level are collected into a +@code{STATEMENT_LIST}. Statement lists are modified and traversed +using the interface in @samp{tree-iterator.h}. + +@node Empty Statements +@subsubsection Empty Statements +@cindex Empty Statements + +Whenever possible, statements with no effect are discarded. But if they +are nested within another construct which cannot be discarded for some +reason, they are instead replaced with an empty statement, generated by +@code{build_empty_stmt}. Initially, all empty statements were shared, +after the pattern of the Java front end, but this caused a lot of trouble in +practice. + +An empty statement is represented as @code{(void)0}. + +@node Loops +@subsubsection Loops +@cindex Loops + +At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but +now they are lowered to explicit gotos. + +@node Selection Statements +@subsubsection Selection Statements +@cindex Selection Statements + +A simple selection statement, such as the C @code{if} statement, is +expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is +used, the other is filled with an empty statement. + +Normally, the condition expression is reduced to a simple comparison. If +it is a shortcut (@code{&&} or @code{||}) expression, however, we try to +break up the @code{if} into multiple @code{if}s so that the implied shortcut +is taken directly, much like the transformation done by @code{do_jump} in +the RTL expander. + +A @code{SWITCH_EXPR} in GIMPLE contains the condition and a +@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values +and corresponding @code{LABEL_DECL}s to jump to. The body of the +@code{switch} is moved after the @code{SWITCH_EXPR}. + +@node Jumps +@subsubsection Jumps +@cindex Jumps + +Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}. + +The operand of a @code{GOTO_EXPR} must be either a label or a variable +containing the address to jump to. + +The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE} or a +@code{MODIFY_EXPR} which sets the return value. It would be nice to +move the @code{MODIFY_EXPR} into a separate statement, but the special +return semantics in @code{expand_return} make that difficult. It may +still happen in the future, perhaps by moving most of that logic into +@code{expand_assignment}. + +@node Cleanups +@subsubsection Cleanups +@cindex Cleanups + +Destructors for local C++ objects and similar dynamic cleanups are +represented in GIMPLE by a @code{TRY_FINALLY_EXPR}. When the controlled +block exits, the cleanup is run. + +@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup +needs to appear on every edge out of the controlled block; this +reduces our freedom to move code across these edges. Therefore, the +EH lowering pass which runs before most of the optimization passes +eliminates these expressions by explicitly adding the cleanup to each +edge. + +@node GIMPLE Exception Handling +@subsubsection Exception Handling +@cindex GIMPLE Exception Handling + +Other exception handling constructs are represented using +@code{TRY_CATCH_EXPR}. The handler operand of a @code{TRY_CATCH_EXPR} +can be a normal statement to be executed if the controlled block throws an +exception, or it can have one of two special forms: + +@enumerate +@item A @code{CATCH_EXPR} executes its handler if the thrown exception + matches one of the allowed types. Multiple handlers can be + expressed by a sequence of @code{CATCH_EXPR} statements. +@item An @code{EH_FILTER_EXPR} executes its handler if the thrown + exception does not match one of the allowed types. +@end enumerate + +Currently throwing an exception is not directly represented in GIMPLE, +since it is implemented by calling a function. At some point in the future +we will want to add some way to express that the call will throw an +exception of a known type. + +Just before running the optimizers, the compiler lowers the high-level +EH constructs above into a set of @samp{goto}s, magic labels, and EH +regions. Continuing to unwind at the end of a cleanup is represented +with a @code{RESX_EXPR}. + +@node GIMPLE Example +@subsection GIMPLE Example +@cindex GIMPLE Example + +@smallexample +struct A @{ A(); ~A(); @}; + +int i; +int g(); +void f() +@{ + A a; + int j = (--i, i ? 0 : 1); + + for (int x = 42; x > 0; --x) + @{ + i += g()*4 + 32; + @} +@} +@end smallexample + +becomes + +@smallexample +void f() +@{ + int i.0; + int T.1; + int iftmp.2; + int T.3; + int T.4; + int T.5; + int T.6; + + @{ + struct A a; + int j; + + __comp_ctor (&a); + try + @{ + i.0 = i; + T.1 = i.0 - 1; + i = T.1; + i.0 = i; + if (i.0 == 0) + iftmp.2 = 1; + else + iftmp.2 = 0; + j = iftmp.2; + @{ + int x; + + x = 42; + goto test; + loop:; + + T.3 = g (); + T.4 = T.3 * 4; + i.0 = i; + T.5 = T.4 + i.0; + T.6 = T.5 + 32; + i = T.6; + x = x - 1; + + test:; + if (x > 0) + goto loop; + else + goto break_; + break_:; + @} + @} + finally + @{ + __comp_dtor (&a); + @} + @} +@} +@end smallexample + +@node Rough GIMPLE Grammar +@subsection Rough GIMPLE Grammar +@cindex Rough GIMPLE Grammar + +@smallexample +function: + FUNCTION_DECL + DECL_SAVED_TREE -> block +block: + BIND_EXPR + BIND_EXPR_VARS -> DECL chain + BIND_EXPR_BLOCK -> BLOCK + BIND_EXPR_BODY + -> compound-stmt +compound-stmt: + COMPOUND_EXPR + op0 -> non-compound-stmt + op1 -> stmt +stmt: compound-stmt + | non-compound-stmt +non-compound-stmt: + block + | if-stmt + | switch-stmt + | jump-stmt + | label-stmt + | try-stmt + | modify-stmt + | call-stmt +if-stmt: + COND_EXPR + op0 -> condition + op1 -> stmt + op2 -> stmt +switch-stmt: + SWITCH_EXPR + op0 -> val + op1 -> NULL_TREE + op2 -> TREE_VEC of CASE_LABEL_EXPRs +jump-stmt: + GOTO_EXPR + op0 -> LABEL_DECL | '*' ID + | RETURN_EXPR + op0 -> modify-stmt + | NULL_TREE +label-stmt: + LABEL_EXPR + op0 -> LABEL_DECL +try-stmt: + TRY_CATCH_EXPR + op0 -> stmt + op1 -> handler + | TRY_FINALLY_EXPR + op0 -> stmt + op1 -> stmt +handler: + catch-seq + | EH_FILTER_EXPR + | stmt +catch-seq: + CATCH_EXPR + | COMPOUND_EXPR + op0 -> CATCH_EXPR + op1 -> catch-seq +modify-stmt: + MODIFY_EXPR + op0 -> lhs + op1 -> rhs +call-stmt: CALL_EXPR + op0 -> _DECL | '&' _DECL + op1 -> arglist +arglist: + NULL_TREE + | TREE_LIST + op0 -> val + op1 -> arglist + +varname : compref | _DECL +lhs: varname | '*' _DECL +pseudo-lval: _DECL | '*' _DECL +compref : + COMPONENT_REF + op0 -> compref | pseudo-lval + | ARRAY_REF + op0 -> compref | pseudo-lval + op1 -> val + +condition : val | val relop val +val : _DECL | CONST + +rhs: varname | CONST + | '*' _DECL + | '&' varname + | call_expr + | unop val + | val binop val + | '(' cast ')' val + +unop: '+' | '-' | '!' | '~' + +binop: relop | '-' | '+' | '/' | '*' + | '%' | '&' | '|' | '<<' | '>>' | '^' + +relop: All tree codes of class '<' +@end smallexample + +@node Annotations +@section Annotations +@cindex annotations + +The optimizers need to associate attributes with statements and +variables during the optimization process. For instance, we need to +know what basic block does a statement belong to or whether a variable +has aliases. All these attributes are stored in data structures +called annotations which are then linked to the field @code{ann} in +@code{struct tree_common}. + +Presently, we define annotations for statements (@code{stmt_ann_t}), +variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}). +Annotations are defined and documented in @file{tree-flow.h}. + + +@node Statement Operands +@section Statement Operands +@cindex operands +@cindex virtual operands +@cindex real operands +@findex get_stmt_operands +@findex modify_stmt + +Almost every GIMPLE statement will contain a reference to a variable +or memory location. Since statements come in different shapes and +sizes, their operands are going to be located at various spots inside +the statement's tree. To facilitate access to the statement's +operands, they are organized into arrays associated inside each +statement's annotation. Each element in an operand array is a pointer +to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. +This provides a very convenient way of examining and replacing +operands. + +Data flow analysis and optimization is done on all tree nodes +representing variables. Any node for which @code{SSA_VAR_P} returns +nonzero is considered when scanning statement operands. However, not +all @code{SSA_VAR_P} variables are processed in the same way. For the +purposes of optimization, we need to distinguish between references to +local scalar variables and references to globals, statics, structures, +arrays, aliased variables, etc. The reason is simple, the compiler +can gather complete data flow information for a local scalar. On the +other hand, a global variable may be modified by a function call, it +may not be possible to keep track of all the elements of an array or +the fields of a structure, etc. + +The operand scanner gathers two kinds of operands: @dfn{real} and +@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true +is considered real, otherwise it is a virtual operand. We also +distinguish between uses and definitions. An operand is used if its +value is loaded by the statement (e.g., the operand at the RHS of an +assignment). If the statement assigns a new value to the operand, the +operand is considered a definition (e.g., the operand at the LHS of +an assignment). + +Virtual and real operands also have very different data flow +properties. Real operands are unambiguous references to the +full object that they represent. For instance, given + +@smallexample +@{ + int a, b; + a = b +@} +@end smallexample + +Since @code{a} and @code{b} are non-aliased locals, the statement +@code{a = b} will have one real definition and one real use because +variable @code{b} is completely modified with the contents of +variable @code{a}. Real definition are also known as @dfn{killing +definitions}. Similarly, the use of @code{a} reads all its bits. + +In contrast, virtual operands represent partial or ambiguous +references to a variable. For instance, given + +@smallexample +@{ + int a, b, *p; + + if (...) + p = &a; + else + p = &b; + *p = 5; + return *p; +@} +@end smallexample + +The assignment @code{*p = 5} may be a definition of @code{a} or +@code{b}. If we cannot determine statically where @code{p} is +pointing to at the time of the store operation, we create virtual +definitions to mark that statement as a potential definition site for +@code{a} and @code{b}. Memory loads are similarly marked with virtual +use operands. Virtual operands are shown in tree dumps right before +the statement that contains them. To request a tree dump with virtual +operands, use the @option{-vops} option to @option{-fdump-tree}: + +@smallexample +@{ + int a, b, *p; + + if (...) + p = &a; + else + p = &b; + # a = VDEF <a> + # b = VDEF <b> + *p = 5; + + # VUSE <a> + # VUSE <b> + return *p; +@} +@end smallexample + +Notice that @code{VDEF} operands have two copies of the referenced +variable. This indicates that this is not a killing definition of +that variable. In this case we refer to it as a @dfn{may definition} +or @dfn{aliased store}. The presence of the second copy of the +variable in the @code{VDEF} operand will become important when the +function is converted into SSA form. This will be used to link all +the non-killing definitions to prevent optimizations from making +incorrect assumptions about them. + +Operands are collected by @file{tree-ssa-operands.c}. They are stored +inside each statement's annotation and can be accessed with +@code{DEF_OPS}, @code{USE_OPS}, @code{VDEF_OPS} and @code{VUSE_OPS}. +The following are all the accessor macros available to access USE +operands. To access all the other operand arrays, just change the +name accordingly: + +@defmac USE_OPS (@var{ann}) +Returns the array of operands used by the statement with annotation +@var{ann}. +@end defmac + +@defmac STMT_USE_OPS (@var{stmt}) +Alternate version of USE_OPS that takes the statement @var{stmt} as +input. +@end defmac + +@defmac NUM_USES (@var{ops}) +Return the number of USE operands in array @var{ops}. +@end defmac + +@defmac USE_OP_PTR (@var{ops}, @var{i}) +Return a pointer to the @var{i}th operand in array @var{ops}. +@end defmac + +@defmac USE_OP (@var{ops}, @var{i}) +Return the @var{i}th operand in array @var{ops}. +@end defmac + +The following function shows how to print all the operands of a given +statement: + +@smallexample +void +print_ops (tree stmt) +@{ + vuse_optype vuses; + vdef_optype vdefs; + def_optype defs; + use_optype uses; + stmt_ann_t ann; + size_t i; + + get_stmt_operands (stmt); + ann = stmt_ann (stmt); + + defs = DEF_OPS (ann); + for (i = 0; i < NUM_DEFS (defs); i++) + print_generic_expr (stderr, DEF_OP (defs, i), 0); + + uses = USE_OPS (ann); + for (i = 0; i < NUM_USES (uses); i++) + print_generic_expr (stderr, USE_OP (uses, i), 0); + + vdefs = VDEF_OPS (ann); + for (i = 0; i < NUM_VDEFS (vdefs); i++) + print_generic_expr (stderr, VDEF_OP (vdefs, i), 0); + + vuses = VUSE_OPS (ann); + for (i = 0; i < NUM_VUSES (vuses); i++) + print_generic_expr (stderr, VUSE_OP (vuses, i), 0); +@} +@end smallexample + +To collect the operands, you first need to call +@code{get_stmt_operands}. Since that is a potentially expensive +operation, statements are only scanned if they have been marked +modified by a call to @code{modify_stmt}. So, if your pass replaces +operands in a statement, make sure to call @code{modify_stmt}. + + +@node SSA +@section Static Single Assignment +@cindex SSA +@cindex static single assignment + +Most of the tree optimizers rely on the data flow information provided +by the Static Single Assignment (SSA) form. We implement the SSA form +as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and +K. Zadeck. Efficiently Computing Static Single Assignment Form and the +Control Dependence Graph. ACM Transactions on Programming Languages +and Systems, 13(4):451-490, October 1991}. + +The SSA form is based on the premise that program variables are +assigned in exactly one location in the program. Multiple assignments +to the same variable create new versions of that variable. Naturally, +actual programs are seldom in SSA form initially because variables +tend to be assigned multiple times. The compiler modifies the program +representation so that every time a variable is assigned in the code, +a new version of the variable is created. Different versions of the +same variable are distinguished by subscripting the variable name with +its version number. Variables used in the right-hand side of +expressions are renamed so that their version number matches that of +the most recent assignment. + +We represent variable versions using @code{SSA_NAME} nodes. The +renaming process in @file{tree-ssa.c} wraps every real and +virtual operand with an @code{SSA_NAME} node which contains +the version number and the statement that created the +@code{SSA_NAME}. Only definitions and virtual definitions may +create new @code{SSA_NAME} nodes. + +Sometimes, flow of control makes it impossible to determine what is the +most recent version of a variable. In these cases, the compiler +inserts an artificial definition for that variable called +@dfn{PHI function} or @dfn{PHI node}. This new definition merges +all the incoming versions of the variable to create a new name +for it. For instance, + +@smallexample +if (...) + a_1 = 5; +else if (...) + a_2 = 2; +else + a_3 = 13; + +# a_4 = PHI <a_1, a_2, a_3> +return a_4; +@end smallexample + +Since it is not possible to determine which of the three branches +will be taken at runtime, we don't know which of @code{a_1}, +@code{a_2} or @code{a_3} to use at the return statement. So, the +SSA renamer creates a new version @code{a_4} which is assigned +the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. +Hence, PHI nodes mean ``one of these operands. I don't know +which''. + +The following macros can be used to examine PHI nodes + +@defmac PHI_RESULT (@var{phi}) +Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., +@var{phi}'s LHS). +@end defmac + +@defmac PHI_NUM_ARGS (@var{phi}) +Returns the number of arguments in @var{phi}. This number is exactly +the number of incoming edges to the basic block holding @var{phi}@. +@end defmac + +@defmac PHI_ARG_ELT (@var{phi}, @var{i}) +Returns a tuple representing the @var{i}th argument of @var{phi}@. +Each element of this tuple contains an @code{SSA_NAME} @var{var} and +the incoming edge through which @var{var} flows. +@end defmac + +@defmac PHI_ARG_EDGE (@var{phi}, @var{i}) +Returns the incoming edge for the @var{i}th argument of @var{phi}. +@end defmac + +@defmac PHI_ARG_DEF (@var{phi}, @var{i}) +Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. +@end defmac + + +@subsection Preserving the SSA form +@findex vars_to_rename +@cindex preserving SSA form +Some optimization passes make changes to the function that +invalidate the SSA property. This can happen when a pass has +added new variables or changed the program so that variables that +were previously aliased aren't anymore. + +Whenever something like this happens, the affected variables must +be renamed into SSA form again. To do this, you should mark the +new variables in the global bitmap @code{vars_to_rename}. Once +your pass has finished, the pass manager will invoke the SSA +renamer to put the program into SSA once more. + +@subsection Examining @code{SSA_NAME} nodes +@cindex examining SSA_NAMEs + +The following macros can be used to examine @code{SSA_NAME} nodes + +@defmac SSA_NAME_DEF_STMT (@var{var}) +Returns the statement @var{s} that creates the @code{SSA_NAME} +@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT +(@var{s})} returns @code{true}), it means that the first reference to +this variable is a USE or a VUSE@. +@end defmac + +@defmac SSA_NAME_VERSION (@var{var}) +Returns the version number of the @code{SSA_NAME} object @var{var}. +@end defmac + + +@subsection Walking use-def chains + +@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data}) + +Walks use-def chains starting at the @code{SSA_NAME} node @var{var}. +Calls function @var{fn} at each reaching definition found. Function +@var{FN} takes three arguments: @var{var}, its defining statement +(@var{def_stmt}) and a generic pointer to whatever state information +that @var{fn} may want to maintain (@var{data}). Function @var{fn} is +able to stop the walk by returning @code{true}, otherwise in order to +continue the walk, @var{fn} should return @code{false}. + +Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are +slightly different. For each argument @var{arg} of the PHI node, this +function will: + +@enumerate +@item Walk the use-def chains for @var{arg}. +@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}. +@end enumerate + +Note how the first argument to @var{fn} is no longer the original +variable @var{var}, but the PHI argument currently being examined. +If @var{fn} wants to get at @var{var}, it should call +@code{PHI_RESULT} (@var{phi}). +@end deftypefn + +@subsection Walking the dominator tree + +@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) + +This function walks the dominator tree for the current CFG calling a +set of callback functions defined in @var{struct dom_walk_data} in +@file{domwalk.h}. The call back functions you need to define give you +hooks to execute custom code at various points during traversal: + +@enumerate +@item Once to initialize any local data needed while processing + @var{bb} and its children. This local data is pushed into an + internal stack which is automatically pushed and popped as the + walker traverses the dominator tree. + +@item Once before traversing all the statements in the @var{bb}. + +@item Once for every statement inside @var{bb}. + +@item Once after traversing all the statements and before recursing + into @var{bb}'s dominator children. + +@item It then recurses into all the dominator children of @var{bb}. + +@item After recursing into all the dominator children of @var{bb} it + can, optionally, traverse every statement in @var{bb} again + (i.e., repeating steps 2 and 3). + +@item Once after walking the statements in @var{bb} and @var{bb}'s + dominator children. At this stage, the block local data stack + is popped. +@end enumerate +@end deftypefn + +@node Alias analysis +@section Alias analysis +@cindex alias +@cindex flow-sensitive alias analysis +@cindex flow-insensitive alias analysis + +Alias analysis proceeds in 3 main phases: + +@enumerate +@item Points-to and escape analysis. + +This phase walks the use-def chains in the SSA web looking for +three things: + + @itemize @bullet + @item Assignments of the form @code{P_i = &VAR} + @item Assignments of the form P_i = malloc() + @item Pointers and ADDR_EXPR that escape the current function. + @end itemize + +The concept of `escaping' is the same one used in the Java world. +When a pointer or an ADDR_EXPR escapes, it means that it has been +exposed outside of the current function. So, assignment to +global variables, function arguments and returning a pointer are +all escape sites. + +This is where we are currently limited. Since not everything is +renamed into SSA, we lose track of escape properties when a +pointer is stashed inside a field in a structure, for instance. +In those cases, we are assuming that the pointer does escape. + +We use escape analysis to determine whether a variable is +call-clobbered. Simply put, if an ADDR_EXPR escapes, then the +variable is call-clobbered. If a pointer P_i escapes, then all +the variables pointed-to by P_i (and its memory tag) also escape. + +@item Compute flow-sensitive aliases + +We have two classes of memory tags. Memory tags associated with +the pointed-to data type of the pointers in the program. These +tags are called ``type memory tag'' (TMT). The other class are +those associated with SSA_NAMEs, called ``name memory tag'' (NMT). +The basic idea is that when adding operands for an INDIRECT_REF +*P_i, we will first check whether P_i has a name tag, if it does +we use it, because that will have more precise aliasing +information. Otherwise, we use the standard type tag. + +In this phase, we go through all the pointers we found in +points-to analysis and create alias sets for the name memory tags +associated with each pointer P_i. If P_i escapes, we mark +call-clobbered the variables it points to and its tag. + + +@item Compute flow-insensitive aliases + +This pass will compare the alias set of every type memory tag and +every addressable variable found in the program. Given a type +memory tag TMT and an addressable variable V@. If the alias sets +of TMT and V conflict (as computed by may_alias_p), then V is +marked as an alias tag and added to the alias set of TMT@. +@end enumerate + +For instance, consider the following function: + +@example +foo (int i) +@{ + int *p, *q, a, b; + + if (i > 10) + p = &a; + else + q = &b; + + *p = 3; + *q = 5; + a = b + 2; + return *p; +@} +@end example + +After aliasing analysis has finished, the type memory tag for +pointer @code{p} will have two aliases, namely variables @code{a} and +@code{b}. +Every time pointer @code{p} is dereferenced, we want to mark the +operation as a potential reference to @code{a} and @code{b}. + +@example +foo (int i) +@{ + int *p, a, b; + + if (i_2 > 10) + p_4 = &a; + else + p_6 = &b; + # p_1 = PHI <p_4(1), p_6(2)>; + + # a_7 = VDEF <a_3>; + # b_8 = VDEF <b_5>; + *p_1 = 3; + + # a_9 = VDEF <a_7> + # VUSE <b_8> + a_9 = b_8 + 2; + + # VUSE <a_9>; + # VUSE <b_8>; + return *p_1; +@} +@end example + +In certain cases, the list of may aliases for a pointer may grow +too large. This may cause an explosion in the number of virtual +operands inserted in the code. Resulting in increased memory +consumption and compilation time. + +When the number of virtual operands needed to represent aliased +loads and stores grows too large (configurable with @option{--param +max-aliased-vops}), alias sets are grouped to avoid severe +compile-time slow downs and memory consumption. The alias +grouping heuristic proceeds as follows: + +@enumerate +@item Sort the list of pointers in decreasing number of contributed +virtual operands. + +@item Take the first pointer from the list and reverse the role +of the memory tag and its aliases. Usually, whenever an +aliased variable Vi is found to alias with a memory tag +T, we add Vi to the may-aliases set for T@. Meaning that +after alias analysis, we will have: + +@smallexample +may-aliases(T) = @{ V1, V2, V3, ..., Vn @} +@end smallexample + +This means that every statement that references T, will get +@code{n} virtual operands for each of the Vi tags. But, when +alias grouping is enabled, we make T an alias tag and add it +to the alias set of all the Vi variables: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +... +may-aliases(Vn) = @{ T @} +@end smallexample + +This has two effects: (a) statements referencing T will only get +a single virtual operand, and, (b) all the variables Vi will now +appear to alias each other. So, we lose alias precision to +improve compile time. But, in theory, a program with such a high +level of aliasing should not be very optimizable in the first +place. + +@item Since variables may be in the alias set of more than one +memory tag, the grouping done in step (2) needs to be extended +to all the memory tags that have a non-empty intersection with +the may-aliases set of tag T@. For instance, if we originally +had these may-aliases sets: + +@smallexample +may-aliases(T) = @{ V1, V2, V3 @} +may-aliases(R) = @{ V2, V4 @} +@end smallexample + +In step (2) we would have reverted the aliases for T as: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +@end smallexample + +But note that now V2 is no longer aliased with R@. We could +add R to may-aliases(V2), but we are in the process of +grouping aliases to reduce virtual operands so what we do is +add V4 to the grouping to obtain: + +@smallexample +may-aliases(V1) = @{ T @} +may-aliases(V2) = @{ T @} +may-aliases(V3) = @{ T @} +may-aliases(V4) = @{ T @} +@end smallexample + +@item If the total number of virtual operands due to aliasing is +still above the threshold set by max-alias-vops, go back to (2). +@end enumerate |