summaryrefslogtreecommitdiff
path: root/docs/IntroductionToTheClangAST.rst
diff options
context:
space:
mode:
authorSean Silva <silvas@purdue.edu>2012-12-13 01:10:46 +0000
committerSean Silva <silvas@purdue.edu>2012-12-13 01:10:46 +0000
commit93ca0217a2aa3047c10518e991ab8578e90829e7 (patch)
tree5ec29392cfd8361d38f2ccaaa810c2f81e3fc268 /docs/IntroductionToTheClangAST.rst
parent1812652c24c5a7847654cef9b0875414000af27f (diff)
downloadclang-93ca0217a2aa3047c10518e991ab8578e90829e7.tar.gz
docs: More reST conversion.
Sorry for the large commit, but it is much faster to convert in batches. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@170067 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/IntroductionToTheClangAST.rst')
-rw-r--r--docs/IntroductionToTheClangAST.rst135
1 files changed, 135 insertions, 0 deletions
diff --git a/docs/IntroductionToTheClangAST.rst b/docs/IntroductionToTheClangAST.rst
new file mode 100644
index 0000000000..a23fb42b71
--- /dev/null
+++ b/docs/IntroductionToTheClangAST.rst
@@ -0,0 +1,135 @@
+=============================
+Introduction to the Clang AST
+=============================
+
+This document gives a gentle introduction to the mysteries of the Clang
+AST. It is targeted at developers who either want to contribute to
+Clang, or use tools that work based on Clang's AST, like the AST
+matchers.
+
+Introduction
+============
+
+Clang's AST is different from ASTs produced by some other compilers in
+that it closely resembles both the written C++ code and the C++
+standard. For example, parenthesis expressions and compile time
+constants are available in an unreduced form in the AST. This makes
+Clang's AST a good fit for refactoring tools.
+
+Documentation for all Clang AST nodes is available via the generated
+`Doxygen <http://clang.llvm.org/doxygen>`_. The doxygen online
+documentation is also indexed by your favorite search engine, which will
+make a search for clang and the AST node's class name usually turn up
+the doxygen of the class you're looking for (for example, search for:
+clang ParenExpr).
+
+Examining the AST
+=================
+
+A good way to familarize yourself with the Clang AST is to actually look
+at it on some simple example code. Clang has a builtin AST-dump modes,
+which can be enabled with the flags -ast-dump and -ast-dump-xml. Note
+that -ast-dump-xml currently only works with debug-builds of clang.
+
+Let's look at a simple example AST:
+
+::
+
+ # cat test.cc
+ int f(int x) {
+ int result = (x / 42);
+ return result;
+ }
+
+ # Clang by default is a frontend for many tools; -cc1 tells it to directly
+ # use the C++ compiler mode. -undef leaves out some internal declarations.
+ $ clang -cc1 -undef -ast-dump-xml test.cc
+ ... cutting out internal declarations of clang ...
+ <TranslationUnit ptr="0x4871160">
+ <Function ptr="0x48a5800" name="f" prototype="true">
+ <FunctionProtoType ptr="0x4871de0" canonical="0x4871de0">
+ <BuiltinType ptr="0x4871250" canonical="0x4871250"/>
+ <parameters>
+ <BuiltinType ptr="0x4871250" canonical="0x4871250"/>
+ </parameters>
+ </FunctionProtoType>
+ <ParmVar ptr="0x4871d80" name="x" initstyle="c">
+ <BuiltinType ptr="0x4871250" canonical="0x4871250"/>
+ </ParmVar>
+ <Stmt>
+ (CompoundStmt 0x48a5a38 <t2.cc:1:14, line:4:1>
+ (DeclStmt 0x48a59c0 <line:2:3, col:24>
+ 0x48a58c0 "int result =
+ (ParenExpr 0x48a59a0 <col:16, col:23> 'int'
+ (BinaryOperator 0x48a5978 <col:17, col:21> 'int' '/'
+ (ImplicitCastExpr 0x48a5960 <col:17> 'int' <LValueToRValue>
+ (DeclRefExpr 0x48a5918 <col:17> 'int' lvalue ParmVar 0x4871d80 'x' 'int'))
+ (IntegerLiteral 0x48a5940 <col:21> 'int' 42)))")
+ (ReturnStmt 0x48a5a18 <line:3:3, col:10>
+ (ImplicitCastExpr 0x48a5a00 <col:10> 'int' <LValueToRValue>
+ (DeclRefExpr 0x48a59d8 <col:10> 'int' lvalue Var 0x48a58c0 'result' 'int'))))
+
+ </Stmt>
+ </Function>
+ </TranslationUnit>
+
+In general, -ast-dump-xml dumps declarations in an XML-style format and
+statements in an S-expression-style format. The toplevel declaration in
+a translation unit is always the `translation unit
+declaration <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_.
+In this example, our first user written declaration is the `function
+declaration <http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html>`_
+of 'f'. The body of 'f' is a `compound
+statement <http://clang.llvm.org/doxygen/classclang_1_1CompoundStmt.html>`_,
+whose child nodes are a `declaration
+statement <http://clang.llvm.org/doxygen/classclang_1_1DeclStmt.html>`_
+that declares our result variable, and the `return
+statement <http://clang.llvm.org/doxygen/classclang_1_1ReturnStmt.html>`_.
+
+AST Context
+===========
+
+All information about the AST for a translation unit is bundled up in
+the class
+`ASTContext <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html>`_.
+It allows traversal of the whole translation unit starting from
+`getTranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#abd909fb01ef10cfd0244832a67b1dd64>`_,
+or to access Clang's `table of
+identifiers <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a4f95adb9958e22fbe55212ae6482feb4>`_
+for the parsed translation unit.
+
+AST Nodes
+=========
+
+Clang's AST nodes are modeled on a class hierarchy that does not have a
+common ancestor. Instead, there are multiple larger hierarchies for
+basic node types like
+`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ and
+`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_. Many
+important AST nodes derive from
+`Type <http://clang.llvm.org/doxygen/classclang_1_1Type.html>`_,
+`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_,
+`DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_
+or `Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_, with
+some classes deriving from both Decl and DeclContext.
+
+There are also a multitude of nodes in the AST that are not part of a
+larger hierarchy, and are only reachable from specific other nodes, like
+`CXXBaseSpecifier <http://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html>`_.
+
+Thus, to traverse the full AST, one starts from the
+`TranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_
+and then recursively traverses everything that can be reached from that
+node - this information has to be encoded for each specific node type.
+This algorithm is encoded in the
+`RecursiveASTVisitor <http://clang.llvm.org/doxygen/classclang_1_1RecursiveASTVisitor.html>`_.
+See the `RecursiveASTVisitor
+tutorial <http://clang.llvm.org/docs/RAVFrontendAction.html>`_.
+
+The two most basic nodes in the Clang AST are statements
+(`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_) and
+declarations
+(`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_). Note
+that expressions
+(`Expr <http://clang.llvm.org/doxygen/classclang_1_1Expr.html>`_) are
+also statements in Clang's AST.