summaryrefslogtreecommitdiff
path: root/doc/latex/src/ndisasm.tex
diff options
context:
space:
mode:
authorCyrill Gorcunov <gorcunov@gmail.com>2019-03-31 19:33:08 +0300
committerCyrill Gorcunov <gorcunov@gmail.com>2019-03-31 19:34:50 +0300
commita384068a04a5cf92a4564fa39b061b7539cb94f9 (patch)
tree711eb1fdc3892eb6f966b5418d3f2a4917aa078b /doc/latex/src/ndisasm.tex
parent982186a1a3139763f2aa2710b32236009f64270d (diff)
downloadnasm-latex.tar.gz
doc: latex -- Initial importlatex
It is an initial import for conversion of our documentation to latex format. Note that latex additional packages needs to be preinstalled, xelatex is used for pdf generation. While I've been very carefull while converting the docs there is a big probability that some indices might be screwed so we need to review everything once again. Then we need to create a converter for html backend, I started working on it but didn't successed yet and I fear won't have enough spare time in near future. Also we need to autogenerate instruction table and warnings from insns.dat and probably from scanning nasm sources. To build nasm.pdf just run make -C doc/latex/ it doesn't require configuration and rather a standalone builder out of our traditional build engine. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Diffstat (limited to 'doc/latex/src/ndisasm.tex')
-rw-r--r--doc/latex/src/ndisasm.tex174
1 files changed, 174 insertions, 0 deletions
diff --git a/doc/latex/src/ndisasm.tex b/doc/latex/src/ndisasm.tex
new file mode 100644
index 00000000..d350a2c9
--- /dev/null
+++ b/doc/latex/src/ndisasm.tex
@@ -0,0 +1,174 @@
+%
+% vim: ts=4 sw=4 et
+%
+\xchapter{ndisasm}{Ndisasm}
+
+The Netwide Disassembler, NDISASM.
+
+\xsection{ndisintro}{Introduction}
+
+The Netwide Disassembler is a small companion program to the Netwide
+Assembler, NASM. It seemed a shame to have an x86 assembler,
+complete with a full instruction table, and not make as much use of
+it as possible, so here's a disassembler which shares the
+instruction table (and some other bits of code) with NASM.
+
+The Netwide Disassembler does nothing except to produce
+disassemblies of \emph{binary} source files. NDISASM does not have any
+understanding of object file formats, like \code{objdump}, and it will
+not understand \code{DOS .EXE} files like \code{debug} will. It just
+disassembles.
+
+\xsection{ndisrun}{Running NDISASM}
+
+To disassemble a file, you will typically use a command of the form
+
+\begin{lstlisting}
+ndisasm -b {16|32|64} filename
+\end{lstlisting}
+
+NDISASM can disassemble 16-, 32- or 64-bit code equally easily,
+provided of course that you remember to specify which it is to work
+with. If no \codeindex{-b} switch is present, NDISASM works in 16-bit mode
+by default. The \codeindex{-u} switch (for USE32) also invokes 32-bit mode.
+
+Two more command line options are \codeindex{-r} which reports the version
+number of NDISASM you are running, and \codeindex{-h} which gives a short
+summary of command line options.
+
+\xsubsection{ndiscom}{COM Files: Specifying an Origin}
+
+To disassemble a \code{DOS .COM} file correctly, a disassembler must
+assume that the first instruction in the file is loaded at address
+\code{0x100}, rather than at zero. NDISASM, which assumes by default
+that any file you give it is loaded at zero, will therefore need
+to be informed of this.
+
+The \codeindex{-o} option allows you to declare a different origin
+for the file you are disassembling. Its argument may be expressed
+in any of the NASM numeric formats: decimal by default, if it begins
+with `\code{\$}' or `\code{0x}' or ends in `\code{H}' it's \code{hex},
+if it ends in `\code{Q}' it's \code{octal}, and if it ends in
+`\code{B}' it's \code{binary}.
+
+Hence, to disassemble a \code{.COM} file:
+
+\begin{lstlisting}
+ndisasm -o100h filename.com
+\end{lstlisting}
+
+will do the trick.
+
+\xsubsection{ndissync}{Code Following Data: Synchronisation}
+
+Suppose you are disassembling a file which contains some data which
+isn't machine code, and \emph{then} contains some machine code. NDISASM
+will faithfully plough through the data section, producing machine
+instructions wherever it can (although most of them will look
+bizarre, and some may have unusual prefixes, e.g. `\code{FS OR AX,0x240A}'),
+and generating `DB' instructions ever so often if it's totally stumped.
+Then it will reach the code section.
+
+Supposing NDISASM has just finished generating a strange machine
+instruction from part of the data section, and its file position is
+now one byte \emph{before} the beginning of the code section. It's
+entirely possible that another spurious instruction will get
+generated, starting with the final byte of the data section, and
+then the correct first instruction in the code section will not be
+seen because the starting point skipped over it. This isn't really
+ideal.
+
+To avoid this, you can specify a `\codeindex{synchronisation}' point, or indeed
+as many synchronisation points as you like (although NDISASM can
+only handle 2147483647 sync points internally). The definition of a sync
+point is this: NDISASM guarantees to hit sync points exactly during
+disassembly. If it is thinking about generating an instruction which
+would cause it to jump over a sync point, it will discard that
+instruction and output a `\code{db}' instead. So it \emph{will} start
+disassembly exactly from the sync point, and so you \emph{will} see all
+the instructions in your code section.
+
+Sync points are specified using the \codeindex{-s} option: they are measured
+in terms of the program origin, not the file position. So if you
+want to synchronize after 32 bytes of a \codeindex{.COM} file, you would have to
+do
+
+\begin{lstlisting}
+ndisasm -o100h -s120h file.com
+\end{lstlisting}
+
+rather than
+
+\begin{lstlisting}
+ndisasm -o100h -s20h file.com
+\end{lstlisting}
+
+As stated above, you can specify multiple sync markers if you need
+to, just by repeating the \code{-s} option.
+
+
+\xsubsection{ndisisync}{Mixed Code and Data: Automatic (Intelligent)
+Synchronisation}
+\indexcode{auto-sync}
+
+Suppose you are disassembling the boot sector of a \code{DOS} floppy (maybe
+it has a virus, and you need to understand the virus so that you
+know what kinds of damage it might have done you). Typically, this
+will contain a \code{JMP} instruction, then some data, then the rest of the
+code. So there is a very good chance of NDISASM being \emph{misaligned}
+when the data ends and the code begins. Hence a sync point is
+needed.
+
+On the other hand, why should you have to specify the sync point
+manually? What you'd do in order to find where the sync point would
+be, surely, would be to read the \code{JMP} instruction, and then to use
+its target address as a sync point. So can NDISASM do that for you?
+
+The answer, of course, is yes: using either of the synonymous
+switches \codeindex{-a} (for automatic sync) or \codeindex{-i}
+(for intelligent sync) will enable \code{auto-sync} mode. Auto-sync
+mode automatically generates a sync point for any forward-referring
+PC-relative jump or call instruction that NDISASM encounters. (Since
+NDISASM is one-pass, if it encounters a PC-relative jump whose target
+has already been processed, there isn't much it can do about it...)
+
+Only PC-relative jumps are processed, since an absolute jump is
+either through a register (in which case NDISASM doesn't know what
+the register contains) or involves a segment address (in which case
+the target code isn't in the same segment that NDISASM is working
+in, and so the sync point can't be placed anywhere useful).
+
+For some kinds of file, this mechanism will automatically put sync
+points in all the right places, and save you from having to place
+any sync points manually. However, it should be stressed that
+auto-sync mode is \emph{not} guaranteed to catch all the sync points, and
+you may still have to place some manually.
+
+Auto-sync mode doesn't prevent you from declaring manual sync
+points: it just adds automatically generated ones to the ones you
+provide. It's perfectly feasible to specify \code{-i} \emph{and}
+some \code{-s} options.
+
+Another caveat with auto-sync mode is that if, by some unpleasant
+fluke, something in your data section should disassemble to a
+PC-relative call or jump instruction, NDISASM may obediently place a
+sync point in a totally random place, for example in the middle of
+one of the instructions in your code section. So you may end up with
+a wrong disassembly even if you use auto-sync. Again, there isn't
+much I can do about this. If you have problems, you'll have to use
+manual sync points, or use the \code{-k} option (documented below) to
+suppress disassembly of the data area.
+
+\xsubsection{ndisother}{Other Options}
+
+The \codeindex{-e} option skips a header on the file, by ignoring the first N
+bytes. This means that the header is \emph{not} counted towards the
+disassembly offset: if you give \code{-e10 -o10}, disassembly will start
+at byte 10 in the file, and this will be given offset 10, not 20.
+
+The \codeindex{-k} option is provided with two comma-separated numeric
+arguments, the first of which is an assembly offset and the second
+is a number of bytes to skip. This \emph{will} count the skipped bytes
+towards the assembly offset: its use is to suppress disassembly of a
+data section which wouldn't contain anything you wanted to see
+anyway.