summaryrefslogtreecommitdiff
path: root/ndisasm.doc
diff options
context:
space:
mode:
Diffstat (limited to 'ndisasm.doc')
-rw-r--r--ndisasm.doc199
1 files changed, 199 insertions, 0 deletions
diff --git a/ndisasm.doc b/ndisasm.doc
new file mode 100644
index 00000000..5b5374af
--- /dev/null
+++ b/ndisasm.doc
@@ -0,0 +1,199 @@
+ The Netwide Disassembler, NDISASM
+ =================================
+
+Introduction
+============
+
+The Netwide Disassembler is a small companion program to the Netwide
+Assembler, NASM. It seemed a shame to have an x86 assembler,
+complete with a full instruction table, and not make as much use of
+it as possible, so here's a disassembler which shares the
+instruction table (and some other bits of code) with NASM.
+
+The Netwide Disassembler does nothing except to produce
+disassemblies of _binary_ source files. NDISASM does not have any
+understanding of object file formats, like `objdump', and it will
+not understand DOS .EXE files like `debug' will. It just
+disassembles.
+
+Getting Started: Installation
+=============================
+
+See `nasm.doc' for installation instructions. NDISASM, like NASM,
+has a man page which you may want to put somewhere useful, if you
+are on a Unix system.
+
+Running NDISASM
+===============
+
+To disassemble a file, you will typically use a command of the form
+
+ ndisasm [-b16 | -b32] filename
+
+NDISASM can disassemble 16 bit code or 32 bit code equally easily,
+provided of course that you remember to specify which it is to work
+with. If no `-b' switch is present, NDISASM works in 16-bit mode by
+default. The `-u' switch (for USE32) also invokes 32-bit mode.
+
+Two more command line options are `-r' which reports the version
+number of NDISASM you are running, and `-h' which gives a short
+summary of command line options.
+
+COM Files: Specifying an Origin
+===============================
+
+To disassemble a DOS .COM file correctly, a disassembler must assume
+that the first instruction in the file is loaded at address 0x100,
+rather than at zero. NDISASM, which assumes by default that any file
+you give it is loaded at zero, will therefore need to be informed of
+this.
+
+The `-o' option allows you to declare a different origin for the
+file you are disassembling. Its argument may be expressed in any of
+the NASM numeric formats: decimal by default, if it begins with `$'
+or `0x' or ends in `H' it's hex, if it ends in `Q' it's octal, and
+if it ends in `B' it's binary.
+
+Hence, to disassemble a .COM file:
+
+ ndisasm -o100h filename.com
+
+will do the trick.
+
+Code Following Data: Synchronisation
+====================================
+
+Suppose you are disassembling a file which contains some data which
+isn't machine code, and _then_ contains some machine code. NDISASM
+will faithfully plough through the data section, producing machine
+instructions wherever it can (although most of them will look
+bizarre, and some may have unusual prefixes, e.g. `fs or
+ax,0x240a'), and generating `db' instructions every so often if it's
+totally stumped. Then it will reach the code section.
+
+Supposing NDISASM has just finished generating a strange machine
+instruction from part of the data section, and its file position is
+now one byte _before_ the beginning of the code section. It's
+entirely possible that another spurious instruction will get
+generated, starting with the final byte of the data section, and
+then the correct first instruction in the code section will not be
+seen because the starting point skipped over it. This isn't really
+ideal.
+
+To avoid this, you can specify a `synchronisation' point, or indeed
+as many synchronisation points as you like (although NDISASM can
+only handle 8192 sync points internally). The definition of a sync
+point is this: NDISASM guarantees to hit sync points exactly during
+disassembly. If it is thinking about generating an instruction which
+would cause it to jump over a sync point, it will discard that
+instruction and output a `db' instead. So it _will_ start
+disassembly exactly from the sync point, and so you _will_ see all
+the instructions in your code section.
+
+Sync points are specified using the `-s' option: they are measured
+in terms of the program origin, not the file position. So if you
+want to synchronise after 32 bytes of a .COM file, you would have to
+do
+
+ ndisasm -o100h -s120h file.com
+
+rather than
+
+ ndisasm -o100h -s20h file.com
+
+As stated above, you can specify multiple sync markers if you need
+to, just by repeating the `-s' option.
+
+Mixed Code and Data: Automatic (Intelligent) Synchronisation
+============================================================
+
+Suppose you are disassembling the boot sector of a DOS floppy (maybe
+it has a virus, and you need to understand the virus so that you
+know what kinds of damage it might have done you). Typically, this
+will contain a JMP instruction, then some data, then the rest of the
+code. So there is a very good chance of NDISASM being misaligned
+when the data ends and the code begins. Hence a sync point is
+needed.
+
+On the other hand, why should you have to specify the sync point
+manually? What you'd do in order to find where the sync point would
+be, surely, would be to read the JMP instruction, and then to use
+its target address as a sync point. So can NDISASM do that for you?
+
+The answer, of course, is yes: using either of the synonymous
+switches `-a' (for automatic sync) or `-i' (for intelligent sync)
+will enable auto-sync mode. Auto-sync mode automatically generates a
+sync point for any forward-referring PC-relative jump or call
+instruction that NDISASM encounters. (Since NDISASM is one-pass, if
+it encounters a PC-relative jump whose target has already been
+processed, there isn't much it can do about it...)
+
+Only PC-relative jumps are processed, since an absolute jump is
+either through a register (in which case NDISASM doesn't know what
+the register contains) or involves a segment address (in which case
+the target code isn't in the same segment that NDISASM is working
+in, and so the sync point can't be placed anywhere useful).
+
+For some kinds of file, this mechanism will automatically put sync
+points in all the right places, and save you from having to place
+any sync points manually. However, it should be stressed that
+auto-sync mode is _not_ guaranteed to catch all the sync points, and
+you may still have to place some manually.
+
+Auto-sync mode doesn't prevent you from declaring manual sync
+points: it just adds automatically generated ones to the ones you
+provide. It's perfectly feasible to specify `-i' _and_ some `-s'
+options.
+
+Another caveat with auto-sync mode is that if, by some unpleasant
+fluke, something in your data section should disassemble to a
+PC-relative call or jump instruction, NDISASM may obediently place a
+sync point in a totally random place, for example in the middle of
+one of the instructions in your code section. So you may end up with
+a wrong disassembly even if you use auto-sync. Again, there isn't
+much I can do about this. If you have problems, you'll have to use
+manual sync points, or use the `-k' option (documented below) to
+suppress disassembly of the data area.
+
+Other Options
+=============
+
+The `-e' option skips a header on the file, by ignoring the first N
+bytes. This means that the header is _not_ counted towards the
+disassembly offset: if you give `-e10 -o10', disassembly will start
+at byte 10 in the file, and this will be given offset 10, not 20.
+
+The `-k' option is provided with two comma-separated numeric
+arguments, the first of which is an assembly offset and the second
+is a number of bytes to skip. This _will_ count the skipped bytes
+towards the assembly offset: its use is to suppress disassembly of a
+data section which wouldn't contain anything you wanted to see
+anyway.
+
+Bugs and Improvements
+=====================
+
+There are no known bugs. However, any you find, with patches if
+possible, should be sent to <jules@dcs.warwick.ac.uk> or
+<anakin@pobox.com>, and we'll try to fix them. Feel free to send
+contributions and new features as well.
+
+Future plans include awareness of which processors certain
+instructions will run on, and marking of instructions that are too
+advanced for some processor (or are FPU instructions, or are
+undocumented opcodes, or are privileged protected-mode instructions,
+or whatever).
+
+That's All Folks!
+=================
+
+I hope NDISASM is of some use to somebody. Including me. :-)
+
+I don't recommend taking NDISASM apart to see how an efficient
+disassembler works, because as far as I know, it isn't an efficient
+one anyway. You have been warned.
+
+Please feel free to send comments, suggestions, or chat to
+<anakin@pobox.com>. As with NASM, no flames please.
+
+- Simon Tatham <anakin@pobox.com>, 21-Nov-96