summaryrefslogtreecommitdiff
path: root/Docs/internals.texi
diff options
context:
space:
mode:
Diffstat (limited to 'Docs/internals.texi')
-rw-r--r--Docs/internals.texi730
1 files changed, 435 insertions, 295 deletions
diff --git a/Docs/internals.texi b/Docs/internals.texi
index 2195b42d9a0..871e51c50bd 100644
--- a/Docs/internals.texi
+++ b/Docs/internals.texi
@@ -1,26 +1,30 @@
\input texinfo @c -*-texinfo-*-
-@c Copyright 1998 TcX AB, Detron HB and Monty Program KB
+@c Copyright 2002 MySQL AB, TcX AB, Detron HB and Monty Program KB
@c
@c %**start of header
@setfilename internals.info
+
@c We want the types in the same index
-@c @synindex tp fn cp
@synindex cp fn
+
@iftex
-@c Well this is normal in Europe. Maybe this should go into the include.texi?
@afourpaper
@end iftex
+
@c Get version and other info
@include include.texi
+
@ifclear tex-debug
@c This removes the black squares in the right margin
@finalout
@end ifclear
+
@c Set background for HTML
@set _body_tags BGCOLOR=#FFFFFF TEXT=#000000 LINK=#101090 VLINK=#7030B0
-@settitle @strong{MySQL} internals Manual for version @value{mysql_version}.
-@setchapternewpage off
+@settitle @strong{MySQL} Internals Manual for version @value{mysql_version}.
+@setchapternewpage odd
@paragraphindent 0
+
@c %**end of header
@ifinfo
@@ -35,67 +39,78 @@ END-INFO-DIR-ENTRY
@sp 10
@center @titlefont{@strong{MySQL} Internals Manual}
@sp 10
-@center Copyright @copyright{} 1998 TcX AB, Detron HB and Monty Program KB
+@center Copyright @copyright{} 1998-2002 MySQL AB
+@page
@end titlepage
-@node Top, Introduction, (dir), (dir)
+@node Top, caching, (dir), (dir)
@ifinfo
This is a manual about @strong{MySQL} internals.
@end ifinfo
@menu
+* caching:: How MySQL Handles Caching
+* flush tables:: How MySQL Handles @code{FLUSH TABLES}
+* filesort:: How MySQL Does Sorting (@code{filesort})
+* coding guidelines:: Coding Guidelines
+* mysys functions:: Functions In The @code{mysys} Library
+* DBUG:: DBUG Tags To Use
+* protocol:: MySQL Client/Server Protocol
+* Fulltext Search:: Fulltext Search in MySQL
@end menu
-@node caching,,,
-@chapter How MySQL handles caching
+
+@node caching, flush tables, Top, Top
+@chapter How MySQL Handles Caching
@strong{MySQL} has the following caches:
(Note that the some of the filename have a wrong spelling of cache. :)
-@itemize @bullet
+@table @strong
-@item Key cache
+@item Key Cache
A shared cache for all B-tree index blocks in the different NISAM
files. Uses hashing and reverse linked lists for quick caching of the
last used blocks and quick flushing of changed entries for a specific
table. (@file{mysys/mf_keycash.c})
-@item Record cache
+@item Record Cache
This is used for quick scanning of all records in a table.
(@file{mysys/mf_iocash.c} and @file{isam/_cash.c})
-@item Table cache
+@item Table Cache
This holds the last used tables. (@file{sql/sql_base.cc})
-@item Hostname cache
+@item Hostname Cache
For quick lookup (with reverse name resolving). Is a must when one has a
slow DNS.
(@file{sql/hostname.cc})
-@item Privilege cache
+@item Privilege Cache
To allow quick change between databases the last used privileges are
cached for each user/database combination.
(@file{sql/sql_acl.cc})
-@item Heap table cache
-Many use of GROUP BY or DISTINCT caches all found
-rows in a HEAP table (this is a very quick in-memory table with hash index)
+@item Heap Table Cache
+Many use of @code{GROUP BY} or @code{DISTINCT} caches all found rows in
+a @code{HEAP} table. (This is a very quick in-memory table with hash index.)
-@item Join row cache.
-For every full join in a SELECT statement (a full join here means there
-were no keys that one could use to find the next table in a list), the
-found rows are cached in a join cache. One SELECT query can use many
-join caches in the worst case.
-@end itemize
+@item Join Row Cache
+For every full join in a @code{SELECT} statement (a full join here means
+there were no keys that one could use to find the next table in a list),
+the found rows are cached in a join cache. One @code{SELECT} query can
+use many join caches in the worst case.
+@end table
-@node flush tables,,,
-@chapter How MySQL handles flush tables
+
+@node flush tables, filesort, caching, Top
+@chapter How MySQL Handles @code{FLUSH TABLES}
@itemize @bullet
@item
-Flush tables is handled in @code{sql/sql_base.cc::close_cached_tables()}.
+Flush tables is handled in @file{sql/sql_base.cc::close_cached_tables()}.
@item
The idea of flush tables is to force all tables to be closed. This
@@ -109,8 +124,8 @@ all tables)!
When one does a @code{FLUSH TABLES}, the variable @code{refresh_version}
will be incremented. Every time a thread releases a table it checks if
the refresh version of the table (updated at open) is the same as
-the current refresh_version. If not it will close it and broadcast
-a signal on COND_refresh (to wait any thread that is waiting for
+the current @code{refresh_version}. If not it will close it and broadcast
+a signal on @code{COND_refresh} (to wait any thread that is waiting for
all instanses of a table to be closed).
@item
@@ -119,8 +134,8 @@ The current @code{refresh_version} is also compared to the open
refresh version is different the thread will free all locks, reopen the
table and try to get the locks again; This is just to quickly get all
tables to use the newest version. This is handled by
-@code{sql/lock.cc::mysql_lock_tables()} and
-@code{sql/sql_base.cc::wait_for_tables()}.
+@file{sql/lock.cc::mysql_lock_tables()} and
+@file{sql/sql_base.cc::wait_for_tables()}.
@item
When all tables has been closed @code{FLUSH TABLES} will return an ok
@@ -134,8 +149,8 @@ After this it will give other threads a chance to open the same tables.
@end itemize
-@node Filesort,,,
-@chapter How MySQL does sorting (filesort)
+@node filesort, coding guidelines, flush tables, Top
+@chapter How MySQL Does Sorting (@code{filesort})
@itemize @bullet
@@ -146,7 +161,7 @@ Read all rows according to key or by table scanning.
Store the sort-key in a buffer (@code{sort_buffer}).
@item
-When the buffer gets full, run a qsort on it and store the result
+When the buffer gets full, run a @code{qsort} on it and store the result
in a temporary file. Save a pointer to the sorted block.
@item
@@ -170,12 +185,13 @@ Now the code in @file{sql/records.cc} will be used to read through them
in sorted order by using the row pointers in the result file.
To optimize this, we read in a big block of row pointers, sort these
and then we read the rows in the sorted order into a row buffer
-(@code{record_buffer}) .
+(@code{record_buffer}).
@end itemize
-@node Coding guidelines,,,
-@chapter Coding guidelines
+
+@node coding guidelines, mysys functions, filesort, Top
+@chapter Coding Guidelines
@itemize @bullet
@@ -183,24 +199,28 @@ and then we read the rows in the sorted order into a row buffer
We are using @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
@item
-You should use the @strong{MySQL} 3.23 or 4.0 source for all developments.
+You should use the @strong{MySQL} 4.0 source for all developments.
@item
If you have any questions about the @strong{MySQL} source, you can post these
-to @email{developers@@mysql.com} and we will answer them.
-Note that we will shortly change the name of this list to
-@email{internals@@mysql.com}, to more accurately reflect what should be
-posted to this list.
+to @email{dev-public@@mysql.com} and we will answer them. Please
+remember to not use this internal email list in public!
@item
-Try to write code in a lot of black boxes that can be reused or at
-least have a clean interface.
+Try to write code in a lot of black boxes that can be reused or use at
+least a clean, easy to change interface.
@item
Reuse code; There is already a lot of algorithms in MySQL for list handling,
queues, dynamic and hashed arrays, sorting, etc. that can be reused.
@item
+Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
+@code{my_malloc()} that you can find in the @code{mysys} library instead
+of the direct system calls; This will make your code easier to debug and
+more portable.
+
+@item
Try to always write optimized code, so that you don't have to
go back and rewrite it a couple of months later. It's better to
spend 3 times as much time designing and writing an optimal function than
@@ -221,25 +241,23 @@ Don't use two commands on the same line.
Do not check the same pointer for @code{NULL} more than once.
@item
-Use long function and variable names in English; This makes your code
-easier to read. Use the 'varible_name' style instead of 'VariableName'.
+Use long function and variable names in English. This makes your code
+easier to read.
@item
-Think assembly - make it easier for the compiler to optimize your code.
+Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
+rather than dancing SHIFT to seperate words in identifiers).
@item
-Comment your code when you do something that someone else may think
-is not ''trivial''.
+Think assembly - make it easier for the compiler to optimize your code.
@item
-Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
-@code{my_malloc()} that you can find in the @code{mysys} library instead
-of the direct system calls; This will make your code easier to debug and
-more portable.
+Comment your code when you do something that someone else may think
+is not ``trivial''.
@item
-Use @code{libstring} functions instead of standard libc string functions
-whenever possible.
+Use @code{libstring} functions (in the @file{strings} directory)
+instead of standard @code{libc} string functions whenever possible.
@item
Avoid using @code{malloc()} (its REAL slow); For memory allocations
@@ -254,10 +272,6 @@ easily discuss it thoroughly if some other developer thinks there is better
way to do the same thing!
@item
-Use my_var as opposed to myVar or MyVar (@samp{_} rather than dancing SHIFT
-to seperate words in identifiers).
-
-@item
Class names start with a capital letter.
@item
@@ -270,29 +284,28 @@ Any @code{#define}'s are in all-caps.
Matching @samp{@{} are in the same column.
@item
-Put the @samp{@{} after a 'switch' on the same line
+Put the @samp{@{} after a @code{switch} on the same line, as this gives
+better overall indentation for the switch statement:
@example
-switch (arg) {
+switch (arg) @{
@end example
-Because this gives better overall indentation for the switch statement.
-
@item
-In all other cases, @{ and @} should be on their own line, except
-if there is nothing inside @{ @}.
+In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
+if there is nothing inside @samp{@{} and @samp{@}}.
@item
-Have a space after 'if'
+Have a space after @code{if}
@item
-Put a space after ',' for function arguments
+Put a space after @samp{,} for function arguments
@item
-Functions return 0 on success, and non-zero on error, so you can do:
+Functions return @samp{0} on success, and non-zero on error, so you can do:
@example
-if(a() || b() || c()) { error("something went wrong"); }
+if(a() || b() || c()) @{ error("something went wrong"); @}
@end example
@item
@@ -337,113 +350,110 @@ Suggested mode in emacs:
(setq c-default-style "MY")
@end example
-@node mysys functions,,,
-@chapter mysys functions
-
-Functions i mysys: (For flags se my_sys.h)
-
- int my_copy _A((const char *from,const char *to,myf MyFlags));
- - Copy file
-
- int my_delete _A((const char *name,myf MyFlags));
- - Delete file
-
- int my_getwd _A((string buf,uint size,myf MyFlags));
- int my_setwd _A((const char *dir,myf MyFlags));
- - Get and set working directory
-
- string my_tempnam _A((const char *pfx,myf MyFlags));
- - Make a uniq temp file name by using dir and adding something after
- pfx to make name uniq. Name is made by adding a uniq 6 length-string
- and TMP_EXT after pfx.
- Returns pointer to malloced area for filename. Should be freed by
- free().
-
- File my_open _A((const char *FileName,int Flags,myf MyFlags));
- File my_create _A((const char *FileName,int CreateFlags,
- int AccsesFlags, myf MyFlags));
- int my_close _A((File Filedes,myf MyFlags));
- uint my_read _A((File Filedes,byte *Buffer,uint Count,myf MyFlags));
- uint my_write _A((File Filedes,const byte *Buffer,uint Count,
- myf MyFlags));
- ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags));
- ulong my_tell _A((File fd,myf MyFlags));
- - Use instead of open,open-with-create-flag, close read and write
- to get automatic error-messages (flag: MYF_WME) and only have
- to test for != 0 if error (flag: MY_NABP).
-
- int my_rename _A((const char *from,const char *to,myf MyFlags));
- - Rename file
-
- FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags));
- FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags));
- int my_fclose _A((FILE *fd,myf MyFlags));
- uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags));
- uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count,
- myf MyFlags));
- ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags));
- ulong my_ftell _A((FILE *stream,myf MyFlags));
- - Same read-interface for streams as for files
-
- gptr _mymalloc _A((uint uSize,const char *sFile,
- uint uLine, myf MyFlag));
- gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,
- uint uLine, myf MyFlag));
- void _myfree _A((gptr pPtr,const char *sFile,uint uLine));
- int _sanity _A((const char *sFile,unsigned int uLine));
- gptr _myget_copy_of_memory _A((const byte *from,uint length,
- const char *sFile, uint uLine,
- myf MyFlag));
- - malloc(size,myflag) is mapped to this functions if not compiled
- with -DSAFEMALLOC
-
- void TERMINATE _A((void));
- - Writes malloc-info on stdout if compiled with -DSAFEMALLOC.
-
- int my_chsize _A((File fd,ulong newlength,myf MyFlags));
- - Change size of file
-
- void my_error _D((int nr,myf MyFlags, ...));
- - Writes message using error number (se mysys/errors.h) on
- stdout or curses if MYSYS_PROGRAM_USES_CURSES() is called.
-
- void my_message _A((const char *str,myf MyFlags));
- - Writes message-string on
- stdout or curses if MYSYS_PROGRAM_USES_CURSES() is called.
-
- void my_init _A((void ));
- - Start each program (in main) with this.
- void my_end _A((int infoflag));
- - Gives info about program.
- - If infoflag & MY_CHECK_ERROR prints if some files are left open
- - If infoflag & MY_GIVE_INFO prints timing info and malloc info
- about prog.
-
- int my_redel _A((const char *from, const char *to, int MyFlags));
- - Delete from before rename of to to from. Copyes state from old
- file to new file. If MY_COPY_TIME is set sets old time.
-
- int my_copystat _A((const char *from, const char *to, int MyFlags));
- - Copye state from old file to new file.
- If MY_COPY_TIME is set sets copy also time.
-
- string my_filename _A((File fd));
- - Give filename of open file.
-
- int dirname _A((string to,const char *name));
- - Copy name of directory from filename.
-
- int test_if_hard_path _A((const char *dir_name));
- - Test if dirname is a hard path (Starts from root)
-
- void convert_dirname _A((string name));
- - Convert dirname acording to system.
- - In MSDOS changes all caracters to capitals and changes '/' to
- '\'
- string fn_ext _A((const char *name));
- - Returns pointer to extension in filename
- string fn_format _A((string to,const char *name,const char *dsk,
- const char *form,int flag));
+
+@node mysys functions, DBUG, coding guidelines, Top
+@chapter Functions In The @code{mysys} Library
+
+Functions in @code{mysys}: (For flags see @file{my_sys.h})
+
+@table @code
+@item int my_copy _A((const char *from, const char *to, myf MyFlags));
+Copy file from @code{from} to @code{to}.
+
+@item int my_delete _A((const char *name, myf MyFlags));
+Delete file @code{name}.
+
+@item int my_getwd _A((string buf, uint size, myf MyFlags));
+@item int my_setwd _A((const char *dir, myf MyFlags));
+Get and set working directory.
+
+@item string my_tempnam _A((const char *pfx, myf MyFlags));
+Make a unique temporary file name by using dir and adding something after
+@code{pfx} to make name unique. The file name is made by adding a unique
+six character string and @code{TMP_EXT} after @code{pfx}.
+Returns pointer to @code{malloc()}'ed area for filename. Should be freed by
+@code{free()}.
+
+@item File my_open _A((const char *FileName,int Flags,myf MyFlags));
+@item File my_create _A((const char *FileName, int CreateFlags, int AccsesFlags, myf MyFlags));
+@item int my_close _A((File Filedes, myf MyFlags));
+@item uint my_read _A((File Filedes, byte *Buffer, uint Count, myf MyFlags));
+@item uint my_write _A((File Filedes, const byte *Buffer, uint Count, myf MyFlags));
+@item ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags));
+@item ulong my_tell _A((File fd,myf MyFlags));
+Use instead of open, open-with-create-flag, close, read, and write
+to get automatic error messages (flag @code{MYF_WME}) and only have
+to test for != 0 if error (flag @code{MY_NABP}).
+
+@item int my_rename _A((const char *from, const char *to, myf MyFlags));
+Rename file from @code{from} to @code{to}.
+
+@item FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags));
+@item FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags));
+@item int my_fclose _A((FILE *fd,myf MyFlags));
+@item uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags));
+@item uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, myf MyFlags));
+@item ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags));
+@item ulong my_ftell _A((FILE *stream,myf MyFlags));
+Same read-interface for streams as for files.
+
+@item gptr _mymalloc _A((uint uSize,const char *sFile,uint uLine, myf MyFlag));
+@item gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,uint uLine, myf MyFlag));
+@item void _myfree _A((gptr pPtr,const char *sFile,uint uLine));
+@item int _sanity _A((const char *sFile,unsigned int uLine));
+@item gptr _myget_copy_of_memory _A((const byte *from,uint length,const char *sFile, uint uLine,myf MyFlag));
+@code{malloc(size,myflag)} is mapped to these functions if not compiled
+with @code{-DSAFEMALLOC}.
+
+@item void TERMINATE _A((void));
+Writes @code{malloc()} info on @code{stdout} if compiled with
+@code{-DSAFEMALLOC}.
+
+@item int my_chsize _A((File fd, ulong newlength, myf MyFlags));
+Change size of file @code{fd} to @code{newlength}.
+
+@item void my_error _D((int nr, myf MyFlags, ...));
+Writes message using error number (see @file{mysys/errors.h}) on @code{stdout},
+or using curses, if @code{MYSYS_PROGRAM_USES_CURSES()} has been called.
+
+@item void my_message _A((const char *str, myf MyFlags));
+Writes @code{str} on @code{stdout}, or using curses, if
+@code{MYSYS_PROGRAM_USES_CURSES()} has been called.
+
+@item void my_init _A((void ));
+Start each program (in @code{main()}) with this.
+
+@item void my_end _A((int infoflag));
+Gives info about program.
+If @code{infoflag & MY_CHECK_ERROR}, prints if some files are left open.
+If @code{infoflag & MY_GIVE_INFO}, prints timing info and malloc info
+about program.
+
+@item int my_redel _A((const char *from, const char *to, int MyFlags));
+Delete @code{from} before rename of @code{to} to @code{from}. Copies state
+from old file to new file. If @code{MY_COPY_TIME} is set, sets old time.
+
+@item int my_copystat _A((const char *from, const char *to, int MyFlags));
+Copy state from old file to new file. If @code{MY_COPY_TIME} is set,
+sets old time.
+
+@item string my_filename _A((File fd));
+Returns filename of open file.
+
+@item int dirname _A((string to, const char *name));
+Copy name of directory from filename.
+
+@item int test_if_hard_path _A((const char *dir_name));
+Test if @code{dir_name} is a hard path (starts from root).
+
+@item void convert_dirname _A((string name));
+Convert dirname according to system.
+In MSDOS, changes all characters to capitals and changes @samp{/} to @samp{\}.
+
+@item string fn_ext _A((const char *name));
+Returns pointer to extension in filename.
+
+@item string fn_format _A((string to,const char *name,const char *dsk,const char *form,int flag));
format a filename with replace of library and extension and
converts between different systems.
params to and name may be identicall
@@ -457,117 +467,204 @@ Functions i mysys: (For flags se my_sys.h)
"open(fn_format(temp_buffe,name,"","",4),...)" to unpack home and
convert filename to system-form.
- string fn_same _A((string toname,const char *name,int flag));
- - Copys directory and extension from name to toname if neaded.
- copy can be forced by same flags that in fn_format.
+@item string fn_same _A((string toname, const char *name, int flag));
+Copys directory and extension from @code{name} to @code{toname} if neaded.
+Copying can be forced by same flags used in @code{fn_format()}.
+
+@item int wild_compare _A((const char *str, const char *wildstr));
+Compare if @code{str} matches @code{wildstr}. @code{wildstr} can contain
+@samp{*} and @samp{?} as wildcard characters.
+Returns 0 if @code{str} and @code{wildstr} match.
+
+@item void get_date _A((string to, int timeflag));
+Get current date in a form ready for printing.
+
+@item void soundex _A((string out_pntr, string in_pntr))
+Makes @code{in_pntr} to a 5 char long string. All words that sound
+alike have the same string.
+
+@item int init_key_cache _A((ulong use_mem, ulong leave_this_much_mem));
+Use caching of keys in MISAM, PISAM, and ISAM.
+@code{KEY_CACHE_SIZE} is a good size.
+Remember to lock databases for optimal caching.
+
+@item void end_key_cache _A((void));
+End key caching.
+@end table
+
- int wild_compare _A((const char *str,const char *wildstr));
- - Compare if str matches wildstr. Wildstr can contain "*" and "?"
- as match-characters.
- Returns 0 if match.
- void get_date _A((string to,int timeflag));
- - Get current date in a form ready for printing.
+@node DBUG, protocol, mysys functions, Top
+@chapter DBUG Tags To Use
- void soundex _A((string out_pntr, string in_pntr))
- - Makes in_pntr to a 5 chars long string. All words that sounds
- alike have the same string.
+Here is some of the tags we now use:
+(We should probably add a couple of new ones)
- int init_key_cache _A((ulong use_mem,ulong leave_this_much_mem));
- - Use cacheing of keys in MISAM, PISAM, and ISAM.
- KEY_CACHE_SIZE is a good size.
- - Remember to lock databases for optimal cacheing
+@table @code
+@item enter
+Arguments to the function.
- void end_key_cache _A((void));
- - End key-cacheing.
+@item exit
+Results from the function.
-@node protocol,,,
-@chapter MySQL client/server protocol
+@item info
+Something that may be interesting.
-Raw packet without compression
-==============================
--------------------------------------------------
-| Packet Length | Packet no | Data |
-| 3 Bytes | 1 Byte | n Bytes |
--------------------------------------------------
+@item warning
+When something doesn't go the usual route or may be wrong.
-3 Byte packet length
- The length is calculated with int3store
- See include/global.h for details.
- The max packetsize can be 16 MB.
-1 Byte packet no
+@item error
+When something went wrong.
-If no compression is used the first 4 bytes of each paket
-is the header of the paket.
-The packet number is incremented for each sent packet. The first
-packet starts with 0
+@item loop
+Write in a loop, that is probably only useful when debugging
+the loop. These should normally be deleted when one is
+satisfied with the code and it has been in real use for a while.
+@end table
-n Byte data
+Some specific to mysqld, because we want to watch these carefully:
+
+@table @code
+@item trans
+Starting/stopping transactions.
+
+@item quit
+@code{info} when mysqld is preparing to die.
+
+@item query
+Print query.
+@end table
+
+
+@node protocol, Fulltext Search, DBUG, Top
+@chapter MySQL Client/Server Protocol
+
+@menu
+* raw packet without compression::
+* raw packet with compression::
+* basic packets::
+* communication::
+* fieldtype codes::
+@end menu
+
+@node raw packet without compression, raw packet with compression, protocol, protocol
+@section Raw Packet Without Compression
+
+@example
++-----------------------------------------------+
+| Packet Length | Packet no | Data |
+| 3 Bytes | 1 Byte | n Bytes |
++-----------------------------------------------+
+@end example
+
+@table @asis
+@item 3 Byte packet length
+The length is calculated with int3store
+See include/global.h for details.
+The max packetsize can be 16 MB.
+
+@item 1 Byte packet no
+If no compression is used the first 4 bytes of each packet is the header
+of the packet. The packet number is incremented for each sent packet.
+The first packet starts with 0.
+@item n Byte data
+
+@end table
The packet length can be recalculated with:
+
+@example
length = byte1 + (256 * byte2) + (256 * 256 * byte3)
-
-Raw packet with compression
-===========================
------------------------------------------------------
-| Packet Length | Packet no | Uncomp. Packet Length |
-| 3 Bytes | 1 Byte | 3 Bytes |
------------------------------------------------------
-
-3 Byte packet length
- The length is calculated with int3store
- See include/global.h for details.
- The max packetsize can be 16 MB.
-1 Byte packet no
-3 Byte uncompressed packet length
-
-If compression is used the first 7 bytes of each paket
-is the header of the paket.
-
-Basic packets
-==============
-OK-packet
- For details see sql/net_pkg.cc
- function send_ok
- -------------------------------------------------
- | Header | No of Rows | Affected Rows |
- | | 1 Byte | 1-8 Byte |
- -------------------------------------------------
- | ID (last_insert_id) | Status | Length |
- | 1-8 Byte | 2 Byte | 1-8 Byte |
- -------------------------------------------------
- | Messagetext |
- | n Byte |
- -------------------------------------------------
-
- Header
- 1 byte number of rows ? (always 0 ?)
- 1-8 bytes affected rows
- 1-8 byte id (last_insert_id)
- 2 byte Status (usually 0)
- If the OK-packege includes a message:
- 1-8 bytes length of message
- n bytes messagetext
-
-Error-packet
- -------------------------------------------------
- | Header | Statuscode | Error no |
- | | 1 Byte | 2 Byte |
- -------------------------------------------------
- | Messagetext | 0x00 |
- | n Byte | 1 Byte |
- -------------------------------------------------
-
- Header
- 1 byte status code (0xFF = ERROR)
- 2 byte error number (is only sent to new 3.23 clients.
- n byte errortext
- 1 byte 0x00
-
-
-
-The communication
-=================
+@end example
+
+
+@node raw packet with compression, basic packets, raw packet without compression, protocol
+@section Raw Packet With Compression
+
+@example
++---------------------------------------------------+
+| Packet Length | Packet no | Uncomp. Packet Length |
+| 3 Bytes | 1 Byte | 3 Bytes |
++---------------------------------------------------+
+@end example
+
+@table @asis
+@item 3 Byte packet length
+The length is calculated with int3store
+See include/global.h for details.
+The max packetsize can be 16 MB.
+
+@item 1 Byte packet no
+@item 3 Byte uncompressed packet length
+@end table
+
+If compression is used the first 7 bytes of each packet
+is the header of the packet.
+
+
+@node basic packets, communication, raw packet with compression, protocol
+@section Basic Packets
+
+@menu
+* ok packet::
+* error packet::
+@end menu
+
+
+@node ok packet, error packet, basic packets, basic packets
+@subsection OK Packet
+
+For details, see @file{sql/net_pkg.cc::send_ok()}.
+
+@example
++-----------------------------------------------+
+| Header | No of Rows | Affected Rows |
+| | 1 Byte | 1-8 Byte |
+|-----------------------------------------------|
+| ID (last_insert_id) | Status | Length |
+| 1-8 Byte | 2 Byte | 1-8 Byte |
+|-----------------------------------------------|
+| Messagetext |
+| n Byte |
++-----------------------------------------------+
+@end example
+
+@table @asis
+@item Header
+@item 1 byte number of rows ? (always 0 ?)
+@item 1-8 bytes affected rows
+@item 1-8 byte id (last_insert_id)
+@item 2 byte Status (usually 0)
+@item If the OK-packege includes a message:
+@item 1-8 bytes length of message
+@item n bytes messagetext
+@end table
+
+
+@node error packet, , ok packet, basic packets
+@subsection Error Packet
+
+@example
++-----------------------------------------------+
+| Header | Status code | Error no |
+| | 1 Byte | 2 Byte |
+|-----------------------------------------------|
+| Messagetext | 0x00 |
+| n Byte | 1 Byte |
++-----------------------------------------------+
+@end example
+
+@table @asis
+@item Header
+@item 1 byte status code (0xFF = ERROR)
+@item 2 byte error number (is only sent to new 3.23 clients.
+@item n byte errortext
+@item 1 byte 0x00
+@end table
+
+
+@node communication, fieldtype codes, basic packets, protocol
+@section Communication
> Packet from server to client
< Paket from client tor server
@@ -658,28 +755,29 @@ The communication
n data
-Fieldtype Codes:
-================
-
- display_length |enum_field_type |flags
- ----------------------------------------------------
-Blob 03 FF FF 00 |01 FC |03 90 00 00
-Mediumblob 03 FF FF FF |01 FC |03 90 00 00
-Tinyblob 03 FF 00 00 |01 FC |03 90 00 00
-Text 03 FF FF 00 |01 FC |03 10 00 00
-Mediumtext 03 FF FF FF |01 FC |03 10 00 00
-Tinytext 03 FF 00 00 |01 FC |03 10 00 00
-Integer 03 0B 00 00 |01 03 |03 03 42 00
-Mediumint 03 09 00 00 |01 09 |03 00 00 00
-Smallint 03 06 00 00 |01 02 |03 00 00 00
-Tinyint 03 04 00 00 |01 01 |03 00 00 00
-Varchar 03 XX 00 00 |01 FD |03 00 00 00
-Enum 03 05 00 00 |01 FE |03 00 01 00
-Datetime 03 13 00 00 |01 0C |03 00 00 00
-Timestamp 03 0E 00 00 |01 07 |03 61 04 00
-Time 03 08 00 00 |01 0B |03 00 00 00
-Date 03 0A 00 00 |01 0A |03 00 00 00
+@node fieldtype codes, , communication, protocol
+@section Fieldtype Codes
+@example
+ display_length |enum_field_type |flags
+ ----------------------------------------------------
+Blob 03 FF FF 00 |01 FC |03 90 00 00
+Mediumblob 03 FF FF FF |01 FC |03 90 00 00
+Tinyblob 03 FF 00 00 |01 FC |03 90 00 00
+Text 03 FF FF 00 |01 FC |03 10 00 00
+Mediumtext 03 FF FF FF |01 FC |03 10 00 00
+Tinytext 03 FF 00 00 |01 FC |03 10 00 00
+Integer 03 0B 00 00 |01 03 |03 03 42 00
+Mediumint 03 09 00 00 |01 09 |03 00 00 00
+Smallint 03 06 00 00 |01 02 |03 00 00 00
+Tinyint 03 04 00 00 |01 01 |03 00 00 00
+Varchar 03 XX 00 00 |01 FD |03 00 00 00
+Enum 03 05 00 00 |01 FE |03 00 01 00
+Datetime 03 13 00 00 |01 0C |03 00 00 00
+Timestamp 03 0E 00 00 |01 07 |03 61 04 00
+Time 03 08 00 00 |01 0B |03 00 00 00
+Date 03 0A 00 00 |01 0A |03 00 00 00
+@end example
@c The Index was empty, and ugly, so I removed it. (jcole, Sep 7, 2000)
@@ -688,6 +786,48 @@ Date 03 0A 00 00 |01 0A |03 00 00 00
@c @printindex fn
+@node Fulltext Search, , protocol, Top
+@chapter Fulltext Search in MySQL
+
+Hopefully, sometime there will be complete description of
+fulltext search algorithms.
+Now it's just unsorted notes.
+
+@menu
+* Weighting in boolean mode::
+@end menu
+
+@node Weighting in boolean mode, , , Fulltext Search
+@section Weighting in boolean mode
+
+The basic idea is as follows: in expression
+@code{A or B or (C and D and E)}, either @code{A} or @code{B} alone
+is enough to match the whole expression. While @code{C},
+@code{D}, and @code{E} should @strong{all} match. So it's
+reasonable to assign weight 1 to @code{A}, @code{B}, and
+@code{(C and D and E)}. And @code{C}, @code{D}, and @code{E}
+should get a weight of 1/3.
+
+Things become more complicated when considering boolean
+operators, as used in MySQL FTB. Obvioulsy, @code{+A +B}
+should be treated as @code{A and B}, and @code{A B} -
+as @code{A or B}. The problem is, that @code{+A B} can @strong{not}
+be rewritten in and/or terms (that's the reason why this - extended -
+set of operators was chosen). Still, aproximations can be used.
+@code{+A B C} can be approximated as @code{A or (A and (B or C))}
+or as @code{A or (A and B) or (A and C) or (A and B and C)}.
+Applying the above logic (and omitting mathematical
+transformations and normalization) one gets that for
+@code{+A_1 +A_2 ... +A_N B_1 B_2 ... B_M} the weights
+should be: @code{A_i = 1/N}, @code{B_j=1} if @code{N==0}, and,
+otherwise, in the first rewritting approach @code{B_j = 1/3},
+and in the second one - @code{B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1))}.
+
+The second expression gives somewhat steeper increase in total
+weight as number of matched B's increases, because it assigns
+higher weights to individual B's. Also the first expression in
+much simplier. So it is the first one, that is implemented in MySQL.
+
@summarycontents
@contents