diff options
author | Richard M. Stallman <rms@gnu.org> | 1998-02-28 01:53:53 +0000 |
---|---|---|
committer | Richard M. Stallman <rms@gnu.org> | 1998-02-28 01:53:53 +0000 |
commit | f9f59935f3518733b46009b9ee40132b1f330cf0 (patch) | |
tree | e932eb7bce20a1b1e30ecc1e494c2818d294a479 /lispref/objects.texi | |
parent | cc6d0d2c9435d5d065121468b3655f4941403685 (diff) | |
download | emacs-f9f59935f3518733b46009b9ee40132b1f330cf0.tar.gz |
*** empty log message ***
Diffstat (limited to 'lispref/objects.texi')
-rw-r--r-- | lispref/objects.texi | 298 |
1 files changed, 212 insertions, 86 deletions
diff --git a/lispref/objects.texi b/lispref/objects.texi index 78412e2c312..66734920de9 100644 --- a/lispref/objects.texi +++ b/lispref/objects.texi @@ -1,6 +1,6 @@ @c -*-texinfo-*- @c This is part of the GNU Emacs Lisp Reference Manual. -@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998 Free Software Foundation, Inc. @c See the file elisp.texi for copying conditions. @setfilename ../info/objects @node Lisp Data Types, Numbers, Introduction, Top @@ -66,8 +66,10 @@ to use these types can be found in later chapters. output generated by the Lisp printer (the function @code{prin1}) for that object. The @dfn{read syntax} of an object is the format of the input accepted by the Lisp reader (the function @code{read}) for that -object. Most objects have more than one possible read syntax. Some -types of object have no read syntax; except for these cases, the printed +object. @xref{Read and Print}. + + Most objects have more than one possible read syntax. Some types of +object have no read syntax; except for these cases, the printed representation of an object is also a read syntax for it. In other languages, an expression is text; it has no other form. In @@ -143,6 +145,8 @@ latter are unique to Emacs Lisp. * Array Type:: Arrays include strings and vectors. * String Type:: An (efficient) array of characters. * Vector Type:: One-dimensional arrays. +* Char-Table Type:: One-dimensional sparse arrays indexed by characters. +* Bool-Vector Type:: One-dimensional arrays of @code{t} or @code{nil}. * Function Type:: A piece of executable code you can call from elsewhere. * Macro Type:: A method of expanding an expression into another expression, more fundamental but less pretty. @@ -196,9 +200,9 @@ leading @samp{+} or a final @samp{.}. @node Floating Point Type @subsection Floating Point Type - Emacs version 19 supports floating point numbers (though there is a -compilation option to disable them). The precise range of floating -point numbers is machine-specific. + Emacs supports floating point numbers (though there is a compilation +option to disable them). The precise range of floating point numbers is +machine-specific. The printed representation for floating point numbers requires either a decimal point (with at least one digit following), an exponent, or @@ -221,9 +225,10 @@ common to work with @emph{strings}, which are sequences composed of characters. @xref{String Type}. Characters in strings, buffers, and files are currently limited to the -range of 0 to 255---eight bits. If you store a larger integer into a -string, buffer or file, it is truncated to that range. Characters that -represent keyboard input have a much wider range. +range of 0 to 524287---nineteen bits. But not all values in that range +are valid character codes. Characters that represent keyboard input +have a much wider range, so they can modifier keys such as Control, Meta +and Shift. @cindex read syntax for characters @cindex printed representation for characters @@ -272,8 +277,7 @@ way to write the space character. If the character is @samp{\}, you You can express the characters Control-g, backspace, tab, newline, vertical tab, formfeed, return, and escape as @samp{?\a}, @samp{?\b}, @samp{?\t}, @samp{?\n}, @samp{?\v}, @samp{?\f}, @samp{?\r}, @samp{?\e}, -respectively. Those values are 7, 8, 9, 10, 11, 12, 13, and 27 in -decimal. Thus, +respectively. Thus, @example ?\a @result{} 7 ; @r{@kbd{C-g}} @@ -306,10 +310,10 @@ equivalent to @samp{?\^I} and to @samp{?\^i}: ?\^I @result{} 9 ?\C-I @result{} 9 @end example - For use in strings and buffers, you are limited to the control -characters that exist in @sc{ASCII}, but for keyboard input purposes, -you can turn any character into a control character with @samp{C-}. The -character codes for these non-@sc{ASCII} control characters include the + In strings and buffers, the only control characters allowed are those +that exist in @sc{ASCII}; but for keyboard input purposes, you can turn +any character into a control character with @samp{C-}. The character +codes for these non-@sc{ASCII} control characters include the @iftex $2^{26}$ @end iftex @@ -359,11 +363,11 @@ $2^{7}$ @ifinfo 2**7 @end ifinfo -bit indicates a meta character, so the meta -characters that can fit in a string have codes in the range from 128 to -255, and are the meta versions of the ordinary @sc{ASCII} characters. -(In Emacs versions 18 and older, this convention was used for characters -outside of strings as well.) +bit attached to an ASCII character indicates a meta character; thus, the +meta characters that can fit in a string have codes in the range from +128 to 255, and are the meta versions of the ordinary @sc{ASCII} +characters. (In Emacs versions 18 and older, this convention was used +for characters outside of strings as well.) The read syntax for meta characters uses @samp{\M-}. For example, @samp{?\M-A} stands for @kbd{M-A}. You can use @samp{\M-} together with @@ -372,9 +376,10 @@ syntax for a character. Thus, you can write @kbd{M-A} as @samp{?\M-A}, or as @samp{?\M-\101}. Likewise, you can write @kbd{C-M-b} as @samp{?\M-\C-b}, @samp{?\C-\M-b}, or @samp{?\M-\002}. - The case of an ordinary letter is indicated by its character code as -part of @sc{ASCII}, but @sc{ASCII} has no way to represent whether a -control character is upper case or lower case. Emacs uses the + The case of a graphic character is indicated by its character code; +for example, @sc{ASCII} distinguishes between the characters @samp{a} +and @samp{A}. But @sc{ASCII} has no way to represent whether a control +character is upper case or lower case. Emacs uses the @iftex $2^{25}$ @end iftex @@ -407,8 +412,9 @@ bit values are 2**22 for alt, 2**23 for super and 2**24 for hyper. @cindex @samp{\} in character constant @cindex backslash in character constant @cindex octal character code - Finally, the most general read syntax consists of a question mark -followed by a backslash and the character code in octal (up to three + Finally, the most general read syntax for a character represents the +character code in either octal or hex. To use octal, write a question +mark followed by a backslash and the octal character code (up to three octal digits); thus, @samp{?\101} for the character @kbd{A}, @samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the character @kbd{C-b}. Although this syntax can represent any @sc{ASCII} @@ -422,6 +428,18 @@ important than the @sc{ASCII} representation. @end group @end example + To use hex, write a question mark followed by a backslash, @samp{x}, +and the hexadecimal character code. You can use any number of hex +digits, so you can represent any character code in this way. +Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the +character @kbd{C-a}, and @code{?\x8c0} for the character +@iftex +@`a. +@end iftex +@ifinfo +@samp{a} with grave accent. +@end ifinfo + A backslash is allowed, and harmless, preceding any character without a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. There is no reason to add a backslash before most characters. However, @@ -788,49 +806,36 @@ the names of Lisp symbols, as messages for the user, and to represent text extracted from buffers. Strings in Lisp are constants: evaluation of a string returns the same string. + @xref{Strings and Characters}, for functions that operate on strings. + +@menu +* Syntax for Strings:: +* Non-ASCII in Strings:: +* Nonprinting Characters:: +* Text Props and Strings:: +@end menu + +@node Syntax for Strings +@subsubsection Syntax for Strings + @cindex @samp{"} in strings @cindex double-quote in strings @cindex @samp{\} in strings @cindex backslash in strings The read syntax for strings is a double-quote, an arbitrary number of -characters, and another double-quote, @code{"like this"}. The Lisp -reader accepts the same formats for reading the characters of a string -as it does for reading single characters (without the question mark that -begins a character literal). You can enter a nonprinting character such -as tab, @kbd{C-a} or @kbd{M-C-A} using the convenient escape sequences, -like this: @code{"\t, \C-a, \M-\C-a"}. You can include a double-quote -in a string by preceding it with a backslash; thus, @code{"\""} is a -string containing just a single double-quote character. -(@xref{Character Type}, for a description of the read syntax for -characters.) - - If you use the @samp{\M-} syntax to indicate a meta character in a -string constant, this sets the -@iftex -$2^{7}$ -@end iftex -@ifinfo -2**7 -@end ifinfo -bit of the character in the string. -This is not the same representation that the meta modifier has in a -character on its own (not inside a string). @xref{Character Type}. - - Strings cannot hold characters that have the hyper, super, or alt -modifiers; they can hold @sc{ASCII} control characters, but no others. -They do not distinguish case in @sc{ASCII} control characters. - - The printed representation of a string consists of a double-quote, the -characters it contains, and another double-quote. However, you must -escape any backslash or double-quote characters in the string with a -backslash, like this: @code{"this \" is an embedded quote"}. +characters, and another double-quote, @code{"like this"}. To include a +double-quote in a string, precede it with a backslash; thus, @code{"\""} +is a string containing just a single double-quote character. Likewise, +you can include a backslash by preceding it with another backslash, like +this: @code{"this \\ is a single embedded backslash"}. +@cindex newline in strings The newline character is not special in the read syntax for strings; if you write a new line between the double-quotes, it becomes a character in the string. But an escaped newline---one that is preceded by @samp{\}---does not become part of the string; i.e., the Lisp reader -ignores an escaped newline while reading a string. -@cindex newline in strings +ignores an escaped newline while reading a string. An escaped space +@w{@samp{\ }} is likewise ignored. @example "It is useful to include newlines @@ -842,11 +847,73 @@ in documentation strings, but the newline is ignored if escaped." @end example - A string can hold properties of the text it contains, in addition to -the characters themselves. This enables programs that copy text between -strings and buffers to preserve the properties with no special effort. -@xref{Text Properties}. Strings with text properties have a special -read and print syntax: +@node Non-ASCII in Strings +@subsubsection Non-ASCII Characters in Strings + + You can include a non-@sc{ASCII} international character in a string +constant by writing it literally. There are two text representations +for non-@sc{ASCII} characters in Emacs strings (and in buffers): unibyte +and multibyte. If the string constant is read from a multibyte source, +then the character is read as a multibyte character, and that makes the +string multibyte. If the string constant is read from a unibyte source, +then the character is read as unibyte and that makes the string unibyte. + + You can also represent a multibyte non-@sc{ASCII} character with its +character code, using a hex escape, @samp{\x@var{nnnnnnn}}, with as many +digits as necessary. (Multibyte non-@sc{ASCII} character codes are all +greater than 256.) Any character which is not a valid hex digit +terminates this construct. If the character that would follow is a hex +digit, write @samp{\ } to terminate the hex escape---for example, +@samp{\x8c0\ } represents one character, @samp{a} with grave accent. +@samp{\ } in a string constant is just like backslash-newline; it does +not contribute any character to the string, but it does terminate the +preceding hex escape. + + Using a multibyte hex escape forces the string to multibyte. You can +represent a unibyte non-@sc{ASCII} character with its character code, +which must be in the range from 128 (0200 octal) to 255 (0377 octal). +This forces a unibyte string. + + @xref{Text Representations}, for more information about the two +text representations. + +@node Nonprinting Characters +@subsubsection Nonprinting Characters in Strings + + Strings cannot hold characters that have the hyper, super, or alt +modifiers; the only control or meta characters they can hold are the +@sc{ASCII} control characters. Strings do not distinguish case in +@sc{ASCII} control characters. + + You can use the same backslash escape-sequences in a string constant +as in character literals (but do not use the question mark that begins a +character constant). For example, you can write a string containing the +nonprinting characters tab, @kbd{C-a} and @kbd{M-C-a}, with commas and +spaces between them, like this: @code{"\t, \C-a, \M-\C-a"}. +@xref{Character Type}, for a description of the read syntax for +characters. + + If you use the @samp{\M-} syntax to indicate a meta character in a +string constant, this sets the +@iftex +$2^{7}$ +@end iftex +@ifinfo +2**7 +@end ifinfo +bit of the character in the string. This construct works only with +ASCII characters. Note that the same meta characters have a different +representation when not in a string. @xref{Character Type}. + +@node Text Props and Strings +@subsubsection Text Properties in Strings + + A string can hold properties for the characters it contains, in +addition to the characters themselves. This enables programs that copy +text between strings and buffers to copy the text's properties with no +special effort. @xref{Text Properties}, for an explanation of what text +properties mean. Strings with text properties use a special read and +print syntax: @example #("@var{characters}" @var{property-data}...) @@ -863,9 +930,18 @@ of three as follows: @noindent The elements @var{beg} and @var{end} are integers, and together specify a range of indices in the string; @var{plist} is the property list for -that range. +that range. For example, - @xref{Strings and Characters}, for functions that work on strings. +@example +#("foo bar" 0 3 (face bold) 3 4 nil 4 7 (face italic)) +@end example + +@noindent +represents a string whose textual contents are @samp{foo bar}, in which +the first three characters have a @code{face} property with value +@code{bold}, and the last three have a @code{face} property with value +@code{italic}. (The fourth character has no text properties so its +property list is @code{nil}.) @node Vector Type @subsection Vector Type @@ -887,6 +963,44 @@ for evaluation. @xref{Vectors}, for functions that work with vectors. +@node Char-Table Type +@subsection Char-Table Type + + A @dfn{char-table} is a one-dimensional array of elements of any type, +indexed by character codes. Char-tables have certain extra features to +make them more useful for many jobs that involve assigning information +to character codes---for example, a char-table can have a parent to +inherit from, a default value, and a small number of extra slots to use for +special purposes. A char-table can also specify a single value for +a whole character set. + + The printed representation of a char-table is like a vector +except that there is an extra @samp{#} at the beginning. + + @xref{Char-Tables}, for special functions to operate on char-tables. + +@node Bool-Vector Type +@subsection Bool-Vector Type + + A @dfn{bool-vector} is a one-dimensional array of elements that +must be @code{t} or @code{nil}. + + The printed representation of a Bool-vector is like a string, except +that it begins with @samp{#&} followed by the length. The string +constant that follows actually specifies the contents of the bool-vector +as a bitmap---each ``character'' in the string contains 8 bits, which +specify the next 8 elements of the bool-vector (1 stands for @code{t}, +and 0 for @code{nil}). If the length is not a multiple of 8, the +printed representation describes extra elements, but these really +make no difference. + +@example +(make-bool-vector 3 t) + @result{} #&3"\377" +(make-bool-vector 3 nil) + @result{} #&3"\0"" +@end example + @node Function Type @subsection Function Type @@ -922,6 +1036,10 @@ is a Lisp function object, including the @code{lambda} symbol. a macro as far as Emacs is concerned. @xref{Macros}, for an explanation of how to write a macro. + @strong{Warning}: Lisp macros and keyboard macros (@pxref{Keyboard +Macros}) are entirely different things. When we use the word ``macro'' +without qualification, we mean a Lisp macro, not a keyboard macro. + @node Primitive Function Type @subsection Primitive Function Type @cindex special forms @@ -939,7 +1057,8 @@ primitive. However, this does matter if you try to substitute a function written in Lisp for a primitive of the same name. The reason is that the primitive function may be called directly from C code. Calls to the redefined function from Lisp will use the new definition, -but calls from C code may still use the built-in definition. +but calls from C code may still use the built-in definition. Therefore, +@strong{we discourage redefinition of primitive functions}. The term @dfn{function} refers to all Emacs functions, whether written in Lisp or C. @xref{Function Type}, for information about the @@ -1227,18 +1346,22 @@ keys, local as well as global keymaps, and changing key bindings. @node Syntax Table Type @subsection Syntax Table Type - A @dfn{syntax table} is a vector of 256 integers. Each element of the -vector defines how one character is interpreted when it appears in a + A @dfn{syntax table} is a char-table which specifies the syntax of +each character, for word and list parsing. Each element of the syntax +table defines how one character is interpreted when it appears in a buffer. For example, in C mode (@pxref{Major Modes}), the @samp{+} character is punctuation, but in Lisp mode it is a valid character in a symbol. These modes specify different interpretations by changing the syntax table entry for @samp{+}, at index 43 in the syntax table. - Syntax tables are used only for scanning text in buffers, not for -reading Lisp expressions. The table the Lisp interpreter uses to read -expressions is built into the Emacs source code and cannot be changed; -thus, to change the list delimiters to be @samp{@{} and @samp{@}} -instead of @samp{(} and @samp{)} would be impossible. + Syntax tables are used only to control primitives that scan text in +buffers, not for reading Lisp expressions. The syntax that the Lisp +interpreter uses to read expressions is built into the Emacs source code +and cannot be changed; thus, to change the list delimiters to be +@samp{@{} and @samp{@}} instead of @samp{(} and @samp{)} would be +impossible. (Some Lisp systems provide ways to redefine the read +syntax, but we decided to leave this feature out of Emacs Lisp for +simplicity.) @xref{Syntax Tables}, for details about syntax classes and how to make and modify syntax tables. @@ -1248,18 +1371,18 @@ and modify syntax tables. A @dfn{display table} specifies how to display each character code. Each buffer and each window can have its own display table. A display -table is actually a vector of length 262. @xref{Display Tables}. +table is actually a char-table. @xref{Display Tables}. @node Overlay Type @subsection Overlay Type - An @dfn{overlay} specifies temporary alteration of the display -appearance of a part of a buffer. It contains markers delimiting a -range of the buffer, plus a property list (a list whose elements are -alternating property names and values). Overlays are used to present -parts of the buffer temporarily in a different display style. They have -no read syntax, and print in hash notation, giving the buffer name and -range of positions. + An @dfn{overlay} specifies properties that apply to a part of a +buffer. Each overlay applies to a specified range of the buffer, and +contains a property list (a list whose elements are alternating property +names and values). Overlay properties are used to present parts of the +buffer temporarily in a different display style. Overlays have no read +syntax, and print in hash notation, giving the buffer name and range of +positions. @xref{Overlays}, for how to create and use overlays. @@ -1284,7 +1407,7 @@ pass an argument to @code{+} that it cannot handle: @example @group (+ 2 'a) - @error{} Wrong type argument: integer-or-marker-p, a + @error{} Wrong type argument: number-or-marker-p, a @end group @end example @@ -1355,6 +1478,9 @@ with references to further information. @item framep @xref{Frames, framep}. +@item functionp +@xref{Functions, functionp}. + @item integer-or-marker-p @xref{Predicates on Markers, integer-or-marker-p}. @@ -1572,10 +1698,10 @@ arguments to see if their elements are the same. So, if two objects are @end group @end example -Comparison of strings is case-sensitive and takes account of text -properties as well as the characters in the strings. To compare -two strings' characters without comparing their text properties, -use @code{string=} (@pxref{Text Comparison}). +Comparison of strings is case-sensitive, but does not take account of +text properties---it compares only the characters in the strings. +A unibyte string never equals a multibyte string unless the +contents are entirely @sc{ASCII} (@pxref{Text Representations}). @example @group |