summaryrefslogtreecommitdiff
path: root/lispref/objects.texi
diff options
context:
space:
mode:
authorRichard M. Stallman <rms@gnu.org>1998-02-28 01:53:53 +0000
committerRichard M. Stallman <rms@gnu.org>1998-02-28 01:53:53 +0000
commitf9f59935f3518733b46009b9ee40132b1f330cf0 (patch)
treee932eb7bce20a1b1e30ecc1e494c2818d294a479 /lispref/objects.texi
parentcc6d0d2c9435d5d065121468b3655f4941403685 (diff)
downloademacs-f9f59935f3518733b46009b9ee40132b1f330cf0.tar.gz
*** empty log message ***
Diffstat (limited to 'lispref/objects.texi')
-rw-r--r--lispref/objects.texi298
1 files changed, 212 insertions, 86 deletions
diff --git a/lispref/objects.texi b/lispref/objects.texi
index 78412e2c312..66734920de9 100644
--- a/lispref/objects.texi
+++ b/lispref/objects.texi
@@ -1,6 +1,6 @@
@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
+@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998 Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@setfilename ../info/objects
@node Lisp Data Types, Numbers, Introduction, Top
@@ -66,8 +66,10 @@ to use these types can be found in later chapters.
output generated by the Lisp printer (the function @code{prin1}) for
that object. The @dfn{read syntax} of an object is the format of the
input accepted by the Lisp reader (the function @code{read}) for that
-object. Most objects have more than one possible read syntax. Some
-types of object have no read syntax; except for these cases, the printed
+object. @xref{Read and Print}.
+
+ Most objects have more than one possible read syntax. Some types of
+object have no read syntax; except for these cases, the printed
representation of an object is also a read syntax for it.
In other languages, an expression is text; it has no other form. In
@@ -143,6 +145,8 @@ latter are unique to Emacs Lisp.
* Array Type:: Arrays include strings and vectors.
* String Type:: An (efficient) array of characters.
* Vector Type:: One-dimensional arrays.
+* Char-Table Type:: One-dimensional sparse arrays indexed by characters.
+* Bool-Vector Type:: One-dimensional arrays of @code{t} or @code{nil}.
* Function Type:: A piece of executable code you can call from elsewhere.
* Macro Type:: A method of expanding an expression into another
expression, more fundamental but less pretty.
@@ -196,9 +200,9 @@ leading @samp{+} or a final @samp{.}.
@node Floating Point Type
@subsection Floating Point Type
- Emacs version 19 supports floating point numbers (though there is a
-compilation option to disable them). The precise range of floating
-point numbers is machine-specific.
+ Emacs supports floating point numbers (though there is a compilation
+option to disable them). The precise range of floating point numbers is
+machine-specific.
The printed representation for floating point numbers requires either
a decimal point (with at least one digit following), an exponent, or
@@ -221,9 +225,10 @@ common to work with @emph{strings}, which are sequences composed of
characters. @xref{String Type}.
Characters in strings, buffers, and files are currently limited to the
-range of 0 to 255---eight bits. If you store a larger integer into a
-string, buffer or file, it is truncated to that range. Characters that
-represent keyboard input have a much wider range.
+range of 0 to 524287---nineteen bits. But not all values in that range
+are valid character codes. Characters that represent keyboard input
+have a much wider range, so they can modifier keys such as Control, Meta
+and Shift.
@cindex read syntax for characters
@cindex printed representation for characters
@@ -272,8 +277,7 @@ way to write the space character. If the character is @samp{\}, you
You can express the characters Control-g, backspace, tab, newline,
vertical tab, formfeed, return, and escape as @samp{?\a}, @samp{?\b},
@samp{?\t}, @samp{?\n}, @samp{?\v}, @samp{?\f}, @samp{?\r}, @samp{?\e},
-respectively. Those values are 7, 8, 9, 10, 11, 12, 13, and 27 in
-decimal. Thus,
+respectively. Thus,
@example
?\a @result{} 7 ; @r{@kbd{C-g}}
@@ -306,10 +310,10 @@ equivalent to @samp{?\^I} and to @samp{?\^i}:
?\^I @result{} 9 ?\C-I @result{} 9
@end example
- For use in strings and buffers, you are limited to the control
-characters that exist in @sc{ASCII}, but for keyboard input purposes,
-you can turn any character into a control character with @samp{C-}. The
-character codes for these non-@sc{ASCII} control characters include the
+ In strings and buffers, the only control characters allowed are those
+that exist in @sc{ASCII}; but for keyboard input purposes, you can turn
+any character into a control character with @samp{C-}. The character
+codes for these non-@sc{ASCII} control characters include the
@iftex
$2^{26}$
@end iftex
@@ -359,11 +363,11 @@ $2^{7}$
@ifinfo
2**7
@end ifinfo
-bit indicates a meta character, so the meta
-characters that can fit in a string have codes in the range from 128 to
-255, and are the meta versions of the ordinary @sc{ASCII} characters.
-(In Emacs versions 18 and older, this convention was used for characters
-outside of strings as well.)
+bit attached to an ASCII character indicates a meta character; thus, the
+meta characters that can fit in a string have codes in the range from
+128 to 255, and are the meta versions of the ordinary @sc{ASCII}
+characters. (In Emacs versions 18 and older, this convention was used
+for characters outside of strings as well.)
The read syntax for meta characters uses @samp{\M-}. For example,
@samp{?\M-A} stands for @kbd{M-A}. You can use @samp{\M-} together with
@@ -372,9 +376,10 @@ syntax for a character. Thus, you can write @kbd{M-A} as @samp{?\M-A},
or as @samp{?\M-\101}. Likewise, you can write @kbd{C-M-b} as
@samp{?\M-\C-b}, @samp{?\C-\M-b}, or @samp{?\M-\002}.
- The case of an ordinary letter is indicated by its character code as
-part of @sc{ASCII}, but @sc{ASCII} has no way to represent whether a
-control character is upper case or lower case. Emacs uses the
+ The case of a graphic character is indicated by its character code;
+for example, @sc{ASCII} distinguishes between the characters @samp{a}
+and @samp{A}. But @sc{ASCII} has no way to represent whether a control
+character is upper case or lower case. Emacs uses the
@iftex
$2^{25}$
@end iftex
@@ -407,8 +412,9 @@ bit values are 2**22 for alt, 2**23 for super and 2**24 for hyper.
@cindex @samp{\} in character constant
@cindex backslash in character constant
@cindex octal character code
- Finally, the most general read syntax consists of a question mark
-followed by a backslash and the character code in octal (up to three
+ Finally, the most general read syntax for a character represents the
+character code in either octal or hex. To use octal, write a question
+mark followed by a backslash and the octal character code (up to three
octal digits); thus, @samp{?\101} for the character @kbd{A},
@samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the
character @kbd{C-b}. Although this syntax can represent any @sc{ASCII}
@@ -422,6 +428,18 @@ important than the @sc{ASCII} representation.
@end group
@end example
+ To use hex, write a question mark followed by a backslash, @samp{x},
+and the hexadecimal character code. You can use any number of hex
+digits, so you can represent any character code in this way.
+Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the
+character @kbd{C-a}, and @code{?\x8c0} for the character
+@iftex
+@`a.
+@end iftex
+@ifinfo
+@samp{a} with grave accent.
+@end ifinfo
+
A backslash is allowed, and harmless, preceding any character without
a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}.
There is no reason to add a backslash before most characters. However,
@@ -788,49 +806,36 @@ the names of Lisp symbols, as messages for the user, and to represent
text extracted from buffers. Strings in Lisp are constants: evaluation
of a string returns the same string.
+ @xref{Strings and Characters}, for functions that operate on strings.
+
+@menu
+* Syntax for Strings::
+* Non-ASCII in Strings::
+* Nonprinting Characters::
+* Text Props and Strings::
+@end menu
+
+@node Syntax for Strings
+@subsubsection Syntax for Strings
+
@cindex @samp{"} in strings
@cindex double-quote in strings
@cindex @samp{\} in strings
@cindex backslash in strings
The read syntax for strings is a double-quote, an arbitrary number of
-characters, and another double-quote, @code{"like this"}. The Lisp
-reader accepts the same formats for reading the characters of a string
-as it does for reading single characters (without the question mark that
-begins a character literal). You can enter a nonprinting character such
-as tab, @kbd{C-a} or @kbd{M-C-A} using the convenient escape sequences,
-like this: @code{"\t, \C-a, \M-\C-a"}. You can include a double-quote
-in a string by preceding it with a backslash; thus, @code{"\""} is a
-string containing just a single double-quote character.
-(@xref{Character Type}, for a description of the read syntax for
-characters.)
-
- If you use the @samp{\M-} syntax to indicate a meta character in a
-string constant, this sets the
-@iftex
-$2^{7}$
-@end iftex
-@ifinfo
-2**7
-@end ifinfo
-bit of the character in the string.
-This is not the same representation that the meta modifier has in a
-character on its own (not inside a string). @xref{Character Type}.
-
- Strings cannot hold characters that have the hyper, super, or alt
-modifiers; they can hold @sc{ASCII} control characters, but no others.
-They do not distinguish case in @sc{ASCII} control characters.
-
- The printed representation of a string consists of a double-quote, the
-characters it contains, and another double-quote. However, you must
-escape any backslash or double-quote characters in the string with a
-backslash, like this: @code{"this \" is an embedded quote"}.
+characters, and another double-quote, @code{"like this"}. To include a
+double-quote in a string, precede it with a backslash; thus, @code{"\""}
+is a string containing just a single double-quote character. Likewise,
+you can include a backslash by preceding it with another backslash, like
+this: @code{"this \\ is a single embedded backslash"}.
+@cindex newline in strings
The newline character is not special in the read syntax for strings;
if you write a new line between the double-quotes, it becomes a
character in the string. But an escaped newline---one that is preceded
by @samp{\}---does not become part of the string; i.e., the Lisp reader
-ignores an escaped newline while reading a string.
-@cindex newline in strings
+ignores an escaped newline while reading a string. An escaped space
+@w{@samp{\ }} is likewise ignored.
@example
"It is useful to include newlines
@@ -842,11 +847,73 @@ in documentation strings,
but the newline is ignored if escaped."
@end example
- A string can hold properties of the text it contains, in addition to
-the characters themselves. This enables programs that copy text between
-strings and buffers to preserve the properties with no special effort.
-@xref{Text Properties}. Strings with text properties have a special
-read and print syntax:
+@node Non-ASCII in Strings
+@subsubsection Non-ASCII Characters in Strings
+
+ You can include a non-@sc{ASCII} international character in a string
+constant by writing it literally. There are two text representations
+for non-@sc{ASCII} characters in Emacs strings (and in buffers): unibyte
+and multibyte. If the string constant is read from a multibyte source,
+then the character is read as a multibyte character, and that makes the
+string multibyte. If the string constant is read from a unibyte source,
+then the character is read as unibyte and that makes the string unibyte.
+
+ You can also represent a multibyte non-@sc{ASCII} character with its
+character code, using a hex escape, @samp{\x@var{nnnnnnn}}, with as many
+digits as necessary. (Multibyte non-@sc{ASCII} character codes are all
+greater than 256.) Any character which is not a valid hex digit
+terminates this construct. If the character that would follow is a hex
+digit, write @samp{\ } to terminate the hex escape---for example,
+@samp{\x8c0\ } represents one character, @samp{a} with grave accent.
+@samp{\ } in a string constant is just like backslash-newline; it does
+not contribute any character to the string, but it does terminate the
+preceding hex escape.
+
+ Using a multibyte hex escape forces the string to multibyte. You can
+represent a unibyte non-@sc{ASCII} character with its character code,
+which must be in the range from 128 (0200 octal) to 255 (0377 octal).
+This forces a unibyte string.
+
+ @xref{Text Representations}, for more information about the two
+text representations.
+
+@node Nonprinting Characters
+@subsubsection Nonprinting Characters in Strings
+
+ Strings cannot hold characters that have the hyper, super, or alt
+modifiers; the only control or meta characters they can hold are the
+@sc{ASCII} control characters. Strings do not distinguish case in
+@sc{ASCII} control characters.
+
+ You can use the same backslash escape-sequences in a string constant
+as in character literals (but do not use the question mark that begins a
+character constant). For example, you can write a string containing the
+nonprinting characters tab, @kbd{C-a} and @kbd{M-C-a}, with commas and
+spaces between them, like this: @code{"\t, \C-a, \M-\C-a"}.
+@xref{Character Type}, for a description of the read syntax for
+characters.
+
+ If you use the @samp{\M-} syntax to indicate a meta character in a
+string constant, this sets the
+@iftex
+$2^{7}$
+@end iftex
+@ifinfo
+2**7
+@end ifinfo
+bit of the character in the string. This construct works only with
+ASCII characters. Note that the same meta characters have a different
+representation when not in a string. @xref{Character Type}.
+
+@node Text Props and Strings
+@subsubsection Text Properties in Strings
+
+ A string can hold properties for the characters it contains, in
+addition to the characters themselves. This enables programs that copy
+text between strings and buffers to copy the text's properties with no
+special effort. @xref{Text Properties}, for an explanation of what text
+properties mean. Strings with text properties use a special read and
+print syntax:
@example
#("@var{characters}" @var{property-data}...)
@@ -863,9 +930,18 @@ of three as follows:
@noindent
The elements @var{beg} and @var{end} are integers, and together specify
a range of indices in the string; @var{plist} is the property list for
-that range.
+that range. For example,
- @xref{Strings and Characters}, for functions that work on strings.
+@example
+#("foo bar" 0 3 (face bold) 3 4 nil 4 7 (face italic))
+@end example
+
+@noindent
+represents a string whose textual contents are @samp{foo bar}, in which
+the first three characters have a @code{face} property with value
+@code{bold}, and the last three have a @code{face} property with value
+@code{italic}. (The fourth character has no text properties so its
+property list is @code{nil}.)
@node Vector Type
@subsection Vector Type
@@ -887,6 +963,44 @@ for evaluation.
@xref{Vectors}, for functions that work with vectors.
+@node Char-Table Type
+@subsection Char-Table Type
+
+ A @dfn{char-table} is a one-dimensional array of elements of any type,
+indexed by character codes. Char-tables have certain extra features to
+make them more useful for many jobs that involve assigning information
+to character codes---for example, a char-table can have a parent to
+inherit from, a default value, and a small number of extra slots to use for
+special purposes. A char-table can also specify a single value for
+a whole character set.
+
+ The printed representation of a char-table is like a vector
+except that there is an extra @samp{#} at the beginning.
+
+ @xref{Char-Tables}, for special functions to operate on char-tables.
+
+@node Bool-Vector Type
+@subsection Bool-Vector Type
+
+ A @dfn{bool-vector} is a one-dimensional array of elements that
+must be @code{t} or @code{nil}.
+
+ The printed representation of a Bool-vector is like a string, except
+that it begins with @samp{#&} followed by the length. The string
+constant that follows actually specifies the contents of the bool-vector
+as a bitmap---each ``character'' in the string contains 8 bits, which
+specify the next 8 elements of the bool-vector (1 stands for @code{t},
+and 0 for @code{nil}). If the length is not a multiple of 8, the
+printed representation describes extra elements, but these really
+make no difference.
+
+@example
+(make-bool-vector 3 t)
+ @result{} #&3"\377"
+(make-bool-vector 3 nil)
+ @result{} #&3"\0""
+@end example
+
@node Function Type
@subsection Function Type
@@ -922,6 +1036,10 @@ is a Lisp function object, including the @code{lambda} symbol.
a macro as far as Emacs is concerned. @xref{Macros}, for an explanation
of how to write a macro.
+ @strong{Warning}: Lisp macros and keyboard macros (@pxref{Keyboard
+Macros}) are entirely different things. When we use the word ``macro''
+without qualification, we mean a Lisp macro, not a keyboard macro.
+
@node Primitive Function Type
@subsection Primitive Function Type
@cindex special forms
@@ -939,7 +1057,8 @@ primitive. However, this does matter if you try to substitute a
function written in Lisp for a primitive of the same name. The reason
is that the primitive function may be called directly from C code.
Calls to the redefined function from Lisp will use the new definition,
-but calls from C code may still use the built-in definition.
+but calls from C code may still use the built-in definition. Therefore,
+@strong{we discourage redefinition of primitive functions}.
The term @dfn{function} refers to all Emacs functions, whether written
in Lisp or C. @xref{Function Type}, for information about the
@@ -1227,18 +1346,22 @@ keys, local as well as global keymaps, and changing key bindings.
@node Syntax Table Type
@subsection Syntax Table Type
- A @dfn{syntax table} is a vector of 256 integers. Each element of the
-vector defines how one character is interpreted when it appears in a
+ A @dfn{syntax table} is a char-table which specifies the syntax of
+each character, for word and list parsing. Each element of the syntax
+table defines how one character is interpreted when it appears in a
buffer. For example, in C mode (@pxref{Major Modes}), the @samp{+}
character is punctuation, but in Lisp mode it is a valid character in a
symbol. These modes specify different interpretations by changing the
syntax table entry for @samp{+}, at index 43 in the syntax table.
- Syntax tables are used only for scanning text in buffers, not for
-reading Lisp expressions. The table the Lisp interpreter uses to read
-expressions is built into the Emacs source code and cannot be changed;
-thus, to change the list delimiters to be @samp{@{} and @samp{@}}
-instead of @samp{(} and @samp{)} would be impossible.
+ Syntax tables are used only to control primitives that scan text in
+buffers, not for reading Lisp expressions. The syntax that the Lisp
+interpreter uses to read expressions is built into the Emacs source code
+and cannot be changed; thus, to change the list delimiters to be
+@samp{@{} and @samp{@}} instead of @samp{(} and @samp{)} would be
+impossible. (Some Lisp systems provide ways to redefine the read
+syntax, but we decided to leave this feature out of Emacs Lisp for
+simplicity.)
@xref{Syntax Tables}, for details about syntax classes and how to make
and modify syntax tables.
@@ -1248,18 +1371,18 @@ and modify syntax tables.
A @dfn{display table} specifies how to display each character code.
Each buffer and each window can have its own display table. A display
-table is actually a vector of length 262. @xref{Display Tables}.
+table is actually a char-table. @xref{Display Tables}.
@node Overlay Type
@subsection Overlay Type
- An @dfn{overlay} specifies temporary alteration of the display
-appearance of a part of a buffer. It contains markers delimiting a
-range of the buffer, plus a property list (a list whose elements are
-alternating property names and values). Overlays are used to present
-parts of the buffer temporarily in a different display style. They have
-no read syntax, and print in hash notation, giving the buffer name and
-range of positions.
+ An @dfn{overlay} specifies properties that apply to a part of a
+buffer. Each overlay applies to a specified range of the buffer, and
+contains a property list (a list whose elements are alternating property
+names and values). Overlay properties are used to present parts of the
+buffer temporarily in a different display style. Overlays have no read
+syntax, and print in hash notation, giving the buffer name and range of
+positions.
@xref{Overlays}, for how to create and use overlays.
@@ -1284,7 +1407,7 @@ pass an argument to @code{+} that it cannot handle:
@example
@group
(+ 2 'a)
- @error{} Wrong type argument: integer-or-marker-p, a
+ @error{} Wrong type argument: number-or-marker-p, a
@end group
@end example
@@ -1355,6 +1478,9 @@ with references to further information.
@item framep
@xref{Frames, framep}.
+@item functionp
+@xref{Functions, functionp}.
+
@item integer-or-marker-p
@xref{Predicates on Markers, integer-or-marker-p}.
@@ -1572,10 +1698,10 @@ arguments to see if their elements are the same. So, if two objects are
@end group
@end example
-Comparison of strings is case-sensitive and takes account of text
-properties as well as the characters in the strings. To compare
-two strings' characters without comparing their text properties,
-use @code{string=} (@pxref{Text Comparison}).
+Comparison of strings is case-sensitive, but does not take account of
+text properties---it compares only the characters in the strings.
+A unibyte string never equals a multibyte string unless the
+contents are entirely @sc{ASCII} (@pxref{Text Representations}).
@example
@group