summaryrefslogtreecommitdiff
path: root/manual/manual/cmds/comp.etex
blob: 85745978fa23406aec4a02c3d0ef2fa21ea3a334 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
\chapter{Batch compilation (ocamlc)} \label{c:camlc}
\pdfchapter{Batch compilation (ocamlc)}
%HEVEA\cutname{comp.html}

This chapter describes the OCaml batch compiler "ocamlc",
which compiles OCaml source files to bytecode object files and links
these object files to produce standalone bytecode executable files.
These executable files are then run by the bytecode interpreter
"ocamlrun".

\section{Overview of the compiler}

The "ocamlc" command has a command-line interface similar to the one of
most C compilers. It accepts several types of arguments and processes them
sequentially:

\begin{itemize}
\item
Arguments ending in ".mli" are taken to be source files for
compilation unit interfaces. Interfaces specify the names exported by
compilation units: they declare value names with their types, define
public data types, declare abstract data types, and so on. From the
file \var{x}".mli", the "ocamlc" compiler produces a compiled interface
in the file \var{x}".cmi".

\item
Arguments ending in ".ml" are taken to be source files for compilation
unit implementations. Implementations provide definitions for the
names exported by the unit, and also contain expressions to be
evaluated for their side-effects.  From the file \var{x}".ml", the "ocamlc"
compiler produces compiled object bytecode in the file \var{x}".cmo".

If the interface file \var{x}".mli" exists, the implementation
\var{x}".ml" is checked against the corresponding compiled interface
\var{x}".cmi", which is assumed to exist. If no interface
\var{x}".mli" is provided, the compilation of \var{x}".ml" produces a
compiled interface file \var{x}".cmi" in addition to the compiled
object code file \var{x}".cmo". The file \var{x}".cmi" produced
corresponds to an interface that exports everything that is defined in
the implementation \var{x}".ml".

\item
Arguments ending in ".cmo" are taken to be compiled object bytecode.  These
files are linked together, along with the object files obtained
by compiling ".ml" arguments (if any), and the OCaml standard
library, to produce a standalone executable program. The order in
which ".cmo" and ".ml" arguments are presented on the command line is
relevant: compilation units are initialized in that order at
run-time, and it is a link-time error to use a component of a unit
before having initialized it. Hence, a given \var{x}".cmo" file must come
before all ".cmo" files that refer to the unit \var{x}.

\item
Arguments ending in ".cma" are taken to be libraries of object bytecode.
A library of object bytecode packs in a single file a set of object
bytecode files (".cmo" files). Libraries are built with "ocamlc -a"
(see the description of the "-a" option below). The object files
contained in the library are linked as regular ".cmo" files (see
above), in the order specified when the ".cma" file was built. The
only difference is that if an object file contained in a library is
not referenced anywhere in the program, then it is not linked in.

\item
Arguments ending in ".c" are passed to the C compiler, which generates
a ".o" object file (".obj" under Windows). This object file is linked
with the program if the "-custom" flag is set (see the description of
"-custom" below).

\item
Arguments ending in ".o" or ".a" (".obj" or ".lib" under Windows)
are assumed to be C object files and libraries. They are passed to the
C linker when linking in "-custom" mode (see the description of
"-custom" below).

\item
Arguments ending in ".so" (".dll" under Windows)
are assumed to be C shared libraries (DLLs).  During linking, they are
searched for external C functions referenced from the OCaml code,
and their names are written in the generated bytecode executable.
The run-time system "ocamlrun" then loads them dynamically at program
start-up time.

\end{itemize}

The output of the linking phase is a file containing compiled bytecode
that can be executed by the OCaml bytecode interpreter:
the command named "ocamlrun". If "a.out" is the name of the file
produced by the linking phase, the command
\begin{alltt}
        ocamlrun a.out \nth{arg}{1} \nth{arg}{2} \ldots \nth{arg}{n}
\end{alltt}
executes the compiled code contained in "a.out", passing it as
arguments the character strings \nth{arg}{1} to \nth{arg}{n}.
(See chapter~\ref{c:runtime} for more details.)

On most systems, the file produced by the linking
phase can be run directly, as in:
\begin{alltt}
        ./a.out \nth{arg}{1} \nth{arg}{2} \ldots \nth{arg}{n}
\end{alltt}
The produced file has the executable bit set, and it manages to launch
the bytecode interpreter by itself.

\section{Options}\label{s:comp-options}

The following command-line options are recognized by "ocamlc".
The options "-pack", "-a", "-c" and "-output-obj" are mutually exclusive.


\begin{options}

\item["-a"]
Build a library (".cma" file) with the object files (".cmo" files)
given on the command line, instead of linking them into an executable
file. The name of the library must be set with the "-o" option.

If "-custom", "-cclib" or "-ccopt" options are passed on the command
line, these options are stored in the resulting ".cma" library.  Then,
linking with this library automatically adds back the "-custom",
"-cclib" and "-ccopt" options as if they had been provided on the
command line, unless the "-noautolink" option is given.

\item["-absname"]
Force error messages to show absolute paths for file names.

\item["-annot"]
Dump detailed information about the compilation (types, bindings,
tail-calls, etc).  The information for file \var{src}".ml"
is put into file \var{src}".annot".  In case of a type error, dump
all the information inferred by the type-checker before the error.
The \var{src}".annot" file can be used with the emacs commands given in
"emacs/caml-types.el" to display types and other annotations
interactively.

\item["-bin-annot"]
Dump detailed information about the compilation (types, bindings,
tail-calls, etc) in binary format. The information for file \var{src}".ml"
is put into file \var{src}".cmt".  In case of a type error, dump
all the information inferred by the type-checker before the error.
The "*.cmt" files produced by "-bin-annot" contain more information
and are much more compact than the files produced by "-annot".

\item["-c"]
Compile only. Suppress the linking phase of the
compilation. Source code files are turned into compiled files, but no
executable file is produced. This option is useful to
compile modules separately.

\item["-cc" \var{ccomp}]
Use \var{ccomp} as the C linker when linking in ``custom runtime''
mode (see the "-custom" option)
and as the C compiler for compiling ".c" source files.

\item["-cclib" "-l"\var{libname}]
Pass the "-l"\var{libname} option to the C linker when linking in
``custom runtime'' mode (see the "-custom" option). This causes the
given C library to be linked with the program.

\item["-ccopt" \var{option}]
Pass the given option to the C compiler and linker. When linking in
``custom runtime'' mode, for instance,
"-ccopt -L"\var{dir} causes the C linker to search for C libraries in
directory \var{dir}.   (See the "-custom" option.)

\item["-color" \var{mode}]
Enable or disable colors in compiler messages (especially warnings and errors).
The following modes are supported:
\begin{description}
  \item["auto"] use heuristics to enable colors only if the output supports them (an ANSI-compatible tty terminal);
  \item["always"] enable colors unconditionally;
  \item["never"] disable color output.
\end{description}
The default setting is 'auto', and the current heuristic
checks that the "TERM" environment variable exists and is
not empty or "dumb", and that \verb!isatty(stderr)! holds.


\item["-compat-32"]
Check that the generated bytecode executable can run on 32-bit
platforms and signal an error if it cannot. This is useful when
compiling bytecode on a 64-bit machine.

\item["-config"]
Print the version number of "ocamlc" and a detailed summary of its
configuration, then exit.

\item["-custom"]
Link in ``custom runtime'' mode. In the default linking mode, the
linker produces bytecode that is intended to be executed with the
shared runtime system, "ocamlrun". In the custom runtime mode, the
linker produces an output file that contains both the runtime system
and the bytecode for the program. The resulting file is larger, but it
can be executed directly, even if the "ocamlrun" command is not
installed. Moreover, the ``custom runtime'' mode enables static
linking of OCaml code with user-defined C functions, as described in
chapter~\ref{c:intf-c}.
\begin{unix}
Never use the "strip" command on executables produced by "ocamlc -custom",
this would remove the bytecode part of the executable.
\end{unix}

\item["-dllib" "-l"\var{libname}]
Arrange for the C shared library "dll"\var{libname}".so"
("dll"\var{libname}".dll" under Windows) to be loaded dynamically
by the run-time system "ocamlrun" at program start-up time.

\item["-dllpath" \var{dir}]
Adds the directory \var{dir} to the run-time search path for shared
C libraries.  At link-time, shared libraries are searched in the
standard search path (the one corresponding to the "-I" option).
The "-dllpath" option simply stores \var{dir} in the produced
executable file, where "ocamlrun" can find it and use it as
described in section~\ref{s-ocamlrun-dllpath}.

\item["-for-pack" \var{module-path}]
Generate an object file (".cmo") that can later be
included
as a sub-module (with the given access path) of a compilation unit
constructed with "-pack".  For instance, "ocamlc -for-pack P -c A.ml"
will generate "a.cmo" that can later be used with
"ocamlc -pack -o P.cmo a.cmo".
Note: you can still pack a module that was compiled without
"-for-pack"
but in this case exceptions will be printed with the wrong names.

\item["-g"]
Add debugging information while compiling and linking. This option is
required in order to be able to debug the program with "ocamldebug"
(see chapter~\ref{c:debugger}), and to produce stack backtraces when
the program terminates on an uncaught exception (see
section~\ref{ocamlrun-options}).

\item["-i"]
Cause the compiler to print all defined names (with their inferred
types or their definitions) when compiling an implementation (".ml"
file).  No compiled files (".cmo" and ".cmi" files) are produced.
This can be useful to check the types inferred by the
compiler. Also, since the output follows the syntax of interfaces, it
can help in writing an explicit interface (".mli" file) for a file:
just redirect the standard output of the compiler to a ".mli" file,
and edit that file to remove all declarations of unexported names.

\item["-I" \var{directory}]
Add the given directory to the list of directories searched for
compiled interface files (".cmi"), compiled object code files
(".cmo"), libraries (".cma"), and C libraries specified with
"-cclib -lxxx".  By default, the current directory is
searched first, then the standard library directory. Directories added
with "-I" are searched after the current directory, in the order in
which they were given on the command line, but before the standard
library directory. See also option "-nostdlib".

If the given directory starts with "+", it is taken relative to the
standard library directory.  For instance, "-I +labltk" adds the
subdirectory "labltk" of the standard library to the search path.

\item["-impl" \var{filename}]
Compile the file \var{filename} as an implementation file, even if its
extension is not ".ml".

\item["-intf" \var{filename}]
Compile the file \var{filename} as an interface file, even if its
extension is not ".mli".

\item["-intf-suffix" \var{string}]
Recognize file names ending with \var{string} as interface files
(instead of the default ".mli").

\item["-labels"]
Labels are not ignored in types, labels may be used in applications,
and labelled parameters can be given in any order.  This is the default.

\item["-linkall"]
Force all modules contained in libraries to be linked in. If this
flag is not given, unreferenced modules are not linked in. When
building a library (option "-a"), setting the "-linkall" option forces all
subsequent links of programs involving that library to link all the
modules contained in the library.

\item["-make-runtime"]
Build a custom runtime system (in the file specified by option "-o")
incorporating the C object files and libraries given on the command
line.  This custom runtime system can be used later to execute
bytecode executables produced with the
"ocamlc -use-runtime" \var{runtime-name} option.
See section~\ref{s:custom-runtime} for more information.

\item["-no-alias-deps"]
Do not record dependencies for module aliases. See
section~\ref{s:module-alias} for more information.

\item["-no-app-funct"]
Deactivates the applicative behaviour of functors. With this option,
each functor application generates new types in its result and
applying the same functor twice to the same argument yields two
incompatible structures.

\item["-noassert"]
Do not compile assertion checks.  Note that the special form
"assert false" is always compiled because it is typed specially.
This flag has no effect when linking already-compiled files.

\item["-noautolink"]
When linking ".cma" libraries, ignore "-custom", "-cclib" and "-ccopt"
options potentially contained in the libraries (if these options were
given when building the libraries).  This can be useful if a library
contains incorrect specifications of C libraries or C options; in this
case, during linking, set "-noautolink" and pass the correct C
libraries and options on the command line.

\item["-nolabels"]
Ignore non-optional labels in types. Labels cannot be used in
applications, and parameter order becomes strict.

\item["-nostdlib"]
Do not include the standard library directory in the list of
directories searched for
compiled interface files (".cmi"), compiled object code files
(".cmo"), libraries (".cma"), and C libraries specified with
"-cclib -lxxx". See also option "-I".

\item["-o" \var{exec-file}]
Specify the name of the output file produced by the compiler. The
default output name is "a.out" under Unix and "camlprog.exe" under
Windows. If the "-a" option is given, specify the name of the library
produced.  If the "-pack" option is given, specify the name of the
packed object file produced.  If the "-output-obj" option is given,
specify the name of the output file produced.  If the "-c" option is
given, specify the name of the object file produced for the {\em next}
source file that appears on the command line.

\item["-open" \var{Module}]
Opens the given module before processing the interface or
implementation files. If several "-open" options are given,
they are processed in order, just as if
the statements "open!" \var{Module1}";;" "..." "open!" \var{ModuleN}";;"
were added at the top of each file.

\item["-output-obj"]
Cause the linker to produce a C object file instead of a bytecode
executable file. This is useful to wrap OCaml code as a C library,
callable from any C program. See chapter~\ref{c:intf-c},
section~\ref{s:embedded-code}. The name of the output object file
must be set with the "-o" option. This
option can also be used to produce a C source file (".c" extension) or
a compiled shared/dynamic library (".so" extension, ".dll" under Windows).

\item["-pack"]
Build a bytecode object file (".cmo" file) and its associated compiled
interface (".cmi") that combines the object
files given on the command line, making them appear as sub-modules of
the output ".cmo" file.  The name of the output ".cmo" file must be
given with the "-o" option.  For instance,
\begin{verbatim}
        ocamlc -pack -o p.cmo a.cmo b.cmo c.cmo
\end{verbatim}
generates compiled files "p.cmo" and "p.cmi" describing a compilation
unit having three sub-modules "A", "B" and "C", corresponding to the
contents of the object files "a.cmo", "b.cmo" and "c.cmo".  These
contents can be referenced as "P.A", "P.B" and "P.C" in the remainder
of the program.

\item["-pp" \var{command}]
Cause the compiler to call the given \var{command} as a preprocessor
for each source file. The output of \var{command} is redirected to
an intermediate file, which is compiled. If there are no compilation
errors, the intermediate file is deleted afterwards.

\item["-ppx" \var{command}]
After parsing, pipe the abstract syntax tree through the preprocessor
\var{command}. The module "Ast_mapper", described in
chapter~\ref{Ast-underscoremapper}, implements the external interface
of a preprocessor.

\item["-principal"]
Check information path during type-checking, to make sure that all
types are derived in a principal way.  When using labelled arguments
and/or polymorphic methods, this flag is required to ensure future
versions of the compiler will be able to infer types correctly, even
if internal algorithms change.
All programs accepted in "-principal" mode are also accepted in the
default mode with equivalent types, but different binary signatures,
and this may slow down type checking; yet it is a good idea to
use it once before publishing source code.

\item["-rectypes"]
Allow arbitrary recursive types during type-checking.  By default,
only recursive types where the recursion goes through an object type
are supported. Note that once you have created an interface using this
flag, you must use it again for all dependencies.

\item["-runtime-variant" \var{suffix}]
Add the \var{suffix} string to the name of the runtime library used by
the program.  Currently, only one such suffix is supported: "d", and
only if the OCaml compiler was configured with option
"-with-debug-runtime".  This suffix gives the debug version of the
runtime, which is useful for debugging pointer problems in low-level
code such as C stubs.

\item["-safe-string"]
Enforce the separation between types "string" and "bytes",
thereby making strings read-only. This will become the default in
a future version of OCaml.

\item["-short-paths"]
When a type is visible under several module-paths, use the shortest
one when printing the type's name in inferred interfaces and error and
warning messages. Identifier names starting with an underscore "_" or
containing double underscores "__" incur a penalty of $+10$ when computing
their length.

\item["-strict-sequence"]
Force the left-hand part of each sequence to have type unit.

\item["-strict-formats"]
Reject invalid formats that were accepted in legacy format
implementations. You should use this flag to detect and fix such
invalid formats, as they will be rejected by future OCaml versions.

\item["-thread"]
Compile or link multithreaded programs, in combination with the
system "threads" library described in chapter~\ref{c:threads}.

\item["-unsafe"]
Turn bound checking off for array and string accesses (the "v.(i)" and
"s.[i]" constructs). Programs compiled with "-unsafe" are therefore
slightly faster, but unsafe: anything can happen if the program
accesses an array or string outside of its bounds.

\item["-unsafe-string"]
Identify the types "string" and "bytes",
thereby making strings writable. For reasons of backward compatibility,
this is the default setting for the moment, but this will change in a future
version of OCaml.

\item["-use-runtime" \var{runtime-name}]
Generate a bytecode executable file that can be executed on the custom
runtime system \var{runtime-name}, built earlier with
"ocamlc -make-runtime" \var{runtime-name}.
See section~\ref{s:custom-runtime} for more information.

\item["-v"]
Print the version number of the compiler and the location of the
standard library directory, then exit.

\item["-verbose"]
Print all external commands before they are executed, in particular
invocations of the C compiler and linker in "-custom" mode.  Useful to
debug C library problems.

\item["-vmthread"]
Compile or link multithreaded programs, in combination with the
VM-level "threads" library described in chapter~\ref{c:threads}.

\item["-version" or "-vnum"]
Print the version number of the compiler in short form (e.g. "3.11.0"),
then exit.

\item["-w" \var{warning-list}]
Enable, disable, or mark as fatal the warnings specified by the argument
\var{warning-list}.
Each warning can be {\em enabled} or {\em disabled}, and each warning
can be {\em fatal} or {\em non-fatal}.
If a warning is disabled, it isn't displayed and doesn't affect
compilation in any way (even if it is fatal).  If a warning is
enabled, it is displayed normally by the compiler whenever the source
code triggers it.  If it is enabled and fatal, the compiler will also
stop with an error after displaying it.

The \var{warning-list} argument is a sequence of warning specifiers,
with no separators between them.  A warning specifier is one of the
following:

\begin{options}
\item["+"\var{num}] Enable warning number \var{num}.
\item["-"\var{num}] Disable warning number \var{num}.
\item["@"\var{num}] Enable and mark as fatal warning number \var{num}.
\item["+"\var{num1}..\var{num2}] Enable warnings in the given range.
\item["-"\var{num1}..\var{num2}] Disable warnings in the given range.
\item["@"\var{num1}..\var{num2}] Enable and mark as fatal warnings in
the given range.
\item["+"\var{letter}] Enable the set of warnings corresponding to
\var{letter}. The letter may be uppercase or lowercase.
\item["-"\var{letter}] Disable the set of warnings corresponding to
\var{letter}. The letter may be uppercase or lowercase.
\item["@"\var{letter}] Enable and mark as fatal the set of warnings
corresponding to \var{letter}. The letter may be uppercase or
lowercase.
\item[\var{uppercase-letter}] Enable the set of warnings corresponding
to \var{uppercase-letter}.
\item[\var{lowercase-letter}] Disable the set of warnings corresponding
to \var{lowercase-letter}.
\end{options}

Warning numbers and letters which are out of the range of warnings
that are currently defined are ignored. The warnings are as follows.
\begin{options}
\input{warnings-help.tex}
\end{options}
Some warnings are described in more detail in section~\ref{s:comp-warnings}.

The default setting is "-w +a-4-6-7-9-27-29-32..39-41..42-44-45".
It is displayed by "ocamlc -help".
Note that warnings 5 and 10 are not always triggered, depending on
the internals of the type checker.

\item["-warn-error" \var{warning-list}]
Mark as fatal the warnings specified in the argument \var{warning-list}.
The compiler will stop with an error when one of these warnings is
emitted. The \var{warning-list} has the same meaning as for
the "-w" option: a "+" sign (or an uppercase letter) marks the
corresponding warnings as fatal, a "-"
sign (or a lowercase letter) turns them back into non-fatal warnings, and a
"@" sign both enables and marks as fatal the corresponding warnings.

Note: it is not recommended to use warning sets (i.e. letters) as
arguments to "-warn-error"
in production code, because this can break your build when future versions
of OCaml add some new warnings.

The default setting is "-warn-error -a" (all warnings are non-fatal).

\item["-warn-help"]
Show the description of all available warning numbers.

\item["-where"]
Print the location of the standard library, then exit.

\item["-" \var{file}]
Process \var{file} as a file name, even if it starts with a dash ("-")
character.

\item["-help" or "--help"]
Display a short usage summary and exit.
%
\end{options}

\noindent
On native Windows, the following environment variable is also consulted:

\begin{options}
\item["OCAML_FLEXLINK"]  Alternative executable to use instead of the
configured value. Primarily used for bootstrapping.
\end{options}

\section{Modules and the file system}

This short section is intended to clarify the relationship between the
names of the modules corresponding to compilation units and the names
of the files that contain their compiled interface and compiled
implementation.

The compiler always derives the module name by taking the capitalized
base name of the source file (".ml" or ".mli" file).  That is, it
strips the leading directory name, if any, as well as the ".ml" or
".mli" suffix; then, it set the first letter to uppercase, in order to
comply with the requirement that module names must be capitalized.
For instance, compiling the file "mylib/misc.ml" provides an
implementation for the module named "Misc". Other compilation units
may refer to components defined in "mylib/misc.ml" under the names
"Misc."\var{name}; they can also do "open Misc", then use unqualified
names \var{name}.

The ".cmi" and ".cmo" files produced by the compiler have the same
base name as the source file. Hence, the compiled files always have
their base name equal (modulo capitalization of the first letter) to
the name of the module they describe (for ".cmi" files) or implement
(for ".cmo" files).

When the compiler encounters a reference to a free module identifier
"Mod", it looks in the search path for a file named "Mod.cmi" or "mod.cmi"
and loads the compiled interface
contained in that file. As a consequence, renaming ".cmi" files is not
advised: the name of a ".cmi" file must always correspond to the name
of the compilation unit it implements. It is admissible to move them
to another directory, if their base name is preserved, and the correct
"-I" options are given to the compiler. The compiler will flag an
error if it loads a ".cmi" file that has been renamed.

Compiled bytecode files (".cmo" files), on the other hand, can be
freely renamed once created. That's because the linker never attempts
to find by itself the ".cmo" file that implements a module with a
given name: it relies instead on the user providing the list of ".cmo"
files by hand.

\section{Common errors} \label{s:comp-errors}

This section describes and explains the most frequently encountered
error messages.

\begin{options}

\item[Cannot find file \var{filename}]
The named file could not be found in the current directory, nor in the
directories of the search path. The \var{filename} is either a
compiled interface file (".cmi" file), or a compiled bytecode file
(".cmo" file). If \var{filename} has the format \var{mod}".cmi", this
means you are trying to compile a file that references identifiers
from module \var{mod}, but you have not yet compiled an interface for
module \var{mod}. Fix: compile \var{mod}".mli" or \var{mod}".ml"
first, to create the compiled interface \var{mod}".cmi".

If \var{filename} has the format \var{mod}".cmo", this
means you are trying to link a bytecode object file that does not
exist yet. Fix: compile \var{mod}".ml" first.

If your program spans several directories, this error can also appear
because you haven't specified the directories to look into. Fix: add
the correct "-I" options to the command line.

\item[Corrupted compiled interface \var{filename}]
The compiler produces this error when it tries to read a compiled
interface file (".cmi" file) that has the wrong structure. This means
something went wrong when this ".cmi" file was written: the disk was
full, the compiler was interrupted in the middle of the file creation,
and so on. This error can also appear if a ".cmi" file is modified after
its creation by the compiler. Fix: remove the corrupted ".cmi" file,
and rebuild it.

\item[This expression has type \nth{t}{1}, but is used with type \nth{t}{2}]
This is by far the most common type error in programs. Type \nth{t}{1} is
the type inferred for the expression (the part of the program that is
displayed in the error message), by looking at the expression itself.
Type \nth{t}{2} is the type expected by the context of the expression; it
is deduced by looking at how the value of this expression is used in
the rest of the program. If the two types \nth{t}{1} and \nth{t}{2} are not
compatible, then the error above is produced.

In some cases, it is hard to understand why the two types \nth{t}{1} and
\nth{t}{2} are incompatible. For instance, the compiler can report that
``expression of type "foo" cannot be used with type "foo"'', and it
really seems that the two types "foo" are compatible. This is not
always true. Two type constructors can have the same name, but
actually represent different types. This can happen if a type
constructor is redefined. Example:
\begin{verbatim}
        type foo = A | B
        let f = function A -> 0 | B -> 1
        type foo = C | D
        f C
\end{verbatim}
This result in the error message ``expression "C" of type "foo" cannot
be used with type "foo"''.

\item[The type of this expression, \var{t}, contains type variables
      that cannot be generalized]
Type variables ("'a", "'b", \ldots) in a type \var{t} can be in either
of two states: generalized (which means that the type \var{t} is valid
for all possible instantiations of the variables) and not generalized
(which means that the type \var{t} is valid only for one instantiation
of the variables). In a "let" binding "let "\var{name}" = "\var{expr},
the type-checker normally generalizes as many type variables as
possible in the type of \var{expr}. However, this leads to unsoundness
(a well-typed program can crash) in conjunction with polymorphic
mutable data structures. To avoid this, generalization is performed at
"let" bindings only if the bound expression \var{expr} belongs to the
class of ``syntactic values'', which includes constants, identifiers,
functions, tuples of syntactic values, etc. In all other cases (for
instance, \var{expr} is a function application), a polymorphic mutable
could have been created and generalization is therefore turned off for
all variables occurring in contravariant or non-variant branches of the
type. For instance, if the type of a non-value is "'a list" the
variable is generalizable ("list" is a covariant type constructor),
but not in "'a list -> 'a list" (the left branch of "->" is
contravariant) or "'a ref" ("ref" is non-variant).

Non-generalized type variables in a type cause no difficulties inside
a given structure or compilation unit (the contents of a ".ml" file,
or an interactive session), but they cannot be allowed inside
signatures nor in compiled interfaces (".cmi" file), because they
could be used inconsistently later. Therefore, the compiler
flags an error when a structure or compilation unit defines a value
\var{name} whose type contains non-generalized type variables. There
are two ways to fix this error:
\begin{itemize}
\item Add a type constraint or a ".mli" file to give a monomorphic
type (without type variables) to \var{name}. For instance, instead of
writing
\begin{verbatim}
    let sort_int_list = Sort.list (<)
    (* inferred type 'a list -> 'a list, with 'a not generalized *)
\end{verbatim}
write
\begin{verbatim}
    let sort_int_list = (Sort.list (<) : int list -> int list);;
\end{verbatim}
\item If you really need \var{name} to have a polymorphic type, turn
its defining expression into a function by adding an extra parameter.
For instance, instead of writing
\begin{verbatim}
    let map_length = List.map Array.length
    (* inferred type 'a array list -> int list, with 'a not generalized *)
\end{verbatim}
write
\begin{verbatim}
    let map_length lv = List.map Array.length lv
\end{verbatim}
\end{itemize}

\item[Reference to undefined global \var{mod}]
This error appears when trying to link an incomplete or incorrectly
ordered set of files. Either you have forgotten to provide an
implementation for the compilation unit named \var{mod} on the command line
(typically, the file named \var{mod}".cmo", or a library containing
that file). Fix: add the missing ".ml" or ".cmo" file to the command
line.  Or, you have provided an implementation for the module named
\var{mod}, but it comes too late on the command line: the
implementation of \var{mod} must come before all bytecode object files
that reference \var{mod}. Fix: change the order of ".ml" and ".cmo"
files on the command line.

Of course, you will always encounter this error if you have mutually
recursive functions across modules. That is, function "Mod1.f" calls
function "Mod2.g", and function "Mod2.g" calls function "Mod1.f".
In this case, no matter what permutations you perform on the command
line, the program will be rejected at link-time. Fixes:
\begin{itemize}
\item Put "f" and "g" in the same module.
\item Parameterize one function by the other.
That is, instead of having
\begin{verbatim}
mod1.ml:    let f x = ... Mod2.g ...
mod2.ml:    let g y = ... Mod1.f ...
\end{verbatim}
define
\begin{verbatim}
mod1.ml:    let f g x = ... g ...
mod2.ml:    let rec g y = ... Mod1.f g ...
\end{verbatim}
and link "mod1.cmo" before "mod2.cmo".
\item Use a reference to hold one of the two functions, as in :
\begin{verbatim}
mod1.ml:    let forward_g =
                ref((fun x -> failwith "forward_g") : <type>)
            let f x = ... !forward_g ...
mod2.ml:    let g y = ... Mod1.f ...
            let _ = Mod1.forward_g := g
\end{verbatim}
\end{itemize}

\item[The external function \var{f} is not available]
This error appears when trying to link code that calls external
functions written in C.  As explained in
chapter~\ref{c:intf-c}, such code must be linked with C libraries that
implement the required \var{f} C function.  If the C libraries in
question are not shared libraries (DLLs), the code must be linked in
``custom runtime'' mode.  Fix: add the required C libraries to the
command line, and possibly the "-custom" option.

\end{options}

\section{Warning reference} \label{s:comp-warnings}

This section describes and explains in detail some warnings:

\begin{options}
\item[Warning 52: fragile constant pattern]

  Some constructors, such as the exception constructors "Failure" and
  "Invalid_argument", take as parameter a "string" value holding
  a text message intended for the user.

  These text messages are usually not stable over time: call sites
  building these constructors may refine the message in a future
  version to make it more explicit, etc. Therefore, it is dangerous to
  match over the precise value of the message. For example, until
  OCaml 4.02, "Array.iter2" would raise the exception
\begin{verbatim}
  Invalid_argument "arrays must have the same length"
\end{verbatim}
  Since 4.03 it raises the more helpful message
\begin{verbatim}
  Invalid_argument "Array.iter2: arrays must have the same length"
\end{verbatim}
  but this means that any code of the form
\begin{verbatim}
  try ...
  with Invalid_argument "arrays must have the same length" -> ...
\end{verbatim}
  is now broken and may suffer from uncaught exceptions.

  Warning 52 is there to prevent users from writing such fragile code
  in the first place. It does not occur on every matching on a literal
  string, but only in the case in which library authors expressed
  their intent to possibly change the constructor parameter value in
  the future, by using the attribute "ocaml.warn_on_literal_pattern"
  (see the manual section on builtin attributes in
  \ref{ss:builtin-attributes}):
\begin{verbatim}
  type t =
    | Foo of string [@ocaml.warn_on_literal_pattern]
    | Bar of string

  let no_warning = function
    | Bar "specific value" -> 0
    | _ -> 1

  let warning = function
    | Foo "specific value" -> 0
    | _ _ -> 1

>    | Foo "specific value" -> 0
>          ^^^^^^^^^^^^^^^^
> Warning 52: the argument of this constructor should not be matched against a
> constant pattern; the actual value of the argument could change
> in the future.
\end{verbatim}

  If your code raises this warning, you should {\em not} change the
  way you test for the specific string to avoid the warning (for
  example using a string equality inside the right-hand-side instead
  of a literal pattern), as your code would remain fragile. You should
  instead enlarge the scope of the pattern by matching on all possible
  values. This may require some care: if the scrutinee may return
  several different cases of the same pattern, or raise distinct
  instances of the same exception, you may need to modify your code to
  separate those several cases.

  For example,
\begin{verbatim}
try (int_of_string count_str, bool_of_string choice_str) with
  | Failure "int_of_string" -> (0, true)
  | Failure "bool_of_string" -> (-1, false)
\end{verbatim}
  should be rewritten into more atomic tests. For example,
  using the "exception" patterns documented in Section~\ref{s:exception-match},
  one can write:
\begin{verbatim}
match int_of_string count_str with
  | exception (Failure _) -> (0, true)
  | count ->
    begin match bool_of_string choice_str with
    | exception (Failure _) -> (-1, false)
    | choice -> (count, choice)
    end
\end{verbatim}

\item[Warning 57: Ambiguous or-pattern variables under guard]
  The semantics of or-patterns in OCaml is specified with
  a left-to-right bias: a value \var{v} matches the pattern \var{p} "|" \var{q}
  if it matches \var{p} or \var{q}, but if it matches both,
  the environment captured by the match is the environment captured by
  \var{p}, never the one captured by \var{q}.

  While this property is generally intuitive, there is at least one specific
  case where a different semantics might be expected.
  Consider a pattern followed by a when-guard:
  "|"~\var{p}~"when"~\var{g}~"->"~\var{e}, for example:
\begin{verbatim}
     | ((Const x, _) | (_, Const x)) when is_neutral x -> branch
\end{verbatim}
  The semantics is clear:
  match the scrutinee against the pattern, if it matches, test the guard,
  and if the guard passes, take the branch.
  In particular, consider the input "(Const"~\var{a}", Const"~\var{b}")", where
  \var{a} fails the test "is_neutral"~\var{a}, while \var{b} passes the test
  "is_neutral"~\var{b}.  With the left-to-right semantics, the clause above is
  {\em not} taken by its input: matching "(Const"~\var{a}", Const"~\var{b}")"
  against the or-pattern succeeds in the left branch, it returns the
  environment \var{x}~"->"~\var{a}, and then the guard
  "is_neutral"~\var{a} is tested and fails, the branch is not taken.

  However, another semantics may be considered more natural here:
  any pair that has one side passing the test will take the branch. With this
  semantics the previous code fragment would be equivalent to
\begin{verbatim}
     | (Const x, _) when is_neutral x -> branch
     | (_, Const x) when is_neutral x -> branch
\end{verbatim}
  This is {\em not} the semantics adopted by OCaml.

 Warning 57 is dedicated to these confusing cases where the
 specified left-to-right semantics is not equivalent to a non-deterministic
 semantics (any branch can be taken) relatively to a specific guard.
 More precisely, it warns when guard uses ``ambiguous'' variables, that are bound
 to different parts of the scrutinees by different sides of a or-pattern.
\end{options}