| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for UTF-8 source files
GHC finally has support for full Unicode in source files. Source
files are now assumed to be UTF-8 encoded, and the full range of
Unicode characters can be used, with classifications recognised using
the implementation from Data.Char. This incedentally means that only
the stage2 compiler will recognise Unicode in source files, because I
was too lazy to port the unicode classifier code into libcompat.
Additionally, the following synonyms for keywords are now recognised:
forall symbol (U+2200) forall
right arrow (U+2192) ->
left arrow (U+2190) <-
horizontal ellipsis (U+22EF) ..
there are probably more things we could add here.
This will break some source files if Latin-1 characters are being used.
In most cases this should result in a UTF-8 decoding error. Later on
if we want to support more encodings (perhaps with a pragma to specify
the encoding), I plan to do it by recoding into UTF-8 before parsing.
Internally, there were some pretty big changes:
- FastStrings are now stored in UTF-8
- Z-encoding has been moved right to the back end. Previously we
used to Z-encode every identifier on the way in for simplicity,
and only decode when we needed to show something to the user.
Instead, we now keep every string in its UTF-8 encoding, and
Z-encode right before printing it out. To avoid Z-encoding the
same string multiple times, the Z-encoding is cached inside the
FastString the first time it is requested.
This speeds up the compiler - I've measured some definite
improvement in parsing at least, and I expect compilations overall
to be faster too. It also cleans up a lot of cruft from the
OccName interface. Z-encoding is nicely hidden inside the
Outputable instance for Names & OccNames now.
- StringBuffers are UTF-8 too, and are now represented as
ForeignPtrs.
- I've put together some test cases, not by any means exhaustive,
but there are some interesting UTF-8 decoding error cases that
aren't obvious. Also, take a look at unicode001.hs for a demo.
|
|
|
|
|
|
|
| |
small performance fix: in via-C mode we previously always created a
switch instead of an conditional-tree for a multi-branch case. Refine
this slightly so that 2-branch switches turn into conditionals again,
since gcc doesn't do a good job of optimising the equivalent switch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relax the restrictions on conflicting packages. This should address
many of the traps that people have been falling into with the current
package story.
Now, a local module can shadow a module in an exposed package, as long
as the package is not otherwise required by the program. GHC checks
for conflicts when it knows the dependencies of the module being
compiled.
Also, we now check for module conflicts in exposed packages only when
importing a module: if an import can be satisfied from multiple
packages, that's an error. It's not possible to prevent GHC from
starting by installing packages now (unless you install another base
package).
It seems to be possible to confuse GHCi by having a local module
shadowing a package module that goes away and comes back again. I
think it's nearly right, but strange happenings have been observed.
I'll try to merge this into the STABLE branch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When generating a switch for:
case e of
0 -> A
1 -> B
instead of generating
if (e < 1) then goto A
B
generate
if (e >= 1) then goto B
A
because this helps the NCG to generate better code. In particular, if
e is a comparison, then we don't need to reverse the sense of the
comparison to eliminate the comparse against 1 (the NCG does try to
reverse the comparison, but floating-point comparisons can't be
reversed).
|
|
|
|
| |
tweaks to a (commented-out) trace message
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flags cleanup.
Basically the purpose of this commit is to move more of the compiler's
global state into DynFlags, which is moving in the direction we need
to go for the GHC API which can have multiple active sessions
supported by a single GHC instance.
Before:
$ grep 'global_var' */*hs | wc -l
78
After:
$ grep 'global_var' */*hs | wc -l
27
Well, it's an improvement. Most of what's left won't really affect
our ability to host multiple sessions.
Lots of static flags have become dynamic flags (yay!). Notably lots
of flags that we used to think of as "driver" flags, like -I and -L,
are now dynamic. The most notable static flags left behind are the
"way" flags, eg. -prof. It would be nice to fix this, but it isn't
urgent.
On the way, lots of cleanup has happened. Everything related to
static and dynamic flags lives in StaticFlags and DynFlags
respectively, and they share a common command-line parser library in
CmdLineParser. The flags related to modes (--makde, --interactive
etc.) are now private to the front end: in fact private to Main
itself, for now.
|
|
|
|
|
|
| |
emitSwitch: if we're compiling via C, then always generate a switch
rather than an if-tree. This should work around brokenness in older
versions of GCC.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rationalise the BUILD,HOST,TARGET defines.
Recall that:
- build is the platform we're building on
- host is the platform we're running on
- target is the platform we're generating code for
The change is that now we take these definitions as applying from the
point of view of the particular source code being built, rather than
the point of view of the whole build tree.
For example, in RTS and library code, we were previously testing the
TARGET platform. But under the new rule, the platform on which this
code is going to run is the HOST platform. TARGET only makes sense in
the compiler sources.
In practical terms, this means that the values of BUILD, HOST & TARGET
may vary depending on which part of the build tree we are in.
Actual changes:
- new file: includes/ghcplatform.h contains platform defines for
the RTS and library code.
- new file: includes/ghcautoconf.h contains the autoconf settings
only (HAVE_BLAH). This is so that we can get hold of these
settings independently of the platform defines when necessary
(eg. in GHC).
- ghcconfig.h now #includes both ghcplatform.h and ghcautoconf.h.
- MachRegs.h, which is included into both the compiler and the RTS,
now has to cope with the fact that it might need to test either
_TARGET_ or _HOST_ depending on the context.
- the compiler's Makefile now generates
stage{1,2,3}/ghc_boot_platform.h
which contains platform defines for the compiler. These differ
depending on the stage, of course: in stage2, the HOST is the
TARGET of stage1. This was wrong before.
- The compiler doesn't get platform info from Config.hs any more.
Previously it did (sometimes), but unless we want to generate
a new Config.hs for each stage we can't do this.
- GHC now helpfully defines *_{BUILD,HOST}_{OS,ARCH} automatically
in CPP'd Haskell source.
- ghcplatform.h defines *_TARGET_* for backwards compatibility
(ghcplatform.h is included by ghcconfig.h, which is included by
config.h, so code which still #includes config.h will get the TARGET
settings as before).
- The Users's Guide is updated to mention *_HOST_* rather than
*_TARGET_*.
- coding-style.html in the commentary now contains a section on
platform defines. There are further doc updates to come.
Thanks to Wolfgang Thaller for pointing me in the right direction.
|
|
|
|
|
|
|
|
|
|
| |
Make the NCG distinguish between the read-only data section and the
"relocatable read-only data" section.
Read-only data is supposed to be _really_ read-only, whereas "relrodata"
can have relocations, but should not be modified by the program at runtime.
For Linux, put relrodata into ".data" by default, as the dynamic linker
tends to do evil things to avoid relocating things in read-only sections.
|
|
|
|
| |
Fix a bug in mk_switch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Further integration with the new package story. GHC now supports
pretty much everything in the package proposal.
- GHC now works in terms of PackageIds (<pkg>-<version>) rather than
just package names. You can still specify package names without
versions on the command line, as long as the name is unambiguous.
- GHC understands hidden/exposed modules in a package, and will refuse
to import a hidden module. Also, the hidden/eposed status of packages
is taken into account.
- I had to remove the old package syntax from ghc-pkg, backwards
compatibility isn't really practical.
- All the package.conf.in files have been rewritten in the new syntax,
and contain a complete list of modules in the package. I've set all
the versions to 1.0 for now - please check your package(s) and fix the
version number & other info appropriately.
- New options:
-hide-package P sets the expose flag on package P to False
-ignore-package P unregisters P for this compilation
For comparison, -package P sets the expose flag on package P
to True, and also causes P to be linked in eagerly.
-package-name is no longer officially supported. Unofficially, it's
a synonym for -ignore-package, which has more or less the same effect
as -package-name used to.
Note that a package may be hidden and yet still be linked into
the program, by virtue of being a dependency of some other package.
To completely remove a package from the compiler's internal database,
use -ignore-package.
The compiler will complain if any two packages in the
transitive closure of exposed packages contain the same
module.
You *must* use -ignore-package P when compiling modules for
package P, if package P (or an older version of P) is already
registered. The compiler will helpfully complain if you don't.
The fptools build system does this.
- Note: the Cabal library won't work yet. It still thinks GHC uses
the old package config syntax.
Internal changes/cleanups:
- The ModuleName type has gone away. Modules are now just (a
newtype of) FastStrings, and don't contain any package information.
All the package-related knowledge is in DynFlags, which is passed
down to where it is needed.
- DynFlags manipulation has been cleaned up somewhat: there are no
global variables holding DynFlags any more, instead the DynFlags
are passed around properly.
- There are a few less global variables in GHC. Lots more are
scheduled for removal.
- -i is now a dynamic flag, as are all the package-related flags (but
using them in {-# OPTIONS #-} is Officially Not Recommended).
- make -j now appears to work under fptools/libraries/. Probably
wouldn't take much to get it working for a whole build.
|
|
|
|
| |
Remove debugging trace that I left in when working on mk_switch.
|
|
|
|
| |
Oops, fix bugs in previous commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a couple of cases to mk_switch to catch the case when we have a
tag range that has a lot of default cases at either end, and we're not
using a single switch. In situations like this we want to eliminate
the default cases with an if-statement, before dealing with the rest
of the branches, which might then be suitable for a switch.
Also, ignore empty tag slots at either end of the range if there is no
default case.
This might work around a gcc 2.95 bug that we tickled with the
code being generated before.
|
|
|
|
|
|
|
|
|
|
|
| |
Simplify the "impossible branch" handling, and fix a bug in the
process. CmmSwitch encodes the possibility of having impossible
branches (the destinations are Maybe BlockId rather than just BlockId)
so we don't need to encode impossible branches as dummy blocks
containing a jump to an impossible location (currently 0).
However, PprC and PprCmm weren't set up to cope with Nothings in a
CmmSwitch, so this commit fixes that too.
|
|
|
|
|
| |
Give literal string labels a _str suffix, to make it less likely that
they'll clash with a symbol in scope in a C file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-------------------------------
Use merge-sort not quicksort
Nuke quicksort altogether
-------------------------------
Quicksort has O(n**2) behaviour worst case, and this occasionally bites.
In particular, when compiling large files consisting only of static data,
we get loads of top-level delarations -- and that led to more than half the
total compile time being spent in the strongly connected component analysis
for the occurrence analyser. Switching to merge sort completely solved the
problem.
I've nuked quicksort altogether to make sure this does not happen again.
|
|
Merge backend-hacking-branch onto HEAD. Yay!
|