delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	ghc-boot: Fix bootstrapping	Ben Gamari	2022-12-21	1	-3/+6
\|
*	Add a Note summarising GHC's UTF-8 implementations	Ben Gamari	2022-07-22	1	-0/+5
\| \| \| \| \|	GHC has a somewhat dizzying array of UTF-8 implementations. This note describes why this is the case.
*	base: Introduce GHC.Encoding.UTF8	Ben Gamari	2022-07-22	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Here we copy a subset of the UTF-8 implementation living in `ghc-boot` into `base`, with the intent of dropping the former in the future. For this reason, the `ghc-boot` copy is now CPP-guarded on `MIN_VERSION_base(4,18,0)`. Naturally, we can't copy all of the functions defined by `ghc-boot` as some depend upon `bytestring`; we rather just copy those which only depend upon `base` and `ghc-prim`. Further consolidation? ---------------------- Currently GHC ships with at least five UTF-8 implementations: * the implementation used by GHC in `ghc-boot:GHC.Utils.Encoding`; this can be used at a number of types including `Addr#`, `ByteArray#`, `ForeignPtr`, `Ptr`, `ShortByteString`, and `ByteString`. Most of this can be removed in GHC 9.6+2, when the copies in `base` will become available to `ghc-boot`. * the copy of the `ghc-boot` definition now exported by `base:GHC.Encoding.UTF8`. This can be used at `Addr#`, `Ptr`, `ByteArray#`, and `ForeignPtr` * the decoder used by `unpackCStringUtf8#` in `ghc-prim:GHC.CString`; this is specialised at `Addr#`. * the codec used by the IO subsystem in `base:GHC.IO.Encoding.UTF8`; this is specialised at `Addr#` but, unlike the above, supports recovery in the presence of partial codepoints (since in IO contexts codepoints may be broken across buffers) * the implementation provided by the `text` library This does seem a tad silly. On the other hand, these implementations do materially differ from one another (e.g. in the types they support, the detail in errors they can report, and the ability to recover from partial codepoints). Consequently, it's quite unclear that further consolidate would be worthwhile.
*	ghc-boot: Clean up UTF-8 codecs	Ben Gamari	2022-07-22	1	-0/+344
	In preparation for moving the UTF-8 codecs into `base`: * Move them to GHC.Utils.Encoding.UTF8 * Make names more consistent * Add some Haddocks