summaryrefslogtreecommitdiff
path: root/pod/perluniintro.pod
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2001-12-27 23:56:20 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2001-12-27 23:56:20 +0000
commitaaef10c5550c567d82a2f114831f7a5c9e62a4e7 (patch)
tree6e96e49c0f43b8660ff537b87fe018f0b99f8908 /pod/perluniintro.pod
parentb682381a96c9a55a544f9537d92a562937057c0c (diff)
downloadperl-aaef10c5550c567d82a2f114831f7a5c9e62a4e7.tar.gz
Fast Latin1<->UTF-8 conversion for older Perls.
p4raw-id: //depot/perl@13912
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r--pod/perluniintro.pod9
1 files changed, 9 insertions, 0 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod
index 9b447caab9..68f8a01534 100644
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -790,6 +790,15 @@ C<Unicode::Map8>, and C<Unicode::Map>, available from CPAN.
If you have the GNU recode installed, you can also use the
Perl frontend C<Convert::Recode> for character conversions.
+The following are fast conversions from ISO 8859-1 (Latin-1) bytes
+to UTF-8 bytes, the code works even with older Perl 5 versions.
+
+ # ISO 8859-1 to UTF-8
+ s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;
+
+ # UTF-8 to ISO 8859-1
+ s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;
+
=head1 SEE ALSO
L<perlunicode>, L<Encode>, L<encoding>, L<open>, L<utf8>, L<bytes>,