ISO/IEC 8859-11

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)

ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias[1] of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows.[2] A draft had the Thai letters in different spots.[3]

As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.

The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.

Character set

ISO/IEC 8859-11[4]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ ๎
Fx

Code values D1, D4-DA, E7-EE are combining characters.

Vendor extensions

Code page 874 (IBM) / 9066

IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066),[5] differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:[6][7][8]

IBM code page 874/9066 (differences from ISO-8859-11)[9][10][11]
0 1 2 3 4 5 6 7 8 9 A B C D E F
Ax ่
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ ้ ๊ ๋ ์ ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ ๎
Fx ¢ ¬ ¦ NBSP
  Differences from ISO 8859-11

Code page 1161

Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).[12][13]

Code page 874 (Microsoft) / 1162

Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM,[14][15] is used by Microsoft Windows. It differs from ISO/IEC 8859-11 only by adding the nine symbols shown in the following table:

Code page 1162 (IBM) / 874 (Microsoft): difference from ISO-8859-11[16][17][18][19]
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x
9x
  Differences from ISO 8859-11

Mac OS Thai

This is the variant used on the Classic Mac OS.

Mac OS Thai[20]
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x « » ่ ้ ๊ ๋ ์ ่ ้ ๊ ๋ ์ ํ
9x ั ็ ิ ี ึ ื ่ ้ ๊ ๋ ์
Ax NBSP
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ  WJ  ZWSP ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ
Fx ® ©
  Differences from ISO 8859-11

See also

Footnotes

References

  1. ^ "IANA Character Sets".
  2. ^ "js-codepage, Getting codepages". GitHub. 12 October 2021.
  3. ^ Everson, Michael. "Proposed ISO 8859-11".
  4. ^ Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
  5. ^ IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1. Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
  6. ^ "Code page 874 information document". Archived from the original on 2017-01-16.
  7. ^ "CCSID 874 information document". Archived from the original on 2016-03-27.
  8. ^ "CCSID 9066 information document". Archived from the original on 2016-03-27.
  9. ^ IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
  10. ^ Code Page CPGID 00874 (txt), IBM
  11. ^ "Converter Explorer: ibm-874_P100-1995". International Components for Unicode. Unicode Consortium.
  12. ^ "Code Page 01161" (PDF).
  13. ^ "CCSID 1161 information document". Archived from the original on 2016-03-27.
  14. ^ "Code page 1162 information document". Archived from the original on 2016-03-17.
  15. ^ "CCSID 1162 information document". Archived from the original on 2016-03-27.
  16. ^ "Code Page 01162" (PDF).
  17. ^ Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
  18. ^ Code Page CPGID 01162 (txt), IBM
  19. ^ International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
  20. ^ Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.