Chapter Contents

Previous

Next

Background

Encoding is the process of converting text data into a numbering system that computers recognize. The most widely used English language encoding systems for computers are called ASCII and EBCDIC. ASCII and EBCDIC encoding systems convert characters of the Latin alphabet into computer representation. Other encoding systems convert Asian pictographic characters into computer representation.

Because of the relatively small number of characters that are required to produce the characters of the Roman alphabet, one byte of information is adequate to represent each character (Single Byte Character Sets). Many Asian languages require thousands of characters, and two bytes of information are needed to represent each character. This is the origin of the term "Double Byte Character Set" (DBCS).

Each Asian language usually has more than one DBCS encoding system, due to nonstandardization between computer manufacturers. The SAS System has features designed to process the DBCS encoding information that is unique to each manufacturer for the major Asian languages, which include Japanese, Korean, simplified Chinese (used in mainland China and Singapore), and traditional or complex Chinese (used in Taiwan and Hong Kong).


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.