Microsoft Code Pages
A code page is a platform specific encoding of a character set, and
can be represented in a table as a mapping of characters to single
or multibyte values. Many code pages share the ASCII character set
for characters in the range
0x00 - 0x7F
.
The Microsoft run-time library uses the following types of
code pages:
- System-default ANSI code page. When an application starts,
the run-time system automatically sets the multibyte code page to
the operating system's default ANSI code page. To set the locale
to the system-default ANSI code page, use the C call:
setlocale(LC_ALL, "");
- Locale code page. Many of the C run-time routines
are dependent on the current locale
setting, which, in turn, is dependent on the locale code page. On
application startup, the locale-dependent routines in the
Microsoft run-time library use the code page that corresponds to
the "C" locale. However, you can change or query the locale code
page within your application by calling
setlocale
.
- Multibyte code page. In addition to
locale-sensitive C run-time functions, Microsoft also supports
many multibyte-character functions that are dependent on the
application's multibyte code page setting. By default, these
routines use the system-default ANSI code page. However, at
run-time you can query and change the multibyte code page by
calling
_getmbcp
and _setmbcp
, respectively.
- "C" locale code page. This is the name of the code
page that corresponds to the ASCII character set, and is the code
page that is used as the C/C++ application's default locale code
page.
Multibyte Code Page Functions
Most multibyte-character routines in the Microsoft run-time library
recognize multibyte-character sequences according to the current
code page setting. This includes the _ismbc
routines. The multibyte code page also affects multibyte processing
in the following set of routines:
| _exec functions |
_mktemp |
_stat |
| _fullpath |
_spawn functions |
_tempnam |
| _makepath |
_splitpath |
tmpnam |
In addition, all run-time library routines that have
multibyte-character
argv
or
envp
program arguments (such as the _exec and _spawn families) process
these strings according to the multibyte code page. Hence these
routines are also affected by a call to
_setmbcp
that changes the multibyte code page.
See the MSDN
Library for more information on the multibyte code page-dependent
functions.
Locale Code Page Functions
There are a number of functions that are dependent on the locale
code page. As stated above, call setlocale
to ensure that the locale is set properly before calling one of
these functions.
| atof, atoi, atol |
is functions |
isleadbyte |
localeconv |
MB_CUR_MAX |
_mbccpy |
_mbclen |
mblen |
_mbstrlen |
mbstowcs |
| mbtowc |
printf functions |
scanf functions |
setlocale, _wsetlocale |
strcoll, wcscoll |
_stricmp, _wcsicmp, _mbsicmp |
_stricoll, _wcsicoll |
_strncoll, _wcsncoll |
_strnicmp, _wcsnicmp, _mbsnicmp |
_strnicoll, _wcsnicoll |
| strftime, wcsftime |
_strlwr |
strtod, wcstod, strtol, wcstol, strtoul, wcstoul |
_strupr |
strxfrm, wcsxfrm |
tolower, towlower |
toupper, towupper |
wcstombs |
wctomb |
_wtoi, _wtol |
See the MSDN
Library for more details on C locale-dependent functions.
There are also many locale-dependent Win32 functions. See Windows C++
Locale Functions for details.
And for a comprehensive list of Microsoft code page identifiers,
click here.
|