|
String Formatting in C and C++Related Links Link to Single and Multi byte Formatting Functions. Contents String Formatting
For purposes of this section, we will refer to both single byte and multibyte strings as This section addresses input and output formatting issues stemming from the use of In addition to the familiar Single and Wide Specifiers To specify that a parameter is to treated as a single byte parameter, irregardless of whether a single or wide function call is being used, the format specifiers %hs and %hc should be used for strings and characters respectively. To specify that a parameter is to treated as a wide parameter, irregardless of whether a single or wide function call is being used, the format specifiers %ls and %lc should be used for strings and characters respectively. Both Windows and ANSI behave the same in regards to these prefixed specifiers. Unqualified String Specifiers (small These tables show how Windows and ANSI treat parameters based on the %s specifier and the style of function call (single, generic, or wide):
Note that ANSI in essence always treats %s in the same way as %hs, in other words it is always assumed to be single byte string. Windows on the other hand treats %s differently based on the type of function call. (and is not ANSI-standard because of this). For single byte function calls, %s acts like the single byte %hs specifier, but for wide functions calls, %s acts like the wide %ls specifier. For Windows Generic calls, the parameter is expected to be of type TCHAR, so that if the code is compiled with the _UNICODE flag off it will be assumed to be a single byte string, and with the _UNICODE flag on it will be assumed to be a wide string. (Note the requirement for %s to mean both single byte or wide depending on Generic compile flags is probably the reason why Microsoft took a non-ANSI standard approach to these specifiers.) Unqualified String Specifiers (large These tables show how Windows and ANSI treat parameters based on the %S specifier and the style of function call (single, generic, or wide):
Both ANSI and Windows treat %S basically as opposite of %s in terms of single byte or wide, which ironically means that Windows and ANSI again handle these specifiers differently. Note that ANSI in essence always treats %S in the same way as %ls, in other words it is always assumed to be wide string. Windows on the other hand treats %S differently based on the type of function call. For single byte function calls, %S acts like the wide %ls specifier, but for wide functions calls, %S acts like the single byte %hs specifier. This specifier should not be used for Windows Generic calls. Since %S is the "opposite" of %s, the parameter would need to be wide if the _UNICODE flag is off, and single byte if the _UNICODE flag is on. The TCHAR generic type does not work this way, and there's not "anti-TCHAR" kind of datatype. Multibyte NotesIn regards to multibyte encodings such as UTF-8 and Shift-JIS, the single byte functions work correctly, with only a few minor notes. One is that for the input functions such as Another is that for functions that take count parameters, like Locale Influences on FormattingAll of the An example of how this is used is seen with the floating-point decimal point separator. The United States uses a period For output, the issue is primarily just ensuring that the locale is set properly. For input however, such as for Consider the case where a number string is in some canonical form that always uses one particular style and is therefore locale-independent. For example a numeric string value that is stored in a database or comes from a protocol that has a locale-independent string format that always uses a period for a floating point separator. If your application is operating in the French locale for example, in this case you will have to temporarily set the locale to a locale like United States English in order to parse the numeric, and then afterwards reset the locale to French. On the other hand, if the string is retrieved from some source such as a dialog box where the value is likely to be the in the local format, you probably will want to leave the locale as it is currently set for the application.
|