Culture-Sensitive C# Class
using System.Text
public UTF8Encoding();
Internationalization (I18n) Class Overview
This class encodes Unicode characters using UCS Transformation Format, 8-bit form (UTF-8).
This encoding supports all Unicode character values and surrogates.
For more information see Microsoft's
MSDN online documentation. Also, see specific MSDN documentation on
Encoding Properties.
I18n Issues
Use of this class probably does not pose an I18n problem.
Globalyzer detects
it by default because during the internationalization process it is important that you
are aware of all of the places in your code where you are performing character encoding
conversions. Further, UTF-8 is different from the form of Unicode used by C# internally
(UTF-16, a two-byte Unicode encoding). UTF-8 is the recommended character encoding to use for
multilingual web pages.
UTF-8 encodes Unicode characters with a
variable number of bytes per character.
This encoding is optimized for the lower 127 ASCII characters, yielding an efficient
mechanism to encode English in an international way. The UTF-8 identifier is the
Unicode byte order mark, hexadecimal 0xFEFF, which is represented in UTF-8 as
hexadecimal 0xEF 0xBB 0xBF. The byte order mark is used to distinguish UTF-8
text from other encodings.
If, once you have examined a particular instantiation of the UTF8Encoding class,
you determine that it does not pose I18n problems, you can
use Globalyzer's Ignore Comment
functionality to ensure that it isn't picked up in a subsequent scan.
Usage Example
UTF8Encoding utf8 = new UTF8Encoding();
Char[] chars = new Char[] {'a', 'b', 'c',
'\uD869', '\uDED6', 'd'};
Byte[] bytes = utf8.GetBytes(chars);
C# Encoding Information
|