Collation in Oracle 8i
There are some considerations for handling collation and sorting
of UTF-8 encoded data. Oracle supports all linguistic definitions
(sorts) for UTF-8 the same way they are supported with corresponding
language specific character sets. Each linguistic sort will sensically
sort the characters for the given language and treat the remaining
Unicode characters as punctuation, essentially ignoring them. To
conduct a linguistic sort, select active the sort definition using
ALTER SESSION SET NLS_SORT= or specifying the NLS_SORT
parameter in NLSSORT function.
An NLSSORT query takes the form:
SELECT value FROM table ORDER BY NLSSORT(value, 'NLS_SORT
= LOCALE')
and takes approximately 50% longer to execute than an ASCII binary
sorting routine. Please see prepared
statements for more information regarding SQL queries.
The NLSSORT function replaces a character string with
the equivalent sort string used by the linguistic sort mechanism.
For a binary sort, the sort string is the same as the input string.
The linguistic sort technique operates by replacing each character
string with some other binary values, chosen so that sorting the
resulting string produces the desired sorting sequence. When a linguistic
sort is being used, NLSSORT returns the binary values
that replace the original string.
To handle linguistic indexes for data in multiple languages for
VARCHAR2 columns, build a single linguistic index for
all languages. This can be accomplished by including a language
column (LANG_COL in the example below) that contains
NLS_LANGUAGE values for the corresponding column on
which the index is built as a parameter to the NLSSORT
function.
Please view the complete list of Oracle
NLS parameters for more information.
Collation Example
The following example builds a single linguistic index for multiple
languages. With this index, the rows with the same values for NLS_LANGUAGE
are collated together.
CREATE INDEX i ON t (NLSSORT(col, 'NLS_SORT=' | | lang_col));
Which index to use is based on the argument of the NLSSORT
function you specified in the ORDER BY clause.
SELECT col
FROM t
WHERE NLSSORT(col, 'NLS_SORT = '||lang_col) IS NOT NULL ORDER BY
lang_col, NLSSORT(col, 'NLS_SORT = '||lang_col);
Note: The IS NOT NULL clause is required for
using linguistic indexes. All column names should be in lower case.
With single linguistic index the results are grouped by language.
|