Bangla Sorting in MySQL

From Ankur

Jump to: navigation, search

Adding Bangla/Bengali collation rules to MySQL database server

  • Find out the “Index.xml” file of existing mysql server.
  1. “/usr/share/mysql/charsets/” in GNU/Linux
  2. “<mysql installation folder>\share\charsets\” in Windows.
  3. Also you can login to the MySQL server and put following SQL query, which will show the folder name.
   mysql> show variables like 'character_sets_dir';
   +--------------------+---------------------------------+
   | Variable_name      | Value                           |
   +--------------------+---------------------------------+
   | character_sets_dir | /usr/share/mysql/charsets/      |
   +--------------------+---------------------------------+
  • We need the maximum collations id of MySQL server's information_schema database.
   mysql> SELECT max( id ) FROM `information_schema`.`COLLATIONS`;
   +-----------+
   | max( id ) |
   +-----------+
   |       210 |
   +-----------+
  • Keep backup copy of “Index.xml” file. Now open “Index.xml” file and search for <charset name="utf8"> node. There should be two collations in that block:
  1. <collation name="utf8_general_ci" id="33">
  2. <collation name="utf8_bin" id="83">
  1. <collation name="utf8_bangla_ci" id="211">
  2. <collation name="utf8_bangla_dictionary_ci" id="212">
  • Be careful with the value of collation id's. It must be a unique number. In earlier step we got the maximum id of the MySQL server. So we will increase the value and use it for Bangla.
  • Update the “Index.xml” file with Bangla collation rules block accordingly. You can get our working copy from here, which might not work for you. :)
  • Restart the MySQL server.
  • MySQL server should now have two new collation rules for Bangla:
  1. utf8_bangla_ci : follows the original order
  2. utf8_bangla_dictionary_ci : follows the Bangla Academy dictionary order
   mysql> SELECT * FROM `information_schema`.`COLLATIONS` where COLLATION_NAME like '%bangla%';
   +---------------------------+--------------------+-----+------------+-------------+---------+
   | COLLATION_NAME            | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
   +---------------------------+--------------------+-----+------------+-------------+---------+
   | utf8_bangla_ci            | utf8               | 211 |            |             |       8 |
   | utf8_bangla_dictionary_ci | utf8               | 212 |            |             |       8 |
   +---------------------------+--------------------+-----+------------+-------------+---------+
  • Now you can use your preferred Bangla collation rule. Here is a sample SQL file. You can import it and test Bangla sorting.
  • We used MySQL Server version: 5.0.51, 5.0.67 (Community Edition) and our test results are available here.
  • Please send us your feedback at feedback@ankur.org.bd.
Views
Personal tools