From Ankur
Adding Bangla/Bengali collation rules to MySQL database server
- Find out the “Index.xml” file of existing mysql server.
- “/usr/share/mysql/charsets/” in GNU/Linux
- “<mysql installation folder>\share\charsets\” in Windows.
- Also you can login to the MySQL server and put following SQL query, which will show the folder name.
mysql> show variables like 'character_sets_dir';
+--------------------+---------------------------------+
| Variable_name | Value |
+--------------------+---------------------------------+
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------+---------------------------------+
- We need the maximum collations id of MySQL server's information_schema database.
mysql> SELECT max( id ) FROM `information_schema`.`COLLATIONS`;
+-----------+
| max( id ) |
+-----------+
| 210 |
+-----------+
- Keep backup copy of “Index.xml” file. Now open “Index.xml” file and search for <charset name="utf8"> node. There should be two collations in that block:
- <collation name="utf8_general_ci" id="33">
- <collation name="utf8_bin" id="83">
- We will add two more collations in that block for Bangla from “BN-Index.xml” file:
- <collation name="utf8_bangla_ci" id="211">
- <collation name="utf8_bangla_dictionary_ci" id="212">
- Be careful with the value of collation id's. It must be a unique number. In earlier step we got the maximum id of the MySQL server. So we will increase the value and use it for Bangla.
- Update the “Index.xml” file with Bangla collation rules block accordingly. You can get our working copy from here, which might not work for you. :)
- Restart the MySQL server.
- MySQL server should now have two new collation rules for Bangla:
- utf8_bangla_ci : follows the original order
- utf8_bangla_dictionary_ci : follows the Bangla Academy dictionary order
mysql> SELECT * FROM `information_schema`.`COLLATIONS` where COLLATION_NAME like '%bangla%';
+---------------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+---------------------------+--------------------+-----+------------+-------------+---------+
| utf8_bangla_ci | utf8 | 211 | | | 8 |
| utf8_bangla_dictionary_ci | utf8 | 212 | | | 8 |
+---------------------------+--------------------+-----+------------+-------------+---------+
- Now you can use your preferred Bangla collation rule. Here is a sample SQL file. You can import it and test Bangla sorting.
- We used MySQL Server version: 5.0.51, 5.0.67 (Community Edition) and our test results are available here.
- Please send us your feedback at feedback@ankur.org.bd.