Welcome Guest, Not a member yet? Register   Sign In
sqlite3 and result ordering according to non-ASCII collations
#1

(This post was last modified: 06-17-2024, 02:22 AM by dgvirtual.)

For a while I was not sure how to order query results in Codeigniter when using sqlite3 database, which natively lacks non-ascii text collation functions. So I resorted to sorting php arrays, which is quite suboptimal with larger datasets.

When I took a closer look, I realized that it is quite simple to handle sorting in vanilla PHP case. Below I post an example of creating a Lithuanian collation and using it with sqlite3.

But I see createCollation and createFunction sqlite3 driver methods are not used in Codeigniter4 code. So I have to find a way to implement these myself.

Could someone take a look at the vanilla php example below and maybe throw in a suggestion on how to optimally do that in Codeigniter?

Vanilla php example

PHP Code:
// the locale offers proper collation order, set it if it is not default
setlocale(LC_ALL'lt_LT.UTF-8');

$db = new SQLite3(baz.db);

// create collation
$db->createCollation('LT', function ($a$b) {
   return strcoll($a$b);
});

// create alternative to native LOWER function for case-insensitive comparisons
$db->createFunction('LT_LOWER', function ($str) {
   return mb_strtolower($str);
}); 

And you continue as if you were using native sqlite3 functions:

PHP Code:
$results $db->query('SELECT bar FROM foo where LT_LOWER(foo) LIKE %Žemai%' ORDER BY bar ASC COLLATE LT);
while (
$row $results->fetchArray()) {
    var_dump($row);


You can also assign this LT collation as default to individual columns when creating a table (and then omit it in select statements), but then you won't be able to read that table without declaring those additional collation methods before the basic select queries.
==

Donatas G.
Reply
#2

Try:
PHP Code:
        $db->connID->createCollation('LT', function ($a$b) {
            return strcoll($a$b);
        }); 
Reply
#3

(This post was last modified: 06-17-2024, 04:45 AM by dgvirtual.)

(06-17-2024, 01:38 AM)kenjis Wrote: Try:
PHP Code:
        $db->connID->createCollation('LT', function ($a$b) {
            return strcoll($a$b);
        }); 

Voila!

so I added these three calls to BaseController:

PHP Code:
        setlocale(LC_ALL'lt_LT.UTF-8'); 
        // before I run this the locale was 'lt' and the ordering did not work

        db_connect()->connID->createCollation('LT', function ($a$b) {
            return strcoll($a$b);
        }); 

        db_connect()->connID->createFunction('LT_LOWER', function ($str) {
            return mb_strtolower($str);
        }); 


And then the controller method to test:

PHP Code:
       public function collate()
    {
        echo '<pre>';
        $db db_connect();


        echo 'ORDERING: ' PHP_EOL;
        $query $db->query('SELECT last_name FROM users ORDER BY last_name COLLATE LT DESC');
        $users $query->getResultArray();
        foreach ($users as $user) {
            echo $user['last_name'] . PHP_EOL;
        }

        echo 'CASE-INSENSITIVE LIKE: '  PHP_EOL PHP_EOL;
        $query $db->query('SELECT last_name FROM users WHERE LT_LOWER(last_name) LIKE "%žič%"');
        $users $query->getResultArray();
        foreach ($users as $user) {
            echo $user['last_name'] . PHP_EOL;
        }

        echo ' PHP_EOL . SAME USING QUERY BUILDER, NOT WORKING: '  PHP_EOL PHP_EOL;
        $builder $db->table('users');
        $builder->select('last_name');
        $builder->orderBy('last_name''COLLATE LT DESC'); 
        $query $builder->get(); 
        $users $query->getResultArray();

        foreach ($users as $user) {
            echo $user['last_name'] . PHP_EOL;
        }



        echo 
PHP_EOL 'CASE-INSENSITIVE LIKE WITH BUILDER: '  PHP_EOL PHP_EOL;
        
$builder $db->table('users');
        
$builder->select('last_name');
        
$builder->like('LT_LOWER(last_name)''žič''both'); 
        
$query $builder->get(); 

        
$users $query->getResultArray();

        foreach (
$users as $user) {
            echo 
$user['last_name'] . PHP_EOL;
        }

    

And I see that both ordering and case - insensitive search with utf8 characters is working.

The last question would be - how do I integrate COLLATE LT into query builder. So far I failed (see the third code block, which does not order).
==

Donatas G.
Reply
#4

It seems there is no way to generate the query `ORDER BY last_name COLLATE LT DESC` with orderBy().
https://github.com/codeigniter4/CodeIgni...1449-L1497

You need to extend the orderBy() method.
Reply
#5

(06-17-2024, 04:17 PM)kenjis Wrote: It seems there is no way to generate the query `ORDER BY last_name COLLATE LT DESC` with orderBy().
https://github.com/codeigniter4/CodeIgni...1449-L1497

You need to extend the orderBy() method.
Thanks @kenjis; I think I will simply go with declaring collation on creation of table, so that I do not need to declare it on queries.

I might also change the collation name to utf8_lithuanian_ci to make it compatible with the MySql counterpart for easier migration if I ever decide to move

it also seems that I can overwrite the native LOWER function of sqlite with my own implementation:

PHP Code:
        $db->createFunction('LOWER', function ($str) {
            return mb_strtolower($str);
        }); 


If I do not notice any shortcomings to such approach, then SQlite3 collation and search functionality can be made on par with mysql's using proper migration code and a few statements in the BaseController class.
==

Donatas G.
Reply
#6

I finally managed to compile the ICU plugin for SQLite3, which can then be loaded and used instead of the PHP functions, making the search almost twice as fast. So definitely worth it with larger databases. Here is the tutorial on how to do that on Ubuntu 22.04 (and the process is the same on Ubuntu 24.04):

https://dg.lapas.info/irasas/enhancing-s...extension/
==

Donatas G.
Reply




Theme © iAndrew 2016 - Forum software by © MyBB