Database erros due to UTF-8 filenames


Sebert, Holger.ext
 

Hi,

I've setup Toaster and a MySQL docker container, all running on Ubuntu 16.04.
I am encountering the following database error, when building my Yocto project:

ERROR: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.7/dist-packages/django/db/backends/mysql/base.py", line 71, in execute
return self.cursor.execute(query, args)
File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/usr/local/lib/python3.7/dist-packages/MySQLdb/connections.py", line 260, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")

The query that raised this error looks as follows:

INSERT INTO `orm_target_file`
(`target_id`, `path`, `size`, `inodetype`, `permission`,
`owner`, `group`, `directory_id`, `sym_target_id`)
VALUES (19,
'/usr/share/ca-certificates/mozilla/NetLock_Arany_=Class_Gold=_F\xc5\x91tan\xc3\xbas\xc3\xadtv\xc3\xa1ny.crt',
1476, 1, 'rw-r--r--', 'root', 'root', NULL, NULL)

The file causing this error has the following UTF-8 encoded filename:

NetLock_Arany_=Class_Gold=_Főtanúsítvány.crt

When looking into the database I found out that the column `path` of table
`orm_target_file` has the following properties:

CHARACTER_SET_NAME: latin1
COLLATION_NAME: latin1_swedish_ci

Apperently, the column `path` is not ready for UTF-8 strings. I can fix that
manually by doing the following mysql command using the `mysql` tool:

ALTER TABLE orm_target_file
CONVERT TO CHARACTER SET utf8
COLLATE utf8_general_ci;

This change makes the database error disappear.

I would like to fix that directly in Toasters's `orm/models.py`. I found the
following definition in class `Target_File`:

path = models.FilePathField()

It seems like I need to pass some clever options to `FilePathField`, but which?
My own research in that direction has brought up nothing useful so far.

My questions are thus:

* How can I parametrize `FilePathField` to properly handle UTF-8 encoded
filenames in the underlying database?

* How should a correspondig migration file look like in `orm/migrations`?

Thanks!

Best,
Holger

Join toaster@lists.yoctoproject.org to automatically receive all group messages.