Sometimes I need to search files with accented characters (diacritic in general), usually with locate (mlocate flavor, Merging Locate; see below the warning related to plocate). I wish to setup (maybe in /etc/updatedb.conf
) so it let me search for this special characters using a certain language mapping, for example:
a == âàáäÂÀÂÄe == êèéëÊÈÉËi == îïíÎÏo == ôöóÔÖu == ûùüÛÜÙc == çÇn == ñ
So locate -i liberación
also should search for file names with string liberacion and even liberaciòn.
Notes and assumptions
- And maybe others: ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ.
- This is a common situation on romance languages like Spanish, French, and German.
- I'm always using a locale 100% UTF-8.
- I would rather not have to use regular expressions.
- A patch might use ASCII transliterations of Unicode as Unidecode/cUnidecode does. Most of mlocate is written on C.
Related
- Similar question but using
find
- Miloslav Trmač (
mlocate
developer) say here that the official source code is on pagure.io (and a fork on Github). - I file an issue on mlocate repo at Pagure.io to add this feature.
- Update 2018-02: This can be fixed with this pull request by marcotrevisan. Will add a
-t
/--transliterate
support usingiconv
to match accented. - Update 2018-03:
mlocate
with support for--transliterate
is now included in Ubuntu 18.04 LTS Bionic Beaver (v2 and v3.1).
- Update 2018-02: This can be fixed with this pull request by marcotrevisan. Will add a