Author Topic: Sorting is culture-insensitive  (Read 15027 times)

0 Members and 3 Guests are viewing this topic.

Offline TnS

  • Member
  • *
  • Posts: 12
    • View Profile
Sorting is culture-insensitive
« on: February 22, 2009, 09:32:49 pm »
Looks like strings are sorted simply by the Unicode value of the characters instead of culture-sensitive sorting. Please change this. I never programmed in Delphi, but I hope it won't be difficult because for example in C# this is just a parameter.

For example in Hungarian O, U, Ú, Ü, Ő, Ű should be sorted as O, Ő, U, Ú, Ü, Ű.

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Sorting is culture-insensitive
« Reply #1 on: February 23, 2009, 02:06:55 am »
The Problem is that sorting is done by Firebird, not in my code and I have not found a way yet to change this sorting behavior, but I keep searching.
Gentlemen, you can’t fight in here! This is the War Room!

Offline cwdean

  • User
  • ***
  • Posts: 46
    • View Profile
Re: Sorting is culture-insensitive
« Reply #2 on: February 23, 2009, 04:36:21 pm »
The Problem is that sorting is done by Firebird, not in my code and I have not found a way yet to change this sorting behavior, but I keep searching.

Hi Nostra,

Not sure if this is helpful, but I did come across a web page that talked about some of the challenges (and possible solutions) related to Firebird INTL Architecture (selecting character set and collation).  See below:

http://www.jodelpeter.de/i18n/fbarch/selecting_charset_and_collation_in_firebird.html

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Sorting is culture-insensitive
« Reply #3 on: February 24, 2009, 02:26:39 am »
Hmmm. Reads like an explanation of why there is no good solution. And it hasn't been updated since 2004.

Here's a 0.9.9.x workaround for the the truly desperate: Create a custom "index" (numeric) field. Export titles to Excel, sort them the way you want them, and number the result in a new column. Import the list back into the database to populate the index. Now the list can be sorted by index. New movies, of course, will not be "indexed," but will at least appear at the top of the list where they won't be missed. This is a lousy solution for any database having movies added to it regularly, but not so bad for collections that don't change much. If gaps are left in the index sequence (i.e., use 10, 20, 30..., rather than 1, 2, 3...), new movies could be "manually" added to the index. The same thing can be done in 0.9.8.20 using the movie Number field—if it is not already being used.

Oops! Found a bug in 0.9.9.4 (which I'll report here in case any wants to try the above): The following error occurs when any filters are set and the list is sorted by a custom field, or vice versa.

Unexpected exception:
Dynamic SQL Error
SQL error code = -206
Column unknown
CUSTOM_VALUES_INT.value
At line 1, column 659
Column does not belong to referenced table
Error Code: 249

Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Re: Sorting is culture-insensitive
« Reply #4 on: April 01, 2010, 02:19:25 pm »
First of all: great work, great application. Still has some minor flaws and glitches of course, but it's the only app of this kind that can do (almost) everything I ever wanted. Should sound like a huge compliment!  ;)

Still there is this very annoying issue with its sort order (I'm using v.0.9.9.18, WinXP), still sorting upper and lower case chars and chars with accents in a strange manner (simply by codepoint order).

There also is a second thread on this:
http://www.videodb.info/forum_en/index.php?topic=1531.0

Of course, Firebird was to blame for the unwanted behavior, but with Firebird 2.1 things might have changed. I searched this forum, didn't find anyone mention it, so perhaps no one knows?

As it seems to me (just after doing some forum research. Hope, I'm not wrong...), all text is stored as UNICODE_FSS within the database? As this is superseded by now, it should be (or already was?) changed to UTF8. Then you can advise Firebird to use four different collations: UCS_BASIC (sorting by code-point order), UNICODE (using the Unicode Collation Algorithm, which really should do the job the best way possible), UTF-8 (completely case insensitive collation), and with Firebird 2.5 there will be UNICODE_CI_AI (totally ignoring case and accents, treating "A" as "a", "Ü" as "u", "è" as "e", and so on).

I'm referring to this page:
http://www.destructor.de/firebird/charsets.htm

The Unicode Collation Algorithm:
To cite Wikipedia: "Multilingual ordering... When lists of names or words need to be ordered, but the context does not define a particular single language or alphabet, the Unicode Collation Algorithm provides a way to put them in sequence."
Sounds good to me.  :)
In detail:
http://en.wikipedia.org/wiki/Unicode_collation_algorithm

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Sorting is culture-insensitive
« Reply #5 on: April 01, 2010, 02:39:34 pm »
It is really an interesting info to know about new firebird versions, but the problem I am having is the complex procedure of updating the database to the new version :( I will try to move to the latest firebird release with the version 1.0 and while doing this I can also take a look at the new sorting possibilities.
Gentlemen, you can’t fight in here! This is the War Room!

Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Re: Sorting is culture-insensitive
« Reply #6 on: April 01, 2010, 03:25:25 pm »
So I'm looking foward to version 1.0...  :)
Thanks in advance and thanks a lot in general for Personal Video Database!

Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Re: Sorting is culture-insensitive
« Reply #7 on: May 27, 2012, 05:24:23 pm »
This might be interesting for changing the collation on existing databases. The command came with Firebird 2.5... don't know the details, but it might be helpful when you are going to tackle this issue.
http://www.firebirdsql.org/refdocs/langrefupd25-ddl-charset.html
I was unsure but just decided to post it. However, I don't want to be pushy on this topic... well, maybe a little bit ;D
Well, kidding aside, don't feel pushed.
« Last Edit: May 27, 2012, 05:44:14 pm by svenne »

 

anything