WIP: Replacement for PR #183 #196

Open
MicheleC wants to merge 8 commits from reworked/feature/konq_listview++_14.1 into master
Owner

This is a cleaned up version of PR #183, for review.

It is not meant to be merged yet.

This is a cleaned up version of PR #183, for review. It is not meant to be merged yet.
MicheleC changed title from Replacement for PR #183 to WIP: Replacement for PR #183 10 months ago
MicheleC added the
PR/rfc
label 10 months ago
Collaborator

Michele,

I tested the cleaned up code and it works. I needed to continue your cleanup in order to restore inadvertently broken dictionary sort feature.

I have a few questions / comments on the cleanup:

  • As you can see, I tend to prefer spaces in function definitions and calls, as well as underscores in variable names. I do that because I find that more readable. However, I will adhere to the "when in Rome, do as Romans do" principle. Is there a TDE coding style manual that I should consult?

  • A lot of the comments you removed were to help me keep track of where things were or observations/questions to my inexperienced self as I learned more about the code. Those of course should be removed. But there are a few that I feel should be left intact for future coders. We can discuss those later.

Ready for next steps :)

Vince

Michele, I tested the cleaned up code and it works. I needed to continue your cleanup in order to restore inadvertently broken dictionary sort feature. I have a few questions / comments on the cleanup: * As you can see, I tend to prefer spaces in function definitions and calls, as well as underscores in variable names. I do that because I find that more readable. However, I will adhere to the "when in Rome, do as Romans do" principle. Is there a TDE coding style manual that I should consult? * A lot of the comments you removed were to help me keep track of where things were or observations/questions to my inexperienced self as I learned more about the code. Those of course should be removed. But there are a few that I feel should be left intact for future coders. We can discuss those later. Ready for next steps :) Vince
Collaborator

...Is there a TDE coding style manual that I should consult?

https://mirror.git.trinitydesktop.org/gitea/TDE/tde/wiki/Code-formatting-style-discussion
Still a draft I believe, but It's a start.

...But there are a few that I feel should be left intact for future coders. We can discuss those later.

You might want to discuss this on the dev channel instead, https://jabb.im/reg/
the channel's name is: tde-devs

>...Is there a TDE coding style manual that I should consult? > https://mirror.git.trinitydesktop.org/gitea/TDE/tde/wiki/Code-formatting-style-discussion Still a draft I believe, but It's a start. >...But there are a few that I feel should be left intact for future coders. We can discuss those later. > You might want to discuss this on the dev channel instead, https://jabb.im/reg/ the channel's name is: tde-devs
Collaborator

Michel,

It looks my attempt to upload a patch to your reworked/feature/konqlistview++14.1 branch unintentionally updated another branch.

Sorry, this was a GIT-newbie mistake -- I did a "git push origin" with reworked/feature/konqlistview++14.1 checked out. To avoid speculation, how should I have done this correctly?

Vince

Michel, It looks my attempt to upload a patch to your reworked/feature/konqlistview++14.1 branch unintentionally updated another branch. Sorry, this was a GIT-newbie mistake -- I did a "git push origin" with reworked/feature/konqlistview++14.1 checked out. To avoid speculation, how should I have done this correctly? Vince
Poster
Owner

Sorry, this was a GIT-newbie mistake -- I did a "git push origin" with reworked/feature/konqlistview++14.1 checked out. To avoid speculation, how should I have done this correctly?

@VinceR
see comment here

> Sorry, this was a GIT-newbie mistake -- I did a "git push origin" with reworked/feature/konqlistview++14.1 checked out. To avoid speculation, how should I have done this correctly? @VinceR see comment [here](https://mirror.git.trinitydesktop.org/gitea/TDE/tdebase/issues/180#issuecomment-12136)
Poster
Owner

Hi @VinceR,

Greg (@cethyel) has already pointed you to the intended TDE code style. It is not yet applied, so as you can see the TDE code is still quite messily (un)formatted. I am slowly working on a code formatting effort to enforce code style at some point, but it won't be finished any time soon. So for the time being, the "when in Rome, do as Romans do" principle is the best way and what I usually do.

But there are a few that I feel should be left intact for future coders.

Sure, we can talk here or on the developer chat room that Greg pointed you too.

Hi @VinceR, Greg (@cethyel) has already pointed you to the intended TDE code style. It is not yet applied, so as you can see the TDE code is still quite messily (un)formatted. I am slowly working on a code formatting effort to enforce code style at some point, but it won't be finished any time soon. So for the time being, the "when in Rome, do as Romans do" principle is the best way and what I usually do. > But there are a few that I feel should be left intact for future coders. Sure, we can talk here or on the developer chat room that Greg pointed you too.
MicheleC force-pushed reworked/feature/konq_listview++_14.1 from c890c7e844 to c501532ee7 9 months ago
Poster
Owner

@VinceR
Thanks for fixing the missing rename in my original version. I have now reworked the tree so I have integrated your latest commit into the original commit where it belongs.
You will need to update your local version of the branch to make sure it is aligned with the new one. The easiest way may be to rename the branch on your side ('git branch -m ') and check out the new branch again. This way you can see they are exactly the same 😄

@VinceR Thanks for fixing the missing rename in my original version. I have now reworked the tree so I have integrated your latest commit into the original commit where it belongs. You will need to update your local version of the branch to make sure it is aligned with the new one. The easiest way may be to rename the branch on your side ('git branch -m <newname>') and check out the new branch again. This way you can see they are exactly the same :smile:
MicheleC force-pushed reworked/feature/konq_listview++_14.1 from c501532ee7 to 974182fe6a 9 months ago
Poster
Owner

@VinceR
I am back at this PR. A few comments to discuss together so that we can further advance this PR (sorry for the long post).

  1. with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further.

  2. during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings.

  3. Dictionary sorter as it is implemented at the moment is a no-go. It needs to be reworked. There are three main reasons for that:

  • a. design: 'sortString' is used to store the name of an item in 'updateContents()' and then used in 'compare()' to decide the order of items. Although it works, this is a weak design and not recommended.

  • b. efficiency: dictionary sorting does a lot of string processing (cut and paste char by char). This is no problem with few items, but if you have a big folder with thousands of files, this is likely to have performance implications since string operations are not cheap.

  • c. functionality: dictionary sorting is looking only at the first non-letter non-digit number for ordering the items. While this may cover most of the cases, it does not work for those cases where the difference is after the first character. For example consider the case a~aa, "a~a a", a~ab, a~a+a, a~a~a. I believe (I have not tested directly) that this will "neutralize" the dictionary sorting logic, since the current algorighm will find the first '~' char and then sort based on the rest of the item name, which is exactly what dictionary sorting is trying to fix.

It could be a good idea to split this PR into two parts. First we work on alternate sorting only, fix point 1. and 2. and merge into Konqueror code. Then we can work on the dictionary sorting algorithm and fix the problems in 3. What do you think?

PS: I am not trying to discourage your effort, on the contrary I want to help to improve the PR 😄 For dictionary sorting implementation, I can guide you through when we come to fix that.

@VinceR I am back at this PR. A few comments to discuss together so that we can further advance this PR (sorry for the long post). 1. with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further. 2. during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings. 3. Dictionary sorter as it is implemented at the moment is a no-go. It needs to be reworked. There are three main reasons for that: * a. design: 'sortString' is used to store the name of an item in 'updateContents()' and then used in 'compare()' to decide the order of items. Although it works, this is a weak design and not recommended. * b. efficiency: dictionary sorting does a lot of string processing (cut and paste char by char). This is no problem with few items, but if you have a big folder with thousands of files, this is likely to have performance implications since string operations are not cheap. * c. functionality: dictionary sorting is looking only at the first non-letter non-digit number for ordering the items. While this may cover most of the cases, it does not work for those cases where the difference is after the first character. For example consider the case a~aa, "a~a a", a~ab, a~a+a, a~a~a. I believe (I have not tested directly) that this will "neutralize" the dictionary sorting logic, since the current algorighm will find the first '~' char and then sort based on the rest of the item name, which is exactly what dictionary sorting is trying to fix. It could be a good idea to split this PR into two parts. First we work on alternate sorting only, fix point 1. and 2. and merge into Konqueror code. Then we can work on the dictionary sorting algorithm and fix the problems in 3. What do you think? PS: I am not trying to discourage your effort, on the contrary I want to help to improve the PR 😄 For dictionary sorting implementation, I can guide you through when we come to fix that.
Owner

By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well?

By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well?
Poster
Owner

By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well?

The two topics are related, so it is probably good to have a general look of that patch too. But in terms of implementation I expect the two code bases to be quite different, so probably we can't simply take that patch and put it here.

> By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well? The two topics are related, so it is probably good to have a general look of that patch too. But in terms of implementation I expect the two code bases to be quite different, so probably we can't simply take that patch and put it here.
Collaborator

Michele,

In partial response to your previous email:

with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further.

I completely agree and that was how I had initially intended to do things. It's been a few years, but I recall encountering difficulties when columns were rearranged due to the fact that 'Filename' was being treated differently than the others. My understanding of how things work has come a long way. Could you describe the small change and and can play around with that this weekend.

during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings.

I also noticed that. Again, I recall having difficulty initializing default values for those settings in the event that they weren't stored on disk. But is seems that even when those preferences have been stored, they don't seem to take effect with a new listview until the user does that initial column click. Do you see any obvious areas where I should be re-reading the on-disk settings and/or calling updateContents()?

It could be a good idea to split this PR into two parts. First we work on alternate sorting only, fix point 1. and 2. and merge into Konqueror code. Then we can work on the dictionary sorting algorithm and fix the problems in 3. What do you think?

I agree, especially given your concerns :) Should that become a separate PR? I do have responses your three comments, but I will save them for later when we return to reviewing and working on the "dictionary sort" feature.

Vince

Michele, In partial response to your previous email: > with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further. I completely agree and that was how I had initially intended to do things. It's been a few years, but I recall encountering difficulties when columns were rearranged due to the fact that 'Filename' was being treated differently than the others. My understanding of how things work has come a long way. Could you describe the small change and and can play around with that this weekend. > during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings. I also noticed that. Again, I recall having difficulty initializing default values for those settings in the event that they weren't stored on disk. But is seems that even when those preferences have been stored, they don't seem to take effect with a new listview until the user does that initial column click. Do you see any obvious areas where I should be re-reading the on-disk settings and/or calling updateContents()? > It could be a good idea to split this PR into two parts. First we work on alternate sorting only, fix point 1. and 2. and merge into Konqueror code. Then we can work on the dictionary sorting algorithm and fix the problems in 3. What do you think? I agree, especially given your concerns :) Should that become a separate PR? I do have responses your three comments, but I will save them for later when we return to reviewing and working on the "dictionary sort" feature. Vince
Collaborator

By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well?

The two topics are related, so it is probably good to have a general look of that patch too. But in terms of implementation I expect the two code bases to be quite different, so probably we can't simply take that patch and put it here.

Now we are talking "fancy" sorting!

Using Gnu Coreutils sort as a reference, the sort proposed in #177 is comparable to "sort --version-sort". The dictionary sort feature I had proposed in this PR is roughly comparable to "sort --dictionary-order" except that I only deal with the 1st character and the final (tie-breaking) collation is different.

One observation I have about the patch referenced in #177: The string mangling necessary necessary to accomplish the objective is done in the compare function, which sort will call many times. I was concerned enough about this to move that key generation function to updateContents() which is called mainly when a directory is changed.

I do think that having a flexible array of extended sorting features will be useful to those (such as myself) that who use long descriptive file names for documents.

>> By the way, there is also issue #177, which refers to an existing patch that was created for Dolphin. Does it make sense to take this patch into account as well? >The two topics are related, so it is probably good to have a general look of that patch too. But in terms of implementation I expect the two code bases to be quite different, so probably we can't simply take that patch and put it here. Now we are talking "fancy" sorting! Using Gnu Coreutils sort as a reference, the sort proposed in #177 is comparable to "sort --version-sort". The dictionary sort feature I had proposed in this PR is roughly comparable to "sort --dictionary-order" except that I only deal with the 1st character and the final (tie-breaking) collation is different. One observation I have about the patch referenced in #177: The string mangling necessary necessary to accomplish the objective is done in the compare function, which sort will call many times. I was concerned enough about this to move that key generation function to updateContents() which is called mainly when a directory is changed. I do think that having a flexible array of extended sorting features will be useful to those (such as myself) that who use long descriptive file names for documents.
Poster
Owner

Hi @VinceR,
thanks for the answer.
I will put this PR aside for one or two weeks and I will come back to it after the release of R14.0.10, which we are closing off in the coming days.
At that point I can revisit the code and help out on the various points.
In the meantime we can discuss various topics related to ir.

Could you describe the small change and and can play around with that this weekend.

You are using two variables to store the columns to use for primary and alternate sorting, so it should be enough to update those two variables each time a user click on a different column. That is why I think it should be a relatively small fix.

I agree, especially given your concerns :) Should that become a separate PR?

Probably it is the wiser way forward. We can keep this PR for reference, but create a first PR for alternate sorting, merge it and later work on the dictionary sorting and the "fancy" patch too.

Hi @VinceR, thanks for the answer. I will put this PR aside for one or two weeks and I will come back to it after the release of R14.0.10, which we are closing off in the coming days. At that point I can revisit the code and help out on the various points. In the meantime we can discuss various topics related to ir. > Could you describe the small change and and can play around with that this weekend. You are using two variables to store the columns to use for primary and alternate sorting, so it should be enough to update those two variables each time a user click on a different column. That is why I think it should be a relatively small fix. > I agree, especially given your concerns :) Should that become a separate PR? Probably it is the wiser way forward. We can keep this PR for reference, but create a first PR for alternate sorting, merge it and later work on the dictionary sorting and the "fancy" patch too.
Collaborator

with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further.

during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings.

OK, hopefully these suggestions were successfully and safely addressed in last commit that I just pushed.

And hopefully that push doesn't cause the headaches that my last one did :)

> with alternate sorting, one of the two columns is fixed to "Filename". I think it would be even better if we get rid of this limitation and we let the user alternate between the last two columns that were clicked. It should be a relatively small change and extend the functionality further. > > during testing alternate sorting, at the beginning when no column was clicked, Ctrl+S would do nothing and the user would be left wondering what's happening. It will be good to select two columns arbitrarily at the very beginning (for example column 0 and 1) if none was previously saved in the settings. > OK, hopefully these suggestions were successfully and safely addressed in last commit that I just pushed. And hopefully that push doesn't cause the headaches that my last one did :)
Poster
Owner

Hi @VinceR, sorry for the long delay, too many thigns to do and not enought time.
Just want to let you know that I have made a list of things I want to do/close in the next 4 months and this PR is part of the list. Not top priority, but it is there.
Please be patient a bit more 😄

Hi @VinceR, sorry for the long delay, too many thigns to do and not enought time. Just want to let you know that I have made a list of things I want to do/close in the next 4 months and this PR is part of the list. Not top priority, but it is there. Please be patient a bit more :smile:
Collaborator

No problem, @MicheleC. In the mean time, I will try to implement #177.

No problem, @MicheleC. In the mean time, I will try to implement #177.
Poster
Owner

No problem, @MicheleC. In the mean time, I will try to implement #177.

Sounds good (y)

> No problem, @MicheleC. In the mean time, I will try to implement #177. Sounds good (y)
Poster
Owner

@VinceR
fyi, I should be back to this PR either next week or the following one 😄

@VinceR fyi, I should be back to this PR either next week or the following one :smile:
Poster
Owner

@VinceR
let's get back to this PR and get things moving forward!!

Going back to this comment and this other one, as I proposed let's focus on the alternate sorting first, then we can look at the dictionary sorting which needs more work.

Could you create a new PR with only the code for alternate sorting? You can use this PR as the starting point and just use the part of code required for alternate sorting. Please use a different branch for the PR to keep things clean (gitea may behave strangely when the same branch is used for two different PRs).

I promise I will look at the new PR in a short time once ready, don't want to drag this another 6 months 😄

@VinceR let's get back to this PR and get things moving forward!! Going back to [this comment](https://mirror.git.trinitydesktop.org/gitea/TDE/tdebase/issues/196#issuecomment-12271) and [this other one](https://mirror.git.trinitydesktop.org/gitea/TDE/tdebase/issues/196#issuecomment-12314), as I proposed let's focus on the alternate sorting first, then we can look at the dictionary sorting which needs more work. Could you create a new PR with only the code for alternate sorting? You can use this PR as the starting point and just use the part of code required for alternate sorting. Please use a different branch for the PR to keep things clean (gitea may behave strangely when the same branch is used for two different PRs). I promise I will look at the new PR in a short time once ready, don't want to drag this another 6 months :smile:
Poster
Owner

I have rebased this PR on top of current master, so make sure you get the latest version of this branch before you start looking into it again. The contents are the same, just the commit hash are different.

I have rebased this PR on top of current master, so make sure you get the latest version of this branch before you start looking into it again. The contents are the same, just the commit hash are different.
MicheleC added the
PR/wip
label 3 months ago
MicheleC force-pushed reworked/feature/konq_listview++_14.1 from 9b28ca939f to 58e239bbb5 3 months ago
Collaborator

MicheleC: since my last message ~6 months ago, I have migrated my home systems from 14.0.x to 14.1 with all of changes made for this PR. I encountered no issues while actively leveraging the new listview features. So, I think the code is solid :)

Based on your concerns, I just pulled out the code that implements "Dictionary Order Sort" and gave it a quick test in a chrooted test environment to make sure that nothing else broke. I will push that change shortly.

There is one side effect of pulling out that feature: Turning off the "Group Hidden First" property may appear to do nothing due to the fact that 'dot' files will still sort together before files that start with an alphanumeric character. The difference will only be visible when sorting in reverse or when there are file names that start with a lower non-alphanumeric character (e.g. ' ', '!', … ,'+', '-'). Dictionary sort order did make the effects more pronounced.

MicheleC: since my last message ~6 months ago, I have migrated my home systems from 14.0.x to 14.1 with all of changes made for this PR. I encountered no issues while actively leveraging the new listview features. So, I think the code is solid :) Based on your concerns, I just pulled out the code that implements "Dictionary Order Sort" and gave it a quick test in a chrooted test environment to make sure that nothing else broke. I will push that change shortly. There is one side effect of pulling out that feature: Turning off the "Group Hidden First" property may appear to do nothing due to the fact that 'dot' files will still sort together before files that start with an alphanumeric character. The difference will only be visible when sorting in reverse or when there are file names that start with a lower non-alphanumeric character (e.g. ' ', '!', … ,'+', '-'). Dictionary sort order did make the effects more pronounced.
VinceR added 1 commit 3 months ago
Poster
Owner

Hi @VinceR,
good to hear you are on R14.1 and there are no issues, it will make integration easier since we work out of master when merging code.

Regarding the dictionaly sort, my concers are not for the functionality, but in the way it is implemented, that will need to be re looked at.

What I was suggesting was to leave this PR untouched for reference and create a separate PR only for the alternate sorting so we can merge that first, then later come back here for the dictionary sorting. What do you think?

Hi @VinceR, good to hear you are on R14.1 and there are no issues, it will make integration easier since we work out of master when merging code. Regarding the dictionaly sort, my concers are not for the functionality, but in the way it is implemented, that will need to be re looked at. What I was suggesting was to leave **this PR** untouched for reference and create a **separate PR** only for the alternate sorting so we can merge that first, then later come back here for the dictionary sorting. What do you think?
Collaborator

OK, so the new PR will contain the sort actions "Alternate Sort Order" (Ctrl+S) and "Reverse Sort Order" (Ctrl+R).

Can it also contain the switchable sort options "Group Directories First" and "Group Hidden First" or should those go into a separate (3rd) PR?

OK, so the new PR will contain the sort actions "Alternate Sort Order" (Ctrl+S) and "Reverse Sort Order" (Ctrl+R). Can it also contain the switchable sort options "Group Directories First" and "Group Hidden First" or should those go into a separate (3rd) PR?
Poster
Owner

OK, so the new PR will contain the sort actions "Alternate Sort Order" (Ctrl+S) and "Reverse Sort Order" (Ctrl+R).

Can it also contain the switchable sort options "Group Directories First" and "Group Hidden First" or should those go into a separate (3rd) PR?

Yes, that can be included in the same PR.

> OK, so the new PR will contain the sort actions "Alternate Sort Order" (Ctrl+S) and "Reverse Sort Order" (Ctrl+R). > > Can it also contain the switchable sort options "Group Directories First" and "Group Hidden First" or should those go into a separate (3rd) PR? > Yes, that can be included in the same PR.
Poster
Owner

PR #240 has been merged, we can now start looking into dictionary sorting.
This comment lists the issue with the original dictionary sorting code.
Point 3.c should be the first to be addressed, then we can look at how to implement it efficiently (3.a and 3.b).
@VinceR how shall we proceed on this? Happy to hear your suggestions.

PR #240 has been merged, we can now start looking into dictionary sorting. This [comment](https://mirror.git.trinitydesktop.org/gitea/TDE/tdebase/pulls/196#issuecomment-12271) lists the issue with the original dictionary sorting code. Point 3.c should be the first to be addressed, then we can look at how to implement it efficiently (3.a and 3.b). @VinceR how shall we proceed on this? Happy to hear your suggestions.
Collaborator

c. functionality: dictionary sorting is looking only at the first non-letter non-digit number for ordering the items. While this may cover most of the cases, it does not work for those cases where the difference is after the first character.

This is a reasonable point. I probably should have described the sort option "Ignore Leading Special Char" instead of the more ambitious "Dictionary Sort Order". There were reasons for that limited focus, including (for example) collating .Xresources next to Xresources when "Group Hidden First" is unchecked.

Before I start modifying TDE code, please take a look at the following code to see if I am on the right track:

#define IS_DICT_CHAR(c) (c.isLetterOrNumber()||c.isSpace())
#define PLACE_HOLDER_CHAR 1

TQString mkDictKey( TQString inKey )
{
  TQString outKey;
  uint keyLength = inKey.length();
  uint inKeyPos, outKeyPos = 0;
  outKey.reserve( keyLength * 2 );

  // Pass 1: Extract only dictionary relevant characters
  for( inKeyPos = 0; inKeyPos < keyLength; inKeyPos++ ) {
    TQChar curChar = inKey.at(inKeyPos);
    if ( IS_DICT_CHAR(curChar) ) {
      outKey += curChar;
    }
  }

  if ( outKey.length() == 0 ) {
    return(inKey);
  }

  /*
     The output key now contains only the dictionary characters from the
     input key. We extend that string with the non-dictionary characters
     from the input key interspersed with a neutral placeholder character
     for each dictionary character. Hopefully that will produce a unique
     output string that will collate predictably.
  */

  // Pass 2: Extract non-dictionary relevant characters + placeholders
  for( inKeyPos = 0; inKeyPos < keyLength; inKeyPos++ ) {
    TQChar curChar = inKey.at(inKeyPos);
    if ( IS_DICT_CHAR(curChar) ) {
      outKey += (TQChar) PLACE_HOLDER_CHAR ; // Append placeholder char
    }
    else {
      outKey += curChar; // Append non-dictionary character
    }
  }

  return outKey ;
}

The function can be called to pre-generate a sort key (like I did originally) or called on the fly in a sort comparison function.

> c. functionality: dictionary sorting is looking only at the first non-letter non-digit number for ordering the items. While this may cover most of the cases, it does not work for those cases where the difference is after the first character. This is a reasonable point. I probably should have described the sort option "Ignore Leading Special Char" instead of the more ambitious "Dictionary Sort Order". There were reasons for that limited focus, including (for example) collating .Xresources next to Xresources when "Group Hidden First" is unchecked. Before I start modifying TDE code, please take a look at the following code to see if I am on the right track: ``` #define IS_DICT_CHAR(c) (c.isLetterOrNumber()||c.isSpace()) #define PLACE_HOLDER_CHAR 1 TQString mkDictKey( TQString inKey ) { TQString outKey; uint keyLength = inKey.length(); uint inKeyPos, outKeyPos = 0; outKey.reserve( keyLength * 2 ); // Pass 1: Extract only dictionary relevant characters for( inKeyPos = 0; inKeyPos < keyLength; inKeyPos++ ) { TQChar curChar = inKey.at(inKeyPos); if ( IS_DICT_CHAR(curChar) ) { outKey += curChar; } } if ( outKey.length() == 0 ) { return(inKey); } /* The output key now contains only the dictionary characters from the input key. We extend that string with the non-dictionary characters from the input key interspersed with a neutral placeholder character for each dictionary character. Hopefully that will produce a unique output string that will collate predictably. */ // Pass 2: Extract non-dictionary relevant characters + placeholders for( inKeyPos = 0; inKeyPos < keyLength; inKeyPos++ ) { TQChar curChar = inKey.at(inKeyPos); if ( IS_DICT_CHAR(curChar) ) { outKey += (TQChar) PLACE_HOLDER_CHAR ; // Append placeholder char } else { outKey += curChar; // Append non-dictionary character } } return outKey ; } ``` The function can be called to pre-generate a sort key (like I did originally) or called on the fly in a sort comparison function.
Poster
Owner

Hi @VinceR
I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective?

  1. sort ignoring only leading special (non alphanumerical) characters?
  2. sort ignoring all special characters?
  3. others? please specify

I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit.

Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above).

@SlavekB
would also be good to hear you opinion on the objective

Hi @VinceR I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective? 1. sort ignoring _only_ leading special (non alphanumerical) characters? 2. sort ignoring _all_ special characters? 3. others? please specify I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit. Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above). @SlavekB would also be good to hear you opinion on the objective
Collaborator

Hi @VinceR
I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective?

  1. sort ignoring only leading special (non alphanumerical) characters?
  2. sort ignoring all special characters?
  3. others? please specify

Adding Objective 0. sort ignoring first leading special character.

My first cut was to implement objective (0) - an admittedly limited scope motivated partially by the desire to be able to ignore the '.' in dotfiles when collating. I interpreted your concern with this limited scope as suggesting we implement objective (2), so that's what I tried to do. I was not considering objective (1).

I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit.

The term "dictionary sorting" was inspired by the Gnu Coreutils sort option --dictionary-order.

Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above).

Of course I have not looked at TQt source code but I am honestly surprised that TQString & operator+= ( TQChar c ) would be so expensive in a preallocated string. How does one build a TQString efficiently one TQChar at a time?

@SlavekB
would also be good to hear you opinion on the objective

> Hi @VinceR > I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective? > 1. sort ignoring _only_ leading special (non alphanumerical) characters? > 2. sort ignoring _all_ special characters? > 3. others? please specify > Adding Objective 0. sort ignoring *first* leading special character. My first cut was to implement objective (0) - an admittedly limited scope motivated partially by the desire to be able to ignore the '.' in dotfiles when collating. I interpreted your concern with this limited scope as suggesting we implement objective (2), so that's what I tried to do. I was not considering objective (1). > I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit. > The term "dictionary sorting" was inspired by the Gnu Coreutils sort option **--dictionary-order**. > Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above). Of course I have not looked at TQt source code but I am honestly surprised that **TQString & operator+= ( TQChar c )** would be so expensive in a preallocated string. How does one build a TQString efficiently one TQChar at a time? > > @SlavekB > would also be good to hear you opinion on the objective
Owner

Hi @VinceR
I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective?

  1. sort ignoring only leading special (non alphanumerical) characters?
  2. sort ignoring all special characters?
  3. others? please specify

Adding Objective 0. sort ignoring first leading special character.

My first cut was to implement objective (0) - an admittedly limited scope motivated partially by the desire to be able to ignore the '.' in dotfiles when collating. I interpreted your concern with this limited scope as suggesting we implement objective (2), so that's what I tried to do. I was not considering objective (1).

Thank you, objective 2 seem like a good idea.

I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit.

The term "dictionary sorting" was inspired by the Gnu Coreutils sort option --dictionary-order.

Thank you. From the name Dictionary I first didn't understand what it means. If we manage to find the name that will be more obvious, it would be good. If not, here it is good that it has the same name as in other tools for sorting.

Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above).

Of course I have not looked at TQt source code but I am honestly surprised that TQString & operator+= ( TQChar c ) would be so expensive in a preallocated string. How does one build a TQString efficiently one TQChar at a time?

I assume this task can be good job for regular expression.

@SlavekB
would also be good to hear you opinion on the objective

> > Hi @VinceR > > I think we first need to define exactly what we want to achieve with dictionary sorting. What is the objective? > > 1. sort ignoring _only_ leading special (non alphanumerical) characters? > > 2. sort ignoring _all_ special characters? > > 3. others? please specify > > > > Adding Objective 0. sort ignoring *first* leading special character. > > My first cut was to implement objective (0) - an admittedly limited scope motivated partially by the desire to be able to ignore the '.' in dotfiles when collating. I interpreted your concern with this limited scope as suggesting we implement objective (2), so that's what I tried to do. I was not considering objective (1). > Thank you, objective 2 seem like a good idea. > > I also think the name "dictionary sorting" is a bit unclear, so probably as you suggested "ignore special (laeading) characters" may be a better fit. > > > > The term "dictionary sorting" was inspired by the Gnu Coreutils sort option **--dictionary-order**. > Thank you. From the name _Dictionary_ I first didn't understand what it means. If we manage to find the name that will be more obvious, it would be good. If not, here it is good that it has the same name as in other tools for sorting. > > Once we define what we what to implement, we will be clearer on how to proceed. One key point is to avoid string concatenations as much as possible. Those are expensive operations, so adding one character at a time to a string is not recommended (which is what you were doing in the original solution and also in the code above). > > Of course I have not looked at TQt source code but I am honestly surprised that **TQString & operator+= ( TQChar c )** would be so expensive in a preallocated string. How does one build a TQString efficiently one TQChar at a time? > > I assume this task can be good job for regular expression. > > @SlavekB > > would also be good to hear you opinion on the objective > >
Poster
Owner

Hi @VinceR,
Objective 2 seems good to me too.
TQString copies over the existing string and add the new characters when operator += is used. You can check the code if you want to see the details.
To get a proper string order without doing too much copying operations, you could preallocate enough bytes for the full strings and build up a required key string on the spot, this way you only copy characters onces.
Even better would be to find a numerical growing function that you can build by only reading the string without copying anything.
I will think a bit more on the topic and see if I can find a good idea. Will write back in a few days, hopefully with some smart ideas :-)

Hi @VinceR, Objective 2 seems good to me too. TQString copies over the existing string and add the new characters when operator += is used. You can check the code if you want to see the details. To get a proper string order without doing too much copying operations, you could preallocate enough bytes for the full strings and build up a required key string on the spot, this way you only copy characters onces. Even better would be to find a numerical growing function that you can build by only reading the string without copying anything. I will think a bit more on the topic and see if I can find a good idea. Will write back in a few days, hopefully with some smart ideas :-)
Poster
Owner

Hi @VinceR,
I spent some time to think about an effective solution and I think I found one that does not require copy any character at all, just reading.
To compared two entries we will do a smart char to char comparison, with up to two passes.

In the first pass, you compared only standard characters (letters, numbers and possibly underscore, space and dash since they are commonly used to separate words) and you skip over any other special char. For example:

aab   < aac  : trivial
aa~b  < aac  : the ~ is ignored and 'b' will be compared against 'c'
a#b~c < ab^c : 'b' compares with 'b' and 'c' with 'c'

You can start from the beginning of the name and move one char at a time, but skip over special chars. As soon as you detect a string is < or > of the other, comparison stops, no need to continue till the end of the string. Also there is no need to execute the second pass in this case.

If the two strings are equal at the end of the first pass (same number of normal chars and same chars), then you execute the second pass, where you compare only the special chars of the two strings. The comparison process is the same as the first pass, but looking at the special chars instead. For example:

a@b < a~b : trivial
a@ab < aa~b : first pass will give identical strings, second pass '@' compares against '~'

This way we can compare strings in "dictionary order" without having to do any writing at all.
If you have any question or if the explanation is not clear, let me know and I will be happy to provide more info.

Hi @VinceR, I spent some time to think about an effective solution and I think I found one that does not require copy any character at all, just reading. To compared two entries we will do a smart char to char comparison, with up to two passes. In the first pass, you compared only standard characters (letters, numbers and possibly underscore, space and dash since they are commonly used to separate words) and you skip over any other special char. For example: ``` aab < aac : trivial aa~b < aac : the ~ is ignored and 'b' will be compared against 'c' a#b~c < ab^c : 'b' compares with 'b' and 'c' with 'c' ``` You can start from the beginning of the name and move one char at a time, but skip over special chars. As soon as you detect a string is < or > of the other, comparison stops, no need to continue till the end of the string. Also there is no need to execute the second pass in this case. If the two strings are equal at the end of the first pass (same number of normal chars and same chars), then you execute the second pass, where you compare only the special chars of the two strings. The comparison process is the same as the first pass, but looking at the special chars instead. For example: ``` a@b < a~b : trivial a@ab < aa~b : first pass will give identical strings, second pass '@' compares against '~' ``` This way we can compare strings in "dictionary order" without having to do any writing at all. If you have any question or if the explanation is not clear, let me know and I will be happy to provide more info.
Collaborator

I am sorry for delay in responding but I had to take a "Covid" break over the holidays :(

First to respond to SlavekB

Thank you. From the name Dictionary I first didn't understand what it means. If we manage to find the name that will be more obvious, it would be good. If not, here it is good that it has the same name as in other tools for sorting.

I have to agree that the term "Dictionary Order" may not be very meaninful to most people nowadays. Maybe the proposed sort option would be better named "Alphanumeric sort" or "Ignore special characters".

Now to respond to solution sketched by MicheleC

I have been assuming that pre-constructing a sort key associated with a file name would be more efficient because it would only need to be done once (in KonqListViewItem::updateContents) and reused in the more frequently invoked comparison function (KonqBaseListViewItem::compare) which ends up calling the TQString::localeAwareCompare function on the generated keys to do the final comparison.

Your method suggests bypassing construction of a sort key and relying only on the comparison function to accomplish things. Based on the statements in https://unicode-org.github.io/icu/userguide/collation/concepts#sortkeys-vs-comparison, it would appear that (contrary to my intuition) this general approach would be more efficient. Also implementation of "natural sort" as suggested by PR #177 might not be so amenable to a pre-generated sort key approach.

However: what type of comparison function would you suggest using for each pair of characters? If you use the comparison operators '==', '>' or '<', you would not being doing the locale-aware comparison that users might be expecting. Even if TQChar had an appropriate comparison function, I would be suspicious of it because of how surrogate character pairs would be handled.

Maybe the solution is to allocate an array of TQChar, build it, and convert it to a TQString before returning result. I will play around with that and get back to you.

I am sorry for delay in responding but I had to take a "Covid" break over the holidays :( First to respond to SlavekB >Thank you. From the name Dictionary I first didn't understand what it means. If we manage to find the name that will be more obvious, it would be good. If not, here it is good that it has the same name as in other tools for sorting. I have to agree that the term "Dictionary Order" may not be very meaninful to most people nowadays. Maybe the proposed sort option would be better named "Alphanumeric sort" or "Ignore special characters". Now to respond to solution sketched by MicheleC I have been assuming that pre-constructing a sort key associated with a file name would be more efficient because it would only need to be done once (in KonqListViewItem::updateContents) and reused in the more frequently invoked comparison function (KonqBaseListViewItem::compare) which ends up calling the TQString::localeAwareCompare function on the generated keys to do the final comparison. Your method suggests bypassing construction of a sort key and relying only on the comparison function to accomplish things. Based on the statements in https://unicode-org.github.io/icu/userguide/collation/concepts#sortkeys-vs-comparison, it would appear that (contrary to my intuition) this general approach would be more efficient. Also implementation of "natural sort" as suggested by PR https://mirror.git.trinitydesktop.org/gitea/TDE/tdebase/issues/177 might not be so amenable to a pre-generated sort key approach. However: what type of comparison function would you suggest using for each pair of characters? If you use the comparison operators '==', '>' or '<', you would not being doing the locale-aware comparison that users might be expecting. Even if TQChar had an appropriate comparison function, I would be suspicious of it because of how surrogate character pairs would be handled. Maybe the solution is to allocate an array of TQChar, build it, and convert it to a TQString before returning result. I will play around with that and get back to you.
Poster
Owner

Hi @VinceR,

I have been assuming that pre-constructing a sort key associated with a file name would be more efficient because it would only need to be done once

with or without a sort key, you need to compare strings. The steps to do the comparison are the same, the only difference would be what you would be comparing (string or key). In case there are no special characters, key and sort key would be the same string. Building a sort key takes more time and resources, so it is not as efficient in this context. And an additional problem would be to keep the sort keys in sync with the strings to sort.

What type of comparison function would you suggest using for each pair of characters?

You can use operator[] to get unicode TQChar from the strings, then use operator< for comparison. I also recommend you look at a stable sort algorithm, so if two strings match, their relative order remains unchanged - see here for a bried explanation of the differences between the two methods.
TQString supports local enconding through TQTextCodec (see TQTextCodec class), so there shouldn't be any problem with locale awareness. It would be no difference from what is already implemented in Konqueror at the moment.

Maybe the solution is to allocate an array of TQChar, build it, and convert it to a TQString before returning result. I will play around with that and get back to you.

This once again would not be efficient. Keep in mind we may have to sort folders with thousand on files inside, so the more efficiente the solution, the better.

Hi @VinceR, > I have been assuming that pre-constructing a sort key associated with a file name would be more efficient because it would only need to be done once with or without a sort key, you need to compare strings. The steps to do the comparison are the same, the only difference would be what you would be comparing (string or key). In case there are no special characters, key and sort key would be the same string. Building a sort key takes more time and resources, so it is not as efficient in this context. And an additional problem would be to keep the sort keys in sync with the strings to sort. > What type of comparison function would you suggest using for each pair of characters? You can use operator[] to get unicode TQChar from the strings, then use operator< for comparison. I also recommend you look at a stable sort algorithm, so if two strings match, their relative order remains unchanged - see [here](https://www.baeldung.com/cs/stable-sorting-algorithms) for a bried explanation of the differences between the two methods. TQString supports local enconding through TQTextCodec (see TQTextCodec class), so there shouldn't be any problem with locale awareness. It would be no difference from what is already implemented in Konqueror at the moment. > Maybe the solution is to allocate an array of TQChar, build it, and convert it to a TQString before returning result. I will play around with that and get back to you. This once again would not be efficient. Keep in mind we may have to sort folders with thousand on files inside, so the more efficiente the solution, the better.
Collaborator

Hi MicheleC,

At the risk of sounding stubborn and/or dense :) … for the sake of argument, let's ignore the current task at hand (alphanumeric-centric comparison) and the pros and cons of pre-computing & storing a custom comparison key versus deferring everything to a sort-called comparison function.

I am concerned that your suggested method of comparing character-by-character would effectively force all filename sorting to LC_COLLATE="C" or to use older terminology, ASCII sort order. To quote from TQT docs:

int operator> ( TQChar c1, TQChar c2 )
Returns TRUE if the numeric Unicode value of c1 is greater than that of c2; otherwise returns FALSE.

I don't see how TQTextCodec would help. How would a character-by-character comparison method guarantee:

  • "graße" sorts before "grate" when LC_COLLATE="de_DE.utf8"
  • "graße" sorts after "grate" when LC_COLLATE="C"
  • "jet" sorts before "yet" when LC_COLLATE="en_US.utf8"
  • "jet" sorts after "yet" when LC_COLLATE="lt_LT.utf8"

Currently TDE Konqueror tries to honor locale-specific sorting, albeit with what I consider to be a bug (soon to be reported). It does so by using TQString::localeAwareCompare() on whole strings. There is no comparable function for TQChar.

What am I overlooking here?

Hi MicheleC, At the risk of sounding stubborn and/or dense :) … for the sake of argument, let's ignore the current task at hand (alphanumeric-centric comparison) and the pros and cons of pre-computing & storing a custom comparison key versus deferring everything to a sort-called comparison function. I am concerned that your suggested method of comparing character-by-character would effectively force all filename sorting to LC_COLLATE="C" or to use older terminology, ASCII sort order. To quote from TQT docs: >int operator> ( TQChar c1, TQChar c2 ) Returns TRUE if the numeric Unicode value of c1 is greater than that of c2; otherwise returns FALSE. I don't see how TQTextCodec would help. How would a character-by-character comparison method guarantee: * "graße" sorts before "grate" when LC_COLLATE="de_DE.utf8" * "graße" sorts after "grate" when LC_COLLATE="C" * "jet" sorts before "yet" when LC_COLLATE="en_US.utf8" * "jet" sorts after "yet" when LC_COLLATE="lt_LT.utf8" Currently TDE Konqueror tries to honor locale-specific sorting, albeit with what I consider to be a bug (soon to be reported). It does so by using TQString::localeAwareCompare() on whole strings. There is no comparable function for TQChar. What am I overlooking here?
This pull request has changes conflicting with the target branch.
konqueror/listview/konq_listview.cpp
konqueror/listview/konq_listview.kcfg
konqueror/listview/konq_treeview.rc
libkonq/konq_propsview.h
konqueror/listview/konq_detailedlistview.rc
konqueror/listview/konq_listview.h
konqueror/listview/konq_listviewitems.cpp
konqueror/listview/konq_textview.rc
libkonq/konq_propsview.cpp
konqueror/listview/konq_infolistview.rc
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
4 Participants
Notifications
Due Date

No due date set.

Dependencies

This pull request currently doesn't have any dependencies.

Loading…
There is no content yet.