WIP: Extend meaning of "Hidden Files" (PR version 3) #179

Introduce TDEStringMatcher, a general purpose iterative string matching class. Additional information about the class can be found in tdecore/README.tdestringmatcher. Files involved:

tdecore/tdestringmatcher.cpp # new
tdecore/tdestringmatcher.h # new
tdecore/README.tdestringmatcher # new
tdecore/CMakeLists.txt

Introduce concept of a hidden file matcher instance of the TDEStringMatcher class that can be used to extend the definition "hidden" files beyond traditional "dotfiles".
Create a global default instance of a hidden file matcher for use in applications that don't need their own. Files involved:

tdecore/tdeglobal.cpp
tdecore/tdeglobal.h

Modify KFileItem class to utilize a hidden file matcher when evaluating whether or not a filesystem object is hidden. Files involved:

tdeio/tdeio/tdefileitem.cpp
tdeio/tdeio/tdefileitem.h

This commit partially addresses issue TDE/tdebase#270. Additional changes to tdelibs, tdebase/libkonq, and tdebase/konqueror will be required to resolve that issue.

Considerations for reviewers:

This code, along with additional code not in this commit, has been successfully tested in a chroot environment running konqueror. To the best of my knowledge, the code successfully addresses issue # 270 and manifests no bugs or noticeable performance penalty.
The code submitted for review has numerous trace statements that have been essential for verification and debugging. These trace statements start with a defined macro TSMTRACE and should be ignored when reviewing - they will be removed when final testing is complete.
Recent changes introduced TDEStringMatcher signals and associated KFileItem slots. The changes are implemented by #define TSMSIGNALS. After further testing, the associated preprocessor conditional directives will be removed.

@MicheleC,

This is in regard to this conversation in which the merits of using associated private classes was dicussed.

As you will see from the TDEStringMatcher code, I did follow your advice and put TQString patternString into a private class instead of declaring it as private: property in the main class. But on second look, this looks contrived and perhaps pointless given how small the private class is. What do you think about this?
As you will see from the KFileItem code, I have added 2 protected class variables (one boolean and one an object pointer. I would definitely agree that both of these should be in an associated private class. The problem is that the KFileItemPrivate class seems to be special purpose and instantiated only conditionally in various methods. Should create another private class, use KFileItemPrivate but allocate it early in the KFileItem constructor, or just leave things as they are?

@VinceR
thanks for the new PR. FYI, I expect to have a close look at this next weekend. The week ahead is busy (again) and the time available for TDE will be devoted to finalize R14.0.13, whose code base will be frozen this Friday.

This PR replaces PR #178

First part of the review. Still do do 3 files.

Second and final part of the review.

As you will see from the TDEStringMatcher code, I did follow your advice and put TQString patternString into a private class instead of declaring it as private: property in the main class. But on second look, this looks contrived and perhaps pointless given how small the private class is. What do you think about this?

I agree with you and as we discussed in #170, we don't need to use a private class for this case.

As you will see from the KFileItem code, I have added 2 protected class variables (one boolean and one an object pointer. I would definitely agree that both of these should be in an associated private class. The problem is that the KFileItemPrivate class seems to be special purpose and instantiated only conditionally in various methods. Should create another private class, use KFileItemPrivate but allocate it early in the KFileItem constructor, or just leave things as they are?

Just leave things as they are. KFileItem has a lot of other private members, so the two new variables fits well where you placed them.

OVerall the PR is good. Most of the comments are related to either minor improvements or some mistakes that can be easily fixed.

There is one point we should discuss, together with @SlavekB, which is how we specify the patterns. Your solution has some pros but the pattern specification is quite convoluted (although ingenious). I had a different more traditional idea in mind, so I will list pros and cons of both ideas from what I can see and then we can discuss all three together.

Solution in this PR

Pros:

allows to specify multiple patterns in a compact way. This is especially good for saving/restoring the patterns from a config file

Cons:

pattern specification is quite convoluted, a user need to read documentation to ubderstand how to create a pattern
require more work to translate from pattern string to GUI options when we will need to create a GUI for the user to change the pattern.
there is no way to modify the existing pattern list without having to specify all the existing patterns again. Basically there is no addPattern, modifyPattern or removePattern method available. Of course they could be added.
need to maintain two independent variables for the patterns (the string and the list of TQRegExt) and make sure they are always in sync.

Traditional solution

Uses only a list of TQRegEx objects, without the pattern string. Patterns are appended/removed from the list as needed, by calling appropriate methods.

Pros:

simpler concept, much easier to grasp for normal users
should allow easier 1-to-1 correspondance with a GUI dialog
the TQRegEx list is the reference for the patterns. No second entry to update and keep in sync.

Cons:

require more works to save/restore the list from config file.

@VinceR feel free to add pros/cons to both solution as you see them from your point of view. We can then ask @SlavekB what he thinks about both to have a third opinion.

Other than that, as I said at the top, this PR is mostly ready. Well done.

OVerall the PR is good. Most of the comments are related to either minor improvements or some mistakes that can be easily fixed. There is one point we should discuss, together with @SlavekB, which is how we specify the patterns. Your solution has some pros but the pattern specification is quite convoluted (although ingenious). I had a different more traditional idea in mind, so I will list pros and cons of both ideas from what I can see and then we can discuss all three together. ##### Solution in this PR Pros: 1. allows to specify multiple patterns in a compact way. This is especially good for saving/restoring the patterns from a config file Cons: 1. pattern specification is quite convoluted, a user need to read documentation to ubderstand how to create a pattern 2. require more work to translate from pattern string to GUI options when we will need to create a GUI for the user to change the pattern. 3. there is no way to modify the existing pattern list without having to specify all the existing patterns again. Basically there is no `addPattern`, `modifyPattern` or `removePattern` method available. Of course they could be added. 4. need to maintain two independent variables for the patterns (the string and the list of TQRegExt) and make sure they are always in sync. ##### Traditional solution Uses only a list of TQRegEx objects, without the pattern string. Patterns are appended/removed from the list as needed, by calling appropriate methods. Pros: 1. simpler concept, much easier to grasp for normal users 2. should allow easier 1-to-1 correspondance with a GUI dialog 3. the TQRegEx list is the reference for the patterns. No second entry to update and keep in sync. Cons: 1. require more works to save/restore the list from config file. @VinceR feel free to add pros/cons to both solution as you see them from your point of view. We can then ask @SlavekB what he thinks about both to have a third opinion. Other than that, as I said at the top, this PR is mostly ready. Well done.

Before I deal with the individual conversation items, I think it is worthwhile to discuss interface design and why I decided to approach things the way I did. I see three separate but related components that are required for a complete solution.

Core string matching engine, currently implemented in TDEStringMatcher, includes:
- List of match patterns and associated options. Currently implemented as TQPtrList<TQRegExp> patternList
- Functions to manage items in the list. Currently implemented with the single all-at-once function generatePatternList()
- Functions to perform string matching operations against the list. Currently implemented with matchAny() and matchAll()
Methods for converting a match pattern list to and from a string suitable for storage. Currently integrated into core string matching engine with respective functions getPatternString() and generatePatternList().
A UI for modifying the match pattern list, either temporarily or permanently. This was formerly integrated with the core string matching engine but was moved to a separate class TDEStringMatcherUI (to be discussed / reviewed later).

I thought that it was important to integrate standardized conversion functions (component # 2) into the core string matching engine. This makes it very easy for applications that need to store and retrieve match settings. They do not need to know anything about patternString syntax to initialize a TDEStringMatcher.

The UI does need to be concerned with details about match patterns and their corresponding options. With the current implementation, it needs to generate a dialog either with direct processing of match pattern list items, or by decoding patternString. On the other hand, a user that uses the UI will never need to understand patternString string syntax.

The current patternString syntax is designed to permit the future addition of other match options. It also permits (but does not require) a more compact (less redundant) specification of pattern options. I would also argue that it is quite easy to understand once explained. It is certainly simpler than understanding regex syntax.

Now in response to your pros & cons:

Solution in this PR

Pros:

allows to specify multiple patterns in a compact way. This is especially good for saving/restoring the patterns from a config file

I consider this to be critical for the original purpose: hidden file matching. It's all going to start and end with config file settings.

Of course we could move this type of code back to applications & views, letting each one do things its own way. But I personally prefer some uniformity so when we look at different configurations, we only need to know one set of rules.

Cons:

pattern specification is quite convoluted, a user need to read documentation to ubderstand how to create a pattern

Users don't need to read anything if they use the UI. They will if they want to edit configuration files directly, but we don't encourage that anyway.

require more work to translate from pattern string to GUI options when we will need to create a GUI for the user to change the pattern.

Not so much work. TDEStringMatcherUI can present a simplified view of "patterns" & "options" in a way that does not require knowledge of TQRegExp properties.

there is no way to modify the existing pattern list without having to specify all the existing patterns again. Basically there is no addPattern, modifyPattern or removePattern method available. Of course they could be added.

For the relatively simple case of hidden file matching, I think setting all patterns at once is sufficient. But you are correct, those could be added for more complex cases. We could also add a setPatterns() for replacing the entire patternList with the content of another TQPtrList<TQRegExp>.

If these additional pattern list modification functions are added, it will become necessary to make getPatternString() actually decode the patternList into a corresponding patternString

need to maintain two independent variables for the patterns (the string and the list of TQRegExt) and make sure they are always in sync.

With current implementation, that's not an issue. The string is simply of the copy of the one used in the last successful call to generatePatternList(). Or the suggested additional pattern list modification functions are added, the string becomes a cached copy of the most recent call to getPatternString().

Before I deal with the individual conversation items, I think it is worthwhile to discuss interface design and why I decided to approach things the way I did. I see three separate but related components that are required for a complete solution. 1. Core string matching engine, *currently implemented in `TDEStringMatcher`*, includes: * List of match patterns and associated options. *Currently implemented as `TQPtrList<TQRegExp> patternList`* * Functions to manage items in the list. *Currently implemented with the single all-at-once function `generatePatternList()`* * Functions to perform string matching operations against the list. *Currently implemented with `matchAny()` and `matchAll()`* 2. Methods for converting a match pattern list **to** and **from** a string suitable for storage. Currently integrated into core string matching engine with respective functions `getPatternString()` and `generatePatternList()`. 3. A UI for modifying the match pattern list, either temporarily or permanently. *This was formerly integrated with the core string matching engine but was moved to a separate class `TDEStringMatcherUI` (to be discussed / reviewed later).* I thought that it was important to integrate standardized conversion functions (component # 2) into the core string matching engine. This makes it very easy for applications that need to store and retrieve match settings. They do not need to know anything about `patternString` syntax to initialize a `TDEStringMatcher`. The UI does need to be concerned with details about match patterns and their corresponding options. With the current implementation, it needs to generate a dialog either with direct processing of match pattern list items, or by decoding `patternString`. On the other hand, a user that uses the UI will never need to understand `patternString` string syntax. The current `patternString` syntax is designed to permit the future addition of other match options. It also permits (but does not require) a more compact (less redundant) specification of pattern options. I would also argue that it is quite easy to understand once explained. It is certainly simpler than understanding regex syntax. Now in response to your pros & cons: > > ##### Solution in this PR > > Pros: > 1. allows to specify multiple patterns in a compact way. This is especially good for saving/restoring the patterns from a config file > I consider this to be critical for the original purpose: hidden file matching. It's all going to start and end with config file settings. Of course we could move this type of code back to applications & views, letting each one do things its own way. But I personally prefer some uniformity so when we look at different configurations, we only need to know one set of rules. > Cons: > 1. pattern specification is quite convoluted, a user need to read documentation to ubderstand how to create a pattern Users don't need to read anything if they use the UI. They will if they want to edit configuration files directly, but we don't encourage that anyway. > 2. require more work to translate from pattern string to GUI options when we will need to create a GUI for the user to change the pattern. Not so much work. `TDEStringMatcherUI` can present a simplified view of "patterns" & "options" in a way that does not require knowledge of `TQRegExp` properties. > 3. there is no way to modify the existing pattern list without having to specify all the existing patterns again. Basically there is no `addPattern`, `modifyPattern` or `removePattern` method available. Of course they could be added. For the relatively simple case of hidden file matching, I think setting all patterns at once is sufficient. But you are correct, those could be added for more complex cases. We could also add a `setPatterns()` for replacing the entire `patternList` with the content of another `TQPtrList<TQRegExp>`. If these additional pattern list modification functions are added, it will become necessary to make `getPatternString()` actually decode the `patternList` into a corresponding `patternString` > 4. need to maintain two independent variables for the patterns (the string and the list of TQRegExt) and make sure they are always in sync. With current implementation, that's not an issue. The string is simply of the copy of the one used in the last successful call to `generatePatternList()`. Or the suggested additional pattern list modification functions are added, the string becomes a cached copy of the most recent call to `getPatternString()`.

Before I deal with the individual conversation items,

Let me think a bit longer on this and the pattern string thing. Will write back in a day or two.

Before I deal with the individual conversation items,

After further consideration, I think your approach is fine, although we should make a couple of refinements.

make the options non-persistent across patterns. This means every pattern will have its own options, like "/ow/p.*/ow/pa?c.txt" for example.
The reasons to do so is simply to simplify pattern changes. A user may want to add/remove additional patterns to the existing ones, so by making each pattern independent from the others, those operations become much simpler. Otherwise each time would be necessary to parse the whole string, keep memory of the options and adapt the string as required.
add two methods addPattern and removePattern to append/remove an indivual option+pattern to the existing pattern lists.
Btw, the setPatterns() function that you proposed would be the same as generatePatternList(), so we could as well rename generatePatternList() to setPatternString() which would also match with the corresponding getPatternString() getter method.

What do you think?

make the options non-persistent across patterns. This means every pattern will have its own options, like "/ow/p.*/ow/pa?c.txt" for example.

There is nothing that prohibits a pattern string from being like that, so let's consider the ramifications of of making this mandatory:

Do we therefore require that a patternString always have an even number of parts (option specs followed by pattern specs)?
Do we therefore require that all known options be specified for each pattern or can we assume that the absence of an option specification has a fixed default (e.g. the TQRegExp default setting).
What happens when we implement a new option (with new option characters)?

I think part of your concern may come from thinking that the UI will have to screw around directly with an arcane patternString. I don't think that needs to be the case. The UI can deal directly with a copy of patternList (or an array derived from that) and communicate the results back to the hidden file matcher.

A user may want to add/remove additional patterns to the existing ones
add two methods addPattern and removePattern to append/remove an indivual option+pattern to the existing pattern lists

I would be reluctant for a UI to have direct access to protected member patternList. If it is a copy that is being processed, then certainly those methods could be implemented within the UI.

While I have been implementing suggestions in this PR, I also made some other changes to the code to support a UI:

typedef TQValueList<TQRegExp> PatternList;
  // Use typedef to help economy of expression
  // It's now a value list, not a pointer list
PatternList getPatternList();
  // Return a copy of m_patternList, intended for use by UI
TQString generatePatternString( PatternList& patternList );
  // Reverse of generatePatternList()
  // Intended for use by a UI on its own copy after user modifications as follows:
  // generatePatternList( generatePatternString( UIPatternListCopy ) )
PatternList m_patternList;
  // main pattern store, renamed as recommended

Btw, the setPatterns() function that you proposed would be the same as generatePatternList(), so we could as well rename generatePatternList() to setPatternString() which would also match with the corresponding getPatternString() getter method.

I have been using the generate* naming because it connotes a process of transforming a one type of entity (e.g, string) to another (e.g. list of TQRegExp). To me, set* and get* connote copying of variables of the same type. I guess I could use names like patternString2List and patternList2String instead of generate*.

>make the options non-persistent across patterns. This means every pattern will have its own options, like "/ow/p.*/ow/pa?c.txt" for example. There is nothing that prohibits a pattern string from being like that, so let's consider the ramifications of of making this *mandatory*: 1. Do we therefore *require* that a `patternString` always have an even number of parts (option specs followed by pattern specs)? 2. Do we therefore *require* that all known options be specified for each pattern or can we assume that the absence of an option specification has a fixed default (e.g. the TQRegExp default setting). 3. What happens when we implement a new option (with new option characters)? I think part of your concern may come from thinking that the UI will have to screw around directly with an arcane `patternString`. I don't think that needs to be the case. The UI can deal directly with a copy of `patternList` (or an array derived from that) and communicate the results back to the hidden file matcher. >A user may want to add/remove additional patterns to the existing ones >add two methods `addPattern` and `removePattern` to append/remove an indivual option+pattern to the existing pattern lists I would be reluctant for a UI to have direct access to protected member `patternList`. If it is a copy that is being processed, then certainly those methods could be implemented within the UI. While I have been implementing suggestions in this PR, I also made some other changes to the code to support a UI: ``` typedef TQValueList<TQRegExp> PatternList; // Use typedef to help economy of expression // It's now a value list, not a pointer list PatternList getPatternList(); // Return a copy of m_patternList, intended for use by UI TQString generatePatternString( PatternList& patternList ); // Reverse of generatePatternList() // Intended for use by a UI on its own copy after user modifications as follows: // generatePatternList( generatePatternString( UIPatternListCopy ) ) PatternList m_patternList; // main pattern store, renamed as recommended ``` >Btw, the setPatterns() function that you proposed would be the same as generatePatternList(), so we could as well rename generatePatternList() to setPatternString() which would also match with the corresponding getPatternString() getter method. I have been using the `generate*` naming because it connotes a process of transforming a one type of entity (e.g, string) to another (e.g. list of `TQRegExp`). To me, `set*` and `get*` connote copying of variables of the same type. I guess I could use names like `patternString2List` and `patternList2String` instead of `generate*`.

Do we therefore require that a patternString always have an even number of parts (option specs followed by pattern specs)?

yes, each "piece" of the pattern would have options + pattern string.

Do we therefore require that all known options be specified for each pattern or can we assume that the absence of an option specification has a fixed default (e.g. the TQRegExp default setting).

Nope, it can be handle as of now, specifying only the options to be active , while the inactive ones are not specified. The only difference is that we will need to repeat the options for each "piece" of the string if needed.

What happens when we implement a new option (with new option characters)?

If we specify the option in the "piece", it will be set, otherwise there is no need to specify it. So all existing patterns would keep working.

I think part of your concern may come from thinking that the UI will have to screw around directly with an arcane patternString.

The UI shouldn't have to do anything too complicated. Just add or remove "pieces" to the pattern string, or replace the string with a new one at once in the worst case.

I would be reluctant for a UI to have direct access to protected member patternList.

UI won't have access to the internals of the implementation. You may have a string like (simplified, not literal) pattern1:pattern2:pattern3 and the UI may say removePattern(pattern2) and addPattern(pattern4), resulting in the string pattern1:pattern3:pattern4. By having each piece to be independent from previous/following pieces of the pattern, such operations are much easier to implement and less error-prone.

I have been using the generate* naming because it connotes a process of transforming a one type of entity (e.g, string) to another (e.g. list of TQRegExp).

Highly suggest to use get/set names when two functions work in reverse for the same functionality. Getters and setters are pretty much standard practice when it comes to get or set values. They don't simply copy the values, they can do validation of the fields, reject setting wrong values and add encapsulation to the inner variables.

EDIT: having said that, it is just a suggestion, so feel free to keep current names if you wish.

> 1. Do we therefore *require* that a `patternString` always have an even number of parts (option specs followed by pattern specs)? yes, each "piece" of the pattern would have options + pattern string. > 2. Do we therefore *require* that all known options be specified for each pattern or can we assume that the absence of an option specification has a fixed default (e.g. the TQRegExp default setting). Nope, it can be handle as of now, specifying only the options to be active , while the inactive ones are not specified. The only difference is that we will need to repeat the options for each "piece" of the string if needed. > 3. What happens when we implement a new option (with new option characters)? If we specify the option in the "piece", it will be set, otherwise there is no need to specify it. So all existing patterns would keep working. > I think part of your concern may come from thinking that the UI will have to screw around directly with an arcane `patternString`. The UI shouldn't have to do anything too complicated. Just add or remove "pieces" to the pattern string, or replace the string with a new one at once in the worst case. > I would be reluctant for a UI to have direct access to protected member `patternList`. UI won't have access to the internals of the implementation. You may have a string like (simplified, not literal) `pattern1:pattern2:pattern3` and the UI may say `removePattern(pattern2)` and `addPattern(pattern4)`, resulting in the string `pattern1:pattern3:pattern4`. By having each piece to be independent from previous/following pieces of the pattern, such operations are much easier to implement and less error-prone. > I have been using the `generate*` naming because it connotes a process of transforming a one type of entity (e.g, string) to another (e.g. list of `TQRegExp`). Highly suggest to use get/set names when two functions work in reverse for the same functionality. Getters and setters are pretty much standard practice when it comes to get or set values. They don't simply copy the values, they can do validation of the fields, reject setting wrong values and add encapsulation to the inner variables. EDIT: having said that, it is just a suggestion, so feel free to keep current names if you wish.

Continuing the conversation on pattern string format:

I am striving to put all of the logic for encoding / decoding pattern strings exclusively into TDEStringMatcher implementation.

Applications should only need to get a pattern string from the object, store/retrieve it to/from disk, and send the string to the back to the object for (re)building the internal pattern list.
The UI can be provided a list or array of either TQRegExp objects or of structs containing just the information necessary to create its dialog.
Users who really want to edit configuration files can consult the readme, and if they mess things up, they can fix things in the UI.

That said, I am not against your idea of formating the pattern string as option spec, pattern spec, [option spec, pattern spec, …]. Doing so would eliminate the need to explicitly prefix each specification with o or p. We can debate the merits of whether or not each option spec should be fully populated with known option characters, or if not, whether "missing" option characters are inferred from a previous option spec versus a standard default.

In my original implementation, applications could explicitly set the character to be used for splitting up the pattern string. In the current implementation, the 1st character of the pattern string determines the splitter.

I am now considering making the splitter a fixed character that satisfies the following characteristics:

Character must be one that could reasonably be prohibited from use in a match pattern.
TDE won't try to backslash the character when storing in a configuration file.
Character needs to be visible and selectable in various text editors and less.
Presence of character in text files won't cause them to be treated as binary files.

The control character Vertical Tab (0x0B) looks like a good candidate:

It appears as ^K in less, nano, vi, and emacs.
It appears as the I-Don't-Know-How-To-Display-That rectangle in kwrite and kate.
It appears as the symbold VT in geany.
I have verified that strings containing it can be stored & retrieved without alteration in a TDE config file.

What do you think of the idea generally (standard separator character) and of my proposed choice of Vertical Tab as that character?

Continuing the conversation on pattern string format: I am striving to put all of the logic for encoding / decoding pattern strings exclusively into `TDEStringMatcher` implementation. * Applications should only need to get a pattern string from the object, store/retrieve it to/from disk, and send the string to the back to the object for (re)building the internal pattern list. * The UI can be provided a list or array of either `TQRegExp` objects or of structs containing just the information necessary to create its dialog. * Users who really want to edit configuration files can consult the readme, and if they mess things up, they can fix things in the UI. That said, I am not against your idea of formating the pattern string as `option spec, pattern spec, [option spec, pattern spec, …]`. Doing so would eliminate the need to explicitly prefix each specification with `o` or `p`. We can debate the merits of whether or not each option spec should be fully populated with known option characters, or if not, whether "missing" option characters are inferred from a previous option spec versus a standard default. ----- In my original implementation, applications could explicitly set the character to be used for splitting up the pattern string. In the current implementation, the 1st character of the pattern string determines the splitter. I am now considering making the splitter a fixed character that satisfies the following characteristics: * Character must be one that could reasonably be prohibited from use in a match pattern. * TDE won't try to backslash the character when storing in a configuration file. * Character needs to be visible and selectable in various text editors and `less`. * Presence of character in text files won't cause them to be treated as binary files. The control character Vertical Tab (`0x0B`) looks like a good candidate: * It appears as `^K` in `less`, `nano`, `vi`, and `emacs`. * It appears as the *I-Don't-Know-How-To-Display-That* rectangle in `kwrite` and `kate`. * It appears as the symbold `VT` in `geany`. * I have verified that strings containing it can be stored & retrieved without alteration in a TDE config file. What do you think of the idea generally (standard separator character) and of my proposed choice of Vertical Tab as that character?

Hi @VinceR,
apologies for the late reply, somehow I missed your message and only spotted it today.

Applications should only need to get a pattern string from the object, store/retrieve it to/from disk

Sounds good.

The UI can be provided a list or array of either TQRegExp objects or of structs containing just the information necessary to create its dialog.

I am not against this, but keep in mind that if the UI has access to the underlying internal structure (for example TQRegExp objects) then we probably need extra code to make sure the pattern string and the internal structure are always in sync (i.e. if the UI changes the internal objects, then we need to re-derive the pattern string). In light of that, I think it would be simpler for the UI to interact with the pattern string only, so that we don't have to add extra syncing code.

Doing so would eliminate the need to explicitly prefix each specification with o or p.

Nice, I had not thought about it. That makes things even simpler. For example a string could be something like |wp.*|wpa?c.txt, where the o is gone. The p is still used to mark the beginning of the pattern string. Regarding the various options, we can assume that those not specified will use a default value. For wildcard-regex we can default to wild card, so a simple use case would be for example |p.*, which would filter all dot files.

The control character Vertical Tab (0x0B) looks like a good candidate:

Uhm, many users may not even know what a vertical tab is, nor how to type it 😰 I think we should look at a character which has all the characteristics that you listed above but is common and visible. How about |? It's the pipe character in unix, so it is extremely unlikely that a user will use that as part of a file name.
Btw I liked your idea of being able to specify the splitting character. Why don't we make something like this:

if the first character of the overall pattern string is s, the second character will specify the splitting char to use
if the first character of the overall pattern string is NOT s, then the splitting char will default to | and the first character will be interpreted as a normal option.

This way we have a simple default but still offer the opportunity to customize the splitting character if needed. For example

wp.*|wpa?c.txt --> default split char is |
s/wp.*/wpa?c.txt --> split char is /
p.* --> minimal pattern string, default to wildcard (filter dot files in this example)

What do you think about this?

Hi @VinceR, apologies for the late reply, somehow I missed your message and only spotted it today. > Applications should only need to get a pattern string from the object, store/retrieve it to/from disk Sounds good. > The UI can be provided a list or array of either TQRegExp objects or of structs containing just the information necessary to create its dialog. I am not against this, but keep in mind that if the UI has access to the underlying internal structure (for example TQRegExp objects) then we probably need extra code to make sure the pattern string and the internal structure are always in sync (i.e. if the UI changes the internal objects, then we need to re-derive the pattern string). In light of that, I think it would be simpler for the UI to interact with the pattern string only, so that we don't have to add extra syncing code. > Doing so would eliminate the need to explicitly prefix each specification with o or p. Nice, I had not thought about it. That makes things even simpler. For example a string could be something like `|wp.*|wpa?c.txt`, where the `o` is gone. The `p` is still used to mark the beginning of the pattern string. Regarding the various options, we can assume that those not specified will use a default value. For wildcard-regex we can default to wild card, so a simple use case would be for example `|p.*`, which would filter all dot files. > The control character Vertical Tab (0x0B) looks like a good candidate: Uhm, many users may not even know what a vertical tab is, nor how to type it 😰 I think we should look at a character which has all the characteristics that you listed above but is common and visible. How about `|`? It's the pipe character in unix, so it is extremely unlikely that a user will use that as part of a file name. Btw I liked your idea of being able to specify the splitting character. Why don't we make something like this: 1. if the first character of the overall pattern string is `s`, the second character will specify the splitting char to use 2. if the first character of the overall pattern string is NOT `s`, then the splitting char will default to `|` and the first character will be interpreted as a normal option. This way we have a simple default but still offer the opportunity to customize the splitting character if needed. For example 1. `wp.*|wpa?c.txt` --> default split char is `|` 2. `s/wp.*/wpa?c.txt` --> split char is `/` 3. `p.*` --> minimal pattern string, default to wildcard (filter dot files in this example) What do you think about this?

The UI can be provided a list or array of either TQRegExp objects or of structs containing just the information necessary to create its dialog.

I am not against this, but keep in mind that if the UI has access to the underlying internal structure (for example TQRegExp objects)...

The more I think about this, the more I favor passing implementation-agnostic structures to the UI. Someday we may want to change the TQRegExp implementation to something better (e.g. a port of QRegularExpression or our own wrapper around PCRE2).

...then we probably need extra code to make sure the pattern string and the internal structure are always in sync

Already done, in revised code to be uploaded.

Uhm, many users may not even know what a vertical tab is, nor how to type it 😰

Users will never need to know either -- and that is by design. In fact, we want to make it very difficult to accidently type this internal-use-only character into a regex. If somebody wants to specify a vertical tab in a search pattern, they can use \v.

Regarding rest of your suggestion: | is not a good default splitter since that's the regex alternation character. I had originally thought of using the horizontal tab character but TDE converts that to \t when storing a multi-pattern string.

I think I have implemented almost all of the other suggestions made in conversation items although I need to double-check. I need to fine-tune some of the new stuff I added in support of a UI, then I will be ready to upload a revised version of the code.

Although I know that you have wanted to focus on core TDEStringMatcher functionality, I think think everthing I have done may make more sense if evaluated in a more complete context: KonqListView working with KDirLister and a decent UI for changing patterns. I am overdue for re-doing the UI to accommodate the additional flexibility that comes with setting per-pattern options. My old UI works for my testing but only because I am familiar with patternString rules.

>>The UI can be provided a list or array of either TQRegExp objects or of structs containing just the information necessary to create its dialog. >I am not against this, but keep in mind that if the UI has access to the underlying internal structure (for example TQRegExp objects)... The more I think about this, the more I favor passing implementation-agnostic structures to the UI. Someday we may want to change the `TQRegExp` implementation to something better (e.g. a port of [QRegularExpression]( https://dangelog.wordpress.com/2012/04/07/qregularexpression/) or our own wrapper around PCRE2). >...then we probably need extra code to make sure the pattern string and the internal structure are always in sync Already done, in revised code to be uploaded. >Uhm, many users may not even know what a vertical tab is, nor how to type it 😰 Users will never need to know either -- and that is by design. In fact, we want to make it very difficult to accidently type this internal-use-only character into a regex. If somebody wants to specify a vertical tab in a search pattern, they can use `\v`. Regarding rest of your suggestion: `|` is not a good default splitter since that's the regex alternation character. I had originally thought of using the horizontal tab character but TDE converts that to `\t` when storing a multi-pattern string. ----- I think I have implemented almost all of the other suggestions made in conversation items although I need to double-check. I need to fine-tune some of the new stuff I added in support of a UI, then I will be ready to upload a revised version of the code. Although I know that you have wanted to focus on core `TDEStringMatcher` functionality, I think think everthing I have done may make more sense if evaluated in a more complete context: `KonqListView` working with `KDirLister` and a decent UI for changing patterns. I am overdue for re-doing the UI to accommodate the additional flexibility that comes with setting per-pattern options. My old UI works for my testing but only because I am familiar with `patternString` rules.

The more I think about this, the more I favor passing implementation-agnostic structures to the UI. Someday we may want to change the TQRegExp implementation to something better

Exactly the point :-)

Regarding rest of your suggestion: | is not a good default splitter since that's the regex alternation character.

Very good point, didn't think about it. | is not a good choice at all.
The only problem I see with vertical tab is that if a user edit a config file manually in an editor, it will show up as a broken string (going to next line), so the user may collapse the string in a single row, effectively losing the vertical tab. Anyway I see you point on users not needing to edit manually in most cases, so we may as well use vertical tab, unless we find a better candidate (@SlavekB any alternative suggestion?)

Although I know that you have wanted to focus on core TDEStringMatcher functionality

Feel free to upload all the code if you wish, but the order of review will be the same :-) If possible use separate commits for eahc part (TDEStringMatcher, KDirLister, KonqListView, UI, ...) so it will be easier to review for me.

Hi @MicheleC,

Circumstances this past month forced my attention away from this PR but I am now back at it. Sorry for the long gap.

Vince

Circumstances this past month forced my attention away from this PR but I am now back at it. Sorry for the long gap.

No worries @VinceR, we are all here in our spare time. No need to apologize for that :-)

MicheleC,

After such a long absence, I finally have some updates. One of the problems with this amount of time passing is that it has given me ample opportunity to brainstorm and implement "improvements". Because of this, the code has changed enough that you will probably want to go through another review cycle - sorry about that.

As before, please ignore the copious TSMTRACE statements that I will remove before this gets pushed to the main branch.

All of the changes in this commit have been tested along with other changes to tdelibs/tdeio, tdebase/libkonq, and tdebase/konqueror that are needed to create a fully functional testing environment. This includes a rewritten UI for setting match patterns and associated options.

Vince

Hi @VinceR ,
thanks for the updated code. I will do a new review cycle next week (this weekend is Chinese New Year and I will be traveling) and feedback as usual.

Hi @VinceR
apologies for the delay, got busy with other stuff. I will start the review tomorrow or Wednesday at most.
I wanted to let you know that since the changes in this PR are quite at the core of tdelibs, the PR won't make it in the final R14.1.0 which is due out at the end of April. We are too close to that date to make such kind of change. But there will be plenty of time to add it to R14.2.0 (we are also putting back some other stuff for the same reason).

First part of the review done. I will continue tomorrow with the rest.

Second part of review. Still need to review the last two files, which I will hopefully do tomorrow.

Hi, MichelleC,

Thanks for your review so far. I think I will go back and resolve the earlier conversations on the assumption that if I missed something from those, you will be reminding me of it during the current review cycle.

I wish I knew more about git. I am currently working on 2 independent branches, the current one you are reviewing and a more comprehensive one that I use for RnD and testing. As it is now, I need to make changes, small and large, to both branches. There must be a better way using git magic. Or maybe not :)

I am perfectly fine with not attaining the 14.1 target. I do hope the interval between future feature releases will be shorter than the one between 14.0 and 14.1.

Do you have a sense of how many people there are that actually use the current 14.1 development branch code for their desktop? I hope it's a lot because there is no better test of committed changes than actual daily use outside of the laboratory.

Vince

Hi Vince,
I shall complete the review of the last two files later today (weekend was busy...) and provide an overall take of where we are so far.

There must be a better way using git magic.

For git, I usually work in the following way. It may not be the best, but it fits my workflow.

do R&D/testing on a local copy of the code. This is temporary and I can get rid of it anytime. Or if I don't like a change, I can simply copy back from the git original copy and restart/continue from a different point. If I am working on a big change, I create a temporary git branch and do several local commits that serves as snapshot, so once again if I mess up at some point, I can always go back to a previous snapshot. I tend to have lot of smaller commits (each with a purpose) rather than one huge commit with lots of changes in it.
for deployment, we have the master branch (R14.1.0-dev) and we usually deploy there.
then we backport to R14.0.x as needed, eventually with fine tuning if required.

I do hope the interval between future feature releases will be shorter than the one between 14.0 and 14.1.

R14.1.0 suffered from lack of planning for the first years. I would say till 2018 there was no real plan. Then we came together with a plan but we had to clean up some mess from the previous 4 years and we ended up where we are. R14.1.0 will be out at the end of April if all goes as per schedule.
For future release I would like to see a R14.x.0 release every 3 or 4 years at most and we will do planning from the very beginning. So it should be a shorter interval.

Do you have a sense of how many people there are that actually use the current 14.1 development branch code for their desktop?

To be honest I don't have any idea about that. There are for sure a number of people using it, myself included. But I think the majority are on R14.0.x.

Last part of the review.

Hi @VinceR,
I finished this round of review.
Good progress, many things are now in place and the design is more robust compared to the very first version from so long ago.
Most of my comments are minor adjustments and should not take long to rectify.
There are only 2 points which needs some discussion:

the choice of separator character. As already expressed, the vertical tab is not much of a user friendly choice IMO, although quite unique indeed. I think the very first suggestion from you (using /) is a better choice. I know / can be part of a pattern, but I think we could simply escape that as '/' is needed. Users should be quite familiar with that since it is a common thing to do. And it avoids "broken lines"
whether to make options carry forward or not. IMO, having independent pieces is easier from a programming and logic view and less prone to mistakes when moving things around.
Having said that, see one of my comments because unlessI have misunderstood, it seems the code is not carrying options over but the readme file say so.

We are definitely going in the right direction. Although we will miss the R14.1.0 window, we can merge this into R14.2.0-dev when we are ready.

Hi MicheleC,

Addressing the separator character:

the choice of separator character. As already expressed, the vertical tab is not much of a user friendly choice IMO, although quite unique indeed.

I too am not thrilled about using <VT>. My standard go-to separator for dividing text strings is the horizontal tab (<HT>) and I had some initial concerns about using that for the divider that may not be founded. But even using <HT> won't make editing config files that easy.

Consider an example where a user wishes to define 3 regular expressions in the matcher: 1\st, 2\nd and 3\rd. Ideally, TDE would store this specification literally in the configuration file as: 1\st 2\nd 3\rd (those are horizontal tabs separating the the 3 expressions). But instead, it stores it as 1\\st\t2\\nd\t3\\rd. I don't agree with what TDE does here: why does \ need to be escaped? and why does a perfectly good and visually pleasing horizontal tab need to be converted to \t? But that's the way it works.

I think the very first suggestion from you (using /) is a better choice. I know / can be part of a pattern, but I think we could simply escape that as '/' is needed. Users should be quite familiar with that since it is a common thing to do. And it avoids "broken lines"

I don't think it is reasonable to ask users, working in a UI, to remember to escape / characters in their regular expressions. They don't have to do that anywhere else. We could post-process user input and convert unescaped instances of / to \/ but ... just to support direct editing of configuration files?

Let me propose that we use <HT> instead of <VT>. I was originally concerned that this would present a conflict when users specify \t in their regular expressions but given TDE's overzealous escaping of \, that won't be an issue.

Let me propose that we use <HT> instead of <VT>. I was originally concerned that this would present a conflict when users specify \t in their regular expressions but given TDE's overzealous escaping of \, that won't be an issue.

is definitely preferrable compared to (see also other comment I made earlier today). Alternatively, why not using , and make it a CSV list of options and patterns?

I have a few more comments to answer, I will do that probably on Thursday.

I have a few more comments to answer, I will do that probably on Thursday.

Done with the remaining comments :-)

Hi MicheleC,

It's been a while, so I thought I should provide a status. Despite my silence, I have been pretty busy with this PR. Once I've made some final design decisions, I will respond to the remaining conversation items and push updated code. I will also include the code that implements what I have dubbed "Alphanumeric Equivalence", which takes case-insensitivity to another level.

One of the things I have done is to develop a class that provides a simplified interface to PCRE2 that can be used as a regex engine in TDEStringMatcher instead of (or in addition to) TQRegExp. I still need to do some more development and testing, but if this turns out to be a successful endeavor, I will push updated code later for review.

Vince

Ok, thanks for the update @VinceR. Looking forward for the new code and the next review cycle :-)

Let me propose that we use <HT> instead of <VT>. I was originally concerned that this would present a conflict when users specify \t in their regular expressions but given TDE's overzealous escaping of \, that won't be an issue.

is definitely preferrable compared to (see also other comment I made earlier today). Alternatively, why not using , and make it a CSV list of options and patterns?

In an effort to settle this point, the new code is using <HT> as the separator. We don't want to use , as the separator because that would require UI users who wanted to write a literal , in a pattern to write \, instead. This is not a problem for <HT> since users already know to specify \t in patterns that require it.

Hi @MicheleC,

First, congratulations to you, @SlavekB, and the rest on getting 14.1 released. I am sure that was a big job and hopefully users will find the update both pleasing and bug-free.

As you can see, I have uploaded new code for this PR and also made responses to conversation items that still merited a specific response from me.

@VinceR

Hi @VinceR

First, congratulations to you, @SlavekB, and the rest on getting 14.1 released. I am sure that was a big job and hopefully users will find the update both pleasing and bug-free.

It's a team effort, so everybody who contributed deserves credit for it, including you :-)

As you can see, I have uploaded new code for this PR and also made responses to conversation items that still merited a specific response from me.

I will review the code sometimes this week or next and feedback as usual.

See comments

@VinceR
I finally reviewed your latest commit and provided feedback about it.
Step by step, we are moving closer to the point where the PR will be ready for merging.

To summarize where we are and to make sure I did not forget anything, the next steps would be:

to review the equivalence class code based on latest feedback
integrate the equivalence class object into the string matcher, to make sure the characters are replaced by their equivalent form when needed
finally review the whole PR and close the remaining pending points.

After that, we should move at reviewing the rest of the code that will build on this PR. In that respect, does TDE/tdebase#270 need any modification or is it good to go as is? And do we need more code on top of this PR and the one in tdebase? It has been a long time, so I lost a bit of clarity at where we stand in regards to the overall initial idea.

Let me propose that we use <HT> instead of <VT>. I was originally concerned that this would present a conflict when users specify \t in their regular expressions but given TDE's overzealous escaping of \, that won't be an issue.

is definitely preferrable compared to (see also other comment I made earlier today). Alternatively, why not using , and make it a CSV list of options and patterns?

In an effort to settle this point, the new code is using <HT> as the separator. We don't want to use , as the separator because that would require UI users who wanted to write a literal , in a pattern to write \, instead. This is not a problem for <HT> since users already know to specify \t in patterns that require it.

Ok, I think it is a good compromise. <HT> is definitely more readable than <VT> and your observation on \t makes sense. I didn't see any new code for it though, so maybe something you still haven't published.

To summarize where we are and to make sure I did not forget anything, the next steps would be:

to review the equivalence class code based on latest feedback

I will take a look at the feedback and respond in the next week.

integrate the equivalence class object into the string matcher, to make sure the characters are replaced by their equivalent form when needed

Take a look at tdestringmatcher.{h,cpp} There were a number of changes in these, some of which implemented ANCHandling::EQUIVALENCE

finally review the whole PR and close the remaining pending points.

Yes, I need to push the remaining changes to:

tdeio/tdeio/kdirlister.h
tdeio/tdeio/kdirlister.cpp
tdeio/tdeio/kdirlister_p.h
tdeio/tdeio/tdefilefilter.h
tdeio/tdeio/tdefilefilter.cpp
tdeio/tdefile/tdediroperator.cpp
tdeio/tdefile/tdefiletreebranch.cpp

These will be a lot more straightforward to review, I promise :)

After that, we should move at reviewing the rest of the code that will build on this PR. In that respect, does TDE/tdebase#270 need any modification or is it good to go as is?

I still need to push the latest changes for that PR

And do we need more code on top of this PR and the one in tdebase? It has been a long time, so I lost a bit of clarity at where we stand in regards to the overall initial idea.

Well ... I've been unit testing a new class that wraps PCRE2. I would eventually like to use it instead of TQRegExp for TDEStringMatcher because it supports a richer regex language, it's well supported, it's faster, and does not have the TQRegExp bugs. But in the interest of avoiding more scope creep in this PR, I should probably introduce that in a new PR.

Ok, I think it is a good compromise. <HT> is definitely more readable than <VT> and your observation on \t makes sense. I didn't see any new code for it though, so maybe something you still haven't published.

tdestringmatcher.h line 57: inline constexpr char PatterStringDivider { '\t' }; and
referenced in a few other modules (namespace TSM).

But I do see an error: it should be spelled PatternStringDivider

Ok, I think it is a good compromise. <HT> is definitely more readable than <VT> and your observation on \t makes sense. I didn't see any new code for it though, so maybe something you still haven't published.

tdestringmatcher.h line 57: inline constexpr char PatterStringDivider { '\t' }; and
referenced in a few other modules (namespace TSM).

But I do see an error: it should be spelled PatternStringDivider

The screenshot below is taken from the current code on this PR on gitea. I don't see the PatterStringDivider.

And as well I still see this comment where the equivalence conversion should be:

// FIXME TBD: This is where we will be converting each alphanumeric
// character in stringToMatch to its "least" equivalent and storing
// the result in equivalentString. Until then, we'll just do:

Please double checked you pushed all the changes :-)

tdelibs-179.png

27 KiB

integrate the equivalence class object into the string matcher, to make sure the characters are replaced by their equivalent form when needed

Take a look at tdestringmatcher.{h,cpp} There were a number of changes in these, some of which implemented ANCHandling::EQUIVALENCE

See previous comment about not seeing the changes :-)

finally review the whole PR and close the remaining pending points.

Yes, I need to push the remaining changes to:
forward to review, I promise :)

After that, we should move at reviewing the rest of the code that will build on this PR. In that respect, does TDE/tdebase#270 need any modification or is it good to go as is?

I still need to push the latest changes for that PR

Thaanks for refreshing my memory

Well ... I've been unit testing a new class that wraps PCRE2. I would eventually like to use it instead of TQRegExp for TDEStringMatcher because it supports a richer regex language, it's well supported, it's faster, and does not have the TQRegExp bugs. But in the interest of avoiding more scope creep in this PR, I should probably introduce that in a new PR.

Yes, a follow up PR is probably a good idea

Please double checked you pushed all the changes :-)

Well I'm not sure how this happened, but apparently I committed and pushed only the new files and not the updated existing files.

In order to not confuse things too much, I will review your feedback on the new files, make changes accordingly to them, and then push everything again as merged commit.

In order to not confuse things too much, I will review your feedback on the new files, make changes accordingly to them, and then push everything again as merged commit.

OK, I have done this and am ready to push a replacement commit. Since I don't want to destroy context, I will await your responses to my responses to your conversation items before I do the push.

OK, I have done this and am ready to push a replacement commit. Since I don't want to destroy context, I will await your responses to my responses to your conversation items before I do the push.

I am done with replying to the various points.
Maybe it could be wiser to push the code in a new PR and close off this one? That way we preserve the comments here and we start a cleaner discussion for the review of the new code? What do you think?

Discussion continues on PR #209.

@VinceR should we mark this PR as closed since we continue on the new version?

Discussion continues on PR #209.

@VinceR should we mark this PR as closed since we continue on the new version?

Yes, I will mark it closed now.

         TSMTRACE << "TDEGlobal::hiddenFileMatcher(): Global HFM initialization STARTED" << endl;
         _hiddenFileMatcher = new TDEStringMatcher();
         TDEGlobal::config()->setGroup( "General" );
         TQString settings = TDEGlobal::config()->readEntry( "globalHiddenFileSpec", "/oW/.*" );

   if ( hiddenFileMatcher == m_pHiddenFileMatcher )
     return;
   if (  hiddenFileMatcher == 0 || hiddenFileMatcher == nullptr ) {
     kdWarning() << "KFileItem::setHiddenFileMatcher: refusing to process null pointer passed by caller" << endl;

 #ifdef TSMSIGNALS
   if ( m_pHiddenFileMatcher != 0 && m_pHiddenFileMatcher != nullptr ) {
     TSMTRACE << "  Attempting to disconnect slots from hidden file matcher signals ... " << endl;
     if ( disconnect( m_pHiddenFileMatcher, 0, 0, 0 ) )

   TSMTRACE << "KFileItem::reEvaluateHidden() called for " << m_url.fileName() <<endl ;
   if ( !m_url.isEmpty() )
       return m_url.fileName()[0] == '.';
     m_bHiddenByMatcher = m_pHiddenFileMatcher->matchAny( m_url.fileName() );

   /**
    * Sets object that encapsulates criteria for determining whether or not
    * a filesystem entity is hidden based on characteristics of its name.
    * Object is stored in @property m_pHiddenFileMatcher.

WIP: Extend meaning of "Hidden Files" (PR version 3) #179

Solution in this PR

Traditional solution

Solution in this PR

Hash table lookup vs. binary search on pre-sorted table

The need for the replaceCharsMB() function.

Reviewers

   }
   // Initialize hidden file matching apparatus
   setHiddenFileMatcher( TDEGlobal::hiddenFileMatcher() );

 }
 bool KFileItem::isHidden() const
 void KFileItem::resetHiddenFileMatcher()

 #ifdef TSMSIGNALS
   TSMTRACE << "  Attempting to reconnect slots to hidden file matcher signals ... " << endl;
   if ( connect( m_pHiddenFileMatcher, TQT_SIGNAL( destroyed() ), this, TQT_SLOT( resetHiddenFileMatcher() ) ) )

   void resetHiddenFileMatcher();
   /**
    * Checks whether or not the current filesystem object is "hidden" by

   bool m_bMimeTypeKnown:1;
   // Auto: check leading dot.
   // Auto: always check if hidden.

   /**
    * Object that encapsulates criteria for determining whether or not
    * this filesystem entity is hidden based on characteristics of its
    * name. This property is set by method setHiddenFileMatcher().

       'w' - Match patterns are to be interpreted as "wildcards"
       'r' - Match patterns are to be interpreted as "regexes" (TQRegExp default)
       'c' - Matching will be case-sensitive (TQRegExp default)
       'c' - Matching will be case-INsensitive

     Files with a version number suffix will be matched via regex
     and dotfiles will be matched via wildcard
 Current and potential use of the TDEStringMatcher class include:

 TDEStringMatcher::~TDEStringMatcher()
 {
   patternList.setAutoDelete( true );

   TQStringList specList = TQStringList::split( patternStringDivider, newPatternString.mid(1), true );
   TQRegExp rxWork;
   TQPtrList<TQRegExp> rxPatternList;


				`protected:`

				`TQPtrList<TQRegExp> patternList;`


				`for ( TQString &specification : newMatchSpecs ) {`

				`if ( specification.find( TQChar(SEP) ) >= 0 ) {`

     TQChar specificationType = specification[0].lower();
     switch ( specificationType ) {
       case 'o' : {
         TQString optionString = specification.mid(1).lower();

         TSMTRACE << "    Processing match pattern: '" << pattern << "'" << endl;
         if ( pattern.isEmpty() ) {
           TSMTRACE << "      Empty patterns are not allowed" << endl;
           rxPatternList.clear();

         rxWork.setPattern( pattern );
         if (! rxWork.isValid() ) {
           TSMTRACE << "      Invalid pattern" << endl;
           rxPatternList.clear();

     }
   }
   if ( patternList.isEmpty() ) {

 {
    //-Debug: TSMTRACE << "Attempting to match string '" << stringToMatch << "' against ALL stored patterns" << endl;
    for ( const TQRegExp *rxPattern : patternList ) {
      if ( !

    */
   bool generatePatternList( TQString newPatternString );
   /**

    */
   TQString getPatternString();
   /**

   * Desired outcome of matching
       TRUE: match succeeds if a string matches the match pattern.
       FALSE: match fails if a string matches the match pattern.

   'i' - Letter case variants are equivalent (e.g. case-insensitive)
   'e' - All letter & number character variants are equivalent
   '=' - Match succeeds if pattern matches [default]
   '!' - Match fails if pattern matches (inverted match)

 The following is an example of a string representing a match specification list
 intended to apply to file names
    w.*ee*cr~$\\.[0-9]+

 #include <tqregexp.h>
 #include <kdebug.h>
 typedef TQValueVector<TQRegExp> RegexList;

     newMatchSpecs.append( optionString );
     newMatchSpecs.append( matchSpec.pattern );
     newRegexList.append( rxWork );
     optionString = "";

      ""
   };
   if ( newMatchSpecString == p->m_matchSpecString )

           break;
         default:
           continue; // should not arise

         newMatchSpecList.clear();
         newRegexList.clear();
         return false;
         continue;

 {
   PatternType      patternType;
   ANCHandling      ancHandling;
   bool             wantMatch; // "matching" vs. "not matching"

  *  Container used in a TDEStringMatcher object
  *  representing multiple match specifications.
  */
 typedef TQValueVector<MatchSpec> MatchSpecList;

   /**
       @return list of currently defined match specifications.
    */
   MatchSpecList getMatchSpecs();

   /**
       @return string encoding list of currently defined match specifications.
    */
   TQString getMatchSpecString();

       Utility function for converting a wildcard pattern string
       to a regular expression pattern string.
    */
   TQString wildcardToRegex( const TQString& wildcardPattern );