WIP: Extend meaning of "Hidden Files" (PR version 4)
#209
Draft
VinceR
wants to merge 4 commits from issue/270/tdelibs-V4
into master
pull from: issue/270/tdelibs-V4
merge into: TDE:master
TDE:r14.1.x
TDE:master
TDE:feat/kate/php-syntax-heredoc-ident
TDE:fix/tde-75
TDE:feat/new_hwcontrol
TDE:feat/tdeio-xattr-support
TDE:fix/api-for-python
TDE:r14.0.x
TDE:v3.5.13-sru
TDE:feat/tdehtml+svg
TDE:other/string-fixes
TDE:feat/fix-suspend-code
Reviewers
Request review
No reviewers
Labels
General - need additional info from contributor PR/keep-branch
Pull request - do not delete branch after merging PR/not-ok
Pull request - need fixing PR/rfc
Pull request - request for comments PR/update-trans
Pull request - update to translation files needed PR/wip
Pull request - work in progress RS/R14.0.x
Related to R14.0.x series RS/R14.1.x
Related to R14.1.x series SL/critical
Severity level - critical SL/major
Severity level - major SL/minor
Severity level - minor SL/normal
Severity level - normal SL/regression
Severity level - regression from previous version SL/trivial
Severity level - trivial SL/wishlist
Severity level - wishlist request ST/duplicate
Status - duplicate of another issue ST/invalid
Status - invalid report ST/notourproblem
Status - not our problem ST/rejected
Status - rejected ST/wontfix
Status - won't fix ST/worksforme
Status - works for me, unable to reproduce
Apply labels
Clear labels
GE/need-info
General - need additional info from contributor PR/keep-branch
Pull request - do not delete branch after merging PR/not-ok
Pull request - need fixing PR/rfc
Pull request - request for comments PR/update-trans
Pull request - update to translation files needed PR/wip
Pull request - work in progress RS/R14.0.x
Related to R14.0.x series RS/R14.1.x
Related to R14.1.x series SL/critical
Severity level - critical SL/major
Severity level - major SL/minor
Severity level - minor SL/normal
Severity level - normal SL/regression
Severity level - regression from previous version SL/trivial
Severity level - trivial SL/wishlist
Severity level - wishlist request ST/duplicate
Status - duplicate of another issue ST/invalid
Status - invalid report ST/notourproblem
Status - not our problem ST/rejected
Status - rejected ST/wontfix
Status - won't fix ST/worksforme
Status - works for me, unable to reproduce
No Label
GE/need-info
PR/keep-branch
PR/not-ok
PR/rfc
PR/update-trans
PR/wip
RS/R14.0.x
RS/R14.1.x
SL/critical
SL/major
SL/minor
SL/normal
SL/regression
SL/trivial
SL/wishlist
ST/duplicate
ST/invalid
ST/notourproblem
ST/rejected
ST/wontfix
ST/worksforme
Milestone
Set milestone
Clear milestone
No items
No Milestone
Assignees
Assign users
Clear assignees
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.
No due date set.
Dependencies
No dependencies set.
Reference: TDE/tdelibs#209
Reference in new issue
There is no content yet.
Delete Branch 'issue/270/tdelibs-V4'
Deleting a branch is permanent. It CANNOT be undone. Continue?
No
Yes
This is a continuation of PR #179 and is intended to eventually address issue TDE/tdebase#270. To do that, changes to the following additional files will need to be pushed as well as corresponding changes to
tdebase/libkonq
andtdebase/konqueror
:All code has been tested successfully.
//typedef wchar_t CHAR16;
//typedef unsigned short CHAR16;
typedef TQChar CHAR16;
I did follow your suggestion to use
TQChar
instead of an integer representing a code point but I still wonder if might be less efficient. Anyway, all three typedefs work.TQChar
uses aushort
as internal representation, so there should be no differences in terms of efficiently, but the readibility improves because TQStrings are made up of TQChars, not ushorts.const defaultCollation EquivalentsTable // terminating ';' is provided in include file
#include "tequivchars-mapping.h"
uint EquivTableROWS = sizeof(EquivalentsTable)/sizeof(EquivalentsTable[0]);
As you can see, the table is now an array of structs versus a 2 dimensional array.
I defined
EquivalentsTable
the way I did so that I could keep the name declaration in this file but the table content and hard-coded table dimensions in the included file.What I really would have like to do was something like this:
and let the compiler figure out the array size. I could not get anything like that to compile.
continue;
}
#endif
I thought this might be a worthwhile optimization, requiring only 3 comparisons to process upper case ASCII letters and only 1 comparison for all other ASCII characters.
After digging into TQt source code, I noticed that it internally utilizes full (non-sparse) tables of various unicode properties (
src/tools/qunicodetables.cpp
). So another optimization I could try is something like this:ok
else
high = mid - 1;
}
//-Debug: std::cerr << TQString(currChar).utf8().data() << "'" << std::endl;
I looked at that but didn't like the need for a callback function for each comparison. Also, the callback function expects a return code of <1, 0, or >1. Unlike integers, there is no simple
TQChar
operation that can quickly compute that. I suspect that usingbsearch
would end up taking more lines of code.The callback function is a standard way to allow using the same algorithm on many different types of data. It provides design flexibility at the expense of a marginal runtime overhead. Implementation of the function is logically no different from the comparison done in the code above, where TQChars get compared for their partial ordering.
Anyway current code is fine and there is no need to move to
bsearch
.Will review the code this week, possibly tomorrow or Wednesday
WIP: Address issue 270 - Extend meaning of "Hidden Files" (PR version 4)to WIP: Address issue TDE/tdebase#270 - Extend meaning of "Hidden Files" (PR version 4) 11 months agoWIP: Address issue TDE/tdebase#270 - Extend meaning of "Hidden Files" (PR version 4)to WIP: Extend meaning of "Hidden Files" (PR version 4) 11 months agoNext round of feedback :-) Sorry for the delay, finally come around to review the code.
SOURCES ${${target}_SRCS}
VERSION 14.1.0
EMBED tdecorenetwork-static
LINK DCOP-shared tdefx-shared ICE SM ${ZLIB_LIBRARIES} ${RESOLV_LIBRARIES}
${RESOLV_LIBRARIES}
was introduce in a recent commit (2 months ago)m, probably you missed that in your local copy. Always make sure to compared against latest master :-)Sorry, I missed that and I am unable to figure out why that happened.
#include <tdeconfig.h>
#include <tdelocale.h>
#include <kcharsets.h>
#include <tequivchars.h>
Include of tequivchars.h is no longer required see comments below)
return _charsets;
}
TEquivChars *TDEGlobal::equivChars()
No longer required, see comments in
tdecore/tequivchars.h
about makingTEquivChars::replaceChars
a static methodTDEInstance *TDEGlobal::_activeInstance = 0;
TDELocale *TDEGlobal::_locale = 0;
KCharsets *TDEGlobal::_charsets = 0;
TEquivChars *TDEGlobal::_equivChars = 0;
TDEGlobal::_equivChars
is also not required.#endif
class TDELocale;
class TDEStringMatcher;
class TEquivChars;
class TEquivChars;
not required here.* The global alphanumeric character equivalence mapper.
* @return the global alphanumeric character equivalence mapper.
*/
static TEquivChars *equivChars();
No longer required, see comments in
tdecore/tequivchars.h
about makingTEquivChars::replaceChars
a static methodstatic TDELocale *_locale;
static KCharsets *_charsets;
static TDEStringMatcher *_hiddenFileMatcher;
static TEquivChars *_equivChars;
_equivChars
not required here.#include <features.h>
#ifdef __GLIBC__
#include <fnmatch.h>
This will need detection from cmake file and conditional code inclusion. Ok to do it later, but before we merge the code.
I see the problem.
<features.h>
is not guaranteed to always exist. I replaced the above code with:Do you think that is sufficient?
namespace TSM {
class AuxData
One of the member is a pointer, so better to provide a destructor to clear allocated memory. Also better to provide a copy constructor since we are copying an
AuxData
later in the code.Have you verified there are no memory leaks with the
matchEngine
in all possible scenarios (exceptions included)? I would (and will before merging) have to verify that as well, but good if you think about it first.An
AuxData
object functions more like astruct
than aclass
although we use a constructor to start with sane values. The object is meaningful only as a member of anAuxDataList
, which in turn is a member of aTDEStringMatcherPrivate
class. The functionTDEStringMatcherPrivate::clearAll()
has sole responsibility for freeing any dynamically allocated memory associated with each member of itsAuxDataList
before clearing the list itself.I assume that you are referring to copying a
TDEStringMatcherPrivate
object? I’m not sure what I would do in this method that isn’t already done by standard C++ copying.Verified … all possible scenarios … I can’t say that. But I have made sure that
clearAll()
is always called before before any return fromsetMatchSpecs()
. And inspecting the output ofTSMTRACE
reveals that there is always a corresponding ‘free’ that follows a regex engine ‘allocate’ for all cases that I have tested.I just installed
valgrind
(which I hadn't realized was FOSS). Is that the tool of choice?fnmatchFlags = FNM_EXTMATCH; // Bash shell option 'extglob'
#endif
matchEngine = nullptr;
patternConverted = "";
TQString::null
is better than""
void TDEStringMatcher::TDEStringMatcherPrivate::clearAll()
{
m_matchSpecString = "";
TQString::null
is better than""
workArea.clearAll();
return false;
}
if ( matchSpec.pattern.find( TQChar(PatterStringDivider) ) >= 0 ) {
If we make
PatterStringDivider
aTQChar
as explained in another commment, there is no need for additional casting.case ANCHandling::EQUIVALENCE :
inferredOptionString += TQChar('e');
auxWork.isCaseSensitive = true;
auxWork.patternConverted = TDEGlobal::equivChars()->replaceChars( auxWork.patternConverted, true );
Becomes
TEquivChars::replaceChars(...)
if we make the method static.return false;
}
break;
#endif // Otherwise we will test wildcard expression as one converted to a regex
Better to make this more explicit in the code, because if someone misses the comment, he can think that the wildcard case is ignored if
fnmatch
is not available.I thought it was pretty clear (earlier in the code) how wildcard expressions are handled without glibc
fnmatch
but I will try to clarify with an empty#else
block containing the comment#define TSMTRACE kdWarning() << "<TSMTRACE> "
namespace TSM
How about moving the members in this namespace inside the definition of the
TDEStringMatcher
class? This avoid the namespace (TSM
is not a great name for it) and make clear that these members are logically related toTDEStringMatcher
.I was trying to make constant names shorter but now realize that using this namespace doesn’t really save that much. If only I had called the class
TSM
instead ofTDEStringMatcher
in the first place :)typedef TQValueVector<MatchSpec> MatchSpecList;
// Use horizontal tab as m_patternString separator
inline constexpr char PatterStringDivider { '\t' };
This can be a static member of
TDEStringMatcher
if moved into the class definition. It would be better to use aTQChar
instead ofchar
. Also why using braces for settings its value?I generally prefer to declare/define named literal compile-time constants in headers. Even if I used a
TQChar
variable instead, I would still want to use the named constant to initialize it.The braces were mistakenly propagated from an earlier attempt at declaring an array this way.
@return a basic regular expression formed by converting the basic
wildcard expression in @param wildcardPattern.
*/
TQString wildcardToRegex( const TQString& wildcardPattern );
Consider making it a protected method.
expression characters escaped. Useful for regular expression engines
that do not support /Q.../E.
*/
TQString escapeRegexChars( const TQString& basicString );
Consider making it a protected method.
//typedef wchar_t CHAR16;
//typedef unsigned short CHAR16;
typedef TQChar CHAR16;
If you are now ok with the usage of TQChar (see other comment), then the typedef for
CHAR16
becomes superfluous. We can simply useTQChar
in its place.//typedef unsigned short CHAR16;
typedef TQChar CHAR16;
class TEquivChars_Private
class TEquivChars_Private
No longer required, see comments in
tdecore/tequivchars.h
about makingTEquivChars::replaceChars
a static method};
const defaultCollation EquivalentsTable // terminating ';' is provided in include file
#include "tequivchars-mapping.h"
EquivalentsTable
can be a static const member.Still I don't understand the point of splitting the definition statement across two files, making reading the code more cumbersome without much benefit.
The benefit for me is to make
tequivchars.cpp
easier to view and edit without needing to wade through the clutter of ~3000 lines of integer constant pairs.CHAR16 prevChar = 0;
CHAR16 nextChar = 0;
for ( int i = 0 ; i < inStrLen ; outString[i] = CHAR16(currChar), i++ ) {
currChar
is already aCHAR16
so there is no need for explicit casting.if ( inBraceExpr ) {
// Is it time to leave brace expression {___} ?
if ( nextChar == '}' ) inBraceExpr = true;
Should it be
false
instead oftrue
?Oops, yes it should!
// We can process ASCII quickly without using lookup table
unsigned short codepoint = currChar.unicode();
if ( codepoint < 128 ) {
if ( codepoint > 64 && codepoint < 91 ) // convert upper case ASCII
TQChar::lower()
does the same job I believe.That is true but I think my method is faster due to the fact
TQChar::lower()
is doing its own unicode table lookup and table lookups are what I am trying to avoid. But I will grant that it's probably a negligible difference./* FIXME: Possible ideas for optimizing table lookup speed
(1) Detect & handle ASCII (<128) characters separately. *DONE*
(2) Split table into multiple lookup tables and search each
Binary search is very efficient. Splitting and searching across multiple tables is likely to be slower.
expression and the alphabetical characters inside Posix bracket [::]
expressions are left as-is
*/
TQString replaceChars( const TQString &inputString, bool isRegex = false );
Having now seen how this is used in the code, this method can be a
static
method andEquivalentsTable
can be a static const array. This save the needs for having aequivChars()
method and related object inTDEGlobal
, which seems unnecessary since the code in this function does not requires any memory between executions.My main goal was to ensure that there is only a single copy of that table regardless of how many times the class is used or from where. I am hoping to enhance and use this same class to implement a future
listview
sort type. Next iteration will have your suggestion implemented.//-Debug: TSMTRACE << " Attempting to disconnect slots from hidden file matcher (" << m_pHiddenFileMatcher << ") signals ... " << endl;
if (m_pHiddenFileMatcher != nullptr)
disconnect( m_pHiddenFileMatcher, 0, this, 0 );
; //-Debug: TSMTRACE << " ... all slots successfully disconnected" << endl;
; //-Debug: TSMTRACE << "
The debug statement is commented out but note that it would not be part of the
if
body if it was uncommented, due to missing braces.if ( hiddenFileMatcher != nullptr ) {
//-Debug: TSMTRACE << " New pattern string: " << hiddenFileMatcher->getMatchSpecString() << endl ;
//-Debug: TSMTRACE << " Attempting to reconnect slots to hidden file matcher signals ... " << endl;
if ( connect( m_pHiddenFileMatcher, TQT_SIGNAL( destroyed() ), this, TQT_SLOT( setHiddenFileMatcher() ) ) )
if ( connect(
I understand the
if
was used here for debugging purposes and it is fine.For the final version (without TSM tracing), please remove the
if
around theconnect
I did that and have removed all
TSMTRACE
statements from this particular fileDespite a possible appearance to the contrary, I have NOT abandoned this effort! I had to turn my attention to other matters over the summer and those took a lot longer than expected. I am now back on the job :)
I have reviewed the feedback from @MicheleC and have made (for the most part) the recommended changes. I will push the updated code after rebasing. I will also upload the remaining changes to
tdelibs
that are necessary to address the original issue.f0cff95fb1
to683d212303
7 months agoHi @VinceR,
thanks for the update on this PR. I will review the changes after the release of R14.1.1, which is planned at the end of this month.
Reviewers