Regular Expressions

<< Click to Display Table of Contents >>

Navigation:  Tips & Annotations >

Regular Expressions

Regular expressions describe patterns in strings that can be used in order to determine whether a given pattern occurs in a text or not. In TreeSize, regular expressions can be used to find specific files and / or folders that match the criteria specified by regular expressions

 

The following table shows some of the most used syntax and provides a few examples:

Expression

Syntax

Description

 Example

Any character

.

Matches any single character except a line break.

a.o matches "aro" in "around" and "abo" in "about" but not "acro" in "across".

Zero or more  

*

Matches zero or more occurrences of the preceding expression, and makes all possible matches.

a*b matches "b" in "bat" and "ab" in "about".

e.*e matches the word "enterprise".

One or more

+

Matches at least one occurrence of the preceding expression.

ac+ matches words that contain the letter "a" and at least one instance of "c", such as "race", and "ace".

a.+s matches the word "access".

Start of string

^

Matches the start of a string

^[0-9] matches strings that start with a digit.

End of string

$

Matches the end of a string

exe$ matches strings that end with "exe".

Beginning of word

[[:<:]]

Matches only when a word starts at this point in the text.

[[:<:]]in matches words such as "inside" and "into" that begin with the letters "in".

End of word

[[:>:]]

Matches only when a word ends at this point in the text.

ss[[:>:]] matches words such as "across" and "loss" that end with the letters "ss".

Any one character in the set

[]

Matches any one of the characters in the []. To specify a range of characters, list the starting and ending characters separated by a dash (-), as in [a-z].

be[n-t] matches "bet" in "between", "ben" in "beneath", and "bes" in "beside" but not "bel" in "below".

Any one character not in the set

[^...]

Matches any character that is not in the set of characters that follows the ^.

be[^n-t] matches "bef" in "before", "beh" in "behind", and "bel" in "below", but not "ben" in "beneath".

Or

|

Matches either the expression before or the one after the OR symbol (|). Mostly used in a group.

(sponge|mud) matches "sponge bath" and "mud bath.

Escape character

\

Matches the character that follows the backslash (\) as a literal. This lets you find the characters that are used in regular expression notation, such as { and ^.

\^ searches for the ^ character.

Repeat n times

{n}

Matches n occurrences of the preceding expression.

[0-9]{4} matches any 4-digit sequence.

Grouping

()

Lets you group a set of expressions together. If you want to search for two different expressions in a single search, you can use the Grouping expression to combine them.

If you want to search for [a-z][1-3] or [0-9][a-z], you would combine them: ([a-z][1-3])|([0-9][a-z]).

 

More examples:

Regular Expression

Use Case

[0-9] or \d

Find all files/folders with at least one digit in its name.

a|b

Find all files/folders containing "a" or "b" in their name.

[^(A-Za-z)]

Find all files/folders containing at least one character in their name that is not in the range A-Z or a-z.

^E[0-9]{7}$

Find all files/folders which start with an "E" followed by exactly 7 digits.

[A-Za-z]:\\([^\\]+\\){2,4}[^\\]+$

Find all files/folders with a folder depth of at least 2 and at most 4.

[^\x00-\x7F]

Find all files/folders with invalid ASCII characters.

[^\P{C}]

Find all files/folders with Unicode characters which cannot be printed.

[\xA0]

Find all file/Folder names that contain the non-breakable space character (Unicode NOBR, U+00A0) instead of a normal space character.

[~\"#%&\*\:<>\?\/\\{|}]

Find all file and folder names, that contain characters which are invalid on SharePoint servers.

^\s+.*

Find all files and folders with a leading space.

\s+(\.[^.]+)$

Find files with an extension that have a trailing space at the end of their name.

.*\s+$

Find folders with a trailing space at the end of their name

 

Further information and additional examples can be found here.

A description of all special characters that can be used with regular expression can be found here.

 

The following tools can assist in forming regular expressions:

https://regex101.com/ (online)

http://regexpal.com/ (online)

http://sourceforge.net/projects/regexpeditor/ (download)

http://sourceforge.net/projects/regextester/ (download)

http://sourceforge.net/projects/regaxe/ (download)