FAQ & Knowledge Base

Welcome to our Knowledge Base. Search or browse through the topics below to find answers to your questions.

Categories: TreeSize | Show all categories

You can use the duplicate search of the file search that is integrated into TreeSize for this use case.

Please open the duplicate search and select both the source directory, as well as the destination directory, as search paths.

In the "Duplicate Search" ribbon, you can choose "Unique files" and use it in combination with the (default) comparison method "File Content". Now, when running a duplicate search, instead of finding all duplicate files, only files that do not have a duplicate file are shown in the result list.

If no files are found, it means every file has a duplicate and this indicates that the copy operation worked fine.

To grant permissions on the respective site collections to be scanned via TreeSize/SpaceObServer, the PowerShell cmdlet Grant-PnPAzureADAppSitePermission can be used.

  1.  Import-Module PnP.Powershell
  2. Connect-PnPOnline -Url <SharePointSIteURL> -Interactive
  3. Grant-PnPAzureADAppSitePermission -AppId <AzureAppID> -DisplayName <AzureAppName>  -Site < SharePointSiteURL> -Permissions write

The user that is used for granting the permission must be a site collection admin of the respective site collection. The role 'Owner' is not sufficient.

For more information about the used cmdlets please see: https://pnp.github.io/powershell/cmdlets/

For files, the value of the file itself is taken. For folders, the highest value of all child elements and optionally (this setting is enabled by default) the value of the folder itself is used.

If you do not want TreeSize to use the value of the folder itself, you can enable the option "Include only file dates in folder date calculation" in the "General" options.

We have a few pages in our manual that describe how to get TreeSize connected to SharePoint:

1. https://manuals.jam-software.de/treesize/EN/scan_targets.html (general description of all scan targets, including SharePoint)
2. https://manuals.jam-software.de/treesize/EN/scantarget.html (description of the dialog that is used to select the scan target)
3. https://manuals.jam-software.de/treesize/EN/azure-ad-configuration.html (describes multi-factor authentication (MFA/2FA))

In the scan dialog, "Server name" is the URL to the SharePoint, "Path" is the optional subpath on your SharePoint you want to scan, "User name" and "Password" are the fields where you put in your credentials. Alternatively, you can log in using a certificate.

It is unfortunately not possible yet but we plan to implement this in a future release. If you are interested, please consider voting for this feature request on our feature voting platform: https://jam-software.upvoty.com/b/treesize/wildcard-contents-search/

The default IFilter that Windows uses to handle PDF files is unfortunately not capable of providing the necessary information so TreeSize can't get the data as a result.

To fix this, please install a different IFilter. You can use the free version of PDF-XChange Editor for example as this comes with an IFilter that provides everything TreeSize needs to get all attributes from PDF files.

You can select the scan root in the left panel (directory tree), right-click it, select "Export Data" -> "Copy list of files" to copy a list of all files inside the scan path to the clipboard.

This can be done by navigating to the advanced or duplicate search and by then right-clicking in an empty space in the search configuration area where you normally add filters. Select "Import from file" or "Export to file" afterwards and the program loads a previously saved filter configuration or saves the current one to a file, depending on what you chose.

TreeSize allows importing paths from a .csv or a .txt file. To do so, you can first expand the search path list by clicking on the arrow next to it or by clicking on an empty space inside somewhere, then right-click into the expanded path list and select "Import paths from file".

There are a few reasons why this can happen. The two most common ones are:

1. The duplicate files do not have the same MD5 checksum

2. There is an issue accessing the selected search path(s)

Regarding #1: The standard setting for the comparison method of the duplicate search is "File Content", which means the program compares the MD5 checksum of each file to determine if the files are identical. A way to find these files where the checksums don't match is to use "Name and size" as comparison method for example. This way, only the name and size have to match, but this is less accurate than the default setting.

Regarding #2: If TreeSize can't access the search path, it can't find any duplicates even if they exist and use the same MD5 checksum. This is indicated by an error on the bottom of the program window which you can click on. It should then tell you that the access to the path was denied.

In this case, please start the program as administrator as this solves the issue most of the time.

All entries (Page 3 / 21)