Text Tree¶

Allow to import a complete tree of text files (.txt, .xml, .csv, .html, .rtf by default)
Author¶
Mathieu Mercapide, Augustin Maillefer & Olivier Cavaleri
Signals¶
Inputs: None
Outputs:
Text data
Segmentation covering the content of files filtered in a tree of files
Description¶
This widget is designed to import one, some or all the files contained in a selected folder. The output is a segmentation containing a segment for each imported file. Each segment has annotations with keys : folder name (depth_0), depth level (depth_1, depth_2, ...), file depth level, file encoding and confidence, file name and file path
The interface of Text Tree is available in two versions, according to whether or not the Advanced Settings checkbox is selected.
Basic interface¶
In its basic version (see figure 1 below), the Text Tree widget lets the user browse at once all the files contained in the selected folder.

Figure 1: Text Tree widget (basic interface).
The Options section allows the user to browse on his computer to find desired folder (Output segmentation label). Default filter : the files with extensions .txt, .html, .xml, .csv, .rtf will be opened.
The Info section indicates the number of segments (files) in the output segmentation, or the reasons why no segmentation is emitted (no title selected).
The Send button triggers the emission of a segmentation to the output connection(s). When it is selected, the Send automatically checkbox disables the button and the widget attempts to automatically emit a segmentation at every modification of its interface.
Advanced interface¶
The advanced version of Text Tree (see figure 2 below) offers the same functionality as the basic one, and it adds the possibility of filtering (include or exclude files by filenames) and execute a sampling (0 - 100 %).
Figure 2: Text Tree widget (advanced interface).
The Options section allows the user to browse on his computer to find the desired folders (the samme way as in the basic interface), but with the options to include or excludes types or names of files and to input a level of sampling in %. With the Add button the selection will be added to the list. Exclusion : exclusion in the files list from the basic browse (default filter). Each exclusion should be separated by a comma. Inclusion : inclusion of a new files filter (default filter replaced). Each inclusion should be separeted by a comma. The encoding’s annotation of a file with encoding’s error will simply be omitted. Sampling : the sampling operation selects the input proportion (rounded-up number) of files randomly. Ex : 50% of 9 files will select 5 random files from the file list.
The Info section, as well as the Send button and Send automatically, operate in the same way as in the basic interface.
Messages¶
Information¶
- <n> segments sent to output (<m> characters).
- This confirms that the widget has operated properly.
Warnings¶
- Settings were changed, please click ‘Send’ when ready.
- Settings have changed but the Send automatically checkbox has not been selected, so the user is prompted to click the Send button (or equivalently check the box) in order for computation and data emission to proceed.
- Please select one (or more) folders.
- The widget instance is not able to emit data to output because no folder has been selected.
Errors¶
No added errors